This is an automated email from the ASF dual-hosted git repository.

liaoxin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
     new c9ee1936ffb [opt](memory) lazy-allocate PrefetchBuffer backing buffer 
to reduce peak memory (#61482)
c9ee1936ffb is described below

commit c9ee1936ffbc5b17495d71bb064c5958474ac264
Author: hui lai <[email protected]>
AuthorDate: Mon Mar 23 14:34:26 2026 +0800

    [opt](memory) lazy-allocate PrefetchBuffer backing buffer to reduce peak 
memory (#61482)
    
    ## Problem
    
    When doing a TVF scan over many small S3/HDFS files, each
    `CsvReader::init_reader()`
    creates a `PrefetchBufferedReader`, which in its constructor immediately
    allocates
    `buffer_num` (typically 4) `PrefetchBuffer` objects, each pre-allocating
    a
    `s_max_pre_buffer_size` (4 MB) backing buffer. This costs **16 MB per
    file reader**
    at construction time, regardless of whether the reader ever performs any
    I/O.
    <img width="3838" height="1284" alt="image"
    
src="https://github.com/user-attachments/assets/7d7d6fcd-7bf9-452c-8fb9-0e4479e426af";
    />
    
    
    ## Fix
    
    Defer the allocation of `_buf` from the `PrefetchBuffer` constructor to
    the first
    time `prefetch_buffer()` actually runs. This is safe because:
    
    1. `_buf` is only written by `prefetch_buffer()` (one writer).
    2. `read_buffer()` only accesses `_buf` after waiting on the condition
    variable for
    `PREFETCHED` status, which provides the required happens-before
    guarantee.
    
    ## Impact
    
    | Scenario | Before | After |
    |---|---|---|
    | Reader created, never reads (closed before prefetch runs) | 16 MB
    allocated, held until task drains | 0 MB allocated |
    | N tasks queued on closed buffers in thread pool | N × 16 MB stuck in
    memory | ~0 MB (empty shell objects only) |
    | Normal read path | 16 MB allocated when prefetch runs | 16 MB
    allocated when prefetch runs (unchanged) |
    
    Comparison of total load memory (the earlier is before optimization):
    <img width="1038" height="426" alt="image"
    
src="https://github.com/user-attachments/assets/895ce622-fd91-4bf6-93ae-ee02f9467801";
    />
---
 be/src/io/fs/buffered_reader.cpp | 7 +++++++
 be/src/io/fs/buffered_reader.h   | 2 +-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/be/src/io/fs/buffered_reader.cpp b/be/src/io/fs/buffered_reader.cpp
index b2328d0352e..5f9e4d63a8b 100644
--- a/be/src/io/fs/buffered_reader.cpp
+++ b/be/src/io/fs/buffered_reader.cpp
@@ -449,6 +449,13 @@ void PrefetchBuffer::prefetch_buffer() {
         _prefetched.notify_all();
     }
 
+    // Lazy-allocate the backing buffer on first actual prefetch, avoiding the 
cost of
+    // pre-allocating memory for readers that are initialized but never read 
(e.g. when
+    // many file readers are created concurrently for a TVF scan over many 
small S3 files).
+    if (!_buf) {
+        _buf = std::make_unique<char[]>(_size);
+    }
+
     int read_range_index = search_read_range(_offset);
     size_t buf_size;
     if (read_range_index == -1) {
diff --git a/be/src/io/fs/buffered_reader.h b/be/src/io/fs/buffered_reader.h
index d06402876bb..790dec16e8f 100644
--- a/be/src/io/fs/buffered_reader.h
+++ b/be/src/io/fs/buffered_reader.h
@@ -436,7 +436,7 @@ struct PrefetchBuffer : 
std::enable_shared_from_this<PrefetchBuffer>, public Pro
               _reader(reader),
               _io_ctx_holder(std::move(io_ctx)),
               _io_ctx(_io_ctx_holder.get()),
-              _buf(new char[buffer_size]),
+              _buf(nullptr),
               _sync_profile(std::move(sync_profile)) {}
 
     PrefetchBuffer(PrefetchBuffer&& other)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to