Re: [PR] [enhancement](segment) Support streaming page flush for ScalarColumnWriter [doris]

via GitHub Sun, 08 Mar 2026 08:37:04 -0700


github-actions[bot] commented on code in PR #61126:
URL: https://github.com/apache/doris/pull/61126#discussion_r2901997203



##########
be/src/olap/rowset/segment_v2/column_writer.cpp:
##########
@@ -875,7 +883,20 @@ Status ScalarColumnWriter::finish_current_page() {
         page->data.emplace_back(std::move(compressed_body));
     }
 
-    _push_back_page(std::move(page));

Review Comment:
   **MINOR: Duplicated lambda.** `collect_uncompressed_bytes` is identical to 
the one defined in `write_data()` (line 747). Consider making this a static 
helper function or a private method to avoid duplication and reduce the risk of 
them diverging.



##########
be/src/common/config.cpp:
##########
@@ -454,6 +454,8 @@ DEFINE_mInt64(vertical_compaction_max_segment_size, 
"1073741824");
 DEFINE_mDouble(sparse_column_compaction_threshold_percent, "0.05");
 // Enable RLE batch Put optimization for compaction
 DEFINE_mBool(enable_rle_batch_put_optimization, "true");
+// When enabled, ScalarColumnWriter flushes each data page to disk immediately
+DEFINE_mBool(enable_streaming_page_flush, "true");

Review Comment:
   **HIGH: Default value is `"true"` but PR description says `(default: 
false)`.**
   
   Given the `estimate_buffer_size()` bug described in the column_writer.cpp 
comment, shipping this as `true` by default would cause segment size limits to 
be silently bypassed in production for all users. Even after fixing the 
estimation bug, a new feature like this should start with `default: false` 
until it has been thoroughly validated in production.



##########
be/src/olap/rowset/segment_v2/column_writer.cpp:
##########
@@ -875,7 +883,20 @@ Status ScalarColumnWriter::finish_current_page() {
         page->data.emplace_back(std::move(compressed_body));
     }

Review Comment:
   **CRITICAL BUG: `estimate_buffer_size()` is broken in streaming mode, 
causing unbounded segment growth.**
   
   When `_streaming_flush_enabled` is true, `_push_back_page()` is never 
called, so `_data_size` stays at **0** for the entire lifetime of the column 
writer. This means `estimate_buffer_size()` (which starts with `uint64_t size = 
_data_size;`) will only return the size of the current in-progress page + index 
builders, completely omitting all data already flushed to disk.
   
   This breaks the entire segment size estimation chain:
   1. `SegmentWriter::estimate_segment_size()` sums `estimate_buffer_size()` 
across all columns → vastly underestimates actual segment file size
   2. `SegmentWriter::max_row_to_add()` computes `(MAX_SEGMENT_SIZE - estimate) 
/ row_avg_size` → always thinks there's ~230MB of headroom
   3. `SegmentCreator::add_block()` never triggers segment flush based on size 
→ **segments can grow far beyond `MAX_SEGMENT_SIZE` (230MB)**
   4. `SegmentWriter::finalize()` disk capacity check also uses 
`estimate_segment_size()` → inaccurate
   
   The only remaining guardrail is `max_rows_per_segment` which defaults to 
`INT32_MAX`.
   
   **Fix:** Track the cumulative bytes already flushed to disk, e.g.:
   ```cpp
   // In the streaming path of finish_current_page():
   _streaming_flushed_bytes += _file_writer->bytes_appended() - offset_before;
   ```
   Then in `estimate_buffer_size()`:
   ```cpp
   uint64_t size = _data_size + _streaming_flushed_bytes;
   ```
   Or simply use `_total_compressed_data_pages_size` (which is already tracked 
correctly) instead of `_data_size` in the estimate.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [enhancement](segment) Support streaming page flush for ScalarColumnWriter [doris]

Reply via email to