hongzhi-gao commented on code in PR #734:
URL: https://github.com/apache/tsfile/pull/734#discussion_r2992313487
##########
cpp/src/writer/tsfile_writer.cc:
##########
@@ -716,15 +764,33 @@ int TsFileWriter::write_tablet_aligned(const Tablet&
tablet) {
data_types))) {
return ret;
}
- time_write_column(time_chunk_writer, tablet);
- ASSERT(value_chunk_writers.size() == tablet.get_column_count());
- for (uint32_t c = 0; c < value_chunk_writers.size(); c++) {
- ValueChunkWriter* value_chunk_writer = value_chunk_writers[c];
- if (IS_NULL(value_chunk_writer)) {
- continue;
+ for (uint32_t row = 0; row < tablet.get_cur_row_size(); row++) {
+ int32_t time_pages_before = time_chunk_writer->num_of_pages();
+ std::vector<int32_t> value_pages_before(value_chunk_writers.size(), 0);
+ for (uint32_t c = 0; c < value_chunk_writers.size(); c++) {
+ ValueChunkWriter* value_chunk_writer = value_chunk_writers[c];
+ if (!IS_NULL(value_chunk_writer)) {
+ value_pages_before[c] = value_chunk_writer->num_of_pages();
+ }
+ }
Review Comment:
Implemented the review suggestion for aligned page sealing.
- Added a new config flag `strict_page_size_` (default: true).
- Extended `TimeChunkWriter` and `ValueChunkWriter` with
`set_enable_page_seal_if_full(bool)` so callers can disable the per-`write()`
auto page-size/point-number sealing check and seal pages manually at chosen
boundaries.
Aligned writing behavior:
- strict_page_size=true: keep the original row-based insertion and use
`maybe_seal_aligned_pages_together()` so that when either time or any value
side becomes full, all aligned pages are sealed together (strict page size
semantics).
- strict_page_size=false:
- If there are no STRING/TEXT/BLOB columns: switch to column-based
insertion. Disable auto sealing for time/value and split by
`page_writer_max_point_num_`, sealing pages at segment boundaries.
- If there are STRING/TEXT/BLOB columns: write time first with auto
sealing enabled and record the time page seal row boundaries. Then write each
value column with auto sealing disabled and seal manually at those recorded
boundaries.
This restores a faster column-based write path while preserving page
alignment requirements according to the review proposal.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]