github-actions[bot] commented on code in PR #64133:
URL: https://github.com/apache/doris/pull/64133#discussion_r3473385233
##########
be/src/storage/rowset/beta_rowset_reader.cpp:
##########
@@ -342,13 +342,13 @@ Status BetaRowsetReader::_init_iterator() {
}
}
}
- if (_read_options.binlog_lsn_idx != -1) {
- sequence_loc = _read_options.binlog_lsn_idx;
Review Comment:
Using TSO as the merge sequence here loses the per-row ordering that
row-binlog DETAIL/MIN_DELTA depends on. The TSO column is not stored per row:
`RowBinlogSegmentWriter` writes a placeholder, and
`SegmentIterator::_update_tso_col_if_needed()` replaces every row in a rowset
with the same commit TSO. For same-key changes that land in different segments
of the same rowset, the merge comparator sees equal key and equal TSO, then
falls back to `data_id()`. These inputs are `LazyInitSegmentIterator`s, which
do not forward the wrapped `SegmentIterator::data_id()` and therefore all
return the base value 0. That leaves the heap with no deterministic
tie-breaker, so `BlockReader` can process a later update/delete before an
earlier one and produce wrong DETAIL/MIN_DELTA rows. Please keep LSN as a
secondary/per-row merge key for row-binlog scans, or otherwise provide a proven
stable ordering for same-key rows within a commit TSO.
##########
be/src/load/memtable/memtable_flush_executor.cpp:
##########
@@ -415,6 +390,17 @@ void FlushToken::_flush_memtable_impl(RowsetWriter*
flush_writer, MemTable* memt
// }};
std::shared_ptr<Block> flush_block;
RETURN_IF_ERROR(_memtable2block(memtable, shared_memtable,
flush_block));
+ if (flush_writer->context().write_binlog_opt().enable) {
+ const auto& memtable_lsns = memtable->row_binlog_lsns();
Review Comment:
This `CHECK(!memtable_lsns.empty())` makes a valid empty-rowset close crash
for row-binlog tables. `TabletsChannel::close()` closes every writer in a
touched partition, and `DeltaWriter::close()` explicitly initializes writers
with no data so tablets that received zero rows still create an empty rowset
for version continuity. In that path `_memtable2block()` returns a zero-row
block and the sidecar LSN vector is empty; the rowset writer already treats
zero-row memtables as OK, but this check fires before `flush_memtable()` can do
so. Please skip LSN registration when `flush_block->rows() == 0` instead of
requiring a non-empty sidecar.
##########
gensrc/proto/olap_file.proto:
##########
@@ -540,11 +540,14 @@ message TabletSchemaPB {
// storage_format; the fallback path lives in TabletSchema::init_from_pb.
optional TabletStorageFormatPB storage_format = 36;
- // cid of the binlog timestamp column (__DORIS_BINLOG_TIMESTAMP__).
- optional int32 binlog_timestamp_col_idx = 37 [default = -1];
-
// cid of the binlog lsn column (__DORIS_BINLOG_LSN__).
- optional int32 binlog_lsn_col_idx = 38 [default = -1];
+ optional int32 binlog_lsn_col_idx = 37 [default = -1];
Review Comment:
This renumbers persisted tablet-schema fields in a way that corrupts
existing row-binlog metadata on upgrade. In the base `TabletSchemaPB`, field 37
was `binlog_timestamp_col_idx` and field 38 was `binlog_lsn_col_idx`;
`TabletSchemaCloudPB` had the same timestamp/LSN fields at 38 and 39. After
this change, new code decodes the old timestamp tag as `binlog_lsn_col_idx` and
the old LSN tag as `binlog_op_col_idx` via `TabletSchema::init_from_pb()` and
the cloud conversion helpers. Any tablet meta written before the upgrade will
therefore load with LSN pointing at the old timestamp column and op pointing at
the old LSN column, and `RowBinlogSegmentWriter` will build the row-binlog
prefix from the wrong position. Please keep existing field numbers stable,
reserve removed fields, and add new field numbers plus an explicit
fallback/migration from the old timestamp/LSN fields.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]