github-actions[bot] commented on code in PR #63850:
URL: https://github.com/apache/doris/pull/63850#discussion_r3400741363
##########
fe/fe-core/src/main/java/org/apache/doris/datasource/InternalCatalog.java:
##########
@@ -4005,6 +3990,8 @@ public void createTableStream(CreateStreamCommand
command) throws DdlException {
.withBaseTable(baseTable)
.build();
Review Comment:
This validates the base table before stream properties are parsed, so
`newStream.getStreamScanType()` is still the default `MIN_DELTA`. That rejects
valid non-MIN_DELTA streams. For example, on a UNIQUE MOW table with ROW binlog
enabled but `binlog.need_historical_value=false`, `CREATE STREAM s ON TABLE t
PROPERTIES("type"="APPEND_ONLY")` reaches
`checkAsTableStreamBaseTable(MIN_DELTA)` and throws the historical-value error
even though APPEND_ONLY/DETAIL should not require historical values. Parse
`newStream.setProperties(properties)` first, then validate the resulting
`newStream.getStreamScanType()`, then check for leftover unknown properties.
##########
fe/fe-core/src/main/java/org/apache/doris/qe/NereidsCoordinator.java:
##########
@@ -148,6 +148,8 @@ public void exec() throws Exception {
Map<DistributedPlanWorker, TPipelineFragmentParamsList>
workerToFragments
= ThriftPlansBuilder.plansToThrift(coordinatorContext);
executionTask = PipelineExecutionTaskBuilder.build(coordinatorContext,
workerToFragments);
+
waitForTimeBasedReadTransactionsVisible(coordinatorContext.connectContext,
coordinatorContext.scanNodes,
Review Comment:
This wait runs after the scan ranges and thrift params have already been
materialized, so it cannot make the time-based read see the transactions it
just waited for. A concrete path is: a row-binlog table has txn T in COMMITTED
with commitTSO <= the requested end timestamp but T is not VISIBLE when
`OlapScanNode.init()` builds scan ranges; `addScanRangeLocations()` writes the
old partition visible version into `TPaloScanRange.version`. Here Nereids then
builds the thrift params/execution task, waits until T becomes VISIBLE, and
executes the already-built ranges, so BE still reads the old version and misses
T's binlog rows. The legacy path has the same ordering problem because its wait
is in `Coordinator.execInternal()` after planning initialized the scan node.
Please move the wait before scan-range version capture, or refresh/regenerate
the affected non-cloud scan range versions after the wait and before
building/sending thrift.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]