github-actions[bot] commented on code in PR #63850:
URL: https://github.com/apache/doris/pull/63850#discussion_r3400352261
##########
fe/fe-core/src/main/java/org/apache/doris/planner/OlapScanNode.java:
##########
@@ -1651,13 +1682,87 @@ public OlapTableStreamUpdate getStreamUpdate() {
Map<Long, Long> prev = Maps.newHashMap();
Map<Long, Long> next = Maps.newHashMap();
for (Long partitionId : getSelectedPartitionIds()) {
- Pair<Long, Long> streamUpdate = ((OlapTableStreamWrapper)
olapTable).getStreamUpdate(partitionId);
+ Pair<Long, Long> streamUpdate = getStreamUpdate(partitionId);
if (streamUpdate.first != null) {
- // prev could be null, ignore
+ // prev could be null, in case of historical scan
prev.put(partitionId, streamUpdate.first);
}
- next.put(partitionId, streamUpdate.second);
+ if (streamUpdate.second != null) {
+ next.put(partitionId, streamUpdate.second);
+ } else {
+ // next could be null, in case of incremental scan
+ next.put(partitionId,
olapTable.getPartition(partitionId).getTso());
+ }
Review Comment:
This still reads a fresh stream offset instead of the snapshot captured when
scan ranges were built. This is distinct from the earlier `next =
null`/open-ended range thread: even after setting an `end_tso`, this path calls
`OlapTableStreamWrapper.getStreamUpdate()` again, and that wrapper writes to
`outputUpdateMap` but returns `stream.getStreamUpdate(partitionId)` rather than
the cached value. If another stream consumer commits between
`addScanRangeLocations()` and this transaction update collection, the later
call can return the advanced `prev`; `checkPartitionOffset()` then accepts the
commit even though this query scanned from the older offset, and the
transaction can advance the stream past rows it did not read. Please make the
wrapper return the cached `outputUpdateMap` value (or `computeIfAbsent`) and
use that same `[prev,next]` snapshot for both the scan range and the
transaction update.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]