LiebingYu commented on code in PR #2197:
URL: https://github.com/apache/fluss/pull/2197#discussion_r2629962804
##########
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/source/enumerator/FlinkSourceEnumerator.java:
##########
@@ -664,9 +677,16 @@ private void handlePartitionsRemoved(Collection<Partition>
removedPartitionInfo)
pendingSplitAssignment.forEach(
(reader, splits) ->
splits.removeIf(
- split ->
- removedPartitionsMap.containsKey(
-
split.getTableBucket().getPartitionId())));
+ split -> {
+ // Never remove LakeSnapshotSplit, because
during union reads,
+ // data from the lake partition must still
be read even if the
+ // partition has already expired in Fluss.
+ if (split instanceof LakeSnapshotSplit) {
Review Comment:
As discuss offline, LakeSnapshotAndFlussLogSplit should not be considered
here, as it pertains to how the reader handles partition expiration during the
reading process—an issue that should be addressed through other mechanisms,
such as a consumer. Introducing it here would add excessive complexity.
Furthermore, even if it were included, we still couldn't guarantee complete
data retrieval if the partition expires mid-read, because the data in Fluss
would still be purged by TTL.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]