mao-liu commented on code in PR #8186:
URL: https://github.com/apache/paimon/pull/8186#discussion_r3392629898
##########
paimon-flink/paimon-flink-common/src/main/java/org/apache/paimon/flink/sink/coordinator/TableWriteCoordinator.java:
##########
@@ -93,6 +100,16 @@ private synchronized void refresh() {
}
this.snapshot = latestSnapshot.get();
this.scan.withSnapshot(snapshot);
+ if (prefetchManifests) {
+ // Eagerly read all data manifests of the current snapshot once to
warm the
+ // table's SegmentsCache (the byte-level manifest cache attached
to the table
+ // inside the Job Manager). This reuses the same threaded `plan()`
read path
+ // that per-task `scan` requests use, so subsequent concurrent
requests hit
+ // warm bytes instead of each performing a cold manifest read.
+ scan.withPartitionFilter(PartitionPredicate.ALWAYS_TRUE)
+ .withBucketFilter(Filter.alwaysTrue())
Review Comment:
Thanks for the review @JingsongLi - and thanks for catching the bug 🙏
I have updated prefetch to use a fresh scan instance, and also added a test
to cover this scenario. -
https://github.com/apache/paimon/pull/8186/commits/709c79b808b20f3046c017f73f5e488664ed7002
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]