gaborkaszab commented on code in PR #14508:
URL: https://github.com/apache/iceberg/pull/14508#discussion_r2511131964
##########
core/src/main/java/org/apache/iceberg/PartitionStatsHandler.java:
##########
@@ -225,7 +223,7 @@ public static PartitionStatisticsFile
computeAndWriteStatsFile(Table table, long
}
try {
- stats = computeAndMergeStatsIncremental(table, snapshot,
partitionType, statisticsFile);
+ stats = computeAndMergeStatsIncremental(table, snapshot,
statisticsFile.snapshotId());
Review Comment:
The reason this param is removed is that `computeAndMergeStatsIncremental`
is changed to use the new Scan API instead of `readPartitionStatsFile`. Since
the partitionType is deliberately made internal to the Scan, this is no longer
needed in this function.
What Do you mean we recalculate it later for every file? Whenever we
calculate stats, we produce this partitionType once for scanning the latest
existing partition stats in case it's an incremental computation, and then we
produce it one more time when we sort the stats. I think this is cheap enough
to worth keeping it internal to the Scan API and not giving a way for users to
inject it into the scan in order to save one computation of this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]