aokolnychyi commented on code in PR #4520:
URL: https://github.com/apache/iceberg/pull/4520#discussion_r846351906
##########
core/src/main/java/org/apache/iceberg/BaseFilesTable.java:
##########
@@ -92,7 +94,26 @@ public TableScan appendsAfter(long fromSnapshotId) {
protected CloseableIterable<FileScanTask> planFiles(TableOperations ops,
Snapshot snapshot, Expression rowFilter,
boolean
ignoreResiduals, boolean caseSensitive,
boolean colStats) {
- CloseableIterable<ManifestFile> filtered = filterManifests(manifests(),
rowFilter, caseSensitive);
+ Map<Integer, PartitionSpec> specsById = table().specs();
+
+ LoadingCache<Integer, ManifestEvaluator> evalCache =
Caffeine.newBuilder().build(specId -> {
+ PartitionSpec spec = specsById.get(specId);
+ PartitionSpec transformedSpec = transformSpec(fileSchema, spec,
PARTITION_FIELD_PREFIX);
+ return ManifestEvaluator.forRowFilter(rowFilter, transformedSpec,
caseSensitive);
+ });
+
+ CloseableIterable<ManifestFile> filtered = CloseableIterable.filter(
+ manifests(),
+ manifest -> {
+ PartitionSpec spec = specsById.get(manifest.partitionSpecId());
+
+ if (spec.fields().stream().anyMatch(f ->
f.transform().equals(Transforms.alwaysNull()))) {
Review Comment:
Well, it looked wrong initially to filter out such manifests but I am no
longer sure.
If we have a filter `partition.data_bucket = 1` and `data_bucket` is null
for evolved tables, maybe it is correct to skip such manifests as technically
the value for that partition value is null?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]