szehon-ho opened a new pull request, #4520: URL: https://github.com/apache/iceberg/pull/4520
The change https://github.com/apache/iceberg/pull/2926 introduced partition predicate push down for the "files" metadata table. However, it is incorrect in the case that the partition spec evolves and data written in both specs. The projectionFilter is constructed by projecting the expression on the current table's partition spec, and evaluating the values against the manifest file. Say for example, we have a table that had a former partition spec that is "data", and a new partition spec that is "id", with manifests written with both specs. A query like "select * from my_table.files where files.partition.id=1" projects successfully against the current partition spec "id" to have a ManifestEvaluator looking for value 1. It uses it to evaluate against all manifest-files, even those written with the old partition-spec "data". This erroneously filters those old manifests where data is 1, and correctly filters new manifests where id is 1, giving wrong results. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
