RussellSpitzer opened a new pull request #1744: URL: https://github.com/apache/iceberg/pull/1744
Previously we would crash whenever attempting to project only non data_file columns (or data_file sub columns) from a partitioned Iceberg table. This would occur because our projection in Manifest Reader would always require the "data_file" column even if no columns from within in it were required. This worked on unpartitioned tables because of a second bug in the column pruning which would not prune a field which contained another field whose schema was an empty struct regardless of whether it was requested or not. An empty partition schema would make sure that the data_file would not actually be pruned even though no columns would actually match the pruning request. Here we fix both bugs, first by correctly pruning empty structs if they are not explicitly requested. Second by chaning the projection for ManifestEntry to ignore the data_file field if no subfields are requested from it. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
