qlong opened a new pull request, #15385: URL: https://github.com/apache/iceberg/pull/15385
This is to support manifest-based file skipping for variant columns. Changes: - SparkV2Filters: Convert variant_get/try_variant_get to Expressions.extract() - Spark3Util.describe: Output extract terms as variant_get() for EXPLAIN Tests: - Added unit tests - Manual e2e testing with spark-sql built with dependence PRs, verified variant_get is pushdown to iceberg for file skipping. Verified that files is skipped from Spark history. The PR depends on: 1. https://github.com/apache/iceberg/pull/15384: variant bound fix 2. https://github.com/apache/iceberg/pull/14297: shredded variant support for Spark. 3. https://github.com/apache/spark/pull/54394: Spark side change to add VariantGet to DSv2 filter This PR can be safely merged once the 1st dependency PR is merged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
