LuciferYang opened a new pull request #30652: URL: https://github.com/apache/spark/pull/30652
### What changes were proposed in this pull request? As described in SPARK-33673, some test suites in `ParquetV2SchemaPruningSuite` will failed when set `parquet.version` to 1.11.1 because Parquet will return empty results for non-existent column since PARQUET-1765. This pr change to use `dataSchema` instead of `schema` to build `pushedParquetFilters` in `ParquetScanBuilder` to avoid push down partition filters to `ParquetScan` for `DataSourceV2` ### Why are the changes needed? Prepare for upgrade using Parquet 1.11.1. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? **Manual test** ``` mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.execution.datasources.parquet.ParquetV2SchemaPruningSuite -Dparquet.version=1.11.1 test -pl sql/core -am ``` **Before** **After** ``` Run completed in 3 minutes, 46 seconds. Total number of tests run: 134 Suites: completed 2, aborted 0 Tests: succeeded 134, failed 0, canceled 0, ignored 0, pending 0 All tests passed. ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
