Github user mallman commented on the issue:
https://github.com/apache/spark/pull/22357
I have some bad news. The methods `testSchemaPruning` and
`testMixedCasePruning` do not set the configuration settings as expected.
Fixing that reveals 6 failing tests for the mixed case tests. One of those
failing tests involves testing the scan and answer for a query involving a
filter condition.
Based on what I'm seeing, I think it's fair to say that schema pruning is
broken under certain circumstances when using a table schema that includes
column names with upper-case characters (note that the test schema for contacts
in `ParquetSchemaPruningSuite.scala` includes no fields with upper-case
characters).
Fortunately schema pruning is disabled by default, and I think it's still
considered "experimental" technology.
I think that fixing `ParquetSchemaPruningSuite.scala` is pretty
straightforward. Fixing the newly failing unit tests will be more effort.
In any case, I will create an issue in Jira and submit a PR.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]