[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

mallman Tue, 11 Sep 2018 07:48:54 -0700

Github user mallman commented on the issue:

    https://github.com/apache/spark/pull/22357
  
    I have some bad news. The methods `testSchemaPruning` and 
`testMixedCasePruning` do not set the configuration settings as expected. 
Fixing that reveals 6 failing tests for the mixed case tests. One of those 
failing tests involves testing the scan and answer for a query involving a 
filter condition.
    
    Based on what I'm seeing, I think it's fair to say that schema pruning is 
broken under certain circumstances when using a table schema that includes 
column names with upper-case characters (note that the test schema for contacts 
in `ParquetSchemaPruningSuite.scala` includes no fields with upper-case 
characters).
    
    Fortunately schema pruning is disabled by default, and I think it's still 
considered "experimental" technology.
    
    I think that fixing `ParquetSchemaPruningSuite.scala` is pretty 
straightforward. Fixing the newly failing unit tests will be more effort.
    
    In any case, I will create an issue in Jira and submit a PR.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

Reply via email to