alamb opened a new issue #656:
URL: https://github.com/apache/arrow-datafusion/issues/656


   **Describe the bug**
   Predicate pruning no longer occurs for queries against parquet files
   
   **To Reproduce**
   Run a query against a parquet file with multiple row groups with a predicate 
that could be used to prune. No pruning occurs
   
   **Expected behavior**
   The predicate should be able to eliminate some row groups
   
   **Additional context**
   While updating IOx to use the latest datafusion in  
https://github.com/influxdata/influxdb_iox/pull/1799 I discovered another place 
where https://github.com/apache/arrow-datafusion/pull/55 has caused some issues
   
   Basically, the predicates that get pushed down to the parquet exec scan now 
are fully qualified, for example `#foo.bar > 5` however, the parquet schema 
only has columns named `bar` and thus the code can not match them up
   
   The reason this was not caught in #55 is that there is no end-to-end test of 
parquet that exercises the entire path.
   
   The fix for this issue is fairly straightforward (it is to strip the 
qualifiers from the expressions) but the end-to-end test is quite involved. I 
plan to fix this in two PRs
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to