Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/511#issuecomment-41361251
  
    But are there any realistic workloads where you'd want to turn this on all 
the time, or turn it off all the time? It seems that in an ad-hoc query 
workload, you'll have some queries that can use this, and some that can't. You 
should just pick whether you want it as a default. Personally I'd go for it 
unless the cost is super high in the cases where it doesn't work, because I 
imagine filtering is pretty common in large schemas and I hope Parquet itself 
optimizes this down the line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to