[GitHub] spark issue #14671: [SPARK-17091][SQL] ParquetFilters rewrite IN to OR of Eq

andreweduffy Wed, 17 Aug 2016 05:24:11 -0700

Github user andreweduffy commented on the issue:

    https://github.com/apache/spark/pull/14671
  
    Yeah benchmarking is definitely a great idea, as it is likely Spark will be 
better than Parquet at filtering individual records, but I'm still not quite 
understanding why this filter is any different and should block on row-by-row 
filtering decision. _All_ filters are being processed row-by-row using 
ParquetRecordReader to my understanding, and this one is no different from any 
of the others in ParquetFilters.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #14671: [SPARK-17091][SQL] ParquetFilters rewrite IN to OR of Eq

Reply via email to