GitHub user andreweduffy opened a pull request:
https://github.com/apache/spark/pull/14159
[PARQUET] Fix for Parquet filter pushdown
## What changes were proposed in this pull request?
Fix parquet filter pushdown from not reaching all the way down to the file
level
Use of previous deprecated constructor defaults to null metadata, which
prevents pushdown from reaching the Parquet level.
## How was this patch tested?
Looking at output of collects from SparkShell, before were printing
warnings about CorruptStatistics, preventing pushing down filters to individual
parquet files. Now able to use the metadata in each file to pushdown.
(If this patch involves UI changes, please attach a screenshot; otherwise,
remove this)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/andreweduffy/spark bugfix/pushdown
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/14159.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #14159
----
commit f825ad709cdc3c89d0cc7e41d0410998e6cc7541
Author: Andrew Duffy <[email protected]>
Date: 2016-07-12T19:41:22Z
Fix for Parquet filter pushdown
Use of previous deprecated constructor defaults to null metadata, which
prevents pushdown from reaching the Parquet level.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]