GitHub user liancheng opened a pull request:
https://github.com/apache/spark/pull/10377
[SPARK-12218] Fixes ORC conjunction predicate push down
This PR is a follow-up of PR #10362.
Two major changes:
1. The fix introduced in #10362 is OK for Parquet, but may disable ORC PPD
in many cases
The fix introduced in #10362 stops converting an `AND` predicate if any
branch is inconvertible. On the other hand, `OrcFilters` combines all filters
into a single big conjunction first and then tries to convert it into ORC
`SearchArgument`. This means, if any filter is inconvertible, no filters can
be pushed down. This PR fixes this issue by finding out all convertible
filters first before doing the actual conversion.
The reason behind the current implementation is mostly due to the
limitation of ORC `SearchArgument` builder, which is documented in this PR in
detail.
1. Copied the `AND` predicate fix for ORC from #10362 to avoid merge
conflict.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/liancheng/spark
spark-12218.fix-orc-conjunction-ppd
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/10377.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #10377
----
commit 11486080f755ef39097d6cbbe851264b2d539ef1
Author: Cheng Lian <[email protected]>
Date: 2015-12-18T11:32:46Z
Fixes ORC conjunction predicate push down
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]