GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/21603
[SPARK-17091][SQL] Add rule to convert IN predicate to equivalent Parquet
filter
## What changes were proposed in this pull request?
Add a new optimizer rule to convert an IN predicate to an equivalent
Parquet filter.
The original pr is: https://github.com/apache/spark/pull/18424
## How was this patch tested?
unit tests and manual tests
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/wangyum/spark SPARK-17091
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21603.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21603
----
commit 264eed81e33d3af7d7ea50a3a49866dde18f163b
Author: Yuming Wang <yumwang@...>
Date: 2018-06-21T04:35:20Z
Convert IN predicate to Parquet filter push-down
commit 4f96881af4af6f613c049f3756ee3aba518ceab8
Author: Yuming Wang <yumwang@...>
Date: 2018-06-21T04:49:12Z
Change threshold to 20.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]