[GitHub] spark pull request #15558: [SPARK-17357][SPARK-6624][SQL] Convert filter pre...

2017-02-22 Thread viirya
Github user viirya closed the pull request at:

https://github.com/apache/spark/pull/15558


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15558: [SPARK-17357][SPARK-6624][SQL] Convert filter pre...

2016-10-19 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/15558

[SPARK-17357][SPARK-6624][SQL] Convert filter predicate to CNF in Optimizer 
for pushdown

## What changes were proposed in this pull request?

This PR is proposed to solve the problem #14912 tried to solve before. 
Simply said, currently some predicates can not be correctly pushdown through 
operators due to its format is a bunch of ORs.

A simple example is (a > 10) || (b > 2 && c == 3). If a datasource has 
attributes a and b, this filtering predicate cannot be pushdown. If we can 
convert it to CNF (a > 10 || b > 2) && (a > 10 || c == 3). Then we can push 
down (a > 10 || b > 2).

To convert the predicate to CNF format can solve this formally instead of a 
hacky way on #14912.

We have previous PRs for CNF conversion, such as #8200. Most of added tests 
in `CNFNormalizationSuite` are copied from #8200.

## How was this patch tested?

Jenkins tests.

Please review 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before 
opening a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 filter-cnf

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15558.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15558


commit baac6327b5a9c1a234e34da538a72d8ef87a9e35
Author: Liang-Chi Hsieh 
Date:   2016-10-06T14:47:34Z

Convert filter predicate to CNF in Optimizer.

commit c0637b26808aed386c4d937ebca44958e9f89c09
Author: Liang-Chi Hsieh 
Date:   2016-10-07T02:49:35Z

Improve test.

commit f0872fe8b208ddda6e2cb335f9c6a58a195a0960
Author: Liang-Chi Hsieh 
Date:   2016-10-07T02:50:08Z

improve test.

commit 62a23691be61f33fa079520e00b573b4ad4aaf3e
Author: Liang-Chi Hsieh 
Date:   2016-10-19T15:35:01Z

Merge remote-tracking branch 'upstream/master' into filter-cnf

commit 5343947cfeb287e1f0e02e472cc2ada441c671a4
Author: Liang-Chi Hsieh 
Date:   2016-10-19T15:36:53Z

Add comments.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org