GitHub user codingjaguar opened a pull request:
https://github.com/apache/spark/pull/10163
[SPARK-12161][SQL] Ignore order of predicates in cache matching
This PR improves `LogicalPlan.sameResult` so that semantically equivalent
queries with different order of predicates are still matched.
Consider an example:
Query 1: CACHE TABLE first AS SELECT * FROM table A where A.id >100 AND
A.id < 200;
Query 2: SELECT * FROM table A where A.id < 200 AND A.id > 100;
Currently in SparkSQL, Query 2 cannot utilize the cache result of query 1,
although query 1 and query 2 are the same if ignoring the order of the
predicates.
We modified the compare function `LogicalPlan.sameResult`. The idea is to
split the condition of filter into a sequence of expressions and wrap it into a
set. Now we can easily compare the sets rather than literally compare the
conditions, thus ignoring the order of the predicates.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/codingjaguar/spark master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/10163.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #10163
----
commit 579b5a24a726a0739284714ca35ffff4b6441537
Author: Jiang Chen <[email protected]>
Date: 2015-12-05T17:38:32Z
Add equivalentConditions()
commit 0a096983f86945415ef6fb91af0be400d4a05aaf
Author: windscope <[email protected]>
Date: 2015-12-05T18:04:57Z
set comparison for projection
commit 85f1ebca7ec568c84a2f08a131b3512c5a204526
Author: windscope <[email protected]>
Date: 2015-12-05T19:48:02Z
Fix set conversion bug
commit 1a2b534e01d0010d64fbe3b2382bd59eb0a28a4b
Author: windscope <[email protected]>
Date: 2015-12-05T20:13:42Z
Remove set comparison of projection
commit 5fcb85ca97e8f48aff7848e51e5dfb187597dbee
Author: windscope <[email protected]>
Date: 2015-12-05T21:02:32Z
Add test case for filter condition order
commit 8f93c6aa7a628b71e789385664352878d3e2fd3d
Author: windscope <[email protected]>
Date: 2015-12-05T21:32:53Z
Fix style error
commit 6eb6fddf82220c3181cd7151b573fe135d2e9c0a
Author: windscope <[email protected]>
Date: 2015-12-06T00:58:53Z
add testcase for SameResultSuite
commit 02fc878081da8ccd313fd60ed0ff81f9735794c0
Author: windscope <[email protected]>
Date: 2015-12-06T01:44:00Z
add testcase for OR split filter condition
commit bcb6df01a1706d92d728c4dce02a600be88f3fd9
Author: Jiang Chen <[email protected]>
Date: 2015-12-06T02:04:50Z
Supported expressions with disjunctive predicates;
refactor cleanArgs so that we can reuse cleanExpression().
commit 360bb2b9169f1ae030c040bec4e035f2ce8dc0c7
Author: windscope <[email protected]>
Date: 2015-12-06T02:07:00Z
Merge branch 'jiang.filter-set' of github.com:codingjaguar/spark into
jiang.filter-set
commit 94837d697c94a8f83bd4384f5681321b5cfe5d97
Author: Jiang Chen <[email protected]>
Date: 2015-12-06T02:46:27Z
Merge branch 'jiang.filter-set'
commit 0de3d7e10789b5e46e67f50942e607b8f229f64d
Author: windscope <[email protected]>
Date: 2015-12-06T02:07:00Z
Merge branch 'jiang.filter-set' of github.com:codingjaguar/spark into
jiang.filter-set
commit 13ce03f2f5132eb8264e2f8d785410a8a94efec0
Author: Jiang Chen <[email protected]>
Date: 2015-12-06T02:50:08Z
Removed dead code
commit 9f6df41f67540765abd646e917c13237f4af2147
Author: Jiang Chen <[email protected]>
Date: 2015-12-06T02:50:26Z
Merge branch 'jiang.filter-set' of github.com:codingjaguar/spark into
jiang.filter-set
commit e63df887670817937c2cd2da57aa8d5f06553ce7
Author: Jiang Chen <[email protected]>
Date: 2015-12-06T02:52:11Z
Merge branch 'jiang.filter-set'
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]