GitHub user antonoal opened a pull request:
https://github.com/apache/spark/pull/11777
Added transitive closure transformation to Catalyst
## What changes were proposed in this pull request?
A relatively simple transformation is missing from Catalyst's arsenal -
generation of transitive predicates. For instance, if you have got the
following query:
`select *
from table1 t1
join table2 t2
on t1.a = t2.b
where t1.a = 42`
then it is a fair assumption that t2.b also equals 42 hence an additional
predicate could be generated. The additional predicate could in turn be pushed
down through the join and improve performance of the whole query by filtering
out the data before joining it.
Such a transformation exists in Oracle DB.
Please note, in this PR a transitive predicate would be created for the
following operations:
* a BinaryComparison (=, >=, etc.) to a foldable
* in (1, 2, 3) where all the values in the sequence are foldable
* Not of any of the above
* Or of any of the above
## How was this patch tested?
I've added a new TransitiveClosureSuite with a series of unit tests
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/antonoal/spark transitive-closure
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/11777.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #11777
----
commit 7df4117749f7afc2e5e95190cf93a961b9c6ed3a
Author: Alex Antonov <[email protected]>
Date: 2016-03-16T21:53:38Z
Added transitive closure transformation to Catalyst
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]