GitHub user sameeragarwal opened a pull request:
https://github.com/apache/spark/pull/11665
[SPARK-XXXX][SQL] Support for inferring filters from data constraints
## What changes were proposed in this pull request?
**[I'll link it to the JIRA once ASF JIRA is back online]**
This PR generalizes the `NullFiltering` optimizer rule in catalyst to
`InferFiltersFromConstraints` that can automatically infer all relevant filters
based on an operator's constraints while making sure of 2 things:
(a) no redundant filters are generated, and
(b) filters that do not contribute to any further optimizations are not
generated.
## How was this patch tested?
Extended all tests in `InferFiltersFromConstraintsSuite` (that were
initially based on `NullFilteringSuite` to test filter inference in `Filter`
and `Join` operators.
In particular the 2 tests ( `single inner join with pre-existing filters:
filter out values on either side` and `multiple inner joins: filter out values
on all sides on equi-join keys` attempts to highlight/test the real potential
of this rule for join optimization.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sameeragarwal/spark infer-filters
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/11665.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #11665
----
commit 308e93c4854d38c99c37919383aec85b5272fdf9
Author: Sameer Agarwal <[email protected]>
Date: 2016-03-11T07:51:50Z
Add InferFiltersFromConstraints rule
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]