GitHub user nongli opened a pull request:

    https://github.com/apache/spark/pull/10209

    [SPARK-9372] [SQL] For joins, insert IS NOT NULL filters to children.

    Some join types and conditions imply that the join keys cannot be NULL and
    can be filtered out by the children. This patch does this for inner joins
    and introduces a mechanism to generate predicates. The complex part of doing
    this is to make sure the transformation is stable. The problem that we want
    to avoid is generating a filter in the join, having that pushed down and 
then
    having the join regenerate the filter.
    
    This patch solves this by having the join remember predicates that it has
    generated. This mechanism should be general enough that we can infer other
    predicates, for example "a join b where a.id = b.id AND a.id = 10" could
    also use this mechanism to generate the predicate "b.id = 10".

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nongli/spark spark-9372

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10209.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10209
    
----
commit 565d15ec8f0c319a97370b133317c11140da49b4
Author: Nong Li <[email protected]>
Date:   2015-12-08T23:50:25Z

    [SPARK-9372] [SQL] For joins, insert IS NOT NULL filters to children.
    
    Some join types and conditions imply that the join keys cannot be NULL and
    can be filtered out by the children. This patch does this for inner joins
    and introduces a mechanism to generate predicates. The complex part of doing
    this is to make sure the transformation is stable. The problem that we want
    to avoid is generating a filter in the join, having that pushed down and 
then
    having the join regenerate the filter.
    
    This patch solves this by having the join remember predicates that it has
    generated. This mechanism should be general enough that we can infer other
    predicates, for example "a join b where a.id = b.id AND a.id = 10" could
    also use this mechanism to generate the predicate "b.id = 10".

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to