GitHub user sameeragarwal opened a pull request:

    https://github.com/apache/spark/pull/11372

    [WIP][SPARK-13495][SQL] Add Null Filters in the query plan for 
Filters/Joins based on their data constraints

    ## What changes were proposed in this pull request?
    
    This PR adds an optimizer rule to eliminate reading (unnecessary) NULL 
values if they are not required for correctness by inserting `isNotNull` 
filters is the query plan. These filters are currently inserted beneath 
existing `Filter` and `Join` operators and are inferred based on their data 
constraints.
     
    Note: While this optimization is applicable to all types of join, it 
primarily benefits `Inner` and `LeftSemi` joins.
    
    ## How was this patch tested?
    
    WIP


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sameeragarwal/spark gen-isnotnull

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11372.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11372
    
----
commit 15eac821b5328ce34ab2a279fad2e48f471ccbdc
Author: Sameer Agarwal <[email protected]>
Date:   2016-02-25T07:49:17Z

    optimizer rules

commit 06d74da3ad1fd2748c395a143cfdd9f99e16009c
Author: Sameer Agarwal <[email protected]>
Date:   2016-02-25T18:10:50Z

    Null filtering

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to