GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/14228

    [SPARK-16583] [SQL] [WIP] Improve Partition Pruning in InMemoryTableScanExec

    #### What changes were proposed in this pull request?
    Currently, column pruning in `InMemoryTableScanExec` only can utilize the 
predicates whose either left or right side is foldable expressions. We can 
further extend the existing framework to utilize the predicates in which both 
sides are attribute references. Through their maximal and minimal boundary 
values, we can prune the unnecessary partitions. The performance improvement 
depends on the data distribution and predicates. 
    
    TODO: more examples here.
    
    #### How was this patch tested?
    TODO: added more test cases

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark inMemory

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14228.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14228
    
----
commit fd4794debb27cf766ce1fedd0b693b6b7f8b7073
Author: gatorsmile <[email protected]>
Date:   2016-07-11T21:18:25Z

    play1

commit 6cb4291758e27f2bd1c1dc1b4e2fe2ad528d1d1d
Author: gatorsmile <[email protected]>
Date:   2016-07-12T05:22:20Z

    test case fix.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to