GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/10942

    [SPARK-12850] [SQL] Support Bucket Pruning (Predicate Pushdown for Bucketed 
Tables)

    JIRA: https://issues.apache.org/jira/browse/SPARK-12850
    
    This PR is to support bucket pruning when the predicates are `EqualTo`, 
`EqualNullSafe`, `IsNull`, `In`, and `InSet`. 
    
    Like HIVE, in this PR, the bucket pruning works when the bucketing key has 
one and only one column.
    
    So far, I do not find a way to verify how many buckets are actually 
scanned. However, I did verify it when doing the debug. Could you provide a 
suggestion how to do it properly. Thank you! @cloud-fan @yhuai @rxin @marmbrus 
    
    BTW, we can add more cases to support complex predicate including `Or` and 
`And`. Please let me know if I should do it in this PR.
    
    Maybe we also need to add test cases to verify if bucket pruning works well 
for each data type. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark pruningBuckets

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10942.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10942
    
----
commit 959b498506ad737f36c65edeb43937a36a1b9255
Author: gatorsmile <[email protected]>
Date:   2016-01-27T05:39:14Z

    bucket pruning.

commit 5b2142cce0e8ce29a8d09db1de86fb9baeb00c4b
Author: gatorsmile <[email protected]>
Date:   2016-01-27T05:40:24Z

    Merge remote-tracking branch 'upstream/master' into pruningBuckets

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to