GitHub user davidnavas opened a pull request:

    https://github.com/apache/spark/pull/15084

    [SPARK-17529][core] Implement BitSet.clearUntil and use it during merge 
joins

    ## What changes were proposed in this pull request?
    
    Add a clearUntil() method on BitSet (adapted from the pre-existing 
setUntil() method).
    Use this method to clear the subset of the BitSet which needs to be used 
during merge joins.
    
    ## How was this patch tested?
    
    dev/run-tests, as well as performance tests on skewed data as described in 
jira.
    
    I expect there to be a small local performance hit using BitSet.clearUntil 
rather than BitSet.clear for normally shaped (unskewed) joins (additional read 
on the last long).  This is expected to be de-minimis and was not specifically 
tested.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/davidnavas/spark bitSet

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/15084.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #15084
    
----
commit a88af25fb820a6bd7f857a62bef50b7c1a816cdb
Author: David Navas <[email protected]>
Date:   2016-09-12T17:58:12Z

    Implement BitSet.clearUntil and use it during merge joins

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to