GitHub user yucai reopened a pull request:

    https://github.com/apache/spark/pull/21156

    [SPARK-24087][SQL] Avoid shuffle when join keys are a super-set of bucket 
keys

    ## What changes were proposed in this pull request?
    
    To improve the bucket join, when join keys are a super-set of bucket keys, 
we should avoid shuffle.
    
    ## How was this patch tested?
    
    Enable ignored test.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yucai/spark SPARK-24087

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21156.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21156
    
----
commit b6bfdc21ed8edf98f9a3b9ac1c253c59adb141a2
Author: yucai <yyu1@...>
Date:   2018-04-25T00:49:43Z

    [SPARK-24087][SQL] Avoid shuffle when join keys are a super-set of bucket 
keys

commit a59c94f5b655fc034ce8907b98022cacf6bf318e
Author: yucai <yyu1@...>
Date:   2018-04-26T04:33:08Z

    simplify the codes

commit 4e026e5e437dc7f578434244b55bb1ebe189bace
Author: yucai <yyu1@...>
Date:   2018-06-04T02:22:12Z

    Add spark.sql.sortMergeJoinExec.childrenPartitioningDetection for user to 
disable this feature

commit fa76a7823baf4e6eb05f33bc746ade7f65f44372
Author: yucai <yyu1@...>
Date:   2018-06-04T05:25:01Z

    enable spark.sql.sortMergeJoinExec.childrenPartitioningDetection by default

commit 946688aee3d03d37a57270e654e00bb9236f21c4
Author: yucai <yyu1@...>
Date:   2018-06-04T05:28:51Z

    should return

commit 981a0fd22d30768ce533982c9fcc701b15d4dc44
Author: yucai <yyu1@...>
Date:   2018-07-06T06:51:24Z

    skip RangePartition

commit 76e7d5f67017604c29179ce55280e0fc56574fde
Author: yucai <yyu1@...>
Date:   2018-07-09T10:14:43Z

    Merge remote-tracking branch 'origin/master' into pr21156

commit 371c3a932f4dede4aeb1be2c9db404b457547ecf
Author: yucai <yyu1@...>
Date:   2018-07-09T11:33:17Z

    improve tests

commit de2bc4de76077f257b85e6a1d58ee17fbc770c8e
Author: yucai <yyu1@...>
Date:   2018-07-12T01:43:35Z

    support shuffled hash join

commit f40606203da01efe400431ed9d2b8b70c0476fc6
Author: yucai <yyu1@...>
Date:   2018-07-26T14:33:40Z

    remove bucket table check

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to