[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-09-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/7379 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-09-03 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-137569833 @hvanhovell thanks for working on this! To keep the PR queue manageable I propose we close this issue for now until you have time to bring it up to date and remove the

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-17 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/7379#discussion_r34891100 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastRangeJoin.scala --- @@ -0,0 +1,411 @@ +/* + * Licensed to the

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-17 Thread chenghao-intel
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-122188991 Sorry, I shouldn't use the word `SMJ`. I mean if we are planning to improve the performance of RangeJoin, probably we can think of it in a more general way,

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-17 Thread hvanhovell
Github user hvanhovell commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-12257 No problem. ### Supporting N-Ary Predicates. In order to make the range join work we need the predicates to define a single interval for each side of the

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-17 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/7379#discussion_r34940713 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastRangeJoin.scala --- @@ -0,0 +1,411 @@ +/* + * Licensed to the

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-17 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/7379#discussion_r34940933 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastRangeJoin.scala --- @@ -0,0 +1,411 @@ +/* + * Licensed to the

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-16 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/7379#discussion_r34863539 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastRangeJoin.scala --- @@ -0,0 +1,411 @@ +/* + * Licensed to the

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-16 Thread chenghao-intel
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-122176186 This is a very interesting optimization, but will it be more general if we consider that with the SortMergeJoin? As well as the case like: ``` SELECT

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-16 Thread hvanhovell
Github user hvanhovell commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-122177553 The = case is quite easy to implement. This implementation is currently targetted at range joining a rather small (broadcastable) to an arbitrarily large

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-16 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/7379#discussion_r34862439 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastRangeJoin.scala --- @@ -0,0 +1,411 @@ +/* + * Licensed to the

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-16 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/7379#discussion_r34862068 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastRangeJoin.scala --- @@ -0,0 +1,411 @@ +/* + * Licensed to the

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121811701 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121811713 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121817147 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121819633 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121819664 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121815160 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121815152 [Test build #37448 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37448/console) for PR 7379 at commit

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121820543 [Test build #37456 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37456/consoleFull) for PR 7379 at commit

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121812674 [Test build #37448 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37448/consoleFull) for PR 7379 at commit

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121833017 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121832849 [Test build #37456 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37456/console) for PR 7379 at commit

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121816918 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121816924 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121248652 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121276408 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121276830 [Test build #37229 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37229/consoleFull) for PR 7379 at commit

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-14 Thread hvanhovell
Github user hvanhovell commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121250819 Current test errors are a bit weird. They shouldn't have been caused by this change, because the functionality is disabled by default. Rebased to most recent

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121306024 [Test build #37229 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37229/console) for PR 7379 at commit

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121306135 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-14 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121316725 This looks pretty cool! I can try and do a more through review in a bit, but a few testing suggestions: It would be great to add a test for the query planner

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121130008 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121130017 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121117361 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-13 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121117436 [Test build #37182 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37182/consoleFull) for PR 7379 at commit

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121117367 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-13 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121130085 [Test build #37193 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37193/consoleFull) for PR 7379 at commit

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-13 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121116697 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-13 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121117442 [Test build #37182 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37182/console) for PR 7379 at commit

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121117445 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121137638 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-13 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121137612 [Test build #37193 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37193/console) for PR 7379 at commit

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121082240 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7379#issuecomment-121082100 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-8682][SQL][WIP] Range Join

2015-07-13 Thread hvanhovell
GitHub user hvanhovell opened a pull request: https://github.com/apache/spark/pull/7379 [SPARK-8682][SQL][WIP] Range Join *...copied from JIRA (SPARK-8682):* Currently Spark SQL uses a Broadcast Nested Loop join (or a filtered Cartesian Join) when it has to execute the