[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-15 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/21564
  
LGTM except the test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21564
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21564
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91856/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21564
  
**[Test build #91856 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91856/testReport)**
 for PR 21564 at commit 
[`405ba94`](https://github.com/apache/spark/commit/405ba9441973a186569bbf733907bd9445331c34).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21564
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21564
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91855/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21564
  
**[Test build #91855 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91855/testReport)**
 for PR 21564 at commit 
[`0ef99cc`](https://github.com/apache/spark/commit/0ef99cc972a54fd9c98338e54a7e4e6b9a213654).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/21564
  
@yucai thanks, can you please also add more UTs in order to cover all the 
possible cases? Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21564
  
**[Test build #91856 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91856/testReport)**
 for PR 21564 at commit 
[`405ba94`](https://github.com/apache/spark/commit/405ba9441973a186569bbf733907bd9445331c34).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread yucai
Github user yucai commented on the issue:

https://github.com/apache/spark/pull/21564
  
@mgaido91 I update the codes as per your suggestion, thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread yucai
Github user yucai commented on the issue:

https://github.com/apache/spark/pull/21564
  
@viirya I think`PartitioningCollection` should be considered. Like below 
case:
```
spark.conf.set("spark.sql.autoBroadcastJoinThreshold", -1)
spark.conf.set("spark.sql.codegen.wholeStage", false)
val df1 = Seq(1 -> "a", 3 -> "c", 2 -> "b").toDF("i", "j").as("t1")
val df2 = Seq(1 -> "a", 3 -> "c", 2 -> "b").toDF("m", "n").as("t2")
val d = df1.join(df2, $"t1.i" === $"t2.m")
d.cache
val d1 = d.as("t3")
val d2 = d.as("t4")
d1.join(d2, $"t3.i" === $"t4.i").explain
```
```
SortMergeJoin [i#5], [i#54], Inner
:- InMemoryTableScan [i#5, j#6, m#15, n#16]
: +- InMemoryRelation [i#5, j#6, m#15, n#16], CachedRDDBuilder
:   +- SortMergeJoin [i#5], [m#15], Inner
:  :- Sort [i#5 ASC NULLS FIRST], false, 0
:  :  +- Exchange hashpartitioning(i#5, 10)
:  : +- LocalTableScan [i#5, j#6]
:  +- Sort [m#15 ASC NULLS FIRST], false, 0
: +- Exchange hashpartitioning(m#15, 10)
:+- LocalTableScan [m#15, n#16]
+- Sort [i#54 ASC NULLS FIRST], false, 0
   +- Exchange hashpartitioning(i#54, 10)
  +- InMemoryTableScan [i#54, j#55, m#58, n#59]
+- InMemoryRelation [i#54, j#55, m#58, n#59], CachedRDDBuilder
  +- SortMergeJoin [i#5], [m#15], Inner
 :- Sort [i#5 ASC NULLS FIRST], false, 0
 :  +- Exchange hashpartitioning(i#5, 10)
 : +- LocalTableScan [i#5, j#6]
 +- Sort [m#15 ASC NULLS FIRST], false, 0
+- Exchange hashpartitioning(m#15, 10)
   +- LocalTableScan [m#15, n#16]
```
`Exchange hashpartitioning(i#54, 10)` is extra shuffle.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21564
  
**[Test build #91855 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91855/testReport)**
 for PR 21564 at commit 
[`0ef99cc`](https://github.com/apache/spark/commit/0ef99cc972a54fd9c98338e54a7e4e6b9a213654).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21564
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91829/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21564
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21564
  
**[Test build #91829 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91829/testReport)**
 for PR 21564 at commit 
[`f37139b`](https://github.com/apache/spark/commit/f37139b2d07497af9df1984e5fb7a50931efbf9a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread yucai
Github user yucai commented on the issue:

https://github.com/apache/spark/pull/21564
  
@cloud-fan @viirya @gatorsmile , could you help review this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21564
  
**[Test build #91829 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91829/testReport)**
 for PR 21564 at commit 
[`f37139b`](https://github.com/apache/spark/commit/f37139b2d07497af9df1984e5fb7a50931efbf9a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21564
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21564: [SPARK-24556][SQL] ReusedExchange should rewrite output ...

2018-06-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21564
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org