[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-11-25 Thread heary-cao
Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/20541
  
@gatorsmile, OK, I will do it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20541
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20541
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-03-23 Thread heary-cao
Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/20541
  
oh, I see, I fallback to the modification of the non-deterministic 
expression, and to keep the newly added test cases for a+1 and a+b, can you?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-03-22 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20541
  
I don't agree. `a + 1`/`a + b` are evaluated the same number of time, no 
matter you push in through Union or not. I don't see any performance benefit by 
doing this, except you can eliminate the entire project above Union.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-03-22 Thread heary-cao
Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/20541
  
oh ,yeah, there is a little difference, for a + 1 and a + b.
**for a + 1**:
```
`PushProjectionThroughUnion `rule handles:
Union
:- Project [(a#0 + 1) AS aa#10]
:  +- LocalRelation , [a#0, b#1, c#2]
:- Project [(d#3 + 1) AS aa#11]
:  +- LocalRelation , [d#3, e#4, f#5]
+- Project [(g#6 + 1) AS aa#12]
   +- LocalRelation , [g#6, h#7, i#8]

`ColumnPruning `rule handles:
Project [(a#0 + 1) AS aa#9]
Union
:- Project [a#0]
:  +- LocalRelation , [a#0, b#1, c#2]
:- Project [d#3]
:  +- LocalRelation , [d#3, e#4, f#5]
+- Project [g#6]
   +- LocalRelation , [g#6, h#7, i#8]
```
  
**for a + b**:
```
`PushProjectionThroughUnion `rule handles:
Union
:- Project [(a#0 + b#1) AS ab#10]
:  +- LocalRelation , [a#0, b#1, c#2]
:- Project [(d#3 + e#4) AS ab#11]
:  +- LocalRelation , [d#3, e#4, f#5]
+- Project [(g#6 + h#7) AS ab#12]
   +- LocalRelation , [g#6, h#7, i#8]

`ColumnPruning `rule handles:
Project [(a#0 + b#1) AS ab#9]
Union
:- Project [a#0, b#1]
:  +- LocalRelation , [a#0, b#1, c#2]
:- Project [d#3, e#4]
:  +- LocalRelation , [d#3, e#4, f#5]
+- Project [g#6, h#7]
   +- LocalRelation , [g#6, h#7, i#8]
```
  
So I think this may be the reason for the need to add the 
pushprojectionthroughunion rules. and to non-deterministic expression.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-03-22 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20541
  
`ColumnPruning` rule handles `Union` already.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-03-22 Thread heary-cao
Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/20541
  
in my opinion, this is considered that PushProjectionThroughUnion optimizes 
rules when there are multiple columns of union in data sources, while 
projection requires only a few columns, and the performance of file operation 
is better.  thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-03-21 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20541
  
I think the use case is, by pushing projects into Union, we are more likely 
to combine adjacent Unions. So I don't think we need to improve it to push part 
of the project list and still leave a project above Union.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-03-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20541
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20541
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20541
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87237/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20541
  
**[Test build #87237 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87237/testReport)**
 for PR 20541 at commit 
[`4f5d46b`](https://github.com/apache/spark/commit/4f5d46baca612caaa882cbabb3b35665e9c7ed8b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-02-08 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20541
  
I'm confused about why we need `PushProjectionThroughUnion`. Generally we 
only need to push down required columns, not entire project list, as there is 
no benifit of doing this. I think we just need to handle `Union` in the 
`ColumnPruning` rule, but I may miss something. cc @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20541
  
**[Test build #87237 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87237/testReport)**
 for PR 20541 at commit 
[`4f5d46b`](https://github.com/apache/spark/commit/4f5d46baca612caaa882cbabb3b35665e9c7ed8b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20541
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20541
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87210/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20541
  
**[Test build #87210 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87210/testReport)**
 for PR 20541 at commit 
[`36dbc9c`](https://github.com/apache/spark/commit/36dbc9c543f36dc5952a89c354bd70067ddd6883).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20541
  
**[Test build #87210 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87210/testReport)**
 for PR 20541 at commit 
[`36dbc9c`](https://github.com/apache/spark/commit/36dbc9c543f36dc5952a89c354bd70067ddd6883).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-02-08 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20541
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20541
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20541
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-02-08 Thread heary-cao
Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/20541
  
@gatorsmile ,@cloud-fan Can you help me to review it. thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org