[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-20 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/20387
  
Thanks for all your help getting this committed, @cloud-fan!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-20 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20387
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87532/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87532 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87532/testReport)**
 for PR 20387 at commit 
[`1a603db`](https://github.com/apache/spark/commit/1a603dbe5528b447bff371d2e00abdbdee664a75).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87532 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87532/testReport)**
 for PR 20387 at commit 
[`1a603db`](https://github.com/apache/spark/commit/1a603dbe5528b447bff371d2e00abdbdee664a75).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-17 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/20387
  
Thanks for the update! Enjoy your vacation, and thanks for letting me know.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/951/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-16 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20387
  
I'm on vacation and will be back next week, will have a more thorough 
review at that time, sorry for the inconvenience!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-16 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/20387
  
@cloud-fan, is there anything else that needs to be updated, or is this 
ready to be merged?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-14 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/20387
  
@cloud-fan, can you have a look at this? I've made the requested changes 
and tests are passing.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87434/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87434 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87434/testReport)**
 for PR 20387 at commit 
[`3b55609`](https://github.com/apache/spark/commit/3b55609b605fb461f6c2616d1da95a2d4b27ff4b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87432/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87432 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87432/testReport)**
 for PR 20387 at commit 
[`b8e3623`](https://github.com/apache/spark/commit/b8e3623837047949b39141e46eb96f30de8aa21e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/20387
  
Okay, I rebased again after SPARK-23303 was reverted.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87434 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87434/testReport)**
 for PR 20387 at commit 
[`3b55609`](https://github.com/apache/spark/commit/3b55609b605fb461f6c2616d1da95a2d4b27ff4b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/882/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87427/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87427 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87427/testReport)**
 for PR 20387 at commit 
[`adcb25a`](https://github.com/apache/spark/commit/adcb25a06240dc413f58b2d1240405b0a5485578).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87432 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87432/testReport)**
 for PR 20387 at commit 
[`b8e3623`](https://github.com/apache/spark/commit/b8e3623837047949b39141e46eb96f30de8aa21e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/880/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87426/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87426 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87426/testReport)**
 for PR 20387 at commit 
[`57e05c2`](https://github.com/apache/spark/commit/57e05c2babbcaec3ed3aa69765e1145539879c97).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/20387
  
@cloud-fan, I've rebased and made the requested changes.

#20603 reverts the last commit that adds back support for user-supplied 
schemas that are identical to the source schema.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87427 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87427/testReport)**
 for PR 20387 at commit 
[`adcb25a`](https://github.com/apache/spark/commit/adcb25a06240dc413f58b2d1240405b0a5485578).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/876/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87426 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87426/testReport)**
 for PR 20387 at commit 
[`57e05c2`](https://github.com/apache/spark/commit/57e05c2babbcaec3ed3aa69765e1145539879c97).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/875/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87274/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87274 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87274/testReport)**
 for PR 20387 at commit 
[`ce5f40d`](https://github.com/apache/spark/commit/ce5f40d6a512874e2dd45bab9256f77ff74e628b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87274 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87274/testReport)**
 for PR 20387 at commit 
[`ce5f40d`](https://github.com/apache/spark/commit/ce5f40d6a512874e2dd45bab9256f77ff74e628b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/760/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87268/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87268 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87268/testReport)**
 for PR 20387 at commit 
[`87a36be`](https://github.com/apache/spark/commit/87a36be5b8dc2002a45bf71ffdc94c816f6d7355).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-09 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20387
  
> Sorry, what do you want to change?

Nothing, just a potential use case to support creating `DataSourceOptions` 
in `DataSourceV2Relation`. If there are a lot of places like this, it's painful 
to duplicate the logic of merging extra entries to `DataSourceOptions`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87268 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87268/testReport)**
 for PR 20387 at commit 
[`87a36be`](https://github.com/apache/spark/commit/87a36be5b8dc2002a45bf71ffdc94c816f6d7355).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/755/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-09 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/20387
  
> See FindDataSourceTable.readDataSourceTable about how we handle the path 
option.

Sorry, what do you want to change?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-08 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20387
  
> We've added a resolution rule from UnresolvedRelation to 
DataSourceV2Relation that uses our implementation. UnresolvedRelation needs to 
pass its TableIdentifier to the v2 relation, which is why I added this.

I've been thinking about this a little more. This is actually an existing 
problem for file-based data sources. The solution is, when converting an 
unresolved relation to data source relation, we add some new options to the 
existing data source options before passing the options to data source 
relation. See `FindDataSourceTable.readDataSourceTable` about how we handle the 
path option.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87170/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87170 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87170/testReport)**
 for PR 20387 at commit 
[`181946d`](https://github.com/apache/spark/commit/181946d1f1c5889661544830a77bd23c4b4f685a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87170 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87170/testReport)**
 for PR 20387 at commit 
[`181946d`](https://github.com/apache/spark/commit/181946d1f1c5889661544830a77bd23c4b4f685a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/674/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-07 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/20387
  
@cloud-fan: Rebased and removed path.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87123/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87123 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87123/testReport)**
 for PR 20387 at commit 
[`7ef90cb`](https://github.com/apache/spark/commit/7ef90cb7e20b903b3569470ae0e3c26a03cb6a2a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/637/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87123 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87123/testReport)**
 for PR 20387 at commit 
[`7ef90cb`](https://github.com/apache/spark/commit/7ef90cb7e20b903b3569470ae0e3c26a03cb6a2a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-06 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/20387
  
Will do.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-06 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20387
  
I think there will be a lot of discussion about data source v2 table 
support. For now how about we remove the table/path stuff and get this PR in 
ASAP?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-06 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/20387
  
> Can you give a use case about this?

We've added a resolution rule from `UnresolvedRelation` to 
`DataSourceV2Relation` that uses our implementation. `UnresolvedRelation` needs 
to pass its `TableIdentifier` to the v2 relation, which is why I added this.

We could separate this out, but I think it makes sense to get the options 
for an immutable plan node right from the start. And I think we agree that 
`TableIdentifier` will be passed this way. I just don't see any benefit to 
separating this out. I agree that there isn't a requirement in this PR and I 
can remove it if necessary.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-06 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20387
  
> The reader is to be created and configured by the relation, then the 
relation needs to be able to set the table, path, and other properties. This 
adds necessary data to the relation that is no longer be passed directly to the 
reader from DataFrameReader.

This is an interesting thing that I missed before. Can you give a use case 
about this? If we have a requirement for it, I totally I agree we should put 
them in `DataSourceV2Relation`.

> I don't see why we wouldn't want to have these options in the immutable 
relation node from the start. 

My main concern is, this PR itself doesn't show a strong use case for 
putting these information into `DataSourceV2Relation`. If we do need it in the 
future, we can add it when we really need it. `DataSourceV2Relation` is not a 
public API so I think we don't need to rush.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-06 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/20387
  
The reader is to be created and configured by the relation, then the 
relation needs to be able to set the table, path, and other properties. This 
adds necessary data to the relation that is no longer be passed directly to the 
reader from `DataFrameReader`.

From the other thread on this, I think we agree that minimizing the number 
of places that work with `DataSourceOptions` and the specific option strings is 
a good idea. So it makes sense to define the relation using `TableIdentifier`. 
Other paths that create `DataSourceV2Relation` need the table name to be passed 
like this.

I guess we *could* revert the change and add it in a separate commit, but I 
don't see a reason for the extra work. It would be impractical to backport a 
later `TableIdentifier` change without this immutability change. Similarly, why 
would someone want to move to an immutable plan, but leave some left-over logic 
for configuration in `DataFrameReader`?

I don't see why we wouldn't want to have these options in the immutable 
relation node from the start. Do you have a case in mind that I'm missing?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-06 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20387
  
This PR does 3 things:
1. make `DataSourceV2Relation` immutable. This extends the constructor of 
`DataSourceV2Relation` to include pushed filters and pruned columns.
2. carry some standard information(table, path, etc.) into 
`DataSourceV2Relation`. This extends the constructor of `DataSourceV2Relation` 
to include table identifier, path string etc.
3. replace the new operator pushdown rule with `PhysicalOperation`.

It will be great if we only focus on 1, but I'm also OK if we do 1 and 3 
together. I don't think we should include 2 here, as it's unclear what's the 
benefit. These stardard information are only used to create `DataSourceOptions` 
inside `DataSourceV2Relation`, which can also be done in `DataFrameReader`. I 
suggest we don't change this part and just keep the `DataSourceOption` in the 
constructor of `DataSourceV2Relation`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-06 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/20387
  
@cloud-fan, this is a single commit on purpose because predicate push-down 
makes plan changes. I think it's best to do these at once to avoid unnecessary 
work. That's why I started looking more closely at push-down in the first 
place: updating the other push-down code for immutable plans was a mess.

I also think it is unlikely that we will need to revert the push-down 
changes here. If we end up redesigning push-down, then it is unlikely that the 
easiest starting point is to roll back this fix.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-05 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20387
  
I'm OK to replace the new push down implementation with 
`PhysicalOperation`, but please do that in an individual PR. If we do find the 
new implementation is necessary, it's easier for us to bring it back, if it was 
removed in an individual commit.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-05 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/20387
  
> For safety, I wanna keep it unchanged, and start something new for data 
source v2 only.

I disagree.

* **#20476 addresses a bug caused by the new implementation that is not a 
problem if we reuse the current push-down code.** Using an entirely new 
implementation to push filters and projection is going to introduce bugs, and 
that problem demonstrates that it is a real risk.
* **Using unreliable push-down code is going to make it more difficult for 
anyone to use the v2 API.**
* **This approach throws away work that has accumulated over the past few 
years that give us confidence in the current push-down code.** The other code 
paths have push-down tests that will help us catch bugs in the new push-down 
logic. If we limit the scope of this change to v2, we will not be able to reuse 
those tests and will have to write entirely new ones that cover all cases.

Lastly, I think it is clear that we need a design for a new push-down 
mechanism. **Adding this to DataSourceV2 as feature creep is not a good way to 
redesign it.** I'd like to see a design document that addresses some of the 
open questions.

I'd also prefer that this new implementation be removed from the v2 code 
path for 2.3.0. @marmbrus, what do you think?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-05 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20387
  
For doing pushdown at logical or physical phase, I don't have a strong 
preference. I think at logical phase we should try our best to push down 
data-size-reduction operators(like filter, aggregate, limit, etc.) close to the 
bottom of the plan, as it should always be good. We should apply pushdown to 
data sources at physical phase, as it's not always good and we need to consider 
the cost. Currently it's done in logical phase because of the `computeStats` 
problem. Eventually we should compute the statistics and apply pushdown to data 
sources in physical phase.

About how to apply pushdown to data sources, I think `PhysicalOperation` is 
in the right direction and the new pushdown rule also follows it. Generally the 
logical phase is responsible for pushing down data-size-reduction operators 
close to the data source relation, and in the physical phase we collect 
supported operators(currently it's only project and filter) above the data 
source relation and apply the pushdown once, so this doesn't need to be 
incremental.

We definitely need to document the contract for ordering and interactions 
between different types of pushdowns. For now we don't need to worry about it 
as we only support column pruning and filter push down, and these 2 are 
orthogonal, it doesn't matter if data source run project first or filter first. 
Here are some initial thoughts on how to define the contract.

Let's say Data Source V2 framework supports pushing down required 
columns(column pruning), filter, limit, aggregate. Semantically filter, limit 
and aggregate are not exchangeable, we should respect their order in the query. 
If we have all these operators in a query, how to tell the data source about 
the order of these operators?

My proposal is, since `DataSourceReader` is mutable(not the plan!), we can 
ask the data source to remember which operators have been pushed down, via the 
order of when these `pushXXX` methods are called. And data source 
implementations should respect the order of pushdown when applying them 
internally.

About `PhysicalOperation`, it's pretty simple and we probably need to 
change it a lot when adding more operator pushdown. Another concern is, 
`PhysicalOperation` is used in a lot of places, not only data source pushdown. 
For safety, I wanna keep it unchanged, and start something new for data source 
v2 only.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87007/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87007 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87007/testReport)**
 for PR 20387 at commit 
[`f1d9872`](https://github.com/apache/spark/commit/f1d9872a2699cdbd5c87b02e702dc8103335131d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread marmbrus
Github user marmbrus commented on the issue:

https://github.com/apache/spark/pull/20387
  
Regarding, `computeStats`, the logical plan seems like it might not be the 
right place.  As we move towards more CBO it seems like we are going to need to 
pick physical operators before we can really reason about the cost of a 
subplan. With the caveat that I haven't though hard about this, I'd be 
supportive of moving these kinds of metrics to physical plan. +1 that we need 
to be able to consider pushdown when producing stats either way.

On the second point, I don't think I understand DataSourceV2 enough yet to 
know the answer, but you ask a lot of questions that I think need to be defined 
as part of the API (if we haven't already).  What is the contract for ordering 
and interactions between different types of pushdown? Is it valid to pushdown 
in pieces or will we only call the method once? (sorry if this is written down 
and I've just missed it).

My gut feeling is that we don't really want to fuse incrementally.  Its 
seems hard to reason about correctness and interactions between different 
things that have been pushed.  As I hinted at before, I think its most natural 
to split the concerns of pushdown within a query plan and fusing of operators. 
But maybe this is limited in someway I don't realize.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87004/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87004 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87004/testReport)**
 for PR 20387 at commit 
[`ab945a1`](https://github.com/apache/spark/commit/ab945a19efe666c41deae9c044002f3455220c1d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/20387
  
> Why pushdown is happening in logical optimization and not during query 
planning. My first instinct would be to have the optimizer get operators as 
close to the leaves as possible and then fuse (or push down) as we convert to 
physical plan. I'm probably missing something.

I think there are two reasons, but I'm not fully convinced by either one:

* 
[`computeStats`](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala#L232)
 is defined on logical plans, so the result of filter push-down needs to be a 
logical plan if we want to be able to use accurate stats for a scan. I'm 
interested here to ensure that we correctly produce broadcast relations based 
on the actual scan stats, not the table-level stats. Maybe there's another way 
to do this?
* One of the tests for DSv2 ends up invoking the push-down rule twice, 
which made me think about whether or not that should be valid. I think it 
probably should be. For example, what if a plan has nodes that can all be 
pushed, but they aren't in the right order? Or what if a projection wasn't 
pushed through a filter because of a rule problem, but it can still be pushed 
down? Incremental fusing during optimization might be an extensible way to 
handle odd cases, or it may be useless. I'm not quite sure yet.

It would be great to hear your perspective on these.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87007 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87007/testReport)**
 for PR 20387 at commit 
[`f1d9872`](https://github.com/apache/spark/commit/f1d9872a2699cdbd5c87b02e702dc8103335131d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/543/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/541/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87004 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87004/testReport)**
 for PR 20387 at commit 
[`ab945a1`](https://github.com/apache/spark/commit/ab945a19efe666c41deae9c044002f3455220c1d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87002 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87002/testReport)**
 for PR 20387 at commit 
[`a7f0b90`](https://github.com/apache/spark/commit/a7f0b90b6ccb85c0801934ce7841831fe37b8739).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87002/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87002 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87002/testReport)**
 for PR 20387 at commit 
[`a7f0b90`](https://github.com/apache/spark/commit/a7f0b90b6ccb85c0801934ce7841831fe37b8739).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/539/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20387
  
**[Test build #87001 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87001/testReport)**
 for PR 20387 at commit 
[`63f11a9`](https://github.com/apache/spark/commit/63f11a92ffefaf937dd0266e3e59d619d33ab873).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87001/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20387
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >