[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20476 LGTM. Thanks! Merged to master/2.3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20476 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20476 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86956/ Test PASSed. ---

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20476 **[Test build #86956 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86956/testReport)** for PR 20476 at commit [`12c8035`](https://github.com/apache/spark/commit/1

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20476 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20476 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/500/ Test

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20476 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20476 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/498/ Test

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20476 **[Test build #86956 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86956/testReport)** for PR 20476 at commit [`12c8035`](https://github.com/apache/spark/commit/12

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20476 Since you are being more and more familar with our codes, I believe you can offer us more useful inputs. Let me merge this PR for fixing the bugs. Then, we can have more detailed discus

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/20476 Yeah, I did review it, but at the time I wasn't familiar with how the other code paths worked and assumed that it was necessary to introduce this. I wasn't very familiar with how it *should* work, so

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20476 https://github.com/apache/spark/pull/19424 is the original PR that introduced the new rule `PushDownOperatorsToDataSource`. Both of us reviewed it. : ) Thank you for your understanding!

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/20476 @gatorsmile, Do you mean this? > Extensibility is not good and operator push-down capabilities are limited. If so, that's very open to interpretation. I would assume it means that the

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20476 @rdblue Operator pushdown is part of the [data source API V2 SPIP](https://docs.google.com/document/d/1n_vUVbF4KD3gxTmkNEon5qdQ-Z8qU5Frf6WMQZ6jJVM/edit#): https://issues.apache.org/jira/browse/SP

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/20476 @gatorsmile, thanks for the context. If we need to redesign push-down, then I think we should do that separately and with a design plan. **I don't think it's a good idea to bundle it into an

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20476 To everyone, this is a bug fix we should merge before the next RC of Spark 2.3. --- - To unsubscribe, e-mail: reviews-unsubs

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20476 @rdblue To be honest, the push-down solution in the current code base is not well designed. We got many feedbacks from the community (e.g., SAP and IBM Research). One proposed a bottom-up solutio

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/20476 @cloud-fan, @gatorsmile, this PR demonstrates why we should use PhysicalOperation. I ported the tests from this PR over to our branch and they pass without modifying the push-down code. That's becaus

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20476 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86933/ Test PASSed. ---

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20476 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20476 **[Test build #86933 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86933/testReport)** for PR 20476 at commit [`353dd6b`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20476 @rdblue I know you wanna use `PhysicalOperation` to replace the current operator pushdown rule, but before we reach a consensus, I think we should still fix bugs in the existing code. --- -

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20476 cc @gatorsmile @rdblue most of the changes are tests. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For ad

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20476 **[Test build #86933 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86933/testReport)** for PR 20476 at commit [`353dd6b`](https://github.com/apache/spark/commit/35

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20476 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/481/ Test

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

2018-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20476 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma