[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22104 Just realized that the PR title and description is not updated. @icexelloss can you update them? thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22104 Thanks all for the review! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95312/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #95312 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95312/testReport)** for PR 22104 at commit [`3f0a97a`](https://github.com/apache/spark/commit/3f0a97a89b39d2ad57c587e49bb07203a670faba). * This patch passes all tests. * This patch **does not merge cleanly**. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22104 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22104 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95317/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #95317 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95317/testReport)** for PR 22104 at commit [`2325a4f`](https://github.com/apache/spark/commit/2325a4f18a2bc6cc95d96bc5ac6790749b3e927e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #95317 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95317/testReport)** for PR 22104 at commit [`2325a4f`](https://github.com/apache/spark/commit/2325a4f18a2bc6cc95d96bc5ac6790749b3e927e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2594/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #95312 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95312/testReport)** for PR 22104 at commit [`3f0a97a`](https://github.com/apache/spark/commit/3f0a97a89b39d2ad57c587e49bb07203a670faba). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2589/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95309/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #95309 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95309/testReport)** for PR 22104 at commit [`8a8e0b9`](https://github.com/apache/spark/commit/8a8e0b9d6cedb01d9a55db0f30e9ea243f757ad8). * This patch **fails Python style tests**. * This patch **does not merge cleanly**. * This patch adds the following public classes _(experimental)_: * `case class ArrowEvalPython(udfs: Seq[PythonUDF], output: Seq[Attribute], child: LogicalPlan)` * `case class BatchEvalPython(udfs: Seq[PythonUDF], output: Seq[Attribute], child: LogicalPlan)` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #95309 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95309/testReport)** for PR 22104 at commit [`8a8e0b9`](https://github.com/apache/spark/commit/8a8e0b9d6cedb01d9a55db0f30e9ea243f757ad8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2587/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22104 can we make `ExtractPythonUDFs` a logical plan instead of physical? then all the problems go away since it happens before the data source strategy. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22104 I mean, the current code will still break partitioned tables: ``` == Physical Plan == *(3) Project [_c0#223, pythonUDF0#231 AS v1#226] +- BatchEvalPython [(0)], [_c0#223, pythonUDF0#231] +- *(2) Project [_c0#223] +- *(2) Filter (pythonUDF0#230 = 0) +- BatchEvalPython [(0)], [_c0#223, pythonUDF0#230] +- *(1) FileScan csv [_c0#223] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/tmp/tab3], PartitionFilters: [((0) = 0)], PushedFilters: [], ReadSchema: struct<_c0:string> ``` For instance: ```python from pyspark.sql.functions import udf, lit, col spark.range(1).selectExpr("id", "id as value").write.mode("overwrite").format('csv').partitionBy("id").save("/tmp/tab3") df = spark.read.csv('/tmp/tab3') df2 = df.withColumn('v1', udf(lambda x: x, 'int')(lit(0))) df2 = df2.filter(df2['v1'] == 0) df2.explain() ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22104 @icexelloss, why https://github.com/apache/spark/pull/22104/commits/ccb27bb1ab75e33913f37a4dbe84793e6b9ddeec was reverted in this PR? Looks this is the correct approach. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95246/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #95246 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95246/testReport)** for PR 22104 at commit [`4d1ae29`](https://github.com/apache/spark/commit/4d1ae29a0b9777e0ce0ae26782280d3230e03396). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #95246 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95246/testReport)** for PR 22104 at commit [`4d1ae29`](https://github.com/apache/spark/commit/4d1ae29a0b9777e0ce0ae26782280d3230e03396). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2551/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22104 Let me take another look today or tomorrow. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22104 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95171/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #95171 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95171/testReport)** for PR 22104 at commit [`4d1ae29`](https://github.com/apache/spark/commit/4d1ae29a0b9777e0ce0ae26782280d3230e03396). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22104 @HyukjinKwon I addressed the comments. Do you mind taking a another look? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95169/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #95169 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95169/testReport)** for PR 22104 at commit [`6b7445c`](https://github.com/apache/spark/commit/6b7445c60d07aea6d05aa59efa3b60b4de590313). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #95171 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95171/testReport)** for PR 22104 at commit [`4d1ae29`](https://github.com/apache/spark/commit/4d1ae29a0b9777e0ce0ae26782280d3230e03396). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2499/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #95169 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95169/testReport)** for PR 22104 at commit [`6b7445c`](https://github.com/apache/spark/commit/6b7445c60d07aea6d05aa59efa3b60b4de590313). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22104 Tests pass now. This comment https://github.com/apache/spark/pull/22104/files#r210414941 requires some attention. @cloud-fan Do you think this is the right way to handle GenericInternalRow inputs here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94848/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #94848 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94848/testReport)** for PR 22104 at commit [`dcf07fb`](https://github.com/apache/spark/commit/dcf07fb4bae8206690db952da6aeeba342cc34f0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #94848 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94848/testReport)** for PR 22104 at commit [`dcf07fb`](https://github.com/apache/spark/commit/dcf07fb4bae8206690db952da6aeeba342cc34f0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2244/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94822/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #94822 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94822/testReport)** for PR 22104 at commit [`fa7a869`](https://github.com/apache/spark/commit/fa7a8697a9b6812481ab25721311fac8b15bc233). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2228/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94826/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #94826 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94826/testReport)** for PR 22104 at commit [`8409611`](https://github.com/apache/spark/commit/84096114ae20e1c76ba58028083e5fdad7785e22). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #94826 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94826/testReport)** for PR 22104 at commit [`8409611`](https://github.com/apache/spark/commit/84096114ae20e1c76ba58028083e5fdad7785e22). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2226/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #94822 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94822/testReport)** for PR 22104 at commit [`fa7a869`](https://github.com/apache/spark/commit/fa7a8697a9b6812481ab25721311fac8b15bc233). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2223/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #94820 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94820/testReport)** for PR 22104 at commit [`38f3dbb`](https://github.com/apache/spark/commit/38f3dbbbd7d77b59b8441daf14f3a94ead1401b9). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94820/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #94820 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94820/testReport)** for PR 22104 at commit [`38f3dbb`](https://github.com/apache/spark/commit/38f3dbbbd7d77b59b8441daf14f3a94ead1401b9). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22104 Thanks @HyukjinKwon and @cloud-fan ! I will take a look --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22104 > we can implement a dummy data source v1/v2 at scala side There's an example https://github.com/apache/spark/pull/21007 that implement something in Scala and use it in Python side test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22104 @icexelloss we can implement a dummy data source v1/v2 at scala side and scan them in PySpark test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22104 I think another way to fix this is to move the logic to `ExtractPythonUDF` to ignore `FileScanExec` `DataSourceScanExec` and `DataSourceV2ScanExec` instead of changing all three rules. The downside is that if a XScanExec node with a Python UDF pushed filter throws exception somewhere else, we need to fix that too. Not sure which way is better. But either way, it would be good to create test case with data source and data source V2... Would appreciate some advise on how to create such relation in test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22104 @gatorsmile Can you advise how to create a df with data source? All my attempts end up triggering FileSourceStrategy not DataSourceStrategy --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22104 @gatorsmile Possibly, let me see if I can create a test case --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22104 @icexelloss Do we face the same issue for DataSourceStrategy? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22104 retest please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94747/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #94747 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94747/testReport)** for PR 22104 at commit [`3e167a6`](https://github.com/apache/spark/commit/3e167a64bc43bbda3f376db6c5ef4bb0c24850d2). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2179/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22104 cc @cloud-fan . Followed your suggestion here: https://issues.apache.org/jira/browse/SPARK-24721?focusedCommentId=16560537=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16560537 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #94747 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94747/testReport)** for PR 22104 at commit [`3e167a6`](https://github.com/apache/spark/commit/3e167a64bc43bbda3f376db6c5ef4bb0c24850d2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2178/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #94746 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94746/testReport)** for PR 22104 at commit [`512f4b6`](https://github.com/apache/spark/commit/512f4b64cb7662baa23995c6f6c109a735ec8f5e). * This patch **fails Python style tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94746/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22104 **[Test build #94746 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94746/testReport)** for PR 22104 at commit [`512f4b6`](https://github.com/apache/spark/commit/512f4b64cb7662baa23995c6f6c109a735ec8f5e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2177/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22104 Build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org