[GitHub] [spark] wangyum commented on a change in pull request #25092: [SPARK-28312][SQL][TEST] Port numeric.sql
wangyum commented on a change in pull request #25092: [SPARK-28312][SQL][TEST] Port numeric.sql URL: https://github.com/apache/spark/pull/25092#discussion_r301895001 ## File path: sql/core/src/test/resources/sql-tests/inputs/pgSQL/numeric.sql ## @@ -0,0 +1,1098 @@ +-- +-- Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group +-- +-- +-- NUMERIC +-- https://github.com/postgres/postgres/blob/REL_12_BETA1/src/test/regress/sql/numeric.sql +-- +-- This test suite contains eight Cartesian products without using explicit CROSS JOIN syntax. +-- Thus, we set spark.sql.crossJoin.enabled to true. +set spark.sql.crossJoin.enabled=true; + +CREATE TABLE num_data (id int, val decimal(38,10)) USING parquet; +CREATE TABLE num_exp_add (id1 int, id2 int, expected decimal(38,10)) USING parquet; +CREATE TABLE num_exp_sub (id1 int, id2 int, expected decimal(38,10)) USING parquet; +CREATE TABLE num_exp_div (id1 int, id2 int, expected decimal(38,10)) USING parquet; +CREATE TABLE num_exp_mul (id1 int, id2 int, expected decimal(38,10)) USING parquet; +CREATE TABLE num_exp_sqrt (id int, expected decimal(38,10)) USING parquet; +CREATE TABLE num_exp_ln (id int, expected decimal(38,10)) USING parquet; +CREATE TABLE num_exp_log10 (id int, expected decimal(38,10)) USING parquet; +CREATE TABLE num_exp_power_10_ln (id int, expected decimal(38,10)) USING parquet; + +CREATE TABLE num_result (id1 int, id2 int, result decimal(38,10)) USING parquet; + + +-- ** +-- * The following EXPECTED results are computed by bc(1) +-- * with a scale of 200 +-- ** + +-- BEGIN TRANSACTION; +INSERT INTO num_exp_add VALUES (0,0,'0'); +INSERT INTO num_exp_sub VALUES (0,0,'0'); +INSERT INTO num_exp_mul VALUES (0,0,'0'); +INSERT INTO num_exp_div VALUES (0,0,'NaN'); +INSERT INTO num_exp_add VALUES (0,1,'0'); +INSERT INTO num_exp_sub VALUES (0,1,'0'); +INSERT INTO num_exp_mul VALUES (0,1,'0'); +INSERT INTO num_exp_div VALUES (0,1,'NaN'); +INSERT INTO num_exp_add VALUES (0,2,'-34338492.215397047'); +INSERT INTO num_exp_sub VALUES (0,2,'34338492.215397047'); +INSERT INTO num_exp_mul VALUES (0,2,'0'); +INSERT INTO num_exp_div VALUES (0,2,'0'); +INSERT INTO num_exp_add VALUES (0,3,'4.31'); +INSERT INTO num_exp_sub VALUES (0,3,'-4.31'); +INSERT INTO num_exp_mul VALUES (0,3,'0'); +INSERT INTO num_exp_div VALUES (0,3,'0'); +INSERT INTO num_exp_add VALUES (0,4,'7799461.4119'); +INSERT INTO num_exp_sub VALUES (0,4,'-7799461.4119'); +INSERT INTO num_exp_mul VALUES (0,4,'0'); +INSERT INTO num_exp_div VALUES (0,4,'0'); +INSERT INTO num_exp_add VALUES (0,5,'16397.038491'); +INSERT INTO num_exp_sub VALUES (0,5,'-16397.038491'); +INSERT INTO num_exp_mul VALUES (0,5,'0'); +INSERT INTO num_exp_div VALUES (0,5,'0'); +INSERT INTO num_exp_add VALUES (0,6,'93901.57763026'); +INSERT INTO num_exp_sub VALUES (0,6,'-93901.57763026'); +INSERT INTO num_exp_mul VALUES (0,6,'0'); +INSERT INTO num_exp_div VALUES (0,6,'0'); +INSERT INTO num_exp_add VALUES (0,7,'-83028485'); +INSERT INTO num_exp_sub VALUES (0,7,'83028485'); +INSERT INTO num_exp_mul VALUES (0,7,'0'); +INSERT INTO num_exp_div VALUES (0,7,'0'); +INSERT INTO num_exp_add VALUES (0,8,'74881'); +INSERT INTO num_exp_sub VALUES (0,8,'-74881'); +INSERT INTO num_exp_mul VALUES (0,8,'0'); +INSERT INTO num_exp_div VALUES (0,8,'0'); +INSERT INTO num_exp_add VALUES (0,9,'-24926804.045047420'); +INSERT INTO num_exp_sub VALUES (0,9,'24926804.045047420'); +INSERT INTO num_exp_mul VALUES (0,9,'0'); +INSERT INTO num_exp_div VALUES (0,9,'0'); +INSERT INTO num_exp_add VALUES (1,0,'0'); +INSERT INTO num_exp_sub VALUES (1,0,'0'); +INSERT INTO num_exp_mul VALUES (1,0,'0'); +INSERT INTO num_exp_div VALUES (1,0,'NaN'); +INSERT INTO num_exp_add VALUES (1,1,'0'); +INSERT INTO num_exp_sub VALUES (1,1,'0'); +INSERT INTO num_exp_mul VALUES (1,1,'0'); +INSERT INTO num_exp_div VALUES (1,1,'NaN'); +INSERT INTO num_exp_add VALUES (1,2,'-34338492.215397047'); +INSERT INTO num_exp_sub VALUES (1,2,'34338492.215397047'); +INSERT INTO num_exp_mul VALUES (1,2,'0'); +INSERT INTO num_exp_div VALUES (1,2,'0'); +INSERT INTO num_exp_add VALUES (1,3,'4.31'); +INSERT INTO num_exp_sub VALUES (1,3,'-4.31'); +INSERT INTO num_exp_mul VALUES (1,3,'0'); +INSERT INTO num_exp_div VALUES (1,3,'0'); +INSERT INTO num_exp_add VALUES (1,4,'7799461.4119'); +INSERT INTO num_exp_sub VALUES (1,4,'-7799461.4119'); +INSERT INTO num_exp_mul VALUES (1,4,'0'); +INSERT INTO num_exp_div VALUES (1,4,'0'); +INSERT INTO num_exp_add VALUES (1,5,'16397.038491'); +INSERT INTO num_exp_sub VALUES (1,5,'-16397.038491'); +INSERT INTO num_exp_mul VALUES (1,5,'0'); +INSERT INTO num_exp_div VALUES (1,5,'0'); +INSERT INTO num_exp_add VALUES (1,6,'93901.57763026'); +INSERT INTO num_exp_sub VALUES (1,6,'-93901.57763026'); +INSERT INTO num_exp_mul VALUES (1,6,'0'); +INSERT INTO num_exp_div VALUES (1,6,'0'); +INSERT INTO num_exp_add VALUES (1,7,'-83028485'); +INSERT INTO num_exp_sub VALUES (1,7,'83028485'); +INSERT
[GitHub] [spark] AmplabJenkins removed a comment on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking.
AmplabJenkins removed a comment on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#issuecomment-509917786 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24933: [SPARK-28136][SQL][TEST] Port int8.sql
AmplabJenkins removed a comment on issue #24933: [SPARK-28136][SQL][TEST] Port int8.sql URL: https://github.com/apache/spark/pull/24933#issuecomment-509917672 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107425/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24933: [SPARK-28136][SQL][TEST] Port int8.sql
AmplabJenkins removed a comment on issue #24933: [SPARK-28136][SQL][TEST] Port int8.sql URL: https://github.com/apache/spark/pull/24933#issuecomment-509917666 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking.
AmplabJenkins removed a comment on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#issuecomment-509917790 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12575/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking.
AmplabJenkins commented on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#issuecomment-509917790 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12575/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24933: [SPARK-28136][SQL][TEST] Port int8.sql
AmplabJenkins commented on issue #24933: [SPARK-28136][SQL][TEST] Port int8.sql URL: https://github.com/apache/spark/pull/24933#issuecomment-509917666 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24933: [SPARK-28136][SQL][TEST] Port int8.sql
AmplabJenkins commented on issue #24933: [SPARK-28136][SQL][TEST] Port int8.sql URL: https://github.com/apache/spark/pull/24933#issuecomment-509917672 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107425/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking.
AmplabJenkins commented on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#issuecomment-509917786 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24933: [SPARK-28136][SQL][TEST] Port int8.sql
SparkQA removed a comment on issue #24933: [SPARK-28136][SQL][TEST] Port int8.sql URL: https://github.com/apache/spark/pull/24933#issuecomment-509884519 **[Test build #107425 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107425/testReport)** for PR 24933 at commit [`bd3a6c3`](https://github.com/apache/spark/commit/bd3a6c36f120d698cb2658c3ed29303b5a3e8610). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24933: [SPARK-28136][SQL][TEST] Port int8.sql
SparkQA commented on issue #24933: [SPARK-28136][SQL][TEST] Port int8.sql URL: https://github.com/apache/spark/pull/24933#issuecomment-509917214 **[Test build #107425 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107425/testReport)** for PR 24933 at commit [`bd3a6c3`](https://github.com/apache/spark/commit/bd3a6c36f120d698cb2658c3ed29303b5a3e8610). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking.
SparkQA commented on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#issuecomment-509916561 **[Test build #107440 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107440/testReport)** for PR 19096 at commit [`972164d`](https://github.com/apache/spark/commit/972164de305f010954d796eb766616356b30d759). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql
AmplabJenkins removed a comment on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql URL: https://github.com/apache/spark/pull/25092#issuecomment-509916109 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12574/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql
AmplabJenkins removed a comment on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql URL: https://github.com/apache/spark/pull/25092#issuecomment-509916096 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql
AmplabJenkins commented on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql URL: https://github.com/apache/spark/pull/25092#issuecomment-509916096 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql
AmplabJenkins commented on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql URL: https://github.com/apache/spark/pull/25092#issuecomment-509916109 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12574/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql
SparkQA commented on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql URL: https://github.com/apache/spark/pull/25092#issuecomment-509914965 **[Test build #107439 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107439/testReport)** for PR 25092 at commit [`60d3e65`](https://github.com/apache/spark/commit/60d3e6569c698ed1dab316fca2aec9f389dc1685). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum opened a new pull request #25092: [SPARK-28312][SQL][TEST] Port numeric.sql
wangyum opened a new pull request #25092: [SPARK-28312][SQL][TEST] Port numeric.sql URL: https://github.com/apache/spark/pull/25092 ## What changes were proposed in this pull request? run test first, will add it later. (Please fill in changes proposed in this fix) ## How was this patch tested? N/A This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2)
AmplabJenkins commented on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2) URL: https://github.com/apache/spark/pull/25063#issuecomment-509912721 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2)
AmplabJenkins removed a comment on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2) URL: https://github.com/apache/spark/pull/25063#issuecomment-509912725 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107437/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2)
SparkQA removed a comment on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2) URL: https://github.com/apache/spark/pull/25063#issuecomment-509910399 **[Test build #107437 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107437/testReport)** for PR 25063 at commit [`ba30f58`](https://github.com/apache/spark/commit/ba30f586008df0b083bb84a5091cbdf91988a540). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2)
AmplabJenkins removed a comment on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2) URL: https://github.com/apache/spark/pull/25063#issuecomment-509912721 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2)
AmplabJenkins commented on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2) URL: https://github.com/apache/spark/pull/25063#issuecomment-509912725 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107437/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2)
SparkQA commented on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2) URL: https://github.com/apache/spark/pull/25063#issuecomment-509912649 **[Test build #107437 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107437/testReport)** for PR 25063 at commit [`ba30f58`](https://github.com/apache/spark/commit/ba30f586008df0b083bb84a5091cbdf91988a540). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base
AmplabJenkins removed a comment on issue #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base URL: https://github.com/apache/spark/pull/25089#issuecomment-509911001 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107427/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base
AmplabJenkins removed a comment on issue #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base URL: https://github.com/apache/spark/pull/25089#issuecomment-509910998 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2)
AmplabJenkins removed a comment on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2) URL: https://github.com/apache/spark/pull/25063#issuecomment-509911413 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
AmplabJenkins removed a comment on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#issuecomment-509911396 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12573/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
AmplabJenkins removed a comment on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#issuecomment-509911395 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2)
AmplabJenkins removed a comment on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2) URL: https://github.com/apache/spark/pull/25063#issuecomment-509911421 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12572/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
AmplabJenkins commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#issuecomment-509911395 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2)
AmplabJenkins commented on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2) URL: https://github.com/apache/spark/pull/25063#issuecomment-509911413 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
AmplabJenkins commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#issuecomment-509911396 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12573/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2)
AmplabJenkins commented on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2) URL: https://github.com/apache/spark/pull/25063#issuecomment-509911421 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12572/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base
SparkQA removed a comment on issue #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base URL: https://github.com/apache/spark/pull/25089#issuecomment-509888717 **[Test build #107427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107427/testReport)** for PR 25089 at commit [`c7b4c46`](https://github.com/apache/spark/commit/c7b4c46425811ce68003dcf851a7d1867973ddf3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base
AmplabJenkins commented on issue #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base URL: https://github.com/apache/spark/pull/25089#issuecomment-509911001 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107427/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base
AmplabJenkins commented on issue #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base URL: https://github.com/apache/spark/pull/25089#issuecomment-509910998 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base
SparkQA commented on issue #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base URL: https://github.com/apache/spark/pull/25089#issuecomment-509910838 **[Test build #107427 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107427/testReport)** for PR 25089 at commit [`c7b4c46`](https://github.com/apache/spark/commit/c7b4c46425811ce68003dcf851a7d1867973ddf3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2)
SparkQA commented on issue #25063: [SPARK-28267][DOC] Update building-spark.md(support build with hadoop-3.2) URL: https://github.com/apache/spark/pull/25063#issuecomment-509910399 **[Test build #107437 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107437/testReport)** for PR 25063 at commit [`ba30f58`](https://github.com/apache/spark/commit/ba30f586008df0b083bb84a5091cbdf91988a540). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
SparkQA commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#issuecomment-509910403 **[Test build #107438 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107438/testReport)** for PR 25000 at commit [`ebe917c`](https://github.com/apache/spark/commit/ebe917c0a8acc1b5a83b51a570749d5cd343160d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition
viirya commented on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition URL: https://github.com/apache/spark/pull/25091#issuecomment-509908957 Thanks! @HyukjinKwon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
AmplabJenkins removed a comment on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#issuecomment-509908450 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
AmplabJenkins removed a comment on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#issuecomment-509908455 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12571/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition
HyukjinKwon commented on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition URL: https://github.com/apache/spark/pull/25091#issuecomment-509908407 Cool @viirya! I will take a closer look within 2 days and get this in This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
AmplabJenkins commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#issuecomment-509908455 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12571/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
AmplabJenkins commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#issuecomment-509908450 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
dongjoon-hyun commented on a change in pull request #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#discussion_r301886053 ## File path: common/unsafe/src/main/java/org/apache/spark/unsafe/types/CalendarInterval.java ## @@ -174,12 +191,47 @@ public static CalendarInterval fromDayTimeString(String s) throws IllegalArgumen int sign = m.group(1) != null && m.group(1).equals("-") ? -1 : 1; long days = m.group(2) == null ? 0 : toLongWithRange("day", m.group(3), Review comment: I intentionally keep the original to reduce the size of patch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
dongjoon-hyun commented on a change in pull request #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#discussion_r301885929 ## File path: common/unsafe/src/main/java/org/apache/spark/unsafe/types/CalendarInterval.java ## @@ -160,6 +160,20 @@ public static CalendarInterval fromYearMonthString(String s) throws IllegalArgum * adapted from HiveIntervalDayTime.valueOf */ public static CalendarInterval fromDayTimeString(String s) throws IllegalArgumentException { +return fromDayTimeString(s, "day", "second"); + } + + /** + * Parse dayTime string in form: [-]d HH:mm:ss.n and [-]HH:mm:ss.n + * + * adapted from HiveIntervalDayTime.valueOf. + * Below interval conversion patterns are supported: + * - DAY TO (HOUR|MINUTE|SECOND) + * - HOUR TO (MINUTE|SECOND) + * - HOUR TO SECOND Review comment: Oops. My bad. This should be `MINUTE TO SECOND`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls])
SparkQA commented on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls]) URL: https://github.com/apache/spark/pull/25082#issuecomment-509907415 **[Test build #107435 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107435/testReport)** for PR 25082 at commit [`b3787f5`](https://github.com/apache/spark/commit/b3787f5cebc1c86c1f82a16950f6aef98791fffd). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
dongjoon-hyun commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#issuecomment-509907492 Thank you for merging. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
SparkQA commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#issuecomment-509907459 **[Test build #107436 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107436/testReport)** for PR 25000 at commit [`bf78d73`](https://github.com/apache/spark/commit/bf78d73c0b5174a98cfe7ef8bf7483c24891fbca). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] imback82 commented on a change in pull request #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base
imback82 commented on a change in pull request #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base URL: https://github.com/apache/spark/pull/25090#discussion_r301885522 ## File path: sql/core/src/test/resources/sql-tests/inputs/udf/udf-except-all.sql ## @@ -0,0 +1,164 @@ +-- This test file was converted from except-all.sql. +-- Note that currently registered UDF returns a string. So there are some differences, for instance +-- in string cast within UDF in Scala and Python. + +CREATE TEMPORARY VIEW tab1 AS SELECT * FROM VALUES +(0), (1), (2), (2), (2), (2), (3), (null), (null) AS tab1(c1); +CREATE TEMPORARY VIEW tab2 AS SELECT * FROM VALUES +(1), (2), (2), (3), (5), (5), (null) AS tab2(c1); +CREATE TEMPORARY VIEW tab3 AS SELECT * FROM VALUES +(1, 2), +(1, 2), +(1, 3), +(2, 3), +(2, 2) +AS tab3(k, v); +CREATE TEMPORARY VIEW tab4 AS SELECT * FROM VALUES +(1, 2), +(2, 3), +(2, 2), +(2, 2), +(2, 20) +AS tab4(k, v); + +-- Basic EXCEPT ALL +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2; + +-- MINUS ALL (synonym for EXCEPT) +SELECT * FROM tab1 +MINUS ALL +SELECT * FROM tab2; + +-- EXCEPT ALL same table in both branches +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2 WHERE udf(c1) IS NOT NULL; + +-- Empty left relation +SELECT * FROM tab1 WHERE udf(c1) > 5 +EXCEPT ALL +SELECT * FROM tab2; + +-- Empty right relation +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2 WHERE c1 > udf(6); + +-- Type Coerced ExceptAll +SELECT * FROM tab1 +EXCEPT ALL +SELECT CAST(udf(1) AS BIGINT); + +-- Error as types of two side are not compatible +SELECT * FROM tab1 +EXCEPT ALL +SELECT array(1); + +-- Basic +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4; + +-- Basic +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3; + +-- EXCEPT ALL + INTERSECT +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3 +INTERSECT DISTINCT +SELECT * FROM tab4; + +-- EXCEPT ALL + EXCEPT +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +UNION ALL +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Mismatch on number of columns across both branches +SELECT k FROM tab3 +EXCEPT ALL +SELECT k, v FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +UNION +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Using MINUS ALL +SELECT * FROM tab3 +MINUS ALL +SELECT * FROM tab4 +UNION +SELECT * FROM tab3 +MINUS DISTINCT +SELECT * FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +EXCEPT DISTINCT +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Join under except all. Should produce empty resultset since both left and right sets +-- are same. +SELECT * +FROM (SELECT udf(tab3.k), + udf(tab4.v) +FROM tab3 + JOIN tab4 + ON tab3.k = tab4.k) Review comment: Cool. I will test it again once your changes are merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls])
AmplabJenkins removed a comment on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls]) URL: https://github.com/apache/spark/pull/25082#issuecomment-509907003 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls])
AmplabJenkins removed a comment on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls]) URL: https://github.com/apache/spark/pull/25082#issuecomment-509907008 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12570/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls])
AmplabJenkins commented on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls]) URL: https://github.com/apache/spark/pull/25082#issuecomment-509907003 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls])
AmplabJenkins commented on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls]) URL: https://github.com/apache/spark/pull/25082#issuecomment-509907008 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12570/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
dongjoon-hyun edited a comment on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#issuecomment-509906663 Hi, @lipzhu . I made a PR to your branch. Could you review and merge that? - https://github.com/lipzhu/spark/pull/3 It's just for minimizing and simplifying the diff of the patch. I checked that all newly added UT passed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls])
wangyum commented on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls]) URL: https://github.com/apache/spark/pull/25082#issuecomment-509906866 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND'
dongjoon-hyun commented on issue #25000: [SPARK-28107][SQL] Support 'DAY TO (HOUR|MINUTE|SECOND)', 'HOUR TO (MINUTE|SECOND)' and 'MINUTE TO SECOND' URL: https://github.com/apache/spark/pull/25000#issuecomment-509906663 Hi, @lipzhu . I made a PR to your branch. Could you review and merge that? - https://github.com/lipzhu/spark/pull/3 It's just for minimizing and simplifying the diff of the patch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base
viirya commented on a change in pull request #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base URL: https://github.com/apache/spark/pull/25090#discussion_r301884760 ## File path: sql/core/src/test/resources/sql-tests/inputs/udf/udf-except-all.sql ## @@ -0,0 +1,164 @@ +-- This test file was converted from except-all.sql. +-- Note that currently registered UDF returns a string. So there are some differences, for instance +-- in string cast within UDF in Scala and Python. + +CREATE TEMPORARY VIEW tab1 AS SELECT * FROM VALUES +(0), (1), (2), (2), (2), (2), (3), (null), (null) AS tab1(c1); +CREATE TEMPORARY VIEW tab2 AS SELECT * FROM VALUES +(1), (2), (2), (3), (5), (5), (null) AS tab2(c1); +CREATE TEMPORARY VIEW tab3 AS SELECT * FROM VALUES +(1, 2), +(1, 2), +(1, 3), +(2, 3), +(2, 2) +AS tab3(k, v); +CREATE TEMPORARY VIEW tab4 AS SELECT * FROM VALUES +(1, 2), +(2, 3), +(2, 2), +(2, 2), +(2, 20) +AS tab4(k, v); + +-- Basic EXCEPT ALL +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2; + +-- MINUS ALL (synonym for EXCEPT) +SELECT * FROM tab1 +MINUS ALL +SELECT * FROM tab2; + +-- EXCEPT ALL same table in both branches +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2 WHERE udf(c1) IS NOT NULL; + +-- Empty left relation +SELECT * FROM tab1 WHERE udf(c1) > 5 +EXCEPT ALL +SELECT * FROM tab2; + +-- Empty right relation +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2 WHERE c1 > udf(6); + +-- Type Coerced ExceptAll +SELECT * FROM tab1 +EXCEPT ALL +SELECT CAST(udf(1) AS BIGINT); + +-- Error as types of two side are not compatible +SELECT * FROM tab1 +EXCEPT ALL +SELECT array(1); + +-- Basic +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4; + +-- Basic +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3; + +-- EXCEPT ALL + INTERSECT +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3 +INTERSECT DISTINCT +SELECT * FROM tab4; + +-- EXCEPT ALL + EXCEPT +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +UNION ALL +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Mismatch on number of columns across both branches +SELECT k FROM tab3 +EXCEPT ALL +SELECT k, v FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +UNION +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Using MINUS ALL +SELECT * FROM tab3 +MINUS ALL +SELECT * FROM tab4 +UNION +SELECT * FROM tab3 +MINUS DISTINCT +SELECT * FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +EXCEPT DISTINCT +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Join under except all. Should produce empty resultset since both left and right sets +-- are same. +SELECT * +FROM (SELECT udf(tab3.k), + udf(tab4.v) +FROM tab3 + JOIN tab4 + ON tab3.k = tab4.k) Review comment: Ur, I think it is the bug I fix now. Please see #25091. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25000: [SPARK-28107][SQL] Support 'day to hour', 'day to minute', ‘day to second’, 'hour to minute', 'hour to second' and 'minute to second'
AmplabJenkins commented on issue #25000: [SPARK-28107][SQL] Support 'day to hour', 'day to minute', ‘day to second’, 'hour to minute', 'hour to second' and 'minute to second' URL: https://github.com/apache/spark/pull/25000#issuecomment-509905344 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25000: [SPARK-28107][SQL] Support 'day to hour', 'day to minute', ‘day to second’, 'hour to minute', 'hour to second' and 'minute to second'
AmplabJenkins removed a comment on issue #25000: [SPARK-28107][SQL] Support 'day to hour', 'day to minute', ‘day to second’, 'hour to minute', 'hour to second' and 'minute to second' URL: https://github.com/apache/spark/pull/25000#issuecomment-509905349 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107424/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25000: [SPARK-28107][SQL] Support 'day to hour', 'day to minute', ‘day to second’, 'hour to minute', 'hour to second' and 'minute to second'
AmplabJenkins removed a comment on issue #25000: [SPARK-28107][SQL] Support 'day to hour', 'day to minute', ‘day to second’, 'hour to minute', 'hour to second' and 'minute to second' URL: https://github.com/apache/spark/pull/25000#issuecomment-509905344 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25000: [SPARK-28107][SQL] Support 'day to hour', 'day to minute', ‘day to second’, 'hour to minute', 'hour to second' and 'minute to second'
AmplabJenkins commented on issue #25000: [SPARK-28107][SQL] Support 'day to hour', 'day to minute', ‘day to second’, 'hour to minute', 'hour to second' and 'minute to second' URL: https://github.com/apache/spark/pull/25000#issuecomment-509905349 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107424/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls])
AmplabJenkins removed a comment on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls]) URL: https://github.com/apache/spark/pull/25082#issuecomment-509904915 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107428/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #25087: [SPARK-28234][CORE][PYTHON] Add python and JavaSparkContext support to get resources
HyukjinKwon commented on issue #25087: [SPARK-28234][CORE][PYTHON] Add python and JavaSparkContext support to get resources URL: https://github.com/apache/spark/pull/25087#issuecomment-509905171 Yea, looks fine to me in general too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25000: [SPARK-28107][SQL] Support 'day to hour', 'day to minute', ‘day to second’, 'hour to minute', 'hour to second' and 'minute to second'
SparkQA commented on issue #25000: [SPARK-28107][SQL] Support 'day to hour', 'day to minute', ‘day to second’, 'hour to minute', 'hour to second' and 'minute to second' URL: https://github.com/apache/spark/pull/25000#issuecomment-509904983 **[Test build #107424 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107424/testReport)** for PR 25000 at commit [`d68ea26`](https://github.com/apache/spark/commit/d68ea26a52ef5623ce196b62b208518eb275381a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls])
AmplabJenkins removed a comment on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls]) URL: https://github.com/apache/spark/pull/25082#issuecomment-509904908 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25000: [SPARK-28107][SQL] Support 'day to hour', 'day to minute', ‘day to second’, 'hour to minute', 'hour to second' and 'minute to second'
SparkQA removed a comment on issue #25000: [SPARK-28107][SQL] Support 'day to hour', 'day to minute', ‘day to second’, 'hour to minute', 'hour to second' and 'minute to second' URL: https://github.com/apache/spark/pull/25000#issuecomment-509883133 **[Test build #107424 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107424/testReport)** for PR 25000 at commit [`d68ea26`](https://github.com/apache/spark/commit/d68ea26a52ef5623ce196b62b208518eb275381a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #25087: [SPARK-28234][CORE][PYTHON] Add python and JavaSparkContext support to get resources
HyukjinKwon commented on a change in pull request #25087: [SPARK-28234][CORE][PYTHON] Add python and JavaSparkContext support to get resources URL: https://github.com/apache/spark/pull/25087#discussion_r301883872 ## File path: python/pyspark/resourceinformation.py ## @@ -0,0 +1,44 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + + +class ResourceInformation(object): + +""" +.. note:: Evolving + +Class to hold information about a type of Resource. A resource could be a GPU, FPGA, etc. +The array of addresses are resource specific and its up to the user to interpret the address. + +One example is GPUs, where the addresses would be the indices of the GPUs + +@param name the name of the resource +@param addresses an array of strings describing the addresses of the resource +""" + +_name = None +_addresses = None + +def __init__(self, name, addresses): +self._name = name +self._addresses = addresses + +def name(self): Review comment: +1. For addresses too to be consistent with Scala side. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls])
AmplabJenkins commented on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls]) URL: https://github.com/apache/spark/pull/25082#issuecomment-509904908 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls])
AmplabJenkins commented on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls]) URL: https://github.com/apache/spark/pull/25082#issuecomment-509904915 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107428/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls])
SparkQA commented on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls]) URL: https://github.com/apache/spark/pull/25082#issuecomment-509904811 **[Test build #107428 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107428/testReport)** for PR 25082 at commit [`b3787f5`](https://github.com/apache/spark/commit/b3787f5cebc1c86c1f82a16950f6aef98791fffd). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls])
SparkQA removed a comment on issue #25082: [SPARK-28310][SQL] Support ANSI SQL grammar:first_value/last_value(expression, [ignore/respect nulls]) URL: https://github.com/apache/spark/pull/25082#issuecomment-509889993 **[Test build #107428 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107428/testReport)** for PR 25082 at commit [`b3787f5`](https://github.com/apache/spark/commit/b3787f5cebc1c86c1f82a16950f6aef98791fffd). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] BryanCutler commented on a change in pull request #25087: [SPARK-28234][CORE][PYTHON] Add python and JavaSparkContext support to get resources
BryanCutler commented on a change in pull request #25087: [SPARK-28234][CORE][PYTHON] Add python and JavaSparkContext support to get resources URL: https://github.com/apache/spark/pull/25087#discussion_r301881588 ## File path: python/pyspark/context.py ## @@ -1105,6 +1106,18 @@ def getConf(self): conf.setAll(self._conf.getAll()) return conf +def resources(self): +resources = {} +jresources = self._jsc.resources() +for x in jresources: +name = jresources[x].name() +jaddresses = jresources[x].addresses() +addrs = [] +for addr in jaddresses: Review comment: a comprehension would be a little cleaner here, e.g. `addrs = [addr for addr in jaddresses]` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] BryanCutler commented on a change in pull request #25087: [SPARK-28234][CORE][PYTHON] Add python and JavaSparkContext support to get resources
BryanCutler commented on a change in pull request #25087: [SPARK-28234][CORE][PYTHON] Add python and JavaSparkContext support to get resources URL: https://github.com/apache/spark/pull/25087#discussion_r301881838 ## File path: python/pyspark/resourceinformation.py ## @@ -0,0 +1,44 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + + +class ResourceInformation(object): + +""" +.. note:: Evolving + +Class to hold information about a type of Resource. A resource could be a GPU, FPGA, etc. +The array of addresses are resource specific and its up to the user to interpret the address. + +One example is GPUs, where the addresses would be the indices of the GPUs + +@param name the name of the resource +@param addresses an array of strings describing the addresses of the resource +""" + +_name = None +_addresses = None + +def __init__(self, name, addresses): +self._name = name +self._addresses = addresses + +def name(self): Review comment: maybe use the `@property` decorator for these? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base
AmplabJenkins removed a comment on issue #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base URL: https://github.com/apache/spark/pull/25090#issuecomment-509902718 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] imback82 commented on a change in pull request #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base
imback82 commented on a change in pull request #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base URL: https://github.com/apache/spark/pull/25090#discussion_r301882087 ## File path: sql/core/src/test/resources/sql-tests/inputs/udf/udf-except-all.sql ## @@ -0,0 +1,164 @@ +-- This test file was converted from except-all.sql. +-- Note that currently registered UDF returns a string. So there are some differences, for instance +-- in string cast within UDF in Scala and Python. + +CREATE TEMPORARY VIEW tab1 AS SELECT * FROM VALUES +(0), (1), (2), (2), (2), (2), (3), (null), (null) AS tab1(c1); +CREATE TEMPORARY VIEW tab2 AS SELECT * FROM VALUES +(1), (2), (2), (3), (5), (5), (null) AS tab2(c1); +CREATE TEMPORARY VIEW tab3 AS SELECT * FROM VALUES +(1, 2), +(1, 2), +(1, 3), +(2, 3), +(2, 2) +AS tab3(k, v); +CREATE TEMPORARY VIEW tab4 AS SELECT * FROM VALUES +(1, 2), +(2, 3), +(2, 2), +(2, 2), +(2, 20) +AS tab4(k, v); + +-- Basic EXCEPT ALL +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2; + +-- MINUS ALL (synonym for EXCEPT) +SELECT * FROM tab1 +MINUS ALL +SELECT * FROM tab2; + +-- EXCEPT ALL same table in both branches +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2 WHERE udf(c1) IS NOT NULL; + +-- Empty left relation +SELECT * FROM tab1 WHERE udf(c1) > 5 +EXCEPT ALL +SELECT * FROM tab2; + +-- Empty right relation +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2 WHERE c1 > udf(6); + +-- Type Coerced ExceptAll +SELECT * FROM tab1 +EXCEPT ALL +SELECT CAST(udf(1) AS BIGINT); + +-- Error as types of two side are not compatible +SELECT * FROM tab1 +EXCEPT ALL +SELECT array(1); + +-- Basic +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4; + +-- Basic +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3; + +-- EXCEPT ALL + INTERSECT +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3 +INTERSECT DISTINCT +SELECT * FROM tab4; + +-- EXCEPT ALL + EXCEPT +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +UNION ALL +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Mismatch on number of columns across both branches +SELECT k FROM tab3 +EXCEPT ALL +SELECT k, v FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +UNION +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Using MINUS ALL +SELECT * FROM tab3 +MINUS ALL +SELECT * FROM tab4 +UNION +SELECT * FROM tab3 +MINUS DISTINCT +SELECT * FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +EXCEPT DISTINCT +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Join under except all. Should produce empty resultset since both left and right sets +-- are same. +SELECT * +FROM (SELECT udf(tab3.k), + udf(tab4.v) +FROM tab3 + JOIN tab4 + ON tab3.k = tab4.k) Review comment: OK. I will update this thread after I create a JIRA. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base
AmplabJenkins removed a comment on issue #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base URL: https://github.com/apache/spark/pull/25090#issuecomment-509902724 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12569/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base
AmplabJenkins commented on issue #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base URL: https://github.com/apache/spark/pull/25090#issuecomment-509902718 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base
HyukjinKwon commented on a change in pull request #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base URL: https://github.com/apache/spark/pull/25089#discussion_r301881829 ## File path: sql/core/src/test/resources/sql-tests/inputs/udf/udf-count.sql ## @@ -0,0 +1,28 @@ +-- This test file was converted from count.sql +-- Test data. +CREATE OR REPLACE TEMPORARY VIEW testData AS SELECT * FROM VALUES +(1, 1), (1, 2), (2, 1), (1, 1), (null, 2), (1, null), (null, null) +AS testData(a, b); + +-- count with single expression +SELECT + udf(count(*)), udf(count(1)), udf(count(null)), udf(count(a)), udf(count(b)), udf(count(a + b)), udf(count((a, b))) Review comment: @vinodkc, can make some other conbinations like `udf(count(*))`, `count(udf(a))`, `udf(count(udf(a)))`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base
HyukjinKwon commented on a change in pull request #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base URL: https://github.com/apache/spark/pull/25089#discussion_r301881829 ## File path: sql/core/src/test/resources/sql-tests/inputs/udf/udf-count.sql ## @@ -0,0 +1,28 @@ +-- This test file was converted from count.sql +-- Test data. +CREATE OR REPLACE TEMPORARY VIEW testData AS SELECT * FROM VALUES +(1, 1), (1, 2), (2, 1), (1, 1), (null, 2), (1, null), (null, null) +AS testData(a, b); + +-- count with single expression +SELECT + udf(count(*)), udf(count(1)), udf(count(null)), udf(count(a)), udf(count(b)), udf(count(a + b)), udf(count((a, b))) Review comment: @vinodkc, can make some other conbinations like `udf(count(*))`, `count(udf(a))`, `udf(count(udf(a)))``? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base
HyukjinKwon commented on a change in pull request #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base URL: https://github.com/apache/spark/pull/25089#discussion_r301881895 ## File path: sql/core/src/test/resources/sql-tests/inputs/udf/udf-count.sql ## @@ -0,0 +1,28 @@ +-- This test file was converted from count.sql +-- Test data. +CREATE OR REPLACE TEMPORARY VIEW testData AS SELECT * FROM VALUES +(1, 1), (1, 2), (2, 1), (1, 1), (null, 2), (1, null), (null, null) +AS testData(a, b); + +-- count with single expression +SELECT + udf(count(*)), udf(count(1)), udf(count(null)), udf(count(a)), udf(count(b)), udf(count(a + b)), udf(count((a, b))) +FROM testData; + +-- distinct count with single expression +SELECT + udf(count(DISTINCT 1)), + udf(count(DISTINCT null)), + udf(count(DISTINCT a)), + udf(count(DISTINCT b)), + udf(count(DISTINCT (a + b))), + udf(count(DISTINCT (a, b))) Review comment: Here too :-) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base
AmplabJenkins commented on issue #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base URL: https://github.com/apache/spark/pull/25090#issuecomment-509902724 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12569/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base
HyukjinKwon commented on issue #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base URL: https://github.com/apache/spark/pull/25089#issuecomment-509902744 Looks fine if there are no output diff comparing to the original file This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base
HyukjinKwon commented on a change in pull request #25089: [SPARK-28275][SQL][PYTHON][TESTS] Convert and port 'count.sql' into UDF test base URL: https://github.com/apache/spark/pull/25089#discussion_r301881829 ## File path: sql/core/src/test/resources/sql-tests/inputs/udf/udf-count.sql ## @@ -0,0 +1,28 @@ +-- This test file was converted from count.sql +-- Test data. +CREATE OR REPLACE TEMPORARY VIEW testData AS SELECT * FROM VALUES +(1, 1), (1, 2), (2, 1), (1, 1), (null, 2), (1, null), (null, null) +AS testData(a, b); + +-- count with single expression +SELECT + udf(count(*)), udf(count(1)), udf(count(null)), udf(count(a)), udf(count(b)), udf(count(a + b)), udf(count((a, b))) Review comment: @vinodkc, can make some other conbinations like `udf(count(*))`, `count(udf(a))`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition
viirya commented on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition URL: https://github.com/apache/spark/pull/25091#issuecomment-509901842 cc @HyukjinKwon @BryanCutler This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition
AmplabJenkins removed a comment on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition URL: https://github.com/apache/spark/pull/25091#issuecomment-509901327 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition
SparkQA commented on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition URL: https://github.com/apache/spark/pull/25091#issuecomment-509901729 **[Test build #107433 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107433/testReport)** for PR 25091 at commit [`95231a6`](https://github.com/apache/spark/commit/95231a6f79a5a6352491622aa5a767aca6571f66). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base
SparkQA commented on issue #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base URL: https://github.com/apache/spark/pull/25090#issuecomment-509901736 **[Test build #107434 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107434/testReport)** for PR 25090 at commit [`a512ef8`](https://github.com/apache/spark/commit/a512ef8c29b01568fed3f1ad80335e568f31). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition
AmplabJenkins removed a comment on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition URL: https://github.com/apache/spark/pull/25091#issuecomment-509901330 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12568/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base
HyukjinKwon commented on a change in pull request #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base URL: https://github.com/apache/spark/pull/25090#discussion_r301880585 ## File path: sql/core/src/test/resources/sql-tests/inputs/udf/udf-except-all.sql ## @@ -0,0 +1,164 @@ +-- This test file was converted from except-all.sql. +-- Note that currently registered UDF returns a string. So there are some differences, for instance +-- in string cast within UDF in Scala and Python. + +CREATE TEMPORARY VIEW tab1 AS SELECT * FROM VALUES +(0), (1), (2), (2), (2), (2), (3), (null), (null) AS tab1(c1); +CREATE TEMPORARY VIEW tab2 AS SELECT * FROM VALUES +(1), (2), (2), (3), (5), (5), (null) AS tab2(c1); +CREATE TEMPORARY VIEW tab3 AS SELECT * FROM VALUES +(1, 2), +(1, 2), +(1, 3), +(2, 3), +(2, 2) +AS tab3(k, v); +CREATE TEMPORARY VIEW tab4 AS SELECT * FROM VALUES +(1, 2), +(2, 3), +(2, 2), +(2, 2), +(2, 20) +AS tab4(k, v); + +-- Basic EXCEPT ALL +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2; + +-- MINUS ALL (synonym for EXCEPT) +SELECT * FROM tab1 +MINUS ALL +SELECT * FROM tab2; + +-- EXCEPT ALL same table in both branches +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2 WHERE udf(c1) IS NOT NULL; + +-- Empty left relation +SELECT * FROM tab1 WHERE udf(c1) > 5 +EXCEPT ALL +SELECT * FROM tab2; + +-- Empty right relation +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2 WHERE c1 > udf(6); + +-- Type Coerced ExceptAll +SELECT * FROM tab1 +EXCEPT ALL +SELECT CAST(udf(1) AS BIGINT); + +-- Error as types of two side are not compatible +SELECT * FROM tab1 +EXCEPT ALL +SELECT array(1); + +-- Basic +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4; + +-- Basic +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3; + +-- EXCEPT ALL + INTERSECT +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3 +INTERSECT DISTINCT +SELECT * FROM tab4; + +-- EXCEPT ALL + EXCEPT +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +UNION ALL +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Mismatch on number of columns across both branches +SELECT k FROM tab3 +EXCEPT ALL +SELECT k, v FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +UNION +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Using MINUS ALL +SELECT * FROM tab3 +MINUS ALL +SELECT * FROM tab4 +UNION +SELECT * FROM tab3 +MINUS DISTINCT +SELECT * FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +EXCEPT DISTINCT +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Join under except all. Should produce empty resultset since both left and right sets +-- are same. +SELECT * +FROM (SELECT udf(tab3.k), + udf(tab4.v) +FROM tab3 + JOIN tab4 + ON tab3.k = tab4.k) Review comment: Hm, sounds possibly this is a bug. Let's create a JIRA. BTW, if possible, it would be better to create a JIRA with a minimised and narrowed-down reproducer in its JIRA description - actually this is one of the key points of why we're doing this :D. For instance, ```python from pyspark.sql.functions import pandas_udf, PandasUDFType @pandas_udf("string", PandasUDFType.SCALAR) def noop(x): return x + 1 spark.udf.register("udf", noop) spark.sql("CREATE TEMPORARY VIEW ...") spark.sql("...udf(...)...").show() ``` _If possible_, It might be even better if we can reproduce it via Python native APIs as well since it'd be very likely reproducible with Python API itself. For instance, ```python from pyspark.sql.functions import pandas_udf, PandasUDFType @pandas_udf("string", PandasUDFType.SCALAR) def noop(x): return x + 1 df1 = ... df2 = ... df1.join(df2 ...).show() ``` It might be even better to show that it works in Scala API with a minimised reproducer in the JIRA description. That will make other contributors and committers can easily focus on bug-fixing itself alone. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition
AmplabJenkins commented on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition URL: https://github.com/apache/spark/pull/25091#issuecomment-509901330 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12568/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition
AmplabJenkins commented on issue #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition URL: https://github.com/apache/spark/pull/25091#issuecomment-509901327 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base
AmplabJenkins removed a comment on issue #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base URL: https://github.com/apache/spark/pull/25090#issuecomment-509892237 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base
HyukjinKwon commented on issue #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base URL: https://github.com/apache/spark/pull/25090#issuecomment-509901277 add to whitelist This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base
HyukjinKwon commented on a change in pull request #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base URL: https://github.com/apache/spark/pull/25090#discussion_r301880585 ## File path: sql/core/src/test/resources/sql-tests/inputs/udf/udf-except-all.sql ## @@ -0,0 +1,164 @@ +-- This test file was converted from except-all.sql. +-- Note that currently registered UDF returns a string. So there are some differences, for instance +-- in string cast within UDF in Scala and Python. + +CREATE TEMPORARY VIEW tab1 AS SELECT * FROM VALUES +(0), (1), (2), (2), (2), (2), (3), (null), (null) AS tab1(c1); +CREATE TEMPORARY VIEW tab2 AS SELECT * FROM VALUES +(1), (2), (2), (3), (5), (5), (null) AS tab2(c1); +CREATE TEMPORARY VIEW tab3 AS SELECT * FROM VALUES +(1, 2), +(1, 2), +(1, 3), +(2, 3), +(2, 2) +AS tab3(k, v); +CREATE TEMPORARY VIEW tab4 AS SELECT * FROM VALUES +(1, 2), +(2, 3), +(2, 2), +(2, 2), +(2, 20) +AS tab4(k, v); + +-- Basic EXCEPT ALL +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2; + +-- MINUS ALL (synonym for EXCEPT) +SELECT * FROM tab1 +MINUS ALL +SELECT * FROM tab2; + +-- EXCEPT ALL same table in both branches +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2 WHERE udf(c1) IS NOT NULL; + +-- Empty left relation +SELECT * FROM tab1 WHERE udf(c1) > 5 +EXCEPT ALL +SELECT * FROM tab2; + +-- Empty right relation +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2 WHERE c1 > udf(6); + +-- Type Coerced ExceptAll +SELECT * FROM tab1 +EXCEPT ALL +SELECT CAST(udf(1) AS BIGINT); + +-- Error as types of two side are not compatible +SELECT * FROM tab1 +EXCEPT ALL +SELECT array(1); + +-- Basic +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4; + +-- Basic +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3; + +-- EXCEPT ALL + INTERSECT +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3 +INTERSECT DISTINCT +SELECT * FROM tab4; + +-- EXCEPT ALL + EXCEPT +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +UNION ALL +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Mismatch on number of columns across both branches +SELECT k FROM tab3 +EXCEPT ALL +SELECT k, v FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +UNION +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Using MINUS ALL +SELECT * FROM tab3 +MINUS ALL +SELECT * FROM tab4 +UNION +SELECT * FROM tab3 +MINUS DISTINCT +SELECT * FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +EXCEPT DISTINCT +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Join under except all. Should produce empty resultset since both left and right sets +-- are same. +SELECT * +FROM (SELECT udf(tab3.k), + udf(tab4.v) +FROM tab3 + JOIN tab4 + ON tab3.k = tab4.k) Review comment: Hm, sounds possibly this is a bug. Let's create a JIRA. BTW, if possible, it would be better to create a JIRA with a minimised and narrowed-down reproducer in its JIRA description - actually this is one of the key points of why we're doing this :D. For instance, ```python from pyspark.sql.functions import pandas_udf, PandasUDFType @pandas_udf("string", PandasUDFType.SCALAR) def noop(x): return x + 1 spark.udf.register("udf", noop) spark.sql("CREATE TEMPORARY VIEW ...") spark.sql("...udf(...)...").show() ``` _If possible_, It might be even better if we can reproduce it via Python native APIs as well since it'd be very likely reproducible with Python API itself. For instance, ```python from pyspark.sql.functions import pandas_udf, PandasUDFType @pandas_udf("string", PandasUDFType.SCALAR) def noop(x): return x + 1 df1 = ... df2 = ... df1.join(df2 ...).show() ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base
HyukjinKwon commented on a change in pull request #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base URL: https://github.com/apache/spark/pull/25090#discussion_r301880585 ## File path: sql/core/src/test/resources/sql-tests/inputs/udf/udf-except-all.sql ## @@ -0,0 +1,164 @@ +-- This test file was converted from except-all.sql. +-- Note that currently registered UDF returns a string. So there are some differences, for instance +-- in string cast within UDF in Scala and Python. + +CREATE TEMPORARY VIEW tab1 AS SELECT * FROM VALUES +(0), (1), (2), (2), (2), (2), (3), (null), (null) AS tab1(c1); +CREATE TEMPORARY VIEW tab2 AS SELECT * FROM VALUES +(1), (2), (2), (3), (5), (5), (null) AS tab2(c1); +CREATE TEMPORARY VIEW tab3 AS SELECT * FROM VALUES +(1, 2), +(1, 2), +(1, 3), +(2, 3), +(2, 2) +AS tab3(k, v); +CREATE TEMPORARY VIEW tab4 AS SELECT * FROM VALUES +(1, 2), +(2, 3), +(2, 2), +(2, 2), +(2, 20) +AS tab4(k, v); + +-- Basic EXCEPT ALL +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2; + +-- MINUS ALL (synonym for EXCEPT) +SELECT * FROM tab1 +MINUS ALL +SELECT * FROM tab2; + +-- EXCEPT ALL same table in both branches +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2 WHERE udf(c1) IS NOT NULL; + +-- Empty left relation +SELECT * FROM tab1 WHERE udf(c1) > 5 +EXCEPT ALL +SELECT * FROM tab2; + +-- Empty right relation +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2 WHERE c1 > udf(6); + +-- Type Coerced ExceptAll +SELECT * FROM tab1 +EXCEPT ALL +SELECT CAST(udf(1) AS BIGINT); + +-- Error as types of two side are not compatible +SELECT * FROM tab1 +EXCEPT ALL +SELECT array(1); + +-- Basic +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4; + +-- Basic +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3; + +-- EXCEPT ALL + INTERSECT +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3 +INTERSECT DISTINCT +SELECT * FROM tab4; + +-- EXCEPT ALL + EXCEPT +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +UNION ALL +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Mismatch on number of columns across both branches +SELECT k FROM tab3 +EXCEPT ALL +SELECT k, v FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +UNION +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Using MINUS ALL +SELECT * FROM tab3 +MINUS ALL +SELECT * FROM tab4 +UNION +SELECT * FROM tab3 +MINUS DISTINCT +SELECT * FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +EXCEPT DISTINCT +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Join under except all. Should produce empty resultset since both left and right sets +-- are same. +SELECT * +FROM (SELECT udf(tab3.k), + udf(tab4.v) +FROM tab3 + JOIN tab4 + ON tab3.k = tab4.k) Review comment: Hm, sounds possibly this is a bug. Let's create a JIRA. BTW, if possible, it would be better to create a JIRA with a minimised and narrowed-down reproducer in its JIRA description - actually this is one of the key points of why we're doing this :D. For instance, ```python from pyspark.sql.functions import pandas_udf, PandasUDFType @pandas_udf("string", PandasUDFType.SCALAR) def noop(x): return x + 1 spark.udf.register("udf", noop) spark.sql("CREATE TEMPORARY VIEW ...") spark.sql("...udf(...)...").show() ``` It might be even better if we can reproduce it via Python native APIs as well. For instance, ```python from pyspark.sql.functions import pandas_udf, PandasUDFType @pandas_udf("string", PandasUDFType.SCALAR) def noop(x): return x + 1 df1 = ... df2 = ... df1.join(df2 ...).show() ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya opened a new pull request #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition
viirya opened a new pull request #25091: [SPARK-28323][SQL][Python] PythonUDF should be able to use in join condition URL: https://github.com/apache/spark/pull/25091 ## What changes were proposed in this pull request? There is a bug in `ExtractPythonUDFs` that produces wrong result attributes. It causes a failure when using `PythonUDF`s among multiple child plans, e.g., join. An example is using `PythonUDF`s in join condition. ## How was this patch tested? Added test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base
HyukjinKwon commented on a change in pull request #25090: [SPARK-28278][SQL][PYTHON][TESTS] Convert and port 'except-all.sql' into UDF test base URL: https://github.com/apache/spark/pull/25090#discussion_r301880585 ## File path: sql/core/src/test/resources/sql-tests/inputs/udf/udf-except-all.sql ## @@ -0,0 +1,164 @@ +-- This test file was converted from except-all.sql. +-- Note that currently registered UDF returns a string. So there are some differences, for instance +-- in string cast within UDF in Scala and Python. + +CREATE TEMPORARY VIEW tab1 AS SELECT * FROM VALUES +(0), (1), (2), (2), (2), (2), (3), (null), (null) AS tab1(c1); +CREATE TEMPORARY VIEW tab2 AS SELECT * FROM VALUES +(1), (2), (2), (3), (5), (5), (null) AS tab2(c1); +CREATE TEMPORARY VIEW tab3 AS SELECT * FROM VALUES +(1, 2), +(1, 2), +(1, 3), +(2, 3), +(2, 2) +AS tab3(k, v); +CREATE TEMPORARY VIEW tab4 AS SELECT * FROM VALUES +(1, 2), +(2, 3), +(2, 2), +(2, 2), +(2, 20) +AS tab4(k, v); + +-- Basic EXCEPT ALL +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2; + +-- MINUS ALL (synonym for EXCEPT) +SELECT * FROM tab1 +MINUS ALL +SELECT * FROM tab2; + +-- EXCEPT ALL same table in both branches +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2 WHERE udf(c1) IS NOT NULL; + +-- Empty left relation +SELECT * FROM tab1 WHERE udf(c1) > 5 +EXCEPT ALL +SELECT * FROM tab2; + +-- Empty right relation +SELECT * FROM tab1 +EXCEPT ALL +SELECT * FROM tab2 WHERE c1 > udf(6); + +-- Type Coerced ExceptAll +SELECT * FROM tab1 +EXCEPT ALL +SELECT CAST(udf(1) AS BIGINT); + +-- Error as types of two side are not compatible +SELECT * FROM tab1 +EXCEPT ALL +SELECT array(1); + +-- Basic +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4; + +-- Basic +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3; + +-- EXCEPT ALL + INTERSECT +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3 +INTERSECT DISTINCT +SELECT * FROM tab4; + +-- EXCEPT ALL + EXCEPT +SELECT * FROM tab4 +EXCEPT ALL +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +UNION ALL +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Mismatch on number of columns across both branches +SELECT k FROM tab3 +EXCEPT ALL +SELECT k, v FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +UNION +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Using MINUS ALL +SELECT * FROM tab3 +MINUS ALL +SELECT * FROM tab4 +UNION +SELECT * FROM tab3 +MINUS DISTINCT +SELECT * FROM tab4; + +-- Chain of set operations +SELECT * FROM tab3 +EXCEPT ALL +SELECT * FROM tab4 +EXCEPT DISTINCT +SELECT * FROM tab3 +EXCEPT DISTINCT +SELECT * FROM tab4; + +-- Join under except all. Should produce empty resultset since both left and right sets +-- are same. +SELECT * +FROM (SELECT udf(tab3.k), + udf(tab4.v) +FROM tab3 + JOIN tab4 + ON tab3.k = tab4.k) Review comment: Hm, sounds possibly this is a bug. Let's create a JIRA. BTW, if possible, it would be better to create a JIRA with a minimised and narrowed-down reproducer in its JIRA description - actually this is one of the key points of why we're doing this :D. For instance, ```python from pyspark.sql.functions import pandas_udf, PandasUDFType @pandas_udf("string", PandasUDFType.SCALAR) def noop(x): return x + 1 spark.udf.register("udf", noop) spark.sql("CREATE TEMPORARY VIEW ...") spark.sql("...udf(...)...").show() ``` It might be even better if we can reproduce it via Python native APIs as well. For instance, ```python from pyspark.sql.functions import pandas_udf, PandasUDFType @pandas_udf("string", PandasUDFType.SCALAR) def noop(x): return x + 1 df1 = ... df2 = ... df1.join(df2 ...) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org