[GitHub] [spark] gengliangwang commented on a change in pull request #28690: [SPARK-31882][WEBUI] DAG-viz is not rendered correctly with pagination
gengliangwang commented on a change in pull request #28690: URL: https://github.com/apache/spark/pull/28690#discussion_r433641562 ## File path: core/src/main/scala/org/apache/spark/ui/UIUtils.scala ## @@ -292,6 +292,7 @@ private[spark] object UIUtils extends Logging { {commonHeaderNodes(request)} +setAppBasePath('{activeTab.basePath}') Review comment: I see. Let's keep it this way. The method `basicSparkPage` also calls the function `commonHeaderNodes` but there is no `activeTab` in it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource
AmplabJenkins commented on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-637303297 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource
AmplabJenkins removed a comment on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-637303297 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements
gatorsmile commented on pull request #27246: URL: https://github.com/apache/spark/pull/27246#issuecomment-637319803 This PR has some legit test failures. Could you fix them? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements
gatorsmile commented on pull request #27246: URL: https://github.com/apache/spark/pull/27246#issuecomment-637320181 cc @Ngone51 @jiangxb1987 can take a look at this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] igreenfield commented on pull request #28629: [SPARK-31769] Add MDC support for driver threads
igreenfield commented on pull request #28629: URL: https://github.com/apache/spark/pull/28629#issuecomment-637299078 @cloud-fan I think you are right, it can't be. so maybe we don't need to add this as default. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on pull request #28690: [SPARK-31882][WEBUI] DAG-viz is not rendered correctly with pagination
gengliangwang commented on pull request #28690: URL: https://github.com/apache/spark/pull/28690#issuecomment-637301679 @sarutak Does the bug exist in branch 3.0? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on pull request #28690: [SPARK-31882][WEBUI] DAG-viz is not rendered correctly with pagination
sarutak commented on pull request #28690: URL: https://github.com/apache/spark/pull/28690#issuecomment-637311750 @gengliangwang I've confirmed this bug exists in branch 3.0 too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on pull request #28690: [SPARK-31882][WEBUI] DAG-viz is not rendered correctly with pagination
gengliangwang commented on pull request #28690: URL: https://github.com/apache/spark/pull/28690#issuecomment-637312344 Then, let's merge to both master and 3.0 :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on pull request #27843: [SPARK-31029] Avoid using global execution context in driver main thread for YarnSchedulerBackend
gatorsmile commented on pull request #27843: URL: https://github.com/apache/spark/pull/27843#issuecomment-637323399 cc @Ngone51 @tgravescs This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28697: [SPARK-29150][CORE] Update RDD API for Stage level scheduling to be public
SparkQA removed a comment on pull request #28697: URL: https://github.com/apache/spark/pull/28697#issuecomment-637253029 **[Test build #123416 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123416/testReport)** for PR 28697 at commit [`7c2e2b7`](https://github.com/apache/spark/commit/7c2e2b7bfecb760ba9e7fc571c4deb2fa89e3daa). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28697: [SPARK-29150][CORE] Update RDD API for Stage level scheduling to be public
SparkQA commented on pull request #28697: URL: https://github.com/apache/spark/pull/28697#issuecomment-637334376 **[Test build #123416 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123416/testReport)** for PR 28697 at commit [`7c2e2b7`](https://github.com/apache/spark/commit/7c2e2b7bfecb760ba9e7fc571c4deb2fa89e3daa). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on a change in pull request #28690: [SPARK-31882][WEBUI] DAG-viz is not rendered correctly with pagination
sarutak commented on a change in pull request #28690: URL: https://github.com/apache/spark/pull/28690#discussion_r433638598 ## File path: core/src/main/scala/org/apache/spark/ui/UIUtils.scala ## @@ -292,6 +292,7 @@ private[spark] object UIUtils extends Logging { {commonHeaderNodes(request)} +setAppBasePath('{activeTab.basePath}') Review comment: At first, I considered the way you mention but to do so, we need to have an additional parameter in `commonHeaderNodes` to pass `activeTab` only for this purpose. Which do you think is better? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource
SparkQA removed a comment on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-637199584 **[Test build #123408 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123408/testReport)** for PR 27366 at commit [`8bfd599`](https://github.com/apache/spark/commit/8bfd599fabd7e2f221771848cedf075069cf7055). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource
SparkQA commented on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-637302319 **[Test build #123408 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123408/testReport)** for PR 27366 at commit [`8bfd599`](https://github.com/apache/spark/commit/8bfd599fabd7e2f221771848cedf075069cf7055). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xuanyuanking edited a comment on pull request #27627: [SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow
xuanyuanking edited a comment on pull request #27627: URL: https://github.com/apache/spark/pull/27627#issuecomment-637271156 > How about we merge it to master only first, and wait for the schema incompatibility check to be done? Agree. > Just to avoid redundant efforts, have you look into #24173? If your approach is different than #24173, what approach you will be proposing? @HeartSaVioR Thanks for the reminding. I also looked into #24173 before. My approach is checking the underlying unsafe row format instead of adding a new schema file in the checkpoint. It is decided by the requirement of detecting the format changing during migration, which has no chance for the user to create a schema file. But I think our approaches can complement each other. Let's discuss in my newly created PR, I'll submit it late today. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on pull request #28690: [SPARK-31882][WEBUI] DAG-viz is not rendered correctly with pagination
sarutak commented on pull request #28690: URL: https://github.com/apache/spark/pull/28690#issuecomment-637315623 Yeah, but #28627 isn't merged to `branch-3.0` so we need to back port it first or open another PR for `branch-3.0` to back port this change without the testcase. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on pull request #27598: [SPARK-30845] Do not upload local pyspark archives for spark-submit on Yarn
gatorsmile commented on pull request #27598: URL: https://github.com/apache/spark/pull/27598#issuecomment-637326327 cc @HyukjinKwon @vanzin @tgravescs This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on pull request #28490: [SPARK-31670][SQL]Struct Field in groupByExpr with CUBE
AngersZh commented on pull request #28490: URL: https://github.com/apache/spark/pull/28490#issuecomment-637325327 cc @maropu Can you have a review ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on pull request #28690: [SPARK-31882][WEBUI] DAG-viz is not rendered correctly with pagination
sarutak commented on pull request #28690: URL: https://github.com/apache/spark/pull/28690#issuecomment-637326056 Anyway, I'll open a PR for backporting #28627. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu removed a comment on pull request #28490: [SPARK-31670][SQL]Struct Field in groupByExpr with CUBE
AngersZh removed a comment on pull request #28490: URL: https://github.com/apache/spark/pull/28490#issuecomment-626340051 gentle ping @cloud-fan Another analyze bug, a little band-fix now, hope some good suggestion and better place to fix this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on pull request #28690: [SPARK-31882][WEBUI] DAG-viz is not rendered correctly with pagination
gengliangwang commented on pull request #28690: URL: https://github.com/apache/spark/pull/28690#issuecomment-637330492 Awesome, let's merge this one after #28627 is backported This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28654: [SPARK-31834][SQL]Improve error message for incompatible data types
AmplabJenkins removed a comment on pull request #28654: URL: https://github.com/apache/spark/pull/28654#issuecomment-637335562 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28697: [SPARK-29150][CORE] Update RDD API for Stage level scheduling to be public
AmplabJenkins removed a comment on pull request #28697: URL: https://github.com/apache/spark/pull/28697#issuecomment-637335627 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28699: [SPARK-31885][SQL][3.0] Fix filter push down for old millis timestamps to Parquet
AmplabJenkins commented on pull request #28699: URL: https://github.com/apache/spark/pull/28699#issuecomment-637335673 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28696: [SPARK-31888][SQL] Support `java.time.Instant` in Parquet filter pushdown
AmplabJenkins removed a comment on pull request #28696: URL: https://github.com/apache/spark/pull/28696#issuecomment-637335317 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28692: [SPARK-31879][SQL] Using GB as default Locale for datetime formatters
AmplabJenkins commented on pull request #28692: URL: https://github.com/apache/spark/pull/28692#issuecomment-637335864 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28692: [SPARK-31879][SQL] Using GB as default Locale for datetime formatters
SparkQA removed a comment on pull request #28692: URL: https://github.com/apache/spark/pull/28692#issuecomment-637227803 **[Test build #123413 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123413/testReport)** for PR 28692 at commit [`3339274`](https://github.com/apache/spark/commit/33392744c4c9f2ffebe8e05411908cea76de17cc). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #27627: [SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow
AmplabJenkins commented on pull request #27627: URL: https://github.com/apache/spark/pull/27627#issuecomment-637335670 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28696: [SPARK-31888][SQL] Support `java.time.Instant` in Parquet filter pushdown
AmplabJenkins commented on pull request #28696: URL: https://github.com/apache/spark/pull/28696#issuecomment-637335317 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28692: [SPARK-31879][SQL] Using GB as default Locale for datetime formatters
SparkQA commented on pull request #28692: URL: https://github.com/apache/spark/pull/28692#issuecomment-637335200 **[Test build #123413 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123413/testReport)** for PR 28692 at commit [`3339274`](https://github.com/apache/spark/commit/33392744c4c9f2ffebe8e05411908cea76de17cc). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28686: [SPARK-31877][SQL]Avoid stats computation for Hive table
AmplabJenkins commented on pull request #28686: URL: https://github.com/apache/spark/pull/28686#issuecomment-637335430 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28686: [SPARK-31877][SQL]Avoid stats computation for Hive table
SparkQA commented on pull request #28686: URL: https://github.com/apache/spark/pull/28686#issuecomment-637335204 **[Test build #123418 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123418/testReport)** for PR 28686 at commit [`a58a75f`](https://github.com/apache/spark/commit/a58a75f6d8fb33ecdc0ed8b563f2c38a3638c57f). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28697: [SPARK-29150][CORE] Update RDD API for Stage level scheduling to be public
AmplabJenkins commented on pull request #28697: URL: https://github.com/apache/spark/pull/28697#issuecomment-637335627 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28686: [SPARK-31877][SQL]Avoid stats computation for Hive table
SparkQA removed a comment on pull request #28686: URL: https://github.com/apache/spark/pull/28686#issuecomment-637289802 **[Test build #123418 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123418/testReport)** for PR 28686 at commit [`a58a75f`](https://github.com/apache/spark/commit/a58a75f6d8fb33ecdc0ed8b563f2c38a3638c57f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28699: [SPARK-31885][SQL][3.0] Fix filter push down for old millis timestamps to Parquet
AmplabJenkins removed a comment on pull request #28699: URL: https://github.com/apache/spark/pull/28699#issuecomment-637335673 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28696: [SPARK-31888][SQL] Support `java.time.Instant` in Parquet filter pushdown
SparkQA removed a comment on pull request #28696: URL: https://github.com/apache/spark/pull/28696#issuecomment-637295139 **[Test build #123420 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123420/testReport)** for PR 28696 at commit [`a0a5923`](https://github.com/apache/spark/commit/a0a5923325592424ce78f5881573b337b8e4e300). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28686: [SPARK-31877][SQL]Avoid stats computation for Hive table
AmplabJenkins removed a comment on pull request #28686: URL: https://github.com/apache/spark/pull/28686#issuecomment-637335430 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #27908: [SPARK-31000] Add ability to set table description via Catalog.createTable()
SparkQA commented on pull request #27908: URL: https://github.com/apache/spark/pull/27908#issuecomment-637335203 **[Test build #123412 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123412/testReport)** for PR 27908 at commit [`31538f2`](https://github.com/apache/spark/commit/31538f26bb0d25cb321068d096c9e1610c65b968). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #27627: [SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow
SparkQA commented on pull request #27627: URL: https://github.com/apache/spark/pull/27627#issuecomment-637335199 **[Test build #123414 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123414/testReport)** for PR 27627 at commit [`7795888`](https://github.com/apache/spark/commit/77958880245cca238bd976900e57715f6f96a3c4). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28654: [SPARK-31834][SQL]Improve error message for incompatible data types
SparkQA commented on pull request #28654: URL: https://github.com/apache/spark/pull/28654#issuecomment-637335195 **[Test build #123415 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123415/testReport)** for PR 28654 at commit [`9c4b485`](https://github.com/apache/spark/commit/9c4b485030dbb80d0d757ef8100984a53ff7eb2b). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28654: [SPARK-31834][SQL]Improve error message for incompatible data types
AmplabJenkins commented on pull request #28654: URL: https://github.com/apache/spark/pull/28654#issuecomment-637335562 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28699: [SPARK-31885][SQL][3.0] Fix filter push down for old millis timestamps to Parquet
SparkQA commented on pull request #28699: URL: https://github.com/apache/spark/pull/28699#issuecomment-637335206 **[Test build #123411 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123411/testReport)** for PR 28699 at commit [`4ba7e05`](https://github.com/apache/spark/commit/4ba7e05073f47e9595fed34e035e219ca0ba19e1). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #27908: [SPARK-31000] Add ability to set table description via Catalog.createTable()
AmplabJenkins removed a comment on pull request #27908: URL: https://github.com/apache/spark/pull/27908#issuecomment-637336199 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123412/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28692: [SPARK-31879][SQL] Using GB as default Locale for datetime formatters
AmplabJenkins removed a comment on pull request #28692: URL: https://github.com/apache/spark/pull/28692#issuecomment-637335881 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123413/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on pull request #28692: [SPARK-31879][SQL] Using GB as default Locale for datetime formatters
yaooqinn commented on pull request #28692: URL: https://github.com/apache/spark/pull/28692#issuecomment-637337298 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on pull request #28692: [SPARK-31879][SQL] Using GB as default Locale for datetime formatters
yaooqinn commented on pull request #28692: URL: https://github.com/apache/spark/pull/28692#issuecomment-637336891 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #28556: [SPARK-31736][SQL] Nested column aliasing for RepartitionByExpression/Join
viirya commented on pull request #28556: URL: https://github.com/apache/spark/pull/28556#issuecomment-637336896 ping @cloud-fan @dongjoon-hyun This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28692: [SPARK-31879][SQL] Using GB as default Locale for datetime formatters
AmplabJenkins commented on pull request #28692: URL: https://github.com/apache/spark/pull/28692#issuecomment-637339913 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on pull request #28696: [SPARK-31888][SQL] Support `java.time.Instant` in Parquet filter pushdown
MaxGekk commented on pull request #28696: URL: https://github.com/apache/spark/pull/28696#issuecomment-637339865 jenkins, retest this, please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
cloud-fan commented on a change in pull request #28647: URL: https://github.com/apache/spark/pull/28647#discussion_r433668494 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ## @@ -839,6 +839,19 @@ case class AlterTableSetLocationCommand( object DDLUtils { val HIVE_PROVIDER = "hive" + val METASTORE_GENERATED_PROPERTIES: Set[String] = Set( +"CreateTime", +"transient_lastDdlTime", +"grantTime", +"lastUpdateTime", +"last_modified_by", +"last_modified_time", +"Owner:", +"totalNumberFiles", +"maxFileSize", +"minFileSize" Review comment: > We need to add some properties like last_modified_by , last_modified_time . It's fine. It's better to do less hardcode. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28654: [SPARK-31834][SQL]Improve error message for incompatible data types
SparkQA commented on pull request #28654: URL: https://github.com/apache/spark/pull/28654#issuecomment-637342881 **[Test build #123423 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123423/testReport)** for PR 28654 at commit [`9c4b485`](https://github.com/apache/spark/commit/9c4b485030dbb80d0d757ef8100984a53ff7eb2b). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table
SparkQA commented on pull request #25840: URL: https://github.com/apache/spark/pull/25840#issuecomment-637342888 **[Test build #123424 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123424/testReport)** for PR 25840 at commit [`f78ff9c`](https://github.com/apache/spark/commit/f78ff9c1f012f0a45e33f421f48366022efaf160). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28696: [SPARK-31888][SQL] Support `java.time.Instant` in Parquet filter pushdown
SparkQA commented on pull request #28696: URL: https://github.com/apache/spark/pull/28696#issuecomment-637342883 **[Test build #123422 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123422/testReport)** for PR 28696 at commit [`a0a5923`](https://github.com/apache/spark/commit/a0a5923325592424ce78f5881573b337b8e4e300). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28654: [SPARK-31834][SQL]Improve error message for incompatible data types
AmplabJenkins commented on pull request #28654: URL: https://github.com/apache/spark/pull/28654#issuecomment-637343496 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table
AmplabJenkins commented on pull request #25840: URL: https://github.com/apache/spark/pull/25840#issuecomment-637343501 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28696: [SPARK-31888][SQL] Support `java.time.Instant` in Parquet filter pushdown
AmplabJenkins commented on pull request #28696: URL: https://github.com/apache/spark/pull/28696#issuecomment-637343433 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak edited a comment on pull request #28702: [SPARK-31756][WEBUI][3.0] Add real headless browser support for UI test
sarutak edited a comment on pull request #28702: URL: https://github.com/apache/spark/pull/28702#issuecomment-637359385 CC: @gengliangwang @dongjoon-hyun I've opened this PR in response to [the discussion](https://github.com/apache/spark/pull/28690#issuecomment-637315623). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on pull request #28702: [SPARK-31756][WEBUI][3.0] Add real headless browser support for UI test
sarutak commented on pull request #28702: URL: https://github.com/apache/spark/pull/28702#issuecomment-637359385 CC: @gengliangwang @dongjoon-hyun I've open this PR in response to [the discussion](https://github.com/apache/spark/pull/28690#issuecomment-637315623). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28645: [SPARK-31826][SQL] Support composed type of case class for typed Scala UDF
cloud-fan commented on a change in pull request #28645: URL: https://github.com/apache/spark/pull/28645#discussion_r433673986 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -2847,6 +2848,39 @@ class Analyzer( } } + object PrepareDeserializerForUDF extends Rule[LogicalPlan] { +override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp { + case p if !p.resolved => p // Skip unresolved nodes. + + case p => p transformExpressionsUp { + +case udf @ ScalaUDF(_, _, inputs, encoders, _, _, _, desers) + if encoders.nonEmpty && desers.isEmpty => + val deserializers = encoders.zipWithIndex.map { case (encOpt, i) => +val dataType = inputs(i).dataType +if (CatalystTypeConverters.isPrimitive(dataType) || + dataType.isInstanceOf[UserDefinedType[_]]) { + // primitive/UDT data types use `CatalystTypeConverters` to + // convert internal data to external data. + None +} else { + encOpt.map { enc => +val attrs = if (enc.isSerializedAsStructForTopLevel) { + dataType.asInstanceOf[StructType].toAttributes +} else { + // the field name doesn't matter here, so we use + // a simple literal to avoid any overhead + new StructType().add(s"input", dataType).toAttributes +} +enc.resolveAndBind(attrs).createDeserializer() Review comment: We can't bind the attributes during analysis. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28702: [SPARK-31756][WEBUI][3.0] Add real headless browser support for UI test
AmplabJenkins commented on pull request #28702: URL: https://github.com/apache/spark/pull/28702#issuecomment-637362304 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28702: [SPARK-31756][WEBUI][3.0] Add real headless browser support for UI test
SparkQA commented on pull request #28702: URL: https://github.com/apache/spark/pull/28702#issuecomment-637361640 **[Test build #123425 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123425/testReport)** for PR 28702 at commit [`b0b43dc`](https://github.com/apache/spark/commit/b0b43dc045fb1d6f8d4418d93efb616b6c8c9acc). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28703: SPARK-29897 Add implicit cast for SubtractTimestamps
AmplabJenkins commented on pull request #28703: URL: https://github.com/apache/spark/pull/28703#issuecomment-637376943 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28696: [SPARK-31888][SQL] Support `java.time.Instant` in Parquet filter pushdown
AmplabJenkins removed a comment on pull request #28696: URL: https://github.com/apache/spark/pull/28696#issuecomment-637335324 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123420/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28686: [SPARK-31877][SQL]Avoid stats computation for Hive table
AmplabJenkins removed a comment on pull request #28686: URL: https://github.com/apache/spark/pull/28686#issuecomment-637335438 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123418/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #27908: [SPARK-31000] Add ability to set table description via Catalog.createTable()
AmplabJenkins commented on pull request #27908: URL: https://github.com/apache/spark/pull/27908#issuecomment-637336186 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #27627: [SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow
AmplabJenkins removed a comment on pull request #27627: URL: https://github.com/apache/spark/pull/27627#issuecomment-637335683 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123414/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28654: [SPARK-31834][SQL]Improve error message for incompatible data types
AmplabJenkins removed a comment on pull request #28654: URL: https://github.com/apache/spark/pull/28654#issuecomment-637335571 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123415/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28692: [SPARK-31879][SQL] Using GB as default Locale for datetime formatters
AmplabJenkins removed a comment on pull request #28692: URL: https://github.com/apache/spark/pull/28692#issuecomment-637335864 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #27908: [SPARK-31000] Add ability to set table description via Catalog.createTable()
AmplabJenkins removed a comment on pull request #27908: URL: https://github.com/apache/spark/pull/27908#issuecomment-637336186 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28699: [SPARK-31885][SQL][3.0] Fix filter push down for old millis timestamps to Parquet
SparkQA removed a comment on pull request #28699: URL: https://github.com/apache/spark/pull/28699#issuecomment-637221536 **[Test build #123411 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123411/testReport)** for PR 28699 at commit [`4ba7e05`](https://github.com/apache/spark/commit/4ba7e05073f47e9595fed34e035e219ca0ba19e1). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #27908: [SPARK-31000] Add ability to set table description via Catalog.createTable()
SparkQA removed a comment on pull request #27908: URL: https://github.com/apache/spark/pull/27908#issuecomment-637221473 **[Test build #123412 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123412/testReport)** for PR 27908 at commit [`31538f2`](https://github.com/apache/spark/commit/31538f26bb0d25cb321068d096c9e1610c65b968). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28654: [SPARK-31834][SQL]Improve error message for incompatible data types
SparkQA removed a comment on pull request #28654: URL: https://github.com/apache/spark/pull/28654#issuecomment-637245241 **[Test build #123415 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123415/testReport)** for PR 28654 at commit [`9c4b485`](https://github.com/apache/spark/commit/9c4b485030dbb80d0d757ef8100984a53ff7eb2b). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #27627: [SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow
SparkQA removed a comment on pull request #27627: URL: https://github.com/apache/spark/pull/27627#issuecomment-637229956 **[Test build #123414 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123414/testReport)** for PR 27627 at commit [`7795888`](https://github.com/apache/spark/commit/77958880245cca238bd976900e57715f6f96a3c4). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #27627: [SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow
AmplabJenkins removed a comment on pull request #27627: URL: https://github.com/apache/spark/pull/27627#issuecomment-637335670 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28696: [SPARK-31888][SQL] Support `java.time.Instant` in Parquet filter pushdown
SparkQA commented on pull request #28696: URL: https://github.com/apache/spark/pull/28696#issuecomment-637335205 **[Test build #123420 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123420/testReport)** for PR 28696 at commit [`a0a5923`](https://github.com/apache/spark/commit/a0a5923325592424ce78f5881573b337b8e4e300). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #28560: [SPARK-27217][SQL] Nested column aliasing for more operators which can prune nested column
viirya commented on pull request #28560: URL: https://github.com/apache/spark/pull/28560#issuecomment-637336844 ping @cloud-fan @dongjoon-hyun This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28692: [SPARK-31879][SQL] Using GB as default Locale for datetime formatters
SparkQA commented on pull request #28692: URL: https://github.com/apache/spark/pull/28692#issuecomment-637339211 **[Test build #123421 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123421/testReport)** for PR 28692 at commit [`3339274`](https://github.com/apache/spark/commit/33392744c4c9f2ffebe8e05411908cea76de17cc). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #28700: [SPARK-31860][BUILD] only push release tags on succes
viirya commented on a change in pull request #28700: URL: https://github.com/apache/spark/pull/28700#discussion_r433665567 ## File path: dev/create-release/release-tag.sh ## @@ -24,7 +24,7 @@ function exit_with_usage { local NAME=$(basename $0) cat << EOF usage: $NAME -Tags a Spark release on a particular branch. +Tags a Spark release on a particular branch. Must push after Review comment: "Must push after" - it reads a bit weird, is it unfinished sentence? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #28654: [SPARK-31834][SQL]Improve error message for incompatible data types
maropu commented on pull request #28654: URL: https://github.com/apache/spark/pull/28654#issuecomment-637340650 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28699: [SPARK-31885][SQL][3.0] Fix filter push down for old millis timestamps to Parquet
AmplabJenkins removed a comment on pull request #28699: URL: https://github.com/apache/spark/pull/28699#issuecomment-637335687 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123411/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28692: [SPARK-31879][SQL] Using GB as default Locale for datetime formatters
AmplabJenkins removed a comment on pull request #28692: URL: https://github.com/apache/spark/pull/28692#issuecomment-637339913 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28696: [SPARK-31888][SQL] Support `java.time.Instant` in Parquet filter pushdown
AmplabJenkins removed a comment on pull request #28696: URL: https://github.com/apache/spark/pull/28696#issuecomment-637343444 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28654: [SPARK-31834][SQL]Improve error message for incompatible data types
AmplabJenkins removed a comment on pull request #28654: URL: https://github.com/apache/spark/pull/28654#issuecomment-637343496 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table
AmplabJenkins removed a comment on pull request #25840: URL: https://github.com/apache/spark/pull/25840#issuecomment-637343501 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28645: [SPARK-31826][SQL] Support composed type of case class for typed Scala UDF
cloud-fan commented on a change in pull request #28645: URL: https://github.com/apache/spark/pull/28645#discussion_r433673640 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala ## @@ -49,11 +52,19 @@ case class ScalaUDF( inputEncoders: Seq[Option[ExpressionEncoder[_]]] = Nil, udfName: Option[String] = None, nullable: Boolean = true, -udfDeterministic: Boolean = true) +udfDeterministic: Boolean = true, +inputDeserializers: Seq[Option[Deserializer[_]]] = Nil) Review comment: do we need the `inputEncoders` parameter anymore? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28645: [SPARK-31826][SQL] Support composed type of case class for typed Scala UDF
cloud-fan commented on a change in pull request #28645: URL: https://github.com/apache/spark/pull/28645#discussion_r433673986 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -2847,6 +2848,39 @@ class Analyzer( } } + object PrepareDeserializerForUDF extends Rule[LogicalPlan] { +override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp { + case p if !p.resolved => p // Skip unresolved nodes. + + case p => p transformExpressionsUp { + +case udf @ ScalaUDF(_, _, inputs, encoders, _, _, _, desers) + if encoders.nonEmpty && desers.isEmpty => + val deserializers = encoders.zipWithIndex.map { case (encOpt, i) => +val dataType = inputs(i).dataType +if (CatalystTypeConverters.isPrimitive(dataType) || + dataType.isInstanceOf[UserDefinedType[_]]) { + // primitive/UDT data types use `CatalystTypeConverters` to + // convert internal data to external data. + None +} else { + encOpt.map { enc => +val attrs = if (enc.isSerializedAsStructForTopLevel) { + dataType.asInstanceOf[StructType].toAttributes +} else { + // the field name doesn't matter here, so we use + // a simple literal to avoid any overhead + new StructType().add(s"input", dataType).toAttributes +} +enc.resolveAndBind(attrs).createDeserializer() Review comment: We can't bind the attributes during analysis. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak opened a new pull request #28702: [SPARK-31756][WEBUI][3.0] Add real headless browser support for UI test
sarutak opened a new pull request #28702: URL: https://github.com/apache/spark/pull/28702 ### What changes were proposed in this pull request? This PR backports the change of #28627 . Some fixes for UI like #28690 should be merged to `branch-3.0` including testcases so #28627 is need to be backported. This PR mainly adds two things. 1. Real headless browser support for UI test 2. A test suite using headless Chrome as one instance of those browsers. Also, for environment where Chrome and Chrome driver is not installed, `ChromeUITest` tag is added to filter out the test suite. By default, test suites with `ChromeUITest` is disabled. ### Why are the changes needed? In the `branch-3.0`, there are two problems for UI test. 1. Lots of tests especially JavaScript related ones are done manually. Appearance is better to be confirmed by our eyes but logic should be tested by test cases ideally. 2. Compared to the real web browsers, HtmlUnit doesn't seem to support JavaScript enough. I added a JavaScript related test before for SPARK-31534 using HtmlUnit which is simple library based headless browser for test. The test I added works somehow but some JavaScript related error is shown in unit-tests.log. ``` === EXCEPTION START Exception class=[net.sourceforge.htmlunit.corejs.javascript.JavaScriptException] com.gargoylesoftware.htmlunit.ScriptException: Error: TOOLTIP: Option "sanitizeFn" provided type "window" but expected type "(null|function)". (http://192.168.1.209:60724/static/jquery-3.4.1.min.js#2) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:904) at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:628) at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:515) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:835) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:807) at com.gargoylesoftware.htmlunit.InteractivePage.executeJavaScriptFunctionIfPossible(InteractivePage.java:216) at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptFunctionJob.runJavaScript(JavaScriptFunctionJob.java:52) at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptExecutionJob.run(JavaScriptExecutionJob.java:102) at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptJobManagerImpl.runSingleJob(JavaScriptJobManagerImpl.java:426) at com.gargoylesoftware.htmlunit.javascript.background.DefaultJavaScriptExecutor.run(DefaultJavaScriptExecutor.java:157) at java.lang.Thread.run(Thread.java:748) Caused by: net.sourceforge.htmlunit.corejs.javascript.JavaScriptException: Error: TOOLTIP: Option "sanitizeFn" provided type "window" but expected type "(null|function)". (http://192.168.1.209:60724/static/jquery-3.4.1.min.js#2) at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1009) at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:800) at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:105) at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall(ContextFactory.java:413) at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.doTopCall(HtmlUnitContextFactory.java:252) at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3264) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$4.doRun(JavaScriptEngine.java:828) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:889) ... 10 more JavaScriptException value = Error: TOOLTIP: Option "sanitizeFn" provided type "window" but expected type "(null|function)". == CALLING JAVASCRIPT == function () { throw e; } === EXCEPTION END ``` I tried to upgrade HtmlUnit to 2.40.0 but what is worse, the test become not working even though it works on real browsers like Chrome, Safari and Firefox without error. ``` [info] UISeleniumSuite: [info] - SPARK-31534: text for tooltip should be escaped *** FAILED *** (17 seconds, 745 milliseconds) [info] The code passed to eventually never returned normally. Attempted 2 times over 12.910785232 seconds. Last failure message: com.gargoylesoftware.htmlunit.ScriptException: ReferenceError: Assignment to undefined "regeneratorRuntime" in strict mode (http://192.168.1.209:62132/static/vis-timeline-graph2d.min.js#52(Function)#1) ``` To resolve those problems, it's better to support
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28702: [SPARK-31756][WEBUI][3.0] Add real headless browser support for UI test
AmplabJenkins removed a comment on pull request #28702: URL: https://github.com/apache/spark/pull/28702#issuecomment-637362304 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] advancedxy commented on a change in pull request #28308: [SPARK-31526][SQL][TESTS] Add a new test suite for ExpressionInfo
advancedxy commented on a change in pull request #28308: URL: https://github.com/apache/spark/pull/28308#discussion_r433696896 ## File path: sql/core/src/test/scala/org/apache/spark/sql/expressions/ExpressionInfoSuite.scala ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.expressions + +import scala.collection.parallel.immutable.ParVector + +import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.catalyst.FunctionIdentifier +import org.apache.spark.sql.catalyst.expressions.ExpressionInfo +import org.apache.spark.sql.execution.HiveResult.hiveResultString +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.test.SharedSparkSession + +class ExpressionInfoSuite extends SparkFunSuite with SharedSparkSession { + + test("Replace _FUNC_ in ExpressionInfo") { +val info = spark.sessionState.catalog.lookupFunctionInfo(FunctionIdentifier("upper")) +assert(info.getName === "upper") +assert(info.getClassName === "org.apache.spark.sql.catalyst.expressions.Upper") +assert(info.getUsage === "upper(str) - Returns `str` with all characters changed to uppercase.") +assert(info.getExamples.contains("> SELECT upper('SparkSql');")) +assert(info.getSince === "1.0.1") +assert(info.getNote === "") +assert(info.getExtended.contains("> SELECT upper('SparkSql');")) + } + + test("group info in ExpressionInfo") { +val info = spark.sessionState.catalog.lookupFunctionInfo(FunctionIdentifier("sum")) +assert(info.getGroup === "agg_funcs") + +Seq("agg_funcs", "array_funcs", "datetime_funcs", "json_funcs", "map_funcs", "window_funcs") +.foreach { groupName => + val info = new ExpressionInfo( +"testClass", null, "testName", null, "", "", "", groupName, "", "") + assert(info.getGroup === groupName) +} + +val errMsg = intercept[IllegalArgumentException] { + val invalidGroupName = "invalid_group_funcs" + new ExpressionInfo("testClass", null, "testName", null, "", "", "", invalidGroupName, "", "") +}.getMessage +assert(errMsg.contains("'group' is malformed in the expression [testName].")) + } + + test("error handling in ExpressionInfo") { +val errMsg1 = intercept[IllegalArgumentException] { + val invalidNote = " invalid note" + new ExpressionInfo("testClass", null, "testName", null, "", "", invalidNote, "", "", "") +}.getMessage +assert(errMsg1.contains("'note' is malformed in the expression [testName].")) + +val errMsg2 = intercept[IllegalArgumentException] { + val invalidSince = "-3.0.0" + new ExpressionInfo("testClass", null, "testName", null, "", "", "", "", invalidSince, "") +}.getMessage +assert(errMsg2.contains("'since' is malformed in the expression [testName].")) + +val errMsg3 = intercept[IllegalArgumentException] { + val invalidDeprecated = " invalid deprecated" + new ExpressionInfo("testClass", null, "testName", null, "", "", "", "", "", invalidDeprecated) +}.getMessage +assert(errMsg3.contains("'deprecated' is malformed in the expression [testName].")) + } + + test("using _FUNC_ instead of function names in examples") { +val exampleRe = "(>.*;)".r +val setStmtRe = "(?i)^(>\\s+set\\s+).+".r +val ignoreSet = Set( + // Examples for CaseWhen show simpler syntax: + // `CASE WHEN ... THEN ... WHEN ... THEN ... END` + "org.apache.spark.sql.catalyst.expressions.CaseWhen", + // _FUNC_ is replaced by `locate` but `locate(... IN ...)` is not supported + "org.apache.spark.sql.catalyst.expressions.StringLocate", + // _FUNC_ is replaced by `%` which causes a parsing error on `SELECT %(2, 1.8)` + "org.apache.spark.sql.catalyst.expressions.Remainder", + // Examples demonstrate alternative names, see SPARK-20749 + "org.apache.spark.sql.catalyst.expressions.Length") +spark.sessionState.functionRegistry.listFunction().foreach { funcId => + val info = spark.sessionState.catalog.lookupFunctionInfo(funcId) + val className = info.getClassName + withClue(s"Expression class '$className'") { +val exprExamples = info.getOriginalExamples
[GitHub] [spark] sathyaprakashg opened a new pull request #28703: SPARK-29897 Add implicit cast for SubtractTimestamps
sathyaprakashg opened a new pull request #28703: URL: https://github.com/apache/spark/pull/28703 ### What changes were proposed in this pull request? Add implicit cast option for SubtractTimestamps expression ### Why are the changes needed? Currently, this statement is failing because timestamp is passed as string. By adding implicit cast trait, it will be casted to timestamp data type automatically. SELECT EXTRACT(DAY FROM NOW() - '2014-08-02 08:10:56'); ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? SQL statements added to sql-tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28703: SPARK-29897 Add implicit cast for SubtractTimestamps
AmplabJenkins removed a comment on pull request #28703: URL: https://github.com/apache/spark/pull/28703#issuecomment-637376943 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28703: SPARK-29897 Add implicit cast for SubtractTimestamps
AmplabJenkins commented on pull request #28703: URL: https://github.com/apache/spark/pull/28703#issuecomment-637377768 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28308: [SPARK-31526][SQL][TESTS] Add a new test suite for ExpressionInfo
HyukjinKwon commented on a change in pull request #28308: URL: https://github.com/apache/spark/pull/28308#discussion_r433711518 ## File path: sql/core/src/test/scala/org/apache/spark/sql/expressions/ExpressionInfoSuite.scala ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.expressions + +import scala.collection.parallel.immutable.ParVector + +import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.catalyst.FunctionIdentifier +import org.apache.spark.sql.catalyst.expressions.ExpressionInfo +import org.apache.spark.sql.execution.HiveResult.hiveResultString +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.test.SharedSparkSession + +class ExpressionInfoSuite extends SparkFunSuite with SharedSparkSession { + + test("Replace _FUNC_ in ExpressionInfo") { +val info = spark.sessionState.catalog.lookupFunctionInfo(FunctionIdentifier("upper")) +assert(info.getName === "upper") +assert(info.getClassName === "org.apache.spark.sql.catalyst.expressions.Upper") +assert(info.getUsage === "upper(str) - Returns `str` with all characters changed to uppercase.") +assert(info.getExamples.contains("> SELECT upper('SparkSql');")) +assert(info.getSince === "1.0.1") +assert(info.getNote === "") +assert(info.getExtended.contains("> SELECT upper('SparkSql');")) + } + + test("group info in ExpressionInfo") { +val info = spark.sessionState.catalog.lookupFunctionInfo(FunctionIdentifier("sum")) +assert(info.getGroup === "agg_funcs") + +Seq("agg_funcs", "array_funcs", "datetime_funcs", "json_funcs", "map_funcs", "window_funcs") +.foreach { groupName => + val info = new ExpressionInfo( +"testClass", null, "testName", null, "", "", "", groupName, "", "") + assert(info.getGroup === groupName) +} + +val errMsg = intercept[IllegalArgumentException] { + val invalidGroupName = "invalid_group_funcs" + new ExpressionInfo("testClass", null, "testName", null, "", "", "", invalidGroupName, "", "") +}.getMessage +assert(errMsg.contains("'group' is malformed in the expression [testName].")) + } + + test("error handling in ExpressionInfo") { +val errMsg1 = intercept[IllegalArgumentException] { + val invalidNote = " invalid note" + new ExpressionInfo("testClass", null, "testName", null, "", "", invalidNote, "", "", "") +}.getMessage +assert(errMsg1.contains("'note' is malformed in the expression [testName].")) + +val errMsg2 = intercept[IllegalArgumentException] { + val invalidSince = "-3.0.0" + new ExpressionInfo("testClass", null, "testName", null, "", "", "", "", invalidSince, "") +}.getMessage +assert(errMsg2.contains("'since' is malformed in the expression [testName].")) + +val errMsg3 = intercept[IllegalArgumentException] { + val invalidDeprecated = " invalid deprecated" + new ExpressionInfo("testClass", null, "testName", null, "", "", "", "", "", invalidDeprecated) +}.getMessage +assert(errMsg3.contains("'deprecated' is malformed in the expression [testName].")) + } + + test("using _FUNC_ instead of function names in examples") { +val exampleRe = "(>.*;)".r +val setStmtRe = "(?i)^(>\\s+set\\s+).+".r +val ignoreSet = Set( + // Examples for CaseWhen show simpler syntax: + // `CASE WHEN ... THEN ... WHEN ... THEN ... END` + "org.apache.spark.sql.catalyst.expressions.CaseWhen", + // _FUNC_ is replaced by `locate` but `locate(... IN ...)` is not supported + "org.apache.spark.sql.catalyst.expressions.StringLocate", + // _FUNC_ is replaced by `%` which causes a parsing error on `SELECT %(2, 1.8)` + "org.apache.spark.sql.catalyst.expressions.Remainder", + // Examples demonstrate alternative names, see SPARK-20749 + "org.apache.spark.sql.catalyst.expressions.Length") +spark.sessionState.functionRegistry.listFunction().foreach { funcId => + val info = spark.sessionState.catalog.lookupFunctionInfo(funcId) + val className = info.getClassName + withClue(s"Expression class '$className'") { +val exprExamples =
[GitHub] [spark] akshatb1 commented on pull request #28258: [SPARK-31486] [CORE] spark.submit.waitAppCompletion flag to control spark-submit exit in Standalone Cluster Mode
akshatb1 commented on pull request #28258: URL: https://github.com/apache/spark/pull/28258#issuecomment-637383391 Thanks, @Ngone51 for reviewing. CC: @srowen @jiangxb1987 @prakharjain09 Could you kindly help in reviewing this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] akshatb1 edited a comment on pull request #28258: [SPARK-31486] [CORE] spark.submit.waitAppCompletion flag to control spark-submit exit in Standalone Cluster Mode
akshatb1 edited a comment on pull request #28258: URL: https://github.com/apache/spark/pull/28258#issuecomment-637383391 Thanks, @Ngone51 for reviewing. I have taken care of the last comment (indentation) as well. CC: @srowen @jiangxb1987 @prakharjain09 Could you kindly help in reviewing this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28308: [SPARK-31526][SQL][TESTS] Add a new test suite for ExpressionInfo
maropu commented on a change in pull request #28308: URL: https://github.com/apache/spark/pull/28308#discussion_r433711978 ## File path: sql/core/src/test/scala/org/apache/spark/sql/expressions/ExpressionInfoSuite.scala ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.expressions + +import scala.collection.parallel.immutable.ParVector + +import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.catalyst.FunctionIdentifier +import org.apache.spark.sql.catalyst.expressions.ExpressionInfo +import org.apache.spark.sql.execution.HiveResult.hiveResultString +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.test.SharedSparkSession + +class ExpressionInfoSuite extends SparkFunSuite with SharedSparkSession { + + test("Replace _FUNC_ in ExpressionInfo") { +val info = spark.sessionState.catalog.lookupFunctionInfo(FunctionIdentifier("upper")) +assert(info.getName === "upper") +assert(info.getClassName === "org.apache.spark.sql.catalyst.expressions.Upper") +assert(info.getUsage === "upper(str) - Returns `str` with all characters changed to uppercase.") +assert(info.getExamples.contains("> SELECT upper('SparkSql');")) +assert(info.getSince === "1.0.1") +assert(info.getNote === "") +assert(info.getExtended.contains("> SELECT upper('SparkSql');")) + } + + test("group info in ExpressionInfo") { +val info = spark.sessionState.catalog.lookupFunctionInfo(FunctionIdentifier("sum")) +assert(info.getGroup === "agg_funcs") + +Seq("agg_funcs", "array_funcs", "datetime_funcs", "json_funcs", "map_funcs", "window_funcs") +.foreach { groupName => + val info = new ExpressionInfo( +"testClass", null, "testName", null, "", "", "", groupName, "", "") + assert(info.getGroup === groupName) +} + +val errMsg = intercept[IllegalArgumentException] { + val invalidGroupName = "invalid_group_funcs" + new ExpressionInfo("testClass", null, "testName", null, "", "", "", invalidGroupName, "", "") +}.getMessage +assert(errMsg.contains("'group' is malformed in the expression [testName].")) + } + + test("error handling in ExpressionInfo") { +val errMsg1 = intercept[IllegalArgumentException] { + val invalidNote = " invalid note" + new ExpressionInfo("testClass", null, "testName", null, "", "", invalidNote, "", "", "") +}.getMessage +assert(errMsg1.contains("'note' is malformed in the expression [testName].")) + +val errMsg2 = intercept[IllegalArgumentException] { + val invalidSince = "-3.0.0" + new ExpressionInfo("testClass", null, "testName", null, "", "", "", "", invalidSince, "") +}.getMessage +assert(errMsg2.contains("'since' is malformed in the expression [testName].")) + +val errMsg3 = intercept[IllegalArgumentException] { + val invalidDeprecated = " invalid deprecated" + new ExpressionInfo("testClass", null, "testName", null, "", "", "", "", "", invalidDeprecated) +}.getMessage +assert(errMsg3.contains("'deprecated' is malformed in the expression [testName].")) + } + + test("using _FUNC_ instead of function names in examples") { +val exampleRe = "(>.*;)".r +val setStmtRe = "(?i)^(>\\s+set\\s+).+".r +val ignoreSet = Set( + // Examples for CaseWhen show simpler syntax: + // `CASE WHEN ... THEN ... WHEN ... THEN ... END` + "org.apache.spark.sql.catalyst.expressions.CaseWhen", + // _FUNC_ is replaced by `locate` but `locate(... IN ...)` is not supported + "org.apache.spark.sql.catalyst.expressions.StringLocate", + // _FUNC_ is replaced by `%` which causes a parsing error on `SELECT %(2, 1.8)` + "org.apache.spark.sql.catalyst.expressions.Remainder", + // Examples demonstrate alternative names, see SPARK-20749 + "org.apache.spark.sql.catalyst.expressions.Length") +spark.sessionState.functionRegistry.listFunction().foreach { funcId => + val info = spark.sessionState.catalog.lookupFunctionInfo(funcId) + val className = info.getClassName + withClue(s"Expression class '$className'") { +val exprExamples = info.getOriginalExamples +
[GitHub] [spark] viirya opened a new pull request #28704: [SPARK-31777][ML] Add user-specified fold column to CrossValidator
viirya opened a new pull request #28704: URL: https://github.com/apache/spark/pull/28704 ### What changes were proposed in this pull request? This patch adds user-specified fold column support to `CrossValidator`. User can assign fold numbers to dataset instead of letting Spark do random splits. ### Why are the changes needed? This gives `CrossValidator` users more flexibility in splitting folds. ### Does this PR introduce _any_ user-facing change? Yes, a new `foldCol` param is added to `CrossValidator`. User can use it to specify custom fold splitting. ### How was this patch tested? Added unit tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] advancedxy commented on a change in pull request #28308: [SPARK-31526][SQL][TESTS] Add a new test suite for ExpressionInfo
advancedxy commented on a change in pull request #28308: URL: https://github.com/apache/spark/pull/28308#discussion_r433715996 ## File path: sql/core/src/test/scala/org/apache/spark/sql/expressions/ExpressionInfoSuite.scala ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.expressions + +import scala.collection.parallel.immutable.ParVector + +import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.catalyst.FunctionIdentifier +import org.apache.spark.sql.catalyst.expressions.ExpressionInfo +import org.apache.spark.sql.execution.HiveResult.hiveResultString +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.test.SharedSparkSession + +class ExpressionInfoSuite extends SparkFunSuite with SharedSparkSession { + + test("Replace _FUNC_ in ExpressionInfo") { +val info = spark.sessionState.catalog.lookupFunctionInfo(FunctionIdentifier("upper")) +assert(info.getName === "upper") +assert(info.getClassName === "org.apache.spark.sql.catalyst.expressions.Upper") +assert(info.getUsage === "upper(str) - Returns `str` with all characters changed to uppercase.") +assert(info.getExamples.contains("> SELECT upper('SparkSql');")) +assert(info.getSince === "1.0.1") +assert(info.getNote === "") +assert(info.getExtended.contains("> SELECT upper('SparkSql');")) + } + + test("group info in ExpressionInfo") { +val info = spark.sessionState.catalog.lookupFunctionInfo(FunctionIdentifier("sum")) +assert(info.getGroup === "agg_funcs") + +Seq("agg_funcs", "array_funcs", "datetime_funcs", "json_funcs", "map_funcs", "window_funcs") +.foreach { groupName => + val info = new ExpressionInfo( +"testClass", null, "testName", null, "", "", "", groupName, "", "") + assert(info.getGroup === groupName) +} + +val errMsg = intercept[IllegalArgumentException] { + val invalidGroupName = "invalid_group_funcs" + new ExpressionInfo("testClass", null, "testName", null, "", "", "", invalidGroupName, "", "") +}.getMessage +assert(errMsg.contains("'group' is malformed in the expression [testName].")) + } + + test("error handling in ExpressionInfo") { +val errMsg1 = intercept[IllegalArgumentException] { + val invalidNote = " invalid note" + new ExpressionInfo("testClass", null, "testName", null, "", "", invalidNote, "", "", "") +}.getMessage +assert(errMsg1.contains("'note' is malformed in the expression [testName].")) + +val errMsg2 = intercept[IllegalArgumentException] { + val invalidSince = "-3.0.0" + new ExpressionInfo("testClass", null, "testName", null, "", "", "", "", invalidSince, "") +}.getMessage +assert(errMsg2.contains("'since' is malformed in the expression [testName].")) + +val errMsg3 = intercept[IllegalArgumentException] { + val invalidDeprecated = " invalid deprecated" + new ExpressionInfo("testClass", null, "testName", null, "", "", "", "", "", invalidDeprecated) +}.getMessage +assert(errMsg3.contains("'deprecated' is malformed in the expression [testName].")) + } + + test("using _FUNC_ instead of function names in examples") { +val exampleRe = "(>.*;)".r +val setStmtRe = "(?i)^(>\\s+set\\s+).+".r +val ignoreSet = Set( + // Examples for CaseWhen show simpler syntax: + // `CASE WHEN ... THEN ... WHEN ... THEN ... END` + "org.apache.spark.sql.catalyst.expressions.CaseWhen", + // _FUNC_ is replaced by `locate` but `locate(... IN ...)` is not supported + "org.apache.spark.sql.catalyst.expressions.StringLocate", + // _FUNC_ is replaced by `%` which causes a parsing error on `SELECT %(2, 1.8)` + "org.apache.spark.sql.catalyst.expressions.Remainder", + // Examples demonstrate alternative names, see SPARK-20749 + "org.apache.spark.sql.catalyst.expressions.Length") +spark.sessionState.functionRegistry.listFunction().foreach { funcId => + val info = spark.sessionState.catalog.lookupFunctionInfo(funcId) + val className = info.getClassName + withClue(s"Expression class '$className'") { +val exprExamples = info.getOriginalExamples
[GitHub] [spark] SparkQA commented on pull request #28704: [SPARK-31777][ML] Add user-specified fold column to CrossValidator
SparkQA commented on pull request #28704: URL: https://github.com/apache/spark/pull/28704#issuecomment-637390451 **[Test build #123426 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123426/testReport)** for PR 28704 at commit [`baec279`](https://github.com/apache/spark/commit/baec279c45b4ac9782e4d1c3286063fc04146eb1). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org