[GitHub] [spark] SparkQA removed a comment on issue #25024: [SPARK-27296][SQL] Allows Aggregator to be registered as a UDF
SparkQA removed a comment on issue #25024: [SPARK-27296][SQL] Allows Aggregator to be registered as a UDF URL: https://github.com/apache/spark/pull/25024#issuecomment-571249550 **[Test build #116183 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116183/testReport)** for PR 25024 at commit [`986a3b4`](https://github.com/apache/spark/commit/986a3b45a1b90b6311c1047abdbcadc1c4d1f7d8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
SparkQA commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571332914 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/20982/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2
AmplabJenkins removed a comment on issue #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2 URL: https://github.com/apache/spark/pull/26231#issuecomment-571330348 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116182/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2
AmplabJenkins removed a comment on issue #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2 URL: https://github.com/apache/spark/pull/26231#issuecomment-571330331 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2
AmplabJenkins commented on issue #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2 URL: https://github.com/apache/spark/pull/26231#issuecomment-571330348 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116182/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
AmplabJenkins removed a comment on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571329537 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116190/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2
AmplabJenkins commented on issue #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2 URL: https://github.com/apache/spark/pull/26231#issuecomment-571330331 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2
SparkQA removed a comment on issue #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2 URL: https://github.com/apache/spark/pull/26231#issuecomment-571234526 **[Test build #116182 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116182/testReport)** for PR 26231 at commit [`eccadc7`](https://github.com/apache/spark/commit/eccadc71d0d60a4c6acb4a6fe242f0913e0856c3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
SparkQA removed a comment on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571327718 **[Test build #116190 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116190/testReport)** for PR 27053 at commit [`8edba2d`](https://github.com/apache/spark/commit/8edba2d7ecdf5b1cdc51f27b576f2a07b05cf69c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
AmplabJenkins removed a comment on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571329524 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
AmplabJenkins commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571329524 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
SparkQA commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571329510 **[Test build #116190 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116190/testReport)** for PR 27053 at commit [`8edba2d`](https://github.com/apache/spark/commit/8edba2d7ecdf5b1cdc51f27b576f2a07b05cf69c). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2
SparkQA commented on issue #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2 URL: https://github.com/apache/spark/pull/26231#issuecomment-571329654 **[Test build #116182 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116182/testReport)** for PR 26231 at commit [`eccadc7`](https://github.com/apache/spark/commit/eccadc71d0d60a4c6acb4a6fe242f0913e0856c3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
AmplabJenkins commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571329537 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116190/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
SparkQA commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571327718 **[Test build #116190 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116190/testReport)** for PR 27053 at commit [`8edba2d`](https://github.com/apache/spark/commit/8edba2d7ecdf5b1cdc51f27b576f2a07b05cf69c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter
viirya commented on a change in pull request #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter URL: https://github.com/apache/spark/pull/26993#discussion_r363491931 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala ## @@ -204,6 +204,23 @@ class ParquetIOSuite extends QueryTest with ParquetTest with SharedSparkSession } } + testStandardAndLegacyModes("array of struct") { Review comment: Do we have a test for array of struct of struct? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
AmplabJenkins removed a comment on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571316985 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116189/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
SparkQA removed a comment on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571316219 **[Test build #116189 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116189/testReport)** for PR 27053 at commit [`eece6ae`](https://github.com/apache/spark/commit/eece6ae77df9dfbf8b6c62d7f54c796bd2180175). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
AmplabJenkins commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571316985 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116189/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
SparkQA commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571316959 **[Test build #116189 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116189/testReport)** for PR 27053 at commit [`eece6ae`](https://github.com/apache/spark/commit/eece6ae77df9dfbf8b6c62d7f54c796bd2180175). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
AmplabJenkins commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571316975 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
AmplabJenkins removed a comment on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571316975 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
SparkQA commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571316219 **[Test build #116189 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116189/testReport)** for PR 27053 at commit [`eece6ae`](https://github.com/apache/spark/commit/eece6ae77df9dfbf8b6c62d7f54c796bd2180175). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter
AmplabJenkins removed a comment on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter URL: https://github.com/apache/spark/pull/26993#issuecomment-571313973 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20981/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter
AmplabJenkins removed a comment on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter URL: https://github.com/apache/spark/pull/26993#issuecomment-571313967 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tgravescs commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference
tgravescs commented on issue #27053: [WIP][SPARK-27495][Core][YARN][k8s] Stage Level Scheduling code for reference URL: https://github.com/apache/spark/pull/27053#issuecomment-571314271 I added in the pyspark support This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter
AmplabJenkins commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter URL: https://github.com/apache/spark/pull/26993#issuecomment-571313967 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter
AmplabJenkins commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter URL: https://github.com/apache/spark/pull/26993#issuecomment-571313973 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20981/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter
SparkQA commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter URL: https://github.com/apache/spark/pull/26993#issuecomment-571313406 **[Test build #116188 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116188/testReport)** for PR 26993 at commit [`4651b2f`](https://github.com/apache/spark/commit/4651b2fd724a56515c087903284682c9ba947c31). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] joshrosen-stripe commented on a change in pull request #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter
joshrosen-stripe commented on a change in pull request #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter URL: https://github.com/apache/spark/pull/26993#discussion_r363479947 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala ## @@ -318,10 +318,33 @@ private[parquet] class ParquetRowConverter( new ParquetMapConverter(parquetType.asGroupType(), t, updater) case t: StructType => +val wrappedUpdater = { + // SPARK-30338: avoid unnecessary InternalRow copying for nested structs: + if (updater.isInstanceOf[RowUpdater]) { +// `updater` is a RowUpdater, implying that the parent container is a struct. +// We do NOT need to perform defensive copying here because either: +// +// 1. The path from the schema root to this field consists only of nested Review comment: Yes, that's right. After thinking about this some more, I think I've come up with a clearer explanation and have updated the code comment: https://github.com/apache/spark/pull/26993/commits/4651b2fd724a56515c087903284682c9ba947c31 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tgravescs commented on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized
tgravescs commented on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized URL: https://github.com/apache/spark/pull/26696#issuecomment-571311572 > One remaining case that isn't handled: > Before any "all free resource" offer, all free resources are offered one by one and all not rejected. > This case should reset the timer, but won't with current impl. So I assume by this you mean the startup case, but I'm not sure that is true. You get an "all free resource" case when you first submitTasks. I think there are 2 cases - static allocation and dynamic allocation. Generally with static you will get your executors before you start any application code, so it won't matter if it makes offers before that. With dynamic allocation generally you won't have any executors so this perhaps is the case on submitTasks you offer all but there are no offers because no executors yet. Which case are you referring to? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tinhto-000 commented on a change in pull request #26955: [SPARK-30310] [Core] Resolve missing match case in SparkUncaughtExceptionHandler and added tests
tinhto-000 commented on a change in pull request #26955: [SPARK-30310] [Core] Resolve missing match case in SparkUncaughtExceptionHandler and added tests URL: https://github.com/apache/spark/pull/26955#discussion_r363475923 ## File path: core/src/main/scala/org/apache/spark/util/SparkUncaughtExceptionHandler.scala ## @@ -48,11 +48,17 @@ private[spark] class SparkUncaughtExceptionHandler(val exitOnUncaughtException: System.exit(SparkExitCode.OOM) case _ if exitOnUncaughtException => System.exit(SparkExitCode.UNCAUGHT_EXCEPTION) + case _ => +// SPARK-30310: Don't System.exit() when exitOnUncaughtException is false } } } catch { - case oom: OutOfMemoryError => Runtime.getRuntime.halt(SparkExitCode.OOM) - case t: Throwable => Runtime.getRuntime.halt(SparkExitCode.UNCAUGHT_EXCEPTION_TWICE) + case oom: OutOfMemoryError => +logError(s"Uncaught OutOfMemoryError in thread $thread, process halted.", oom) Review comment: Thanks for the comment. Well the reason why for the logError is because it wasn't obvious to users or devs why the worker would just disappeared as DEAD on the UI, and there was nothing in the worker log file to tell what happened. We couldn't find out why until we set SPARK_NO_DAEMONIZE=1 and examined the exit code. Is there any alternative to indicate the process halted unexpectedly? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#discussion_r363471020 ## File path: core/src/test/scala/org/apache/spark/scheduler/WorkerDecommissionSuite.scala ## @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler + +import scala.concurrent.TimeoutException +import scala.concurrent.duration._ + +import org.apache.spark.{LocalSparkContext, SparkConf, SparkContext, SparkException, SparkFunSuite} +import org.apache.spark.internal.config +import org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend +import org.apache.spark.util.{RpcUtils, SerializableBuffer, ThreadUtils} + +class WorkerDecommissionSuite extends SparkFunSuite with LocalSparkContext { + + + override def beforeEach(): Unit = { +val conf = new SparkConf().setAppName("test").setMaster("local") + .set(config.Worker.WORKER_DECOMMISSION_ENABLED.key, "true") + +sc = new SparkContext("local-cluster[2, 1, 1024]", "test", conf) + } + + test("verify task with no decommissioning works as expected") { +val input = sc.parallelize(1 to 10) +input.count() +val sleepyRdd = input.mapPartitions{ x => + Thread.sleep(100) + x +} +assert(sleepyRdd.count() === 10) + } + + test("verify a task with all workers decommissioned succeeds") { +val input = sc.parallelize(1 to 10) +// Do a count to wait for the executors to be registered. +input.count() +val sleepyRdd = input.mapPartitions{ x => + Thread.sleep(100) + x +} +// Start the task. +val asyncCount = sleepyRdd.countAsync() +// Give the job long enough to start. +Thread.sleep(20) +// Decommission all the executors, this should not halt the current task. +// The master passing message is tested with Review comment: tested with? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#discussion_r363470570 ## File path: core/src/test/scala/org/apache/spark/scheduler/WorkerDecommissionSuite.scala ## @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler + +import scala.concurrent.TimeoutException +import scala.concurrent.duration._ + +import org.apache.spark.{LocalSparkContext, SparkConf, SparkContext, SparkException, SparkFunSuite} +import org.apache.spark.internal.config +import org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend +import org.apache.spark.util.{RpcUtils, SerializableBuffer, ThreadUtils} + +class WorkerDecommissionSuite extends SparkFunSuite with LocalSparkContext { + Review comment: too many empty lines This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#discussion_r363473266 ## File path: resource-managers/kubernetes/docker/src/main/dockerfiles/spark/decom.sh ## @@ -0,0 +1,38 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + + +set -ex +export LOG=/dev/termination-log Review comment: `/dev/` looks like a very weird place for a log file. In fact, is this log file useful at all? Won't it go away as soon as the container stops? (I looked at the k8s page around these event handlers but it doesn't seem to explain where the output of these commands end up.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#discussion_r363470002 ## File path: core/src/main/scala/org/apache/spark/util/SignalUtils.scala ## @@ -60,10 +60,11 @@ private[spark] object SignalUtils extends Logging { if (SystemUtils.IS_OS_UNIX) { try { val handler = handlers.getOrElseUpdate(signal, { - logInfo("Registered signal handler for " + signal) + logInfo("Registering signal handler for " + signal) new ActionHandler(new Signal(signal)) }) handler.register(action) +logInfo("Registered signal handler for " + signal) Review comment: This seems unnecessary. If registration fails you'll get an error message from the exception handler below, no? Then the previous message is enough for the success case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#discussion_r363471925 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala ## @@ -55,6 +55,9 @@ private[spark] abstract class KubernetesConf(val sparkConf: SparkConf) { } } + def workerDecommissioning: Boolean = Review comment: I'd avoid adding getters for simple configs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#discussion_r363470718 ## File path: core/src/test/scala/org/apache/spark/scheduler/WorkerDecommissionSuite.scala ## @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler + +import scala.concurrent.TimeoutException +import scala.concurrent.duration._ + +import org.apache.spark.{LocalSparkContext, SparkConf, SparkContext, SparkException, SparkFunSuite} +import org.apache.spark.internal.config +import org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend +import org.apache.spark.util.{RpcUtils, SerializableBuffer, ThreadUtils} + +class WorkerDecommissionSuite extends SparkFunSuite with LocalSparkContext { + + + override def beforeEach(): Unit = { +val conf = new SparkConf().setAppName("test").setMaster("local") + .set(config.Worker.WORKER_DECOMMISSION_ENABLED.key, "true") Review comment: ` .set(config.Worker.WORKER_DECOMMISSION_ENABLED, true)` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#discussion_r363471618 ## File path: core/src/test/scala/org/apache/spark/scheduler/WorkerDecommissionSuite.scala ## @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler + +import scala.concurrent.TimeoutException +import scala.concurrent.duration._ + +import org.apache.spark.{LocalSparkContext, SparkConf, SparkContext, SparkException, SparkFunSuite} +import org.apache.spark.internal.config +import org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend +import org.apache.spark.util.{RpcUtils, SerializableBuffer, ThreadUtils} + +class WorkerDecommissionSuite extends SparkFunSuite with LocalSparkContext { + + + override def beforeEach(): Unit = { +val conf = new SparkConf().setAppName("test").setMaster("local") + .set(config.Worker.WORKER_DECOMMISSION_ENABLED.key, "true") + +sc = new SparkContext("local-cluster[2, 1, 1024]", "test", conf) + } + + test("verify task with no decommissioning works as expected") { +val input = sc.parallelize(1 to 10) +input.count() +val sleepyRdd = input.mapPartitions{ x => + Thread.sleep(100) + x +} +assert(sleepyRdd.count() === 10) + } + + test("verify a task with all workers decommissioned succeeds") { +val input = sc.parallelize(1 to 10) +// Do a count to wait for the executors to be registered. +input.count() +val sleepyRdd = input.mapPartitions{ x => + Thread.sleep(100) + x +} +// Start the task. +val asyncCount = sleepyRdd.countAsync() +// Give the job long enough to start. +Thread.sleep(20) Review comment: 20ms is enough? I recommend installing a listener and waiting on the job start (or the first task start) event. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#discussion_r363472016 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala ## @@ -192,6 +193,21 @@ private[spark] class BasicExecutorFeatureStep( .endResources() .build() }.getOrElse(executorContainer) +val containerWithLifecycle = kubernetesConf.workerDecommissioning match { + case false => Review comment: if / else for a simple boolean This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#discussion_r363474668 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala ## @@ -192,6 +193,21 @@ private[spark] class BasicExecutorFeatureStep( .endResources() .build() }.getOrElse(executorContainer) +val containerWithLifecycle = kubernetesConf.workerDecommissioning match { + case false => +logInfo("Decommissioning not enabled, skipping shutdown script") +containerWithLimitCores + case true => +logInfo("Adding decommission script to lifecycle") +new ContainerBuilder(containerWithLimitCores).withNewLifecycle() + .withNewPreStop() Review comment: Will this get triggered when Spark itself stops the executor (i.e. when you turn on dynamic allocation)? Does the code behave as it should in that case? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#discussion_r363471411 ## File path: core/src/test/scala/org/apache/spark/scheduler/WorkerDecommissionSuite.scala ## @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler + +import scala.concurrent.TimeoutException +import scala.concurrent.duration._ + +import org.apache.spark.{LocalSparkContext, SparkConf, SparkContext, SparkException, SparkFunSuite} +import org.apache.spark.internal.config +import org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend +import org.apache.spark.util.{RpcUtils, SerializableBuffer, ThreadUtils} + +class WorkerDecommissionSuite extends SparkFunSuite with LocalSparkContext { + + + override def beforeEach(): Unit = { +val conf = new SparkConf().setAppName("test").setMaster("local") + .set(config.Worker.WORKER_DECOMMISSION_ENABLED.key, "true") + +sc = new SparkContext("local-cluster[2, 1, 1024]", "test", conf) + } + + test("verify task with no decommissioning works as expected") { +val input = sc.parallelize(1 to 10) +input.count() +val sleepyRdd = input.mapPartitions{ x => + Thread.sleep(100) + x +} +assert(sleepyRdd.count() === 10) + } + + test("verify a task with all workers decommissioned succeeds") { +val input = sc.parallelize(1 to 10) +// Do a count to wait for the executors to be registered. +input.count() +val sleepyRdd = input.mapPartitions{ x => + Thread.sleep(100) + x +} +// Start the task. +val asyncCount = sleepyRdd.countAsync() +// Give the job long enough to start. +Thread.sleep(20) +// Decommission all the executors, this should not halt the current task. +// The master passing message is tested with +val sched = sc.schedulerBackend.asInstanceOf[StandaloneSchedulerBackend] +val execs = sched.getExecutorIds() +execs.foreach(execId => sched.decommissionExecutor(execId)) +val asyncCountResult = ThreadUtils.awaitResult(asyncCount, 10.seconds) +assert(asyncCountResult === 10) +// Try and launch task after decommissioning, this should fail +val postDecommissioned = input.map(x => x) +val postDecomAsyncCount = postDecommissioned.countAsync() +val thrown = intercept[java.util.concurrent.TimeoutException]{ + val result = ThreadUtils.awaitResult(postDecomAsyncCount, 10.seconds) +} Review comment: I'd look for a better way to check this. This test will intentionally wait 10 seconds doing nothing in the success case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#discussion_r363469587 ## File path: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala ## @@ -402,6 +408,27 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp scheduler.workerRemoved(workerId, host, message) } +/** + * Mark a given executor as decommissioned and stop making resource offers for it. + */ +private def decommissionExecutor(executorId: String): Boolean = { + val shouldDisable = CoarseGrainedSchedulerBackend.this.synchronized { +// Only bother decommissioning executors which are alive. +if (isExecutorActive(executorId)) { + executorsPendingDecommission += executorId Review comment: I see you adding things to this set but didn't notice anywhere removing the executor from it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#discussion_r363467451 ## File path: core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala ## @@ -140,6 +144,16 @@ private[spark] class CoarseGrainedExecutorBackend( if (executor == null) { exitExecutor(1, "Received LaunchTask command but executor was null") } else { +if (decommissioned) { + logError("Asked to launch a task while decommissioned.") + driver match { +case Some(endpoint) => Review comment: I think that instead of doing this here, it should be done in `onStart` where the driver reference is created. That means the decommission message is sent to the driver as soon as possible after the signal arrives, instead of waiting for the driver to try to use the executor for something. (That also means this block can go away and you can just keep the log message in `Executor.scala`.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#discussion_r363468030 ## File path: core/src/main/scala/org/apache/spark/scheduler/ExecutorLossReason.scala ## @@ -58,3 +58,11 @@ private [spark] object LossReasonPending extends ExecutorLossReason("Pending los private[spark] case class SlaveLost(_message: String = "Slave lost", workerLost: Boolean = false) extends ExecutorLossReason(_message) + +/** + * A loss reason that means the worker is marked for decommissioning. Review comment: s/worker/executor This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on a change in pull request #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#discussion_r363472397 ## File path: resource-managers/kubernetes/docker/src/main/dockerfiles/spark/decom.sh ## @@ -0,0 +1,38 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + + +set -ex +export LOG=/dev/termination-log +echo "Asked to decommission" > ${LOG} +# Find the pid to signal +date | tee -a ${LOG} +WORKER_PID=$(ps axf | grep java |grep org.apache.spark.executor.CoarseGrainedExecutorBackend | grep -v grep) Review comment: nit: space after `|` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27096: [SPARK-28148][SQL] Repartition after join is not optimized away
AmplabJenkins commented on issue #27096: [SPARK-28148][SQL] Repartition after join is not optimized away URL: https://github.com/apache/spark/pull/27096#issuecomment-571305449 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27096: [SPARK-28148][SQL] Repartition after join is not optimized away
AmplabJenkins removed a comment on issue #27096: [SPARK-28148][SQL] Repartition after join is not optimized away URL: https://github.com/apache/spark/pull/27096#issuecomment-571305460 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20980/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27096: [SPARK-28148][SQL] Repartition after join is not optimized away
AmplabJenkins commented on issue #27096: [SPARK-28148][SQL] Repartition after join is not optimized away URL: https://github.com/apache/spark/pull/27096#issuecomment-571305460 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20980/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27096: [SPARK-28148][SQL] Repartition after join is not optimized away
AmplabJenkins removed a comment on issue #27096: [SPARK-28148][SQL] Repartition after join is not optimized away URL: https://github.com/apache/spark/pull/27096#issuecomment-571305449 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #27109: [SPARK-30434][PYTHON][SQL] Move pandas related functionalities into 'pandas' sub-package
zero323 commented on a change in pull request #27109: [SPARK-30434][PYTHON][SQL] Move pandas related functionalities into 'pandas' sub-package URL: https://github.com/apache/spark/pull/27109#discussion_r363472878 ## File path: python/pyspark/sql/dataframe.py ## @@ -31,23 +31,23 @@ from pyspark import copy_func, since, _NoValue from pyspark.rdd import RDD, _load_from_socket, _local_iterator_from_socket, \ -ignore_unicode_prefix, PythonEvalType -from pyspark.serializers import ArrowCollectSerializer, BatchedSerializer, PickleSerializer, \ +ignore_unicode_prefix +from pyspark.serializers import BatchedSerializer, PickleSerializer, \ UTF8Deserializer from pyspark.storagelevel import StorageLevel from pyspark.traceback_utils import SCCallSiteSync from pyspark.sql.types import _parse_datatype_json_string from pyspark.sql.column import Column, _to_seq, _to_list, _to_java_column from pyspark.sql.readwriter import DataFrameWriter from pyspark.sql.streaming import DataStreamWriter -from pyspark.sql.types import IntegralType from pyspark.sql.types import * -from pyspark.util import _exception_message +from pyspark.sql.pandas.conversion import PandasConversionMixin +from pyspark.sql.pandas.map_ops import PandasMapOpsMixin __all__ = ["DataFrame", "DataFrameNaFunctions", "DataFrameStatFunctions"] -class DataFrame(object): +class DataFrame(PandasMapOpsMixin, PandasConversionMixin): Review comment: In general I am trying to get a better feeling of overall purpose of such refactoring. As for now there is no indication that any of these mixins will be ever used outside the current context (`DataFrame` and `GroupedData`). That impression is further enforced by explicit type checks ([here](https://github.com/apache/spark/blob/cfd78393e76f454503e7cf5416f6d56f1efffd0a/python/pyspark/sql/pandas/group_ops.py#L96) and [here](https://github.com/apache/spark/blob/cfd78393e76f454503e7cf5416f6d56f1efffd0a/python/pyspark/sql/pandas/map_ops.py#L64)). So that doesn't really seem like a canonical use of mixin, especially when base core `DataFrame` is not designed for extensiblity. > Ah you mean API usages like: > >df.pandas.mapInPandas(...) That's one possible approach though not the one I was thinking about. I assumed (though I am not sure, as the amount of code moved, excluding docs, message and some static stuff is negligible, and tightly coupled with `DataFrame` anyway) that the point is maintainability. So possible approach is either direct def __init__(self, ...): ... self._pandasMapOpsMixin = PandasMapOpsMixin(self) ... def mapInPandas(self, udf): return self._pandasMapOpsMixin.mapInPandas(udf) or indirect (by overwriting `__geattr__`). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27096: [SPARK-28148][SQL] Repartition after join is not optimized away
SparkQA commented on issue #27096: [SPARK-28148][SQL] Repartition after join is not optimized away URL: https://github.com/apache/spark/pull/27096#issuecomment-571304820 **[Test build #116187 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116187/testReport)** for PR 27096 at commit [`ff573e8`](https://github.com/apache/spark/commit/ff573e864c8271d157acd2ae3b62de5aeb03117a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27096: [SPARK-28148][CORE]: repartition after join is not optimized away
AmplabJenkins removed a comment on issue #27096: [SPARK-28148][CORE]: repartition after join is not optimized away URL: https://github.com/apache/spark/pull/27096#issuecomment-570843855 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dbtsai commented on issue #27096: [SPARK-28148][CORE]: repartition after join is not optimized away
dbtsai commented on issue #27096: [SPARK-28148][CORE]: repartition after join is not optimized away URL: https://github.com/apache/spark/pull/27096#issuecomment-571303226 Jenkins, ok to test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26956: [SPARK-30312][SQL] Preserve path permission and acl when truncate table
AmplabJenkins removed a comment on issue #26956: [SPARK-30312][SQL] Preserve path permission and acl when truncate table URL: https://github.com/apache/spark/pull/26956#issuecomment-571302669 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116181/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26956: [SPARK-30312][SQL] Preserve path permission and acl when truncate table
AmplabJenkins commented on issue #26956: [SPARK-30312][SQL] Preserve path permission and acl when truncate table URL: https://github.com/apache/spark/pull/26956#issuecomment-571302669 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116181/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26956: [SPARK-30312][SQL] Preserve path permission and acl when truncate table
AmplabJenkins commented on issue #26956: [SPARK-30312][SQL] Preserve path permission and acl when truncate table URL: https://github.com/apache/spark/pull/26956#issuecomment-571302657 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26956: [SPARK-30312][SQL] Preserve path permission and acl when truncate table
AmplabJenkins removed a comment on issue #26956: [SPARK-30312][SQL] Preserve path permission and acl when truncate table URL: https://github.com/apache/spark/pull/26956#issuecomment-571302657 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26956: [SPARK-30312][SQL] Preserve path permission and acl when truncate table
SparkQA removed a comment on issue #26956: [SPARK-30312][SQL] Preserve path permission and acl when truncate table URL: https://github.com/apache/spark/pull/26956#issuecomment-571213066 **[Test build #116181 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116181/testReport)** for PR 26956 at commit [`39fe234`](https://github.com/apache/spark/commit/39fe2343dbf332f12be95de2e845a8e197a87f73). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26956: [SPARK-30312][SQL] Preserve path permission and acl when truncate table
SparkQA commented on issue #26956: [SPARK-30312][SQL] Preserve path permission and acl when truncate table URL: https://github.com/apache/spark/pull/26956#issuecomment-571302080 **[Test build #116181 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116181/testReport)** for PR 26956 at commit [`39fe234`](https://github.com/apache/spark/commit/39fe2343dbf332f12be95de2e845a8e197a87f73). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26682: [SPARK-29306][CORE] Stage Level Sched: Executors need to track what ResourceProfile they are created with
AmplabJenkins removed a comment on issue #26682: [SPARK-29306][CORE] Stage Level Sched: Executors need to track what ResourceProfile they are created with URL: https://github.com/apache/spark/pull/26682#issuecomment-571299651 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26682: [SPARK-29306][CORE] Stage Level Sched: Executors need to track what ResourceProfile they are created with
AmplabJenkins commented on issue #26682: [SPARK-29306][CORE] Stage Level Sched: Executors need to track what ResourceProfile they are created with URL: https://github.com/apache/spark/pull/26682#issuecomment-571299651 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26682: [SPARK-29306][CORE] Stage Level Sched: Executors need to track what ResourceProfile they are created with
AmplabJenkins removed a comment on issue #26682: [SPARK-29306][CORE] Stage Level Sched: Executors need to track what ResourceProfile they are created with URL: https://github.com/apache/spark/pull/26682#issuecomment-571299660 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20979/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26682: [SPARK-29306][CORE] Stage Level Sched: Executors need to track what ResourceProfile they are created with
AmplabJenkins commented on issue #26682: [SPARK-29306][CORE] Stage Level Sched: Executors need to track what ResourceProfile they are created with URL: https://github.com/apache/spark/pull/26682#issuecomment-571299660 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20979/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26682: [SPARK-29306][CORE] Stage Level Sched: Executors need to track what ResourceProfile they are created with
SparkQA commented on issue #26682: [SPARK-29306][CORE] Stage Level Sched: Executors need to track what ResourceProfile they are created with URL: https://github.com/apache/spark/pull/26682#issuecomment-571299086 **[Test build #116186 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116186/testReport)** for PR 26682 at commit [`6cb7023`](https://github.com/apache/spark/commit/6cb7023426b6f9d0806df07b5d58867ae472a04d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #24457: [SPARK-27340][SS] Alias on TimeWindow expression may cause watermark metadata lost
dongjoon-hyun commented on issue #24457: [SPARK-27340][SS] Alias on TimeWindow expression may cause watermark metadata lost URL: https://github.com/apache/spark/pull/24457#issuecomment-571294006 While preparing at 2.4.5 release, I just noticed that this was closed recently and we might need to fix the underlying issue. The test case failed in both `master` and `branch-2.4`. If watermarks are ignored, the internal state grows indefinitely. How do you think about the reported issue, @tdas , @zsxwing , @cloud-fan , @HeartSaVioR , @gatorsmile? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
AmplabJenkins removed a comment on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571284825 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20978/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24457: [SPARK-27340][SS] Alias on TimeWindow expression may cause watermark metadata lost
AmplabJenkins removed a comment on issue #24457: [SPARK-27340][SS] Alias on TimeWindow expression may cause watermark metadata lost URL: https://github.com/apache/spark/pull/24457#issuecomment-571283979 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
AmplabJenkins removed a comment on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571284808 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
AmplabJenkins commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571284825 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20978/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
AmplabJenkins commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571284808 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24457: [SPARK-27340][SS] Alias on TimeWindow expression may cause watermark metadata lost
AmplabJenkins commented on issue #24457: [SPARK-27340][SS] Alias on TimeWindow expression may cause watermark metadata lost URL: https://github.com/apache/spark/pull/24457#issuecomment-571284551 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
SparkQA commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571284101 **[Test build #116185 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116185/testReport)** for PR 26890 at commit [`d9eb441`](https://github.com/apache/spark/commit/d9eb4410d5c77234746d13132f3cad5aa6092d1d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24457: [SPARK-27340][SS] Alias on TimeWindow expression may cause watermark metadata lost
AmplabJenkins removed a comment on issue #24457: [SPARK-27340][SS] Alias on TimeWindow expression may cause watermark metadata lost URL: https://github.com/apache/spark/pull/24457#issuecomment-531894048 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter
viirya commented on a change in pull request #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter URL: https://github.com/apache/spark/pull/26993#discussion_r363451729 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala ## @@ -318,10 +318,33 @@ private[parquet] class ParquetRowConverter( new ParquetMapConverter(parquetType.asGroupType(), t, updater) case t: StructType => +val wrappedUpdater = { + // SPARK-30338: avoid unnecessary InternalRow copying for nested structs: + if (updater.isInstanceOf[RowUpdater]) { +// `updater` is a RowUpdater, implying that the parent container is a struct. +// We do NOT need to perform defensive copying here because either: +// +// 1. The path from the schema root to this field consists only of nested Review comment: When we have deeply nested struct inside an array, is it the first case here? I think it is fine because at the element converter the top level struct inside an array element will do the defensive copying. So in nested struct converter, we will see RowUpdater from parent struct so don't need defensive copying too. Just maybe good to also update it in the doc. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24457: [SPARK-27340][SS] Alias on TimeWindow expression may cause watermark metadata lost
AmplabJenkins commented on issue #24457: [SPARK-27340][SS] Alias on TimeWindow expression may cause watermark metadata lost URL: https://github.com/apache/spark/pull/24457#issuecomment-571283979 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24457: [SPARK-27340][SS] Alias on TimeWindow expression may cause watermark metadata lost
dongjoon-hyun commented on a change in pull request #24457: [SPARK-27340][SS] Alias on TimeWindow expression may cause watermark metadata lost URL: https://github.com/apache/spark/pull/24457#discussion_r363451556 ## File path: sql/core/src/test/scala/org/apache/spark/sql/streaming/EventTimeWatermarkSuite.scala ## @@ -591,6 +591,17 @@ class EventTimeWatermarkSuite extends StreamTest with BeforeAndAfter with Matche } } + test("SPARK-27340: Alias on TimeWindow expression may cause watermark metadata lost") { +val inputData = MemoryStream[Int] +val aliasWindow = inputData.toDF() + .withColumn("eventTime", $"value".cast("timestamp")) + .withWatermark("eventTime", "10 seconds") + .select(window($"eventTime", "5 seconds") as 'aliasWindow) + +assert(aliasWindow.logicalPlan.output.exists( +_.metadata.contains(EventTimeWatermark.delayKey))) Review comment: Since this test case seems to fail on the master branch (as of today), the issue seems to exist still. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LiangchangZ opened a new pull request #24457: [SPARK-27340][SS] Alias on TimeWindow expression may cause watermark metadata lost
LiangchangZ opened a new pull request #24457: [SPARK-27340][SS] Alias on TimeWindow expression may cause watermark metadata lost URL: https://github.com/apache/spark/pull/24457 ## What changes were proposed in this pull request? `window($"fooTime", "2 seconds").alias("fooWindow")` can generate an expression tree `Alias(fooWindow) <- TimeWindow`. The tree will become `Alias(fooWindow) <- Alias(window) <- Window(start, end)` after analyzed by TimeWindowing rule. The `Alias(window)` got metadata of watermark when created: ``` val windowStruct = Alias(getWindow(0, 1), WINDOW_COL_NAME)( exprId = windowAttr.exprId, explicitMetadata = Some(metadata)) ``` but the `Alias(fooWindow)` is created before TimeWindowing rule effected. Its code path is: ``` ... case ne: NamedExpression => Alias(expr, alias)(explicitMetadata = Some(ne.metadata)) ... ``` before TimeWindowing rule effected, the `ne.metadata` is None and cause the watermark metadata lost We make the `def name(alias: String)` return a `Alias` which get metadata from its child automatically, when not specifying metadata explicitly. Thank @LinhongLiu for helping analyzing this problem! ## How was this patch tested? Add a UT and do the integration tests by run the example in jira successfully and do not throw org.apache.spark.sql.AnalysisException anymore This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on issue #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
vanzin commented on issue #26440: [WIP][SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support URL: https://github.com/apache/spark/pull/26440#issuecomment-571282142 > recomissioning is something I considered out of scope Recommissioning might be tricky, but perhaps it would be good to have a fail-safe for the executors to exit by themselves (or be killed by the driver) if the decommission doesn't really happen? At least the YARN API documentation leaves that behavior as a possibility. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on a change in pull request #24938: [SPARK-27946][SQL] Hive DDL to Spark DDL conversion USING "show create table"
gengliangwang commented on a change in pull request #24938: [SPARK-27946][SQL] Hive DDL to Spark DDL conversion USING "show create table" URL: https://github.com/apache/spark/pull/24938#discussion_r363444748 ## File path: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ## @@ -196,7 +196,7 @@ statement | SHOW PARTITIONS multipartIdentifier partitionSpec? #showPartitions | SHOW identifier? FUNCTIONS (LIKE? (multipartIdentifier | pattern=STRING))? #showFunctions -| SHOW CREATE TABLE multipartIdentifier #showCreateTable +| SHOW CREATE TABLE multipartIdentifier (AS SPARK)? #showCreateTable Review comment: +1. The new proposal makes more sense! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] joshrosen-stripe commented on issue #27089: [SPARK-30414][SQL] ParquetRowConverter optimizations: arrays, maps, plus misc. constant factors
joshrosen-stripe commented on issue #27089: [SPARK-30414][SQL] ParquetRowConverter optimizations: arrays, maps, plus misc. constant factors URL: https://github.com/apache/spark/pull/27089#issuecomment-571276738 @cloud-fan @HyukjinKwon @dongjoon-hyun @viirya, could you take a look at this PR which implements several small performance optimizations in `ParquetRowConverter`? These changes are aimed at improving performance when scanning very wide datasets with large numbers of columns, plus datasets with small maps and arrays. These changes are complementary but orthogonal to the changes in #26967. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tgravescs commented on issue #26682: [SPARK-29306][CORE] Stage Level Sched: Executors need to track what ResourceProfile they are created with
tgravescs commented on issue #26682: [SPARK-29306][CORE] Stage Level Sched: Executors need to track what ResourceProfile they are created with URL: https://github.com/apache/spark/pull/26682#issuecomment-571276795 looks like new test added that I wasn't upmerged to, upmerging and looking at failure This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on issue #24938: [SPARK-27946][SQL] Hive DDL to Spark DDL conversion USING "show create table"
gatorsmile commented on issue #24938: [SPARK-27946][SQL] Hive DDL to Spark DDL conversion USING "show create table" URL: https://github.com/apache/spark/pull/24938#issuecomment-571276597 cc @viirya Sorry for the late reply. Are you fine to address the above comment? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on a change in pull request #24938: [SPARK-27946][SQL] Hive DDL to Spark DDL conversion USING "show create table"
gatorsmile commented on a change in pull request #24938: [SPARK-27946][SQL] Hive DDL to Spark DDL conversion USING "show create table" URL: https://github.com/apache/spark/pull/24938#discussion_r363444039 ## File path: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ## @@ -196,7 +196,7 @@ statement | SHOW PARTITIONS multipartIdentifier partitionSpec? #showPartitions | SHOW identifier? FUNCTIONS (LIKE? (multipartIdentifier | pattern=STRING))? #showFunctions -| SHOW CREATE TABLE multipartIdentifier #showCreateTable +| SHOW CREATE TABLE multipartIdentifier (AS SPARK)? #showCreateTable Review comment: After rethinking it, let us make it more aggressive here. Instead of creating Spark native tables for the existing Hive serde tables, we can try to always show how to create Spark native tables if possible. This will further simplify the migration from Hive to Spark. To the existing Spark users who prefer to keeping Hive serde formats, we can introduce a new option `AS SERDE` which will keep the behaviors in Spark 2.4 or prior. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
AmplabJenkins removed a comment on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571275165 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116184/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] joshrosen-stripe edited a comment on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter
joshrosen-stripe edited a comment on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter URL: https://github.com/apache/spark/pull/26993#issuecomment-571273025 @cloud-fan @dongjoon-hyun @viirya, could you take a look at this PR optimizing nested struct handling in `ParquetRowConverter`? I'm tagging this group because it looks like you've all helped to review recent changes to this file and I'd like some more eyes on this change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
SparkQA removed a comment on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571274416 **[Test build #116184 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116184/testReport)** for PR 26890 at commit [`3796a3a`](https://github.com/apache/spark/commit/3796a3a2388c04e3896a57973a133fc481a64578). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
AmplabJenkins removed a comment on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571275154 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
SparkQA commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571275144 **[Test build #116184 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116184/testReport)** for PR 26890 at commit [`3796a3a`](https://github.com/apache/spark/commit/3796a3a2388c04e3896a57973a133fc481a64578). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class CreateFunctionStatement(` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
AmplabJenkins commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571275154 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
AmplabJenkins commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571275165 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116184/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
SparkQA commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571274416 **[Test build #116184 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116184/testReport)** for PR 26890 at commit [`3796a3a`](https://github.com/apache/spark/commit/3796a3a2388c04e3896a57973a133fc481a64578). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] joshrosen-stripe commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter
joshrosen-stripe commented on issue #26993: [SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRowConverter URL: https://github.com/apache/spark/pull/26993#issuecomment-571273025 @cloud-fan @dongjoon-hyun @viirya, could you take a look at this PR optimizing nested struct handling in `ParquetRowConverter`? I'm tagging this group because it looks like you've all helped to review recent changes to this file. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even table stats is empty
AmplabJenkins removed a comment on issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even table stats is empty URL: https://github.com/apache/spark/pull/22721#issuecomment-571272335 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116179/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even table stats is empty
AmplabJenkins removed a comment on issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even table stats is empty URL: https://github.com/apache/spark/pull/22721#issuecomment-571272321 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even table stats is empty
AmplabJenkins commented on issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even table stats is empty URL: https://github.com/apache/spark/pull/22721#issuecomment-571272321 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even table stats is empty
AmplabJenkins commented on issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even table stats is empty URL: https://github.com/apache/spark/pull/22721#issuecomment-571272335 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116179/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
AmplabJenkins removed a comment on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571271761 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
AmplabJenkins removed a comment on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571271770 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20977/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
AmplabJenkins commented on issue #26890: [SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution URL: https://github.com/apache/spark/pull/26890#issuecomment-571271761 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org