[GitHub] [spark] AmplabJenkins removed a comment on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
AmplabJenkins removed a comment on pull request #28331: URL: https://github.com/apache/spark/pull/28331#issuecomment-635628412 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123248/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maryannxue commented on pull request #28668: [SPARK-31862] Remove exception wrapping in AQE
maryannxue commented on pull request #28668: URL: https://github.com/apache/spark/pull/28668#issuecomment-635628548 cc @cloud-fan @Ngone51 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
SparkQA removed a comment on pull request #28331: URL: https://github.com/apache/spark/pull/28331#issuecomment-635624756 **[Test build #123248 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123248/testReport)** for PR 28331 at commit [`2da0f2d`](https://github.com/apache/spark/commit/2da0f2d761659a92ebf44a6f134d9640cff0138a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
AmplabJenkins removed a comment on pull request #28331: URL: https://github.com/apache/spark/pull/28331#issuecomment-635628404 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
SparkQA commented on pull request #28331: URL: https://github.com/apache/spark/pull/28331#issuecomment-635628388 **[Test build #123248 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123248/testReport)** for PR 28331 at commit [`2da0f2d`](https://github.com/apache/spark/commit/2da0f2d761659a92ebf44a6f134d9640cff0138a). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
AmplabJenkins commented on pull request #28331: URL: https://github.com/apache/spark/pull/28331#issuecomment-635628404 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maryannxue opened a new pull request #28668: [SPARK-31862] Remove exception wrapping in AQE
maryannxue opened a new pull request #28668: URL: https://github.com/apache/spark/pull/28668 ### What changes were proposed in this pull request? This PR removes the excessive exception wrapping in AQE so that error messages are less verbose and mostly consistent with non-aqe execution. Exceptions from stage materialization are now only wrapped with `SparkException` if there are multiple stage failures. Also, stage cancelling errors will not be included as part the exception thrown, but rather just be error logged. ### Why are the changes needed? This will make the AQE error reporting more readable and debuggable. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Updated existing tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #28666: [BUILD][INFRA] bump the timeout to match the jenkins PRB
dongjoon-hyun commented on pull request #28666: URL: https://github.com/apache/spark/pull/28666#issuecomment-635627152 +1, late LGTM. Thank you, @shaneknapp . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] holdenk commented on pull request #28667: [WIP][SPARK-31860][BUILD] only push release tags on success
holdenk commented on pull request #28667: URL: https://github.com/apache/spark/pull/28667#issuecomment-635625346 cc @shaneknapp This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
SparkQA commented on pull request #28331: URL: https://github.com/apache/spark/pull/28331#issuecomment-635624756 **[Test build #123248 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123248/testReport)** for PR 28331 at commit [`2da0f2d`](https://github.com/apache/spark/commit/2da0f2d761659a92ebf44a6f134d9640cff0138a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test
AmplabJenkins removed a comment on pull request #28627: URL: https://github.com/apache/spark/pull/28627#issuecomment-635622188 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
AmplabJenkins commented on pull request #28331: URL: https://github.com/apache/spark/pull/28331#issuecomment-635622316 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
AmplabJenkins removed a comment on pull request #28331: URL: https://github.com/apache/spark/pull/28331#issuecomment-635622316 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test
AmplabJenkins commented on pull request #28627: URL: https://github.com/apache/spark/pull/28627#issuecomment-635622188 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test
SparkQA commented on pull request #28627: URL: https://github.com/apache/spark/pull/28627#issuecomment-635621840 **[Test build #123247 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123247/testReport)** for PR 28627 at commit [`2e805ec`](https://github.com/apache/spark/commit/2e805ec3276d820935b987861ab90a042c1a8638). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on pull request #28666: [BUILD][INFRA] bump the timeout to match the jenkins PRB
sarutak commented on pull request #28666: URL: https://github.com/apache/spark/pull/28666#issuecomment-635619264 @shaneknapp Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] shaneknapp commented on pull request #28666: [BUILD][INFRA] bump the timeout to match the jenkins PRB
shaneknapp commented on pull request #28666: URL: https://github.com/apache/spark/pull/28666#issuecomment-635618458 done! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] shaneknapp commented on pull request #28666: [BUILD][INFRA] bump the timeout to match the jenkins PRB
shaneknapp commented on pull request #28666: URL: https://github.com/apache/spark/pull/28666#issuecomment-635617050 also going to backport this to 3.0 and 2.4 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] shaneknapp closed pull request #28666: [BUILD][INFRA] bump the timeout to match the jenkins PRB
shaneknapp closed pull request #28666: URL: https://github.com/apache/spark/pull/28666 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28666: [BUILD][INFRA] bump the timeout to match the jenkins PRB
AmplabJenkins removed a comment on pull request #28666: URL: https://github.com/apache/spark/pull/28666#issuecomment-635616372 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28666: [BUILD][INFRA] bump the timeout to match the jenkins PRB
AmplabJenkins commented on pull request #28666: URL: https://github.com/apache/spark/pull/28666#issuecomment-635616372 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28666: [BUILD][INFRA] bump the timeout to match the jenkins PRB
SparkQA removed a comment on pull request #28666: URL: https://github.com/apache/spark/pull/28666#issuecomment-635522914 **[Test build #123243 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123243/testReport)** for PR 28666 at commit [`3a4b29c`](https://github.com/apache/spark/commit/3a4b29c8ad6a73d3f088447052ac70a40a0b5e1c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28666: [BUILD][INFRA] bump the timeout to match the jenkins PRB
SparkQA commented on pull request #28666: URL: https://github.com/apache/spark/pull/28666#issuecomment-635615385 **[Test build #123243 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123243/testReport)** for PR 28666 at commit [`3a4b29c`](https://github.com/apache/spark/commit/3a4b29c8ad6a73d3f088447052ac70a40a0b5e1c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #27066: [SPARK-31317][SQL] Add withField method to Column
SparkQA commented on pull request #27066: URL: https://github.com/apache/spark/pull/27066#issuecomment-635615292 **[Test build #123246 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123246/testReport)** for PR 27066 at commit [`238f2f2`](https://github.com/apache/spark/commit/238f2f29ac913875d7f80884d89e0395fc468215). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilip-k-m edited a comment on pull request #4039: [SPARK-5236] Fix ClassCastException in SpecificMutableRow
dilip-k-m edited a comment on pull request #4039: URL: https://github.com/apache/spark/pull/4039#issuecomment-635613051 I've got the same issue in production with Spark 1.6. I was able to replicate in our performance test environment. So, concluded that, with the same cluster configuration, if a spark job is fed growing rate of input traffic, while writing after processing the feed, it generates parquet files with corrupted footer. Again, this probability of footer corruption increases when more unique values[the input file has more distinct values, i.e. lesser redundant field values] are fed. If, number of partitions to write is increased, then also this probability is reduced. I have found that Spark 2.x does not have such issue. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilip-k-m commented on pull request #4039: [SPARK-5236] Fix ClassCastException in SpecificMutableRow
dilip-k-m commented on pull request #4039: URL: https://github.com/apache/spark/pull/4039#issuecomment-635613051 I've got the same issue in production. I was able to replicate in our performance test environment. So, concluded that, with the same cluster configuration, if a spark job is fed growing rate of input traffic, while writing after processing the feed, it generates parquet files with corrupted footer. Again, this probability of footer corruption increases when more unique values[the input file has more distinct values, i.e. lesser redundant field values] are fed. If, number of partitions to write is increased, then also this probability is reduced. I have found that Spark 2.x does not have such issue. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] fqaiser94 commented on a change in pull request #27066: [SPARK-31317][SQL] Add withField method to Column
fqaiser94 commented on a change in pull request #27066: URL: https://github.com/apache/spark/pull/27066#discussion_r432128545 ## File path: sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala ## @@ -923,4 +923,452 @@ class ColumnExpressionSuite extends QueryTest with SharedSparkSession { val inSet = InSet(Literal("a"), Set("a", "b").map(UTF8String.fromString)) assert(inSet.sql === "('a' IN ('a', 'b'))") } + + { Review comment: fair enough, I've made the change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting and parsing functions
AmplabJenkins removed a comment on pull request #28650: URL: https://github.com/apache/spark/pull/28650#issuecomment-635610362 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting and parsing functions
AmplabJenkins commented on pull request #28650: URL: https://github.com/apache/spark/pull/28650#issuecomment-635610362 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting and parsing functions
SparkQA removed a comment on pull request #28650: URL: https://github.com/apache/spark/pull/28650#issuecomment-635459663 **[Test build #123239 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123239/testReport)** for PR 28650 at commit [`0680855`](https://github.com/apache/spark/commit/0680855705a6a19d2a60b746128085a3a320501d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting and parsing functions
SparkQA commented on pull request #28650: URL: https://github.com/apache/spark/pull/28650#issuecomment-635609182 **[Test build #123239 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123239/testReport)** for PR 28650 at commit [`0680855`](https://github.com/apache/spark/commit/0680855705a6a19d2a60b746128085a3a320501d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] shaneknapp commented on pull request #28666: [BUILD][INFRA] bump the timeout to match the jenkins PRB
shaneknapp commented on pull request #28666: URL: https://github.com/apache/spark/pull/28666#issuecomment-635607691 i'm going to go ahead and merge this now, as it's blocking maven PRs. @srowen @HyukjinKwon @sarutak This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
AmplabJenkins commented on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635608008 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
AmplabJenkins removed a comment on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635608008 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
SparkQA removed a comment on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635515643 **[Test build #123242 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123242/testReport)** for PR 28665 at commit [`90aa354`](https://github.com/apache/spark/commit/90aa354539b503bec542ad6dff5afcdad4271a54). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
SparkQA commented on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635607100 **[Test build #123242 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123242/testReport)** for PR 28665 at commit [`90aa354`](https://github.com/apache/spark/commit/90aa354539b503bec542ad6dff5afcdad4271a54). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28667: [WIP][SPARK-31860][BUILD] only push release tags on success
AmplabJenkins removed a comment on pull request #28667: URL: https://github.com/apache/spark/pull/28667#issuecomment-635605706 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28667: [WIP][SPARK-31860][BUILD] only push release tags on success
AmplabJenkins commented on pull request #28667: URL: https://github.com/apache/spark/pull/28667#issuecomment-635605706 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28667: [WIP][SPARK-31860][BUILD] only push release tags on success
SparkQA commented on pull request #28667: URL: https://github.com/apache/spark/pull/28667#issuecomment-635605135 **[Test build #123245 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123245/testReport)** for PR 28667 at commit [`d26c735`](https://github.com/apache/spark/commit/d26c7353e01442459f0c9d17dd013d0f61f2ad02). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR edited a comment on pull request #27694: [SPARK-30946][SS] Serde entry with UnsafeRow on FileStream(Source/Sink)Log with LZ4 compression
HeartSaVioR edited a comment on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-635603504 > One lesson I learned from the past is UnsafeRow is not designed to be persisted across Spark versions. This sounds like a blocker for SS, as we leverage it to store state which should be used across Spark versions and shouldn't be corrupted/lost as it's time-consuming or sometimes simply not possible to replay from the scratch to construct the same due to retention policy. Do you have references on following the discussion/gotcha for that issue? I think we should really fix it for storing state - probably state must not be stored via UnsafeRow format then. > Do you know what are the performance numbers if we just compress the text files? I haven't experimented but we can easily imagine the file size would be smaller whereas processing time may be affected both positively (less to read from remote storage) and negatively (still have to serialize/deserialize with JSON + additional cost for compression/decompression). Here the point of the PR is that we know the schema (and even versioning) of the file in prior, hence we don't (and shouldn't) pay huge cost to make the file be backward-compatible by itself. We don't do versioning for data structures being used by event log so we are paying huge cost to make it backward/forward compatible. If we are not sure about unsafe row format for storing then we may be able to just try with traditional approaches. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] holdenk commented on pull request #28667: [WIP][SPARK-31860][BUILD] only push release tags on success
holdenk commented on pull request #28667: URL: https://github.com/apache/spark/pull/28667#issuecomment-635603600 cc @dongjoon-hyun ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] holdenk opened a new pull request #28667: [WIP][SPARK-31860][BUILD] only push release tags on success
holdenk opened a new pull request #28667: URL: https://github.com/apache/spark/pull/28667 ### What changes were proposed in this pull request? Only push the release tag after the build has finished. ### Why are the changes needed? If the build fails we don't need a release tag. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Running locally with a fake user.[WIP - waiting to finish] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR edited a comment on pull request #27694: [SPARK-30946][SS] Serde entry with UnsafeRow on FileStream(Source/Sink)Log with LZ4 compression
HeartSaVioR edited a comment on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-635603504 > One lesson I learned from the past is UnsafeRow is not designed to be persisted across Spark versions. This sounds like a blocker for SS, as we leverage it to store state which should be used across Spark versions and shouldn't be corrupted/lost as it's time-consuming or sometimes simply not possible to replay from the scratch to construct the same due to retention policy. Do you have references on following the discussion/gotcha for that issue? I think we should really fix it for storing state - probably state must not be stored via UnsafeRow format then. > Do you know what are the performance numbers if we just compress the text files? I haven't experimented but we can easily imagine the file size would be smaller whereas processing time may be affected both positively (less to read from remote storage) and negatively (still have to serialize/deserialize with JSON + cost to compression). Here the point of the PR is that we know the schema (and even versioning) of the file in prior, hence we don't (and shouldn't) pay huge cost to make the file be backward-compatible by itself. We don't do versioning for data structures being used by event log so we are paying huge cost to make it backward/forward compatible. If we are not sure about unsafe row format for storing then we may be able to just try with traditional approaches. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #27694: [SPARK-30946][SS] Serde entry with UnsafeRow on FileStream(Source/Sink)Log with LZ4 compression
HeartSaVioR commented on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-635603504 > One lesson I learned from the past is UnsafeRow is not designed to be persisted across Spark versions. This sounds like a blocker for SS, as we leverage it to store state which should be used across Spark versions and shouldn't be corrupted/lost as it's time-consuming or sometimes simply not possible to replay from the scratch to construct the same due to retention policy. Do you have references on following the discussion/gotcha for that issue? I think we should really fix it for storing state - probably state must not be stored via UnsafeRow format then. > Do you know what are the performance numbers if we just compress the text files? I haven't experimented but we can easily imagine the file size would be smaller whereas processing time may be affected both positively (less to read from remote storage) and negatively (still have to serialize/deserialize with JSON + compression). Here the point of the PR is that we know the schema (and even versioning) of the file in prior, hence we don't (and shouldn't) pay huge cost to make the file be backward-compatible by itself. We don't do versioning for data structures being used by event log so we are paying huge cost to make it backward/forward compatible. If we are not sure about unsafe row format for storing then we may be able to just try with traditional approaches. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] holdenk commented on a change in pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
holdenk commented on a change in pull request #28331: URL: https://github.com/apache/spark/pull/28331#discussion_r432110087 ## File path: core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionSuite.scala ## @@ -69,36 +84,64 @@ class BlockManagerDecommissionSuite extends SparkFunSuite with LocalSparkContext }) // Cache the RDD lazily -sleepyRdd.persist() +if (persist) { + testRdd.persist() +} // Start the computation of RDD - this step will also cache the RDD -val asyncCount = sleepyRdd.countAsync() +val asyncCount = testRdd.countAsync() // Wait for the job to have started sem.acquire(1) +// Give Spark a tiny bit to start the tasks after the listener says hello +Thread.sleep(100) Review comment: I've added the wait for all the executors to come up before starting the job, but I think this sleep is ok because we know it's less than the length of the job and we are essentially trying to test what happens in the middle of a job. I can't think of a reasonable way to avoid this sleep. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
AmplabJenkins removed a comment on pull request #28331: URL: https://github.com/apache/spark/pull/28331#issuecomment-635591610 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
AmplabJenkins commented on pull request #28331: URL: https://github.com/apache/spark/pull/28331#issuecomment-635591610 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] holdenk commented on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
holdenk commented on pull request #28331: URL: https://github.com/apache/spark/pull/28331#issuecomment-635591113 Oh wait we already do that in the K8s integration test. So I think that's tested but I'll see what I can do to improve the unit tests as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
SparkQA commented on pull request #28331: URL: https://github.com/apache/spark/pull/28331#issuecomment-635590890 **[Test build #123244 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123244/testReport)** for PR 28331 at commit [`838a346`](https://github.com/apache/spark/commit/838a346bc27e5d9b7892866ac6faf3188b93b615). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] holdenk commented on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
holdenk commented on pull request #28331: URL: https://github.com/apache/spark/pull/28331#issuecomment-635583163 Probably the easist thing to do is delete the executor in the K8s test after it's had a chance to migrate the blocks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] holdenk commented on a change in pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
holdenk commented on a change in pull request #28331: URL: https://github.com/apache/spark/pull/28331#discussion_r432097155 ## File path: core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionSuite.scala ## @@ -69,36 +84,64 @@ class BlockManagerDecommissionSuite extends SparkFunSuite with LocalSparkContext }) // Cache the RDD lazily -sleepyRdd.persist() +if (persist) { + testRdd.persist() +} // Start the computation of RDD - this step will also cache the RDD -val asyncCount = sleepyRdd.countAsync() +val asyncCount = testRdd.countAsync() // Wait for the job to have started sem.acquire(1) +// Give Spark a tiny bit to start the tasks after the listener says hello +Thread.sleep(100) Review comment: if I think of one I'll update it. There might be another listener event I can look at but I'll revisit this nearer the end. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28658: [SPARK-31730][CORE][TEST][3.0] Fix flaky tests in BarrierTaskContextSuite
AmplabJenkins removed a comment on pull request #28658: URL: https://github.com/apache/spark/pull/28658#issuecomment-635549762 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28658: [SPARK-31730][CORE][TEST][3.0] Fix flaky tests in BarrierTaskContextSuite
AmplabJenkins commented on pull request #28658: URL: https://github.com/apache/spark/pull/28658#issuecomment-635549762 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28658: [SPARK-31730][CORE][TEST][3.0] Fix flaky tests in BarrierTaskContextSuite
SparkQA removed a comment on pull request #28658: URL: https://github.com/apache/spark/pull/28658#issuecomment-635463295 **[Test build #123240 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123240/testReport)** for PR 28658 at commit [`4359923`](https://github.com/apache/spark/commit/43599236260d1f7affb3611c9ab1b7f0cf31831f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28658: [SPARK-31730][CORE][TEST][3.0] Fix flaky tests in BarrierTaskContextSuite
SparkQA commented on pull request #28658: URL: https://github.com/apache/spark/pull/28658#issuecomment-635548841 **[Test build #123240 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123240/testReport)** for PR 28658 at commit [`4359923`](https://github.com/apache/spark/commit/43599236260d1f7affb3611c9ab1b7f0cf31831f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28654: [SPARK-31834][SQL]Improve error message for incompatible data types
AmplabJenkins removed a comment on pull request #28654: URL: https://github.com/apache/spark/pull/28654#issuecomment-635541684 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28654: [SPARK-31834][SQL]Improve error message for incompatible data types
AmplabJenkins commented on pull request #28654: URL: https://github.com/apache/spark/pull/28654#issuecomment-635541684 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28654: [SPARK-31834][SQL]Improve error message for incompatible data types
SparkQA removed a comment on pull request #28654: URL: https://github.com/apache/spark/pull/28654#issuecomment-635384182 **[Test build #123236 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123236/testReport)** for PR 28654 at commit [`77c7eef`](https://github.com/apache/spark/commit/77c7eef56ba7edaa4fee9bc8f6b5ac471d0806d7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28654: [SPARK-31834][SQL]Improve error message for incompatible data types
SparkQA commented on pull request #28654: URL: https://github.com/apache/spark/pull/28654#issuecomment-635540632 **[Test build #123236 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123236/testReport)** for PR 28654 at commit [`77c7eef`](https://github.com/apache/spark/commit/77c7eef56ba7edaa4fee9bc8f6b5ac471d0806d7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28626: [SPARK-28481][SQL] More expressions should extend NullIntolerant
AmplabJenkins removed a comment on pull request #28626: URL: https://github.com/apache/spark/pull/28626#issuecomment-635535474 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28626: [SPARK-28481][SQL] More expressions should extend NullIntolerant
AmplabJenkins commented on pull request #28626: URL: https://github.com/apache/spark/pull/28626#issuecomment-635535474 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28626: [SPARK-28481][SQL] More expressions should extend NullIntolerant
SparkQA removed a comment on pull request #28626: URL: https://github.com/apache/spark/pull/28626#issuecomment-635384195 **[Test build #123237 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123237/testReport)** for PR 28626 at commit [`265abd3`](https://github.com/apache/spark/commit/265abd3b8905613aa4b9878326f0004cae2baab7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28626: [SPARK-28481][SQL] More expressions should extend NullIntolerant
SparkQA commented on pull request #28626: URL: https://github.com/apache/spark/pull/28626#issuecomment-635534477 **[Test build #123237 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123237/testReport)** for PR 28626 at commit [`265abd3`](https://github.com/apache/spark/commit/265abd3b8905613aa4b9878326f0004cae2baab7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28666: bump the timeout to match the jenkins PRB
AmplabJenkins removed a comment on pull request #28666: URL: https://github.com/apache/spark/pull/28666#issuecomment-635523623 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28666: bump the timeout to match the jenkins PRB
AmplabJenkins commented on pull request #28666: URL: https://github.com/apache/spark/pull/28666#issuecomment-635523623 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28666: bump the timeout to match the jenkins PRB
SparkQA commented on pull request #28666: URL: https://github.com/apache/spark/pull/28666#issuecomment-635522914 **[Test build #123243 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123243/testReport)** for PR 28666 at commit [`3a4b29c`](https://github.com/apache/spark/commit/3a4b29c8ad6a73d3f088447052ac70a40a0b5e1c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] shaneknapp opened a new pull request #28666: bump the timeout to match the jenkins PRB
shaneknapp opened a new pull request #28666: URL: https://github.com/apache/spark/pull/28666 ### What changes were proposed in this pull request? bump the timeout to match what's set in jenkins ### Why are the changes needed? tests be timing out! ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? via jenkins This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] skonto edited a comment on pull request #28561: [SPARK-31740][K8S][TESTS] Use github URL instead of a broken link
skonto edited a comment on pull request #28561: URL: https://github.com/apache/spark/pull/28561#issuecomment-635519286 +1 LGTM. AFAIK this url is very old check [here](https://github.com/apache-spark-on-k8s/spark-integration/blob/bcecb08879131aaa490c5ef711959ab04cda3b0e/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala#L394), existed even before porting the tests to the Spark project for the first time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] skonto commented on pull request #28561: [SPARK-31740][K8S][TESTS] Use github URL instead of a broken link
skonto commented on pull request #28561: URL: https://github.com/apache/spark/pull/28561#issuecomment-635519286 +1 LGTM. AFAIK this url is very old check [here](https://github.com/apache-spark-on-k8s/spark-integration/blob/bcecb08879131aaa490c5ef711959ab04cda3b0e/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala), even before porting the tests to the Spark project for the first time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] skonto edited a comment on pull request #28561: [SPARK-31740][K8S][TESTS] Use github URL instead of a broken link
skonto edited a comment on pull request #28561: URL: https://github.com/apache/spark/pull/28561#issuecomment-635519286 +1 LGTM. AFAIK this url is very old check [here](https://github.com/apache-spark-on-k8s/spark-integration/blob/bcecb08879131aaa490c5ef711959ab04cda3b0e/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala#L394), even before porting the tests to the Spark project for the first time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage
SparkQA commented on pull request #27690: URL: https://github.com/apache/spark/pull/27690#issuecomment-635513495 **[Test build #123235 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123235/testReport)** for PR 27690 at commit [`5098ebd`](https://github.com/apache/spark/commit/5098ebd19827cd4bcce6a31e41cba5d77cc8bc59). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
AmplabJenkins commented on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635516319 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
AmplabJenkins removed a comment on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635516319 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
SparkQA commented on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635515643 **[Test build #123242 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123242/testReport)** for PR 28665 at commit [`90aa354`](https://github.com/apache/spark/commit/90aa354539b503bec542ad6dff5afcdad4271a54). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage
AmplabJenkins removed a comment on pull request #27690: URL: https://github.com/apache/spark/pull/27690#issuecomment-635514705 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage
AmplabJenkins commented on pull request #27690: URL: https://github.com/apache/spark/pull/27690#issuecomment-635514705 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
AmplabJenkins removed a comment on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635513453 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123241/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
AmplabJenkins removed a comment on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635513441 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
SparkQA removed a comment on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635511876 **[Test build #123241 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123241/testReport)** for PR 28665 at commit [`05379e5`](https://github.com/apache/spark/commit/05379e5b9a5f0ebc0e86abb87ec6d00c0a44750c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage
SparkQA removed a comment on pull request #27690: URL: https://github.com/apache/spark/pull/27690#issuecomment-635355138 **[Test build #123235 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123235/testReport)** for PR 27690 at commit [`5098ebd`](https://github.com/apache/spark/commit/5098ebd19827cd4bcce6a31e41cba5d77cc8bc59). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
AmplabJenkins commented on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635513441 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
SparkQA commented on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635513427 **[Test build #123241 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123241/testReport)** for PR 28665 at commit [`05379e5`](https://github.com/apache/spark/commit/05379e5b9a5f0ebc0e86abb87ec6d00c0a44750c). * This patch **fails build dependency tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
AmplabJenkins removed a comment on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635512463 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
AmplabJenkins commented on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635512463 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28665: [SPARK-31858][BUILD][test-hadoop3.2] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
SparkQA commented on pull request #28665: URL: https://github.com/apache/spark/pull/28665#issuecomment-635511876 **[Test build #123241 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123241/testReport)** for PR 28665 at commit [`05379e5`](https://github.com/apache/spark/commit/05379e5b9a5f0ebc0e86abb87ec6d00c0a44750c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun opened a new pull request #28665: [SPARK-31858][BUILD] Upgrade commons-io to 2.5 in Hadoop 3.2 profile
dongjoon-hyun opened a new pull request #28665: URL: https://github.com/apache/spark/pull/28665 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] BryanCutler commented on pull request #28659: [SPARK-25351][PYTHON][TEST][FOLLOWUP] Fix test assertions to be consistent
BryanCutler commented on pull request #28659: URL: https://github.com/apache/spark/pull/28659#issuecomment-635506194 Thanks @HyukjinKwon ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] EnricoMi commented on pull request #28663: [SPARK-31853][DOCS] Mention removal of params mixins setter in migration guide
EnricoMi commented on pull request #28663: URL: https://github.com/apache/spark/pull/28663#issuecomment-635498007 Code that used to call into the setters, e.g. `HasOutputCols.setOutputCols(value)` will not compile anymore. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] attilapiros commented on pull request #28331: [WIP][SPARK-20629][CORE] Copy shuffle data when nodes are being shutdown
attilapiros commented on pull request #28331: URL: https://github.com/apache/spark/pull/28331#issuecomment-635492967 So it is not recalculation but the timing: the decommissioned executor is still giving back those shuffle data files: You can check it by adding a temporary log: ```diff diff --git a/core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala b/core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala index b959e83599..ca45dc0905 100644 --- a/core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala +++ b/core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala @@ -85,9 +85,11 @@ private[spark] class IndexShuffleBlockResolver( */ def getDataFile(shuffleId: Int, mapId: Long, dirs: Option[Array[String]]): File = { val blockId = ShuffleDataBlockId(shuffleId, mapId, NOOP_REDUCE_ID) -dirs +val file = dirs .map(ExecutorDiskUtils.getFile(_, blockManager.subDirsPerLocalDir, blockId.name)) .getOrElse(blockManager.diskBlockManager.getFile(blockId)) +logInfo(s"Attila: shuffle file: ${file.getAbsolutePath()}", new Exception()) +file } /** ``` Then please remove the `work` directory content and execute this test only i.e. by sbt: ``` > testOnly *.BlockManagerDecommissionSuite -- -z "verify that shuffle blocks are migrated" [warn] Multiple main classes detected. Run 'show discoveredMainClasses' to see the list [info] Packaging /Users/attilazsoltpiros/git/attilapiros/reviewing/core/target/scala-2.12/spark-core_2.12-3.1.0-SNAPSHOT.jar ... [info] Done packaging. [info] BlockManagerDecommissionSuite: [info] - verify that shuffle blocks are migrated. (12 seconds, 968 milliseconds) [info] ScalaTest [info] Run completed in 15 seconds, 145 milliseconds. [info] Total number of tests run: 1 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. [info] Passed: Total 1, Failed 0, Errors 0, Passed 1 [success] Total time: 19 s, completed May 28, 2020 7:17:21 PM ``` You can check the logs in the `work` directory: ``` ╭─attilazsoltpiros@apiros-MBP16 ~/git/attilapiros/reviewing/work ‹pr/28331*› ╰─$ ag Attila app-20200528101709-/0/target/unit-tests.log 69:20/05/28 19:17:14.044 Executor task launch worker for task 10 INFO IndexShuffleBlockResolver: Attila: shuffle file: /Users/attilazsoltpiros/git/attilapiros/reviewing/target/tmp/spark-31f83c40-e25a-4247-8eee-d5fb7f2ec5f8/executor-4803c415-97f1-4429-87d7-537603fcce3b/blockmgr-add08797-7db9-4b18-9fff-76fb9a92cc72/33/shuffle_0_10_0.data 86:20/05/28 19:17:14.363 Executor task launch worker for task 10 INFO IndexShuffleBlockResolver: Attila: shuffle file: /Users/attilazsoltpiros/git/attilapiros/reviewing/target/tmp/spark-31f83c40-e25a-4247-8eee-d5fb7f2ec5f8/executor-4803c415-97f1-4429-87d7-537603fcce3b/blockmgr-add08797-7db9-4b18-9fff-76fb9a92cc72/33/shuffle_0_10_0.data 106:20/05/28 19:17:14.886 Executor task launch worker for task 12 INFO IndexShuffleBlockResolver: Attila: shuffle file: /Users/attilazsoltpiros/git/attilapiros/reviewing/target/tmp/spark-31f83c40-e25a-4247-8eee-d5fb7f2ec5f8/executor-4803c415-97f1-4429-87d7-537603fcce3b/blockmgr-add08797-7db9-4b18-9fff-76fb9a92cc72/31/shuffle_0_12_0.data 123:20/05/28 19:17:14.894 Executor task launch worker for task 12 INFO IndexShuffleBlockResolver: Attila: shuffle file: /Users/attilazsoltpiros/git/attilapiros/reviewing/target/tmp/spark-31f83c40-e25a-4247-8eee-d5fb7f2ec5f8/executor-4803c415-97f1-4429-87d7-537603fcce3b/blockmgr-add08797-7db9-4b18-9fff-76fb9a92cc72/31/shuffle_0_12_0.data 143:20/05/28 19:17:15.409 Executor task launch worker for task 13 INFO IndexShuffleBlockResolver: Attila: shuffle file: /Users/attilazsoltpiros/git/attilapiros/reviewing/target/tmp/spark-31f83c40-e25a-4247-8eee-d5fb7f2ec5f8/executor-4803c415-97f1-4429-87d7-537603fcce3b/blockmgr-add08797-7db9-4b18-9fff-76fb9a92cc72/30/shuffle_0_13_0.data 160:20/05/28 19:17:15.416 Executor task launch worker for task 13 INFO IndexShuffleBlockResolver: Attila: shuffle file: /Users/attilazsoltpiros/git/attilapiros/reviewing/target/tmp/spark-31f83c40-e25a-4247-8eee-d5fb7f2ec5f8/executor-4803c415-97f1-4429-87d7-537603fcce3b/blockmgr-add08797-7db9-4b18-9fff-76fb9a92cc72/30/shuffle_0_13_0.data 180:20/05/28 19:17:15.933 Executor task launch worker for task 14 INFO IndexShuffleBlockResolver: Attila: shuffle file: /Users/attilazsoltpiros/git/attilapiros/reviewing/target/tmp/spark-31f83c40-e25a-4247-8eee-d5fb7f2ec5f8/executor-4803c415-97f1-4429-87d7-537603fcce3b/blockmgr-add08797-7db9-4b18-9fff-76fb9a92cc72/2f/shuffle_0_14_0.data 197:20/05/28 19:17:15.940 Executor task launch worker for task 14 INFO IndexShuffleBlockResolver: Attila: shuffle file: /Users/attilazsoltpiros
[GitHub] [spark] karuppayya commented on pull request #28662: [SPARK-31850] Prevent DetermineTableStats from computing stats multiple times for same table
karuppayya commented on pull request #28662: URL: https://github.com/apache/spark/pull/28662#issuecomment-635484418 @dongjoon-hyun @cloud-fan @gatorsmile Can anyone please help reviewing this change. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic
AmplabJenkins removed a comment on pull request #28661: URL: https://github.com/apache/spark/pull/28661#issuecomment-635479265 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123234/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic
AmplabJenkins removed a comment on pull request #28661: URL: https://github.com/apache/spark/pull/28661#issuecomment-635479255 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic
AmplabJenkins commented on pull request #28661: URL: https://github.com/apache/spark/pull/28661#issuecomment-635479255 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic
SparkQA removed a comment on pull request #28661: URL: https://github.com/apache/spark/pull/28661#issuecomment-635338654 **[Test build #123234 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123234/testReport)** for PR 28661 at commit [`fb22d68`](https://github.com/apache/spark/commit/fb22d687afc86d308d16aaf7685f1ce9b6fb41a5). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark exceptions more Pythonic
SparkQA commented on pull request #28661: URL: https://github.com/apache/spark/pull/28661#issuecomment-635478585 **[Test build #123234 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123234/testReport)** for PR 28661 at commit [`fb22d68`](https://github.com/apache/spark/commit/fb22d687afc86d308d16aaf7685f1ce9b6fb41a5). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28658: [SPARK-31730][CORE][TEST][3.0] Fix flaky tests in BarrierTaskContextSuite
AmplabJenkins removed a comment on pull request #28658: URL: https://github.com/apache/spark/pull/28658#issuecomment-635463919 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on a change in pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting and parsing functions
yaooqinn commented on a change in pull request #28650: URL: https://github.com/apache/spark/pull/28650#discussion_r431977318 ## File path: sql/core/src/test/resources/sql-tests/inputs/datetime.sql ## @@ -160,3 +156,83 @@ select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampF select from_json('{"date":"26/October/2015"}', 'date Date', map('dateFormat', 'dd/M/')); select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/M/')); select from_csv('26/October/2015', 'date Date', map('dateFormat', 'dd/M/')); + +select date_format(null, null); +select date_format(null, '-MM-dd'); +select date_format(null, 'invalid'); +select date_format(cast(null as date), '-MM-dd'); +select date_format(date '1986-05-23', null); +select date_format(date '1986-05-23', 'invalid'); +select date_format(date '1986-05-23', '-MM-dd'); +select date_format(cast(null as string), '-MM-dd'); +select date_format('1986-05-23', null); +select date_format('1986-05-23', 'invalid'); +select date_format('1986-05-23', '-MM-dd'); +select date_format(cast(null as timestamp ), '-MM-dd'); +select date_format(timestamp '1986-05-23', null); +select date_format(timestamp '1986-05-23', 'invalid'); +select date_format(timestamp '1986-05-23', '-MM-dd'); + +select from_unixtime(null); +select from_unixtime(null , null); +select from_unixtime(12345 , null); +select from_unixtime(null , 'invalid'); +select from_unixtime(null , '-MM-dd'); +select from_unixtime(12345 , '-MM-dd'); + +select unix_timestamp() = unix_timestamp(); +select unix_timestamp(null); +select unix_timestamp(null, null); +select unix_timestamp(null, '-MM-dd'); +select unix_timestamp(null, 'invalid'); +select unix_timestamp(cast(null as date), '-MM-dd'); +select unix_timestamp(date '1986-05-23'); +select unix_timestamp(date '1986-05-23', null); +select unix_timestamp(date '1986-05-23', 'invalid'); +select unix_timestamp(date '1986-05-23', '-MM-dd'); +select unix_timestamp(cast(null as string), '-MM-dd'); +select unix_timestamp('1986-05-23'); +select unix_timestamp('1986-05-23', null); +select unix_timestamp('1986-05-23', 'invalid'); +select unix_timestamp('1986-05-23', '-MM-dd'); +select unix_timestamp(cast(null as timestamp ), '-MM-dd'); +select unix_timestamp(timestamp '1986-05-23'); +select unix_timestamp(timestamp '1986-05-23', null); +select unix_timestamp(timestamp '1986-05-23', 'invalid'); +select unix_timestamp(timestamp '1986-05-23', '-MM-dd'); + +select to_unix_timestamp(null); +select to_unix_timestamp(null, null); +select to_unix_timestamp(null, '-MM-dd'); +select to_unix_timestamp(null, 'invalid'); +select to_unix_timestamp(cast(null as date), '-MM-dd'); +select to_unix_timestamp(date '1986-05-23'); +select to_unix_timestamp(date '1986-05-23', null); +select to_unix_timestamp(date '1986-05-23', 'invalid'); +select to_unix_timestamp(date '1986-05-23', '-MM-dd'); +select to_unix_timestamp(cast(null as string), '-MM-dd'); +select to_unix_timestamp('1986-05-23'); +select to_unix_timestamp('1986-05-23', null); +select to_unix_timestamp('1986-05-23', 'invalid'); +select to_unix_timestamp('1986-05-23', '-MM-dd'); +select to_unix_timestamp(cast(null as timestamp ), '-MM-dd'); +select to_unix_timestamp(timestamp '1986-05-23'); +select to_unix_timestamp(timestamp '1986-05-23', null); +select to_unix_timestamp(timestamp '1986-05-23', 'invalid'); +select to_unix_timestamp(timestamp '1986-05-23', '-MM-dd'); + +select to_timestamp(null); +select to_timestamp(cast(null as string), '-MM-dd'); +select to_timestamp(cast(null as string), 'invalid'); +select to_timestamp('1986-05-23'); +select to_timestamp('1986-05-23', null); +select to_timestamp('1986-05-23', 'invalid'); +select to_timestamp('1986-05-23', '-MM-dd'); + +select to_date(null); +select to_date(cast(null as string), '-MM-dd'); +select to_date(cast(null as string), 'invalid'); +select to_date('1986-05-23'); +select to_date('1986-05-23', null); +select to_date('1986-05-23', 'invalid'); +select to_date('1986-05-23', '-MM-dd'); Review comment: added more tests to check codegen version This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28658: [SPARK-31730][CORE][TEST][3.0] Fix flaky tests in BarrierTaskContextSuite
AmplabJenkins commented on pull request #28658: URL: https://github.com/apache/spark/pull/28658#issuecomment-635463919 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28658: [SPARK-31730][CORE][TEST][3.0] Fix flaky tests in BarrierTaskContextSuite
SparkQA commented on pull request #28658: URL: https://github.com/apache/spark/pull/28658#issuecomment-635463295 **[Test build #123240 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123240/testReport)** for PR 28658 at commit [`4359923`](https://github.com/apache/spark/commit/43599236260d1f7affb3611c9ab1b7f0cf31831f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on a change in pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting and parsing functions
yaooqinn commented on a change in pull request #28650: URL: https://github.com/apache/spark/pull/28650#discussion_r431977318 ## File path: sql/core/src/test/resources/sql-tests/inputs/datetime.sql ## @@ -160,3 +156,83 @@ select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampF select from_json('{"date":"26/October/2015"}', 'date Date', map('dateFormat', 'dd/M/')); select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/M/')); select from_csv('26/October/2015', 'date Date', map('dateFormat', 'dd/M/')); + +select date_format(null, null); +select date_format(null, '-MM-dd'); +select date_format(null, 'invalid'); +select date_format(cast(null as date), '-MM-dd'); +select date_format(date '1986-05-23', null); +select date_format(date '1986-05-23', 'invalid'); +select date_format(date '1986-05-23', '-MM-dd'); +select date_format(cast(null as string), '-MM-dd'); +select date_format('1986-05-23', null); +select date_format('1986-05-23', 'invalid'); +select date_format('1986-05-23', '-MM-dd'); +select date_format(cast(null as timestamp ), '-MM-dd'); +select date_format(timestamp '1986-05-23', null); +select date_format(timestamp '1986-05-23', 'invalid'); +select date_format(timestamp '1986-05-23', '-MM-dd'); + +select from_unixtime(null); +select from_unixtime(null , null); +select from_unixtime(12345 , null); +select from_unixtime(null , 'invalid'); +select from_unixtime(null , '-MM-dd'); +select from_unixtime(12345 , '-MM-dd'); + +select unix_timestamp() = unix_timestamp(); +select unix_timestamp(null); +select unix_timestamp(null, null); +select unix_timestamp(null, '-MM-dd'); +select unix_timestamp(null, 'invalid'); +select unix_timestamp(cast(null as date), '-MM-dd'); +select unix_timestamp(date '1986-05-23'); +select unix_timestamp(date '1986-05-23', null); +select unix_timestamp(date '1986-05-23', 'invalid'); +select unix_timestamp(date '1986-05-23', '-MM-dd'); +select unix_timestamp(cast(null as string), '-MM-dd'); +select unix_timestamp('1986-05-23'); +select unix_timestamp('1986-05-23', null); +select unix_timestamp('1986-05-23', 'invalid'); +select unix_timestamp('1986-05-23', '-MM-dd'); +select unix_timestamp(cast(null as timestamp ), '-MM-dd'); +select unix_timestamp(timestamp '1986-05-23'); +select unix_timestamp(timestamp '1986-05-23', null); +select unix_timestamp(timestamp '1986-05-23', 'invalid'); +select unix_timestamp(timestamp '1986-05-23', '-MM-dd'); + +select to_unix_timestamp(null); +select to_unix_timestamp(null, null); +select to_unix_timestamp(null, '-MM-dd'); +select to_unix_timestamp(null, 'invalid'); +select to_unix_timestamp(cast(null as date), '-MM-dd'); +select to_unix_timestamp(date '1986-05-23'); +select to_unix_timestamp(date '1986-05-23', null); +select to_unix_timestamp(date '1986-05-23', 'invalid'); +select to_unix_timestamp(date '1986-05-23', '-MM-dd'); +select to_unix_timestamp(cast(null as string), '-MM-dd'); +select to_unix_timestamp('1986-05-23'); +select to_unix_timestamp('1986-05-23', null); +select to_unix_timestamp('1986-05-23', 'invalid'); +select to_unix_timestamp('1986-05-23', '-MM-dd'); +select to_unix_timestamp(cast(null as timestamp ), '-MM-dd'); +select to_unix_timestamp(timestamp '1986-05-23'); +select to_unix_timestamp(timestamp '1986-05-23', null); +select to_unix_timestamp(timestamp '1986-05-23', 'invalid'); +select to_unix_timestamp(timestamp '1986-05-23', '-MM-dd'); + +select to_timestamp(null); +select to_timestamp(cast(null as string), '-MM-dd'); +select to_timestamp(cast(null as string), 'invalid'); +select to_timestamp('1986-05-23'); +select to_timestamp('1986-05-23', null); +select to_timestamp('1986-05-23', 'invalid'); +select to_timestamp('1986-05-23', '-MM-dd'); + +select to_date(null); +select to_date(cast(null as string), '-MM-dd'); +select to_date(cast(null as string), 'invalid'); +select to_date('1986-05-23'); +select to_date('1986-05-23', null); +select to_date('1986-05-23', 'invalid'); +select to_date('1986-05-23', '-MM-dd'); Review comment: add more tests to check codegen version This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org