[GitHub] [spark] SparkQA commented on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-06-14 Thread GitBox
SparkQA commented on pull request #25965: URL: https://github.com/apache/spark/pull/25965#issuecomment-643758032 **[Test build #124003 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124003/testReport)** for PR 25965 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #24173: URL: https://github.com/apache/spark/pull/24173#issuecomment-643729498 **[Test build #124004 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124004/testReport)** for PR 24173 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28422: URL: https://github.com/apache/spark/pull/28422#issuecomment-643758026 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28422: URL: https://github.com/apache/spark/pull/28422#issuecomment-643758026 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #28816: [SPARK-31986][SQL] Fix Julian-Gregorian micros rebasing of overlapping local timestamps

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #28816: URL: https://github.com/apache/spark/pull/28816#issuecomment-643751654 **[Test build #124006 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124006/testReport)** for PR 28816 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28816: [SPARK-31986][SQL] Fix Julian-Gregorian micros rebasing of overlapping local timestamps

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28816: URL: https://github.com/apache/spark/pull/28816#issuecomment-643784950 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28816: [SPARK-31986][SQL] Fix Julian-Gregorian micros rebasing of overlapping local timestamps

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28816: URL: https://github.com/apache/spark/pull/28816#issuecomment-643784950 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #28819: [SPARK-31980][SQL]Function sequence() fails if start and end of range are equal dates

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #28819: URL: https://github.com/apache/spark/pull/28819#issuecomment-643758259 **[Test build #124007 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124007/testReport)** for PR 28819 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28819: [SPARK-31980][SQL]Function sequence() fails if start and end of range are equal dates

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28819: URL: https://github.com/apache/spark/pull/28819#issuecomment-643791390 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r439853387 ## File path: core/src/main/scala/org/apache/spark/storage/BlockManagerDecommissioner.scala ## @@ -0,0 +1,281 @@ +/* + * Licensed to the Apache

[GitHub] [spark] MaxGekk commented on pull request #28816: [SPARK-31986][SQL] Fix Julian-Gregorian micros rebasing of overlapping local timestamps

2020-06-14 Thread GitBox
MaxGekk commented on pull request #28816: URL: https://github.com/apache/spark/pull/28816#issuecomment-643804818 @cloud-fan @HyukjinKwon Please, review the fix. This is an automated message from the Apache Git Service. To

[GitHub] [spark] MaxGekk commented on pull request #28824: [SPARK-31984][SQL] Make micros rebasing functions via local timestamps pure

2020-06-14 Thread GitBox
MaxGekk commented on pull request #28824: URL: https://github.com/apache/spark/pull/28824#issuecomment-643804694 @cloud-fan @HyukjinKwon Please, review this PR. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28827: [SPARK-31989][SQL] Generate JSON rebasing files w/ 30 minutes step

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28827: URL: https://github.com/apache/spark/pull/28827#issuecomment-643804601 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28827: [SPARK-31989][SQL] Generate JSON rebasing files w/ 30 minutes step

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28827: URL: https://github.com/apache/spark/pull/28827#issuecomment-643806345 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] holdenk commented on a change in pull request #28817: [WIP][SPARK-31197][CORE] Exit the executor once all tasks and migrations are finished built on top of on top of spark20629

2020-06-14 Thread GitBox
holdenk commented on a change in pull request #28817: URL: https://github.com/apache/spark/pull/28817#discussion_r439857534 ## File path: core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala ## @@ -258,26 +262,60 @@ private[spark] class

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28363: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28363: URL: https://github.com/apache/spark/pull/28363#issuecomment-643758404 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #26935: URL: https://github.com/apache/spark/pull/26935#issuecomment-643758109 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #25965: URL: https://github.com/apache/spark/pull/25965#issuecomment-643758286 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28819: [SPARK-31980][SQL]Function sequence() fails if start and end of range are equal dates

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28819: URL: https://github.com/apache/spark/pull/28819#issuecomment-643758444 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #28363: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28363: URL: https://github.com/apache/spark/pull/28363#issuecomment-643758404 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #25965: URL: https://github.com/apache/spark/pull/25965#issuecomment-643758286 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #24173: URL: https://github.com/apache/spark/pull/24173#issuecomment-643758210 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #28819: [SPARK-31980][SQL]Function sequence() fails if start and end of range are equal dates

2020-06-14 Thread GitBox
SparkQA commented on pull request #28819: URL: https://github.com/apache/spark/pull/28819#issuecomment-643758259 **[Test build #124007 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124007/testReport)** for PR 28819 at commit

[GitHub] [spark] guykhazma opened a new pull request #28826: [SPARK-31988] - schema pruning may discard attribute metadata

2020-06-14 Thread GitBox
guykhazma opened a new pull request #28826: URL: https://github.com/apache/spark/pull/28826 ### What changes were proposed in this pull request? Fixing the `getRootFields` function to preserve attribute metadata ### Why are the changes needed? This can lead to a

[GitHub] [spark] dongjoon-hyun commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-06-14 Thread GitBox
dongjoon-hyun commented on pull request #28708: URL: https://github.com/apache/spark/pull/28708#issuecomment-643797896 The K8s integration test failure is irrelevant to this PR. ``` - Run SparkPi with no resources *** FAILED *** ```

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r439851831 ## File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ## @@ -121,12 +121,28 @@ private class ShuffleStatus(numPartitions: Int)

[GitHub] [spark] MaxGekk opened a new pull request #28827: [SPARK-31989][SQL] Generate JSON rebasing files w/ 30 minutes step

2020-06-14 Thread GitBox
MaxGekk opened a new pull request #28827: URL: https://github.com/apache/spark/pull/28827 ### What changes were proposed in this pull request? 1. Change the max step from 1 week to 30 minutes in the tests `RebaseDateTimeSuite`.`generate 'gregorian-julian-rebase-micros.json'` and

[GitHub] [spark] SparkQA removed a comment on pull request #28827: [SPARK-31989][SQL] Generate JSON rebasing files w/ 30 minutes step

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #28827: URL: https://github.com/apache/spark/pull/28827#issuecomment-643804463 **[Test build #124010 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124010/testReport)** for PR 28827 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28827: [SPARK-31989][SQL] Generate JSON rebasing files w/ 30 minutes step

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28827: URL: https://github.com/apache/spark/pull/28827#issuecomment-643806342 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #28827: [SPARK-31989][SQL] Generate JSON rebasing files w/ 30 minutes step

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28827: URL: https://github.com/apache/spark/pull/28827#issuecomment-643806342 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28827: [SPARK-31989][SQL] Generate JSON rebasing files w/ 30 minutes step

2020-06-14 Thread GitBox
SparkQA commented on pull request #28827: URL: https://github.com/apache/spark/pull/28827#issuecomment-643806320 **[Test build #124010 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124010/testReport)** for PR 28827 at commit

[GitHub] [spark] SparkQA commented on pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-14 Thread GitBox
SparkQA commented on pull request #28685: URL: https://github.com/apache/spark/pull/28685#issuecomment-643767795 **[Test build #124008 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124008/testReport)** for PR 28685 at commit

[GitHub] [spark] SparkQA commented on pull request #28825: [SPARk-31950][SQL][FOLLOW-UP][MINOR] Better error message on SPARK_HOME or…

2020-06-14 Thread GitBox
SparkQA commented on pull request #28825: URL: https://github.com/apache/spark/pull/28825#issuecomment-643782599 **[Test build #124005 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124005/testReport)** for PR 28825 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #28825: [SPARk-31950][SQL][FOLLOW-UP][MINOR] Better error message on SPARK_HOME or…

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #28825: URL: https://github.com/apache/spark/pull/28825#issuecomment-643750910 **[Test build #124005 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124005/testReport)** for PR 28825 at commit

[GitHub] [spark] MaxGekk commented on a change in pull request #28821: [SPARK-31981][SQL] Keep TimestampType when taking an average of a Timestamp

2020-06-14 Thread GitBox
MaxGekk commented on a change in pull request #28821: URL: https://github.com/apache/spark/pull/28821#discussion_r439842079 ## File path: sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestData.scala ## @@ -73,6 +74,17 @@ private[sql] trait SQLTestData { self => df

[GitHub] [spark] AmplabJenkins commented on pull request #24525: [SPARK-27633][SQL] Remove redundant aliases in NestedColumnAliasing

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #24525: URL: https://github.com/apache/spark/pull/24525#issuecomment-643786152 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #24525: [SPARK-27633][SQL] Remove redundant aliases in NestedColumnAliasing

2020-06-14 Thread GitBox
SparkQA commented on pull request #24525: URL: https://github.com/apache/spark/pull/24525#issuecomment-643786010 **[Test build #124009 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124009/testReport)** for PR 24525 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28826: [SPARK-31988] - schema pruning may discard attribute metadata

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28826: URL: https://github.com/apache/spark/pull/28826#issuecomment-643786098 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins removed a comment on pull request #24525: [SPARK-27633][SQL] Remove redundant aliases in NestedColumnAliasing

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #24525: URL: https://github.com/apache/spark/pull/24525#issuecomment-643786152 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28826: [SPARK-31988] - schema pruning may discard attribute metadata

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28826: URL: https://github.com/apache/spark/pull/28826#issuecomment-643785951 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28819: [SPARK-31980][SQL]Function sequence() fails if start and end of range are equal dates

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28819: URL: https://github.com/apache/spark/pull/28819#issuecomment-643791390 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r439852420 ## File path: core/src/main/scala/org/apache/spark/internal/config/package.scala ## @@ -420,6 +420,21 @@ package object config {

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r439852892 ## File path: core/src/main/scala/org/apache/spark/storage/BlockManager.scala ## @@ -650,6 +658,23 @@ private[spark] class BlockManager(

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r439852796 ## File path: core/src/main/scala/org/apache/spark/storage/BlockManager.scala ## @@ -650,6 +658,23 @@ private[spark] class BlockManager(

[GitHub] [spark] SparkQA removed a comment on pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #28685: URL: https://github.com/apache/spark/pull/28685#issuecomment-643767795 **[Test build #124008 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124008/testReport)** for PR 28685 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28685: URL: https://github.com/apache/spark/pull/28685#issuecomment-643800833 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28685: URL: https://github.com/apache/spark/pull/28685#issuecomment-643800833 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-06-14 Thread GitBox
SparkQA commented on pull request #24173: URL: https://github.com/apache/spark/pull/24173#issuecomment-643757996 **[Test build #124004 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124004/testReport)** for PR 24173 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #26935: URL: https://github.com/apache/spark/pull/26935#issuecomment-643729490 **[Test build #124002 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124002/testReport)** for PR 26935 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28819: [SPARK-31980][SQL]Function sequence() fails if start and end of range are equal dates

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28819: URL: https://github.com/apache/spark/pull/28819#issuecomment-643758444 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28826: [SPARK-31988] - schema pruning may discard attribute metadata

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28826: URL: https://github.com/apache/spark/pull/28826#issuecomment-643785951 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #28363: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files

2020-06-14 Thread GitBox
SparkQA commented on pull request #28363: URL: https://github.com/apache/spark/pull/28363#issuecomment-643758171 **[Test build #123999 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123999/testReport)** for PR 28363 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #24173: URL: https://github.com/apache/spark/pull/24173#issuecomment-643758210 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #25965: URL: https://github.com/apache/spark/pull/25965#issuecomment-643729499 **[Test build #124003 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124003/testReport)** for PR 25965 at commit

[GitHub] [spark] HyukjinKwon commented on pull request #28812: [SPARK-31977][SQL] Returns the plan directly from NestedColumnAliasing

2020-06-14 Thread GitBox
HyukjinKwon commented on pull request #28812: URL: https://github.com/apache/spark/pull/28812#issuecomment-643758126 Thanks @viirya and @maropu! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] AmplabJenkins commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #26935: URL: https://github.com/apache/spark/pull/26935#issuecomment-643758109 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28685: URL: https://github.com/apache/spark/pull/28685#issuecomment-643767939 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28685: URL: https://github.com/apache/spark/pull/28685#issuecomment-643767939 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r439851900 ## File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ## @@ -479,6 +497,15 @@ private[spark] class MapOutputTrackerMaster(

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r439851962 ## File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ## @@ -775,7 +802,12 @@ private[spark] class MapOutputTrackerMaster(

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r439853174 ## File path: core/src/main/scala/org/apache/spark/storage/BlockManagerDecommissioner.scala ## @@ -0,0 +1,281 @@ +/* + * Licensed to the Apache

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r439853227 ## File path: core/src/main/scala/org/apache/spark/storage/BlockManagerDecommissioner.scala ## @@ -0,0 +1,281 @@ +/* + * Licensed to the Apache

[GitHub] [spark] zsxwing commented on a change in pull request #28607: [SPARK-24634][SS] Add a new metric regarding number of inputs later than watermark plus allowed delay

2020-06-14 Thread GitBox
zsxwing commented on a change in pull request #28607: URL: https://github.com/apache/spark/pull/28607#discussion_r439853232 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala ## @@ -77,6 +77,8 @@ trait StateStoreWriter

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28825: [SPARk-31950][SQL][FOLLOW-UP][MINOR] Better error message on SPARK_HOME or…

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28825: URL: https://github.com/apache/spark/pull/28825#issuecomment-643782881 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28825: [SPARk-31950][SQL][FOLLOW-UP][MINOR] Better error message on SPARK_HOME or…

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28825: URL: https://github.com/apache/spark/pull/28825#issuecomment-643782881 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28819: [SPARK-31980][SQL]Function sequence() fails if start and end of range are equal dates

2020-06-14 Thread GitBox
SparkQA commented on pull request #28819: URL: https://github.com/apache/spark/pull/28819#issuecomment-643791166 **[Test build #124007 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124007/testReport)** for PR 28819 at commit

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r439852124 ## File path: core/src/main/scala/org/apache/spark/internal/config/package.scala ## @@ -420,6 +420,21 @@ package object config {

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r439852162 ## File path: core/src/main/scala/org/apache/spark/internal/config/package.scala ## @@ -420,6 +420,21 @@ package object config {

[GitHub] [spark] SparkQA commented on pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-14 Thread GitBox
SparkQA commented on pull request #28685: URL: https://github.com/apache/spark/pull/28685#issuecomment-643800607 **[Test build #124008 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124008/testReport)** for PR 28685 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-06-14 Thread GitBox
cloud-fan commented on a change in pull request #28593: URL: https://github.com/apache/spark/pull/28593#discussion_r439904191 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala ## @@ -552,28 +553,34 @@ class HiveQuerySuite extends

[GitHub] [spark] cloud-fan commented on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-06-14 Thread GitBox
cloud-fan commented on pull request #28593: URL: https://github.com/apache/spark/pull/28593#issuecomment-643873707 Why are there empty golden files generated in `sql/hive/src/test/resources/golden`? This is an automated

[GitHub] [spark] HeartSaVioR commented on a change in pull request #28830: [SPARK-31990][SQL][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
HeartSaVioR commented on a change in pull request #28830: URL: https://github.com/apache/spark/pull/28830#discussion_r439905543 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -2541,7 +2542,20 @@ class Dataset[T] private[sql]( def

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28830: URL: https://github.com/apache/spark/pull/28830#discussion_r439907052 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -2541,7 +2542,20 @@ class Dataset[T] private[sql]( def

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28830: URL: https://github.com/apache/spark/pull/28830#discussion_r439907052 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -2541,7 +2542,20 @@ class Dataset[T] private[sql]( def

[GitHub] [spark] AmplabJenkins commented on pull request #27604: [SPARK-30849][CORE][SHUFFLE]Fix application failed due to failed to get MapStatuses broadcast block

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #27604: URL: https://github.com/apache/spark/pull/27604#issuecomment-643877494 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27604: [SPARK-30849][CORE][SHUFFLE]Fix application failed due to failed to get MapStatuses broadcast block

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #27604: URL: https://github.com/apache/spark/pull/27604#issuecomment-643877494 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] maropu commented on pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
maropu commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643879180 okay, I'll revert that part in this PR first. This is an automated message from the Apache Git Service. To

[GitHub] [spark] HeartSaVioR edited a comment on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-06-14 Thread GitBox
HeartSaVioR edited a comment on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-643878976 I’m sorry, but version 4 doesn’t leverage UnsafeRow. (version 2 was.) Please read the description thoughtfully. As I commented earlier there’re still lots of

[GitHub] [spark] gatorsmile commented on pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
gatorsmile commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643879059 Yes. I prefer to reverting the original fix in 3.0.1. and then discuss how to solve/avoid the problems in a proper way.

[GitHub] [spark] HeartSaVioR edited a comment on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
HeartSaVioR edited a comment on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-64318 How we plan to consolidate both? How we will write JIRA title/description and PR title/description? Which is the type of the consolidated issue? Is the consolidated

[GitHub] [spark] HeartSaVioR edited a comment on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
HeartSaVioR edited a comment on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-64318 How we plan to consolidate both? How we will write JIRA title/description and PR title/description? Which is the type of the consolidated issue? Is the consolidated

[GitHub] [spark] huaxingao commented on pull request #28710: [SPARK-31893][ML] Add a generic ClassificationSummary trait

2020-06-14 Thread GitBox
huaxingao commented on pull request #28710: URL: https://github.com/apache/spark/pull/28710#issuecomment-643896578 retest this please This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] maropu commented on a change in pull request #28807: [SPARK-26905][SQL] Follow the SQL:2016 reserved keywords

2020-06-14 Thread GitBox
maropu commented on a change in pull request #28807: URL: https://github.com/apache/spark/pull/28807#discussion_r439927771 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala ## @@ -388,12 +396,24 @@ class

[GitHub] [spark] SparkQA commented on pull request #28807: [SPARK-26905][SQL] Follow the SQL:2016 reserved keywords

2020-06-14 Thread GitBox
SparkQA commented on pull request #28807: URL: https://github.com/apache/spark/pull/28807#issuecomment-643899210 **[Test build #124031 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124031/testReport)** for PR 28807 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28807: [SPARK-26905][SQL] Follow the SQL:2016 reserved keywords

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28807: URL: https://github.com/apache/spark/pull/28807#issuecomment-643899506 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28807: [SPARK-26905][SQL] Follow the SQL:2016 reserved keywords

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28807: URL: https://github.com/apache/spark/pull/28807#issuecomment-643899506 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28821: [SPARK-31981][SQL] Keep TimestampType when taking an average of a Timestamp

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28821: URL: https://github.com/apache/spark/pull/28821#issuecomment-643904439 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] SparkQA removed a comment on pull request #28821: [SPARK-31981][SQL] Keep TimestampType when taking an average of a Timestamp

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #28821: URL: https://github.com/apache/spark/pull/28821#issuecomment-643865812 **[Test build #124024 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124024/testReport)** for PR 28821 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28821: [SPARK-31981][SQL] Keep TimestampType when taking an average of a Timestamp

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28821: URL: https://github.com/apache/spark/pull/28821#issuecomment-643904434 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #28821: [SPARK-31981][SQL] Keep TimestampType when taking an average of a Timestamp

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28821: URL: https://github.com/apache/spark/pull/28821#issuecomment-643904434 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] xuanyuanking edited a comment on pull request #28707: [SPARK-31894][SS] Introduce UnsafeRow format validation for streaming state store

2020-06-14 Thread GitBox
xuanyuanking edited a comment on pull request #28707: URL: https://github.com/apache/spark/pull/28707#issuecomment-643916110 cc @maropu @gatorsmile @HeartSaVioR @dongjoon-hyun A new regression bug SPARK-31990 was found when investigating the test failure

[GitHub] [spark] maropu commented on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
maropu commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643885408 > Thanks for the quick fix @maropu! I think maybe we can simplify the bugfix by combining it together with #28707. WDYT? I'll also reference this PR with #28707.

[GitHub] [spark] SparkQA commented on pull request #28786: [SPARK-31925][ML] Summary.totalIterations greater than maxIters

2020-06-14 Thread GitBox
SparkQA commented on pull request #28786: URL: https://github.com/apache/spark/pull/28786#issuecomment-643885633 **[Test build #124025 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124025/testReport)** for PR 28786 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28786: [SPARK-31925][ML] Summary.totalIterations greater than maxIters

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28786: URL: https://github.com/apache/spark/pull/28786#issuecomment-643885908 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28786: [SPARK-31925][ML] Summary.totalIterations greater than maxIters

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28786: URL: https://github.com/apache/spark/pull/28786#issuecomment-643885908 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #28786: [SPARK-31925][ML] Summary.totalIterations greater than maxIters

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #28786: URL: https://github.com/apache/spark/pull/28786#issuecomment-643867351 **[Test build #124025 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124025/testReport)** for PR 28786 at commit

[GitHub] [spark] HeartSaVioR edited a comment on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
HeartSaVioR edited a comment on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-64318 How we plan to consolidate both? How we will write JIRA title/description and PR title/description? Which is the type of the consolidated issue? Is the consolidated

[GitHub] [spark] HeartSaVioR edited a comment on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-06-14 Thread GitBox
HeartSaVioR edited a comment on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-643878976 I’m sorry, but version 4 doesn’t leverage UnsafeRow. (version 2 was.) Please read the description thoughtfully. As I commented earlier there’re still lots of

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28593: URL: https://github.com/apache/spark/pull/28593#issuecomment-643892810 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-06-14 Thread GitBox
SparkQA commented on pull request #28593: URL: https://github.com/apache/spark/pull/28593#issuecomment-643892530 **[Test build #124029 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124029/testReport)** for PR 28593 at commit

  1   2   3   4   5   >