[GitHub] [spark] SaurabhChawla100 commented on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-14 Thread GitBox
SaurabhChawla100 commented on pull request #29045: URL: https://github.com/apache/spark/pull/29045#issuecomment-658561486 Retest this please This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29118: [SPARK-32318][SQL][TESTS] Add a test case to EliminateSortsSuite for ORDER BY in DISTRIBUTE BY

2020-07-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #29118: URL: https://github.com/apache/spark/pull/29118#issuecomment-658560629 Also, cc @cloud-fan , @HyukjinKwon , @maropu This is an automated message from the Apache Git

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29118: [SPARK-32318][SQL][TESTS] Add a test case to EliminateSortsSuite for ORDER BY in DISTRIBUTE BY

2020-07-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #29118: URL: https://github.com/apache/spark/pull/29118#issuecomment-658560339 Could you review this, @viirya ? This will protect us from the future regression. This part is tricky.

[GitHub] [spark] dongjoon-hyun commented on pull request #29118: [SPARK-32318][SQL][TESTS] Add a test case to EliminateSortsSuite for ORDER BY in DISTRIBUTE BY

2020-07-14 Thread GitBox
dongjoon-hyun commented on pull request #29118: URL: https://github.com/apache/spark/pull/29118#issuecomment-658560629 Also, cc @cloud-fan and @HyukjinKwon . This is an automated message from the Apache Git Service. To

[GitHub] [spark] dongjoon-hyun commented on pull request #29118: [SPARK-32318][SQL][TESTS] Add a test case to EliminateSortsSuite for ORDER BY in DISTRIBUTE BY

2020-07-14 Thread GitBox
dongjoon-hyun commented on pull request #29118: URL: https://github.com/apache/spark/pull/29118#issuecomment-658560339 Could you review this, @viirya ? This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658559706 The most big factor is file formats instead of Spark side. For example, in the above example, ORC files are small because it supports a special encoding when the

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658559706 No~ It depends on file formats instead of Spark side. For example, in the above example, ORC files are small because it supports a special encoding when the

[GitHub] [spark] dongjoon-hyun commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
dongjoon-hyun commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658559706 No~ It depends on file formats instead of Spark side. For example, in the above example, ORC files are small because it supports a special encoding when the data is

[GitHub] [spark] dongjoon-hyun commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
dongjoon-hyun commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658558813 I made a PR to add a test coverage for the above case. - https://github.com/apache/spark/pull/29118

[GitHub] [spark] viirya commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
viirya commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658558946 Oh, this is interesting. I know removing `Sort` before `Repartition` will result in different data distribution because `Repartition` uses `RoundRobinPartitioning`. Because I

[GitHub] [spark] dongjoon-hyun opened a new pull request #29118: [SPARK-32318][SQL][TESTS] Add a test case to EliminateSortsSuite for ORDER BY in DISTRIBUTE BY

2020-07-14 Thread GitBox
dongjoon-hyun opened a new pull request #29118: URL: https://github.com/apache/spark/pull/29118 ### What changes were proposed in this pull request? This PR aims to add a test case to EliminateSortsSuite to protect a valid use case which is using ORDER BY in DISTRIBUTE BY statement.

[GitHub] [spark] viirya commented on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel

2020-07-14 Thread GitBox
viirya commented on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658556814 Do you read the above too links? The current approach is repeated random sub-sampling validation, this PR changes to k-fold cross-validation.

[GitHub] [spark] viirya edited a comment on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel

2020-07-14 Thread GitBox
viirya edited a comment on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658556814 Do you read the above two links? The current approach is repeated random sub-sampling validation, this PR changes to k-fold cross-validation.

[GitHub] [spark] SparkQA commented on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-07-14 Thread GitBox
SparkQA commented on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-658555806 **[Test build #125876 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125876/testReport)** for PR 27694 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-07-14 Thread GitBox
SparkQA removed a comment on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-658519508 **[Test build #125876 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125876/testReport)** for PR 27694 at commit

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #28931: [SPARK-32103][CORE] Support IPv6 host/port in core module

2020-07-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #28931: URL: https://github.com/apache/spark/pull/28931#issuecomment-658553220 Hi, @gatorsmile . Technically, this only handles `host/port` parsing inside `core` module. I'm sure that this is a meaningful step inside Spark. However, we didn't

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #28931: [SPARK-32103][CORE] Support IPv6 host/port in core module

2020-07-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #28931: URL: https://github.com/apache/spark/pull/28931#issuecomment-658553220 Hi, @gatorsmile . Technically, this only handles `host/port` parsing inside `core` module. I'm sure that this is a meaningful step inside Spark. However, we didn't

[GitHub] [spark] dongjoon-hyun commented on pull request #28931: [SPARK-32103][CORE] Support IPv6 host/port in core module

2020-07-14 Thread GitBox
dongjoon-hyun commented on pull request #28931: URL: https://github.com/apache/spark/pull/28931#issuecomment-658553220 Hi, @gatorsmile . Technically, this only handles `host/port` parsing inside `core` module only. I'm sure that this is a meaningful step inside Spark. However, we didn't

[GitHub] [spark] adjordan edited a comment on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel

2020-07-14 Thread GitBox
adjordan edited a comment on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658547236 @viirya Sorry, can you explain? I don't see how it changes the technique, it just allows models from multiple folds to be run in parallel. `MLUtils.kFold` is doing

[GitHub] [spark] dongjoon-hyun commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
dongjoon-hyun commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658550248 Very sorry, guys. Due to the above regression, I'll revert this commit urgently. We can rethink about this PR.

[GitHub] [spark] maropu commented on a change in pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-14 Thread GitBox
maropu commented on a change in pull request #29085: URL: https://github.com/apache/spark/pull/29085#discussion_r454795948 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkScriptTransformationExec.scala ## @@ -0,0 +1,187 @@ +/* + * Licensed to the

[GitHub] [spark] dongjoon-hyun commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
dongjoon-hyun commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658549984 **AFTER SPARK-32276** ``` scala> scala.util.Random.shuffle((1 to 10).map(x => (x % 2, x))).toDF("a", "b").repartition(2).createOrReplaceTempView("t")

[GitHub] [spark] maropu commented on a change in pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-14 Thread GitBox
maropu commented on a change in pull request #29085: URL: https://github.com/apache/spark/pull/29085#discussion_r454780673 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/BaseScriptTransformationExec.scala ## @@ -87,17 +90,60 @@ trait

[GitHub] [spark] adjordan edited a comment on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel

2020-07-14 Thread GitBox
adjordan edited a comment on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658547236 @viirya Sorry, can you explain? I don't see how it changes the technique, it just allows models from multiple folds to be run in parallel.

[GitHub] [spark] adjordan commented on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel

2020-07-14 Thread GitBox
adjordan commented on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658547236 @viirya Sorry, can you explain? I don't see how it changes anything, it just allows models from multiple folds to be run in parallel.

[GitHub] [spark] srowen commented on a change in pull request #29111: [SPARK-29292][SQL][ML] Update rest of default modules (Hive, ML, etc) for Scala 2.13 compilation

2020-07-14 Thread GitBox
srowen commented on a change in pull request #29111: URL: https://github.com/apache/spark/pull/29111#discussion_r454792607 ## File path: mllib/src/main/scala/org/apache/spark/ml/Estimator.scala ## @@ -76,7 +76,7 @@ abstract class Estimator[M <: Model[M]] extends PipelineStage

[GitHub] [spark] srowen commented on pull request #29111: [SPARK-29292][SQL][ML] Update rest of default modules (Hive, ML, etc) for Scala 2.13 compilation

2020-07-14 Thread GitBox
srowen commented on pull request #29111: URL: https://github.com/apache/spark/pull/29111#issuecomment-658546568 I think I understand the last test failures, will fix too. This is an automated message from the Apache Git

[GitHub] [spark] MaxGekk commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-14 Thread GitBox
MaxGekk commented on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-658546141 @cloud-fan Anything else should I do in the PR to be merged? This is an automated message from the Apache Git

[GitHub] [spark] stczwd commented on a change in pull request #29088: [SPARK-32289][SQL] Some characters are garbled when opening csv files with Excel

2020-07-14 Thread GitBox
stczwd commented on a change in pull request #29088: URL: https://github.com/apache/spark/pull/29088#discussion_r454791986 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CsvOutputWriter.scala ## @@ -39,6 +39,10 @@ class CsvOutputWriter(

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658544475 To generate small final Parquet/ORC files, we do the above tricks, don't we? This may cause a regression on the size of output storage.

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658544475 To generate small final Parquet/ORC files, we do the above tricks, don't we? This PR may cause a regression on the size of output storage.

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658544475 To generate small final Parquet/ORC files, we do the above tricks, don't we? This is an automated

[GitHub] [spark] dongjoon-hyun commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
dongjoon-hyun commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658544475 To generate small Parquet/ORC files, we do the above tricks, don't we? This is an automated message from

[GitHub] [spark] warrenzhu25 edited a comment on pull request #29044: [WIP][SPARK-32227] Fix regression bug in load-spark-env.cmd with Spark 3.0.0

2020-07-14 Thread GitBox
warrenzhu25 edited a comment on pull request #29044: URL: https://github.com/apache/spark/pull/29044#issuecomment-656771107 > It's directly relevant to this PR because your patch is changing `environment` variable. > > * Please see this for the detail

[GitHub] [spark] dongjoon-hyun commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
dongjoon-hyun commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658543717 Oops. Sorry, guys. It seems that I missed something during testing. For the following case, we should not remove `Sort`. **BEFORE THIS PR** ```scala scala>

[GitHub] [spark] warrenzhu25 commented on pull request #28942: [SPARK-32125][UI] Support get taskList by status in Web UI and SHS Rest API

2020-07-14 Thread GitBox
warrenzhu25 commented on pull request #28942: URL: https://github.com/apache/spark/pull/28942#issuecomment-658543670 @gengliangwang Tests passed, could you help merge this? This is an automated message from the Apache Git

[GitHub] [spark] HyukjinKwon opened a new pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-14 Thread GitBox
HyukjinKwon opened a new pull request #29117: URL: https://github.com/apache/spark/pull/29117 ### What changes were proposed in this pull request? TBD ### Why are the changes needed? TBD ### Does this PR introduce _any_ user-facing change? TBD ###

[GitHub] [spark] HeartSaVioR closed pull request #29077: [SPARK-31985][SS] Remove incomplete/undocumented stateful aggregation in continuous mode

2020-07-14 Thread GitBox
HeartSaVioR closed pull request #29077: URL: https://github.com/apache/spark/pull/29077 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] HeartSaVioR commented on pull request #29077: [SPARK-31985][SS] Remove incomplete/undocumented stateful aggregation in continuous mode

2020-07-14 Thread GitBox
HeartSaVioR commented on pull request #29077: URL: https://github.com/apache/spark/pull/29077#issuecomment-658539797 Thanks for the reviewing and kind words :) I'll deal with merging. This is an automated message from the

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29111: [SPARK-29292][SQL][ML] Update rest of default modules (Hive, ML, etc) for Scala 2.13 compilation

2020-07-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #29111: URL: https://github.com/apache/spark/pull/29111#discussion_r454784921 ## File path: mllib/src/main/scala/org/apache/spark/ml/Estimator.scala ## @@ -76,7 +76,7 @@ abstract class Estimator[M <: Model[M]] extends

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29111: [SPARK-29292][SQL][ML] Update rest of default modules (Hive, ML, etc) for Scala 2.13 compilation

2020-07-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #29111: URL: https://github.com/apache/spark/pull/29111#discussion_r454784282 ## File path: examples/src/main/scala/org/apache/spark/examples/SparkKMeans.scala ## @@ -102,5 +102,10 @@ object SparkKMeans {

[GitHub] [spark] aokolnychyi commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
aokolnychyi commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658538432 Thanks, everyone! This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] dongjoon-hyun commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
dongjoon-hyun commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658538140 Also, cc @gatorsmile and @cloud-fan This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] SparkQA removed a comment on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel

2020-07-14 Thread GitBox
SparkQA removed a comment on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658519469 **[Test build #125874 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125874/testReport)** for PR 29080 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658536762 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel

2020-07-14 Thread GitBox
AmplabJenkins commented on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658537135 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658537135 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658537137 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658536619 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel

2020-07-14 Thread GitBox
SparkQA commented on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658536994 **[Test build #125874 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125874/testReport)** for PR 29080 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-14 Thread GitBox
AmplabJenkins commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658536613 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-14 Thread GitBox
AmplabJenkins commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658536758 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-14 Thread GitBox
SparkQA removed a comment on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658491516 **[Test build #125865 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125865/testReport)** for PR 29114 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-14 Thread GitBox
AmplabJenkins commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658536691 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658536613 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-14 Thread GitBox
SparkQA commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658536417 **[Test build #125865 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125865/testReport)** for PR 29114 at commit

[GitHub] [spark] dongjoon-hyun closed pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-14 Thread GitBox
dongjoon-hyun closed pull request #29089: URL: https://github.com/apache/spark/pull/29089 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] SparkQA commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-14 Thread GitBox
SparkQA commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658535423 **[Test build #125878 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125878/testReport)** for PR 29114 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28708: URL: https://github.com/apache/spark/pull/28708#issuecomment-658534819 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-14 Thread GitBox
AmplabJenkins commented on pull request #28708: URL: https://github.com/apache/spark/pull/28708#issuecomment-658534813 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28708: URL: https://github.com/apache/spark/pull/28708#issuecomment-658534813 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA removed a comment on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-14 Thread GitBox
SparkQA removed a comment on pull request #28708: URL: https://github.com/apache/spark/pull/28708#issuecomment-658493500 **[Test build #125867 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125867/testReport)** for PR 28708 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28708: URL: https://github.com/apache/spark/pull/28708#issuecomment-658503907 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] SparkQA commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-14 Thread GitBox
SparkQA commented on pull request #28708: URL: https://github.com/apache/spark/pull/28708#issuecomment-658534225 **[Test build #125867 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125867/testReport)** for PR 28708 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658533895 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-14 Thread GitBox
AmplabJenkins commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658533895 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28848: [SPARK-32003][CORE] When external shuffle service is used, unregister outputs for executor on fetch failure after executor is l

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28848: URL: https://github.com/apache/spark/pull/28848#issuecomment-658533186 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] HyukjinKwon commented on pull request #29116: [SPARK-32316][TESTS][INFRA] Test PySpark with Python 3.8 in Github Actions

2020-07-14 Thread GitBox
HyukjinKwon commented on pull request #29116: URL: https://github.com/apache/spark/pull/29116#issuecomment-658533425 Thanks, @dongjoon-hyun This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28848: [SPARK-32003][CORE] When external shuffle service is used, unregister outputs for executor on fetch failure after executor is l

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28848: URL: https://github.com/apache/spark/pull/28848#issuecomment-658533182 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA removed a comment on pull request #28848: [SPARK-32003][CORE] When external shuffle service is used, unregister outputs for executor on fetch failure after executor is lost

2020-07-14 Thread GitBox
SparkQA removed a comment on pull request #28848: URL: https://github.com/apache/spark/pull/28848#issuecomment-658485359 **[Test build #125863 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125863/testReport)** for PR 28848 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28848: [SPARK-32003][CORE] When external shuffle service is used, unregister outputs for executor on fetch failure after executor is lost

2020-07-14 Thread GitBox
AmplabJenkins commented on pull request #28848: URL: https://github.com/apache/spark/pull/28848#issuecomment-658533182 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28848: [SPARK-32003][CORE] When external shuffle service is used, unregister outputs for executor on fetch failure after executor is lost

2020-07-14 Thread GitBox
SparkQA commented on pull request #28848: URL: https://github.com/apache/spark/pull/28848#issuecomment-658532861 **[Test build #125863 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125863/testReport)** for PR 28848 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658529664 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-14 Thread GitBox
AmplabJenkins commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658529664 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-14 Thread GitBox
SparkQA commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658529122 **[Test build #125877 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125877/testReport)** for PR 29114 at commit

[GitHub] [spark] dongjoon-hyun closed pull request #29116: [SPARK-32316][TESTS][INFRA] Test PySpark with Python 3.8 in Github Actions

2020-07-14 Thread GitBox
dongjoon-hyun closed pull request #29116: URL: https://github.com/apache/spark/pull/29116 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] maropu commented on pull request #29101: [WIP][SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-14 Thread GitBox
maropu commented on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-658527647 Just a question; if this proposal works well, we don't need the fix, https://github.com/apache/spark/pull/29075 ?

[GitHub] [spark] jose-torres commented on pull request #29077: [SPARK-31985][SS] Remove incomplete/undocumented stateful aggregation in continuous mode

2020-07-14 Thread GitBox
jose-torres commented on pull request #29077: URL: https://github.com/apache/spark/pull/29077#issuecomment-658526205 LGTM. I don't have the repo fully set up on my new computer, so I'll try to find time to set it up and merge tomorrow. (Or you can do it if you want to try out your new

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29111: [SPARK-29292][SQL][ML] Update rest of default modules (Hive, ML, etc) for Scala 2.13 compilation

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29111: URL: https://github.com/apache/spark/pull/29111#issuecomment-658524127 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29111: [SPARK-29292][SQL][ML] Update rest of default modules (Hive, ML, etc) for Scala 2.13 compilation

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29111: URL: https://github.com/apache/spark/pull/29111#issuecomment-658524120 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #29111: [SPARK-29292][SQL][ML] Update rest of default modules (Hive, ML, etc) for Scala 2.13 compilation

2020-07-14 Thread GitBox
AmplabJenkins commented on pull request #29111: URL: https://github.com/apache/spark/pull/29111#issuecomment-658524120 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29111: [SPARK-29292][SQL][ML] Update rest of default modules (Hive, ML, etc) for Scala 2.13 compilation

2020-07-14 Thread GitBox
SparkQA commented on pull request #29111: URL: https://github.com/apache/spark/pull/29111#issuecomment-658523924 **[Test build #125864 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125864/testReport)** for PR 29111 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #29111: [SPARK-29292][SQL][ML] Update rest of default modules (Hive, ML, etc) for Scala 2.13 compilation

2020-07-14 Thread GitBox
SparkQA removed a comment on pull request #29111: URL: https://github.com/apache/spark/pull/29111#issuecomment-658489540 **[Test build #125864 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125864/testReport)** for PR 29111 at commit

[GitHub] [spark] wangyum commented on a change in pull request #29088: [SPARK-32289][SQL] Some characters are garbled when opening csv files with Excel

2020-07-14 Thread GitBox
wangyum commented on a change in pull request #29088: URL: https://github.com/apache/spark/pull/29088#discussion_r454764920 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CsvOutputWriter.scala ## @@ -39,6 +39,10 @@ class CsvOutputWriter(

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29115: [SPARK-32315][ML] Provide an explanation error message when calling require

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29115: URL: https://github.com/apache/spark/pull/29115#issuecomment-658519915 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-07-14 Thread GitBox
AmplabJenkins commented on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-658519988 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-658519932 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29115: [SPARK-32315][ML] Provide an explanation error message when calling require

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29115: URL: https://github.com/apache/spark/pull/29115#issuecomment-658519909 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658519974 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-658519988 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #29115: [SPARK-32315][ML] Provide an explanation error message when calling require

2020-07-14 Thread GitBox
SparkQA removed a comment on pull request #29115: URL: https://github.com/apache/spark/pull/29115#issuecomment-658519446 **[Test build #125873 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125873/testReport)** for PR 29115 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel

2020-07-14 Thread GitBox
AmplabJenkins commented on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658519974 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29115: [SPARK-32315][ML] Provide an explanation error message when calling require

2020-07-14 Thread GitBox
SparkQA commented on pull request #29115: URL: https://github.com/apache/spark/pull/29115#issuecomment-658519894 **[Test build #125873 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125873/testReport)** for PR 29115 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29115: [SPARK-32315][ML] Provide an explanation error message when calling require

2020-07-14 Thread GitBox
AmplabJenkins commented on pull request #29115: URL: https://github.com/apache/spark/pull/29115#issuecomment-658519909 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-14 Thread GitBox
AmplabJenkins commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-658519932 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-07-14 Thread GitBox
SparkQA commented on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-658519508 **[Test build #125876 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125876/testReport)** for PR 27694 at commit

[GitHub] [spark] SparkQA commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-14 Thread GitBox
SparkQA commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-658519488 **[Test build #125875 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125875/testReport)** for PR 28904 at commit

[GitHub] [spark] SparkQA commented on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel

2020-07-14 Thread GitBox
SparkQA commented on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658519469 **[Test build #125874 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125874/testReport)** for PR 29080 at commit

[GitHub] [spark] SparkQA commented on pull request #29115: [SPARK-32315][ML] Provide an explanation error message when calling require

2020-07-14 Thread GitBox
SparkQA commented on pull request #29115: URL: https://github.com/apache/spark/pull/29115#issuecomment-658519446 **[Test build #125873 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125873/testReport)** for PR 29115 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel

2020-07-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-657302373 Can one of the admins verify this patch? This is an automated message from the Apache Git

  1   2   3   4   5   6   7   8   9   10   >