[GitHub] [spark] SparkQA commented on pull request #28464: [SPARK-31652][ML][PySpark] Add ANOVASelector and FValueSelector to PySpark

2020-05-06 Thread GitBox
SparkQA commented on pull request #28464: URL: https://github.com/apache/spark/pull/28464#issuecomment-624846842 **[Test build #122377 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122377/testReport)** for PR 28464 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #28464: [SPARK-31652][ML][PySpark] Add ANOVASelector and FValueSelector to PySpark

2020-05-06 Thread GitBox
SparkQA removed a comment on pull request #28464: URL: https://github.com/apache/spark/pull/28464#issuecomment-624814768 **[Test build #122377 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122377/testReport)** for PR 28464 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28464: [SPARK-31652][ML][PySpark] Add ANOVASelector and FValueSelector to PySpark

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28464: URL: https://github.com/apache/spark/pull/28464#issuecomment-624847358 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28467: [SPARK-31653][BUILD] Setuptools is needed before installing any other python packages

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28467: URL: https://github.com/apache/spark/pull/28467#issuecomment-624913243 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #28106: [SPARK-31335][SQL] Add try function support

2020-05-06 Thread GitBox
SparkQA removed a comment on pull request #28106: URL: https://github.com/apache/spark/pull/28106#issuecomment-624700888 **[Test build #122371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122371/testReport)** for PR 28106 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28106: [SPARK-31335][SQL] Add try function support

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28106: URL: https://github.com/apache/spark/pull/28106#issuecomment-624853403 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28370: [SPARK-20732][CORE] Decommission cache blocks to other executors when an executor is decommissioned

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28370: URL: https://github.com/apache/spark/pull/28370#issuecomment-624863594 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] baohe-zhang commented on pull request #28412: [SPARK-31608][CORE][WEBUI] Add a new type of KVStore to make loading UI faster

2020-05-06 Thread GitBox
baohe-zhang commented on pull request #28412: URL: https://github.com/apache/spark/pull/28412#issuecomment-624907362 @tgravescs What I saw in FsHistoryProvider is that "spark.history.fs.numReplayThreads" is used to create a thread pool to mergeApplicationListing and compact. These threads

[GitHub] [spark] holdenk opened a new pull request #28467: [SPARK-31653][BUILD] Setuptools is needed before installing any other python packages

2020-05-06 Thread GitBox
holdenk opened a new pull request #28467: URL: https://github.com/apache/spark/pull/28467 ### What changes were proposed in this pull request? Allow the docker build to succeed ### Why are the changes needed? The base packages depend on having setuptools installed now

[GitHub] [spark] vinooganesh commented on a change in pull request #28128: [SPARK-31354] SparkSession Lifecycle methods to fix memory leak

2020-05-06 Thread GitBox
vinooganesh commented on a change in pull request #28128: URL: https://github.com/apache/spark/pull/28128#discussion_r421037627 ## File path: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala ## @@ -691,21 +752,56 @@ class SparkSession private( } //

[GitHub] [spark] vinooganesh commented on pull request #28128: [SPARK-31354] SparkSession Lifecycle methods to fix memory leak

2020-05-06 Thread GitBox
vinooganesh commented on pull request #28128: URL: https://github.com/apache/spark/pull/28128#issuecomment-624851677 Hey @cloud-fan - Sure, right now the listener issue is coupled with the operating model for `SparkSession`s (which is where I think the confusion is coming from).

[GitHub] [spark] SparkQA commented on pull request #28370: [SPARK-20732][CORE] Decommission cache blocks to other executors when an executor is decommissioned

2020-05-06 Thread GitBox
SparkQA commented on pull request #28370: URL: https://github.com/apache/spark/pull/28370#issuecomment-624862986 **[Test build #122366 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122366/testReport)** for PR 28370 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624886711 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on pull request #28466: [SPARK-31361][SQL][TESTS][FOLLOWUP] Check non-vectorized Parquet reader while date/timestamp rebasing

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28466: URL: https://github.com/apache/spark/pull/28466#issuecomment-624826963 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28466: [SPARK-31361][SQL][TESTS][FOLLOWUP] Check non-vectorized Parquet reader while date/timestamp rebasing

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28466: URL: https://github.com/apache/spark/pull/28466#issuecomment-624826963 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28370: [SPARK-20732][CORE] Decommission cache blocks to other executors when an executor is decommissioned

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28370: URL: https://github.com/apache/spark/pull/28370#issuecomment-624863583 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA removed a comment on pull request #28370: [SPARK-20732][CORE] Decommission cache blocks to other executors when an executor is decommissioned

2020-05-06 Thread GitBox
SparkQA removed a comment on pull request #28370: URL: https://github.com/apache/spark/pull/28370#issuecomment-624649246 **[Test build #122366 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122366/testReport)** for PR 28370 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28370: [SPARK-20732][CORE] Decommission cache blocks to other executors when an executor is decommissioned

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28370: URL: https://github.com/apache/spark/pull/28370#issuecomment-624863583 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624886702 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624886702 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
SparkQA removed a comment on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624810979 **[Test build #122376 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122376/testReport)** for PR 28442 at commit

[GitHub] [spark] MaxGekk commented on pull request #28466: [SPARK-31361][SQL][TESTS][FOLLOWUP] Check non-vectorized Parquet reader while date/timestamp rebasing

2020-05-06 Thread GitBox
MaxGekk commented on pull request #28466: URL: https://github.com/apache/spark/pull/28466#issuecomment-624823354 @dongjoon-hyun @HyukjinKwon @cloud-fan Please, take a look it if you have time. This is an automated message

[GitHub] [spark] MaxGekk opened a new pull request #28466: [SPARK-31361][SQL][TESTS][FOLLOWUP] Check non-vectorized Parquet reader while date/timestamp rebasing

2020-05-06 Thread GitBox
MaxGekk opened a new pull request #28466: URL: https://github.com/apache/spark/pull/28466 ### What changes were proposed in this pull request? In PR, I propose to modify two tests of `ParquetIOSuite`: - SPARK-31159: rebasing timestamps in write - SPARK-31159: rebasing dates in

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28464: [SPARK-31652][ML][PySpark] Add ANOVASelector and FValueSelector to PySpark

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28464: URL: https://github.com/apache/spark/pull/28464#issuecomment-624847358 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #28106: [SPARK-31335][SQL] Add try function support

2020-05-06 Thread GitBox
SparkQA commented on pull request #28106: URL: https://github.com/apache/spark/pull/28106#issuecomment-624852524 **[Test build #122371 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122371/testReport)** for PR 28106 at commit

[GitHub] [spark] SparkQA commented on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
SparkQA commented on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624886206 **[Test build #122376 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122376/testReport)** for PR 28442 at commit

[GitHub] [spark] SparkQA commented on pull request #28466: [SPARK-31361][SQL][TESTS][FOLLOWUP] Check non-vectorized Parquet reader while date/timestamp rebasing

2020-05-06 Thread GitBox
SparkQA commented on pull request #28466: URL: https://github.com/apache/spark/pull/28466#issuecomment-624825910 **[Test build #122378 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122378/testReport)** for PR 28466 at commit

[GitHub] [spark] huaxingao commented on pull request #28464: [SPARK-31652][ML][PySpark] Add ANOVASelector and FValueSelector to PySpark

2020-05-06 Thread GitBox
huaxingao commented on pull request #28464: URL: https://github.com/apache/spark/pull/28464#issuecomment-624856985 cc @srowen @zhengruifeng This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] AmplabJenkins commented on pull request #28467: [SPARK-31653][BUILD] Setuptools is needed before installing any other python packages

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28467: URL: https://github.com/apache/spark/pull/28467#issuecomment-624913243 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] holdenk commented on pull request #28467: [SPARK-31653][BUILD] Setuptools is needed before installing any other python packages

2020-05-06 Thread GitBox
holdenk commented on pull request #28467: URL: https://github.com/apache/spark/pull/28467#issuecomment-624912627 Merged to branch-2.4 This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] SparkQA commented on pull request #28467: [SPARK-31653][BUILD] Setuptools is needed before installing any other python packages

2020-05-06 Thread GitBox
SparkQA commented on pull request #28467: URL: https://github.com/apache/spark/pull/28467#issuecomment-624912763 **[Test build #122379 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122379/testReport)** for PR 28467 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28106: [SPARK-31335][SQL] Add try function support

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28106: URL: https://github.com/apache/spark/pull/28106#issuecomment-624853403 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #28468: [SPARK-31365][SQL][Followup] Refine config document

2020-05-06 Thread GitBox
SparkQA commented on pull request #28468: URL: https://github.com/apache/spark/pull/28468#issuecomment-624926619 **[Test build #122380 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122380/testReport)** for PR 28468 at commit

[GitHub] [spark] viirya commented on pull request #28468: [SPARK-31365][SQL][Followup] Refine config document

2020-05-06 Thread GitBox
viirya commented on pull request #28468: URL: https://github.com/apache/spark/pull/28468#issuecomment-624925992 cc @HyukjinKwon This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] viirya opened a new pull request #28468: [SPARK-31365][SQL][Followup] Refine config document

2020-05-06 Thread GitBox
viirya opened a new pull request #28468: URL: https://github.com/apache/spark/pull/28468 ### What changes were proposed in this pull request? This is a followup to address the https://github.com/apache/spark/pull/28366#discussion_r420611872 by refining the SQL config

[GitHub] [spark] SparkQA commented on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
SparkQA commented on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624940140 **[Test build #122381 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122381/testReport)** for PR 28442 at commit

[GitHub] [spark] HeartSaVioR commented on a change in pull request #28336: [SPARK-31559][YARN] Re-obtain tokens at the startup of AM for yarn cluster mode if principal and keytab are available

2020-05-06 Thread GitBox
HeartSaVioR commented on a change in pull request #28336: URL: https://github.com/apache/spark/pull/28336#discussion_r421172010 ## File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ## @@ -863,10 +864,20 @@ object

[GitHub] [spark] HeartSaVioR commented on a change in pull request #28336: [SPARK-31559][YARN] Re-obtain tokens at the startup of AM for yarn cluster mode if principal and keytab are available

2020-05-06 Thread GitBox
HeartSaVioR commented on a change in pull request #28336: URL: https://github.com/apache/spark/pull/28336#discussion_r421175345 ## File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ## @@ -863,10 +864,20 @@ object

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28467: [SPARK-31653][BUILD] Setuptools is needed before installing any other python packages

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28467: URL: https://github.com/apache/spark/pull/28467#issuecomment-624965875 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] HyukjinKwon commented on a change in pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
HyukjinKwon commented on a change in pull request #28442: URL: https://github.com/apache/spark/pull/28442#discussion_r421174937 ## File path: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala ## @@ -131,11 +130,7 @@ class

[GitHub] [spark] AmplabJenkins commented on pull request #28467: [SPARK-31653][BUILD] Setuptools is needed before installing any other python packages

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28467: URL: https://github.com/apache/spark/pull/28467#issuecomment-624965875 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] xwu99 commented on pull request #28229: [SPARK-31454][ML] An optimized K-Means based on DenseMatrix and GEMM

2020-05-06 Thread GitBox
xwu99 commented on pull request #28229: URL: https://github.com/apache/spark/pull/28229#issuecomment-624989871 > @xwu99 There was a [ticket](https://issues.apache.org/jira/browse/SPARK-30641) for this. > Now I had merged high-level BLAS supports for

[GitHub] [spark] tianshizz opened a new pull request #28470: [SPARK-29803][Python] remove all instances of 'from __future__ import print_function'

2020-05-06 Thread GitBox
tianshizz opened a new pull request #28470: URL: https://github.com/apache/spark/pull/28470 ### What changes were proposed in this pull request? removed all remaining instances of `from __future__ import print_function` ### Why are the changes needed? deprecate

[GitHub] [spark] xwu99 commented on pull request #28229: [SPARK-31454][ML] An optimized K-Means based on DenseMatrix and GEMM

2020-05-06 Thread GitBox
xwu99 commented on pull request #28229: URL: https://github.com/apache/spark/pull/28229#issuecomment-624993186 @zhengruifeng btw. There is [a closed PR for ALS](https://github.com/apache/spark/pull/13891) which is my colleague's work before he left. I would like to rework it. Could you

[GitHub] [spark] SparkQA commented on pull request #28471: [SPARK-30660][ML][PYSPARK] LinearRegression blockify input vectors

2020-05-06 Thread GitBox
SparkQA commented on pull request #28471: URL: https://github.com/apache/spark/pull/28471#issuecomment-625007059 **[Test build #122386 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122386/testReport)** for PR 28471 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28471: [SPARK-30660][ML][PYSPARK] LinearRegression blockify input vectors

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28471: URL: https://github.com/apache/spark/pull/28471#issuecomment-625007384 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28472: [SPARK-31655][BUILD] Upgrade snappy-java to 1.1.7.5

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28472: URL: https://github.com/apache/spark/pull/28472#issuecomment-625011208 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28473: [SPARK-31656][ML][PYSPARK] AFT blockify input vectors

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28473: URL: https://github.com/apache/spark/pull/28473#issuecomment-625011172 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28473: [SPARK-31656][ML][PYSPARK] AFT blockify input vectors

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28473: URL: https://github.com/apache/spark/pull/28473#issuecomment-625011172 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #28472: [SPARK-31655][BUILD] Upgrade snappy-java to 1.1.7.5

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28472: URL: https://github.com/apache/spark/pull/28472#issuecomment-625011208 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] Ngone51 commented on pull request #28460: [SPARK-31650][SQL] Fix wrong UI in case of AdaptiveSparkPlanExec has unmanaged subqueries

2020-05-06 Thread GitBox
Ngone51 commented on pull request #28460: URL: https://github.com/apache/spark/pull/28460#issuecomment-625010880 thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28420: [SPARK-31615][SQL] Pretty string output for sql method of RuntimeReplaceable expressions

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28420: URL: https://github.com/apache/spark/pull/28420#issuecomment-624930249 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] viirya commented on a change in pull request #28463: [SPARK-31399][CORE] Support indylambda Scala closure in ClosureCleaner

2020-05-06 Thread GitBox
viirya commented on a change in pull request #28463: URL: https://github.com/apache/spark/pull/28463#discussion_r421149812 ## File path: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala ## @@ -372,14 +342,63 @@ private[spark] object ClosureCleaner extends Logging

[GitHub] [spark] viirya commented on a change in pull request #28463: [SPARK-31399][CORE] Support indylambda Scala closure in ClosureCleaner

2020-05-06 Thread GitBox
viirya commented on a change in pull request #28463: URL: https://github.com/apache/spark/pull/28463#discussion_r421155560 ## File path: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala ## @@ -414,6 +433,142 @@ private[spark] object ClosureCleaner extends Logging

[GitHub] [spark] maropu commented on pull request #28459: [SPARK-31647][SQL] Deprecate 'spark.sql.optimizer.metadataOnly' configuration

2020-05-06 Thread GitBox
maropu commented on pull request #28459: URL: https://github.com/apache/spark/pull/28459#issuecomment-624953219 Thanks! Merged to master/3.0. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] HyukjinKwon commented on pull request #28468: [SPARK-31365][SQL][FOLLOWUP] Refine config document for nested predicate pushdown

2020-05-06 Thread GitBox
HyukjinKwon commented on pull request #28468: URL: https://github.com/apache/spark/pull/28468#issuecomment-624961599 Thanks @viirya. This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] SparkQA commented on pull request #28336: [SPARK-31559][YARN] Re-obtain tokens at the startup of AM for yarn cluster mode if principal and keytab are available

2020-05-06 Thread GitBox
SparkQA commented on pull request #28336: URL: https://github.com/apache/spark/pull/28336#issuecomment-624988314 **[Test build #122384 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122384/testReport)** for PR 28336 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28336: [SPARK-31559][YARN] Re-obtain tokens at the startup of AM for yarn cluster mode if principal and keytab are available

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28336: URL: https://github.com/apache/spark/pull/28336#issuecomment-624988394 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #28336: [SPARK-31559][YARN] Re-obtain tokens at the startup of AM for yarn cluster mode if principal and keytab are available

2020-05-06 Thread GitBox
SparkQA removed a comment on pull request #28336: URL: https://github.com/apache/spark/pull/28336#issuecomment-624981803 **[Test build #122384 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122384/testReport)** for PR 28336 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28336: [SPARK-31559][YARN] Re-obtain tokens at the startup of AM for yarn cluster mode if principal and keytab are available

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28336: URL: https://github.com/apache/spark/pull/28336#issuecomment-624988394 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] viirya commented on pull request #28468: [SPARK-31365][SQL][FOLLOWUP] Refine config document for nested predicate pushdown

2020-05-06 Thread GitBox
viirya commented on pull request #28468: URL: https://github.com/apache/spark/pull/28468#issuecomment-624998359 Thanks @HyukjinKwon @maropu This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] SparkQA commented on pull request #28472: [SPARK-31655][BUILD] Upgrade snappy-java to 1.1.7.5

2020-05-06 Thread GitBox
SparkQA commented on pull request #28472: URL: https://github.com/apache/spark/pull/28472#issuecomment-625010899 **[Test build #122389 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122389/testReport)** for PR 28472 at commit

[GitHub] [spark] SparkQA commented on pull request #28473: [SPARK-31656][ML][PYSPARK] AFT blockify input vectors

2020-05-06 Thread GitBox
SparkQA commented on pull request #28473: URL: https://github.com/apache/spark/pull/28473#issuecomment-625010888 **[Test build #122388 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122388/testReport)** for PR 28473 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624940605 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624940605 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] maropu commented on pull request #26339: [SPARK-27194][SPARK-29302][SQL] For dynamic partition overwrite operation, fix speculation task conflict issue and FileAlreadyExistsException

2020-05-06 Thread GitBox
maropu commented on pull request #26339: URL: https://github.com/apache/spark/pull/26339#issuecomment-624941244 retest this please This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] HyukjinKwon commented on pull request #28459: [SPARK-31647][SQL] Deprecate 'spark.sql.optimizer.metadataOnly' configuration

2020-05-06 Thread GitBox
HyukjinKwon commented on pull request #28459: URL: https://github.com/apache/spark/pull/28459#issuecomment-624961194 Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] HeartSaVioR commented on pull request #28412: [SPARK-31608][CORE][WEBUI] Add a new type of KVStore to make loading UI faster

2020-05-06 Thread GitBox
HeartSaVioR commented on pull request #28412: URL: https://github.com/apache/spark/pull/28412#issuecomment-624964828 The idea is similar with HistoryServerDiskManager so makes sense in general. We may need to get concrete answers for these questions to go forward: 1. How we will

[GitHub] [spark] AmplabJenkins commented on pull request #28336: [SPARK-31559][YARN] Re-obtain tokens at the startup of AM for yarn cluster mode if principal and keytab are available

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28336: URL: https://github.com/apache/spark/pull/28336#issuecomment-624982113 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28336: [SPARK-31559][YARN] Re-obtain tokens at the startup of AM for yarn cluster mode if principal and keytab are available

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28336: URL: https://github.com/apache/spark/pull/28336#issuecomment-624982113 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624985896 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624986274 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624985896 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] zhengruifeng edited a comment on pull request #28458: [SPARK-30659][ML][PYSPARK] LogisticRegression blockify input vectors

2020-05-06 Thread GitBox
zhengruifeng edited a comment on pull request #28458: URL: https://github.com/apache/spark/pull/28458#issuecomment-624985569 This PR is a update of https://github.com/apache/spark/pull/27374, it can avoid performance regression on sparse datasets by default (with blockSize=1). On dense

[GitHub] [spark] AmplabJenkins commented on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624986274 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] zhengruifeng commented on pull request #28229: [SPARK-31454][ML] An optimized K-Means based on DenseMatrix and GEMM

2020-05-06 Thread GitBox
zhengruifeng commented on pull request #28229: URL: https://github.com/apache/spark/pull/28229#issuecomment-624989291 @xwu99 There was a [ticket](https://issues.apache.org/jira/browse/SPARK-30641) for this. Now I had merged high-level BLAS supports for

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28472: [SPARK-31655][BUILD] Upgrade snappy-java to 1.1.7.5

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28472: URL: https://github.com/apache/spark/pull/28472#issuecomment-625009560 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA removed a comment on pull request #28472: [SPARK-31655][BUILD] Upgrade snappy-java to 1.1.7.5

2020-05-06 Thread GitBox
SparkQA removed a comment on pull request #28472: URL: https://github.com/apache/spark/pull/28472#issuecomment-625009011 **[Test build #122387 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122387/testReport)** for PR 28472 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28472: [SPARK-31655][BUILD] Upgrade snappy-java to 1.1.7.5

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28472: URL: https://github.com/apache/spark/pull/28472#issuecomment-625009566 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] zhengruifeng opened a new pull request #28473: [SPARK-31656][ML][PYSPARK] AFT blockify input vectors

2020-05-06 Thread GitBox
zhengruifeng opened a new pull request #28473: URL: https://github.com/apache/spark/pull/28473 ### What changes were proposed in this pull request? 1, add new param blockSize; 2, add a new class InstanceBlock; 3, if blockSize==1, keep original behavior; if blockSize>1, stack input

[GitHub] [spark] SparkQA commented on pull request #28420: [SPARK-31615][SQL] Pretty string output for sql method of RuntimeReplaceable expressions

2020-05-06 Thread GitBox
SparkQA commented on pull request #28420: URL: https://github.com/apache/spark/pull/28420#issuecomment-624929625 **[Test build #122375 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122375/testReport)** for PR 28420 at commit

[GitHub] [spark] viirya commented on a change in pull request #28463: [SPARK-31399][CORE] Support indylambda Scala closure in ClosureCleaner

2020-05-06 Thread GitBox
viirya commented on a change in pull request #28463: URL: https://github.com/apache/spark/pull/28463#discussion_r421156436 ## File path: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala ## @@ -414,6 +433,142 @@ private[spark] object ClosureCleaner extends Logging

[GitHub] [spark] AmplabJenkins commented on pull request #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #27920: URL: https://github.com/apache/spark/pull/27920#issuecomment-624954267 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27920: [SPARK-31102][SQL] Spark-sql fails to parse when contains comment.

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #27920: URL: https://github.com/apache/spark/pull/27920#issuecomment-624954267 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #28467: [SPARK-31653][BUILD] Setuptools is needed before installing any other python packages

2020-05-06 Thread GitBox
SparkQA removed a comment on pull request #28467: URL: https://github.com/apache/spark/pull/28467#issuecomment-624912763 **[Test build #122379 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122379/testReport)** for PR 28467 at commit

[GitHub] [spark] SparkQA commented on pull request #28467: [SPARK-31653][BUILD] Setuptools is needed before installing any other python packages

2020-05-06 Thread GitBox
SparkQA commented on pull request #28467: URL: https://github.com/apache/spark/pull/28467#issuecomment-624965503 **[Test build #122379 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122379/testReport)** for PR 28467 at commit

[GitHub] [spark] HyukjinKwon commented on pull request #28461: [SPARK-31361][SQL][FOLLOWUP] Use LEGACY_PARQUET_REBASE_DATETIME_IN_READ instead of avro config in ParquetIOSuite

2020-05-06 Thread GitBox
HyukjinKwon commented on pull request #28461: URL: https://github.com/apache/spark/pull/28461#issuecomment-624965251 Merged to master and branch-3.0. This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] maropu edited a comment on pull request #28420: [SPARK-31615][SQL] Pretty string output for sql method of RuntimeReplaceable expressions

2020-05-06 Thread GitBox
maropu edited a comment on pull request #28420: URL: https://github.com/apache/spark/pull/28420#issuecomment-624939556 We need to backport this into branch-2.4, too? I personally think printing debug info looks like a minor bug. @cloud-fan @dongjoon-hyun

[GitHub] [spark] SparkQA removed a comment on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
SparkQA removed a comment on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624940140 **[Test build #122381 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122381/testReport)** for PR 28442 at commit

[GitHub] [spark] SparkQA commented on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
SparkQA commented on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624985317 **[Test build #122381 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122381/testReport)** for PR 28442 at commit

[GitHub] [spark] yaooqinn commented on pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-06 Thread GitBox
yaooqinn commented on pull request #28442: URL: https://github.com/apache/spark/pull/28442#issuecomment-624985609 retest this please This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] zhengruifeng commented on pull request #28458: [SPARK-30659][ML][PYSPARK] LogisticRegression blockify input vectors

2020-05-06 Thread GitBox
zhengruifeng commented on pull request #28458: URL: https://github.com/apache/spark/pull/28458#issuecomment-624985569 This PR is a update of https://github.com/apache/spark/pull/27374, it can avoid performance regression on sparse datasets by default (with blockSize=1). On dense

[GitHub] [spark] AmplabJenkins commented on pull request #28470: [SPARK-29803][Python] remove all instances of 'from __future__ import print_function'

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28470: URL: https://github.com/apache/spark/pull/28470#issuecomment-624991743 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To

[GitHub] [spark] Ngone51 commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

2020-05-06 Thread GitBox
Ngone51 commented on a change in pull request #27937: URL: https://github.com/apache/spark/pull/27937#discussion_r421201672 ## File path: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala ## @@ -93,7 +93,7 @@ sealed abstract class

[GitHub] [spark] zhengruifeng commented on pull request #28229: [SPARK-31454][ML] An optimized K-Means based on DenseMatrix and GEMM

2020-05-06 Thread GitBox
zhengruifeng commented on pull request #28229: URL: https://github.com/apache/spark/pull/28229#issuecomment-624991855 @xwu99 I think you can also refer to those two PRs, since some utils were added. This is an automated

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28470: [SPARK-29803][Python] remove all instances of 'from __future__ import print_function'

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28470: URL: https://github.com/apache/spark/pull/28470#issuecomment-624991486 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] SparkQA removed a comment on pull request #28468: [SPARK-31365][SQL][FOLLOWUP] Refine config document for nested predicate pushdown

2020-05-06 Thread GitBox
SparkQA removed a comment on pull request #28468: URL: https://github.com/apache/spark/pull/28468#issuecomment-624926619 **[Test build #122380 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122380/testReport)** for PR 28468 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28471: [SPARK-30660][ML][PYSPARK] LinearRegression blockify input vectors

2020-05-06 Thread GitBox
AmplabJenkins removed a comment on pull request #28471: URL: https://github.com/apache/spark/pull/28471#issuecomment-625007384 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #28468: [SPARK-31365][SQL][FOLLOWUP] Refine config document for nested predicate pushdown

2020-05-06 Thread GitBox
SparkQA commented on pull request #28468: URL: https://github.com/apache/spark/pull/28468#issuecomment-625007783 **[Test build #122380 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122380/testReport)** for PR 28468 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28468: [SPARK-31365][SQL][Followup] Refine config document

2020-05-06 Thread GitBox
AmplabJenkins commented on pull request #28468: URL: https://github.com/apache/spark/pull/28468#issuecomment-624927020 This is an automated message from the Apache Git Service. To respond to the message, please log on to

  1   2   3   4   5   6   >