[GitHub] [spark] SparkQA commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
SparkQA commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698739307 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33724/

[GitHub] [spark] SparkQA commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698742548 **[Test build #129104 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129104/testReport)** for PR 29869 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-25 Thread GitBox
cloud-fan commented on a change in pull request #29795: URL: https://github.com/apache/spark/pull/29795#discussion_r494764849 ## File path: sql/core/src/test/scala/org/apache/spark/sql/UpdateFieldsBenchmark.scala ## @@ -0,0 +1,310 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29843: [WIP][SPARK-29250][test-maven][test-hadoop2.7] Upgrade to Hadoop 3.2.1 and move to shaded client

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29843: URL: https://github.com/apache/spark/pull/29843#issuecomment-698737847 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA removed a comment on pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-25 Thread GitBox
SparkQA removed a comment on pull request #29795: URL: https://github.com/apache/spark/pull/29795#issuecomment-698674623 **[Test build #129098 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129098/testReport)** for PR 29795 at commit

[GitHub] [spark] SparkQA commented on pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-25 Thread GitBox
SparkQA commented on pull request #29795: URL: https://github.com/apache/spark/pull/29795#issuecomment-698745385 **[Test build #129098 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129098/testReport)** for PR 29795 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29843: [WIP][SPARK-29250][test-maven][test-hadoop2.7] Upgrade to Hadoop 3.2.1 and move to shaded client

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29843: URL: https://github.com/apache/spark/pull/29843#issuecomment-698737847 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698747726 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698747726 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
SparkQA commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698748122 **[Test build #129105 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129105/testReport)** for PR 29850 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698747734 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29795: URL: https://github.com/apache/spark/pull/29795#issuecomment-698746139 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29795: URL: https://github.com/apache/spark/pull/29795#issuecomment-698746139 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698746743 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
SparkQA commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698746730 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33724/

[GitHub] [spark] SparkQA commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
SparkQA commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698750147 **[Test build #129103 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129103/testReport)** for PR 29850 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698750504 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
SparkQA removed a comment on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698725725 **[Test build #129103 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129103/testReport)** for PR 29850 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-25 Thread GitBox
cloud-fan commented on a change in pull request #29795: URL: https://github.com/apache/spark/pull/29795#discussion_r494764069 ## File path: sql/core/src/test/scala/org/apache/spark/sql/UpdateFieldsBenchmark.scala ## @@ -0,0 +1,310 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] SparkQA commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698747241 **[Test build #129102 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129102/testReport)** for PR 29869 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
SparkQA removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698711104 **[Test build #129102 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129102/testReport)** for PR 29869 at commit

[GitHub] [spark] zhengruifeng commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
zhengruifeng commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698747256 retest this please This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698746748 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698746743 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698750504 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] cloud-fan closed pull request #29756: [SPARK-32885][SS] Add DataStreamReader.table API

2020-09-25 Thread GitBox
cloud-fan closed pull request #29756: URL: https://github.com/apache/spark/pull/29756 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] cloud-fan commented on pull request #29756: [SPARK-32885][SS] Add DataStreamReader.table API

2020-09-25 Thread GitBox
cloud-fan commented on pull request #29756: URL: https://github.com/apache/spark/pull/29756#issuecomment-698755963 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] SparkQA removed a comment on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
SparkQA removed a comment on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698748122 **[Test build #129105 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129105/testReport)** for PR 29850 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698761936 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698761836 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA removed a comment on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-25 Thread GitBox
SparkQA removed a comment on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698709228 **[Test build #129101 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129101/testReport)** for PR 29800 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698761844 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29860: [SPARK-32984][TESTS][SQL] Improve showing the differences between approved and actual plans of PlanStabilitySuite

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29860: URL: https://github.com/apache/spark/pull/29860#issuecomment-698762317 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698761768 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698761944 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] SparkQA commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698763007 **[Test build #129106 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129106/testReport)** for PR 29869 at commit

[GitHub] [spark] SparkQA commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698768029 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33725/

[GitHub] [spark] AmplabJenkins commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698768045 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] ulysses-you commented on pull request #29863: [SPARK-32877][SQL][TEST] Add test for Hive UDF complex decimal type

2020-09-25 Thread GitBox
ulysses-you commented on pull request #29863: URL: https://github.com/apache/spark/pull/29863#issuecomment-698767677 thanks for merging ! FYI @cloud-fan @maropu This is an automated message from the Apache Git Service. To

[GitHub] [spark] izchen commented on a change in pull request #29862: [SPARK-32956][SQL] Ensure that the generated and existing headers are not duplicated in CSV DataSource

2020-09-25 Thread GitBox
izchen commented on a change in pull request #29862: URL: https://github.com/apache/spark/pull/29862#discussion_r494817447 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVUtils.scala ## @@ -93,6 +93,12 @@ object CSVUtils {

[GitHub] [spark] cloud-fan commented on a change in pull request #29860: [SPARK-32984][TESTS][SQL] Improve showing the differences between approved and actual plans of PlanStabilitySuite

2020-09-25 Thread GitBox
cloud-fan commented on a change in pull request #29860: URL: https://github.com/apache/spark/pull/29860#discussion_r494832113 ## File path: sql/core/src/test/scala/org/apache/spark/sql/PlanStabilitySuite.scala ## @@ -153,23 +154,93 @@ trait PlanStabilitySuite extends TPCDSBase

[GitHub] [spark] cloud-fan edited a comment on pull request #29785: [SPARK-32901][CORE] Do not allocate memory while spilling UnsafeExternalSorter

2020-09-25 Thread GitBox
cloud-fan edited a comment on pull request #29785: URL: https://github.com/apache/spark/pull/29785#issuecomment-698810109 Is it possible to refactor it a little bit to make the logic simpler? It's really tricky that we must be careful at each step, in case something was spilled right

[GitHub] [spark] cloud-fan commented on a change in pull request #29785: [SPARK-32901][CORE] Do not allocate memory while spilling UnsafeExternalSorter

2020-09-25 Thread GitBox
cloud-fan commented on a change in pull request #29785: URL: https://github.com/apache/spark/pull/29785#discussion_r494846145 ## File path: core/src/test/scala/org/apache/spark/memory/TestMemoryManager.scala ## @@ -119,6 +119,14 @@ class TestMemoryManager(conf: SparkConf)

[GitHub] [spark] SparkQA commented on pull request #29024: [SPARK-32001][SQL]Create JDBC authentication provider developer API

2020-09-25 Thread GitBox
SparkQA commented on pull request #29024: URL: https://github.com/apache/spark/pull/29024#issuecomment-698819795 **[Test build #129108 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129108/testReport)** for PR 29024 at commit

[GitHub] [spark] cloud-fan commented on pull request #29798: [SPARK-32931][SQL] Unevaluable Expressions are not Foldable

2020-09-25 Thread GitBox
cloud-fan commented on pull request #29798: URL: https://github.com/apache/spark/pull/29798#issuecomment-698770919 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] wangyum commented on a change in pull request #29790: [SPARK-32914][SQL] Avoid calling dataType multiple times for each expression

2020-09-25 Thread GitBox
wangyum commented on a change in pull request #29790: URL: https://github.com/apache/spark/pull/29790#discussion_r494810795 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ## @@ -3498,13 +3500,15 @@ object

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698788744 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698788735 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] beliefer commented on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-25 Thread GitBox
beliefer commented on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698797542 retest this please This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] SparkQA commented on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-25 Thread GitBox
SparkQA commented on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698797812 **[Test build #129107 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129107/testReport)** for PR 29800 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698832061 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698832061 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698761762 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698761836 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29860: [SPARK-32984][TESTS][SQL] Improve showing the differences between approved and actual plans of PlanStabilitySuite

2020-09-25 Thread GitBox
SparkQA commented on pull request #29860: URL: https://github.com/apache/spark/pull/29860#issuecomment-698761659 **[Test build #129099 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129099/testReport)** for PR 29860 at commit

[GitHub] [spark] SparkQA commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
SparkQA commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698761663 **[Test build #129105 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129105/testReport)** for PR 29850 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
SparkQA removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698742548 **[Test build #129104 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129104/testReport)** for PR 29869 at commit

[GitHub] [spark] SparkQA commented on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-25 Thread GitBox
SparkQA commented on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698761665 **[Test build #129101 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129101/testReport)** for PR 29800 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698761936 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698761667 **[Test build #129104 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129104/testReport)** for PR 29869 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698761762 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA removed a comment on pull request #29860: [SPARK-32984][TESTS][SQL] Improve showing the differences between approved and actual plans of PlanStabilitySuite

2020-09-25 Thread GitBox
SparkQA removed a comment on pull request #29860: URL: https://github.com/apache/spark/pull/29860#issuecomment-698689552 **[Test build #129099 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129099/testReport)** for PR 29860 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29860: [SPARK-32984][TESTS][SQL] Improve showing the differences between approved and actual plans of PlanStabilitySuite

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29860: URL: https://github.com/apache/spark/pull/29860#issuecomment-698762317 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] cloud-fan commented on a change in pull request #29790: [SPARK-32914][SQL] Avoid calling dataType multiple times for each expression

2020-09-25 Thread GitBox
cloud-fan commented on a change in pull request #29790: URL: https://github.com/apache/spark/pull/29790#discussion_r494807719 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ## @@ -3498,13 +3500,15 @@ object

[GitHub] [spark] gatorsmile commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs

2020-09-25 Thread GitBox
gatorsmile commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r494808052 ## File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md ## @@ -36,6 +36,14 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier

[GitHub] [spark] cloud-fan commented on a change in pull request #29804: [SPARK-32859][SQL] Introduce physical rule to decide bucketing dynamically

2020-09-25 Thread GitBox
cloud-fan commented on a change in pull request #29804: URL: https://github.com/apache/spark/pull/29804#discussion_r494852129 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/DisableUnnecessaryBucketedScan.scala ## @@ -0,0 +1,154 @@ +/* + *

[GitHub] [spark] SparkQA commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698831044 **[Test build #129106 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129106/testReport)** for PR 29869 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
SparkQA removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698763007 **[Test build #129106 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129106/testReport)** for PR 29869 at commit

[GitHub] [spark] tomvanbussel commented on pull request #29785: [SPARK-32901][CORE] Do not allocate memory while spilling UnsafeExternalSorter

2020-09-25 Thread GitBox
tomvanbussel commented on pull request #29785: URL: https://github.com/apache/spark/pull/29785#issuecomment-698836144 @cloud-fan I don't really see a way to simply this without making invasive changes to memory management in Spark. For instance we could simplify this if we had some way to

[GitHub] [spark] SparkQA commented on pull request #29024: [SPARK-32001][SQL]Create JDBC authentication provider developer API

2020-09-25 Thread GitBox
SparkQA commented on pull request #29024: URL: https://github.com/apache/spark/pull/29024#issuecomment-698845157 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33729/

[GitHub] [spark] SparkQA commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
SparkQA commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698765677 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33726/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698773106 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
SparkQA commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698773089 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33726/

[GitHub] [spark] AmplabJenkins commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698773106 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] gatorsmile commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs

2020-09-25 Thread GitBox
gatorsmile commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r494808719 ## File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md ## @@ -36,6 +36,14 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier

[GitHub] [spark] cloud-fan commented on pull request #29785: [SPARK-32901][CORE] Do not allocate memory while spilling UnsafeExternalSorter

2020-09-25 Thread GitBox
cloud-fan commented on pull request #29785: URL: https://github.com/apache/spark/pull/29785#issuecomment-698810109 Is it possible to refactor it a little bit to make the logic simpler? It's really tricky that we must be careful at each step, in case something was spilled right before the

[GitHub] [spark] HyukjinKwon commented on pull request #29024: [SPARK-32001][SQL]Create JDBC authentication provider developer API

2020-09-25 Thread GitBox
HyukjinKwon commented on pull request #29024: URL: https://github.com/apache/spark/pull/29024#issuecomment-698818393 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29860: [SPARK-32984][TESTS][SQL] Improve showing the differences between approved and actual plans of PlanStabilitySuite

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29860: URL: https://github.com/apache/spark/pull/29860#issuecomment-698762326 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] cloud-fan commented on a change in pull request #29804: [SPARK-32859][SQL] Introduce physical rule to decide bucketing dynamically

2020-09-25 Thread GitBox
cloud-fan commented on a change in pull request #29804: URL: https://github.com/apache/spark/pull/29804#discussion_r494854327 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/bucketing/DisableUnnecessaryBucketedScan.scala ## @@ -0,0 +1,154 @@ +/* + *

[GitHub] [spark] SparkQA commented on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-25 Thread GitBox
SparkQA commented on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698829347 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33728/

[GitHub] [spark] AmplabJenkins commented on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698829368 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] cloud-fan closed pull request #29798: [SPARK-32931][SQL] Unevaluable Expressions are not Foldable

2020-09-25 Thread GitBox
cloud-fan closed pull request #29798: URL: https://github.com/apache/spark/pull/29798 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] cloud-fan commented on a change in pull request #29785: [SPARK-32901][CORE] Do not allocate memory while spilling UnsafeExternalSorter

2020-09-25 Thread GitBox
cloud-fan commented on a change in pull request #29785: URL: https://github.com/apache/spark/pull/29785#discussion_r494835305 ## File path: core/src/test/scala/org/apache/spark/memory/TestMemoryManager.scala ## @@ -119,6 +119,14 @@ class TestMemoryManager(conf: SparkConf)

[GitHub] [spark] cloud-fan commented on a change in pull request #29804: [SPARK-32859][SQL] Introduce physical rule to decide bucketing dynamically

2020-09-25 Thread GitBox
cloud-fan commented on a change in pull request #29804: URL: https://github.com/apache/spark/pull/29804#discussion_r494851172 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala ## @@ -348,20 +352,25 @@ case class FileSourceScanExec(

[GitHub] [spark] SparkQA commented on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-25 Thread GitBox
SparkQA commented on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698820632 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33728/

[GitHub] [spark] LuciferYang commented on pull request #29857: [SPARK-32972][ML] Fix UTs of `mllib` module in Scala 2.13 except RandomForestRegressorSuite

2020-09-25 Thread GitBox
LuciferYang commented on pull request #29857: URL: https://github.com/apache/spark/pull/29857#issuecomment-698819722 Synchronize the test result:

[GitHub] [spark] AmplabJenkins commented on pull request #29024: [SPARK-32001][SQL]Create JDBC authentication provider developer API

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29024: URL: https://github.com/apache/spark/pull/29024#issuecomment-698852755 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29024: [SPARK-32001][SQL]Create JDBC authentication provider developer API

2020-09-25 Thread GitBox
SparkQA commented on pull request #29024: URL: https://github.com/apache/spark/pull/29024#issuecomment-698852737 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33729/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29024: [SPARK-32001][SQL]Create JDBC authentication provider developer API

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29024: URL: https://github.com/apache/spark/pull/29024#issuecomment-698852755 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] gaborgsomogyi commented on a change in pull request #29024: [SPARK-32001][SQL]Create JDBC authentication provider developer API

2020-09-25 Thread GitBox
gaborgsomogyi commented on a change in pull request #29024: URL: https://github.com/apache/spark/pull/29024#discussion_r494903467 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala ## @@ -23,12 +23,15 @@ import

[GitHub] [spark] SparkQA commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698758315 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33725/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698768045 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698780379 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33727/

[GitHub] [spark] AmplabJenkins commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
AmplabJenkins commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698788735 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-25 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698788706 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33727/

[GitHub] [spark] tomvanbussel commented on a change in pull request #29785: [SPARK-32901][CORE] Do not allocate memory while spilling UnsafeExternalSorter

2020-09-25 Thread GitBox
tomvanbussel commented on a change in pull request #29785: URL: https://github.com/apache/spark/pull/29785#discussion_r494839048 ## File path: core/src/test/scala/org/apache/spark/memory/TestMemoryManager.scala ## @@ -119,6 +119,14 @@ class TestMemoryManager(conf: SparkConf)

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29862: [SPARK-32956][SQL] Ensure that the generated and existing headers are not duplicated in CSV DataSource

2020-09-25 Thread GitBox
HyukjinKwon commented on a change in pull request #29862: URL: https://github.com/apache/spark/pull/29862#discussion_r494861000 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVUtils.scala ## @@ -93,6 +93,12 @@ object CSVUtils {

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698829380 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-25 Thread GitBox
AmplabJenkins removed a comment on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698829368 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29024: [SPARK-32001][SQL]Create JDBC authentication provider developer API

2020-09-25 Thread GitBox
HyukjinKwon commented on a change in pull request #29024: URL: https://github.com/apache/spark/pull/29024#discussion_r494878705 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala ## @@ -23,12 +23,15 @@ import

  1   2   3   4   5   6   7   >