[GitHub] [spark] AmplabJenkins removed a comment on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726584408 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
AmplabJenkins commented on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726584408 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-11-12 Thread GitBox
beliefer commented on a change in pull request #2: URL: https://github.com/apache/spark/pull/2#discussion_r522752648 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala ## @@ -178,6 +180,90 @@ case class Like(left

[GitHub] [spark] SparkQA removed a comment on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
SparkQA removed a comment on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726576439 **[Test build #131041 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131041/testReport)** for PR 30346 at commit [`4e0c912`](https://gi

[GitHub] [spark] SparkQA commented on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
SparkQA commented on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726584082 **[Test build #131041 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131041/testReport)** for PR 30346 at commit [`4e0c912`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #30351: [SPARK-33441][CORE] Remove unused imports in core module

2020-11-12 Thread GitBox
SparkQA commented on pull request #30351: URL: https://github.com/apache/spark/pull/30351#issuecomment-726582750 **[Test build #131042 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131042/testReport)** for PR 30351 at commit [`6fbf7c8`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #30342: [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples

2020-11-12 Thread GitBox
SparkQA commented on pull request #30342: URL: https://github.com/apache/spark/pull/30342#issuecomment-726581889 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35645/ -

[GitHub] [spark] cloud-fan commented on pull request #30358: [SPARK-33394][SQL] Throw `NoSuchNamespaceException` for not existing namespace in `InMemoryTableCatalog.listTables()`

2020-11-12 Thread GitBox
cloud-fan commented on pull request #30358: URL: https://github.com/apache/spark/pull/30358#issuecomment-72658 SGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [spark] leanken closed pull request #30362: [SPARK-33139][SQL][FOLLOW-UP] add spark.sql.legacy.allowModifyActiveSession in Exception message

2020-11-12 Thread GitBox
leanken closed pull request #30362: URL: https://github.com/apache/spark/pull/30362 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-726580378 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-11-12 Thread GitBox
AmplabJenkins commented on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-726580378 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-11-12 Thread GitBox
SparkQA commented on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-726580360 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35644/ ---

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30362: [SPARK-33139][SQL][FOLLOW-UP] add spark.sql.legacy.allowModifyActiveSession in Exception message

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #30362: URL: https://github.com/apache/spark/pull/30362#issuecomment-726578423 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30366: [SPARK-33440][CORE] Separate the calculation for the next renewal date for each delegation token

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #30366: URL: https://github.com/apache/spark/pull/30366#issuecomment-726578608 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #30366: [SPARK-33440][CORE] Separate the calculation for the next renewal date for each delegation token

2020-11-12 Thread GitBox
AmplabJenkins commented on pull request #30366: URL: https://github.com/apache/spark/pull/30366#issuecomment-726578608 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #30362: [SPARK-33139][SQL][FOLLOW-UP] add spark.sql.legacy.allowModifyActiveSession in Exception message

2020-11-12 Thread GitBox
AmplabJenkins commented on pull request #30362: URL: https://github.com/apache/spark/pull/30362#issuecomment-726578423 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #30366: [SPARK-33440][CORE] Separate the calculation for the next renewal date for each delegation token

2020-11-12 Thread GitBox
SparkQA removed a comment on pull request #30366: URL: https://github.com/apache/spark/pull/30366#issuecomment-726517047 **[Test build #131036 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131036/testReport)** for PR 30366 at commit [`38a6c5a`](https://gi

[GitHub] [spark] MaxGekk commented on pull request #30358: [SPARK-33394][SQL] Throw `NoSuchNamespaceException` for not existing namespace in `InMemoryTableCatalog.listTables()`

2020-11-12 Thread GitBox
MaxGekk commented on pull request #30358: URL: https://github.com/apache/spark/pull/30358#issuecomment-726577886 The failed test is some kind of corner case. This guy https://github.com/apache/spark/blob/fcf8aa59b5025dde9b4af36953146894659967e2/sql/core/src/test/scala/org/apache/spark/sq

[GitHub] [spark] SparkQA commented on pull request #30366: [SPARK-33440][CORE] Separate the calculation for the next renewal date for each delegation token

2020-11-12 Thread GitBox
SparkQA commented on pull request #30366: URL: https://github.com/apache/spark/pull/30366#issuecomment-726577752 **[Test build #131036 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131036/testReport)** for PR 30366 at commit [`38a6c5a`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #30362: [SPARK-33139][SQL][FOLLOW-UP] add spark.sql.legacy.allowModifyActiveSession in Exception message

2020-11-12 Thread GitBox
SparkQA removed a comment on pull request #30362: URL: https://github.com/apache/spark/pull/30362#issuecomment-726471300 **[Test build #131030 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131030/testReport)** for PR 30362 at commit [`4551bb0`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30164: [SPARK-32919][SHUFFLE][test-maven][test-hadoop2.7] Driver side changes for coordinating push based shuffle by selecting externa

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #30164: URL: https://github.com/apache/spark/pull/30164#issuecomment-726576823 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/131

[GitHub] [spark] SparkQA commented on pull request #30362: [SPARK-33139][SQL][FOLLOW-UP] add spark.sql.legacy.allowModifyActiveSession in Exception message

2020-11-12 Thread GitBox
SparkQA commented on pull request #30362: URL: https://github.com/apache/spark/pull/30362#issuecomment-726577534 **[Test build #131030 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131030/testReport)** for PR 30362 at commit [`4551bb0`](https://github.co

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30164: [SPARK-32919][SHUFFLE][test-maven][test-hadoop2.7] Driver side changes for coordinating push based shuffle by selecting externa

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #30164: URL: https://github.com/apache/spark/pull/30164#issuecomment-726576817 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] HyukjinKwon edited a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-12 Thread GitBox
HyukjinKwon edited a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-726476089 It's just that both points are valid ... It would have been best to wait for a couple of days given that we had a sort of long discussions/reviews but I don't see any

[GitHub] [spark] AmplabJenkins commented on pull request #30164: [SPARK-32919][SHUFFLE][test-maven][test-hadoop2.7] Driver side changes for coordinating push based shuffle by selecting external shuffl

2020-11-12 Thread GitBox
AmplabJenkins commented on pull request #30164: URL: https://github.com/apache/spark/pull/30164#issuecomment-726576817 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #30164: [SPARK-32919][SHUFFLE][test-maven][test-hadoop2.7] Driver side changes for coordinating push based shuffle by selecting external shuf

2020-11-12 Thread GitBox
SparkQA removed a comment on pull request #30164: URL: https://github.com/apache/spark/pull/30164#issuecomment-726415580 **[Test build #131022 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131022/testReport)** for PR 30164 at commit [`2d6d266`](https://gi

[GitHub] [spark] SparkQA commented on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
SparkQA commented on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726576439 **[Test build #131041 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131041/testReport)** for PR 30346 at commit [`4e0c912`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #30164: [SPARK-32919][SHUFFLE][test-maven][test-hadoop2.7] Driver side changes for coordinating push based shuffle by selecting external shuffle serv

2020-11-12 Thread GitBox
SparkQA commented on pull request #30164: URL: https://github.com/apache/spark/pull/30164#issuecomment-726576238 **[Test build #131022 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131022/testReport)** for PR 30164 at commit [`2d6d266`](https://github.co

[GitHub] [spark] cloud-fan commented on pull request #30042: [SPARK-33139][SQL] protect setActionSession and clearActiveSession

2020-11-12 Thread GitBox
cloud-fan commented on pull request #30042: URL: https://github.com/apache/spark/pull/30042#issuecomment-726575683 Sorry, we may need to revert this. I got feedback about these 2 APIs, which can be valid use cases: Users don't want to pass a `SparkSession` parameter all the way around, and

[GitHub] [spark] HyukjinKwon commented on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
HyukjinKwon commented on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726575432 @itholic, can you check pyi files too and update? At least I found one diff `batchDuration: Union[float, int] = ...,`. -

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
HyukjinKwon commented on a change in pull request #30346: URL: https://github.com/apache/spark/pull/30346#discussion_r522725171 ## File path: python/pyspark/streaming/context.py ## @@ -45,10 +53,6 @@ class StreamingContext(object): def __init__(self, sparkContext, batchDur

[GitHub] [spark] SparkQA commented on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-11-12 Thread GitBox
SparkQA commented on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-726572104 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35644/ -

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
HyukjinKwon commented on a change in pull request #30346: URL: https://github.com/apache/spark/pull/30346#discussion_r522723304 ## File path: python/pyspark/streaming/dstream.py ## @@ -522,17 +551,25 @@ def reduceByKeyAndWindow(self, func, invFunc, windowDuration, slideDuratio

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
HyukjinKwon commented on a change in pull request #30346: URL: https://github.com/apache/spark/pull/30346#discussion_r522721724 ## File path: python/pyspark/streaming/dstream.py ## @@ -449,15 +462,21 @@ def reduceByWindow(self, reduceFunc, invReduceFunc, windowDuration, slideD

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
HyukjinKwon commented on a change in pull request #30346: URL: https://github.com/apache/spark/pull/30346#discussion_r522720697 ## File path: python/pyspark/streaming/dstream.py ## @@ -423,11 +432,15 @@ def window(self, windowDuration, slideDuration=None): Return a new

[GitHub] [spark] SparkQA commented on pull request #30300: [SPARK-33399][SQL] Normalize output partitioning and sortorder with respect to aliases to avoid unneeded exchange/sort nodes

2020-11-12 Thread GitBox
SparkQA commented on pull request #30300: URL: https://github.com/apache/spark/pull/30300#issuecomment-726570795 **[Test build #131040 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131040/testReport)** for PR 30300 at commit [`c66874a`](https://github.com

[GitHub] [spark] HeartSaVioR edited a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-12 Thread GitBox
HeartSaVioR edited a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-726569827 I'd rather avoid the chance of "post-review" whenever possible, but I'd admit everyone has different thoughts. I'm OK with it, and if that's considered here (and no o

[GitHub] [spark] HeartSaVioR edited a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-12 Thread GitBox
HeartSaVioR edited a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-726569827 I'd rather avoid the chance of "post-review" whenever possible, but I'd admit everyone has different thoughts. I'm OK with it, and if that's considered here (and no o

[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-11-12 Thread GitBox
cloud-fan commented on a change in pull request #2: URL: https://github.com/apache/spark/pull/2#discussion_r522719281 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -1408,7 +1408,20 @@ class AstBuilder(conf: SQLCon

[GitHub] [spark] prakharjain09 commented on a change in pull request #30300: [SPARK-33399][SQL] Normalize output partitioning and sortorder with respect to aliases to avoid unneeded exchange/sort node

2020-11-12 Thread GitBox
prakharjain09 commented on a change in pull request #30300: URL: https://github.com/apache/spark/pull/30300#discussion_r522717726 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/AliasAwareOutputExpression.scala ## @@ -68,11 +65,8 @@ trait AliasAwareOutputO

[GitHub] [spark] HeartSaVioR commented on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-12 Thread GitBox
HeartSaVioR commented on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-726569827 I'd rather avoid the chance of "post-review" whenever possible, but I'd admit everyone has different thoughts. I'm OK with it, and if that's considered here (and no one woul

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
HyukjinKwon commented on a change in pull request #30346: URL: https://github.com/apache/spark/pull/30346#discussion_r522717815 ## File path: python/pyspark/streaming/context.py ## @@ -286,11 +321,18 @@ def queueStream(self, rdds, oneAtATime=True, default=None): Creat

[GitHub] [spark] cloud-fan commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-11-12 Thread GitBox
cloud-fan commented on a change in pull request #2: URL: https://github.com/apache/spark/pull/2#discussion_r522716611 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala ## @@ -178,6 +180,90 @@ case class Like(lef

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
HyukjinKwon commented on a change in pull request #30346: URL: https://github.com/apache/spark/pull/30346#discussion_r522714463 ## File path: python/pyspark/streaming/context.py ## @@ -286,11 +321,18 @@ def queueStream(self, rdds, oneAtATime=True, default=None): Creat

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
HyukjinKwon commented on a change in pull request #30346: URL: https://github.com/apache/spark/pull/30346#discussion_r522713401 ## File path: python/pyspark/streaming/context.py ## @@ -242,9 +268,14 @@ def socketTextStream(self, hostname, port, storageLevel=StorageLevel.MEMORY

[GitHub] [spark] cloud-fan commented on a change in pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-12 Thread GitBox
cloud-fan commented on a change in pull request #30341: URL: https://github.com/apache/spark/pull/30341#discussion_r522712902 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionProxy.scala ## @@ -0,0 +1,57 @@ +/* + * Licensed to the A

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
HyukjinKwon commented on a change in pull request #30346: URL: https://github.com/apache/spark/pull/30346#discussion_r522711041 ## File path: python/pyspark/streaming/context.py ## @@ -268,8 +299,12 @@ def binaryRecordsStream(self, directory, recordLength): them from a

[GitHub] [spark] yaooqinn commented on pull request #30311: [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only

2020-11-12 Thread GitBox
yaooqinn commented on pull request #30311: URL: https://github.com/apache/spark/pull/30311#issuecomment-726566732 @maropu the description for `spark.executor.memoryOverhead` is fine in http://spark.apache.org/docs/2.4.7/configuration.html. Then it is not that necessary to backport this alt

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
HyukjinKwon commented on a change in pull request #30346: URL: https://github.com/apache/spark/pull/30346#discussion_r522709942 ## File path: python/pyspark/streaming/context.py ## @@ -242,9 +268,14 @@ def socketTextStream(self, hostname, port, storageLevel=StorageLevel.MEMORY

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
HyukjinKwon commented on a change in pull request #30346: URL: https://github.com/apache/spark/pull/30346#discussion_r522708852 ## File path: python/pyspark/streaming/context.py ## @@ -90,8 +94,12 @@ def getOrCreate(cls, checkpointPath, setupFunc): recreated from the c

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
HyukjinKwon commented on a change in pull request #30346: URL: https://github.com/apache/spark/pull/30346#discussion_r522708521 ## File path: python/pyspark/streaming/context.py ## @@ -36,6 +36,14 @@ class StreamingContext(object): be started and stopped using `context.sta

[GitHub] [spark] SparkQA commented on pull request #30342: [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples

2020-11-12 Thread GitBox
SparkQA commented on pull request #30342: URL: https://github.com/apache/spark/pull/30342#issuecomment-726565526 **[Test build #131039 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131039/testReport)** for PR 30342 at commit [`64effa1`](https://github.com

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
HyukjinKwon commented on a change in pull request #30346: URL: https://github.com/apache/spark/pull/30346#discussion_r522706515 ## File path: python/pyspark/streaming/context.py ## @@ -45,10 +53,6 @@ class StreamingContext(object): def __init__(self, sparkContext, batchDur

[GitHub] [spark] sarutak commented on pull request #30292: [SPARK-33166][DOC] Provide Search Function in Spark docs site

2020-11-12 Thread GitBox
sarutak commented on pull request #30292: URL: https://github.com/apache/spark/pull/30292#issuecomment-726564502 Seems good to me. This is an automated message from the Apache Git Service. To respond to the message, please lo

[GitHub] [spark] maropu commented on pull request #30342: [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples

2020-11-12 Thread GitBox
maropu commented on pull request #30342: URL: https://github.com/apache/spark/pull/30342#issuecomment-726563747 retest this please This is an automated message from the Apache Git Service. To respond to the message, please lo

[GitHub] [spark] cloud-fan commented on pull request #30332: [SPARK-33419][SQL] Unexpected behavior when using SET commands before a query in SparkSession.sql

2020-11-12 Thread GitBox
cloud-fan commented on pull request #30332: URL: https://github.com/apache/spark/pull/30332#issuecomment-726562478 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [spark] cloud-fan closed pull request #30332: [SPARK-33419][SQL] Unexpected behavior when using SET commands before a query in SparkSession.sql

2020-11-12 Thread GitBox
cloud-fan closed pull request #30332: URL: https://github.com/apache/spark/pull/30332 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] maropu commented on pull request #30356: [SPARK-33433][SQL] Change Aggregate max rows to 1 if grouping is empty

2020-11-12 Thread GitBox
maropu commented on pull request #30356: URL: https://github.com/apache/spark/pull/30356#issuecomment-726562265 Thanks, @ulysses-you @HyukjinKwon @viirya ! Merged to master. This is an automated message from the Apache Git Se

[GitHub] [spark] maropu closed pull request #30356: [SPARK-33433][SQL] Change Aggregate max rows to 1 if grouping is empty

2020-11-12 Thread GitBox
maropu closed pull request #30356: URL: https://github.com/apache/spark/pull/30356 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [spark] maropu commented on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-11-12 Thread GitBox
maropu commented on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-726559463 Looks fine to me. Btw, if we merge this, we can close #29085? If so, please add "Closes #29085" in the PR description. --

[GitHub] [spark] Lucusone commented on pull request #25201: [SPARK-28419][SQL] Enable SparkThriftServer support proxy user's authentication .

2020-11-12 Thread GitBox
Lucusone commented on pull request #25201: URL: https://github.com/apache/spark/pull/25201#issuecomment-726558699 ok This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726555340 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726555330 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] SparkQA commented on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-11-12 Thread GitBox
SparkQA commented on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-726556475 **[Test build #131038 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131038/testReport)** for PR 29414 at commit [`3ad44e3`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
SparkQA commented on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726555307 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35643/ ---

[GitHub] [spark] AmplabJenkins commented on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
AmplabJenkins commented on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726555330 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] HeartSaVioR commented on pull request #30139: [SPARK-31069][CORE] high cpu caused by chunksBeingTransferred in external shuffle service

2020-11-12 Thread GitBox
HeartSaVioR commented on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-726554569 I see this is now only making exception on not calculating the chunksBeingTransferred on default value of config. This is far from the original approach, so could you please

[GitHub] [spark] AngersZhuuuu commented on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-11-12 Thread GitBox
AngersZh commented on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-726553987 @maropu @HyukjinKwon @cloud-fan Have resolve the conflicts, can we start these work again? This is an auto

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30365: [SPARK-33439][INFRA] Use SERIAL_SBT_TESTS=1 for SQL modules

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #30365: URL: https://github.com/apache/spark/pull/30365#issuecomment-726550078 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/131

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30365: [SPARK-33439][INFRA] Use SERIAL_SBT_TESTS=1 for SQL modules

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #30365: URL: https://github.com/apache/spark/pull/30365#issuecomment-726550069 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins commented on pull request #30365: [SPARK-33439][INFRA] Use SERIAL_SBT_TESTS=1 for SQL modules

2020-11-12 Thread GitBox
AmplabJenkins commented on pull request #30365: URL: https://github.com/apache/spark/pull/30365#issuecomment-726550069 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #30365: [SPARK-33439][INFRA] Use SERIAL_SBT_TESTS=1 for SQL modules

2020-11-12 Thread GitBox
SparkQA removed a comment on pull request #30365: URL: https://github.com/apache/spark/pull/30365#issuecomment-726492687 **[Test build #131034 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131034/testReport)** for PR 30365 at commit [`1db4c78`](https://gi

[GitHub] [spark] SparkQA commented on pull request #30365: [SPARK-33439][INFRA] Use SERIAL_SBT_TESTS=1 for SQL modules

2020-11-12 Thread GitBox
SparkQA commented on pull request #30365: URL: https://github.com/apache/spark/pull/30365#issuecomment-726549526 **[Test build #131034 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131034/testReport)** for PR 30365 at commit [`1db4c78`](https://github.co

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30139: [SPARK-31069][CORE] high cpu caused by chunksBeingTransferred in external shuffle service

2020-11-12 Thread GitBox
AngersZh commented on a change in pull request #30139: URL: https://github.com/apache/spark/pull/30139#discussion_r522677914 ## File path: common/network-common/src/main/java/org/apache/spark/network/server/ChunkFetchRequestHandler.java ## @@ -88,12 +88,14 @@ public void p

[GitHub] [spark] cloud-fan commented on a change in pull request #30357: [SPARK-33432][SQL] SQL parser should use active SQLConf

2020-11-12 Thread GitBox
cloud-fan commented on a change in pull request #30357: URL: https://github.com/apache/spark/pull/30357#discussion_r522676804 ## File path: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLDriver.scala ## @@ -60,7 +60,7 @@ private[hive] class

[GitHub] [spark] cloud-fan commented on a change in pull request #30357: [SPARK-33432][SQL] SQL parser should use active SQLConf

2020-11-12 Thread GitBox
cloud-fan commented on a change in pull request #30357: URL: https://github.com/apache/spark/pull/30357#discussion_r522676774 ## File path: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala ## @@ -306,7 +306,7 @@ p

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30357: [SPARK-33432][SQL] SQL parser should use active SQLConf

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #30357: URL: https://github.com/apache/spark/pull/30357#issuecomment-726546476 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30357: [SPARK-33432][SQL] SQL parser should use active SQLConf

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #30357: URL: https://github.com/apache/spark/pull/30357#issuecomment-726546492 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/131

[GitHub] [spark] SparkQA removed a comment on pull request #30357: [SPARK-33432][SQL] SQL parser should use active SQLConf

2020-11-12 Thread GitBox
SparkQA removed a comment on pull request #30357: URL: https://github.com/apache/spark/pull/30357#issuecomment-726473494 **[Test build #131031 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131031/testReport)** for PR 30357 at commit [`6d02b6c`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #30357: [SPARK-33432][SQL] SQL parser should use active SQLConf

2020-11-12 Thread GitBox
AmplabJenkins commented on pull request #30357: URL: https://github.com/apache/spark/pull/30357#issuecomment-726546476 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #30357: [SPARK-33432][SQL] SQL parser should use active SQLConf

2020-11-12 Thread GitBox
SparkQA commented on pull request #30357: URL: https://github.com/apache/spark/pull/30357#issuecomment-726546128 **[Test build #131031 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131031/testReport)** for PR 30357 at commit [`6d02b6c`](https://github.co

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30366: [SPARK-33440][CORE] Separate the calculation for the next renewal date for each delegation token

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #30366: URL: https://github.com/apache/spark/pull/30366#issuecomment-726544504 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30366: [SPARK-33440][CORE] Separate the calculation for the next renewal date for each delegation token

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #30366: URL: https://github.com/apache/spark/pull/30366#issuecomment-726544495 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] SparkQA commented on pull request #30366: [SPARK-33440][CORE] Separate the calculation for the next renewal date for each delegation token

2020-11-12 Thread GitBox
SparkQA commented on pull request #30366: URL: https://github.com/apache/spark/pull/30366#issuecomment-726544476 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35642/ ---

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-11-12 Thread GitBox
dongjoon-hyun commented on a change in pull request #29950: URL: https://github.com/apache/spark/pull/29950#discussion_r522674374 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ## @@ -724,20 +725,17 @@ object ColumnPruning ext

[GitHub] [spark] AmplabJenkins commented on pull request #30366: [SPARK-33440][CORE] Separate the calculation for the next renewal date for each delegation token

2020-11-12 Thread GitBox
AmplabJenkins commented on pull request #30366: URL: https://github.com/apache/spark/pull/30366#issuecomment-726544495 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
SparkQA commented on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726542163 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35643/ -

[GitHub] [spark] manuzhang edited a comment on pull request #29797: [SPARK-32932][SQL] Do not use local shuffle reader at final stage on write command

2020-11-12 Thread GitBox
manuzhang edited a comment on pull request #29797: URL: https://github.com/apache/spark/pull/29797#issuecomment-726493708 @maryannxue This could miss a LSR optimization on ~~build~~ probe side of leaf BHJ in multiple joins as follows. ``` *(6) BroadcastHashJoin [b#24], [a#33], Inn

[GitHub] [spark] manuzhang edited a comment on pull request #29797: [SPARK-32932][SQL] Do not use local shuffle reader at final stage on write command

2020-11-12 Thread GitBox
manuzhang edited a comment on pull request #29797: URL: https://github.com/apache/spark/pull/29797#issuecomment-726541325 @cloud-fan Sorry, it's probe side, `ShuffleQueryStage 1` which currently has a `CustomShuffleReader local` parent -

[GitHub] [spark] manuzhang commented on pull request #29797: [SPARK-32932][SQL] Do not use local shuffle reader at final stage on write command

2020-11-12 Thread GitBox
manuzhang commented on pull request #29797: URL: https://github.com/apache/spark/pull/29797#issuecomment-726541325 @cloud-fan `ShuffleQueryStage 1` which currently has a `CustomShuffleReader local` parent This is an automate

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-11-12 Thread GitBox
dongjoon-hyun commented on a change in pull request #29950: URL: https://github.com/apache/spark/pull/29950#discussion_r522669924 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ## @@ -758,6 +756,42 @@ object CollapseProject ex

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-11-12 Thread GitBox
dongjoon-hyun commented on a change in pull request #29950: URL: https://github.com/apache/spark/pull/29950#discussion_r522669719 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ## @@ -758,6 +756,42 @@ object CollapseProject ex

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-11-12 Thread GitBox
dongjoon-hyun commented on a change in pull request #29950: URL: https://github.com/apache/spark/pull/29950#discussion_r522668629 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1963,6 +1963,27 @@ object SQLConf { .booleanConf

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
AmplabJenkins removed a comment on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726535588 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #30366: [SPARK-33440][CORE] Separate the calculation for the next renewal date for each delegation token

2020-11-12 Thread GitBox
SparkQA commented on pull request #30366: URL: https://github.com/apache/spark/pull/30366#issuecomment-726535707 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35642/ -

[GitHub] [spark] SparkQA removed a comment on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
SparkQA removed a comment on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726521325 **[Test build #131037 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131037/testReport)** for PR 30346 at commit [`1945dcf`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
AmplabJenkins commented on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726535588 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-11-12 Thread GitBox
dongjoon-hyun commented on a change in pull request #29950: URL: https://github.com/apache/spark/pull/29950#discussion_r522667998 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseProjectSuite.scala ## @@ -170,4 +171,59 @@ class Collapse

[GitHub] [spark] SparkQA commented on pull request #30346: [SPARK-33253][PYTHON][DOCS] Migration to NumPy documentation style in Streaming (pyspark.streaming.*)

2020-11-12 Thread GitBox
SparkQA commented on pull request #30346: URL: https://github.com/apache/spark/pull/30346#issuecomment-726535326 **[Test build #131037 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131037/testReport)** for PR 30346 at commit [`1945dcf`](https://github.co

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-11-12 Thread GitBox
dongjoon-hyun commented on a change in pull request #29950: URL: https://github.com/apache/spark/pull/29950#discussion_r522667998 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseProjectSuite.scala ## @@ -170,4 +171,59 @@ class Collapse

  1   2   3   4   5   6   7   8   9   10   >