[GitHub] [spark] LuciferYang commented on a diff in pull request #37842: [SPARK-40396][BUILD] Update scalatest and scalatestplus related dependencies to use stable version

2022-09-09 Thread GitBox
LuciferYang commented on code in PR #37842: URL: https://github.com/apache/spark/pull/37842#discussion_r967243275 ## pom.xml: ## @@ -1139,37 +1139,38 @@ org.scalatest scalatest_${scala.binary.version} -3.3.0-SNAP3 +3.2.13 Review

[GitHub] [spark] dtenedor commented on a diff in pull request #37840: [SPARK-40394][SQL] Move subquery expression CheckAnalysis error messages to use the new error framework

2022-09-09 Thread GitBox
dtenedor commented on code in PR #37840: URL: https://github.com/apache/spark/pull/37840#discussion_r967333759 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala: ## @@ -923,7 +966,11 @@ trait CheckAnalysis extends PredicateHelper with

[GitHub] [spark] dtenedor commented on a diff in pull request #37840: [SPARK-40394][SQL] Move subquery expression CheckAnalysis error messages to use the new error framework

2022-09-09 Thread GitBox
dtenedor commented on code in PR #37840: URL: https://github.com/apache/spark/pull/37840#discussion_r967333176 ## sql/core/src/test/resources/sql-tests/results/join-lateral.sql.out: ## @@ -323,14 +322,10 @@ SELECT * FROM t1, LATERAL (SELECT rand(0) FROM t2) struct<> -- !query

[GitHub] [spark] gengliangwang commented on pull request #37847: [SPARK-40280][SQL][FOLLOWUP][3.3] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
gengliangwang commented on PR #37847: URL: https://github.com/apache/spark/pull/37847#issuecomment-1242311085 @huaxingao Thanks for the ping. I will fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37733: [SPARK-40267][DOC] Add description for ExecutorAllocationManager metrics

2022-09-09 Thread GitBox
dongjoon-hyun commented on code in PR #37733: URL: https://github.com/apache/spark/pull/37733#discussion_r967366412 ## docs/monitoring.md: ## @@ -1207,12 +1207,12 @@ This is the component with the largest amount of instrumented metrics - namespace=ExecutorAllocationManager

[GitHub] [spark] dongjoon-hyun commented on pull request #37848: [SPARK-40389][SQL][FollowUp][3.3] Fix a test failure in SQLQuerySuite

2022-09-09 Thread GitBox
dongjoon-hyun commented on PR #37848: URL: https://github.com/apache/spark/pull/37848#issuecomment-1242347363 cc @huaxingao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] roczei commented on a diff in pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-09 Thread GitBox
roczei commented on code in PR #37679: URL: https://github.com/apache/spark/pull/37679#discussion_r967420952 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala: ## @@ -286,7 +284,7 @@ class SessionCatalog( def dropDatabase(db: String,

[GitHub] [spark] dtenedor commented on a diff in pull request #37840: [SPARK-40394][SQL] Move subquery expression CheckAnalysis error messages to use the new error framework

2022-09-09 Thread GitBox
dtenedor commented on code in PR #37840: URL: https://github.com/apache/spark/pull/37840#discussion_r967336318 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala: ## @@ -923,7 +966,11 @@ trait CheckAnalysis extends PredicateHelper with

[GitHub] [spark] dtenedor commented on a diff in pull request #37840: [SPARK-40394][SQL] Move subquery expression CheckAnalysis error messages to use the new error framework

2022-09-09 Thread GitBox
dtenedor commented on code in PR #37840: URL: https://github.com/apache/spark/pull/37840#discussion_r967333176 ## sql/core/src/test/resources/sql-tests/results/join-lateral.sql.out: ## @@ -323,14 +322,10 @@ SELECT * FROM t1, LATERAL (SELECT rand(0) FROM t2) struct<> -- !query

[GitHub] [spark] dtenedor commented on a diff in pull request #37840: [SPARK-40394][SQL] Move subquery expression CheckAnalysis error messages to use the new error framework

2022-09-09 Thread GitBox
dtenedor commented on code in PR #37840: URL: https://github.com/apache/spark/pull/37840#discussion_r967340825 ## core/src/main/resources/error/error-classes.json: ## @@ -327,6 +327,83 @@ ], "sqlState" : "42000" }, + "INVALID_SUBQUERY_EXPRESSION" : { +

[GitHub] [spark] dtenedor commented on a diff in pull request #37840: [SPARK-40394][SQL] Move subquery expression CheckAnalysis error messages to use the new error framework

2022-09-09 Thread GitBox
dtenedor commented on code in PR #37840: URL: https://github.com/apache/spark/pull/37840#discussion_r967340575 ## core/src/main/resources/error/error-classes.json: ## @@ -327,6 +327,83 @@ ], "sqlState" : "42000" }, + "INVALID_SUBQUERY_EXPRESSION" : { +

[GitHub] [spark] huaxingao commented on pull request #37846: [SPARK-40280][SQL][FOLLOWUP][3.2] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
huaxingao commented on PR #37846: URL: https://github.com/apache/spark/pull/37846#issuecomment-1242313810 The test failure doesn't seem to be related to this PR. I will merge this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] LuciferYang commented on a diff in pull request #37843: [WIP][SPARK-40398][SQL] Use Loop instead of Arrays.stream api

2022-09-09 Thread GitBox
LuciferYang commented on code in PR #37843: URL: https://github.com/apache/spark/pull/37843#discussion_r967265692 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java: ## @@ -393,4 +384,20 @@ private String joinListToString( }

[GitHub] [spark] Bahigac commented on pull request #35220: [SPARK-37922][SQL] Combine to one cast if we can safely up-cast two casts

2022-09-09 Thread GitBox
Bahigac commented on PR #35220: URL: https://github.com/apache/spark/pull/35220#issuecomment-1242201044 > ### What changes were proposed in this pull request? > > > > This PR improves `SimplifyCasts` to combine into one cast if they are both `NumericType` and can safely

[GitHub] [spark] dtenedor commented on a diff in pull request #37840: [SPARK-40394][SQL] Move subquery expression CheckAnalysis error messages to use the new error framework

2022-09-09 Thread GitBox
dtenedor commented on code in PR #37840: URL: https://github.com/apache/spark/pull/37840#discussion_r967338829 ## core/src/main/resources/error/error-classes.json: ## @@ -327,6 +327,83 @@ ], "sqlState" : "42000" }, + "INVALID_SUBQUERY_EXPRESSION" : { +

[GitHub] [spark] huaxingao commented on pull request #37846: [SPARK-40280][SQL][FOLLOWUP][3.2] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
huaxingao commented on PR #37846: URL: https://github.com/apache/spark/pull/37846#issuecomment-1242324401 @zzcclp Could you please update the PR description to fill all the required information? -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] huaxingao commented on pull request #37847: [SPARK-40280][SQL][FOLLOWUP][3.3] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
huaxingao commented on PR #37847: URL: https://github.com/apache/spark/pull/37847#issuecomment-1242325279 @zzcclp Could you please update the PR description to fill all the required information? Thanks! -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] dongjoon-hyun commented on pull request #37468: [SPARK-40034][SQL] PathOutputCommitters to support dynamic partitions

2022-09-09 Thread GitBox
dongjoon-hyun commented on PR #37468: URL: https://github.com/apache/spark/pull/37468#issuecomment-1242332651 Thank you, @steveloughran and @attilapiros . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] huaxingao commented on pull request #37847: [SPARK-40280][SQL][FOLLOWUP][3.3] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
huaxingao commented on PR #37847: URL: https://github.com/apache/spark/pull/37847#issuecomment-1242332923 Merged to 3.3. Thanks @zzcclp for fixing this! Please remember to update the PR description. Thanks @revans2 @tgravescs for reviewing! -- This is an automated message from the

[GitHub] [spark] roczei commented on pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-09 Thread GitBox
roczei commented on PR #37679: URL: https://github.com/apache/spark/pull/37679#issuecomment-1242406713 > we should also update `V2SessionCatalog.defaultNamespace` @cloud-fan, for example this change will be good? ``` diff --git

[GitHub] [spark] LuciferYang commented on a diff in pull request #37826: [SPARK-40364][CORE] Use the unified `DBProvider#initDB ` method

2022-09-09 Thread GitBox
LuciferYang commented on code in PR #37826: URL: https://github.com/apache/spark/pull/37826#discussion_r967405666 ## common/network-common/src/main/java/org/apache/spark/network/util/LevelDBProvider.java: ## @@ -85,14 +84,6 @@ public static DB initLevelDB(File dbFile,

[GitHub] [spark] LuciferYang commented on a diff in pull request #37826: [SPARK-40364][CORE] Use the unified `DBProvider#initDB ` method

2022-09-09 Thread GitBox
LuciferYang commented on code in PR #37826: URL: https://github.com/apache/spark/pull/37826#discussion_r967405666 ## common/network-common/src/main/java/org/apache/spark/network/util/LevelDBProvider.java: ## @@ -85,14 +84,6 @@ public static DB initLevelDB(File dbFile,

[GitHub] [spark] srowen commented on a diff in pull request #37842: [SPARK-40396][BUILD] Update scalatest and scalatestplus related dependencies to use stable version

2022-09-09 Thread GitBox
srowen commented on code in PR #37842: URL: https://github.com/apache/spark/pull/37842#discussion_r967248466 ## pom.xml: ## @@ -1139,37 +1139,38 @@ org.scalatest scalatest_${scala.binary.version} -3.3.0-SNAP3 +3.2.13 Review Comment:

[GitHub] [spark] dongjoon-hyun closed pull request #37848: [SPARK-40389][SQL][FollowUp][3.3] Fix a test failure in SQLQuerySuite

2022-09-09 Thread GitBox
dongjoon-hyun closed pull request #37848: [SPARK-40389][SQL][FollowUp][3.3] Fix a test failure in SQLQuerySuite URL: https://github.com/apache/spark/pull/37848 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] tgravescs commented on a diff in pull request #37826: [SPARK-40364][CORE] Use the unified `DBProvider#initDB ` method

2022-09-09 Thread GitBox
tgravescs commented on code in PR #37826: URL: https://github.com/apache/spark/pull/37826#discussion_r967392150 ## common/network-common/src/main/java/org/apache/spark/network/util/LevelDBProvider.java: ## @@ -85,14 +84,6 @@ public static DB initLevelDB(File dbFile,

[GitHub] [spark] LuciferYang commented on a diff in pull request #37842: [SPARK-40396][BUILD] Update scalatest and scalatestplus related dependencies to use stable version

2022-09-09 Thread GitBox
LuciferYang commented on code in PR #37842: URL: https://github.com/apache/spark/pull/37842#discussion_r967250684 ## pom.xml: ## @@ -1109,7 +1109,7 @@ org.scala-lang.modules scala-xml_${scala.binary.version} -1.2.0 +2.1.0 Review

[GitHub] [spark] LuciferYang commented on a diff in pull request #37843: [WIP][SPARK-40398][SQL] Use Loop instead of Arrays.stream api

2022-09-09 Thread GitBox
LuciferYang commented on code in PR #37843: URL: https://github.com/apache/spark/pull/37843#discussion_r967260907 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java: ## @@ -393,4 +384,20 @@ private String joinListToString( }

[GitHub] [spark] gengliangwang commented on a diff in pull request #37841: [SPARK-40324][SQL] Provide query context in AnalysisException

2022-09-09 Thread GitBox
gengliangwang commented on code in PR #37841: URL: https://github.com/apache/spark/pull/37841#discussion_r967326002 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala: ## @@ -77,12 +77,6 @@ class ExpressionTypeCheckingSuite

[GitHub] [spark] dtenedor commented on a diff in pull request #37840: [SPARK-40394][SQL] Move subquery expression CheckAnalysis error messages to use the new error framework

2022-09-09 Thread GitBox
dtenedor commented on code in PR #37840: URL: https://github.com/apache/spark/pull/37840#discussion_r967332227 ## core/src/main/resources/error/error-classes.json: ## @@ -327,6 +327,83 @@ ], "sqlState" : "42000" }, + "INVALID_SUBQUERY_EXPRESSION" : { +

[GitHub] [spark] LuciferYang commented on pull request #37843: [WIP][SPARK-40398][SQL] Use Loop instead of Arrays.stream api

2022-09-09 Thread GitBox
LuciferYang commented on PR #37843: URL: https://github.com/apache/spark/pull/37843#issuecomment-1242204716 Yes, there are other cases. I am sorting out the test data and hope to fix them all in this one -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] ahshahid commented on pull request #37824: [SPARK-40362][SQL] Bug in Canonicalization of expressions like Add & Multiply i.e Commutative Operators

2022-09-09 Thread GitBox
ahshahid commented on PR #37824: URL: https://github.com/apache/spark/pull/37824#issuecomment-1242223041 > Agree with @peter-toth that we should do all ordering in the 2nd pass, in a bottom-up way. I suppose @cloud-fan @peter-toth you want to code the change...? Or you want me to

[GitHub] [spark] wankunde commented on pull request #37533: [SPARK-40096]Fix finalize shuffle stage slow due to connection creation slow

2022-09-09 Thread GitBox
wankunde commented on PR #37533: URL: https://github.com/apache/spark/pull/37533#issuecomment-1242234794 I'm sorry for the late reply, I have updated the code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] dtenedor commented on a diff in pull request #37840: [SPARK-40394][SQL] Move subquery expression CheckAnalysis error messages to use the new error framework

2022-09-09 Thread GitBox
dtenedor commented on code in PR #37840: URL: https://github.com/apache/spark/pull/37840#discussion_r967339252 ## core/src/main/resources/error/error-classes.json: ## @@ -327,6 +327,83 @@ ], "sqlState" : "42000" }, + "INVALID_SUBQUERY_EXPRESSION" : { +

[GitHub] [spark] dongjoon-hyun closed pull request #37468: [SPARK-40034][SQL] PathOutputCommitters to support dynamic partitions

2022-09-09 Thread GitBox
dongjoon-hyun closed pull request #37468: [SPARK-40034][SQL] PathOutputCommitters to support dynamic partitions URL: https://github.com/apache/spark/pull/37468 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] dongjoon-hyun commented on pull request #37468: [SPARK-40034][SQL] PathOutputCommitters to support dynamic partitions

2022-09-09 Thread GitBox
dongjoon-hyun commented on PR #37468: URL: https://github.com/apache/spark/pull/37468#issuecomment-1242329502 Merged to master for Apache Spark 3.4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37733: [SPARK-40267][DOC] Add description for ExecutorAllocationManager metrics

2022-09-09 Thread GitBox
dongjoon-hyun commented on code in PR #37733: URL: https://github.com/apache/spark/pull/37733#discussion_r967367956 ## docs/monitoring.md: ## @@ -1207,12 +1207,12 @@ This is the component with the largest amount of instrumented metrics - namespace=ExecutorAllocationManager

[GitHub] [spark] dtenedor commented on a diff in pull request #37840: [SPARK-40394][SQL] Move subquery expression CheckAnalysis error messages to use the new error framework

2022-09-09 Thread GitBox
dtenedor commented on code in PR #37840: URL: https://github.com/apache/spark/pull/37840#discussion_r967378293 ## sql/core/src/test/resources/sql-tests/results/join-lateral.sql.out: ## @@ -323,14 +322,10 @@ SELECT * FROM t1, LATERAL (SELECT rand(0) FROM t2) struct<> -- !query

[GitHub] [spark] huaxingao commented on pull request #37847: [SPARK-40280][SQL][FOLLOWUP][3.3] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
huaxingao commented on PR #37847: URL: https://github.com/apache/spark/pull/37847#issuecomment-1242312672 I will merge this PR since the test failure is not related to this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] huaxingao commented on pull request #37847: [SPARK-40280][SQL][FOLLOWUP][3.3] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
huaxingao commented on PR #37847: URL: https://github.com/apache/spark/pull/37847#issuecomment-1242312241 Thanks @gengliangwang ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] huaxingao commented on pull request #37846: [SPARK-40280][SQL][FOLLOWUP][3.2] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
huaxingao commented on PR #37846: URL: https://github.com/apache/spark/pull/37846#issuecomment-1242334119 Merged to 3.2. Thanks @zzcclp for fixing this! Also thanks @tgravescs @revans2 @wangyum for reviewing! -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] huaxingao closed pull request #37847: [SPARK-40280][SQL][FOLLOWUP][3.3] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
huaxingao closed pull request #37847: [SPARK-40280][SQL][FOLLOWUP][3.3] Fix 'ParquetFilterSuite' issue URL: https://github.com/apache/spark/pull/37847 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] huaxingao closed pull request #37846: [SPARK-40280][SQL][FOLLOWUP][3.2] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
huaxingao closed pull request #37846: [SPARK-40280][SQL][FOLLOWUP][3.2] Fix 'ParquetFilterSuite' issue URL: https://github.com/apache/spark/pull/37846 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] gengliangwang opened a new pull request, #37848: [SPARK-40389][SQL][FollowUp][3.3] Fix a test failure in SQLQuerySuite

2022-09-09 Thread GitBox
gengliangwang opened a new pull request, #37848: URL: https://github.com/apache/spark/pull/37848 ### What changes were proposed in this pull request? Fix a test failure in SQLQuerySuite on branch-3.3. It's from the backport of https://github.com/apache/spark/pull/37832 since

[GitHub] [spark] dtenedor commented on pull request #37841: [SPARK-40324][SQL] Provide query context in AnalysisException

2022-09-09 Thread GitBox
dtenedor commented on PR #37841: URL: https://github.com/apache/spark/pull/37841#issuecomment-1242354926 cc @dtenedor myself for context -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun closed pull request #37716: [SPARK-40269][CORE] Randomize the orders of peer in BlockManagerDecommissioner

2022-09-09 Thread GitBox
dongjoon-hyun closed pull request #37716: [SPARK-40269][CORE] Randomize the orders of peer in BlockManagerDecommissioner URL: https://github.com/apache/spark/pull/37716 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] LuciferYang commented on a diff in pull request #37826: [SPARK-40364][CORE] Use the unified `DBProvider#initDB ` method

2022-09-09 Thread GitBox
LuciferYang commented on code in PR #37826: URL: https://github.com/apache/spark/pull/37826#discussion_r967386144 ## common/network-common/src/main/java/org/apache/spark/network/util/LevelDBProvider.java: ## @@ -85,14 +84,6 @@ public static DB initLevelDB(File dbFile,

[GitHub] [spark] LuciferYang commented on a diff in pull request #37826: [SPARK-40364][CORE] Use the unified `DBProvider#initDB ` method

2022-09-09 Thread GitBox
LuciferYang commented on code in PR #37826: URL: https://github.com/apache/spark/pull/37826#discussion_r967405666 ## common/network-common/src/main/java/org/apache/spark/network/util/LevelDBProvider.java: ## @@ -85,14 +84,6 @@ public static DB initLevelDB(File dbFile,

[GitHub] [spark] dongjoon-hyun closed pull request #37802: [SPARK-40350][Kubernetes] Use spark config to configure the parameters of volcano podgroup

2022-09-09 Thread GitBox
dongjoon-hyun closed pull request #37802: [SPARK-40350][Kubernetes] Use spark config to configure the parameters of volcano podgroup URL: https://github.com/apache/spark/pull/37802 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] dongjoon-hyun commented on pull request #37802: [SPARK-40350][Kubernetes] Use spark config to configure the parameters of volcano podgroup

2022-09-09 Thread GitBox
dongjoon-hyun commented on PR #37802: URL: https://github.com/apache/spark/pull/37802#issuecomment-1242559032 Let me close this PR for now. We can continue our discussion on this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun commented on pull request #37813: [SPARK-40228][SQL][3.3] Do not simplify multiLike if child is not a cheap expression

2022-09-09 Thread GitBox
dongjoon-hyun commented on PR #37813: URL: https://github.com/apache/spark/pull/37813#issuecomment-1242562694 Merged to branch-3.3. Thank you, @wangyum and @cloud-fan . I added comment during backporting. - https://github.com/apache/spark/pull/37813#discussion_r967496771 -- This is

[GitHub] [spark] dongjoon-hyun closed pull request #37813: [SPARK-40228][SQL][3.3] Do not simplify multiLike if child is not a cheap expression

2022-09-09 Thread GitBox
dongjoon-hyun closed pull request #37813: [SPARK-40228][SQL][3.3] Do not simplify multiLike if child is not a cheap expression URL: https://github.com/apache/spark/pull/37813 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] wangyum closed pull request #37849: [SPARK-40401][CORE] Remove the support of deprecated `spark.akka.*` configs

2022-09-09 Thread GitBox
wangyum closed pull request #37849: [SPARK-40401][CORE] Remove the support of deprecated `spark.akka.*` configs URL: https://github.com/apache/spark/pull/37849 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] dongjoon-hyun opened a new pull request, #37849: [SPARK-40401][CORE] Remove the support of deprecated `spark.akka.*` configs

2022-09-09 Thread GitBox
dongjoon-hyun opened a new pull request, #37849: URL: https://github.com/apache/spark/pull/37849 ### What changes were proposed in this pull request? This PR aims to remove the support of `spark.akka.*` configs. ### Why are the changes needed? - Apache Spark 2.0+ is not

[GitHub] [spark] wangyum commented on pull request #37849: [SPARK-40401][CORE] Remove the support of deprecated `spark.akka.*` configs

2022-09-09 Thread GitBox
wangyum commented on PR #37849: URL: https://github.com/apache/spark/pull/37849#issuecomment-1242622789 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37813: [SPARK-40228][SQL][3.3] Do not simplify multiLike if child is not a cheap expression

2022-09-09 Thread GitBox
dongjoon-hyun commented on code in PR #37813: URL: https://github.com/apache/spark/pull/37813#discussion_r967496771 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala: ## @@ -1075,6 +1075,16 @@ object CollapseProject extends Rule[LogicalPlan]

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37813: [SPARK-40228][SQL][3.3] Do not simplify multiLike if child is not a cheap expression

2022-09-09 Thread GitBox
dongjoon-hyun commented on code in PR #37813: URL: https://github.com/apache/spark/pull/37813#discussion_r967496771 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala: ## @@ -1075,6 +1075,16 @@ object CollapseProject extends Rule[LogicalPlan]

[GitHub] [spark] dongjoon-hyun closed pull request #37808: [SPARK-39830][SQL][TESTS][3.3] Add a test case to read ORC table that requires type promotion

2022-09-09 Thread GitBox
dongjoon-hyun closed pull request #37808: [SPARK-39830][SQL][TESTS][3.3] Add a test case to read ORC table that requires type promotion URL: https://github.com/apache/spark/pull/37808 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun closed pull request #37730: [SPARK-39915][SQL][3.3] Dataset.repartition(N) may not create N partitions Non-AQE part

2022-09-09 Thread GitBox
dongjoon-hyun closed pull request #37730: [SPARK-39915][SQL][3.3] Dataset.repartition(N) may not create N partitions Non-AQE part URL: https://github.com/apache/spark/pull/37730 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] dongjoon-hyun commented on pull request #37849: [SPARK-40401][CORE] Remove the support of deprecated `spark.akka.*` configs

2022-09-09 Thread GitBox
dongjoon-hyun commented on PR #37849: URL: https://github.com/apache/spark/pull/37849#issuecomment-1242574665 cc @srowen -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] fanyilun commented on pull request #37803: [SPARK-39546][K8S] Support `ports` definition in executor pod template

2022-09-09 Thread GitBox
fanyilun commented on PR #37803: URL: https://github.com/apache/spark/pull/37803#issuecomment-1242604703 Thanks, driver pod template already supports ports definition. It works for me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun closed pull request #37603: [SPARK-40168][CORE] Handle `SparkException` during shuffle block migration

2022-09-09 Thread GitBox
dongjoon-hyun closed pull request #37603: [SPARK-40168][CORE] Handle `SparkException` during shuffle block migration URL: https://github.com/apache/spark/pull/37603 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] srowen commented on pull request #37806: [MINOR][SQL] Print stacktrace when NoClassDefFoundError in HiveDelegationToken

2022-09-09 Thread GitBox
srowen commented on PR #37806: URL: https://github.com/apache/spark/pull/37806#issuecomment-1242549952 Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] srowen closed pull request #37806: [MINOR][SQL] Print stacktrace when NoClassDefFoundError in HiveDelegationToken

2022-09-09 Thread GitBox
srowen closed pull request #37806: [MINOR][SQL] Print stacktrace when NoClassDefFoundError in HiveDelegationToken URL: https://github.com/apache/spark/pull/37806 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] srowen commented on pull request #37820: [MINOR][PS][DOCS] Fix note in missing pandas

2022-09-09 Thread GitBox
srowen commented on PR #37820: URL: https://github.com/apache/spark/pull/37820#issuecomment-1242550304 Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] srowen closed pull request #37820: [MINOR][PS][DOCS] Fix note in missing pandas

2022-09-09 Thread GitBox
srowen closed pull request #37820: [MINOR][PS][DOCS] Fix note in missing pandas URL: https://github.com/apache/spark/pull/37820 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] zzcclp commented on pull request #37846: [SPARK-40280][SQL][FOLLOWUP][3.2] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
zzcclp commented on PR #37846: URL: https://github.com/apache/spark/pull/37846#issuecomment-1242578594 > @zzcclp Could you please update the PR description to fill all the required information? Done. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] zzcclp commented on pull request #37847: [SPARK-40280][SQL][FOLLOWUP][3.3] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
zzcclp commented on PR #37847: URL: https://github.com/apache/spark/pull/37847#issuecomment-1242578672 > @zzcclp Could you please update the PR description to fill all the required information? Thanks! Done. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] huaxingao commented on pull request #37847: [SPARK-40280][SQL][FOLLOWUP][3.3] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
huaxingao commented on PR #37847: URL: https://github.com/apache/spark/pull/37847#issuecomment-1242307565 @gengliangwang This test failed. ``` [info] - SPARK-40389: Don't eliminate a cast which can cause overflow *** FAILED *** (227 milliseconds) [info] "The value

[GitHub] [spark] dongjoon-hyun commented on pull request #37848: [SPARK-40389][SQL][FollowUp][3.3] Fix a test failure in SQLQuerySuite

2022-09-09 Thread GitBox
dongjoon-hyun commented on PR #37848: URL: https://github.com/apache/spark/pull/37848#issuecomment-1242353212 Thank you, @gengliangwang and @huaxingao . I tested this manually. Merged to branch-3.3 to recover the branch. -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] roczei commented on a diff in pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-09 Thread GitBox
roczei commented on code in PR #37679: URL: https://github.com/apache/spark/pull/37679#discussion_r967420952 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala: ## @@ -286,7 +284,7 @@ class SessionCatalog( def dropDatabase(db: String,

[GitHub] [spark] LuciferYang opened a new pull request, #37844: [DON'T MERGE] Upgrade slf4j to 2.0.0

2022-09-09 Thread GitBox
LuciferYang opened a new pull request, #37844: URL: https://github.com/apache/spark/pull/37844 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] LuciferYang commented on pull request #37604: [DON'T MERGE] Try to replace all `json4s` with `Jackson`

2022-09-09 Thread GitBox
LuciferYang commented on PR #37604: URL: https://github.com/apache/spark/pull/37604#issuecomment-1241543683 @plokhotnyuk Let me learn about [jsoniter-scala](https://github.com/plokhotnyuk/jsoniter-scala) first -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] dongjoon-hyun commented on pull request #37468: [SPARK-40034][SQL] PathOutputCommitters to support dynamic partitions

2022-09-09 Thread GitBox
dongjoon-hyun commented on PR #37468: URL: https://github.com/apache/spark/pull/37468#issuecomment-1241563463 BTW, when is the ETA for Apache Hadoop 3.3.5, @steveloughran ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] LuciferYang commented on pull request #37843: [WIP][SPARK-40398][SQL] Use Loop instead of Arrays.stream api

2022-09-09 Thread GitBox
LuciferYang commented on PR #37843: URL: https://github.com/apache/spark/pull/37843#issuecomment-1241538954 will add more similar case -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] LuciferYang commented on pull request #37844: [DON'T MERGE] Upgrade slf4j to 2.0.0

2022-09-09 Thread GitBox
LuciferYang commented on PR #37844: URL: https://github.com/apache/spark/pull/37844#issuecomment-1241573384 Test API compatibility first -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] cloud-fan commented on a diff in pull request #36027: [SPARK-38717][SQL] Handle Hive's bucket spec case preserving behaviour

2022-09-09 Thread GitBox
cloud-fan commented on code in PR #36027: URL: https://github.com/apache/spark/pull/36027#discussion_r966724740 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala: ## @@ -1095,7 +1095,11 @@ private[hive] object HiveClientImpl extends Logging {

[GitHub] [spark] cloud-fan commented on a diff in pull request #37841: [SPARK-40324][SQL] Provide query context in AnalysisException

2022-09-09 Thread GitBox
cloud-fan commented on code in PR #37841: URL: https://github.com/apache/spark/pull/37841#discussion_r966737035 ## sql/catalyst/src/main/scala/org/apache/spark/sql/AnalysisException.scala: ## @@ -124,12 +126,16 @@ class AnalysisException protected[sql] ( plan:

[GitHub] [spark] peter-toth commented on pull request #37824: [SPARK-40362][SQL] Bug in Canonicalization of expressions like Add & Multiply i.e Commutative Operators

2022-09-09 Thread GitBox
peter-toth commented on PR #37824: URL: https://github.com/apache/spark/pull/37824#issuecomment-1241638796 > > I think we could simply move the ordering logic from `BinaryComparison.preCanonicalized` to `Canonicalize.reorderCommutativeOperators` (and rename it to `reorderOperators`) and

[GitHub] [spark] beliefer commented on a diff in pull request #37830: [SPARK-40387][SQL] Improve the implementation of Spark Decimal

2022-09-09 Thread GitBox
beliefer commented on code in PR #37830: URL: https://github.com/apache/spark/pull/37830#discussion_r966770205 ## sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala: ## @@ -184,68 +184,56 @@ final class Decimal extends Ordered[Decimal] with Serializable {

[GitHub] [spark] peter-toth commented on a diff in pull request #36027: [SPARK-38717][SQL] Handle Hive's bucket spec case preserving behaviour

2022-09-09 Thread GitBox
peter-toth commented on code in PR #36027: URL: https://github.com/apache/spark/pull/36027#discussion_r966793263 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala: ## @@ -1095,7 +1095,11 @@ private[hive] object HiveClientImpl extends Logging {

[GitHub] [spark] peter-toth commented on a diff in pull request #36027: [SPARK-38717][SQL] Handle Hive's bucket spec case preserving behaviour

2022-09-09 Thread GitBox
peter-toth commented on code in PR #36027: URL: https://github.com/apache/spark/pull/36027#discussion_r966809657 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala: ## @@ -1095,7 +1095,11 @@ private[hive] object HiveClientImpl extends Logging {

[GitHub] [spark] zhengruifeng commented on a diff in pull request #37845: [SPARK-40399][PS] Make `pearson` correlation in `DataFrame.corr` support missing values and `min_periods `

2022-09-09 Thread GitBox
zhengruifeng commented on code in PR #37845: URL: https://github.com/apache/spark/pull/37845#discussion_r966813948 ## python/pyspark/pandas/tests/test_stats.py: ## @@ -257,6 +257,32 @@ def test_skew_kurt_numerical_stability(self): self.assert_eq(psdf.skew(),

[GitHub] [spark] peter-toth commented on a diff in pull request #36027: [SPARK-38717][SQL] Handle Hive's bucket spec case preserving behaviour

2022-09-09 Thread GitBox
peter-toth commented on code in PR #36027: URL: https://github.com/apache/spark/pull/36027#discussion_r966809657 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala: ## @@ -1095,7 +1095,11 @@ private[hive] object HiveClientImpl extends Logging {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36027: [SPARK-38717][SQL] Handle Hive's bucket spec case preserving behaviour

2022-09-09 Thread GitBox
cloud-fan commented on code in PR #36027: URL: https://github.com/apache/spark/pull/36027#discussion_r966824596 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala: ## @@ -1095,7 +1095,11 @@ private[hive] object HiveClientImpl extends Logging {

[GitHub] [spark] cloud-fan commented on a diff in pull request #37830: [SPARK-40387][SQL] Improve the implementation of Spark Decimal

2022-09-09 Thread GitBox
cloud-fan commented on code in PR #37830: URL: https://github.com/apache/spark/pull/37830#discussion_r966829229 ## sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala: ## @@ -487,12 +488,12 @@ final class Decimal extends Ordered[Decimal] with Serializable {

[GitHub] [spark] AmplabJenkins commented on pull request #37819: [SPARK-40377][SQL] Allow customize maxBroadcastTableBytes and maxBroadcastRows

2022-09-09 Thread GitBox
AmplabJenkins commented on PR #37819: URL: https://github.com/apache/spark/pull/37819#issuecomment-1241842477 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] AmplabJenkins commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

2022-09-09 Thread GitBox
AmplabJenkins commented on PR #37817: URL: https://github.com/apache/spark/pull/37817#issuecomment-1241842534 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] wangyum commented on a diff in pull request #36850: [SPARK-39069][SQL] Enhance ConstantPropagation to replace constants in inequality predicates

2022-09-09 Thread GitBox
wangyum commented on code in PR #36850: URL: https://github.com/apache/spark/pull/36850#discussion_r966887563 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala: ## @@ -205,10 +205,14 @@ object ConstantPropagation extends Rule[LogicalPlan]

[GitHub] [spark] beliefer commented on a diff in pull request #37830: [SPARK-40387][SQL] Improve the implementation of Spark Decimal

2022-09-09 Thread GitBox
beliefer commented on code in PR #37830: URL: https://github.com/apache/spark/pull/37830#discussion_r966946241 ## sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala: ## @@ -504,7 +505,7 @@ final class Decimal extends Ordered[Decimal] with Serializable {

[GitHub] [spark] SelfImpr001 commented on a diff in pull request #37732: [SPARK-40253] [SQL] Fixed loss of precision for writing 0.00 specific…

2022-09-09 Thread GitBox
SelfImpr001 commented on code in PR #37732: URL: https://github.com/apache/spark/pull/37732#discussion_r967056734 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala: ## @@ -76,8 +76,19 @@ object Literal { val decimal = Decimal(d)

[GitHub] [spark] srowen commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

2022-09-09 Thread GitBox
srowen commented on PR #37817: URL: https://github.com/apache/spark/pull/37817#issuecomment-1241962781 Oh, remove the type ignore comment: ``` annotations failed mypy checks: python/pyspark/sql/pandas/conversion.py:298: error: unused "type: ignore" comment Found 1 error in 1

[GitHub] [spark] srowen commented on a diff in pull request #37732: [SPARK-40253] [SQL] Fixed loss of precision for writing 0.00 specific…

2022-09-09 Thread GitBox
srowen commented on code in PR #37732: URL: https://github.com/apache/spark/pull/37732#discussion_r967060213 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala: ## @@ -76,8 +76,19 @@ object Literal { val decimal = Decimal(d)

[GitHub] [spark] srowen commented on pull request #37845: [SPARK-40399][PS] Make `pearson` correlation in `DataFrame.corr` support missing values and `min_periods `

2022-09-09 Thread GitBox
srowen commented on PR #37845: URL: https://github.com/apache/spark/pull/37845#issuecomment-1241970409 Hm, does another library or method in Spark do this? It feels weird to have a method that computes "mostly a correlation" ignoring data -- This is an automated message from the Apache

[GitHub] [spark] wangyum closed pull request #37732: [SPARK-40253] [SQL] Fixed loss of precision for writing 0.00 specific…

2022-09-09 Thread GitBox
wangyum closed pull request #37732: [SPARK-40253] [SQL] Fixed loss of precision for writing 0.00 specific… URL: https://github.com/apache/spark/pull/37732 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] srowen commented on a diff in pull request #37842: [SPARK-40396][BUILD] Update scalatest and scalatestplus related dependencies to use stable version

2022-09-09 Thread GitBox
srowen commented on code in PR #37842: URL: https://github.com/apache/spark/pull/37842#discussion_r967068327 ## pom.xml: ## @@ -1139,37 +1139,38 @@ org.scalatest scalatest_${scala.binary.version} -3.3.0-SNAP3 +3.2.13 Review Comment:

[GitHub] [spark] srowen commented on pull request #37839: correct typo in rdd-programming-guide.md

2022-09-09 Thread GitBox
srowen commented on PR #37839: URL: https://github.com/apache/spark/pull/37839#issuecomment-1241972895 Again - not a typo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] srowen closed pull request #37839: correct typo in rdd-programming-guide.md

2022-09-09 Thread GitBox
srowen closed pull request #37839: correct typo in rdd-programming-guide.md URL: https://github.com/apache/spark/pull/37839 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] tgravescs commented on pull request #37847: [SPARK-40280][SQL][FOLLOWUP][3.3] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
tgravescs commented on PR #37847: URL: https://github.com/apache/spark/pull/37847#issuecomment-1241976659 thanks for fixing @zzcclp I should have built it on these before merging -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] tgravescs commented on pull request #37846: [SPARK-40280][SQL][FOLLOWUP][3.2] Fix 'ParquetFilterSuite' issue

2022-09-09 Thread GitBox
tgravescs commented on PR #37846: URL: https://github.com/apache/spark/pull/37846#issuecomment-1241976737 thanks for fixing @zzcclp I should have built it on these before merging -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] EnricoMi commented on a diff in pull request #37407: [SPARK-39876][SQL] Add UNPIVOT to SQL syntax

2022-09-09 Thread GitBox
EnricoMi commented on code in PR #37407: URL: https://github.com/apache/spark/pull/37407#discussion_r967091275 ## sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4: ## @@ -618,6 +618,46 @@ pivotValue : expression (AS? identifier)? ;

  1   2   >