[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38503: [SPARK-40940] Remove Multi-stateful operator checkers for streaming queries.

2022-11-14 Thread GitBox
HeartSaVioR commented on code in PR #38503: URL: https://github.com/apache/spark/pull/38503#discussion_r1021175378 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite.scala: ## @@ -240,25 +240,30 @@ class

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38503: [SPARK-40940] Remove Multi-stateful operator checkers for streaming queries.

2022-11-14 Thread GitBox
HeartSaVioR commented on code in PR #38503: URL: https://github.com/apache/spark/pull/38503#discussion_r1021193999 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingDeduplicationSuite.scala: ## @@ -190,20 +190,25 @@ class StreamingDeduplicationSuite extends

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38503: [SPARK-40940] Remove Multi-stateful operator checkers for streaming queries.

2022-11-14 Thread GitBox
HeartSaVioR commented on code in PR #38503: URL: https://github.com/apache/spark/pull/38503#discussion_r1021193999 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingDeduplicationSuite.scala: ## @@ -190,20 +190,25 @@ class StreamingDeduplicationSuite extends

[GitHub] [spark] zhengruifeng opened a new pull request, #38653: [SPARK-41128][CONNECT][PYTHON] Implement `DataFrame.fillna ` and `DataFrame.na.fill `

2022-11-14 Thread GitBox
zhengruifeng opened a new pull request, #38653: URL: https://github.com/apache/spark/pull/38653 ### What changes were proposed in this pull request? Implement `DataFrame.fillna ` and `DataFrame.na.fill` ### Why are the changes needed? For API coverage ### Does

[GitHub] [spark] MaxGekk opened a new pull request, #38656: [WIP][SPARK-41140][SQL] Rename the error class `_LEGACY_ERROR_TEMP_2440` to `INVALID_WHERE_CONDITION`

2022-11-14 Thread GitBox
MaxGekk opened a new pull request, #38656: URL: https://github.com/apache/spark/pull/38656 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this

[GitHub] [spark] itholic commented on pull request #38658: [SPARK-41109][CORE][FOLLOWUP] Re-order error class to fix `SparkThrowableSuite`

2022-11-14 Thread GitBox
itholic commented on PR #38658: URL: https://github.com/apache/spark/pull/38658#issuecomment-1313545319 Thanks for the fix this. Do you know why this suddenly start failing? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] cloud-fan closed pull request #38626: [SPARK-38959][SQL][FOLLOWUP] Do not optimize subqueries twice

2022-11-14 Thread GitBox
cloud-fan closed pull request #38626: [SPARK-38959][SQL][FOLLOWUP] Do not optimize subqueries twice URL: https://github.com/apache/spark/pull/38626 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] pan3793 opened a new pull request, #38651: [SPARK-41136][K8S] Shorten graceful shutdown time of ExecutorPodsSnapshotsStoreImpl to prevent blocking shutdown process

2022-11-14 Thread GitBox
pan3793 opened a new pull request, #38651: URL: https://github.com/apache/spark/pull/38651 ### What changes were proposed in this pull request? Shorten graceful shutdown time of `ExecutorPodsSnapshotsStoreImpl#stop` from 30s to 20s to prevent blocking shutdown process

[GitHub] [spark] cloud-fan commented on a diff in pull request #38631: [SPARK-40809] [CONNECT] [FOLLOW] Support `alias()` in Python client

2022-11-14 Thread GitBox
cloud-fan commented on code in PR #38631: URL: https://github.com/apache/spark/pull/38631#discussion_r1021211335 ## connector/connect/src/main/protobuf/spark/connect/expressions.proto: ## @@ -170,6 +170,8 @@ message Expression { message Alias { Expression expr = 1; -

[GitHub] [spark] itholic commented on pull request #38652: [SPARK-41137][SQL] Rename `LATERAL_JOIN_OF_TYPE` to `INVALID_LATERAL_JOIN_TYPE`

2022-11-14 Thread GitBox
itholic commented on PR #38652: URL: https://github.com/apache/spark/pull/38652#issuecomment-1313290570 cc @MaxGekk @srielau -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] itholic opened a new pull request, #38657: [SPARK-41139][SQL] Improve error class: `PYTHON_UDF_IN_ON_CLAUSE`

2022-11-14 Thread GitBox
itholic opened a new pull request, #38657: URL: https://github.com/apache/spark/pull/38657 ### What changes were proposed in this pull request? This PR proposes to improve the error message and test for `PYTHON_UDF_IN_ON_CLAUSE` ### Why are the changes needed? The

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38503: [SPARK-40940] Remove Multi-stateful operator checkers for streaming queries.

2022-11-14 Thread GitBox
HeartSaVioR commented on code in PR #38503: URL: https://github.com/apache/spark/pull/38503#discussion_r1021193999 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingDeduplicationSuite.scala: ## @@ -190,20 +190,25 @@ class StreamingDeduplicationSuite extends

[GitHub] [spark] MaxGekk commented on a diff in pull request #38646: [SPARK-41131][SQL] Improve error message for `UNRESOLVED_MAP_KEY.WITHOUT_SUGGESTION`

2022-11-14 Thread GitBox
MaxGekk commented on code in PR #38646: URL: https://github.com/apache/spark/pull/38646#discussion_r1021194056 ## core/src/main/resources/error/error-classes.json: ## @@ -1044,7 +1044,7 @@ }, "UNRESOLVED_MAP_KEY" : { "message" : [ - "Cannot resolve column as a

[GitHub] [spark] grundprinzip commented on a diff in pull request #38631: [SPARK-40809] [CONNECT] [FOLLOW] Support `alias()` in Python client

2022-11-14 Thread GitBox
grundprinzip commented on code in PR #38631: URL: https://github.com/apache/spark/pull/38631#discussion_r1021214764 ## connector/connect/src/main/protobuf/spark/connect/expressions.proto: ## @@ -170,6 +170,8 @@ message Expression { message Alias { Expression expr = 1;

[GitHub] [spark] cloud-fan commented on a diff in pull request #38648: [SPARK-41134][SQL] Improve error message of internal errors

2022-11-14 Thread GitBox
cloud-fan commented on code in PR #38648: URL: https://github.com/apache/spark/pull/38648#discussion_r1021221885 ## sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala: ## @@ -494,7 +494,8 @@ object QueryExecution { private[sql] def

[GitHub] [spark] EnricoMi commented on a diff in pull request #38356: [SPARK-40885] `Sort` may not take effect when it is the last 'Transform' operator

2022-11-14 Thread GitBox
EnricoMi commented on code in PR #38356: URL: https://github.com/apache/spark/pull/38356#discussion_r1021351324 ## sql/core/src/test/scala/org/apache/spark/sql/sources/PartitionedWriteSuite.scala: ## @@ -220,6 +220,23 @@ class PartitionedWriteSuite extends QueryTest with

[GitHub] [spark] LuciferYang commented on a diff in pull request #38615: [SPARK-41109][SQL] Rename the error class _LEGACY_ERROR_TEMP_1216 to INVALID_LIKE_PATTERN

2022-11-14 Thread GitBox
LuciferYang commented on code in PR #38615: URL: https://github.com/apache/spark/pull/38615#discussion_r1021384022 ## core/src/main/resources/error/error-classes.json: ## @@ -630,6 +630,23 @@ "Input schema can only contain STRING as a key type for a MAP." ] },

[GitHub] [spark] cloud-fan commented on pull request #38626: [SPARK-38959][SQL][FOLLOWUP] Do not optimize subqueries twice

2022-11-14 Thread GitBox
cloud-fan commented on PR #38626: URL: https://github.com/apache/spark/pull/38626#issuecomment-1313245999 The failed test is known to be flaky: ``` SPARK-37555: spark-sql should pass last unclosed comment to backend *** FAILED *** (2 minutes, 10 seconds) [info]

[GitHub] [spark] cloud-fan commented on pull request #38626: [SPARK-38959][SQL][FOLLOWUP] Do not optimize subqueries twice

2022-11-14 Thread GitBox
cloud-fan commented on PR #38626: URL: https://github.com/apache/spark/pull/38626#issuecomment-1313244564 thanks for review, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38503: [SPARK-40940] Remove Multi-stateful operator checkers for streaming queries.

2022-11-14 Thread GitBox
HeartSaVioR commented on code in PR #38503: URL: https://github.com/apache/spark/pull/38503#discussion_r1021196487 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingDeduplicationSuite.scala: ## @@ -190,20 +190,25 @@ class StreamingDeduplicationSuite extends

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38503: [SPARK-40940] Remove Multi-stateful operator checkers for streaming queries.

2022-11-14 Thread GitBox
HeartSaVioR commented on code in PR #38503: URL: https://github.com/apache/spark/pull/38503#discussion_r1021193999 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingDeduplicationSuite.scala: ## @@ -190,20 +190,25 @@ class StreamingDeduplicationSuite extends

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38503: [SPARK-40940] Remove Multi-stateful operator checkers for streaming queries.

2022-11-14 Thread GitBox
HeartSaVioR commented on code in PR #38503: URL: https://github.com/apache/spark/pull/38503#discussion_r1021193999 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingDeduplicationSuite.scala: ## @@ -190,20 +190,25 @@ class StreamingDeduplicationSuite extends

[GitHub] [spark] itholic opened a new pull request, #38652: [SPARK-41137][SQL] Rename `LATERAL_JOIN_OF_TYPE` to `INVALID_LATERAL_JOIN_TYPE`

2022-11-14 Thread GitBox
itholic opened a new pull request, #38652: URL: https://github.com/apache/spark/pull/38652 ### What changes were proposed in this pull request? This PR proposes to rename `LATERAL_JOIN_OF_TYPE` to `INVALID_LATERAL_JOIN_TYPE`. Also remove this from the sub-class under

[GitHub] [spark] MaxGekk commented on a diff in pull request #38648: [SPARK-41134][SQL] Improve error message of internal errors

2022-11-14 Thread GitBox
MaxGekk commented on code in PR #38648: URL: https://github.com/apache/spark/pull/38648#discussion_r1021200448 ## sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala: ## @@ -494,7 +494,8 @@ object QueryExecution { private[sql] def toInternalError(msg:

[GitHub] [spark] cloud-fan commented on a diff in pull request #38631: [SPARK-40809] [CONNECT] [FOLLOW] Support `alias()` in Python client

2022-11-14 Thread GitBox
cloud-fan commented on code in PR #38631: URL: https://github.com/apache/spark/pull/38631#discussion_r1021212753 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -348,7 +350,16 @@ class SparkConnectPlanner(session:

[GitHub] [spark] grundprinzip commented on a diff in pull request #38631: [SPARK-40809] [CONNECT] [FOLLOW] Support `alias()` in Python client

2022-11-14 Thread GitBox
grundprinzip commented on code in PR #38631: URL: https://github.com/apache/spark/pull/38631#discussion_r1021212933 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -348,7 +350,16 @@ class SparkConnectPlanner(session:

[GitHub] [spark] cloud-fan commented on a diff in pull request #38631: [SPARK-40809] [CONNECT] [FOLLOW] Support `alias()` in Python client

2022-11-14 Thread GitBox
cloud-fan commented on code in PR #38631: URL: https://github.com/apache/spark/pull/38631#discussion_r1021219879 ## connector/connect/src/main/protobuf/spark/connect/expressions.proto: ## @@ -170,6 +170,8 @@ message Expression { message Alias { Expression expr = 1; -

[GitHub] [spark] LuciferYang commented on a diff in pull request #38651: [SPARK-41136][K8S] Shorten graceful shutdown time of ExecutorPodsSnapshotsStoreImpl to prevent blocking shutdown process

2022-11-14 Thread GitBox
LuciferYang commented on code in PR #38651: URL: https://github.com/apache/spark/pull/38651#discussion_r1021225985 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshotsStoreImpl.scala: ## @@ -94,7 +95,9 @@

[GitHub] [spark] MaxGekk commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

2022-11-14 Thread GitBox
MaxGekk commented on code in PR #38531: URL: https://github.com/apache/spark/pull/38531#discussion_r1021226420 ## core/src/main/resources/error/error-classes.json: ## @@ -290,6 +290,46 @@ "Null typed values cannot be used as arguments of ." ] }, +

[GitHub] [spark] zhengruifeng commented on pull request #38654: [SPARK-41005][CONNECT][DOC][FOLLOW-UP] Document the reason of sending batch in main thread

2022-11-14 Thread GitBox
zhengruifeng commented on PR #38654: URL: https://github.com/apache/spark/pull/38654#issuecomment-1313363736 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] zhengruifeng opened a new pull request, #38655: [SPARK-41138][PYTHON] `DataFrame.na.fill` should have the same augment types as `DataFrame.fillna`

2022-11-14 Thread GitBox
zhengruifeng opened a new pull request, #38655: URL: https://github.com/apache/spark/pull/38655 ### What changes were proposed in this pull request? Make `DataFrame.na.fill` have the same augment types as `DataFrame.fillna` ### Why are the changes needed? `DataFrame.na.fill`

[GitHub] [spark] itholic commented on pull request #38657: [SPARK-41139][SQL] Improve error class: `PYTHON_UDF_IN_ON_CLAUSE`

2022-11-14 Thread GitBox
itholic commented on PR #38657: URL: https://github.com/apache/spark/pull/38657#issuecomment-1313489138 cc @srielau @MaxGekk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] Yikun commented on pull request #38064: [SPARK-40622][SQL][CORE]Remove the limitation that single task result must fit in 2GB

2022-11-14 Thread GitBox
Yikun commented on PR #38064: URL: https://github.com/apache/spark/pull/38064#issuecomment-1313530277 The `2GB` should not the limited of github action, there only are [a 7GB

[GitHub] [spark] LuciferYang commented on pull request #38658: [SPARK-41109][CORE][FOLLOWUP] Fix error class order

2022-11-14 Thread GitBox
LuciferYang commented on PR #38658: URL: https://github.com/apache/spark/pull/38658#issuecomment-1313537233 cc @MaxGekk @dongjoon-hyun for fix GA Task: https://github.com/LuciferYang/spark/actions/runs/3459895559/jobs/5775820010 -- This is an automated message from the

[GitHub] [spark] MaxGekk commented on a diff in pull request #38648: [SPARK-41134][SQL] Improve error message of internal errors

2022-11-14 Thread GitBox
MaxGekk commented on code in PR #38648: URL: https://github.com/apache/spark/pull/38648#discussion_r1021200448 ## sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala: ## @@ -494,7 +494,8 @@ object QueryExecution { private[sql] def toInternalError(msg:

[GitHub] [spark] grundprinzip commented on a diff in pull request #38631: [SPARK-40809] [CONNECT] [FOLLOW] Support `alias()` in Python client

2022-11-14 Thread GitBox
grundprinzip commented on code in PR #38631: URL: https://github.com/apache/spark/pull/38631#discussion_r1021203690 ## connector/connect/src/main/protobuf/spark/connect/expressions.proto: ## @@ -170,6 +170,8 @@ message Expression { message Alias { Expression expr = 1;

[GitHub] [spark] MaxGekk commented on a diff in pull request #38650: [SPARK-41135][SQL] Rename `UNSUPPORTED_EMPTY_LOCATION` to `INVALID_EMPTY_LOCATION`

2022-11-14 Thread GitBox
MaxGekk commented on code in PR #38650: URL: https://github.com/apache/spark/pull/38650#discussion_r1021209665 ## core/src/main/resources/error/error-classes.json: ## @@ -616,6 +616,11 @@ ], "sqlState" : "42000" }, + "INVALID_EMPTY_LOCATION" : { +"message" : [

[GitHub] [spark] MaxGekk commented on a diff in pull request #38648: [SPARK-41134][SQL] Improve error message of internal errors

2022-11-14 Thread GitBox
MaxGekk commented on code in PR #38648: URL: https://github.com/apache/spark/pull/38648#discussion_r1021233627 ## sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala: ## @@ -494,7 +494,8 @@ object QueryExecution { private[sql] def toInternalError(msg:

[GitHub] [spark] zhengruifeng opened a new pull request, #38654: [SPARK-41005][CONNECT][DOC][FOLLOW-UP] Document the reason of sending batch in main thread

2022-11-14 Thread GitBox
zhengruifeng opened a new pull request, #38654: URL: https://github.com/apache/spark/pull/38654 ### What changes were proposed in this pull request? Document the reason of sending batch in main thread ### Why are the changes needed? as per

[GitHub] [spark] grundprinzip commented on a diff in pull request #38631: [SPARK-40809] [CONNECT] [FOLLOW] Support `alias()` in Python client

2022-11-14 Thread GitBox
grundprinzip commented on code in PR #38631: URL: https://github.com/apache/spark/pull/38631#discussion_r1021263016 ## connector/connect/src/main/protobuf/spark/connect/expressions.proto: ## @@ -170,6 +170,8 @@ message Expression { message Alias { Expression expr = 1;

[GitHub] [spark] ulysses-you commented on a diff in pull request #38356: [SPARK-40885] `Sort` may not take effect when it is the last 'Transform' operator

2022-11-14 Thread GitBox
ulysses-you commented on code in PR #38356: URL: https://github.com/apache/spark/pull/38356#discussion_r1021331641 ## sql/core/src/test/scala/org/apache/spark/sql/sources/PartitionedWriteSuite.scala: ## @@ -220,6 +220,23 @@ class PartitionedWriteSuite extends QueryTest with

[GitHub] [spark] LuciferYang opened a new pull request, #38658: [SPARK-41109][CORE][FOLLOWUP] Fix error class order

2022-11-14 Thread GitBox
LuciferYang opened a new pull request, #38658: URL: https://github.com/apache/spark/pull/38658 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] LuciferYang commented on a diff in pull request #38637: [SPARK-41121][BUILD] Upgrade sbt-assembly from 1.2.0 to 2.0.0

2022-11-14 Thread GitBox
LuciferYang commented on code in PR #38637: URL: https://github.com/apache/spark/pull/38637#discussion_r1021393419 ## project/plugins.sbt: ## @@ -25,7 +25,7 @@ libraryDependencies += "com.puppycrawl.tools" % "checkstyle" % "9.3" // checkstyle uses guava 31.0.1-jre.

[GitHub] [spark] LuciferYang commented on pull request #38658: [SPARK-41109][CORE][FOLLOWUP] Re-order error class to fix `SparkThrowableSuite`

2022-11-14 Thread GitBox
LuciferYang commented on PR #38658: URL: https://github.com/apache/spark/pull/38658#issuecomment-1313551135 Still under investigation. It should have been discovered very early -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] itholic commented on a diff in pull request #38646: [SPARK-41131][SQL] Improve error message for `UNRESOLVED_MAP_KEY.WITHOUT_SUGGESTION`

2022-11-14 Thread GitBox
itholic commented on code in PR #38646: URL: https://github.com/apache/spark/pull/38646#discussion_r1021411839 ## core/src/main/resources/error/error-classes.json: ## @@ -1044,7 +1044,7 @@ }, "UNRESOLVED_MAP_KEY" : { "message" : [ - "Cannot resolve column as a

[GitHub] [spark] cloud-fan commented on pull request #38631: [SPARK-40809] [CONNECT] [FOLLOW] Support `alias()` in Python client

2022-11-14 Thread GitBox
cloud-fan commented on PR #38631: URL: https://github.com/apache/spark/pull/38631#issuecomment-1313590270 @HyukjinKwon can you review the python side? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] zero323 commented on pull request #38643: [SPARK-41091][BUILD][3.2] Fix Docker release tool for branch-3.2

2022-11-14 Thread GitBox
zero323 commented on PR #38643: URL: https://github.com/apache/spark/pull/38643#issuecomment-1313603005 LGTM. Thanks @sunchao! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] LuciferYang commented on pull request #38658: [SPARK-41109][CORE][FOLLOWUP] Re-order error class to fix `SparkThrowableSuite`

2022-11-14 Thread GitBox
LuciferYang commented on PR #38658: URL: https://github.com/apache/spark/pull/38658#issuecomment-1313667810 > Thanks for the fix this. > > Do you know why this suddenly start failing? @itholic After https://github.com/apache/spark/pull/38615 merging, there is a code

[GitHub] [spark] dengziming opened a new pull request, #38659: [SPARK-41114][CONNECT] Support local data for LocalRelation

2022-11-14 Thread GitBox
dengziming opened a new pull request, #38659: URL: https://github.com/apache/spark/pull/38659 ### What changes were proposed in this pull request? This PR supports local data for LocalRelation, we have 2 approaches to represent a row: 1. Use Expression.Literal.Struct 2. Use

[GitHub] [spark] AmplabJenkins commented on pull request #38649: [SPARK-41132][SQL] Convert LikeAny and NotLikeAny to InSet if no pattern contains wildcards

2022-11-14 Thread GitBox
AmplabJenkins commented on PR #38649: URL: https://github.com/apache/spark/pull/38649#issuecomment-1313876520 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] tgravescs commented on pull request #38622: [SPARK-39601][YARN] AllocationFailure should not be treated as exitCausedByApp when driver is shutting down

2022-11-14 Thread GitBox
tgravescs commented on PR #38622: URL: https://github.com/apache/spark/pull/38622#issuecomment-1313919867 can you please add a description to the issue: https://issues.apache.org/jira/projects/SPARK/issues/SPARK-39601 -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] EnricoMi commented on a diff in pull request #38356: [SPARK-40885] `Sort` may not take effect when it is the last 'Transform' operator

2022-11-14 Thread GitBox
EnricoMi commented on code in PR #38356: URL: https://github.com/apache/spark/pull/38356#discussion_r1021635684 ## sql/core/src/test/scala/org/apache/spark/sql/sources/PartitionedWriteSuite.scala: ## @@ -220,6 +220,23 @@ class PartitionedWriteSuite extends QueryTest with

[GitHub] [spark] tgravescs commented on a diff in pull request #38622: [SPARK-39601][YARN] AllocationFailure should not be treated as exitCausedByApp when driver is shutting down

2022-11-14 Thread GitBox
tgravescs commented on code in PR #38622: URL: https://github.com/apache/spark/pull/38622#discussion_r1021695959 ## resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala: ## @@ -815,6 +815,7 @@ private[spark] class ApplicationMaster(

[GitHub] [spark] peter-toth commented on pull request #38640: [SPARK-41124][SQL][TEST] Add DSv2 PlanStabilitySuites

2022-11-14 Thread GitBox
peter-toth commented on PR #38640: URL: https://github.com/apache/spark/pull/38640#issuecomment-1313560978 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] itholic commented on a diff in pull request #38644: [SPARK-41130][SQL] Rename `OUT_OF_DECIMAL_TYPE_RANGE` to `NUMERIC_OUT_OF_SUPPORTED_RANGE`

2022-11-14 Thread GitBox
itholic commented on code in PR #38644: URL: https://github.com/apache/spark/pull/38644#discussion_r1021424319 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastWithAnsiOnSuite.scala: ## @@ -242,9 +242,13 @@ class CastWithAnsiOnSuite extends

[GitHub] [spark] dengziming commented on a diff in pull request #38638: [SPARK-41122][CONNECT] Explain API can support different modes

2022-11-14 Thread GitBox
dengziming commented on code in PR #38638: URL: https://github.com/apache/spark/pull/38638#discussion_r1021527911 ## connector/connect/src/main/protobuf/spark/connect/base.proto: ## @@ -38,6 +38,30 @@ message Plan { } } +// Plan explanation mode. +enum ExplainMode { +

[GitHub] [spark] LuciferYang commented on pull request #38658: [SPARK-41109][CORE][FOLLOWUP] Re-order error class to fix `SparkThrowableSuite`

2022-11-14 Thread GitBox
LuciferYang commented on PR #38658: URL: https://github.com/apache/spark/pull/38658#issuecomment-1313699403 friendly ping @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] itholic commented on a diff in pull request #38646: [SPARK-41131][SQL] Improve error message for `UNRESOLVED_MAP_KEY.WITHOUT_SUGGESTION`

2022-11-14 Thread GitBox
itholic commented on code in PR #38646: URL: https://github.com/apache/spark/pull/38646#discussion_r1021411839 ## core/src/main/resources/error/error-classes.json: ## @@ -1044,7 +1044,7 @@ }, "UNRESOLVED_MAP_KEY" : { "message" : [ - "Cannot resolve column as a

[GitHub] [spark] itholic commented on a diff in pull request #38646: [SPARK-41131][SQL] Improve error message for `UNRESOLVED_MAP_KEY.WITHOUT_SUGGESTION`

2022-11-14 Thread GitBox
itholic commented on code in PR #38646: URL: https://github.com/apache/spark/pull/38646#discussion_r1021411839 ## core/src/main/resources/error/error-classes.json: ## @@ -1044,7 +1044,7 @@ }, "UNRESOLVED_MAP_KEY" : { "message" : [ - "Cannot resolve column as a

[GitHub] [spark] cloud-fan commented on pull request #38640: [SPARK-41124][SQL][TEST] Add DSv2 PlanStabilitySuites

2022-11-14 Thread GitBox
cloud-fan commented on PR #38640: URL: https://github.com/apache/spark/pull/38640#issuecomment-1313585893 This is great! Is it using v2 parquet? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] cloud-fan commented on a diff in pull request #38654: [SPARK-41005][CONNECT][DOC][FOLLOW-UP] Document the reason of sending batch in main thread

2022-11-14 Thread GitBox
cloud-fan commented on code in PR #38654: URL: https://github.com/apache/spark/pull/38654#discussion_r1021448099 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala: ## @@ -168,8 +168,12 @@ class

[GitHub] [spark] peter-toth commented on pull request #38640: [SPARK-41124][SQL][TEST] Add DSv2 PlanStabilitySuites

2022-11-14 Thread GitBox
peter-toth commented on PR #38640: URL: https://github.com/apache/spark/pull/38640#issuecomment-1313612631 > This is great! Is it using v2 parquet? Yes it is, like other stability tests. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] LuciferYang commented on pull request #38658: [SPARK-41109][CORE][FOLLOWUP] Re-order error class to fix `SparkThrowableSuite`

2022-11-14 Thread GitBox
LuciferYang commented on PR #38658: URL: https://github.com/apache/spark/pull/38658#issuecomment-1313618482 An interesting thing. I didn't find logs related to `SparkThrowableSuite` tests in the two tests passed link. Is that my problem?

[GitHub] [spark] LuciferYang commented on pull request #38635: [SPARK-41118][SQL] `to_number`/`try_to_number` should return `null` when format is `null`

2022-11-14 Thread GitBox
LuciferYang commented on PR #38635: URL: https://github.com/apache/spark/pull/38635#issuecomment-1313712161 cc @MaxGekk FYI, Should `fmt` be checked for null in `checkInputDataTypes`? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] cloud-fan commented on a diff in pull request #38356: [SPARK-40885] `Sort` may not take effect when it is the last 'Transform' operator

2022-11-14 Thread GitBox
cloud-fan commented on code in PR #38356: URL: https://github.com/apache/spark/pull/38356#discussion_r1021444905 ## sql/core/src/test/scala/org/apache/spark/sql/sources/PartitionedWriteSuite.scala: ## @@ -220,6 +220,23 @@ class PartitionedWriteSuite extends QueryTest with

[GitHub] [spark] AmplabJenkins commented on pull request #38651: [SPARK-41136][K8S] Shorten graceful shutdown time of ExecutorPodsSnapshotsStoreImpl to prevent blocking shutdown process

2022-11-14 Thread GitBox
AmplabJenkins commented on PR #38651: URL: https://github.com/apache/spark/pull/38651#issuecomment-1313645025 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] MaxGekk commented on pull request #38658: [SPARK-41109][CORE][FOLLOWUP] Re-order error class to fix `SparkThrowableSuite`

2022-11-14 Thread GitBox
MaxGekk commented on PR #38658: URL: https://github.com/apache/spark/pull/38658#issuecomment-1314019272 +1, LGTM. Merging to master. I have checked the test suite locally: ``` [info] SparkThrowableSuite: ... [info] - prohibit dots in error class names (87 milliseconds) [info]

[GitHub] [spark] MaxGekk closed pull request #38658: [SPARK-41109][CORE][FOLLOWUP] Re-order error class to fix `SparkThrowableSuite`

2022-11-14 Thread GitBox
MaxGekk closed pull request #38658: [SPARK-41109][CORE][FOLLOWUP] Re-order error class to fix `SparkThrowableSuite` URL: https://github.com/apache/spark/pull/38658 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] sunchao commented on pull request #38643: [SPARK-41091][BUILD][3.2] Fix Docker release tool for branch-3.2

2022-11-14 Thread GitBox
sunchao commented on PR #38643: URL: https://github.com/apache/spark/pull/38643#issuecomment-1314053963 Thanks @dongjoon-hyun @zero323 ! I think I'm unblocked from 3.2.3 release now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] pan3793 commented on a diff in pull request #38622: [SPARK-39601][YARN] AllocationFailure should not be treated as exitCausedByApp when driver is shutting down

2022-11-14 Thread GitBox
pan3793 commented on code in PR #38622: URL: https://github.com/apache/spark/pull/38622#discussion_r1021822652 ## resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala: ## @@ -815,6 +815,7 @@ private[spark] class ApplicationMaster(

[GitHub] [spark] pan3793 commented on a diff in pull request #38651: [SPARK-41136][K8S] Shorten graceful shutdown time of ExecutorPodsSnapshotsStoreImpl to prevent blocking shutdown process

2022-11-14 Thread GitBox
pan3793 commented on code in PR #38651: URL: https://github.com/apache/spark/pull/38651#discussion_r1021830231 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshotsStoreImpl.scala: ## @@ -94,7 +95,9 @@ private[spark]

[GitHub] [spark] MaxGekk commented on a diff in pull request #38576: [SPARK-41062][SQL] Rename `UNSUPPORTED_CORRELATED_REFERENCE` to `CORRELATED_REFERENCE`

2022-11-14 Thread GitBox
MaxGekk commented on code in PR #38576: URL: https://github.com/apache/spark/pull/38576#discussion_r1021741191 ## core/src/main/resources/error/error-classes.json: ## @@ -1277,6 +1277,11 @@ "A correlated outer name reference within a subquery expression body was not

[GitHub] [spark] LuciferYang commented on pull request #38658: [SPARK-41109][CORE][FOLLOWUP] Re-order error class to fix `SparkThrowableSuite`

2022-11-14 Thread GitBox
LuciferYang commented on PR #38658: URL: https://github.com/apache/spark/pull/38658#issuecomment-1314066195 Thanks @MaxGekk @cloud-fan @itholic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] pan3793 commented on pull request #38622: [SPARK-39601][YARN] AllocationFailure should not be treated as exitCausedByApp when driver is shutting down

2022-11-14 Thread GitBox
pan3793 commented on PR #38622: URL: https://github.com/apache/spark/pull/38622#issuecomment-1314081544 > can you please add a description to the issue: https://issues.apache.org/jira/projects/SPARK/issues/SPARK-39601 Thanks for reminding, updated. -- This is an automated message

[GitHub] [spark] carlfu-db commented on a diff in pull request #38404: [SPARK-40956] SQL Equivalent for Dataframe overwrite command

2022-11-14 Thread GitBox
carlfu-db commented on code in PR #38404: URL: https://github.com/apache/spark/pull/38404#discussion_r1021850376 ## sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4: ## @@ -319,6 +319,7 @@ query insertInto : INSERT OVERWRITE TABLE?

[GitHub] [spark] dongjoon-hyun commented on pull request #38643: [SPARK-41091][BUILD][3.2] Fix Docker release tool for branch-3.2

2022-11-14 Thread GitBox
dongjoon-hyun commented on PR #38643: URL: https://github.com/apache/spark/pull/38643#issuecomment-1314132688 It's great. I saw the tag. Thank you! - https://github.com/apache/spark/releases/tag/v3.2.3-rc1 -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] WweiL commented on a diff in pull request #38503: [SPARK-40940] Remove Multi-stateful operator checkers for streaming queries.

2022-11-14 Thread GitBox
WweiL commented on code in PR #38503: URL: https://github.com/apache/spark/pull/38503#discussion_r1021870148 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingDeduplicationSuite.scala: ## @@ -190,20 +190,25 @@ class StreamingDeduplicationSuite extends

[GitHub] [spark] amaliujia commented on pull request #38659: [SPARK-41114][CONNECT] Support local data for LocalRelation

2022-11-14 Thread GitBox
amaliujia commented on PR #38659: URL: https://github.com/apache/spark/pull/38659#issuecomment-1314224467 Question: can we re-use the ARROW collection we have done here? cc @zhengruifeng -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] MaxGekk commented on pull request #38656: [SPARK-41140][SQL] Rename the error class `_LEGACY_ERROR_TEMP_2440` to `INVALID_WHERE_CONDITION`

2022-11-14 Thread GitBox
MaxGekk commented on PR #38656: URL: https://github.com/apache/spark/pull/38656#issuecomment-1314224185 @srielau @cloud-fan @LuciferYang @panbingkun @itholic Please, review this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] amaliujia commented on pull request #38659: [SPARK-41114][CONNECT] Support local data for LocalRelation

2022-11-14 Thread GitBox
amaliujia commented on PR #38659: URL: https://github.com/apache/spark/pull/38659#issuecomment-1314224623 also cc @hvanhovell -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] sunchao closed pull request #38628: [SPARK-41096][SQL] Support reading parquet FIXED_LEN_BYTE_ARRAY type

2022-11-14 Thread GitBox
sunchao closed pull request #38628: [SPARK-41096][SQL] Support reading parquet FIXED_LEN_BYTE_ARRAY type URL: https://github.com/apache/spark/pull/38628 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] sunchao commented on pull request #38628: [SPARK-41096][SQL] Support reading parquet FIXED_LEN_BYTE_ARRAY type

2022-11-14 Thread GitBox
sunchao commented on PR #38628: URL: https://github.com/apache/spark/pull/38628#issuecomment-1314225366 Committed to master, thanks @kazuyukitanimura ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] kazuyukitanimura commented on pull request #38628: [SPARK-41096][SQL] Support reading parquet FIXED_LEN_BYTE_ARRAY type

2022-11-14 Thread GitBox
kazuyukitanimura commented on PR #38628: URL: https://github.com/apache/spark/pull/38628#issuecomment-1314250536 Thank you @huaxingao @sunchao @LuciferYang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] aokolnychyi commented on a diff in pull request #38005: [SPARK-40550][SQL] DataSource V2: Handle DELETE commands for delta-based sources

2022-11-14 Thread GitBox
aokolnychyi commented on code in PR #38005: URL: https://github.com/apache/spark/pull/38005#discussion_r1022088980 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala: ## @@ -254,6 +254,113 @@ case class ReplaceData( } } +/** + *

[GitHub] [spark] xkrogen commented on pull request #37949: [SPARK-40504][YARN] Make yarn appmaster load config from client

2022-11-14 Thread GitBox
xkrogen commented on PR #37949: URL: https://github.com/apache/spark/pull/37949#issuecomment-1314492128 Ah, I see. It seems you're using `spark.yarn.populateHadoopClasspath = true`. It looks like it's expected that the Hadoop conf from the node overrides the one from `__hadoop_conf__` in

[GitHub] [spark] xkrogen commented on a diff in pull request #38648: [SPARK-41134][SQL] Improve error message of internal errors

2022-11-14 Thread GitBox
xkrogen commented on code in PR #38648: URL: https://github.com/apache/spark/pull/38648#discussion_r1021976624 ## sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala: ## @@ -494,7 +494,8 @@ object QueryExecution { private[sql] def toInternalError(msg:

[GitHub] [spark] viirya commented on pull request #38628: [SPARK-41096][SQL] Support reading parquet FIXED_LEN_BYTE_ARRAY type

2022-11-14 Thread GitBox
viirya commented on PR #38628: URL: https://github.com/apache/spark/pull/38628#issuecomment-1314270163 Just found a previous PR #35902. The change is the same, but there are some avro test stuff that we can consider to add as a followup too. -- This is an automated message from the

[GitHub] [spark] SandishKumarHN commented on a diff in pull request #38384: [SPARK-40657][PROTOBUF] Require shading for Java class jar, improve error handling

2022-11-14 Thread GitBox
SandishKumarHN commented on code in PR #38384: URL: https://github.com/apache/spark/pull/38384#discussion_r1021864869 ## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufUtils.scala: ## @@ -155,21 +155,52 @@ private[sql] object ProtobufUtils extends

[GitHub] [spark] grundprinzip commented on a diff in pull request #38642: [SPARK-41127][CONNECT][PYTHON] Implement DataFrame.CreateGlobalView in Python client

2022-11-14 Thread GitBox
grundprinzip commented on code in PR #38642: URL: https://github.com/apache/spark/pull/38642#discussion_r1022030795 ## python/pyspark/sql/tests/connect/test_connect_basic.py: ## @@ -207,6 +208,18 @@ def test_range(self): .equals(self.spark.range(start=0, end=10,

[GitHub] [spark] kazuyukitanimura commented on pull request #38628: [SPARK-41096][SQL] Support reading parquet FIXED_LEN_BYTE_ARRAY type

2022-11-14 Thread GitBox
kazuyukitanimura commented on PR #38628: URL: https://github.com/apache/spark/pull/38628#issuecomment-1314336519 Thanks @viirya I also realized PR https://github.com/apache/spark/pull/35902 along with https://github.com/apache/spark/pull/20826 and https://github.com/apache/spark/pull/1737

[GitHub] [spark] grundprinzip commented on a diff in pull request #38642: [SPARK-41127][CONNECT][PYTHON] Implement DataFrame.CreateGlobalView in Python client

2022-11-14 Thread GitBox
grundprinzip commented on code in PR #38642: URL: https://github.com/apache/spark/pull/38642#discussion_r1022031268 ## python/pyspark/sql/tests/connect/test_connect_basic.py: ## @@ -207,6 +208,18 @@ def test_range(self): .equals(self.spark.range(start=0, end=10,

[GitHub] [spark] aokolnychyi commented on a diff in pull request #38005: [SPARK-40550][SQL] DataSource V2: Handle DELETE commands for delta-based sources

2022-11-14 Thread GitBox
aokolnychyi commented on code in PR #38005: URL: https://github.com/apache/spark/pull/38005#discussion_r1022079865 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ProjectingInternalRow.scala: ## @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] amaliujia commented on a diff in pull request #38638: [SPARK-41122][CONNECT] Explain API can support different modes

2022-11-14 Thread GitBox
amaliujia commented on code in PR #38638: URL: https://github.com/apache/spark/pull/38638#discussion_r1021980330 ## connector/connect/src/main/protobuf/spark/connect/base.proto: ## @@ -48,6 +72,9 @@ message Request { // The logical plan to be executed / analyzed. Plan

[GitHub] [spark] grundprinzip commented on a diff in pull request #38638: [SPARK-41122][CONNECT] Explain API can support different modes

2022-11-14 Thread GitBox
grundprinzip commented on code in PR #38638: URL: https://github.com/apache/spark/pull/38638#discussion_r1022036015 ## connector/connect/src/main/protobuf/spark/connect/base.proto: ## @@ -38,16 +38,50 @@ message Plan { } } +// Explains the input plan based on a

[GitHub] [spark] aokolnychyi commented on a diff in pull request #38005: [SPARK-40550][SQL] DataSource V2: Handle DELETE commands for delta-based sources

2022-11-14 Thread GitBox
aokolnychyi commented on code in PR #38005: URL: https://github.com/apache/spark/pull/38005#discussion_r1022077282 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala: ## @@ -477,6 +507,73 @@ object DataWritingSparkTask extends

[GitHub] [spark] amaliujia commented on a diff in pull request #38642: [SPARK-41127][CONNECT][PYTHON] Implement DataFrame.CreateGlobalView in Python client

2022-11-14 Thread GitBox
amaliujia commented on code in PR #38642: URL: https://github.com/apache/spark/pull/38642#discussion_r1022140526 ## python/pyspark/sql/tests/connect/test_connect_basic.py: ## @@ -207,6 +208,18 @@ def test_range(self): .equals(self.spark.range(start=0, end=10,

[GitHub] [spark] WweiL commented on a diff in pull request #38503: [SPARK-40940] Remove Multi-stateful operator checkers for streaming queries.

2022-11-14 Thread GitBox
WweiL commented on code in PR #38503: URL: https://github.com/apache/spark/pull/38503#discussion_r1021870148 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingDeduplicationSuite.scala: ## @@ -190,20 +190,25 @@ class StreamingDeduplicationSuite extends

[GitHub] [spark] amaliujia commented on a diff in pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-g

2022-11-14 Thread GitBox
amaliujia commented on code in PR #38609: URL: https://github.com/apache/spark/pull/38609#discussion_r1021995966 ## project/SparkBuild.scala: ## @@ -109,6 +109,16 @@ object SparkBuild extends PomBuild { if (profiles.contains("jdwp-test-debug")) {

[GitHub] [spark] amaliujia commented on pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-ja

2022-11-14 Thread GitBox
amaliujia commented on PR #38609: URL: https://github.com/apache/spark/pull/38609#issuecomment-1314293906 Looks easy to follow! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] amaliujia commented on a diff in pull request #38642: [SPARK-41127][CONNECT][PYTHON] Implement DataFrame.CreateGlobalView in Python client

2022-11-14 Thread GitBox
amaliujia commented on code in PR #38642: URL: https://github.com/apache/spark/pull/38642#discussion_r1022037157 ## python/pyspark/sql/tests/connect/test_connect_basic.py: ## @@ -207,6 +208,18 @@ def test_range(self): .equals(self.spark.range(start=0, end=10,

  1   2   3   >