[GitHub] [spark] cxzl25 commented on pull request #36808: [SPARK-39415][CORE] Local mode supports HadoopDelegationTokenManager

2022-06-08 Thread GitBox
cxzl25 commented on PR #36808: URL: https://github.com/apache/spark/pull/36808#issuecomment-1150693473 Duplicate #32009 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] cxzl25 closed pull request #36808: [SPARK-39415][CORE] Local mode supports HadoopDelegationTokenManager

2022-06-08 Thread GitBox
cxzl25 closed pull request #36808: [SPARK-39415][CORE] Local mode supports HadoopDelegationTokenManager URL: https://github.com/apache/spark/pull/36808 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] HyukjinKwon commented on pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <0.18 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon commented on PR #36813: URL: https://github.com/apache/spark/pull/36813#issuecomment-1150686914 Python docs passed. Merged to master, branch-3.3 and branch-3.2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] HyukjinKwon closed pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <0.18 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon closed pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <0.18 in documentation build URL: https://github.com/apache/spark/pull/36813 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] AngersZhuuuu commented on a diff in pull request #36564: [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status

2022-06-08 Thread GitBox
AngersZh commented on code in PR #36564: URL: https://github.com/apache/spark/pull/36564#discussion_r893074610 ## core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala: ## @@ -187,12 +181,6 @@ class OutputCommitCoordinatorSuite extends

[GitHub] [spark] AngersZhuuuu commented on a diff in pull request #36564: [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status

2022-06-08 Thread GitBox
AngersZh commented on code in PR #36564: URL: https://github.com/apache/spark/pull/36564#discussion_r893072393 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -461,7 +467,8 @@ class SparkContext(config: SparkConf) extends Logging {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

2022-06-08 Thread GitBox
cloud-fan commented on code in PR #36663: URL: https://github.com/apache/spark/pull/36663#discussion_r893067774 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java: ## @@ -196,6 +196,90 @@ *Since version: 3.4.0 * *

[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

2022-06-08 Thread GitBox
cloud-fan commented on code in PR #36663: URL: https://github.com/apache/spark/pull/36663#discussion_r893067221 ## sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala: ## @@ -259,6 +259,55 @@ class V2ExpressionBuilder( } else {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36812: [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null

2022-06-08 Thread GitBox
cloud-fan commented on code in PR #36812: URL: https://github.com/apache/spark/pull/36812#discussion_r893062401 ## core/src/main/resources/error/error-classes.json: ## @@ -551,5 +551,10 @@ "Writing job aborted" ], "sqlState" : "4" + }, +

[GitHub] [spark] cloud-fan commented on a diff in pull request #36564: [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status

2022-06-08 Thread GitBox
cloud-fan commented on code in PR #36564: URL: https://github.com/apache/spark/pull/36564#discussion_r893061431 ## core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala: ## @@ -187,12 +181,6 @@ class OutputCommitCoordinatorSuite extends SparkFunSuite

[GitHub] [spark] HeartSaVioR commented on pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-08 Thread GitBox
HeartSaVioR commented on PR #36737: URL: https://github.com/apache/spark/pull/36737#issuecomment-1150658562 General comment from what I see in review comments: I see you repeat the explanation of the code you changed; I don't think reviewers asked about the detailed explanation of

[GitHub] [spark] HyukjinKwon commented on pull request #36816: [SPARK-39425][PYTHON][PS] Add migration guide for pandas 1.4 behavior changes

2022-06-08 Thread GitBox
HyukjinKwon commented on PR #36816: URL: https://github.com/apache/spark/pull/36816#issuecomment-1150648303 Thanks for updating the guide @Yikun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] HyukjinKwon commented on pull request #36814: [SPARK-39422][SQL] Improve error message for 'SHOW CREATE TABLE' with unsupported serdes

2022-06-08 Thread GitBox
HyukjinKwon commented on PR #36814: URL: https://github.com/apache/spark/pull/36814#issuecomment-1150647932 Hey thanks for letting me know. https://github.com/apache/spark/pull/36813 should fix that. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] huaxingao commented on pull request #36814: [SPARK-39422][SQL] Improve error message for 'SHOW CREATE TABLE' with unsupported serdes

2022-06-08 Thread GitBox
huaxingao commented on PR #36814: URL: https://github.com/apache/spark/pull/36814#issuecomment-1150646179 @HyukjinKwon The python doc generation failed. I saw the same error in other PRs too. ``` /__w/spark/spark/docs/_plugins/copy_api_dirs.rb:130:in `': Python doc generation

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.7 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36813: URL: https://github.com/apache/spark/pull/36813#discussion_r893049422 ## .github/workflows/build_and_test.yml: ## @@ -547,6 +547,7 @@ jobs: # See also https://issues.apache.org/jira/browse/SPARK-35375. # Pin the

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.7 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36813: URL: https://github.com/apache/spark/pull/36813#discussion_r893049087 ## dev/requirements.txt: ## @@ -35,6 +35,7 @@ numpydoc jinja2<3.0.0 sphinx<3.1.0 sphinx-plotly-directive +docutils~=1.7.0 Review Comment: ```suggestion

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.7 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36813: URL: https://github.com/apache/spark/pull/36813#discussion_r893047340 ## dev/requirements.txt: ## @@ -35,6 +35,7 @@ numpydoc jinja2<3.0.0 sphinx<3.1.0 sphinx-plotly-directive +docutils<1.7.0 Review Comment: ```suggestion

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.7 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36813: URL: https://github.com/apache/spark/pull/36813#discussion_r893047298 ## .github/workflows/build_and_test.yml: ## @@ -547,6 +547,7 @@ jobs: # See also https://issues.apache.org/jira/browse/SPARK-35375. # Pin the

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.7 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36813: URL: https://github.com/apache/spark/pull/36813#discussion_r893047077 ## .github/workflows/build_and_test.yml: ## @@ -549,6 +549,7 @@ jobs: # See also https://issues.apache.org/jira/browse/SPARK-38279. python3.9 -m

[GitHub] [spark] Yikun commented on a diff in pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.7 in documentation build

2022-06-08 Thread GitBox
Yikun commented on code in PR #36813: URL: https://github.com/apache/spark/pull/36813#discussion_r893046990 ## .github/workflows/build_and_test.yml: ## @@ -549,6 +549,7 @@ jobs: # See also https://issues.apache.org/jira/browse/SPARK-38279. python3.9 -m pip

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.7 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36813: URL: https://github.com/apache/spark/pull/36813#discussion_r893046599 ## .github/workflows/build_and_test.yml: ## @@ -549,6 +549,7 @@ jobs: # See also https://issues.apache.org/jira/browse/SPARK-38279. Review Comment:

[GitHub] [spark] itholic commented on pull request #36793: [SPARK-39406][PYTHON] Accept NumPy array in createDataFrame

2022-06-08 Thread GitBox
itholic commented on PR #36793: URL: https://github.com/apache/spark/pull/36793#issuecomment-1150640677 Otherwise looks good if test passes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] itholic commented on pull request #36793: [SPARK-39406][PYTHON] Accept NumPy array in createDataFrame

2022-06-08 Thread GitBox
itholic commented on PR #36793: URL: https://github.com/apache/spark/pull/36793#issuecomment-1150640100 Seems like it would fixed when https://github.com/apache/spark/pull/36813 is merged. Let's rebase after then. -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] Yikun commented on a diff in pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.7 in documentation build

2022-06-08 Thread GitBox
Yikun commented on code in PR #36813: URL: https://github.com/apache/spark/pull/36813#discussion_r893045350 ## .github/workflows/build_and_test.yml: ## @@ -549,6 +549,7 @@ jobs: # See also https://issues.apache.org/jira/browse/SPARK-38279. python3.9 -m pip

[GitHub] [spark] Yikun commented on pull request #36816: [SPARK-39425][PYTHON][PS] Add migration guide for pandas 1.4 behavior changes

2022-06-08 Thread GitBox
Yikun commented on PR #36816: URL: https://github.com/apache/spark/pull/36816#issuecomment-1150637320 @LuciferYang Thanks for reminder, I guess I need a reabase after doctest fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] LuciferYang commented on pull request #36816: [SPARK-39425][PYTHON][PS] Add migration guide for pandas 1.4 behavior changes

2022-06-08 Thread GitBox
LuciferYang commented on PR #36816: URL: https://github.com/apache/spark/pull/36816#issuecomment-1150637062 > @LuciferYang Hmm, this is not for fix doc test failed, :). It's for plus migration doc for [SPARK-38819](https://issues.apache.org/jira/browse/SPARK-38819) OK -- This is

[GitHub] [spark] Yikun commented on pull request #36816: [SPARK-39425][PYTHON][PS] Add migration guide for pandas 1.4 behavior changes

2022-06-08 Thread GitBox
Yikun commented on PR #36816: URL: https://github.com/apache/spark/pull/36816#issuecomment-1150636796 @LuciferYang Hmm, this is not for fix doc test failed, :). It's for plus migration doc for SPARK-38819 -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] nyingping commented on a diff in pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-08 Thread GitBox
nyingping commented on code in PR #36737: URL: https://github.com/apache/spark/pull/36737#discussion_r893042194 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3963,8 +3966,10 @@ object TimeWindowing extends Rule[LogicalPlan] {

[GitHub] [spark] LuciferYang commented on pull request #36816: Add migration guide for pandas 1.4 behavior changes

2022-06-08 Thread GitBox
LuciferYang commented on PR #36816: URL: https://github.com/apache/spark/pull/36816#issuecomment-1150634766 @Yikun Will the current pr fix [SPARK-39424](https://issues.apache.org/jira/browse/SPARK-39424) -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] LuciferYang commented on pull request #36807: [WIP][SPARK-39414][BUILD] Upgrade Scala to 2.12.16

2022-06-08 Thread GitBox
LuciferYang commented on PR #36807: URL: https://github.com/apache/spark/pull/36807#issuecomment-1150633518 There are no official release notes yet -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] Yikun opened a new pull request, #36816: Add migration guide for pandas 1.4 behavior changes

2022-06-08 Thread GitBox
Yikun opened a new pull request, #36816: URL: https://github.com/apache/spark/pull/36816 ### What changes were proposed in this pull request? Add migration guide for pandas 1.4 behavior changes: * SPARK-39054 https://github.com/apache/spark/pull/36581: In Spark 3.4, if Pandas

[GitHub] [spark] cloud-fan commented on a diff in pull request #36564: [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status

2022-06-08 Thread GitBox
cloud-fan commented on code in PR #36564: URL: https://github.com/apache/spark/pull/36564#discussion_r893038044 ## core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala: ## @@ -187,8 +188,8 @@ class OutputCommitCoordinatorSuite extends SparkFunSuite

[GitHub] [spark] cloud-fan commented on a diff in pull request #36564: [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status

2022-06-08 Thread GitBox
cloud-fan commented on code in PR #36564: URL: https://github.com/apache/spark/pull/36564#discussion_r893037549 ## core/src/main/scala/org/apache/spark/SparkEnv.scala: ## @@ -423,6 +432,7 @@ object SparkEnv extends Logging { envInstance } + // scalastyle:on argcount

[GitHub] [spark] gengliangwang commented on a diff in pull request #36812: [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null

2022-06-08 Thread GitBox
gengliangwang commented on code in PR #36812: URL: https://github.com/apache/spark/pull/36812#discussion_r893036298 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -3828,6 +3828,15 @@ object SQLConf { .booleanConf

[GitHub] [spark] yaooqinn commented on a diff in pull request #36810: [SPARK-39417][SQL] Handle Null partition values in PartitioningUtils

2022-06-08 Thread GitBox
yaooqinn commented on code in PR #36810: URL: https://github.com/apache/spark/pull/36810#discussion_r893034665 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetPartitionDiscoverySuite.scala: ## @@ -1259,6 +1259,14 @@ class

[GitHub] [spark] HyukjinKwon commented on pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.8 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon commented on PR #36813: URL: https://github.com/apache/spark/pull/36813#issuecomment-1150621696 It actually fails with 1.7.0 too  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] LuciferYang commented on pull request #36807: [WIP][SPARK-39414][BUILD] Upgrade Scala to 2.12.16

2022-06-08 Thread GitBox
LuciferYang commented on PR #36807: URL: https://github.com/apache/spark/pull/36807#issuecomment-1150621586 `Run documentation build` failed: ``` /__w/spark/spark/python/pyspark/pandas/supported_api_gen.py:101: UserWarning: Warning: Latest version of pandas(>=1.4.0) is required to

[GitHub] [spark] cloud-fan commented on pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.8 in documentation build

2022-06-08 Thread GitBox
cloud-fan commented on PR #36813: URL: https://github.com/apache/spark/pull/36813#issuecomment-1150620951 I think higher version is better? maybe we should change the docker file later (after 3.3 is released...) -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] HyukjinKwon commented on pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.8 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon commented on PR #36813: URL: https://github.com/apache/spark/pull/36813#issuecomment-1150620405 I matched the version to Dockerfile. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] zhengruifeng opened a new pull request, #35250: [SPARK-37961][SQL] Override maxRows/maxRowsPerPartition for some logical operators

2022-06-08 Thread GitBox
zhengruifeng opened a new pull request, #35250: URL: https://github.com/apache/spark/pull/35250 ### What changes were proposed in this pull request? 1, override `maxRowsPerPartition` in `Sort`,`Expand`,`Sample`,`CollectMetrics`; 2, override `maxRows` in

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.8 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36813: URL: https://github.com/apache/spark/pull/36813#discussion_r893029627 ## .github/workflows/build_and_test.yml: ## @@ -549,6 +549,7 @@ jobs: # See also https://issues.apache.org/jira/browse/SPARK-38279. python3.9 -m

[GitHub] [spark] cloud-fan closed pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-08 Thread GitBox
cloud-fan closed pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace URL: https://github.com/apache/spark/pull/36586 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] cloud-fan commented on pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-08 Thread GitBox
cloud-fan commented on PR #36586: URL: https://github.com/apache/spark/pull/36586#issuecomment-1150618882 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-08 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r893029290 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala: ## @@ -965,6 +965,10 @@ class SessionCatalog(

[GitHub] [spark] singhpk234 commented on pull request #36810: [SPARK-39417][SQL] Handle Null partition values in PartitioningUtils

2022-06-08 Thread GitBox
singhpk234 commented on PR #36810: URL: https://github.com/apache/spark/pull/36810#issuecomment-1150618735 @cloud-fan, This seems to be introduced via [commit](https://github.com/apache/spark/commit/fc29c91f27d866502f5b6cc4261d4943b57e),

[GitHub] [spark] dongjoon-hyun commented on pull request #36787: [SPARK-39387][FOLLOWUP][TESTS] Add a test case for HIVE-25190

2022-06-08 Thread GitBox
dongjoon-hyun commented on PR #36787: URL: https://github.com/apache/spark/pull/36787#issuecomment-1150615271 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun closed pull request #36787: [SPARK-39387][FOLLOWUP][TESTS] Add a test case for HIVE-25190

2022-06-08 Thread GitBox
dongjoon-hyun closed pull request #36787: [SPARK-39387][FOLLOWUP][TESTS] Add a test case for HIVE-25190 URL: https://github.com/apache/spark/pull/36787 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] beliefer commented on pull request #36805: [SPARK-39413][SQL] Capitalize sql keywords in JDBCV2Suite

2022-06-08 Thread GitBox
beliefer commented on PR #36805: URL: https://github.com/apache/spark/pull/36805#issuecomment-1150613902 @huaxingao @cloud-fan Thanks a lot! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] mridulm commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-08 Thread GitBox
mridulm commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r893022070 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -576,6 +661,7 @@ public MergeStatuses

[GitHub] [spark] mridulm commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-08 Thread GitBox
mridulm commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r893021304 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -992,6 +1233,45 @@ AppShufflePartitionInfo

[GitHub] [spark] ulysses-you commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-06-08 Thread GitBox
ulysses-you commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r893019308 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -734,11 +920,35 @@ case class Pmod( override def nullable:

[GitHub] [spark] AngersZhuuuu commented on a diff in pull request #36564: [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status

2022-06-08 Thread GitBox
AngersZh commented on code in PR #36564: URL: https://github.com/apache/spark/pull/36564#discussion_r893017815 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -461,7 +467,8 @@ class SparkContext(config: SparkConf) extends Logging {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-06-08 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r893017456 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -734,11 +920,35 @@ case class Pmod( override def nullable:

[GitHub] [spark] dongjoon-hyun commented on pull request #36815: [SPARK-37670][FOLLOWUP][SQL][TESTS][3.2] Update TPCDS golden files

2022-06-08 Thread GitBox
dongjoon-hyun commented on PR #36815: URL: https://github.com/apache/spark/pull/36815#issuecomment-1150604227 Thank you, @sunchao and @cloud-fan . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] Eugene-Mark commented on pull request #36499: [SPARK-38846][SQL] Add explicit data mapping between Teradata Numeric Type and Spark DecimalType

2022-06-08 Thread GitBox
Eugene-Mark commented on PR #36499: URL: https://github.com/apache/spark/pull/36499#issuecomment-1150603723 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun commented on pull request #36815: [SPARK-37670][FOLLOWUP][SQL][TESTS][3.2] Update TPCDS golden files

2022-06-08 Thread GitBox
dongjoon-hyun commented on PR #36815: URL: https://github.com/apache/spark/pull/36815#issuecomment-1150603278 Thank you, @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on pull request #36815: [SPARK-37670][FOLLOWUP][SQL][TESTS][3.2] Update TPCDS golden files

2022-06-08 Thread GitBox
HyukjinKwon commented on PR #36815: URL: https://github.com/apache/spark/pull/36815#issuecomment-1150603145  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] dongjoon-hyun commented on pull request #36815: [SPARK-37670][FOLLOWUP][SQL][TESTS][3.2] Update TPCDS golden files

2022-06-08 Thread GitBox
dongjoon-hyun commented on PR #36815: URL: https://github.com/apache/spark/pull/36815#issuecomment-1150602874 cc @maryannxue , @cloud-fan , @HyukjinKwon , @sunchao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] dongjoon-hyun commented on pull request #34929: [SPARK-37670][SQL] Support predicate pushdown and column pruning for de-duped CTEs

2022-06-08 Thread GitBox
dongjoon-hyun commented on PR #34929: URL: https://github.com/apache/spark/pull/34929#issuecomment-1150601161 Here is a followup. - https://github.com/apache/spark/pull/36815 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] dongjoon-hyun opened a new pull request, #36815: [SPARK-37670][FOLLOWUP][SQL][TESTS][3.2] Update TPCDS golden files

2022-06-08 Thread GitBox
dongjoon-hyun opened a new pull request, #36815: URL: https://github.com/apache/spark/pull/36815 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] HyukjinKwon commented on pull request #36683: [SPARK-39301][SQL][PYTHON] Leverage LocalRelation and respect Arrow batch size in createDataFrame with Arrow optimization

2022-06-08 Thread GitBox
HyukjinKwon commented on PR #36683: URL: https://github.com/apache/spark/pull/36683#issuecomment-1150592704 Let me merge this in few days ... assuming that we're all good. Hopefully my benchmark is good enough. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] HyukjinKwon commented on pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.8 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon commented on PR #36813: URL: https://github.com/apache/spark/pull/36813#issuecomment-1150589281 seems like Max already fixed it in https://github.com/apache/spark/blob/master/dev/create-release/spark-rm/Dockerfile#L45 -- This is an automated message from the Apache Git

[GitHub] [spark] JoshRosen opened a new pull request, #36814: [SPARK-39422][SQL] Improve error message for 'SHOW CREATE TABLE' with unsupported serdes

2022-06-08 Thread GitBox
JoshRosen opened a new pull request, #36814: URL: https://github.com/apache/spark/pull/36814 ### What changes were proposed in this pull request? This PR improves the error message that is thrown when trying to run `SHOW CREATE TABLE` on a Hive table with an unsupported serde.

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-06-08 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r893003521 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -953,15 +952,16 @@ case class Pmod( // when we reach here,

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-06-08 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r893003240 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -953,15 +952,16 @@ case class Pmod( // when we reach here,

[GitHub] [spark] cloud-fan closed pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-08 Thread GitBox
cloud-fan closed pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path URL: https://github.com/apache/spark/pull/36693 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] cloud-fan commented on pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-08 Thread GitBox
cloud-fan commented on PR #36693: URL: https://github.com/apache/spark/pull/36693#issuecomment-1150579697 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] cloud-fan commented on pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-08 Thread GitBox
cloud-fan commented on PR #36693: URL: https://github.com/apache/spark/pull/36693#issuecomment-1150579622 The python doc issue is being fixed by https://github.com/apache/spark/pull/36813 -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] gengliangwang commented on a diff in pull request #36812: [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null

2022-06-08 Thread GitBox
gengliangwang commented on code in PR #36812: URL: https://github.com/apache/spark/pull/36812#discussion_r892996318 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala: ## @@ -375,9 +375,17 @@ case class ArrayTransform( //

[GitHub] [spark] cloud-fan commented on pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.8 in documentation build

2022-06-08 Thread GitBox
cloud-fan commented on PR #36813: URL: https://github.com/apache/spark/pull/36813#issuecomment-1150577614 how about the docker file we use for release? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] gengliangwang commented on a diff in pull request #36812: [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null

2022-06-08 Thread GitBox
gengliangwang commented on code in PR #36812: URL: https://github.com/apache/spark/pull/36812#discussion_r892992649 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -3828,6 +3828,15 @@ object SQLConf { .booleanConf

[GitHub] [spark] cloud-fan commented on a diff in pull request #36812: [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null

2022-06-08 Thread GitBox
cloud-fan commented on code in PR #36812: URL: https://github.com/apache/spark/pull/36812#discussion_r892991176 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -3828,6 +3828,15 @@ object SQLConf { .booleanConf

[GitHub] [spark] cloud-fan commented on a diff in pull request #36812: [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null

2022-06-08 Thread GitBox
cloud-fan commented on code in PR #36812: URL: https://github.com/apache/spark/pull/36812#discussion_r892990862 ## sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala: ## @@ -2012,4 +2012,12 @@ private[sql] object QueryExecutionErrors extends

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.8 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36813: URL: https://github.com/apache/spark/pull/36813#discussion_r892987817 ## .github/workflows/build_and_test.yml: ## @@ -549,6 +549,7 @@ jobs: # See also https://issues.apache.org/jira/browse/SPARK-38279. python3.9 -m

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36812: [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36812: URL: https://github.com/apache/spark/pull/36812#discussion_r892987033 ## sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala: ## @@ -2012,4 +2012,12 @@ private[sql] object QueryExecutionErrors extends

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36812: [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36812: URL: https://github.com/apache/spark/pull/36812#discussion_r892987033 ## sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala: ## @@ -2012,4 +2012,12 @@ private[sql] object QueryExecutionErrors extends

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36812: [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36812: URL: https://github.com/apache/spark/pull/36812#discussion_r892986587 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -3828,6 +3828,15 @@ object SQLConf { .booleanConf

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36793: [SPARK-39406][PYTHON] Accept NumPy array in createDataFrame

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36793: URL: https://github.com/apache/spark/pull/36793#discussion_r892985507 ## python/pyspark/sql/tests/test_arrow.py: ## @@ -495,6 +509,22 @@ def test_schema_conversion_roundtrip(self): schema_rt = from_arrow_schema(arrow_schema)

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.8 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36813: URL: https://github.com/apache/spark/pull/36813#discussion_r892981613 ## .github/workflows/build_and_test.yml: ## @@ -549,6 +549,7 @@ jobs: # See also https://issues.apache.org/jira/browse/SPARK-38279. python3.9 -m

[GitHub] [spark] HyukjinKwon opened a new pull request, #36813: [SPARK-39421][PYTHON][DOCS] Pin the docutils version <1.8 in documentation build

2022-06-08 Thread GitBox
HyukjinKwon opened a new pull request, #36813: URL: https://github.com/apache/spark/pull/36813 ### What changes were proposed in this pull request? This PR fixes the Sphinx build failure below (see https://github.com/singhpk234/spark/runs/6799026458?check_suite_focus=true):

[GitHub] [spark] wangyum commented on a diff in pull request #36810: [SPARK-39417][SQL] Handle Null partition values in PartitioningUtils

2022-06-08 Thread GitBox
wangyum commented on code in PR #36810: URL: https://github.com/apache/spark/pull/36810#discussion_r892980250 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala: ## @@ -359,7 +359,12 @@ object PartitioningUtils extends SQLConfHelper{

[GitHub] [spark] cloud-fan commented on pull request #36810: [SPARK-39417][SQL] Handle Null partition values in PartitioningUtils

2022-06-08 Thread GitBox
cloud-fan commented on PR #36810: URL: https://github.com/apache/spark/pull/36810#issuecomment-1150557713 do we know which commit caused this issue? is it a 3.3 only bug? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] github-actions[bot] commented on pull request #35605: [SPARK-38280][SQL] The Rank windows to be ordered is not necessary in a query.

2022-06-08 Thread GitBox
github-actions[bot] commented on PR #35605: URL: https://github.com/apache/spark/pull/35605#issuecomment-1150538958 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] wangyum commented on pull request #36786: [SPARK-39400][SQL] spark-sql should remove hive resource dir in all case

2022-06-08 Thread GitBox
wangyum commented on PR #36786: URL: https://github.com/apache/spark/pull/36786#issuecomment-1150532390 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] wangyum closed pull request #36786: [SPARK-39400][SQL] spark-sql should remove hive resource dir in all case

2022-06-08 Thread GitBox
wangyum closed pull request #36786: [SPARK-39400][SQL] spark-sql should remove hive resource dir in all case URL: https://github.com/apache/spark/pull/36786 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] huaxingao commented on pull request #36810: [SPARK-39417][SQL] Handle Null partition values in PartitioningUtils

2022-06-08 Thread GitBox
huaxingao commented on PR #36810: URL: https://github.com/apache/spark/pull/36810#issuecomment-1150517588 The Python doc generation failure seems to be irrelevant. All the other tests passed. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] ueshin opened a new pull request, #36812: [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null

2022-06-08 Thread GitBox
ueshin opened a new pull request, #36812: URL: https://github.com/apache/spark/pull/36812 ### What changes were proposed in this pull request? Fixes `ArraySort` to throw an exception when the comparator returns `null`. Also updates the doc to follow the corrected behavior.

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-08 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r892940331 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -576,6 +661,7 @@ public MergeStatuses

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-08 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r892938847 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -992,6 +1233,45 @@ AppShufflePartitionInfo

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-08 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r892923191 ## resource-managers/yarn/src/test/scala/org/apache/spark/network/shuffle/ShuffleTestAccessor.scala: ## @@ -44,14 +48,113 @@ object ShuffleTestAccessor {

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-08 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r892923038 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -655,6 +743,197 @@ public void registerExecutor(String

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-08 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r892921497 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -203,15 +237,16 @@ private AppShufflePartitionInfo

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-08 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r892921190 ## common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java: ## @@ -451,6 +472,7 @@ protected File initRecoveryDb(String dbName) {

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-08 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r892921010 ## common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java: ## @@ -287,7 +301,13 @@ protected void serviceInit(Configuration

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-08 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r892920621 ## common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java: ## @@ -230,11 +241,14 @@ protected void serviceInit(Configuration

[GitHub] [spark] xinrong-databricks commented on a diff in pull request #36793: [SPARK-39406][PYTHON] Accept NumPy array in createDataFrame

2022-06-08 Thread GitBox
xinrong-databricks commented on code in PR #36793: URL: https://github.com/apache/spark/pull/36793#discussion_r892890115 ## python/pyspark/sql/tests/test_arrow.py: ## @@ -495,6 +509,22 @@ def test_schema_conversion_roundtrip(self): schema_rt =

[GitHub] [spark] huaxingao commented on pull request #36781: [SPARK-39393][SQL] Parquet data source only supports push-down predicate filters for non-repeated primitive types

2022-06-08 Thread GitBox
huaxingao commented on PR #36781: URL: https://github.com/apache/spark/pull/36781#issuecomment-1150422569 @Borjianamin98 I forgot that I need to add you to the contributors list first. I just did and assigned the jira OK :) -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36793: [SPARK-39406][PYTHON] Accept NumPy array in createDataFrame

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36793: URL: https://github.com/apache/spark/pull/36793#discussion_r892865940 ## python/pyspark/sql/tests/test_arrow.py: ## @@ -495,6 +509,22 @@ def test_schema_conversion_roundtrip(self): schema_rt = from_arrow_schema(arrow_schema)

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36793: [SPARK-39406][PYTHON] Accept NumPy array in createDataFrame

2022-06-08 Thread GitBox
HyukjinKwon commented on code in PR #36793: URL: https://github.com/apache/spark/pull/36793#discussion_r892865940 ## python/pyspark/sql/tests/test_arrow.py: ## @@ -495,6 +509,22 @@ def test_schema_conversion_roundtrip(self): schema_rt = from_arrow_schema(arrow_schema)

[GitHub] [spark] EnricoMi commented on a diff in pull request #35965: [SPARK-38647][SQL] Add SupportsReportOrdering mix in interface for Scan (DataSourceV2)

2022-06-08 Thread GitBox
EnricoMi commented on code in PR #35965: URL: https://github.com/apache/spark/pull/35965#discussion_r892865243 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExecBase.scala: ## @@ -138,6 +138,15 @@ trait DataSourceV2ScanExecBase extends

  1   2   >