Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

2023-10-31 Thread via GitHub
LuciferYang commented on code in PR #43603: URL: https://github.com/apache/spark/pull/43603#discussion_r1378421435 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala: ## @@ -1677,25 +1677,24 @@ class PlanParserSuite extends AnalysisTest {

Re: [PR] [SPARK-45753][CORE] Support `spark.deploy.driverIdPattern` [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43615: URL: https://github.com/apache/spark/pull/43615#issuecomment-1788449104 Could you review this Spark `Master` class improvement, @yaooqinn ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-45749][CORE][WEBUI] Fix `Spark History Server` to sort `Duration` column properly [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43613: URL: https://github.com/apache/spark/pull/43613#issuecomment-1788445849 Thank you, @LuciferYang . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] [SPARK-45754][CORE] Support `spark.deploy.appIdPattern` [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun opened a new pull request, #43616: URL: https://github.com/apache/spark/pull/43616 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

Re: [PR] [WIP][SPARK-45629]Fix `Implicit definition should have explicit type` [spark]

2023-10-31 Thread via GitHub
LuciferYang commented on PR #43526: URL: https://github.com/apache/spark/pull/43526#issuecomment-1788436569 https://github.com/apache/spark/blob/e1bc48b729e40390a4b0f977eec4a9050c7cac77/project/SparkBuild.scala#L251 @laglangyue laglangyueYou can delete the above line and then compile

Re: [PR] [SPARK-45680][CONNECT] Release session [spark]

2023-10-31 Thread via GitHub
HyukjinKwon commented on PR #43546: URL: https://github.com/apache/spark/pull/43546#issuecomment-1788436733 hmmm .. `test_error_enrichment_jvm_stacktrace` seems a real test failure too. lemme take a look soon -- This is an automated message from the Apache Git Service. To respond to the

[PR] [SPARK-45753][CORE] Support `spark.deploy.driverIdPattern` [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun opened a new pull request, #43615: URL: https://github.com/apache/spark/pull/43615 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

Re: [PR] [SPARK-45752][SQL] Unreferenced CTE should all be checked by CheckAnalysis0 [spark]

2023-10-31 Thread via GitHub
amaliujia commented on PR #43614: URL: https://github.com/apache/spark/pull/43614#issuecomment-1788420163 @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[PR] [SPARK-45752][SQL] Unreferenced CTE should all be checked by CheckAnalysis0 [spark]

2023-10-31 Thread via GitHub
amaliujia opened a new pull request, #43614: URL: https://github.com/apache/spark/pull/43614 ### What changes were proposed in this pull request? This PR fixes an issue that if a CTE is referenced by a non-referenced CTE, then this CTE should also have ref count as 0 and

Re: [PR] [SPARK-45704][BUILD] Fix compile warning - using symbols inherited from a superclass shadow symbols defined in an outer scope [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43593: URL: https://github.com/apache/spark/pull/43593#issuecomment-1788413155 Thank you so much, @panbingkun . Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-45704][BUILD] Fix compile warning - using symbols inherited from a superclass shadow symbols defined in an outer scope [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun closed pull request #43593: [SPARK-45704][BUILD] Fix compile warning - using symbols inherited from a superclass shadow symbols defined in an outer scope URL: https://github.com/apache/spark/pull/43593 -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] [SPARK-45734][BUILD] Upgrade commons-io to 2.15.0 [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43592: URL: https://github.com/apache/spark/pull/43592#issuecomment-1788412347 Merged to master. Thank you, @LuciferYang . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-45734][BUILD] Upgrade commons-io to 2.15.0 [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun closed pull request #43592: [SPARK-45734][BUILD] Upgrade commons-io to 2.15.0 URL: https://github.com/apache/spark/pull/43592 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [WIP][SPARK-45629]Fix `Implicit definition should have explicit type` [spark]

2023-10-31 Thread via GitHub
laglangyue commented on PR #43526: URL: https://github.com/apache/spark/pull/43526#issuecomment-1788391495 Hi, brother @LuciferYang How can I reproduce this? I would like to check if I have made all the necessary modifications ```shell [error]

Re: [PR] [SPARK-45748][SQL] Add a .fromSQL helper function for Literals [spark]

2023-10-31 Thread via GitHub
cloud-fan commented on code in PR #43612: URL: https://github.com/apache/spark/pull/43612#discussion_r1378358968 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala: ## @@ -247,6 +251,67 @@ object Literal { s"Literal must have a

Re: [PR] [SPARK-45748][SQL] Add a .fromSQL helper function for Literals [spark]

2023-10-31 Thread via GitHub
cloud-fan commented on code in PR #43612: URL: https://github.com/apache/spark/pull/43612#discussion_r1378358241 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala: ## @@ -247,6 +251,67 @@ object Literal { s"Literal must have a

Re: [PR] [SPARK-45748][SQL] Add a .fromSQL helper function for Literals [spark]

2023-10-31 Thread via GitHub
cloud-fan commented on code in PR #43612: URL: https://github.com/apache/spark/pull/43612#discussion_r1378356652 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala: ## @@ -247,6 +251,67 @@ object Literal { s"Literal must have a

Re: [PR] [SPARK-45748][SQL] Add a .fromSQL helper function for Literals [spark]

2023-10-31 Thread via GitHub
cloud-fan commented on code in PR #43612: URL: https://github.com/apache/spark/pull/43612#discussion_r1378356375 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala: ## @@ -247,6 +251,67 @@ object Literal { s"Literal must have a

Re: [PR] [SPARK-33393][SQL] Support SHOW TABLE EXTENDED in v2 [spark]

2023-10-31 Thread via GitHub
panbingkun commented on code in PR #37588: URL: https://github.com/apache/spark/pull/37588#discussion_r1378350993 ## sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala: ## @@ -165,4 +154,222 @@ class ShowTablesSuite extends

Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

2023-10-31 Thread via GitHub
panbingkun commented on code in PR #43603: URL: https://github.com/apache/spark/pull/43603#discussion_r1378349936 ## project/SparkBuild.scala: ## @@ -269,7 +269,8 @@ object SparkBuild extends PomBuild { // SPARK-45627 `enum`, `export` and `given` will become keywords

Re: [PR] [SPARK-33393][SQL] Support SHOW TABLE EXTENDED in v2 [spark]

2023-10-31 Thread via GitHub
cloud-fan commented on code in PR #37588: URL: https://github.com/apache/spark/pull/37588#discussion_r1378345147 ## sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala: ## @@ -165,4 +154,222 @@ class ShowTablesSuite extends

Re: [PR] [SPARK-45718][PS] Remove remaining deprecated Pandas features from Spark 3.4.0 [spark]

2023-10-31 Thread via GitHub
itholic commented on PR #43581: URL: https://github.com/apache/spark/pull/43581#issuecomment-1788315425 Oh, just realized that there was actual failure. Thanks for the notice! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

2023-10-31 Thread via GitHub
panbingkun commented on PR #43603: URL: https://github.com/apache/spark/pull/43603#issuecomment-1788304312 https://github.com/apache/spark/assets/15246973/76a23c09-cd87-4eeb-bca4-ec01fda4000d;> https://github.com/apache/spark/assets/15246973/1ebfb343-a1ae-46bc-a690-b02c93612b34;>

Re: [PR] [SPARK-45742][CORE][CONNECT][MLLIB][PYTHON] Introduce an implicit function for Scala Array to wrap into `immutable.ArraySeq`. [spark]

2023-10-31 Thread via GitHub
LuciferYang commented on code in PR #43607: URL: https://github.com/apache/spark/pull/43607#discussion_r1378331137 ## common/utils/src/main/scala/org/apache/spark/util/ArrayImplicits.scala: ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] [SPARK-45749][CORE][WEBUI] Fix `Spark History Server` to sort `Duration` column properly [spark]

2023-10-31 Thread via GitHub
LuciferYang commented on PR #43613: URL: https://github.com/apache/spark/pull/43613#issuecomment-1788299717 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-45749][CORE][WEBUI] Fix `Spark History Server` to sort `Duration` column properly [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43613: URL: https://github.com/apache/spark/pull/43613#issuecomment-1788295247 Thank you so much, @yaooqinn ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-45739][PYTHON] Catch IOException instead of EOFException alone for faulthandler [spark]

2023-10-31 Thread via GitHub
ueshin commented on PR #43600: URL: https://github.com/apache/spark/pull/43600#issuecomment-1788287681 Late LGTM. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-45749][CORE][WEBUI] Fix `Spark History Server` to sort `Duration` column properly [spark]

2023-10-31 Thread via GitHub
yaooqinn commented on PR #43613: URL: https://github.com/apache/spark/pull/43613#issuecomment-1788282500 Thanks @dongjoon-hyun Merged to master for 4.0.0 and 3.5.1,3.4.2,3.3.4 for the maintenance releases. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-45749][CORE][WEBUI] Fix `Spark History Server` to sort `Duration` column properly [spark]

2023-10-31 Thread via GitHub
yaooqinn closed pull request #43613: [SPARK-45749][CORE][WEBUI] Fix `Spark History Server` to sort `Duration` column properly URL: https://github.com/apache/spark/pull/43613 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-45740][SQL] Relax the node prefix of SparkPlanGraphCluster [spark]

2023-10-31 Thread via GitHub
ulysses-you commented on PR #43602: URL: https://github.com/apache/spark/pull/43602#issuecomment-1788274447 cc @yaooqinn @cloud-fan thank you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-42746][SQL] Add the LISTAGG() aggregate function [spark]

2023-10-31 Thread via GitHub
Hisoka-X commented on PR #42398: URL: https://github.com/apache/spark/pull/42398#issuecomment-1788269081 kindly ping @cloud-fan @MaxGekk @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-45680][CONNECT] Release session [spark]

2023-10-31 Thread via GitHub
HyukjinKwon commented on PR #43546: URL: https://github.com/apache/spark/pull/43546#issuecomment-1788264993 I pushed some changes but I think you should take a look ... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-45481][SPARK-45664][SPARK-45711][SQL][FOLLOWUP] Avoid magic strings copy from parquet|orc|avro compression codes [spark]

2023-10-31 Thread via GitHub
beliefer commented on PR #43604: URL: https://github.com/apache/spark/pull/43604#issuecomment-1788257760 > Is this the last follow-up, @beliefer ? There are still some changes to the PR that need to be updated, but you have already merged them for me. No matter what, thank you

Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

2023-10-31 Thread via GitHub
panbingkun commented on code in PR #43603: URL: https://github.com/apache/spark/pull/43603#discussion_r1378302898 ## project/SparkBuild.scala: ## @@ -269,7 +269,8 @@ object SparkBuild extends PomBuild { // SPARK-45627 `enum`, `export` and `given` will become keywords

Re: [PR] [SPARK-45368][SQL] Remove scala2.12 compatibility logic for DoubleType, FloatType, Decimal [spark]

2023-10-31 Thread via GitHub
panbingkun commented on PR #43456: URL: https://github.com/apache/spark/pull/43456#issuecomment-1788247942 It's strange that I haven't received any notification about this PR until today when I received the 'resolved' email on Jira. -- This is an automated message from the Apache Git

Re: [PR] [SPARK-45680][CONNECT] Release session [spark]

2023-10-31 Thread via GitHub
HyukjinKwon commented on code in PR #43546: URL: https://github.com/apache/spark/pull/43546#discussion_r1378300936 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SparkSession.scala: ## @@ -667,6 +667,7 @@ class SparkSession private[sql] ( * @since 3.4.0

Re: [PR] [SPARK-45680][CONNECT] Release session [spark]

2023-10-31 Thread via GitHub
HyukjinKwon commented on code in PR #43546: URL: https://github.com/apache/spark/pull/43546#discussion_r1378300602 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SparkSession.scala: ## @@ -667,6 +667,7 @@ class SparkSession private[sql] ( * @since 3.4.0

Re: [PR] [SPARK-45704][BUILD] Fix compile warning - using symbols inherited from a superclass shadow symbols defined in an outer scope [spark]

2023-10-31 Thread via GitHub
panbingkun commented on PR #43593: URL: https://github.com/apache/spark/pull/43593#issuecomment-1788245919 > Could you revise the PR title a little more specific, @panbingkun ? Done -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] [SPARK-45368][SQL] Remove scala2.12 compatibility logic for DoubleType, FloatType, Decimal [spark]

2023-10-31 Thread via GitHub
panbingkun commented on PR #43456: URL: https://github.com/apache/spark/pull/43456#issuecomment-1788238152 late LGTM. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-45749][CORE][WEBUI] Fix Spark History Server to sort `Duration` column properly [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43613: URL: https://github.com/apache/spark/pull/43613#issuecomment-1788231527 Could you review this, @yaooqinn and @LuciferYang ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[PR] [SPARK-45749][CORE][WEBUI] Fix Spark History Server to sort `Duration` column properly [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun opened a new pull request, #43613: URL: https://github.com/apache/spark/pull/43613 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[PR] [SQL]WIP [spark]

2023-10-31 Thread via GitHub
anchovYu opened a new pull request, #43612: URL: https://github.com/apache/spark/pull/43612 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-45680][CONNECT] Release session [spark]

2023-10-31 Thread via GitHub
HyukjinKwon commented on code in PR #43546: URL: https://github.com/apache/spark/pull/43546#discussion_r1378275852 ## python/pyspark/sql/tests/connect/test_connect_basic.py: ## @@ -3451,7 +3451,6 @@ def test_can_create_multiple_sessions_to_different_remotes(self): #

Re: [PR] [SPARK-45733][CONNECT][PYTHON] Support multiple retry policies [spark]

2023-10-31 Thread via GitHub
HyukjinKwon commented on PR #43591: URL: https://github.com/apache/spark/pull/43591#issuecomment-1788205796 @grundprinzip and @nija-at let me know if this is good to go. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-45654][PYTHON] Add Python data source write API [spark]

2023-10-31 Thread via GitHub
HyukjinKwon closed pull request #43516: [SPARK-45654][PYTHON] Add Python data source write API URL: https://github.com/apache/spark/pull/43516 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-45654][PYTHON] Add Python data source write API [spark]

2023-10-31 Thread via GitHub
HyukjinKwon commented on PR #43516: URL: https://github.com/apache/spark/pull/43516#issuecomment-1788205097 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-45733][CONNECT][PYTHON] Support multiple retry policies [spark]

2023-10-31 Thread via GitHub
cdkrot commented on PR #43591: URL: https://github.com/apache/spark/pull/43591#issuecomment-1788204673 meow -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] [SPARK-43380][SQL][Follow-up] Fix slowdown in Avro read [spark]

2023-10-31 Thread via GitHub
gengliangwang closed pull request #43606: [SPARK-43380][SQL][Follow-up] Fix slowdown in Avro read URL: https://github.com/apache/spark/pull/43606 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-43380][SQL][Follow-up] Fix slowdown in Avro read [spark]

2023-10-31 Thread via GitHub
gengliangwang commented on PR #43606: URL: https://github.com/apache/spark/pull/43606#issuecomment-1788184967 Thanks,merging to master/branch-3.5 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [MINOR][DOCS]Fix the variable name in the docs - testing_pyspark.ipynb [spark]

2023-10-31 Thread via GitHub
HyukjinKwon closed pull request #43610: [MINOR][DOCS]Fix the variable name in the docs - testing_pyspark.ipynb URL: https://github.com/apache/spark/pull/43610 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [MINOR][DOCS]Fix the variable name in the docs - testing_pyspark.ipynb [spark]

2023-10-31 Thread via GitHub
HyukjinKwon commented on PR #43610: URL: https://github.com/apache/spark/pull/43610#issuecomment-1788177955 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-45741][BUILD] Upgrade Netty to 4.1.100.Final [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43605: URL: https://github.com/apache/spark/pull/43605#issuecomment-1788171441 Merged to master for Apache Spark 4.0.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-45741][BUILD] Upgrade Netty to 4.1.100.Final [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun closed pull request #43605: [SPARK-45741][BUILD] Upgrade Netty to 4.1.100.Final URL: https://github.com/apache/spark/pull/43605 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-45746][Python] Return specific error messages if UDTF 'analyze' method accepts or returns wrong values [spark]

2023-10-31 Thread via GitHub
dtenedor commented on PR #43611: URL: https://github.com/apache/spark/pull/43611#issuecomment-1788161885 cc @ueshin -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[PR] [SPARK-45746][Python] Return specific error messages if UDTF 'analyze' method accepts or returns wrong values [spark]

2023-10-31 Thread via GitHub
dtenedor opened a new pull request, #43611: URL: https://github.com/apache/spark/pull/43611 ### What changes were proposed in this pull request? This PR adds checks to return specific error messages if any Python UDTF 'analyze' method accepts or returns wrong values. ### Why

Re: [PR] [MINOR][DOCS]Fix the variable name in the docs - testing_pyspark.ipynb [spark]

2023-10-31 Thread via GitHub
asl3 commented on PR #43610: URL: https://github.com/apache/spark/pull/43610#issuecomment-1788153546 thanks for the fix! LGTM @huciaa -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [MINOR][DOCS]Fix the variable name in the docs - testing_pyspark.ipynb [spark]

2023-10-31 Thread via GitHub
HyukjinKwon commented on PR #43610: URL: https://github.com/apache/spark/pull/43610#issuecomment-1788126781 cc @asl3 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-45739][PYTHON] Catch IOException instead of EOFException alone for faulthandler [spark]

2023-10-31 Thread via GitHub
HyukjinKwon closed pull request #43600: [SPARK-45739][PYTHON] Catch IOException instead of EOFException alone for faulthandler URL: https://github.com/apache/spark/pull/43600 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-45739][PYTHON] Catch IOException instead of EOFException alone for faulthandler [spark]

2023-10-31 Thread via GitHub
HyukjinKwon commented on PR #43600: URL: https://github.com/apache/spark/pull/43600#issuecomment-1788125916 Retriggered and passed at https://github.com/HyukjinKwon/spark/actions/runs/6705864326/job/18242116209 -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] [SPARK-45739][PYTHON] Catch IOException instead of EOFException alone for faulthandler [spark]

2023-10-31 Thread via GitHub
HyukjinKwon commented on PR #43600: URL: https://github.com/apache/spark/pull/43600#issuecomment-1788126056 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-43380][SQL][Follow-up] Fix slowdown in Avro read [spark]

2023-10-31 Thread via GitHub
zeruibao commented on PR #43606: URL: https://github.com/apache/spark/pull/43606#issuecomment-1788111440 Hi @gengliangwang and @dongjoon-hyun, all avro related tests have passed. Should be good to merge now haha. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] [SPARK-45511][SS] State Data Source - Reader [spark]

2023-10-31 Thread via GitHub
chaoqin-li1123 commented on code in PR #43425: URL: https://github.com/apache/spark/pull/43425#discussion_r1378121683 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StatePartitionReader.scala: ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache

[PR] Fix the variable name in the docs - testing_pyspark.ipynb [spark]

2023-10-31 Thread via GitHub
huciaa opened a new pull request, #43610: URL: https://github.com/apache/spark/pull/43610 ### What changes were proposed in this pull request? I'm changing a variable name in one of the cells in the documentation regarding pyspark testing. ### Why are the changes needed?

Re: [PR] [SPARK-45511][SS] State Data Source - Reader [spark]

2023-10-31 Thread via GitHub
chaoqin-li1123 commented on code in PR #43425: URL: https://github.com/apache/spark/pull/43425#discussion_r1378121683 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StatePartitionReader.scala: ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-45511][SS] State Data Source - Reader [spark]

2023-10-31 Thread via GitHub
chaoqin-li1123 commented on code in PR #43425: URL: https://github.com/apache/spark/pull/43425#discussion_r1378121683 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/StatePartitionReader.scala: ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-45744][CORE] Switch `spark.history.store.serializer` to use `PROTOBUF` by default [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43609: URL: https://github.com/apache/spark/pull/43609#issuecomment-1787967100 Oh, got it. Let me convert this to the `Draft` first and stay tune in your works. Thank you so much, @gengliangwang . -- This is an automated message from the Apache Git Service.

Re: [PR] [SPARK-45744][CORE] Switch `spark.history.store.serializer` to use `PROTOBUF` by default [spark]

2023-10-31 Thread via GitHub
gengliangwang commented on PR #43609: URL: https://github.com/apache/spark/pull/43609#issuecomment-1787964921 @dongjoon-hyun I like the idea. In the long term, the major concern is that there can be issues if a protobuf serializer is not implemented properly in handling null string values.

Re: [PR] [SPARK-45741][BUILD] Upgrade Netty to 4.1.100.Final [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43605: URL: https://github.com/apache/spark/pull/43605#issuecomment-1787956941 Thank you, @LuciferYang . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-45744][CORE] Switch `spark.history.store.serializer` to use `PROTOBUF` by default [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43609: URL: https://github.com/apache/spark/pull/43609#issuecomment-1787951831 WDYT, @gengliangwang ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] [SPARK-45744][CORE] Switch `spark.history.store.serializer` to use `PROTOBUF` by default [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun opened a new pull request, #43609: URL: https://github.com/apache/spark/pull/43609 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was

[PR] [SPARK-45743][BUILD] Upgrade dropwizard metrics 4.2.21 [spark]

2023-10-31 Thread via GitHub
LuciferYang opened a new pull request, #43608: URL: https://github.com/apache/spark/pull/43608 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

Re: [PR] [SPARK-45743][BUILD] Upgrade dropwizard metrics 4.2.21 [spark]

2023-10-31 Thread via GitHub
LuciferYang commented on PR #43608: URL: https://github.com/apache/spark/pull/43608#issuecomment-1787914598 Test first -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub
LuciferYang commented on code in PR #43502: URL: https://github.com/apache/spark/pull/43502#discussion_r1378069029 ## common/kvstore/src/test/java/org/apache/spark/util/kvstore/RocksDBSuite.java: ## @@ -381,6 +382,56 @@ public void testSkipAfterDBClose() throws Exception {

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub
LuciferYang commented on code in PR #43502: URL: https://github.com/apache/spark/pull/43502#discussion_r1378067392 ## common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java: ## @@ -180,13 +183,20 @@ public boolean skip(long n) { @Override public

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub
LuciferYang commented on code in PR #43502: URL: https://github.com/apache/spark/pull/43502#discussion_r1378067392 ## common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java: ## @@ -180,13 +183,20 @@ public boolean skip(long n) { @Override public

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub
LuciferYang commented on code in PR #43502: URL: https://github.com/apache/spark/pull/43502#discussion_r1378065031 ## common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java: ## @@ -270,20 +280,30 @@ static int compare(byte[] a, byte[] b) { return

Re: [PR] [SPARK-45242][SQL][FOLLOWUP] Canonicalize DataFrame ID in CollectMetrics [spark]

2023-10-31 Thread via GitHub
gengliangwang closed pull request #43594: [SPARK-45242][SQL][FOLLOWUP] Canonicalize DataFrame ID in CollectMetrics URL: https://github.com/apache/spark/pull/43594 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[PR] [SPARK-45742][CORE][CONNECT][MLLIB] Introduce an implicit function for Scala Array to wrap into `immutable.ArraySeq`. [spark]

2023-10-31 Thread via GitHub
LuciferYang opened a new pull request, #43607: URL: https://github.com/apache/spark/pull/43607 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

Re: [PR] [SPARK-45726][CONNECT] Make Dataset.collectResult private [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43586: URL: https://github.com/apache/spark/pull/43586#issuecomment-1787800557 According to the above discussion result, let me close this PR. We can re-open this later if something is changed in the future. -- This is an automated message from the Apache

Re: [PR] [SPARK-45726][CONNECT] Make Dataset.collectResult private [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun closed pull request #43586: [SPARK-45726][CONNECT] Make Dataset.collectResult private URL: https://github.com/apache/spark/pull/43586 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[PR] [SPARK-43380][SQL][Follow-up] Fix slowdown in Avro read [spark]

2023-10-31 Thread via GitHub
zeruibao opened a new pull request, #43606: URL: https://github.com/apache/spark/pull/43606 ### What changes were proposed in this pull request? Fix slowdown in Avro read. There is a https://github.com/apache/spark/pull/42503 that causes the performance regression. It seems that

Re: [PR] [SPARK-45481][SPARK-45664][SPARK-45711][SQL][FOLLOWUP] Avoid magic strings copy from parquet|orc|avro compression codes [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43604: URL: https://github.com/apache/spark/pull/43604#issuecomment-1787797604 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-45481][SPARK-45664][SPARK-45711][SQL][FOLLOWUP] Avoid magic strings copy from parquet|orc|avro compression codes [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun closed pull request #43604: [SPARK-45481][SPARK-45664][SPARK-45711][SQL][FOLLOWUP] Avoid magic strings copy from parquet|orc|avro compression codes URL: https://github.com/apache/spark/pull/43604 -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-45511][SS] State Data Source - Reader [spark]

2023-10-31 Thread via GitHub
anishshri-db commented on code in PR #43425: URL: https://github.com/apache/spark/pull/43425#discussion_r1378010285 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/v2/state/StateDataSourceReadSuite.scala: ## @@ -0,0 +1,695 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-45511][SS] State Data Source - Reader [spark]

2023-10-31 Thread via GitHub
anishshri-db commented on code in PR #43425: URL: https://github.com/apache/spark/pull/43425#discussion_r1378010285 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/v2/state/StateDataSourceReadSuite.scala: ## @@ -0,0 +1,695 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-45680][CONNECT] Release session [spark]

2023-10-31 Thread via GitHub
juliuszsompolski commented on code in PR #43546: URL: https://github.com/apache/spark/pull/43546#discussion_r1377960758 ## python/pyspark/sql/tests/connect/test_connect_basic.py: ## @@ -3451,7 +3451,6 @@ def test_can_create_multiple_sessions_to_different_remotes(self):

Re: [PR] [SPARK-45719][K8S] Upgrade AWS SDK to v2 for Kubernetes integration tests module [spark]

2023-10-31 Thread via GitHub
steveloughran commented on code in PR #43510: URL: https://github.com/apache/spark/pull/43510#discussion_r1377952623 ## pom.xml: ## @@ -162,6 +162,7 @@ 1.12.0 1.11.655 +2.20.128 Review Comment: hadoop is @ 2.20.160 already -- This is an automated

Re: [PR] [SPARK-45741][BUILD] Upgrade Netty to 4.1.100.Final [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43605: URL: https://github.com/apache/spark/pull/43605#issuecomment-1787683326 Thank you, @viirya ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-45713][PYTHON][FOLLOWUP] Fix SparkThrowableSuite for GA [spark]

2023-10-31 Thread via GitHub
allisonwang-db commented on PR #43598: URL: https://github.com/apache/spark/pull/43598#issuecomment-1787676008 Thanks for fixing this @panbingkun! Would be good to know why the test was not triggered in CI. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-45511][SS] State Data Source - Reader [spark]

2023-10-31 Thread via GitHub
anishshri-db commented on PR #43425: URL: https://github.com/apache/spark/pull/43425#issuecomment-1787661104 Also another question - do we need to explicit block using this source as part of `readStream` somewhere ? Do we also need a test for this ? -- This is an automated message from

Re: [PR] [SPARK-45511][SS] State Data Source - Reader [spark]

2023-10-31 Thread via GitHub
anishshri-db commented on PR #43425: URL: https://github.com/apache/spark/pull/43425#issuecomment-1787660261 @HeartSaVioR - test failure seems related ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[PR] [SPARK-45741][BUILD] Upgrade Netty to 4.1.100.Final [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun opened a new pull request, #43605: URL: https://github.com/apache/spark/pull/43605 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

Re: [PR] [SPARK-45592][SQL] Correctness issue in AQE with InMemoryTableScanExec [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43435: URL: https://github.com/apache/spark/pull/43435#issuecomment-1787597375 Thank you, @eejbyfeldt and all. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-45718][PS] Remove remaining deprecated Pandas features from Spark 3.4.0 [spark]

2023-10-31 Thread via GitHub
dongjoon-hyun commented on PR #43581: URL: https://github.com/apache/spark/pull/43581#issuecomment-1787577153 It seems that three PySpark test pipelines fails still unfortunately. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub
LuciferYang commented on code in PR #43502: URL: https://github.com/apache/spark/pull/43502#discussion_r1377856270 ## common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java: ## @@ -182,23 +189,21 @@ public boolean skip(long n) { @Override public

Re: [PR] [SPARK-45527][core] Use fraction to do the resource calculation [spark]

2023-10-31 Thread via GitHub
tgravescs commented on code in PR #43494: URL: https://github.com/apache/spark/pull/43494#discussion_r1377768603 ## core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala: ## @@ -273,7 +273,8 @@ private[spark] class CoarseGrainedExecutorBackend(

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub
zhaomin1423 commented on code in PR #43502: URL: https://github.com/apache/spark/pull/43502#discussion_r1377858576 ## common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java: ## @@ -182,23 +189,21 @@ public boolean skip(long n) { @Override public

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub
LuciferYang commented on code in PR #43502: URL: https://github.com/apache/spark/pull/43502#discussion_r1377856270 ## common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java: ## @@ -182,23 +189,21 @@ public boolean skip(long n) { @Override public

Re: [PR] [SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator [spark]

2023-10-31 Thread via GitHub
zhaomin1423 commented on code in PR #43502: URL: https://github.com/apache/spark/pull/43502#discussion_r1377851138 ## common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBIterator.java: ## @@ -182,23 +189,21 @@ public boolean skip(long n) { @Override public

Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

2023-10-31 Thread via GitHub
LuciferYang commented on code in PR #43603: URL: https://github.com/apache/spark/pull/43603#discussion_r1377847272 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala: ## @@ -1678,23 +1678,27 @@ class PlanParserSuite extends AnalysisTest {

  1   2   >