[PR] [MINOR][DOCS] Change `SPARK_ANSI_SQL_MODE`in PlanStabilitySuite documentation [spark]

2024-04-21 Thread via GitHub
HyukjinKwon opened a new pull request, #46148: URL: https://github.com/apache/spark/pull/46148 ### What changes were proposed in this pull request? This PR proposes to fix `SPARK_ANSI_SQL_MODE=true` to `SPARK_ANSI_SQL_MODE=false` in `PlanStabilitySuite` documentation ### Why

Re: [PR] [DO-NOT-MERGE] Test Apache Hive 2.3.10 RC0 [spark]

2024-04-21 Thread via GitHub
pan3793 commented on PR #45372: URL: https://github.com/apache/spark/pull/45372#issuecomment-2067976202 @dongjoon-hyun @sunchao @LuciferYang the integration test with Hive RC0 passed, do you have other concerns or extra cases that need to be verified? -- This is an automated message from

Re: [PR] [MINOR] Improve SparkPi example [spark]

2024-04-21 Thread via GitHub
EnricoMi closed pull request #45664: [MINOR] Improve SparkPi example URL: https://github.com/apache/spark/pull/45664 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [MINOR] Improve SparkPi example [spark]

2024-04-21 Thread via GitHub
EnricoMi commented on PR #45664: URL: https://github.com/apache/spark/pull/45664#issuecomment-2067958222 Thanks for the input. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [DO-NOT-MERGE] Test Apache Hive 2.3.10 RC0 [spark]

2024-04-21 Thread via GitHub
pan3793 commented on PR #45372: URL: https://github.com/apache/spark/pull/45372#issuecomment-2067976499 also cc @wangyum @HyukjinKwon @yaooqinn -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-47018][BUILD] Upgrade built-in Hive to 2.3.10 [spark]

2024-04-21 Thread via GitHub
pan3793 commented on code in PR #45372: URL: https://github.com/apache/spark/pull/45372#discussion_r1574144603 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala: ## @@ -1626,6 +1625,7 @@ class HiveQuerySuite extends HiveComparisonTest with

Re: [PR] [SPARK-47018][BUILD] Upgrade built-in Hive to 2.3.10 [spark]

2024-04-21 Thread via GitHub
pan3793 commented on code in PR #45372: URL: https://github.com/apache/spark/pull/45372#discussion_r1574116717 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala: ## @@ -1626,6 +1625,7 @@ class HiveQuerySuite extends HiveComparisonTest with

Re: [PR] [SPARK-47928][SQL][TEST] Speed up test "Add jar support Ivy URI in SQL" [spark]

2024-04-21 Thread via GitHub
pan3793 commented on PR #46150: URL: https://github.com/apache/spark/pull/46150#issuecomment-2068478468 cc @dongjoon-hyun @LuciferYang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-47890][CONNECT][PYTHON] Add variant functions to Scala and Python. [spark]

2024-04-21 Thread via GitHub
itholic commented on code in PR #46123: URL: https://github.com/apache/spark/pull/46123#discussion_r1574132243 ## python/pyspark/sql/tests/test_functions.py: ## @@ -1315,6 +1315,35 @@ def test_parse_json(self): self.assertEqual("""{"a":1}""", actual["var"])

Re: [PR] [SPARK-47932][SQL][TEST] Avoid using legacy commons-lang [spark]

2024-04-21 Thread via GitHub
smileyboy2019 commented on PR #46154: URL: https://github.com/apache/spark/pull/46154#issuecomment-2068511712 Structured Streaming supports writing Spark SQL and using SQL to write stream processing logic. Is this possible, similar to the syntax of Flink. SQL can satisfy the flow

Re: [PR] [SPARK-47596][DSTREAMS] Streaming: Migrate logWarn with variables to structured logging framework [spark]

2024-04-21 Thread via GitHub
smileyboy2019 commented on PR #46079: URL: https://github.com/apache/spark/pull/46079#issuecomment-2068510767 Structured Streaming supports writing Spark SQL and using SQL to write stream processing logic. Is this possible, similar to the syntax of Flink. SQL can satisfy the flow

Re: [PR] [SPARK-47912][SQL] Infer serde class from format classes [spark]

2024-04-21 Thread via GitHub
smileyboy2019 commented on PR #46132: URL: https://github.com/apache/spark/pull/46132#issuecomment-2068512262 Structured Streaming supports writing Spark SQL and using SQL to write stream processing logic. Is this possible, similar to the syntax of Flink. SQL can satisfy the flow

Re: [PR] [SPARK-47018][BUILD] Upgrade built-in Hive to 2.3.10 [spark]

2024-04-21 Thread via GitHub
pan3793 commented on code in PR #45372: URL: https://github.com/apache/spark/pull/45372#discussion_r1574143381 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala: ## @@ -1627,10 +1627,8 @@ class HiveQuerySuite extends HiveComparisonTest with

Re: [PR] [SPARK-47932][SQL][TEST] Avoid using legacy commons-lang [spark]

2024-04-21 Thread via GitHub
pan3793 commented on PR #46154: URL: https://github.com/apache/spark/pull/46154#issuecomment-2068515201 @smileyboy2019 the question is irrelevant to this PR, please ask questions in the mailing list or Slack. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-47890][CONNECT][PYTHON] Add variant functions to Scala and Python. [spark]

2024-04-21 Thread via GitHub
zhengruifeng commented on code in PR #46123: URL: https://github.com/apache/spark/pull/46123#discussion_r1574083962 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala: ## @@ -6975,6 +6975,71 @@ object functions { */ def parse_json(json:

Re: [PR] [SPARK-47907] Put bang under a config [spark]

2024-04-21 Thread via GitHub
cloud-fan commented on code in PR #46138: URL: https://github.com/apache/spark/pull/46138#discussion_r1574084085 ## sql/core/src/test/resources/sql-tests/analyzer-results/predicate-functions.sql.out: ## @@ -450,3 +450,18 @@ Project [NOT between(to_timestamp(2022-12-26 00:00:01,

Re: [PR] [SPARK-47904][SQL] Preserve case in Avro schema when using enableStableIdentifiersForUnionType [spark]

2024-04-21 Thread via GitHub
dongjoon-hyun commented on code in PR #46126: URL: https://github.com/apache/spark/pull/46126#discussion_r1574081702 ## connector/avro/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala: ## @@ -208,14 +208,13 @@ object SchemaConverters { // could

Re: [PR] [SPARK-47907] Put bang under a config [spark]

2024-04-21 Thread via GitHub
srielau commented on code in PR #46138: URL: https://github.com/apache/spark/pull/46138#discussion_r1574085978 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -365,6 +367,8 @@ class AstBuilder extends DataTypeAstBuilder with

Re: [PR] [SPARK-47907] Put bang under a config [spark]

2024-04-21 Thread via GitHub
srielau commented on code in PR #46138: URL: https://github.com/apache/spark/pull/46138#discussion_r1574095617 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -3798,6 +3798,21 @@ ], "sqlState" : "22003" }, + "SYNTAX_DISCONTINUED" : { +

Re: [PR] [SPARK-47904][SQL] Preserve case in Avro schema when using enableStableIdentifiersForUnionType [spark]

2024-04-21 Thread via GitHub
sadikovi commented on PR #46126: URL: https://github.com/apache/spark/pull/46126#issuecomment-2068451840 Thanks for the review @dongjoon-hyun. I have addressed your comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-47018][BUILD] Upgrade built-in Hive to 2.3.10 [spark]

2024-04-21 Thread via GitHub
pan3793 commented on PR #45372: URL: https://github.com/apache/spark/pull/45372#issuecomment-2068475033 @dongjoon-hyun This one is mostly for verification, let's try any ideas on it. I plan to split it into several PRs once Hive 2.3.10 is officially released. -- This is an automated

Re: [PR] [SPARK-47018][BUILD] Upgrade built-in Hive to 2.3.10 [spark]

2024-04-21 Thread via GitHub
pan3793 commented on code in PR #45372: URL: https://github.com/apache/spark/pull/45372#discussion_r1574116295 ## sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala: ## @@ -3748,7 +3748,7 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with

Re: [PR] [SPARK-47907] Put bang under a config [spark]

2024-04-21 Thread via GitHub
cloud-fan commented on code in PR #46138: URL: https://github.com/apache/spark/pull/46138#discussion_r1574084230 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -3798,6 +3798,21 @@ ], "sqlState" : "22003" }, + "SYNTAX_DISCONTINUED" : { +

[PR] [SPARK-47931][SQL] Remove unused and leaked threadlocal/session sessionHive [spark]

2024-04-21 Thread via GitHub
yaooqinn opened a new pull request, #46153: URL: https://github.com/apache/spark/pull/46153 ### What changes were proposed in this pull request? This `sessionHive` is never used and properly closed ### Why are the changes needed? A thread local

Re: [PR] [SPARK-47907] Put bang under a config [spark]

2024-04-21 Thread via GitHub
srielau commented on code in PR #46138: URL: https://github.com/apache/spark/pull/46138#discussion_r1574101454 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -3798,6 +3798,21 @@ ], "sqlState" : "22003" }, + "SYNTAX_DISCONTINUED" : { +

Re: [PR] [SPARK-47148][SQL] Avoid to materialize AQE ExchangeQueryStageExec on the cancellation [spark]

2024-04-21 Thread via GitHub
erenavsarogullari commented on PR #45234: URL: https://github.com/apache/spark/pull/45234#issuecomment-2068474192 Thanks @cloud-fan and @ulysses-you for the reviews and approval. I have just rebased to get green build. -- This is an automated message from the Apache Git Service. To

Re: [PR] [SPARK-47018][BUILD] Upgrade built-in Hive to 2.3.10 [spark]

2024-04-21 Thread via GitHub
pan3793 commented on code in PR #45372: URL: https://github.com/apache/spark/pull/45372#discussion_r1574127899 ## sql/hive/src/test/java/org/apache/spark/sql/hive/test/Complex.java: ## @@ -16,7 +16,7 @@ */ package org.apache.spark.sql.hive.test; -import

Re: [PR] [SPARK-47910][CORE] close stream when DiskBlockObjectWriter closeResources to avoid memory leak [spark]

2024-04-21 Thread via GitHub
JacobZheng0927 commented on PR #46131: URL: https://github.com/apache/spark/pull/46131#issuecomment-2068538332 > Thank you for making a PR, @JacobZheng0927 . > > However, your PR fails to compile. Please make GitHub Action CI green. > > > [error] (core / Compile /

Re: [PR] [SPARK-47018][BUILD] Upgrade built-in Hive to 2.3.10 [spark]

2024-04-21 Thread via GitHub
dongjoon-hyun commented on code in PR #45372: URL: https://github.com/apache/spark/pull/45372#discussion_r1574094232 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala: ## @@ -1626,6 +1625,7 @@ class HiveQuerySuite extends HiveComparisonTest

Re: [PR] [SPARK-47907] Put bang under a config [spark]

2024-04-21 Thread via GitHub
cloud-fan commented on code in PR #46138: URL: https://github.com/apache/spark/pull/46138#discussion_r1574094424 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -365,6 +367,8 @@ class AstBuilder extends DataTypeAstBuilder with

[PR] [SPARK-47933][PYTHON] Parent Column class for Spark Connect and Spark Classic [spark]

2024-04-21 Thread via GitHub
HyukjinKwon opened a new pull request, #46155: URL: https://github.com/apache/spark/pull/46155 ### What changes were proposed in this pull request? Same as https://github.com/apache/spark/pull/46129 but for `Column` class. ### Why are the changes needed? Same as

Re: [PR] [SPARK-47018][BUILD] Upgrade built-in Hive to 2.3.10 [spark]

2024-04-21 Thread via GitHub
dongjoon-hyun commented on code in PR #45372: URL: https://github.com/apache/spark/pull/45372#discussion_r1574093194 ## sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala: ## @@ -3748,7 +3748,7 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with

Re: [PR] [SPARK-47633][SQL] Include right-side plan output in `LateralJoin#allAttributes` for more consistent canonicalization [spark]

2024-04-21 Thread via GitHub
cloud-fan commented on code in PR #45763: URL: https://github.com/apache/spark/pull/45763#discussion_r1574118395 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala: ## @@ -2056,6 +2056,8 @@ case class LateralJoin(

[PR] [SPARK-47932][SQL][TEST] Avoid using legacy commons-lang [spark]

2024-04-21 Thread via GitHub
pan3793 opened a new pull request, #46154: URL: https://github.com/apache/spark/pull/46154 ### What changes were proposed in this pull request? Remove usage of `commons-lang` and in favor `commons-lang3` ### Why are the changes needed? Migrate `commons-lang` to

Re: [PR] [SPARK-47904][SQL] Preserve case in Avro schema when using enableStableIdentifiersForUnionType [spark]

2024-04-21 Thread via GitHub
dongjoon-hyun commented on code in PR #46126: URL: https://github.com/apache/spark/pull/46126#discussion_r1574083057 ## connector/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala: ## @@ -367,6 +367,20 @@ abstract class AvroSuite

Re: [PR] [SPARK-47922][SQL] Implement the try_parse_json expression [spark]

2024-04-21 Thread via GitHub
harshmotw-db commented on code in PR #46141: URL: https://github.com/apache/spark/pull/46141#discussion_r1574119555 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/variant/variantExpressions.scala: ## @@ -75,6 +75,36 @@ case class ParseJson(child:

Re: [PR] [SPARK-47907] Put bang under a config [spark]

2024-04-21 Thread via GitHub
cloud-fan commented on code in PR #46138: URL: https://github.com/apache/spark/pull/46138#discussion_r157408 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -365,6 +367,8 @@ class AstBuilder extends DataTypeAstBuilder with

Re: [PR] [SPARK-47907] Put bang under a config [spark]

2024-04-21 Thread via GitHub
srielau commented on code in PR #46138: URL: https://github.com/apache/spark/pull/46138#discussion_r1574088137 ## sql/core/src/test/resources/sql-tests/analyzer-results/predicate-functions.sql.out: ## @@ -450,3 +450,18 @@ Project [NOT between(to_timestamp(2022-12-26 00:00:01,

Re: [PR] [SPARK-47930][BUILD] Upgrade RoaringBitmap to 1.0.6 [spark]

2024-04-21 Thread via GitHub
panbingkun commented on PR #46152: URL: https://github.com/apache/spark/pull/46152#issuecomment-2068523328 The benchmark(org.apache.spark.MapStatusesConvertBenchmark) results as follows: JDK 17: JDK 21: -- This is an automated message from the Apache Git Service. To respond to the

[PR] [SPARK-47730][K8S] Support APP_ID and EXECUTOR_ID placeholders in labels [spark]

2024-04-21 Thread via GitHub
jshmchenxi opened a new pull request, #46149: URL: https://github.com/apache/spark/pull/46149 Currently, only the pod annotations supports `APP_ID` and `EXECUTOR_ID` placeholders. This commit aims to add the same function to pod labels. The use case is to support using customized

Re: [PR] [SPARK-46632][SQL] EquivalentExpressions addExprTree should allow all type of expressions [spark]

2024-04-21 Thread via GitHub
planga82 closed pull request #45894: [SPARK-46632][SQL] EquivalentExpressions addExprTree should allow all type of expressions URL: https://github.com/apache/spark/pull/45894 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-46632][SQL] EquivalentExpressions addExprTree should allow all type of expressions [spark]

2024-04-21 Thread via GitHub
planga82 commented on PR #45894: URL: https://github.com/apache/spark/pull/45894#issuecomment-2067982861 Thank you @zml1206 . I saw that you have proposed other solution. I'm going to close this PR -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [SPARK-47412][SQL] Add Collation Support for LPad/RPad. [spark]

2024-04-21 Thread via GitHub
GideonPotok commented on code in PR #46041: URL: https://github.com/apache/spark/pull/46041#discussion_r1573733891 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala: ## @@ -54,7 +54,7 @@ object CollationTypeCasts extends

Re: [PR] [SPARK-47412][SQL] Add Collation Support for LPad/RPad. [spark]

2024-04-21 Thread via GitHub
GideonPotok commented on code in PR #46041: URL: https://github.com/apache/spark/pull/46041#discussion_r1573741040 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala: ## @@ -54,7 +54,7 @@ object CollationTypeCasts extends

Re: [PR] [SPARK-47730][K8S] Support APP_ID and EXECUTOR_ID placeholders in labels [spark]

2024-04-21 Thread via GitHub
jshmchenxi commented on PR #46149: URL: https://github.com/apache/spark/pull/46149#issuecomment-2068041487 cc @dongjoon-hyun This is a similar feature as #35704 and #42600. Please take a look, thanks! -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-47412][SQL] Add Collation Support for LPad/RPad. [spark]

2024-04-21 Thread via GitHub
uros-db commented on code in PR #46041: URL: https://github.com/apache/spark/pull/46041#discussion_r1573839675 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala: ## @@ -52,6 +52,12 @@ object CollationTypeCasts extends

Re: [PR] [SPARK-47412][SQL] Add Collation Support for LPad/RPad. [spark]

2024-04-21 Thread via GitHub
uros-db commented on code in PR #46041: URL: https://github.com/apache/spark/pull/46041#discussion_r1573842165 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala: ## @@ -52,6 +52,12 @@ object CollationTypeCasts extends

Re: [PR] [SPARK-47412][SQL] Add Collation Support for LPad/RPad. [spark]

2024-04-21 Thread via GitHub
GideonPotok commented on PR #46041: URL: https://github.com/apache/spark/pull/46041#issuecomment-2068126309 > small changes in casting @uros-db done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] [SPARK-47928][SQL][TEST] Speed up test "Add jar support Ivy URI in SQL" [spark]

2024-04-21 Thread via GitHub
pan3793 opened a new pull request, #46150: URL: https://github.com/apache/spark/pull/46150 ### What changes were proposed in this pull request? `SQLQuerySuite`/"SPARK-33084: Add jar support Ivy URI in SQL" uses Hive deps to test `ADD JAR` which pulls tons of transitive deps,

Re: [PR] [SPARK-47412][SQL] Add Collation Support for LPad/RPad. [spark]

2024-04-21 Thread via GitHub
uros-db commented on code in PR #46041: URL: https://github.com/apache/spark/pull/46041#discussion_r1573905109 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala: ## @@ -52,6 +52,14 @@ object CollationTypeCasts extends

Re: [PR] [SPARK-41469][CORE] Avoid unnecessary task rerun on decommissioned executor lost if shuffle data migrated [spark]

2024-04-21 Thread via GitHub
parthshyara commented on code in PR #39011: URL: https://github.com/apache/spark/pull/39011#discussion_r1573887387 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1046,17 +1048,45 @@ private[spark] class TaskSetManager( /** Called by

Re: [PR] [SPARK-47928][SQL][TEST] Speed up test "Add jar support Ivy URI in SQL" [spark]

2024-04-21 Thread via GitHub
pan3793 commented on code in PR #46150: URL: https://github.com/apache/spark/pull/46150#discussion_r1573890525 ## sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala: ## @@ -3748,22 +3748,21 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with

Re: [PR] [SPARK-47412][SQL] Add Collation Support for LPad/RPad. [spark]

2024-04-21 Thread via GitHub
uros-db commented on code in PR #46041: URL: https://github.com/apache/spark/pull/46041#discussion_r1573905109 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala: ## @@ -52,6 +52,14 @@ object CollationTypeCasts extends

Re: [PR] [MINOR][DOCS] Change `SPARK_ANSI_SQL_MODE`in PlanStabilitySuite documentation [spark]

2024-04-21 Thread via GitHub
HyukjinKwon commented on PR #46148: URL: https://github.com/apache/spark/pull/46148#issuecomment-2068330509 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [MINOR][DOCS] Change `SPARK_ANSI_SQL_MODE`in PlanStabilitySuite documentation [spark]

2024-04-21 Thread via GitHub
HyukjinKwon closed pull request #46148: [MINOR][DOCS] Change `SPARK_ANSI_SQL_MODE`in PlanStabilitySuite documentation URL: https://github.com/apache/spark/pull/46148 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-47845][SQL][PYTHON][CONNECT] Support Column type in split function for scala and python [spark]

2024-04-21 Thread via GitHub
zhengruifeng closed pull request #46045: [SPARK-47845][SQL][PYTHON][CONNECT] Support Column type in split function for scala and python URL: https://github.com/apache/spark/pull/46045 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-47845][SQL][PYTHON][CONNECT] Support Column type in split function for scala and python [spark]

2024-04-21 Thread via GitHub
zhengruifeng commented on PR #46045: URL: https://github.com/apache/spark/pull/46045#issuecomment-2068333943 thank you @CTCC1 merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-47890][CONNECT][PYTHON] Add variant functions to Scala and Python. [spark]

2024-04-21 Thread via GitHub
chenhao-db commented on code in PR #46123: URL: https://github.com/apache/spark/pull/46123#discussion_r1574033907 ## python/pyspark/sql/tests/test_functions.py: ## @@ -1315,6 +1315,35 @@ def test_parse_json(self): self.assertEqual("""{"a":1}""", actual["var"])

Re: [PR] [SPARK-47910][CORE] close stream when DiskBlockObjectWriter closeResources to avoid memory leak [spark]

2024-04-21 Thread via GitHub
JacobZheng0927 commented on PR #46131: URL: https://github.com/apache/spark/pull/46131#issuecomment-2068359129 cc @dongjoon-hyun This is a similar to https://github.com/apache/spark/pull/35613. Please take a look, thanks! -- This is an automated message from the Apache Git Service. To

[PR] [SPARK-47930][BUILD] Upgrade RoaringBitmap to 1.0.6 [spark]

2024-04-21 Thread via GitHub
panbingkun opened a new pull request, #46152: URL: https://github.com/apache/spark/pull/46152 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-47730][K8S] Support `APP_ID` and `EXECUTOR_ID` placeholders in labels [spark]

2024-04-21 Thread via GitHub
dongjoon-hyun commented on code in PR #46149: URL: https://github.com/apache/spark/pull/46149#discussion_r1574055887 ## resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStepSuite.scala: ## @@ -35,7 +35,9 @@ import

Re: [PR] [DO-NOT-MERGE] Test Apache Hive 2.3.10 RC0 [spark]

2024-04-21 Thread via GitHub
LuciferYang commented on code in PR #45372: URL: https://github.com/apache/spark/pull/45372#discussion_r1574056705 ## dev/deps/spark-deps-hadoop-3-hive-2.3: ## @@ -33,6 +33,7 @@ breeze-macros_2.13/2.1.0//breeze-macros_2.13-2.1.0.jar breeze_2.13/2.1.0//breeze_2.13-2.1.0.jar

Re: [PR] [SPARK-47914][SQL] Do not display the splits parameter in Rang [spark]

2024-04-21 Thread via GitHub
guixiaowen commented on PR #46136: URL: https://github.com/apache/spark/pull/46136#issuecomment-2068383150 > Mind checking the test failures? ok. I will check it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-47730][K8S] Support `APP_ID` and `EXECUTOR_ID` placeholders in labels [spark]

2024-04-21 Thread via GitHub
dongjoon-hyun commented on code in PR #46149: URL: https://github.com/apache/spark/pull/46149#discussion_r1574056485 ## resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStepSuite.scala: ## @@ -35,7 +35,9 @@ import

Re: [PR] [SPARK-47730][K8S] Support `APP_ID` and `EXECUTOR_ID` placeholders in labels [spark]

2024-04-21 Thread via GitHub
dongjoon-hyun commented on code in PR #46149: URL: https://github.com/apache/spark/pull/46149#discussion_r1574056783 ## resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/BasicTestsSuite.scala: ## @@ -102,13 +102,15 @@

Re: [PR] [SPARK-47730][K8S] Support `APP_ID` and `EXECUTOR_ID` placeholders in labels [spark]

2024-04-21 Thread via GitHub
dongjoon-hyun commented on code in PR #46149: URL: https://github.com/apache/spark/pull/46149#discussion_r1574057309 ## resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala: ## @@ -589,7 +589,8 @@ class

Re: [PR] [SPARK-47904][SQL] Preserve case in Avro schema when using enableStableIdentifiersForUnionType [spark]

2024-04-21 Thread via GitHub
dongjoon-hyun commented on PR #46126: URL: https://github.com/apache/spark/pull/46126#issuecomment-2068387729 Sorry for being late. I'll take a look Today, @sadikovi . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [DO-NOT-MERGE] Test Apache Hive 2.3.10 RC0 [spark]

2024-04-21 Thread via GitHub
LuciferYang commented on code in PR #45372: URL: https://github.com/apache/spark/pull/45372#discussion_r1574059438 ## project/SparkBuild.scala: ## @@ -952,7 +955,7 @@ object Unsafe { object DockerIntegrationTests { // This serves to override the override specified in

Re: [PR] [SPARK-47922][SQL] Implement the try_parse_json expression [spark]

2024-04-21 Thread via GitHub
cloud-fan commented on code in PR #46141: URL: https://github.com/apache/spark/pull/46141#discussion_r1574060791 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/variant/variantExpressions.scala: ## @@ -75,6 +75,36 @@ case class ParseJson(child:

Re: [PR] [DO-NOT-MERGE] Test Apache Hive 2.3.10 RC0 [spark]

2024-04-21 Thread via GitHub
LuciferYang commented on code in PR #45372: URL: https://github.com/apache/spark/pull/45372#discussion_r1574062976 ## sql/hive/src/test/java/org/apache/spark/sql/hive/test/Complex.java: ## @@ -16,7 +16,7 @@ */ package org.apache.spark.sql.hive.test; -import

Re: [PR] [DO-NOT-MERGE] Test Apache Hive 2.3.10 RC0 [spark]

2024-04-21 Thread via GitHub
pan3793 commented on code in PR #45372: URL: https://github.com/apache/spark/pull/45372#discussion_r1574065984 ## project/SparkBuild.scala: ## @@ -952,7 +955,7 @@ object Unsafe { object DockerIntegrationTests { // This serves to override the override specified in

Re: [PR] [DO-NOT-MERGE] Test Apache Hive 2.3.10 RC0 [spark]

2024-04-21 Thread via GitHub
LuciferYang commented on code in PR #45372: URL: https://github.com/apache/spark/pull/45372#discussion_r1574066564 ## project/SparkBuild.scala: ## @@ -952,7 +955,7 @@ object Unsafe { object DockerIntegrationTests { // This serves to override the override specified in

Re: [PR] [DO-NOT-MERGE] Test Apache Hive 2.3.10 RC0 [spark]

2024-04-21 Thread via GitHub
pan3793 commented on code in PR #45372: URL: https://github.com/apache/spark/pull/45372#discussion_r1574066575 ## sql/hive/src/test/java/org/apache/spark/sql/hive/test/Complex.java: ## @@ -16,7 +16,7 @@ */ package org.apache.spark.sql.hive.test; -import

Re: [PR] [DO-NOT-MERGE] Test Apache Hive 2.3.10 RC0 [spark]

2024-04-21 Thread via GitHub
pan3793 commented on code in PR #45372: URL: https://github.com/apache/spark/pull/45372#discussion_r157404 ## dev/deps/spark-deps-hadoop-3-hive-2.3: ## @@ -33,6 +33,7 @@ breeze-macros_2.13/2.1.0//breeze-macros_2.13-2.1.0.jar breeze_2.13/2.1.0//breeze_2.13-2.1.0.jar

Re: [PR] [SPARK-47912][SQL] Infer serde class from format classes [spark]

2024-04-21 Thread via GitHub
wForget commented on PR #46132: URL: https://github.com/apache/spark/pull/46132#issuecomment-2068404048 @cloud-fan could you please take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-47902][SQL]Making Compute Current Time* expressions foldable [spark]

2024-04-21 Thread via GitHub
cloud-fan commented on PR #46120: URL: https://github.com/apache/spark/pull/46120#issuecomment-2068419611 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47902][SQL]Making Compute Current Time* expressions foldable [spark]

2024-04-21 Thread via GitHub
cloud-fan closed pull request #46120: [SPARK-47902][SQL]Making Compute Current Time* expressions foldable URL: https://github.com/apache/spark/pull/46120 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-47412][SQL] Add Collation Support for LPad/RPad. [spark]

2024-04-21 Thread via GitHub
GideonPotok commented on code in PR #46041: URL: https://github.com/apache/spark/pull/46041#discussion_r1573929256 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala: ## @@ -52,6 +52,14 @@ object CollationTypeCasts extends

Re: [PR] [SPARK-47412][SQL] Add Collation Support for LPad/RPad. [spark]

2024-04-21 Thread via GitHub
09306677806 commented on PR #46041: URL: https://github.com/apache/spark/pull/46041#issuecomment-2068214445 Lotfan barsihay lam ra shoro konid ShahrzadMahro در تاریخ دوشنبه ۱۵ آوریل ۲۰۲۴،‏ ۱۳:۴۱ Gideon Potok ***@***.***> نوشت: > @uros-db

Re: [PR] [SPARK-41469][CORE] Avoid unnecessary task rerun on decommissioned executor lost if shuffle data migrated [spark]

2024-04-21 Thread via GitHub
09306677806 commented on PR #39011: URL: https://github.com/apache/spark/pull/39011#issuecomment-2068221273 #(&£€&445₩)*khakqganHynosql sharsing ShahrzadMahro در تاریخ یکشنبه ۲۱ آوریل ۲۰۲۴،‏ ۲۲:۳۶ Parth Shyara ***@***.***> نوشت: > ***@***. commented on this pull

[PR] [SPARK-47600][CORE] MLLib: Migrate logInfo with variables to structured logging framework [spark]

2024-04-21 Thread via GitHub
zeotuan opened a new pull request, #46151: URL: https://github.com/apache/spark/pull/46151 The PR aims to migrate `logInfo` in module MLLib with variables to structured logging framework. ### Why are the changes needed? To enhance Apache Spark's logging system by

Re: [PR] [SPARK-47909][PYTHON][CONNECT] Parent DataFrame class for Spark Connect and Spark Classic [spark]

2024-04-21 Thread via GitHub
HyukjinKwon commented on PR #46129: URL: https://github.com/apache/spark/pull/46129#issuecomment-2068279142 Merged to master. I will followup if there are more comments to address. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] [SPARK-47909][PYTHON][CONNECT] Parent DataFrame class for Spark Connect and Spark Classic [spark]

2024-04-21 Thread via GitHub
HyukjinKwon closed pull request #46129: [SPARK-47909][PYTHON][CONNECT] Parent DataFrame class for Spark Connect and Spark Classic URL: https://github.com/apache/spark/pull/46129 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [WIP][SPARK-46620][PS][CONNECT] Implement `Frame.asfreq` [spark]

2024-04-21 Thread via GitHub

Re: [PR] [SPARK-47413][SQL] - add support to substr/left/right for collations [spark]

2024-04-21 Thread via GitHub
GideonPotok commented on PR #46040: URL: https://github.com/apache/spark/pull/46040#issuecomment-2068213773 @uros-db -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-47412][SQL] Add Collation Support for LPad/RPad. [spark]

2024-04-21 Thread via GitHub
09306677806 commented on PR #46041: URL: https://github.com/apache/spark/pull/46041#issuecomment-2068216025 U;Sh(7700077)SQL HARMAN 04#G) ShahrzadMahro در تاریخ دوشنبه ۲۲ آوریل ۲۰۲۴،‏ ۰۱:۲۴ Gideon Potok ***@***.***> نوشت: > ***@***. commented on this pull request.

[PR] [SPARK-47929]Setup Static Analysis for Operator [spark-kubernetes-operator]

2024-04-21 Thread via GitHub
jiangzho opened a new pull request, #6: URL: https://github.com/apache/spark-kubernetes-operator/pull/6 ### What changes were proposed in this pull request? This is a breakdown PR from #2 - setting up common build tasks and required plugins. Checkstyle xml is

Re: [PR] [SPARK-47890][CONNECT][PYTHON] Add variant functions to Scala and Python. [spark]

2024-04-21 Thread via GitHub
chenhao-db commented on PR #46123: URL: https://github.com/apache/spark/pull/46123#issuecomment-2068317353 @zhengruifeng @LuciferYang Could you help take another look? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-47890][CONNECT][PYTHON] Add variant functions to Scala and Python. [spark]

2024-04-21 Thread via GitHub
itholic commented on code in PR #46123: URL: https://github.com/apache/spark/pull/46123#discussion_r1574020712 ## python/pyspark/sql/tests/test_functions.py: ## @@ -1315,6 +1315,35 @@ def test_parse_json(self): self.assertEqual("""{"a":1}""", actual["var"])