Re: [PR] [SPARK-49539][SS] Update internal col families start identifier to a different one [spark]

2024-09-09 Thread via GitHub
HeartSaVioR commented on PR #48030: URL: https://github.com/apache/spark/pull/48030#issuecomment-2339593712 Thanks! Merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-49539][SS] Update internal col families start identifier to a different one [spark]

2024-09-09 Thread via GitHub
HeartSaVioR closed pull request #48030: [SPARK-49539][SS] Update internal col families start identifier to a different one URL: https://github.com/apache/spark/pull/48030 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] [SPARK-49576][INFRA] Upload Python logs in CI [spark]

2024-09-09 Thread via GitHub
HyukjinKwon closed pull request #48048: [SPARK-49576][INFRA] Upload Python logs in CI URL: https://github.com/apache/spark/pull/48048 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] [SPARK-49534][CORE][3.5] No longer prepend `sql/hive`and `sql/hive-thriftserver` when `spark-hive_xxx.jar` is not in the classpath [spark]

2024-09-09 Thread via GitHub
LuciferYang commented on PR #48046: URL: https://github.com/apache/spark/pull/48046#issuecomment-2339631391 > Actually this is more test-only and won't need to backport right? fine to me. cc @wangyum for awareness -- This is an automated message from the Apache Git Service. T

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-09-09 Thread via GitHub
mihailom-db commented on code in PR #47364: URL: https://github.com/apache/spark/pull/47364#discussion_r1751315215 ## common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java: ## @@ -923,4 +1063,13 @@ public static String getClosestSuggestionsOnInvali

Re: [PR] [SPARK-49547][SQL][PYTHON] Support returning iterator of RecordBatches in applyInArrow [spark]

2024-09-09 Thread via GitHub
zhengruifeng commented on code in PR #48038: URL: https://github.com/apache/spark/pull/48038#discussion_r1751352741 ## python/pyspark/sql/pandas/_typing/__init__.pyi: ## @@ -347,10 +348,16 @@ PandasCogroupedMapFunction = Union[ ArrowGroupedMapFunction = Union[ Callable[[py

[PR] [SPARK-49578][SQL] Change error message for CAST_INVALID_INPUT and CAST_OVERFLOW [spark]

2024-09-09 Thread via GitHub
mihailom-db opened a new pull request, #48054: URL: https://github.com/apache/spark/pull/48054 ### What changes were proposed in this pull request? Removal of suggestion to turn off ansi when experiencing CAST_INVALID_INPUT or CAST_OVERFLOW. ### Why are the changes needed? T

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-09-10 Thread via GitHub
panbingkun commented on code in PR #47364: URL: https://github.com/apache/spark/pull/47364#discussion_r1751388958 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala: ## @@ -1112,4 +1112,29 @@ class SparkSqlAstBuilder extends AstBuilder { withId

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-09-10 Thread via GitHub
panbingkun commented on code in PR #47364: URL: https://github.com/apache/spark/pull/47364#discussion_r1751390300 ## common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java: ## @@ -704,6 +809,39 @@ private String collationName() { }

Re: [PR] [SPARK-49548][CONNECT] Replace coarse-locking in SparkConnectSessionManager with ConcurrentMap [spark]

2024-09-10 Thread via GitHub
changgyoopark-db commented on PR #48036: URL: https://github.com/apache/spark/pull/48036#issuecomment-2339871455 @juliuszsompolski Can you please review this PR (as you're the one who wrote the majority of the code in this file)? Thanks! -- This is an automated message from the Apache Git

Re: [PR] [SPARK-49544][CONNECT] Replace coarse-locking in SparkConnectExecutionManager with ConcurrentMap [spark]

2024-09-10 Thread via GitHub
changgyoopark-db commented on PR #48034: URL: https://github.com/apache/spark/pull/48034#issuecomment-2339872073 @juliuszsompolski Can you please review this PR too? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-09-10 Thread via GitHub
panbingkun commented on code in PR #47364: URL: https://github.com/apache/spark/pull/47364#discussion_r1751402898 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala: ## @@ -1112,4 +1112,29 @@ class SparkSqlAstBuilder extends AstBuilder { withId

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-09-10 Thread via GitHub
mihailom-db commented on code in PR #47364: URL: https://github.com/apache/spark/pull/47364#discussion_r1751410939 ## common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java: ## @@ -88,12 +90,45 @@ public Optional getVersion() { } } + publ

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-09-10 Thread via GitHub
panbingkun commented on code in PR #47364: URL: https://github.com/apache/spark/pull/47364#discussion_r1751414438 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala: ## @@ -1112,4 +1112,29 @@ class SparkSqlAstBuilder extends AstBuilder { withId

Re: [PR] [SPARK-49547][SQL][PYTHON] Support returning iterator of RecordBatches in applyInArrow [spark]

2024-09-10 Thread via GitHub
xinrong-meng commented on code in PR #48038: URL: https://github.com/apache/spark/pull/48038#discussion_r1751420860 ## python/pyspark/sql/pandas/group_ops.py: ## @@ -538,26 +538,28 @@ def applyInArrow( as a `DataFrame`. The function should take a `pyarrow.Tab

Re: [PR] [SPARK-49547][SQL][PYTHON] Support returning iterator of RecordBatches in applyInArrow [spark]

2024-09-10 Thread via GitHub
xinrong-meng commented on code in PR #48038: URL: https://github.com/apache/spark/pull/48038#discussion_r1751421772 ## python/pyspark/sql/pandas/group_ops.py: ## @@ -538,26 +538,28 @@ def applyInArrow( as a `DataFrame`. The function should take a `pyarrow.Tab

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-09-10 Thread via GitHub
mihailom-db commented on code in PR #47364: URL: https://github.com/apache/spark/pull/47364#discussion_r1751423306 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala: ## @@ -1112,4 +1112,29 @@ class SparkSqlAstBuilder extends AstBuilder { withI

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-09-10 Thread via GitHub
mihailom-db commented on code in PR #47364: URL: https://github.com/apache/spark/pull/47364#discussion_r1751428380 ## common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java: ## @@ -923,4 +1063,13 @@ public static String getClosestSuggestionsOnInvali

Re: [PR] [SPARK-36796][BUILD][CORE][SQL] Pass all `sql/core` and dependent modules UTs with JDK 17 except one case in `postgreSQL/text.sql` [spark]

2024-09-10 Thread via GitHub
lachen-rms commented on PR #34153: URL: https://github.com/apache/spark/pull/34153#issuecomment-2339925239 Hi guys, is there today a spark version that supports java17, im trying to run locally a project that uses spark an java 16, scala version 2.12.15 and is failing because trying to acce

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-09-10 Thread via GitHub
panbingkun commented on code in PR #47364: URL: https://github.com/apache/spark/pull/47364#discussion_r1751449761 ## common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java: ## @@ -923,4 +1063,13 @@ public static String getClosestSuggestionsOnInvalid

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-09-10 Thread via GitHub
panbingkun commented on code in PR #47364: URL: https://github.com/apache/spark/pull/47364#discussion_r1751452589 ## common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java: ## @@ -88,12 +90,45 @@ public Optional getVersion() { } } + publi

[PR] [SPARK-49582][PYTHON][CONNECT] Fix "dispatch_window_method" utility and documentation [spark]

2024-09-10 Thread via GitHub
xinrong-meng opened a new pull request, #48056: URL: https://github.com/apache/spark/pull/48056 ### What changes were proposed in this pull request? Fix "dispatch_window_method" utility and documentation ### Why are the changes needed? Correct dispatch and better documentation.

Re: [PR] [SPARK-49398][SQL] Cache Table with Parameter markers returns wrong error [spark]

2024-09-10 Thread via GitHub
mihailom-db commented on code in PR #48055: URL: https://github.com/apache/spark/pull/48055#discussion_r1751489604 ## sql/core/src/test/scala/org/apache/spark/sql/ParametersSuite.scala: ## @@ -715,4 +715,20 @@ class ParametersSuite extends QueryTest with SharedSparkSession with

Re: [PR] [SPARK-49553][PYTHON][DOCS] Remove the experimental API notes for pandas related functions [spark]

2024-09-10 Thread via GitHub
HeartSaVioR commented on code in PR #48042: URL: https://github.com/apache/spark/pull/48042#discussion_r1751499369 ## python/pyspark/sql/pandas/group_ops.py: ## @@ -329,8 +327,6 @@ def applyInPandasWithState( Notes - This function requires a full s

Re: [PR] [SPARK-49582][PYTHON][CONNECT] Fix "dispatch_window_method" utility and docstring [spark]

2024-09-10 Thread via GitHub
zhengruifeng commented on code in PR #48056: URL: https://github.com/apache/spark/pull/48056#discussion_r1751515263 ## python/pyspark/sql/utils.py: ## @@ -399,11 +399,13 @@ def wrapped(*args: Any, **kwargs: Any) -> Any: if is_remote() and "PYSPARK_NO_NAMESPACE_SHARE" no

Re: [PR] [SPARK-49582][PYTHON][CONNECT] Fix "dispatch_window_method" utility and docstring [spark]

2024-09-10 Thread via GitHub
zhengruifeng commented on code in PR #48056: URL: https://github.com/apache/spark/pull/48056#discussion_r1751516321 ## python/pyspark/sql/utils.py: ## @@ -399,11 +399,13 @@ def wrapped(*args: Any, **kwargs: Any) -> Any: if is_remote() and "PYSPARK_NO_NAMESPACE_SHARE" no

Re: [PR] [MINOR][DOCS] Fix scaladoc for `FlatMapGroupsInArrowExec` and `FlatMapCoGroupsInArrowExec` [spark]

2024-09-10 Thread via GitHub
zhengruifeng commented on PR #48052: URL: https://github.com/apache/spark/pull/48052#issuecomment-2340027257 merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [MINOR][DOCS] Fix scaladoc for `FlatMapGroupsInArrowExec` and `FlatMapCoGroupsInArrowExec` [spark]

2024-09-10 Thread via GitHub
zhengruifeng closed pull request #48052: [MINOR][DOCS] Fix scaladoc for `FlatMapGroupsInArrowExec` and `FlatMapCoGroupsInArrowExec` URL: https://github.com/apache/spark/pull/48052 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-09-10 Thread via GitHub
panbingkun commented on PR #47364: URL: https://github.com/apache/spark/pull/47364#issuecomment-2340065192 @mihailom-db @MaxGekk all done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-09-10 Thread via GitHub
mihailom-db commented on code in PR #47364: URL: https://github.com/apache/spark/pull/47364#discussion_r1751555879 ## common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java: ## @@ -428,6 +461,52 @@ protected Collation buildCollation() {

Re: [PR] [SPARK-49584][BUILD] Upgrade log4j2 to 2.24.0 [spark]

2024-09-10 Thread via GitHub
LuciferYang commented on PR #48057: URL: https://github.com/apache/spark/pull/48057#issuecomment-2340103506 Test first -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] [SPARK-49544][CONNECT] Replace coarse-locking in SparkConnectExecutionManager with ConcurrentMap [spark]

2024-09-10 Thread via GitHub
changgyoopark-db commented on code in PR #48034: URL: https://github.com/apache/spark/pull/48034#discussion_r1751600547 ## sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectExecutionManager.scala: ## @@ -61,6 +60,7 @@ private[connect] class Spark

Re: [PR] [SPARK-49398][SQL] Cache Table with Parameter markers returns wrong error [spark]

2024-09-10 Thread via GitHub
mikhailnik-db commented on code in PR #48055: URL: https://github.com/apache/spark/pull/48055#discussion_r1751612040 ## sql/core/src/test/scala/org/apache/spark/sql/ParametersSuite.scala: ## @@ -715,4 +715,20 @@ class ParametersSuite extends QueryTest with SharedSparkSession wi

Re: [PR] [SPARK-49544][CONNECT] Replace coarse-locking in SparkConnectExecutionManager with ConcurrentMap [spark]

2024-09-10 Thread via GitHub
changgyoopark-db commented on code in PR #48034: URL: https://github.com/apache/spark/pull/48034#discussion_r1751626231 ## sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectExecutionManager.scala: ## @@ -77,26 +77,28 @@ private[connect] class Spa

Re: [PR] [SPARK-49544][CONNECT] Replace coarse-locking in SparkConnectExecutionManager with ConcurrentMap [spark]

2024-09-10 Thread via GitHub
changgyoopark-db commented on code in PR #48034: URL: https://github.com/apache/spark/pull/48034#discussion_r1751619058 ## sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectExecutionManager.scala: ## @@ -108,43 +110,50 @@ private[connect] class S

Re: [PR] [SPARK-49544][CONNECT] Replace coarse-locking in SparkConnectExecutionManager with ConcurrentMap [spark]

2024-09-10 Thread via GitHub
juliuszsompolski commented on code in PR #48034: URL: https://github.com/apache/spark/pull/48034#discussion_r1751639015 ## sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectExecutionManager.scala: ## @@ -61,6 +60,7 @@ private[connect] class Spark

Re: [PR] [SPARK-49544][CONNECT] Replace coarse-locking in SparkConnectExecutionManager with ConcurrentMap [spark]

2024-09-10 Thread via GitHub
juliuszsompolski commented on code in PR #48034: URL: https://github.com/apache/spark/pull/48034#discussion_r1751665785 ## sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectExecutionManager.scala: ## @@ -77,26 +77,28 @@ private[connect] class Spa

Re: [PR] [SPARK-49544][CONNECT] Replace coarse-locking in SparkConnectExecutionManager with ConcurrentMap [spark]

2024-09-10 Thread via GitHub
changgyoopark-db commented on code in PR #48034: URL: https://github.com/apache/spark/pull/48034#discussion_r1751714503 ## sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectExecutionManager.scala: ## @@ -61,6 +60,7 @@ private[connect] class Spark

Re: [PR] [SPARK-49544][CONNECT] Replace coarse-locking in SparkConnectExecutionManager with ConcurrentMap [spark]

2024-09-10 Thread via GitHub
changgyoopark-db commented on code in PR #48034: URL: https://github.com/apache/spark/pull/48034#discussion_r1751722122 ## sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectExecutionManager.scala: ## @@ -77,26 +77,28 @@ private[connect] class Spa

Re: [PR] [SPARK-49582][PYTHON][CONNECT] Fix "dispatch_window_method" utility and docstring [spark]

2024-09-10 Thread via GitHub
xinrong-meng commented on code in PR #48056: URL: https://github.com/apache/spark/pull/48056#discussion_r1751726121 ## python/pyspark/sql/utils.py: ## @@ -399,11 +399,13 @@ def wrapped(*args: Any, **kwargs: Any) -> Any: if is_remote() and "PYSPARK_NO_NAMESPACE_SHARE" no

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

2024-09-10 Thread via GitHub
panbingkun commented on code in PR #47364: URL: https://github.com/apache/spark/pull/47364#discussion_r1751736488 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSuite.scala: ## @@ -1624,4 +1624,22 @@ class CollationSuite extends DatasourceV2SQLBase with AdaptiveSparkP

Re: [PR] [SPARK-49519][SQL] Merge options of table and relation when constructing FileScanBuilder [spark]

2024-09-10 Thread via GitHub
liujiayi771 commented on PR #47996: URL: https://github.com/apache/spark/pull/47996#issuecomment-2340447026 @cloud-fan @dongjoon-hyun I have added a test case, could you help to review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-49547][SQL][PYTHON] Support returning iterator of RecordBatches in applyInArrow [spark]

2024-09-10 Thread via GitHub
zhengruifeng commented on PR #48038: URL: https://github.com/apache/spark/pull/48038#issuecomment-2340481093 @Kimahriman thank you so much for working on it! 1, regarding the cogrouped API, I personally prefer ``` (Iterator[RecordBatch], Iterator[RecordBatch]) -> Iterator[Record

Re: [PR] [SPARK-49548][CONNECT] Replace coarse-locking in SparkConnectSessionManager with ConcurrentMap [spark]

2024-09-10 Thread via GitHub
changgyoopark-db commented on PR #48036: URL: https://github.com/apache/spark/pull/48036#issuecomment-2340667167 @hvanhovell @HyukjinKwon Hello Herman and Hyukjin, would you mind merging this PR? Thanks! -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] [SPARK-49398][SQL] Cache Table with Parameter markers returns wrong error [spark]

2024-09-10 Thread via GitHub
MaxGekk commented on code in PR #48055: URL: https://github.com/apache/spark/pull/48055#discussion_r1751909597 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -5081,6 +5081,13 @@ class AstBuilder extends DataTypeAstBuilder val o

Re: [PR] [SPARK-49583][SQL] Define the error sub-condition `SECONDS_FRACTION` for invalid seconds fraction pattern [spark]

2024-09-10 Thread via GitHub
MaxGekk commented on PR #48058: URL: https://github.com/apache/spark/pull/48058#issuecomment-2340858407 Merging to master. Thank you, @cloud-fan for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-48195] Save and reuse RDD/Broadcast created by SparkPlan [spark]

2024-09-10 Thread via GitHub
cloud-fan commented on code in PR #48037: URL: https://github.com/apache/spark/pull/48037#discussion_r1752024515 ## core/src/main/scala/org/apache/spark/util/LazyTry.scala: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contrib

Re: [PR] [SPARK-49249][SPARK-49320] Add new tag-related APIs in Connect back to Spark Core [spark]

2024-09-10 Thread via GitHub
xupefei commented on code in PR #47815: URL: https://github.com/apache/spark/pull/47815#discussion_r1752028523 ## core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala: ## @@ -167,7 +168,7 @@ private[spark] class DAGScheduler( // Stages that must be resubmitted du

Re: [PR] [SPARK-48195] Save and reuse RDD/Broadcast created by SparkPlan [spark]

2024-09-10 Thread via GitHub
cloud-fan commented on code in PR #48037: URL: https://github.com/apache/spark/pull/48037#discussion_r1752038999 ## core/src/test/scala/org/apache/spark/util/UtilsSuite.scala: ## @@ -1523,6 +1523,116 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties { co

Re: [PR] [SPARK-48195] Save and reuse RDD/Broadcast created by SparkPlan [spark]

2024-09-10 Thread via GitHub
cloud-fan commented on code in PR #48037: URL: https://github.com/apache/spark/pull/48037#discussion_r1752046485 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala: ## @@ -182,6 +182,10 @@ abstract class SparkPlan extends QueryPlan[SparkPlan] with Logging

Re: [PR] [SPARK-48195] Save and reuse RDD/Broadcast created by SparkPlan [spark]

2024-09-10 Thread via GitHub
juliuszsompolski commented on code in PR #48037: URL: https://github.com/apache/spark/pull/48037#discussion_r1752047279 ## core/src/test/scala/org/apache/spark/util/LazyTrySuite.scala: ## @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or mor

Re: [PR] [SPARK-48195] Save and reuse RDD/Broadcast created by SparkPlan [spark]

2024-09-10 Thread via GitHub
juliuszsompolski commented on code in PR #48037: URL: https://github.com/apache/spark/pull/48037#discussion_r1752051998 ## core/src/test/scala/org/apache/spark/util/UtilsSuite.scala: ## @@ -1523,6 +1523,116 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties {

Re: [PR] [SPARK-49398][SQL] Cache Table with Parameter markers returns wrong error [spark]

2024-09-10 Thread via GitHub
mikhailnik-db commented on code in PR #48055: URL: https://github.com/apache/spark/pull/48055#discussion_r1752061272 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -5677,4 +5684,24 @@ class AstBuilder extends DataTypeAstBuilder w

Re: [PR] [SPARK-49443][SQL][PYTHON] Implement to_variant_object expression and make schema_of_variant expressions print OBJECT for for Variant Objects [spark]

2024-09-10 Thread via GitHub
cloud-fan commented on PR #47907: URL: https://github.com/apache/spark/pull/47907#issuecomment-2340961708 The link failure is unrelated, thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] [SPARK-49398][SQL] Cache Table with Parameter markers returns wrong error [spark]

2024-09-10 Thread via GitHub
MaxGekk commented on PR #48055: URL: https://github.com/apache/spark/pull/48055#issuecomment-2340957562 @mikhailnik-db Could you improve PR's title, please. It should say what the PR does, but not what was wrong. Like: ``` Improve the error for parameters in the query of CACHE TABLE

Re: [PR] [SPARK-49443][SQL][PYTHON] Implement to_variant_object expression and make schema_of_variant expressions print OBJECT for for Variant Objects [spark]

2024-09-10 Thread via GitHub
cloud-fan closed pull request #47907: [SPARK-49443][SQL][PYTHON] Implement to_variant_object expression and make schema_of_variant expressions print OBJECT for for Variant Objects URL: https://github.com/apache/spark/pull/47907 -- This is an automated message from the Apache Git Service. To

Re: [PR] [SPARK-49584][BUILD] Upgrade log4j2 to 2.24.0 [spark]

2024-09-10 Thread via GitHub
LuciferYang commented on code in PR #48057: URL: https://github.com/apache/spark/pull/48057#discussion_r1752095345 ## common/utils/src/test/resources/log4j2.component.properties: ## @@ -0,0 +1 @@ +log4j2.garbagefree.threadContextMap = true Review Comment: https://github.com

Re: [PR] [SPARK-49584][BUILD] Upgrade log4j2 to 2.24.0 [spark]

2024-09-10 Thread via GitHub
LuciferYang commented on code in PR #48057: URL: https://github.com/apache/spark/pull/48057#discussion_r1752095345 ## common/utils/src/test/resources/log4j2.component.properties: ## @@ -0,0 +1 @@ +log4j2.garbagefree.threadContextMap = true Review Comment: https://github.com

Re: [PR] [SPARK-49545][INFRA] Increase timeout for build from 3 to 4 hours [spark]

2024-09-10 Thread via GitHub
dongjoon-hyun commented on PR #48033: URL: https://github.com/apache/spark/pull/48033#issuecomment-2341017211 Thank you so much, @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-49579][SQL][TESTS] Rename `errorClass` to `condition` in `checkError()` [spark]

2024-09-10 Thread via GitHub
MaxGekk commented on PR #48027: URL: https://github.com/apache/spark/pull/48027#issuecomment-2341220061 Merging to master. Thank you, @cloud-fan for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-49579][SQL][TESTS] Rename `errorClass` to `condition` in `checkError()` [spark]

2024-09-10 Thread via GitHub
MaxGekk closed pull request #48027: [SPARK-49579][SQL][TESTS] Rename `errorClass` to `condition` in `checkError()` URL: https://github.com/apache/spark/pull/48027 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] [SPARK-46088][PYTHON][SQL][DOCS] Add a self-contained example dataframe from jdbc MySQL [spark]

2024-09-10 Thread via GitHub
eder001 commented on PR #48045: URL: https://github.com/apache/spark/pull/48045#issuecomment-2341246884 @cloud-fan Can you help me with my first request? I am looking to start collaborating more with the community -- This is an automated message from the Apache Git Service. To respond to

[PR] [SPARK-49579][SQL][TESTS][FOLLOWUP] Use `condition` instead of `errorClass` in `SparkThrowableSuite` and in `DateTimeFormatterHelperSuite` [spark]

2024-09-10 Thread via GitHub
MaxGekk opened a new pull request, #48061: URL: https://github.com/apache/spark/pull/48061 ### What changes were proposed in this pull request? In the PR, I propose to use `condition` instead of `errorClass` in two test suites: - SparkThrowableSuite - DateTimeFormatterHelperSuite

Re: [PR] [SPARK-49579][SQL][TESTS][FOLLOWUP] Use `condition` instead of `errorClass` in `SparkThrowableSuite` and in `DateTimeFormatterHelperSuite` [spark]

2024-09-10 Thread via GitHub
MaxGekk commented on PR #48061: URL: https://github.com/apache/spark/pull/48061#issuecomment-2341371653 Merging to master. Thank you, @dongjoon-hyun for the quick review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-49398][SQL] Improve the error for parameters in the query of CACHE TABLE and CREATE VIEW [spark]

2024-09-10 Thread via GitHub
MaxGekk commented on code in PR #48055: URL: https://github.com/apache/spark/pull/48055#discussion_r1752270062 ## sql/core/src/test/scala/org/apache/spark/sql/ParametersSuite.scala: ## @@ -715,4 +715,30 @@ class ParametersSuite extends QueryTest with SharedSparkSession with Pla

Re: [PR] [SPARK-49398][SQL] Improve the error for parameters in the query of CACHE TABLE and CREATE VIEW [spark]

2024-09-10 Thread via GitHub
MaxGekk commented on code in PR #48055: URL: https://github.com/apache/spark/pull/48055#discussion_r1752271470 ## sql/core/src/test/scala/org/apache/spark/sql/ParametersSuite.scala: ## @@ -715,4 +715,30 @@ class ParametersSuite extends QueryTest with SharedSparkSession with Pla

Re: [PR] [SPARK-49398][SQL] Improve the error for parameters in the query of CACHE TABLE and CREATE VIEW [spark]

2024-09-10 Thread via GitHub
MaxGekk commented on code in PR #48055: URL: https://github.com/apache/spark/pull/48055#discussion_r1752270062 ## sql/core/src/test/scala/org/apache/spark/sql/ParametersSuite.scala: ## @@ -715,4 +715,30 @@ class ParametersSuite extends QueryTest with SharedSparkSession with Pla

Re: [PR] [SPARK-49584][BUILD] Upgrade log4j2 to 2.24.0 [spark]

2024-09-10 Thread via GitHub
LuciferYang closed pull request #48057: [SPARK-49584][BUILD] Upgrade log4j2 to 2.24.0 URL: https://github.com/apache/spark/pull/48057 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] [WIP][SPARK-49541][BUILD] Upgrade log4j2 to 2.24.0 [spark]

2024-09-10 Thread via GitHub
LuciferYang commented on PR #48029: URL: https://github.com/apache/spark/pull/48029#issuecomment-2341472710 https://github.com/apache/spark/pull/48057#discussion_r1752095345 Upon further analysis, it appears that there are some issues with the method `copyAndPutAll` in `org.apache.lo

Re: [PR] [DO-NOT-MERGE] Test Python 3.12 build [spark]

2024-09-10 Thread via GitHub
dongjoon-hyun commented on PR #48049: URL: https://github.com/apache/spark/pull/48049#issuecomment-2341560455 It seems that `Python 3.12 - pyspark-connect` passed in 58 mins today. - https://github.com/apache/spark/actions/runs/10779540436/job/29948161788 -- This is an automated message

[PR] [SPARK-49574][CONNECT][SQL] ExpressionEncoder tracks the AgnosticEncoder that created it. [spark]

2024-09-10 Thread via GitHub
hvanhovell opened a new pull request, #48062: URL: https://github.com/apache/spark/pull/48062 ### What changes were proposed in this pull request? This PR makes ExpressionEncoder track the AgnosticEncoder it is created from. The main reason for this change is to allow for situations where

Re: [PR] [SPARK-49505][SQL] Create new SQL functions "randstr" and "uniform" to generate random strings or numbers within ranges [spark]

2024-09-10 Thread via GitHub
MaxGekk commented on code in PR #48004: URL: https://github.com/apache/spark/pull/48004#discussion_r1752520445 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala: ## @@ -181,3 +189,215 @@ case class Randn(child: Expression, hideSeed:

Re: [PR] [SPARK-49162][SQL] Push down date_trunc function [spark]

2024-09-10 Thread via GitHub
IvanK-db commented on code in PR #47666: URL: https://github.com/apache/spark/pull/47666#discussion_r1752531210 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/PostgresIntegrationSuite.scala: ## @@ -123,4 +132,81 @@ class PostgresIntegrationSuit

Re: [PR] [SPARK-48700] [SQL] Mode expression for complex types (all collations) [spark]

2024-09-10 Thread via GitHub
MaxGekk commented on code in PR #47154: URL: https://github.com/apache/spark/pull/47154#discussion_r1752552252 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1852,40 +1888,67 @@ class CollationSQLExpressionsSuite s"array(col

Re: [PR] [SPARK-49590] E2E test template includes invalid spec field [spark-kubernetes-operator]

2024-09-10 Thread via GitHub
jiangzho commented on PR #119: URL: https://github.com/apache/spark-kubernetes-operator/pull/119#issuecomment-2341900212 cc @viirya can you please help to review ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] [SPARK-48965][SQL] Use the correct schema in `Dataset#toJSON` [spark]

2024-09-10 Thread via GitHub
dongjoon-hyun commented on PR #47982: URL: https://github.com/apache/spark/pull/47982#issuecomment-2341940738 Thank you for the confirmation, @bersprockets . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] [MINOR][SQL][TESTS] Check `sqlState` in `checkError()` [spark]

2024-09-10 Thread via GitHub
dongjoon-hyun closed pull request #48059: [MINOR][SQL][TESTS] Check `sqlState` in `checkError()` URL: https://github.com/apache/spark/pull/48059 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] [MINOR][SQL][TESTS] Check `sqlState` in `checkError()` [spark]

2024-09-10 Thread via GitHub
dongjoon-hyun commented on PR #48059: URL: https://github.com/apache/spark/pull/48059#issuecomment-2341942705 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[PR] [SPARK-49591] Add Logical Type columnt to variant readme [spark]

2024-09-10 Thread via GitHub
cashmand opened a new pull request, #48064: URL: https://github.com/apache/spark/pull/48064 ### What changes were proposed in this pull request? Add a concept of logical type to the variant README.md, distinct from the physical encoding of a value. In particular, decimal and i

Re: [PR] [SPARK-49569][CONNECT][SQL] Add shims to support SparkContext and RDD [spark]

2024-09-10 Thread via GitHub
hvanhovell commented on code in PR #48065: URL: https://github.com/apache/spark/pull/48065#discussion_r1752691614 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/package.scala: ## @@ -52,4 +52,7 @@ package object sql { f(builder) column(builder.buil

Re: [PR] [SPARK-49569][CONNECT][SQL] Add shims to support SparkContext and RDD [spark]

2024-09-10 Thread via GitHub
hvanhovell commented on code in PR #48065: URL: https://github.com/apache/spark/pull/48065#discussion_r1752691614 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/package.scala: ## @@ -52,4 +52,7 @@ package object sql { f(builder) column(builder.buil

Re: [PR] [SPARK-49569][CONNECT][SQL] Add shims to support SparkContext and RDD [spark]

2024-09-10 Thread via GitHub
hvanhovell commented on code in PR #48065: URL: https://github.com/apache/spark/pull/48065#discussion_r1752692273 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SparkSession.scala: ## @@ -84,10 +87,14 @@ class SparkSession private[sql] ( private[sql] va

Re: [PR] [SPARK-49556][SQL] Add SQL pipe syntax for the SELECT operator [spark]

2024-09-10 Thread via GitHub
dtenedor commented on PR #48047: URL: https://github.com/apache/spark/pull/48047#issuecomment-2342008738 cc @cloud-fan @srielau @gengliangwang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-49556][SQL] Add SQL pipe syntax for the SELECT operator [spark]

2024-09-10 Thread via GitHub
gengliangwang commented on code in PR #48047: URL: https://github.com/apache/spark/pull/48047#discussion_r1752917950 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -3707,6 +3707,12 @@ ], "sqlState" : "42K03" }, + "PIPE_OPERATOR_SELECT_CONTAIN

Re: [PR] [SPARK-49591] Add Logical Type columnt to variant readme [spark]

2024-09-10 Thread via GitHub
gene-db commented on code in PR #48064: URL: https://github.com/apache/spark/pull/48064#discussion_r1752923065 ## common/variant/README.md: ## @@ -362,6 +362,8 @@ The Decimal type contains a scale, but no precision. The implied precision of a | 18 <= precision <= 38 | int128

Re: [PR] [SPARK-49556][SQL] Add SQL pipe syntax for the SELECT operator [spark]

2024-09-10 Thread via GitHub
gengliangwang commented on code in PR #48047: URL: https://github.com/apache/spark/pull/48047#discussion_r1752925497 ## sql/core/src/test/resources/sql-tests/inputs/pipe-operators.sql: ## @@ -0,0 +1,84 @@ +-- Prepare some test data. +-- +drop table if exi

Re: [PR] [SPARK-49556][SQL] Add SQL pipe syntax for the SELECT operator [spark]

2024-09-10 Thread via GitHub
dtenedor commented on code in PR #48047: URL: https://github.com/apache/spark/pull/48047#discussion_r1752928254 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -4989,6 +4989,15 @@ object SQLConf { .stringConf .createWithDefault("ve

Re: [PR] [SPARK-49556][SQL] Add SQL pipe syntax for the SELECT operator [spark]

2024-09-10 Thread via GitHub
dtenedor commented on code in PR #48047: URL: https://github.com/apache/spark/pull/48047#discussion_r1752933403 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -5677,4 +5677,29 @@ class AstBuilder extends DataTypeAstBuilder withOr

Re: [PR] [SPARK-49553][PYTHON][DOCS] Remove the experimental API notes for pandas related functions [spark]

2024-09-10 Thread via GitHub
allisonwang-db commented on code in PR #48042: URL: https://github.com/apache/spark/pull/48042#discussion_r1752933907 ## python/pyspark/sql/pandas/group_ops.py: ## @@ -636,8 +630,6 @@ def applyInArrow( into memory, so the user should be aware of the potential OOM risk i

Re: [PR] [DO-NOT-MERGE] Test Python 3.12 build [spark]

2024-09-10 Thread via GitHub
HyukjinKwon commented on PR #48049: URL: https://github.com/apache/spark/pull/48049#issuecomment-2342344799 thanks!!! closing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] [SPARK-48397][SQL] Add data write time metric to FileFormatDataWriter [spark]

2024-09-10 Thread via GitHub
github-actions[bot] commented on PR #46714: URL: https://github.com/apache/spark/pull/46714#issuecomment-2342382680 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-49582][PYTHON][CONNECT] Improve "dispatch_window_method" utility and docstring [spark]

2024-09-10 Thread via GitHub
HyukjinKwon commented on PR #48056: URL: https://github.com/apache/spark/pull/48056#issuecomment-2342428792 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-49582][PYTHON][CONNECT] Improve "dispatch_window_method" utility and docstring [spark]

2024-09-10 Thread via GitHub
HyukjinKwon closed pull request #48056: [SPARK-49582][PYTHON][CONNECT] Improve "dispatch_window_method" utility and docstring URL: https://github.com/apache/spark/pull/48056 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-49553][PYTHON][DOCS] Remove the experimental API notes for pandas related functions [spark]

2024-09-10 Thread via GitHub
HyukjinKwon commented on PR #48042: URL: https://github.com/apache/spark/pull/48042#issuecomment-2342473527 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-49553][PYTHON][DOCS] Remove the experimental API notes for pandas related functions [spark]

2024-09-10 Thread via GitHub
HyukjinKwon closed pull request #48042: [SPARK-49553][PYTHON][DOCS] Remove the experimental API notes for pandas related functions URL: https://github.com/apache/spark/pull/48042 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] [SPARK-49519][SQL] Merge options of table and relation when constructing FileScanBuilder [spark]

2024-09-10 Thread via GitHub
adrian-wang commented on PR #47996: URL: https://github.com/apache/spark/pull/47996#issuecomment-2342502388 +1, LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] [SPARK-49582][PYTHON][CONNECT] Improve "dispatch_window_method" utility and docstring [spark]

2024-09-10 Thread via GitHub
xinrong-meng commented on PR #48056: URL: https://github.com/apache/spark/pull/48056#issuecomment-2342493135 Thank you all! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-49592] Modify OperatorStateMetadataV2Reader to return the right metadata for the provided batchId [spark]

2024-09-10 Thread via GitHub
ericm-db commented on PR #48066: URL: https://github.com/apache/spark/pull/48066#issuecomment-2342538881 Sorry, thought about this a little more. I don't think this is actually fixing anything, in that we don't write metadata for every single batch, only every run. Say we have run a que

[PR] [SPARK-49595] Fix DataFrame.unpivot/melt in Spark Connect [spark]

2024-09-10 Thread via GitHub
xinrong-meng opened a new pull request, #48069: URL: https://github.com/apache/spark/pull/48069 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch

Re: [PR] [SPARK-49592] Modify OperatorStateMetadataV2Reader to return the right metadata for the provided batchId [spark]

2024-09-10 Thread via GitHub
ericm-db commented on PR #48066: URL: https://github.com/apache/spark/pull/48066#issuecomment-2342532300 @neilramaswamy Can you please add a test for this, looks good to me otherwise -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

<    4   5   6   7   8   9   10   11   12   13   >