Re: [PR] [SPARK-47969][PYTHON][TESTS] Make `test_creation_index` deterministic [spark]

2024-04-24 Thread via GitHub
zhengruifeng commented on PR #46200: URL: https://github.com/apache/spark/pull/46200#issuecomment-2074150375 thank you @dongjoon-hyun so much -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-47965][CORE] Avoid orNull in TypedConfigBuilder and OptionalConfigEntry [spark]

2024-04-24 Thread via GitHub
HyukjinKwon commented on PR #46197: URL: https://github.com/apache/spark/pull/46197#issuecomment-2074175363 yeah, it should not be a breaking change I believe. I checked the configuration related interface, and we don't allow setting `null`s. The main idea is that code path that uses

Re: [PR] [SPARK-47969][PYTHON][TESTS] Make `test_creation_index` deterministic [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun closed pull request #46200: [SPARK-47969][PYTHON][TESTS] Make `test_creation_index` deterministic URL: https://github.com/apache/spark/pull/46200 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [MINOR][DOCS] Add `docs/_generated/` to .gitignore [spark]

2024-04-24 Thread via GitHub
nchammas commented on PR #46178: URL: https://github.com/apache/spark/pull/46178#issuecomment-2074131881 Whoops, I believe this is the fault of #44971. I have the ignore in my local checkout but I didn't add it to the PR. Thanks for patching that up. -- This is an automated message from

Re: [PR] [SPARK-47969][PYTHON][TESTS] Make `test_creation_index` deterministic [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on PR #46200: URL: https://github.com/apache/spark/pull/46200#issuecomment-2074134861 Merged to master for Apache Spark 4.0.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-47965][CORE] Avoid orNull in TypedConfigBuilder and OptionalConfigEntry [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #46197: URL: https://github.com/apache/spark/pull/46197#discussion_r1577322044 ## core/src/test/scala/org/apache/spark/internal/config/ConfigEntrySuite.scala: ## @@ -196,12 +196,6 @@ class ConfigEntrySuite extends SparkFunSuite {

Re: [PR] [SPARK-47952][CORE][CONNECT] Support retrieving the real SparkConnectService GRPC address and port programmatically when running on Yarn [spark]

2024-04-24 Thread via GitHub
TakawaAkirayo commented on PR #46182: URL: https://github.com/apache/spark/pull/46182#issuecomment-2074148793 cc @grundprinzip @HyukjinKwon Please let me know if this change aligns with the design and usage pattern of SparkConnect and also whether there are already related plans on the

Re: [PR] [SPARK-47903][PYTHON][FOLLOW-UP] Removed changes relating to try_parse_json [spark]

2024-04-24 Thread via GitHub
HyukjinKwon commented on PR #46170: URL: https://github.com/apache/spark/pull/46170#issuecomment-2074184788 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47903][PYTHON][FOLLOW-UP] Removed changes relating to try_parse_json [spark]

2024-04-24 Thread via GitHub
HyukjinKwon closed pull request #46170: [SPARK-47903][PYTHON][FOLLOW-UP] Removed changes relating to try_parse_json URL: https://github.com/apache/spark/pull/46170 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-47954][K8S] Support creating ingress entry for external UI access [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #46184: URL: https://github.com/apache/spark/pull/46184#discussion_r1577326962 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/DriverIngressFeatureStep.scala: ## @@ -0,0 +1,96 @@ +/* + * Licensed to

[PR] [SPARK-47967][SQL] Make JdbcUtils.makeGetter handle reading time type as NTZ correctly [spark]

2024-04-24 Thread via GitHub
yaooqinn opened a new pull request, #46201: URL: https://github.com/apache/spark/pull/46201 ### What changes were proposed in this pull request? This PR adds a specific rule for reading the JDBC TIME value as Spark TimestampNTZ. Currently, we use getTimestamp API to

Re: [PR] [SPARK-47965][CORE] Avoid orNull in TypedConfigBuilder and OptionalConfigEntry [spark]

2024-04-24 Thread via GitHub
HyukjinKwon closed pull request #46197: [SPARK-47965][CORE] Avoid orNull in TypedConfigBuilder and OptionalConfigEntry URL: https://github.com/apache/spark/pull/46197 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-47818][CONNECT][FOLLOW-UP] Introduce plan cache in SparkConnectPlanner to improve performance of Analyze requests [spark]

2024-04-24 Thread via GitHub
hvanhovell commented on PR #46098: URL: https://github.com/apache/spark/pull/46098#issuecomment-2075061098 @xi-db please update the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]

2024-04-24 Thread via GitHub
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1577686240 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -144,6 +146,22 @@ trait LoggingSuiteBase } } + // An

Re: [PR] [SPARK-47819][CONNECT][3.5] Use asynchronous callback for execution cleanup [spark]

2024-04-24 Thread via GitHub
hvanhovell closed pull request #46064: [SPARK-47819][CONNECT][3.5] Use asynchronous callback for execution cleanup URL: https://github.com/apache/spark/pull/46064 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-47958][TESTS] Change LocalSchedulerBackend to notify scheduler of executor on start [spark]

2024-04-24 Thread via GitHub
cloud-fan commented on PR #46187: URL: https://github.com/apache/spark/pull/46187#issuecomment-2074741167 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47414][SQL] Lowercase collation support for regexp expressions [spark]

2024-04-24 Thread via GitHub
uros-db commented on code in PR #46077: URL: https://github.com/apache/spark/pull/46077#discussion_r1577746735 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala: ## @@ -158,7 +162,9 @@ case class Like(left: Expression, right:

Re: [PR] [SPARK-47958][TESTS] Change LocalSchedulerBackend to notify scheduler of executor on start [spark]

2024-04-24 Thread via GitHub
cloud-fan closed pull request #46187: [SPARK-47958][TESTS] Change LocalSchedulerBackend to notify scheduler of executor on start URL: https://github.com/apache/spark/pull/46187 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-47955][SQL] Improve `DeduplicateRelations` performance [spark]

2024-04-24 Thread via GitHub
peter-toth commented on code in PR #46183: URL: https://github.com/apache/spark/pull/46183#discussion_r1578042754 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala: ## @@ -38,28 +38,31 @@ case class RelationWrapper(cls: Class[_],

Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]

2024-04-24 Thread via GitHub
LuciferYang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1577602188 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -144,6 +146,22 @@ trait LoggingSuiteBase } } + // An

Re: [PR] [SPARK-47965][CORE] Avoid orNull in TypedConfigBuilder and OptionalConfigEntry [spark]

2024-04-24 Thread via GitHub
HyukjinKwon commented on PR #46197: URL: https://github.com/apache/spark/pull/46197#issuecomment-2074644249 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] [SPARK-47974][BUILD] Remove `install_scala` from `build/mvn` [spark]

2024-04-24 Thread via GitHub
pan3793 opened a new pull request, #46204: URL: https://github.com/apache/spark/pull/46204 ### What changes were proposed in this pull request? Remove `install_scala` from `build/mvn` ### Why are the changes needed? Scala seems only used for standalone Zinc,

Re: [PR] [SPARK-47414][SQL] Lowercase collation support for regexp expressions [spark]

2024-04-24 Thread via GitHub
uros-db commented on code in PR #46077: URL: https://github.com/apache/spark/pull/46077#discussion_r1577746005 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala: ## @@ -1161,7 +1186,8 @@ object RegExpUtils { | // regex

Re: [PR] [SPARK-47952][CORE][CONNECT] Support retrieving the real SparkConnectService GRPC address and port programmatically when running on Yarn [spark]

2024-04-24 Thread via GitHub
grundprinzip commented on PR #46182: URL: https://github.com/apache/spark/pull/46182#issuecomment-2074921088 Hi @TakawaAkirayo , thanks a lot for the contribution. Please let me have a look at it. This is something we wanted to address, but I need to first spend some time looking at the PR

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578075108 ## spark-operator-api/src/main/java/org/apache/spark/k8s/operator/spec/ResourceRetentionPolicy.java: ## @@ -0,0 +1,26 @@ +/* + * Licensed to the

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578076963 ## spark-operator-api/src/main/java/org/apache/spark/k8s/operator/spec/RestartConfig.java: ## @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578093461 ## spark-operator-api/src/test/java/org/apache/spark/k8s/operator/status/ApplicationStatusTest.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the

[PR] Update sql-ref-syntax-qry-select-setops.md [spark]

2024-04-24 Thread via GitHub
Legumolas opened a new pull request, #46208: URL: https://github.com/apache/spark/pull/46208 Remove incorrect statements about set operators and ALL Using the set operators UNION, INTERSECT, EXCEPT, and MINUS are equivalent to using UNION DISTINCT, INTERSECT DISTINCT, EXCEPT

Re: [PR] [SPARK-47633][SQL][3.5] Include right-side plan output in `LateralJoin#allAttributes` for more consistent canonicalization [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on PR #46190: URL: https://github.com/apache/spark/pull/46190#issuecomment-2075408294 Merged to branch-3.5 for Apache Spark 3.5.2. Thank you, @bersprockets . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-47597][STREAMING] Streaming: Migrate logInfo with variables to structured logging framework [spark]

2024-04-24 Thread via GitHub
gengliangwang commented on code in PR #46192: URL: https://github.com/apache/spark/pull/46192#discussion_r1578509327 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala: ## @@ -102,7 +102,7 @@ class IncrementalExecution(

Re: [PR] [SPARK-47960][SS] Allow chaining other stateful operators after transformWIthState operator. [spark]

2024-04-24 Thread via GitHub
anishshri-db commented on code in PR #45376: URL: https://github.com/apache/spark/pull/45376#discussion_r1578226523 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala: ## @@ -347,6 +347,27 @@ class IncrementalExecution(

Re: [PR] Update sql-ref-syntax-qry-select-setops.md [spark]

2024-04-24 Thread via GitHub
Legumolas commented on PR #46208: URL: https://github.com/apache/spark/pull/46208#issuecomment-2075436551 This is the first time I've tried to contribute to a public repo... I checked for existing pull requests with related to the fix, but did not see any. -- This is an automated message

Re: [PR] [SPARK-47583][CORE] SQL core: Migrate logError with variables to structured logging framework [spark]

2024-04-24 Thread via GitHub
gengliangwang commented on PR #45969: URL: https://github.com/apache/spark/pull/45969#issuecomment-2075806528 Thanks, merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-47597][STREAMING] Streaming: Migrate logInfo with variables to structured logging framework [spark]

2024-04-24 Thread via GitHub
gengliangwang commented on code in PR #46192: URL: https://github.com/apache/spark/pull/46192#discussion_r1578512880 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala: ## @@ -171,7 +173,9 @@ class MicroBatchExecution( //

Re: [PR] [WIP][SPARK-47975][SQL] Add Collation Support for trim/ltrim/rtrim [spark]

2024-04-24 Thread via GitHub
uros-db commented on PR #46205: URL: https://github.com/apache/spark/pull/46205#issuecomment-2075272977 FYI, I think this is already in progress https://issues.apache.org/jira/browse/SPARK-47409 @panbingkun https://issues.apache.org/jira/browse/SPARK-47975 seems like a duplicate

Re: [PR] [SPARK-47960][SS] Allow chaining other stateful operators after transformWIthState operator. [spark]

2024-04-24 Thread via GitHub
anishshri-db commented on code in PR #45376: URL: https://github.com/apache/spark/pull/45376#discussion_r157822 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithStateChainingSuite.scala: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software

[PR] [SPARK-46122][SQL] Disable spark.sql.legacy.createHiveTableByDefault by default [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun opened a new pull request, #46207: URL: https://github.com/apache/spark/pull/46207 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

Re: [PR] [SPARK-47793][SS][PYTHON] Implement SimpleDataSourceStreamReader for python streaming data source [spark]

2024-04-24 Thread via GitHub
allisonwang-db commented on code in PR #45977: URL: https://github.com/apache/spark/pull/45977#discussion_r1578365745 ## python/pyspark/sql/datasource_internal.py: ## @@ -0,0 +1,146 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578066857 ## spark-operator-api/src/main/java/org/apache/spark/k8s/operator/spec/ResourceRetentionPolicy.java: ## @@ -0,0 +1,26 @@ +/* + * Licensed to the

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578090933 ## spark-operator-api/src/main/java/org/apache/spark/k8s/operator/spec/RestartPolicy.java: ## @@ -0,0 +1,44 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-47409][SQL] Add support for collation for StringTrim type of functions/expressions [spark]

2024-04-24 Thread via GitHub
davidm-db closed pull request #45749: [SPARK-47409][SQL] Add support for collation for StringTrim type of functions/expressions URL: https://github.com/apache/spark/pull/45749 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-47597][STREAMING] Streaming: Migrate logInfo with variables to structured logging framework [spark]

2024-04-24 Thread via GitHub
dtenedor commented on PR #46192: URL: https://github.com/apache/spark/pull/46192#issuecomment-2075678628 Note that the single test failure is flaky and unrelated; the latest CI run otherwise passes all tests. -- This is an automated message from the Apache Git Service. To respond to the

[PR] [SPARK-47939][SQL] Implement a new Analyzer rule to move ParameterizedQuery inside ExplainCommand and DescribeQueryCommand [spark]

2024-04-24 Thread via GitHub
vladimirg-db opened a new pull request, #46209: URL: https://github.com/apache/spark/pull/46209 ### What changes were proposed in this pull request? Mark `DescribeQueryCommand` and `ExplainCommand` as `SupervisingCommand` (they don't expose their wrapped nodes, but supervise them

Re: [PR] [SPARK-47583][CORE] SQL core: Migrate logError with variables to structured logging framework [spark]

2024-04-24 Thread via GitHub
gengliangwang closed pull request #45969: [SPARK-47583][CORE] SQL core: Migrate logError with variables to structured logging framework URL: https://github.com/apache/spark/pull/45969 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578058548 ## spark-operator-api/src/main/java/org/apache/spark/k8s/operator/spec/JDKVersion.java: ## @@ -0,0 +1,26 @@ +/* + * Licensed to the Apache Software

Re: [PR] [SPARK-47967][SQL] Make `JdbcUtils.makeGetter` handle reading time type as NTZ correctly [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #46201: URL: https://github.com/apache/spark/pull/46201#discussion_r1578098806 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala: ## @@ -498,6 +498,12 @@ object JdbcUtils extends Logging with

Re: [PR] [WIP][SPARK-24815] [CORE] Trigger Interval based DRA for Structured Streaming [spark]

2024-04-24 Thread via GitHub
pkotikalapudi commented on code in PR #42352: URL: https://github.com/apache/spark/pull/42352#discussion_r1578115234 ## core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala: ## @@ -669,6 +753,27 @@ private[spark] class ExecutorAllocationManager( private val

Re: [PR] [SPARK-47974][BUILD] Remove `install_scala` from `build/mvn` [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on PR #46204: URL: https://github.com/apache/spark/pull/46204#issuecomment-2075258406 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47974][BUILD] Remove `install_scala` from `build/mvn` [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun closed pull request #46204: [SPARK-47974][BUILD] Remove `install_scala` from `build/mvn` URL: https://github.com/apache/spark/pull/46204 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [WIP][SPARK-24815] [CORE] Trigger Interval based DRA for Structured Streaming [spark]

2024-04-24 Thread via GitHub
pkotikalapudi commented on code in PR #42352: URL: https://github.com/apache/spark/pull/42352#discussion_r1578138588 ## core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala: ## @@ -916,8 +1040,13 @@ private[spark] class ExecutorAllocationManager(

Re: [PR] [SPARK-47960][SS] Allow chaining other stateful operators after transformWIthState operator. [spark]

2024-04-24 Thread via GitHub
anishshri-db commented on code in PR #45376: URL: https://github.com/apache/spark/pull/45376#discussion_r1578228218 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithStateChainingSuite.scala: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software

Re: [PR] [CONNECT] Use v1 as spark connect go library starting version [spark-connect-go]

2024-04-24 Thread via GitHub
viirya commented on code in PR #19: URL: https://github.com/apache/spark-connect-go/pull/19#discussion_r1578325614 ## internal/generated/expressions.pb.go: ## @@ -211,6 +211,7 @@ type Expression struct { // *Expression_UpdateFields_ //

Re: [PR] [CONNECT] Use v1 as spark connect go library starting version [spark-connect-go]

2024-04-24 Thread via GitHub
viirya commented on code in PR #19: URL: https://github.com/apache/spark-connect-go/pull/19#discussion_r1578326225 ## go.mod: ## @@ -49,10 +49,10 @@ require ( github.com/pmezard/go-difflib v1.0.0 // indirect github.com/zeebo/xxh3 v1.0.2 // indirect

Re: [PR] [CONNECT] Use v1 as spark connect go library starting version [spark-connect-go]

2024-04-24 Thread via GitHub
viirya commented on PR #19: URL: https://github.com/apache/spark-connect-go/pull/19#issuecomment-2075559200 Do we need to create JIRA ticket for this? @hiboyang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [CONNECT] Use v1 as spark connect go library starting version [spark-connect-go]

2024-04-24 Thread via GitHub
viirya commented on PR #19: URL: https://github.com/apache/spark-connect-go/pull/19#issuecomment-2075558029 maybe cc @HyukjinKwon ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47955][SQL] Improve `DeduplicateRelations` performance [spark]

2024-04-24 Thread via GitHub
peter-toth commented on PR #46183: URL: https://github.com/apache/spark/pull/46183#issuecomment-2075265856 > Thank you, @peter-toth . However, it would be great if we can have a way to measure this improvement. Could you add some reproducible procedure in PR description? Or, do you think

Re: [PR] [SPARK-46841][SQL] Add collation support for ICU locales and collation specifiers [spark]

2024-04-24 Thread via GitHub
nikolamand-db commented on PR #46180: URL: https://github.com/apache/spark/pull/46180#issuecomment-2075286793 Please review collation team @dbatomic @stefankandic @uros-db @mihailom-db @stevomitric. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [SPARK-47633][SQL][3.5] Include right-side plan output in `LateralJoin#allAttributes` for more consistent canonicalization [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun closed pull request #46190: [SPARK-47633][SQL][3.5] Include right-side plan output in `LateralJoin#allAttributes` for more consistent canonicalization URL: https://github.com/apache/spark/pull/46190 -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [CONNECT] Use v1 as spark connect go library starting version [spark-connect-go]

2024-04-24 Thread via GitHub
hiboyang commented on PR #19: URL: https://github.com/apache/spark-connect-go/pull/19#issuecomment-2075513386 Yeah, Go module name normally put `v2`/`v3`/etc in the end to indicate major version bump, like `module example.com/mymodule/v2`, see example in the end of

Re: [PR] [CONNECT] Use v1 as spark connect go library starting version [spark-connect-go]

2024-04-24 Thread via GitHub
viirya commented on code in PR #19: URL: https://github.com/apache/spark-connect-go/pull/19#discussion_r1578324700 ## client/sql/dataframe.go: ## @@ -23,7 +23,7 @@ import ( "github.com/apache/arrow/go/v12/arrow" "github.com/apache/arrow/go/v12/arrow/array"

Re: [PR] [CONNECT] [SPARK-47842] Use v1 as spark connect go library starting version [spark-connect-go]

2024-04-24 Thread via GitHub
hiboyang commented on code in PR #19: URL: https://github.com/apache/spark-connect-go/pull/19#discussion_r1578559424 ## internal/generated/expressions.pb.go: ## @@ -211,6 +211,7 @@ type Expression struct { // *Expression_UpdateFields_ //

Re: [PR] [SPARK-47979][SQL][TESTS] Use Hive tables explicitly for Hive table capability tests [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #46211: URL: https://github.com/apache/spark/pull/46211#discussion_r1578613668 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala: ## @@ -2642,9 +2642,6 @@ class HiveDDLSuite sql("CREATE TABLE t3 (v

Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]

2024-04-24 Thread via GitHub
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578638859 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,393 +16,393 @@ */ package org.apache.spark.internal +trait ILogKey + /** *

Re: [PR] [SPARK-47979][SQL][TESTS] Use Hive tables explicitly for Hive table capability tests [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on PR #46211: URL: https://github.com/apache/spark/pull/46211#issuecomment-2076096134 Thank you so much, @viirya ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-47580][SQL] SQL catalyst: eliminate unnamed variables in error logs [spark]

2024-04-24 Thread via GitHub
cloud-fan commented on code in PR #46212: URL: https://github.com/apache/spark/pull/46212#discussion_r1578757845 ## sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala: ## @@ -629,7 +629,7 @@ private[sql] object QueryExecutionErrors extends

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578774317 ## spark-operator-api/src/main/java/org/apache/spark/k8s/operator/status/BaseStatus.java: ## @@ -0,0 +1,64 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578774826 ## spark-operator-api/src/main/java/org/apache/spark/k8s/operator/status/BaseStatus.java: ## @@ -0,0 +1,64 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-47912][SQL] Infer serde class from format classes [spark]

2024-04-24 Thread via GitHub
yaooqinn commented on code in PR #46132: URL: https://github.com/apache/spark/pull/46132#discussion_r1578788071 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala: ## @@ -681,11 +681,14 @@ class SparkSqlAstBuilder extends AstBuilder { } else {

Re: [PR] [SPARK-47912][SQL] Infer serde class from format classes [spark]

2024-04-24 Thread via GitHub
wForget commented on code in PR #46132: URL: https://github.com/apache/spark/pull/46132#discussion_r1578808475 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala: ## @@ -681,11 +681,14 @@ class SparkSqlAstBuilder extends AstBuilder { } else {

Re: [PR] [WIP][SPARK-47975][SQL] Add Collation Support for trim/ltrim/rtrim [spark]

2024-04-24 Thread via GitHub
panbingkun commented on PR #46205: URL: https://github.com/apache/spark/pull/46205#issuecomment-2075910662 > FYI, I think this is already in progress https://issues.apache.org/jira/browse/SPARK-47409 > > @panbingkun https://issues.apache.org/jira/browse/SPARK-47975 seems like a

Re: [PR] [WIP][SPARK-47975][SQL] Add Collation Support for trim/ltrim/rtrim [spark]

2024-04-24 Thread via GitHub
panbingkun closed pull request #46205: [WIP][SPARK-47975][SQL] Add Collation Support for trim/ltrim/rtrim URL: https://github.com/apache/spark/pull/46205 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [CONNECT] [SPARK-47842] Use v1 as spark connect go library starting version [spark-connect-go]

2024-04-24 Thread via GitHub
hiboyang commented on PR #19: URL: https://github.com/apache/spark-connect-go/pull/19#issuecomment-2075911147 Thanks @viirya for the review! @HyukjinKwon @grundprinzip do you get time to look at this PR? -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-47977] DateTimeUtils.timestampDiff and DateTimeUtils.timestampAdd should not throw INTERNAL_ERROR exception [spark]

2024-04-24 Thread via GitHub
vitaliili-db commented on PR #46210: URL: https://github.com/apache/spark/pull/46210#issuecomment-2075911376 @MaxGekk please take a look -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] [SPARK-47977] DateTimeUtils.timestampDiff and DateTimeUtils.timestampAdd should not throw INTERNAL_ERROR exception [spark]

2024-04-24 Thread via GitHub
vitaliili-db opened a new pull request, #46210: URL: https://github.com/apache/spark/pull/46210 ### What changes were proposed in this pull request? Convert `INTERNAL_ERROR` for `timestampAdd` and `timestampDiff` to error with class. Reusing

Re: [PR] [SPARK-47960][SS] Allow chaining other stateful operators after transformWIthState operator. [spark]

2024-04-24 Thread via GitHub
sahnib commented on code in PR #45376: URL: https://github.com/apache/spark/pull/45376#discussion_r1578604240 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithStateChainingSuite.scala: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation

[PR] [SPARK-47979][SQL][TESTS] Use Hive tables explicitly for Hive table capability tests [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun opened a new pull request, #46211: URL: https://github.com/apache/spark/pull/46211 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]

2024-04-24 Thread via GitHub
gengliangwang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578608646 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -144,6 +146,22 @@ trait LoggingSuiteBase } } + // An

Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]

2024-04-24 Thread via GitHub
gengliangwang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578624300 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,393 +16,393 @@ */ package org.apache.spark.internal +trait ILogKey + /**

Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]

2024-04-24 Thread via GitHub
gengliangwang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578624819 ## common/utils/src/test/scala/org/apache/spark/util/StructuredLoggingSuite.scala: ## @@ -144,6 +146,22 @@ trait LoggingSuiteBase } } + // An

Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]

2024-04-24 Thread via GitHub
gengliangwang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578623991 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,393 +16,393 @@ */ package org.apache.spark.internal +trait ILogKey Review

Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]

2024-04-24 Thread via GitHub
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578641891 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,393 +16,393 @@ */ package org.apache.spark.internal +trait ILogKey + /** *

Re: [PR] [SPARK-42846][SQL] Integrate _LEGACY_ERROR_TEMP_2011 into UNEXPECTED_DATA_TYPE [spark]

2024-04-24 Thread via GitHub
HiuKwok commented on PR #45786: URL: https://github.com/apache/spark/pull/45786#issuecomment-2076080001 @MaxGekk Hi, would you mind having a look at this? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578672802 ## spark-operator-api/src/main/java/org/apache/spark/k8s/operator/spec/JDKVersion.java: ## @@ -0,0 +1,25 @@ +/* + * Licensed to the Apache Software

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578766652 ## spark-operator-api/src/main/java/org/apache/spark/k8s/operator/spec/RuntimeVersions.java: ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578772843 ## spark-operator-api/src/main/java/org/apache/spark/k8s/operator/status/BaseStatus.java: ## @@ -0,0 +1,64 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]

2024-04-24 Thread via GitHub
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578772766 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/OffsetSeq.scala: ## @@ -21,12 +21,12 @@ import org.json4s.{Formats, NoTypeHints} import

Re: [PR] [SPARK-47967][SQL] Make `JdbcUtils.makeGetter` handle reading time type as NTZ correctly [spark]

2024-04-24 Thread via GitHub
yaooqinn closed pull request #46201: [SPARK-47967][SQL] Make `JdbcUtils.makeGetter` handle reading time type as NTZ correctly URL: https://github.com/apache/spark/pull/46201 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-47967][SQL] Make `JdbcUtils.makeGetter` handle reading time type as NTZ correctly [spark]

2024-04-24 Thread via GitHub
yaooqinn commented on PR #46201: URL: https://github.com/apache/spark/pull/46201#issuecomment-2076239188 Thank you @dongjoon-hyun Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578777664 ## spark-operator-api/src/main/java/org/apache/spark/k8s/operator/utils/ModelUtils.java: ## @@ -0,0 +1,115 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]

2024-04-24 Thread via GitHub
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578780993 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala: ## @@ -20,15 +20,15 @@ package org.apache.spark.sql.catalyst.optimizer

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578781776 ## spark-operator-api/src/main/java/org/apache/spark/k8s/operator/utils/ModelUtils.java: ## @@ -0,0 +1,115 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]

2024-04-24 Thread via GitHub
panbingkun commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578780993 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala: ## @@ -20,15 +20,15 @@ package org.apache.spark.sql.catalyst.optimizer

Re: [PR] [SPARK-47963][CORE] Make the external Spark ecosystem can use structured logging mechanisms [spark]

2024-04-24 Thread via GitHub
LuciferYang commented on code in PR #46193: URL: https://github.com/apache/spark/pull/46193#discussion_r1578786930 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -16,399 +16,399 @@ */ package org.apache.spark.internal +trait LogKey Review

Re: [PR] [WIP] Relax the type of `MDC#key` [spark]

2024-04-24 Thread via GitHub
LuciferYang closed pull request #46186: [WIP] Relax the type of `MDC#key` URL: https://github.com/apache/spark/pull/46186 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578793732 ## spark-operator-api/src/test/java/org/apache/spark/k8s/operator/spec/RestartPolicyTest.java: ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578793511 ## spark-operator-api/src/test/java/org/apache/spark/k8s/operator/spec/RestartPolicyTest.java: ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-47979][SQL][TESTS] Use Hive tables explicitly for Hive table capability tests [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on PR #46211: URL: https://github.com/apache/spark/pull/46211#issuecomment-2075998795 Could you review this PR, @viirya ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578668674 ## spark-operator-api/src/main/java/org/apache/spark/kubernetes/operator/status/ApplicationAttemptSummary.java: ## @@ -0,0 +1,40 @@ +/* + * Licensed

Re: [PR] [SPARK-47979][SQL][TESTS] Use Hive tables explicitly for Hive table capability tests [spark]

2024-04-24 Thread via GitHub
dongjoon-hyun closed pull request #46211: [SPARK-47979][SQL][TESTS] Use Hive tables explicitly for Hive table capability tests URL: https://github.com/apache/spark/pull/46211 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-47950] Add Java API Module for Spark Operator [spark-kubernetes-operator]

2024-04-24 Thread via GitHub
dongjoon-hyun commented on code in PR #8: URL: https://github.com/apache/spark-kubernetes-operator/pull/8#discussion_r1578765373 ## spark-operator-api/src/main/java/org/apache/spark/k8s/operator/spec/RuntimeVersions.java: ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache

  1   2   3   >