[PR] [SPARK-46051][INFRA] Cache python deps for linter and documentation [spark]

2023-11-21 Thread via GitHub
zhengruifeng opened a new pull request, #43953: URL: https://github.com/apache/spark/pull/43953 ### What changes were proposed in this pull request? Cache python deps for linter and documentation ### Why are the changes needed? 1, to avoid unnecessary installation: some

Re: [PR] [SPARK-46051][INFRA] Cache python deps for linter and documentation [spark]

2023-11-21 Thread via GitHub
zhengruifeng commented on code in PR #43953: URL: https://github.com/apache/spark/pull/43953#discussion_r1401642302 ## .github/workflows/build_and_test.yml: ## @@ -689,15 +689,6 @@ jobs: # Should delete this section after SPARK 3.5 EOL. python3.9 -m pip

Re: [PR] [SPARK-46050][SQL]: Allow additional rules for the Substitution batch [spark]

2023-11-21 Thread via GitHub
nastra commented on PR #43952: URL: https://github.com/apache/spark/pull/43952#issuecomment-1822250414 /cc @jzhuge @aokolnychyi @holdenk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] [SPARK-46050][SQL]: Allow additional rules for the Substitution batch [spark]

2023-11-21 Thread via GitHub
nastra opened a new pull request, #43952: URL: https://github.com/apache/spark/pull/43952 ### What changes were proposed in this pull request? Analyzer improvement that allows providing addtional rules for the Substitution batch ### Why are the changes needed?

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
laglangyue commented on PR #43526: URL: https://github.com/apache/spark/pull/43526#issuecomment-1822242366 test local in idea for `SparkSQLExample` ![image](https://github.com/apache/spark/assets/35491928/f7028b96-f449-4102-88c7-f7939590c88e) -- This is an automated message from

Re: [PR] [SPARK-46038][BUILD] Upgrade log4j2 to 2.22.0 [spark]

2023-11-21 Thread via GitHub
LuciferYang commented on PR #43940: URL: https://github.com/apache/spark/pull/43940#issuecomment-1822238182 Thanks @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46035][BUILD] Upgrade zstd-jni to 1.5.5-10 [spark]

2023-11-21 Thread via GitHub
LuciferYang commented on PR #43937: URL: https://github.com/apache/spark/pull/43937#issuecomment-1822237912 Thanks @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-45807][SQL]: Add createOrReplaceView(..) / replaceView(..) to ViewCatalog [spark]

2023-11-21 Thread via GitHub
nastra commented on PR #43677: URL: https://github.com/apache/spark/pull/43677#issuecomment-1822236908 thanks for reviewing @aokolnychyi, I've applied your suggestion -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-46047][BUILD][SQL] Fix the usage of Scala 2.13 deprecated APIs and make the usage of Scala 2.13 deprecated APIs as compilation error [spark]

2023-11-21 Thread via GitHub
LuciferYang commented on PR #43950: URL: https://github.com/apache/spark/pull/43950#issuecomment-1822236512 Thanks @dongjoon-hyun @HyukjinKwon ~ I will merge this PR as soon as possible and rebase the code locally before merging to perform another compilation check -- This is an

Re: [PR] [SPARK-46047][BUILD] Remove the compilation suppression rule related to the Scala 2.13 deprecated APIs usage [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun commented on PR #43950: URL: https://github.com/apache/spark/pull/43950#issuecomment-1822218576 Thank you! > There should not be many bad cases left, Ok, I will fix the remaining parts in this PR and convert this rule into compilation error. -- This is an

Re: [PR] [SPARK-46047][BUILD] Remove the compilation suppression rule related to the Scala 2.13 deprecated APIs usage [spark]

2023-11-21 Thread via GitHub
LuciferYang commented on PR #43950: URL: https://github.com/apache/spark/pull/43950#issuecomment-1822218219 There should not be many bad cases left, Ok, I will fix the remaining parts in this PR and convert this rule into compilation error. -- This is an automated message from

Re: [PR] [SPARK-46006][YARN] YarnAllocator miss clean targetNumExecutorsPerResourceProfileId after YarnSchedulerBackend call stop [spark]

2023-11-21 Thread via GitHub
AngersZh commented on code in PR #43906: URL: https://github.com/apache/spark/pull/43906#discussion_r1401593709 ## resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala: ## @@ -384,19 +384,27 @@ private[yarn] class YarnAllocator(

Re: [PR] [SPARK-46048][PYTHON][SQL] Support DataFrame.groupingSets in PySpark [spark]

2023-11-21 Thread via GitHub
HyukjinKwon closed pull request #43951: [SPARK-46048][PYTHON][SQL] Support DataFrame.groupingSets in PySpark URL: https://github.com/apache/spark/pull/43951 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-46048][PYTHON][SQL] Support DataFrame.groupingSets in PySpark [spark]

2023-11-21 Thread via GitHub
HyukjinKwon commented on PR #43951: URL: https://github.com/apache/spark/pull/43951#issuecomment-1822213625 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46006][YARN] YarnAllocator miss clean targetNumExecutorsPerResourceProfileId after YarnSchedulerBackend call stop [spark]

2023-11-21 Thread via GitHub
yaooqinn commented on code in PR #43906: URL: https://github.com/apache/spark/pull/43906#discussion_r1401587275 ## resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala: ## @@ -384,19 +384,27 @@ private[yarn] class YarnAllocator(

Re: [PR] [SPARK-46047][BUILD] Remove the compilation suppression rule related to the Scala 2.13 deprecated APIs usage [spark]

2023-11-21 Thread via GitHub
HyukjinKwon commented on PR #43950: URL: https://github.com/apache/spark/pull/43950#issuecomment-1822207316 Yeah in principle I agree with making it failed .. just wonder how much we should fix them ... If we should, probably we should just fix them in one go ... -- This is an automated

Re: [PR] [SPARK-45833][SS][DOCS] Document the new introduction of state data source [spark]

2023-11-21 Thread via GitHub
anishshri-db commented on code in PR #43920: URL: https://github.com/apache/spark/pull/43920#discussion_r1401583755 ## docs/structured-streaming-programming-guide.md: ## @@ -2452,6 +2452,14 @@ Specifically for built-in HDFS state store provider, users can check the state s it

Re: [PR] [SPARK-45833][SS][DOCS] Document the new introduction of state data source [spark]

2023-11-21 Thread via GitHub
anishshri-db commented on code in PR #43920: URL: https://github.com/apache/spark/pull/43920#discussion_r1401583560 ## docs/structured-streaming-programming-guide.md: ## @@ -2452,6 +2452,14 @@ Specifically for built-in HDFS state store provider, users can check the state s it

Re: [PR] [SPARK-45833][SS][DOCS] Document the new introduction of state data source [spark]

2023-11-21 Thread via GitHub
anishshri-db commented on code in PR #43920: URL: https://github.com/apache/spark/pull/43920#discussion_r1401582848 ## docs/structured-streaming-programming-guide.md: ## @@ -2452,6 +2452,14 @@ Specifically for built-in HDFS state store provider, users can check the state s it

Re: [PR] [SPARK-46021][CORE] Support cancel future jobs belonging to a job group [spark]

2023-11-21 Thread via GitHub
beliefer commented on code in PR #43926: URL: https://github.com/apache/spark/pull/43926#discussion_r1401569798 ## core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala: ## @@ -1264,14 +1280,24 @@ private[spark] class DAGScheduler(

Re: [PR] [SPARK-45656][SQL] Fix observation when named observations with the same name on different datasets [spark]

2023-11-21 Thread via GitHub
cloud-fan commented on PR #43519: URL: https://github.com/apache/spark/pull/43519#issuecomment-1822188529 Oh sorry I was a bit confused as well. I think it's because of self-join, we don't require the observation name to be unique. -- This is an automated message from the Apache Git

Re: [PR] [SPARK-46043][SQL] Support create table using DSv2 sources [spark]

2023-11-21 Thread via GitHub
cloud-fan commented on code in PR #43949: URL: https://github.com/apache/spark/pull/43949#discussion_r1401567348 ## sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2Suite.scala: ## @@ -633,6 +633,38 @@ class DataSourceV2Suite extends QueryTest with

Re: [PR] [SPARK-46043][SQL] Support create table using DSv2 sources [spark]

2023-11-21 Thread via GitHub
cloud-fan commented on code in PR #43949: URL: https://github.com/apache/spark/pull/43949#discussion_r1401566798 ## sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2Suite.scala: ## @@ -633,6 +633,38 @@ class DataSourceV2Suite extends QueryTest with

Re: [PR] [SPARK-46043][SQL] Support create table using DSv2 sources [spark]

2023-11-21 Thread via GitHub
cloud-fan commented on code in PR #43949: URL: https://github.com/apache/spark/pull/43949#discussion_r1401565526 ## sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2Suite.scala: ## @@ -633,6 +633,38 @@ class DataSourceV2Suite extends QueryTest with

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
laglangyue commented on code in PR #43526: URL: https://github.com/apache/spark/pull/43526#discussion_r1401554599 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala: ## @@ -47,16 +46,16 @@ import org.apache.spark.util.ArrayImplicits._ *

Re: [PR] [SPARK-46029][SQL] Escape the single quote, `_` and `%` for DS V2 pushdown [spark]

2023-11-21 Thread via GitHub
beliefer commented on PR #43801: URL: https://github.com/apache/spark/pull/43801#issuecomment-1822182897 @cloud-fan The GA failure is unrelated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-46021][CORE] Support cancel future jobs belonging to a job group [spark]

2023-11-21 Thread via GitHub
cloud-fan commented on code in PR #43926: URL: https://github.com/apache/spark/pull/43926#discussion_r1401555960 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -2961,6 +2961,12 @@ ], "sqlState" : "42601" }, + "SPARK_JOB_CANCELLED" : { +

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
laglangyue commented on code in PR #43526: URL: https://github.com/apache/spark/pull/43526#discussion_r1401554599 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala: ## @@ -47,16 +46,16 @@ import org.apache.spark.util.ArrayImplicits._ *

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
laglangyue commented on code in PR #43526: URL: https://github.com/apache/spark/pull/43526#discussion_r1401545560 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala: ## @@ -47,16 +46,16 @@ import org.apache.spark.util.ArrayImplicits._ *

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
laglangyue commented on code in PR #43526: URL: https://github.com/apache/spark/pull/43526#discussion_r1401539215 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala: ## @@ -140,7 +137,9 @@ abstract class CompactibleFileStreamLog[T

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
laglangyue commented on code in PR #43526: URL: https://github.com/apache/spark/pull/43526#discussion_r1401536784 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala: ## @@ -35,7 +35,6 @@ import

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
laglangyue commented on code in PR #43526: URL: https://github.com/apache/spark/pull/43526#discussion_r1401536527 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala: ## @@ -47,16 +46,16 @@ import org.apache.spark.util.ArrayImplicits._ *

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
laglangyue commented on code in PR #43526: URL: https://github.com/apache/spark/pull/43526#discussion_r1401536364 ## resource-managers/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala: ## @@ -64,14 +64,14 @@ private[spark] abstract class

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
laglangyue commented on code in PR #43526: URL: https://github.com/apache/spark/pull/43526#discussion_r1401534482 ## core/src/test/scala/org/apache/spark/SparkContextSuite.scala: ## @@ -32,7 +32,7 @@ import org.apache.hadoop.io.{BytesWritable, LongWritable, Text} import

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
laglangyue commented on code in PR #43526: URL: https://github.com/apache/spark/pull/43526#discussion_r1401534260 ## mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeansModel.scala: ## @@ -188,7 +188,7 @@ object BisectingKMeansModel extends

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
laglangyue commented on code in PR #43526: URL: https://github.com/apache/spark/pull/43526#discussion_r1401534708 ## core/src/test/scala/org/apache/spark/deploy/worker/WorkerSuite.scala: ## @@ -23,7 +23,7 @@ import java.util.function.Supplier import

Re: [PR] [SPARK-46035][BUILD] Upgrade zstd-jni to 1.5.5-10 [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun commented on PR #43937: URL: https://github.com/apache/spark/pull/43937#issuecomment-1822147196 Merged to master. Thank you, @LuciferYang . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-46038][BUILD] Upgrade log4j2 to 2.22.0 [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun commented on PR #43940: URL: https://github.com/apache/spark/pull/43940#issuecomment-1822146950 Merged to master. Thank you, @LuciferYang . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-46045][DOCS][PS] Add individual categories for `Options and settings` to API reference [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun commented on PR #43948: URL: https://github.com/apache/spark/pull/43948#issuecomment-1822146678 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46045][DOCS][PS] Add individual categories for `Options and settings` to API reference [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun closed pull request #43948: [SPARK-46045][DOCS][PS] Add individual categories for `Options and settings` to API reference URL: https://github.com/apache/spark/pull/43948 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-46035][BUILD] Upgrade zstd-jni to 1.5.5-10 [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun commented on code in PR #43937: URL: https://github.com/apache/spark/pull/43937#discussion_r1401527611 ## core/benchmarks/ZStandardBenchmark-results.txt: ## @@ -2,26 +2,26 @@ Benchmark ZStandardCompressionCodec

Re: [PR] [SPARK-46038][BUILD] Upgrade log4j2 to 2.22.0 [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun closed pull request #43940: [SPARK-46038][BUILD] Upgrade log4j2 to 2.22.0 URL: https://github.com/apache/spark/pull/43940 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46035][BUILD] Upgrade zstd-jni to 1.5.5-10 [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun closed pull request #43937: [SPARK-46035][BUILD] Upgrade zstd-jni to 1.5.5-10 URL: https://github.com/apache/spark/pull/43937 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-46035][BUILD] Upgrade zstd-jni to 1.5.5-10 [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun commented on code in PR #43937: URL: https://github.com/apache/spark/pull/43937#discussion_r1401526447 ## core/benchmarks/ZStandardBenchmark-results.txt: ## @@ -2,26 +2,26 @@ Benchmark ZStandardCompressionCodec

Re: [PR] [SPARK-46048][PYTHON][SQL] Support DataFrame.groupingSets in PySpark [spark]

2023-11-21 Thread via GitHub
HyukjinKwon commented on PR #43951: URL: https://github.com/apache/spark/pull/43951#issuecomment-1822120965 cc @zhengruifeng -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] [SPARK-46048][PYTHON][SQL] Support DataFrame.groupingSets in PySpark [spark]

2023-11-21 Thread via GitHub
HyukjinKwon opened a new pull request, #43951: URL: https://github.com/apache/spark/pull/43951 ### What changes were proposed in this pull request? https://github.com/apache/spark/pull/43813 added Scala API of `DataFrame.groupingSets`. This PR proposes to have the same API in PySpark

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
LuciferYang commented on PR #43526: URL: https://github.com/apache/spark/pull/43526#issuecomment-1822114074 Since Spark 4.0 currently does not support Scala 3, and it is not a compilation error in Scala 3, I think we can remove the part related to Scala 3 from the pr description. --

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
LuciferYang commented on PR #43526: URL: https://github.com/apache/spark/pull/43526#issuecomment-1822111798 Got -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-45556][UI] Allow web page respond customized status code and message through WebApplicationException [spark]

2023-11-21 Thread via GitHub
kuwii commented on PR #43646: URL: https://github.com/apache/spark/pull/43646#issuecomment-1822106783 Also kindly ping @srowen @dongjoon-hyun. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
laglangyue commented on PR #43526: URL: https://github.com/apache/spark/pull/43526#issuecomment-1822095997 > > > Thank you very much @laglangyue , I will review this PR as soon as possible > > > > > I new a project and test code as bellow,and `sbt compile`,and it will not cause error,

Re: [PR] [SPARK-46047][BUILD] Remove the compilation suppression rule related to the Scala 2.13 deprecated APIs usage [spark]

2023-11-21 Thread via GitHub
LuciferYang commented on PR #43950: URL: https://github.com/apache/spark/pull/43950#issuecomment-1822092479 cc @srowen @HyukjinKwon @dongjoon-hyun Another option is to change this compilation rule to ```

[PR] [SPARK-46047][BUILD] Remove the compilation suppression rule related to the Scala 2.13 deprecated APIs usage [spark]

2023-11-21 Thread via GitHub
LuciferYang opened a new pull request, #43950: URL: https://github.com/apache/spark/pull/43950 ### What changes were proposed in this pull request? After some subtasks of SPARK-45314 were completed, there is almost no Scala 2.13 deprecated APIs usage in the current Spark code. Therefore,

Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

2023-11-21 Thread via GitHub
LuciferYang commented on PR #43603: URL: https://github.com/apache/spark/pull/43603#issuecomment-1822059435 Merged into master for Spark 4.0. Thanks @panbingkun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

2023-11-21 Thread via GitHub
LuciferYang closed pull request #43603: [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` URL: https://github.com/apache/spark/pull/43603 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-45696][CORE] Fix method tryCompleteWith in trait Promise is deprecated [spark]

2023-11-21 Thread via GitHub
LuciferYang commented on PR #43556: URL: https://github.com/apache/spark/pull/43556#issuecomment-1822050721 Merged into master for Spark 4.0. Thanks @zhaomin1423 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-45696][CORE] Fix method tryCompleteWith in trait Promise is deprecated [spark]

2023-11-21 Thread via GitHub
LuciferYang closed pull request #43556: [SPARK-45696][CORE] Fix method tryCompleteWith in trait Promise is deprecated URL: https://github.com/apache/spark/pull/43556 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-46039][BUILD][CONNECT] Upgrade `grpcio*` to 1.59.3 for Python 3.12 [spark]

2023-11-21 Thread via GitHub
LuciferYang commented on PR #43942: URL: https://github.com/apache/spark/pull/43942#issuecomment-1822045375 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-46043][SQL] Support create table using DSv2 sources [spark]

2023-11-21 Thread via GitHub
allisonwang-db commented on PR #43949: URL: https://github.com/apache/spark/pull/43949#issuecomment-1822044270 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46035][BUILD] Upgrade zstd-jni to 1.5.5-10 [spark]

2023-11-21 Thread via GitHub
LuciferYang commented on code in PR #43937: URL: https://github.com/apache/spark/pull/43937#discussion_r1401467919 ## core/benchmarks/ZStandardBenchmark-results.txt: ## @@ -2,26 +2,26 @@ Benchmark ZStandardCompressionCodec

[PR] [SPARK-46043][SQL] Support create table using DSv2 sources [spark]

2023-11-21 Thread via GitHub
allisonwang-db opened a new pull request, #43949: URL: https://github.com/apache/spark/pull/43949 ### What changes were proposed in this pull request? This PR supports `CREATE TABLE ... USING source` for DSv2 sources. ### Why are the changes needed? To support

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
HyukjinKwon commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401465887 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1822,7 +1822,7 @@ class SparkContext(config: SparkConf) extends Logging { logInfo(s"Added

Re: [PR] [SPARK-46012][CORE][FOLLOWUP] Invoke `fs.listStatus` once and reuse the result [spark]

2023-11-21 Thread via GitHub
mridulm commented on PR #43944: URL: https://github.com/apache/spark/pull/43944#issuecomment-1822031693 Thanks for fixing this @dongjoon-hyun ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-45856] Move ArtifactManager from Spark Connect into SparkSession (sql/core) [spark]

2023-11-21 Thread via GitHub
fhalde commented on PR #43735: URL: https://github.com/apache/spark/pull/43735#issuecomment-1822029844 Hi, we are super interested in having the isolated classloader per spark session ability for our usecase. i believe this today is only achievable if jobs are run from a connect client. we

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
LuciferYang commented on PR #43526: URL: https://github.com/apache/spark/pull/43526#issuecomment-1822026592 > > Thank you very much @laglangyue , I will review this PR as soon as possible > > > > I new a project and test code as bellow,and `sbt compile`,and it will not cause error, why

Re: [PR] [SPARK-45629][CORE][SQL][CONNECT][ML][STREAMING][BUILD][EXAMPLES]Fix `Implicit definition should have explicit type` [spark]

2023-11-21 Thread via GitHub
laglangyue commented on PR #43526: URL: https://github.com/apache/spark/pull/43526#issuecomment-1822024412 > Thank you very much @laglangyue , I will review this PR as soon as possible > > > > I new a project and test code as bellow,and `sbt compile`,and it will not cause error, why

Re: [PR] [SPARK-46031][SQL] Replace `!Optional.isPresent()` with `Optional.isEmpty()` [spark]

2023-11-21 Thread via GitHub
LuciferYang commented on PR #43931: URL: https://github.com/apache/spark/pull/43931#issuecomment-1822018446 Thanks @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
AngersZh commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401448409 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1836,7 +1836,7 @@ class SparkContext(config: SparkConf) extends Logging { val uriToUse =

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
AngersZh commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401447578 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1822,7 +1822,7 @@ class SparkContext(config: SparkConf) extends Logging {

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
AngersZh commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401447858 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1836,7 +1836,7 @@ class SparkContext(config: SparkConf) extends Logging { val uriToUse =

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
AngersZh commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401447419 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1822,7 +1822,7 @@ class SparkContext(config: SparkConf) extends Logging {

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
HyukjinKwon commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401445511 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1822,7 +1822,7 @@ class SparkContext(config: SparkConf) extends Logging { logInfo(s"Added

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
HyukjinKwon commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401443129 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1836,7 +1836,7 @@ class SparkContext(config: SparkConf) extends Logging { val uriToUse =

Re: [PR] [SPARK-45807][SQL]: Add createOrReplaceView(..) / replaceView(..) to ViewCatalog [spark]

2023-11-21 Thread via GitHub
aokolnychyi commented on code in PR #43677: URL: https://github.com/apache/spark/pull/43677#discussion_r1401440744 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java: ## @@ -140,6 +140,87 @@ View createView( String[] columnComments,

Re: [PR] [SPARK-45807][SQL]: Add createOrReplaceView(..) / replaceView(..) to ViewCatalog [spark]

2023-11-21 Thread via GitHub
aokolnychyi commented on PR #43677: URL: https://github.com/apache/spark/pull/43677#issuecomment-1821999176 Will review today. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46012][CORE][FOLLOWUP] Invoke `fs.listStatus` once and reuse the result [spark]

2023-11-21 Thread via GitHub
LuciferYang commented on PR #43944: URL: https://github.com/apache/spark/pull/43944#issuecomment-1821990625 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
AngersZh commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401427353 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1822,7 +1822,7 @@ class SparkContext(config: SparkConf) extends Logging {

Re: [PR] [SPARK-46022][SPARK-46017][DOCS][PYTHON] Remove deprecated function APIs from documents [spark]

2023-11-21 Thread via GitHub
HyukjinKwon closed pull request #43932: [SPARK-46022][SPARK-46017][DOCS][PYTHON] Remove deprecated function APIs from documents URL: https://github.com/apache/spark/pull/43932 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-46013][PYTHON][SQL][DOCS] Improve data source load and save functions docs page [spark]

2023-11-21 Thread via GitHub
HyukjinKwon closed pull request #43917: [SPARK-46013][PYTHON][SQL][DOCS] Improve data source load and save functions docs page URL: https://github.com/apache/spark/pull/43917 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-46022][SPARK-46017][DOCS][PYTHON] Remove deprecated function APIs from documents [spark]

2023-11-21 Thread via GitHub
HyukjinKwon commented on PR #43932: URL: https://github.com/apache/spark/pull/43932#issuecomment-1821984339 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46013][PYTHON][SQL][DOCS] Improve data source load and save functions docs page [spark]

2023-11-21 Thread via GitHub
HyukjinKwon commented on PR #43917: URL: https://github.com/apache/spark/pull/43917#issuecomment-1821983343 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] [SPARK-46045][DOCS][PS] Add individual categories for `Options and settings` to API reference [spark]

2023-11-21 Thread via GitHub
itholic opened a new pull request, #43948: URL: https://github.com/apache/spark/pull/43948 ### What changes were proposed in this pull request? This PR proposes to add individual categories for `Options and settings` to API reference such as pandas does:

Re: [PR] [SPARK-46026][PYTHON][DOCS] Refine docstring of UDTF [spark]

2023-11-21 Thread via GitHub
allisonwang-db commented on PR #43928: URL: https://github.com/apache/spark/pull/43928#issuecomment-1821970655 Nice! Thanks for the update. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-46012][CORE][FOLLOWUP] Invoke `fs.listStatus` once and reuse the result [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun commented on PR #43944: URL: https://github.com/apache/spark/pull/43944#issuecomment-1821965278 Merged to master/3.5/3.4/3.3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-46012][CORE][FOLLOWUP] Invoke `fs.listStatus` once and reuse the result [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun closed pull request #43944: [SPARK-46012][CORE][FOLLOWUP] Invoke `fs.listStatus` once and reuse the result URL: https://github.com/apache/spark/pull/43944 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-46012][CORE][FOLLOWUP] Invoke `fs.listStatus` once and reuse the result [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun commented on PR #43944: URL: https://github.com/apache/spark/pull/43944#issuecomment-1821964194 Thank you, @yaooqinn ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-23015][WINDOWS] Mitigate bug in Windows where starting multiple Spark instances within the same second causes a failure [spark]

2023-11-21 Thread via GitHub
HyukjinKwon commented on PR #43706: URL: https://github.com/apache/spark/pull/43706#issuecomment-1821928455 Sent https://lists.apache.org/thread/9y0xrxo08cv3nt27wt40jrpncgzz1kt1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-23015][WINDOWS] Mitigate bug in Windows where starting multiple Spark instances within the same second causes a failure [spark]

2023-11-21 Thread via GitHub
HyukjinKwon commented on PR #43706: URL: https://github.com/apache/spark/pull/43706#issuecomment-1821926816 Let me ask the dev mailing list and see if we can have others test this patch -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] [SPARK-46021][CORE] Support cancel future jobs belonging to a job group [spark]

2023-11-21 Thread via GitHub
anchovYu commented on code in PR #43926: URL: https://github.com/apache/spark/pull/43926#discussion_r1401373298 ## core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala: ## @@ -169,6 +169,12 @@ private[spark] class DAGScheduler( private[scheduler] val

Re: [PR] [SPARK-46022][SPARK-46017][DOCS][PYTHON] Remove deprecated function APIs from documents [spark]

2023-11-21 Thread via GitHub
itholic commented on code in PR #43932: URL: https://github.com/apache/spark/pull/43932#discussion_r1401370529 ## python/docs/source/reference/pyspark.sql/functions.rst: ## @@ -504,18 +496,6 @@ Generator Functions stack -Partition Transformation Functions

Re: [PR] [SPARK-46021][CORE] Support cancel future jobs belonging to a job group [spark]

2023-11-21 Thread via GitHub
anchovYu commented on code in PR #43926: URL: https://github.com/apache/spark/pull/43926#discussion_r1401370271 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -2961,6 +2961,12 @@ ], "sqlState" : "42601" }, + "SPARK_JOB_CANCELLED" : { +

Re: [PR] [SPARK-46021][CORE] Support cancel future jobs belonging to a job group [spark]

2023-11-21 Thread via GitHub
anchovYu commented on code in PR #43926: URL: https://github.com/apache/spark/pull/43926#discussion_r1401368945 ## common/utils/src/main/resources/error/README.md: ## @@ -1347,7 +1347,7 @@ The following SQLSTATEs are collated from: |XX001|XX |Internal Error

Re: [PR] [SPARK-46040][SQL][Python] Update UDTF API for 'analyze' partitioning/ordering columns to support general expressions [spark]

2023-11-21 Thread via GitHub
dtenedor commented on PR #43946: URL: https://github.com/apache/spark/pull/43946#issuecomment-1821907222 cc @ueshin @allisonwang-db -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] [SPARK-46040][SQL][Python] Update UDTF API for 'analyze' partitioning/ordering columns to support general expressions [spark]

2023-11-21 Thread via GitHub
dtenedor opened a new pull request, #43946: URL: https://github.com/apache/spark/pull/43946 ### What changes were proposed in this pull request? This PR updates the Python user-defined table function (UDTF) API for the `analyze` method to support general expressions for the

Re: [PR] [SPARK-46039][BUILD][CONNECT] Upgrade `grpcio*` to 1.59.3 for Python 3.12 [spark]

2023-11-21 Thread via GitHub
HyukjinKwon commented on PR #43942: URL: https://github.com/apache/spark/pull/43942#issuecomment-1821903760 @juliuszsompolski would you mind taking a look at https://github.com/apache/spark/pull/43942#issuecomment-1821896165 when you find some time? TR;DR: A bit of context is that

Re: [PR] [SPARK-46039][BUILD][CONNECT] Upgrade `grpcio*` to 1.59.3 for Python 3.12 [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun commented on PR #43942: URL: https://github.com/apache/spark/pull/43942#issuecomment-1821898015 Merged to master for Apache Spark 4.0.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-46039][BUILD][CONNECT] Upgrade `grpcio*` to 1.59.3 for Python 3.12 [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun closed pull request #43942: [SPARK-46039][BUILD][CONNECT] Upgrade `grpcio*` to 1.59.3 for Python 3.12 URL: https://github.com/apache/spark/pull/43942 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [WIP][SPARK-44781][SQL] Runtime filter should supports reuse exchange if it can reduce the data size of application side [spark]

2023-11-21 Thread via GitHub
github-actions[bot] closed pull request #42468: [WIP][SPARK-44781][SQL] Runtime filter should supports reuse exchange if it can reduce the data size of application side URL: https://github.com/apache/spark/pull/42468 -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] [SPARK-44767] Plugin API for PySpark and SparkR workers [spark]

2023-11-21 Thread via GitHub
github-actions[bot] closed pull request #42440: [SPARK-44767] Plugin API for PySpark and SparkR workers URL: https://github.com/apache/spark/pull/42440 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-46039][BUILD][CONNECT] Upgrade `grpcio*` to 1.59.3 for Python 3.12 [spark]

2023-11-21 Thread via GitHub
dongjoon-hyun commented on PR #43942: URL: https://github.com/apache/spark/pull/43942#issuecomment-1821896165 There is only one failure case here. Although this fails eventually, the behavior is slightly different. I filed SPARK-46042.

Re: [PR] [SPARK-45511][SS][FOLLOWUP] fix comment in StateDataSourceReadSuite [spark]

2023-11-21 Thread via GitHub
HyukjinKwon closed pull request #43945: [SPARK-45511][SS][FOLLOWUP] fix comment in StateDataSourceReadSuite URL: https://github.com/apache/spark/pull/43945 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

  1   2   3   >