[GitHub] [spark] HyukjinKwon commented on pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

2022-09-25 Thread GitBox
HyukjinKwon commented on PR #37989: URL: https://github.com/apache/spark/pull/37989#issuecomment-1257175536 cc @mridulm FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

2022-09-25 Thread GitBox
HyukjinKwon commented on code in PR #37989: URL: https://github.com/apache/spark/pull/37989#discussion_r979395539 ## core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala: ## @@ -4469,7 +4469,7 @@ class DAGSchedulerSuite extends SparkFunSuite with

[GitHub] [spark] HyukjinKwon opened a new pull request, #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

2022-09-25 Thread GitBox
HyukjinKwon opened a new pull request, #37989: URL: https://github.com/apache/spark/pull/37989 ### What changes were proposed in this pull request? This PR is a followup of https://github.com/apache/spark/pull/37533 that works around the test failure by explicitly checking the

[GitHub] [spark] cchantep commented on pull request #36030: Draft: [SPARK-38715] Configurable client ID for Kafka Spark SQL producer

2022-09-25 Thread GitBox
cchantep commented on PR #36030: URL: https://github.com/apache/spark/pull/36030#issuecomment-1257182398 Closing because nobody review it timely? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] roczei commented on pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-25 Thread GitBox
roczei commented on PR #37679: URL: https://github.com/apache/spark/pull/37679#issuecomment-1257184837 Hi @cloud-fan, All build issues have been fixed and all of your feedbacks have been implemented. Latest state: ``` $ bin/spark-shell --conf

[GitHub] [spark] attilapiros commented on a diff in pull request #37990: [WIP][SPARK-40458][K8S] Bump Kubernetes Client Version to 6.1.1

2022-09-25 Thread GitBox
attilapiros commented on code in PR #37990: URL: https://github.com/apache/spark/pull/37990#discussion_r979435892 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/K8sSubmitOps.scala: ## @@ -144,14 +134,13 @@ private[spark] class

[GitHub] [spark] srowen commented on pull request #37988: [SPARK-40142][PYTHON][SQL][FOLLOW-UP] Make pyspark.sql.functions examples self-contained (FINAL)

2022-09-25 Thread GitBox
srowen commented on PR #37988: URL: https://github.com/apache/spark/pull/37988#issuecomment-1257243173 Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] srowen closed pull request #37988: [SPARK-40142][PYTHON][SQL][FOLLOW-UP] Make pyspark.sql.functions examples self-contained (FINAL)

2022-09-25 Thread GitBox
srowen closed pull request #37988: [SPARK-40142][PYTHON][SQL][FOLLOW-UP] Make pyspark.sql.functions examples self-contained (FINAL) URL: https://github.com/apache/spark/pull/37988 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] khalidmammadov commented on pull request #37988: [SPARK-40142][PYTHON][SQL][FOLLOW-UP] Make pyspark.sql.functions examples self-contained (FINAL)

2022-09-25 Thread GitBox
khalidmammadov commented on PR #37988: URL: https://github.com/apache/spark/pull/37988#issuecomment-1257199499 > Thanks for finishing this work, @khalidmammadov. Happy to help -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] ivoson commented on pull request #37268: [SPARK-39853][CORE] Support stage level task resource profile for standalone cluster when dynamic allocation disabled

2022-09-25 Thread GitBox
ivoson commented on PR #37268: URL: https://github.com/apache/spark/pull/37268#issuecomment-1257212450 > Mostly looks good. We need to update the docs like: https://github.com/apache/spark/blob/master/docs/configuration.md#stage-level-scheduling-overview It says "the current implementation

[GitHub] [spark] attilapiros commented on pull request #37990: [WIP][SPARK-40458] Bump Kubernetes Client Version to 6.1.1

2022-09-25 Thread GitBox
attilapiros commented on PR #37990: URL: https://github.com/apache/spark/pull/37990#issuecomment-1257240658 The `inNamespace` calls are added because of the [namespace changes](https://github.com/fabric8io/kubernetes-client/blob/master/doc/MIGRATION-v6.md#namespace-changes) and to have

[GitHub] [spark] bjornjorgensen opened a new pull request, #37991: [SPARK-40552][] Upgrade `protobuf-python` to 4.21.6

2022-09-25 Thread GitBox
bjornjorgensen opened a new pull request, #37991: URL: https://github.com/apache/spark/pull/37991 ### What changes were proposed in this pull request? Upgrade protobuf-python from 4.21.5 to 4.21.6 ### Why are the changes needed?

[GitHub] [spark] attilapiros commented on a diff in pull request #37990: [WIP][SPARK-40458][K8S] Bump Kubernetes Client Version to 6.1.1

2022-09-25 Thread GitBox
attilapiros commented on code in PR #37990: URL: https://github.com/apache/spark/pull/37990#discussion_r979439689 ## resource-managers/kubernetes/core/pom.xml: ## @@ -75,6 +75,11 @@ test + Review Comment:

[GitHub] [spark] attilapiros commented on a diff in pull request #37990: [WIP][SPARK-40458][K8S] Bump Kubernetes Client Version to 6.1.1

2022-09-25 Thread GitBox
attilapiros commented on code in PR #37990: URL: https://github.com/apache/spark/pull/37990#discussion_r979439689 ## resource-managers/kubernetes/core/pom.xml: ## @@ -75,6 +75,11 @@ test + Review Comment:

[GitHub] [spark] AmplabJenkins commented on pull request #37991: [SPARK-40552][BUILD] Upgrade `protobuf-python` to 4.21.6

2022-09-25 Thread GitBox
AmplabJenkins commented on PR #37991: URL: https://github.com/apache/spark/pull/37991#issuecomment-1257250279 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] HyukjinKwon commented on pull request #37988: [SPARK-40142][PYTHON][SQL][FOLLOW-UP] Make pyspark.sql.functions examples self-contained (FINAL)

2022-09-25 Thread GitBox
HyukjinKwon commented on PR #37988: URL: https://github.com/apache/spark/pull/37988#issuecomment-1257175625 Thanks for finishing this work, @khalidmammadov. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

2022-09-25 Thread GitBox
HyukjinKwon commented on code in PR #37989: URL: https://github.com/apache/spark/pull/37989#discussion_r979395539 ## core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala: ## @@ -4469,7 +4469,7 @@ class DAGSchedulerSuite extends SparkFunSuite with

[GitHub] [spark] lvshaokang commented on pull request #37986: [SPARK-40357][SQL] Migrate window type check failures onto error classes

2022-09-25 Thread GitBox
lvshaokang commented on PR #37986: URL: https://github.com/apache/spark/pull/37986#issuecomment-1257206608 @MaxGekk Thanks for you review. I'm already addressing it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] attilapiros commented on a diff in pull request #37990: [WIP][SPARK-40458][K8S] Bump Kubernetes Client Version to 6.1.1

2022-09-25 Thread GitBox
attilapiros commented on code in PR #37990: URL: https://github.com/apache/spark/pull/37990#discussion_r979435775 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala: ## @@ -115,7 +115,10 @@ private[spark] object

[GitHub] [spark] attilapiros opened a new pull request, #37990: [WIP][SPARK-40458] Bump Kubernetes Client Version to 6.1.1

2022-09-25 Thread GitBox
attilapiros opened a new pull request, #37990: URL: https://github.com/apache/spark/pull/37990 ### What changes were proposed in this pull request? Bump kubernetes-client version from 5.12.3 to 6.1.1 and clean up all the deprecations. ### Why are the changes needed?

[GitHub] [spark] attilapiros commented on a diff in pull request #37990: [WIP][SPARK-40458][K8S] Bump Kubernetes Client Version to 6.1.1

2022-09-25 Thread GitBox
attilapiros commented on code in PR #37990: URL: https://github.com/apache/spark/pull/37990#discussion_r979436243 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala: ## @@ -168,23 +171,19 @@

[GitHub] [spark] EvgenyZamyatin commented on pull request #37967: Scalable SkipGram-Word2Vec implementation

2022-09-25 Thread GitBox
EvgenyZamyatin commented on PR #37967: URL: https://github.com/apache/spark/pull/37967#issuecomment-1257192550 @zhengruifeng Hi! Could you please review my changes? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] attilapiros commented on a diff in pull request #37990: [WIP][SPARK-40458][K8S] Bump Kubernetes Client Version to 6.1.1

2022-09-25 Thread GitBox
attilapiros commented on code in PR #37990: URL: https://github.com/apache/spark/pull/37990#discussion_r979439689 ## resource-managers/kubernetes/core/pom.xml: ## @@ -75,6 +75,11 @@ test + Review Comment:

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37991: [SPARK-40552][BUILD] Upgrade `protobuf-python` to 4.21.6

2022-09-25 Thread GitBox
HyukjinKwon commented on code in PR #37991: URL: https://github.com/apache/spark/pull/37991#discussion_r979489138 ## dev/requirements.txt: ## @@ -48,4 +48,4 @@ black==22.6.0 # Spark Connect grpcio==1.48.1 -protobuf==4.21.5 \ No newline at end of file +protobuf==4.21.6

[GitHub] [spark] grundprinzip opened a new pull request, #37993: [Cleanup] Update generated proto files for Spark Connect

2022-09-25 Thread GitBox
grundprinzip opened a new pull request, #37993: URL: https://github.com/apache/spark/pull/37993 ### What changes were proposed in this pull request? This patch cleans up the generated proto files from the initial Spark Connect import. The previous files had a Databricks specific

[GitHub] [spark] zhengruifeng closed pull request #37923: [SPARK-40334][PS] Implement `GroupBy.prod`

2022-09-25 Thread GitBox
zhengruifeng closed pull request #37923: [SPARK-40334][PS] Implement `GroupBy.prod` URL: https://github.com/apache/spark/pull/37923 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] zhengruifeng commented on a diff in pull request #37978: [SPARK-40330][PS] Implement `Series.searchsorted`

2022-09-25 Thread GitBox
zhengruifeng commented on code in PR #37978: URL: https://github.com/apache/spark/pull/37978#discussion_r979482727 ## python/pyspark/pandas/series.py: ## @@ -6610,6 +6610,78 @@ def compare( ) return DataFrame(internal) +# todo: 1, support array-like

[GitHub] [spark] zhengruifeng commented on a diff in pull request #37978: [SPARK-40330][PS] Implement `Series.searchsorted`

2022-09-25 Thread GitBox
zhengruifeng commented on code in PR #37978: URL: https://github.com/apache/spark/pull/37978#discussion_r979482634 ## python/pyspark/pandas/series.py: ## @@ -6610,6 +6610,78 @@ def compare( ) return DataFrame(internal) +# todo: 1, support array-like

[GitHub] [spark] nkronenfeld commented on pull request #36613: [WIP][SPARK-30983] Support typed select in Datasets up to the max tuple size

2022-09-25 Thread GitBox
nkronenfeld commented on PR #36613: URL: https://github.com/apache/spark/pull/36613#issuecomment-1257319840 also, I don't see a button to re-open it - does anyone know where that is? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] bersprockets commented on a diff in pull request #37825: [SPARK-40382][SQL] Group distinct aggregate expressions by semantically equivalent children in `RewriteDistinctAggregates`

2022-09-25 Thread GitBox
bersprockets commented on code in PR #37825: URL: https://github.com/apache/spark/pull/37825#discussion_r979487585 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala: ## @@ -291,7 +298,8 @@ object RewriteDistinctAggregates

[GitHub] [spark] weixiuli commented on a diff in pull request #37922: [WIP][SPARK-40480][SHUFFLE] Remove push-based shuffle data after query finished

2022-09-25 Thread GitBox
weixiuli commented on code in PR #37922: URL: https://github.com/apache/spark/pull/37922#discussion_r979507631 ## core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala: ## @@ -2543,16 +2541,13 @@ private[spark] class DAGScheduler( shuffleIdToMapStage.filter {

[GitHub] [spark] HeartSaVioR commented on pull request #37285: [POC][PYTHON][SS] Arbitrary stateful processing in Structured Streaming with Python

2022-09-25 Thread GitBox
HeartSaVioR commented on PR #37285: URL: https://github.com/apache/spark/pull/37285#issuecomment-1257400388 We can close this now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HeartSaVioR closed pull request #37285: [POC][PYTHON][SS] Arbitrary stateful processing in Structured Streaming with Python

2022-09-25 Thread GitBox
HeartSaVioR closed pull request #37285: [POC][PYTHON][SS] Arbitrary stateful processing in Structured Streaming with Python URL: https://github.com/apache/spark/pull/37285 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] HeartSaVioR commented on pull request #37935: [SPARK-40492][SS] Do maintenance before streaming StateStore unload

2022-09-25 Thread GitBox
HeartSaVioR commented on PR #37935: URL: https://github.com/apache/spark/pull/37935#issuecomment-1257401965 Thanks! Merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] LuciferYang commented on pull request #37976: [SPARK-40544][SQL][TESTS] Restore the file appender log level threshold of the hive UTs to info

2022-09-25 Thread GitBox
LuciferYang commented on PR #37976: URL: https://github.com/apache/spark/pull/37976#issuecomment-1257403015 thanks @wangyum -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] HeartSaVioR commented on pull request #37935: [SPARK-40492][SS] Do maintenance before streaming StateStore unload

2022-09-25 Thread GitBox
HeartSaVioR commented on PR #37935: URL: https://github.com/apache/spark/pull/37935#issuecomment-1257403721 Thanks @chaoqin-li1123 for the contribution! I merged this to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] HeartSaVioR closed pull request #37935: [SPARK-40492][SS] Do maintenance before streaming StateStore unload

2022-09-25 Thread GitBox
HeartSaVioR closed pull request #37935: [SPARK-40492][SS] Do maintenance before streaming StateStore unload URL: https://github.com/apache/spark/pull/37935 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] grundprinzip commented on pull request #37710: [SPARK-40448][CONNECT] Spark Connect build as Driver Plugin with Shaded Dependencies

2022-09-25 Thread GitBox
grundprinzip commented on PR #37710: URL: https://github.com/apache/spark/pull/37710#issuecomment-1257437750 Ack, I will regenerate the protos and update. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] zhengruifeng commented on pull request #37923: [SPARK-40334][PS] Implement `GroupBy.prod`

2022-09-25 Thread GitBox
zhengruifeng commented on PR #37923: URL: https://github.com/apache/spark/pull/37923#issuecomment-1257315237 Merged into master, thank you @ayudovin for working on it! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] bersprockets commented on a diff in pull request #37825: [SPARK-40382][SQL] Group distinct aggregate expressions by semantically equivalent children in `RewriteDistinctAggregates`

2022-09-25 Thread GitBox
bersprockets commented on code in PR #37825: URL: https://github.com/apache/spark/pull/37825#discussion_r979489753 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala: ## @@ -213,7 +213,16 @@ object RewriteDistinctAggregates

[GitHub] [spark] LuciferYang commented on pull request #37979: [SPARK-40545][SQL][TESTS] Clean up `metastorePath` after `SparkSQLEnvSuite` execution

2022-09-25 Thread GitBox
LuciferYang commented on PR #37979: URL: https://github.com/apache/spark/pull/37979#issuecomment-1257402939 thanks @wangyum -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] beliefer commented on pull request #37825: [SPARK-40382][SQL] Group distinct aggregate expressions by semantically equivalent children in `RewriteDistinctAggregates`

2022-09-25 Thread GitBox
beliefer commented on PR #37825: URL: https://github.com/apache/spark/pull/37825#issuecomment-1257447790 It seems a little complex. I have an idea to simplify the binary expressions in other optimizer rule. Please reference `SimplifyBinaryComparison`. -- This is an automated

[GitHub] [spark] HyukjinKwon closed pull request #37978: [SPARK-40330][PS] Implement `Series.searchsorted`

2022-09-25 Thread GitBox
HyukjinKwon closed pull request #37978: [SPARK-40330][PS] Implement `Series.searchsorted` URL: https://github.com/apache/spark/pull/37978 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on pull request #37978: [SPARK-40330][PS] Implement `Series.searchsorted`

2022-09-25 Thread GitBox
HyukjinKwon commented on PR #37978: URL: https://github.com/apache/spark/pull/37978#issuecomment-1257439757 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] ulysses-you opened a new pull request, #36700: [SPARK-39318][SQL] Remove tpch-plan-stability WithStats golden files

2022-09-25 Thread GitBox
ulysses-you opened a new pull request, #36700: URL: https://github.com/apache/spark/pull/36700 ### What changes were proposed in this pull request? Remove all TPCH with stats golden files. ### Why are the changes needed? It's a dead golden files since we have no

[GitHub] [spark] cloud-fan commented on pull request #36700: [SPARK-39318][SQL] Remove tpch-plan-stability WithStats golden files

2022-09-25 Thread GitBox
cloud-fan commented on PR #36700: URL: https://github.com/apache/spark/pull/36700#issuecomment-1257481093 sorry I missed this PR. @ulysses-you can you do a rebase? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] bjornjorgensen commented on pull request #37991: [SPARK-40552][BUILD] Upgrade `protobuf-python` to 4.21.6

2022-09-25 Thread GitBox
bjornjorgensen commented on PR #37991: URL: https://github.com/apache/spark/pull/37991#issuecomment-1257271472 cc @grundprinzip -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] srowen commented on pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

2022-09-25 Thread GitBox
srowen commented on PR #37989: URL: https://github.com/apache/spark/pull/37989#issuecomment-1257299340 Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] srowen closed pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

2022-09-25 Thread GitBox
srowen closed pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length URL: https://github.com/apache/spark/pull/37989 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] nkronenfeld commented on pull request #36613: [WIP][SPARK-30983] Support typed select in Datasets up to the max tuple size

2022-09-25 Thread GitBox
nkronenfeld commented on PR #36613: URL: https://github.com/apache/spark/pull/36613#issuecomment-1257319350 I haven't done anything on the branch because I was waiting for comments - but as far as I know, no one even looked at it. Am I missing something for it to get considered in the

[GitHub] [spark] github-actions[bot] closed pull request #36874: [SPARK-39475][SQL] Pull out complex join keys for shuffled join

2022-09-25 Thread GitBox
github-actions[bot] closed pull request #36874: [SPARK-39475][SQL] Pull out complex join keys for shuffled join URL: https://github.com/apache/spark/pull/36874 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] github-actions[bot] closed pull request #36128: [SPARK-34444][SQL] Pushdown scalar-subquery filter to FileSourceScan

2022-09-25 Thread GitBox
github-actions[bot] closed pull request #36128: [SPARK-3][SQL] Pushdown scalar-subquery filter to FileSourceScan URL: https://github.com/apache/spark/pull/36128 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] github-actions[bot] closed pull request #36088: [SPARK-38805][SHUFFLE] Automatically remove an expired indexFilePath from the ESS shuffleIndexCache or the PBS indexCache to save memor

2022-09-25 Thread GitBox
github-actions[bot] closed pull request #36088: [SPARK-38805][SHUFFLE] Automatically remove an expired indexFilePath from the ESS shuffleIndexCache or the PBS indexCache to save memory. URL: https://github.com/apache/spark/pull/36088 -- This is an automated message from the Apache Git

[GitHub] [spark] github-actions[bot] commented on pull request #35858: [SPARK-38448] [YARN] [CORE] Sending Available Resources in Yarn Cluster Information to Spark Driver

2022-09-25 Thread GitBox
github-actions[bot] commented on PR #35858: URL: https://github.com/apache/spark/pull/35858#issuecomment-1257323009 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] closed pull request #35927: [WIP] Simplify the rule of auto-generated alias name

2022-09-25 Thread GitBox
github-actions[bot] closed pull request #35927: [WIP] Simplify the rule of auto-generated alias name URL: https://github.com/apache/spark/pull/35927 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] github-actions[bot] commented on pull request #35845: [SPARK-38520][SQL] ANSI interval overflow when reading CSV

2022-09-25 Thread GitBox
github-actions[bot] commented on PR #35845: URL: https://github.com/apache/spark/pull/35845#issuecomment-1257323023 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #35867: [SPARK-38559][SQL][WEBUI]Display the number of empty partitions on spark ui

2022-09-25 Thread GitBox
github-actions[bot] commented on PR #35867: URL: https://github.com/apache/spark/pull/35867#issuecomment-1257323002 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #35808: [WIP][SPARK-38512] Rebased traversal order from "pre-order" to "post-order" for `ResolveFunctions` Rule

2022-09-25 Thread GitBox
github-actions[bot] commented on PR #35808: URL: https://github.com/apache/spark/pull/35808#issuecomment-1257323034 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #36829: [SPARK-39438][SQL] Add a threshold to not in line CTE

2022-09-25 Thread GitBox
github-actions[bot] commented on PR #36829: URL: https://github.com/apache/spark/pull/36829#issuecomment-1257322956 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] closed pull request #36301: [SPARK-21697][SQL] NPE & ExceptionInInitializerError trying to load UDF from HDFS

2022-09-25 Thread GitBox
github-actions[bot] closed pull request #36301: [SPARK-21697][SQL] NPE & ExceptionInInitializerError trying to load UDF from HDFS URL: https://github.com/apache/spark/pull/36301 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] github-actions[bot] commented on pull request #36265: [SPARK-38951][SQL] Aggregate aliases override field names in ResolveAggregateFunctions

2022-09-25 Thread GitBox
github-actions[bot] commented on PR #36265: URL: https://github.com/apache/spark/pull/36265#issuecomment-1257322967 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] closed pull request #36208: [SPARK-38911][CORE] Fix the potential resource profile id mess up issue

2022-09-25 Thread GitBox
github-actions[bot] closed pull request #36208: [SPARK-38911][CORE] Fix the potential resource profile id mess up issue URL: https://github.com/apache/spark/pull/36208 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] github-actions[bot] closed pull request #36180: [SPARK-38887][SQL] Support switch inner join side for sort merge join

2022-09-25 Thread GitBox
github-actions[bot] closed pull request #36180: [SPARK-38887][SQL] Support switch inner join side for sort merge join URL: https://github.com/apache/spark/pull/36180 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] github-actions[bot] closed pull request #36151: WIP: [SPARK-27998] [SQL] Add support for double-quoted named expressions

2022-09-25 Thread GitBox
github-actions[bot] closed pull request #36151: WIP: [SPARK-27998] [SQL] Add support for double-quoted named expressions URL: https://github.com/apache/spark/pull/36151 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] github-actions[bot] commented on pull request #35806: [SPARK-38505][SQL] Make partial aggregation adaptive

2022-09-25 Thread GitBox
github-actions[bot] commented on PR #35806: URL: https://github.com/apache/spark/pull/35806#issuecomment-1257323038 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #35764: [SPARK-38444][SQL]Automatically calculate the upper and lower bounds of partitions when no specified partition related params

2022-09-25 Thread GitBox
github-actions[bot] commented on PR #35764: URL: https://github.com/apache/spark/pull/35764#issuecomment-1257323067 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #35799: [SPARK-38498][STREAM] Support customized StreamingListener by configuration

2022-09-25 Thread GitBox
github-actions[bot] commented on PR #35799: URL: https://github.com/apache/spark/pull/35799#issuecomment-1257323057 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #35763: [SPARK-38433][BUILD] change the shell code style with shellcheck

2022-09-25 Thread GitBox
github-actions[bot] commented on PR #35763: URL: https://github.com/apache/spark/pull/35763#issuecomment-1257323076 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] lvshaokang commented on pull request #37986: [SPARK-40357][SQL] Migrate window type check failures onto error classes

2022-09-25 Thread GitBox
lvshaokang commented on PR #37986: URL: https://github.com/apache/spark/pull/37986#issuecomment-1257478094 @MaxGekk I have addressed, please take a look, thk! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] itholic commented on pull request #37988: [SPARK-40142][PYTHON][SQL][FOLLOW-UP] Make pyspark.sql.functions examples self-contained (FINAL)

2022-09-25 Thread GitBox
itholic commented on PR #37988: URL: https://github.com/apache/spark/pull/37988#issuecomment-1257289652 Thanks for your efforts to finish this work! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] Kwafoor closed pull request #37951: [SPARK-40506]Spark Streaming metrics name doesn't need application name

2022-09-25 Thread GitBox
Kwafoor closed pull request #37951: [SPARK-40506]Spark Streaming metrics name doesn't need application name URL: https://github.com/apache/spark/pull/37951 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] mridulm commented on pull request #37989: [SPARK-40096][CORE][TESTS][FOLLOW-UP] Explicitly check the element and length

2022-09-25 Thread GitBox
mridulm commented on PR #37989: URL: https://github.com/apache/spark/pull/37989#issuecomment-1257374954 This is very interesting behavior ! Thanks for fixing this @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] zhengruifeng opened a new pull request, #37992: [SPARK-40554][PS] Make `ddof` in `DataFrame.sem` and `Series.sem` accept arbitary integers

2022-09-25 Thread GitBox
zhengruifeng opened a new pull request, #37992: URL: https://github.com/apache/spark/pull/37992 ### What changes were proposed in this pull request? Make `ddof` in `DataFrame.sem` and `Series.sem` accept arbitary integers ### Why are the changes needed? for API coverage

[GitHub] [spark] beliefer commented on a diff in pull request #37825: [SPARK-40382][SQL] Group distinct aggregate expressions by semantically equivalent children in `RewriteDistinctAggregates`

2022-09-25 Thread GitBox
beliefer commented on code in PR #37825: URL: https://github.com/apache/spark/pull/37825#discussion_r979537901 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala: ## @@ -218,9 +218,16 @@ object RewriteDistinctAggregates

[GitHub] [spark] zhengruifeng commented on pull request #37978: [SPARK-40330][PS] Implement `Series.searchsorted`

2022-09-25 Thread GitBox
zhengruifeng commented on PR #37978: URL: https://github.com/apache/spark/pull/37978#issuecomment-1257440961 @HyukjinKwon @itholic Thanks for the reviews! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] HyukjinKwon commented on pull request #37993: [CONNECT] [Cleanup] Update generated proto files for Spark Connect

2022-09-25 Thread GitBox
HyukjinKwon commented on PR #37993: URL: https://github.com/apache/spark/pull/37993#issuecomment-1257456288 (Should probably need a separate JIRA for this) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] HyukjinKwon commented on pull request #37985: [SPARK-40548][BUILD] Upgrade rocksdbjni from 7.5.3 to 7.6.0

2022-09-25 Thread GitBox
HyukjinKwon commented on PR #37985: URL: https://github.com/apache/spark/pull/37985#issuecomment-1257193303 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon closed pull request #37985: [SPARK-40548][BUILD] Upgrade rocksdbjni from 7.5.3 to 7.6.0

2022-09-25 Thread GitBox
HyukjinKwon closed pull request #37985: [SPARK-40548][BUILD] Upgrade rocksdbjni from 7.5.3 to 7.6.0 URL: https://github.com/apache/spark/pull/37985 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] attilapiros commented on a diff in pull request #37990: [WIP][SPARK-40458][K8S] Bump Kubernetes Client Version to 6.1.1

2022-09-25 Thread GitBox
attilapiros commented on code in PR #37990: URL: https://github.com/apache/spark/pull/37990#discussion_r979437063 ## resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/submit/K8sSubmitOpSuite.scala: ## @@ -101,18 +114,19 @@ class K8sSubmitOpSuite

[GitHub] [spark] cloud-fan commented on pull request #37982: [SPARK-38717][SQL][3.3] Handle Hive's bucket spec case preserving behaviour

2022-09-25 Thread GitBox
cloud-fan commented on PR #37982: URL: https://github.com/apache/spark/pull/37982#issuecomment-1257487700 all tests passed: https://github.com/peter-toth/spark/runs/8514875267 merging to 3.3, thanks! -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] amaliujia commented on pull request #37994: [SPARK-40454][Connect]Initial DSL framework for protobuf testing

2022-09-25 Thread GitBox
amaliujia commented on PR #37994: URL: https://github.com/apache/spark/pull/37994#issuecomment-1257516316 @cloud-fan @HyukjinKwon @@grundprinzip -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] cloud-fan commented on a diff in pull request #36265: [SPARK-38951][SQL] Aggregate aliases override field names in ResolveAggregateFunctions

2022-09-25 Thread GitBox
cloud-fan commented on code in PR #36265: URL: https://github.com/apache/spark/pull/36265#discussion_r979562710 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -2594,6 +2601,31 @@ class Analyzer(override val catalogManager:

[GitHub] [spark] cloud-fan commented on a diff in pull request #35789: [SPARK-32268][SQL] Row-level Runtime Filtering

2022-09-25 Thread GitBox
cloud-fan commented on code in PR #35789: URL: https://github.com/apache/spark/pull/35789#discussion_r979569276 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/InjectRuntimeFilter.scala: ## @@ -0,0 +1,303 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] cloud-fan closed pull request #37982: [SPARK-38717][SQL][3.3] Handle Hive's bucket spec case preserving behaviour

2022-09-25 Thread GitBox
cloud-fan closed pull request #37982: [SPARK-38717][SQL][3.3] Handle Hive's bucket spec case preserving behaviour URL: https://github.com/apache/spark/pull/37982 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] cloud-fan commented on a diff in pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-25 Thread GitBox
cloud-fan commented on code in PR #37679: URL: https://github.com/apache/spark/pull/37679#discussion_r979572317 ## sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala: ## @@ -1932,6 +1932,13 @@ private[sql] object QueryExecutionErrors extends

[GitHub] [spark] cloud-fan commented on a diff in pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-25 Thread GitBox
cloud-fan commented on code in PR #37679: URL: https://github.com/apache/spark/pull/37679#discussion_r979572820 ## sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala: ## @@ -148,13 +148,19 @@ private[sql] class SharedState( val externalCatalog =

[GitHub] [spark] beliefer commented on a diff in pull request #35789: [SPARK-32268][SQL] Row-level Runtime Filtering

2022-09-25 Thread GitBox
beliefer commented on code in PR #35789: URL: https://github.com/apache/spark/pull/35789#discussion_r979579903 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/InjectRuntimeFilter.scala: ## @@ -0,0 +1,303 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] Ngone51 commented on a diff in pull request #37268: [SPARK-39853][CORE] Support stage level task resource profile for standalone cluster when dynamic allocation disabled

2022-09-25 Thread GitBox
Ngone51 commented on code in PR #37268: URL: https://github.com/apache/spark/pull/37268#discussion_r979579541 ## core/src/main/scala/org/apache/spark/resource/ResourceProfileManager.scala: ## @@ -59,35 +59,65 @@ private[spark] class ResourceProfileManager(sparkConf: SparkConf,

[GitHub] [spark] cloud-fan commented on a diff in pull request #37825: [SPARK-40382][SQL] Group distinct aggregate expressions by semantically equivalent children in `RewriteDistinctAggregates`

2022-09-25 Thread GitBox
cloud-fan commented on code in PR #37825: URL: https://github.com/apache/spark/pull/37825#discussion_r979582134 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala: ## @@ -218,9 +218,16 @@ object RewriteDistinctAggregates

[GitHub] [spark] amaliujia opened a new pull request, #37994: [SPARK-40454] Initial DSL framework for protobuf testing

2022-09-25 Thread GitBox
amaliujia opened a new pull request, #37994: URL: https://github.com/apache/spark/pull/37994 ### What changes were proposed in this pull request? Implement an approach to testing the proto to Scala conversion with a DSL according to the proposal in

[GitHub] [spark] khalidmammadov opened a new pull request, #37988: [SPARK-40142][PYTHON][SQL][FOLLOW-UP] Make pyspark.sql.functions examples self-contained (FINAL)

2022-09-25 Thread GitBox
khalidmammadov opened a new pull request, #37988: URL: https://github.com/apache/spark/pull/37988 ### What changes were proposed in this pull request? It's part of the Pyspark docstrings improvement series (https://github.com/apache/spark/pull/37592,

[GitHub] [spark] AmplabJenkins commented on pull request #37988: [SPARK-40142][PYTHON][SQL][FOLLOW-UP] Make pyspark.sql.functions examples self-contained (FINAL)

2022-09-25 Thread GitBox
AmplabJenkins commented on PR #37988: URL: https://github.com/apache/spark/pull/37988#issuecomment-1257145678 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] khalidmammadov commented on pull request #37988: [SPARK-40142][PYTHON][SQL][FOLLOW-UP] Make pyspark.sql.functions examples self-contained (FINAL)

2022-09-25 Thread GitBox
khalidmammadov commented on PR #37988: URL: https://github.com/apache/spark/pull/37988#issuecomment-1257156072 @srowen @itholic @HyukjinKwon Please review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above