[GitHub] [spark] wangyum opened a new pull request, #37984: [SPARK-40322][DOCS][3.3] Fix all dead links in the docs

2022-09-24 Thread GitBox
wangyum opened a new pull request, #37984: URL: https://github.com/apache/spark/pull/37984 This PR backports https://github.com/apache/spark/pull/37981 to branch-3.3. The original PR description: ### What changes were proposed in this pull request? This PR fixes any dead links

[GitHub] [spark] HyukjinKwon closed pull request #37984: [SPARK-40322][DOCS][3.3] Fix all dead links in the docs

2022-09-24 Thread GitBox
HyukjinKwon closed pull request #37984: [SPARK-40322][DOCS][3.3] Fix all dead links in the docs URL: https://github.com/apache/spark/pull/37984 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on pull request #37984: [SPARK-40322][DOCS][3.3] Fix all dead links in the docs

2022-09-24 Thread GitBox
HyukjinKwon commented on PR #37984: URL: https://github.com/apache/spark/pull/37984#issuecomment-1256904136 Merged to branch-3.3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] EvgenyZamyatin commented on pull request #37967: [WIP] Scalable SkipGram-Word2Vec implementation

2022-09-24 Thread GitBox
EvgenyZamyatin commented on PR #37967: URL: https://github.com/apache/spark/pull/37967#issuecomment-1256936762 How could I fix this error? The problem is due to the change of parallel collections in scala 2.13 I can fix it for scala 2.13, but how I could fix it for cross building? ```

[GitHub] [spark] panbingkun commented on a diff in pull request #37941: [SPARK-40501][SQL] Add PushProjectionThroughLimit for Optimizer

2022-09-24 Thread GitBox
panbingkun commented on code in PR #37941: URL: https://github.com/apache/spark/pull/37941#discussion_r979185389 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PushProjectionThroughLimit.scala: ## @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] roczei commented on a diff in pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-24 Thread GitBox
roczei commented on code in PR #37679: URL: https://github.com/apache/spark/pull/37679#discussion_r979213153 ## core/src/main/resources/error/error-classes.json: ## @@ -70,6 +70,11 @@ ], "sqlState" : "22008" }, + "DEFAULT_CATALOG_DATABASE_NOT_EXISTS" : { Review

[GitHub] [spark] lvshaokang opened a new pull request, #37986: [SPARK-40357][SQL] Migrate window type check failures onto error classes

2022-09-24 Thread GitBox
lvshaokang opened a new pull request, #37986: URL: https://github.com/apache/spark/pull/37986 ### What changes were proposed in this pull request? In the PR, I propose to use error classes in the case of type check failure in window expressions. ### Why are the changes

[GitHub] [spark] wangyum commented on pull request #37976: [SPARK-40544][SQL][TESTS] Restore the file appender log level threshold of the hive UTs to info

2022-09-24 Thread GitBox
wangyum commented on PR #37976: URL: https://github.com/apache/spark/pull/37976#issuecomment-1256994302 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] wangyum closed pull request #37976: [SPARK-40544][SQL][TESTS] Restore the file appender log level threshold of the hive UTs to info

2022-09-24 Thread GitBox
wangyum closed pull request #37976: [SPARK-40544][SQL][TESTS] Restore the file appender log level threshold of the hive UTs to info URL: https://github.com/apache/spark/pull/37976 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] bersprockets commented on a diff in pull request #37825: [SPARK-40382][SQL] Group distinct aggregate expressions by semantically equivalent children in `RewriteDistinctAggregates`

2022-09-24 Thread GitBox
bersprockets commented on code in PR #37825: URL: https://github.com/apache/spark/pull/37825#discussion_r979267219 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala: ## @@ -213,7 +213,16 @@ object RewriteDistinctAggregates

[GitHub] [spark] AmplabJenkins commented on pull request #37985: [SPARK-40548][BUILD] Upgrade rocksdbjni from 7.5.3 to 7.6.0

2022-09-24 Thread GitBox
AmplabJenkins commented on PR #37985: URL: https://github.com/apache/spark/pull/37985#issuecomment-1256998639 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] panbingkun opened a new pull request, #37985: [BUILD] Upgrade rocksdbjni from 7.5.3 to 7.6.0

2022-09-24 Thread GitBox
panbingkun opened a new pull request, #37985: URL: https://github.com/apache/spark/pull/37985 ### What changes were proposed in this pull request? This PR aims to upgrade RocksDB JNI library from 7.5.3 to 7.6.0. ### Why are the changes needed? This version bring performance

[GitHub] [spark] peter-toth commented on pull request #37982: [SPARK-38717][SQL][3.3] Handle Hive's bucket spec case preserving behaviour

2022-09-24 Thread GitBox
peter-toth commented on PR #37982: URL: https://github.com/apache/spark/pull/37982#issuecomment-1256996469 > Is the HIVE metastore case-sensitivity documented somewhere or we have to run some code or play with hive directly to confirm the behavior? It just came up with a query

[GitHub] [spark] github-actions[bot] closed pull request #36378: [SPARK-39022][SQL] Fix combination of HAVING and SORT not being resolved correctly

2022-09-24 Thread GitBox
github-actions[bot] closed pull request #36378: [SPARK-39022][SQL] Fix combination of HAVING and SORT not being resolved correctly URL: https://github.com/apache/spark/pull/36378 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] github-actions[bot] commented on pull request #36030: Draft: [SPARK-38715] Configurable client ID for Kafka Spark SQL producer

2022-09-24 Thread GitBox
github-actions[bot] commented on PR #36030: URL: https://github.com/apache/spark/pull/36030#issuecomment-1257089545 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #36005: [SPARK-38506][SQL] Push partial aggregation through join

2022-09-24 Thread GitBox
github-actions[bot] commented on PR #36005: URL: https://github.com/apache/spark/pull/36005#issuecomment-1257089547 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] closed pull request #36789: [SPARK-39403] Add SPARK_SUBMIT_OPTS in spark-env.sh.template

2022-09-24 Thread GitBox
github-actions[bot] closed pull request #36789: [SPARK-39403] Add SPARK_SUBMIT_OPTS in spark-env.sh.template URL: https://github.com/apache/spark/pull/36789 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] github-actions[bot] commented on pull request #35969: [SPARK-38651][SQL] Add configuration to support writing out empty schemas in supported filebased datasources

2022-09-24 Thread GitBox
github-actions[bot] commented on PR #35969: URL: https://github.com/apache/spark/pull/35969#issuecomment-1257089558 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #36052: [SPARK-38777][YARN] Add `bin/spark-submit --kill / --status` support for yarn

2022-09-24 Thread GitBox
github-actions[bot] commented on PR #36052: URL: https://github.com/apache/spark/pull/36052#issuecomment-1257089535 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #36046: [SPARK-38771][SQL] Adaptive Bloom filter Join

2022-09-24 Thread GitBox
github-actions[bot] commented on PR #36046: URL: https://github.com/apache/spark/pull/36046#issuecomment-1257089538 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #35990: [SPARK-38639][SQL] Support ignoreCorruptRecord flag to ensure querying broken sequence file table smoothly

2022-09-24 Thread GitBox
github-actions[bot] commented on PR #35990: URL: https://github.com/apache/spark/pull/35990#issuecomment-1257089554 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #35927: [WIP] Simplify the rule of auto-generated alias name

2022-09-24 Thread GitBox
github-actions[bot] commented on PR #35927: URL: https://github.com/apache/spark/pull/35927#issuecomment-1257089562 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] wangyum closed pull request #37979: [SPARK-40545][SQL][TESTS] Clean up `metastorePath` after `SparkSQLEnvSuite` execution

2022-09-24 Thread GitBox
wangyum closed pull request #37979: [SPARK-40545][SQL][TESTS] Clean up `metastorePath` after `SparkSQLEnvSuite` execution URL: https://github.com/apache/spark/pull/37979 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] wangyum commented on pull request #37979: [SPARK-40545][SQL][TESTS] Clean up `metastorePath` after `SparkSQLEnvSuite` execution

2022-09-24 Thread GitBox
wangyum commented on PR #37979: URL: https://github.com/apache/spark/pull/37979#issuecomment-1257095684 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] MaxGekk commented on a diff in pull request #37986: [SPARK-40357][SQL] Migrate window type check failures onto error classes

2022-09-24 Thread GitBox
MaxGekk commented on code in PR #37986: URL: https://github.com/apache/spark/pull/37986#discussion_r979293197 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala: ## @@ -421,7 +454,12 @@ sealed abstract class

[GitHub] [spark] github-actions[bot] commented on pull request #36874: [SPARK-39475][SQL] Pull out complex join keys for shuffled join

2022-09-24 Thread GitBox
github-actions[bot] commented on PR #36874: URL: https://github.com/apache/spark/pull/36874#issuecomment-1257089507 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] closed pull request #36859: DTW: new distance measure for clustering

2022-09-24 Thread GitBox
github-actions[bot] closed pull request #36859: DTW: new distance measure for clustering URL: https://github.com/apache/spark/pull/36859 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] github-actions[bot] closed pull request #36770: [SPARK-39382][WEBUI] UI show the duration of the failed task when the executor lost

2022-09-24 Thread GitBox
github-actions[bot] closed pull request #36770: [SPARK-39382][WEBUI] UI show the duration of the failed task when the executor lost URL: https://github.com/apache/spark/pull/36770 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] github-actions[bot] closed pull request #36700: [SPARK-39318][SQL] Remove tpch-plan-stability WithStats golden files

2022-09-24 Thread GitBox
github-actions[bot] closed pull request #36700: [SPARK-39318][SQL] Remove tpch-plan-stability WithStats golden files URL: https://github.com/apache/spark/pull/36700 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] github-actions[bot] closed pull request #36658: [SPARK-39278][CORE] Fix backward compatibility of alternative configs of Hadoop Filesystems to access

2022-09-24 Thread GitBox
github-actions[bot] closed pull request #36658: [SPARK-39278][CORE] Fix backward compatibility of alternative configs of Hadoop Filesystems to access URL: https://github.com/apache/spark/pull/36658 -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] github-actions[bot] closed pull request #36305: [SPARK-38987][shuffle] Handle fallback when merged shuffle blocks are corrupted and spark.shuffle.detectCorrupt is set to true

2022-09-24 Thread GitBox
github-actions[bot] closed pull request #36305: [SPARK-38987][shuffle] Handle fallback when merged shuffle blocks are corrupted and spark.shuffle.detectCorrupt is set to true URL: https://github.com/apache/spark/pull/36305 -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] github-actions[bot] closed pull request #36126: [SPARK-38843][SQL] Fix translate metadata col filters

2022-09-24 Thread GitBox
github-actions[bot] closed pull request #36126: [SPARK-38843][SQL] Fix translate metadata col filters URL: https://github.com/apache/spark/pull/36126 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] github-actions[bot] commented on pull request #36088: [SPARK-38805][SHUFFLE] Automatically remove an expired indexFilePath from the ESS shuffleIndexCache or the PBS indexCache to save

2022-09-24 Thread GitBox
github-actions[bot] commented on PR #36088: URL: https://github.com/apache/spark/pull/36088#issuecomment-1257089532 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] closed pull request #36304: [SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands

2022-09-24 Thread GitBox
github-actions[bot] closed pull request #36304: [SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands URL: https://github.com/apache/spark/pull/36304 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AmplabJenkins commented on pull request #37986: [SPARK-40357][SQL] Migrate window type check failures onto error classes

2022-09-24 Thread GitBox
AmplabJenkins commented on PR #37986: URL: https://github.com/apache/spark/pull/37986#issuecomment-1257107320 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] amaliujia commented on pull request #37982: [SPARK-38717][SQL][3.3] Handle Hive's bucket spec case preserving behaviour

2022-09-24 Thread GitBox
amaliujia commented on PR #37982: URL: https://github.com/apache/spark/pull/37982#issuecomment-1257114799 > > Is the HIVE metastore case-sensitivity documented somewhere or we have to run some code or play with hive directly to confirm the behavior? > > @amaliujia, it just came up

[GitHub] [spark] HyukjinKwon commented on pull request #37710: [SPARK-40448][CONNECT] Spark Connect build as Driver Plugin with Shaded Dependencies

2022-09-24 Thread GitBox
HyukjinKwon commented on PR #37710: URL: https://github.com/apache/spark/pull/37710#issuecomment-1257127356 Merged to master. I will follow up and actively work on cleaning up and followup tasks from tomorrow. -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] HyukjinKwon closed pull request #37710: [SPARK-40448][CONNECT] Spark Connect build as Driver Plugin with Shaded Dependencies

2022-09-24 Thread GitBox
HyukjinKwon closed pull request #37710: [SPARK-40448][CONNECT] Spark Connect build as Driver Plugin with Shaded Dependencies URL: https://github.com/apache/spark/pull/37710 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] dependabot[bot] opened a new pull request, #37987: Bump protobuf from 4.21.5 to 4.21.6 in /dev

2022-09-24 Thread GitBox
dependabot[bot] opened a new pull request, #37987: URL: https://github.com/apache/spark/pull/37987 Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 4.21.5 to 4.21.6. Release notes Sourced from https://github.com/protocolbuffers/protobuf/releases;>protobuf's

[GitHub] [spark] HyukjinKwon closed pull request #37987: Bump protobuf from 4.21.5 to 4.21.6 in /dev

2022-09-24 Thread GitBox
HyukjinKwon closed pull request #37987: Bump protobuf from 4.21.5 to 4.21.6 in /dev URL: https://github.com/apache/spark/pull/37987 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dependabot[bot] commented on pull request #37987: Bump protobuf from 4.21.5 to 4.21.6 in /dev

2022-09-24 Thread GitBox
dependabot[bot] commented on PR #37987: URL: https://github.com/apache/spark/pull/37987#issuecomment-1257127701 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version,