date:20221219

[GitHub] [spark] LuciferYang opened a new pull request, #39135: [SPARK-41427][UI][FOLLOWUP] Remove duplicate `getMetricValue` from `ExecutorMetricsSerializer#serialize`

2022-12-19 Thread GitBox

LuciferYang opened a new pull request, #39135: URL: https://github.com/apache/spark/pull/39135 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] LuciferYang commented on a diff in pull request #39100: [SPARK-41427][UI] Protobuf serializer for ExecutorStageSummaryWrapper

2022-12-19 Thread GitBox

LuciferYang commented on code in PR #39100: URL: https://github.com/apache/spark/pull/39100#discussion_r1053000909 ## core/src/main/scala/org/apache/spark/status/protobuf/ExecutorMetricsSerializer.scala: ## @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] rxin commented on a diff in pull request #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

rxin commented on code in PR #39134: URL: https://github.com/apache/spark/pull/39134#discussion_r1052995619 ## sql/core/src/test/resources/sql-tests/results/group-by-star-mosha.sql.out: ## @@ -0,0 +1,141 @@ +-- Automatically generated by SQLQueryTestSuite +-- !query +create

[GitHub] [spark] rxin commented on a diff in pull request #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

rxin commented on code in PR #39134: URL: https://github.com/apache/spark/pull/39134#discussion_r1052993548 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveGroupByStar.scala: ## @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] rxin commented on a diff in pull request #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

rxin commented on code in PR #39134: URL: https://github.com/apache/spark/pull/39134#discussion_r1052993302 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveGroupByStar.scala: ## @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] HeartSaVioR closed pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

HeartSaVioR closed pull request #38517: [SPARK-39591][SS] Async Progress Tracking URL: https://github.com/apache/spark/pull/38517 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HeartSaVioR commented on pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

HeartSaVioR commented on PR #38517: URL: https://github.com/apache/spark/pull/38517#issuecomment-1358951698 https://github.com/jerrypeng/spark/actions/runs/3737519049 Above build passed for the last commit

[GitHub] [spark] amaliujia commented on pull request #38984: [SPARK-41349][CONNECT][PYTHON] Implement DataFrame.hint

2022-12-19 Thread GitBox

amaliujia commented on PR #38984: URL: https://github.com/apache/spark/pull/38984#issuecomment-1358948232 many thanks @dengziming to tackle this work! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] cloud-fan commented on a diff in pull request #39040: [SPARK-27561][SQL][FOLLOWUP] Support implicit lateral column alias resolution on Aggregate

2022-12-19 Thread GitBox

cloud-fan commented on code in PR #39040: URL: https://github.com/apache/spark/pull/39040#discussion_r1052986240 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveLateralColumnAlias.scala: ## @@ -244,6 +303,61 @@ object

[GitHub] [spark] EnricoMi commented on pull request #39131: [SPARK-41162][SQL] Fix anti- and semi-join for self-join with aggregations

2022-12-19 Thread GitBox

EnricoMi commented on PR #39131: URL: https://github.com/apache/spark/pull/39131#issuecomment-1358939935 @cloud-fan I think this is a better approach to fix correctness bug SPARK-41162 than #38676. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark-docker] Yikun commented on pull request #27: [SPARK-40513] Add support to generate DOI mainifest

2022-12-19 Thread GitBox

Yikun commented on PR #27: URL: https://github.com/apache/spark-docker/pull/27#issuecomment-1358933985 cc @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark-docker] Yikun opened a new pull request, #27: [SPARK-40513] Add support to generate DOI mainifest

2022-12-19 Thread GitBox

Yikun opened a new pull request, #27: URL: https://github.com/apache/spark-docker/pull/27 ### What changes were proposed in this pull request? This patch add support to generate DOI mainifest from versions.json. ### Why are the changes needed? To help generate DOI mainifest

[GitHub] [spark] cloud-fan commented on pull request #39054: [SPARK-27561][SQL][FOLLOWUP] Move the two rules for Later column alias into one file

2022-12-19 Thread GitBox

cloud-fan commented on PR #39054: URL: https://github.com/apache/spark/pull/39054#issuecomment-1358931368 Reverted at https://github.com/apache/spark/commit/52082d3906bc3813cd3ff4447f7c75beb4f28612 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] cloud-fan commented on pull request #39054: [SPARK-27561][SQL][FOLLOWUP] Move the two rules for Later column alias into one file

2022-12-19 Thread GitBox

cloud-fan commented on PR #39054: URL: https://github.com/apache/spark/pull/39054#issuecomment-1358929659 Since this PR introduced a regression (case insensitive problem) and it's actually not necessary after my refactor https://github.com/apache/spark/pull/3 , I'm reverting it. --

[GitHub] [spark] cloud-fan commented on a diff in pull request #39040: [SPARK-27561][SQL][FOLLOWUP] Support implicit lateral column alias resolution on Aggregate

2022-12-19 Thread GitBox

cloud-fan commented on code in PR #39040: URL: https://github.com/apache/spark/pull/39040#discussion_r1052973933 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -182,6 +183,157 @@ object AnalysisContext { } } +object Analyzer

[GitHub] [spark] venkyvb commented on pull request #37738: add Support Java Class with circular references

2022-12-19 Thread GitBox

venkyvb commented on PR #37738: URL: https://github.com/apache/spark/pull/37738#issuecomment-1358924659 Hey all, Wondering if this PR (or some similar fix got merged). I have similar issues with circular references and it would be great to have an option to skip the check. Thanks.

[GitHub] [spark] jerrypeng commented on a diff in pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

jerrypeng commented on code in PR #38517: URL: https://github.com/apache/spark/pull/38517#discussion_r1052964642 ## sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecutionSuite.scala: ## @@ -78,7 +72,9 @@ class

[GitHub] [spark] jerrypeng commented on a diff in pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

jerrypeng commented on code in PR #38517: URL: https://github.com/apache/spark/pull/38517#discussion_r1052963923 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecution.scala: ## @@ -275,7 +298,7 @@ object

[GitHub] [spark] jerrypeng commented on a diff in pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

jerrypeng commented on code in PR #38517: URL: https://github.com/apache/spark/pull/38517#discussion_r1052964187 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecution.scala: ## @@ -157,8 +172,17 @@ class

[GitHub] [spark] jerrypeng commented on a diff in pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

jerrypeng commented on code in PR #38517: URL: https://github.com/apache/spark/pull/38517#discussion_r1052963344 ## sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecutionSuite.scala: ## @@ -1762,4 +1344,173 @@ class

[GitHub] [spark] jerrypeng commented on a diff in pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

jerrypeng commented on code in PR #38517: URL: https://github.com/apache/spark/pull/38517#discussion_r1052960918 ## sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecutionSuite.scala: ## @@ -68,11 +64,42 @@ class

[GitHub] [spark] mridulm commented on pull request #39131: [SPARK-41162][SQL] Fix anti- and semi-join for self-join with aggregations

2022-12-19 Thread GitBox

mridulm commented on PR #39131: URL: https://github.com/apache/spark/pull/39131#issuecomment-1358901058 +CC @shardulm94 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] HeartSaVioR commented on pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

HeartSaVioR commented on PR #38517: URL: https://github.com/apache/spark/pull/38517#issuecomment-1358900740 Let me give +1 once the builds are passed rather than waiting for addressing all minor/nit comments. We can deal with them as a follow-up PR. -- This is an automated message from

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

HeartSaVioR commented on code in PR #38517: URL: https://github.com/apache/spark/pull/38517#discussion_r1052937189 ## sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecutionSuite.scala: ## @@ -68,11 +64,42 @@ class

[GitHub] [spark] HeartSaVioR commented on pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

HeartSaVioR commented on PR #38517: URL: https://github.com/apache/spark/pull/38517#issuecomment-1358899601 (You can ignore outdated comments since I messed up with only seeing two recent commits and maybe left some comments which only bound to old commit.) -- This is an automated

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

HeartSaVioR commented on code in PR #38517: URL: https://github.com/apache/spark/pull/38517#discussion_r1052941322 ## sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecutionSuite.scala: ## @@ -0,0 +1,1865 @@ +/* + * Licensed to

[GitHub] [spark] rxin commented on a diff in pull request #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

rxin commented on code in PR #39134: URL: https://github.com/apache/spark/pull/39134#discussion_r1052933954 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveGroupByStar.scala: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

HeartSaVioR commented on code in PR #38517: URL: https://github.com/apache/spark/pull/38517#discussion_r1052931956 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecution.scala: ## @@ -275,7 +298,7 @@ object

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

HeartSaVioR commented on code in PR #38517: URL: https://github.com/apache/spark/pull/38517#discussion_r1052927996 ## sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecutionSuite.scala: ## @@ -78,7 +72,9 @@ class

[GitHub] [spark] infoankitp commented on pull request #38865: [SPARK-41232][SQL][PYTHON] Adding array_append function

2022-12-19 Thread GitBox

infoankitp commented on PR #38865: URL: https://github.com/apache/spark/pull/38865#issuecomment-1358869906 @beliefer @LuciferYang Friendly ping! Please review the changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] rxin commented on a diff in pull request #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

rxin commented on code in PR #39134: URL: https://github.com/apache/spark/pull/39134#discussion_r1052926462 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveGroupByStar.scala: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] gengliangwang commented on a diff in pull request #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

gengliangwang commented on code in PR #39134: URL: https://github.com/apache/spark/pull/39134#discussion_r1052920528 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveGroupByStar.scala: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] gengliangwang commented on a diff in pull request #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

gengliangwang commented on code in PR #39134: URL: https://github.com/apache/spark/pull/39134#discussion_r1052920262 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveGroupByStar.scala: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] gengliangwang commented on a diff in pull request #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

gengliangwang commented on code in PR #39134: URL: https://github.com/apache/spark/pull/39134#discussion_r1052919912 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveGroupByStar.scala: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] gengliangwang commented on a diff in pull request #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

gengliangwang commented on code in PR #39134: URL: https://github.com/apache/spark/pull/39134#discussion_r1052919759 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveGroupByStar.scala: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] gengliangwang commented on pull request #39104: [SPARK-41425][UI] Protobuf serializer for RDDStorageInfoWrapper

2022-12-19 Thread GitBox

gengliangwang commented on PR #39104: URL: https://github.com/apache/spark/pull/39104#issuecomment-1358850152 Thanks, merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] gengliangwang closed pull request #39104: [SPARK-41425][UI] Protobuf serializer for RDDStorageInfoWrapper

2022-12-19 Thread GitBox

gengliangwang closed pull request #39104: [SPARK-41425][UI] Protobuf serializer for RDDStorageInfoWrapper URL: https://github.com/apache/spark/pull/39104 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] rxin commented on a diff in pull request #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

rxin commented on code in PR #39134: URL: https://github.com/apache/spark/pull/39134#discussion_r1052911704 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveGroupByStar.scala: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] gengliangwang closed pull request #39100: [SPARK-41427][UI] Protobuf serializer for ExecutorStageSummaryWrapper

2022-12-19 Thread GitBox

gengliangwang closed pull request #39100: [SPARK-41427][UI] Protobuf serializer for ExecutorStageSummaryWrapper URL: https://github.com/apache/spark/pull/39100 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] cloud-fan commented on a diff in pull request #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

cloud-fan commented on code in PR #39134: URL: https://github.com/apache/spark/pull/39134#discussion_r1052911386 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveGroupByStar.scala: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] gengliangwang commented on pull request #39100: [SPARK-41427][UI] Protobuf serializer for ExecutorStageSummaryWrapper

2022-12-19 Thread GitBox

gengliangwang commented on PR #39100: URL: https://github.com/apache/spark/pull/39100#issuecomment-1358848179 Merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] cloud-fan commented on a diff in pull request #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

cloud-fan commented on code in PR #39134: URL: https://github.com/apache/spark/pull/39134#discussion_r1052911008 ## sql/core/src/test/resources/sql-tests/results/group-by-star-mosha.sql.out: ## @@ -0,0 +1,141 @@ +-- Automatically generated by SQLQueryTestSuite +-- !query

[GitHub] [spark] rxin commented on a diff in pull request #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

rxin commented on code in PR #39134: URL: https://github.com/apache/spark/pull/39134#discussion_r1052910941 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveGroupByStar.scala: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] rxin commented on a diff in pull request #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

rxin commented on code in PR #39134: URL: https://github.com/apache/spark/pull/39134#discussion_r1052909245 ## sql/core/src/test/resources/sql-tests/inputs/group-by-star.sql: ## @@ -0,0 +1,45 @@ +-- group by all Review Comment: do we need a test case for window functions?

[GitHub] [spark] rxin opened a new pull request, #39134: [WIP] Implement group by star (aka group by all)

2022-12-19 Thread GitBox

rxin opened a new pull request, #39134: URL: https://github.com/apache/spark/pull/39134 ### What changes were proposed in this pull request? This patch implements group by star. This is similar to the "group by all" implemented in DuckDB. Note that I'm not done yet. We need to decide if

[GitHub] [spark] jerrypeng commented on pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

jerrypeng commented on PR #38517: URL: https://github.com/apache/spark/pull/38517#issuecomment-1358828963 @HeartSaVioR I have addressed your comments please take another look -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] zhengruifeng commented on pull request #38984: [SPARK-41349][CONNECT][PYTHON] Implement DataFrame.hint

2022-12-19 Thread GitBox

zhengruifeng commented on PR #38984: URL: https://github.com/apache/spark/pull/38984#issuecomment-1358823132 merged into master, thank you @dengziming for working on it! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] zhengruifeng closed pull request #38984: [SPARK-41349][CONNECT][PYTHON] Implement DataFrame.hint

2022-12-19 Thread GitBox

zhengruifeng closed pull request #38984: [SPARK-41349][CONNECT][PYTHON] Implement DataFrame.hint URL: https://github.com/apache/spark/pull/38984 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] itholic commented on pull request #39128: [SPARK-41586][PYTHON] Introduce new PySpark package: `pyspark.errors` and error classes.

2022-12-19 Thread GitBox

itholic commented on PR #39128: URL: https://github.com/apache/spark/pull/39128#issuecomment-1358810969 Let me close it for now, and re-create the PR to change the logic to re-use JVM. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] itholic closed pull request #39128: [SPARK-41586][PYTHON] Introduce new PySpark package: `pyspark.errors` and error classes.

2022-12-19 Thread GitBox

itholic closed pull request #39128: [SPARK-41586][PYTHON] Introduce new PySpark package: `pyspark.errors` and error classes. URL: https://github.com/apache/spark/pull/39128 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] LuciferYang commented on pull request #39124: [DON'T MERGE] Test build and test with hadoop 3.3.5-RC0

2022-12-19 Thread GitBox

LuciferYang commented on PR #39124: URL: https://github.com/apache/spark/pull/39124#issuecomment-1358800857 also cc @wangyum -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] LuciferYang commented on pull request #39124: [DON'T MERGE] Test build and test with hadoop 3.3.5-RC0

2022-12-19 Thread GitBox

LuciferYang commented on PR #39124: URL: https://github.com/apache/spark/pull/39124#issuecomment-1358794388 Many test failed as follows: ``` 2022-12-20T03:15:37.0609530Z [info] org.apache.spark.sql.hive.execution.command.AlterTableAddColumnsSuite *** ABORTED *** (28 milliseconds)

[GitHub] [spark] amaliujia commented on a diff in pull request #39078: [SPARK-41534][CONNECT][SQL] Setup initial client module for Spark Connect

2022-12-19 Thread GitBox

amaliujia commented on code in PR #39078: URL: https://github.com/apache/spark/pull/39078#discussion_r1052834300 ## connector/connect/client/src/main/scala/org/apache/spark/sql/connect/client/SparkSession.scala: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] amaliujia commented on a diff in pull request #39078: [SPARK-41534][CONNECT][SQL] Setup initial client module for Spark Connect

2022-12-19 Thread GitBox

amaliujia commented on code in PR #39078: URL: https://github.com/apache/spark/pull/39078#discussion_r1052834300 ## connector/connect/client/src/main/scala/org/apache/spark/sql/connect/client/SparkSession.scala: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] for08 commented on pull request #39111: [MINOR] Fix some typos

2022-12-19 Thread GitBox

for08 commented on PR #39111: URL: https://github.com/apache/spark/pull/39111#issuecomment-1358783259 I reopen this PR according to the suggestions of srowen and bjornjorgensen. As a beginner, I will continue to learn and use, not just fixing more typos.

[GitHub] [spark] for08 commented on pull request #39111: [MINOR] Fix some typos

2022-12-19 Thread GitBox

for08 commented on PR #39111: URL: https://github.com/apache/spark/pull/39111#issuecomment-1358782929 I reopen this PR according to the suggestions of srowen and bjornjorgensen. At 2022-12-19 11:30:19, "UCB AMPLab"

[GitHub] [spark] beliefer commented on a diff in pull request #39084: [SPARK-41464][CONNECT][PYTHON] Implement `DataFrame.to`

2022-12-19 Thread GitBox

beliefer commented on code in PR #39084: URL: https://github.com/apache/spark/pull/39084#discussion_r1052820825 ## python/pyspark/sql/tests/connect/test_connect_basic.py: ## @@ -389,6 +389,21 @@ def test_schema(self): self.connect.sql(query).schema.__repr__(),

[GitHub] [spark] rangadi commented on pull request #38922: [SPARK-41396][SQL][PROTOBUF] OneOf field support and recursion checks

2022-12-19 Thread GitBox

rangadi commented on PR #38922: URL: https://github.com/apache/spark/pull/38922#issuecomment-1358763884 Asking @HeartSaVioR to take a quick look to approve. @cloud-fan take a look at the updated PR description for example of how spark schema would look like with the different setting

[GitHub] [spark] HyukjinKwon commented on pull request #39041: [SPARK-41528][CONNECT] Merge namespace of Spark Connect and PySpark API

2022-12-19 Thread GitBox

HyukjinKwon commented on PR #39041: URL: https://github.com/apache/spark/pull/39041#issuecomment-1358746938 Build: https://github.com/HyukjinKwon/spark/runs/10178697287 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] LuciferYang commented on a diff in pull request #39125: [SPARK-41584][BUILD] Upgrade RoaringBitmap to 0.9.36

2022-12-19 Thread GitBox

LuciferYang commented on code in PR #39125: URL: https://github.com/apache/spark/pull/39125#discussion_r1052806208 ## core/benchmarks/MapStatusesConvertBenchmark-results.txt: ## @@ -2,12 +2,12 @@ MapStatuses Convert Benchmark

[GitHub] [spark] hvanhovell commented on a diff in pull request #39078: [SPARK-41534][CONNECT][SQL] Setup initial client module for Spark Connect

2022-12-19 Thread GitBox

hvanhovell commented on code in PR #39078: URL: https://github.com/apache/spark/pull/39078#discussion_r1052791191 ## connector/connect/client/src/main/scala/org/apache/spark/sql/connect/client/SparkSession.scala: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] hvanhovell commented on a diff in pull request #39078: [SPARK-41534][CONNECT][SQL] Setup initial client module for Spark Connect

2022-12-19 Thread GitBox

hvanhovell commented on code in PR #39078: URL: https://github.com/apache/spark/pull/39078#discussion_r1052791191 ## connector/connect/client/src/main/scala/org/apache/spark/sql/connect/client/SparkSession.scala: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] hvanhovell commented on a diff in pull request #39078: [SPARK-41534][CONNECT][SQL] Setup initial client module for Spark Connect

2022-12-19 Thread GitBox

hvanhovell commented on code in PR #39078: URL: https://github.com/apache/spark/pull/39078#discussion_r1052790753 ## connector/connect/client/src/main/scala/org/apache/spark/sql/connect/client/SparkSession.scala: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] allisonwang-db opened a new pull request, #39133: [SPARK-41595][SQL] Support generator function explode/explode_outer in the FROM clause

2022-12-19 Thread GitBox

allisonwang-db opened a new pull request, #39133: URL: https://github.com/apache/spark/pull/39133 ### What changes were proposed in this pull request? This PR supports using table-valued generator functions in the FROM clause of a query. A generator function can be registered in

[GitHub] [spark] HeartSaVioR closed pull request #39132: [MINOR][DOC] Fix for Kafka Consumer Config Link

2022-12-19 Thread GitBox

HeartSaVioR closed pull request #39132: [MINOR][DOC] Fix for Kafka Consumer Config Link URL: https://github.com/apache/spark/pull/39132 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HeartSaVioR commented on pull request #39132: [MINOR][DOC] Fix for Kafka Consumer Config Link

2022-12-19 Thread GitBox

HeartSaVioR commented on PR #39132: URL: https://github.com/apache/spark/pull/39132#issuecomment-1358709546 Thanks! Merging to master. (It's just a small doc change so won't wait for CI build.) -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] zhengruifeng commented on a diff in pull request #39068: [SPARK-41434][CONNECT][PYTHON] Initial `LambdaFunction` implementation

2022-12-19 Thread GitBox

zhengruifeng commented on code in PR #39068: URL: https://github.com/apache/spark/pull/39068#discussion_r1052762202 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -534,6 +536,36 @@ class

[GitHub] [spark] WweiL opened a new pull request, #39132: [MINOR][DOC] Fix for Kafka Consumer Config Link

2022-12-19 Thread GitBox

WweiL opened a new pull request, #39132: URL: https://github.com/apache/spark/pull/39132 ### What changes were proposed in this pull request? Right the redirect link for kafka consumer config, before it points you to the top of the page, now it redirects you to the correct

[GitHub] [spark] HyukjinKwon closed pull request #39117: [SPARK-41535][SQL] Set null correctly for calendar interval fields in `InterpretedUnsafeProjection` and `InterpretedMutableProjection`

2022-12-19 Thread GitBox

HyukjinKwon closed pull request #39117: [SPARK-41535][SQL] Set null correctly for calendar interval fields in `InterpretedUnsafeProjection` and `InterpretedMutableProjection` URL: https://github.com/apache/spark/pull/39117 -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] HyukjinKwon commented on pull request #39117: [SPARK-41535][SQL] Set null correctly for calendar interval fields in `InterpretedUnsafeProjection` and `InterpretedMutableProjection`

2022-12-19 Thread GitBox

HyukjinKwon commented on PR #39117: URL: https://github.com/apache/spark/pull/39117#issuecomment-1358675707 Merged to master, branch-3.3, and branch-3.2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39128: [SPARK-41586][PYTHON] Introduce new PySpark package: `pyspark.errors` and error classes.

2022-12-19 Thread GitBox

HyukjinKwon commented on code in PR #39128: URL: https://github.com/apache/spark/pull/39128#discussion_r1052752065 ## python/pyspark/sql/functions.py: ## @@ -8122,15 +8130,13 @@ def _get_lambda_parameters(f: Callable) -> ValuesView[inspect.Parameter]: # Validate that

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39128: [SPARK-41586][PYTHON] Introduce new PySpark package: `pyspark.errors` and error classes.

2022-12-19 Thread GitBox

HyukjinKwon commented on code in PR #39128: URL: https://github.com/apache/spark/pull/39128#discussion_r1052751875 ## python/pyspark/testing/utils.py: ## @@ -138,6 +140,32 @@ def setUpClass(cls): def tearDownClass(cls): cls.sc.stop() +def checkError( Review

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39128: [SPARK-41586][PYTHON] Introduce new PySpark package: `pyspark.errors` and error classes.

2022-12-19 Thread GitBox

HyukjinKwon commented on code in PR #39128: URL: https://github.com/apache/spark/pull/39128#discussion_r1052751697 ## python/pyspark/errors/error_classes.py: ## @@ -0,0 +1,30 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39128: [SPARK-41586][PYTHON] Introduce new PySpark package: `pyspark.errors` and error classes.

2022-12-19 Thread GitBox

HyukjinKwon commented on code in PR #39128: URL: https://github.com/apache/spark/pull/39128#discussion_r1052751097 ## python/pyspark/errors/__init__.py: ## @@ -0,0 +1,140 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39128: [SPARK-41586][PYTHON] Introduce new PySpark package: `pyspark.errors` and error classes.

2022-12-19 Thread GitBox

HyukjinKwon commented on code in PR #39128: URL: https://github.com/apache/spark/pull/39128#discussion_r1052750600 ## python/pyspark/errors/__init__.py: ## @@ -0,0 +1,140 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license

[GitHub] [spark] HyukjinKwon commented on pull request #39129: [SPARK-41587][BUILD] Upgrade `org.scalatestplus:selenium-4-4` to `org.scalatestplus:selenium-4-7`

2022-12-19 Thread GitBox

HyukjinKwon commented on PR #39129: URL: https://github.com/apache/spark/pull/39129#issuecomment-1358669841 cc @sarutak FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] github-actions[bot] closed pull request #37831: [SPARK-40354][SQL] Support eliminate dynamic partition for datasource v1 writes

2022-12-19 Thread GitBox

github-actions[bot] closed pull request #37831: [SPARK-40354][SQL] Support eliminate dynamic partition for datasource v1 writes URL: https://github.com/apache/spark/pull/37831 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] HyukjinKwon commented on pull request #39041: [SPARK-41528][CONNECT] Merge namespace of Spark Connect and PySpark API

2022-12-19 Thread GitBox

HyukjinKwon commented on PR #39041: URL: https://github.com/apache/spark/pull/39041#issuecomment-1358666500 Let me get this in in few days if there are no more comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] srielau commented on a diff in pull request #38861: [SPARK-41294][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1203 / 1168

2022-12-19 Thread GitBox

srielau commented on code in PR #38861: URL: https://github.com/apache/spark/pull/38861#discussion_r1052745779 ## sql/core/src/test/resources/sql-tests/results/postgreSQL/numeric.sql.out: ## @@ -3831,12 +3831,12 @@ struct<> -- !query output

[GitHub] [spark] HyukjinKwon closed pull request #39123: [SPARK-41583][CONNECT][PROTOBUF] Add Spark Connect and protobuf into setup.py with specifying dependencies

2022-12-19 Thread GitBox

HyukjinKwon closed pull request #39123: [SPARK-41583][CONNECT][PROTOBUF] Add Spark Connect and protobuf into setup.py with specifying dependencies URL: https://github.com/apache/spark/pull/39123 -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] HyukjinKwon commented on pull request #39123: [SPARK-41583][CONNECT][PROTOBUF] Add Spark Connect and protobuf into setup.py with specifying dependencies

2022-12-19 Thread GitBox

HyukjinKwon commented on PR #39123: URL: https://github.com/apache/spark/pull/39123#issuecomment-1358655672 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39123: [SPARK-41583][CONNECT][PROTOBUF] Add Spark Connect and protobuf into setup.py with specifying dependencies

2022-12-19 Thread GitBox

HyukjinKwon commented on code in PR #39123: URL: https://github.com/apache/spark/pull/39123#discussion_r1052742314 ## python/setup.py: ## @@ -113,6 +113,7 @@ def _supports_symlinks(): # Also don't forget to update python/docs/source/getting_started/install.rst.

[GitHub] [spark] gengliangwang commented on a diff in pull request #39104: [SPARK-41425][UI] Protobuf serializer for RDDStorageInfoWrapper

2022-12-19 Thread GitBox

gengliangwang commented on code in PR #39104: URL: https://github.com/apache/spark/pull/39104#discussion_r1052742075 ## core/src/main/scala/org/apache/spark/status/protobuf/RDDStorageInfoWrapperSerializer.scala: ## @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] anchovYu commented on a diff in pull request #39040: [SPARK-27561][SQL][FOLLOWUP] Support implicit lateral column alias resolution on Aggregate

2022-12-19 Thread GitBox

anchovYu commented on code in PR #39040: URL: https://github.com/apache/spark/pull/39040#discussion_r1052738997 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveLateralColumnAlias.scala: ## @@ -244,7 +303,67 @@ object

[GitHub] [spark] gengliangwang commented on a diff in pull request #39040: [SPARK-27561][SQL][FOLLOWUP] Support implicit lateral column alias resolution on Aggregate

2022-12-19 Thread GitBox

gengliangwang commented on code in PR #39040: URL: https://github.com/apache/spark/pull/39040#discussion_r1052735930 ## sql/core/src/test/scala/org/apache/spark/sql/LateralColumnAliasSuite.scala: ## @@ -689,4 +713,38 @@ class LateralColumnAliasSuite extends

[GitHub] [spark] gengliangwang commented on a diff in pull request #39040: [SPARK-27561][SQL][FOLLOWUP] Support implicit lateral column alias resolution on Aggregate

2022-12-19 Thread GitBox

gengliangwang commented on code in PR #39040: URL: https://github.com/apache/spark/pull/39040#discussion_r1052735150 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveLateralColumnAlias.scala: ## @@ -244,7 +303,67 @@ object

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

HeartSaVioR commented on code in PR #38517: URL: https://github.com/apache/spark/pull/38517#discussion_r1052728943 ## sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecutionSuite.scala: ## @@ -0,0 +1,1865 @@ +/* + * Licensed to

[GitHub] [spark] amaliujia commented on a diff in pull request #39068: [SPARK-41434][CONNECT][PYTHON] Initial `LambdaFunction` implementation

2022-12-19 Thread GitBox

amaliujia commented on code in PR #39068: URL: https://github.com/apache/spark/pull/39068#discussion_r1052727913 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -534,6 +536,36 @@ class SparkConnectPlanner(session:

[GitHub] [spark] amaliujia commented on a diff in pull request #39078: [SPARK-41534][CONNECT][SQL] Setup initial client module for Spark Connect

2022-12-19 Thread GitBox

amaliujia commented on code in PR #39078: URL: https://github.com/apache/spark/pull/39078#discussion_r1052725981 ## connector/connect/client/src/main/scala/org/apache/spark/sql/connect/client/SparkSession.scala: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] amaliujia commented on a diff in pull request #39078: [SPARK-41534][CONNECT][SQL] Setup initial client module for Spark Connect

2022-12-19 Thread GitBox

amaliujia commented on code in PR #39078: URL: https://github.com/apache/spark/pull/39078#discussion_r1052725694 ## connector/connect/client/src/main/scala/org/apache/spark/sql/connect/client/SparkSession.scala: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] techaddict commented on a diff in pull request #39104: [SPARK-41425][UI] Protobuf serializer for RDDStorageInfoWrapper

2022-12-19 Thread GitBox

techaddict commented on code in PR #39104: URL: https://github.com/apache/spark/pull/39104#discussion_r1052723531 ## core/src/main/scala/org/apache/spark/status/protobuf/KVStoreProtobufSerializer.scala: ## @@ -17,7 +17,7 @@ package org.apache.spark.status.protobuf -import

[GitHub] [spark] techaddict commented on pull request #39104: [SPARK-41425][UI] Protobuf serializer for RDDStorageInfoWrapper

2022-12-19 Thread GitBox

techaddict commented on PR #39104: URL: https://github.com/apache/spark/pull/39104#issuecomment-1358568558 @gengliangwang addressed comments -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] techaddict commented on a diff in pull request #39104: [SPARK-41425][UI] Protobuf serializer for RDDStorageInfoWrapper

2022-12-19 Thread GitBox

techaddict commented on code in PR #39104: URL: https://github.com/apache/spark/pull/39104#discussion_r1052723424 ## core/src/test/scala/org/apache/spark/status/protobuf/KVStoreProtobufSerializerSuite.scala: ## @@ -21,8 +21,8 @@ import java.util.Date import

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

HeartSaVioR commented on code in PR #38517: URL: https://github.com/apache/spark/pull/38517#discussion_r1052722369 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala: ## @@ -342,17 +342,14 @@ class MicroBatchExecution(

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

HeartSaVioR commented on code in PR #38517: URL: https://github.com/apache/spark/pull/38517#discussion_r1052722369 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala: ## @@ -342,17 +342,14 @@ class MicroBatchExecution(

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

HeartSaVioR commented on code in PR #38517: URL: https://github.com/apache/spark/pull/38517#discussion_r1052721653 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala: ## @@ -342,17 +342,14 @@ class MicroBatchExecution(

[GitHub] [spark] gengliangwang commented on a diff in pull request #39104: [SPARK-41425][UI] Protobuf serializer for RDDStorageInfoWrapper

2022-12-19 Thread GitBox

gengliangwang commented on code in PR #39104: URL: https://github.com/apache/spark/pull/39104#discussion_r1052719961 ## core/src/test/scala/org/apache/spark/status/protobuf/KVStoreProtobufSerializerSuite.scala: ## @@ -21,8 +21,8 @@ import java.util.Date import

[GitHub] [spark] gengliangwang commented on a diff in pull request #39104: [SPARK-41425][UI] Protobuf serializer for RDDStorageInfoWrapper

2022-12-19 Thread GitBox

gengliangwang commented on code in PR #39104: URL: https://github.com/apache/spark/pull/39104#discussion_r1052719752 ## core/src/main/scala/org/apache/spark/status/protobuf/KVStoreProtobufSerializer.scala: ## @@ -17,7 +17,7 @@ package org.apache.spark.status.protobuf

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38517: [SPARK-39591][SS] Async Progress Tracking

2022-12-19 Thread GitBox

HeartSaVioR commented on code in PR #38517: URL: https://github.com/apache/spark/pull/38517#discussion_r1052719778 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecution.scala: ## @@ -0,0 +1,282 @@ +/* + * Licensed to the

[GitHub] [spark] gengliangwang closed pull request #39120: [SPARK-41588][SQL] Make "rule id not found" error slightly easier to debug.

2022-12-19 Thread GitBox

gengliangwang closed pull request #39120: [SPARK-41588][SQL] Make "rule id not found" error slightly easier to debug. URL: https://github.com/apache/spark/pull/39120 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

1 2 3 >

1 - 100 of 209 matches

Mail list logo