date:20230608

[GitHub] [spark] TongWei1105 closed pull request #41513: [SPARK-44007][SQL] Unresolved hint cause query failure

2023-06-08 Thread via GitHub

TongWei1105 closed pull request #41513: [SPARK-44007][SQL] Unresolved hint cause query failure URL: https://github.com/apache/spark/pull/41513 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] LuciferYang commented on a diff in pull request #41529: [SPARK-43988][INFRA] Add maven testing GitHub Action task for connect client module

2023-06-08 Thread via GitHub

LuciferYang commented on code in PR #41529: URL: https://github.com/apache/spark/pull/41529#discussion_r1223872778 ## .github/workflows/build_and_test.yml: ## @@ -728,6 +729,83 @@ jobs: ./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pmesos -Pkubernetes -Pvolcano

[GitHub] [spark] pan3793 commented on a diff in pull request #41448: [SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources

2023-06-08 Thread via GitHub

pan3793 commented on code in PR #41448: URL: https://github.com/apache/spark/pull/41448#discussion_r1223871226 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala: ## @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] aokolnychyi commented on pull request #41448: [SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources

2023-06-08 Thread via GitHub

aokolnychyi commented on PR #41448: URL: https://github.com/apache/spark/pull/41448#issuecomment-1584012053 Also created SPARK-44013 to add a benchmark. Will be used to measure the impact of adding codegen later. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] aokolnychyi commented on pull request #41448: [SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources

2023-06-08 Thread via GitHub

aokolnychyi commented on PR #41448: URL: https://github.com/apache/spark/pull/41448#issuecomment-1584010899 Fixed, tested `SparkThrowableSuite` locally. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] LuciferYang commented on pull request #41529: [SPARK-43988][INFRA] Add maven testing GitHub Action task for connect client module

2023-06-08 Thread via GitHub

LuciferYang commented on PR #41529: URL: https://github.com/apache/spark/pull/41529#issuecomment-1584010670 wait https://github.com/apache/spark/pull/41487 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] LuciferYang opened a new pull request, #41529: [SPARK-43988][INFRA] Add maven testing GitHub Action task for connect client module

2023-06-08 Thread via GitHub

LuciferYang opened a new pull request, #41529: URL: https://github.com/apache/spark/pull/41529 ### What changes were proposed in this pull request? This pr aims to added Maven testing job on GitHub Actions for the `connect-client-jvm` module. ### Why are the changes needed?

[GitHub] [spark] aokolnychyi commented on a diff in pull request #41448: [SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources

2023-06-08 Thread via GitHub

aokolnychyi commented on code in PR #41448: URL: https://github.com/apache/spark/pull/41448#discussion_r1223867780 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala: ## @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] aokolnychyi commented on a diff in pull request #41448: [SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources

2023-06-08 Thread via GitHub

aokolnychyi commented on code in PR #41448: URL: https://github.com/apache/spark/pull/41448#discussion_r1223868149 ## core/src/main/resources/error/error-classes.json: ## @@ -1539,6 +1539,13 @@ "Parse Mode: . To process malformed records as null result, try setting the

[GitHub] [spark] aokolnychyi commented on a diff in pull request #41448: [SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources

2023-06-08 Thread via GitHub

aokolnychyi commented on code in PR #41448: URL: https://github.com/apache/spark/pull/41448#discussion_r1223867780 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala: ## @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] LuciferYang commented on pull request #41487: [SPARK-43648][CONNECT][TESTS] Move `interrupt all` related test to a new test file to pass Maven test

2023-06-08 Thread via GitHub

LuciferYang commented on PR #41487: URL: https://github.com/apache/spark/pull/41487#issuecomment-1584008688 @dongjoon-hyun @vicennial @juliuszsompolski Do you have any other suggestions for this pr? Can we merge this one first? This can make Maven test to pass first, and if there is a

[GitHub] [spark] LuciferYang commented on a diff in pull request #41516: [SPARK-43932][SQL][PYTHON][CONNECT] Add `current` like functions to Scala and Python

2023-06-08 Thread via GitHub

LuciferYang commented on code in PR #41516: URL: https://github.com/apache/spark/pull/41516#discussion_r1223861231 ## sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala: ## @@ -5570,6 +5570,21 @@ class DataFrameFunctionsSuite extends QueryTest with

[GitHub] [spark] zeruibao commented on a diff in pull request #41521: [SPARK-43380][SQL] Fix conversion of Avro logical timestamp type to Long

2023-06-08 Thread via GitHub

zeruibao commented on code in PR #41521: URL: https://github.com/apache/spark/pull/41521#discussion_r1223853665 ## connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala: ## @@ -158,7 +158,7 @@ private[sql] class AvroDeserializer( } case

[GitHub] [spark] cloud-fan commented on a diff in pull request #41521: [SPARK-43380][SQL] Fix conversion of Avro logical timestamp type to Long

2023-06-08 Thread via GitHub

cloud-fan commented on code in PR #41521: URL: https://github.com/apache/spark/pull/41521#discussion_r1223844129 ## connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala: ## @@ -158,7 +158,7 @@ private[sql] class AvroDeserializer( } case

[GitHub] [spark] itholic opened a new pull request, #41528: [SPARK-43610][CONNECT][PS] Enable `InternalFrame.attach_distributed_column` in Spark Connect.

2023-06-08 Thread via GitHub

itholic opened a new pull request, #41528: URL: https://github.com/apache/spark/pull/41528 ### What changes were proposed in this pull request? This PR proposes to enable `InternalFrame.attach_distributed_column` in Spark Connect. ### Why are the changes needed? To

[GitHub] [spark] HyukjinKwon closed pull request #41522: [SPARK-44010][PYTHON][SS][MINOR] Python StreamingQueryProgress rowsPerSecond type fix

2023-06-08 Thread via GitHub

HyukjinKwon closed pull request #41522: [SPARK-44010][PYTHON][SS][MINOR] Python StreamingQueryProgress rowsPerSecond type fix URL: https://github.com/apache/spark/pull/41522 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] HyukjinKwon commented on pull request #41522: [SPARK-44010][PYTHON][SS][MINOR] Python StreamingQueryProgress rowsPerSecond type fix

2023-06-08 Thread via GitHub

HyukjinKwon commented on PR #41522: URL: https://github.com/apache/spark/pull/41522#issuecomment-1583946638 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] jdesjean commented on a diff in pull request #41443: [SPARK-43923][CONNECT] Post listenerBus events during ExecutePlanRequest

2023-06-08 Thread via GitHub

jdesjean commented on code in PR #41443: URL: https://github.com/apache/spark/pull/41443#discussion_r1223820830 ## connector/connect/server/src/test/scala/org/apache/spark/sql/connect/service/EventsSuite.scala: ## @@ -0,0 +1,246 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] jdesjean commented on a diff in pull request #41443: [SPARK-43923][CONNECT] Post listenerBus events during ExecutePlanRequest

2023-06-08 Thread via GitHub

jdesjean commented on code in PR #41443: URL: https://github.com/apache/spark/pull/41443#discussion_r1223820830 ## connector/connect/server/src/test/scala/org/apache/spark/sql/connect/service/EventsSuite.scala: ## @@ -0,0 +1,246 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] jdesjean commented on a diff in pull request #41443: [SPARK-43923][CONNECT] Post listenerBus events during ExecutePlanRequest

2023-06-08 Thread via GitHub

jdesjean commented on code in PR #41443: URL: https://github.com/apache/spark/pull/41443#discussion_r1223820830 ## connector/connect/server/src/test/scala/org/apache/spark/sql/connect/service/EventsSuite.scala: ## @@ -0,0 +1,246 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] wangyum commented on a diff in pull request #41513: [SPARK-44007][SQL] Unresolved hint cause query failure

2023-06-08 Thread via GitHub

wangyum commented on code in PR #41513: URL: https://github.com/apache/spark/pull/41513#discussion_r1223820020 ## sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala: ## @@ -4683,6 +4683,27 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with

[GitHub] [spark] jdesjean commented on a diff in pull request #41443: [SPARK-43923][CONNECT] Post listenerBus events during ExecutePlanRequest

2023-06-08 Thread via GitHub

jdesjean commented on code in PR #41443: URL: https://github.com/apache/spark/pull/41443#discussion_r1223819874 ## connector/connect/server/src/test/scala/org/apache/spark/sql/connect/service/ExecutePlanHolderSuite.scala: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache

[GitHub] [spark] beliefer closed pull request #41464: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

2023-06-08 Thread via GitHub

beliefer closed pull request #41464: [SPARK-43879][CONNECT] Decouple handle command and send response on server side URL: https://github.com/apache/spark/pull/41464 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] beliefer opened a new pull request, #41527: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

2023-06-08 Thread via GitHub

beliefer opened a new pull request, #41527: URL: https://github.com/apache/spark/pull/41527 ### What changes were proposed in this pull request? `SparkConnectStreamHandler` treats the proto requests from connect client and send the responses back to connect client.

[GitHub] [spark] zhengruifeng opened a new pull request, #41526: [SPARK-43943][SPARK-43935][SPARK-43930][DOCS][FOLLOW-UP] Add missing `versionadded` annotations

2023-06-08 Thread via GitHub

zhengruifeng opened a new pull request, #41526: URL: https://github.com/apache/spark/pull/41526 ### What changes were proposed in this pull request? Add missing `versionadded` annotations ### Why are the changes needed? for better doc ### Does this PR introduce

[GitHub] [spark] zhengruifeng commented on a diff in pull request #41505: [SPARK-43938][CONNECT][PYTHON] Add to_* functions to Scala and Python

2023-06-08 Thread via GitHub

zhengruifeng commented on code in PR #41505: URL: https://github.com/apache/spark/pull/41505#discussion_r1223802204 ## connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala: ## @@ -1964,6 +1964,34 @@ class PlanGenerationTestSuite

[GitHub] [spark] beliefer commented on a diff in pull request #41516: [SPARK-43932][SQL][PYTHON][CONNECT] Add `current` like functions to Scala and Python

2023-06-08 Thread via GitHub

beliefer commented on code in PR #41516: URL: https://github.com/apache/spark/pull/41516#discussion_r1223802410 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala: ## @@ -2807,6 +2807,38 @@ object functions { // Misc functions

[GitHub] [spark] beliefer commented on a diff in pull request #41516: [SPARK-43932][SQL][PYTHON][CONNECT] Add `current` like functions to Scala and Python

2023-06-08 Thread via GitHub

beliefer commented on code in PR #41516: URL: https://github.com/apache/spark/pull/41516#discussion_r1223802410 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala: ## @@ -2807,6 +2807,38 @@ object functions { // Misc functions

[GitHub] [spark] beliefer commented on a diff in pull request #41516: [SPARK-43932][SQL][PYTHON][CONNECT] Add `current` like functions to Scala and Python

2023-06-08 Thread via GitHub

beliefer commented on code in PR #41516: URL: https://github.com/apache/spark/pull/41516#discussion_r1223802759 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala: ## @@ -2961,6 +2993,14 @@ object functions { allowDifferentLgConfigK:

[GitHub] [spark] LuciferYang commented on pull request #41519: [SPARK-43943][SQL][TESTS][FOLLOW] Fix `DataFrame function and SQL function parity` in `DataFrameFunctionsSuite`

2023-06-08 Thread via GitHub

LuciferYang commented on PR #41519: URL: https://github.com/apache/spark/pull/41519#issuecomment-1583915596 Thanks @zhengruifeng -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] siying opened a new pull request, #41525: [SPARK-44012][SS]KafkaDataConsumer to print some read status

2023-06-08 Thread via GitHub

siying opened a new pull request, #41525: URL: https://github.com/apache/spark/pull/41525 ### What changes were proposed in this pull request? In the end of each KafkaDataConsumer, it logs some stats. Here is an sample log line: 23/06/08 23:48:14 INFO KafkaDataConsumer: From

[GitHub] [spark] beliefer commented on a diff in pull request #41518: [SPARK-19335][SQL] Add upserts for writing to JDBC

2023-06-08 Thread via GitHub

beliefer commented on code in PR #41518: URL: https://github.com/apache/spark/pull/41518#discussion_r1223800177 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala: ## @@ -878,6 +898,7 @@ object JdbcUtils extends Logging with SQLConfHelper

[GitHub] [spark] itholic commented on a diff in pull request #41514: [SPARK-43684][SPARK-43685][SPARK-43686][SPARK-43691][CONNECT][PS] Fix `(NullOps|NumOps).(eq|ne)` for Spark Connect.

2023-06-08 Thread via GitHub

itholic commented on code in PR #41514: URL: https://github.com/apache/spark/pull/41514#discussion_r1223797787 ## python/pyspark/pandas/data_type_ops/null_ops.py: ## @@ -43,6 +43,22 @@ class NullOps(DataTypeOps): def pretty_name(self) -> str: return "nulls" +

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #41514: [SPARK-43684][SPARK-43685][SPARK-43686][SPARK-43691][CONNECT][PS] Fix `(NullOps|NumOps).(eq|ne)` for Spark Connect.

2023-06-08 Thread via GitHub

HyukjinKwon commented on code in PR #41514: URL: https://github.com/apache/spark/pull/41514#discussion_r1223797237 ## python/pyspark/pandas/data_type_ops/null_ops.py: ## @@ -43,6 +43,22 @@ class NullOps(DataTypeOps): def pretty_name(self) -> str: return "nulls"

[GitHub] [spark] allisonwang-db commented on a diff in pull request #41316: [SPARK-43798][SQL][PYTHON] Support Python user-defined table functions

2023-06-08 Thread via GitHub

allisonwang-db commented on code in PR #41316: URL: https://github.com/apache/spark/pull/41316#discussion_r1223573185 ## python/docs/source/reference/pyspark.sql/udtf.rst: ## Review Comment: Thanks, I will add the link. Can I compile the doc locally to see how it looks

[GitHub] [spark] zhengruifeng commented on a diff in pull request #41477: [SPARK-43931][SQL][PYTHON][CONNECT] Add make_* functions to Scala and Python

2023-06-08 Thread via GitHub

zhengruifeng commented on code in PR #41477: URL: https://github.com/apache/spark/pull/41477#discussion_r1223794670 ## python/pyspark/sql/connect/functions.py: ## @@ -2373,6 +2374,117 @@ def hours(col: "ColumnOrName") -> Column: hours.__doc__ = pysparkfuncs.hours.__doc__ +

[GitHub] [spark] zhengruifeng commented on a diff in pull request #41515: [SPARK-43934][SQL][PYTHON][CONNECT] Add regexp_* functions to Scala and Python

2023-06-08 Thread via GitHub

zhengruifeng commented on code in PR #41515: URL: https://github.com/apache/spark/pull/41515#discussion_r1223794078 ## python/pyspark/sql/connect/functions.py: ## @@ -1988,13 +1988,70 @@ def split(str: "ColumnOrName", pattern: str, limit: int = -1) -> Column: split.__doc__ =

[GitHub] [spark] zhengruifeng opened a new pull request, #41524: [SPARK-43621][PS][CONNECT] Enable `pyspark.pandas.spark.functions.repeat` in Spark Connect

2023-06-08 Thread via GitHub

zhengruifeng opened a new pull request, #41524: URL: https://github.com/apache/spark/pull/41524 ### What changes were proposed in this pull request? Enable `pyspark.pandas.spark.functions.repeat` in Spark Connect ### Why are the changes needed? feature parity ### Does

[GitHub] [spark] pang-wu commented on a diff in pull request #41498: [SPARK-44001][Protobuf] spark protobuf: handle well known wrapper types

2023-06-08 Thread via GitHub

pang-wu commented on code in PR #41498: URL: https://github.com/apache/spark/pull/41498#discussion_r1223754931 ## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala: ## @@ -247,12 +247,86 @@ private[sql] class ProtobufDeserializer(

[GitHub] [spark] pang-wu commented on a diff in pull request #41498: [SPARK-44001][Protobuf] spark protobuf: handle well known wrapper types

2023-06-08 Thread via GitHub

pang-wu commented on code in PR #41498: URL: https://github.com/apache/spark/pull/41498#discussion_r1223754931 ## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala: ## @@ -247,12 +247,86 @@ private[sql] class ProtobufDeserializer(

[GitHub] [spark] pang-wu commented on a diff in pull request #41498: [SPARK-44001][Protobuf] spark protobuf: handle well known wrapper types

2023-06-08 Thread via GitHub

pang-wu commented on code in PR #41498: URL: https://github.com/apache/spark/pull/41498#discussion_r1223754931 ## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala: ## @@ -247,12 +247,86 @@ private[sql] class ProtobufDeserializer(

[GitHub] [spark] zhengruifeng opened a new pull request, #41523: [WIP][SPARK-43616][PS][CONNECT] Enable `pyspark.pandas.spark.functions.mode` in Spark Connect

2023-06-08 Thread via GitHub

zhengruifeng opened a new pull request, #41523: URL: https://github.com/apache/spark/pull/41523 ### What changes were proposed in this pull request? Enable `pyspark.pandas.spark.functions.mode` in Spark Connect ### Why are the changes needed? for feature parity ###

[GitHub] [spark] WweiL opened a new pull request, #41522: [SPARK-44010][Python][SS][Minor] Python StreamingQueryProgress rowsPerSecond type fix

2023-06-08 Thread via GitHub

WweiL opened a new pull request, #41522: URL: https://github.com/apache/spark/pull/41522 ### What changes were proposed in this pull request? Fix Python StreamingQueryProgress' inputRowsPerSecond and processedRowsPerSecond's return type. They should be float according to

[GitHub] [spark] beliefer commented on pull request #41515: [SPARK-43934][SQL][PYTHON][CONNECT] Add regexp_* functions to Scala and Python

2023-06-08 Thread via GitHub

beliefer commented on PR #41515: URL: https://github.com/apache/spark/pull/41515#issuecomment-1583826262 ping @HyukjinKwon @zhengruifeng cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] panbingkun commented on pull request #41505: [SPARK-43938][CONNECT][PYTHON] Add to_* functions to Scala and Python

2023-06-08 Thread via GitHub

panbingkun commented on PR #41505: URL: https://github.com/apache/spark/pull/41505#issuecomment-1583800048 @zhengruifeng Let me be more careful. I have checked all the naming and annotations, as well as the location of the functions, and added UT for each case. -- This is an

[GitHub] [spark] thepinetree commented on a diff in pull request #41072: [SPARK-43393][SQL] Address sequence expression overflow bug.

2023-06-08 Thread via GitHub

thepinetree commented on code in PR #41072: URL: https://github.com/apache/spark/pull/41072#discussion_r1223723240 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala: ## @@ -3448,13 +3449,32 @@ object Sequence { ||

[GitHub] [spark] itholic commented on a diff in pull request #41514: [SPARK-43684][SPARK-43685][SPARK-43686][SPARK-43691][CONNECT][PS] Fix `(NullOps|NumOps).(eq|ne)` for Spark Connect.

2023-06-08 Thread via GitHub

itholic commented on code in PR #41514: URL: https://github.com/apache/spark/pull/41514#discussion_r1223720225 ## python/pyspark/pandas/data_type_ops/null_ops.py: ## @@ -43,6 +43,22 @@ class NullOps(DataTypeOps): def pretty_name(self) -> str: return "nulls" +

[GitHub] [spark] itholic commented on pull request #41517: [SPARK-42290][SQL] Fix the OOM error can't be reported when AQE on

2023-06-08 Thread via GitHub

itholic commented on PR #41517: URL: https://github.com/apache/spark/pull/41517#issuecomment-1583753744 last LGTM. Thanks for the fix -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] melin commented on pull request #41518: [SPARK-19335][SQL] Add upserts for writing to JDBC

2023-06-08 Thread via GitHub

melin commented on PR #41518: URL: https://github.com/apache/spark/pull/41518#issuecomment-1583718745 Many databases support merge sql, including oracle https://issues.apache.org/jira/browse/SPARK-38200 -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] zhengruifeng commented on a diff in pull request #41516: [SPARK-43932][SQL][PYTHON][CONNECT] Add `current` like functions to Scala and Python

2023-06-08 Thread via GitHub

zhengruifeng commented on code in PR #41516: URL: https://github.com/apache/spark/pull/41516#discussion_r1223691672 ## sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala: ## @@ -5570,6 +5570,21 @@ class DataFrameFunctionsSuite extends QueryTest with

[GitHub] [spark] github-actions[bot] commented on pull request #38035: [SPARK-42438][SQL] Improve constraint propagation using multiTransform

2023-06-08 Thread via GitHub

github-actions[bot] commented on PR #38035: URL: https://github.com/apache/spark/pull/38035#issuecomment-1583660803 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #39185: [SPARK-41551][SQL] Dynamic/absolute path support in PathOutputCommitters

2023-06-08 Thread via GitHub

github-actions[bot] commented on PR #39185: URL: https://github.com/apache/spark/pull/39185#issuecomment-1583660791 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] closed pull request #40178: [MINOR][DOCS] Remove `Jenkins` from web page.

2023-06-08 Thread via GitHub

github-actions[bot] closed pull request #40178: [MINOR][DOCS] Remove `Jenkins` from web page. URL: https://github.com/apache/spark/pull/40178 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] github-actions[bot] commented on pull request #40189: [SPARK-27483] [SS] [SQL] Move fallback logic for StreamingRelation(V2) to an analyzer rule

2023-06-08 Thread via GitHub

github-actions[bot] commented on PR #40189: URL: https://github.com/apache/spark/pull/40189#issuecomment-1583660763 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #40221: [SPARK-41551][SQL] Dynamic/absolute path support in PathOutputCommitters

2023-06-08 Thread via GitHub

github-actions[bot] commented on PR #40221: URL: https://github.com/apache/spark/pull/40221#issuecomment-1583660746 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] Hisoka-X commented on pull request #41348: [SPARK-43203][SQL] Move all Drop Table case to DataSource V2

2023-06-08 Thread via GitHub

Hisoka-X commented on PR #41348: URL: https://github.com/apache/spark/pull/41348#issuecomment-1583654481 > Is the test failure related? > > ``` > DataFrameFunctionsSuite.DataFrame function and SQL functon parity > org.scalatest.exceptions.TestFailedException: Set("ceiling",

[GitHub] [spark] Hisoka-X commented on a diff in pull request #41348: [SPARK-43203][SQL] Move all Drop Table case to DataSource V2

2023-06-08 Thread via GitHub

Hisoka-X commented on code in PR #41348: URL: https://github.com/apache/spark/pull/41348#discussion_r1223661427 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/IdentifierImpl.java: ## @@ -30,12 +30,12 @@ * An {@link Identifier} implementation. */

[GitHub] [spark] WweiL commented on a diff in pull request #41129: [SPARK-43133] Scala Client DataStreamWriter Foreach support

2023-06-08 Thread via GitHub

WweiL commented on code in PR #41129: URL: https://github.com/apache/spark/pull/41129#discussion_r1223658122 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -2445,10 +2451,24 @@ class SparkConnectPlanner(val

[GitHub] [spark] WweiL commented on a diff in pull request #41129: [SPARK-43133] Scala Client DataStreamWriter Foreach support

2023-06-08 Thread via GitHub

WweiL commented on code in PR #41129: URL: https://github.com/apache/spark/pull/41129#discussion_r1223658042 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala: ## @@ -202,6 +208,28 @@ final class DataStreamWriter[T]

[GitHub] [spark] cloud-fan commented on pull request #41517: [SPARK-42290][SQL] Fix the OOM error can't be reported when AQE on

2023-06-08 Thread via GitHub

cloud-fan commented on PR #41517: URL: https://github.com/apache/spark/pull/41517#issuecomment-1583637991 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] HyukjinKwon closed pull request #41318: [SPARK-43803] [SS] [CONNECT] Improve awaitTermination() to handle client disconnects

2023-06-08 Thread via GitHub

HyukjinKwon closed pull request #41318: [SPARK-43803] [SS] [CONNECT] Improve awaitTermination() to handle client disconnects URL: https://github.com/apache/spark/pull/41318 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] HyukjinKwon closed pull request #41129: [SPARK-43133] Scala Client DataStreamWriter Foreach support

2023-06-08 Thread via GitHub

HyukjinKwon closed pull request #41129: [SPARK-43133] Scala Client DataStreamWriter Foreach support URL: https://github.com/apache/spark/pull/41129 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] HyukjinKwon commented on pull request #41129: [SPARK-43133] Scala Client DataStreamWriter Foreach support

2023-06-08 Thread via GitHub

HyukjinKwon commented on PR #41129: URL: https://github.com/apache/spark/pull/41129#issuecomment-1583617244 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on pull request #41318: [SPARK-43803] [SS] [CONNECT] Improve awaitTermination() to handle client disconnects

2023-06-08 Thread via GitHub

HyukjinKwon commented on PR #41318: URL: https://github.com/apache/spark/pull/41318#issuecomment-1583617127 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon closed pull request #41511: [SPARK-43613][PS][CONNECT] Enable `pyspark.pandas.spark.functions.covar` in Spark Connect

2023-06-08 Thread via GitHub

HyukjinKwon closed pull request #41511: [SPARK-43613][PS][CONNECT] Enable `pyspark.pandas.spark.functions.covar` in Spark Connect URL: https://github.com/apache/spark/pull/41511 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] HyukjinKwon closed pull request #41512: [SPARK-43700][SPARK-43701][CONNECT][PS] Enable `TimedeltaOps.(sub|rsub)` with Spark Connect

2023-06-08 Thread via GitHub

HyukjinKwon closed pull request #41512: [SPARK-43700][SPARK-43701][CONNECT][PS] Enable `TimedeltaOps.(sub|rsub)` with Spark Connect URL: https://github.com/apache/spark/pull/41512 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] HyukjinKwon commented on pull request #41511: [SPARK-43613][PS][CONNECT] Enable `pyspark.pandas.spark.functions.covar` in Spark Connect

2023-06-08 Thread via GitHub

HyukjinKwon commented on PR #41511: URL: https://github.com/apache/spark/pull/41511#issuecomment-1583614730 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on pull request #41512: [SPARK-43700][SPARK-43701][CONNECT][PS] Enable `TimedeltaOps.(sub|rsub)` with Spark Connect

2023-06-08 Thread via GitHub

HyukjinKwon commented on PR #41512: URL: https://github.com/apache/spark/pull/41512#issuecomment-1583614693 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] zhengruifeng commented on pull request #41519: [SPARK-43943][SQL][TESTS][FOLLOW] Fix `DataFrame function and SQL function parity` in `DataFrameFunctionsSuite`

2023-06-08 Thread via GitHub

zhengruifeng commented on PR #41519: URL: https://github.com/apache/spark/pull/41519#issuecomment-1583614159 merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] zhengruifeng closed pull request #41519: [SPARK-43943][SQL][TESTS][FOLLOW] Fix `DataFrame function and SQL function parity` in `DataFrameFunctionsSuite`

2023-06-08 Thread via GitHub

zhengruifeng closed pull request #41519: [SPARK-43943][SQL][TESTS][FOLLOW] Fix `DataFrame function and SQL function parity` in `DataFrameFunctionsSuite` URL: https://github.com/apache/spark/pull/41519 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] zhengruifeng commented on pull request #41519: [SPARK-43943][SQL][TESTS][FOLLOW] Fix `DataFrame function and SQL function parity` in `DataFrameFunctionsSuite`

2023-06-08 Thread via GitHub

zhengruifeng commented on PR #41519: URL: https://github.com/apache/spark/pull/41519#issuecomment-1583612968 I can repro this issue, so this fix LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] zhengruifeng commented on pull request #41519: [SPARK-43943][SQL][TESTS][FOLLOW] Fix `DataFrame function and SQL function parity` in `DataFrameFunctionsSuite`

2023-06-08 Thread via GitHub

zhengruifeng commented on PR #41519: URL: https://github.com/apache/spark/pull/41519#issuecomment-1583612759 @LuciferYang thanks for the catch. I was not aware of this failure, since the `sql - other` failed before it run this test. -- This is an automated message from the Apache Git

[GitHub] [spark] rangadi commented on a diff in pull request #41129: [SPARK-43133] Scala Client DataStreamWriter Foreach support

2023-06-08 Thread via GitHub

rangadi commented on code in PR #41129: URL: https://github.com/apache/spark/pull/41129#discussion_r1223638394 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -2445,10 +2451,24 @@ class SparkConnectPlanner(val

[GitHub] [spark] amaliujia commented on pull request #41427: [SPARK-43888][CONNECT][FOLLOW-UP] Spark Connect client should depend on common-utils explicitly

2023-06-08 Thread via GitHub

amaliujia commented on PR #41427: URL: https://github.com/apache/spark/pull/41427#issuecomment-1583598012 @LuciferYang thank you so much! It seems just be a line of change. Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dtenedor commented on pull request #41191: [SPARK-43529][SQL] Support general constant expressions as CREATE/REPLACE TABLE OPTIONS values

2023-06-08 Thread via GitHub

dtenedor commented on PR #41191: URL: https://github.com/apache/spark/pull/41191#issuecomment-1583566600 (Note, this is passing all CI again.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] jimmyzzxhlh commented on pull request #39691: [SPARK-31561][SQL] Add QUALIFY clause

2023-06-08 Thread via GitHub

jimmyzzxhlh commented on PR #39691: URL: https://github.com/apache/spark/pull/39691#issuecomment-1583566431 ^ Same question -- Any plan to release this feature? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] pengzhon-db commented on pull request #41318: [SPARK-43803] [SS] [CONNECT] Improve awaitTermination() to handle client disconnects

2023-06-08 Thread via GitHub

pengzhon-db commented on PR #41318: URL: https://github.com/apache/spark/pull/41318#issuecomment-1583565719 @HyukjinKwon can you help merge this? Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark-connect-go] hiboyang commented on a diff in pull request #10: [SPARK-43351] Add DataFrame writer and reader prototype code

2023-06-08 Thread via GitHub

hiboyang commented on code in PR #10: URL: https://github.com/apache/spark-connect-go/pull/10#discussion_r1223599098 ## client/sql/dataframe.go: ## @@ -31,6 +31,7 @@ type DataFrame interface { Show(numRows int, truncate bool) error Schema() (*StructType, error)

[GitHub] [spark] allisonwang-db commented on a diff in pull request #41316: [SPARK-43798][SQL][PYTHON] Support Python user-defined table functions

2023-06-08 Thread via GitHub

allisonwang-db commented on code in PR #41316: URL: https://github.com/apache/spark/pull/41316#discussion_r1223573185 ## python/docs/source/reference/pyspark.sql/udtf.rst: ## Review Comment: Thanks, I will add the link. Can I compile the doc locally to see how it looks

[GitHub] [spark] yliou commented on pull request #40503: [SPARK-42830] [UI] Link skipped stages on Spark UI

2023-06-08 Thread via GitHub

yliou commented on PR #40503: URL: https://github.com/apache/spark/pull/40503#issuecomment-1583357328 CC: @HyukjinKwon is there interest in this feature? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] rangadi commented on a diff in pull request #41146: [SPARK-43474] [SS] [CONNECT] Add a spark connect function to create DataFrame reference

2023-06-08 Thread via GitHub

rangadi commented on code in PR #41146: URL: https://github.com/apache/spark/pull/41146#discussion_r1223516152 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectCachedDataFrameManager.scala: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the

[GitHub] [spark] ueshin commented on a diff in pull request #41316: [SPARK-43798][SQL][PYTHON] Support Python user-defined table functions

2023-06-08 Thread via GitHub

ueshin commented on code in PR #41316: URL: https://github.com/apache/spark/pull/41316#discussion_r1223463725 ## python/docs/source/reference/pyspark.sql/udtf.rst: ## Review Comment: Need an entry in `python/docs/source/reference/pyspark.sql/index.rst` or

[GitHub] [spark] zeruibao commented on pull request #41521: [SPARK-43380][SQL] Fix conversion of Avro logical timestamp type to Long

2023-06-08 Thread via GitHub

zeruibao commented on PR #41521: URL: https://github.com/apache/spark/pull/41521#issuecomment-1583330021 Yeah, I think https://github.com/apache/spark/pull/41052 is only merged to master branch. @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] viirya commented on a diff in pull request #41348: [SPARK-43203][SQL] Move all Drop Table case to DataSource V2

2023-06-08 Thread via GitHub

viirya commented on code in PR #41348: URL: https://github.com/apache/spark/pull/41348#discussion_r1223519017 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/IdentifierImpl.java: ## @@ -30,12 +30,12 @@ * An {@link Identifier} implementation. */

[GitHub] [spark] viirya commented on pull request #41348: [SPARK-43203][SQL] Move all Drop Table case to DataSource V2

2023-06-08 Thread via GitHub

viirya commented on PR #41348: URL: https://github.com/apache/spark/pull/41348#issuecomment-1583295230 Is the test failure related? ``` DataFrameFunctionsSuite.DataFrame function and SQL functon parity org.scalatest.exceptions.TestFailedException: Set("ceiling", "negative",

[GitHub] [spark] dongjoon-hyun commented on pull request #41517: [SPARK-42290][SQL] Fix the OOM error can't be reported when AQE on

2023-06-08 Thread via GitHub

dongjoon-hyun commented on PR #41517: URL: https://github.com/apache/spark/pull/41517#issuecomment-1583290913 Thank you, @Hisoka-X , @LuciferYang , @kazuyukitanimura . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] dongjoon-hyun closed pull request #41517: [SPARK-42290][SQL] Fix the OOM error can't be reported when AQE on

2023-06-08 Thread via GitHub

dongjoon-hyun closed pull request #41517: [SPARK-42290][SQL] Fix the OOM error can't be reported when AQE on URL: https://github.com/apache/spark/pull/41517 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] dongjoon-hyun commented on pull request #41517: [SPARK-42290][SQL] Fix the OOM error can't be reported when AQE on

2023-06-08 Thread via GitHub

dongjoon-hyun commented on PR #41517: URL: https://github.com/apache/spark/pull/41517#issuecomment-1583274941 I verified manually. Merged to master/3.4. ``` $ build/sbt "sql/testOnly *.QueryExecutionErrorsSuite -- -z SPARK-42290" [info] QueryExecutionErrorsSuite: 13:10:15.573

[GitHub] [spark] tgravescs commented on pull request #34622: [SPARK-37340][UI] Display StageIds in Operators for SQL UI

2023-06-08 Thread via GitHub

tgravescs commented on PR #34622: URL: https://github.com/apache/spark/pull/34622#issuecomment-1583262505 sure, its been a while but I think I had tried this out and was seeing some performance issues with it. I'd have to relook at it to remember. Did you run any performance tests? --

[GitHub] [spark] yliou commented on pull request #40502: [SPARK-42829] [UI] add repeat identifier to cached RDD on stage page

2023-06-08 Thread via GitHub

yliou commented on PR #40502: URL: https://github.com/apache/spark/pull/40502#issuecomment-1583251122 CC: @HyukjinKwon is there interest in this feature? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] dongjoon-hyun commented on pull request #41520: [MINOR][SQL][TESTS] Move ResolveDefaultColumnsSuite to 'o.a.s.sql'

2023-06-08 Thread via GitHub

dongjoon-hyun commented on PR #41520: URL: https://github.com/apache/spark/pull/41520#issuecomment-1583246487 Merged to master/3.4. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] yliou commented on pull request #34622: [SPARK-37340][UI] Display StageIds in Operators for SQL UI

2023-06-08 Thread via GitHub

yliou commented on PR #34622: URL: https://github.com/apache/spark/pull/34622#issuecomment-1583245920 @tgravescs @martin-g should I create another pull request for this feature to try to get it merged? I'm unable to reopen the PR. -- This is an automated message from the Apache Git

[GitHub] [spark] dongjoon-hyun closed pull request #41520: [MINOR][SQL][TESTS] Move ResolveDefaultColumnsSuite to 'o.a.s.sql'

2023-06-08 Thread via GitHub

dongjoon-hyun closed pull request #41520: [MINOR][SQL][TESTS] Move ResolveDefaultColumnsSuite to 'o.a.s.sql' URL: https://github.com/apache/spark/pull/41520 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] dongjoon-hyun commented on pull request #41520: [MINOR][SQL][TESTS] Move ResolveDefaultColumnsSuite to 'o.a.s.sql'

2023-06-08 Thread via GitHub

dongjoon-hyun commented on PR #41520: URL: https://github.com/apache/spark/pull/41520#issuecomment-1583208534 Thank you, @dtenedor and @gengliangwang I verified the relocated suite manually. ``` $ build/sbt "sql/testOnly *.ResolveDefaultColumnsSuite" ... [info]

[GitHub] [spark] zeruibao opened a new pull request, #41521: [SPARK-43380][SQL] Fix conversion of Avro logical timestamp type to Long

2023-06-08 Thread via GitHub

zeruibao opened a new pull request, #41521: URL: https://github.com/apache/spark/pull/41521 ### What changes were proposed in this pull request? Fix conversion of Avro logical timestamp type to Long ### Why are the changes needed? The fix in

[GitHub] [spark] ueshin commented on a diff in pull request #41316: [SPARK-43798][SQL][PYTHON] Support Python user-defined table functions

2023-06-08 Thread via GitHub

ueshin commented on code in PR #41316: URL: https://github.com/apache/spark/pull/41316#discussion_r1223418548 ## python/pyspark/worker.py: ## @@ -871,6 +941,16 @@ def process(): else: process() +if eval_type == PythonEvalType.SQL_TABLE_UDF: +

[GitHub] [spark] dtenedor commented on pull request #41520: [MINOR][SQL][TESTS] Move ResolveDefaultColumnsSuite to 'o.a.s.sql'

2023-06-08 Thread via GitHub

dtenedor commented on PR #41520: URL: https://github.com/apache/spark/pull/41520#issuecomment-1583148512 LGTM! Thanks @dongjoon-hyun for the clean up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] dongjoon-hyun commented on pull request #41520: [MINOR][SQL][TESTS] Move ResolveDefaultColumnsSuite to 'o.a.s.sql'

2023-06-08 Thread via GitHub

dongjoon-hyun commented on PR #41520: URL: https://github.com/apache/spark/pull/41520#issuecomment-1583145315 cc @dtenedor and @gengliangwang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #40652: [SPARK-43018][SQL] Fix bug for INSERT commands with timestamp literals

2023-06-08 Thread via GitHub

dongjoon-hyun commented on code in PR #40652: URL: https://github.com/apache/spark/pull/40652#discussion_r1223414585 ## sql/core/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultColumnsSuite.scala: ## @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] dongjoon-hyun opened a new pull request, #41520: [MINOR][SQL][TESTS] Move ResolveDefaultColumnsSuite to 'o.a.s.sql'

2023-06-08 Thread via GitHub

dongjoon-hyun opened a new pull request, #41520: URL: https://github.com/apache/spark/pull/41520 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

1 2 3 >

1 - 100 of 218 matches

Mail list logo