[GitHub] [spark] TongWei1105 closed pull request #41513: [SPARK-44007][SQL] Unresolved hint cause query failure

2023-06-08 Thread via GitHub
TongWei1105 closed pull request #41513: [SPARK-44007][SQL] Unresolved hint cause query failure URL: https://github.com/apache/spark/pull/41513 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] LuciferYang commented on a diff in pull request #41529: [SPARK-43988][INFRA] Add maven testing GitHub Action task for connect client module

2023-06-08 Thread via GitHub
LuciferYang commented on code in PR #41529: URL: https://github.com/apache/spark/pull/41529#discussion_r1223872778 ## .github/workflows/build_and_test.yml: ## @@ -728,6 +729,83 @@ jobs: ./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pmesos -Pkubernetes -Pvolcano

[GitHub] [spark] pan3793 commented on a diff in pull request #41448: [SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources

2023-06-08 Thread via GitHub
pan3793 commented on code in PR #41448: URL: https://github.com/apache/spark/pull/41448#discussion_r1223871226 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala: ## @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] aokolnychyi commented on pull request #41448: [SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources

2023-06-08 Thread via GitHub
aokolnychyi commented on PR #41448: URL: https://github.com/apache/spark/pull/41448#issuecomment-1584012053 Also created SPARK-44013 to add a benchmark. Will be used to measure the impact of adding codegen later. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] aokolnychyi commented on pull request #41448: [SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources

2023-06-08 Thread via GitHub
aokolnychyi commented on PR #41448: URL: https://github.com/apache/spark/pull/41448#issuecomment-1584010899 Fixed, tested `SparkThrowableSuite` locally. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] LuciferYang commented on pull request #41529: [SPARK-43988][INFRA] Add maven testing GitHub Action task for connect client module

2023-06-08 Thread via GitHub
LuciferYang commented on PR #41529: URL: https://github.com/apache/spark/pull/41529#issuecomment-1584010670 wait https://github.com/apache/spark/pull/41487 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] LuciferYang opened a new pull request, #41529: [SPARK-43988][INFRA] Add maven testing GitHub Action task for connect client module

2023-06-08 Thread via GitHub
LuciferYang opened a new pull request, #41529: URL: https://github.com/apache/spark/pull/41529 ### What changes were proposed in this pull request? This pr aims to added Maven testing job on GitHub Actions for the `connect-client-jvm` module. ### Why are the changes needed?

[GitHub] [spark] aokolnychyi commented on a diff in pull request #41448: [SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources

2023-06-08 Thread via GitHub
aokolnychyi commented on code in PR #41448: URL: https://github.com/apache/spark/pull/41448#discussion_r1223867780 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala: ## @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] aokolnychyi commented on a diff in pull request #41448: [SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources

2023-06-08 Thread via GitHub
aokolnychyi commented on code in PR #41448: URL: https://github.com/apache/spark/pull/41448#discussion_r1223868149 ## core/src/main/resources/error/error-classes.json: ## @@ -1539,6 +1539,13 @@ "Parse Mode: . To process malformed records as null result, try setting the

[GitHub] [spark] aokolnychyi commented on a diff in pull request #41448: [SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources

2023-06-08 Thread via GitHub
aokolnychyi commented on code in PR #41448: URL: https://github.com/apache/spark/pull/41448#discussion_r1223867780 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala: ## @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] LuciferYang commented on pull request #41487: [SPARK-43648][CONNECT][TESTS] Move `interrupt all` related test to a new test file to pass Maven test

2023-06-08 Thread via GitHub
LuciferYang commented on PR #41487: URL: https://github.com/apache/spark/pull/41487#issuecomment-1584008688 @dongjoon-hyun @vicennial @juliuszsompolski Do you have any other suggestions for this pr? Can we merge this one first? This can make Maven test to pass first, and if there is a

[GitHub] [spark] LuciferYang commented on a diff in pull request #41516: [SPARK-43932][SQL][PYTHON][CONNECT] Add `current` like functions to Scala and Python

2023-06-08 Thread via GitHub
LuciferYang commented on code in PR #41516: URL: https://github.com/apache/spark/pull/41516#discussion_r1223861231 ## sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala: ## @@ -5570,6 +5570,21 @@ class DataFrameFunctionsSuite extends QueryTest with

[GitHub] [spark] zeruibao commented on a diff in pull request #41521: [SPARK-43380][SQL] Fix conversion of Avro logical timestamp type to Long

2023-06-08 Thread via GitHub
zeruibao commented on code in PR #41521: URL: https://github.com/apache/spark/pull/41521#discussion_r1223853665 ## connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala: ## @@ -158,7 +158,7 @@ private[sql] class AvroDeserializer( } case

[GitHub] [spark] cloud-fan commented on a diff in pull request #41521: [SPARK-43380][SQL] Fix conversion of Avro logical timestamp type to Long

2023-06-08 Thread via GitHub
cloud-fan commented on code in PR #41521: URL: https://github.com/apache/spark/pull/41521#discussion_r1223844129 ## connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala: ## @@ -158,7 +158,7 @@ private[sql] class AvroDeserializer( } case

[GitHub] [spark] itholic opened a new pull request, #41528: [SPARK-43610][CONNECT][PS] Enable `InternalFrame.attach_distributed_column` in Spark Connect.

2023-06-08 Thread via GitHub
itholic opened a new pull request, #41528: URL: https://github.com/apache/spark/pull/41528 ### What changes were proposed in this pull request? This PR proposes to enable `InternalFrame.attach_distributed_column` in Spark Connect. ### Why are the changes needed? To

[GitHub] [spark] HyukjinKwon closed pull request #41522: [SPARK-44010][PYTHON][SS][MINOR] Python StreamingQueryProgress rowsPerSecond type fix

2023-06-08 Thread via GitHub
HyukjinKwon closed pull request #41522: [SPARK-44010][PYTHON][SS][MINOR] Python StreamingQueryProgress rowsPerSecond type fix URL: https://github.com/apache/spark/pull/41522 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] HyukjinKwon commented on pull request #41522: [SPARK-44010][PYTHON][SS][MINOR] Python StreamingQueryProgress rowsPerSecond type fix

2023-06-08 Thread via GitHub
HyukjinKwon commented on PR #41522: URL: https://github.com/apache/spark/pull/41522#issuecomment-1583946638 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] jdesjean commented on a diff in pull request #41443: [SPARK-43923][CONNECT] Post listenerBus events during ExecutePlanRequest

2023-06-08 Thread via GitHub
jdesjean commented on code in PR #41443: URL: https://github.com/apache/spark/pull/41443#discussion_r1223820830 ## connector/connect/server/src/test/scala/org/apache/spark/sql/connect/service/EventsSuite.scala: ## @@ -0,0 +1,246 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] jdesjean commented on a diff in pull request #41443: [SPARK-43923][CONNECT] Post listenerBus events during ExecutePlanRequest

2023-06-08 Thread via GitHub
jdesjean commented on code in PR #41443: URL: https://github.com/apache/spark/pull/41443#discussion_r1223820830 ## connector/connect/server/src/test/scala/org/apache/spark/sql/connect/service/EventsSuite.scala: ## @@ -0,0 +1,246 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] jdesjean commented on a diff in pull request #41443: [SPARK-43923][CONNECT] Post listenerBus events during ExecutePlanRequest

2023-06-08 Thread via GitHub
jdesjean commented on code in PR #41443: URL: https://github.com/apache/spark/pull/41443#discussion_r1223820830 ## connector/connect/server/src/test/scala/org/apache/spark/sql/connect/service/EventsSuite.scala: ## @@ -0,0 +1,246 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] wangyum commented on a diff in pull request #41513: [SPARK-44007][SQL] Unresolved hint cause query failure

2023-06-08 Thread via GitHub
wangyum commented on code in PR #41513: URL: https://github.com/apache/spark/pull/41513#discussion_r1223820020 ## sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala: ## @@ -4683,6 +4683,27 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with

[GitHub] [spark] jdesjean commented on a diff in pull request #41443: [SPARK-43923][CONNECT] Post listenerBus events during ExecutePlanRequest

2023-06-08 Thread via GitHub
jdesjean commented on code in PR #41443: URL: https://github.com/apache/spark/pull/41443#discussion_r1223819874 ## connector/connect/server/src/test/scala/org/apache/spark/sql/connect/service/ExecutePlanHolderSuite.scala: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache

[GitHub] [spark] beliefer closed pull request #41464: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

2023-06-08 Thread via GitHub
beliefer closed pull request #41464: [SPARK-43879][CONNECT] Decouple handle command and send response on server side URL: https://github.com/apache/spark/pull/41464 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] beliefer opened a new pull request, #41527: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

2023-06-08 Thread via GitHub
beliefer opened a new pull request, #41527: URL: https://github.com/apache/spark/pull/41527 ### What changes were proposed in this pull request? `SparkConnectStreamHandler` treats the proto requests from connect client and send the responses back to connect client.

[GitHub] [spark] zhengruifeng opened a new pull request, #41526: [SPARK-43943][SPARK-43935][SPARK-43930][DOCS][FOLLOW-UP] Add missing `versionadded` annotations

2023-06-08 Thread via GitHub
zhengruifeng opened a new pull request, #41526: URL: https://github.com/apache/spark/pull/41526 ### What changes were proposed in this pull request? Add missing `versionadded` annotations ### Why are the changes needed? for better doc ### Does this PR introduce

[GitHub] [spark] zhengruifeng commented on a diff in pull request #41505: [SPARK-43938][CONNECT][PYTHON] Add to_* functions to Scala and Python

2023-06-08 Thread via GitHub
zhengruifeng commented on code in PR #41505: URL: https://github.com/apache/spark/pull/41505#discussion_r1223802204 ## connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala: ## @@ -1964,6 +1964,34 @@ class PlanGenerationTestSuite

[GitHub] [spark] beliefer commented on a diff in pull request #41516: [SPARK-43932][SQL][PYTHON][CONNECT] Add `current` like functions to Scala and Python

2023-06-08 Thread via GitHub
beliefer commented on code in PR #41516: URL: https://github.com/apache/spark/pull/41516#discussion_r1223802410 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala: ## @@ -2807,6 +2807,38 @@ object functions { // Misc functions

[GitHub] [spark] beliefer commented on a diff in pull request #41516: [SPARK-43932][SQL][PYTHON][CONNECT] Add `current` like functions to Scala and Python

2023-06-08 Thread via GitHub
beliefer commented on code in PR #41516: URL: https://github.com/apache/spark/pull/41516#discussion_r1223802410 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala: ## @@ -2807,6 +2807,38 @@ object functions { // Misc functions

[GitHub] [spark] beliefer commented on a diff in pull request #41516: [SPARK-43932][SQL][PYTHON][CONNECT] Add `current` like functions to Scala and Python

2023-06-08 Thread via GitHub
beliefer commented on code in PR #41516: URL: https://github.com/apache/spark/pull/41516#discussion_r1223802759 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala: ## @@ -2961,6 +2993,14 @@ object functions { allowDifferentLgConfigK:

[GitHub] [spark] LuciferYang commented on pull request #41519: [SPARK-43943][SQL][TESTS][FOLLOW] Fix `DataFrame function and SQL function parity` in `DataFrameFunctionsSuite`

2023-06-08 Thread via GitHub
LuciferYang commented on PR #41519: URL: https://github.com/apache/spark/pull/41519#issuecomment-1583915596 Thanks @zhengruifeng -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] siying opened a new pull request, #41525: [SPARK-44012][SS]KafkaDataConsumer to print some read status

2023-06-08 Thread via GitHub
siying opened a new pull request, #41525: URL: https://github.com/apache/spark/pull/41525 ### What changes were proposed in this pull request? In the end of each KafkaDataConsumer, it logs some stats. Here is an sample log line: 23/06/08 23:48:14 INFO KafkaDataConsumer: From

[GitHub] [spark] beliefer commented on a diff in pull request #41518: [SPARK-19335][SQL] Add upserts for writing to JDBC

2023-06-08 Thread via GitHub
beliefer commented on code in PR #41518: URL: https://github.com/apache/spark/pull/41518#discussion_r1223800177 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala: ## @@ -878,6 +898,7 @@ object JdbcUtils extends Logging with SQLConfHelper

[GitHub] [spark] itholic commented on a diff in pull request #41514: [SPARK-43684][SPARK-43685][SPARK-43686][SPARK-43691][CONNECT][PS] Fix `(NullOps|NumOps).(eq|ne)` for Spark Connect.

2023-06-08 Thread via GitHub
itholic commented on code in PR #41514: URL: https://github.com/apache/spark/pull/41514#discussion_r1223797787 ## python/pyspark/pandas/data_type_ops/null_ops.py: ## @@ -43,6 +43,22 @@ class NullOps(DataTypeOps): def pretty_name(self) -> str: return "nulls" +

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #41514: [SPARK-43684][SPARK-43685][SPARK-43686][SPARK-43691][CONNECT][PS] Fix `(NullOps|NumOps).(eq|ne)` for Spark Connect.

2023-06-08 Thread via GitHub
HyukjinKwon commented on code in PR #41514: URL: https://github.com/apache/spark/pull/41514#discussion_r1223797237 ## python/pyspark/pandas/data_type_ops/null_ops.py: ## @@ -43,6 +43,22 @@ class NullOps(DataTypeOps): def pretty_name(self) -> str: return "nulls"

[GitHub] [spark] allisonwang-db commented on a diff in pull request #41316: [SPARK-43798][SQL][PYTHON] Support Python user-defined table functions

2023-06-08 Thread via GitHub
allisonwang-db commented on code in PR #41316: URL: https://github.com/apache/spark/pull/41316#discussion_r1223573185 ## python/docs/source/reference/pyspark.sql/udtf.rst: ## Review Comment: Thanks, I will add the link. Can I compile the doc locally to see how it looks

[GitHub] [spark] zhengruifeng commented on a diff in pull request #41477: [SPARK-43931][SQL][PYTHON][CONNECT] Add make_* functions to Scala and Python

2023-06-08 Thread via GitHub
zhengruifeng commented on code in PR #41477: URL: https://github.com/apache/spark/pull/41477#discussion_r1223794670 ## python/pyspark/sql/connect/functions.py: ## @@ -2373,6 +2374,117 @@ def hours(col: "ColumnOrName") -> Column: hours.__doc__ = pysparkfuncs.hours.__doc__ +

[GitHub] [spark] zhengruifeng commented on a diff in pull request #41515: [SPARK-43934][SQL][PYTHON][CONNECT] Add regexp_* functions to Scala and Python

2023-06-08 Thread via GitHub
zhengruifeng commented on code in PR #41515: URL: https://github.com/apache/spark/pull/41515#discussion_r1223794078 ## python/pyspark/sql/connect/functions.py: ## @@ -1988,13 +1988,70 @@ def split(str: "ColumnOrName", pattern: str, limit: int = -1) -> Column: split.__doc__ =

[GitHub] [spark] zhengruifeng opened a new pull request, #41524: [SPARK-43621][PS][CONNECT] Enable `pyspark.pandas.spark.functions.repeat` in Spark Connect

2023-06-08 Thread via GitHub
zhengruifeng opened a new pull request, #41524: URL: https://github.com/apache/spark/pull/41524 ### What changes were proposed in this pull request? Enable `pyspark.pandas.spark.functions.repeat` in Spark Connect ### Why are the changes needed? feature parity ### Does

[GitHub] [spark] pang-wu commented on a diff in pull request #41498: [SPARK-44001][Protobuf] spark protobuf: handle well known wrapper types

2023-06-08 Thread via GitHub
pang-wu commented on code in PR #41498: URL: https://github.com/apache/spark/pull/41498#discussion_r1223754931 ## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala: ## @@ -247,12 +247,86 @@ private[sql] class ProtobufDeserializer(

[GitHub] [spark] pang-wu commented on a diff in pull request #41498: [SPARK-44001][Protobuf] spark protobuf: handle well known wrapper types

2023-06-08 Thread via GitHub
pang-wu commented on code in PR #41498: URL: https://github.com/apache/spark/pull/41498#discussion_r1223754931 ## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala: ## @@ -247,12 +247,86 @@ private[sql] class ProtobufDeserializer(

[GitHub] [spark] pang-wu commented on a diff in pull request #41498: [SPARK-44001][Protobuf] spark protobuf: handle well known wrapper types

2023-06-08 Thread via GitHub
pang-wu commented on code in PR #41498: URL: https://github.com/apache/spark/pull/41498#discussion_r1223754931 ## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala: ## @@ -247,12 +247,86 @@ private[sql] class ProtobufDeserializer(

[GitHub] [spark] zhengruifeng opened a new pull request, #41523: [WIP][SPARK-43616][PS][CONNECT] Enable `pyspark.pandas.spark.functions.mode` in Spark Connect

2023-06-08 Thread via GitHub
zhengruifeng opened a new pull request, #41523: URL: https://github.com/apache/spark/pull/41523 ### What changes were proposed in this pull request? Enable `pyspark.pandas.spark.functions.mode` in Spark Connect ### Why are the changes needed? for feature parity ###

[GitHub] [spark] WweiL opened a new pull request, #41522: [SPARK-44010][Python][SS][Minor] Python StreamingQueryProgress rowsPerSecond type fix

2023-06-08 Thread via GitHub
WweiL opened a new pull request, #41522: URL: https://github.com/apache/spark/pull/41522 ### What changes were proposed in this pull request? Fix Python StreamingQueryProgress' inputRowsPerSecond and processedRowsPerSecond's return type. They should be float according to

[GitHub] [spark] beliefer commented on pull request #41515: [SPARK-43934][SQL][PYTHON][CONNECT] Add regexp_* functions to Scala and Python

2023-06-08 Thread via GitHub
beliefer commented on PR #41515: URL: https://github.com/apache/spark/pull/41515#issuecomment-1583826262 ping @HyukjinKwon @zhengruifeng cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] panbingkun commented on pull request #41505: [SPARK-43938][CONNECT][PYTHON] Add to_* functions to Scala and Python

2023-06-08 Thread via GitHub
panbingkun commented on PR #41505: URL: https://github.com/apache/spark/pull/41505#issuecomment-1583800048 @zhengruifeng Let me be more careful. I have checked all the naming and annotations, as well as the location of the functions, and added UT for each case. -- This is an

[GitHub] [spark] thepinetree commented on a diff in pull request #41072: [SPARK-43393][SQL] Address sequence expression overflow bug.

2023-06-08 Thread via GitHub
thepinetree commented on code in PR #41072: URL: https://github.com/apache/spark/pull/41072#discussion_r1223723240 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala: ## @@ -3448,13 +3449,32 @@ object Sequence { ||

[GitHub] [spark] itholic commented on a diff in pull request #41514: [SPARK-43684][SPARK-43685][SPARK-43686][SPARK-43691][CONNECT][PS] Fix `(NullOps|NumOps).(eq|ne)` for Spark Connect.

2023-06-08 Thread via GitHub
itholic commented on code in PR #41514: URL: https://github.com/apache/spark/pull/41514#discussion_r1223720225 ## python/pyspark/pandas/data_type_ops/null_ops.py: ## @@ -43,6 +43,22 @@ class NullOps(DataTypeOps): def pretty_name(self) -> str: return "nulls" +

[GitHub] [spark] itholic commented on pull request #41517: [SPARK-42290][SQL] Fix the OOM error can't be reported when AQE on

2023-06-08 Thread via GitHub
itholic commented on PR #41517: URL: https://github.com/apache/spark/pull/41517#issuecomment-1583753744 last LGTM. Thanks for the fix -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] melin commented on pull request #41518: [SPARK-19335][SQL] Add upserts for writing to JDBC

2023-06-08 Thread via GitHub
melin commented on PR #41518: URL: https://github.com/apache/spark/pull/41518#issuecomment-1583718745 Many databases support merge sql, including oracle https://issues.apache.org/jira/browse/SPARK-38200 -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] zhengruifeng commented on a diff in pull request #41516: [SPARK-43932][SQL][PYTHON][CONNECT] Add `current` like functions to Scala and Python

2023-06-08 Thread via GitHub
zhengruifeng commented on code in PR #41516: URL: https://github.com/apache/spark/pull/41516#discussion_r1223691672 ## sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala: ## @@ -5570,6 +5570,21 @@ class DataFrameFunctionsSuite extends QueryTest with

[GitHub] [spark] github-actions[bot] commented on pull request #38035: [SPARK-42438][SQL] Improve constraint propagation using multiTransform

2023-06-08 Thread via GitHub
github-actions[bot] commented on PR #38035: URL: https://github.com/apache/spark/pull/38035#issuecomment-1583660803 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #39185: [SPARK-41551][SQL] Dynamic/absolute path support in PathOutputCommitters

2023-06-08 Thread via GitHub
github-actions[bot] commented on PR #39185: URL: https://github.com/apache/spark/pull/39185#issuecomment-1583660791 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] closed pull request #40178: [MINOR][DOCS] Remove `Jenkins` from web page.

2023-06-08 Thread via GitHub
github-actions[bot] closed pull request #40178: [MINOR][DOCS] Remove `Jenkins` from web page. URL: https://github.com/apache/spark/pull/40178 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] github-actions[bot] commented on pull request #40189: [SPARK-27483] [SS] [SQL] Move fallback logic for StreamingRelation(V2) to an analyzer rule

2023-06-08 Thread via GitHub
github-actions[bot] commented on PR #40189: URL: https://github.com/apache/spark/pull/40189#issuecomment-1583660763 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #40221: [SPARK-41551][SQL] Dynamic/absolute path support in PathOutputCommitters

2023-06-08 Thread via GitHub
github-actions[bot] commented on PR #40221: URL: https://github.com/apache/spark/pull/40221#issuecomment-1583660746 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] Hisoka-X commented on pull request #41348: [SPARK-43203][SQL] Move all Drop Table case to DataSource V2

2023-06-08 Thread via GitHub
Hisoka-X commented on PR #41348: URL: https://github.com/apache/spark/pull/41348#issuecomment-1583654481 > Is the test failure related? > > ``` > DataFrameFunctionsSuite.DataFrame function and SQL functon parity > org.scalatest.exceptions.TestFailedException: Set("ceiling",

[GitHub] [spark] Hisoka-X commented on a diff in pull request #41348: [SPARK-43203][SQL] Move all Drop Table case to DataSource V2

2023-06-08 Thread via GitHub
Hisoka-X commented on code in PR #41348: URL: https://github.com/apache/spark/pull/41348#discussion_r1223661427 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/IdentifierImpl.java: ## @@ -30,12 +30,12 @@ * An {@link Identifier} implementation. */

[GitHub] [spark] WweiL commented on a diff in pull request #41129: [SPARK-43133] Scala Client DataStreamWriter Foreach support

2023-06-08 Thread via GitHub
WweiL commented on code in PR #41129: URL: https://github.com/apache/spark/pull/41129#discussion_r1223658122 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -2445,10 +2451,24 @@ class SparkConnectPlanner(val

[GitHub] [spark] WweiL commented on a diff in pull request #41129: [SPARK-43133] Scala Client DataStreamWriter Foreach support

2023-06-08 Thread via GitHub
WweiL commented on code in PR #41129: URL: https://github.com/apache/spark/pull/41129#discussion_r1223658042 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala: ## @@ -202,6 +208,28 @@ final class DataStreamWriter[T]

[GitHub] [spark] cloud-fan commented on pull request #41517: [SPARK-42290][SQL] Fix the OOM error can't be reported when AQE on

2023-06-08 Thread via GitHub
cloud-fan commented on PR #41517: URL: https://github.com/apache/spark/pull/41517#issuecomment-1583637991 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] HyukjinKwon closed pull request #41318: [SPARK-43803] [SS] [CONNECT] Improve awaitTermination() to handle client disconnects

2023-06-08 Thread via GitHub
HyukjinKwon closed pull request #41318: [SPARK-43803] [SS] [CONNECT] Improve awaitTermination() to handle client disconnects URL: https://github.com/apache/spark/pull/41318 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] HyukjinKwon closed pull request #41129: [SPARK-43133] Scala Client DataStreamWriter Foreach support

2023-06-08 Thread via GitHub
HyukjinKwon closed pull request #41129: [SPARK-43133] Scala Client DataStreamWriter Foreach support URL: https://github.com/apache/spark/pull/41129 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] HyukjinKwon commented on pull request #41129: [SPARK-43133] Scala Client DataStreamWriter Foreach support

2023-06-08 Thread via GitHub
HyukjinKwon commented on PR #41129: URL: https://github.com/apache/spark/pull/41129#issuecomment-1583617244 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on pull request #41318: [SPARK-43803] [SS] [CONNECT] Improve awaitTermination() to handle client disconnects

2023-06-08 Thread via GitHub
HyukjinKwon commented on PR #41318: URL: https://github.com/apache/spark/pull/41318#issuecomment-1583617127 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon closed pull request #41511: [SPARK-43613][PS][CONNECT] Enable `pyspark.pandas.spark.functions.covar` in Spark Connect

2023-06-08 Thread via GitHub
HyukjinKwon closed pull request #41511: [SPARK-43613][PS][CONNECT] Enable `pyspark.pandas.spark.functions.covar` in Spark Connect URL: https://github.com/apache/spark/pull/41511 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] HyukjinKwon closed pull request #41512: [SPARK-43700][SPARK-43701][CONNECT][PS] Enable `TimedeltaOps.(sub|rsub)` with Spark Connect

2023-06-08 Thread via GitHub
HyukjinKwon closed pull request #41512: [SPARK-43700][SPARK-43701][CONNECT][PS] Enable `TimedeltaOps.(sub|rsub)` with Spark Connect URL: https://github.com/apache/spark/pull/41512 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] HyukjinKwon commented on pull request #41511: [SPARK-43613][PS][CONNECT] Enable `pyspark.pandas.spark.functions.covar` in Spark Connect

2023-06-08 Thread via GitHub
HyukjinKwon commented on PR #41511: URL: https://github.com/apache/spark/pull/41511#issuecomment-1583614730 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on pull request #41512: [SPARK-43700][SPARK-43701][CONNECT][PS] Enable `TimedeltaOps.(sub|rsub)` with Spark Connect

2023-06-08 Thread via GitHub
HyukjinKwon commented on PR #41512: URL: https://github.com/apache/spark/pull/41512#issuecomment-1583614693 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] zhengruifeng commented on pull request #41519: [SPARK-43943][SQL][TESTS][FOLLOW] Fix `DataFrame function and SQL function parity` in `DataFrameFunctionsSuite`

2023-06-08 Thread via GitHub
zhengruifeng commented on PR #41519: URL: https://github.com/apache/spark/pull/41519#issuecomment-1583614159 merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] zhengruifeng closed pull request #41519: [SPARK-43943][SQL][TESTS][FOLLOW] Fix `DataFrame function and SQL function parity` in `DataFrameFunctionsSuite`

2023-06-08 Thread via GitHub
zhengruifeng closed pull request #41519: [SPARK-43943][SQL][TESTS][FOLLOW] Fix `DataFrame function and SQL function parity` in `DataFrameFunctionsSuite` URL: https://github.com/apache/spark/pull/41519 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] zhengruifeng commented on pull request #41519: [SPARK-43943][SQL][TESTS][FOLLOW] Fix `DataFrame function and SQL function parity` in `DataFrameFunctionsSuite`

2023-06-08 Thread via GitHub
zhengruifeng commented on PR #41519: URL: https://github.com/apache/spark/pull/41519#issuecomment-1583612968 I can repro this issue, so this fix LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] zhengruifeng commented on pull request #41519: [SPARK-43943][SQL][TESTS][FOLLOW] Fix `DataFrame function and SQL function parity` in `DataFrameFunctionsSuite`

2023-06-08 Thread via GitHub
zhengruifeng commented on PR #41519: URL: https://github.com/apache/spark/pull/41519#issuecomment-1583612759 @LuciferYang thanks for the catch. I was not aware of this failure, since the `sql - other` failed before it run this test. -- This is an automated message from the Apache Git

[GitHub] [spark] rangadi commented on a diff in pull request #41129: [SPARK-43133] Scala Client DataStreamWriter Foreach support

2023-06-08 Thread via GitHub
rangadi commented on code in PR #41129: URL: https://github.com/apache/spark/pull/41129#discussion_r1223638394 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -2445,10 +2451,24 @@ class SparkConnectPlanner(val

[GitHub] [spark] amaliujia commented on pull request #41427: [SPARK-43888][CONNECT][FOLLOW-UP] Spark Connect client should depend on common-utils explicitly

2023-06-08 Thread via GitHub
amaliujia commented on PR #41427: URL: https://github.com/apache/spark/pull/41427#issuecomment-1583598012 @LuciferYang thank you so much! It seems just be a line of change. Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dtenedor commented on pull request #41191: [SPARK-43529][SQL] Support general constant expressions as CREATE/REPLACE TABLE OPTIONS values

2023-06-08 Thread via GitHub
dtenedor commented on PR #41191: URL: https://github.com/apache/spark/pull/41191#issuecomment-1583566600 (Note, this is passing all CI again.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] jimmyzzxhlh commented on pull request #39691: [SPARK-31561][SQL] Add QUALIFY clause

2023-06-08 Thread via GitHub
jimmyzzxhlh commented on PR #39691: URL: https://github.com/apache/spark/pull/39691#issuecomment-1583566431 ^ Same question -- Any plan to release this feature? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] pengzhon-db commented on pull request #41318: [SPARK-43803] [SS] [CONNECT] Improve awaitTermination() to handle client disconnects

2023-06-08 Thread via GitHub
pengzhon-db commented on PR #41318: URL: https://github.com/apache/spark/pull/41318#issuecomment-1583565719 @HyukjinKwon can you help merge this? Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark-connect-go] hiboyang commented on a diff in pull request #10: [SPARK-43351] Add DataFrame writer and reader prototype code

2023-06-08 Thread via GitHub
hiboyang commented on code in PR #10: URL: https://github.com/apache/spark-connect-go/pull/10#discussion_r1223599098 ## client/sql/dataframe.go: ## @@ -31,6 +31,7 @@ type DataFrame interface { Show(numRows int, truncate bool) error Schema() (*StructType, error)

[GitHub] [spark] allisonwang-db commented on a diff in pull request #41316: [SPARK-43798][SQL][PYTHON] Support Python user-defined table functions

2023-06-08 Thread via GitHub
allisonwang-db commented on code in PR #41316: URL: https://github.com/apache/spark/pull/41316#discussion_r1223573185 ## python/docs/source/reference/pyspark.sql/udtf.rst: ## Review Comment: Thanks, I will add the link. Can I compile the doc locally to see how it looks

[GitHub] [spark] yliou commented on pull request #40503: [SPARK-42830] [UI] Link skipped stages on Spark UI

2023-06-08 Thread via GitHub
yliou commented on PR #40503: URL: https://github.com/apache/spark/pull/40503#issuecomment-1583357328 CC: @HyukjinKwon is there interest in this feature? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] rangadi commented on a diff in pull request #41146: [SPARK-43474] [SS] [CONNECT] Add a spark connect function to create DataFrame reference

2023-06-08 Thread via GitHub
rangadi commented on code in PR #41146: URL: https://github.com/apache/spark/pull/41146#discussion_r1223516152 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectCachedDataFrameManager.scala: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the

[GitHub] [spark] ueshin commented on a diff in pull request #41316: [SPARK-43798][SQL][PYTHON] Support Python user-defined table functions

2023-06-08 Thread via GitHub
ueshin commented on code in PR #41316: URL: https://github.com/apache/spark/pull/41316#discussion_r1223463725 ## python/docs/source/reference/pyspark.sql/udtf.rst: ## Review Comment: Need an entry in `python/docs/source/reference/pyspark.sql/index.rst` or

[GitHub] [spark] zeruibao commented on pull request #41521: [SPARK-43380][SQL] Fix conversion of Avro logical timestamp type to Long

2023-06-08 Thread via GitHub
zeruibao commented on PR #41521: URL: https://github.com/apache/spark/pull/41521#issuecomment-1583330021 Yeah, I think https://github.com/apache/spark/pull/41052 is only merged to master branch. @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] viirya commented on a diff in pull request #41348: [SPARK-43203][SQL] Move all Drop Table case to DataSource V2

2023-06-08 Thread via GitHub
viirya commented on code in PR #41348: URL: https://github.com/apache/spark/pull/41348#discussion_r1223519017 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/IdentifierImpl.java: ## @@ -30,12 +30,12 @@ * An {@link Identifier} implementation. */

[GitHub] [spark] viirya commented on pull request #41348: [SPARK-43203][SQL] Move all Drop Table case to DataSource V2

2023-06-08 Thread via GitHub
viirya commented on PR #41348: URL: https://github.com/apache/spark/pull/41348#issuecomment-1583295230 Is the test failure related? ``` DataFrameFunctionsSuite.DataFrame function and SQL functon parity org.scalatest.exceptions.TestFailedException: Set("ceiling", "negative",

[GitHub] [spark] dongjoon-hyun commented on pull request #41517: [SPARK-42290][SQL] Fix the OOM error can't be reported when AQE on

2023-06-08 Thread via GitHub
dongjoon-hyun commented on PR #41517: URL: https://github.com/apache/spark/pull/41517#issuecomment-1583290913 Thank you, @Hisoka-X , @LuciferYang , @kazuyukitanimura . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] dongjoon-hyun closed pull request #41517: [SPARK-42290][SQL] Fix the OOM error can't be reported when AQE on

2023-06-08 Thread via GitHub
dongjoon-hyun closed pull request #41517: [SPARK-42290][SQL] Fix the OOM error can't be reported when AQE on URL: https://github.com/apache/spark/pull/41517 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] dongjoon-hyun commented on pull request #41517: [SPARK-42290][SQL] Fix the OOM error can't be reported when AQE on

2023-06-08 Thread via GitHub
dongjoon-hyun commented on PR #41517: URL: https://github.com/apache/spark/pull/41517#issuecomment-1583274941 I verified manually. Merged to master/3.4. ``` $ build/sbt "sql/testOnly *.QueryExecutionErrorsSuite -- -z SPARK-42290" [info] QueryExecutionErrorsSuite: 13:10:15.573

[GitHub] [spark] tgravescs commented on pull request #34622: [SPARK-37340][UI] Display StageIds in Operators for SQL UI

2023-06-08 Thread via GitHub
tgravescs commented on PR #34622: URL: https://github.com/apache/spark/pull/34622#issuecomment-1583262505 sure, its been a while but I think I had tried this out and was seeing some performance issues with it. I'd have to relook at it to remember. Did you run any performance tests? --

[GitHub] [spark] yliou commented on pull request #40502: [SPARK-42829] [UI] add repeat identifier to cached RDD on stage page

2023-06-08 Thread via GitHub
yliou commented on PR #40502: URL: https://github.com/apache/spark/pull/40502#issuecomment-1583251122 CC: @HyukjinKwon is there interest in this feature? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] dongjoon-hyun commented on pull request #41520: [MINOR][SQL][TESTS] Move ResolveDefaultColumnsSuite to 'o.a.s.sql'

2023-06-08 Thread via GitHub
dongjoon-hyun commented on PR #41520: URL: https://github.com/apache/spark/pull/41520#issuecomment-1583246487 Merged to master/3.4. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] yliou commented on pull request #34622: [SPARK-37340][UI] Display StageIds in Operators for SQL UI

2023-06-08 Thread via GitHub
yliou commented on PR #34622: URL: https://github.com/apache/spark/pull/34622#issuecomment-1583245920 @tgravescs @martin-g should I create another pull request for this feature to try to get it merged? I'm unable to reopen the PR. -- This is an automated message from the Apache Git

[GitHub] [spark] dongjoon-hyun closed pull request #41520: [MINOR][SQL][TESTS] Move ResolveDefaultColumnsSuite to 'o.a.s.sql'

2023-06-08 Thread via GitHub
dongjoon-hyun closed pull request #41520: [MINOR][SQL][TESTS] Move ResolveDefaultColumnsSuite to 'o.a.s.sql' URL: https://github.com/apache/spark/pull/41520 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] dongjoon-hyun commented on pull request #41520: [MINOR][SQL][TESTS] Move ResolveDefaultColumnsSuite to 'o.a.s.sql'

2023-06-08 Thread via GitHub
dongjoon-hyun commented on PR #41520: URL: https://github.com/apache/spark/pull/41520#issuecomment-1583208534 Thank you, @dtenedor and @gengliangwang I verified the relocated suite manually. ``` $ build/sbt "sql/testOnly *.ResolveDefaultColumnsSuite" ... [info]

[GitHub] [spark] zeruibao opened a new pull request, #41521: [SPARK-43380][SQL] Fix conversion of Avro logical timestamp type to Long

2023-06-08 Thread via GitHub
zeruibao opened a new pull request, #41521: URL: https://github.com/apache/spark/pull/41521 ### What changes were proposed in this pull request? Fix conversion of Avro logical timestamp type to Long ### Why are the changes needed? The fix in

[GitHub] [spark] ueshin commented on a diff in pull request #41316: [SPARK-43798][SQL][PYTHON] Support Python user-defined table functions

2023-06-08 Thread via GitHub
ueshin commented on code in PR #41316: URL: https://github.com/apache/spark/pull/41316#discussion_r1223418548 ## python/pyspark/worker.py: ## @@ -871,6 +941,16 @@ def process(): else: process() +if eval_type == PythonEvalType.SQL_TABLE_UDF: +

[GitHub] [spark] dtenedor commented on pull request #41520: [MINOR][SQL][TESTS] Move ResolveDefaultColumnsSuite to 'o.a.s.sql'

2023-06-08 Thread via GitHub
dtenedor commented on PR #41520: URL: https://github.com/apache/spark/pull/41520#issuecomment-1583148512 LGTM! Thanks @dongjoon-hyun for the clean up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] dongjoon-hyun commented on pull request #41520: [MINOR][SQL][TESTS] Move ResolveDefaultColumnsSuite to 'o.a.s.sql'

2023-06-08 Thread via GitHub
dongjoon-hyun commented on PR #41520: URL: https://github.com/apache/spark/pull/41520#issuecomment-1583145315 cc @dtenedor and @gengliangwang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #40652: [SPARK-43018][SQL] Fix bug for INSERT commands with timestamp literals

2023-06-08 Thread via GitHub
dongjoon-hyun commented on code in PR #40652: URL: https://github.com/apache/spark/pull/40652#discussion_r1223414585 ## sql/core/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultColumnsSuite.scala: ## @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] dongjoon-hyun opened a new pull request, #41520: [MINOR][SQL][TESTS] Move ResolveDefaultColumnsSuite to 'o.a.s.sql'

2023-06-08 Thread via GitHub
dongjoon-hyun opened a new pull request, #41520: URL: https://github.com/apache/spark/pull/41520 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

  1   2   3   >