HyukjinKwon closed pull request #39390: [SPARK-41840][CONNECT][PYTHON] Add the
missing alias `groupby`
URL: https://github.com/apache/spark/pull/39390
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
HyukjinKwon closed pull request #39392: [SPARK-41846][CONNECT][PYTHON] Enable
doctests for window functions
URL: https://github.com/apache/spark/pull/39392
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
HyukjinKwon commented on PR #39392:
URL: https://github.com/apache/spark/pull/39392#issuecomment-1371554769
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon commented on PR #39390:
URL: https://github.com/apache/spark/pull/39390#issuecomment-1371554309
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon commented on code in PR #39393:
URL: https://github.com/apache/spark/pull/39393#discussion_r1061976193
##
python/pyspark/sql/tests/test_dataframe.py:
##
@@ -553,13 +553,17 @@ def test_generic_hints(self):
def test_extended_hint_types(self):
df =
lu-wang-dl commented on code in PR #39188:
URL: https://github.com/apache/spark/pull/39188#discussion_r1061973010
##
python/pyspark/ml/torch/distributor.py:
##
@@ -0,0 +1,491 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license
akpatnam25 commented on PR #38959:
URL: https://github.com/apache/spark/pull/38959#issuecomment-1371548903
@mridulm updated the PR to not have protocol/server side changes. In this
case, we are creating a new connection every time the SASL retry is triggered.
Confirmed that this is the
lu-wang-dl commented on code in PR #39188:
URL: https://github.com/apache/spark/pull/39188#discussion_r1061968170
##
python/pyspark/ml/torch/distributor.py:
##
@@ -0,0 +1,491 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license
rithwik-db commented on code in PR #39188:
URL: https://github.com/apache/spark/pull/39188#discussion_r1061943616
##
python/pyspark/ml/torch/distributor.py:
##
@@ -0,0 +1,491 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license
rithwik-db commented on code in PR #39188:
URL: https://github.com/apache/spark/pull/39188#discussion_r1061941062
##
python/pyspark/ml/torch/distributor.py:
##
@@ -0,0 +1,491 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license
rithwik-db commented on code in PR #39188:
URL: https://github.com/apache/spark/pull/39188#discussion_r1061940175
##
python/pyspark/ml/torch/distributor.py:
##
@@ -0,0 +1,491 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license
dongjoon-hyun commented on code in PR #39268:
URL: https://github.com/apache/spark/pull/39268#discussion_r1061939747
##
sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala:
##
@@ -140,6 +166,7 @@ object SQLExecution {
} finally {
leewyang commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r1061936672
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +138,605 @@ def array_to_vector(col: Column) -> Column:
return
dongjoon-hyun commented on code in PR #39268:
URL: https://github.com/apache/spark/pull/39268#discussion_r1061935560
##
sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala:
##
@@ -55,6 +56,28 @@ object SQLExecution {
}
}
+ /**
+ * Track the
dongjoon-hyun commented on code in PR #39268:
URL: https://github.com/apache/spark/pull/39268#discussion_r1061933315
##
sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala:
##
@@ -55,6 +56,28 @@ object SQLExecution {
}
}
+ /**
+ * Track the
dongjoon-hyun commented on code in PR #39268:
URL: https://github.com/apache/spark/pull/39268#discussion_r1061933315
##
sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala:
##
@@ -55,6 +56,28 @@ object SQLExecution {
}
}
+ /**
+ * Track the
dongjoon-hyun commented on code in PR #39268:
URL: https://github.com/apache/spark/pull/39268#discussion_r1061928098
##
core/src/main/scala/org/apache/spark/internal/config/UI.scala:
##
@@ -229,4 +229,11 @@ private[spark] object UI {
.stringConf
dongjoon-hyun commented on code in PR #39268:
URL: https://github.com/apache/spark/pull/39268#discussion_r1061928098
##
core/src/main/scala/org/apache/spark/internal/config/UI.scala:
##
@@ -229,4 +229,11 @@ private[spark] object UI {
.stringConf
dongjoon-hyun commented on code in PR #39268:
URL: https://github.com/apache/spark/pull/39268#discussion_r1061927117
##
core/src/main/resources/org/apache/spark/ui/static/webui.css:
##
@@ -187,6 +187,18 @@ pre {
display: none;
}
+.sub-execution-list {
+ font-size:0.9rem;
leewyang commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r1061926376
##
python/pyspark/ml/model_cache.py:
##
@@ -0,0 +1,44 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.
leewyang commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r1061926045
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +117,474 @@ def array_to_vector(col: Column) -> Column:
return
leewyang commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r1061925652
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +117,474 @@ def array_to_vector(col: Column) -> Column:
return
gengliangwang closed pull request #39357: [SPARK-41677][CORE][SQL][SS][UI] Add
Protobuf serializer for StreamingQueryProgressWrapper
URL: https://github.com/apache/spark/pull/39357
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
gengliangwang commented on PR #39357:
URL: https://github.com/apache/spark/pull/39357#issuecomment-1371482013
Thanks, merging to master
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
gengliangwang commented on code in PR #39357:
URL: https://github.com/apache/spark/pull/39357#discussion_r1061918139
##
core/src/main/protobuf/org/apache/spark/status/protobuf/store_types.proto:
##
@@ -684,3 +684,54 @@ message ExecutorPeakMetricsDistributions {
repeated
gengliangwang commented on code in PR #39226:
URL: https://github.com/apache/spark/pull/39226#discussion_r1061892643
##
core/src/test/scala/org/apache/spark/status/AutoCleanupLiveUIDirSuite.scala:
##
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
gengliangwang closed pull request #39286: [SPARK-41768][CORE] Refactor the
definition of enum to follow with the code style
URL: https://github.com/apache/spark/pull/39286
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
gengliangwang commented on PR #39286:
URL: https://github.com/apache/spark/pull/39286#issuecomment-1371426746
Thanks, merging to master
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
lu-wang-dl commented on code in PR #39188:
URL: https://github.com/apache/spark/pull/39188#discussion_r1061884522
##
python/pyspark/ml/torch/distributor.py:
##
@@ -0,0 +1,491 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license
lu-wang-dl commented on code in PR #39188:
URL: https://github.com/apache/spark/pull/39188#discussion_r1061884189
##
python/pyspark/ml/torch/distributor.py:
##
@@ -0,0 +1,491 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license
srowen commented on PR #39391:
URL: https://github.com/apache/spark/pull/39391#issuecomment-1371405185
Can you try rerunning the tests? they're stuck or something
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
gerashegalov commented on code in PR #39383:
URL: https://github.com/apache/spark/pull/39383#discussion_r1061843543
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala:
##
@@ -634,7 +634,12 @@ case class RegExpReplace(subject:
srowen commented on code in PR #39286:
URL: https://github.com/apache/spark/pull/39286#discussion_r1061854567
##
core/src/main/protobuf/org/apache/spark/status/protobuf/store_types.proto:
##
@@ -27,10 +27,10 @@ package org.apache.spark.status.protobuf;
enum
gengliangwang commented on code in PR #39286:
URL: https://github.com/apache/spark/pull/39286#discussion_r1061842276
##
core/src/main/protobuf/org/apache/spark/status/protobuf/store_types.proto:
##
@@ -27,10 +27,10 @@ package org.apache.spark.status.protobuf;
enum
gengliangwang commented on code in PR #39385:
URL: https://github.com/apache/spark/pull/39385#discussion_r1061839322
##
sql/core/src/test/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListenerSuite.scala:
##
@@ -1007,6 +1004,36 @@ class SQLAppStatusListenerSuite extends
dongjoon-hyun commented on PR #39371:
URL: https://github.com/apache/spark/pull/39371#issuecomment-1371346517
Apache Spark community always recommends to use the latest one. In case of
SPARK-41030, `v3.3.2` is the fastest release with that.
--
This is an automated message from the Apache
dongjoon-hyun commented on PR #39371:
URL: https://github.com/apache/spark/pull/39371#issuecomment-1371344946
Before `v3.2.4`,
- `v3.3.2` will arrive on Feb/March timeframe
- `v3.4.0` feature freeze will start on January 16th and RC will start on
dongjoon-hyun commented on PR #39371:
URL: https://github.com/apache/spark/pull/39371#issuecomment-1371342486
BTW, `v3.2.4` is expected on April 2023 as EOL release according to the
release cadence.
--
This is an automated message from the Apache Git Service.
To respond to the message,
bjornjorgensen commented on PR #39371:
URL: https://github.com/apache/spark/pull/39371#issuecomment-1371339204
@kyle-ai2 Yes, this PR is a part of the 3.2 branch now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
kyle-ai2 commented on PR #39371:
URL: https://github.com/apache/spark/pull/39371#issuecomment-1371329812
Thanks everyone. Will this be released in a new Spark 3.2.4 image?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
MaxGekk closed pull request #39284: [SPARK-41573][SQL] Assign name to
_LEGACY_ERROR_TEMP_2136
URL: https://github.com/apache/spark/pull/39284
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
MaxGekk commented on PR #39284:
URL: https://github.com/apache/spark/pull/39284#issuecomment-1371324425
+1, LGTM. Merging to master.
Thank you, @itholic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
neshkeev commented on PR #39350:
URL: https://github.com/apache/spark/pull/39350#issuecomment-1371297053
@srowen , thank you. Please tell me when I can safely delete the branch
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
MaxGekk commented on code in PR #39282:
URL: https://github.com/apache/spark/pull/39282#discussion_r1061771537
##
sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala:
##
@@ -680,6 +681,18 @@ class QueryCompilationErrorsSuite
context =
lu-wang-dl commented on code in PR #39188:
URL: https://github.com/apache/spark/pull/39188#discussion_r1061769537
##
python/pyspark/ml/torch/distributor.py:
##
@@ -0,0 +1,491 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license
viirya commented on code in PR #39131:
URL: https://github.com/apache/spark/pull/39131#discussion_r1061750586
##
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/LeftSemiAntiJoinPushDownSuite.scala:
##
@@ -46,7 +46,7 @@ class LeftSemiPushdownSuite extends
smallzhongfeng commented on PR #39368:
URL: https://github.com/apache/spark/pull/39368#issuecomment-1371229625
cc @mccheah @cloud-fan @HyukjinKwon If I have misunderstood, thank you very
much for pointing it out :).
--
This is an automated message from the Apache Git Service.
To respond
cloud-fan commented on PR #39131:
URL: https://github.com/apache/spark/pull/39131#issuecomment-1371227066
@EnricoMi thanks for the fix! which spark version starts to have this bug?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
thejdeep commented on code in PR #36165:
URL: https://github.com/apache/spark/pull/36165#discussion_r1061723365
##
core/src/main/scala/org/apache/spark/status/storeTypes.scala:
##
@@ -233,6 +243,38 @@ private[spark] class TaskDataWrapper(
val shuffleLocalBytesRead: Long,
thejdeep commented on PR #36165:
URL: https://github.com/apache/spark/pull/36165#issuecomment-1371213103
Fixed failing tests and updated commit messages to reflect the overall
changes. PTAL. Thanks
--
This is an automated message from the Apache Git Service.
To respond to the message,
itholic commented on PR #39388:
URL: https://github.com/apache/spark/pull/39388#issuecomment-1371198964
nit:
> Implement DataFrame.hint for pyspark
Maybe `DataFrame.repartition` or `RepartitionByExpression` instead of
`DataFrame.hint` ?
--
This is an automated message from the
itholic commented on code in PR #39383:
URL: https://github.com/apache/spark/pull/39383#discussion_r1061705413
##
sql/core/src/test/scala/org/apache/spark/sql/StringFunctionsSuite.scala:
##
@@ -663,4 +664,18 @@ class StringFunctionsSuite extends QueryTest with
EnricoMi commented on PR #38223:
URL: https://github.com/apache/spark/pull/38223#issuecomment-1371196200
@HyukjinKwon @cloud-fan @xinrong-meng can we get this into Spark 3.4?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
EnricoMi closed pull request #38676: [SPARK-41162][SQL] Do not push down
anti-join predicates that become ambiguous
URL: https://github.com/apache/spark/pull/38676
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
EnricoMi commented on PR #38676:
URL: https://github.com/apache/spark/pull/38676#issuecomment-1371192971
Closed in favour of #39131.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
srowen commented on code in PR #39326:
URL: https://github.com/apache/spark/pull/39326#discussion_r1061696714
##
core/pom.xml:
##
@@ -181,6 +181,10 @@
commons-codec
commons-codec
+
+ org.apache.commons
+ commons-compress
+
Review Comment:
itholic commented on code in PR #39393:
URL: https://github.com/apache/spark/pull/39393#discussion_r1061696091
##
python/pyspark/sql/connect/dataframe.py:
##
@@ -478,9 +478,10 @@ def to_jcols(
def hint(self, name: str, *params: Any) -> "DataFrame":
for param in
srowen commented on code in PR #39286:
URL: https://github.com/apache/spark/pull/39286#discussion_r1061695522
##
core/src/main/protobuf/org/apache/spark/status/protobuf/store_types.proto:
##
@@ -27,10 +27,10 @@ package org.apache.spark.status.protobuf;
enum
srowen closed pull request #39350: [MINOR] Fix a typo "from from" -> "from"
URL: https://github.com/apache/spark/pull/39350
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
srowen commented on PR #39350:
URL: https://github.com/apache/spark/pull/39350#issuecomment-1371179995
I'll merge it. You didn't enable tests to run, but, these are just .md file
changes that can't affect anything else
--
This is an automated message from the Apache Git Service.
To
MaxGekk commented on code in PR #39260:
URL: https://github.com/apache/spark/pull/39260#discussion_r1061672988
##
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala:
##
@@ -2405,22 +2405,24 @@ private[sql] object QueryCompilationErrors extends
techaddict commented on PR #39385:
URL: https://github.com/apache/spark/pull/39385#issuecomment-1371130899
@LuciferYang good catch, thanks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
itholic opened a new pull request, #39394:
URL: https://github.com/apache/spark/pull/39394
### What changes were proposed in this pull request?
This PR proposes to assign name to _LEGACY_ERROR_TEMP_2054,
"TASK_WRITE_FAILED".
### Why are the changes needed?
olaky commented on code in PR #39314:
URL: https://github.com/apache/spark/pull/39314#discussion_r1061637682
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala:
##
@@ -249,8 +255,26 @@ object FileSourceStrategy extends Strategy with
techaddict closed pull request #39355: [SPARK-40263][CORE] Use interruptible
lock instead of synchronized in TransportClientFactory.createClient()
URL: https://github.com/apache/spark/pull/39355
--
This is an automated message from the Apache Git Service.
To respond to the message, please
cloud-fan commented on code in PR #39277:
URL: https://github.com/apache/spark/pull/39277#discussion_r1061624700
##
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/V1WritesHiveUtils.scala:
##
@@ -105,4 +112,164 @@ trait V1WritesHiveUtils {
.map(_ =>
cloud-fan commented on code in PR #39277:
URL: https://github.com/apache/spark/pull/39277#discussion_r1061620498
##
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala:
##
@@ -294,3 +282,40 @@ case class InsertIntoHiveTable(
override
cloud-fan commented on code in PR #39277:
URL: https://github.com/apache/spark/pull/39277#discussion_r1061619079
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriteFiles.scala:
##
@@ -53,13 +59,17 @@ case class WriteFiles(child: LogicalPlan) extends
techaddict opened a new pull request, #39393:
URL: https://github.com/apache/spark/pull/39393
### What changes were proposed in this pull request?
Spark Connect DataFrame hint parameter can be str, list, float, or int. This
is done in parity with pyspark DataFrame.hint
### Why
grundprinzip commented on code in PR #39361:
URL: https://github.com/apache/spark/pull/39361#discussion_r1061588135
##
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala:
##
@@ -18,14 +18,15 @@ package org.apache.spark.sql.connect.config
cloud-fan commented on PR #39343:
URL: https://github.com/apache/spark/pull/39343#issuecomment-1371044732
cc @bogdanghit
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
cloud-fan commented on code in PR #38163:
URL: https://github.com/apache/spark/pull/38163#discussion_r1061576773
##
sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala:
##
@@ -337,6 +338,7 @@ case class WindowInPandasExec(
if
cloud-fan commented on PR #39170:
URL: https://github.com/apache/spark/pull/39170#issuecomment-1371031081
```
SELECT *
FROM bf1
JOIN bf2
JOIN bf3
ON bf1.c1 = bf2.c2
AND bf3.c3 = bf2.c2
WHERE bf2.a2 = 5
```
Can you show the query plan before
cloud-fan commented on code in PR #39314:
URL: https://github.com/apache/spark/pull/39314#discussion_r1061563125
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala:
##
@@ -249,8 +255,26 @@ object FileSourceStrategy extends Strategy
vicennial commented on code in PR #39361:
URL: https://github.com/apache/spark/pull/39361#discussion_r1061524162
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/SparkConnectClientSuite.scala:
##
@@ -16,17 +16,79 @@
*/
package
Daniel-Davies commented on code in PR #38867:
URL: https://github.com/apache/spark/pull/38867#discussion_r1061515779
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##
@@ -4601,6 +4601,231 @@ case class ArrayExcept(left:
zhengruifeng opened a new pull request, #39392:
URL: https://github.com/apache/spark/pull/39392
### What changes were proposed in this pull request?
Enable doctests for window functions
### Why are the changes needed?
for test coverage
### Does this PR introduce
LuciferYang opened a new pull request, #39391:
URL: https://github.com/apache/spark/pull/39391
### What changes were proposed in this pull request?
This pr aims upgrade dropwizard metrics to 4.2.15.
### Why are the changes needed?
The release notes as follows:
-
zhengruifeng commented on PR #39390:
URL: https://github.com/apache/spark/pull/39390#issuecomment-1370856219
cc @HyukjinKwon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
zhengruifeng opened a new pull request, #39390:
URL: https://github.com/apache/spark/pull/39390
### What changes were proposed in this pull request?
Add the missing alias `groupby`
### Why are the changes needed?
for api coverage and test coverage
### Does this PR
LuciferYang commented on PR #39385:
URL: https://github.com/apache/spark/pull/39385#issuecomment-1370840366
> @LuciferYang Thanks for fixing this! Could you add tests for the SQL UI
with RocksDB as the backend? For example, you can have a new test suite based
on `SQLAppStatusListenerSuite`
LuciferYang commented on code in PR #39385:
URL: https://github.com/apache/spark/pull/39385#discussion_r1061409595
##
core/src/main/protobuf/org/apache/spark/status/protobuf/store_types.proto:
##
@@ -395,7 +395,8 @@ message SQLExecutionUIData {
optional string error_message
itholic opened a new pull request, #39389:
URL: https://github.com/apache/spark/pull/39389
### What changes were proposed in this pull request?
This PR proposes to migrate `_LEGACY_ERROR_TEMP_2136` into `INTERNAL_ERROR`.
### Why are the changes needed?
We
LuciferYang commented on code in PR #39385:
URL: https://github.com/apache/spark/pull/39385#discussion_r1061400974
##
sql/core/src/main/scala/org/apache/spark/status/protobuf/sql/SparkPlanGraphWrapperSerializer.scala:
##
@@ -53,8 +53,9 @@ class SparkPlanGraphWrapperSerializer
WeichenXu123 commented on code in PR #39188:
URL: https://github.com/apache/spark/pull/39188#discussion_r1061399015
##
python/pyspark/ml/torch/distributor.py:
##
@@ -0,0 +1,491 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license
juechen507 commented on PR #30003:
URL: https://github.com/apache/spark/pull/30003#issuecomment-1370825133
Whether the spark-written-hive-bucketed-table can be read by spark-sql to do
bucket filter pruning, join, group-by?
In my test, bucket information cannot be used for group-by
itholic commented on code in PR #39387:
URL: https://github.com/apache/spark/pull/39387#discussion_r1061376821
##
python/pyspark/sql/tests/test_functions.py:
##
@@ -1030,9 +1031,13 @@ def test_lit_list(self):
self.assertEqual(actual, expected)
df =
itholic commented on code in PR #39387:
URL: https://github.com/apache/spark/pull/39387#discussion_r1061372891
##
python/pyspark/errors/error-classes.json:
##
@@ -0,0 +1,7 @@
+{
+ "COLUMN_IN_LIST" : {
+"message" : [
+ " does not allow a column in a list"
+]
dengziming opened a new pull request, #39388:
URL: https://github.com/apache/spark/pull/39388
### What changes were proposed in this pull request?
Implement DataFrame.hint for pyspark
### Why are the changes needed?
For API coverage
### Does this PR introduce _any_
itholic commented on code in PR #39387:
URL: https://github.com/apache/spark/pull/39387#discussion_r1061374539
##
python/pyspark/sql/functions.py:
##
@@ -172,7 +173,9 @@ def lit(col: Any) -> Column:
return col
elif isinstance(col, list):
if
itholic commented on code in PR #39387:
URL: https://github.com/apache/spark/pull/39387#discussion_r1061372713
##
python/pyspark/errors/error-classes.json:
##
@@ -0,0 +1,7 @@
+{
+ "COLUMN_IN_LIST" : {
+"message" : [
+ " does not allow a column in a list"
+]
itholic commented on code in PR #39387:
URL: https://github.com/apache/spark/pull/39387#discussion_r1061372309
##
python/pyspark/errors/error-classes.json:
##
@@ -0,0 +1,7 @@
+{
+ "COLUMN_IN_LIST" : {
Review Comment:
Will add more error-classes in follow-up PRs.
LuciferYang commented on code in PR #39385:
URL: https://github.com/apache/spark/pull/39385#discussion_r1061369728
##
core/src/main/protobuf/org/apache/spark/status/protobuf/store_types.proto:
##
@@ -395,7 +395,8 @@ message SQLExecutionUIData {
optional string error_message
itholic opened a new pull request, #39387:
URL: https://github.com/apache/spark/pull/39387
### What changes were proposed in this pull request?
This PR proposes to introduce `pyspark.errors` and error classes to unifying
& improving errors generated by PySpark under a single path.
zhengruifeng commented on PR #39386:
URL: https://github.com/apache/spark/pull/39386#issuecomment-1370782139
cc @HyukjinKwon @grundprinzip
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
AmplabJenkins commented on PR #39350:
URL: https://github.com/apache/spark/pull/39350#issuecomment-1370754460
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
vicennial commented on code in PR #39361:
URL: https://github.com/apache/spark/pull/39361#discussion_r1061307438
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/SparkConnectClientSuite.scala:
##
@@ -16,17 +16,79 @@
*/
package
vicennial commented on code in PR #39361:
URL: https://github.com/apache/spark/pull/39361#discussion_r1061305904
##
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala:
##
@@ -18,14 +18,15 @@ package org.apache.spark.sql.connect.config
LorenzoMartini commented on PR #39367:
URL: https://github.com/apache/spark/pull/39367#issuecomment-1370667960
Tests are passing, failing on `linters` that seems like a very common flake
and `docker integration tests` which doesn't seem related at all so probably
just another flake
--
LorenzoMartini commented on PR #39366:
URL: https://github.com/apache/spark/pull/39366#issuecomment-1370664846
Tests are passing. Only `python linter` tests are failing, but there is no
related change and it sems like they are failing in many other PRs so it's a
flake
--
This is an
101 - 200 of 210 matches
Mail list logo