ivoson commented on code in PR #39410:
URL: https://github.com/apache/spark/pull/39410#discussion_r1064356843
##
core/src/test/scala/org/apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala:
##
@@ -403,6 +405,92 @@ class CoarseGrainedSchedulerBackendSuite extends
WangGuangxin commented on code in PR #38877:
URL: https://github.com/apache/spark/pull/38877#discussion_r1064352128
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/ScriptTransformation.scala:
##
@@ -32,7 +32,13 @@ case class ScriptTransformation(
WangGuangxin commented on code in PR #38877:
URL: https://github.com/apache/spark/pull/38877#discussion_r1064352128
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/ScriptTransformation.scala:
##
@@ -32,7 +32,13 @@ case class ScriptTransformation(
MaxGekk commented on code in PR #39464:
URL: https://github.com/apache/spark/pull/39464#discussion_r1064350822
##
core/src/main/resources/error/README.md:
##
@@ -24,27 +24,27 @@ Throw with arbitrary error message:
### After
-`error-class.json`
+`error-classes.json`
MaxGekk closed pull request #39282: [SPARK-41581][SQL] Update
`_LEGACY_ERROR_TEMP_1230` as `INTERNAL_ERROR`
URL: https://github.com/apache/spark/pull/39282
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
MaxGekk commented on PR #39282:
URL: https://github.com/apache/spark/pull/39282#issuecomment-1375213258
+1, LGTM. Merging to master.
Thank you, @itholic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
itholic opened a new pull request, #39464:
URL: https://github.com/apache/spark/pull/39464
### What changes were proposed in this pull request?
This PR proposes to update error class guidelines for
`core/src/main/resources/error/README.md`.
### Why are the changes
zhengruifeng commented on code in PR #39461:
URL: https://github.com/apache/spark/pull/39461#discussion_r1064336239
##
python/pyspark/sql/tests/connect/test_parity_functions.py:
##
@@ -90,8 +90,6 @@ def test_nested_higher_order_function(self):
def
zhengruifeng commented on PR #39461:
URL: https://github.com/apache/spark/pull/39461#issuecomment-1375197987
due to the duplicated column names?
can you update the example with a simpler one?
--
This is an automated message from the Apache Git Service.
To respond to the message,
zhengruifeng commented on PR #39388:
URL: https://github.com/apache/spark/pull/39388#issuecomment-1375190045
merged into master, thank you @dengziming for working on this!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
zhengruifeng closed pull request #39388: [SPARK-41354][CONNECT][PYTHON]
Implement RepartitionByExpression
URL: https://github.com/apache/spark/pull/39388
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
dongjoon-hyun commented on code in PR #39410:
URL: https://github.com/apache/spark/pull/39410#discussion_r1064326242
##
core/src/test/scala/org/apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala:
##
@@ -403,6 +405,92 @@ class CoarseGrainedSchedulerBackendSuite
zhengruifeng commented on PR #39456:
URL: https://github.com/apache/spark/pull/39456#issuecomment-1375171107
@beliefer you may also need to change
EnricoMi commented on code in PR #39431:
URL: https://github.com/apache/spark/pull/39431#discussion_r1064317964
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala:
##
@@ -138,9 +144,21 @@ object FileFormatWriter extends Logging {
HyukjinKwon commented on PR #39463:
URL: https://github.com/apache/spark/pull/39463#issuecomment-1375166461
cc @zhengruifeng @amaliujia FYI
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
HyukjinKwon opened a new pull request, #39463:
URL: https://github.com/apache/spark/pull/39463
### What changes were proposed in this pull request?
This PR mainly proposes to pass the user-specified configurations to local
remote mode.
Previously, all user-specific
smallzhongfeng commented on PR #39448:
URL: https://github.com/apache/spark/pull/39448#issuecomment-1375166196
> There was a long discussion thread regarding this implementation in this
[PR](https://github.com/apache/spark/pull/35085#discussion_r786892744). There
will be some issue with
zhengruifeng opened a new pull request, #39462:
URL: https://github.com/apache/spark/pull/39462
### What changes were proposed in this pull request?
Make `DataFrame.collect` support nested types, by introducing a new data
converter
### Why are the changes needed?
to be
dengziming commented on PR #39388:
URL: https://github.com/apache/spark/pull/39388#issuecomment-1375160517
> @dengziming would you mind resolving the conflicts? thanks
Done!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
beliefer opened a new pull request, #39461:
URL: https://github.com/apache/spark/pull/39461
### What changes were proposed in this pull request?
Python: connect client should not use pyarrow.Table.to_pylist to transform
fetched data.
For example:
the data in pyarrow.Table show
cloud-fan commented on PR #39333:
URL: https://github.com/apache/spark/pull/39333#issuecomment-1375126878
thanks, merging to master!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
cloud-fan closed pull request #39333: [SPARK-41805][SQL] Reuse expressions in
WindowSpecDefinition
URL: https://github.com/apache/spark/pull/39333
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
zhouyejoe commented on PR #39448:
URL: https://github.com/apache/spark/pull/39448#issuecomment-1375124798
There was a long discussion thread regarding this implementation in this
[PR](https://github.com/apache/spark/pull/35085#discussion_r786892744). There
will be some issue with setgid.
LuciferYang commented on PR #39458:
URL: https://github.com/apache/spark/pull/39458#issuecomment-1375122436
Thanks @dongjoon-hyun @HyukjinKwon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
dongjoon-hyun commented on PR #39458:
URL: https://github.com/apache/spark/pull/39458#issuecomment-1375122255
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
dongjoon-hyun closed pull request #39458: [SPARK-41941][BUILD] Upgrade
`scalatest` related test dependencies to 3.2.15
URL: https://github.com/apache/spark/pull/39458
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
itholic commented on code in PR #39387:
URL: https://github.com/apache/spark/pull/39387#discussion_r1064286705
##
python/pyspark/errors/tests/test_errors.py:
##
@@ -0,0 +1,48 @@
+# -*- encoding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
itholic commented on code in PR #39387:
URL: https://github.com/apache/spark/pull/39387#discussion_r1064285339
##
python/pyspark/testing/utils.py:
##
@@ -138,6 +140,32 @@ def setUpClass(cls):
def tearDownClass(cls):
cls.sc.stop()
+def check_error(
+
cloud-fan commented on PR #38163:
URL: https://github.com/apache/spark/pull/38163#issuecomment-1375102383
thanks, merging to master!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
cloud-fan closed pull request #38163: [SPARK-40711][SQL] Add spill size metrics
for window
URL: https://github.com/apache/spark/pull/38163
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
smallzhongfeng commented on code in PR #39448:
URL: https://github.com/apache/spark/pull/39448#discussion_r1064268534
##
core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala:
##
@@ -301,9 +301,6 @@ private[spark] class DiskBlockManager(
* Create a directory
beliefer commented on PR #39091:
URL: https://github.com/apache/spark/pull/39091#issuecomment-1375067246
> In particular, the discussion on the `isObservation` flag in the proto
message needs to be addressed to simplify.
Hi, @grundprinzip . In fact, I removed the `Observation` that
cloud-fan commented on code in PR #39431:
URL: https://github.com/apache/spark/pull/39431#discussion_r1064261329
##
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/V1WriteCommandSuite.scala:
##
@@ -181,13 +178,111 @@ class V1WriteCommandSuite extends
cloud-fan commented on code in PR #39431:
URL: https://github.com/apache/spark/pull/39431#discussion_r1064261228
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala:
##
@@ -138,9 +144,21 @@ object FileFormatWriter extends Logging {
cloud-fan commented on PR #35709:
URL: https://github.com/apache/spark/pull/35709#issuecomment-1375060899
Maybe we should deprecate the old `datediff` function. The new `datediff`
function has a special parser rule so that it won't conflict with the old one,
but I agree that 2 `datedff`
wankunde opened a new pull request, #39460:
URL: https://github.com/apache/spark/pull/39460
### What changes were proposed in this pull request?
Makes DPP support the pruning side has `Union`. For example:
```sql
SELECT f.store_id,
f.date_id,
s.state_province
itholic commented on code in PR #39260:
URL: https://github.com/apache/spark/pull/39260#discussion_r1064249717
##
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala:
##
@@ -2166,6 +2166,22 @@ abstract class DDLSuite extends QueryTest with
LuciferYang commented on PR #39406:
URL: https://github.com/apache/spark/pull/39406#issuecomment-1375032421
Thanks @HeartSaVioR
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
itholic commented on PR #39282:
URL: https://github.com/apache/spark/pull/39282#issuecomment-1375031781
Update the title & description, thanks :-)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
HeartSaVioR closed pull request #39406: [SPARK-41894][SS][TESTS] Restore the
write permission of `commitDir` after run `testAsyncWriteErrorsPermissionsIssue`
URL: https://github.com/apache/spark/pull/39406
--
This is an automated message from the Apache Git Service.
To respond to the
HeartSaVioR commented on PR #39406:
URL: https://github.com/apache/spark/pull/39406#issuecomment-1375031024
Thanks! Merging to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
itholic commented on code in PR #39389:
URL: https://github.com/apache/spark/pull/39389#discussion_r1064247087
##
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala:
##
@@ -378,10 +378,9 @@ private[sql] object QueryExecutionErrors extends
itholic commented on PR #39394:
URL: https://github.com/apache/spark/pull/39394#issuecomment-1375027707
Sounds good. Just exposed `path` to error message.
Thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
ulysses-you commented on code in PR #39431:
URL: https://github.com/apache/spark/pull/39431#discussion_r1064243471
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala:
##
@@ -138,9 +144,21 @@ object FileFormatWriter extends Logging {
ulysses-you commented on PR #38163:
URL: https://github.com/apache/spark/pull/38163#issuecomment-1375015821
cc @cloud-fan @HyukjinKwon if you find some time to take an another look,
thank you
--
This is an automated message from the Apache Git Service.
To respond to the message, please
LuciferYang commented on PR #39458:
URL: https://github.com/apache/spark/pull/39458#issuecomment-1375013705
re-trigger the failed task
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
ulysses-you commented on code in PR #39277:
URL: https://github.com/apache/spark/pull/39277#discussion_r1064238626
##
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala:
##
@@ -294,3 +285,40 @@ case class InsertIntoHiveTable(
override
HyukjinKwon closed pull request #39368: [SPARK-28764][CORE][TEST] Remove
writePartitionedFile in ExternalSorter
URL: https://github.com/apache/spark/pull/39368
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
HyukjinKwon commented on PR #39368:
URL: https://github.com/apache/spark/pull/39368#issuecomment-1374983004
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon commented on code in PR #39448:
URL: https://github.com/apache/spark/pull/39448#discussion_r1064230294
##
core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala:
##
@@ -301,9 +301,6 @@ private[spark] class DiskBlockManager(
* Create a directory that
HyukjinKwon commented on code in PR #39456:
URL: https://github.com/apache/spark/pull/39456#discussion_r1064229998
##
python/pyspark/sql/tests/connect/test_parity_functions.py:
##
@@ -122,8 +122,6 @@ def test_nested_higher_order_function(self):
def
HyukjinKwon closed pull request #39453: [SPARK-41938][BUILD] Upgrade sbt to
1.8.2
URL: https://github.com/apache/spark/pull/39453
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon commented on PR #39453:
URL: https://github.com/apache/spark/pull/39453#issuecomment-1374979616
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon closed pull request #39454: [SPARK-41937][R] Fix error in R (>=
4.2.0) for SparkR datetime column comparing with Sys.time()
URL: https://github.com/apache/spark/pull/39454
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
HyukjinKwon commented on PR #39454:
URL: https://github.com/apache/spark/pull/39454#issuecomment-1374979008
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
github-actions[bot] commented on PR #37759:
URL: https://github.com/apache/spark/pull/37759#issuecomment-1374972033
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #36850:
URL: https://github.com/apache/spark/pull/36850#issuecomment-1374972048
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] closed pull request #36588: [SPARK-39217][SQL] Makes DPP
support the pruning side has Union
URL: https://github.com/apache/spark/pull/36588
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
vicennial commented on code in PR #39361:
URL: https://github.com/apache/spark/pull/39361#discussion_r1064210276
##
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/connect/client/SparkConnectClient.scala:
##
@@ -17,31 +17,147 @@
package
vicennial commented on code in PR #39361:
URL: https://github.com/apache/spark/pull/39361#discussion_r1064217874
##
connector/connect/client/jvm/pom.xml:
##
@@ -52,6 +53,12 @@
${protobuf.version}
compile
+
+ com.google.guava
+ guava
+
vicennial commented on code in PR #39361:
URL: https://github.com/apache/spark/pull/39361#discussion_r1064210276
##
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/connect/client/SparkConnectClient.scala:
##
@@ -17,31 +17,147 @@
package
grundprinzip commented on code in PR #39456:
URL: https://github.com/apache/spark/pull/39456#discussion_r1064210177
##
connector/connect/common/src/main/protobuf/spark/connect/expressions.proto:
##
@@ -209,11 +209,15 @@ message Expression {
// (Required) Indicate if this
grundprinzip commented on PR #39361:
URL: https://github.com/apache/spark/pull/39361#issuecomment-1374940010
Hi @vicennial, please resolve the addressed comments for easier review.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
grundprinzip commented on PR #39091:
URL: https://github.com/apache/spark/pull/39091#issuecomment-1374939250
Hi @beliefer,
when you're ready for another round of reviews, I would suggest to resolve
the comments that you think you have addressed because otherwise it's going to
be
grundprinzip commented on PR #38879:
URL: https://github.com/apache/spark/pull/38879#issuecomment-1374938706
Closing for now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
grundprinzip closed pull request #38879: [SPARK-41362][CONNECT][PYTHON] Better
error messages for invalid argument types.
URL: https://github.com/apache/spark/pull/38879
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
warrenzhu25 commented on code in PR #39280:
URL: https://github.com/apache/spark/pull/39280#discussion_r1064192028
##
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala:
##
@@ -102,6 +104,15 @@ class
wankunde closed pull request #39457: [WIP][SPARK-41940][SQL] Infer IsNotNull
constraints for complex join expressions
URL: https://github.com/apache/spark/pull/39457
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
wankunde commented on PR #39457:
URL: https://github.com/apache/spark/pull/39457#issuecomment-1374887807
This pr may infer too many unnecessary constraints.
Maybe we can add a `MayBeNull` strait for the expressions which output may
be evaluated to null with all inputs are not null. And
smallzhongfeng commented on PR #39448:
URL: https://github.com/apache/spark/pull/39448#issuecomment-1374881384
cc @cloud-fan @HyukjinKwon @LuciferYang @@zhouyejoe Hope to get your
opinion. :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please
ivoson opened a new pull request, #39459:
URL: https://github.com/apache/spark/pull/39459
### What changes were proposed in this pull request?
Make rdd block(rdd cache) available only when a task generate the block
succeed.
### Why are the changes needed?
Fixing the bug as
ivoson commented on code in PR #39410:
URL: https://github.com/apache/spark/pull/39410#discussion_r1064168456
##
core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala:
##
@@ -262,9 +269,11 @@ private[spark] class CoarseGrainedExecutorBackend(
ivoson commented on code in PR #39410:
URL: https://github.com/apache/spark/pull/39410#discussion_r1064168342
##
core/src/test/scala/org/apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala:
##
@@ -403,6 +405,92 @@ class CoarseGrainedSchedulerBackendSuite extends
ivoson commented on code in PR #39410:
URL: https://github.com/apache/spark/pull/39410#discussion_r1064167868
##
core/src/test/scala/org/apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala:
##
@@ -403,6 +405,92 @@ class CoarseGrainedSchedulerBackendSuite extends
ivoson commented on code in PR #39410:
URL: https://github.com/apache/spark/pull/39410#discussion_r1064167770
##
core/src/test/scala/org/apache/spark/scheduler/CoarseGrainedSchedulerBackendSuite.scala:
##
@@ -403,6 +405,92 @@ class CoarseGrainedSchedulerBackendSuite extends
ivoson commented on code in PR #39410:
URL: https://github.com/apache/spark/pull/39410#discussion_r1064156234
##
core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala:
##
@@ -262,9 +269,11 @@ private[spark] class CoarseGrainedExecutorBackend(
Daniel-Davies commented on code in PR #38867:
URL: https://github.com/apache/spark/pull/38867#discussion_r1064153224
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##
@@ -4601,6 +4601,231 @@ case class ArrayExcept(left:
Daniel-Davies commented on code in PR #38867:
URL: https://github.com/apache/spark/pull/38867#discussion_r1059780745
##
sql/core/src/test/resources/sql-tests/results/array.sql.out:
##
@@ -427,6 +427,103 @@ struct
NULL
+-- !query
+select array_insert(array(1, 2, 3), 4, 4)
Daniel-Davies commented on code in PR #38867:
URL: https://github.com/apache/spark/pull/38867#discussion_r1064152931
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##
@@ -4601,6 +4601,231 @@ case class ArrayExcept(left:
LuciferYang opened a new pull request, #39458:
URL: https://github.com/apache/spark/pull/39458
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
###
LuciferYang commented on PR #39406:
URL: https://github.com/apache/spark/pull/39406#issuecomment-1374832851
friendly ping @HyukjinKwon @HeartSaVioR
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
MaxGekk commented on PR #39332:
URL: https://github.com/apache/spark/pull/39332#issuecomment-1374816921
@cloud-fan @srielau Could you review generating of column aliases, please.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
xinrong-meng commented on code in PR #39384:
URL: https://github.com/apache/spark/pull/39384#discussion_r1064122141
##
python/pyspark/sql/tests/test_arrow_python_udf.py:
##
@@ -0,0 +1,131 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor
xinrong-meng commented on code in PR #39384:
URL: https://github.com/apache/spark/pull/39384#discussion_r1064122105
##
python/pyspark/sql/udf.py:
##
@@ -75,6 +81,104 @@ def _create_udf(
return udf_obj._wrapped()
+def _create_py_udf(
+f: Callable[..., Any],
+
wankunde opened a new pull request, #39457:
URL: https://github.com/apache/spark/pull/39457
### What changes were proposed in this pull request?
Infer IsNotNull constraints for complex join expressions along with
IsNotNull constraints for the attribute.
For example,
techaddict commented on code in PR #39450:
URL: https://github.com/apache/spark/pull/39450#discussion_r1064101499
##
python/pyspark/sql/tests/test_functions.py:
##
@@ -24,6 +24,7 @@
from py4j.protocol import Py4JJavaError
from pyspark.sql import Row, Window, types
+from
techaddict commented on code in PR #39451:
URL: https://github.com/apache/spark/pull/39451#discussion_r1064101288
##
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##
@@ -1123,7 +1123,7 @@ class
87 matches
Mail list logo