TakawaAkirayo commented on PR #45367:
URL: https://github.com/apache/spark/pull/45367#issuecomment-2053512982
@mridulm @beliefer @LuciferYang Thanks for your review and guidance to
improve the PR :-)
--
This is an automated message from the Apache Git Service.
To respond to the message,
panbingkun opened a new pull request, #46038:
URL: https://github.com/apache/spark/pull/46038
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How
mridulm commented on PR #46013:
URL: https://github.com/apache/spark/pull/46013#issuecomment-2053492527
+CC @shardulm94
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
mridulm commented on PR #45367:
URL: https://github.com/apache/spark/pull/45367#issuecomment-2053492389
I have updated the description, and merged to master.
Thanks for fixing this @TakawaAkirayo !
Thanks for the review @beliefer and @LuciferYang :-)
--
This is an automated message
mridulm closed pull request #45367: [SPARK-47253][CORE] Allow LiveEventBus to
stop without the completely draining of event queue
URL: https://github.com/apache/spark/pull/45367
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
mridulm commented on PR #45367:
URL: https://github.com/apache/spark/pull/45367#issuecomment-2053490996
The test failures are unrelated to this PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
grundprinzip commented on PR #46002:
URL: https://github.com/apache/spark/pull/46002#issuecomment-2053489832
Thank you @HyukjinKwon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
ericm-db commented on PR #45932:
URL: https://github.com/apache/spark/pull/45932#issuecomment-2053489012
@HeartSaVioR PTAL, thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
panbingkun commented on PR #45957:
URL: https://github.com/apache/spark/pull/45957#issuecomment-2053481060
> @panbingkun Thanks for the work. LGTM except for two comments.
Updated, done.
--
This is an automated message from the Apache Git Service.
To respond to the message, please
panbingkun commented on code in PR #45957:
URL: https://github.com/apache/spark/pull/45957#discussion_r1563691923
##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala:
##
@@ -210,10 +211,10 @@ class
HyukjinKwon commented on PR #46036:
URL: https://github.com/apache/spark/pull/46036#issuecomment-2053258705
https://github.com/HyukjinKwon/spark/actions/runs/8670824482/job/23779286417
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
WweiL opened a new pull request, #46037:
URL: https://github.com/apache/spark/pull/46037
### What changes were proposed in this pull request?
Server and client side for the client side listener.
The client should start send a `add_listener_bus_listener` RPC for the
TakawaAkirayo commented on code in PR #45367:
URL: https://github.com/apache/spark/pull/45367#discussion_r1563578933
##
core/src/test/scala/org/apache/spark/scheduler/SparkListenerSuite.scala:
##
@@ -176,6 +176,56 @@ class SparkListenerSuite extends SparkFunSuite with
HyukjinKwon commented on code in PR #46002:
URL: https://github.com/apache/spark/pull/46002#discussion_r1563534141
##
python/pyspark/sql/connect/streaming/readwriter.py:
##
@@ -557,7 +557,7 @@ def foreach(self, f: Union[Callable[[Row], None],
"SupportsProcess"]) -> "DataSt
HyukjinKwon commented on code in PR #46002:
URL: https://github.com/apache/spark/pull/46002#discussion_r1563534141
##
python/pyspark/sql/connect/streaming/readwriter.py:
##
@@ -557,7 +557,7 @@ def foreach(self, f: Union[Callable[[Row], None],
"SupportsProcess"]) -> "DataSt
HyukjinKwon closed pull request #46002: [SPARK-47812][CONNECT] Support
Serialization of SparkSession for ForEachBatch worker
URL: https://github.com/apache/spark/pull/46002
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
HyukjinKwon commented on PR #46002:
URL: https://github.com/apache/spark/pull/46002#issuecomment-2052885568
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon commented on PR #46036:
URL: https://github.com/apache/spark/pull/46036#issuecomment-2052879562
https://github.com/HyukjinKwon/spark/actions/runs/8670017978/job/23777524704
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
HyukjinKwon opened a new pull request, #46036:
URL: https://github.com/apache/spark/pull/46036
### What changes were proposed in this pull request?
This PR proposes to testing PySpark Connect server to have `pyspark.core`
package by running Python workers once (and they will be
HyukjinKwon commented on PR #46032:
URL: https://github.com/apache/spark/pull/46032#issuecomment-2052780662
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon closed pull request #46032: [MINOR][PYTHON] Enable parity test
`test_different_group_key_cardinality`
URL: https://github.com/apache/spark/pull/46032
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
github-actions[bot] commented on PR #43908:
URL: https://github.com/apache/spark/pull/43908#issuecomment-205272
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] closed pull request #1: [SPARK-46477][SQL] Add bucket
info to SD in toHivePartition
URL: https://github.com/apache/spark/pull/1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
github-actions[bot] closed pull request #44197: [SPARK-39800][SQL][WIP]
DataSourceV2: View Support
URL: https://github.com/apache/spark/pull/44197
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
github-actions[bot] commented on PR #44546:
URL: https://github.com/apache/spark/pull/44546#issuecomment-2052721068
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #44572:
URL: https://github.com/apache/spark/pull/44572#issuecomment-2052721053
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
gengliangwang commented on PR #45926:
URL: https://github.com/apache/spark/pull/45926#issuecomment-2052711109
@itholic The Hive-thriftserver tests failed. Please check it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
gengliangwang commented on code in PR #45923:
URL: https://github.com/apache/spark/pull/45923#discussion_r1563384223
##
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/HiveThriftServer2Listener.scala:
##
@@ -218,7 +232,9 @@ private[thriftserver]
gengliangwang commented on PR #45957:
URL: https://github.com/apache/spark/pull/45957#issuecomment-2052707701
@panbingkun Thanks for the work. LGTM except for two comments.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
fanyue-xia commented on code in PR #45971:
URL: https://github.com/apache/spark/pull/45971#discussion_r1563379719
##
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQueryHashPartitionVerifySuite.scala:
##
@@ -0,0 +1,199 @@
+/*
+ * Licensed to the Apache Software
gengliangwang commented on code in PR #45957:
URL: https://github.com/apache/spark/pull/45957#discussion_r1563377336
##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala:
##
@@ -282,7 +283,7 @@ class
gengliangwang commented on code in PR #45957:
URL: https://github.com/apache/spark/pull/45957#discussion_r1563376076
##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala:
##
@@ -210,10 +211,10 @@ class
fanyue-xia commented on PR #45971:
URL: https://github.com/apache/spark/pull/45971#issuecomment-2052704919
> > the seed might behave differently across runs/on different machines
>
> Ah I see, this indeed makes sense.
>
> In this case, I think we should fix the generator of
sahnib commented on PR #46035:
URL: https://github.com/apache/spark/pull/46035#issuecomment-2052704693
@HeartSaVioR PTAL.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
sahnib opened a new pull request, #46035:
URL: https://github.com/apache/spark/pull/46035
### What changes were proposed in this pull request?
Streaming queries with Union of 2 data streams followed by an Aggregate
(groupBy) can produce incorrect results if the grouping
chaoqin-li1123 commented on code in PR #45977:
URL: https://github.com/apache/spark/pull/45977#discussion_r1563370002
##
python/pyspark/sql/worker/plan_data_source_read.py:
##
@@ -51,6 +52,71 @@
)
+def records_to_arrow_batches(
+output_iter: Iterator[Tuple],
+
margorczynski commented on PR #2:
URL:
https://github.com/apache/spark-kubernetes-operator/pull/2#issuecomment-2052702949
Hey, great stuff. Could you tell me if you'll find a moment how this relates
to the https://github.com/kubeflow/spark-operator/tree/master operator? Is the
approach
fanyue-xia commented on code in PR #45971:
URL: https://github.com/apache/spark/pull/45971#discussion_r1563367019
##
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQueryHashPartitionVerifySuite.scala:
##
@@ -0,0 +1,199 @@
+/*
+ * Licensed to the Apache Software
fanyue-xia commented on code in PR #45971:
URL: https://github.com/apache/spark/pull/45971#discussion_r1563365767
##
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQueryHashPartitionVerifySuite.scala:
##
@@ -0,0 +1,199 @@
+/*
+ * Licensed to the Apache Software
kelvinjian-db commented on PR #46034:
URL: https://github.com/apache/spark/pull/46034#issuecomment-2052701361
cc @cloud-fan @jchen5
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
kelvinjian-db opened a new pull request, #46034:
URL: https://github.com/apache/spark/pull/46034
### What changes were proposed in this pull request?
- Fixes a bug where `RewriteWithExpression` can rewrite an `Aggregate` into
an invalid one. The fix is done by separating
gengliangwang commented on PR #46013:
URL: https://github.com/apache/spark/pull/46013#issuecomment-2052657458
@yaooqinn enabling the ANSI SQL mode does address certain unreasonable SQL
behaviors, such as integer overflow and division by zero, which could
potentially disrupt users'
gengliangwang commented on code in PR #45990:
URL: https://github.com/apache/spark/pull/45990#discussion_r1563258451
##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -1609,6 +1609,19 @@ object SQLConf {
gengliangwang commented on code in PR #45990:
URL: https://github.com/apache/spark/pull/45990#discussion_r1563257890
##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -1609,6 +1609,19 @@ object SQLConf {
gengliangwang commented on code in PR #45990:
URL: https://github.com/apache/spark/pull/45990#discussion_r1563255189
##
sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala:
##
@@ -204,6 +215,8 @@ class CacheManager extends Logging with
mridulm commented on code in PR #45367:
URL: https://github.com/apache/spark/pull/45367#discussion_r1563152817
##
core/src/test/scala/org/apache/spark/scheduler/SparkListenerSuite.scala:
##
@@ -176,6 +176,56 @@ class SparkListenerSuite extends SparkFunSuite with
chaoqin-li1123 commented on code in PR #45977:
URL: https://github.com/apache/spark/pull/45977#discussion_r1563133283
##
python/pyspark/sql/datasource.py:
##
@@ -469,6 +501,192 @@ def stop(self) -> None:
...
+class SimpleInputPartition(InputPartition):
+def
chaoqin-li1123 commented on code in PR #45977:
URL: https://github.com/apache/spark/pull/45977#discussion_r1563131511
##
python/pyspark/sql/datasource.py:
##
@@ -469,6 +501,192 @@ def stop(self) -> None:
...
+class SimpleInputPartition(InputPartition):
+def
chaoqin-li1123 commented on code in PR #45977:
URL: https://github.com/apache/spark/pull/45977#discussion_r1563131182
##
sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonStreamingSourceRunner.scala:
##
@@ -164,7 +175,20 @@ class PythonStreamingSourceRunner(
chaoqin-li1123 commented on code in PR #45977:
URL: https://github.com/apache/spark/pull/45977#discussion_r1563130456
##
python/pyspark/sql/streaming/python_streaming_source_runner.py:
##
@@ -76,6 +97,19 @@ def commit_func(reader: DataSourceStreamReader, infile: IO,
outfile:
chaoqin-li1123 commented on code in PR #45977:
URL: https://github.com/apache/spark/pull/45977#discussion_r1563129813
##
python/pyspark/sql/datasource.py:
##
@@ -469,6 +501,192 @@ def stop(self) -> None:
...
+class SimpleInputPartition(InputPartition):
+def
WweiL commented on PR #45971:
URL: https://github.com/apache/spark/pull/45971#issuecomment-2052419964
> the seed might behave differently across runs/on different machines
Ah I see, this indeed makes sense.
In this case, I think we should fix the generator of rows. It's okay
ayinresh closed pull request #46033: [BACKPORT][SPARK-42369][CORE] Fix
constructor for java.nio.DirectByteBuffer (#39909)
URL: https://github.com/apache/spark/pull/46033
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
ayinresh opened a new pull request, #46033:
URL: https://github.com/apache/spark/pull/46033
### What changes were proposed in this pull request?
Backport of https://github.com/apache/spark/pull/39909 to 3.4.
### Why are the changes needed?
It's required to
anishshri-db commented on code in PR #45932:
URL: https://github.com/apache/spark/pull/45932#discussion_r1563064461
##
sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithValueStateTTLSuite.scala:
##
@@ -171,203 +160,15 @@ case class
anishshri-db commented on code in PR #45932:
URL: https://github.com/apache/spark/pull/45932#discussion_r1563063876
##
sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithListStateTTLSuite.scala:
##
@@ -0,0 +1,349 @@
+/*
+ * Licensed to the Apache Software
anishshri-db commented on code in PR #45932:
URL: https://github.com/apache/spark/pull/45932#discussion_r1563050397
##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ListStateImplWithTTL.scala:
##
@@ -0,0 +1,220 @@
+/*
+ * Licensed to the Apache Software
amaliujia commented on PR #46013:
URL: https://github.com/apache/spark/pull/46013#issuecomment-2052308167
Will this be included into any maintenance release that is cut before Spark
4? I assume no?
--
This is an automated message from the Apache Git Service.
To respond to the message,
mridulm commented on PR #46014:
URL: https://github.com/apache/spark/pull/46014#issuecomment-2052306349
Sounds good to me for docs @dongjoon-hyun - we will have to forward port
that to master as well.
And I guess we leave the config as 4.0 in code ?
--
This is an automated message
chaoqin-li1123 commented on code in PR #45977:
URL: https://github.com/apache/spark/pull/45977#discussion_r1563028452
##
python/pyspark/sql/datasource.py:
##
@@ -469,6 +501,192 @@ def stop(self) -> None:
...
+class SimpleInputPartition(InputPartition):
+def
chaoqin-li1123 commented on code in PR #45977:
URL: https://github.com/apache/spark/pull/45977#discussion_r1563028108
##
sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonStreamingSourceRunner.scala:
##
@@ -199,4 +223,30 @@ class PythonStreamingSourceRunner(
fanyue-xia commented on PR #45971:
URL: https://github.com/apache/spark/pull/45971#issuecomment-2052224756
> Thanks for the effort! This really requires some deep understanding of
spark internals...
>
> There is still one important concern, that the golden file size is too
big. I
WweiL commented on code in PR #45971:
URL: https://github.com/apache/spark/pull/45971#discussion_r1562953554
##
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQueryHashPartitionVerifySuite.scala:
##
@@ -0,0 +1,199 @@
+/*
+ * Licensed to the Apache Software
fanyue-xia commented on code in PR #45971:
URL: https://github.com/apache/spark/pull/45971#discussion_r1562948068
##
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQueryHashPartitionVerifySuite.scala:
##
@@ -0,0 +1,199 @@
+/*
+ * Licensed to the Apache Software
WweiL commented on code in PR #45971:
URL: https://github.com/apache/spark/pull/45971#discussion_r1562849582
##
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQueryHashPartitionVerifySuite.scala:
##
@@ -0,0 +1,199 @@
+/*
+ * Licensed to the Apache Software
WweiL commented on PR #45971:
URL: https://github.com/apache/spark/pull/45971#issuecomment-2052183431
Thanks for the effort! This really requires some deep understanding of spark
internals...
There is still one important concern, that the golden file size is too big.
I looked a bit,
sahnib commented on code in PR #45977:
URL: https://github.com/apache/spark/pull/45977#discussion_r1562866326
##
python/pyspark/sql/datasource.py:
##
@@ -469,6 +501,192 @@ def stop(self) -> None:
...
+class SimpleInputPartition(InputPartition):
+def
dongjoon-hyun commented on PR #46014:
URL: https://github.com/apache/spark/pull/46014#issuecomment-2052103818
We can follow the Apache Spark Security page convention.
- https://spark.apache.org/security.html
> 3.2.2, or 3.3.1 or later
In this case, maybe, `3.4.3, or 3.5.2 or
ericm-db commented on code in PR #45932:
URL: https://github.com/apache/spark/pull/45932#discussion_r1562819807
##
sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithValueStateTTLSuite.scala:
##
@@ -399,7 +205,7 @@ class TransformWithValueStateTTLSuite
nchammas commented on PR #44971:
URL: https://github.com/apache/spark/pull/44971#issuecomment-2052098166
> The discussion on
[SPARK-46810](https://issues.apache.org/jira/browse/SPARK-46810) is not moving
forward, unfortunately.
This discussion has since been resolved in favor of
dongjoon-hyun commented on code in PR #46019:
URL: https://github.com/apache/spark/pull/46019#discussion_r1562817518
##
core/src/main/scala/org/apache/spark/api/python/WriteInputFormatTestDataGenerator.scala:
##
@@ -104,6 +105,7 @@ private[python] class
dongjoon-hyun commented on code in PR #46019:
URL: https://github.com/apache/spark/pull/46019#discussion_r1562817518
##
core/src/main/scala/org/apache/spark/api/python/WriteInputFormatTestDataGenerator.scala:
##
@@ -104,6 +105,7 @@ private[python] class
dongjoon-hyun commented on code in PR #46019:
URL: https://github.com/apache/spark/pull/46019#discussion_r1562814727
##
core/src/main/scala/org/apache/spark/api/python/WriteInputFormatTestDataGenerator.scala:
##
@@ -104,6 +105,7 @@ private[python] class
dongjoon-hyun commented on code in PR #46019:
URL: https://github.com/apache/spark/pull/46019#discussion_r1562814727
##
core/src/main/scala/org/apache/spark/api/python/WriteInputFormatTestDataGenerator.scala:
##
@@ -104,6 +105,7 @@ private[python] class
anishshri-db commented on code in PR #45932:
URL: https://github.com/apache/spark/pull/45932#discussion_r1562801897
##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ListStateImplWithTTL.scala:
##
@@ -0,0 +1,235 @@
+/*
+ * Licensed to the Apache Software
csviri commented on code in PR #2:
URL:
https://github.com/apache/spark-kubernetes-operator/pull/2#discussion_r1562687820
##
spark-operator/src/main/java/org/apache/spark/kubernetes/operator/health/SentinelManager.java:
##
@@ -0,0 +1,210 @@
+/*
+ * Licensed to the Apache
sahnib commented on code in PR #45932:
URL: https://github.com/apache/spark/pull/45932#discussion_r1562685312
##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ListStateImplWithTTL.scala:
##
@@ -0,0 +1,235 @@
+/*
+ * Licensed to the Apache Software Foundation
harshmotw-db commented on code in PR #46011:
URL: https://github.com/apache/spark/pull/46011#discussion_r1562704219
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/variant/VariantExpressionEvalUtils.scala:
##
@@ -41,4 +41,15 @@ object
harshmotw-db commented on PR #46017:
URL: https://github.com/apache/spark/pull/46017#issuecomment-2051963610
> Do we support variant as the join keys? HashJoin will hash the join key
values as well.
We currently do not since we haven't implemented the `=` operator on variant.
--
yaooqinn commented on PR #46026:
URL: https://github.com/apache/spark/pull/46026#issuecomment-2051928714
Is the Hadoop native zstd library still missing?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
cloud-fan commented on code in PR #46008:
URL: https://github.com/apache/spark/pull/46008#discussion_r1562653009
##
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala:
##
@@ -163,6 +165,45 @@ class CollationStringExpressionsSuite
})
}
+
zhengruifeng opened a new pull request, #46032:
URL: https://github.com/apache/spark/pull/46032
### What changes were proposed in this pull request?
Enable parity test `test_different_group_key_cardinality`, by trigger the
analysis
### Why are the changes needed?
for test
cloud-fan closed pull request #45946: [SPARK-47765][SQL] Add SET COLLATION to
parser rules
URL: https://github.com/apache/spark/pull/45946
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
cloud-fan commented on PR #45946:
URL: https://github.com/apache/spark/pull/45946#issuecomment-2051862860
thanks, merging to master!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
zhengruifeng commented on code in PR #46012:
URL: https://github.com/apache/spark/pull/46012#discussion_r1562547428
##
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SessionHolder.scala:
##
@@ -381,6 +405,53 @@ case class SessionHolder(userId:
zhengruifeng commented on code in PR #46012:
URL: https://github.com/apache/spark/pull/46012#discussion_r1562547428
##
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SessionHolder.scala:
##
@@ -381,6 +405,53 @@ case class SessionHolder(userId:
tanelk commented on PR #26029:
URL: https://github.com/apache/spark/pull/26029#issuecomment-2051754913
Hello, I know this is an "ancient" PR, but it seems like it caused a severe
performance regression.
I made a jira issue with it https://issues.apache.org/jira/browse/SPARK-47836
I'll
stefankandic opened a new pull request, #46031:
URL: https://github.com/apache/spark/pull/46031
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
###
pan3793 opened a new pull request, #46030:
URL: https://github.com/apache/spark/pull/46030
### What changes were proposed in this pull request?
This PR logically reverts
https://github.com/apache/spark/commit/2c82745686f4456c4d5c84040a431dcb5b6cb60b,
to allow disable
hvanhovell commented on PR #46027:
URL: https://github.com/apache/spark/pull/46027#issuecomment-2051701249
@vicennial @xi-db should we also fix this in 3.5?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
hvanhovell closed pull request #46027: [SPARK-47819][CONNECT] Use asynchronous
callback for execution cleanup
URL: https://github.com/apache/spark/pull/46027
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
uros-db commented on code in PR #46008:
URL: https://github.com/apache/spark/pull/46008#discussion_r1562392631
##
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala:
##
@@ -163,6 +165,45 @@ class CollationStringExpressionsSuite
})
}
+
uros-db commented on code in PR #46008:
URL: https://github.com/apache/spark/pull/46008#discussion_r1562392631
##
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala:
##
@@ -163,6 +165,45 @@ class CollationStringExpressionsSuite
})
}
+
mihailom-db commented on code in PR #46008:
URL: https://github.com/apache/spark/pull/46008#discussion_r1562381738
##
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala:
##
@@ -163,6 +165,45 @@ class CollationStringExpressionsSuite
})
}
mihailom-db commented on code in PR #46008:
URL: https://github.com/apache/spark/pull/46008#discussion_r1562379501
##
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala:
##
@@ -89,6 +89,45 @@ class CollationStringExpressionsSuite
pan3793 commented on code in PR #25899:
URL: https://github.com/apache/spark/pull/25899#discussion_r1562356543
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala:
##
@@ -734,30 +735,52 @@ object DataSource extends Logging {
* Checks and
LuciferYang opened a new pull request, #46029:
URL: https://github.com/apache/spark/pull/46029
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
###
pan3793 commented on PR #46028:
URL: https://github.com/apache/spark/pull/46028#issuecomment-2051465166
cc @srowen @viirya @dongjoon-hyun @yaooqinn
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
pan3793 opened a new pull request, #46028:
URL: https://github.com/apache/spark/pull/46028
### What changes were proposed in this pull request?
SPARK-29089 parallelized `checkAndGlobPathIfNecessary` by leveraging fork
join pools, it also introduced a side effect, the reported
vicennial commented on PR #46027:
URL: https://github.com/apache/spark/pull/46027#issuecomment-2051417438
cc @HyukjinKwon @hvanhovell
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
1 - 100 of 135 matches
Mail list logo