HeartSaVioR closed pull request #39662: [SPARK-42105][SS][DOCS] Reflect the
change of SPARK-40925 to SS guide doc
URL: https://github.com/apache/spark/pull/39662
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
AmplabJenkins commented on PR #39629:
URL: https://github.com/apache/spark/pull/39629#issuecomment-1398038934
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
AmplabJenkins commented on PR #39628:
URL: https://github.com/apache/spark/pull/39628#issuecomment-1398038997
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
AmplabJenkins commented on PR #39626:
URL: https://github.com/apache/spark/pull/39626#issuecomment-1398039062
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
gengliangwang commented on code in PR #39666:
URL: https://github.com/apache/spark/pull/39666#discussion_r1082286371
##
core/src/main/scala/org/apache/spark/status/protobuf/Utils.scala:
##
@@ -17,10 +17,24 @@
package org.apache.spark.status.protobuf
+import
dongjoon-hyun commented on PR #39664:
URL: https://github.com/apache/spark/pull/39664#issuecomment-1398159305
I merged the newer PR, @ggershinsky . :)
-
https://github.com/apache/spark/commit/e1c630a98c45ae07c43c8cf95979532b51bf59ec
--
This is an automated message from the Apache Git
HeartSaVioR commented on PR #39662:
URL: https://github.com/apache/spark/pull/39662#issuecomment-1398038094
Thanks for quick reviewing! Merging to master. (I'll deal with follow-up PR
if there are outstanding post-review comments.)
--
This is an automated message from the Apache Git
gengliangwang commented on PR #39666:
URL: https://github.com/apache/spark/pull/39666#issuecomment-1398047138
cc @LuciferYang @panbingkun @techaddict let's **update all the string
fields** to make sure null string values are well handled. To avoid creating
too many PRs and making the
LuciferYang commented on PR #39642:
URL: https://github.com/apache/spark/pull/39642#issuecomment-1398075391
Will refactor after https://github.com/apache/spark/pull/39666 merged
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
gengliangwang commented on code in PR #39666:
URL: https://github.com/apache/spark/pull/39666#discussion_r1082251439
##
core/src/main/scala/org/apache/spark/status/protobuf/Utils.scala:
##
@@ -17,10 +17,24 @@
package org.apache.spark.status.protobuf
+import
sadikovi commented on PR #39660:
URL: https://github.com/apache/spark/pull/39660#issuecomment-1398098578
Thanks @dongjoon-hyun. I will address your comments soon-ish .
@beliefer, Yes, you are right. The documentation describes TOP (N) returning
the N top rows when used together with
beliefer opened a new pull request, #39667:
URL: https://github.com/apache/spark/pull/39667
### What changes were proposed in this pull request?
Currently, JDBCRDD uses fixed format for SELECT statement.
```
val sqlText = options.prepareQuery +
s"SELECT $columnList FROM
dongjoon-hyun commented on code in PR #39666:
URL: https://github.com/apache/spark/pull/39666#discussion_r1082308565
##
core/src/main/protobuf/org/apache/spark/status/protobuf/store_types.proto:
##
@@ -22,7 +22,12 @@ package org.apache.spark.status.protobuf;
* Developer
dongjoon-hyun commented on code in PR #39666:
URL: https://github.com/apache/spark/pull/39666#discussion_r1082308565
##
core/src/main/protobuf/org/apache/spark/status/protobuf/store_types.proto:
##
@@ -22,7 +22,12 @@ package org.apache.spark.status.protobuf;
* Developer
dongjoon-hyun commented on PR #39665:
URL: https://github.com/apache/spark/pull/39665#issuecomment-1398167017
I fixed the `Affected Version` from 3.3.1 to 3.4.0 because this fails in
`branch-3.3`.
```
[info] ParquetEncryptionSuite:
[info] - SPARK-34990: Write and read an encrypted
dongjoon-hyun opened a new pull request, #39668:
URL: https://github.com/apache/spark/pull/39668
This aims to test the possible test failures on Spark 3.4.0 RC tag.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
HyukjinKwon commented on code in PR #39541:
URL: https://github.com/apache/spark/pull/39541#discussion_r1082320316
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/util/RemoteSparkSession.scala:
##
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the
HyukjinKwon commented on code in PR #39541:
URL: https://github.com/apache/spark/pull/39541#discussion_r1082329724
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/SparkConnectClientSuite.scala:
##
@@ -78,7 +78,7 @@ class
HyukjinKwon commented on code in PR #39541:
URL: https://github.com/apache/spark/pull/39541#discussion_r1082329225
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/util/RemoteSparkSession.scala:
##
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the
HyukjinKwon commented on PR #39668:
URL: https://github.com/apache/spark/pull/39668#issuecomment-1398189235
cc @xinrong-meng FYI
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
antonipp commented on PR #38376:
URL: https://github.com/apache/spark/pull/38376#issuecomment-1398209302
Thank you for the reviews and for the merge!
I am not 100% sure what is the backport process but I opened 2 PRs (for 3.3
and 3.2) since I believe both are still supported based on
vicennial opened a new pull request, #39672:
URL: https://github.com/apache/spark/pull/39672
### What changes were proposed in this pull request?
Adds the following methods:
- Dataframe API methods
- project
- filter
- limit
- SparkSession
- range (and
gengliangwang opened a new pull request, #39666:
URL: https://github.com/apache/spark/pull/39666
### What changes were proposed in this pull request?
After revisiting https://github.com/apache/spark/pull/39416 and
https://github.com/apache/spark/pull/39623, I propose:
*
HeartSaVioR commented on PR #39647:
URL: https://github.com/apache/spark/pull/39647#issuecomment-1398045113
Thanks! Merging to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
HeartSaVioR closed pull request #39647: [SPARK-42075][DSTREAM] Deprecate
DStream API
URL: https://github.com/apache/spark/pull/39647
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon commented on code in PR #39628:
URL: https://github.com/apache/spark/pull/39628#discussion_r1082211081
##
python/pyspark/ml/functions.py:
##
@@ -647,37 +386,369 @@ def predict_columnar(x1: np.ndarray, x2: np.ndarray) ->
Mapping[str, np.ndarray]
Function
HyukjinKwon commented on PR #39665:
URL: https://github.com/apache/spark/pull/39665#issuecomment-1398048922
Mind keeping the PR description template
https://github.com/apache/spark/blob/master/.github/PULL_REQUEST_TEMPLATE?
--
This is an automated message from the Apache Git Service.
To
LuciferYang commented on PR #39666:
URL: https://github.com/apache/spark/pull/39666#issuecomment-1398066333
> cc @LuciferYang @panbingkun @techaddict let's **update all the string
fields** to make sure null string values are well handled. To avoid creating
too many PRs and making the
LuciferYang commented on code in PR #39666:
URL: https://github.com/apache/spark/pull/39666#discussion_r1082248889
##
core/src/main/scala/org/apache/spark/status/protobuf/Utils.scala:
##
@@ -17,10 +17,24 @@
package org.apache.spark.status.protobuf
+import
dongjoon-hyun commented on code in PR #39666:
URL: https://github.com/apache/spark/pull/39666#discussion_r1082307793
##
core/src/main/protobuf/org/apache/spark/status/protobuf/store_types.proto:
##
@@ -22,7 +22,12 @@ package org.apache.spark.status.protobuf;
* Developer
HyukjinKwon commented on code in PR #39541:
URL: https://github.com/apache/spark/pull/39541#discussion_r1082319733
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/util/RemoteSparkSession.scala:
##
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the
dongjoon-hyun commented on PR #39541:
URL: https://github.com/apache/spark/pull/39541#issuecomment-1398177490
BTW, while I was reviewing this PR, I felt the necessity to open an official
PR to test any potential test cases on tagging.
Here is the general PR to detect any `SNAPSHOT`
HyukjinKwon commented on code in PR #39541:
URL: https://github.com/apache/spark/pull/39541#discussion_r1082319733
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/util/RemoteSparkSession.scala:
##
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the
antonipp opened a new pull request, #39669:
URL: https://github.com/apache/spark/pull/39669
### What changes were proposed in this pull request?
Backport https://github.com/apache/spark/pull/38376 to `branch-3.3`
You can find a detailed description of the issue and an example
antonipp opened a new pull request, #39670:
URL: https://github.com/apache/spark/pull/39670
### What changes were proposed in this pull request?
Backport https://github.com/apache/spark/pull/38376 to `branch-3.2`
You can find a detailed description of the issue and an example
dongjoon-hyun commented on PR #39671:
URL: https://github.com/apache/spark/pull/39671#issuecomment-1398234989
Oh, is `Zulu` only have that released version, @wangyum ?
- https://bugs.openjdk.org/browse/JDK-8296506
I cannot find docker image and Adoptium (Temurin) Java yet.
-
zhengruifeng commented on PR #39661:
URL: https://github.com/apache/spark/pull/39661#issuecomment-1398122520
LGTM, thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
dongjoon-hyun closed pull request #39665: [SPARK-42114][SQL][TESTS] Add uniform
parquet encryption test case
URL: https://github.com/apache/spark/pull/39665
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
HyukjinKwon commented on code in PR #39541:
URL: https://github.com/apache/spark/pull/39541#discussion_r1082323765
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/util/RemoteSparkSession.scala:
##
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the
HyukjinKwon commented on code in PR #39541:
URL: https://github.com/apache/spark/pull/39541#discussion_r1082326873
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/util/RemoteSparkSession.scala:
##
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the
HyukjinKwon commented on code in PR #39541:
URL: https://github.com/apache/spark/pull/39541#discussion_r1082329957
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala:
##
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software
WeichenXu123 commented on code in PR #39299:
URL: https://github.com/apache/spark/pull/39299#discussion_r1082414835
##
python/pyspark/ml/torch/log_communication.py:
##
@@ -0,0 +1,201 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor
LuciferYang commented on code in PR #39666:
URL: https://github.com/apache/spark/pull/39666#discussion_r1082425788
##
core/src/main/scala/org/apache/spark/status/protobuf/Utils.scala:
##
@@ -17,10 +17,24 @@
package org.apache.spark.status.protobuf
+import
dongjoon-hyun commented on PR #39668:
URL: https://github.com/apache/spark/pull/39668#issuecomment-1398314758
It seems that we have only one failure.
![Screenshot 2023-01-20 at 4 28 07
dongjoon-hyun commented on PR #39671:
URL: https://github.com/apache/spark/pull/39671#issuecomment-1398319998
To @LuciferYang , I don't think this is a compatibility or any failure.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
LuciferYang opened a new pull request, #39674:
URL: https://github.com/apache/spark/pull/39674
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
###
panbingkun opened a new pull request, #39675:
URL: https://github.com/apache/spark/pull/39675
### What changes were proposed in this pull request?
The pr aims to update the doc of arrow & kubernetes.
### Why are the changes needed?
dongjoon-hyun commented on PR #39671:
URL: https://github.com/apache/spark/pull/39671#issuecomment-1398367056
BTW, we didn't cut the branch yet and we still have one month for Apache
Spark 3.4.0 release. I'm considering that time period for this decision,
@LuciferYang . You are also
dongjoon-hyun commented on code in PR #39675:
URL: https://github.com/apache/spark/pull/39675#discussion_r1082532817
##
docs/running-on-kubernetes.md:
##
@@ -34,13 +34,13 @@ Please see [Spark Security](security.html) and the specific
security sections in
Images built from
LuciferYang commented on code in PR #39674:
URL: https://github.com/apache/spark/pull/39674#discussion_r1082533145
##
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:
##
@@ -1005,26 +1005,6 @@ private[spark] class Client(
val tmpDir = new
wangyum commented on PR #39671:
URL: https://github.com/apache/spark/pull/39671#issuecomment-1398267365
> Oh, does `Zulu` only have that released version, @wangyum ?
>
> * https://bugs.openjdk.org/browse/JDK-8296506
>
> I cannot find docker image and Adoptium (Temurin) Java
WeichenXu123 commented on code in PR #39299:
URL: https://github.com/apache/spark/pull/39299#discussion_r1082420394
##
python/pyspark/ml/torch/log_communication.py:
##
@@ -0,0 +1,201 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor
EnricoMi commented on PR #39640:
URL: https://github.com/apache/spark/pull/39640#issuecomment-1398282605
@cloud-fan following issue: `ds.groupByKey` adds key columns to the plan:
```
def groupByKey[K: Encoder](func: T => K): KeyValueGroupedDataset[K, T] = {
val withGroupingKey
WeichenXu123 commented on code in PR #39299:
URL: https://github.com/apache/spark/pull/39299#discussion_r1082428873
##
python/pyspark/ml/torch/log_communication.py:
##
@@ -0,0 +1,201 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor
wecharyu commented on PR #39115:
URL: https://github.com/apache/spark/pull/39115#issuecomment-1398308497
> Can you tune the config spark.sql.addPartitionInBatch.size? Setting it to
a larger number can reduce the number of RPCs.
It does not help in `RepairTableCommand`, when enable
LuciferYang commented on PR #39671:
URL: https://github.com/apache/spark/pull/39671#issuecomment-1398314217
One problem is that GA is still using Temurin 8u352 for build and test. We
need to wait for a while before running GA tasks using 8u362.
--
This is an automated message
LuciferYang commented on code in PR #39674:
URL: https://github.com/apache/spark/pull/39674#discussion_r1082496069
##
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:
##
@@ -1005,26 +1005,6 @@ private[spark] class Client(
val tmpDir = new
LuciferYang commented on code in PR #39674:
URL: https://github.com/apache/spark/pull/39674#discussion_r1082496069
##
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:
##
@@ -1005,26 +1005,6 @@ private[spark] class Client(
val tmpDir = new
LuciferYang commented on PR #39663:
URL: https://github.com/apache/spark/pull/39663#issuecomment-1398346083
Thanks @dongjoon-hyun
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
dongjoon-hyun commented on PR #39671:
URL: https://github.com/apache/spark/pull/39671#issuecomment-1398362594
Timezone issues are inevitably which we need to adjust the code in a regular
basis, @LuciferYang .
--
This is an automated message from the Apache Git Service.
To respond to the
dongjoon-hyun commented on code in PR #39674:
URL: https://github.com/apache/spark/pull/39674#discussion_r1082528460
##
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:
##
@@ -1005,26 +1005,6 @@ private[spark] class Client(
val tmpDir = new
dongjoon-hyun commented on code in PR #39675:
URL: https://github.com/apache/spark/pull/39675#discussion_r1082529696
##
docs/index.md:
##
@@ -45,7 +45,6 @@ Java 8 prior to version 8u201 support is deprecated as of
Spark 3.2.0.
When using the Scala API, it is necessary for
srowen commented on PR #39190:
URL: https://github.com/apache/spark/pull/39190#issuecomment-1398393768
Yeah but do you know how it happens, or have a theory? Just want to see if
the change seems to match with some theory of how it arises. Or does this
change definitely change the output
EnricoMi opened a new pull request, #39673:
URL: https://github.com/apache/spark/pull/39673
### What changes were proposed in this pull request?
This deduplicate attributes that exist on both sides of a `CoGroup` by
aliasing the occurrence on the right side.
### Why are the
EnricoMi commented on PR #39673:
URL: https://github.com/apache/spark/pull/39673#issuecomment-1398246138
Ideally, `QueryPlan.rewriteAttrs` would not replace occurrences `id#0L#`
with `id#13L` in all fields of `CoGroup`, but only in `rightDeserializer`,
`rightGroup`, `rightAttr`,
kuwii commented on PR #39190:
URL: https://github.com/apache/spark/pull/39190#issuecomment-1398263751
@srowen We found this issue in some of Spark applications. Here's the event
log of an example, which can be loaded through history server:
WeichenXu123 commented on code in PR #39299:
URL: https://github.com/apache/spark/pull/39299#discussion_r1082416985
##
python/pyspark/ml/torch/log_communication.py:
##
@@ -0,0 +1,201 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor
WeichenXu123 commented on code in PR #39299:
URL: https://github.com/apache/spark/pull/39299#discussion_r1082420774
##
python/pyspark/ml/torch/log_communication.py:
##
@@ -0,0 +1,201 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor
WeichenXu123 commented on code in PR #39369:
URL: https://github.com/apache/spark/pull/39369#discussion_r1082443370
##
python/pyspark/ml/torch/distributor.py:
##
@@ -495,32 +546,119 @@ def set_gpus(context: "BarrierTaskContext") -> None:
def _run_distributed_training(
WeichenXu123 commented on code in PR #39369:
URL: https://github.com/apache/spark/pull/39369#discussion_r1082450887
##
python/pyspark/ml/torch/tests/test_distributor.py:
##
@@ -224,8 +293,10 @@ def setUp(self) -> None:
self.sc =
dongjoon-hyun commented on PR #39541:
URL: https://github.com/apache/spark/pull/39541#issuecomment-1398317390
As @HyukjinKwon pointed out, this causes a failure for RC and official
release.
- https://github.com/apache/spark/pull/39668#issuecomment-1398314758
![Screenshot
LuciferYang commented on PR #39671:
URL: https://github.com/apache/spark/pull/39671#issuecomment-1398316727
Could you use 8u362 to run full UTs offline to check compatibility? Thanks ~
@wangyum
--
This is an automated message from the Apache Git Service.
To respond to the
dongjoon-hyun closed pull request #39663: [SPARK-42129][BUILD] Upgrade
rocksdbjni to 7.9.2
URL: https://github.com/apache/spark/pull/39663
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
LuciferYang commented on PR #39671:
URL: https://github.com/apache/spark/pull/39671#issuecomment-1398356049
@dongjoon-hyun Hmm...do you remember SPARK-40846? When we upgrade from 8u345
to 8u352 for GA testing, there are some time zone issue that need to be solved
by changing the code, so I
dongjoon-hyun commented on code in PR #39675:
URL: https://github.com/apache/spark/pull/39675#discussion_r1082534577
##
docs/running-on-kubernetes.md:
##
@@ -34,13 +34,13 @@ Please see [Spark Security](security.html) and the specific
security sections in
Images built from
LuciferYang commented on PR #39671:
URL: https://github.com/apache/spark/pull/39671#issuecomment-1398388671
Ok, plenty of time. I am fine to make this change
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
dtenedor commented on PR #39678:
URL: https://github.com/apache/spark/pull/39678#issuecomment-1399101487
Hi @RyanBerti, after a few initial conversations about this proposal, we
wanted to express some questions and opinions here for your consideration. In
general, we wholeheartedly
zhengruifeng commented on PR #39622:
URL: https://github.com/apache/spark/pull/39622#issuecomment-1399144510
merged into master, thank you @cloud-fan @HyukjinKwon @dongjoon-hyun !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
mridulm commented on code in PR #39654:
URL: https://github.com/apache/spark/pull/39654#discussion_r1083189284
##
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java:
##
@@ -815,7 +815,7 @@ public MergeStatuses
zhengruifeng commented on code in PR #39638:
URL: https://github.com/apache/spark/pull/39638#discussion_r1083224562
##
python/pyspark/sql/tests/test_functions.py:
##
@@ -763,25 +798,55 @@ def test_higher_order_function_failures(self):
from pyspark.sql.functions import
mridulm commented on PR #39682:
URL: https://github.com/apache/spark/pull/39682#issuecomment-1399193866
Thanks for clarifying - yeah, you have to use has/get and either
Optional(value).isPresent(set) or null check for set
--
This is an automated message from the Apache Git Service.
To
LuciferYang opened a new pull request, #39684:
URL: https://github.com/apache/spark/pull/39684
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
###
HyukjinKwon closed pull request #39299: [SPARK-41593][PYTHON][ML] Adding
logging from executors
URL: https://github.com/apache/spark/pull/39299
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
HyukjinKwon commented on PR #39299:
URL: https://github.com/apache/spark/pull/39299#issuecomment-1399197595
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
LuciferYang commented on code in PR #39682:
URL: https://github.com/apache/spark/pull/39682#discussion_r1083259379
##
sql/core/src/test/scala/org/apache/spark/status/protobuf/sql/KVStoreProtobufSerializerSuite.scala:
##
@@ -48,6 +48,43 @@ class KVStoreProtobufSerializerSuite
dongjoon-hyun closed pull request #39679: [SPARK-42137][CORE] Enable
`spark.kryo.unsafe` by default
URL: https://github.com/apache/spark/pull/39679
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
dongjoon-hyun commented on PR #39679:
URL: https://github.com/apache/spark/pull/39679#issuecomment-1399134868
Merged to master for Apache Spark 3.4.0.
The one pyspark pipeline seems to slow, but it's irrelevant to this PR and
verified here before.
--
This is an automated message from
zhengruifeng closed pull request #39622:
[SPARK-42099][SPARK-41845][CONNECT][PYTHON] Fix `count(*)` and `count(col(*))`
URL: https://github.com/apache/spark/pull/39622
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
huaxingao commented on PR #39678:
URL: https://github.com/apache/spark/pull/39678#issuecomment-1399184316
@RyanBerti Thanks for the great work!
+1 for using Apache DataSketches library.
I am wondering if we can use [Theta
HyukjinKwon commented on PR #39677:
URL: https://github.com/apache/spark/pull/39677#issuecomment-1399194601
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon closed pull request #39677: [SPARK-42043][CONNECT][TEST][FOLLOWUP]
Fix jar finding bug and use better env vars and time measurement
URL: https://github.com/apache/spark/pull/39677
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
gengliangwang commented on PR #39685:
URL: https://github.com/apache/spark/pull/39685#issuecomment-1399196312
cc @LuciferYang
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
gengliangwang opened a new pull request, #39685:
URL: https://github.com/apache/spark/pull/39685
### What changes were proposed in this pull request?
Similar to https://github.com/apache/spark/pull/39666, this PR handles null
string values in
LuciferYang commented on PR #39684:
URL: https://github.com/apache/spark/pull/39684#issuecomment-1399196254
cc @gengliangwang
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
tedyu closed pull request #39654: [MINOR][SHUFFLE] Include IOException in
warning log of finalizeShuffleMerge
URL: https://github.com/apache/spark/pull/39654
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
joveyuan-db opened a new pull request, #39681:
URL: https://github.com/apache/spark/pull/39681
### What changes were proposed in this pull request?
This PR ensures that SparkR serializes `NA` dates as `"NA"` (string) to
avoid an undefined length when deserializing in the JVM.
###
huaxingao commented on PR #39676:
URL: https://github.com/apache/spark/pull/39676#issuecomment-1399156214
Merged to 3.3/master. Thanks for fix this @peter-toth
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
huaxingao closed pull request #39676: [SPARK-42134][SQL] Fix
getPartitionFiltersAndDataFilters() to handle filters without referenced
attributes
URL: https://github.com/apache/spark/pull/39676
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
HyukjinKwon commented on code in PR #39638:
URL: https://github.com/apache/spark/pull/39638#discussion_r1083233072
##
python/pyspark/sql/tests/test_functions.py:
##
@@ -763,25 +798,55 @@ def test_higher_order_function_failures(self):
from pyspark.sql.functions import
mridulm commented on PR #39644:
URL: https://github.com/apache/spark/pull/39644#issuecomment-1399165650
Thanks for the ping @dongjoon-hyun :)
I will merge this to 3.3
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
1 - 100 of 202 matches
Mail list logo