sadikovi commented on code in PR #37933:
URL: https://github.com/apache/spark/pull/37933#discussion_r974908544
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala:
##
@@ -224,7 +223,7 @@ class UnivocityParser(
case NonFatal(e) =>
xiaonanyang-db commented on code in PR #37933:
URL: https://github.com/apache/spark/pull/37933#discussion_r974909141
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala:
##
@@ -224,7 +223,7 @@ class UnivocityParser(
case NonFatal(e)
sadikovi commented on code in PR #37933:
URL: https://github.com/apache/spark/pull/37933#discussion_r974908544
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala:
##
@@ -224,7 +223,7 @@ class UnivocityParser(
case NonFatal(e) =>
xiaonanyang-db commented on PR #37933:
URL: https://github.com/apache/spark/pull/37933#issuecomment-1251869330
> Can you update the description to list all of the semantics of the change?
You can remove the point where we need to merge them to TimestampType if this
is not what the PR
LuciferYang commented on PR #37940:
URL: https://github.com/apache/spark/pull/37940#issuecomment-1251869079
Test the following code with input size
`1,5,10,20,50,100,150,200,300,400,500,1000,5000,1,2`
```
def testZipWithIndexToMap(valuesPerIteration: Int, collectionSize:
HeartSaVioR commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974906420
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala:
##
@@ -311,6 +323,56 @@ object UnsupportedOperationChecker
sadikovi commented on code in PR #37933:
URL: https://github.com/apache/spark/pull/37933#discussion_r974905472
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala:
##
@@ -224,7 +223,7 @@ class UnivocityParser(
case NonFatal(e) =>
HeartSaVioR commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974905452
##
python/pyspark/worker.py:
##
@@ -207,6 +209,65 @@ def wrapped(key_series, value_series):
return lambda k, v: [(wrapped(k, v), to_arrow_type(return_type))]
dongjoon-hyun commented on PR #37729:
URL: https://github.com/apache/spark/pull/37729#issuecomment-1251865262
Thank you, @wangyum .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
LuciferYang opened a new pull request, #37940:
URL: https://github.com/apache/spark/pull/37940
### What changes were proposed in this pull request?
Similar as https://github.com/apache/spark/pull/37876, this pr introduce a
new toMap method to `o.a.spark.util.collection.Utils`, use
wangyum commented on PR #37729:
URL: https://github.com/apache/spark/pull/37729#issuecomment-1251857340
OK. https://issues.apache.org/jira/browse/SPARK-40493
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
beliefer commented on PR #37937:
URL: https://github.com/apache/spark/pull/37937#issuecomment-1251853764
> Hm, why do we need this? Can't we do `spark.read.jdbc(...).rdd` or `toDS`?
I know. This PR just follows the legacy document of `JdbcRDD`. If we don't
need the change, we may
dongjoon-hyun commented on code in PR #37924:
URL: https://github.com/apache/spark/pull/37924#discussion_r974893716
##
core/src/main/scala/org/apache/spark/internal/config/package.scala:
##
@@ -2221,6 +2221,14 @@ package object config {
.checkValue(_ >= 0, "needs to be a
dongjoon-hyun commented on code in PR #37924:
URL: https://github.com/apache/spark/pull/37924#discussion_r974893716
##
core/src/main/scala/org/apache/spark/internal/config/package.scala:
##
@@ -2221,6 +2221,14 @@ package object config {
.checkValue(_ >= 0, "needs to be a
dongjoon-hyun commented on code in PR #37934:
URL: https://github.com/apache/spark/pull/37934#discussion_r974891717
##
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala:
##
@@ -1461,10 +1462,7 @@ class ParquetIOSuite extends
cloud-fan commented on code in PR #37679:
URL: https://github.com/apache/spark/pull/37679#discussion_r974888079
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala:
##
@@ -48,9 +48,6 @@ import org.apache.spark.sql.types.StructType
import
dongjoon-hyun commented on PR #36096:
URL: https://github.com/apache/spark/pull/36096#issuecomment-1251844564
This test commit is backported to branch-3.3 according to the community
request, https://github.com/apache/spark/pull/36087#issuecomment-1251757187 .
--
This is an automated
dongjoon-hyun commented on PR #36087:
URL: https://github.com/apache/spark/pull/36087#issuecomment-1251843627
Sure, @Yikun .
This test commit is backported to branch-3.3 according to the community
request.
--
This is an automated message from the Apache Git Service.
To respond to
sunpe commented on PR #33154:
URL: https://github.com/apache/spark/pull/33154#issuecomment-1251836832
> Hello @sunpe, thank you for your very fast answer.
>
> Please let me give you some more context, I am using Spark v3.3.0 in K8s
using [Spark on K8S
HyukjinKwon opened a new pull request, #37939:
URL: https://github.com/apache/spark/pull/37939
### What changes were proposed in this pull request?
This PR proposes to document datetime.timedelta support in PySpark in SQL
DataType reference page. This support was added in SPARK-37275
HyukjinKwon commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974870249
##
python/pyspark/sql/pandas/group_ops.py:
##
@@ -216,6 +218,105 @@ def applyInPandas(
jdf = self._jgd.flatMapGroupsInPandas(udf_column._jc.expr())
HeartSaVioR commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r97486
##
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ApplyInPandasWithStatePythonRunner.scala:
##
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache
HyukjinKwon closed pull request #37932: [SPARK-40460][SS][3.3] Fix streaming
metrics when selecting _metadata
URL: https://github.com/apache/spark/pull/37932
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
HyukjinKwon commented on PR #37932:
URL: https://github.com/apache/spark/pull/37932#issuecomment-1251799957
Merged to branch-3.3.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HeartSaVioR commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974858058
##
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ApplyInPandasWithStatePythonRunner.scala:
##
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache
HeartSaVioR commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974858058
##
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ApplyInPandasWithStatePythonRunner.scala:
##
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache
HeartSaVioR commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974856402
##
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ApplyInPandasWithStateWriter.scala:
##
@@ -0,0 +1,246 @@
+/*
+ * Licensed to the Apache Software
HeartSaVioR commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974854298
##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -2705,6 +2705,44 @@ object SQLConf {
.booleanConf
LuciferYang commented on code in PR #37938:
URL: https://github.com/apache/spark/pull/37938#discussion_r974854247
##
common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java:
##
@@ -237,6 +241,10 @@ protected void serviceInit(Configuration
Yikun commented on PR #36087:
URL: https://github.com/apache/spark/pull/36087#issuecomment-1251792098
Here is a simple demo to show why we need them:
https://github.com/Yikun/spark-docker/pull/5
- docker image build with tag v3.3.0
- test with 3.3.0 K8S IT in github action
-
LuciferYang commented on code in PR #37938:
URL: https://github.com/apache/spark/pull/37938#discussion_r974846876
##
common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java:
##
@@ -237,6 +241,10 @@ protected void serviceInit(Configuration
LuciferYang commented on PR #37938:
URL: https://github.com/apache/spark/pull/37938#issuecomment-1251782282
cc @tgravescs
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
LuciferYang opened a new pull request, #37938:
URL: https://github.com/apache/spark/pull/37938
### What changes were proposed in this pull request?
After SPARK-17321, `YarnShuffleService` will persist data to local shuffle
state db/reload data from local shuffle state db only when Yarn
beliefer opened a new pull request, #37937:
URL: https://github.com/apache/spark/pull/37937
### What changes were proposed in this pull request?
According to the legacy document of `JdbcRDD`, we need to expose a jdbcRDD
function in `SparkContext`.
### Why are the changes
alex-balikov commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974707164
##
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ApplyInPandasWithStatePythonRunner.scala:
##
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache
zhengruifeng commented on code in PR #37918:
URL: https://github.com/apache/spark/pull/37918#discussion_r974838677
##
mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala:
##
@@ -496,18 +499,23 @@ class ALSModel private[ml] (
.iterator.map { j =>
gengliangwang commented on code in PR #37840:
URL: https://github.com/apache/spark/pull/37840#discussion_r974833681
##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -3911,6 +3911,15 @@ object SQLConf {
LuciferYang commented on PR #37926:
URL: https://github.com/apache/spark/pull/37926#issuecomment-1251761360
thanks @viirya @srowen @dongjoon-hyun
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
Yikun commented on PR #36087:
URL: https://github.com/apache/spark/pull/36087#issuecomment-1251757187
And also this https://github.com/apache/spark/pull/36096
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
Yikun commented on PR #36087:
URL: https://github.com/apache/spark/pull/36087#issuecomment-1251756266
@dongjoon-hyun Could we backport this to branch-3.3, this will very help to
run branch-3.3 K8S in github action.
--
This is an automated message from the Apache Git Service.
To respond
zhengruifeng commented on PR #37929:
URL: https://github.com/apache/spark/pull/37929#issuecomment-1251750796
cc @itholic @HyukjinKwon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
Yikun commented on code in PR #37923:
URL: https://github.com/apache/spark/pull/37923#discussion_r974808692
##
python/pyspark/pandas/groupby.py:
##
@@ -993,6 +994,101 @@ def nth(self, n: int) -> FrameLike:
return self._prepare_return(DataFrame(internal))
+def
HeartSaVioR commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974805894
##
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ApplyInPandasWithStatePythonRunner.scala:
##
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache
HeartSaVioR commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974803979
##
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ApplyInPandasWithStatePythonRunner.scala:
##
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache
HeartSaVioR commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974798726
##
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ApplyInPandasWithStatePythonRunner.scala:
##
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache
warrenzhu25 commented on code in PR #37924:
URL: https://github.com/apache/spark/pull/37924#discussion_r974793372
##
core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala:
##
@@ -1860,8 +1867,18 @@ private[spark] class DAGScheduler(
s"(attempt
HeartSaVioR commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974784396
##
python/pyspark/sql/pandas/serializers.py:
##
@@ -371,3 +375,292 @@ def load_stream(self, stream):
raise ValueError(
viirya commented on code in PR #37934:
URL: https://github.com/apache/spark/pull/37934#discussion_r974783229
##
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala:
##
@@ -1461,10 +1462,7 @@ class ParquetIOSuite extends QueryTest with
zhengruifeng commented on PR #37923:
URL: https://github.com/apache/spark/pull/37923#issuecomment-1251698604
```
Oh no!
2 files would be reformatted, 352 files would be left unchanged.
Please run 'dev/reformat-python' script.
1
Error: Process completed with exit code 1.
zhengruifeng commented on code in PR #37923:
URL: https://github.com/apache/spark/pull/37923#discussion_r974780246
##
python/pyspark/pandas/groupby.py:
##
@@ -993,6 +994,101 @@ def nth(self, n: int) -> FrameLike:
return self._prepare_return(DataFrame(internal))
+
zhengruifeng commented on code in PR #37923:
URL: https://github.com/apache/spark/pull/37923#discussion_r974779483
##
python/pyspark/pandas/groupby.py:
##
@@ -44,6 +43,7 @@
)
import warnings
+import numpy as np
Review Comment:
`numpy` in the docstring was imported in
chaoqin-li1123 commented on PR #37935:
URL: https://github.com/apache/spark/pull/37935#issuecomment-1251692899
@HeartSaVioR
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
kazuyukitanimura commented on code in PR #37934:
URL: https://github.com/apache/spark/pull/37934#discussion_r974775400
##
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala:
##
@@ -1461,10 +1462,7 @@ class ParquetIOSuite extends
kazuyukitanimura commented on code in PR #37934:
URL: https://github.com/apache/spark/pull/37934#discussion_r974772538
##
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala:
##
@@ -1461,10 +1462,7 @@ class ParquetIOSuite extends
HeartSaVioR closed pull request #37917: [SPARK-40466][SS] Improve the error
message when DSv2 is disabled while DSv1 is not avaliable
URL: https://github.com/apache/spark/pull/37917
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
HeartSaVioR commented on PR #37917:
URL: https://github.com/apache/spark/pull/37917#issuecomment-1251667880
Thanks! Merging to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
WweiL opened a new pull request, #37936:
URL: https://github.com/apache/spark/pull/37936
## What changes were proposed in this pull request?
Add complex tests to `StreamingSessionWindowSuite`. Concretely, I created
two helper functions,
- one is called
mengxr commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r974747883
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +111,167 @@ def array_to_vector(col: Column) -> Column:
return
dongjoon-hyun commented on PR #37729:
URL: https://github.com/apache/spark/pull/37729#issuecomment-1251641154
Sorry, but I missed that this is an ancient patch. To @wangyum , we need a
new JIRA when we revert already released patches.
--
This is an automated message from the Apache Git
viirya commented on code in PR #37934:
URL: https://github.com/apache/spark/pull/37934#discussion_r974736080
##
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala:
##
@@ -1461,10 +1462,7 @@ class ParquetIOSuite extends QueryTest with
viirya commented on code in PR #37934:
URL: https://github.com/apache/spark/pull/37934#discussion_r974734194
##
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala:
##
@@ -1461,10 +1462,7 @@ class ParquetIOSuite extends QueryTest with
chaoqin-li1123 opened a new pull request, #37935:
URL: https://github.com/apache/spark/pull/37935
### What changes were proposed in this pull request?
Before unload of a StateStore, perform a cleanup.
### Why are the changes needed?
Current the maintenance of
wbo4958 commented on PR #37855:
URL: https://github.com/apache/spark/pull/37855#issuecomment-1251592287
>
> @wbo4958
>
> Issue: The xgboost code uses rdd barrier mode, but barrier mode does not
work with `coalesce` operator.
@mridulm just suggested using
dongjoon-hyun closed pull request #37424: [SPARK-39991][SQL][AQE] Use available
column statistics from completed query stages
URL: https://github.com/apache/spark/pull/37424
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
AmplabJenkins commented on PR #37922:
URL: https://github.com/apache/spark/pull/37922#issuecomment-1251551259
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
AmplabJenkins commented on PR #37923:
URL: https://github.com/apache/spark/pull/37923#issuecomment-1251551212
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
AmplabJenkins commented on PR #37924:
URL: https://github.com/apache/spark/pull/37924#issuecomment-1251551174
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
alex-balikov commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974680745
##
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ApplyInPandasWithStatePythonRunner.scala:
##
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache
alex-balikov commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974672806
##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -2705,6 +2705,44 @@ object SQLConf {
.booleanConf
MaxGekk commented on PR #37921:
URL: https://github.com/apache/spark/pull/37921#issuecomment-1251528429
@srielau @anchovYu Could you take a look at the PR, please.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
leewyang commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r974649030
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +111,167 @@ def array_to_vector(col: Column) -> Column:
return
leewyang commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r974646917
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +111,167 @@ def array_to_vector(col: Column) -> Column:
return
xinrong-meng commented on PR #37908:
URL: https://github.com/apache/spark/pull/37908#issuecomment-1251489189
Thank you!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
leewyang commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r974626628
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +111,167 @@ def array_to_vector(col: Column) -> Column:
return
kazuyukitanimura commented on PR #37934:
URL: https://github.com/apache/spark/pull/37934#issuecomment-1251489126
cc @sunchao @viirya @flyrain
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
xinrong-meng commented on PR #37912:
URL: https://github.com/apache/spark/pull/37912#issuecomment-1251488032
Thank you!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
xinrong-meng commented on PR #37888:
URL: https://github.com/apache/spark/pull/37888#issuecomment-1251487641
Thank you @HyukjinKwon @zhengruifeng @Yikun for taking care of the merging!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
kazuyukitanimura opened a new pull request, #37934:
URL: https://github.com/apache/spark/pull/37934
### What changes were proposed in this pull request?
This PR proposes to support `NullType` in `ColumnarBatchRow`.
### Why are the changes needed?
`ColumnarBatchRow.get()`
grundprinzip commented on code in PR #37710:
URL: https://github.com/apache/spark/pull/37710#discussion_r974611991
##
connect/src/main/scala/org/apache/spark/sql/sparkconnect/planner/SparkConnectPlanner.scala:
##
@@ -0,0 +1,275 @@
+/*
+ * Licensed to the Apache Software
mridulm commented on code in PR #37924:
URL: https://github.com/apache/spark/pull/37924#discussion_r974600104
##
docs/configuration.md:
##
@@ -2605,6 +2605,15 @@ Apart from these, the following properties are also
available, and may be useful
2.2.0
+
+
grundprinzip commented on code in PR #37710:
URL: https://github.com/apache/spark/pull/37710#discussion_r974602945
##
connect/src/main/scala/org/apache/spark/sql/sparkconnect/planner/SparkConnectPlanner.scala:
##
@@ -0,0 +1,275 @@
+/*
+ * Licensed to the Apache Software
grundprinzip commented on code in PR #37710:
URL: https://github.com/apache/spark/pull/37710#discussion_r974599749
##
connect/src/main/scala/org/apache/spark/sql/sparkconnect/command/SparkConnectCommandPlanner.scala:
##
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software
AmplabJenkins commented on PR #37928:
URL: https://github.com/apache/spark/pull/37928#issuecomment-1251440593
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
shrprasa commented on PR #37880:
URL: https://github.com/apache/spark/pull/37880#issuecomment-1251427562
@gaborgsomogyi @dongjoon-hyun @HyukjinKwon Can you please review this PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
dtenedor commented on code in PR #37840:
URL: https://github.com/apache/spark/pull/37840#discussion_r974575955
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##
@@ -730,6 +729,13 @@ trait CheckAnalysis extends PredicateHelper with
alex-balikov commented on code in PR #37893:
URL: https://github.com/apache/spark/pull/37893#discussion_r974517188
##
python/pyspark/worker.py:
##
@@ -361,6 +429,32 @@ def read_udfs(pickleSer, infile, eval_type):
if eval_type ==
roczei commented on PR #37679:
URL: https://github.com/apache/spark/pull/37679#issuecomment-1251364841
Thanks @cloud-fan, I have implemented this and all tests passed. As I see we
have resolved all of your feedbacks.
--
This is an automated message from the Apache Git Service.
To respond
mridulm commented on PR #37922:
URL: https://github.com/apache/spark/pull/37922#issuecomment-1251349810
> The push-based shuffle service will auto clean up the old shuffle merge
data
Consider the case I mentioned above - stage retry for an `INDETERMINATE`
stage.
We cleanup
ayudovin commented on code in PR #37923:
URL: https://github.com/apache/spark/pull/37923#discussion_r974514017
##
python/pyspark/pandas/groupby.py:
##
@@ -993,6 +993,98 @@ def nth(self, n: int) -> FrameLike:
return self._prepare_return(DataFrame(internal))
+def
pralabhkumar commented on PR #37417:
URL: https://github.com/apache/spark/pull/37417#issuecomment-1251334364
@dongjoon-hyun , Have incorporated all the review comments , please look
into the same.
--
This is an automated message from the Apache Git Service.
To respond to the message,
AmplabJenkins commented on PR #37930:
URL: https://github.com/apache/spark/pull/37930#issuecomment-1251324416
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
xiaonanyang-db opened a new pull request, #37933:
URL: https://github.com/apache/spark/pull/37933
### What changes were proposed in this pull request?
Adjust part of changes in https://github.com/apache/spark/pull/36871.
In the pr above, we introduced the support of date
xkrogen commented on code in PR #37634:
URL: https://github.com/apache/spark/pull/37634#discussion_r974499166
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala:
##
@@ -252,28 +267,44 @@ object
xkrogen commented on PR #37634:
URL: https://github.com/apache/spark/pull/37634#issuecomment-1251319065
Thanks for the suggestion @cloud-fan ! Good point about there many places
where Spark trusts nullability. Here I am trying to target places where _user
code_ could introduce a null. This
Yaohua628 commented on PR #37905:
URL: https://github.com/apache/spark/pull/37905#issuecomment-1251311583
> There's conflict in branch-3.3. @Yaohua628 Could you please craft a PR for
branch-3.3? Thanks in advance!
Done! https://github.com/apache/spark/pull/37932 - Thank you
--
Yaohua628 commented on PR #37932:
URL: https://github.com/apache/spark/pull/37932#issuecomment-1251310801
cc @HeartSaVioR
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
Yaohua628 opened a new pull request, #37932:
URL: https://github.com/apache/spark/pull/37932
### What changes were proposed in this pull request?
Cherry-picked from #37905
Streaming metrics report all 0 (`processedRowsPerSecond`, etc) when
selecting `_metadata`
dongjoon-hyun commented on code in PR #37924:
URL: https://github.com/apache/spark/pull/37924#discussion_r974491025
##
core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala:
##
@@ -2159,6 +2171,16 @@ private[spark] class DAGScheduler(
}
}
+ private def
dongjoon-hyun commented on code in PR #37924:
URL: https://github.com/apache/spark/pull/37924#discussion_r974490653
##
core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala:
##
@@ -1860,8 +1863,17 @@ private[spark] class DAGScheduler(
s"(attempt
dongjoon-hyun commented on code in PR #37924:
URL: https://github.com/apache/spark/pull/37924#discussion_r974486551
##
docs/configuration.md:
##
@@ -2605,6 +2605,15 @@ Apart from these, the following properties are also
available, and may be useful
2.2.0
+
+
1 - 100 of 173 matches
Mail list logo