mridulm commented on PR #40307:
URL: https://github.com/apache/spark/pull/40307#issuecomment-145754
We are evaluating it currently @dongjoon-hyun :-)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
dongjoon-hyun closed pull request #40289: [SPARK-42478][SQL][3.2] Make a
serializable jobTrackerId instead of a non-serializable JobID in
FileWriterFactory
URL: https://github.com/apache/spark/pull/40289
--
This is an automated message from the Apache Git Service.
To respond to the message,
dongjoon-hyun closed pull request #40283: [SPARK-42673][BUILD] Make
`build/mvn` build Spark only with the verified maven version
URL: https://github.com/apache/spark/pull/40283
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
hvanhovell closed pull request #40217: [SPARK-42559][CONNECT] Implement
DataFrameNaFunctions
URL: https://github.com/apache/spark/pull/40217
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
mridulm opened a new pull request, #40307:
URL: https://github.com/apache/spark/pull/40307
### What changes were proposed in this pull request?
Currently, if there is an executor node loss, we assume the shuffle data on
that node is also lost. This is not necessarily the case if
mridulm commented on PR #40307:
URL: https://github.com/apache/spark/pull/40307#issuecomment-1456844136
This is still WIP, but want to get early feedback.
+CC @Ngone51, @otterc, @waitinfuture
--
This is an automated message from the Apache Git Service.
To respond to the message, please
ueshin commented on code in PR #40276:
URL: https://github.com/apache/spark/pull/40276#discussion_r1127045146
##
python/pyspark/sql/connect/types.py:
##
@@ -342,20 +343,325 @@ def from_arrow_schema(arrow_schema: "pa.Schema") ->
StructType:
def parse_data_type(data_type:
otterc commented on code in PR #40307:
URL: https://github.com/apache/spark/pull/40307#discussion_r1127049718
##
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala:
##
@@ -203,7 +205,8 @@ private[spark] class ExecutorAllocationManager(
throw new
aokolnychyi opened a new pull request, #40308:
URL: https://github.com/apache/spark/pull/40308
### What changes were proposed in this pull request?
This PR adds a rule to align UPDATE assignments with table attributes.
### Why are the changes needed?
amaliujia commented on code in PR #40304:
URL: https://github.com/apache/spark/pull/40304#discussion_r1127070223
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala:
##
@@ -76,7 +76,8 @@ class ClientE2ETestSuite extends
mridulm commented on code in PR #40307:
URL: https://github.com/apache/spark/pull/40307#discussion_r1127076610
##
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala:
##
@@ -203,7 +205,8 @@ private[spark] class ExecutorAllocationManager(
throw new
dongjoon-hyun commented on PR #40307:
URL: https://github.com/apache/spark/pull/40307#issuecomment-1457022823
If you don't mind, please share some results later~ :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
aokolnychyi commented on code in PR #40308:
URL: https://github.com/apache/spark/pull/40308#discussion_r1127081206
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -3344,43 +3345,6 @@ class Analyzer(override val catalogManager:
zhenlineo opened a new pull request, #40304:
URL: https://github.com/apache/spark/pull/40304
### What changes were proposed in this pull request?
Mute the UDF test.
### Why are the changes needed?
The test fails during maven test runs because the server cannot find the udf
in
zhenlineo commented on PR #40274:
URL: https://github.com/apache/spark/pull/40274#issuecomment-1456552298
https://github.com/apache/spark/pull/40304
https://github.com/apache/spark/pull/40303
--
This is an automated message from the Apache Git Service.
To respond to the message, please
amaliujia commented on PR #40303:
URL: https://github.com/apache/spark/pull/40303#issuecomment-1456893990
LGTM
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
dongjoon-hyun commented on PR #40290:
URL: https://github.com/apache/spark/pull/40290#issuecomment-1457080979
Merged to branch-3.3. Thank you, @Yikf and @cloud-fan .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
dongjoon-hyun closed pull request #40290: [SPARK-42478][SQL][3.3] Make a
serializable jobTrackerId instead of a non-serializable JobID in
FileWriterFactory
URL: https://github.com/apache/spark/pull/40290
--
This is an automated message from the Apache Git Service.
To respond to the message,
mridulm commented on code in PR #40307:
URL: https://github.com/apache/spark/pull/40307#discussion_r1126939110
##
core/src/main/scala/org/apache/spark/SparkContext.scala:
##
@@ -596,6 +591,13 @@ class SparkContext(config: SparkConf) extends Logging {
aokolnychyi commented on code in PR #40308:
URL: https://github.com/apache/spark/pull/40308#discussion_r1127079791
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AlignRowLevelCommandAssignments.scala:
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache
amaliujia commented on PR #40309:
URL: https://github.com/apache/spark/pull/40309#issuecomment-1457104885
cc @zhengruifeng @HyukjinKwon @grundprinzip
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
amaliujia opened a new pull request, #40309:
URL: https://github.com/apache/spark/pull/40309
### What changes were proposed in this pull request?
Rename Connect proto Request client_id to session_id.
On the one hand when I read client_id I was confused on what it is
FurcyPin commented on code in PR #40271:
URL: https://github.com/apache/spark/pull/40271#discussion_r1126999170
##
python/pyspark/sql/tests/test_functions.py:
##
@@ -1268,6 +1268,12 @@ def test_bucket(self):
message_parameters={"arg_name": "numBuckets", "arg_type":
srielau commented on code in PR #40282:
URL: https://github.com/apache/spark/pull/40282#discussion_r1126835916
##
python/docs/source/development/errors.rst:
##
@@ -0,0 +1,92 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license
huanliwang-db opened a new pull request, #40306:
URL: https://github.com/apache/spark/pull/40306
`pivot` is an unsupported operation in structured streaming but produces a
bad error message that is quite misleading.
The following is the current error message for the pivot in SS:
hvanhovell commented on PR #40217:
URL: https://github.com/apache/spark/pull/40217#issuecomment-1456804274
LGTM
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
srielau commented on code in PR #40282:
URL: https://github.com/apache/spark/pull/40282#discussion_r1126835916
##
python/docs/source/development/errors.rst:
##
@@ -0,0 +1,92 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license
zhenlineo opened a new pull request, #40305:
URL: https://github.com/apache/spark/pull/40305
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How
hvanhovell commented on PR #40217:
URL: https://github.com/apache/spark/pull/40217#issuecomment-1456804495
Merging
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
FurcyPin commented on code in PR #40271:
URL: https://github.com/apache/spark/pull/40271#discussion_r1126999170
##
python/pyspark/sql/tests/test_functions.py:
##
@@ -1268,6 +1268,12 @@ def test_bucket(self):
message_parameters={"arg_name": "numBuckets", "arg_type":
otterc commented on code in PR #40307:
URL: https://github.com/apache/spark/pull/40307#discussion_r1127049718
##
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala:
##
@@ -203,7 +205,8 @@ private[spark] class ExecutorAllocationManager(
throw new
aokolnychyi commented on code in PR #40308:
URL: https://github.com/apache/spark/pull/40308#discussion_r1127080574
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AlignRowLevelCommandAssignments.scala:
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache
HeartSaVioR commented on PR #40306:
URL: https://github.com/apache/spark/pull/40306#issuecomment-1457192756
Thanks, merging to master!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
hvanhovell commented on PR #40218:
URL: https://github.com/apache/spark/pull/40218#issuecomment-1457206111
Merging.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
ueshin opened a new pull request, #40310:
URL: https://github.com/apache/spark/pull/40310
### What changes were proposed in this pull request?
Fixes `createDataFrame` to autogenerate missing column names.
### Why are the changes needed?
Currently the number of the column
hvanhovell closed pull request #40309: [SPARK-42688][CONNECT] Rename Connect
proto Request client_id to session_id
URL: https://github.com/apache/spark/pull/40309
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
beliefer commented on PR #40287:
URL: https://github.com/apache/spark/pull/40287#issuecomment-1457418420
@hvanhovell Scala also uses
`UnresolvedNamedLambdaVariable.freshVarName("x")` to get the unique names. see:
cloud-fan commented on code in PR #40300:
URL: https://github.com/apache/spark/pull/40300#discussion_r1127307264
##
sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -2714,6 +2726,17 @@ class Dataset[T] private[sql](
*/
def withColumn(colName: String,
cloud-fan commented on code in PR #40300:
URL: https://github.com/apache/spark/pull/40300#discussion_r1127307622
##
sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -2714,6 +2726,17 @@ class Dataset[T] private[sql](
*/
def withColumn(colName: String,
aokolnychyi commented on code in PR #40308:
URL: https://github.com/apache/spark/pull/40308#discussion_r1127340319
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AssignmentUtils.scala:
##
@@ -0,0 +1,275 @@
+/*
+ * Licensed to the Apache Software
amaliujia commented on PR #40310:
URL: https://github.com/apache/spark/pull/40310#issuecomment-1457579306
LGTM!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
amaliujia commented on code in PR #40310:
URL: https://github.com/apache/spark/pull/40310#discussion_r1127370577
##
python/pyspark/sql/connect/session.py:
##
@@ -235,6 +235,9 @@ def createDataFrame(
# If no schema supplied by user then get the names of columns only
viirya commented on code in PR #40215:
URL: https://github.com/apache/spark/pull/40215#discussion_r1127412999
##
docs/structured-streaming-programming-guide.md:
##
@@ -1848,12 +1848,137 @@ Additional details on supported joins:
- As of Spark 2.4, you can use joins only when
zhenlineo commented on PR #40305:
URL: https://github.com/apache/spark/pull/40305#issuecomment-1457166376
If this PR accepted then no need to merge
https://github.com/apache/spark/pull/40303 as this PR override the changes
needed there.
--
This is an automated message from the Apache
zhenlineo commented on PR #40303:
URL: https://github.com/apache/spark/pull/40303#issuecomment-1457165581
Or even better? -> https://github.com/apache/spark/pull/40305
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
HeartSaVioR closed pull request #40306: [SPARK-42687][SS] Better error message
for the unsupport `pivot` operation in Streaming
URL: https://github.com/apache/spark/pull/40306
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
github-actions[bot] closed pull request #38736: [SPARK-41214][SQL] - SQL
Metrics are missing from Spark UI when AQE for Cached DataFrame is enabled
URL: https://github.com/apache/spark/pull/38736
--
This is an automated message from the Apache Git Service.
To respond to the message, please
vitaliili-db commented on PR #40295:
URL: https://github.com/apache/spark/pull/40295#issuecomment-1457383616
build timed out but succeeded on rerun:
https://github.com/vitaliili-db/spark/actions/runs/4346311324/jobs/7598960402
--
This is an automated message from the Apache Git Service.
vitaliili-db commented on PR #40295:
URL: https://github.com/apache/spark/pull/40295#issuecomment-1457384015
@gengliangwang can you review this please?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
hvanhovell closed pull request #40303: [SPARK-42656][CONNECT][Followup] Improve
the script to start spark-connect server
URL: https://github.com/apache/spark/pull/40303
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
HyukjinKwon commented on PR #40244:
URL: https://github.com/apache/spark/pull/40244#issuecomment-1457397715
WDYT @hvanhovell ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HeartSaVioR commented on PR #39931:
URL: https://github.com/apache/spark/pull/39931#issuecomment-1457520455
Thanks all for quite huge efforts on reviewing this complicated change! The
implementation got better with the review comments.
--
This is an automated message from the Apache Git
aokolnychyi commented on code in PR #40308:
URL: https://github.com/apache/spark/pull/40308#discussion_r1127342306
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AssignmentUtils.scala:
##
@@ -0,0 +1,275 @@
+/*
+ * Licensed to the Apache Software
aokolnychyi commented on code in PR #40308:
URL: https://github.com/apache/spark/pull/40308#discussion_r1127343402
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala:
##
@@ -129,7 +129,7 @@ object TableOutputResolver {
}
}
itholic commented on PR #40280:
URL: https://github.com/apache/spark/pull/40280#issuecomment-1457558427
Thanks, @panbingkun !
By the way, I think this issue has a pretty high priority since the default
nullability of a schema is `False`.
```python
>>> sdf =
viirya commented on code in PR #40215:
URL: https://github.com/apache/spark/pull/40215#discussion_r1127438527
##
docs/structured-streaming-programming-guide.md:
##
@@ -1848,12 +1848,137 @@ Additional details on supported joins:
- As of Spark 2.4, you can use joins only when
hvanhovell commented on PR #40309:
URL: https://github.com/apache/spark/pull/40309#issuecomment-1457390771
Merging.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
LuciferYang commented on PR #40283:
URL: https://github.com/apache/spark/pull/40283#issuecomment-1457450172
Thanks @dongjoon-hyun @pan3793 ~
Also thanks @gnodet @hboutemy
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
cloud-fan commented on PR #40300:
URL: https://github.com/apache/spark/pull/40300#issuecomment-1457500143
It's a good idea to provide an API that allows people to unambiguously
reference metadata columns, and I like the new `Dataset.metadataColumn`
function. However, I think the prepending
aokolnychyi commented on code in PR #40308:
URL: https://github.com/apache/spark/pull/40308#discussion_r1127343402
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala:
##
@@ -129,7 +129,7 @@ object TableOutputResolver {
}
}
aokolnychyi commented on code in PR #40308:
URL: https://github.com/apache/spark/pull/40308#discussion_r1127348254
##
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala:
##
@@ -2057,6 +2057,17 @@ private[sql] object QueryCompilationErrors
aokolnychyi commented on code in PR #40308:
URL: https://github.com/apache/spark/pull/40308#discussion_r1127081206
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -3344,43 +3345,6 @@ class Analyzer(override val catalogManager:
amaliujia commented on code in PR #40310:
URL: https://github.com/apache/spark/pull/40310#discussion_r1127370577
##
python/pyspark/sql/connect/session.py:
##
@@ -235,6 +235,9 @@ def createDataFrame(
# If no schema supplied by user then get the names of columns only
amaliujia commented on code in PR #40310:
URL: https://github.com/apache/spark/pull/40310#discussion_r1127370577
##
python/pyspark/sql/connect/session.py:
##
@@ -235,6 +235,9 @@ def createDataFrame(
# If no schema supplied by user then get the names of columns only
wangyum commented on code in PR #40268:
URL: https://github.com/apache/spark/pull/40268#discussion_r1127193046
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala:
##
@@ -138,56 +136,53 @@ object ConstantPropagation extends Rule[LogicalPlan]
mridulm commented on PR #40307:
URL: https://github.com/apache/spark/pull/40307#issuecomment-1457315803
The test failure is unrelated, so existing tests work fine - will work on
specifically checking for the changes in this PR later today.
--
This is an automated message from the Apache
HeartSaVioR closed pull request #39931: [SPARK-42376][SS] Introduce watermark
propagation among operators
URL: https://github.com/apache/spark/pull/39931
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
HeartSaVioR commented on PR #39931:
URL: https://github.com/apache/spark/pull/39931#issuecomment-1457521207
Merging to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
olaky commented on code in PR #40300:
URL: https://github.com/apache/spark/pull/40300#discussion_r1127449861
##
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileMetadataStructSuite.scala:
##
@@ -244,6 +245,89 @@ class FileMetadataStructSuite extends
hvanhovell closed pull request #40218: [SPARK-42579][CONNECT] Part-1:
`function.lit` support `Array[_]` dataType
URL: https://github.com/apache/spark/pull/40218
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
WeichenXu123 commented on code in PR #40297:
URL: https://github.com/apache/spark/pull/40297#discussion_r1127288752
##
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/AlgorithmRegisty.scala:
##
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software
beliefer commented on code in PR #40277:
URL: https://github.com/apache/spark/pull/40277#discussion_r1127311861
##
connector/connect/common/src/main/protobuf/spark/connect/relations.proto:
##
@@ -140,6 +140,9 @@ message Read {
// (Optional) A list of path for file-system
zsxwing commented on code in PR #39931:
URL: https://github.com/apache/spark/pull/39931#discussion_r1127324257
##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/WatermarkPropagator.scala:
##
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation
aokolnychyi commented on code in PR #40308:
URL: https://github.com/apache/spark/pull/40308#discussion_r1127079791
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AlignRowLevelCommandAssignments.scala:
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache
itholic commented on PR #40288:
URL: https://github.com/apache/spark/pull/40288#issuecomment-1457564181
cc @allanf-db addressed the comments we discussed in offline
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
HeartSaVioR commented on PR #40215:
URL: https://github.com/apache/spark/pull/40215#issuecomment-1457584928
cc. @zsxwing @rangadi @jerrypeng @anishshri-db @chaoqin-li1123
cc-ing folks who reviewed the code change PR. This PR is a doc change to
show up what is being unblocked, like we
jerqi commented on PR #40307:
URL: https://github.com/apache/spark/pull/40307#issuecomment-1457625037
> spark.shuffle.reduceLocality.enabled
Thanks, I got it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
HyukjinKwon opened a new pull request, #40311:
URL: https://github.com/apache/spark/pull/40311
### What changes were proposed in this pull request?
This PR proposes to disable ANSI mode in both `replace float with nan` and
`replace double with nan` tests.
### Why are the
LuciferYang commented on PR #40218:
URL: https://github.com/apache/spark/pull/40218#issuecomment-1457373651
Thanks @hvanhovell
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
hvanhovell commented on PR #40303:
URL: https://github.com/apache/spark/pull/40303#issuecomment-1457382781
Merging
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
cloud-fan commented on code in PR #40300:
URL: https://github.com/apache/spark/pull/40300#discussion_r1127321842
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala:
##
@@ -42,6 +42,24 @@ abstract class LogicalPlan
*/
def
aokolnychyi commented on code in PR #40308:
URL: https://github.com/apache/spark/pull/40308#discussion_r1127343402
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala:
##
@@ -129,7 +129,7 @@ object TableOutputResolver {
}
}
aokolnychyi commented on PR #40308:
URL: https://github.com/apache/spark/pull/40308#issuecomment-1457537193
cc @huaxingao @cloud-fan @dongjoon-hyun @sunchao @viirya @gengliangwang
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
aokolnychyi commented on code in PR #40308:
URL: https://github.com/apache/spark/pull/40308#discussion_r1127081206
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -3344,43 +3345,6 @@ class Analyzer(override val catalogManager:
shrprasa commented on PR #37880:
URL: https://github.com/apache/spark/pull/37880#issuecomment-1457588129
Gentle ping @holdenk @dongjoon-hyun @Ngone51 , @HyukjinKwon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
HyukjinKwon commented on PR #40296:
URL: https://github.com/apache/spark/pull/40296#issuecomment-1457393807
Merged to master and branch-3.4.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
HyukjinKwon closed pull request #40296: [SPARK-42680][CONNECT][TESTS] Create
the helper function withSQLConf for connect test framework
URL: https://github.com/apache/spark/pull/40296
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
beliefer commented on PR #40296:
URL: https://github.com/apache/spark/pull/40296#issuecomment-1457422020
@HyukjinKwon @zhengruifeng Thank you.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
beliefer commented on code in PR #40277:
URL: https://github.com/apache/spark/pull/40277#discussion_r1127291008
##
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/DataFrameReader.scala:
##
@@ -250,6 +250,46 @@ class DataFrameReader private[sql] (sparkSession:
hvanhovell commented on PR #40287:
URL: https://github.com/apache/spark/pull/40287#issuecomment-1457433571
@beliefer here is the thing. When this was designed it was mainly aimed at
sql, and there we definitely do not generate unique names in lambda functions
either. This is all done in
zhengruifeng commented on PR #40296:
URL: https://github.com/apache/spark/pull/40296#issuecomment-1457271633
@beliefer I think it's not a `new features` mentioned in the PR description
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
panbingkun commented on PR #40280:
URL: https://github.com/apache/spark/pull/40280#issuecomment-1457349284
> Thanks @panbingkun for the nice fix! Btw, think I found another
`createDataFrame` bug which is not working properly with non-nullable schema as
below:
>
> ```python
> >>>
zhengruifeng commented on code in PR #40297:
URL: https://github.com/apache/spark/pull/40297#discussion_r1127232443
##
connector/connect/common/src/main/protobuf/spark/connect/ml.proto:
##
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or
aokolnychyi commented on code in PR #40308:
URL: https://github.com/apache/spark/pull/40308#discussion_r1127343402
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala:
##
@@ -129,7 +129,7 @@ object TableOutputResolver {
}
}
HeartSaVioR commented on PR #40215:
URL: https://github.com/apache/spark/pull/40215#issuecomment-1457585553
cc. @viirya as well who may be interested with new feature in SS.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
shrprasa commented on PR #40258:
URL: https://github.com/apache/spark/pull/40258#issuecomment-1457585690
Gentle Ping @srowen @dongjoon-hyun
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
shrprasa commented on PR #40128:
URL: https://github.com/apache/spark/pull/40128#issuecomment-1457586866
gentle ping @holdenk
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
jerqi commented on PR #40307:
URL: https://github.com/apache/spark/pull/40307#issuecomment-1457606879
Hi @mridulm , thanks for your great work! Apache Uniffle is similar project
to Apache Celeborn. We also patched to the Apache Spark like
pan3793 commented on PR #40307:
URL: https://github.com/apache/spark/pull/40307#issuecomment-1457619866
@jerqi locality may still have benefits when RSS works in hybrid
deployments, besides, there is a dedicated configuration for that
`spark.shuffle.reduceLocality.enabled`
--
This is an
amaliujia commented on PR #40311:
URL: https://github.com/apache/spark/pull/40311#issuecomment-1457682741
LGTM
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
1 - 100 of 172 matches
Mail list logo