EnricoMi commented on code in PR #37304:
URL: https://github.com/apache/spark/pull/37304#discussion_r932044028
##
python/pyspark/sql/dataframe.py:
##
@@ -2188,6 +2188,142 @@ def cube(self, *cols: "ColumnOrName") -> "GroupedData":
# type: ignore[misc]
return
panbingkun commented on code in PR #36996:
URL: https://github.com/apache/spark/pull/36996#discussion_r932085322
##
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/AlterTableSetSerdeSuite.scala:
##
@@ -0,0 +1,203 @@
+/*
+ * Licensed to the Apache Software
physinet opened a new pull request, #37329:
URL: https://github.com/apache/spark/pull/37329
### What changes were proposed in this pull request?
Support either literal Python strings or Column objects for the pattern and
replacement arguments for `regexp_replace`.
### Why are the
cloud-fan commented on code in PR #37320:
URL: https://github.com/apache/spark/pull/37320#discussion_r932183215
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala:
##
@@ -410,12 +413,24 @@ object V2ScanRelationPushDown extends
cloud-fan closed pull request #36918: [SQL][SPARK-39528] Use V2 Filter in
SupportsRuntimeFiltering
URL: https://github.com/apache/spark/pull/36918
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
peter-toth commented on PR #37319:
URL: https://github.com/apache/spark/pull/37319#issuecomment-1197915164
So, I was thinking about adding
```
case _: Union =>
var first = true
plan.mapChildren { child =>
if (first) {
first =
wayneguow commented on PR #36775:
URL: https://github.com/apache/spark/pull/36775#issuecomment-1198234993
IMO, it's better that users can configure what exceptions can ignore corrupt
files.
--
This is an automated message from the Apache Git Service.
To respond to the message, please
beliefer commented on PR #37317:
URL: https://github.com/apache/spark/pull/37317#issuecomment-1198064228
ping @MaxGekk @gengliangwang @dongjoon-hyun cc @cloud-fan
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
cloud-fan commented on code in PR #37320:
URL: https://github.com/apache/spark/pull/37320#discussion_r932182709
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala:
##
@@ -545,6 +560,9 @@ case class ScanBuilderHolder(
var
cloud-fan commented on code in PR #37320:
URL: https://github.com/apache/spark/pull/37320#discussion_r932181859
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala:
##
@@ -545,6 +560,9 @@ case class ScanBuilderHolder(
var
cloud-fan commented on code in PR #37320:
URL: https://github.com/apache/spark/pull/37320#discussion_r932186273
##
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##
@@ -811,6 +800,244 @@ class JDBCV2Suite extends QueryTest with
SharedSparkSession with
goutam-git commented on code in PR #37065:
URL: https://github.com/apache/spark/pull/37065#discussion_r932196681
##
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/compression/compressionSchemes.scala:
##
@@ -421,7 +421,7 @@ private[columnar] case object
cloud-fan commented on PR #37319:
URL: https://github.com/apache/spark/pull/37319#issuecomment-1198119033
`Union.output` is a long-standing issue (same for `Join.output`). It reuses
the first child's output but apparently `Union` and its first child output
different values. We have to
ulysses-you opened a new pull request, #37330:
URL: https://github.com/apache/spark/pull/37330
### What changes were proposed in this pull request?
Optimize Global sort to RepartitionByExpression, for example:
```
Sort local Sort local
Sort global=>
LuciferYang commented on code in PR #37293:
URL: https://github.com/apache/spark/pull/37293#discussion_r932335919
##
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedDeltaBinaryPackedReader.java:
##
@@ -300,7 +300,8 @@ private void
senthh commented on PR #35785:
URL: https://github.com/apache/spark/pull/35785#issuecomment-1198036728
@dongjoon-hyun @dgd-contributor @gaborgsomogyi @squito Could you be kind to
review this PR, Please?
--
This is an automated message from the Apache Git Service.
To respond to the
LuciferYang opened a new pull request, #37331:
URL: https://github.com/apache/spark/pull/37331
### What changes were proposed in this pull request?
Testing with Arrow 9.0.0, will update here later
### Why are the changes needed?
### Does this PR introduce _any_
AngersZh commented on PR #37162:
URL: https://github.com/apache/spark/pull/37162#issuecomment-1197950750
ping @dongjoon-hyun The latest GA failed caused by
```
* DONE (miniUI)
ERROR: dependency ‘pkgdown’ is not available for package ‘devtools’
* removing
panbingkun commented on PR #37314:
URL: https://github.com/apache/spark/pull/37314#issuecomment-1197977785
cc @dongjoon-hyun
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
MaxGekk commented on code in PR #36996:
URL: https://github.com/apache/spark/pull/36996#discussion_r932008623
##
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/AlterTableSetSerdeSuite.scala:
##
@@ -0,0 +1,203 @@
+/*
+ * Licensed to the Apache Software
ulysses-you commented on PR #37275:
URL: https://github.com/apache/spark/pull/37275#issuecomment-1197941198
cc @cloud-fan ready for branch-3.2
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
ulysses-you commented on PR #37276:
URL: https://github.com/apache/spark/pull/37276#issuecomment-1197940919
cc @cloud-fan ready for branch-3.1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
huaxingao commented on PR #36918:
URL: https://github.com/apache/spark/pull/36918#issuecomment-1198240325
Thanks @cloud-fan @zinking
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
cloud-fan commented on code in PR #37287:
URL: https://github.com/apache/spark/pull/37287#discussion_r932317590
##
sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala:
##
@@ -110,53 +108,44 @@ class CatalogImpl(sparkSession: SparkSession) extends
Catalog {
peter-toth commented on PR #37319:
URL: https://github.com/apache/spark/pull/37319#issuecomment-1198030620
I don't think that extra `Alias` does any harm in that test, just the
expected needs to be amended.
My proposal also fixes the issue of the following:
```
SELECT a, b
cloud-fan commented on code in PR #37327:
URL: https://github.com/apache/spark/pull/37327#discussion_r932204473
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala:
##
@@ -153,19 +153,24 @@ class CSVOptions(
* Disabled by default for backwards
ala commented on code in PR #37228:
URL: https://github.com/apache/spark/pull/37228#discussion_r932280019
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala:
##
@@ -223,8 +216,25 @@ object FileSourceStrategy extends Strategy with
github-actions[bot] closed pull request #36240: [SPARK-37787][CORE] fix bug,
Long running Spark Job throw HDFS_DELEGATE_TOKEN not found in cache Exception
URL: https://github.com/apache/spark/pull/36240
--
This is an automated message from the Apache Git Service.
To respond to the message,
RS131419 commented on code in PR #37230:
URL: https://github.com/apache/spark/pull/37230#discussion_r932792260
##
sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala:
##
@@ -1611,4 +1611,26 @@ class StatisticsSuite extends
StatisticsCollectionTestBase with
cloud-fan commented on PR #37287:
URL: https://github.com/apache/spark/pull/37287#issuecomment-1198793708
> Is listTables() does not respect current catalog fixed in this PR?
I think so, by always passing the fully qualified name to `getTable` in
`listTables`. We can add tests later,
HyukjinKwon commented on PR #37329:
URL: https://github.com/apache/spark/pull/37329#issuecomment-1198809042
cc @zero323 FYI
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
HyukjinKwon commented on code in PR #37329:
URL: https://github.com/apache/spark/pull/37329#discussion_r932809698
##
python/pyspark/sql/functions.py:
##
@@ -3262,7 +3262,19 @@ def regexp_extract(str: "ColumnOrName", pattern: str,
idx: int) -> Column:
return
Yikun commented on PR #37258:
URL: https://github.com/apache/spark/pull/37258#issuecomment-1198812267
Sorry for late reply, I'm busy in some local meeting recent days.
> In addition, can we get the content of dmesg?
@LuciferYang We can add a separate step like:
```
-
LuciferYang commented on PR #37258:
URL: https://github.com/apache/spark/pull/37258#issuecomment-1198825739
> Sorry for late reply, I'm busy in some local meeting recent days.
>
> > In addition, can we get the content of dmesg?
>
> @LuciferYang We can add a separate step like:
deshanxiao opened a new pull request, #37336:
URL: https://github.com/apache/spark/pull/37336
### What changes were proposed in this pull request?
Today we have two SchemaUtils: SQL SchemaUtils and mllib SchemaUtils. This
pr is try to remove SchemaUtils in mllib.
### Why are the
zhengruifeng commented on code in PR #37304:
URL: https://github.com/apache/spark/pull/37304#discussion_r932846004
##
sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -2127,6 +2127,15 @@ class Dataset[T] private[sql](
valueColumnName: String): DataFrame
MaxGekk commented on code in PR #37337:
URL: https://github.com/apache/spark/pull/37337#discussion_r932884678
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalMathUtils.scala:
##
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
ivoson commented on code in PR #37268:
URL: https://github.com/apache/spark/pull/37268#discussion_r928873929
##
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala:
##
@@ -388,14 +388,19 @@ private[spark] class TaskSchedulerImpl(
val execId =
Yikun commented on code in PR #37305:
URL: https://github.com/apache/spark/pull/37305#discussion_r932795358
##
dev/lint-python:
##
@@ -210,7 +210,7 @@ function black_test {
local BLACK_STATUS=
# Skip check if black is not installed.
-$BLACK_BUILD 2> /dev/null
+
beliefer commented on code in PR #37320:
URL: https://github.com/apache/spark/pull/37320#discussion_r932847111
##
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##
@@ -811,6 +800,244 @@ class JDBCV2Suite extends QueryTest with
SharedSparkSession with
zhengruifeng commented on code in PR #37304:
URL: https://github.com/apache/spark/pull/37304#discussion_r932851912
##
sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -2127,6 +2127,15 @@ class Dataset[T] private[sql](
valueColumnName: String): DataFrame
gengliangwang opened a new pull request, #37337:
URL: https://github.com/apache/spark/pull/37337
### What changes were proposed in this pull request?
Similar with https://github.com/apache/spark/pull/37313, currently, when
arithmetic overflow errors happen under ANSI mode,
gengliangwang opened a new pull request, #37338:
URL: https://github.com/apache/spark/pull/37338
### What changes were proposed in this pull request?
Update the codegen error message for data type which can't be compared by
replacing`un-comparable` with `incomparable`
gengliangwang commented on PR #37338:
URL: https://github.com/apache/spark/pull/37338#issuecomment-1198884914
This is trivial. I found it when working on
https://github.com/apache/spark/pull/37337
--
This is an automated message from the Apache Git Service.
To respond to the message,
Jonathancui123 commented on PR #37327:
URL: https://github.com/apache/spark/pull/37327#issuecomment-1198894995
> Should we keep requirement that `inferDate = true` needs `inferSchema =
true`? I think we should clarify semantics.
@sadikovi I think we should keep the requirement and
MaxGekk commented on code in PR #37322:
URL: https://github.com/apache/spark/pull/37322#discussion_r932499581
##
sql/core/src/test/scala/org/apache/spark/sql/DatasetUnpivotSuite.scala:
##
@@ -305,14 +305,17 @@ class DatasetUnpivotSuite extends QueryTest
dongjoon-hyun commented on code in PR #37335:
URL: https://github.com/apache/spark/pull/37335#discussion_r932701092
##
python/pyspark/sql/dataframe.py:
##
@@ -3237,17 +3237,18 @@ def drop(self, *cols: "ColumnOrName") -> "DataFrame":
# type: ignore[misc]
"""
dtenedor commented on PR #37280:
URL: https://github.com/apache/spark/pull/37280#issuecomment-1198675997
@gengliangwang Sure, this is done.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
amaliujia commented on PR #37287:
URL: https://github.com/apache/spark/pull/37287#issuecomment-1198716223
Is `listTables()` does not respect current catalog fixed in this PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
dongjoon-hyun commented on code in PR #37335:
URL: https://github.com/apache/spark/pull/37335#discussion_r932774765
##
python/pyspark/sql/tests/test_dataframe.py:
##
@@ -87,6 +87,21 @@ def test_help_command(self):
pydoc.render_doc(df.foo)
cfmcgrady commented on code in PR #37334:
URL: https://github.com/apache/spark/pull/37334#discussion_r932804420
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:
##
@@ -559,6 +559,17 @@ object RemoveRedundantAliases extends
Yikun commented on code in PR #37305:
URL: https://github.com/apache/spark/pull/37305#discussion_r932810013
##
python/pyspark/ml/feature.py:
##
@@ -968,7 +968,7 @@ class _CountVectorizerParams(JavaParams, HasInputCol,
HasOutputCol):
def __init__(self, *args: Any):
Yikun commented on PR #37328:
URL: https://github.com/apache/spark/pull/37328#issuecomment-1198820043
otherwise LGTM! Thanks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
HyukjinKwon commented on PR #37258:
URL: https://github.com/apache/spark/pull/37258#issuecomment-1198843792
Let me close this one. I believe all are fixed now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
HyukjinKwon closed pull request #37258: [DO-NOT-MERGE] trigger CI
URL: https://github.com/apache/spark/pull/37258
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
zhengruifeng commented on code in PR #37304:
URL: https://github.com/apache/spark/pull/37304#discussion_r932840669
##
python/pyspark/context.py:
##
@@ -309,10 +309,7 @@ def _do_init(
if sys.version_info[:2] < (3, 8):
with warnings.catch_warnings():
MaxGekk commented on PR #37322:
URL: https://github.com/apache/spark/pull/37322#issuecomment-1198870290
@anchovYu @cloud-fan @HyukjinKwon @gengliangwang Could you review this PR,
please.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
amaliujia commented on PR #37287:
URL: https://github.com/apache/spark/pull/37287#issuecomment-1198905411
> > Is listTables() does not respect current catalog fixed in this PR?
>
> I think so, by always passing the fully qualified name to `getTable` in
`listTables`. We can add tests
gengliangwang closed pull request #37280: [SPARK-39862][SQL] Fix two bugs in
existence DEFAULT value lookups
URL: https://github.com/apache/spark/pull/37280
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
gengliangwang commented on PR #37280:
URL: https://github.com/apache/spark/pull/37280#issuecomment-1198710296
Thanks, merging to master
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
huaxingao commented on PR #37332:
URL: https://github.com/apache/spark/pull/37332#issuecomment-1198735772
The GA failure doesn't seem relevant.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
HyukjinKwon commented on code in PR #37335:
URL: https://github.com/apache/spark/pull/37335#discussion_r932808436
##
python/pyspark/sql/dataframe.py:
##
@@ -3244,10 +3244,14 @@ def drop(self, *cols: "ColumnOrName") -> "DataFrame":
# type: ignore[misc]
else:
HyukjinKwon commented on PR #37326:
URL: https://github.com/apache/spark/pull/37326#issuecomment-1198807963
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
deshanxiao commented on PR #37336:
URL: https://github.com/apache/spark/pull/37336#issuecomment-1198839786
CC @gengliangwang @dongjoon-hyun @cloud-fan
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
beliefer commented on code in PR #37320:
URL: https://github.com/apache/spark/pull/37320#discussion_r932843706
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala:
##
@@ -545,6 +560,9 @@ case class ScanBuilderHolder(
var
huaxingao commented on PR #37332:
URL: https://github.com/apache/spark/pull/37332#issuecomment-1198736391
@cloud-fan Could you please take a look when you have time? Thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
HyukjinKwon closed pull request #37326: [SPARK-39906][INFRA] Eliminate build
warnings - 'sbt 0.13 shell syntax is deprecated; use slash syntax instead'
URL: https://github.com/apache/spark/pull/37326
--
This is an automated message from the Apache Git Service.
To respond to the message,
HyukjinKwon commented on PR #37328:
URL: https://github.com/apache/spark/pull/37328#issuecomment-1198808336
cc @itholic @xinrong-meng @ueshin FYI
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
Yikun commented on PR #37305:
URL: https://github.com/apache/spark/pull/37305#issuecomment-1198817543
and CI failed due to `[Run / Scala 2.13 build with
SBT](https://github.com/grundprinzip/spark/runs/7546678501?check_suite_focus=true)`
git clone networking issue, I think we can pass it by
sadikovi commented on PR #37327:
URL: https://github.com/apache/spark/pull/37327#issuecomment-1198896103
Yes, that was my thinking too. Okay, I will make a few changes to the PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
Yikun commented on code in PR #37328:
URL: https://github.com/apache/spark/pull/37328#discussion_r932814726
##
python/pyspark/pandas/series.py:
##
@@ -6322,13 +6322,21 @@ def argmax(self, axis: Axis = None, skipna: bool =
True) -> int:
# If the maximum is achieved
ulysses-you commented on PR #36253:
URL: https://github.com/apache/spark/pull/36253#issuecomment-1198822779
cc @cloud-fan @huaxingao if you have time to take a look, thank you
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
c21 commented on code in PR #37290:
URL: https://github.com/apache/spark/pull/37290#discussion_r932846383
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/V1Writes.scala:
##
@@ -117,20 +117,26 @@ object V1WritesUtils {
outputColumns: Seq[Attribute],
sadikovi commented on PR #37327:
URL: https://github.com/apache/spark/pull/37327#issuecomment-1198856750
Should we keep requirement that `inferDate = true` needs `inferSchema =
true`? I think it is unclear right now.
--
This is an automated message from the Apache Git Service.
To respond
c21 commented on code in PR #37264:
URL: https://github.com/apache/spark/pull/37264#discussion_r932857471
##
sql/core/src/test/scala/org/apache/spark/sql/DataFrameAsSchemaSuite.scala:
##
@@ -46,15 +46,11 @@ class DataFrameAsSchemaSuite extends QueryTest with
SharedSparkSession
c21 commented on PR #37264:
URL: https://github.com/apache/spark/pull/37264#issuecomment-1198868034
The PR is ready for review again, thanks @cloud-fan.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
viirya commented on code in PR #37290:
URL: https://github.com/apache/spark/pull/37290#discussion_r932864145
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala:
##
@@ -107,8 +108,10 @@ object FileFormatWriter extends Logging {
MaxGekk closed pull request #37322: [SPARK-39905][SQL][TESTS] Remove
`checkErrorClass()` and use `checkError()` instead
URL: https://github.com/apache/spark/pull/37322
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
MaxGekk commented on PR #37322:
URL: https://github.com/apache/spark/pull/37322#issuecomment-1198880973
Merging to master. Thank you, @gengliangwang for review.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
dongjoon-hyun commented on code in PR #37287:
URL: https://github.com/apache/spark/pull/37287#discussion_r932517832
##
sql/core/src/main/scala/org/apache/spark/sql/catalog/Catalog.scala:
##
@@ -33,36 +33,37 @@ import org.apache.spark.storage.StorageLevel
abstract class Catalog
dongjoon-hyun commented on code in PR #37287:
URL: https://github.com/apache/spark/pull/37287#discussion_r932517832
##
sql/core/src/main/scala/org/apache/spark/sql/catalog/Catalog.scala:
##
@@ -33,36 +33,37 @@ import org.apache.spark.storage.StorageLevel
abstract class Catalog
ueshin commented on code in PR #35391:
URL: https://github.com/apache/spark/pull/35391#discussion_r932566706
##
python/pyspark/sql/tests/test_dataframe.py:
##
@@ -953,6 +953,30 @@ def test_to_pandas_from_mixed_dataframe(self):
pdf_with_only_nulls =
santosh-d3vpl3x closed pull request #37335: SPARK-39895 pyspark support
multiple column drop
URL: https://github.com/apache/spark/pull/37335
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
cabral1888 commented on code in PR #37230:
URL: https://github.com/apache/spark/pull/37230#discussion_r932418981
##
sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala:
##
@@ -1611,4 +1611,26 @@ class StatisticsSuite extends
StatisticsCollectionTestBase
MaxGekk commented on code in PR #37322:
URL: https://github.com/apache/spark/pull/37322#discussion_r932495675
##
sql/core/src/test/scala/org/apache/spark/sql/DatasetUnpivotSuite.scala:
##
@@ -305,14 +305,17 @@ class DatasetUnpivotSuite extends QueryTest
Jonathancui123 commented on code in PR #37327:
URL: https://github.com/apache/spark/pull/37327#discussion_r932486986
##
docs/sql-data-sources-csv.md:
##
@@ -109,7 +109,7 @@ Data source options of CSV can be set via:
read
-inferDate
+preferDate
false
Jonathancui123 commented on code in PR #37327:
URL: https://github.com/apache/spark/pull/37327#discussion_r932486986
##
docs/sql-data-sources-csv.md:
##
@@ -109,7 +109,7 @@ Data source options of CSV can be set via:
read
-inferDate
+preferDate
false
MaxGekk commented on code in PR #37322:
URL: https://github.com/apache/spark/pull/37322#discussion_r932506853
##
sql/core/src/test/scala/org/apache/spark/sql/DatasetUnpivotSuite.scala:
##
@@ -305,14 +305,17 @@ class DatasetUnpivotSuite extends QueryTest
otterc commented on PR #35906:
URL: https://github.com/apache/spark/pull/35906#issuecomment-1198491146
> Should be easy to add. We can have a feature flag, and when initiate the
RemoteBlockPushResolver, db can be set to null if this feature flag is turned
off, and all the later DB
sunchao commented on code in PR #36995:
URL: https://github.com/apache/spark/pull/36995#discussion_r932515356
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DistributionAndOrderingUtils.scala:
##
@@ -17,22 +17,33 @@
package
EnricoMi commented on PR #37304:
URL: https://github.com/apache/spark/pull/37304#issuecomment-1198506801
> btw, you may also need to run `dev/reformat-python`
Why do I have to reformat `python/pyspark/context.py`? That seems unrelated.
--
This is an automated message from the
santosh-d3vpl3x closed pull request #37333: SPARK-39895 pyspark support
multiple column drop
URL: https://github.com/apache/spark/pull/37333
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
sadikovi commented on code in PR #37327:
URL: https://github.com/apache/spark/pull/37327#discussion_r932712356
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala:
##
@@ -153,19 +153,24 @@ class CSVOptions(
* Disabled by default for backwards
gengliangwang closed pull request #37311: [SPARK-39865][SQL][3.3] Show proper
error messages on the overflow errors of table insert
URL: https://github.com/apache/spark/pull/37311
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
peter-toth opened a new pull request, #37334:
URL: https://github.com/apache/spark/pull/37334
### What changes were proposed in this pull request?
Keep the output attributes of a `Union` node's first child in the
`RemoveRedundantAliases` rule to avoid correctness issues.
### Why
peter-toth commented on PR #37319:
URL: https://github.com/apache/spark/pull/37319#issuecomment-1198525757
I've opened a PR with my proposal here:
https://github.com/apache/spark/pull/37334
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
santosh-d3vpl3x opened a new pull request, #37335:
URL: https://github.com/apache/spark/pull/37335
* SPARK-39895 pyspark support multiple column drop
### What changes were proposed in this pull request?
Fixes issues related type confirmation in pyspark api
### Why are the
gengliangwang commented on PR #37280:
URL: https://github.com/apache/spark/pull/37280#issuecomment-1198601705
@dtenedor could you also update the PR description about the ORC fix?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
ulysses-you commented on PR #37290:
URL: https://github.com/apache/spark/pull/37290#issuecomment-1197941930
cc @viirya @cloud-fan @c21
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
cfmcgrady commented on PR #37319:
URL: https://github.com/apache/spark/pull/37319#issuecomment-1197968855
hi, @peter-toth thank you for your feedback.
While these changes of `RemoveRedundantAliases` solve this issue, they break
the guarantee of `alias removal should not break after push
1 - 100 of 122 matches
Mail list logo