MaxGekk opened a new pull request, #38027:
URL: https://github.com/apache/spark/pull/38027
### What changes were proposed in this pull request?
In the PR, I propose to migrate 100 compilation errors onto temporary error
classes with the prefix `_LEGACY_ERROR_TEMP_12xx`. The error message
LuciferYang opened a new pull request, #38028:
URL: https://github.com/apache/spark/pull/38028
### What changes were proposed in this pull request?
https://github.com/apache/spark/pull/37894 changed the preconditions for the
following two tests from
cloud-fan commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r982063973
##
connect/src/main/scala/org/apache/spark/sql/catalyst/connect/connect.scala:
##
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
itholic commented on code in PR #38015:
URL: https://github.com/apache/spark/pull/38015#discussion_r982276182
##
python/pyspark/pandas/indexes/base.py:
##
@@ -1907,6 +1908,9 @@ def append(self, other: "Index") -> "Index":
)
index_fields =
itholic commented on code in PR #38015:
URL: https://github.com/apache/spark/pull/38015#discussion_r982276182
##
python/pyspark/pandas/indexes/base.py:
##
@@ -1907,6 +1908,9 @@ def append(self, other: "Index") -> "Index":
)
index_fields =
mridulm commented on PR #38030:
URL: https://github.com/apache/spark/pull/38030#issuecomment-1260850742
How/where are we populating the message ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
srowen commented on code in PR #38024:
URL: https://github.com/apache/spark/pull/38024#discussion_r982356825
##
core/src/main/scala/org/apache/spark/internal/config/package.scala:
##
@@ -1078,6 +1078,13 @@ package object config {
.booleanConf
.createWithDefault(false)
AngersZh commented on PR #35799:
URL: https://github.com/apache/spark/pull/35799#issuecomment-1260559187
@cloud-fan @dongjoon-hyun How about this one?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
LuciferYang commented on PR #38028:
URL: https://github.com/apache/spark/pull/38028#issuecomment-1260522666
cc @HeartSaVioR and @HyukjinKwon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
cloud-fan opened a new pull request, #38029:
URL: https://github.com/apache/spark/pull/38029
### What changes were proposed in this pull request?
In `CheckAnalysis`, we inline CTE relations first and then check the plan.
This causes an issue if the CTE relation is not used,
cloud-fan commented on PR #38029:
URL: https://github.com/apache/spark/pull/38029#issuecomment-1260569657
cc @MaxGekk @srielau
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
EnricoMi commented on code in PR #37407:
URL: https://github.com/apache/spark/pull/37407#discussion_r982215819
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##
@@ -1098,6 +1106,87 @@ class AstBuilder extends
EnricoMi commented on code in PR #37407:
URL: https://github.com/apache/spark/pull/37407#discussion_r982352630
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -869,26 +869,50 @@ class Analyzer(override val catalogManager:
amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r982038430
##
connect/src/main/scala/org/apache/spark/sql/catalyst/connect/connect.scala:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
cloud-fan commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r982067988
##
connect/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation
zhengruifeng closed pull request #38026: [SPARK-40592][PS] Implement
`min_count` in `GroupBy.max`
URL: https://github.com/apache/spark/pull/38026
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
zhengruifeng commented on PR #38026:
URL: https://github.com/apache/spark/pull/38026#issuecomment-1260564475
Merged into master, thanks @HyukjinKwon for reviews
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
itholic commented on code in PR #38031:
URL: https://github.com/apache/spark/pull/38031#discussion_r982260594
##
python/pyspark/pandas/tests/test_dataframe.py:
##
@@ -6076,7 +6076,13 @@ def test_corrwith(self):
def _test_corrwith(self, psdf, psobj):
pdf =
itholic opened a new pull request, #38033:
URL: https://github.com/apache/spark/pull/38033
### What changes were proposed in this pull request?
This PR proposes to fix the plotting functions working properly with pandas
1.5.0.
This includes two fixes:
- Fix the
cloud-fan commented on code in PR #38004:
URL: https://github.com/apache/spark/pull/38004#discussion_r982037071
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/LogicalWriteInfo.java:
##
@@ -45,4 +45,18 @@ public interface LogicalWriteInfo {
* the schema
cloud-fan commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r982067325
##
connect/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation
mridulm commented on PR #38030:
URL: https://github.com/apache/spark/pull/38030#issuecomment-1260855604
Ok, so this is mainly to propagate in `SparkListenerExecutorRemoved`, sounds
reasonable.
+CC @dongjoon-hyun
--
This is an automated message from the Apache Git Service.
To
yaooqinn commented on code in PR #38024:
URL: https://github.com/apache/spark/pull/38024#discussion_r982003775
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FilePartitionReader.scala:
##
@@ -36,8 +36,15 @@ class FilePartitionReader[T](
private
cloud-fan commented on code in PR #38004:
URL: https://github.com/apache/spark/pull/38004#discussion_r982048790
##
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/write/LogicalWriteInfoImpl.scala:
##
@@ -23,4 +23,6 @@ import
HeartSaVioR commented on PR #38013:
URL: https://github.com/apache/spark/pull/38013#issuecomment-1260507874
Thanks! Merging to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
cloud-fan commented on code in PR #38023:
URL: https://github.com/apache/spark/pull/38023#discussion_r982071397
##
connect/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##
@@ -96,7 +96,9 @@ class SparkConnectPlanner(plan: proto.Relation,
LuciferYang commented on PR #37654:
URL: https://github.com/apache/spark/pull/37654#issuecomment-1260459690
Thanks @cloud-fan @sadikovi
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
cloud-fan commented on code in PR #38004:
URL: https://github.com/apache/spark/pull/38004#discussion_r982053631
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/DeltaWriter.java:
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
cloud-fan commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r982062517
##
connect/src/main/scala/org/apache/spark/sql/catalyst/connect/connect.scala:
##
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
AngersZh commented on PR #35594:
URL: https://github.com/apache/spark/pull/35594#issuecomment-1260558788
> @AngersZh can you rebase? We should merge this PR.
Done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
LuciferYang commented on PR #37844:
URL: https://github.com/apache/spark/pull/37844#issuecomment-1260663996
mvn test `Java8 + hadoop-2 profile` with this pr , all test passed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
bozhang2820 opened a new pull request, #38030:
URL: https://github.com/apache/spark/pull/38030
### What changes were proposed in this pull request?
This change populates `ExecutorDecommission` with messages in
`ExecutorDecommissionInfo`.
### Why are the changes needed?
yaooqinn opened a new pull request, #38032:
URL: https://github.com/apache/spark/pull/38032
### What changes were proposed in this pull request?
The local modes w/o explicitly num of task failures option specified is
currently hard coded to 1. The resilience in
HeartSaVioR closed pull request #38013: [SPARK-40509][SS][PYTHON] Add example
for applyInPandasWithState
URL: https://github.com/apache/spark/pull/38013
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
itholic opened a new pull request, #38031:
URL: https://github.com/apache/spark/pull/38031
### What changes were proposed in this pull request?
This PR proposes to skip the `DataFrame.corr_with` test when the `other` is
`pyspark.pandas.Series` and the `method` is
mridulm commented on PR #38024:
URL: https://github.com/apache/spark/pull/38024#issuecomment-1260859431
I am a bit confused with what this PR is trying to do.
If we want to ignore corrupt files, by definition failures will be ignored -
and tasks will be marked successful : because that
EnricoMi commented on code in PR #37407:
URL: https://github.com/apache/spark/pull/37407#discussion_r982503768
##
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4:
##
@@ -618,6 +618,46 @@ pivotValue
: expression (AS? identifier)?
;
EnricoMi commented on PR #38036:
URL: https://github.com/apache/spark/pull/38036#issuecomment-1260959605
Ideally, `EnsureRequirements` should not call into
`HashShuffleSpec.createPartitioning(clustering)` with a `clustering` that has
an incompatible cardinality.
--
This is an automated
EnricoMi commented on PR #38036:
URL: https://github.com/apache/spark/pull/38036#issuecomment-1260961186
@HyukjinKwon @cloud-fan
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
peter-toth opened a new pull request, #38038:
URL: https://github.com/apache/spark/pull/38038
### What changes were proposed in this pull request?
This is a WIP PR to refactor `BroadcastHashJoinExec` output partitioning
calculation.
As this PR is based on
cloud-fan opened a new pull request, #38039:
URL: https://github.com/apache/spark/pull/38039
### What changes were proposed in this pull request?
Currently, Spark swallows the error thrown by catalog implementations, and
re-throws a standard error. However, the original error
cloud-fan commented on PR #38039:
URL: https://github.com/apache/spark/pull/38039#issuecomment-1261135792
cc @MaxGekk @srielau @viirya
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
EnricoMi opened a new pull request, #38036:
URL: https://github.com/apache/spark/pull/38036
Cogrouping two grouped DataFrames in PySpark that have different group key
cardinalities raises an error that is not very descriptive:
```
py4j.protocol.Py4JJavaError: An error occurred
mridulm commented on PR #37779:
URL: https://github.com/apache/spark/pull/37779#issuecomment-1260865571
Can we also add the example code you had to reproduce the issue as a test ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
cloud-fan commented on code in PR #37407:
URL: https://github.com/apache/spark/pull/37407#discussion_r982381621
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##
@@ -1098,6 +1106,87 @@ class AstBuilder extends
peter-toth opened a new pull request, #38034:
URL: https://github.com/apache/spark/pull/38034
### What changes were proposed in this pull request?
This PR introduce `TreeNode.multiTransform()` methods to be able to
recursively transform a `TreeNode` (and so a tree) into multiple
Kimahriman commented on code in PR #37770:
URL: https://github.com/apache/spark/pull/37770#discussion_r982400522
##
sql/core/src/test/scala/org/apache/spark/sql/GeneratorFunctionSuite.scala:
##
@@ -219,20 +219,21 @@ class GeneratorFunctionSuite extends QueryTest with
peter-toth opened a new pull request, #38035:
URL: https://github.com/apache/spark/pull/38035
### What changes were proposed in this pull request?
This is a WIP PR to improve constraint generation with the help of
`TreeNode.multiTransform()`.
As this PR is based on
grundprinzip opened a new pull request, #38037:
URL: https://github.com/apache/spark/pull/38037
### What changes were proposed in this pull request?
This patch adds the missing type annotations for the Spark Connect Python
client and renables the mypy checks. In addition, the
EnricoMi commented on code in PR #37407:
URL: https://github.com/apache/spark/pull/37407#discussion_r982503768
##
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4:
##
@@ -618,6 +618,46 @@ pivotValue
: expression (AS? identifier)?
;
EnricoMi commented on code in PR #37407:
URL: https://github.com/apache/spark/pull/37407#discussion_r982517306
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -869,26 +869,50 @@ class Analyzer(override val catalogManager:
grundprinzip commented on code in PR #38023:
URL: https://github.com/apache/spark/pull/38023#discussion_r982536156
##
connect/src/main/protobuf/spark/connect/expressions.proto:
##
@@ -155,4 +156,7 @@ message Expression {
string expression = 1;
}
+ // represent *
cloud-fan commented on code in PR #37407:
URL: https://github.com/apache/spark/pull/37407#discussion_r982631012
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##
@@ -1098,6 +1106,87 @@ class AstBuilder extends
cloud-fan commented on code in PR #37407:
URL: https://github.com/apache/spark/pull/37407#discussion_r982631649
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -869,26 +869,50 @@ class Analyzer(override val catalogManager:
amaliujia commented on code in PR #38007:
URL: https://github.com/apache/spark/pull/38007#discussion_r982632002
##
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala:
##
@@ -2635,6 +2635,10 @@ class JDBCV2Suite extends QueryTest with
SharedSparkSession with
mridulm commented on PR #38032:
URL: https://github.com/apache/spark/pull/38032#issuecomment-1260862926
The reason to retry on failures is the inherent nature of distributed
computation - which does not apply in local mode.
What is the scenario where we are looking for this change to be
peter-toth commented on PR #38034:
URL: https://github.com/apache/spark/pull/38034#issuecomment-1261011468
I've opened 3 WIP PRs to demonstrate the usage of `multiTransform()`:
- A bug fix to omprove AliasAwareOutputPartitioning to take all aliases into
account:
cloud-fan commented on code in PR #37407:
URL: https://github.com/apache/spark/pull/37407#discussion_r982382485
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -869,26 +869,50 @@ class Analyzer(override val catalogManager:
cloud-fan commented on code in PR #37407:
URL: https://github.com/apache/spark/pull/37407#discussion_r982382485
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -869,26 +869,50 @@ class Analyzer(override val catalogManager:
EnricoMi commented on code in PR #37407:
URL: https://github.com/apache/spark/pull/37407#discussion_r982515423
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##
@@ -1098,6 +1106,87 @@ class AstBuilder extends
LucaCanali commented on PR #33559:
URL: https://github.com/apache/spark/pull/33559#issuecomment-1261280991
The issue with SQLQueryTestSuite.udf/postgreSQL/udf-aggregates_part3.sql
should be fixed now.
I have also extended the instrumentation to applyInPandasWithState recently
introduced
aokolnychyi commented on code in PR #38004:
URL: https://github.com/apache/spark/pull/38004#discussion_r982741016
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/DeltaWriter.java:
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
holdenk commented on code in PR #37885:
URL: https://github.com/apache/spark/pull/37885#discussion_r982829203
##
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala:
##
@@ -971,18 +971,30 @@ private[spark] class TaskSchedulerImpl(
}
override def
itholic commented on PR #38018:
URL: https://github.com/apache/spark/pull/38018#issuecomment-1261583694
Yeah, and more specifically, the "pandas-on-Spark" is used when the another
noun follows right after "pandas API on Spark".
For example,
"pandas API on Spark DataFrame is
amaliujia commented on code in PR #37994:
URL: https://github.com/apache/spark/pull/37994#discussion_r982725060
##
connect/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation
aokolnychyi commented on code in PR #38004:
URL: https://github.com/apache/spark/pull/38004#discussion_r982723941
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/LogicalWriteInfo.java:
##
@@ -45,4 +45,18 @@ public interface LogicalWriteInfo {
* the schema
sadikovi commented on code in PR #37654:
URL: https://github.com/apache/spark/pull/37654#discussion_r982881638
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala:
##
@@ -72,87 +69,9 @@ class ParquetFileFormat
job: Job,
sadikovi commented on code in PR #37654:
URL: https://github.com/apache/spark/pull/37654#discussion_r982881822
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala:
##
@@ -72,87 +69,9 @@ class ParquetFileFormat
job: Job,
amaliujia commented on code in PR #38037:
URL: https://github.com/apache/spark/pull/38037#discussion_r982943887
##
connect/dev/generate_protos.sh:
##
@@ -0,0 +1,79 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.
amaliujia commented on code in PR #38037:
URL: https://github.com/apache/spark/pull/38037#discussion_r982943887
##
connect/dev/generate_protos.sh:
##
@@ -0,0 +1,79 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.
amaliujia commented on code in PR #38037:
URL: https://github.com/apache/spark/pull/38037#discussion_r982876885
##
python/pyspark/sql/connect/client.py:
##
@@ -21,18 +21,20 @@
import typing
import uuid
-import grpc
+import grpc # type: ignore
import pandas
import pandas
grundprinzip commented on code in PR #38037:
URL: https://github.com/apache/spark/pull/38037#discussion_r982910120
##
python/pyspark/sql/connect/client.py:
##
@@ -21,18 +21,20 @@
import typing
import uuid
-import grpc
+import grpc # type: ignore
import pandas
import
MaxGekk commented on PR #38029:
URL: https://github.com/apache/spark/pull/38029#issuecomment-1261301966
+1, LGTM. Merging to master.
Thank you, @cloud-fan and @amaliujia for review.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
MaxGekk closed pull request #38029: [SPARK-40595][SQL] Improve error message
for unused CTE relations
URL: https://github.com/apache/spark/pull/38029
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
amaliujia commented on code in PR #38037:
URL: https://github.com/apache/spark/pull/38037#discussion_r982876885
##
python/pyspark/sql/connect/client.py:
##
@@ -21,18 +21,20 @@
import typing
import uuid
-import grpc
+import grpc # type: ignore
import pandas
import pandas
amaliujia commented on code in PR #38037:
URL: https://github.com/apache/spark/pull/38037#discussion_r982876885
##
python/pyspark/sql/connect/client.py:
##
@@ -21,18 +21,20 @@
import typing
import uuid
-import grpc
+import grpc # type: ignore
import pandas
import pandas
amaliujia commented on code in PR #38037:
URL: https://github.com/apache/spark/pull/38037#discussion_r982912553
##
python/pyspark/sql/connect/client.py:
##
@@ -21,18 +21,20 @@
import typing
import uuid
-import grpc
+import grpc # type: ignore
import pandas
import pandas
rdblue commented on PR #36304:
URL: https://github.com/apache/spark/pull/36304#issuecomment-1261555810
I talked with @aokolnychyi about this and I think this is a data source
problem, not something Spark should track right now.
The main problem is that some table sources have
allisonwang-db commented on code in PR #37641:
URL: https://github.com/apache/spark/pull/37641#discussion_r982873072
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala:
##
@@ -177,7 +171,15 @@ object FileFormatWriter extends Logging {
dongjoon-hyun commented on PR #38030:
URL: https://github.com/apache/spark/pull/38030#issuecomment-1261492253
Thank you, @mridulm .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
dongjoon-hyun commented on code in PR #38030:
URL: https://github.com/apache/spark/pull/38030#discussion_r982884660
##
core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala:
##
@@ -123,7 +123,8 @@ private[spark] object MapStatus {
private[spark] class
aokolnychyi commented on code in PR #38004:
URL: https://github.com/apache/spark/pull/38004#discussion_r982731665
##
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/write/LogicalWriteInfoImpl.scala:
##
@@ -23,4 +23,6 @@ import
amaliujia commented on code in PR #38037:
URL: https://github.com/apache/spark/pull/38037#discussion_r982878444
##
python/pyspark/sql/connect/client.py:
##
@@ -21,18 +21,20 @@
import typing
import uuid
-import grpc
+import grpc # type: ignore
import pandas
import pandas
bozhang2820 commented on code in PR #38030:
URL: https://github.com/apache/spark/pull/38030#discussion_r982969089
##
core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala:
##
@@ -123,7 +123,8 @@ private[spark] object MapStatus {
private[spark] class
Ngone51 commented on code in PR #38030:
URL: https://github.com/apache/spark/pull/38030#discussion_r983009468
##
core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionIntegrationSuite.scala:
##
@@ -186,6 +186,8 @@ class BlockManagerDecommissionIntegrationSuite
srowen commented on code in PR #38024:
URL: https://github.com/apache/spark/pull/38024#discussion_r983008731
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FilePartitionReader.scala:
##
@@ -36,8 +36,15 @@ class FilePartitionReader[T](
private def
zhengruifeng commented on code in PR #37995:
URL: https://github.com/apache/spark/pull/37995#discussion_r983016797
##
python/pyspark/pandas/series.py:
##
@@ -6442,6 +6445,8 @@ def argmin(self, axis: Axis = None, skipna: bool = True)
-> int:
raise ValueError("axis
LuciferYang commented on code in PR #38041:
URL: https://github.com/apache/spark/pull/38041#discussion_r983020028
##
connect/src/test/resources/log4j2.properties:
##
@@ -0,0 +1,39 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license
LuciferYang commented on PR #38041:
URL: https://github.com/apache/spark/pull/38041#issuecomment-1261689871
I'm not sure whether this should belong to the subtask of SPARK-39375. If
not, please help move it
--
This is an automated message from the Apache Git Service.
To respond to the
wangyum commented on PR #37996:
URL: https://github.com/apache/spark/pull/37996#issuecomment-1261697253
Another advantage is that we can coalesce into smaller partitions and then
build the bloom filter because large parallelism can not improve the
performance of the build bloom filter. For
LuciferYang commented on PR #37654:
URL: https://github.com/apache/spark/pull/37654#issuecomment-1261594105
Thanks @sadikovi ~ rebase to keep the code up to date, let's wait for GA
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
github-actions[bot] closed pull request #35549: [SPARK-38230][SQL]
InsertIntoHadoopFsRelationCommand unnecessarily fetches details of partitions
in most cases
URL: https://github.com/apache/spark/pull/35549
--
This is an automated message from the Apache Git Service.
To respond to the
github-actions[bot] closed pull request #35548: [SPARK-38234] [SQL] [SS] Added
structured streaming monitoring APIs.
URL: https://github.com/apache/spark/pull/35548
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
github-actions[bot] closed pull request #35371: [WIP][SPARK-37946][SQL] Use
error classes in the execution errors related to partitions
URL: https://github.com/apache/spark/pull/35371
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
github-actions[bot] closed pull request #35319: [SPARK-36571][SQL] Add new
SQLPathHadoopMapReduceCommitProtocol resolve conflict when write into partition
table's different partition
URL: https://github.com/apache/spark/pull/35319
--
This is an automated message from the Apache Git Service.
github-actions[bot] closed pull request #35337: [SPARK-37840][SQL] Dynamic
Update of UDF
URL: https://github.com/apache/spark/pull/35337
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
github-actions[bot] closed pull request #34903: [SPARK-37650][PYTHON] Tell
spark-env.sh the python interpreter
URL: https://github.com/apache/spark/pull/34903
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
github-actions[bot] closed pull request #34856: [SPARK-37602][CORE] Add config
property to set default Spark listeners
URL: https://github.com/apache/spark/pull/34856
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
github-actions[bot] commented on PR #34791:
URL: https://github.com/apache/spark/pull/34791#issuecomment-1261602244
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #34829:
URL: https://github.com/apache/spark/pull/34829#issuecomment-126160
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
1 - 100 of 150 matches
Mail list logo