AmplabJenkins commented on PR #36548:
URL: https://github.com/apache/spark/pull/36548#issuecomment-1126812505
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
huaxingao commented on PR #36521:
URL: https://github.com/apache/spark/pull/36521#issuecomment-1126826707
Merged to master. Thanks!
@beliefer I can't merge to 3.3 because there are conflicts. Could you please
back port to 3.3?
--
This is an automated message from the Apache Git
AmplabJenkins commented on PR #36544:
URL: https://github.com/apache/spark/pull/36544#issuecomment-1126831264
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
AmplabJenkins commented on PR #36545:
URL: https://github.com/apache/spark/pull/36545#issuecomment-1126831255
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
HyukjinKwon commented on PR #36501:
URL: https://github.com/apache/spark/pull/36501#issuecomment-1126833206
Test results are in https://github.com/dtenedor/spark/runs/6433699950. Seems
like the sync went failed for some reasons.
--
This is an automated message from the Apache Git
HyukjinKwon commented on PR #36549:
URL: https://github.com/apache/spark/pull/36549#issuecomment-1126832926
Merged to master, branch-3.3 and branch-3.2.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
zhengruifeng commented on PR #36554:
URL: https://github.com/apache/spark/pull/36554#issuecomment-1126864262
cc @HyukjinKwon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
github-actions[bot] commented on PR #35357:
URL: https://github.com/apache/spark/pull/35357#issuecomment-1126832112
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
HyukjinKwon commented on PR #36267:
URL: https://github.com/apache/spark/pull/36267#issuecomment-1126832061
Merged to master and branch-3.3.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
HyukjinKwon closed pull request #36267: [SPARK-38953][PYTHON][DOC] Document
PySpark common exceptions / errors
URL: https://github.com/apache/spark/pull/36267
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
HyukjinKwon closed pull request #36546: [SPARK-37544][SQL] Correct date
arithmetic in sequences
URL: https://github.com/apache/spark/pull/36546
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
huaxingao closed pull request #36521: [SPARK-39162][SQL] Jdbc dialect should
decide which function could be pushed down
URL: https://github.com/apache/spark/pull/36521
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
MaxGekk opened a new pull request, #36553:
URL: https://github.com/apache/spark/pull/36553
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How
HyukjinKwon commented on PR #36546:
URL: https://github.com/apache/spark/pull/36546#issuecomment-1126832443
@bersprockets, it has a conflict with branch-3.1. Please create a PR to
backport if you think it should be backported :-).
--
This is an automated message from the Apache Git
HyukjinKwon commented on PR #36546:
URL: https://github.com/apache/spark/pull/36546#issuecomment-1126832378
Merged to master, branch-3.3 and branch-3.2.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
Yikun commented on code in PR #36464:
URL: https://github.com/apache/spark/pull/36464#discussion_r873100372
##
python/pyspark/pandas/groupby.py:
##
@@ -2110,22 +2110,79 @@ def _limit(self, n: int, asc: bool) -> FrameLike:
groupkey_scols =
zhengruifeng commented on PR #36554:
URL: https://github.com/apache/spark/pull/36554#issuecomment-1126847198
befor this PR:
```
In [2]: pdf = pd.DataFrame(
...: {
...: "A": [1, 1, 1, 1, 1],
...: "B": [1.0,
zhengruifeng opened a new pull request, #36554:
URL: https://github.com/apache/spark/pull/36554
### What changes were proposed in this pull request?
Improve the numerical stability of skewness for cases with small `m2` and
`m3`
### Why are the changes needed?
the
HyukjinKwon closed pull request #36549: [SPARK-39186][PYTHON] Make
pandas-on-Spark's skew consistent with pandas
URL: https://github.com/apache/spark/pull/36549
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
AmplabJenkins commented on PR #36540:
URL: https://github.com/apache/spark/pull/36540#issuecomment-1126840371
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
HyukjinKwon commented on code in PR #36545:
URL: https://github.com/apache/spark/pull/36545#discussion_r872942079
##
python/pyspark/sql/tests/test_types.py:
##
@@ -285,6 +285,64 @@ def test_infer_nested_dict_as_struct(self):
df = self.spark.createDataFrame(data)
zhengruifeng commented on PR #36549:
URL: https://github.com/apache/spark/pull/36549#issuecomment-1126676744
lastest master:
```
pdf = pd.DataFrame(
{
"A": [1, -2, np.nan, -4, 5],
"B": [1.0, -2, np.nan, -4, 5],
"C": [-6.0, -7, -8, np.nan,
zhengruifeng opened a new pull request, #36549:
URL: https://github.com/apache/spark/pull/36549
### What changes were proposed in this pull request?
the logics of computing skewness are different between spark sql and pandas:
spark sql: [`sqrt(n) * m3 / sqrt(m2 * m2 *
HyukjinKwon commented on code in PR #36501:
URL: https://github.com/apache/spark/pull/36501#discussion_r872943310
##
sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala:
##
@@ -511,6 +511,30 @@ case class StructType(fields: Array[StructField]) extends
bjornjorgensen commented on PR #36547:
URL: https://github.com/apache/spark/pull/36547#issuecomment-1126676101
[all](https://docs.python.org/3/library/functions.html#all) is a built in
function in python. Can we rename this to `def all_to_skip()`
--
This is an automated message from the
Yikun commented on code in PR #36464:
URL: https://github.com/apache/spark/pull/36464#discussion_r872956648
##
python/pyspark/pandas/groupby.py:
##
@@ -2110,22 +2110,79 @@ def _limit(self, n: int, asc: bool) -> FrameLike:
groupkey_scols =
abhishekd0907 commented on PR #35683:
URL: https://github.com/apache/spark/pull/35683#issuecomment-1126682797
@mridulm @attilapiros I have handled all your comments. Can you please
review the PR again?
--
This is an automated message from the Apache Git Service.
To respond to the
MaxGekk opened a new pull request, #36550:
URL: https://github.com/apache/spark/pull/36550
### What changes were proposed in this pull request?
Remove `SparkIllegalStateException` and replace it by
`IllegalStateException` where it was used.
### Why are the changes needed?
To
MaxGekk commented on code in PR #36550:
URL: https://github.com/apache/spark/pull/36550#discussion_r872960720
##
sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala:
##
@@ -272,15 +271,6 @@ class QueryExecutionErrorsSuite
}
}
-
HyukjinKwon commented on PR #36545:
URL: https://github.com/apache/spark/pull/36545#issuecomment-1126657800
We should probably add a configuration like
`spark.sql.pyspark.legacy.inferFirstElementInArray.enabled` (feel free to pick
other names if you have other ideas).
--
This is an
Yikun commented on code in PR #36464:
URL: https://github.com/apache/spark/pull/36464#discussion_r872956648
##
python/pyspark/pandas/groupby.py:
##
@@ -2110,22 +2110,79 @@ def _limit(self, n: int, asc: bool) -> FrameLike:
groupkey_scols =
Yikun commented on code in PR #36464:
URL: https://github.com/apache/spark/pull/36464#discussion_r872956648
##
python/pyspark/pandas/groupby.py:
##
@@ -2110,22 +2110,79 @@ def _limit(self, n: int, asc: bool) -> FrameLike:
groupkey_scols =
Yikun commented on code in PR #36464:
URL: https://github.com/apache/spark/pull/36464#discussion_r872956648
##
python/pyspark/pandas/groupby.py:
##
@@ -2110,22 +2110,79 @@ def _limit(self, n: int, asc: bool) -> FrameLike:
groupkey_scols =
Yikun commented on code in PR #36464:
URL: https://github.com/apache/spark/pull/36464#discussion_r872956648
##
python/pyspark/pandas/groupby.py:
##
@@ -2110,22 +2110,79 @@ def _limit(self, n: int, asc: bool) -> FrameLike:
groupkey_scols =
Yikun commented on code in PR #36464:
URL: https://github.com/apache/spark/pull/36464#discussion_r872956648
##
python/pyspark/pandas/groupby.py:
##
@@ -2110,22 +2110,79 @@ def _limit(self, n: int, asc: bool) -> FrameLike:
groupkey_scols =
beliefer commented on PR #36516:
URL: https://github.com/apache/spark/pull/36516#issuecomment-1126682083
@cloud-fan Thank you!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
zhengruifeng commented on PR #36549:
URL: https://github.com/apache/spark/pull/36549#issuecomment-1126688194
cc @HyukjinKwon @xinrong-databricks @itholic should this be a bug-fix?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
HyukjinKwon commented on PR #36545:
URL: https://github.com/apache/spark/pull/36545#issuecomment-1126657160
Nice PR description. Yeah, we should probably add a configuration then,
please also refer to
https://github.com/apache/spark/commit/2537fe8cbaf49070137d4b5bc39af078b306c4c8
for
beliefer commented on code in PR #36531:
URL: https://github.com/apache/spark/pull/36531#discussion_r872962001
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala:
##
@@ -2117,7 +2265,9 @@ case class Cast(
child: Expression,
dataType:
panbingkun opened a new pull request, #36551:
URL: https://github.com/apache/spark/pull/36551
## What changes were proposed in this pull request?
This change is to refactor exceptions thrown in
FixedLengthBinaryRecordReader to use error class framework.
### Why are the changes
panbingkun commented on PR #36479:
URL: https://github.com/apache/spark/pull/36479#issuecomment-1126696228
pinging @MaxGekk
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
zhengruifeng commented on code in PR #36464:
URL: https://github.com/apache/spark/pull/36464#discussion_r872972240
##
python/pyspark/pandas/groupby.py:
##
@@ -2110,22 +2110,79 @@ def _limit(self, n: int, asc: bool) -> FrameLike:
groupkey_scols =
weixiuli commented on PR #36162:
URL: https://github.com/apache/spark/pull/36162#issuecomment-1126711880
@mridulm @Ngone51 Sorry for the late reply,please help me review if you have
time. Thank you very much.
--
This is an automated message from the Apache Git Service.
To respond to the
wangyum opened a new pull request, #36552:
URL: https://github.com/apache/spark/pull/36552
### What changes were proposed in this pull request?
1. Add a new optimizer rule(PushPartialAggregationThroughJoin) to push the
partial aggregation through join. It supports the following
beliefer commented on code in PR #36541:
URL: https://github.com/apache/spark/pull/36541#discussion_r872964095
##
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala:
##
@@ -82,52 +82,45 @@ abstract class SparkStrategies extends
QueryPlanner[SparkPlan]
beliefer commented on code in PR #36541:
URL: https://github.com/apache/spark/pull/36541#discussion_r872964108
##
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala:
##
@@ -82,52 +82,45 @@ abstract class SparkStrategies extends
QueryPlanner[SparkPlan]
HyukjinKwon commented on PR #36534:
URL: https://github.com/apache/spark/pull/36534#issuecomment-1126689562
Merged to master, branch-3.3, branch-3.2 and branch-3.1.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
HyukjinKwon closed pull request #36534: [SPARK-39174][SQL] Catalogs loading
swallows missing classname for ClassNotFoundException
URL: https://github.com/apache/spark/pull/36534
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
MaxGekk commented on code in PR #36546:
URL: https://github.com/apache/spark/pull/36546#discussion_r873062601
##
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala:
##
@@ -964,6 +964,50 @@ class CollectionExpressionsSuite
AmplabJenkins commented on PR #36551:
URL: https://github.com/apache/spark/pull/36551#issuecomment-1126795684
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
wangyum commented on PR #36552:
URL: https://github.com/apache/spark/pull/36552#issuecomment-1126736623
Part of the TPC-DS q24a query plan.
Before this PR | After this PR
-- | --
MaxGekk commented on code in PR #36479:
URL: https://github.com/apache/spark/pull/36479#discussion_r873053587
##
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala:
##
@@ -147,14 +147,17 @@ object QueryCompilationErrors extends QueryErrorsBase
52 matches
Mail list logo