Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/20894
jenkins, retest this, please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20894
**[Test build #89365 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89365/testReport)**
for PR 20894 at commit
Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20629#discussion_r181547119
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala
---
@@ -64,12 +65,12 @@ class ClusteringEvaluator @Since("2.3.0")
Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20629#discussion_r181547107
--- Diff: python/pyspark/ml/clustering.py ---
@@ -322,7 +323,11 @@ def computeCost(self, dataset):
"""
Return the K-means cost
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20629
@holdenk I am not sure about requiring or not cluster centers for this
metric. On one side, since the `ClusteringEvaluator` should be a general
interface for all clustering models and some of them
Github user stoader commented on the issue:
https://github.com/apache/spark/pull/21067
@mccheah
> But whether or not the driver should be relaunchable should be determined
by the application submitter, and not necessarily done all the time. Can we
make this behavior
Github user sujith71955 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20611#discussion_r181543985
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ---
@@ -304,45 +304,14 @@ case class LoadDataCommand(
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20937
**[Test build #89364 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89364/testReport)**
for PR 20937 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21056
**[Test build #89366 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89366/testReport)**
for PR 21056 at commit
GitHub user mgaido91 opened a pull request:
https://github.com/apache/spark/pull/21072
[SPARK-23973][SQL] Remove consecutive Sorts
## What changes were proposed in this pull request?
In SPARK-23375 we introduced the ability of removing `Sort` operation
during query
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21072
**[Test build #89367 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89367/testReport)**
for PR 21072 at commit
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/21072
cc @cloud-fan @henryr
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20894
**[Test build #89365 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89365/testReport)**
for PR 20894 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20894
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89365/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20894
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21072
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21072
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2329/
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20937
**[Test build #89364 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89364/testReport)**
for PR 20937 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20937
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89364/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20937
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/20894
jenkins, retest this, please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
Merged to branch-2.3.
Thanks for reviewing this @BryanCutler.
---
-
To unsubscribe, e-mail:
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20611#discussion_r181552961
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ---
@@ -304,45 +304,14 @@ case class LoadDataCommand(
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21057
ok to test
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21057#discussion_r181553378
--- Diff: python/pyspark/streaming/kafka.py ---
@@ -104,7 +104,7 @@ def createDirectStream(ssc, topics, kafkaParams,
fromOffsets=None,
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21056
**[Test build #89366 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89366/testReport)**
for PR 21056 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21004
(let's avoid to describe the PR description just saying improvement next
time)
---
-
To unsubscribe, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21070
@rdblue, BTW mind fixing the title to `[SPARK-23972][...] ...`? It's
actually written in the guide.
---
-
To unsubscribe,
Github user HyukjinKwon closed the pull request at:
https://github.com/apache/spark/pull/21060
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21057
BTW, mind fixing the PR title to `[MINOR][PYTHON] ... ` and make the title
more descriptive? not a big deal but good to match it with other PRs.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21057
**[Test build #89368 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89368/testReport)**
for PR 21057 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21057
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21057
**[Test build #89368 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89368/testReport)**
for PR 21057 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21057
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89368/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21056
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21056
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89366/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21072
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89367/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21072
**[Test build #89367 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89367/testReport)**
for PR 21072 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21072
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user henryr commented on a diff in the pull request:
https://github.com/apache/spark/pull/21072#discussion_r181563918
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -736,12 +736,15 @@ object EliminateSorts extends
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21007#discussion_r181569225
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -3189,10 +3189,10 @@ class Dataset[T] private[sql](
private[sql]
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/21070#discussion_r181563672
--- Diff: pom.xml ---
@@ -129,7 +129,7 @@
1.2.1
10.12.1.1
-1.8.2
+1.10.0
--- End diff --
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/21070#discussion_r181564129
--- Diff: pom.xml ---
@@ -129,7 +129,7 @@
1.2.1
10.12.1.1
-1.8.2
+1.10.0
--- End diff --
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/18378#discussion_r181567408
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1750,6 +1761,24 @@ def _to_scala_map(sc, jm):
return sc._jvm.PythonUtils.toScalaMap(jm)
Github user edlee123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18378#discussion_r181565606
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1750,6 +1761,24 @@ def _to_scala_map(sc, jm):
return sc._jvm.PythonUtils.toScalaMap(jm)
Github user edlee123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18378#discussion_r181567770
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1750,6 +1761,24 @@ def _to_scala_map(sc, jm):
return sc._jvm.PythonUtils.toScalaMap(jm)
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21073
**[Test build #89369 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89369/testReport)**
for PR 21073 at commit
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/21057#discussion_r181568263
--- Diff: python/pyspark/streaming/listener.py ---
@@ -22,6 +22,10 @@ class StreamingListener(object):
def __init__(self):
pass
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/21057
I think you may create a minor JIRA ticket for this.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21007
@HyukjinKwon @BryanCutler @viirya @felixcheung The first sentence of this
PR really scares me. After reading the PR description. Since the PR description
will be part of our change log. Please
GitHub user bersprockets opened a pull request:
https://github.com/apache/spark/pull/21073
[SPARK-23936][SQL][WIP] Implement map_concat
## What changes were proposed in this pull request?
Implement map_concat high order function.
This is a work in progress.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
Since this is not a bug fix, I plan to revert this PR. WDYT? @HyukjinKwon
@BryanCutler
---
-
To unsubscribe, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21073
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21057
+1 for ^.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
I guess the behaviour changes here is that a custom query execution
listener now can recognise the action `collect` in PySpark which other APIs
have detected. Mind explaining how it breaks
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
I agree that It's better to avoid a behaviour change but this one is a
clearly a bug and the fix is straightforward. I am puzzled why this
specifically prompted you. I wouldn't revert if
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
I am a bit puzzled because `QueryExecutionListener` should call the
callback for actions and `collect` triggers it in Scala and R but it doesn't in
PySpark specifically. It sounds a bug and
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
This is just the basic backport rule we follow for each PR. We should not
make an exception for this PR.
---
-
To
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
How about we formally document that in the guide?
I have been always putting more importance on practice and I personally
think we are fine to make a backport if it's a bug and the fix
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
hm I would say it's a bug since the action is not detected which is
supposed to call the callback. The test is a bit complicated but the fix is
relatively straightforward.
---
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
This will introduce the behavior change and it is not a regression. The
changes we made in this PR could break the external app. We should not do it in
the maintenance release.
---
Github user echarles commented on the issue:
https://github.com/apache/spark/pull/20451
Now that #20910 has been merged, I will update this PR to take account the
refactoring. @inetfuture Once these changes are pushed, there is the review
process which needs to occur, so difficult
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
If this can be treated as a bug to backport, we have many behavior change
PRs that can be backported. We are building the system software. We have to be
more principled.
---
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21007
What's wrong in the description and PR title, and what to document? Do you
mean the first sentence `This PR proposes to add collect to a query executor as
an action.` is wrong because this
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
Users apps should not be blamed in this case. If they want this change,
they should upgrade to the newer release. Basically, we should not introduce
any external behavior change in the
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
Yup, that should reduce some overhead like this. I would like to listen
what you guys think cc @srowenn, @vanzin, @felixcheung, @holdenk too.
---
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
I do think we should clearly document the rule what we can backport.
I do not think we should make an exception for this PR. cc @rxin @marmbrus
@yhuai @cloud-fan @ueshin
---
67 matches
Mail list logo