Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2806#issuecomment-59600547
[QA tests have
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/398/consoleFull)
for PR 2806 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2808#issuecomment-59600543
[QA tests have
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/397/consoleFull)
for PR 2808 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2806#issuecomment-59601469
[QA tests have
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/398/consoleFull)
for PR 2806 at commit
Github user zsxwing commented on the pull request:
https://github.com/apache/spark/pull/2732#issuecomment-59601473
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2808#issuecomment-59601740
[QA tests have
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/397/consoleFull)
for PR 2808 at commit
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2828#issuecomment-59602394
**tl;dr**: _this patch looks pretty good to me based on the testing that
I've done so far. For my own interest / fun, I'd like to find a way to extend
my test
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/2780#discussion_r19052549
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -1011,4 +1014,99 @@ object DecisionTree extends Serializable with
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/2780#discussion_r19052547
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -912,16 +913,18 @@ object DecisionTree extends Serializable with
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/2780#discussion_r19052551
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -1011,4 +1014,99 @@ object DecisionTree extends Serializable with
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/2780#discussion_r19052554
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -1011,4 +1014,99 @@ object DecisionTree extends Serializable with
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/2780#discussion_r19052556
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -1011,4 +1014,99 @@ object DecisionTree extends Serializable with
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/2780#discussion_r19052544
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -36,6 +36,7 @@ import org.apache.spark.mllib.tree.model._
import
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/2780#discussion_r19052548
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -912,16 +913,18 @@ object DecisionTree extends Serializable with
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/2780#discussion_r19052546
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -912,16 +913,18 @@ object DecisionTree extends Serializable with
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/2780#discussion_r19052555
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -1011,4 +1014,99 @@ object DecisionTree extends Serializable with
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/2780#discussion_r19052550
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -1011,4 +1014,99 @@ object DecisionTree extends Serializable with
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/2780#discussion_r19052553
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -1011,4 +1014,99 @@ object DecisionTree extends Serializable with
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/2780#issuecomment-59603338
@chouqin Thanks for this PR! This method should be a real improvement. I
added some small comments inline.
My main concern right now is testing edge cases,
Github user scwf commented on the pull request:
https://github.com/apache/spark/pull/2576#issuecomment-59603629
Hi, i am working with this and addressing the comments in my another
branch(https://github.com/scwf/spark/tree/orc) before it's good to merge here,
you can see the update
Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/2732#issuecomment-59603852
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2732#issuecomment-59603960
[QA tests have
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21875/consoleFull)
for PR 2732 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2732#issuecomment-59606118
[QA tests have
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21875/consoleFull)
for PR 2732 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/2732#issuecomment-59606120
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user PraveenSeluka commented on a diff in the pull request:
https://github.com/apache/spark/pull/2840#discussion_r19053493
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -851,6 +851,38 @@ class SparkContext(config: SparkConf) extends Logging {
GitHub user saucam opened a pull request:
https://github.com/apache/spark/pull/2841
SPARK-3968 Use parquet-mr filter2 api in spark sql
The parquet-mr project has introduced a new filter api
(https://github.com/apache/incubator-parquet-mr/pull/4), along with several
fixes . It can
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/2841#issuecomment-59612787
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2576#issuecomment-59620121
[QA tests have
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21876/consoleFull)
for PR 2576 at commit
GitHub user baishuo opened a pull request:
https://github.com/apache/spark/pull/2842
[SPARK-3999][deploy] resolve the wrong number of arguments for pattern error
AssociationErrorEvent which is provided by
akka-remote_2.10-2.2.3-shaded-protobuf.jar only have 4 arguments
You can
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/2842#issuecomment-59620378
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user scwf commented on a diff in the pull request:
https://github.com/apache/spark/pull/2576#discussion_r19054716
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcTableOperations.scala
---
@@ -0,0 +1,351 @@
+/*
+ * Licensed to the Apache Software
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/2576#issuecomment-59622596
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2842#issuecomment-59623836
Jenkins, this is ok to test.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2842#issuecomment-59623964
[QA tests have
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21877/consoleFull)
for PR 2842 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/2842#issuecomment-59624079
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2842#issuecomment-59624078
[QA tests have
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21877/consoleFull)
for PR 2842 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2753#issuecomment-59624284
[QA tests have
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21878/consoleFull)
for PR 2753 at commit
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2805#issuecomment-59626413
I can confirm that this seems to have fixed the serialization issue; here's
my test-case:
```scala
import org.apache.spark.api.java._
val pairs =
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/2805#discussion_r19055750
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
---
@@ -587,4 +587,11 @@ trait JavaRDDLike[T, This : JavaRDDLike[T, This]]
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/2805#discussion_r19055757
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
---
@@ -587,4 +587,11 @@ trait JavaRDDLike[T, This : JavaRDDLike[T, This]]
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/2805
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2805#issuecomment-59627052
I made those minor fixups and committed this as
f406a8391825d8866110f29a0d656c82cd064520. I also cherry-picked it into
`branch-1.1`. Thanks!
---
If your project is
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2842#issuecomment-59627171
It looks like this could be a case where the `AssociationErrorEvent`
case-class is backwards-compatible from a Java POV but not w.r.t. Scala
pattern-matching. If we
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/2753#issuecomment-59627314
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2753#issuecomment-59627310
[QA tests have
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21878/consoleFull)
for PR 2753 at commit
Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/2842#issuecomment-59628189
@baishuo are you trying to modify the akka version inside of Spark? In
Spark currently support a specific akka version, that version was upgraded
recently, but Spark
Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/2837#issuecomment-59628478
I'm not sure whether this is a good idea - since estimates like this are
likely to be very innacurate due to the presence of stragglers in most jobs. I
think it's
Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/2790#issuecomment-59629160
@wangxiaojing I updated the JIRA title a bit - do you mind updating your
pull request?
https://issues.apache.org/jira/browse/SPARK-3940
---
If your project is
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2837#issuecomment-59629809
@pwendell I agree; progress estimation for arbitrary jobs is a really hard
problem and I share your concerns about wrong estimates confusing users.
---
If your
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2836#issuecomment-59630692
Can your use-case be addressed by logging into the master and running a
command after it's launched? I think you could even automate this in your own
bash script that
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/2838#discussion_r19056687
--- Diff: python/pyspark/worker.py ---
@@ -131,6 +130,14 @@ def process():
for (aid, accum) in _accumulatorRegistry.items():
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/2838#discussion_r19056695
--- Diff: python/pyspark/worker.py ---
@@ -57,7 +57,7 @@ def main(infile, outfile):
boot_time = time.time()
split_index =
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2838#issuecomment-59632902
The `take()`-handling code is a little tricky and has been the source of
multiple bugs in the past, so it may be worth considering several different
approaches for
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2681#issuecomment-59633500
Based on some offline discussions, we think that we may have identified the
cause of the Snappy issues that we've seen with TorrentBroadcast. I'm going to
merge this
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2681#issuecomment-59633558
Actually, I changed my mind: I'm going to hold off on merging this because
I don't want to backport the embedding of small objects into `branch-1.1` but I
_do_ want to
Github user davies commented on the pull request:
https://github.com/apache/spark/pull/2681#issuecomment-59634598
I hope that we can have this in 1.1, some people see regression in 1.1
because of TorrentBroadcast, this patch will help for those.
---
If your project is set up for it,
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/2838#discussion_r19057007
--- Diff: python/pyspark/worker.py ---
@@ -57,7 +57,7 @@ def main(infile, outfile):
boot_time = time.time()
split_index =
Github user davies commented on the pull request:
https://github.com/apache/spark/pull/2838#issuecomment-59635035
take() is not the only one which will introduce problems, user could call
mapPartitions(), and read parts of the items in the infile.
Not only re-use the worker,
Github user davies commented on the pull request:
https://github.com/apache/spark/pull/2808#issuecomment-59635066
@JoshRosen updated the readme.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2808#issuecomment-59635202
[QA tests have
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21879/consoleFull)
for PR 2808 at commit
Github user chouqin commented on a diff in the pull request:
https://github.com/apache/spark/pull/2780#discussion_r19057305
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -1011,4 +1014,99 @@ object DecisionTree extends Serializable with
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/2808#issuecomment-59636460
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2808#issuecomment-59636456
[QA tests have
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21879/consoleFull)
for PR 2808 at commit
Github user chouqin commented on a diff in the pull request:
https://github.com/apache/spark/pull/2780#discussion_r19057399
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
@@ -1011,4 +1014,99 @@ object DecisionTree extends Serializable with
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2808#issuecomment-59636873
LGTM; it seems fine to me to bump the Jekyll version, since the prior
version was really old and didn't seem to properly handle some of the cases
that we use here. We
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/2808
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user chouqin commented on the pull request:
https://github.com/apache/spark/pull/2780#issuecomment-59637054
@manishamde @jkbradley thanks for your comments, I have changed my code
now. Do you have any more suggestions?
---
If your project is set up for it, you can reply to
Github user theoryno3 commented on the pull request:
https://github.com/apache/spark/pull/2833#issuecomment-59637188
Just wondering when this pull request will be accepted?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user chouqin commented on the pull request:
https://github.com/apache/spark/pull/2780#issuecomment-59637234
Jekins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user scwf commented on the pull request:
https://github.com/apache/spark/pull/2576#issuecomment-59638937
Test failed in Clisuite.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user ash211 commented on the pull request:
https://github.com/apache/spark/pull/2828#issuecomment-59639085
This is EXCELLENT work @JoshRosen ! Looking forward to future integration
tests that cover these sorts of behaviors.
---
If your project is set up for it, you can reply
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2576#issuecomment-59639208
[QA tests have
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21880/consoleFull)
for PR 2576 at commit
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/2840#discussion_r19058079
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -851,6 +851,38 @@ class SparkContext(config: SparkConf) extends Logging {
72 matches
Mail list logo