GitHub user cleaton opened a pull request:
https://github.com/apache/spark/pull/4338
[STREAMING] SPARK-4986 Wait for receivers to deregister and receiver job to
terminate
A slow receiver might not have enough time to shutdown cleanly even when
graceful shutdown is used. This PR
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4338#issuecomment-72662235
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user cleaton commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-72662664
@tdas I have created a new master branch PR. You can find it here:
https://github.com/apache/spark/pull/4338
---
If your project is set up for it, you can reply to this
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4338#issuecomment-72745892
[Test build #26674 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26674/consoleFull)
for PR 4338 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4338#issuecomment-72745909
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/4338#issuecomment-72707020
ok to test.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4338#issuecomment-72707729
[Test build #26662 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26662/consoleFull)
for PR 4338 at commit
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/4338#issuecomment-72714914
@cleaton Thanks for this PR. It looks reasonably good, except a few minor
comments. Could you please address them very soon? I would like to merge this
today for Spark 1.3.
Github user cleaton commented on the pull request:
https://github.com/apache/spark/pull/4338#issuecomment-72718673
@tdas I will address it now. I have one question though. Don't you think
there at least should be some sort of timeout while waiting for the receivers
to deregister? I
Github user cleaton closed the pull request at:
https://github.com/apache/spark/pull/3868
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4338#issuecomment-72720928
[Test build #26662 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26662/consoleFull)
for PR 4338 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4338#issuecomment-72720945
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/4338#discussion_r24029758
--- Diff:
streaming/src/test/scala/org/apache/spark/streaming/StreamingContextSuite.scala
---
@@ -319,6 +346,38 @@ object TestReceiver {
val counter =
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-72719061
Thanks! @cleaton Could you close this PR, and then create a new PR with the
new stuff. Also create a new JIRA saying something like Make the
gracefulStopTimeout more
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/4338#discussion_r24029813
--- Diff:
streaming/src/test/scala/org/apache/spark/streaming/StreamingContextSuite.scala
---
@@ -205,6 +205,33 @@ class StreamingContextSuite extends FunSuite
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/4338#discussion_r24029997
--- Diff:
streaming/src/test/scala/org/apache/spark/streaming/StreamingContextSuite.scala
---
@@ -205,6 +205,33 @@ class StreamingContextSuite extends FunSuite
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/4338#discussion_r24048309
--- Diff:
streaming/src/test/scala/org/apache/spark/streaming/StreamingContextSuite.scala
---
@@ -205,6 +205,33 @@ class StreamingContextSuite extends FunSuite
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/4338#issuecomment-72753609
Alright, I like this. I am merging this. Please submit the other PR as well
:)
---
If your project is set up for it, you can reply to this email and have your
reply
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/4338
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4338#issuecomment-72733188
[Test build #26674 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26674/consoleFull)
for PR 4338 at commit
Github user cleaton commented on a diff in the pull request:
https://github.com/apache/spark/pull/4338#discussion_r24038573
--- Diff:
streaming/src/test/scala/org/apache/spark/streaming/StreamingContextSuite.scala
---
@@ -205,6 +205,33 @@ class StreamingContextSuite extends
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-72556583
It is fine to not address the timeout in that PR. That PR is about
correctness. With that PR, the system will not loose data when gracefully
stopped. The time it may take
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-72553531
Yes, the branch looks good. Can you please submit that as a PR. There are a
few minor comments that I want to make.
---
If your project is set up for it, you can reply to
Github user cleaton commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-72555661
I will rebase to master tomorrow and submit it as a new PR for SPARK-4986.
There is still the problem of how to handle the gracefulStopTimeout option, any
thoughts about
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-72101801
1. The merge window for features closes this weekend. However this is a bug
fix so we can make it to early next week as well. But the change has to be
surgical.
Github user cleaton commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-72158188
Thanks @tdas
I'll create a new minimal PR with only the receiver shutdown part and the
unit test. The main problem is how to deal with the timeout then. The two
Github user cleaton commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-72164837
@tdas I have a new branch here:
https://github.com/cleaton/spark/compare/apache:branch-1.2...receiverstop
Still missing code to handle gracefulStopTimeout but
Github user cleaton commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-71843209
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23705218
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobScheduler.scala
---
@@ -68,36 +69,60 @@ class JobScheduler(val ssc:
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23706083
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala
---
@@ -223,22 +224,33 @@ class ReceiverTracker(ssc:
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23701730
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobScheduler.scala
---
@@ -68,36 +69,60 @@ class JobScheduler(val ssc:
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23702493
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobScheduler.scala
---
@@ -68,36 +69,60 @@ class JobScheduler(val ssc:
Github user cleaton commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-71982990
Thank you for the comments @tdas . I think the diff file for this PR has
become a bit misleading. It looks like there is more change than there actually
is. This is a
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23699803
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobScheduler.scala
---
@@ -68,36 +69,60 @@ class JobScheduler(val ssc:
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23699867
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobScheduler.scala
---
@@ -68,36 +69,60 @@ class JobScheduler(val ssc:
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23699970
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobScheduler.scala
---
@@ -68,36 +69,60 @@ class JobScheduler(val ssc:
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23700247
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobScheduler.scala
---
@@ -68,36 +69,60 @@ class JobScheduler(val ssc:
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23706669
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala
---
@@ -208,6 +208,7 @@ class ReceiverTracker(ssc:
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23706702
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala
---
@@ -223,22 +224,33 @@ class ReceiverTracker(ssc:
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23709301
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobScheduler.scala
---
@@ -68,36 +69,60 @@ class JobScheduler(val ssc:
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23708107
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
---
@@ -126,10 +112,13 @@ class JobGenerator(jobScheduler:
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23708100
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
---
@@ -93,27 +93,13 @@ class JobGenerator(jobScheduler:
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23708402
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
---
@@ -93,27 +93,13 @@ class JobGenerator(jobScheduler:
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23708418
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
---
@@ -126,10 +112,13 @@ class JobGenerator(jobScheduler:
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r23710541
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobScheduler.scala
---
@@ -68,36 +69,60 @@ class JobScheduler(val ssc:
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-71905464
This patch is definitely heading in the right direction. But as I was
reading it, I realize that it affects the whole shutdown behavior of all the
components
Github user cleaton commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-71375392
@tdas I've added a unit test now covering slow receiver shutdown. What do
you think about the approach now? Thanks
---
If your project is set up for it, you can reply
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-68689798
I am happy to review the code if you take a pass on implementing (2). I can
jump in if things get too hairy.
---
If your project is set up for it, you can reply to this
Github user cleaton commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-68678753
@tdas Thank you for the input.
Yes, the main purpose of this patch is to make ReceiverTracker graceful by
waiting for ssc.sparkContext.runJob(tempRDD,
Github user cleaton commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-68707568
OK sounds great. :+1:
I can prepare an implementation of (2). Bit busy now, but I think I can
have something to review in a week.
Any specific unit test
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-68647850
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-68647896
[Test build #25037 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25037/consoleFull)
for PR 3868 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-68652450
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-68652443
[Test build #25037 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25037/consoleFull)
for PR 3868 at commit
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-68654184
Could you add an unit test that tests this functionality. Something like
this is easy undergo regression silently.
---
If your project is set up for it, you can reply to
Github user cleaton commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-68656608
@tdas yes I think I can, I will take a look at it. Thanks for the feedback.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r22444552
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala
---
@@ -223,22 +227,40 @@ class ReceiverTracker(ssc:
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r22444547
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala
---
@@ -18,6 +18,8 @@
package
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r22444560
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala
---
@@ -285,16 +307,18 @@ class ReceiverTracker(ssc:
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r22444563
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/util/TimeoutUtils.scala ---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r22444565
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/util/TimeoutUtils.scala ---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r22444576
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/util/TimeoutUtils.scala ---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r22444579
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/util/TimeoutUtils.scala ---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r22444573
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/util/TimeoutUtils.scala ---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r22444597
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala
---
@@ -223,22 +227,40 @@ class ReceiverTracker(ssc:
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-68656939
As long as you're planning to do more work on this, I left a round of (kind
of nitpicky) style comments
Github user cleaton commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-68657548
@JoshRosen Thank you for all the comments. I will update accordingly :).
Regarding the waitUntilDone function, yes I assume there must be something
like this in
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-68658085
Yeah, we can't / shouldn't use ScalaTest's `eventually` here; I was just
noting the similarity. Maybe there's a library that has something like this,
but I don't know
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r22447509
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/util/TimeoutUtils.scala ---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/3868#discussion_r22447455
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
---
@@ -93,27 +93,18 @@ class JobGenerator(jobScheduler:
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-68670827
I think I get the idea at a high-level what this patch is trying to do. The
key change essentially is in the `ReceiverLauncher` - the `stop` is made to use
the config
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/3868#issuecomment-68617053
Jenkins, this is ok to test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
72 matches
Mail list logo