[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-11-25 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-64323860 Andrew's got a patch for this: #3447 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-11-24 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-64316156 It looks like this patch may have introduced a race-condition / bug during multi-master failover: https://issues.apache.org/jira/browse/SPARK-4592. I'm working on a

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread mccheah
Github user mccheah commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59803881 @JoshRosen agreed with @ash211, this is really good. Are there any actual comments on the PR, or can it be merged? =) --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59807896 @JoshRosen , this is awesome to test Spark integration with Docker @mccheah , this PR is LGTM now, except that we exposed too many should-be-private members in

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59810037 @CodingCat, Worker is private[spark], so what is the nature of your concern? In fact, I'm wondering whether we really want the changes in this PR that make some

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59810532 @markhamstra , yeah, my concern is just this, though Worker is marked as private[spark], is it a good practice to expose every detail in the implementation to the

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r19101917 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -166,26 +178,47 @@ private[spark] class Worker( } }

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59811803 A legitimate concern, and certainly something that could be worked up into a JIRA issue and separate pull request. But it's not a very pressing issue since nothing

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59812443 As a general principle, you should use the most private access modifiers that are sufficient. We can always make methods / fields _more_ visible, but it's much harder

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r19102299 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -64,8 +66,17 @@ private[spark] class Worker( // Send a heartbeat

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r19102413 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -166,26 +178,47 @@ private[spark] class Worker( } }

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59814026 sure, I created the JIRA: https://issues.apache.org/jira/browse/SPARK-4011 --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59814881 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21926/consoleFull) for PR 2828 at commit

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59815368 This looks good to me. Thanks! I'm going to merge this into `master`. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2828 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59823491 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59823485 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21926/consoleFull) for PR 2828 at commit

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread mccheah
Github user mccheah commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59824518 The PR doesn't seem to be related to the unit tests that failed. How shall we tackle this issue? --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-20 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59833561 Don't worry about it. This test is a little flaky and will be fixed shortly. I highly doubt that the test failure is caused by this PR. --- If your project is set up

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-18 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59602394 **tl;dr**: _this patch looks pretty good to me based on the testing that I've done so far. For my own interest / fun, I'd like to find a way to extend my test

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-18 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59639085 This is EXCELLENT work @JoshRosen ! Looking forward to future integration tests that cover these sorts of behaviors. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-17 Thread ash211
Github user ash211 commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r19002923 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -362,9 +372,19 @@ private[spark] class Worker( } }

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-17 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r19011614 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -362,9 +372,19 @@ private[spark] class Worker( } }

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59596562 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21873/consoleFull) for PR 2828 at commit

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59598067 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21873/consoleFull) for PR 2828 at commit

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59598071 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread mccheah
GitHub user mccheah opened a pull request: https://github.com/apache/spark/pull/2828 [SPARK-3736] Workers reconnect when disassociated from the master. Before, if the master node is killed and restarted, the worker nodes would not attempt to reconnect to the Master. Therefore,

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59406399 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread mccheah
Github user mccheah commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59408043 One remark is that there are no automated tests in this commit for now. I was unsuccessful in setting up TestKit to emulate a worker and master sending messages

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18976036 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -94,6 +96,7 @@ private[spark] class Worker( val finishedExecutors =

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread ash211
Github user ash211 commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18976148 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -341,7 +341,11 @@ private[spark] class Master( case

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18976594 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -365,6 +375,16 @@ private[spark] class Worker( def

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18976619 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -365,6 +375,16 @@ private[spark] class Worker( def

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18977018 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -341,7 +341,11 @@ private[spark] class Master( case

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread mccheah
Github user mccheah commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18977288 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -341,7 +341,11 @@ private[spark] class Master( case

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18978412 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -362,9 +372,19 @@ private[spark] class Worker( } }

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread mccheah
Github user mccheah commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18978981 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -362,9 +372,19 @@ private[spark] class Worker( } }

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59425292 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59425990 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21818/consoleFull) for PR 2828 at commit

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18985861 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -362,9 +372,19 @@ private[spark] class Worker( } }

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18985880 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -243,6 +249,10 @@ private[spark] class Worker(

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread mccheah
Github user mccheah commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18986188 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -362,9 +372,19 @@ private[spark] class Worker( } }

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18986488 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -362,9 +372,19 @@ private[spark] class Worker( } }

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59430399 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21822/consoleFull) for PR 2828 at commit

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread mccheah
Github user mccheah commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18986702 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -362,9 +372,19 @@ private[spark] class Worker( } }

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread mccheah
Github user mccheah commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18986941 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -362,9 +372,19 @@ private[spark] class Worker( } }

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18988031 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -362,9 +372,19 @@ private[spark] class Worker( } }

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18988140 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -362,9 +372,19 @@ private[spark] class Worker( } }

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread mccheah
Github user mccheah commented on a diff in the pull request: https://github.com/apache/spark/pull/2828#discussion_r18988742 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -362,9 +372,19 @@ private[spark] class Worker( } }

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59435763 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21818/consoleFull) for PR 2828 at commit

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59435771 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59439367 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21822/consoleFull) for PR 2828 at commit

[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-59439376 Test PASSed. Refer to this link for build results (access rights to CI server needed):