[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-29 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60877916 [Test build #22430 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22430/consoleFull) for PR 2944 at commit

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-29 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60877993 I've rewritten this patch so that thread dumps are triggered on-demand using a new driver - executor RPC channel. There are a few hacks involved in setting this up,

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-29 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60878081 [Test build #22430 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22430/consoleFull) for PR 2944 at commit

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60878083 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-29 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2944#discussion_r19521370 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala --- @@ -412,6 +415,17 @@ class BlockManagerMasterActor(val isLocal:

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-29 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2944#discussion_r19521449 --- Diff: core/src/main/scala/org/apache/spark/ui/exec/ThreadDumpPage.scala --- @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-29 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60878689 [Test build #22433 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22433/consoleFull) for PR 2944 at commit

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-29 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60878899 [Test build #22433 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22433/consoleFull) for PR 2944 at commit

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60878901 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-29 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-61012331 [Test build #22482 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22482/consoleFull) for PR 2944 at commit

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-29 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-61021043 [Test build #22482 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22482/consoleFull) for PR 2944 at commit

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-61021051 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60818509 Wow, awesome!! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60818581 This is even easier to read than the raw jstack output --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60818745 @JoshRosen This is super awesome ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60834858 It looks like executorIds are assigned by the cluster manager, so in principle they could be arbitrary strings but in practice they seem to not contain special

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60837003 Do you know how large the threadDump is typically ? I'm concerned this might make the heartbeat too large --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60837888 The other idea I had was that we could just open a port on the executor and have a web ui on it. This could also display the executor's stderr (Which is very painful to

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60842701 @shivaram That's a good point RE: the size of the thread dumps. I can now imagine problems where a thread-leak in an executor causes the heartbeat to become huge and

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60843263 [Test build #22383 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22383/consoleFull) for PR 2944 at commit

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60845020 I like the idea of running a separate UI server on the executor, but this seems like a much more involved change that will take a lot more design review. For example,

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60846384 Yes - I think having a separate RPC sounds good for now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60846448 Upon closer inspection, there's not a general driver - executor RPC path that I can use to send arbitrary Akka messages to executors. To keep this PR simple and

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60852209 [Test build #22383 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22383/consoleFull) for PR 2944 at commit

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60852214 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60856220 [Test build #22402 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22402/consoleFull) for PR 2944 at commit

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60856334 Alright, I've updated this to send the dumps as part of a separate fire-and-forget RPC and removed the new code from the heartbeat code paths (which should make things

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60857056 That sounds good -- Actually could we make this a request-reply pattern ? i.e we only fetch the stack traces if somebody clicks on the link ? --- If your project is

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60857353 I've thought about that, but it looks like we don't actually create addressable actors on the executors, so there's no path for the driver to send an RPC to the

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60857677 Another subtlety: when the web UI receives a request for a thread-dump, it would need to issue a RPC to the executor to fetch that request. Ideally, we wouldn't block

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60858892 Hmm okay - I agree that we don't really have a request - reply route from the web ui (maybe this is also worth investigating if / when we have a executor web ui).

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60859702 What if we had a 1000-node cluster, though, and kept the default heartbeat interval of 10 seconds? In that case, we'd be sending a huge flood of data to the driver,

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60862614 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60862610 [Test build #22402 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22402/consoleFull) for PR 2944 at commit

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60650825 [Test build #22301 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22301/consoleFull) for PR 2944 at commit

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60664728 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-27 Thread shaneknapp
Github user shaneknapp commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60664970 jenkins, test this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-27 Thread shaneknapp
Github user shaneknapp commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60668583 jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60668699 [Test build #22305 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22305/consoleFull) for PR 2944 at commit

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60680949 [Test build #22305 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22305/consoleFull) for PR 2944 at commit

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60680961 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-25 Thread JoshRosen
GitHub user JoshRosen opened a pull request: https://github.com/apache/spark/pull/2944 [SPARK-611] [WIP] Display executor thread dumps in web UI This patch allows executor thread dumps to be viewed in the Spark web UI. Thread dumps obtained from Thread.getAllStackTraces()

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-25 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2944#discussion_r19378809 --- Diff: core/src/main/scala/org/apache/spark/scheduler/local/LocalBackend.scala --- @@ -47,7 +47,7 @@ private[spark] class LocalActor(

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60505076 [Test build #8 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/8/consoleFull) for PR 2944 at commit

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-25 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60505192 One subtle issue that I've run into is that the driver always runs a block manager but only runs an Executor in local mode. So, the executors tab in the web UI is

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-25 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60505272 Executor IDs are strings, so I should probably check whether they'll need to be url-encoded; I guess this depends on which components create these strings. --- If

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60505576 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-611] [WIP] Display executor thread dump...

2014-10-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2944#issuecomment-60505573 [Test build #8 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/8/consoleFull) for PR 2944 at commit