[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-23 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/5491 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-23 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-95654146 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-22 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-95073146 @vanzin Please take a look, thanks~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-94620834 [Test build #30623 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30623/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-20 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-94621686 If we changed the way to iterate like you said, the delete operations may cost too much time here (even be stuck) in case DFS client throw IOException often

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-94634400 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-94634395 [Test build #30623 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30623/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-20 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-94525786 LGTM, just left a minor comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-20 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/5491#discussion_r28712948 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -266,34 +268,38 @@ private[history] class FsHistoryProvider(conf:

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-19 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-94341483 [Test build #30571 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30571/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-19 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-94341589 @vanzin Added another temporary ListBuffer `leftToClean` to store the apps that wasn't deleted succesfully and avoid editing `appsToClean` in its iterator.

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-94351145 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-19 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-94351140 [Test build #30571 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30571/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/5491#discussion_r28443327 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -273,33 +278,34 @@ private[history] class FsHistoryProvider(conf:

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/5491#discussion_r28442371 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -134,6 +138,7 @@ private[history] class FsHistoryProvider(conf:

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/5491#discussion_r28442204 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -21,6 +21,7 @@ import java.io.{IOException, BufferedInputStream,

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/5491#discussion_r28443347 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -273,33 +278,34 @@ private[history] class FsHistoryProvider(conf:

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-93624836 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-93624827 [Test build #30390 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30390/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-93613012 [Test build #30390 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30390/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-93213684 [Test build #30314 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30314/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-93242420 [Test build #30314 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30314/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-93242499 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-93210950 [Test build #30312 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30312/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-15 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-93216107 @vanzin Now I use an extra global ListBuffer to store the apps to clean. Update its content and delete its dirs/files in every clean round. I know the

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-93229453 [Test build #30312 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30312/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-93229522 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92624802 [Test build #30218 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30218/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92624868 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-14 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/5491#discussion_r28347842 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -273,33 +273,32 @@ private[history] class FsHistoryProvider(conf:

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-14 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/5491#discussion_r28347868 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -273,33 +273,32 @@ private[history] class FsHistoryProvider(conf:

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92451595 I can easily imagine sparks applications that running in jobserver mode that sit idle for a long time between active jobs. I can see that, but I wonder it a

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92447578 @vanzin as a counterpoint -- I can easily imagine sparks applications that running in jobserver mode that sit idle for a long time between active jobs. Can we

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92555754 where the app itself would explicitly keep the log's mod time updated All I mean here is that `EventLoggingListener` could from time to time just call

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92566099 @vanzin Erhhh, It seems like another solution, but there are few questions: 1.It adds logic to event logger(more codes and more action) 2.It increase the

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92551078 Okay I made an observation on my cluster, the thrift server is started at 21:01:32 and it hadn't do anything from that. Its evnet log's modification time is

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/5491#discussion_r28297121 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -273,35 +273,28 @@ private[history] class FsHistoryProvider(conf:

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92550220 @vanzin It is not just theoretical. I tested using a ThriftServer instance, before this patch its event log is deleted by cleaner(when it expires).

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92567471 Yeah, I'm a little on the fence. Pressure on HDFS shouldn't be a problem - you don't need to call `setTimes` that often, just often enough that the HS wouldn't clean up

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/5491#discussion_r28297179 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -273,35 +273,28 @@ private[history] class FsHistoryProvider(conf:

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread viper-kun
Github user viper-kun commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92578144 I think it is ok. User must call sc.stop(), if not, it just not delete some event log. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92595667 [Test build #30218 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30218/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92596329 Used an extra Map(`appsToClean`) to store applications that need to be deleted and add it back to `applicaitons` once delete failed. @vanzin Please check

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92437781 Question: did you actually run into this problem or is this theoretical? Because I'd expect a live application to be updating the event log, in which case its

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread WangTaoTheTonic
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/spark/pull/5491 [SPARK-6879][HistoryServer]check if app is completed before clean it up https://issues.apache.org/jira/browse/SPARK-6879 Use `applications` to replace `FileStatus`, and check if the

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92339932 [Test build #30159 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30159/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92340806 CC @vanzin @viper-kun I think this makes sense, although it does change the logic slightly. Now, log cleanup happens based on the application's state, rather than the

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92368706 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92368689 [Test build #30159 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30159/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92360278 [Test build #30166 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30166/consoleFull) for PR 5491 at commit

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92359112 Fixed the wrong path to delete and I have tested on my cluster, it worked fine. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92395847 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-6879][HistoryServer]check if app is com...

2015-04-13 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5491#issuecomment-92395785 [Test build #30166 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30166/consoleFull) for PR 5491 at commit