Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/5491
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-95654146
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WangTaoTheTonic commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-95073146
@vanzin Please take a look, thanks~
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-94620834
[Test build #30623 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30623/consoleFull)
for PR 5491 at commit
Github user WangTaoTheTonic commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-94621686
If we changed the way to iterate like you said, the delete operations may
cost too much time here (even be stuck) in case DFS client throw IOException
often
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-94634400
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-94634395
[Test build #30623 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30623/consoleFull)
for PR 5491 at commit
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-94525786
LGTM, just left a minor comment.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/5491#discussion_r28712948
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -266,34 +268,38 @@ private[history] class FsHistoryProvider(conf:
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-94341483
[Test build #30571 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30571/consoleFull)
for PR 5491 at commit
Github user WangTaoTheTonic commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-94341589
@vanzin Added another temporary ListBuffer `leftToClean` to store the apps
that wasn't deleted succesfully and avoid editing `appsToClean` in its
iterator.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-94351145
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-94351140
[Test build #30571 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30571/consoleFull)
for PR 5491 at commit
Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/5491#discussion_r28443327
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -273,33 +278,34 @@ private[history] class FsHistoryProvider(conf:
Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/5491#discussion_r28442371
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -134,6 +138,7 @@ private[history] class FsHistoryProvider(conf:
Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/5491#discussion_r28442204
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -21,6 +21,7 @@ import java.io.{IOException, BufferedInputStream,
Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/5491#discussion_r28443347
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -273,33 +278,34 @@ private[history] class FsHistoryProvider(conf:
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-93624836
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-93624827
[Test build #30390 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30390/consoleFull)
for PR 5491 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-93613012
[Test build #30390 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30390/consoleFull)
for PR 5491 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-93213684
[Test build #30314 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30314/consoleFull)
for PR 5491 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-93242420
[Test build #30314 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30314/consoleFull)
for PR 5491 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-93242499
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-93210950
[Test build #30312 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30312/consoleFull)
for PR 5491 at commit
Github user WangTaoTheTonic commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-93216107
@vanzin Now I use an extra global ListBuffer to store the apps to clean.
Update its content and delete its dirs/files in every clean round.
I know the
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-93229453
[Test build #30312 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30312/consoleFull)
for PR 5491 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-93229522
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92624802
[Test build #30218 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30218/consoleFull)
for PR 5491 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92624868
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/5491#discussion_r28347842
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -273,33 +273,32 @@ private[history] class FsHistoryProvider(conf:
Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/5491#discussion_r28347868
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -273,33 +273,32 @@ private[history] class FsHistoryProvider(conf:
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92451595
I can easily imagine sparks applications that running in jobserver
mode that sit idle for a long time between active jobs.
I can see that, but I wonder it a
Github user squito commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92447578
@vanzin as a counterpoint -- I can easily imagine sparks applications that
running in jobserver mode that sit idle for a long time between active jobs.
Can we
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92555754
where the app itself would explicitly keep the log's mod time updated
All I mean here is that `EventLoggingListener` could from time to time just
call
Github user WangTaoTheTonic commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92566099
@vanzin Erhhh, It seems like another solution, but there are few questions:
1.It adds logic to event logger(more codes and more action)
2.It increase the
Github user WangTaoTheTonic commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92551078
Okay I made an observation on my cluster, the thrift server is started at
21:01:32 and it hadn't do anything from that. Its evnet log's modification time
is
Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/5491#discussion_r28297121
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -273,35 +273,28 @@ private[history] class FsHistoryProvider(conf:
Github user WangTaoTheTonic commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92550220
@vanzin
It is not just theoretical. I tested using a ThriftServer instance, before
this patch its event log is deleted by cleaner(when it expires).
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92567471
Yeah, I'm a little on the fence. Pressure on HDFS shouldn't be a problem -
you don't need to call `setTimes` that often, just often enough that the HS
wouldn't clean up
Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/5491#discussion_r28297179
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -273,35 +273,28 @@ private[history] class FsHistoryProvider(conf:
Github user viper-kun commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92578144
I think it is ok. User must call sc.stop(), if not, it just not delete some
event log.
---
If your project is set up for it, you can reply to this email and have your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92595667
[Test build #30218 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30218/consoleFull)
for PR 5491 at commit
Github user WangTaoTheTonic commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92596329
Used an extra Map(`appsToClean`) to store applications that need to be
deleted and add it back to `applicaitons` once delete failed.
@vanzin Please check
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92437781
Question: did you actually run into this problem or is this theoretical?
Because I'd expect a live application to be updating the event log, in
which case its
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/spark/pull/5491
[SPARK-6879][HistoryServer]check if app is completed before clean it up
https://issues.apache.org/jira/browse/SPARK-6879
Use `applications` to replace `FileStatus`, and check if the
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92339932
[Test build #30159 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30159/consoleFull)
for PR 5491 at commit
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92340806
CC @vanzin @viper-kun I think this makes sense, although it does change the
logic slightly. Now, log cleanup happens based on the application's state,
rather than the
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92368706
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92368689
[Test build #30159 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30159/consoleFull)
for PR 5491 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92360278
[Test build #30166 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30166/consoleFull)
for PR 5491 at commit
Github user WangTaoTheTonic commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92359112
Fixed the wrong path to delete and I have tested on my cluster, it worked
fine.
---
If your project is set up for it, you can reply to this email and have your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92395847
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5491#issuecomment-92395785
[Test build #30166 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30166/consoleFull)
for PR 5491 at commit
53 matches
Mail list logo