[
https://issues.apache.org/jira/browse/MAPREDUCE-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250151#comment-13250151
]
Jason Lowe commented on MAPREDUCE-4099:
---------------------------------------
All of the reported test failures appear to be unrelated to the patch. They
all fail because a ResourceManager process can't start due to a socket bind
problem -- a runaway RM process on the build machine, perhaps? I ran the RM
unit tests locally with this patch and they all pass.
I also manually tested the patch with a single-node cluster running sleep and
wordcount jobs. Also connected the debugger to the ApplicationMaster, causing
it to linger artificially in the FINISHING state to verify killing or expiring
an application in the FINISHING state behaves properly.
> ApplicationMaster may fail to remove staging directory
> ------------------------------------------------------
>
> Key: MAPREDUCE-4099
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4099
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Affects Versions: 0.23.2
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Priority: Critical
> Attachments: MAPREDUCE-4099.patch, MAPREDUCE-4099.patch
>
>
> When the ApplicationMaster shuts down it's supposed to remove the staging
> directory, assuming properties weren't set to override this behavior. During
> shutdown the AM tells the ResourceManager that it has finished before it
> cleans up the staging directory. However upon hearing the AM has finished,
> the RM turns right around and kills the AM container. If the AM is too slow,
> the AM will be killed before the staging directory is removed.
> We're seeing the AM lose this race fairly consistently on our clusters, and
> the lack of staging directory cleanup quickly leads to filesystem quota
> issues for some users.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira