[
https://issues.apache.org/jira/browse/MAPREDUCE-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251652#comment-13251652
]
Jason Lowe commented on MAPREDUCE-4099:
---------------------------------------
TestClientRMService failure appears to be unrelated to this patch. It's
testing an area of the code unrelated to the changes, and the test passes for
me when I run it locally.
> ApplicationMaster may fail to remove staging directory
> ------------------------------------------------------
>
> Key: MAPREDUCE-4099
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4099
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Affects Versions: 0.23.2
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Priority: Critical
> Fix For: 0.23.3, 2.0.0
>
> Attachments: MAPREDUCE-4099-addendum.patch,
> MAPREDUCE-4099-addendum.patch, MAPREDUCE-4099.patch, MAPREDUCE-4099.patch,
> MAPREDUCE-4099.patch
>
>
> When the ApplicationMaster shuts down it's supposed to remove the staging
> directory, assuming properties weren't set to override this behavior. During
> shutdown the AM tells the ResourceManager that it has finished before it
> cleans up the staging directory. However upon hearing the AM has finished,
> the RM turns right around and kills the AM container. If the AM is too slow,
> the AM will be killed before the staging directory is removed.
> We're seeing the AM lose this race fairly consistently on our clusters, and
> the lack of staging directory cleanup quickly leads to filesystem quota
> issues for some users.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira