[
https://issues.apache.org/jira/browse/MAPREDUCE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131440#comment-13131440
]
Siddharth Seth commented on MAPREDUCE-3159:
-------------------------------------------
The new Application state transition - INITING to FINISHED on APP_INIT_FAILED
will cause problems with subsequent startContainer() (new containers will be
stuck in the NEW state) and finishApplication() calls. Like the patch says - an
additional state, which would have to deal with new container and finishApp
requests. Also, an intermittent failure in app initialization would end up
making the node unusable for the specific app.
Changing DCE to remove the delete in {code}createAppDirs{code} is probably a
simpler fix ? job.jar and job.xml are separate App resources which are
localized into their own directory. One failing should not affect the other
(will only affect the container associated with the failed localization
attempt). Don't think the comment in the code about cleaning up the dir is
valid.
> DefaultContainerExecutor removes appcache dir on every localization
> -------------------------------------------------------------------
>
> Key: MAPREDUCE-3159
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3159
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Affects Versions: 0.23.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Blocker
> Attachments: mr-3159.txt, mr-3159.txt
>
>
> The DefaultContainerExecutor currently has code that removes the application
> dir from appcache/ in the local directories on every task localization. This
> causes any concurrent executing tasks from the same job to fail.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira