[jira] [Commented] (MAPREDUCE-3347) Resource manager is not respawning MRAppMaster process if it goes down in the middle of job execution and the job is getting failed.

Ramgopal N (Commented) (JIRA) Fri, 04 Nov 2011 21:59:20 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144582#comment-13144582
 ]


Ramgopal N commented on MAPREDUCE-3347:
---------------------------------------

Hi vinod ,
By enabling yarn.resourcemanager.am.max-retries in yarn-site.xml the RM retries 
specified number of times before failing the job. Thanks



                
> Resource manager is not respawning MRAppMaster process if it goes down in the 
> middle of job execution and the job is getting failed.
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3347
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3347
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.0
>            Reporter: Ramgopal N
>
> ApplicationMaster service should recover the job if MRAppMaster process goes 
> down in the middle of job execution.If not MRAppMaster process becomes the 
> single point of failure for the job and losses the advantage of MRV1 
> framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3347) Resource manager is not respawning MRAppMaster process if it goes down in the middle of job execution and the job is getting failed.

Reply via email to