[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537275#comment-15537275
 ] 

Hitesh Shah commented on MAPREDUCE-6776:
----------------------------------------

>From a practical sense, this is not really an incompatible change as there is 
>some internal behavioral aspects that are being changed to retry 3 times 
>instead of no retries. 

However, from a pure theoretical compat perspective, a public default value is 
being changed as well as the value in mapred-default.xml. Tests which might be 
earlier doing some verification would expect immediate failures whereas now it 
might be reconnect or fail after 6 seconds or so. 

I suggest pushing this to trunk for sure as we are still in the alpha stage of 
releases. As for branch-2, I would check with the 2.8 release manager. 

> yarn.app.mapreduce.client.job.max-retries should have a more useful default
> ---------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6776
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6776
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 2.8.0
>            Reporter: Daniel Templeton
>            Assignee: Miklos Szegedi
>         Attachments: MAPREDUCE-6776.001.patch, MAPREDUCE-6776.002.patch, 
> MAPREDUCE-6776.003.patch
>
>
> The default is 0, so any communication failure results in a client failure.  
> Oozie doesn't like that.  If the RM is failing over and Oozie gets a 
> communication failure, it assumes the target job has failed.  I propose 
> raising the default to something modest like 3 or 5.  The default retry 
> interval is 2s.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to