[
https://issues.apache.org/jira/browse/MAPREDUCE-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15530134#comment-15530134
]
Miklos Szegedi commented on MAPREDUCE-6776:
-------------------------------------------
The unit test failure must have been intermittent and unrelated. I verified
locally and it works well.
(org.apache.hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager)
> yarn.app.mapreduce.client.job.max-retries should have a more useful default
> ---------------------------------------------------------------------------
>
> Key: MAPREDUCE-6776
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6776
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: client
> Affects Versions: 2.8.0
> Reporter: Daniel Templeton
> Assignee: Miklos Szegedi
> Attachments: MAPREDUCE-6776.001.patch, MAPREDUCE-6776.002.patch,
> MAPREDUCE-6776.003.patch
>
>
> The default is 0, so any communication failure results in a client failure.
> Oozie doesn't like that. If the RM is failing over and Oozie gets a
> communication failure, it assumes the target job has failed. I propose
> raising the default to something modest like 3 or 5. The default retry
> interval is 2s.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]