[
https://issues.apache.org/jira/browse/MAPREDUCE-7264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
tuyu updated MAPREDUCE-7264:
----------------------------
Description:
when rolling restart nodemanager, some mapreduce job will exit because of
unhandle TA_TOO_MANY_FETCH_FAILURE event
details:
if task stay in SUCCEEDED state, if reciveice
when map task send done rpc to AM, AM will Transition this task to
SUCCESS_FINISHING_CONTAINER state, and add this task to
mapAttemptCompletionEvents List, when reduce send
getMapAttemptCompletionEvents rpc to get the complete map, the task stay in
SUCCESS_FINISHING_CONTAINER state will return.
was:when we rolling restart nodemanager, some mapreduce job will exit because
of unhandle
> overall reduction of ApplicationMaster exit because of unhandled
> TA_TOO_MANY_FETCH_FAILURE event
> ------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-7264
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7264
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: applicationmaster
> Affects Versions: 3.2.1
> Reporter: tuyu
> Priority: Critical
> Fix For: 3.2.1
>
>
> when rolling restart nodemanager, some mapreduce job will exit because of
> unhandle TA_TOO_MANY_FETCH_FAILURE event
> details:
> if task stay in SUCCEEDED state, if reciveice
> when map task send done rpc to AM, AM will Transition this task to
> SUCCESS_FINISHING_CONTAINER state, and add this task to
> mapAttemptCompletionEvents List, when reduce send
> getMapAttemptCompletionEvents rpc to get the complete map, the task stay in
> SUCCESS_FINISHING_CONTAINER state will return.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]