[
https://issues.apache.org/jira/browse/MAPREDUCE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261811#comment-13261811
]
Siddharth Seth commented on MAPREDUCE-3921:
-------------------------------------------
Thanks for the updated patch Bikas. Will take a look. Still waiting for input
from the MR veterans on some of the previous comments - how things were handled
in 20 - specifically for killing map/reduce tasks on unhealthy nodes, and
treating 'node unhealthy' similar to 'fetch failure' (State Killed / Failed as
well as counting towards max_attempts).
bq. About the OBSOLETE part. I get how it is used. What I dont get is why we
are marking a previously successful task as obsolete and invalid upon the
completion of a new task without first checking if the new task was itself
successful or not.
Are you considering leaving the task in SUCCESSFUL state, even if it's being
retried, so that the Reduce *may* be able to pull data - before there's a new
SUCCESSFUL attempt ?
Otherwise, marking the attempt as OBSOLETE and removing the task from
successAttemptCompletionEventNoMap (tracks only SUCCESSUL attempts) seems like
the correct thing to do.
> MR AM should act on the nodes liveliness information when nodes go
> up/down/unhealthy
> ------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-3921
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3921
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mr-am, mrv2
> Affects Versions: 0.23.0
> Reporter: Vinod Kumar Vavilapalli
> Assignee: Bikas Saha
> Fix For: 0.23.2
>
> Attachments: MAPREDUCE-3921-1.patch, MAPREDUCE-3921-3.patch,
> MAPREDUCE-3921-4.patch, MAPREDUCE-3921-5.patch, MAPREDUCE-3921-6.patch,
> MAPREDUCE-3921-7.patch, MAPREDUCE-3921-branch-0.23.patch,
> MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch,
> MAPREDUCE-3921.patch
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira