[
https://issues.apache.org/jira/browse/HADOOP-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582376#action_12582376
]
Sameer Paranjpye commented on HADOOP-2175:
------------------------------------------
Let's not confuse lost and blacklisted tasktrackers. A lost tasktracker is one
that doesn't check in with the JT and a tasktracker blacklisted for a job is
one that causes tasks to fail for that job and they need to be handled very
differently.
We should move this issue to 0.18. We don't have a coherent model for task
failures, the blacklisting logic is already messy and adding half a dozen if
statements will only make it messier.
> Blacklisted hosts may not be able to serve map outputs
> ------------------------------------------------------
>
> Key: HADOOP-2175
> URL: https://issues.apache.org/jira/browse/HADOOP-2175
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Assignee: Amar Kamat
> Fix For: 0.18.0
>
> Attachments: HADOOP-2175-v1.1.patch, HADOOP-2175-v1.patch,
> HADOOP-2175-v2.patch, HADOOP-2175-v2.patch
>
>
> After a node fails 4 mappers (tasks), it is added to blacklist thus it will
> no longer accept tasks.
> But, it will continue serve the map outputs of any mappers that ran
> successfully there.
> However, the node may not be able serve the map outputs either.
> This will cause the reducers to mark the corresponding map outputs as from
> slow hosts,
> but continue to try to get the map outputs from that node.
> This may lead to waiting forever.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.