[
https://issues.apache.org/jira/browse/HADOOP-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581796#action_12581796
]
Amar Kamat commented on HADOOP-2175:
------------------------------------
This patch incorporates Devaraj's comments. Changes are as follows
- The call to lostTaskTracker now expects a reason/info for why/how the tracker
is lost.
- Losing a blacklisted tracker happens in {{addTrackerFailure}}
- Before losing a tracker a check is made if the tracker exists in the JT and
also the status is updated so that in the next heartbeat cycle the TT gets
reinitialized.
- The test case now doesn't depend on timing.
- The only changes in {{MiniMRCluster}} is to do with default conf for JT/TT.
> Blacklisted hosts may not be able to serve map outputs
> ------------------------------------------------------
>
> Key: HADOOP-2175
> URL: https://issues.apache.org/jira/browse/HADOOP-2175
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Assignee: Amar Kamat
> Attachments: HADOOP-2175-v1.1.patch, HADOOP-2175-v1.patch,
> HADOOP-2175-v2.patch
>
>
> After a node fails 4 mappers (tasks), it is added to blacklist thus it will
> no longer accept tasks.
> But, it will continue serve the map outputs of any mappers that ran
> successfully there.
> However, the node may not be able serve the map outputs either.
> This will cause the reducers to mark the corresponding map outputs as from
> slow hosts,
> but continue to try to get the map outputs from that node.
> This may lead to waiting forever.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.