[ 
https://issues.apache.org/jira/browse/HADOOP-5985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722282#action_12722282
 ] 

Matei Zaharia commented on HADOOP-5985:
---------------------------------------

I think this approach is the right one, Aaron. However, the tricky part will be 
determining what metrics to use for deciding that a TaskTracker is slow at 
serving outputs and for deciding on a reducer when you want to try a second 
location.

For identifying slow TT's, just looking at number of shards served might not be 
enough. First of all, TT's might have run different numbers of tasks. Second, 
TT's might become slow partway through their lifetime. If all TT's have served 
1000/1000 shards and one of them is at 800/1000 and then starts thrashing, you 
have to realize that at that point. I think there are two possible ways to go:
1) Instead of looking at shards served, ask reducers periodically about how 
many shards they are in the process of fetching from each TT. If there is a TT 
that a lot of reducers are currently fetching from, it's probably being slow 
(we could even say "when there are only N tasktrackers left that we are 
fetching from, start considering them slow"). This might have to be weighted by 
number of shards that ran per node. The idea is to match the same kind of 
detection that a human operator may do ("hey, everyone is waiting on 
tasktracker X").
2) Look at rate of data being served (in bytes/second) from each TT, as well as 
the demand (how many shards are being requested). Mark TT's where the rate is 
below some threshold and the demand is above some other threshold as slow. 
(These thresholds may be calculated based on how the other TT's did).

For deciding when to fetch from a new shard in a reducer, the tricky part will 
be telling loaded TT's apart from slow TT's. The moving around between shards 
can't be too aggressive.

> A single slow (but not dead) map TaskTracker impedes MapReduce progress
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-5985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5985
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.18.3
>            Reporter: Aaron Kimball
>
> We see cases where there may be a large number of mapper nodes running many 
> tasks (e.g., a thousand). The reducers will pull 980 of the map task 
> intermediate files down, but will be unable to retrieve the final 
> intermediate shards from the last node. The TaskTracker on that node returns 
> data to reducers either slowly or not at all, but its heartbeat messages make 
> it back to the JobTracker -- so the JobTracker doesn't mark the tasks as 
> failed. Manually stopping the offending TaskTracker works to migrate the 
> tasks to other nodes, where the shuffling process finishes very quickly. Left 
> on its own, it can take hours to unjam itself otherwise.
> We need a mechanism for reducers to provide feedback to the JobTracker that 
> one of the mapper nodes should be regarded as lost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to