[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965147#action_12965147
 ] 

Joydeep Sen Sarma commented on MAPREDUCE-2206:
----------------------------------------------

afaik from looking at the code - there's no requirement for the cleanup to go 
to the same machine. it happens to go to the same machine because whenever a 
task reports failed/killed - a slot is freed up and the JT schedules the newly 
created cleanup task on the same TT. but there's no hard requirement for the 
same in the code and it's possible that the JT does not schedule it on the same 
machine (for example if the TT was previously oversubscribed).

If the failure was because of problems with task localization (for example) - 
the results are truly miserable. i have hit scenarios where two 10 min task 
timeouts were required to fail a task (one for the task failure and one for 
it's cleanup) on a bad node.

> The task-cleanup tasks should be optional
> -----------------------------------------
>
>                 Key: MAPREDUCE-2206
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2206
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker
>    Affects Versions: 0.23.0
>            Reporter: Scott Chen
>            Assignee: Scott Chen
>             Fix For: 0.23.0
>
>
> For job does not use OutputCommitter.abort(), this should be able to turn off.
> This improves the latency of the job because failed tasks are often the 
> bottleneck of the jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to