[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876156#action_12876156
 ] 

Vinod K V commented on MAPREDUCE-1682:
--------------------------------------

The above code bug is also responsible for some corner case issues because of 
which a job never finishes. We saw some scenarios in which speculative attempts 
get launched and get killed immediately in the order of seconds. This happens 
continuously for ever and the job never ends.

> Tasks should not be scheduled after tip is killed/failed.
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-1682
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1682
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>            Reporter: Amareshwari Sriramadasu
>             Fix For: 0.20.3
>
>
> We have seen the following scenario in our cluster:
> A job got marked failed, because four attempts of a TIP failed. This would 
> kill all the map and reduce tips. Then a job-cleanup attempt is launched.
> The job-cleanup attempt failed because it could not report status for 10 
> minutes. There are 3 such job-cleanup attempts leading the job to get killed 
> after 1/2 hour.
> While waiting for the job cleanup to finish, JobTracker scheduled many tasks 
> of the job on TaskTrackers and sent a KillTaskAction in the next heartbeat. 
> This is just wasting lots of resources, we should avoid scheduling tasks of a 
> tip once the tip is killed/failed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to