[
https://issues.apache.org/jira/browse/MAPREDUCE-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542570#comment-13542570
]
Karthik Kambatla commented on MAPREDUCE-2217:
---------------------------------------------
The patch posted on 16/Nov fixes the issue.
To verify this I ran a hadoop cluster of 4 nodes with both MR-2217.patch and
expose-bug-mr-2217.patch. The tasks assigned to machine01 timeout, and are
subsequently scheduled on other nodes, and the job completes. Without
MR-2217.patch, the job doesn't progress even after an hour. I used pi job with
8 mappers and 1000 input splits for this.
> The expire launching task should cover the UNASSIGNED task
> ----------------------------------------------------------
>
> Key: MAPREDUCE-2217
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2217
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: jobtracker
> Affects Versions: 0.23.0
> Reporter: Scott Chen
> Assignee: Karthik Kambatla
> Fix For: 0.24.0
>
> Attachments: expose-bug-mr-2217.patch, MAPREDUCE-2217.1.txt,
> MR-2217.patch
>
>
> The ExpireLaunchingTask thread kills the task that are scheduled but not
> responded.
> Currently if a task is scheduled on tasktracker and for some reason
> tasktracker cannot put it to RUNNING.
> The task will just hang in the UNASSIGNED status and JobTracker will keep
> waiting for it.
> JobTracker.ExpireLaunchingTask should be able to kill this task.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira