[
https://issues.apache.org/jira/browse/HADOOP-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678010#action_12678010
]
Vinod K V commented on HADOOP-5376:
-----------------------------------
bq. In a discussion offline, Amareshwari explained that this issue occurs if a
setup/cleanup task is running on the TT that subsequently becomes lost and the
task moves to a KILLED_UNCLEAN state. This makes the setup/cleanup task to be
incorrectly added to the list of tasks that need cleanup.
Confirmed the same from the logs.
{code}
2009-02-28 23:05:41,986 INFO org.apache.hadoop.mapred.JobTracker: Adding task
'attempt_200902261046_9662_m_007800_0' to tip task_200902261046_9662_m_007800,
for tracker '<tracker_host:port>'
2009-02-28 23:17:14,800 INFO org.apache.hadoop.mapred.TaskInProgress: Error
from attempt_200902261046_9662_m_007800_0: Lost task tracker:
<tracker_host:port>
{code}
The result is that the job's cleanup task got stuck, it is shown to be in
pending state on the JT UI. No subsequent attempts are launched for the cleanup
task. And the job hangs in there like that. I tried killing the cleanup attempt
from the client command line, thinking it might get rescheduled, but it fails
with message "Could not kill task attempt_200902261046_9662_m_007800_0". Even
killing the job didn't work :(
> JobInProgress.obtainTaskCleanupTask() throws an ArrayIndexOutOfBoundsException
> ------------------------------------------------------------------------------
>
> Key: HADOOP-5376
> URL: https://issues.apache.org/jira/browse/HADOOP-5376
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.20.0
> Reporter: Vinod K V
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.