JobTracker's TaskCommitQueue is vulnerable to non-IOExceptions --------------------------------------------------------------
Key: HADOOP-2051 URL: https://issues.apache.org/jira/browse/HADOOP-2051 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Blocker Fix For: 0.15.0 The {{JobTracker#TaskCommitQueue#run}} method only handles {{IOException}}s. Christian Kunz ran into a scenario where a job was stuck with all tasks in {{COMMIT_PENDING}} state and the stack traces showed that the "Task Commit Thread" wasn't even around. The work-around is to model {{TaskCommitQueue#run}} along the lines of other long-running threads in the {{JobTracer}} ({{ExpireLaunchingTasks}}, {{ExpireTrackers}} etc.) to catch, log and ignore any {{Exception}} in a loop. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.