[ https://issues.apache.org/jira/browse/HADOOP-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12463087 ]
Arun C Murthy commented on HADOOP-600: -------------------------------------- Attached a straight-forward fix: lock the JobTracker before locking the 'taskTrackers' & 'trackerExpiryQueue'. I didn't bother trying to build a list of dead task-trackers and then lock the JobTracker since the inner-loop only checks timestamps & hence shouldn't a big-deal... :-) > Race condition in JobTracker updating the task tracker's status while > declaring it lost > --------------------------------------------------------------------------------------- > > Key: HADOOP-600 > URL: https://issues.apache.org/jira/browse/HADOOP-600 > Project: Hadoop > Issue Type: Bug > Components: mapred > Affects Versions: 0.7.1 > Reporter: Owen O'Malley > Assigned To: Arun C Murthy > Fix For: 0.10.1 > > Attachments: HADOOP-600_20070108_1.patch > > > There was a case where the JobTracker lost track of a set of tasks that were > on a task tracker. It appears to be a race condition because the > ExpireTrackers thread doesn't lock the JobTracker while updating the state. > The fix would be to build a list of dead task trackers and then lock the job > tracker while updating their status. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira