[ https://issues.apache.org/jira/browse/MAPREDUCE-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836439#action_12836439 ]
Matei Zaharia commented on MAPREDUCE-1436: ------------------------------------------ I'm still a little concerned about the update thread not locking the JT all the time, but maybe I don't need to be. Just to clarify, the convention for locking is the following: * If both the JT and the FairScheduler must be locked, the JT is locked first. * If both the FairScheduler and a JIP must be locked, the FairScheduler is locked first. * If both the JT and a JIP must be locked, the JT is locked first. * If the JT, FS and JIP must all be locked, the order is JT -> FS -> JIP. If this is it, then I think we're fine with the current usage. > Deadlock in preemption code in fair scheduler > --------------------------------------------- > > Key: MAPREDUCE-1436 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1436 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/fair-share > Affects Versions: 0.21.0, 0.22.0 > Reporter: Matei Zaharia > Assignee: Matei Zaharia > Priority: Blocker > Attachments: deadlock.png, mapreduce-1436-v2.patch, > mapreduce-1436.patch > > > In testing the fair scheduler with preemption, I found a deadlock between > updatePreemptionVariables and some code in the JobTracker. This was found > while testing a backport of the fair scheduler to Hadoop 0.20, but it looks > like it could also happen in trunk and 0.21. Details are in a comment below. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.