[
https://issues.apache.org/jira/browse/HADOOP-4904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantin Shvachko updated HADOOP-4904:
----------------------------------------
Attachment: safeModeDeadlock.patch
This should solve the problem. {{leaveSafeMode()}} first acquires the
{{FSNamesystem}} lock and then the {{SafeMode}} lock.
> Deadlock while leaving safe mode.
> ---------------------------------
>
> Key: HADOOP-4904
> URL: https://issues.apache.org/jira/browse/HADOOP-4904
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.18.3
> Reporter: Konstantin Shvachko
> Priority: Blocker
> Fix For: 0.18.3
>
> Attachments: safeModeDeadlock.patch
>
>
> {{SafeModeInfo.leave()}} acquires locks in an incorrect order, which causes
> the deadlock.
> It first acquires the {{SafeModeInfo}} lock, then calls
> {{FSNamesystem.processMisReplicatedBlocks()}}, which requires the global
> {{FSNamesystem}} lock.
> It should be the other way around: first {{FSNamesystem}} lock, then
> {{SafeModeInfo}}.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.