[ 
https://issues.apache.org/jira/browse/HADOOP-4904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HADOOP-4904:
----------------------------------------

    Attachment: safeModeDeadlock.patch

This should solve the problem. {{leaveSafeMode()}} first acquires the 
{{FSNamesystem}} lock and then the {{SafeMode}} lock.


> Deadlock while leaving safe mode.
> ---------------------------------
>
>                 Key: HADOOP-4904
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4904
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.3
>            Reporter: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: safeModeDeadlock.patch
>
>
> {{SafeModeInfo.leave()}} acquires locks in an incorrect order, which causes 
> the deadlock.
> It first acquires the {{SafeModeInfo}} lock, then calls 
> {{FSNamesystem.processMisReplicatedBlocks()}}, which requires the global 
> {{FSNamesystem}} lock.
> It should be the other way around: first {{FSNamesystem}} lock, then 
> {{SafeModeInfo}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to