[ 
https://issues.apache.org/jira/browse/HDFS-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172886#comment-13172886
 ] 

Todd Lipcon commented on HDFS-2693:
-----------------------------------

Found a nasty bug in the current patch: in {{getBlockLocationsUpdateTimes}} the 
{{checkOperation}} call has to move down inside {{try..catch}} or else calling 
this function on an NN in standby state will leak the FSN readlock, killing the 
SBN. I'll upload a new patch soon. Still worth reviewing the general patch 
since most of it seems to be working correctly.
                
> Synchronization issues around state transition
> ----------------------------------------------
>
>                 Key: HDFS-2693
>                 URL: https://issues.apache.org/jira/browse/HDFS-2693
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>         Attachments: hdfs-2693.txt
>
>
> Currently when the NN changes state, it does so without synchronization. In 
> particular, the state transition function does:
> (1) leave old state
> (2) change state variable
> (3) enter new state
> This means that the NN is marked as "active" before it has actually 
> transitioned to active mode and opened its edit logs. This gives a window 
> where write transactions can come in and the {{checkOperation}} allows them, 
> but then they fail because the edit log is not yet opened.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to