[ 
https://issues.apache.org/jira/browse/HDFS-7046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134106#comment-14134106
 ] 

Daryn Sharp commented on HDFS-7046:
-----------------------------------

Modifying the startup of the secret manager feels hacky.  Changing active state 
in the very middle of processing an edit op seems pretty dangerous and wrong.  
Even if it works today.  It's not something I would have ever expected the NN 
to do.

If the NN is allowed to exit safemode before finishing the edits replay then 
effectively it's exiting due to consistency "at some point in the past", 
instead of consistent "right now". 

> HA NN can NPE upon transition to active
> ---------------------------------------
>
>                 Key: HDFS-7046
>                 URL: https://issues.apache.org/jira/browse/HDFS-7046
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.0.0, 2.5.0
>            Reporter: Daryn Sharp
>            Assignee: Kihwal Lee
>            Priority: Critical
>         Attachments: HDFS-7046.patch, HDFS-7046_test_reproduce.patch
>
>
> While processing edits, the NN may decide after adjusting block totals to 
> leave safe mode - in the middle of the edit.  Going active starts the secret 
> manager which generates a new secret key, which in turn generates an edit, 
> which NPEs because the edit log is not open.
> # Transitions should _not_ occur in the middle of an edit.
> # The edit log appears to claim it's open for write when the stream isn't 
> even open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to