[ 
https://issues.apache.org/jira/browse/HDFS-10192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15222641#comment-15222641
 ] 

Mingliang Liu commented on HDFS-10192:
--------------------------------------

I had a look at [HDFS-7046] and found that there was no easy fix to avoid NPE 
because of leaving safe mode early in the middle of edit. For now I'm in favor 
of the current fix. We are deliberately avoid not leaving safe mode in the 
middle of edit when failing over., and check the safe mode after start active 
services.

That's said,
# I had a look at the {{BlockManagerSafeMode#checkSafeMode()}}, if the safe 
mode is OFF, it will be a no op. This means we can check the safe mode without 
side effects (e.g. OFF -> PENDING_THRESHOLD). This is important if 
{{BlockManager#checkSafeMode()}} is public.
# I think we can add another unit test that will assert that, the 
{{BlockManagerSafeMode#checkSafeMode()}} will never leave safe mode (even 
better, a no-op) in the context of start active services. This may be similar 
to the test case in the patch (or we can consolidate them in one single test).

Any comment?

> Namenode safemode not coming out during failover
> ------------------------------------------------
>
>                 Key: HDFS-10192
>                 URL: https://issues.apache.org/jira/browse/HDFS-10192
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Brahma Reddy Battula
>            Assignee: Brahma Reddy Battula
>         Attachments: HDFS-10192-01.patch
>
>
> Scenario:
> =======
> write some blocks
> wait till roll edits happen
> Stop SNN
> Delete some blocks in ANN, wait till the blocks are deleted in DN also.
> restart the SNN and Wait till block reports come from datanode to SNN
> Kill ANN then make SNN to Active.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to