[ 
https://issues.apache.org/jira/browse/HDFS-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Mankude updated HDFS-2753:
-------------------------------

    Summary: Standby namenode stuck in safemode during a failover  (was: 
Standby namenode stuck in safenode during a failover)
    
> Standby namenode stuck in safemode during a failover
> ----------------------------------------------------
>
>                 Key: HDFS-2753
>                 URL: https://issues.apache.org/jira/browse/HDFS-2753
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Hari Mankude
>            Assignee: Hari Mankude
>
> teragen is run to generate write traffic. A manual failover is initiated by 
> killing namenode process. namenode which was killed is then restarted and it 
> comes up as a standby. However, the standby never exits out of the safemode. 
> Looking at the standby logs, it looks like the standby namenode gets 
> addStoredBlock immediately after data node registration which results in 
> numBlocks getting incremented. There is an optimization in processReport() 
> which causes the follow on complete block report to be ignored when NN is in 
> safemode. So, NN never exits out of the safemode.
> 2012-01-05 18:57:46,030 INFO  hdfs.StateChange 
> (DatanodeManager.java:registerDatanode(573)) - BLOCK* 
> NameSystem.registerDatanode: node registration from 98.137.233.235:50010 
> storage DS-526656430-98.137.233.235-50010-1325723536492
> 2012-01-05 18:57:46,033 INFO  net.NetworkTopology 
> (NetworkTopology.java:add(344)) - Adding a new node: 
> /default-rack/98.137.233.235:50010
> 2012-01-05 18:57:46,033 INFO  namenode.FSNamesystem 
> (FSNamesystem.java:checkMode(3411)) - DID NOT call initialize at 1
> 2012-01-05 18:57:46,034 INFO  hdfs.StateChange 
> (DatanodeManager.java:registerDatanode(573)) - BLOCK* 
> NameSystem.registerDatanode: node registration from 98.137.233.237:50010 
> storage DS-1961520590-98.137.233.237-50010-1325725253057
> 2012-01-05 18:57:46,034 INFO  net.NetworkTopology 
> (NetworkTopology.java:add(344)) - Adding a new node: 
> /default-rack/98.137.233.237:50010
> 2012-01-05 18:57:46,042 INFO  namenode.FSNamesystem 
> (FSNamesystem.java:checkMode(3411)) - DID NOT call initialize at 1
> 2012-01-05 18:57:46,045 INFO  hdfs.StateChange 
> (BlockManager.java:addStoredBlock(1775)) - BLOCK* addStoredBlock: blockMap 
> updated: 98.137.233.235:50010 is added to 
> blk_-3183325095022454724_1172{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[98.137.233.235:50010|FINALIZED]]} size 0
> 2012-01-05 18:57:46,046 INFO  hdfs.StateChange 
> (BlockManager.java:addStoredBlock(1775)) - BLOCK* addStoredBlock: blockMap 
> updated: 98.137.233.235:50010 is added to 
> blk_5617057825952660916_1173{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[98.137.233.235:50010|FINALIZED]]} size 0
> 2012-01-05 18:57:46,046 INFO  hdfs.StateChange 
> (BlockManager.java:addStoredBlock(1775)) - BLOCK* addStoredBlock: blockMap 
> updated: 98.137.233.237:50010 is added to 
> blk_-3183325095022454724_1172{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[98.137.233.235:50010|FINALIZED], 
> ReplicaUnderConstruction[98.137.233.237:50010|FINALIZED]]} size 0
> 2012-01-05 18:57:46,046 INFO  hdfs.StateChange 
> (BlockManager.java:addStoredBlock(1775)) - BLOCK* addStoredBlock: blockMap 
> updated: 98.137.233.237:50010 is added to 
> blk_5617057825952660916_1173{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[98.137.233.235:50010|FINALIZED], 
> ReplicaUnderConstruction[98.137.233.237:50010|FINALIZED]]} size 0
> 2012-01-05 18:57:46,049 INFO  hdfs.StateChange 
> (BlockManager.java:processReport(1365)) - BLOCK* processReport: discarded 
> non-initial block report from 98.137.233.235:50010 because namenode still in 
> startup phase
> 2012-01-05 18:57:46,049 INFO  hdfs.StateChange 
> (BlockManager.java:processReport(1365)) - BLOCK* processReport: discarded 
> non-initial block report from 98.137.233.237:50010 because namenode still in 
> startup phase
> 2012-01-05 18:58:05,167 INFO  namenode.NameNode 
> (NameNodeRpcServer.java:blockReceivedAndDeleted(894)) - Required GS=1175, 
> Queuing blockReceivedAndDeleted message
> 2012-01-05 18:58:05,168 INFO  namenode.NameNode 
> (NameNodeRpcServer.java:blockReceivedAndDeleted(894)) - Required GS=1175, 
> Queuing blockReceivedAndDeleted message
> 2012-01-05 18:58:06,634 INFO  namenode.NameNode 
> (NameNodeRpcServer.java:blockReceivedAndDeleted(894)) - Required GS=1176, 
> Queuing blockReceivedAndDeleted message
> 2012-01-05 18:58:06,636 INFO  namenode.NameNode 
> (NameNodeRpcServer.java:blockReceivedAndDeleted(894)) - Required GS=1176, 
> Queuing blockReceivedAndDeleted message
> 2012-01-05 18:58:08,097 INFO  namenode.NameNode 
> (NameNodeRpcServer.java:blockReceivedAndDeleted(894)) - Required GS=1177, 
> Queuing blockReceivedAndDeleted message
> 2012-01-05 18:58:08,097 INFO  namenode.NameNode 
> (NameNodeRpcServer.java:blockReceivedAndDeleted(894)) - Required GS=1177, 
> Queuing blockReceivedAndDeleted message

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to