Hi,

I have an issue with crashing secondary namenode due to a simple move
operation ....
Appreciate any ideas on the resolution ...

Details bellow:
I was moving old backups to a separate folder, exact command:

    sudo -u hdfs hadoop fs -mv /hbase-bak /backup/

and shortly after the command secondary namenode crashed with following message:

 2012-05-09 09:37:44,168 INFO
org.apache.hadoop.hdfs.server.common.Storage: Edits file
/NNBak/current/edits of size 7680232 edits # 45318 loaded in 1
seconds.
2012-05-09 09:37:44,232 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
transactions: 0 Total time for transactions(ms): 0Number of
transactions batched in S
2012-05-09 09:37:45,449 ERROR
org.apache.hadoop.hdfs.server.common.Storage: Unable to save image for
/NNBak
java.io.IOException: saveLeases found path
/backup/base-bak/.logs/data1,60020,1304443405002/data1%3A60020.1308869024750
but no matching entry in namespace.
       at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:5449)
       at 
org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:1070)
       at 
org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1172)
       at 
org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1120)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:731)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$500(SecondaryNameNode.java:628)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:505)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:469)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:333)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
       at java.lang.Thread.run(Thread.java:662)
2012-05-09 09:37:45,450 WARN
org.apache.hadoop.hdfs.server.common.Storage: Removing storage dir
/NNBak
2012-05-09 09:37:45,450 FATAL
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: No edit streams
are accessible
java.lang.Exception: No edit streams are accessible
       at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.fatalExit(FSEditLog.java:410)
       at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.exitIfNoStreams(FSEditLog.java:429)
       at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.open(FSEditLog.java:374)
       at 
org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1158)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:731)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$500(SecondaryNameNode.java:628)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:505)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:469)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:333)
       at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
       at java.lang.Thread.run(Thread.java:662)
2012-05-09 09:37:45,451 INFO
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode:
SHUTDOWN_MSG:


looks like it is  expecting the file
/backup/base-bak/.logs/data1,60020,1304443405002/data1%3A60020.1308869024750
which never existed
but there is   
/backup/hbase-bak/.logs/data1,60020,1304443405002/data1%3A60020.1308869024750
which was moved


fsck / and fsck /backup  returns no issues
I can backup  fsimage and edits from
 http://namenode:50070/getimage?getimage=1
 http://namnoede:50070/getimage?getedit=1


but all attempts to start the secondary namenode resulted in the same crash ...


on the primary namenode all edits goes to edits.new and edits is not updating ..

looking at result of "strings edits.new" I see lines like:

 /backup/base-bak/.logs
1336585759935
hdfs
supergroup
:/backup/base-bak/.logs/data1.inadco.gg,60020,1304443405002
1336585759963
hdfs
supergroup
a/backup/hbase-bak/.logs/data1.inadco.gg,60020,1304443405002/data1.inadco.gg%3A60020.1308869024750
1336585759995
`/backup/base-bak/.logs/data1.inadco.gg,60020,1304443405002/data1.inadco.gg%3A60020.1308869024750
1336585760017
1336585760017
67108864

duplicating /backup/hbase-bajk and  /backup/base-bak



Thanks
-- Alex

Reply via email to