Hi,
I have an issue with crashing secondary namenode due to a simple move
operation ....
Appreciate any ideas on the resolution ...
Details bellow:
I was moving old backups to a separate folder, exact command:
sudo -u hdfs hadoop fs -mv /hbase-bak /backup/
and shortly after the command secondary namenode crashed with following message:
2012-05-09 09:37:44,168 INFO
org.apache.hadoop.hdfs.server.common.Storage: Edits file
/NNBak/current/edits of size 7680232 edits # 45318 loaded in 1
seconds.
2012-05-09 09:37:44,232 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
transactions: 0 Total time for transactions(ms): 0Number of
transactions batched in S
2012-05-09 09:37:45,449 ERROR
org.apache.hadoop.hdfs.server.common.Storage: Unable to save image for
/NNBak
java.io.IOException: saveLeases found path
/backup/base-bak/.logs/data1,60020,1304443405002/data1%3A60020.1308869024750
but no matching entry in namespace.
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:5449)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:1070)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1172)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1120)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:731)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$500(SecondaryNameNode.java:628)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:505)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:469)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:333)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
at java.lang.Thread.run(Thread.java:662)
2012-05-09 09:37:45,450 WARN
org.apache.hadoop.hdfs.server.common.Storage: Removing storage dir
/NNBak
2012-05-09 09:37:45,450 FATAL
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: No edit streams
are accessible
java.lang.Exception: No edit streams are accessible
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.fatalExit(FSEditLog.java:410)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.exitIfNoStreams(FSEditLog.java:429)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.open(FSEditLog.java:374)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1158)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:731)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$500(SecondaryNameNode.java:628)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:505)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:469)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:333)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
at java.lang.Thread.run(Thread.java:662)
2012-05-09 09:37:45,451 INFO
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode:
SHUTDOWN_MSG:
looks like it is expecting the file
/backup/base-bak/.logs/data1,60020,1304443405002/data1%3A60020.1308869024750
which never existed
but there is
/backup/hbase-bak/.logs/data1,60020,1304443405002/data1%3A60020.1308869024750
which was moved
fsck / and fsck /backup returns no issues
I can backup fsimage and edits from
http://namenode:50070/getimage?getimage=1
http://namnoede:50070/getimage?getedit=1
but all attempts to start the secondary namenode resulted in the same crash ...
Thanks
-- Alex