Hi,
I have an issue with crashing secondary namenode due to a simple move
operation ....
Appreciate any ideas on the resolution ...
Details bellow:
I was moving old backups to a separate folder, exact command:
sudo -u hdfs hadoop fs -mv /hbase-bak /backup/
and shortly after the command secondary namenode crashed with following message:
2012-05-09 09:37:44,168 INFO
org.apache.hadoop.hdfs.server.common.Storage: Edits file
/NNBak/current/edits of size 7680232 edits # 45318 loaded in 1
seconds.
2012-05-09 09:37:44,232 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
transactions: 0 Total time for transactions(ms): 0Number of
transactions batched in S
2012-05-09 09:37:45,449 ERROR
org.apache.hadoop.hdfs.server.common.Storage: Unable to save image for
/NNBak
java.io.IOException: saveLeases found path
/backup/base-bak/.logs/data1,60020,1304443405002/data1%3A60020.1308869024750
but no matching entry in namespace.
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:5449)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:1070)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1172)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1120)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:731)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$500(SecondaryNameNode.java:628)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:505)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:469)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:333)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
at java.lang.Thread.run(Thread.java:662)
2012-05-09 09:37:45,450 WARN
org.apache.hadoop.hdfs.server.common.Storage: Removing storage dir
/NNBak
2012-05-09 09:37:45,450 FATAL
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: No edit streams
are accessible
java.lang.Exception: No edit streams are accessible
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.fatalExit(FSEditLog.java:410)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.exitIfNoStreams(FSEditLog.java:429)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.open(FSEditLog.java:374)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1158)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:731)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$500(SecondaryNameNode.java:628)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:505)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:469)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:333)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
at java.lang.Thread.run(Thread.java:662)
2012-05-09 09:37:45,451 INFO
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode:
SHUTDOWN_MSG:
looks like it is expecting the file
/backup/base-bak/.logs/data1,60020,1304443405002/data1%3A60020.1308869024750
which never existed
but there is
/backup/hbase-bak/.logs/data1,60020,1304443405002/data1%3A60020.1308869024750
which was moved
fsck / and fsck /backup returns no issues
I can backup fsimage and edits from
http://namenode:50070/getimage?getimage=1
http://namnoede:50070/getimage?getedit=1
but all attempts to start the secondary namenode resulted in the same crash ...
on the primary namenode all edits goes to edits.new and edits is not updating ..
looking at result of "strings edits.new" I see lines like:
/backup/base-bak/.logs
1336585759935
hdfs
supergroup
:/backup/base-bak/.logs/data1.inadco.gg,60020,1304443405002
1336585759963
hdfs
supergroup
a/backup/hbase-bak/.logs/data1.inadco.gg,60020,1304443405002/data1.inadco.gg%3A60020.1308869024750
1336585759995
`/backup/base-bak/.logs/data1.inadco.gg,60020,1304443405002/data1.inadco.gg%3A60020.1308869024750
1336585760017
1336585760017
67108864
duplicating /backup/hbase-bajk and /backup/base-bak
Thanks
-- Alex