hi,
I tried to run a backup node on hdfs 0.21 , however the daemon crashed with NPE
(stack trace as below) and
left an 'edits.new' file in the $dfs.namenode.name.dir/current diretory . After
that , I failed to restart the namenode and the backup node because of the same
exception.
Could anyone give me a help to recovery the cluster? Although the NN can be
restarted by creating an empty 'edits' file ,much data would be lost .
12/10/09 15:32:45 ERROR namenode.Checkpointer: Throwable Exception in
doCheckpoint:
java.lang.NullPointerException
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedSetTimes(FSDirectory.java:1765)
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedSetTimes(FSDirectory.java:1753)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadEditRecords(FSEditLog.java:708)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:411)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:378)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1209)
at
org.apache.hadoop.hdfs.server.namenode.BackupStorage.loadCheckpoint(BackupStorage.java:158)
at
org.apache.hadoop.hdfs.server.namenode.Checkpointer.doCheckpoint(Checkpointer.java:243)
at
org.apache.hadoop.hdfs.server.namenode.Checkpointer.run(Checkpointer.java:141)
12/10/09 15:32:45 WARN namenode.FSNamesystem: ReplicationMonitor thread
received InterruptedException.java.lang.InterruptedException: sleep interrupted
12/10/09 15:32:45 WARN namenode.DecommissionManager: Monitor interrupted:
java.lang.InterruptedException: sleep interrupted
12/10/09 15:32:45 INFO namenode.FSNamesystem: Number of transactions: 24 Total
time for transactions(ms): 4Number of transactions batched in Syncs: 0 Number
of syncs: 25 SyncTimes(ms): 239
12/10/09 15:32:45 INFO ipc.Server: Stopping server on 50100
2012-10-22
rongshen.long