Hey,

Yesterday we restarted our Name Node for the first time in awhile to push
out some new configuration updates to it. Upon it starting again we got this
error :-

2010-12-01 10:59:39,635 INFO org.apache.hadoop.hdfs.server.common.Storage:
Number of files = 121229
2010-12-01 10:59:41,578 INFO org.apache.hadoop.hdfs.server.common.Storage:
Number of files under construction = 126
2010-12-01 10:59:41,598 INFO org.apache.hadoop.hdfs.server.common.Storage:
Image file of size 19581054 loaded in 1 seconds.
2010-12-01 10:59:41,600 ERROR
org.apache.hadoop.hdfs.server.namenode.NameNode:
java.lang.NullPointerException
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1073)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1085)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:992)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:195)
        at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:615)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:999)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:88)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:312)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:293)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:224)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:306)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1004)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1013)

Which I tracked down to being a race error with the edit log saving :-
https://issues.apache.org/jira/browse/HDFS-909

We fixed this by applying the patch from here
https://issues.apache.org/jira/browse/HDFS-1002 which meant we could start
the name node and let it fix the edit log, but meant we lost some files from
HDFS..

We're using CHD2-169.68, and this bug was fixed in CHD2-169.113 released in
September so I would recommend everyone upgrades to that!

Thanks,

-- 
Dan Harvey | Datamining Engineer
www.mendeley.com/profiles/dan-harvey

Mendeley Limited | London, UK | www.mendeley.com
Registered in England and Wales | Company Number 6419015

Reply via email to