[
https://issues.apache.org/jira/browse/HDFS-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006335#comment-13006335
]
dhruba borthakur commented on HDFS-1594:
----------------------------------------
another question related to this one. Can we find the root cause why the edits
log get corrupted when disk device is full? As far as the code is concerned,
the namenode will fail to preallocate the transaction log and should exit
(expected behaviour). One reason why it can refuse to restart is because the nn
tries to take a new checkpoint at startup time, can you pl say whether this was
what happened in your situation?
new method: numLiveDataNodes()?
> When the disk becomes full Namenode is getting shutdown and not able to
> recover
> -------------------------------------------------------------------------------
>
> Key: HDFS-1594
> URL: https://issues.apache.org/jira/browse/HDFS-1594
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: name-node
> Affects Versions: 0.21.0, 0.21.1, 0.22.0
> Environment: Linux linux124 2.6.27.19-5-default #1 SMP 2009-02-28
> 04:40:21 +0100 x86_64 x86_64 x86_64 GNU/Linux
> Reporter: Devaraj K
> Fix For: 0.23.0
>
> Attachments: HDFS-1594.patch, HDFS-1594.patch, HDFS-1594.patch,
> hadoop-root-namenode-linux124.log
>
>
> When the disk becomes full name node is shutting down and if we try to start
> after making the space available It is not starting and throwing the below
> exception.
> {code:xml}
> 2011-01-24 23:23:33,727 ERROR
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
> initialization failed.
> java.io.EOFException
> at java.io.DataInputStream.readFully(DataInputStream.java:180)
> at org.apache.hadoop.io.UTF8.readFields(UTF8.java:117)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.readString(FSImageSerialization.java:201)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:185)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:60)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1089)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:1041)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:487)
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:149)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:306)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:284)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:328)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:356)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:577)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:570)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1529)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1538)
> 2011-01-24 23:23:33,729 ERROR
> org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.EOFException
> at java.io.DataInputStream.readFully(DataInputStream.java:180)
> at org.apache.hadoop.io.UTF8.readFields(UTF8.java:117)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.readString(FSImageSerialization.java:201)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:185)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:60)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1089)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:1041)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:487)
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:149)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:306)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:284)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:328)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:356)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:577)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:570)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1529)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1538)
> 2011-01-24 23:23:33,730 INFO org.apache.hadoop.hdfs.server.namenode.NameNode:
> SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at linux124/10.18.52.124
> ************************************************************/
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira