[ 
https://issues.apache.org/jira/browse/HADOOP-7458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067554#comment-13067554
 ] 

Sakthivel Murugasamy commented on HADOOP-7458:
----------------------------------------------

Hi All,

Thank you so much for your valuable solutions!

Problem got resolved, but significant time+data loss(since we were running on 
an experimental basis, reloaded fewer GB of the data). I used -importCheckpoint 
option.

I just would like to tell you the possible scenario/reason of editlog 
corruption might have happened(correct me if I am wrong),

Below were the typical configurations in hdfs-site.xml

    * hadoop.tmp.dir : /opt/data/tmp
    * dfs.name.dir : /opt/data/name
    * dfs.data.dir : /opt/data/data
    * mapred.local.dir  : ${hadoop.tmp.dir}/mapred/local

/opt/data is an mounted storage, size is 50GB. Namenode, SecondaryNamenode( 
${hadoop.tmp.dir}/dfs/namesecondary) & Datanode directories were configured 
within /opt/data itself.

Once I moved 3.6GB compressed(bz2) file, I guess /opt/data memory usage of this 
dir. could have been 100%(I checked($df -h) after this incident). Then, I ran 
Hive with simple "Select" query, its job.jar files also needs to be created 
within the same directory which already has no space. So this is how the 
editlog corruption could have been occurred.

This is really a good learning for me! Now I have changed that configurations.

Thanks

> Namenode not get started! FSNamesystem initialization failed. 
> java.io.FileNotFoundException
> -------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-7458
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7458
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.20.2
>         Environment: CentOS release 5.5 (Final), 18 node Cluster 
>            Reporter: Sakthivel Murugasamy
>            Priority: Blocker
>              Labels: hadoop
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> 2011-07-13 12:04:12,967 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
> initialization failed.
> java.io.FileNotFoundException: File does not exist: 
> /opt/data/tmp/mapred/system/job_201107041958_0120/j^@^@^@^@^@^@
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedSetPermission(FSDirectory.java:544)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:724)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:992)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
> 2011-07-13 12:04:13,006 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: 
> java.io.FileNotFoundException: File does not exist: 
> /opt/data/tmp/mapred/system/job_201107041958_0120/j^@^@^@^@^@^@
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedSetPermission(FSDirectory.java:544)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:724)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:992)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
> In the path /opt/data/tmp/mapred, "system/" folder itself is not available

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to