[
https://issues.apache.org/jira/browse/HADOOP-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615310#action_12615310
]
lohit edited comment on HADOOP-3724 at 7/21/08 11:29 AM:
--------------------------------------------------------------------
After looking at the logs, here is something which might have triggered this
- 00:06:35,042
/mapredsystem/user/mapredsystem/machine/job_200807182151_0100/job.jar was
created
- 00:06:36,641 request for NameSystem.allocateBlock
blk_353852126366132690_6689141 of file
/mapredsystem/user/mapredsystem/machine/job_200807182151_0100/job.jar
- 00:10:18,138 request for rename of /mapredsystem/user/mapredsystem/machine
to /user/user/.Trash/Current/machine
- 01:06:35,583 After around one hour, lease recovery for the open file happens
but with new path under .Trash With this exception going on forever
{noformat}
INFO org.apache.hadoop.dfs.LeaseManager: Lease Monitor: Removing lease [Lease.
Holder: DFSClient_-2067534834, pendingcreates: 1], sortedLeases.size()=: 13
INFO org.apache.hadoop.fs.FSNamesystem: Recovering lease=[Lease. Holder:
DFSClient_-2067534834, pendingcreates: 1],
src=/user/user/.Trash/Current/machine/machine/job_200807182151_0100/job.jar
WARN org.apache.hadoop.dfs.StateChange: DIR* NameSystem.internalReleaseCreate:
attempt to release a create lock on
/user/user/.Trash/Current/machine/machine/job_200807182151_0100/job.jar file
does not exist.
{noformat}
Also,
- For this particular file, allocating a block was not followed by
addStoredBlock, none of the datanodes reported back to namenode with block
length, hence the file was still open
was (Author: lohit):
After looking at the logs, here is something which might have triggered this
- 00:06:35,042
/mapredsystem/user/mapredsystem/machine/job_200807182151_0100/job.jar was
created
- 00:06:36,641 request for NameSystem.allocateBlock
blk_353852126366132690_6689141 of file
/mapredsystem/user/mapredsystem/machine/job_200807182151_0100/job.jar
- 00:10:18,138 request for rename of /mapredsystem/user/mapredsystem/machine
to /user/user/.Trash/Current/machine
- 01:06:35,583 After around one hour, lease recovery for the open file happens
but with new path under .Trash With this exception going on forever
{noformat}
INFO org.apache.hadoop.fs.FSNamesystem: Recovering lease=[Lease. Holder:
DFSClient_-2067534834, pendingcreates: 1],
src=/user/user/.Trash/Current/machine/machine/job_200807182151_0100/job.jar
{noformat}
Also,
- For this particular file, allocating a block was not followed by
addStoredBlock, none of the datanodes reported back to namenode with block
length, hence the file was still open
> Namenode does not start due to exception throw while saving Image
> -----------------------------------------------------------------
>
> Key: HADOOP-3724
> URL: https://issues.apache.org/jira/browse/HADOOP-3724
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.18.0
> Reporter: Lohit Vijayarenu
> Assignee: dhruba borthakur
> Priority: Blocker
> Fix For: 0.18.0
>
>
> Re-start of namenode failed with this stack trace while savingImage during
> initialization
> {noformat}
> 2008-07-09 00:20:21,470 INFO org.apache.hadoop.ipc.Server: Stopping server on
> 9000
> 2008-07-09 00:20:21,493 ERROR org.apache.hadoop.dfs.NameNode:
> java.io.IOException: saveLeases found path /foo/bar/jambajuice but no
> matching entry in namespace.
> at
> org.apache.hadoop.dfs.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:4376)
>
> at org.apache.hadoop.dfs.FSImage.saveFSImage(FSImage.java:874)
> at org.apache.hadoop.dfs.FSImage.saveFSImage(FSImage.java:892)
> at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:81)
> at org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:273)
> at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:252)
> at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:148)
> at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:193)
> at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:179)
> at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:822)
> at org.apache.hadoop.dfs.NameNode.main(NameNode.java:831)
> {noformat}
> Looks like it was throwing IOException in saveFilesUnderConstruction
> Before restart NameNode was killed while some jobs were running. Upon looking
> at the namenode log before the stopping of namenode, there were many entries
> like this
> {noformat}
> 2008-07-09 00:12:55,301 INFO org.apache.hadoop.fs.FSNamesystem: Recovering
> lease=[Lease. Holder: DFSClient_-510679348, pendingcreates: 1],
> src=/foo/bar/jambajuice
> 2008-07-09 00:12:55,301 WARN org.apache.hadoop.dfs.StateChange: DIR*
> NameSystem.internalReleaseCreate: attempt to release a create lock on
> /foo/bar/jambajuice file does not exist.
> {noformat}
> These 2 lines are repeated forever every second, to a point where I see that
> a 7 node cluster had namenode log with size close to 41G.
> Could not find any other information about the file as there were not
> previous namenode logs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.