[
https://issues.apache.org/jira/browse/HDFS-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102653#comment-14102653
]
Hudson commented on HDFS-6825:
------------------------------
FAILURE: Integrated in Hadoop-Hdfs-trunk #1842 (See
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1842/])
HDFS-6825. Edit log corruption due to delayed block removal. Contributed by
Yongjun Zhang. (wang:
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618684)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCommitBlockSynchronization.java
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeleteRace.java
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestPipelinesFailover.java
> Edit log corruption due to delayed block removal
> ------------------------------------------------
>
> Key: HDFS-6825
> URL: https://issues.apache.org/jira/browse/HDFS-6825
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 2.5.0
> Reporter: Yongjun Zhang
> Assignee: Yongjun Zhang
> Fix For: 2.6.0
>
> Attachments: HDFS-6825.001.patch, HDFS-6825.002.patch,
> HDFS-6825.003.patch, HDFS-6825.004.patch, HDFS-6825.005.patch
>
>
> Observed the following stack:
> {code}
> 2014-08-04 23:49:44,133 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> commitBlockSynchronization(lastblock=BP-.., newgenerationstamp=...,
> newlength=..., newtargets=..., closeFile=true, deleteBlock=false)
> 2014-08-04 23:49:44,133 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Unexpected exception
> while updating disk space.
> java.io.FileNotFoundException: Path not found:
> /solr/hierarchy/core_node1/data/tlog/tlog.xyz
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateSpaceConsumed(FSDirectory.java:1807)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitOrCompleteLastBlock(FSNamesystem.java:3975)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.closeFileCommitBlocks(FSNamesystem.java:4178)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:4146)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.commitBlockSynchronization(NameNodeRpcServer.java:662)
> at
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.commitBlockSynchronization(DatanodeProtocolServerSideTranslatorPB.java:270)
> at
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28073)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)
> {code}
> Found this is what happened:
> - client created file /solr/hierarchy/core_node1/data/tlog/tlog.xyz
> - client tried to append to this file, but the lease expired, so lease
> recovery is started, thus the append failed
> - the file get deleted, however, there are still pending blocks of this file
> not deleted
> - then commitBlockSynchronization() method is called (see stack above), an
> InodeFile is created out of the pending block, not aware of that the file was
> deleted already
> - FileNotExistException was thrown by FSDirectory.updateSpaceConsumed, but
> swallowed by commitOrCompleteLastBlock
> - closeFileCommitBlocks continue to call finalizeINodeFileUnderConstruction
> and wrote CloseOp to the edit log
--
This message was sent by Atlassian JIRA
(v6.2#6252)