[
https://issues.apache.org/jira/browse/HDFS-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565741#comment-13565741
]
Brandon Li commented on HDFS-4212:
----------------------------------
Sorry, Yanbo. I though I replied your comment. This is a problem identified in
branch-1 in a few deployed environments. I will try your tests with trunk and
get back to you soon.
Part of the problem here is that getAdditionalBlock()(and thus addBlock()) is
not real idempotent. When the client or namenode or the network between them
causes error, it can leave an assigned blockID but not block created on
datanode.
If addBlock() is really idempotent, the namenode can identified and delete the
dangling blockID when it gets the repeated addBlock() request. To make this api
idempotent is to add the offset as input parameter, so namenode can check the
offset to validate if it's a repeated request. I will upload a patch for that.
> NameNode can't differentiate between a never-created block and a block which
> is really missing
> ----------------------------------------------------------------------------------------------
>
> Key: HDFS-4212
> URL: https://issues.apache.org/jira/browse/HDFS-4212
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 1.2.0, 3.0.0
> Reporter: Brandon Li
> Assignee: Brandon Li
> Attachments: hdfs-4212-junit-test.patch
>
>
> In one test case, NameNode allocated a block and then was killed before the
> client got the addBlock response.
> After NameNode restarted, the block which was never created was considered as
> a missing block and FSCK would report the file is corrupted.
> The problem seems to be that, NameNode can't differentiate between a
> never-created block and a block which is really missing.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira