[jira] [Commented] (HDFS-2791) If block report races with closing of file, replica is incorrectly marked corrupt

Todd Lipcon (Commented) (JIRA) Mon, 23 Jan 2012 12:48:04 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191436#comment-13191436
 ]


Todd Lipcon commented on HDFS-2791:
-----------------------------------

I've been thinking about this over the weekend and this morning. My current 
thinking is that the safest bet is the following approach:

When an RBW block is reported for a finalized replica:
- Case 1) if the block has a too-low generation stamp, mark it corrupt.
- Case 2) if the block has the correct generation stamp, ignore it (don't add 
to block locations or mark it corrupt)

Here's the reasoning:

*Case 1* One of the DNs is reporting a stale generation stamp.

This means that the client must have either appended to the block or undergone 
pipeline recovery. There are two possibilities of why the DN is thus reporting 
an old genstamp:
- 1a) it is a "delayed block report" as described in this JIRA. We will later 
see a correct/up-to-date BR for the same block.

Here it is OK to mark the block as corrupt, since when we sent the "invalidate" 
message to the DN, we'll invalidate the old genstamp specifically. So when the 
DN receives the invalidation, it will not delete the new (correct) replica, but 
rather just ignore it.

- 1b) the client lost its connection to this DN node and did a pipeline 
recovery before closing the file. In this case we will never see a 
correct/up-to-date BR.

Here it's also OK to mark it as corrupt, because it really is corrupt (ie 
didn't participate in the block recovery).


*Case 2* correct generation stamp, but RBW report on a FINALIZED block

As far as I can think, the only way we can get here is with the "delayed 
report" scenario described in this JIRA. The reasoning is as follows:
- in order for the client to call completeBlock(), it must have gotten a 
successful pipeline close from all of the DNs in the current pipeline
- if the pipeline nodes had changed, it would have gotten a different 
generation stamp. So, all of the nodes that have a block with the correct 
genstamp were in the closed pipeline
- thus all of the nodes with the correct genstamp would have the correct length 
and state, and any report otherwise is because of a message delay.

The only other possibility is something like a machine crash which doesn't 
replay the ext3 journal causing some blocks to get rolled back to a prior 
state. In this case, upon restart, the DN would change it to be a RWR 
(ReplicaWaitingRecovery) and we could use the original logic of marking it 
corrupt.


I think the above solution is safer and simpler than any other solutions I 
could come up with.
                
> If block report races with closing of file, replica is incorrectly marked 
> corrupt
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-2791
>                 URL: https://issues.apache.org/jira/browse/HDFS-2791
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, name-node
>    Affects Versions: 0.22.0, 0.23.0
>            Reporter: Todd Lipcon
>         Attachments: hdfs-2791-test.txt
>
>
> The following sequence of events results in a replica mistakenly marked 
> corrupt:
> 1. Pipeline is open with 2 replicas
> 2. DN1 generates a block report but is slow in sending to the NN (eg some 
> flaky network). It gets "stuck" right before the block report RPC.
> 3. Client closes the file.
> 4. DN2 is fast and sends blockReceived to the NN. NN marks the block as 
> COMPLETE
> 5. DN1's block report proceeds, and includes the block in an RBW state.
> 6. (x) NN incorrectly marks the replica as corrupt, since it is an RBW 
> replica on a COMPLETE block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2791) If block report races with closing of file, replica is incorrectly marked corrupt

Reply via email to