[ 
https://issues.apache.org/jira/browse/HDFS-11755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16003459#comment-16003459
 ] 

Nathan Roberts commented on HDFS-11755:
---------------------------------------

bq. Do you know which one makes more sense?
Not an expert in this area but here's my understanding. When a block is 
completed and the client has received the necessary acks, the client either 
adds another block, or completes the file. Both cause the namenode to consider 
the block complete, and at that point the namenode will properly maintain 
replication of the completed block. If the pipeline fails while writing, the 
client may (depends on policy configured) rebuild the pipeline to maintain the 
desired level of replication in the pipeline. So, while a block is mutating, it 
is the client that is ultimately responsible for making sure enough datanodes 
remain in the pipeline and in-sync with the data. Once a block is complete, it 
becomes the namenode's responsibility to maintain replication. 

If a client dies and fails to complete the last block, after a timeout, lease 
recovery will cause the file to be closed and the blocks to be properly 
synchronized and committed if possible.  

There is also hsync(), which applications can use to enhance the durability 
guarantees at the datanode (via fsync).

Hope that helps a little.


> Underconstruction blocks can be considered missing
> --------------------------------------------------
>
>                 Key: HDFS-11755
>                 URL: https://issues.apache.org/jira/browse/HDFS-11755
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0-alpha2, 2.8.1
>            Reporter: Nathan Roberts
>            Assignee: Nathan Roberts
>         Attachments: HDFS-11755.001.patch
>
>
> Following sequence of events can lead to a block underconstruction being 
> considered missing.
> - pipeline of 3 DNs, DN1->DN2->DN3
> - DN3 has a failing disk so some updates take a long time
> - Client writes entire block and is waiting for final ack
> - DN1, DN2 and DN3 have all received the block 
> - DN1 is waiting for ACK from DN2 who is waiting for ACK from DN3
> - DN3 is having trouble finalizing the block due to the failing drive. It 
> does eventually succeed but it is VERY slow at doing so. 
> - DN2 times out waiting for DN3 and tears down its pieces of the pipeline, so 
> DN1 notices and does the same. Neither DN1 nor DN2 finalized the block.
> - DN3 finally sends an IBR to the NN indicating the block has been received.
> - Drive containing the block on DN3 fails enough that the DN takes it offline 
> and notifies NN of failed volume
> - NN removes DN3's replica from the triplets and then declares the block 
> missing because there are no other replicas
> Seems like we shouldn't consider uncompleted blocks for replication.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to