[ 
https://issues.apache.org/jira/browse/HDFS-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733172#comment-14733172
 ] 

Konstantin Shvachko commented on HDFS-8999:
-------------------------------------------

Spent some time browsing jira. This issue was discussed earlier in HDFS-1172 
(linking).
# NN cannot rely on locations reported by the client (or a primary DN) because 
it leads to a race condition between the client report and block reports from 
the DN, that contains the replica. The block report may not contain the replica 
that was reported by the client. As noted in [Hairong's 
comment|https://issues.apache.org/jira/browse/HDFS-1172?focusedCommentId=12874030&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12874030]
# [~hairong] proposed a solution, which makes NN place replicas that were not 
yet reported by DNs into {{pendingReplication}} queue instead of 
{{neededRepication}}. This is absolutely logical, because NN knows that missing 
replicas were in the succeeded pipeline and can assume they will be reported 
soon.

I don't know why HDFS-1172 was never committed. May be it is time to revisit it 
now.

> Namenode need not wait for {{blockReceived}} for the last block before 
> completing a file.
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-8999
>                 URL: https://issues.apache.org/jira/browse/HDFS-8999
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Jitendra Nath Pandey
>
> This comes out of a discussion in HDFS-8763. Pasting [~jingzhao]'s comment 
> from the jira:
> {quote}
> ...whether we need to let NameNode wait for all the block_received msgs to 
> announce the replica is safe. Looking into the code, now we have
>    # NameNode knows the DataNodes involved when initially setting up the 
> writing pipeline
>    # If any DataNode fails during the writing, client bumps the GS and 
> finally reports all the DataNodes included in the new pipeline to NameNode 
> through the updatePipeline RPC.
>    # When the client received the ack for the last packet of the block (and 
> before the client tries to close the file on NameNode), the replica has been 
> finalized in all the DataNodes.
> Then in this case, when NameNode receives the close request from the client, 
> the NameNode already knows the latest replicas for the block. Currently the 
> checkReplication call only counts in all the replicas that NN has already 
> received the block_received msg, but based on the above #2 and #3, it may be 
> safe to also count in all the replicas in the 
> BlockUnderConstructionFeature#replicas?
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to