[
https://issues.apache.org/jira/browse/HDFS-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064771#comment-15064771
]
Konstantin Shvachko commented on HDFS-8999:
-------------------------------------------
Kihwal, I did not understand exactly what you propose. But it seems that the
following two of your statements contradict each other.
- ??NN might incorrectly mark a replica as corrupt or have more locations than
committed and do not know which are valid??
- ??Conceptually, namenode knows exactly where the replicas of an
under-construction blocks are.??
I think the first is correct as NameNode knows only where the replicas should
be, but never knows where they actually are at any given moment. And the same
is with clients.
I was talking about the following race condition. Suppose we let NN complete
the block based on client locations, and it does. Then FBR comes from DN. The
FBR could have been formed before the the replica was received by DN, and
therefore will NOT contain the new replica. Because the block is complete, NN
will _incorrectly_ remove the valid replica.
I think HDFS-1172 fixed the problem with closing. The replications are not
starting immediately, so DN's have a chance to IBR remaining replicas. If they
don't, these replicas will eventually be moved from pendingReplication into
neededReplication, and then replicated.
Don't see any problems and don't think anything needs to be done here?
> Namenode need not wait for {{blockReceived}} for the last block before
> completing a file.
> -----------------------------------------------------------------------------------------
>
> Key: HDFS-8999
> URL: https://issues.apache.org/jira/browse/HDFS-8999
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Reporter: Jitendra Nath Pandey
> Assignee: Tsz Wo Nicholas Sze
>
> This comes out of a discussion in HDFS-8763. Pasting [~jingzhao]'s comment
> from the jira:
> {quote}
> ...whether we need to let NameNode wait for all the block_received msgs to
> announce the replica is safe. Looking into the code, now we have
> # NameNode knows the DataNodes involved when initially setting up the
> writing pipeline
> # If any DataNode fails during the writing, client bumps the GS and
> finally reports all the DataNodes included in the new pipeline to NameNode
> through the updatePipeline RPC.
> # When the client received the ack for the last packet of the block (and
> before the client tries to close the file on NameNode), the replica has been
> finalized in all the DataNodes.
> Then in this case, when NameNode receives the close request from the client,
> the NameNode already knows the latest replicas for the block. Currently the
> checkReplication call only counts in all the replicas that NN has already
> received the block_received msg, but based on the above #2 and #3, it may be
> safe to also count in all the replicas in the
> BlockUnderConstructionFeature#replicas?
> {quote}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)