Jitendra Nath Pandey created HDFS-8999:
------------------------------------------
Summary: Namenode need not wait for {{blockReceived}} for the last
block before completing a file.
Key: HDFS-8999
URL: https://issues.apache.org/jira/browse/HDFS-8999
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Reporter: Jitendra Nath Pandey
This comes out of a discussion in HDFS-8763. Pasting [~jingzhao]'s comment from
the jira:
...whether we need to let NameNode wait for all the block_received msgs to
announce the replica is safe. Looking into the code, now we have
# NameNode knows the DataNodes involved when initially setting up the
writing pipeline
# If any DataNode fails during the writing, client bumps the GS and finally
reports all the DataNodes included in the new pipeline to NameNode through the
updatePipeline RPC.
# When the client received the ack for the last packet of the block (and
before the client tries to close the file on NameNode), the replica has been
finalized in all the DataNodes.
Then in this case, when NameNode receives the close request from the client,
the NameNode already knows the latest replicas for the block. Currently the
checkReplication call only counts in all the replicas that NN has already
received the block_received msg, but based on the above #2 and #3, it may be
safe to also count in all the replicas in the
BlockUnderConstructionFeature#replicas?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)