[ 
https://issues.apache.org/jira/browse/HDFS-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17235524#comment-17235524
 ] 

honestman commented on HDFS-4452:
---------------------------------

I'm so sorry for asking some questions after so many years.

I think simply return the created block in retry is not ok.
 # client call addBlock, nn create one block and sync edit log
 # nn sync edit log get stuck due to some reason
 # client timeout and retry, this time get the block created in step 1, and 
start to write data
 # nn crash after sync edits failed
 # when standby transition to active, it can not get the create block info from 
edits log
 # when client close file and commit the block, it will get exception

 

 

> getAdditionalBlock() can create multiple blocks if the client times out and 
> retries.
> ------------------------------------------------------------------------------------
>
>                 Key: HDFS-4452
>                 URL: https://issues.apache.org/jira/browse/HDFS-4452
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.0.2-alpha
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Critical
>             Fix For: 2.0.3-alpha
>
>         Attachments: TestAddBlockRetry.java, 
> getAdditionalBlock-branch2.patch, getAdditionalBlock.patch, 
> getAdditionalBlock.patch, getAdditionalBlock.patch
>
>
> HDFS client tries to addBlock() to a file. If NameNode is busy the client can 
> timeout and will reissue the same request again. The two requests will race 
> with each other in {{FSNamesystem.getAdditionalBlock()}}, which can result in 
> creating two new blocks on the NameNode while the client will know of only 
> one of them. This eventually results in {{NotReplicatedYetException}} because 
> the extra block is never reported by any DataNode, which stalls file creation 
> and puts it in invalid state with an empty block in the middle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to