[
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinayakumar B updated HDFS-11674:
---------------------------------
Attachment: HDFS-11674-branch-2.7-03.patch
> reserveSpaceForReplicas is not released if append request failed due to
> mirror down and replica recovered
> ---------------------------------------------------------------------------------------------------------
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Reporter: Vinayakumar B
> Assignee: Vinayakumar B
> Priority: Critical
> Labels: release-blocker
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch,
> HDFS-11674-03.patch, HDFS-11674-branch-2.7-03.patch
>
>
> Scenario:
> 1. 3 Node cluster with
> "dfs.client.block.write.replace-datanode-on-failure.policy" as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will
> have the reservedSpaceForReplicas incremented, BUT never decremented.
> 2. So in long run DN's all space will be in reservedSpaceForReplicas
> resulting OutOfSpace errors.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]