[ 
https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-935:
-------------------------------------
    Attachment: HDDS-935.004.patch

> Avoid creating an already created container on a datanode in case of disk 
> removal followed by datanode restart
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-935
>                 URL: https://issues.apache.org/jira/browse/HDDS-935
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>          Components: Ozone Datanode
>    Affects Versions: 0.4.0
>            Reporter: Rakesh R
>            Assignee: Shashikant Banerjee
>            Priority: Major
>         Attachments: HDDS-935.000.patch, HDDS-935.001.patch, 
> HDDS-935.002.patch, HDDS-935.003.patch, HDDS-935.004.patch
>
>
> Currently, a container gets created when a writeChunk request comes to 
> HddsDispatcher and if the container does not exist already. In case a disk on 
> which a container exists gets removed and datanode restarts and now, if a 
> writeChunkRequest comes , it might end up creating the same container again 
> with an updated BCSID as it won't detect the disk is removed. This won't be 
> detected by SCM as well as it will have the latest BCSID. This Jira aims to 
> address this issue.
> The proposed fix would be to persist the all the containerIds existing in the 
> containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
> is removed and dn gets restarted, the container set will be rebuild after 
> scanning all the available disks and the the container list stored in the 
> snapshot file will give all the containers created in the datanode. The diff 
> between these two will give the exact list of containers which were created 
> but were not detected after the restart. Any writeChunk request now should 
> validate the container Id from the list of missing containers. Also, we need 
> to ensure container creation does not happen as part of applyTransaction of 
> writeChunk request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to