[jira] [Updated] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart

2019-03-05 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-935:
-
   Resolution: Fixed
Fix Version/s: 0.4.0
   Status: Resolved  (was: Patch Available)

Thanks [~arpitagarwal] and [~jnp] for the review. I have committed this change 
to trunk.

> Avoid creating an already created container on a datanode in case of disk 
> removal followed by datanode restart
> --
>
> Key: HDDS-935
> URL: https://issues.apache.org/jira/browse/HDDS-935
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Rakesh R
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-935.000.patch, HDDS-935.001.patch, 
> HDDS-935.002.patch, HDDS-935.003.patch, HDDS-935.004.patch, 
> HDDS-935.005.patch, HDDS-935.006.patch, HDDS-935.007.patch, HDDS-935.008.patch
>
>
> Currently, a container gets created when a writeChunk request comes to 
> HddsDispatcher and if the container does not exist already. In case a disk on 
> which a container exists gets removed and datanode restarts and now, if a 
> writeChunkRequest comes , it might end up creating the same container again 
> with an updated BCSID as it won't detect the disk is removed. This won't be 
> detected by SCM as well as it will have the latest BCSID. This Jira aims to 
> address this issue.
> The proposed fix would be to persist the all the containerIds existing in the 
> containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
> is removed and dn gets restarted, the container set will be rebuild after 
> scanning all the available disks and the the container list stored in the 
> snapshot file will give all the containers created in the datanode. The diff 
> between these two will give the exact list of containers which were created 
> but were not detected after the restart. Any writeChunk request now should 
> validate the container Id from the list of missing containers. Also, we need 
> to ensure container creation does not happen as part of applyTransaction of 
> writeChunk request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart

2019-03-05 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-935:
-
Attachment: HDDS-935.008.patch

> Avoid creating an already created container on a datanode in case of disk 
> removal followed by datanode restart
> --
>
> Key: HDDS-935
> URL: https://issues.apache.org/jira/browse/HDDS-935
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Rakesh R
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-935.000.patch, HDDS-935.001.patch, 
> HDDS-935.002.patch, HDDS-935.003.patch, HDDS-935.004.patch, 
> HDDS-935.005.patch, HDDS-935.006.patch, HDDS-935.007.patch, HDDS-935.008.patch
>
>
> Currently, a container gets created when a writeChunk request comes to 
> HddsDispatcher and if the container does not exist already. In case a disk on 
> which a container exists gets removed and datanode restarts and now, if a 
> writeChunkRequest comes , it might end up creating the same container again 
> with an updated BCSID as it won't detect the disk is removed. This won't be 
> detected by SCM as well as it will have the latest BCSID. This Jira aims to 
> address this issue.
> The proposed fix would be to persist the all the containerIds existing in the 
> containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
> is removed and dn gets restarted, the container set will be rebuild after 
> scanning all the available disks and the the container list stored in the 
> snapshot file will give all the containers created in the datanode. The diff 
> between these two will give the exact list of containers which were created 
> but were not detected after the restart. Any writeChunk request now should 
> validate the container Id from the list of missing containers. Also, we need 
> to ensure container creation does not happen as part of applyTransaction of 
> writeChunk request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart

2019-03-04 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-935:
-
Attachment: HDDS-935.007.patch

> Avoid creating an already created container on a datanode in case of disk 
> removal followed by datanode restart
> --
>
> Key: HDDS-935
> URL: https://issues.apache.org/jira/browse/HDDS-935
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Rakesh R
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-935.000.patch, HDDS-935.001.patch, 
> HDDS-935.002.patch, HDDS-935.003.patch, HDDS-935.004.patch, 
> HDDS-935.005.patch, HDDS-935.006.patch, HDDS-935.007.patch
>
>
> Currently, a container gets created when a writeChunk request comes to 
> HddsDispatcher and if the container does not exist already. In case a disk on 
> which a container exists gets removed and datanode restarts and now, if a 
> writeChunkRequest comes , it might end up creating the same container again 
> with an updated BCSID as it won't detect the disk is removed. This won't be 
> detected by SCM as well as it will have the latest BCSID. This Jira aims to 
> address this issue.
> The proposed fix would be to persist the all the containerIds existing in the 
> containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
> is removed and dn gets restarted, the container set will be rebuild after 
> scanning all the available disks and the the container list stored in the 
> snapshot file will give all the containers created in the datanode. The diff 
> between these two will give the exact list of containers which were created 
> but were not detected after the restart. Any writeChunk request now should 
> validate the container Id from the list of missing containers. Also, we need 
> to ensure container creation does not happen as part of applyTransaction of 
> writeChunk request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart

2019-02-26 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-935:
-
Attachment: HDDS-935.006.patch

> Avoid creating an already created container on a datanode in case of disk 
> removal followed by datanode restart
> --
>
> Key: HDDS-935
> URL: https://issues.apache.org/jira/browse/HDDS-935
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Rakesh R
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-935.000.patch, HDDS-935.001.patch, 
> HDDS-935.002.patch, HDDS-935.003.patch, HDDS-935.004.patch, 
> HDDS-935.005.patch, HDDS-935.006.patch
>
>
> Currently, a container gets created when a writeChunk request comes to 
> HddsDispatcher and if the container does not exist already. In case a disk on 
> which a container exists gets removed and datanode restarts and now, if a 
> writeChunkRequest comes , it might end up creating the same container again 
> with an updated BCSID as it won't detect the disk is removed. This won't be 
> detected by SCM as well as it will have the latest BCSID. This Jira aims to 
> address this issue.
> The proposed fix would be to persist the all the containerIds existing in the 
> containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
> is removed and dn gets restarted, the container set will be rebuild after 
> scanning all the available disks and the the container list stored in the 
> snapshot file will give all the containers created in the datanode. The diff 
> between these two will give the exact list of containers which were created 
> but were not detected after the restart. Any writeChunk request now should 
> validate the container Id from the list of missing containers. Also, we need 
> to ensure container creation does not happen as part of applyTransaction of 
> writeChunk request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart

2019-02-18 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-935:
-
Status: Patch Available  (was: Open)

> Avoid creating an already created container on a datanode in case of disk 
> removal followed by datanode restart
> --
>
> Key: HDDS-935
> URL: https://issues.apache.org/jira/browse/HDDS-935
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Rakesh R
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-935.000.patch, HDDS-935.001.patch, 
> HDDS-935.002.patch, HDDS-935.003.patch, HDDS-935.004.patch, HDDS-935.005.patch
>
>
> Currently, a container gets created when a writeChunk request comes to 
> HddsDispatcher and if the container does not exist already. In case a disk on 
> which a container exists gets removed and datanode restarts and now, if a 
> writeChunkRequest comes , it might end up creating the same container again 
> with an updated BCSID as it won't detect the disk is removed. This won't be 
> detected by SCM as well as it will have the latest BCSID. This Jira aims to 
> address this issue.
> The proposed fix would be to persist the all the containerIds existing in the 
> containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
> is removed and dn gets restarted, the container set will be rebuild after 
> scanning all the available disks and the the container list stored in the 
> snapshot file will give all the containers created in the datanode. The diff 
> between these two will give the exact list of containers which were created 
> but were not detected after the restart. Any writeChunk request now should 
> validate the container Id from the list of missing containers. Also, we need 
> to ensure container creation does not happen as part of applyTransaction of 
> writeChunk request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart

2019-02-14 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-935:
-
Attachment: HDDS-935.005.patch

> Avoid creating an already created container on a datanode in case of disk 
> removal followed by datanode restart
> --
>
> Key: HDDS-935
> URL: https://issues.apache.org/jira/browse/HDDS-935
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Rakesh R
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-935.000.patch, HDDS-935.001.patch, 
> HDDS-935.002.patch, HDDS-935.003.patch, HDDS-935.004.patch, HDDS-935.005.patch
>
>
> Currently, a container gets created when a writeChunk request comes to 
> HddsDispatcher and if the container does not exist already. In case a disk on 
> which a container exists gets removed and datanode restarts and now, if a 
> writeChunkRequest comes , it might end up creating the same container again 
> with an updated BCSID as it won't detect the disk is removed. This won't be 
> detected by SCM as well as it will have the latest BCSID. This Jira aims to 
> address this issue.
> The proposed fix would be to persist the all the containerIds existing in the 
> containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
> is removed and dn gets restarted, the container set will be rebuild after 
> scanning all the available disks and the the container list stored in the 
> snapshot file will give all the containers created in the datanode. The diff 
> between these two will give the exact list of containers which were created 
> but were not detected after the restart. Any writeChunk request now should 
> validate the container Id from the list of missing containers. Also, we need 
> to ensure container creation does not happen as part of applyTransaction of 
> writeChunk request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart

2019-02-05 Thread Mukul Kumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-935:
---
Status: Open  (was: Patch Available)

> Avoid creating an already created container on a datanode in case of disk 
> removal followed by datanode restart
> --
>
> Key: HDDS-935
> URL: https://issues.apache.org/jira/browse/HDDS-935
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Rakesh R
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-935.000.patch, HDDS-935.001.patch, 
> HDDS-935.002.patch, HDDS-935.003.patch, HDDS-935.004.patch
>
>
> Currently, a container gets created when a writeChunk request comes to 
> HddsDispatcher and if the container does not exist already. In case a disk on 
> which a container exists gets removed and datanode restarts and now, if a 
> writeChunkRequest comes , it might end up creating the same container again 
> with an updated BCSID as it won't detect the disk is removed. This won't be 
> detected by SCM as well as it will have the latest BCSID. This Jira aims to 
> address this issue.
> The proposed fix would be to persist the all the containerIds existing in the 
> containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
> is removed and dn gets restarted, the container set will be rebuild after 
> scanning all the available disks and the the container list stored in the 
> snapshot file will give all the containers created in the datanode. The diff 
> between these two will give the exact list of containers which were created 
> but were not detected after the restart. Any writeChunk request now should 
> validate the container Id from the list of missing containers. Also, we need 
> to ensure container creation does not happen as part of applyTransaction of 
> writeChunk request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart

2019-02-04 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-935:
-
Attachment: HDDS-935.004.patch

> Avoid creating an already created container on a datanode in case of disk 
> removal followed by datanode restart
> --
>
> Key: HDDS-935
> URL: https://issues.apache.org/jira/browse/HDDS-935
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Rakesh R
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-935.000.patch, HDDS-935.001.patch, 
> HDDS-935.002.patch, HDDS-935.003.patch, HDDS-935.004.patch
>
>
> Currently, a container gets created when a writeChunk request comes to 
> HddsDispatcher and if the container does not exist already. In case a disk on 
> which a container exists gets removed and datanode restarts and now, if a 
> writeChunkRequest comes , it might end up creating the same container again 
> with an updated BCSID as it won't detect the disk is removed. This won't be 
> detected by SCM as well as it will have the latest BCSID. This Jira aims to 
> address this issue.
> The proposed fix would be to persist the all the containerIds existing in the 
> containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
> is removed and dn gets restarted, the container set will be rebuild after 
> scanning all the available disks and the the container list stored in the 
> snapshot file will give all the containers created in the datanode. The diff 
> between these two will give the exact list of containers which were created 
> but were not detected after the restart. Any writeChunk request now should 
> validate the container Id from the list of missing containers. Also, we need 
> to ensure container creation does not happen as part of applyTransaction of 
> writeChunk request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart

2019-02-01 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-935:
-
Attachment: HDDS-935.003.patch

> Avoid creating an already created container on a datanode in case of disk 
> removal followed by datanode restart
> --
>
> Key: HDDS-935
> URL: https://issues.apache.org/jira/browse/HDDS-935
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Rakesh R
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-935.000.patch, HDDS-935.001.patch, 
> HDDS-935.002.patch, HDDS-935.003.patch
>
>
> Currently, a container gets created when a writeChunk request comes to 
> HddsDispatcher and if the container does not exist already. In case a disk on 
> which a container exists gets removed and datanode restarts and now, if a 
> writeChunkRequest comes , it might end up creating the same container again 
> with an updated BCSID as it won't detect the disk is removed. This won't be 
> detected by SCM as well as it will have the latest BCSID. This Jira aims to 
> address this issue.
> The proposed fix would be to persist the all the containerIds existing in the 
> containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
> is removed and dn gets restarted, the container set will be rebuild after 
> scanning all the available disks and the the container list stored in the 
> snapshot file will give all the containers created in the datanode. The diff 
> between these two will give the exact list of containers which were created 
> but were not detected after the restart. Any writeChunk request now should 
> validate the container Id from the list of missing containers. Also, we need 
> to ensure container creation does not happen as part of applyTransaction of 
> writeChunk request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart

2019-02-01 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-935:
-
Attachment: HDDS-935.002.patch

> Avoid creating an already created container on a datanode in case of disk 
> removal followed by datanode restart
> --
>
> Key: HDDS-935
> URL: https://issues.apache.org/jira/browse/HDDS-935
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Rakesh R
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-935.000.patch, HDDS-935.001.patch, 
> HDDS-935.002.patch
>
>
> Currently, a container gets created when a writeChunk request comes to 
> HddsDispatcher and if the container does not exist already. In case a disk on 
> which a container exists gets removed and datanode restarts and now, if a 
> writeChunkRequest comes , it might end up creating the same container again 
> with an updated BCSID as it won't detect the disk is removed. This won't be 
> detected by SCM as well as it will have the latest BCSID. This Jira aims to 
> address this issue.
> The proposed fix would be to persist the all the containerIds existing in the 
> containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
> is removed and dn gets restarted, the container set will be rebuild after 
> scanning all the available disks and the the container list stored in the 
> snapshot file will give all the containers created in the datanode. The diff 
> between these two will give the exact list of containers which were created 
> but were not detected after the restart. Any writeChunk request now should 
> validate the container Id from the list of missing containers. Also, we need 
> to ensure container creation does not happen as part of applyTransaction of 
> writeChunk request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart

2019-01-22 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-935:
-
Attachment: HDDS-935.001.patch

> Avoid creating an already created container on a datanode in case of disk 
> removal followed by datanode restart
> --
>
> Key: HDDS-935
> URL: https://issues.apache.org/jira/browse/HDDS-935
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Rakesh R
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-935.000.patch, HDDS-935.001.patch
>
>
> Currently, a container gets created when a writeChunk request comes to 
> HddsDispatcher and if the container does not exist already. In case a disk on 
> which a container exists gets removed and datanode restarts and now, if a 
> writeChunkRequest comes , it might end up creating the same container again 
> with an updated BCSID as it won't detect the disk is removed. This won't be 
> detected by SCM as well as it will have the latest BCSID. This Jira aims to 
> address this issue.
> The proposed fix would be to persist the all the containerIds existing in the 
> containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
> is removed and dn gets restarted, the container set will be rebuild after 
> scanning all the available disks and the the container list stored in the 
> snapshot file will give all the containers created in the datanode. The diff 
> between these two will give the exact list of containers which were created 
> but were not detected after the restart. Any writeChunk request now should 
> validate the container Id from the list of missing containers. Also, we need 
> to ensure container creation does not happen as part of applyTransaction of 
> writeChunk request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart

2019-01-08 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-935:
-
Status: Patch Available  (was: Open)

> Avoid creating an already created container on a datanode in case of disk 
> removal followed by datanode restart
> --
>
> Key: HDDS-935
> URL: https://issues.apache.org/jira/browse/HDDS-935
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Rakesh R
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-935.000.patch
>
>
> Currently, a container gets created when a writeChunk request comes to 
> HddsDispatcher and if the container does not exist already. In case a disk on 
> which a container exists gets removed and datanode restarts and now, if a 
> writeChunkRequest comes , it might end up creating the same container again 
> with an updated BCSID as it won't detect the disk is removed. This won't be 
> detected by SCM as well as it will have the latest BCSID. This Jira aims to 
> address this issue.
> The proposed fix would be to persist the all the containerIds existing in the 
> containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
> is removed and dn gets restarted, the container set will be rebuild after 
> scanning all the available disks and the the container list stored in the 
> snapshot file will give all the containers created in the datanode. The diff 
> between these two will give the exact list of containers which were created 
> but were not detected after the restart. Any writeChunk request now should 
> validate the container Id from the list of missing containers. Also, we need 
> to ensure container creation does not happen as part of applyTransaction of 
> writeChunk request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart

2019-01-08 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-935:
-
Attachment: HDDS-935.000.patch

> Avoid creating an already created container on a datanode in case of disk 
> removal followed by datanode restart
> --
>
> Key: HDDS-935
> URL: https://issues.apache.org/jira/browse/HDDS-935
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Rakesh R
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-935.000.patch
>
>
> Currently, a container gets created when a writeChunk request comes to 
> HddsDispatcher and if the container does not exist already. In case a disk on 
> which a container exists gets removed and datanode restarts and now, if a 
> writeChunkRequest comes , it might end up creating the same container again 
> with an updated BCSID as it won't detect the disk is removed. This won't be 
> detected by SCM as well as it will have the latest BCSID. This Jira aims to 
> address this issue.
> The proposed fix would be to persist the all the containerIds existing in the 
> containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
> is removed and dn gets restarted, the container set will be rebuild after 
> scanning all the available disks and the the container list stored in the 
> snapshot file will give all the containers created in the datanode. The diff 
> between these two will give the exact list of containers which were created 
> but were not detected after the restart. Any writeChunk request now should 
> validate the container Id from the list of missing containers. Also, we need 
> to ensure container creation does not happen as part of applyTransaction of 
> writeChunk request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org