[
https://issues.apache.org/jira/browse/HDDS-7465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Devesh Kumar Singh updated HDDS-7465:
-------------------------------------
Description:
Recon - \{containerId}/keys API is not giving details of keys for missing
containers when recon was down.
Steps to reproduce:
# Create a 3 data node cluster and non-HA single SCM cluster
# Bring Recon down.
# Create few volumes and buckets and keys to have few (say 5-6) containers
created.
# SCM rocks DB got update of these containers. Make sure that
"ozone.scm.ratis.enable" property is set as false to immediate flush of
containers and their state in SCM rocks DB.
5. Bring all the 3 the data nodes down.
6. Containers will move from OPEN to CLOSING. Bring all the 3 DNs up and
make sure that containers state moved from QUASI CLOSED to CLOSED.
7. Bring all the 3 data nodes down again.
8. Bring Recon up and all containers should be reported as missing.
9. For all missing containers, expand container id to list the key and
volume info.
Expected: Key and volume info should be listed for missing container id.
Actual: Key and volume is not listed for missing container id.
Root cause: When Recon was down, recon doesn't get any updates (PUT, UPDATE,
DELETE) for keys created from OM, so no delta updates of OM gets applied for
Recon OM DB and containerKey mapping data is empty.
Solution: Get full periodic snapshot of OM rocks db and and then process the
data to get containerKeyMapping info because if Recon was down when containers
gets created, full snapshot of OM DB will never gets taken due to below logic
as this logic relies on currentSequenceNumber which will never be less than
equal to zero as OM DB sequence number has already moved beyond zero when keys
were written.
!image-2022-11-09-09-34-16-930.png!
was:
Recon - \{containerId}/keys API is not giving details of keys for missing
containers when recon was down.
Steps to reproduce:
# Create a 3 data node cluster and non-HA single SCM cluster
# Bring Recon down.
# Create few volumes and buckets and keys to have few (say 5-6) containers
created.
# SCM rocks DB got update of these containers. Make sure that
"ozone.scm.ratis.enable" property is set as false to immediate flush of
containers and their state in SCM rocks DB.
5. Bring all the 3 the data nodes down.
6. Containers will move from OPEN to CLOSING. Bring all the 3 DNs up and
make sure that containers state moved from QUASI CLOSED to CLOSED.
7. Bring all the 3 data nodes down again.
8. Bring Recon up and all containers should be reported as missing.
9. For all missing containers, expand container id to list the key and
volume info.
Expected: Key and volume info should be listed for missing container id.
Actual: Key and volume is not listed for missing container id.
Root cause: When Recon was down, recon doesn't get any updates (PUT, UPDATE,
DELETE) for keys created from OM, so no delta updates of OM gets applied for
Recon OM DB and containerKey mapping data is empty.
Solution: Get full periodic snapshot of OM rocks db and and then process the
data to get containerKeyMapping info.
> Recon - {containerId}/keys API is not giving details of keys for missing
> containers when recon was down
> -------------------------------------------------------------------------------------------------------
>
> Key: HDDS-7465
> URL: https://issues.apache.org/jira/browse/HDDS-7465
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Devesh Kumar Singh
> Assignee: Devesh Kumar Singh
> Priority: Major
> Attachments: image-2022-11-09-09-34-16-930.png
>
>
> Recon - \{containerId}/keys API is not giving details of keys for missing
> containers when recon was down.
>
> Steps to reproduce:
> # Create a 3 data node cluster and non-HA single SCM cluster
> # Bring Recon down.
> # Create few volumes and buckets and keys to have few (say 5-6) containers
> created.
> # SCM rocks DB got update of these containers. Make sure that
> "ozone.scm.ratis.enable" property is set as false to immediate flush of
> containers and their state in SCM rocks DB.
> 5. Bring all the 3 the data nodes down.
> 6. Containers will move from OPEN to CLOSING. Bring all the 3 DNs up
> and make sure that containers state moved from QUASI CLOSED to CLOSED.
> 7. Bring all the 3 data nodes down again.
> 8. Bring Recon up and all containers should be reported as missing.
> 9. For all missing containers, expand container id to list the key and
> volume info.
> Expected: Key and volume info should be listed for missing container id.
> Actual: Key and volume is not listed for missing container id.
>
> Root cause: When Recon was down, recon doesn't get any updates (PUT, UPDATE,
> DELETE) for keys created from OM, so no delta updates of OM gets applied for
> Recon OM DB and containerKey mapping data is empty.
>
> Solution: Get full periodic snapshot of OM rocks db and and then process the
> data to get containerKeyMapping info because if Recon was down when
> containers gets created, full snapshot of OM DB will never gets taken due to
> below logic as this logic relies on currentSequenceNumber which will never be
> less than equal to zero as OM DB sequence number has already moved beyond
> zero when keys were written.
>
> !image-2022-11-09-09-34-16-930.png!
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]