Nilotpal Nandi created HDDS-988:
-----------------------------------
Summary: containers remain in CLOSING state in one of the
datanodes when there datanodes are isolated in docker cluster
Key: HDDS-988
URL: https://issues.apache.org/jira/browse/HDDS-988
Project: Hadoop Distributed Data Store
Issue Type: Bug
Components: Ozone Datanode, SCM
Reporter: Nilotpal Nandi
Attachments: datanode_1, datanode_2, datanode_3, om, scm
steps taken :
-------------------
# Created 3 datanodes docker cluster.
# wrote some data to create a pipeline.
# Then, isolated all datanodes , i.e, datanodes coud not communicate with each
other . (datanodes can communicate with scm and om).
# Tried to write some data again, write failed as expected.
# After waiting for 'ozone.scm.stale.node.interval' and
'ozone.scm.dead.node.interval' , the container replicas are still in CLOSING
state. Containers failed to get CLOSED.
{noformat}
hadoop@8876c7214ee5:~$ cat
/data/hdds/hdds/40bb080a-1a9f-42c8-9e20-8257ed567e46/current/containerDir0/*/metadata/*.container
!<KeyValueContainerData>
checksum: 7ee8f706cf215a5fa4b7e9a195529c15147823ceea302ab4998c7476ee64ebf4
chunksPath:
/data/hdds/hdds/40bb080a-1a9f-42c8-9e20-8257ed567e46/current/containerDir0/2/chunks
containerDBType: RocksDB
containerID: 2
containerType: KeyValueContainer
layOutVersion: 1
maxSize: 5368709120
metadata: {}
metadataPath:
/data/hdds/hdds/40bb080a-1a9f-42c8-9e20-8257ed567e46/current/containerDir0/2/metadata
originNodeId: 6e077f73-9fd9-4f4e-930f-578c9857912c
originPipelineId: ee5f9e7a-0d63-412a-839a-77af2cf7ca93
state: CLOSING{noformat}
Expectation :
---------------------
The container should have at least two closed replicas .
scm, om datanodes log attached.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]