Jyotirmoy Sinha created HDDS-10951:
--------------------------------------
Summary: Container is stuck in CLOSING state for more than 12
hours on getting ICR of UNHEALTHY replica
Key: HDDS-10951
URL: https://issues.apache.org/jira/browse/HDDS-10951
Project: Apache Ozone
Issue Type: Bug
Components: SCM
Reporter: Jyotirmoy Sinha
Steps :
* Create vol/buck/key
* Simulate unhealthy replica in the container of above key
* Check for container to close
Expected behaviour - Container should be closed soon after it receives ICR of
UNHEALTHY replica
Actual behaviour - Container is stuck in CLOSING state for more than 12 hours
after receiving ICR
Container close initiated at -
{code:java}
2023-10-26 19:56:08,079 INFO
[FixedThreadPoolWithAffinityExecutor-1-0]-org.apache.hadoop.hdds.scm.container.IncrementalContainerReportHandler:
Moving OPEN container #18002 to CLOSING state, datanode
f2a6be07-db06-430b-8311-534247744f99(quasar-yzwbdi-8.quasar-yzwbdi.root.hwx.site/172.27.112.2)
reported UNHEALTHY replica with index 0. {code}
Current state of container -
{code:java}
root@quasar-yzwbdi-1:~# ozone admin container info 18002
Container id: 18002
Pipeline id: 34771df9-8ba5-4a3e-9e48-abb590e67ea2
Container State: CLOSING
Datanodes:
[f2a6be07-db06-430b-8311-534247744f99/quasar-yzwbdi-8.quasar-yzwbdi.root.hwx.site,
baa35af1-7b51-4275-b465-f750c429c618/quasar-yzwbdi-5.quasar-yzwbdi.root.hwx.site,
f40aed3a-dddf-4f2b-a30f-035136bfceba/quasar-yzwbdi-4.quasar-yzwbdi.root.hwx.site]
Replicas: [State: CLOSING; ReplicaIndex: 0; Origin:
f40aed3a-dddf-4f2b-a30f-035136bfceba; Location:
f40aed3a-dddf-4f2b-a30f-035136bfceba/quasar-yzwbdi-4.quasar-yzwbdi.root.hwx.site,
State: UNHEALTHY; ReplicaIndex: 0; Origin:
f2a6be07-db06-430b-8311-534247744f99; Location:
f2a6be07-db06-430b-8311-534247744f99/quasar-yzwbdi-8.quasar-yzwbdi.root.hwx.site,
State: CLOSING; ReplicaIndex: 0; Origin: baa35af1-7b51-4275-b465-f750c429c618;
Location:
baa35af1-7b51-4275-b465-f750c429c618/quasar-yzwbdi-5.quasar-yzwbdi.root.hwx.site]
root@quasar-yzwbdi-1:~# date
Fri 27 Oct 2023 04:57:29 AM UTC {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]