[
https://issues.apache.org/jira/browse/HDDS-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sadanand Shenoy updated HDDS-9055:
----------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
> Datanode decommission Failed, Follower never received the command
> -----------------------------------------------------------------
>
> Key: HDDS-9055
> URL: https://issues.apache.org/jira/browse/HDDS-9055
> Project: Apache Ozone
> Issue Type: Bug
> Components: SCM HA
> Reporter: Soumitra Sulav
> Assignee: Sumit Agrawal
> Priority: Critical
> Labels: pull-request-available
> Fix For: 1.4.0
>
>
> *Issue:*
> As per one of the Cloudera system test, 2 Datanode are scheduled for
> decommission post data write and data pipeline close.
> LEADER node has received the scheduled command for decommission as expected
> from the test, But the FOLLOWER never received the decommission.
> *Summary logs :*
> Follower
> {code:java}
> 19:58:04,931 : persistedOpState: DECOMMISSIONING, the value stored in SCM
> (IN_SERVICE, 0)
> 19:58:10,016 : persistedOpState: IN_SERVICE, the value stored in SCM
> (DECOMMISSIONING, 0)
> {code}
> Leader: TimeOut
> {code:java}
> 2023-07-20 19:38:31,689 : persistedOpState: IN_SERVICE, the value stored in
> SCM (DECOMMISSIONING, 0)
> ...... multiple retries .......
> 2023-07-20 19:55:54,323 : persistedOpState: IN_SERVICE, the value stored in
> SCM (DECOMMISSIONING, 0)
> 2023-07-20 19:56:24,344 : persistedOpState: IN_SERVICE, the value stored in
> SCM (DECOMMISSIONING, 0)
> 2023-07-20 19:58:04,931 : persistedOpState: DECOMMISSIONING, the value stored
> in SCM (IN_SERVICE, 0)
> {code}
> *Detailed logs :*
> {code:java}
> FOLLOWER
> 2023-07-20 19:58:04,931 INFO org.apache.hadoop.hdds.scm.node.SCMNodeManager:
> Update the operationalState saved in follower SCM for
> 33c95701-aaa5-4b08-a56b-70ac5d237187{ip: 172.27.12.66, host:
> quasar-zqlpfe-5.quasar-zqlpfe.root.hwx.site, ports: [REPLICATION=9886,
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
> networkLocation: /default-rack, certSerialId: 70976812254805668,
> persistedOpState: DECOMMISSIONING, persistedOpStateExpiryEpochSec: 0} as the
> reported value does not match the value stored in SCM (IN_SERVICE, 0)
> 2023-07-20 19:58:10,016 INFO org.apache.hadoop.hdds.scm.node.SCMNodeManager:
> Update the operationalState saved in follower SCM for
> 33c95701-aaa5-4b08-a56b-70ac5d237187{ip: 172.27.12.66, host:
> quasar-zqlpfe-5.quasar-zqlpfe.root.hwx.site, ports: [REPLICATION=9886,
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
> networkLocation: /default-rack, certSerialId: 70976812254805668,
> persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0} as the
> reported value does not match the value stored in SCM (DECOMMISSIONING, 0)
> LEADER
> 2023-07-20 19:56:24,344 INFO org.apache.hadoop.hdds.scm.node.SCMNodeManager:
> Scheduling a command to update the operationalState persisted on
> 33c95701-aaa5-4b08-a56b-70ac5d237187{ip: 172.27.12.66, host:
> quasar-zqlpfe-5.quasar-zqlpfe.root.hwx.site, ports: [REPLICATION=9886,
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
> networkLocation: /default-rack, certSerialId: 70976812254805668,
> persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0} as the
> reported value does not match the value stored in SCM (DECOMMISSIONING, 0)
> 2023-07-20 19:58:04,931 INFO org.apache.hadoop.hdds.scm.node.SCMNodeManager:
> Scheduling a command to update the operationalState persisted on
> 33c95701-aaa5-4b08-a56b-70ac5d237187{ip: 172.27.12.66, host:
> quasar-zqlpfe-5.quasar-zqlpfe.root.hwx.site, ports: [REPLICATION=9886,
> RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859],
> networkLocation: /default-rack, certSerialId: 70976812254805668,
> persistedOpState: DECOMMISSIONING, persistedOpStateExpiryEpochSec: 0} as the
> reported value does not match the value stored in SCM (IN_SERVICE, 0)
> {code}
> PFA SCM logs for more details
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]