Wei-Chiu Chuang created HDDS-15350:
--------------------------------------
Summary: Divide by zero bug crash SCM when decommission datanode
Key: HDDS-15350
URL: https://issues.apache.org/jira/browse/HDDS-15350
Project: Apache Ozone
Issue Type: Bug
Components: SCM
Reporter: Wei-Chiu Chuang
Encountered an interesting bug:
{noformat}
2026-05-22 16:23:39,439 ERROR
[ReplicationMonitor]-org.apache.hadoop.hdds.scm.container.replication.ReplicationManager:
Exception in Replication Monitor Thr
ead.
java.lang.ArithmeticException: / by zero
at
org.apache.hadoop.hdds.scm.SCMCommonPlacementPolicy.getMaxReplicasPerRack(SCMCommonPlacementPolicy.java:419)
at
org.apache.hadoop.hdds.scm.SCMCommonPlacementPolicy.validateContainerPlacement(SCMCommonPlacementPolicy.java:466)
at
org.apache.hadoop.hdds.scm.container.replication.health.ECMisReplicationCheckHandler.getPlacementStatus(ECMisReplicationCheckHandler.java:138)
at
org.apache.hadoop.hdds.scm.container.replication.health.ECMisReplicationCheckHandler.checkMisReplication(ECMisReplicationCheckHandler.java:93)
at
org.apache.hadoop.hdds.scm.container.replication.health.ECMisReplicationCheckHandler.handle(ECMisReplicationCheckHandler.java:69)
at
org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:38)
at
org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
at
org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
at
org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
at
org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
at
org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
at
org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
at
org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
at
org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
at
org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
at
org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
at
org.apache.hadoop.hdds.scm.container.replication.ReplicationManager.processContainer(ReplicationManager.java:899)
at
org.apache.hadoop.hdds.scm.container.replication.ReplicationManager.processContainer(ReplicationManager.java:872)
at
org.apache.hadoop.hdds.scm.container.replication.ReplicationManager.processAll(ReplicationManager.java:399)
at
org.apache.hadoop.hdds.scm.container.replication.ReplicationManager.run(ReplicationManager.java:953)
at java.lang.Thread.run(Thread.java:748)
2026-05-22 16:23:39,442 INFO
[ReplicationMonitor]-org.apache.hadoop.util.ExitUtil: Exiting with status 1:
java.lang.ArithmeticException: / by zero {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]