[
https://issues.apache.org/jira/browse/HDDS-15535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rakesh Radhakrishnan updated HDDS-15535:
----------------------------------------
Labels: container-balancer (was: )
> Container Balancer should validate configuration and report startup failures
> to user
> ------------------------------------------------------------------------------------
>
> Key: HDDS-15535
> URL: https://issues.apache.org/jira/browse/HDDS-15535
> Project: Apache Ozone
> Issue Type: Task
> Reporter: Ashish Kumar
> Priority: Major
> Labels: container-balancer
>
> When we start balancer from CLI it shows started and then internally it
> silently fails.
> Instead if there are such failure it should interactively shown in UI to
> correct the command.
> Example: When cluster has 10 nodes and we run balancer with below config
> {code:java}
> ozone admin containerbalancer start
> --balancing-iteration-interval-minutes=15
> --max-datanodes-percentage-to-involve-per-iteration=10
> --max-size-entering-target-in-gb=50 --iterations=10
> --max-size-leaving-source-in-gb=50 --max-size-to-move-per-iteration-in-gb=300
> --threshold=5
> Container Balancer started successfully. {code}
> CLI output:Container Balancer started successfully.
> However, in a cluster with 10 datanodes:
> max-datanodes-percentage-to-involve-per-iteration=10
> 10% of 10 datanodes = 1 datanode
> Container balancing requires both source and target datanodes to perform a
> move. With only one datanode allowed to participate, balancing cannot proceed.
>
> The failure is only visible in SCM logs:
> {code:java}
> 2026-06-10 06:49:56,159 DEBUG
> [node2-ContainerBalancerTask-2]-org.apache.hadoop.hdds.scm.container.balancer.ContainerBalancerTask:
> Approaching max datanodes to involve limit. 0 datanodes have already been
> selected for balancing and the limit is 1. Only already selected targets can
> be selected as targets now.
> ------
> 2026-06-10 06:49:56,178 INFO
> [node2-ContainerBalancerTask-2]-org.apache.hadoop.hdds.scm.container.balancer.ContainerBalancerTask:
> Result of this iteration of Container Balancer:
> CAN_NOT_BALANCE_ANY_MORE{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]