This is an automated email from the ASF dual-hosted git repository.
jojochuang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/ozone-site.git
The following commit(s) were added to refs/heads/master by this push:
new 37c57befd HDDS-15525. Update DiskBalancer Doc and Blog with latest
changes (#462)
37c57befd is described below
commit 37c57befdb1ad7fbd5e250fda2011af0a79d2c1f
Author: Gargi Jaiswal <[email protected]>
AuthorDate: Tue Jun 16 10:04:36 2026 +0530
HDDS-15525. Update DiskBalancer Doc and Blog with latest changes (#462)
---
blog/2026-01-29-disk-balancer-preview.md | 23 ++++++--
cspell.yaml | 1 +
.../05-data-balancing/02-disk-balancer.md | 69 ++++++++++++----------
3 files changed, 57 insertions(+), 36 deletions(-)
diff --git a/blog/2026-01-29-disk-balancer-preview.md
b/blog/2026-01-29-disk-balancer-preview.md
index e63ceec62..04377d700 100644
--- a/blog/2026-01-29-disk-balancer-preview.md
+++ b/blog/2026-01-29-disk-balancer-preview.md
@@ -37,14 +37,15 @@ Balancing is local and safe:
- A scheduler periodically checks for imbalance and dispatches copy-and-import
tasks.
- Bandwidth and concurrency are **operator-tunable** to avoid interfering with
production I/O.
-This runs independently on each Datanode. To use it, first enable the feature
by setting `hdds.datanode.disk.balancer.enabled = true` in `ozone-site.xml` on
your Datanodes. Once enabled, clients use `ozone admin datanode diskbalancer`
commands to talk directly to Datanodes, with SCM only used to discover
IN_SERVICE Datanodes when running batch operations with
`--in-service-datanodes`.
+This runs independently on each Datanode. The feature can be disabled by
setting `hdds.datanode.disk.balancer.enabled = false` in `ozone-site.xml` on
your Datanodes. Once disabled, clients can no longer use `ozone admin datanode
diskbalancer` commands to balance disks on a datanode.
## How DiskBalancer Decides What to Move
-DiskBalancer uses simple but robust policies to decide **which disks to
balance** and **which containers to move** (see the design doc for details:
`diskbalancer.md` in
[HDDS-5713](https://issues.apache.org/jira/browse/HDDS-5713)).
+DiskBalancer uses simple but robust policy to decide **which disks to
balance** and **which containers to move** (see the design doc for details:
`diskbalancer.md` in
[HDDS-5713](https://issues.apache.org/jira/browse/HDDS-5713)).
-- **Default Volume Choosing Policy**: Picks the most over‑utilized volume as
the source and the most under‑utilized volume as the destination, based on each
disk’s **Volume Data Density** and the Datanode’s average utilization.
-- **Default Container Choosing Policy**: Scans containers on the source volume
and moves only **CLOSED** containers that are not already being moved. To avoid
repeatedly scanning the same list, it caches container metadata with automatic
expiry.
+- **Default Container Choosing Policy**: This is the default policy that
consolidates both volume selection and container selection into a single
operation. It identifies the most over-utilized volume
+as the source and the most under-utilized volume with sufficient space as the
destination, then iterates through containers on the source to pick the first
one that is movable (per `hdds.datanode.disk.balancer.container.states`,
+default **CLOSED** and **QUASI_CLOSED**) and is not already being moved. It
caches the list of containers for each volume which auto expires after one hour.
These defaults aim to make safe, incremental moves that converge the disks
toward an even utilization state.
@@ -56,13 +57,22 @@ When DiskBalancer moves a container from one disk to
another on the **same Datan
2. Transition that copy into a **RECOVERING** state and import it as a new
container on the destination.
3. Once import and metadata updates succeed, delete the original CLOSED
container from the source disk.
+```
+D1 ----> C1-CLOSED --- (5) ---> C1-DELETED
+ |
+ |
+ (1)
+ |
+D2 ----> Temp C1-CLOSED --- (2) ---> Temp C1-RECOVERING --- (3) --->
C1-RECOVERING --- (4) ---> C1-CLOSED
+```
+
This ensures that data is always consistent: the destination copy is fully
validated before the original is removed, minimizing risk during balancing.
## Using Disk Balancer
-First, enable the Disk Balancer feature on each Datanode by setting the
following in `ozone-site.xml`:
+The Disk Balancer has a feature flag which is **by default true** on each
Datanode and can be disabled by setting the following property in
`ozone-site.xml` :
-- `hdds.datanode.disk.balancer.enabled = true`
+- `hdds.datanode.disk.balancer.enabled = false`
The Disk Balancer CLI supports two command patterns:
@@ -121,6 +131,7 @@ The following parameters can be specified during **start**
or **update configura
| `--bandwidth-in-mb` | `-b` | `10` | Maximum bandwidth for
DiskBalancer per second. |
| `--parallel-thread` | `-p` | `5` | Max parallel thread count for
DiskBalancer. |
| `--stop-after-disk-even` | `-s` | `true` | Stop DiskBalancer
automatically after disk utilization is even. |
+| `--container-states` | `-c` | `CLOSED,QUASI_CLOSED` | Comma-separated list
of container states that are eligible for moving during balancing. |
## Benefits for operators
diff --git a/cspell.yaml b/cspell.yaml
index 6e0fbeb65..b7264e3d3 100644
--- a/cspell.yaml
+++ b/cspell.yaml
@@ -103,6 +103,7 @@ words:
- AOS
- FCQ
- QoS
+- QUASI_CLOSED
# Other systems' words
- savepoints
- HDDs
diff --git
a/docs/05-administrator-guide/03-operations/05-data-balancing/02-disk-balancer.md
b/docs/05-administrator-guide/03-operations/05-data-balancing/02-disk-balancer.md
index 8c1174cf9..9c3662df1 100644
---
a/docs/05-administrator-guide/03-operations/05-data-balancing/02-disk-balancer.md
+++
b/docs/05-administrator-guide/03-operations/05-data-balancing/02-disk-balancer.md
@@ -21,9 +21,9 @@ A disk is considered a candidate for balancing if its
`VolumeDataDensity` exceed
## Feature Flag
-The Disk Balancer feature is introduced with a feature flag. By default, this
feature is disabled.
+The Disk Balancer feature is introduced with a feature flag. By default, this
feature is enabled.
-The feature can be **enabled** by setting the following property to `true` in
the `ozone-site.xml` configuration file: `hdds.datanode.disk.balancer.enabled =
true`
+The feature can be **disabled** by setting the following property to false in
the `ozone-site.xml` configuration file: `hdds.datanode.disk.balancer.enabled =
false`.
## Authentication and Authorization
@@ -45,9 +45,12 @@ In secure clusters with Kerberos enabled, the Datanode must
have its Kerberos pr
</description>
</property>
```
+:::note
-**Note:** Without this configuration, DiskBalancer commands will fail with
authentication errors in secure clusters. The client uses this principal to
verify the Datanode's identity when establishing RPC connections.
+ Without this configuration, DiskBalancer commands will fail with
authentication errors in secure clusters. The client uses this principal to
verify the
+ Datanode's identity when establishing RPC connections.
+:::
### Authorization Configuration
Each Datanode performs authorization checks using `OzoneAdmins` based on the
`ozone.administrators` configuration:
@@ -96,7 +99,11 @@ To allow other users to perform DiskBalancer admin
operations (start, stop, upda
The DiskBalancer is managed through the `ozone admin datanode diskbalancer`
command.
-**Note:** This command is hidden from the main help message (`ozone admin
datanode --help`). This is because the feature is currently considered
experimental and is disabled by default. The command is, however, fully
functional for those who wish to enable and use the feature.
+:::note
+
+ DiskBalancer is enabled by default on datanodes. Use
`hdds.datanode.disk.balancer.enabled=false` in `ozone-site.xml` to disable the
service on datanodes and prevent CLI commands from running.
+
+:::
### Command Syntax
@@ -132,15 +139,16 @@ ozone admin datanode diskbalancer report
[<datanode-address> ...] [--in-service-
### Command Options
-| Option | Description
| Example |
-|--------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|
-| `<datanode-address>` | One or more Datanode addresses as positional
arguments. Addresses can be:<br />- Hostname (e.g., `DN-1`) - uses default
CLIENT_RPC port (19864)<br />- Hostname with port (e.g., `DN-1:19864`)<br />-
IP address (e.g., `192.168.1.10`)<br />- IP address with port (e.g.,
`192.168.1.10:19864`)<br />- Stdin (`-`) - reads Datanode addresses from
standard input, one per line | `DN-1`<br />`DN-1:19864`<br />`192.168.1.10`<br
/>`-` |
-| `--in-service-datanodes` | It queries SCM for all IN_SERVICE Datanodes and
executes the command on all of them.
| `--in-service-datanodes` |
-| `--json` | Format output as JSON.
| `--json` |
-| `-t/--threshold-percentage` | Volume density threshold percentage (default:
10.0). Used with `start` and `update` commands.
| `-t 5`<br />`--threshold-percentage 5.0` |
-| `-b/--bandwidth-in-mb` | Maximum disk bandwidth in MB/s (default: 10). Used
with `start` and `update` commands.
|
`-b 20`<br />`--bandwidth-in-mb 50` |
-| `-p/--parallel-thread` | Number of parallel threads (default: 1). Used with
`start` and `update` commands.
|
`-p 5`<br />`--parallel-thread 10` |
-| `-s/--stop-after-disk-even` | Stop automatically after disks are balanced
(default: true). Used with `start` and `update` commands.
| `-s false`<br />`--stop-after-disk-even true` |
+| Option | Description
| Example
|
+|--------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------|
+| `<datanode-address>` | One or more Datanode addresses as positional
arguments. Addresses can be:<br />- Hostname (e.g., `DN-1`) - uses default
CLIENT_RPC port (19864)<br />- Hostname with port (e.g., `DN-1:19864`)<br />-
IP address (e.g., `192.168.1.10`)<br />- IP address with port (e.g.,
`192.168.1.10:19864`)<br />- Stdin (`-`) - reads Datanode addresses from
standard input, one per line | `DN-1`<br />`DN-1:19864`<br />`192.168.1.10`<br
/>`-` |
+| `--in-service-datanodes` | It queries SCM for all IN_SERVICE Datanodes and
executes the command on all of them.
| `--in-service-datanodes` |
+| `--json` | Format output as JSON.
| `--json`
|
+| `-t/--threshold-percentage` | Volume density threshold percentage (default:
10.0). Used with `start` and `update` commands.
| `-t 5`<br />`--threshold-percentage 5.0` |
+| `-b/--bandwidth-in-mb` | Maximum disk bandwidth in MB/s (default: 10). Used
with `start` and `update` commands.
|
`-b 20`<br />`--bandwidth-in-mb 50` |
+| `-p/--parallel-thread` | Number of parallel threads (default: 1). Used with
`start` and `update` commands.
|
`-p 5`<br />`--parallel-thread 10` |
+| `-s/--stop-after-disk-even` | Stop automatically after disks are balanced
(default: true). Used with `start` and `update` commands.
| `-s false`<br />`--stop-after-disk-even true` |
+| `-c/--container-states` | Comma-separated container lifecycle state names
that may be moved between disks. Used with `start` and `update` commands. | `-c
CLOSED,QUASI_CLOSED` <br /> `--container-states CLOSED` |
### Examples
@@ -150,7 +158,7 @@ ozone admin datanode diskbalancer report
[<datanode-address> ...] [--in-service-
# Start DiskBalancer on multiple datanodes
ozone admin datanode diskbalancer start DN-1 DN-2 DN-3
-# Start DiskBalancer on all IN_SERVICE datanodes
+# Start DiskBalancer on all IN_SERVICE and HEALTHY datanodes
ozone admin datanode diskbalancer start --in-service-datanodes
# Start DiskBalancer with configuration parameters
@@ -171,7 +179,7 @@ ozone admin datanode diskbalancer start DN-1 --json
# Stop DiskBalancer on multiple datanodes
ozone admin datanode diskbalancer stop DN-1 DN-2 DN-3
-# Stop DiskBalancer on all IN_SERVICE datanodes
+# Stop DiskBalancer on all IN_SERVICE and HEALTHY datanodes
ozone admin datanode diskbalancer stop --in-service-datanodes
# Stop DiskBalancer with json output
@@ -184,7 +192,7 @@ ozone admin datanode diskbalancer stop DN-1 --json
# Update multiple parameters
ozone admin datanode diskbalancer update DN-1 -t 5 -b 50 -p 10
-# Update on all IN_SERVICE datanodes
+# Update on all IN_SERVICE and HEALTHY datanodes
ozone admin datanode diskbalancer update --in-service-datanodes -t 5
# Or using the long form:
ozone admin datanode diskbalancer update --in-service-datanodes
--threshold-percentage 5
@@ -199,7 +207,7 @@ ozone admin datanode diskbalancer update DN-1 -b 50 --json
# Get status from multiple datanodes
ozone admin datanode diskbalancer status DN-1 DN-2 DN-3
-# Get status from all IN_SERVICE datanodes
+# Get status from all IN_SERVICE and HEALTHY datanodes
ozone admin datanode diskbalancer status --in-service-datanodes
# Get status as JSON
@@ -212,7 +220,7 @@ ozone admin datanode diskbalancer status
--in-service-datanodes --json
# Get report from multiple datanodes
ozone admin datanode diskbalancer report DN-1 DN-2 DN-3
-# Get report from all IN_SERVICE datanodes
+# Get report from all IN_SERVICE and HEALTHY datanodes
ozone admin datanode diskbalancer report --in-service-datanodes
# Get report as JSON
@@ -223,15 +231,16 @@ ozone admin datanode diskbalancer report
--in-service-datanodes --json
The DiskBalancer's behavior can be controlled using the following
configuration properties in `ozone-site.xml`.
-| Property | Default
| Purpose
|
-|----------|--------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `hdds.datanode.disk.balancer.enabled` | `false`
| If false, the
DiskBalancer service on the Datanode is disabled. Configure it to true for
diskBalancer to be enabled.
|
-| `hdds.datanode.disk.balancer.volume.density.threshold.percent` | `10.0`
| A percentage (0-100). A Datanode is considered balanced if for each
volume, its utilization differs from the average Datanode utilization by no
more than this threshold. |
-| `hdds.datanode.disk.balancer.max.disk.throughputInMBPerSec` | `10`
| The maximum bandwidth (in MB/s) that the balancer can use for moving data,
to avoid impacting client I/O.
|
-| `hdds.datanode.disk.balancer.parallel.thread` | `5`
| The
number of worker threads to use for moving containers in parallel.
|
-| `hdds.datanode.disk.balancer.service.interval` | `60s`
| The time
interval at which the Datanode DiskBalancer service checks for imbalance and
updates its configuration.
|
-| `hdds.datanode.disk.balancer.stop.after.disk.even` | `true`
| If
true, the DiskBalancer will automatically stop its balancing activity once
disks are considered balanced (i.e., all volume densities are within the
threshold). |
-| `hdds.datanode.disk.balancer.volume.choosing.policy` |
`org.apache.hadoop.`<br />`ozone.container.` <br />`diskbalancer.policy.` <br
/>`DefaultVolumeChoosingPolicy` | The policy class for selecting source and
destination volumes for balancing.
|
-| `hdds.datanode.disk.balancer.container.choosing.policy` |
`org.apache.hadoop.`<br />`ozone.container.` <br />`diskbalancer.policy.` <br
/>`DefaultContainerChoosingPolicy` | The policy class for selecting which
containers to move from a source volume to destination volume.
|
-| `hdds.datanode.disk.balancer.service.timeout` | `300s`
| Timeout
for the Datanode DiskBalancer service operations.
|
-| `hdds.datanode.disk.balancer.should.run.default` | `false`
| If the
balancer fails to read its persisted configuration, this value determines if
the service should run by default.
|
+| Property | Default
| Purpose
|
+|----------|------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `hdds.datanode.disk.balancer.enabled` | `true`
|
If false, the DiskBalancer service on the Datanode is disabled. By default,
DiskBalancer is enabled on datanodes.
|
+| `hdds.datanode.disk.balancer.volume.density.threshold.percent` | `10.0`
| A percentage (0-100). A Datanode is considered balanced
if for each volume, its utilization differs from the average Datanode
utilization by no more than this threshold. |
+| `hdds.datanode.disk.balancer.max.disk.throughputInMBPerSec` | `10`
| The maximum bandwidth (in MB/s) that the balancer can use
for moving data, to avoid impacting client I/O.
|
+| `hdds.datanode.disk.balancer.parallel.thread` | `5`
| The number of worker threads to use for moving containers in parallel.
|
+| `hdds.datanode.disk.balancer.service.interval` | `60s`
| The time interval at which the Datanode DiskBalancer service checks for
imbalance and updates its configuration.
|
+| `hdds.datanode.disk.balancer.stop.after.disk.even` | `true`
| If true, the DiskBalancer will automatically stop its balancing
activity once disks are considered balanced (i.e., all volume densities are
within the threshold). |
+| `hdds.datanode.disk.balancer.replica.deletion.delay` | `5m`
| The delay after a container is successfully moved from source
volume to destination volume before the source container replica is deleted.
This lazy deletion provides a grace period before failing the read thread
holding the old container replica. Unit: ns, ms, s, m, h, d. |
+| `hdds.datanode.disk.balancer.container.states` | `CLOSED,QUASI_CLOSED`
| Comma-separated container lifecycle state names that may be moved
between disks (must match enum names exactly, uppercase). Default includes
**CLOSED** and **QUASI_CLOSED**; extend the list when additional states are
needed to be balanced. All defined container states which are eligible to move
**QUASI_CLOSED**, **CLOSED**, [...]
+| `hdds.datanode.disk.balancer.container.choosing.policy` |
`org.apache.hadoop.`<br />`ozone.container.` <br />`diskbalancer.policy.` <br
/>`DefaultContainerChoosingPolicy` | The policy for selecting
source/destination volumes and which containers to move.
|
+| `hdds.datanode.disk.balancer.service.timeout` | `300s`
| Timeout for the Datanode DiskBalancer service operations.
|
+| `hdds.datanode.disk.balancer.should.run.default` | `false`
| If the balancer fails to read its persisted configuration, this value
determines if the service should run by default.
|
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]