This is an automated email from the ASF dual-hosted git repository.

sammichen pushed a commit to branch HDDS-5713
in repository https://gitbox.apache.org/repos/asf/ozone.git


The following commit(s) were added to refs/heads/HDDS-5713 by this push:
     new 5a1997fcb66 HDDS-14049. [DiskBalancer] Missing stop-after-disk-even in 
DiskBalancer status output and doc updation (#9415)
5a1997fcb66 is described below

commit 5a1997fcb6657770ed4e61770caf7c8951e2d475
Author: Gargi Jaiswal <[email protected]>
AuthorDate: Wed Dec 3 11:43:07 2025 +0530

    HDDS-14049. [DiskBalancer] Missing stop-after-disk-even in DiskBalancer 
status output and doc updation (#9415)
---
 hadoop-hdds/docs/content/design/diskbalancer.md    | 101 +++++----------------
 hadoop-hdds/docs/content/feature/DiskBalancer.md   |  76 ++++++++++++++--
 .../docs/content/feature/DiskBalancer.zh.md        |  75 +++++++++++++--
 .../cli/datanode/DiskBalancerStatusSubcommand.java |   8 +-
 .../cli/datanode/TestDiskBalancerSubCommands.java  |   2 +
 5 files changed, 165 insertions(+), 97 deletions(-)

diff --git a/hadoop-hdds/docs/content/design/diskbalancer.md 
b/hadoop-hdds/docs/content/design/diskbalancer.md
index 0c9a8b4436a..f546b5253d7 100644
--- a/hadoop-hdds/docs/content/design/diskbalancer.md
+++ b/hadoop-hdds/docs/content/design/diskbalancer.md
@@ -119,81 +119,32 @@ and is not already being moved by another balancing 
operation. To optimize perfo
 containers repeatedly, it caches the list of containers for each volume which 
auto expires after one hour of its last 
 used time or if the container iterator for that is invalidated on full 
utilisation.
 
-## CLI Interface
+## Security Design
+DiskBalancer follows the same security model as other services:
 
-The DiskBalancer CLI provides the following commands:
+* **Authentication**: Clients communicate directly with datanodes via RPC. In 
secure clusters, RPC authentication is required (Kerberos).
 
-### Command Syntax
+* **Authorization**: After successful authentication, each datanode performs 
authorization checks using `OzoneAdmins` based on the `ozone.administrators` 
configuration:
+  - **Admin operations** (start, stop, update): Require the authenticated user 
to be in `ozone.administrators` or belong to a group in 
`ozone.administrators.groups`
+  - **Read-only operations** (status, report): Do not require admin privileges 
- any authenticated user can query status and reports
+  
+By default, if `ozone.administrators` is not configured, only the user who 
launched the datanode service has admin privileges. This ensures that 
DiskBalancer operations are restricted to authorized administrators while 
allowing read-only access for monitoring purposes.
 
-**Start DiskBalancer:**
-```bash
-ozone admin datanode diskbalancer start [<datanode-address> ...] [OPTIONS] 
[--in-service-datanodes]
-```
-
-**Stop DiskBalancer:**
-```bash
-ozone admin datanode diskbalancer stop [<datanode-address> ...] 
[--in-service-datanodes]
-```
-
-**Update Configuration:**
-```bash
-ozone admin datanode diskbalancer update [<datanode-address> ...] [OPTIONS] 
[--in-service-datanodes]
-```
-
-**Get Status:**
-```bash
-ozone admin datanode diskbalancer status [<datanode-address> ...] 
[--in-service-datanodes] [--json]
-```
-
-**Get Report:**
-```bash
-ozone admin datanode diskbalancer report [<datanode-address> ...] 
[--in-service-datanodes] [--json]
-```
-
-### Command Options
-
-| Option                              | Description                            
                                                                                
                                                                                
                                                                                
                                                                             | 
Example                                        |
-|-------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------|
-| `<datanode-address>`                | One or more datanode addresses as 
positional arguments. Addresses can be:<br>- Hostname (e.g., `DN-1`) - uses 
default CLIENT_RPC port (9858)<br>- Hostname with port (e.g., `DN-1:9858`)<br>- 
IP address (e.g., `192.168.1.10`)<br>- IP address with port (e.g., 
`192.168.1.10:9858`)<br>- Stdin (`-`) - reads datanode addresses from standard 
input, one per line | `DN-1`<br>`DN-1:9858`<br>`192.168.1.10`<br>`-` |
-| `--in-service-datanodes`            | It queries SCM for all IN_SERVICE 
datanodes and executes the command on all of them.                              
                                                                                
                                                                                
                                                                                
  | `--in-service-datanodes`                       |
-| `--json`                            | Format output as JSON.                 
                                                                                
                                                                                
                                                                                
                                                                             | 
`--json`                                       |
-| `-t/--threshold`                    | Volume density threshold percentage 
(default: 10.0). Used with `start` and `update` commands.                       
                                                                                
                                                                                
                                                                                
| `-t 5`<br>`--threshold 5.0`                    |
-| `-b/--bandwidth-in-mb`              | Maximum disk bandwidth in MB/s 
(default: 10). Used with `start` and `update` commands.                         
                                                                                
                                                                                
                                                                                
     | `-b 20`<br>`--bandwidth-in-mb 50`              |
-| `-p/--parallel-thread`              | Number of parallel threads (default: 
1). Used with `start` and `update` commands.                                    
                                                                                
                                                                                
                                                                               
| `-p 5`<br>`--parallel-thread 10`               |
-| `-s/--stop-after-disk-even`         | Stop automatically after disks are 
balanced (default: false). Used with `start` and `update` commands.             
                                                                                
                                                                                
                                                                                
 | `-s false`<br>`--stop-after-disk-even true`    |
-
-### Examples
+## CLI Interface Design
 
-```bash
-# Start DiskBalancer on a specific datanode
-ozone admin datanode diskbalancer start DN-1
+The DiskBalancer CLI provides five main commands that communicate directly 
with datanodes:
 
-# Start DiskBalancer on multiple datanodes
-ozone admin datanode diskbalancer start DN-1 DN-2 DN-3
+1. **start** - Initiates DiskBalancer on `specified datanodes` or all 
`in-service-datanodes` with optional configuration parameters
+2. **stop** - Stops DiskBalancer operations on specified datanodes.
+3. **update** - Updates DiskBalancer configuration.
+4. **status** - Retrieves current DiskBalancer status including running state, 
metrics, and configuration.
+5. **report** - Retrieves volume density report showing imbalance analysis.
 
-# Start DiskBalancer on all IN_SERVICE datanodes
-ozone admin datanode diskbalancer start --in-service-datanodes
-
-# Start DiskBalancer with configuration parameters
-ozone admin datanode diskbalancer start DN-1 -t 5 -b 20 -p 5
-
-# Read datanode addresses from stdin
-echo -e "DN-1\nDN-2" | ozone admin datanode diskbalancer start -
-
-# Get status as JSON
-ozone admin datanode diskbalancer status --in-service-datanodes --json
-
-# Update configuration on specific datanode (partial update - only specified 
parameters are updated)
-ozone admin datanode diskbalancer update DN-1 -b 50
-```
-
-### Authentication and Authorization
-
-* **Authentication**: RPC authentication is required (e.g., via `kinit` in 
secure clusters). The client's identity is verified by the datanode's RPC layer.
-
-* **Authorization**: Each datanode performs authorization checks using 
`OzoneAdmins` based on the `ozone.administrators` configuration:
-  - **Admin operations** (start, stop, update): Require the user to be in 
`ozone.administrators`
-  - **Read-only operations** (status, report): Do not require admin privileges
+The CLI supports:
+- **Direct datanode addressing**: Commands can target specific datanodes by 
hostname or IP address
+- **Batch operations**: The `--in-service-datanodes` flag queries SCM for all 
IN_SERVICE and HEALTHY datanodes and executes commands on all of them
+- **Flexible input**: Datanode addresses can be provided as positional 
arguments or read from stdin
+- **Output formats**: Results can be displayed in human-readable format or 
JSON for programmatic access
 
 ### Operational State Awareness
 
@@ -206,17 +157,7 @@ This ensures DiskBalancer respects datanode lifecycle 
management and does not in
 
 ## Feature Flag
 
-The Disk Balancer feature is introduced with a feature flag. By default, this 
feature is disabled.
-
-The feature can be enabled by setting the following property to `true` in the 
`ozone-site.xml` configuration file:
-`hdds.datanode.disk.balancer.enabled = false`
-
-Developers who wish to test or use the Disk Balancer must explicitly enable 
it. Once the feature is 
-considered stable, the default value may be changed to `true` in a future 
release.
-
-**Note:** This command is hidden from the main help message (`ozone admin 
datanode --help`). This is because the feature
-is currently considered experimental and is disabled by default. The command 
is, however, fully functional for those who
-wish to enable and use the feature.
+The DiskBalancer feature is gated behind a feature flag 
(`hdds.datanode.disk.balancer.enabled`) to allow controlled rollout. By 
default, the feature is disabled. When disabled, the DiskBalancer service is 
not initialized on datanodes, and the CLI commands are hidden from the main 
help output to prevent accidental usage.
 
 ## DiskBalancer Metrics
 
diff --git a/hadoop-hdds/docs/content/feature/DiskBalancer.md 
b/hadoop-hdds/docs/content/feature/DiskBalancer.md
index 7b0fe76c2f4..ac0ea9aa28e 100644
--- a/hadoop-hdds/docs/content/feature/DiskBalancer.md
+++ b/hadoop-hdds/docs/content/feature/DiskBalancer.md
@@ -48,6 +48,74 @@ The Disk Balancer feature is introduced with a feature flag. 
By default, this fe
 The feature can be **enabled** by setting the following property to `true` in 
the `ozone-site.xml` configuration file:
 `hdds.datanode.disk.balancer.enabled = false`
 
+### Authentication and Authorization
+
+DiskBalancer commands communicate directly with datanodes via RPC, requiring 
proper authentication and authorization configuration.
+
+#### Authentication Configuration
+
+In secure clusters with Kerberos enabled, the datanode must have its Kerberos 
principal configured for RPC authentication in `ozone-site.xml`:
+
+```xml
+<property>
+  <name>hdds.datanode.kerberos.principal</name>
+  <value>dn/[email protected]</value>
+  <description>
+    The Datanode service principal. This is typically set to
+    dn/[email protected]. Each Datanode will substitute _HOST with its
+    own fully qualified hostname at startup. The _HOST placeholder
+    allows using the same configuration setting on all Datanodes.
+  </description>
+</property>
+```
+
+**Note**: Without this configuration, DiskBalancer commands will fail with 
authentication errors in secure clusters.
+The client uses this principal to verify the datanode's identity when 
establishing RPC connections.
+
+#### Authorization Configuration
+
+Each datanode performs authorization checks using `OzoneAdmins` based on the 
`ozone.administrators` configuration:
+- **Admin operations** (start, stop, update): Require the user to be in 
`ozone.administrators` or belong to a group in `ozone.administrators.groups`
+- **Read-only operations** (status, report): Do not require admin privileges - 
any authenticated user can query status and reports
+
+#### Default Behavior
+
+By default, if `ozone.administrators` is not configured, only the user who 
launched the datanode service can start, stop,
+or update DiskBalancer. This means that in a typical deployment where the 
datanode runs as user `dn`, only that user has
+admin privileges for DiskBalancer operations.
+
+#### Enabling Authorization for Additional Users
+
+To allow other users to perform DiskBalancer admin operations (start, stop, 
update), configure the `ozone.administrators` property in `ozone-site.xml`:
+
+**Example 1: Single user**
+```xml
+<property>
+  <name>ozone.administrators</name>
+  <value>scm</value>
+</property>
+```
+
+**Example 2: Multiple users**
+```xml
+<property>
+  <name>ozone.administrators</name>
+  <value>scm,hdfs</value>
+</property>
+```
+
+**Example 3: Using groups**
+```xml
+<property>
+  <name>ozone.administrators.groups</name>
+  <value>ozone-admins,cluster-operators</value>
+</property>
+```
+
+**Note**: `ozone-admins` and `cluster-operators` are example group names. 
Replace them with actual
+group names from your environment. After updating the `ozone.administrators` 
configuration,
+restart the datanode service for the changes to take effect.
+
 ## Command Line Usage
 The DiskBalancer is managed through the `ozone admin datanode diskbalancer` 
command.
 
@@ -162,14 +230,6 @@ ozone admin datanode diskbalancer report 
--in-service-datanodes
 ozone admin datanode diskbalancer report --in-service-datanodes --json
 ```
 
-### Authentication and Authorization
-
-* **Authentication**: RPC authentication is required (e.g., via `kinit` in 
secure clusters). The client's identity is verified by the datanode's RPC layer.
-
-* **Authorization**: Each datanode performs authorization checks using 
`OzoneAdmins` based on the `ozone.administrators` configuration:
-  - **Admin operations** (start, stop, update): Require the user to be in 
`ozone.administrators`
-  - **Read-only operations** (status, report): Do not require admin privileges
-
 ## **DiskBalancer Configurations**
 
 The DiskBalancer's behavior can be controlled using the following 
configuration properties in `ozone-site.xml`.
diff --git a/hadoop-hdds/docs/content/feature/DiskBalancer.zh.md 
b/hadoop-hdds/docs/content/feature/DiskBalancer.zh.md
index 106ba1de6a9..ab83c6e7538 100644
--- a/hadoop-hdds/docs/content/feature/DiskBalancer.zh.md
+++ b/hadoop-hdds/docs/content/feature/DiskBalancer.zh.md
@@ -44,6 +44,74 @@ summary: 数据节点的磁盘平衡器.
 可以通过在“ozone-site.xml”配置文件中将以下属性设置为“true”来**启用**该功能:
 `hdds.datanode.disk.balancer.enabled = false`
 
+### 身份验证和授权
+
+DiskBalancer 命令通过 RPC 直接与数据节点通信,因此需要进行正确的身份验证和授权配置。
+
+#### 身份验证配置
+
+在启用了 Kerberos 的安全集群中,必须在 `ozone-site.xml` 文件中配置数据节点的 Kerberos 主体以进行 RPC 身份验证:
+
+```xml
+<property>
+<name>hdds.datanode.kerberos.principal</name>
+<value>dn/[email protected]</value>
+<description>
+  The Datanode service principal. This is typically set to
+  dn/[email protected]. Each Datanode will substitute _HOST with its
+  own fully qualified hostname at startup. The _HOST placeholder
+  allows using the same configuration setting on all Datanodes.
+</description>
+
+</property>
+```
+
+**注意**:如果没有此配置,DiskBalancer 命令在安全集群中将因身份验证错误而失败。 客户端使用此主体在建立 RPC 连接时验证数据节点的身份。
+
+#### 授权配置
+
+每个数据节点都使用 `OzoneAdmins` 根据 `ozone.administrators` 配置执行授权检查:
+
+- **管理员操作**(启动、停止、更新):要求用户位于 `ozone.administrators` 成员列表中,或属于 
`ozone.administrators.groups` 中的某个组。
+
+- **只读操作**(状态、报告):不需要管理员权限 - 任何已认证的用户都可以查询状态和报告。
+
+#### 默认行为
+
+默认情况下,如果未配置 `ozone.administrators`,则只有启动数据节点服务的用户才能启动、停止或更新 DiskBalancer。
+
+这意味着在典型的部署中,如果数据节点以用户 `dn` 的身份运行,则只有该用户拥有 DiskBalancer 操作的 管理员权限。
+
+#### 为其他用户启用身份验证
+
+要允许其他用户执行 DiskBalancer 管理操作(启动、停止、更新),请在 `ozone-site.xml` 文件中配置 
`ozone.administrators` 属性:
+
+**Example 1: Single user**
+```xml
+<property>
+  <name>ozone.administrators</name>
+  <value>scm</value>
+</property>
+```
+
+**Example 2: Multiple users**
+```xml
+<property>
+  <name>ozone.administrators</name>
+  <value>scm,hdfs</value>
+</property>
+```
+
+**Example 3: Using groups**
+```xml
+<property>
+  <name>ozone.administrators.groups</name>
+  <value>ozone-admins,cluster-operators</value>
+</property>
+```
+**注意**:`ozone-admins` 和 `cluster-operators` 是示例组名称。请将其替换为您环境中的实际组名称。 更新 
`ozone.administrators` 配置后,
+请重启数据节点服务以使更改生效。
+
 ## 命令行用法
 DiskBalancer 通过 `ozone admin datanode diskbalancer` 命令进行管理。
 
@@ -157,13 +225,6 @@ ozone admin datanode diskbalancer report 
--in-service-datanodes
 # 以 JSON 格式获取报告
 ozone admin datanode diskbalancer report --in-service-datanodes --json
 ```
-### 身份验证和授权
-
-* **身份验证**:需要 RPC 身份验证(例如,在安全集群中通过 `kinit`)。客户端的身份由数据节点的 RPC 层验证。
-
-* **授权**:每个数据节点都使用 `OzoneAdmins` 根据 `ozone.administrators` 配置执行授权检查:
-- **管理操作**(启动、停止、更新):要求用户位于 `ozone.administrators` 成员中
-- **只读操作**(状态、报告):不需要管理员权限
 
 ## DiskBalancer Configurations
 
diff --git 
a/hadoop-ozone/cli-admin/src/main/java/org/apache/hadoop/hdds/scm/cli/datanode/DiskBalancerStatusSubcommand.java
 
b/hadoop-ozone/cli-admin/src/main/java/org/apache/hadoop/hdds/scm/cli/datanode/DiskBalancerStatusSubcommand.java
index 7ef6c001043..1bcc62e512e 100644
--- 
a/hadoop-ozone/cli-admin/src/main/java/org/apache/hadoop/hdds/scm/cli/datanode/DiskBalancerStatusSubcommand.java
+++ 
b/hadoop-ozone/cli-admin/src/main/java/org/apache/hadoop/hdds/scm/cli/datanode/DiskBalancerStatusSubcommand.java
@@ -91,7 +91,7 @@ protected void displayResults(List<String> successNodes, 
List<String> failedNode
 
   private String generateStatus(List<DatanodeDiskBalancerInfoProto> protos) {
     StringBuilder formatBuilder = new StringBuilder("Status result:%n" +
-        "%-35s %-15s %-15s %-15s %-12s %-12s %-12s %-15s %-15s %-15s%n");
+        "%-35s %-15s %-15s %-15s %-12s %-20s %-12s %-12s %-15s %-18s %-20s%n");
 
     List<String> contentList = new ArrayList<>();
     contentList.add("Datanode");
@@ -99,6 +99,7 @@ private String 
generateStatus(List<DatanodeDiskBalancerInfoProto> protos) {
     contentList.add("Threshold(%)");
     contentList.add("BandwidthInMB");
     contentList.add("Threads");
+    contentList.add("StopAfterDiskEven");
     contentList.add("SuccessMove");
     contentList.add("FailureMove");
     contentList.add("BytesMoved(MB)");
@@ -106,7 +107,7 @@ private String 
generateStatus(List<DatanodeDiskBalancerInfoProto> protos) {
     contentList.add("EstTimeLeft(min)");
 
     for (HddsProtos.DatanodeDiskBalancerInfoProto proto : protos) {
-      formatBuilder.append("%-35s %-15s %-15s %-15s %-12s %-12s %-12s %-15s 
%-15s %-15s%n");
+      formatBuilder.append("%-35s %-15s %-15s %-15s %-12s %-20s %-12s %-12s 
%-15s %-18s %-20s%n");
       long estimatedTimeLeft = calculateEstimatedTimeLeft(proto);
       long bytesMovedMB = (long) Math.ceil(proto.getBytesMoved() / (1024.0 * 
1024.0));
       long bytesToMoveMB = (long) Math.ceil(proto.getBytesToMove() / (1024.0 * 
1024.0));
@@ -119,6 +120,8 @@ private String 
generateStatus(List<DatanodeDiskBalancerInfoProto> protos) {
           String.valueOf(proto.getDiskBalancerConf().getDiskBandwidthInMB()));
       contentList.add(
           String.valueOf(proto.getDiskBalancerConf().getParallelThread()));
+      contentList.add(
+          String.valueOf(proto.getDiskBalancerConf().getStopAfterDiskEven()));
       contentList.add(String.valueOf(proto.getSuccessMoveCount()));
       contentList.add(String.valueOf(proto.getFailureMoveCount()));
       contentList.add(String.valueOf(bytesMovedMB));
@@ -153,6 +156,7 @@ private Map<String, Object> 
createStatusResult(DatanodeDiskBalancerInfoProto sta
     result.put("threshold", status.getDiskBalancerConf().getThreshold());
     result.put("bandwidthInMB", 
status.getDiskBalancerConf().getDiskBandwidthInMB());
     result.put("threads", status.getDiskBalancerConf().getParallelThread());
+    result.put("stopAfterDiskEven", 
status.getDiskBalancerConf().getStopAfterDiskEven());
     result.put("successMove", status.getSuccessMoveCount());
     result.put("failureMove", status.getFailureMoveCount());
     result.put("bytesMovedMB", (long) Math.ceil(status.getBytesMoved() / 
(1024.0 * 1024.0)));
diff --git 
a/hadoop-ozone/cli-admin/src/test/java/org/apache/hadoop/hdds/scm/cli/datanode/TestDiskBalancerSubCommands.java
 
b/hadoop-ozone/cli-admin/src/test/java/org/apache/hadoop/hdds/scm/cli/datanode/TestDiskBalancerSubCommands.java
index 18cd51a48e2..9f95fd3f117 100644
--- 
a/hadoop-ozone/cli-admin/src/test/java/org/apache/hadoop/hdds/scm/cli/datanode/TestDiskBalancerSubCommands.java
+++ 
b/hadoop-ozone/cli-admin/src/test/java/org/apache/hadoop/hdds/scm/cli/datanode/TestDiskBalancerSubCommands.java
@@ -442,6 +442,7 @@ public void testStatusDiskBalancerWithJson() throws 
Exception {
       assertTrue(output.contains("\"threshold\""));
       assertTrue(output.contains("\"bandwidthInMB\""));
       assertTrue(output.contains("\"threads\""));
+      assertTrue(output.contains("\"stopAfterDiskEven\""));
     }
   }
 
@@ -648,6 +649,7 @@ private DatanodeDiskBalancerInfoProto 
createStatusProto(String hostname,
         .setThreshold(threshold)
         .setDiskBandwidthInMB(bandwidthInMB)
         .setParallelThread(parallelThread)
+        .setStopAfterDiskEven(true)
         .build();
 
     return DatanodeDiskBalancerInfoProto.newBuilder()


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to