jojochuang commented on code in PR #276: URL: https://github.com/apache/ozone-site/pull/276#discussion_r2733761311
########## blog/2025-12-09-disk-balancer-preview.md: ########## @@ -0,0 +1,68 @@ +--- +title: "Disk Balancer in Apache Ozone: A Preview" +authors: ["apache-ozone-community"] +date: 2025-12-09 +tags: [Ozone, Disk Balancer, Ozone 2.2, Datanode] +--- + +**Disk Balancer** — a lightweight, automatic way to keep disk usage within each Datanode evenly distributed. Review Comment: Replace with a better hook. ```suggestion Ever replaced a drive on a Datanode only to watch it become an I/O hotspot? Or seen one disk hit 95% usage while others on the same machine sit idle? These imbalances create performance bottlenecks and increase failure risk. Apache Ozone's new intra-node Disk Balancer is designed to fix this—automatically. ``` ########## blog/2025-12-09-disk-balancer-preview.md: ########## @@ -0,0 +1,68 @@ +--- +title: "Disk Balancer in Apache Ozone: A Preview" +authors: ["apache-ozone-community"] +date: 2025-12-09 +tags: [Ozone, Disk Balancer, Ozone 2.2, Datanode] +--- + +**Disk Balancer** — a lightweight, automatic way to keep disk usage within each Datanode evenly distributed. + +<!-- truncate --> + +Cluster-wide balancing in Ozone already ensures replicas are evenly spread across Datanodes. But inside a single Datanode, disks can still drift out of balance over time — for example after adding new disks, replacing hardware, or performing large deletions. This leads to I/O hotspots and uneven wear. + +Disk Balancer closes that gap. + +## Why Disk Balancer? + +- **Disks fill unevenly** when nodes gain or lose volumes. + +- **Large deletes** can empty some disks disproportionately. + +- **Hot disks degrade performance** and become failure risks. + +Even if the cluster is balanced, the node itself may not be. Disk Balancer fixes this automatically. + +## How it works + +The design ([HDDS-5713](https://issues.apache.org/jira/browse/HDDS-5713)) introduces a simple metric: **Volume Data Density** — how much a disk's utilization deviates from the node's average. If the deviation exceeds a threshold, the node begins balancing. + +Balancing is local and safe: + +- Only **closed containers** are moved. +- Moves happen entirely **within the same Datanode.** +- A scheduler periodically checks for imbalance and dispatches copy-and-import tasks. +- Bandwidth and concurrency are **operator-tunable** to avoid interfering with production I/O. + +This runs independently on each Datanode; SCM just receives reports. + +## Using Disk Balancer + +CLI examples: + +```bash +# Start on all datanodes +ozone admin datanode diskbalancer start -a + +# Check status +ozone admin datanode diskbalancer status +``` + +Key settings include the density threshold, per-task throughput cap, parallel thread count, and whether to auto-stop once balanced. + +## Benefits for operators + +- **Even I/O load** across disks → more stable performance. +- **Smooth ops after hardware changes** (new or replaced disks). +- **Hands-off balancing** once enabled. +- **Clear metrics** for observability and troubleshooting. + +It complements the existing Container Balancer: one works across nodes, the other within nodes. + +## Closing Thoughts + +Disk Balancer is small but impactful. It brings Ozone closer to being a fully self-healing, self-balancing object store — reducing hotspots, simplifying maintenance, and improving cluster longevity. + +Ozone 2.2 will ship with this feature enabled via simple CLI controls and safe defaults. If you run long-lived clusters, this is a feature to watch. Review Comment: ```suggestion Ozone 2.2 will ship with this feature available via simple CLI controls and safe defaults. If you run long-lived clusters, this is a feature to watch. ``` ########## blog/2025-12-09-disk-balancer-preview.md: ########## @@ -0,0 +1,68 @@ +--- +title: "Disk Balancer in Apache Ozone: A Preview" Review Comment: Let's work on the title a bit more to make it viral. ```suggestion title: "No More Hotspots: Introducing the Automatic Disk Balancer in Apache Ozone" ``` ```suggestion title: "Self-Healing Storage: A First Look at Ozone's Datanode Disk Balancer" ``` ```suggestion title: "Your Ozone Datanodes Just Got Smarter: Introducing the Automatic Disk Balancer in Apache Ozone" ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
