Re: [PR] HDDS-14314. [Website v2] Blog: Apache Ozone Best Practices at Didi [ozone-site]

via GitHub Fri, 30 Jan 2026 01:22:31 -0800


sarvekshayr commented on code in PR #306:
URL: https://github.com/apache/ozone-site/pull/306#discussion_r2745362833



##########
blog/2026-01-30-apache-ozone-best-practices-at-didi.md:
##########
@@ -0,0 +1,94 @@
+---
+title: "Apache Ozone Best Practices at Didi: Scaling to Tens of Billions of 
Files"
+date: 2026-01-30
+authors: ["rich7420", "jojochuang", "apache-ozone-community"]
+tags: [user-stories, performance, erasure-coding, scale]
+---
+
+Guest post by the Didi Engineering Team. For the full story with detailed 
slides, see [Apache Ozone Best Practices at Didi 
(PDF)](https://ozone.apache.org/assets/ApacheOzoneBestPracticesAtDidi.pdf).
+
+As Didi's volume of unstructured data surged into the hundreds of petabytes, 
comprising tens of billions of files, their traditional storage architecture 
faced severe scalability bottlenecks. This post summarizes how they migrated 
from HDFS to Apache Ozone, the optimizations they implemented for 
high-performance reads, and their journey in contributing these improvements 
back to the community.
+
+<!-- truncate -->
+
+## The Challenge: HDFS at Scale
+
+Like many data-driven enterprises, Didi relied heavily on HDFS. However, as 
their data scale grew, they hit the classic "NameNode Limit."
+
+- **Metadata Pressure:** Storing hundreds of millions of files put immense 
pressure on the HDFS NameNode memory.
+- **Block Reporting Storms:** With massive file counts, block reporting became 
a significant overhead.
+- **Scalability Ceiling:** They needed a solution that could handle tens of 
billions of files without partitioning their clusters into unmanageable silos.
+
+## Why Ozone?
+
+They chose Apache Ozone as their next-generation storage engine because it 
addresses these limitations architecturally:
+
+- **Decoupled Metadata:** By separating the Ozone Manager (OM) for namespace 
and Storage Container Manager (SCM) for block management, Ozone scales 
significantly better than HDFS.
+- **RocksDB-based Metadata:** Unlike HDFS, which relies entirely on heap 
memory, Ozone stores metadata in RocksDB, removing the memory bottleneck.
+- **Container Logic:** Managing data in "containers" rather than individual 
blocks reduces the reporting overhead on the SCM.
+
+Today, Ozone has been running in production at Didi for over two years, 
managing hundreds of PB of storage.
+
+Figure 1: Ozone Cluster Scale at Didi
+
+## Architecture & Key Optimizations
+
+Migrating was just the first step. To meet Didi's strict latency requirements 
(especially for "first-frame" read access), they engineered several critical 
optimizations.
+
+### 1. Multi-Cluster Routing with ViewFs
+
+To manage the sheer volume of data, they utilized a client-side routing 
mechanism inspired by HDFS ViewFs. By mapping paths to specific clusters (e.g., 
`vol/bucket/prefix1` → cluster1), they effectively balanced the load and kept 
the file count in each cluster under 5 billion, alleviating RPC pressure on 
individual Ozone Managers.
+
+### 2. Boosting Read Performance: S3G Follower Reads
+
+They observed that the Leader OM often became a bottleneck for S3 Gateway 
(S3G) requests. To solve this, they implemented a Follower Read strategy.
+
+They introduced a "probe task" in the client (e.g. every 3 seconds) that 
evaluates:
+
+- **Latency:** Selects the OM node with the lowest response time.
+- **Freshness:** Checks the lastAppliedIndex to ensure the Follower isn't 
serving stale data.
+
+**Result:** The P90 latency for S3G metadata requests (GetMetaLatency) dropped 
from a weekly average of ~90ms to ~17ms; in best cases, from tens of 
milliseconds to under 3ms.
+
+Figure 2: Significant drop in S3G latency after enabling Follower Reads

Review Comment:
   Here as well.



##########
cspell.yaml:
##########
@@ -199,10 +199,13 @@ words:
 # Apache Ozone community member names
 - Sumit
 # Company names for "Who Uses Ozone" page
+- Didi
 - Shopee
 - Qihoo360
 - Meituan
 - Unicom
+- LRU
+- SPDK

Review Comment:
   These two are better placed under `Other systems' words`.



##########
blog/2026-01-30-apache-ozone-best-practices-at-didi.md:
##########
@@ -0,0 +1,94 @@
+---
+title: "Apache Ozone Best Practices at Didi: Scaling to Tens of Billions of 
Files"
+date: 2026-01-30
+authors: ["rich7420", "jojochuang", "apache-ozone-community"]
+tags: [user-stories, performance, erasure-coding, scale]
+---
+
+Guest post by the Didi Engineering Team. For the full story with detailed 
slides, see [Apache Ozone Best Practices at Didi 
(PDF)](https://ozone.apache.org/assets/ApacheOzoneBestPracticesAtDidi.pdf).
+
+As Didi's volume of unstructured data surged into the hundreds of petabytes, 
comprising tens of billions of files, their traditional storage architecture 
faced severe scalability bottlenecks. This post summarizes how they migrated 
from HDFS to Apache Ozone, the optimizations they implemented for 
high-performance reads, and their journey in contributing these improvements 
back to the community.
+
+<!-- truncate -->
+
+## The Challenge: HDFS at Scale
+
+Like many data-driven enterprises, Didi relied heavily on HDFS. However, as 
their data scale grew, they hit the classic "NameNode Limit."
+
+- **Metadata Pressure:** Storing hundreds of millions of files put immense 
pressure on the HDFS NameNode memory.
+- **Block Reporting Storms:** With massive file counts, block reporting became 
a significant overhead.
+- **Scalability Ceiling:** They needed a solution that could handle tens of 
billions of files without partitioning their clusters into unmanageable silos.
+
+## Why Ozone?
+
+They chose Apache Ozone as their next-generation storage engine because it 
addresses these limitations architecturally:
+
+- **Decoupled Metadata:** By separating the Ozone Manager (OM) for namespace 
and Storage Container Manager (SCM) for block management, Ozone scales 
significantly better than HDFS.
+- **RocksDB-based Metadata:** Unlike HDFS, which relies entirely on heap 
memory, Ozone stores metadata in RocksDB, removing the memory bottleneck.
+- **Container Logic:** Managing data in "containers" rather than individual 
blocks reduces the reporting overhead on the SCM.
+
+Today, Ozone has been running in production at Didi for over two years, 
managing hundreds of PB of storage.
+
+Figure 1: Ozone Cluster Scale at Didi

Review Comment:
   The figures are missing in the doc. Please include it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HDDS-14314. [Website v2] Blog: Apache Ozone Best Practices at Didi [ozone-site]

Reply via email to