This is an automated email from the ASF dual-hosted git repository.

weichiu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/ozone.git


The following commit(s) were added to refs/heads/master by this push:
     new ac7a9a90f20 HDDS-14199. [Docs] Explain how EC write pipelines are 
calculated (#9520)
ac7a9a90f20 is described below

commit ac7a9a90f205c742c711b08c879d0a0c72385d44
Author: Wei-Chiu Chuang <[email protected]>
AuthorDate: Sat Jan 3 05:35:48 2026 -0800

    HDDS-14199. [Docs] Explain how EC write pipelines are calculated (#9520)
---
 hadoop-hdds/docs/content/feature/ErasureCoding.md  | 35 ++++++++++++++++++++++
 .../docs/content/start/ProductionDeployment.md     |  2 +-
 2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/hadoop-hdds/docs/content/feature/ErasureCoding.md 
b/hadoop-hdds/docs/content/feature/ErasureCoding.md
index 9e60c3a923a..b7899010a8c 100644
--- a/hadoop-hdds/docs/content/feature/ErasureCoding.md
+++ b/hadoop-hdds/docs/content/feature/ErasureCoding.md
@@ -228,6 +228,41 @@ When using ofs/o3fs, we can pass the EC Replication Config 
by setting the config
 
 In the case bucket already has default EC Replication Config, there is no need 
of passing EC Replication Config while creating key.
 
+#### Calculating EC Pipeline Limits
+
+The target number of open EC pipelines SCM aims to maintain is calculated 
dynamically for each EC replication configuration (e.g., RS-6-3, RS-3-2). The 
calculation is based on the following two properties, with the final target 
being the greater of the two resulting values.
+
+*   `ozone.scm.ec.pipeline.minimum`
+    *   **Description**: The guaranteed minimum number of open pipelines to 
maintain for each EC configuration, regardless of other factors.
+    *   **Default Value**: `5`
+
+*   `ozone.scm.ec.pipeline.per.volume.factor`
+    *   **Description**: A factor used to calculate a target number of 
pipelines based on the total number of healthy volumes across all datanodes in 
the cluster.
+    *   **Default Value**: `1.0`
+
+**Calculation Logic:**
+
+SCM first calculates a volume-based target using the formula:
+`(<pipeline.per.volume.factor> * <total healthy volumes>) / <required nodes 
for EC config>`
+
+The final target number of pipelines is then determined by:
+`max(<volume-based target>, <pipeline.minimum>)`
+
+**Example:**
+
+Consider a cluster with **200 total healthy volumes** across all datanodes and 
an EC policy of **RS-6-3** (which requires 9 nodes).
+*   `ozone.scm.ec.pipeline.minimum` = **5** (default)
+*   `ozone.scm.ec.pipeline.per.volume.factor` = **1.0** (default)
+
+1.  The volume-based target is: `(1.0 * 200) / 9 = 22`
+2.  The final target is: `max(22, 5) = 22`
+
+SCM will attempt to create and maintain approximately **22** open, RS-6-3 EC 
pipelines.
+
+**Production Recommendation:**
+
+The default values are a good starting point for most clusters. If you have a 
very high number of volumes and a write-heavy EC workload, you might consider 
slightly increasing the `pipeline.per.volume.factor`. Conversely, for 
read-heavy workloads, the default minimum of 5 pipelines is often sufficient.
+
 ### Enable Intel ISA-L
 
 Intel Intelligent Storage Acceleration Library (ISA-L) is an open-source 
collection of optimized low-level functions used for
diff --git a/hadoop-hdds/docs/content/start/ProductionDeployment.md 
b/hadoop-hdds/docs/content/start/ProductionDeployment.md
index ed24a7b2671..e3c16b060e6 100644
--- a/hadoop-hdds/docs/content/start/ProductionDeployment.md
+++ b/hadoop-hdds/docs/content/start/ProductionDeployment.md
@@ -85,5 +85,5 @@ A typical production Ozone cluster includes the following 
services:
 ### Ozone Configuration
 
 *   **Monitoring**: Install Prometheus and Grafana for monitoring the Ozone 
cluster. For audit logs, consider using a log ingestion framework such as the 
ELK Stack (Elasticsearch, Logstash, and Kibana) with FileBeat, or other similar 
frameworks. Alternatively, you can use Apache Ranger to manage audit logs.
-*   **Pipeline Limits**: Increase the number of allowed write pipelines to 
better suit your workload by adjusting `ozone.scm.datanode.pipeline.limit` and 
`ozone.scm.ec.pipeline.minimum`.
+*   **Pipeline Limits**: Increase the number of allowed write pipelines to 
better suit your workload by adjusting `ozone.scm.datanode.pipeline.limit` (for 
Ratis) and `ozone.scm.ec.pipeline.minimum` (for EC).
 *   **Heap Sizes**: Configure sufficient heap sizes for Ozone Manager (OM), 
Storage Container Manager (SCM), Recon, DataNode, S3 Gateway (S3G), and HttpFs 
services to ensure stability.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to