Re: [PR] HDDS-13578. [Docs] Add pipeline placement policy to Topology Awareness doc [ozone]

via GitHub Mon, 18 Aug 2025 05:36:15 -0700


sodonnel commented on code in PR #8947:
URL: https://github.com/apache/ozone/pull/8947#discussion_r2282275946



##########
hadoop-hdds/docs/content/feature/Topology.md:
##########
@@ -104,78 +107,53 @@ Uses an external script to resolve rack locations for IPs.
 
 **Topology Mapping Best Practices:**
 
-* **Accuracy:** Mappings must be accurate and current.
-* **Static Mapping:** Simpler for small, stable clusters; requires manual 
updates.
-* **Dynamic Mapping:** Flexible for large/dynamic clusters. Script 
performance, correctness, and reliability are vital; ensure it's idempotent and 
handles batch lookups efficiently.
-
-## Pipeline Choosing Policies
-
-Ozone supports several policies for selecting a pipeline when placing 
containers. The policy for Ratis containers is configured by the property 
`hdds.scm.pipeline.choose.policy.impl` for SCM. The policy for EC (Erasure 
Coded) containers is configured by the property 
`hdds.scm.ec.pipeline.choose.policy.impl`. For both, the default value is 
`org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RandomPipelineChoosePolicy`.
-
-These policies help optimize for different goals such as load balancing, 
health, or simplicity:
-
-- **RandomPipelineChoosePolicy** (Default): Selects a pipeline at random from 
the available list, without considering utilization or health. This policy is 
simple and does not optimize for any particular metric.
+*   **Accuracy:** Mappings must be accurate and current.
+*   **Static Mapping:** Simpler for small, stable clusters; requires manual 
updates.
+*   **Dynamic Mapping:** Flexible for large/dynamic clusters. Script 
performance, correctness, and reliability are vital; ensure it's idempotent and 
handles batch lookups efficiently.
 
-- **CapacityPipelineChoosePolicy**: Picks two random pipelines and selects the 
one with lower utilization, favoring pipelines with more available capacity and 
helping to balance the load across the cluster.
+## Placement and Selection Policies
 
-- **RoundRobinPipelineChoosePolicy**: Selects pipelines in a round-robin 
order. This policy is mainly used for debugging and testing, ensuring even 
distribution but not considering health or capacity.
+Ozone uses three distinct types of policies to manage how and where data is 
written.
 
-- **HealthyPipelineChoosePolicy**: Randomly selects pipelines but only returns 
a healthy one. If no healthy pipeline is found, it returns the last tried 
pipeline as a fallback.
+### 1. Pipeline Creation Policy
 
-These policies can be configured to suit different deployment needs and 
workloads.
+This policy selects a set of datanodes to form a new pipeline. Its purpose is 
to ensure new pipelines are internally fault-tolerant by spreading their nodes 
across racks. This is the primary mechanism for topology awareness on the write 
path for open containers.
 
-## Container Placement Policies for Replicated (RATIS) Containers
+The policy is configured by the `ozone.scm.pipeline.placement.impl` property 
in `ozone-site.xml`.
 
-SCM uses a pluggable policy to place additional replicas of *closed* 
RATIS-replicated containers. This is configured using the 
`ozone.scm.container.placement.impl` property in `ozone-site.xml`. Available 
policies are found in the 
`org.apache.hadoop.hdds.scm.container.placement.algorithms` package \[1, 3\].
+*   **`SCMContainerPlacementRackAware` (Default)**
+    *   **Function:** Distributes the datanodes of a pipeline across racks for 
fault tolerance (e.g., for a 3-node pipeline, it aims for at least two racks). 
Similar to HDFS placement. [1]
+    *   **Use Cases:** Production clusters needing rack-level fault tolerance.
+    *   **Limitations:** Designed for single-layer rack topologies (e.g., 
`/rack/node`). Not recommended for multi-layer hierarchies (e.g., 
`/dc/row/rack/node`) as it may not interpret deeper levels correctly. [1]
 
-These policies are applied when SCM needs to re-replicate containers, such as 
during container balancing.
+*   **`SCMContainerPlacementRandom`**
+    *   **Function:** Randomly selects healthy, available DataNodes, ignoring 
rack topology. [1, 4]
+    *   **Use Cases:** Small/dev/test clusters where rack fault tolerance is 
not critical.
 
-### 1. `SCMContainerPlacementRackAware` (Default)
+*   **`SCMContainerPlacementCapacity`**
+    *   **Function:** Selects DataNodes by available capacity (favors lower 
disk utilization) to balance disk usage across the cluster. [5, 6]
+    *   **Use Cases:** Heterogeneous storage clusters or where even disk 
utilization is key.
 
-* **Function:** Distributes replicas across racks for fault tolerance (e.g., 
for 3 replicas, aims for at least two racks). Similar to HDFS placement. \[1]
-* **Use Cases:** Production clusters needing rack-level fault tolerance.
-* **Configuration:**
-    ```xml
-    <property>
-      <name>ozone.scm.container.placement.impl</name>
-      
<value>org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware</value>
-    </property>
-    ```
-* **Best Practices:** Requires accurate topology mapping.
-* **Limitations:** Designed for single-layer rack topologies (e.g., 
`/rack/node`). Not recommended for multi-layer hierarchies (e.g., 
`/dc/row/rack/node`) as it may not interpret deeper levels correctly. \[1]
+### 2. Pipeline Selection (Load Balancing) Policy
 
-### 2. `SCMContainerPlacementRandom`
+After a pool of healthy, open, and rack-aware pipelines has been created, this 
policy is used to **select one** of them to handle a client's write request. 
Its purpose is **load balancing**, not topology awareness, as the topology has 
already been handled during pipeline creation.
 
-* **Function:** Randomly selects healthy, available DataNodes meeting basic 
criteria (space, no existing replica), ignoring rack topology. \[1, 4\]
-* **Use Cases:** Small/dev/test clusters, or if rack fault tolerance for 
closed replicas isn't critical.
-* **Configuration:**
-    ```xml
-    <property>
-      <name>ozone.scm.container.placement.impl</name>
-      
<value>org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRandom</value>
-    </property>
-    ```
-* **Best Practices:** Not for production needing rack failure resilience.
+The policy is configured by `hdds.scm.pipeline.choose.policy.impl` in 
`ozone-site.xml`.
 
-### 3. `SCMContainerPlacementCapacity`
+*   **`RandomPipelineChoosePolicy` (Default):** Selects a pipeline at random 
from the available list. This policy is simple and distributes load without 
considering other metrics.
+*   **`CapacityPipelineChoosePolicy`:** Picks two random pipelines and selects 
the one with lower utilization, favoring pipelines with more available capacity.
+*   **`RoundRobinPipelineChoosePolicy`:** Selects pipelines in a round-robin 
order. This is mainly for debugging and testing.
+*   **`HealthyPipelineChoosePolicy`:** Randomly selects pipelines but only 
returns a healthy one.
 
-* **Function:** Selects DataNodes by available capacity (favors lower disk 
utilization) to balance disk usage. \[5, 6\]
-* **Use Cases:** Heterogeneous storage clusters or where even disk utilization 
is key.
-* **Configuration:**
-    ```xml
-    <property>
-      <name>ozone.scm.container.placement.impl</name>
-      
<value>org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementCapacity</value>
-    </property>
-    ```
-* **Best Practices:** Prevents uneven node filling.
-* **Interaction:** This container placement policy selects datanodes by 
randomly picking two nodes from a pool of healthy, available nodes and then 
choosing the one with lower utilization (more free space). This approach aims 
to distribute containers more evenly across the cluster over time, favoring 
less utilized nodes without overwhelming newly added nodes.
+### 3. Closed Container Replication Policy
 
+This policy is used only when SCM needs to create an **additional replica of a 
closed container**. This happens during re-replication (after a node failure) 
or container balancing. Its scope is narrow compared to the pipeline creation 
and selection policies.
 
+This is configured using the `ozone.scm.container.placement.impl` property in 
`ozone-site.xml`. The available policies are the same as for Pipeline Creation 
(e.g., `SCMContainerPlacementRackAware`, `SCMContainerPlacementRandom`).

Review Comment:
   This is correct, but I am not sure the policies mentioned above are valid 
for pipeline creation, in so much as they have probably never been tested. 
Might be best to move the list from above to here and then not mention them 
above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HDDS-13578. [Docs] Add pipeline placement policy to Topology Awareness doc [ozone]

Reply via email to