sodonnel commented on code in PR #8947:
URL: https://github.com/apache/ozone/pull/8947#discussion_r2282267728


##########
hadoop-hdds/docs/content/feature/Topology.md:
##########
@@ -104,78 +107,53 @@ Uses an external script to resolve rack locations for IPs.
 
 **Topology Mapping Best Practices:**
 
-* **Accuracy:** Mappings must be accurate and current.
-* **Static Mapping:** Simpler for small, stable clusters; requires manual 
updates.
-* **Dynamic Mapping:** Flexible for large/dynamic clusters. Script 
performance, correctness, and reliability are vital; ensure it's idempotent and 
handles batch lookups efficiently.
-
-## Pipeline Choosing Policies
-
-Ozone supports several policies for selecting a pipeline when placing 
containers. The policy for Ratis containers is configured by the property 
`hdds.scm.pipeline.choose.policy.impl` for SCM. The policy for EC (Erasure 
Coded) containers is configured by the property 
`hdds.scm.ec.pipeline.choose.policy.impl`. For both, the default value is 
`org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RandomPipelineChoosePolicy`.
-
-These policies help optimize for different goals such as load balancing, 
health, or simplicity:
-
-- **RandomPipelineChoosePolicy** (Default): Selects a pipeline at random from 
the available list, without considering utilization or health. This policy is 
simple and does not optimize for any particular metric.
+*   **Accuracy:** Mappings must be accurate and current.
+*   **Static Mapping:** Simpler for small, stable clusters; requires manual 
updates.
+*   **Dynamic Mapping:** Flexible for large/dynamic clusters. Script 
performance, correctness, and reliability are vital; ensure it's idempotent and 
handles batch lookups efficiently.
 
-- **CapacityPipelineChoosePolicy**: Picks two random pipelines and selects the 
one with lower utilization, favoring pipelines with more available capacity and 
helping to balance the load across the cluster.
+## Placement and Selection Policies
 
-- **RoundRobinPipelineChoosePolicy**: Selects pipelines in a round-robin 
order. This policy is mainly used for debugging and testing, ensuring even 
distribution but not considering health or capacity.
+Ozone uses three distinct types of policies to manage how and where data is 
written.
 
-- **HealthyPipelineChoosePolicy**: Randomly selects pipelines but only returns 
a healthy one. If no healthy pipeline is found, it returns the last tried 
pipeline as a fallback.
+### 1. Pipeline Creation Policy
 
-These policies can be configured to suit different deployment needs and 
workloads.
+This policy selects a set of datanodes to form a new pipeline. Its purpose is 
to ensure new pipelines are internally fault-tolerant by spreading their nodes 
across racks. This is the primary mechanism for topology awareness on the write 
path for open containers.
 
-## Container Placement Policies for Replicated (RATIS) Containers
+The policy is configured by the `ozone.scm.pipeline.placement.impl` property 
in `ozone-site.xml`.
 
-SCM uses a pluggable policy to place additional replicas of *closed* 
RATIS-replicated containers. This is configured using the 
`ozone.scm.container.placement.impl` property in `ozone-site.xml`. Available 
policies are found in the 
`org.apache.hadoop.hdds.scm.container.placement.algorithms` package \[1, 3\].
+*   **`SCMContainerPlacementRackAware` (Default)**

Review Comment:
   This is not the default for pipeline creation. It uses the class 
"PipelinePlacementPolicy" as the default for pipeline creation, which is a 
totally different code path to the policy used for the balancer or replication 
manager.
   
   The difference is that the pipeline creation policy takes "pipeline count" 
into consideration to attempt to balance the pipelines across the nodes evenly.
   
   The default considers rack awareness if racks are configured and the 
pipeline count across nodes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to