peterxcli commented on code in PR #224:
URL: https://github.com/apache/ozone-site/pull/224#discussion_r2672693415


##########
docs/03-core-concepts/02-replication/02-write-pipelines.md:
##########
@@ -1,5 +1,298 @@
+---
+sidebar_label: Write Pipelines
+---
+
 # Write Pipelines
 
-**TODO:** File a subtask under 
[HDDS-9857](https://issues.apache.org/jira/browse/HDDS-9857) and complete this 
page or section.
+Write pipelines are a fundamental component of Apache Ozone's storage 
architecture, enabling reliable data storage across distributed nodes. This 
document provides a comprehensive overview of write pipelines, covering both 
replication and erasure coding approaches, their architecture, implementation 
details, and usage patterns.
+
+## What are Write Pipelines?
+
+Write pipelines are groups of Datanodes that work together as a unit to store 
and replicate data in Ozone. They serve as the foundation for Ozone's data 
redundancy strategy, providing:
+
+- A coordinated path for write operations across multiple nodes
+- Consistency guarantees for data replication
+- Efficient management of data distribution and storage
+
+The Storage Container Manager (SCM) is responsible for creating and managing 
write pipelines, selecting appropriate Datanodes based on factors like 
availability, capacity, and network topology.
+
+## Pipeline Types
+
+Ozone supports different types of write pipelines to accommodate various 
durability and storage efficiency requirements:
+
+### 1. Ratis Pipelines (Replicated)
+
+Ratis pipelines use the [Apache Ratis](https://ratis.apache.org/) 
implementation of the Raft consensus protocol for strongly consistent 
replication.
+
+- **Structure**: Typically consists of three Datanodes (one leader, multiple 
followers)
+- **Consistency**: Provides strong consistency through synchronous replication
+- **Durability**: Data is fully replicated on all nodes in the pipeline
+- **Use Case**: Default replication strategy for most Ozone deployments
+
+![Ratis Replication Pipeline](../../../static/img/replication/ratis.svg)
+
+#### Ratis Pipeline V1: Async API
+
+The original Ozone replication pipeline (V1) uses the Ratis Async API for data 
replication across multiple Datanodes:
+
+1. Client buffers data locally until a certain threshold is reached
+2. Data is sent to the leader Datanode in the pipeline
+3. Leader replicates data to follower Datanodes
+4. Once a quorum of Datanodes acknowledge the write, the operation is 
considered successful
+
+This approach ensures data consistency but has some limitations in terms of 
network topology awareness and buffer handling efficiency.
+
+#### Ratis Pipeline V2: Streaming API

Review Comment:
   should we add some reference here 
https://www.cloudera.com/blog/technical/ozone-write-pipeline-v2-with-ratis-streaming.html?
   
   or maybe other place, I just feel this article is so good and it has to be 
placed somewhere in the official site.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to