sreejasahithi commented on code in PR #279:
URL: https://github.com/apache/ozone-site/pull/279#discussion_r2720831623


##########
docs/07-system-internals/04-replication/02-data/02-containers/08-replication.md:
##########
@@ -4,6 +4,155 @@ sidebar_label: Replication
 
 # Container Replication
 
-**TODO:** File a subtask under 
[HDDS-9862](https://issues.apache.org/jira/browse/HDDS-9862) and complete this 
page or section.
+## Overview
 
-Document replication process among Datanodes. Advantages of push replication 
and cases where an EC container can be replicated, like decommissioning.
+Container replication is a critical mechanism in Apache Ozone that ensures 
data availability and durability by copying containers from source Datanodes to 
destination Datanodes. This document provides a comprehensive description of 
the replication process, including the detailed steps involved, advantages of 
push replication, and scenarios where EC (Erasure Coded) containers can be 
replicated.
+
+## Replication Mode
+
+Apache Ozone supports **Push Replication** by **default**, where the source 
Datanode actively pushes the container to the target Datanode. The replication 
mode is controlled by the configuration property `hdds.scm.replication.push` 
(default: `true`). When set to `false`, the system uses pull replication where 
the target Datanode pulls from source Datanodes.
+
+**Push Replication :** `PushReplicator` class handles push replication by:
+
+- Using `OnDemandContainerReplicationSource` to prepare the container
+- Using `GrpcContainerUploader` to upload the container via gRPC stream
+- Streaming the container data directly to the target Datanode
+
+:::note
+
+Both regular container replication and EC container replication respect the 
same `hdds.scm.replication.push` configuration setting. EC container 
replication scenarios (decommissioning, under-replication, maintenance mode, 
mis-replication) will use push mode when the configuration is `true` (default) 
or pull mode when set to `false`.
+
+:::
+
+---
+
+## Detailed Replication Process
+
+The container replication process involves several well-defined steps.
+
+### Step 1: Source Datanode Prepares Container Tarball
+
+The source Datanode creates a tarball containing:
+
+- Container descriptor file (`container.yaml`) with metadata
+- RocksDB metadata files (database files)
+- Container chunk files (actual data)
+- Container checksum file (if exists)
+
+**Compression:** This tarball is not compressed by default. Optional 
compression can be enabled via `hdds.container.replication.compression` 
(values: `NO_COMPRESSION`(default), `GZIP`, `SNAPPY`, `LZ4`, `ZSTD`).
+
+### Step 2: Destination Datanode Receives Tarball
+
+The source Datanode streams the tarball to the destination via gRPC:
+
+- Establishes gRPC stream connection
+- Streams data in chunks via `SendContainerRequest` messages
+- Destination writes chunks to temporary file in the temp dir of the volume: 
`/tmp/container-copy/`
+
+Before receiving, the destination selects a volume, **reserves space (2x 
container size)**, and creates the temporary directory.

Review Comment:
   I think it will be better to tell here why 2x size , i.e it is because of 
tarball + extracted.



##########
docs/07-system-internals/04-replication/02-data/02-containers/08-replication.md:
##########
@@ -4,6 +4,155 @@ sidebar_label: Replication
 
 # Container Replication
 
-**TODO:** File a subtask under 
[HDDS-9862](https://issues.apache.org/jira/browse/HDDS-9862) and complete this 
page or section.
+## Overview
 
-Document replication process among Datanodes. Advantages of push replication 
and cases where an EC container can be replicated, like decommissioning.
+Container replication is a critical mechanism in Apache Ozone that ensures 
data availability and durability by copying containers from source Datanodes to 
destination Datanodes. This document provides a comprehensive description of 
the replication process, including the detailed steps involved, advantages of 
push replication, and scenarios where EC (Erasure Coded) containers can be 
replicated.
+
+## Replication Mode
+
+Apache Ozone supports **Push Replication** by **default**, where the source 
Datanode actively pushes the container to the target Datanode. The replication 
mode is controlled by the configuration property `hdds.scm.replication.push` 
(default: `true`). When set to `false`, the system uses pull replication where 
the target Datanode pulls from source Datanodes.
+
+**Push Replication :** `PushReplicator` class handles push replication by:
+
+- Using `OnDemandContainerReplicationSource` to prepare the container
+- Using `GrpcContainerUploader` to upload the container via gRPC stream
+- Streaming the container data directly to the target Datanode
+
+:::note
+
+Both regular container replication and EC container replication respect the 
same `hdds.scm.replication.push` configuration setting. EC container 
replication scenarios (decommissioning, under-replication, maintenance mode, 
mis-replication) will use push mode when the configuration is `true` (default) 
or pull mode when set to `false`.
+
+:::
+
+---
+
+## Detailed Replication Process
+
+The container replication process involves several well-defined steps.
+
+### Step 1: Source Datanode Prepares Container Tarball
+
+The source Datanode creates a tarball containing:
+
+- Container descriptor file (`container.yaml`) with metadata
+- RocksDB metadata files (database files)
+- Container chunk files (actual data)
+- Container checksum file (if exists)
+
+**Compression:** This tarball is not compressed by default. Optional 
compression can be enabled via `hdds.container.replication.compression` 
(values: `NO_COMPRESSION`(default), `GZIP`, `SNAPPY`, `LZ4`, `ZSTD`).
+
+### Step 2: Destination Datanode Receives Tarball
+
+The source Datanode streams the tarball to the destination via gRPC:
+
+- Establishes gRPC stream connection
+- Streams data in chunks via `SendContainerRequest` messages
+- Destination writes chunks to temporary file in the temp dir of the volume: 
`/tmp/container-copy/`

Review Comment:
   I think this should be `<volume-root>/tmp/container-copy/`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to