This is an automated email from the ASF dual-hosted git repository.

prashantpogde pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/ozone.git


The following commit(s) were added to refs/heads/master by this push:
     new 87bf336083 HDDS-7745. [Snapshot] Add feature documentation (#4989)
87bf336083 is described below

commit 87bf336083b1f5d2384953588ce86f120d41498e
Author: prashantpogde <[email protected]>
AuthorDate: Wed Jun 28 19:10:37 2023 -0700

    HDDS-7745. [Snapshot] Add feature documentation (#4989)
---
 hadoop-hdds/docs/content/feature/Snapshot.md | 68 ++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/hadoop-hdds/docs/content/feature/Snapshot.md 
b/hadoop-hdds/docs/content/feature/Snapshot.md
new file mode 100644
index 0000000000..d0223f12cd
--- /dev/null
+++ b/hadoop-hdds/docs/content/feature/Snapshot.md
@@ -0,0 +1,68 @@
+---
+title: "Ozone Snapshot"
+weight: 1
+menu:
+   main:
+      parent: Features
+summary: Ozone Snapshot
+---
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+## Introduction
+
+Snapshot feature for Apache Ozone object store allows users to take 
point-in-time consistent image of a given bucket. Snapshot feature enables 
various use cases, including:
+ * Backup and Restore: Create hourly, daily, weekly, monthly snapshots for 
backup and recovery when needed.
+ * Archival and Compliance: Take snapshots for compliance purpose and archive 
them as required.
+ * Replication and Disaster Recovery (DR): Snapshots provide frozen immutable 
images of the bucket on the source Ozone cluster. Snapshots can be used for 
replicating these immutable bucket images to remote DR sites.
+ * Incremental Replication: DistCp with SnapshotDiff offers an efficient way 
to incrementally sync up source and destination buckets.
+
+## Snapshot APIs
+
+Snapshot feature is available through 'ozone fs' and 'ozone sh' CLI. This 
feature can also be programmatically accessed from Ozone `ObjectStore` Java 
client. The feature provides following functionalities:
+* Create Snapshot: Create an instantenous snapshot for a given bucket
+```shell
+ozone sh snapshot create [-hV] <bucket> [<snapshotName>]
+```
+* List Snapshots: List all snapshots of a given bucket
+```shell
+ozone sh snapshot list [-hV] <bucket>
+```
+* Delete snapshot: Delete a given snapshot for a given bucket
+```shell
+ozone sh snapshot delete [-hV] <bucket> <snapshotName>
+```
+* Snapshot Diff: Given two snapshots, list all the keys that are different 
between the them.
+```shell
+ozone sh snapshot diff [-chV] [-p=<pageSize>] [-t=<continuation-token>] 
<bucket> <fromSnapshot> <toSnapshot>
+```
+
+SnapshotDiff CLI/API is asynchronous. The first time the API is invoked, OM 
starts a background thread to calculate the SnapshotDiff, and returns "Retry" 
with suggested duration for the retry operation. Once the SnapshotDiff is 
computed, this API returns the diffs in multiple Pages. Within each Diff 
response, OM also returns a continuation token for the client to continue from 
the last batch of Diff results.  This API is safe to be called multiple times 
for a given snapshot source and de [...]
+
+## Architecture
+
+Ozone Snapshot architecture leverages the fact that data blocks once written, 
remain immutable for their lifetime. These data blocks are reclaimed only when 
the object key metadata that references them, is deleted from the Ozone 
namespace. All of this Ozone metadata is stored on the OM nodes in the Ozone 
cluster. When a user takes a snapshot of an Ozone bucket, internally the system 
takes snapshot of the Ozone metadata in OM nodes. Since Ozone doesn't allow 
updates to datanode blocks, in [...]
+Ozone also provides SnapshotDiff API. Whenever a user issues a SnapshotDiff 
between two snapshots, it efficiently calculates all the keys that are 
different between these two snapshots and returns paginated diff list result.
+
+## Deployment
+----------
+### Cluster and Hardware Configuration
+
+Snapshot feature places additional demands on the cluster in terms of CPU, 
memory and storage. Cluster nodes running Ozone Managers and Ozone Datanodes 
should be configured with extra storage capacity depending on the number of 
active snapshots that the user wants to keep. Ozone Snapshots consume 
incremental amount of space per snapshot. e.g. if the active object store has 
100 GB data (before replication) and a snapshot is taken, then the 100 GB of 
space will be locked in that snapshot.  [...]
+
+Similarly, nodes running Ozone Manager should be configured with extra memory 
depending on how many snapshots are concurrently read from. This also depepnds 
on how many concurrent SnapshotDiff jobs are expected in the cluster. By 
default, an Ozone Manager allows 10 concurrent SnapshotDiff jobs at a time, 
which can be increased in config.
+


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to