ivandika3 commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r1692365363


##########
hadoop-hdds/docs/content/design/storage-policy.md:
##########
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+<!--
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);
+- The storage type is the property of volume and Container replicas (Managed 
by DN);
+- Only the storage policy can be modified by the user directly via ozone 
command;
+
+Example:
+
+For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
+- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
+- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
+- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+
+# Current Status
+
+- Ozone currently has some support for tiered storage such as storage type, 
and some parts of this article may already be implemented.
+- Currently, in Ozone, when a key is created, the key's Block can appear on 
any volume of a Datanode. When a key is created, SCM first needs to allocate a 
Block for the key through Pipelines. The Client then writes the Block to the 
corresponding Datanode based on the Pipeline information. In this process, the 
smallest element managed by the SCM Pipeline is the Datanode, and when the 
Datanode creates a Container, the Container may appear on any volume with 
enough remaining space. Under the current architecture, Ozone does not support 
writing data to specific disks
+
+# Goal Requirements Specification
+
+### **Support for Storage Policy Writing and Management**
+
+- **Writing keys**: Allow keys to be written to specified storage tiers based 
on storage policies.
+- **Policy Management**: Enable setting, unsetting, and inheriting storage 
policies for keys, prefixes, and buckets. Inherit policies based on the longest 
matching prefix or bucket if no specific policy is set.
+
+### **Support for Data Migration Across Different Storage Policies**
+
+- **Data Migration**: Support data migration across different storage policies 
via manual triggers, ensuring data is moved to the appropriate storage tiers.
+
+### **Adaptation of AWS S3 StorageClass**
+
+- **S3 StorageClass Mapping**: Map AWS S3 storage classes to Ozone storage 
policies, supporting related API operations (PutObject, CopyObject, Multipart 
Upload, GetObject, HeadObject, ListObjects).
+
+### **Management and Monitoring Tools**
+
+- **Storage Policy Commands**: Provide tools to view storage policies of 
containers, datanode usage, and pipeline information.
+- **Metrics and Monitoring**: Enable visibility into storage policy 
compliance, container storage types, and space information across different 
storage policies.
+
+### **Future Enhancements**
+
+- **Intelligent Storage Policies**: Plan to support automatic data migration 
based on access frequency, similar to S3 Intelligent-Tiering.
+- **Bucket StorageClass Lifecycle Rules: Support setting storage policies 
Lifecycle Rules at the bucket level.**
+- **Recon Support**: Enhance Recon to display relevant storage tier 
information.
+
+# Detailed Requirements Specification
+
+## Storage Policy and Storage Types
+
+### Supported Storage Types
+
+- Specify the Storage Type for each volume through configuration. If no 
Storage Type is specified, the default value will be DISK.
+- Support Storage Type:SSD / DISK / ARCHIVE / RAM_DISK
+
+### Supported Storage Policies
+
+Support storage policy: Hot , Warm, Cold
+
+### Storage Policies Map To Storage Tiers
+
+| Storage Policy | Storage Tier for Write | Fallback Tier for Write |
+| --- | --- | --- |
+| Hot | SSD | DISK |
+| Warm | DISK | none |
+| Cold | ARCHIVE | none |
+- **Storage Tier For Write**: The priority storage tier where data is written 
when storage policy is specified.
+- **Fallback Tier for Write**: If the specified storage policy cannot be 
satisfied with the priority storage tier, the SCM will attempt to use this 
fallback tier to meet the policy requirements.
+
+### Storage Tier Map To Storage Type
+
+| Tier | StorageType of Pipeline | One Replication 
+Container Replicas Storage Type | Three replication
+Container Replicas Storage Type | EC
+Container Replicas Storage Type |
+| --- | --- | --- | --- | --- |
+| SSD | SSD | SSD | 3 SSD | n SSD |
+| DISK | DISK | DISK | 3 DISK | n DISK |
+| ARCHIVE | ARCHIVE | ARCHIVE | 3 ARCHIVE | n ARCHIVE |

Review Comment:
   Nit: The table does not seem to be rendered properly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to