Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-05-12 Thread via GitHub


xichen01 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4436643699

   > @ivandika3, @xichen01, as we have 
[HDDS-15239](https://issues.apache.org/jira/browse/HDDS-15239) merged and a new 
branch created. Are we ready to move forward?
   
   Yes, we will continue to submit new PRs after the merge


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-05-12 Thread via GitHub


greenwich commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4435956576

   @ivandika3, @xichen01, as we have HDDS-15239 merged and a new branch 
created. Are we ready to move forward?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-05-05 Thread via GitHub


greenwich commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4384175779

   > @ivandika3 @greenwich @chungen0126 #10191 Please help to review
   
   Looks good!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-05-05 Thread via GitHub


xichen01 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4379351270

   @ivandika3 @greenwich  @chungen0126 
https://github.com/apache/ozone/pull/10191Please help to review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-05-05 Thread via GitHub


xichen01 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4378706326

   The new HDDS-11233 branch https://github.com/apache/ozone/tree/HDDS-11233


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-05-05 Thread via GitHub


xichen01 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4378698452

   > @xichen01 Could you help to take over? I'm currently quite busy and my GH 
account was recently blocked from GH actions (still waiting for support) so no 
CI can be triggered.
   
   OK, I will


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-05-05 Thread via GitHub


ivandika3 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4378415007

   @xichen01 Could you help to take over? My GH account was recently blocked 
from GH actions (still waiting for support) so no CI can be triggered.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-05-03 Thread via GitHub


xichen01 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4366105397

   > @xichen01 How about we cut a new 
[HDDS-11233](https://issues.apache.org/jira/browse/HDDS-11233) branch? The 
current branch can be done without the transition patch 
([HDDS-8342](https://issues.apache.org/jira/browse/HDDS-8342)). We can add the 
transition after [HDDS-8342](https://issues.apache.org/jira/browse/HDDS-8342) 
is merged to master.
   
   I think we can create a new HDDS-11233 branch on the Apache Ozone 
repositories, just like https://github.com/apache/ozone/tree/HDDS-8342. Then we 
can merge related Commit to this branch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-05-02 Thread via GitHub


ivandika3 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4365281668

   @xichen01 How about we cut a new HDDS-11233 branch? The current branch can 
be done without the transition patch 
([HDDS-8342](https://issues.apache.org/jira/browse/HDDS-8342)). We can add the 
transition after HDDS-8342 is merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-05-02 Thread via GitHub


xichen01 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4364194566

   > @greenwich @xichen01 FYI, test in 
https://github.com/ivandika3/ozone/tree/refs/heads/backport-storage-policy-storage-class
 passed. You can refer to the diff in https://github.com/ivandika3/ozone/pull/4
   
   @ivandika3 We can create corresponding sub tasks for these commits and then 
merge them into the HDDS-8342 branch via MR


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-04-30 Thread via GitHub


ivandika3 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4353095024

   > Regarding Phase 2, just thinking ahead — would it make sense to initially 
target a simpler mover-style implementation (similar in spirit to HDFS Mover) 
before introducing a separate job worker subsystem? That might allow basic 
Storage Policy Migration functionality to be delivered earlier and iterated on 
over time.
   
   @greenwich Yes, we can first implement a standalone StoragePolicySatisfier 
(similar to HDFS Mover or HDFS StoragePolicySatisfier) without needing to 
implement a separate subsystem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-04-30 Thread via GitHub


ivandika3 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4351185647

   @greenwich @xichen01 FYI, test in 
https://github.com/ivandika3/ozone/tree/refs/heads/backport-storage-policy-storage-class
 passed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-04-25 Thread via GitHub


ivandika3 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4319936554

   Thanks @greenwich , @xichen01 would you mind take a look at that?
   
   I'm currently backporting it in my fork, but the test are still failing 
(https://github.com/ivandika3/ozone/tree/refs/heads/backport-storage-policy-storage-class)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-04-24 Thread via GitHub


greenwich commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4317800028

   > @greenwich Sorry for the late reply. Let me try to use an AI agent to 
backport the storage policy to my fork.
   
   @ivandika3, I managed to do a similar thing because I wanted to prepare 
everything up front to make things go faster for this PR.
   - I picked the patch attached to this PR (that was branched out from ozone 
1.4)
   - Merged it to sync with master around two weeks ago, fixed all the 
conflicts and broken tests
   - Yesterday I synced upstream's master with that branch again to be up to 
date.
   - Please have a look, maybe it already has what you were planning to do: 
https://github.com/greenwich/ozone/tree/refs/heads/HDDS-11233_patch_merged
   - All tests are green: 
https://github.com/greenwich/ozone/actions/runs/24915662451 
   cc: @xichen01 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-04-22 Thread via GitHub


ivandika3 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4295042380

   @greenwich Sorry for the late reply. Let me try to use AI agent to backport 
this to my fork.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-04-13 Thread via GitHub


greenwich commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4234740368

   @ivandika3, Have you managed to backport the patch as per the discussion 
here: https://github.com/apache/ozone/pull/6989#issuecomment-2953554337?
   
   Any help required?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-29 Thread via GitHub


greenwich commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r3007348520


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,607 @@
+---
+title: Ozone Storage Policy Support
+summary: Support storage policy in Ozone to write key data into specified 
types of storage media.
+date: 2026-03-23
+jira: HDDS-11233
+status: draft
+---
+
+
+
+# Terminology
+
+## Definitions
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The type of each Datanode volume or container replica. Each 
Datanode volume can be configured with a
+  storage type, including SSD, DISK, and ARCHIVE.
+- Storage Tier: A specific storage tier is composed of all replicas of a 
container based on their storage type. For
+  example, a 3-replica SSD tier consists of 3 replicas of SSD type.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode.
+- Key: In this document, a key refers to an object in Ozone, including entries 
in both the KeyTable and FileTable.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relationship between Storage Policy, Storage Type, and Storage Tier:
+
+- The storage policy is the property of key/bucket (managed by OM).
+- The storage tier is the property of Pipeline and Container (managed by SCM).
+- The storage type is the property of volume and container replica (managed by 
DN).
+- Only the storage policy can be modified by the user directly via the ozone 
command.
+
+Example:
+
+For a keyA, its storage policy is Hot, Its Container tier is SSD tier, the 
Container has three replicas, all of which
+are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so they create a 
bucket with the storage policy set to Hot.
+  Data written by User A to the bucket will automatically be distributed 
across SSD disks in the cluster.
+- User B needs higher IO performance for a specific key. They write a key with 
the storage policy set to Hot. The
+  key's data will be distributed across SSD disks in the cluster.
+- User C uses the command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class STANDARD` to upload a file
+  to the Ozone SSD tier. The key's data will be distributed across SSD disks 
in the cluster.
+
+# Goals
+
+- Storage Policy: Introduce storage policy and related concepts. Define 
multiple storage policies and support S3
+  storage class.
+- Storage Policy Writing: Allow writing keys/files to specified storage tiers 
based on storage policy. Support S3,
+  API, and shell command interfaces.
+- Storage Policy Update: Enable setting and unsetting storage policies for 
buckets, and setting storage tiers for
+  containers.
+- Storage Policy Display: Support displaying the storage policy attribute of 
buckets and keys. Support displaying the
+  storage tier of SCM containers and pipelines. Support displaying Datanode 
storage type usage information. Support
+  checking whether the key storage policy is satisfied.
+- Container Balancer: Support migrating container replicas between Datanodes 
to volumes of the matching storage type.
+  For example, SSD type container replicas will be migrated to SSD type 
volumes, and will not be migrated to DISK
+  type volumes.
+- ReplicationManager: Support managing the storage type of container replicas 
to ensure that container replicas on
+  Datanodes reside on the correct volumes. Ensure that the storage types of 
container replicas forming a storage
+  tier are correct. For example, a 3-replica SSD storage tier container in SCM 
should consist of 3 SSD type container
+  replicas, and each container replica should reside on an SSD type volume.
+- DiskBalancerService: Support migrating container replicas within a Datanode 
to volumes of the matching storage type.
+  For example, SSD type container replicas will be migrated to SSD type 
volumes, and will not be migrated to DISK
+  type volumes.
+
+# Design
+
+## Supported Storage Policies
+
+- Supported storage policies: Hot / Warm / Cold
+- Supported storage tiers: SSD / DISK / ARCHIVE / EMPTY
+- Supported storage types: SSD / DISK / ARCHIVE
+- Supported bucket layouts: FILE_SYSTEM_OPTIMIZED, OBJECT_STORE, LEGACY
+- S3 storage classes: STANDARD / STANDARD_IA / GLACIER
+
+### Storage Policy Map to Storage Tier
+
+| Storage Policy | Storage Tier for Write | Fallback Tier for Write |
+|||-|
+| Hot| SSD| DISK|
+| Warm   | DISK   | EMPTY   |
+| Cold   | ARCHIVE| EMPTY   |
+
+- Storage Tier for Write: The primary storage tier where data is written when 
a storage policy is specified.
+- Fallbac

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-29 Thread via GitHub


greenwich commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r3007348520


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,607 @@
+---
+title: Ozone Storage Policy Support
+summary: Support storage policy in Ozone to write key data into specified 
types of storage media.
+date: 2026-03-23
+jira: HDDS-11233
+status: draft
+---
+
+
+
+# Terminology
+
+## Definitions
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The type of each Datanode volume or container replica. Each 
Datanode volume can be configured with a
+  storage type, including SSD, DISK, and ARCHIVE.
+- Storage Tier: A specific storage tier is composed of all replicas of a 
container based on their storage type. For
+  example, a 3-replica SSD tier consists of 3 replicas of SSD type.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode.
+- Key: In this document, a key refers to an object in Ozone, including entries 
in both the KeyTable and FileTable.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relationship between Storage Policy, Storage Type, and Storage Tier:
+
+- The storage policy is the property of key/bucket (managed by OM).
+- The storage tier is the property of Pipeline and Container (managed by SCM).
+- The storage type is the property of volume and container replica (managed by 
DN).
+- Only the storage policy can be modified by the user directly via the ozone 
command.
+
+Example:
+
+For a keyA, its storage policy is Hot, Its Container tier is SSD tier, the 
Container has three replicas, all of which
+are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so they create a 
bucket with the storage policy set to Hot.
+  Data written by User A to the bucket will automatically be distributed 
across SSD disks in the cluster.
+- User B needs higher IO performance for a specific key. They write a key with 
the storage policy set to Hot. The
+  key's data will be distributed across SSD disks in the cluster.
+- User C uses the command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class STANDARD` to upload a file
+  to the Ozone SSD tier. The key's data will be distributed across SSD disks 
in the cluster.
+
+# Goals
+
+- Storage Policy: Introduce storage policy and related concepts. Define 
multiple storage policies and support S3
+  storage class.
+- Storage Policy Writing: Allow writing keys/files to specified storage tiers 
based on storage policy. Support S3,
+  API, and shell command interfaces.
+- Storage Policy Update: Enable setting and unsetting storage policies for 
buckets, and setting storage tiers for
+  containers.
+- Storage Policy Display: Support displaying the storage policy attribute of 
buckets and keys. Support displaying the
+  storage tier of SCM containers and pipelines. Support displaying Datanode 
storage type usage information. Support
+  checking whether the key storage policy is satisfied.
+- Container Balancer: Support migrating container replicas between Datanodes 
to volumes of the matching storage type.
+  For example, SSD type container replicas will be migrated to SSD type 
volumes, and will not be migrated to DISK
+  type volumes.
+- ReplicationManager: Support managing the storage type of container replicas 
to ensure that container replicas on
+  Datanodes reside on the correct volumes. Ensure that the storage types of 
container replicas forming a storage
+  tier are correct. For example, a 3-replica SSD storage tier container in SCM 
should consist of 3 SSD type container
+  replicas, and each container replica should reside on an SSD type volume.
+- DiskBalancerService: Support migrating container replicas within a Datanode 
to volumes of the matching storage type.
+  For example, SSD type container replicas will be migrated to SSD type 
volumes, and will not be migrated to DISK
+  type volumes.
+
+# Design
+
+## Supported Storage Policies
+
+- Supported storage policies: Hot / Warm / Cold
+- Supported storage tiers: SSD / DISK / ARCHIVE / EMPTY
+- Supported storage types: SSD / DISK / ARCHIVE
+- Supported bucket layouts: FILE_SYSTEM_OPTIMIZED, OBJECT_STORE, LEGACY
+- S3 storage classes: STANDARD / STANDARD_IA / GLACIER
+
+### Storage Policy Map to Storage Tier
+
+| Storage Policy | Storage Tier for Write | Fallback Tier for Write |
+|||-|
+| Hot| SSD| DISK|
+| Warm   | DISK   | EMPTY   |
+| Cold   | ARCHIVE| EMPTY   |
+
+- Storage Tier for Write: The primary storage tier where data is written when 
a storage policy is specified.
+- Fallbac

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-29 Thread via GitHub


greenwich commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r3007344267


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);
+- The storage type is the property of volume and Container replicas (Managed 
by DN);
+- Only the storage policy can be modified by the user directly via ozone 
command;
+
+Example:
+
+For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
+- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
+- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
+- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+
+# Current Status
+
+- Ozone currently has some support for tiered storage such as storage type, 
and some parts of this article may already be implemented.
+- Currently, in Ozone, when a key is created, the key's Block can appear on 
any volume of a Datanode. When a key is created, SCM first needs to allocate a 
Block for the key through Pipelines. The Client then writes the Block to the 
corresponding Datanode based on the Pipeline information. In this process, the 
smallest element managed by the SCM Pipeline is the Datanode, and when the 
Datanode creates a Container, the Container may appear on any volume with 
enough remaining space. Under the current architecture, Ozone does not support 
writing data to specific disks
+
+# Goal Requirements Specification
+
+### **Support for Storage Policy Writing and Management**
+
+- **Writing keys**: Allow keys to be written to specified storage tiers based 
on storage policies.
+- **Policy Management**: Enable setting, unsetting, and inheriting storage 
policies for keys, prefixes, and buckets. Inherit policies based on the longest 
matching prefix or bucket if no specific policy is set.
+
+### **Support for Data Migration Across Different Storage Policies**
+
+- **Data Migration**: Support data migration across different storage policies 
via manual triggers, ensuring data is moved to the appropriate storage tiers.
+
+### **Adaptation of AWS S3 StorageClass**
+
+- **S3 StorageClass Mapping**: Map AWS S3 storage classes to Ozone storage 
policies, supporting related API operations (PutObject, CopyObject, Multipart 
Upload, GetObject, HeadObject, ListObjects).
+
+### **Management and Monitoring Tools**
+
+- **Storage Policy Commands**: Provide tools to view storage policies of 
containers, datanode usage, and pipeline information.
+- **Metrics and Monitoring**: Enable visibility into storage policy 
compliance, container storage types, and space information across different 
storage policies.
+
+### **Future Enhancements**
+
+- **Intelligent Storage Policies**: Plan to support automatic data migration 
based on access frequency, similar to S3 Intelligent-Tiering.
+- **Bucket StorageClass Lifecycle Rules: Support setting storage policies 
Lifecycle Rules at the bucket level.**
+- **Recon Support**: Enhance Recon to display 

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-27 Thread via GitHub


chungen0126 commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2999375115


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -19,379 +20,588 @@ status: draft
 
 # Terminology
 
-## Terminology
+## Definitions
 
 - Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
-- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
-- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
-- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
-- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+- Storage Type: The type of each Datanode volume or container replica. Each 
Datanode volume can be configured with a
+  storage type, including SSD, DISK, and ARCHIVE.
+- Storage Tier: A specific storage tier is composed of all replicas of a 
container based on their storage type. For
+  example, a 3-replica SSD tier consists of 3 replicas of SSD type.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode.
+- Key: In this document, a key refers to an object in Ozone, including entries 
in both the KeyTable and FileTable.
 
 ## Storage Policy vs Storage Type vs Storage Tier
 
 
![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
 
-The relation of Storage Policy, Storage Type and Storage Tier
+The relationship between Storage Policy, Storage Type, and Storage Tier:
 
-- The storage policy is the property of key/bucket/ prefix (Managed by OM);
-- The storage tier is the property of Pipeline and Container (Managed by SCM);
-- The storage type is the property of volume and Container replicas (Managed 
by DN);
-- Only the storage policy can be modified by the user directly via ozone 
command;
+- The storage policy is the property of key/bucket (managed by OM).
+- The storage tier is the property of Pipeline and Container (managed by SCM).
+- The storage type is the property of volume and container replica (managed by 
DN).
+- Only the storage policy can be modified by the user directly via the ozone 
command.
 
 Example:
 
-For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+For a keyA, its storage policy is Hot, Its Container tier is SSD tier, the 
Container has three replicas, all of which
+are of the SSD storage type.
 
 # User Scenarios
 
-- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
-- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
-- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
-- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+- User A needs a bucket that supports high-performance IO, so they create a 
bucket with the storage policy set to Hot.
+  Data written by User A to the bucket will automatically be distributed 
across SSD disks in the cluster.
+- User B needs higher IO performance for a specific key. They write a key with 
the storage policy set to Hot. The
+  key's data will be distributed across SSD disks in the cluster.
+- User C uses the command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class STANDARD` to upload a file
+  to the Ozone SSD tier. The key's data will be distributed across SSD disks 
in the cluster.
+
+# Goals
+
+- Storage Policy: Introduce storage policy and related concepts. Define 
multiple storage policies and support S3
+  storage class.
+- Storage Policy Writing: Allow writing keys/files to specified storage tiers 
based on storage policy. Support S3,
+  API, and shell command interfaces.
+- Storage Policy Update: Enable setting and unsetting storage policies for 
buckets, and setting storage tiers for
+  containers.
+- Storage Policy Display: Support displaying the storage policy attribute of 
buckets and keys. Support displaying the
+  storage tier of SCM containers and pipelines. Support displaying Datanode 
storage type usage information. Support
+  checking whether the key storage policy is satisfied.
+- Co

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-26 Thread via GitHub


xichen01 commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2996045328


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -19,379 +20,588 @@ status: draft
 
 # Terminology
 
-## Terminology
+## Definitions
 
 - Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
-- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
-- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
-- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
-- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+- Storage Type: The type of each Datanode volume or container replica. Each 
Datanode volume can be configured with a
+  storage type, including SSD, DISK, and ARCHIVE.
+- Storage Tier: A specific storage tier is composed of all replicas of a 
container based on their storage type. For
+  example, a 3-replica SSD tier consists of 3 replicas of SSD type.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode.
+- Key: In this document, a key refers to an object in Ozone, including entries 
in both the KeyTable and FileTable.
 
 ## Storage Policy vs Storage Type vs Storage Tier
 
 
![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
 
-The relation of Storage Policy, Storage Type and Storage Tier
+The relationship between Storage Policy, Storage Type, and Storage Tier:
 
-- The storage policy is the property of key/bucket/ prefix (Managed by OM);
-- The storage tier is the property of Pipeline and Container (Managed by SCM);
-- The storage type is the property of volume and Container replicas (Managed 
by DN);
-- Only the storage policy can be modified by the user directly via ozone 
command;
+- The storage policy is the property of key/bucket (managed by OM).
+- The storage tier is the property of Pipeline and Container (managed by SCM).
+- The storage type is the property of volume and container replica (managed by 
DN).
+- Only the storage policy can be modified by the user directly via the ozone 
command.
 
 Example:
 
-For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+For a keyA, its storage policy is Hot, Its Container tier is SSD tier, the 
Container has three replicas, all of which
+are of the SSD storage type.
 
 # User Scenarios
 
-- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
-- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
-- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
-- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+- User A needs a bucket that supports high-performance IO, so they create a 
bucket with the storage policy set to Hot.
+  Data written by User A to the bucket will automatically be distributed 
across SSD disks in the cluster.
+- User B needs higher IO performance for a specific key. They write a key with 
the storage policy set to Hot. The
+  key's data will be distributed across SSD disks in the cluster.
+- User C uses the command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class STANDARD` to upload a file
+  to the Ozone SSD tier. The key's data will be distributed across SSD disks 
in the cluster.
+
+# Goals
+
+- Storage Policy: Introduce storage policy and related concepts. Define 
multiple storage policies and support S3
+  storage class.
+- Storage Policy Writing: Allow writing keys/files to specified storage tiers 
based on storage policy. Support S3,
+  API, and shell command interfaces.
+- Storage Policy Update: Enable setting and unsetting storage policies for 
buckets, and setting storage tiers for
+  containers.
+- Storage Policy Display: Support displaying the storage policy attribute of 
buckets and keys. Support displaying the
+  storage tier of SCM containers and pipelines. Support displaying Datanode 
storage type usage information. Support
+  checking whether the key storage policy is satisfied.
+- Conta

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-26 Thread via GitHub


chungen0126 commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2993398281


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);
+- The storage type is the property of volume and Container replicas (Managed 
by DN);
+- Only the storage policy can be modified by the user directly via ozone 
command;
+
+Example:
+
+For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
+- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
+- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
+- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+
+# Current Status
+
+- Ozone currently has some support for tiered storage such as storage type, 
and some parts of this article may already be implemented.
+- Currently, in Ozone, when a key is created, the key's Block can appear on 
any volume of a Datanode. When a key is created, SCM first needs to allocate a 
Block for the key through Pipelines. The Client then writes the Block to the 
corresponding Datanode based on the Pipeline information. In this process, the 
smallest element managed by the SCM Pipeline is the Datanode, and when the 
Datanode creates a Container, the Container may appear on any volume with 
enough remaining space. Under the current architecture, Ozone does not support 
writing data to specific disks
+
+# Goal Requirements Specification
+
+### **Support for Storage Policy Writing and Management**
+
+- **Writing keys**: Allow keys to be written to specified storage tiers based 
on storage policies.
+- **Policy Management**: Enable setting, unsetting, and inheriting storage 
policies for keys, prefixes, and buckets. Inherit policies based on the longest 
matching prefix or bucket if no specific policy is set.
+
+### **Support for Data Migration Across Different Storage Policies**
+
+- **Data Migration**: Support data migration across different storage policies 
via manual triggers, ensuring data is moved to the appropriate storage tiers.
+
+### **Adaptation of AWS S3 StorageClass**
+
+- **S3 StorageClass Mapping**: Map AWS S3 storage classes to Ozone storage 
policies, supporting related API operations (PutObject, CopyObject, Multipart 
Upload, GetObject, HeadObject, ListObjects).
+
+### **Management and Monitoring Tools**
+
+- **Storage Policy Commands**: Provide tools to view storage policies of 
containers, datanode usage, and pipeline information.
+- **Metrics and Monitoring**: Enable visibility into storage policy 
compliance, container storage types, and space information across different 
storage policies.
+
+### **Future Enhancements**
+
+- **Intelligent Storage Policies**: Plan to support automatic data migration 
based on access frequency, similar to S3 Intelligent-Tiering.
+- **Bucket StorageClass Lifecycle Rules: Support setting storage policies 
Lifecycle Rules at the bucket level.**
+- **Recon Support**: Enhance Recon to displa

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-26 Thread via GitHub


chungen0126 commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2993177512


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -19,379 +20,588 @@ status: draft
 
 # Terminology
 
-## Terminology
+## Definitions
 
 - Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
-- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
-- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
-- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
-- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+- Storage Type: The type of each Datanode volume or container replica. Each 
Datanode volume can be configured with a
+  storage type, including SSD, DISK, and ARCHIVE.
+- Storage Tier: A specific storage tier is composed of all replicas of a 
container based on their storage type. For
+  example, a 3-replica SSD tier consists of 3 replicas of SSD type.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode.
+- Key: In this document, a key refers to an object in Ozone, including entries 
in both the KeyTable and FileTable.
 
 ## Storage Policy vs Storage Type vs Storage Tier
 
 
![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
 
-The relation of Storage Policy, Storage Type and Storage Tier
+The relationship between Storage Policy, Storage Type, and Storage Tier:
 
-- The storage policy is the property of key/bucket/ prefix (Managed by OM);
-- The storage tier is the property of Pipeline and Container (Managed by SCM);
-- The storage type is the property of volume and Container replicas (Managed 
by DN);
-- Only the storage policy can be modified by the user directly via ozone 
command;
+- The storage policy is the property of key/bucket (managed by OM).
+- The storage tier is the property of Pipeline and Container (managed by SCM).
+- The storage type is the property of volume and container replica (managed by 
DN).
+- Only the storage policy can be modified by the user directly via the ozone 
command.
 
 Example:
 
-For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+For a keyA, its storage policy is Hot, Its Container tier is SSD tier, the 
Container has three replicas, all of which
+are of the SSD storage type.
 
 # User Scenarios
 
-- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
-- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
-- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
-- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+- User A needs a bucket that supports high-performance IO, so they create a 
bucket with the storage policy set to Hot.
+  Data written by User A to the bucket will automatically be distributed 
across SSD disks in the cluster.
+- User B needs higher IO performance for a specific key. They write a key with 
the storage policy set to Hot. The
+  key's data will be distributed across SSD disks in the cluster.
+- User C uses the command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class STANDARD` to upload a file
+  to the Ozone SSD tier. The key's data will be distributed across SSD disks 
in the cluster.
+
+# Goals
+
+- Storage Policy: Introduce storage policy and related concepts. Define 
multiple storage policies and support S3
+  storage class.
+- Storage Policy Writing: Allow writing keys/files to specified storage tiers 
based on storage policy. Support S3,
+  API, and shell command interfaces.
+- Storage Policy Update: Enable setting and unsetting storage policies for 
buckets, and setting storage tiers for
+  containers.
+- Storage Policy Display: Support displaying the storage policy attribute of 
buckets and keys. Support displaying the
+  storage tier of SCM containers and pipelines. Support displaying Datanode 
storage type usage information. Support
+  checking whether the key storage policy is satisfied.
+- Co

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-23 Thread via GitHub


xichen01 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-4109608481

   @greenwich @chungen0126 @errose28 @vtutrinov The document has been updated. 
Please check it. All the content covered in the current document has been 
implemented in our internal version.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-09 Thread via GitHub


chungen0126 commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2906025795


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);
+- The storage type is the property of volume and Container replicas (Managed 
by DN);
+- Only the storage policy can be modified by the user directly via ozone 
command;
+
+Example:
+
+For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
+- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
+- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
+- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+
+# Current Status
+
+- Ozone currently has some support for tiered storage such as storage type, 
and some parts of this article may already be implemented.
+- Currently, in Ozone, when a key is created, the key's Block can appear on 
any volume of a Datanode. When a key is created, SCM first needs to allocate a 
Block for the key through Pipelines. The Client then writes the Block to the 
corresponding Datanode based on the Pipeline information. In this process, the 
smallest element managed by the SCM Pipeline is the Datanode, and when the 
Datanode creates a Container, the Container may appear on any volume with 
enough remaining space. Under the current architecture, Ozone does not support 
writing data to specific disks
+
+# Goal Requirements Specification
+
+### **Support for Storage Policy Writing and Management**
+
+- **Writing keys**: Allow keys to be written to specified storage tiers based 
on storage policies.
+- **Policy Management**: Enable setting, unsetting, and inheriting storage 
policies for keys, prefixes, and buckets. Inherit policies based on the longest 
matching prefix or bucket if no specific policy is set.
+
+### **Support for Data Migration Across Different Storage Policies**
+
+- **Data Migration**: Support data migration across different storage policies 
via manual triggers, ensuring data is moved to the appropriate storage tiers.
+
+### **Adaptation of AWS S3 StorageClass**
+
+- **S3 StorageClass Mapping**: Map AWS S3 storage classes to Ozone storage 
policies, supporting related API operations (PutObject, CopyObject, Multipart 
Upload, GetObject, HeadObject, ListObjects).
+
+### **Management and Monitoring Tools**
+
+- **Storage Policy Commands**: Provide tools to view storage policies of 
containers, datanode usage, and pipeline information.
+- **Metrics and Monitoring**: Enable visibility into storage policy 
compliance, container storage types, and space information across different 
storage policies.
+
+### **Future Enhancements**
+
+- **Intelligent Storage Policies**: Plan to support automatic data migration 
based on access frequency, similar to S3 Intelligent-Tiering.
+- **Bucket StorageClass Lifecycle Rules: Support setting storage policies 
Lifecycle Rules at the bucket level.**
+- **Recon Support**: Enhance Recon to displa

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-02 Thread via GitHub


greenwich commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2875955485


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);
+- The storage type is the property of volume and Container replicas (Managed 
by DN);
+- Only the storage policy can be modified by the user directly via ozone 
command;
+
+Example:
+
+For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
+- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
+- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
+- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+
+# Current Status
+
+- Ozone currently has some support for tiered storage such as storage type, 
and some parts of this article may already be implemented.
+- Currently, in Ozone, when a key is created, the key's Block can appear on 
any volume of a Datanode. When a key is created, SCM first needs to allocate a 
Block for the key through Pipelines. The Client then writes the Block to the 
corresponding Datanode based on the Pipeline information. In this process, the 
smallest element managed by the SCM Pipeline is the Datanode, and when the 
Datanode creates a Container, the Container may appear on any volume with 
enough remaining space. Under the current architecture, Ozone does not support 
writing data to specific disks
+
+# Goal Requirements Specification
+
+### **Support for Storage Policy Writing and Management**
+
+- **Writing keys**: Allow keys to be written to specified storage tiers based 
on storage policies.
+- **Policy Management**: Enable setting, unsetting, and inheriting storage 
policies for keys, prefixes, and buckets. Inherit policies based on the longest 
matching prefix or bucket if no specific policy is set.
+
+### **Support for Data Migration Across Different Storage Policies**
+
+- **Data Migration**: Support data migration across different storage policies 
via manual triggers, ensuring data is moved to the appropriate storage tiers.
+
+### **Adaptation of AWS S3 StorageClass**
+
+- **S3 StorageClass Mapping**: Map AWS S3 storage classes to Ozone storage 
policies, supporting related API operations (PutObject, CopyObject, Multipart 
Upload, GetObject, HeadObject, ListObjects).
+
+### **Management and Monitoring Tools**
+
+- **Storage Policy Commands**: Provide tools to view storage policies of 
containers, datanode usage, and pipeline information.
+- **Metrics and Monitoring**: Enable visibility into storage policy 
compliance, container storage types, and space information across different 
storage policies.
+
+### **Future Enhancements**
+
+- **Intelligent Storage Policies**: Plan to support automatic data migration 
based on access frequency, similar to S3 Intelligent-Tiering.
+- **Bucket StorageClass Lifecycle Rules: Support setting storage policies 
Lifecycle Rules at the bucket level.**
+- **Recon Support**: Enhance Recon to display 

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-02 Thread via GitHub


greenwich commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2875959913


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);
+- The storage type is the property of volume and Container replicas (Managed 
by DN);
+- Only the storage policy can be modified by the user directly via ozone 
command;
+
+Example:
+
+For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
+- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
+- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
+- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+
+# Current Status
+
+- Ozone currently has some support for tiered storage such as storage type, 
and some parts of this article may already be implemented.
+- Currently, in Ozone, when a key is created, the key's Block can appear on 
any volume of a Datanode. When a key is created, SCM first needs to allocate a 
Block for the key through Pipelines. The Client then writes the Block to the 
corresponding Datanode based on the Pipeline information. In this process, the 
smallest element managed by the SCM Pipeline is the Datanode, and when the 
Datanode creates a Container, the Container may appear on any volume with 
enough remaining space. Under the current architecture, Ozone does not support 
writing data to specific disks
+
+# Goal Requirements Specification
+
+### **Support for Storage Policy Writing and Management**
+
+- **Writing keys**: Allow keys to be written to specified storage tiers based 
on storage policies.
+- **Policy Management**: Enable setting, unsetting, and inheriting storage 
policies for keys, prefixes, and buckets. Inherit policies based on the longest 
matching prefix or bucket if no specific policy is set.
+
+### **Support for Data Migration Across Different Storage Policies**
+
+- **Data Migration**: Support data migration across different storage policies 
via manual triggers, ensuring data is moved to the appropriate storage tiers.
+
+### **Adaptation of AWS S3 StorageClass**
+
+- **S3 StorageClass Mapping**: Map AWS S3 storage classes to Ozone storage 
policies, supporting related API operations (PutObject, CopyObject, Multipart 
Upload, GetObject, HeadObject, ListObjects).
+
+### **Management and Monitoring Tools**
+
+- **Storage Policy Commands**: Provide tools to view storage policies of 
containers, datanode usage, and pipeline information.
+- **Metrics and Monitoring**: Enable visibility into storage policy 
compliance, container storage types, and space information across different 
storage policies.
+
+### **Future Enhancements**
+
+- **Intelligent Storage Policies**: Plan to support automatic data migration 
based on access frequency, similar to S3 Intelligent-Tiering.
+- **Bucket StorageClass Lifecycle Rules: Support setting storage policies 
Lifecycle Rules at the bucket level.**
+- **Recon Support**: Enhance Recon to display 

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-02 Thread via GitHub


greenwich commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2875955485


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);
+- The storage type is the property of volume and Container replicas (Managed 
by DN);
+- Only the storage policy can be modified by the user directly via ozone 
command;
+
+Example:
+
+For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
+- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
+- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
+- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+
+# Current Status
+
+- Ozone currently has some support for tiered storage such as storage type, 
and some parts of this article may already be implemented.
+- Currently, in Ozone, when a key is created, the key's Block can appear on 
any volume of a Datanode. When a key is created, SCM first needs to allocate a 
Block for the key through Pipelines. The Client then writes the Block to the 
corresponding Datanode based on the Pipeline information. In this process, the 
smallest element managed by the SCM Pipeline is the Datanode, and when the 
Datanode creates a Container, the Container may appear on any volume with 
enough remaining space. Under the current architecture, Ozone does not support 
writing data to specific disks
+
+# Goal Requirements Specification
+
+### **Support for Storage Policy Writing and Management**
+
+- **Writing keys**: Allow keys to be written to specified storage tiers based 
on storage policies.
+- **Policy Management**: Enable setting, unsetting, and inheriting storage 
policies for keys, prefixes, and buckets. Inherit policies based on the longest 
matching prefix or bucket if no specific policy is set.
+
+### **Support for Data Migration Across Different Storage Policies**
+
+- **Data Migration**: Support data migration across different storage policies 
via manual triggers, ensuring data is moved to the appropriate storage tiers.
+
+### **Adaptation of AWS S3 StorageClass**
+
+- **S3 StorageClass Mapping**: Map AWS S3 storage classes to Ozone storage 
policies, supporting related API operations (PutObject, CopyObject, Multipart 
Upload, GetObject, HeadObject, ListObjects).
+
+### **Management and Monitoring Tools**
+
+- **Storage Policy Commands**: Provide tools to view storage policies of 
containers, datanode usage, and pipeline information.
+- **Metrics and Monitoring**: Enable visibility into storage policy 
compliance, container storage types, and space information across different 
storage policies.
+
+### **Future Enhancements**
+
+- **Intelligent Storage Policies**: Plan to support automatic data migration 
based on access frequency, similar to S3 Intelligent-Tiering.
+- **Bucket StorageClass Lifecycle Rules: Support setting storage policies 
Lifecycle Rules at the bucket level.**
+- **Recon Support**: Enhance Recon to display 

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-02 Thread via GitHub


xichen01 commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2873208891


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);
+- The storage type is the property of volume and Container replicas (Managed 
by DN);
+- Only the storage policy can be modified by the user directly via ozone 
command;
+
+Example:
+
+For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
+- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
+- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
+- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+
+# Current Status
+
+- Ozone currently has some support for tiered storage such as storage type, 
and some parts of this article may already be implemented.
+- Currently, in Ozone, when a key is created, the key's Block can appear on 
any volume of a Datanode. When a key is created, SCM first needs to allocate a 
Block for the key through Pipelines. The Client then writes the Block to the 
corresponding Datanode based on the Pipeline information. In this process, the 
smallest element managed by the SCM Pipeline is the Datanode, and when the 
Datanode creates a Container, the Container may appear on any volume with 
enough remaining space. Under the current architecture, Ozone does not support 
writing data to specific disks
+
+# Goal Requirements Specification
+
+### **Support for Storage Policy Writing and Management**
+
+- **Writing keys**: Allow keys to be written to specified storage tiers based 
on storage policies.
+- **Policy Management**: Enable setting, unsetting, and inheriting storage 
policies for keys, prefixes, and buckets. Inherit policies based on the longest 
matching prefix or bucket if no specific policy is set.
+
+### **Support for Data Migration Across Different Storage Policies**
+
+- **Data Migration**: Support data migration across different storage policies 
via manual triggers, ensuring data is moved to the appropriate storage tiers.
+
+### **Adaptation of AWS S3 StorageClass**
+
+- **S3 StorageClass Mapping**: Map AWS S3 storage classes to Ozone storage 
policies, supporting related API operations (PutObject, CopyObject, Multipart 
Upload, GetObject, HeadObject, ListObjects).
+
+### **Management and Monitoring Tools**
+
+- **Storage Policy Commands**: Provide tools to view storage policies of 
containers, datanode usage, and pipeline information.
+- **Metrics and Monitoring**: Enable visibility into storage policy 
compliance, container storage types, and space information across different 
storage policies.
+
+### **Future Enhancements**
+
+- **Intelligent Storage Policies**: Plan to support automatic data migration 
based on access frequency, similar to S3 Intelligent-Tiering.
+- **Bucket StorageClass Lifecycle Rules: Support setting storage policies 
Lifecycle Rules at the bucket level.**
+- **Recon Support**: Enhance Recon to display r

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-02 Thread via GitHub


xichen01 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-3985905946

   @greenwich @chungen0126 @errose28 @vtutrinov I will update this document, 
adding more detailed content and incorporating some minor changes. I will try 
to complete it this week.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-03-01 Thread via GitHub


greenwich commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2870767609


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);
+- The storage type is the property of volume and Container replicas (Managed 
by DN);
+- Only the storage policy can be modified by the user directly via ozone 
command;
+
+Example:
+
+For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
+- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
+- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
+- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+
+# Current Status
+
+- Ozone currently has some support for tiered storage such as storage type, 
and some parts of this article may already be implemented.
+- Currently, in Ozone, when a key is created, the key's Block can appear on 
any volume of a Datanode. When a key is created, SCM first needs to allocate a 
Block for the key through Pipelines. The Client then writes the Block to the 
corresponding Datanode based on the Pipeline information. In this process, the 
smallest element managed by the SCM Pipeline is the Datanode, and when the 
Datanode creates a Container, the Container may appear on any volume with 
enough remaining space. Under the current architecture, Ozone does not support 
writing data to specific disks
+
+# Goal Requirements Specification
+
+### **Support for Storage Policy Writing and Management**
+
+- **Writing keys**: Allow keys to be written to specified storage tiers based 
on storage policies.
+- **Policy Management**: Enable setting, unsetting, and inheriting storage 
policies for keys, prefixes, and buckets. Inherit policies based on the longest 
matching prefix or bucket if no specific policy is set.
+
+### **Support for Data Migration Across Different Storage Policies**
+
+- **Data Migration**: Support data migration across different storage policies 
via manual triggers, ensuring data is moved to the appropriate storage tiers.
+
+### **Adaptation of AWS S3 StorageClass**
+
+- **S3 StorageClass Mapping**: Map AWS S3 storage classes to Ozone storage 
policies, supporting related API operations (PutObject, CopyObject, Multipart 
Upload, GetObject, HeadObject, ListObjects).
+
+### **Management and Monitoring Tools**
+
+- **Storage Policy Commands**: Provide tools to view storage policies of 
containers, datanode usage, and pipeline information.
+- **Metrics and Monitoring**: Enable visibility into storage policy 
compliance, container storage types, and space information across different 
storage policies.
+
+### **Future Enhancements**
+
+- **Intelligent Storage Policies**: Plan to support automatic data migration 
based on access frequency, similar to S3 Intelligent-Tiering.
+- **Bucket StorageClass Lifecycle Rules: Support setting storage policies 
Lifecycle Rules at the bucket level.**
+- **Recon Support**: Enhance Recon to display 

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-02-28 Thread via GitHub


xichen01 commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2867351940


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);
+- The storage type is the property of volume and Container replicas (Managed 
by DN);
+- Only the storage policy can be modified by the user directly via ozone 
command;
+
+Example:
+
+For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
+- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
+- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
+- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+
+# Current Status
+
+- Ozone currently has some support for tiered storage such as storage type, 
and some parts of this article may already be implemented.
+- Currently, in Ozone, when a key is created, the key's Block can appear on 
any volume of a Datanode. When a key is created, SCM first needs to allocate a 
Block for the key through Pipelines. The Client then writes the Block to the 
corresponding Datanode based on the Pipeline information. In this process, the 
smallest element managed by the SCM Pipeline is the Datanode, and when the 
Datanode creates a Container, the Container may appear on any volume with 
enough remaining space. Under the current architecture, Ozone does not support 
writing data to specific disks
+
+# Goal Requirements Specification
+
+### **Support for Storage Policy Writing and Management**
+
+- **Writing keys**: Allow keys to be written to specified storage tiers based 
on storage policies.
+- **Policy Management**: Enable setting, unsetting, and inheriting storage 
policies for keys, prefixes, and buckets. Inherit policies based on the longest 
matching prefix or bucket if no specific policy is set.
+
+### **Support for Data Migration Across Different Storage Policies**
+
+- **Data Migration**: Support data migration across different storage policies 
via manual triggers, ensuring data is moved to the appropriate storage tiers.
+
+### **Adaptation of AWS S3 StorageClass**
+
+- **S3 StorageClass Mapping**: Map AWS S3 storage classes to Ozone storage 
policies, supporting related API operations (PutObject, CopyObject, Multipart 
Upload, GetObject, HeadObject, ListObjects).
+
+### **Management and Monitoring Tools**
+
+- **Storage Policy Commands**: Provide tools to view storage policies of 
containers, datanode usage, and pipeline information.
+- **Metrics and Monitoring**: Enable visibility into storage policy 
compliance, container storage types, and space information across different 
storage policies.
+
+### **Future Enhancements**
+
+- **Intelligent Storage Policies**: Plan to support automatic data migration 
based on access frequency, similar to S3 Intelligent-Tiering.
+- **Bucket StorageClass Lifecycle Rules: Support setting storage policies 
Lifecycle Rules at the bucket level.**
+- **Recon Support**: Enhance Recon to display r

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-02-24 Thread via GitHub


greenwich commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-3956982569

   @ivandika3 Thanks for the clarification — that makes sense.
   
   I fully understand that contributions are voluntary, and I really appreciate 
the time and effort everyone is putting into this.
   
   I’m very interested in helping move this forward and would be glad to 
contribute where it makes the most sense.
   
   Regarding Phase 2, just thinking ahead — would it make sense to initially 
target a simpler mover-style implementation (similar in spirit to HDFS Mover) 
before introducing a separate job worker subsystem? That might allow basic 
Storage Policy Migration functionality to be delivered earlier and iterated on 
over time.
   
   Of course, I’m happy to align with the broader design direction — just 
exploring whether an incremental path could also work here.
   
   Please let me know how I can best contribute.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-02-24 Thread via GitHub


ivandika3 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-3956283360

   > the diff above has references to the following non-existent files 
(relative to ozone-1.4.1):
   
   These refer to our internal placement policy to support multi-DC setup and 
can be ignored as these do not pertain the any current functionality in 
community Ozone. The patch diff serve only as an overview of what the changes 
might look like.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-02-24 Thread via GitHub


ivandika3 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-3956078125

   @xichen01 Let's reopen this patch and move this forward. I am willing spend 
some time on the storage policy support development. Hopefully other community 
members (@errose28 @chungen0126) are also able to help to push the review 
process.
   
   @greenwich As mentioned in the 
https://github.com/apache/ozone/discussions/8811#discussioncomment-13928693 
discussion last year. There are mainly two phases of Ozone Storage Policy 
Support
   1. Supporting storage policy and storage types on Ozone
   2. Storage Policy Migration support
   
   Phase 1 goal is to allow client to upload keys with different storage policy 
and integrate storage policy and types to all Ozone pipeline, containers, etc. 
This will be our short term focus now since we are not introducing any new 
subsystem
   
   Phase 2 goal is to support Storage Policy Migration. However, our current 
implementation requires a separate job worker subsystem. This will be longer 
term since we are introducing some new subsystem. If you don't require a 
separate job worker subsystem, you might need to write your own implementation 
of "Storage Policy Satisfier".
   
   That said, hope you understand that all contributions to Ozone are purely 
voluntary and made by members with other higher level priorities and therefore 
we cannot 100% guarantee that this will be done in a timely manner.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-02-24 Thread via GitHub


greenwich commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-3955939286

   @xichen01, Thanks for the update. That's great news indeed! In that case, 
there is no point for me to invent a wheel. I had a quick look at the patch you 
submitted earlier, and it looks comprehensive. I will have a detailed look 
today.
   
   Would it be possible to move forward with that PR (and merge it, or at least 
create a working branch and sync it with master)? Also, as I am very interested 
in that feature, I can provide assistance on my end.
   
   cc: @ivandika3 @chungen0126 
   Please let me know.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-02-24 Thread via GitHub


xichen01 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-3953103047

   @errose28 Thanks for noticing this PR.
   
   @greenwich Thank you for the update.
   
   ### Regarding the current status of this PR
   
   The current PR hasn't been updated for a while, mainly because there doesn't 
seem to be a strong demand for this feature from other members of the 
community, so reviews have stagnated.
   However, this feature has been fully implemented internally, including 
support for StoragePolicy across all S3 and Filesystem write interfaces, as 
well as support for StoragePolicy in ReplicationManager and ContainerBalancer.
   We've basically implemented it according to this design document (some parts 
of the document need updating; I can update the document if needed).
   
   We also support S3 Lifecycle 
(https://issues.apache.org/jira/browse/HDDS-8342), allowing you to set a 
Lifecycle for a Bucket to migrate specified keys to a specified StoragePolicy 
at a specified time (including from SSD migrate to DISK, and also from THREE 
Replication migrate to EC) or similar HDFS SPS (Storage Policy Satisfaction) 
functionality.
   
   ### Follow-up
   
   If you or others in the community are willing, we can continue to move 
forward with this PR, of course, you can also move forward with your own 
proposals, we can cooperate if you need it.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-02-23 Thread via GitHub


greenwich commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-3947561978

   @errose28 Thanks for looking! I am very sorry for the noise my PR caused; it 
wasn't intended to be public (My bad - I haven't set it to Draft when I created 
it).
   
   Let me explain my motivation.
   1. My team **needs** a storage policy and tiering support for Ozone. 
Unfortunately, reality is tough, and if we don't have it this year (ideally 
H1), then the Ozone might be deprioritised because we use other Ozone 
competitors in the company.
   2. The patch attached to this pull request doesn't seem to be up to date 
with the master or even 2.0 or 2.1 releases.
   3. As I can see, there has been no work on this PR since Nov 2025.
   4. Plus, I might be wrong, but I got an impression that we want a perfect 
design and implementation here before we start coding.
   5. My approach is different - I want to design and build it 
**incrementally** because that's the simplest way to adopt it in my team, start 
using it and collect the feedback from others. So
   - I analysed the current state, reviewed the contents of the patch in this 
pull request, and created a small roadmap of the features I want in storage 
tiering. 
   - I prioritised them and split them into small releases. I called them 
MVP-1, MVP-2, etc. Each MVP should take me around 1 week to implement. 
   - Each MVP should have a complete set of features that work e2e. At the 
moment, I work on MVP-1.
   - I planned to test MVPs on our PROD environment, and if it works as 
expected, then go back to the Ozone community and share what I have.
   6. As I haven't created any Jira tickets because I thought of implementing 
and testing my MVP first. However, each independent feature is pushed as a 
separate commit, so it might later be retrofitted into different ASF jira 
tickeets (if needed).
   7. The last thing, using Apache Ozone's PR feature is very useful for me 
because it triggers the whole CI and whatnot and allows seeing diff with the 
master to make my branch in sync.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-02-23 Thread via GitHub


errose28 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-3946550817

   Hi @greenwich I see you have also opened #9807.  If you would like to 
continue work on this, we should start by reaching agreement on a design doc. 
I'm not sure we finished that process yet. @ivandika3 @xichen01 does this doc 
need more updates/review? Should we continue work on it here or open a new PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2026-02-17 Thread via GitHub


greenwich commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-3919009991

   Hi, do you know if any work is planned for this ticket? AFAIK, the patch 
diff wasn't added to the branch and is probably out of date right now. What are 
the next steps here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-11-18 Thread via GitHub


github-actions[bot] closed pull request #6989: HDDS-11233. Ozone Storage Policy 
Support.
URL: https://github.com/apache/ozone/pull/6989


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-11-18 Thread via GitHub


github-actions[bot] commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-3549968542

   Thank you for your contribution. This PR is being closed due to inactivity. 
If needed, feel free to reopen it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-11-11 Thread via GitHub


github-actions[bot] commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-3519269594

   This PR has been marked as stale due to 21 days of inactivity. Please 
comment or remove the stale label to keep it open. Otherwise, it will be 
automatically closed in 7 days.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-06-19 Thread via GitHub


vtutrinov commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-2988190589

   @ivandika3 @xichen01 the diff above has references to the following 
non-existent files (relative to ozone-1.4.1):
   
   ```
   
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/placement/algorithms/SCMContainerPlacementDataCenterAware.java
   
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/placement/algorithms/SCMContainerPlacementDataRecovery.java
   
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/PipelinePlacementDataCenterAware.java
   
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/placement/algorithms/TestSCMContainerPlacementDataCenterAware.java
   
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/placement/algorithms/TestSCMContainerPlacementDataCenterAwareSpecialCase.java
   
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/placement/algorithms/TestSCMContainerPlacementDataRecovery.java
   
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/placement/algorithms/TestSCMContainerPlacementDcFlow.java
   
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/pipeline/TestPipelinePlacementDataCenterAware.java
   
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/diskbalancer/TestDiskBalancerService.java
   
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/placement/algorithms/TestSCMContainerPlacementStorageTier.java
   
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/client/StorageTierUtil.java
   
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeUtils.java
   
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/TestSpecialCloseContainerEventHandler.java
   
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/hdds/scm/container/TestPeriodicContainerCloser.java
   
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/AbstractRootedOzoneFileSystemTest.java
   
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/shell/UpdateBucketOptions.java
   
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/client/StorageTypeUtils.java
   
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestOzoneStoragePolicy.java
   
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/balancer/TestContainerBalancerTaskDcFlow.java
   
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/diskbalancer/DiskBalancerService.java
   
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/diskbalancer/DiskBalancerUtils.java
   
hadoop-hdds/common/src/test/java/org/apache/hadoop/hdds/client/StorageTierUtilTest.java
   
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/dcflow/ContainerBalancerSelectionCriteriaDcFlow.java
   
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/AbstractStorageTypeChoosingPolicy.java
   
hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/storagepolicy/StoragePolicyCommands.java
   
hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/storagepolicy/UsageInfoSubCommand.java
   ```
   
   Could you provide them too, or point me to the commit where I can fetch them?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-06-16 Thread via GitHub


ivandika3 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-2976778259

   @vtutrinov Thanks for the reminder. I have attached 
https://issues.apache.org/jira/secure/attachment/13077025/storage-policy-diff.tar.gz
 for the list of diffs of the storage policy integration. 
   
   Please be reminded to attribute @xichen01 for any commits generated from 
these diffs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-06-16 Thread via GitHub


vtutrinov commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-2976630111

   @ivandika3 I don't want to rush, but is there any news about the mentioned 
diff? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-06-09 Thread via GitHub


vtutrinov commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-2955514606

   @ivandika3 it would be great!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-06-09 Thread via GitHub


ivandika3 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-2955463014

   @vtutrinov the fastest way I think we can do is to provide you with the 
diffs. However this diff won't apply cleanly on the master branch since our 
branch is based on 1.4.1 version with some of our internal specific changes. I 
probably can provide some of it this weekend.
   
   Feature branch in my fork might take a while since we need to resolve the 
conflicts.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-06-09 Thread via GitHub


vtutrinov commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-2955438634

   @ivandika3 thanks for the response!
   Can we glance at the implementation as the first phase (maybe in a custom 
feature branch)? Or are there too many private details?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-06-07 Thread via GitHub


ivandika3 commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-2953554337

   @vtutrinov Currently the implementation is being worked on internally for 
the past one year.
   
   The basic implementation of storage policy and storage types integration on 
containers, pipelines, volumes, s3 storage class, and creating key / file with 
storage policy has been implemented but still need extensive testing. Currently 
we are focusing on storage policy migration implementation.
   
   @xichen01 would know more about the approximate timestamps, but we hope to 
have a working implementation in the next quarter (i.e. Q3 2025) or so.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-05-28 Thread via GitHub


vtutrinov commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-2915981608

   @xichen01 @kerneltime @sodonnel, could you help somehow to force the review 
of the design doc? The feature is very needed, and I would gladly start 
implementation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-05-22 Thread via GitHub


vtutrinov commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-2900322703

   @xichen01 is there an understanding of the time frame for the functionality 
to be implemented? I'd start creating the JIRA tickets and implementing them


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-05-21 Thread via GitHub


xichen01 commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2100544578


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);

Review Comment:
   Storage Tier is more like the `ReplicationConfig`, will be a independent 
fields in `ContainerInfo` and `Pipeline`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-05-20 Thread via GitHub


vtutrinov commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2098394743


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);

Review Comment:
   I meant org.apache.hadoop.hdds.scm.net.NodeSchema. Will the Storage Tier 
(aka `rack of specific storage volumes`) become a part of the network topology



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-05-20 Thread via GitHub


xichen01 commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2098242333


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);
+- The storage type is the property of volume and Container replicas (Managed 
by DN);
+- Only the storage policy can be modified by the user directly via ozone 
command;
+
+Example:
+
+For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
+- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
+- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
+- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+
+# Current Status
+
+- Ozone currently has some support for tiered storage such as storage type, 
and some parts of this article may already be implemented.
+- Currently, in Ozone, when a key is created, the key's Block can appear on 
any volume of a Datanode. When a key is created, SCM first needs to allocate a 
Block for the key through Pipelines. The Client then writes the Block to the 
corresponding Datanode based on the Pipeline information. In this process, the 
smallest element managed by the SCM Pipeline is the Datanode, and when the 
Datanode creates a Container, the Container may appear on any volume with 
enough remaining space. Under the current architecture, Ozone does not support 
writing data to specific disks
+
+# Goal Requirements Specification
+
+### **Support for Storage Policy Writing and Management**
+
+- **Writing keys**: Allow keys to be written to specified storage tiers based 
on storage policies.
+- **Policy Management**: Enable setting, unsetting, and inheriting storage 
policies for keys, prefixes, and buckets. Inherit policies based on the longest 
matching prefix or bucket if no specific policy is set.
+
+### **Support for Data Migration Across Different Storage Policies**
+
+- **Data Migration**: Support data migration across different storage policies 
via manual triggers, ensuring data is moved to the appropriate storage tiers.
+
+### **Adaptation of AWS S3 StorageClass**
+
+- **S3 StorageClass Mapping**: Map AWS S3 storage classes to Ozone storage 
policies, supporting related API operations (PutObject, CopyObject, Multipart 
Upload, GetObject, HeadObject, ListObjects).
+
+### **Management and Monitoring Tools**
+
+- **Storage Policy Commands**: Provide tools to view storage policies of 
containers, datanode usage, and pipeline information.
+- **Metrics and Monitoring**: Enable visibility into storage policy 
compliance, container storage types, and space information across different 
storage policies.
+
+### **Future Enhancements**
+
+- **Intelligent Storage Policies**: Plan to support automatic data migration 
based on access frequency, similar to S3 Intelligent-Tiering.
+- **Bucket StorageClass Lifecycle Rules: Support setting storage policies 
Lifecycle Rules at the bucket level.**
+- **Recon Support**: Enhance Recon to display r

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2025-05-20 Thread via GitHub


vtutrinov commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r2097266132


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);

Review Comment:
   Will we deal with the storage tier as an entry of the cluster topology?



##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);
+- The storage type is the property of volume and Container replicas (Managed 
by DN);
+- Only the storage policy can be modified by the user directly via ozone 
command;
+
+Example:
+
+For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
+- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
+- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
+- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+
+# Current Status
+
+- Ozone currently has some support for tiered storage such as storage type, 
and some parts of this article may already be implemented.
+- Currently, in Ozone, when a key is created, the key's Block can appear on 
any volume of a Datanode. When a key is created, SCM first needs to allocate a 
Block for the key through Pipelines. The Client then writes the Block to the 
corresponding Datanode based on the Pipeline information. In this process, the 
smallest element managed by the SCM Pipeline is the Datanode, and when the 
Datanode creates a Container, the Container may appear on any volume with 
enough remaining space. Under the current architecture, Ozone does not support 
writing data to specific disks
+
+# Goal Requirements Specification
+
+### **Support for Storage Policy Writing

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2024-07-25 Thread via GitHub


ivandika3 commented on code in PR #6989:
URL: https://github.com/apache/ozone/pull/6989#discussion_r1692365363


##
hadoop-hdds/docs/content/design/storage-policy.md:
##
@@ -0,0 +1,397 @@
+---
+title: Ozone Storage Policy Support
+summary: Support Ozone storage strategy, and support to write key into the 
specified type of storage medium.
+date: 2024-07-25
+jira: HDDS-11233
+status: draft
+---
+
+
+# Terminology
+
+## Terminology
+
+- Storage Policy: Defines where key data replicas should be stored in specific 
storage tiers.
+- Storage Type: The types of disks/Container replicas in a Datanode, storage 
type could include RAM_DISK, SSD, HDD, ARCHIVE, etc.
+- Storage Tier: A set of Container replicas in a cluster that satisfy the 
storage policy.
+- Volume: In this document, unless otherwise specified, a volume refers to the 
volume of a Datanode..
+- prefix: The prefix in this article, unless otherwise specified, refers to 
the prefix of the storage policy type, not the ACL prefix. The prefix of the 
storage policy type is used to configure the prefix of the storage policy for 
the specified prefix.
+
+## Storage Policy vs Storage Type vs Storage Tier
+
+![storage-policy](https://issues.apache.org/jira/secure/attachment/13070477/storage-policy.png)
+
+The relation of Storage Policy, Storage Type and Storage Tier
+
+- The storage policy is the property of key/bucket/ prefix (Managed by OM);
+- The storage tier is the property of Pipeline and Container (Managed by SCM);
+- The storage type is the property of volume and Container replicas (Managed 
by DN);
+- Only the storage policy can be modified by the user directly via ozone 
command;
+
+Example:
+
+For a keyA, its storage policy is Hot, its Container 1 tier is SSD tier, and 
Container 1 has three replicas, all of which are of the SSD storage type.
+
+# User Scenarios
+
+- User A needs a bucket that supports high-performance IO, so create a bucket 
with the storage policy set to Hot. Data written by User A to bucket will 
automatically be distributed across the SSD disks in the cluster.
+- User B needs higher IO performance for the directory/prefix 
/project/metadata, so set the storage policy for the prefix /project/metadata 
to Hot. Subsequently, data written to /project/metadata will be automatically 
distributed across the SSD disks in the cluster.
+- User C has already written key1 to the cluster and requires better IO 
performance. The storage policy for key1 can be set to Hot, and then a 
migration can be triggered to move key1 to the SSD disks.
+- Use D use command `aws s3 cp myfile.txt s3://my-bucket/myfile.txt 
--storage-class XXX` upload a file the Ozone SSD tier
+
+# Current Status
+
+- Ozone currently has some support for tiered storage such as storage type, 
and some parts of this article may already be implemented.
+- Currently, in Ozone, when a key is created, the key's Block can appear on 
any volume of a Datanode. When a key is created, SCM first needs to allocate a 
Block for the key through Pipelines. The Client then writes the Block to the 
corresponding Datanode based on the Pipeline information. In this process, the 
smallest element managed by the SCM Pipeline is the Datanode, and when the 
Datanode creates a Container, the Container may appear on any volume with 
enough remaining space. Under the current architecture, Ozone does not support 
writing data to specific disks
+
+# Goal Requirements Specification
+
+### **Support for Storage Policy Writing and Management**
+
+- **Writing keys**: Allow keys to be written to specified storage tiers based 
on storage policies.
+- **Policy Management**: Enable setting, unsetting, and inheriting storage 
policies for keys, prefixes, and buckets. Inherit policies based on the longest 
matching prefix or bucket if no specific policy is set.
+
+### **Support for Data Migration Across Different Storage Policies**
+
+- **Data Migration**: Support data migration across different storage policies 
via manual triggers, ensuring data is moved to the appropriate storage tiers.
+
+### **Adaptation of AWS S3 StorageClass**
+
+- **S3 StorageClass Mapping**: Map AWS S3 storage classes to Ozone storage 
policies, supporting related API operations (PutObject, CopyObject, Multipart 
Upload, GetObject, HeadObject, ListObjects).
+
+### **Management and Monitoring Tools**
+
+- **Storage Policy Commands**: Provide tools to view storage policies of 
containers, datanode usage, and pipeline information.
+- **Metrics and Monitoring**: Enable visibility into storage policy 
compliance, container storage types, and space information across different 
storage policies.
+
+### **Future Enhancements**
+
+- **Intelligent Storage Policies**: Plan to support automatic data migration 
based on access frequency, similar to S3 Intelligent-Tiering.
+- **Bucket StorageClass Lifecycle Rules: Support setting storage policies 
Lifecycle Rules at the bucket level.**
+- **Recon Support**: Enhance Recon to display 

Re: [PR] HDDS-11233. Ozone Storage Policy Support. [ozone]

2024-07-25 Thread via GitHub


kerneltime commented on PR #6989:
URL: https://github.com/apache/ozone/pull/6989#issuecomment-2251173313

   cc @sodonnel 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]