xichen01 commented on code in PR #6589: URL: https://github.com/apache/ozone/pull/6589#discussion_r1600862649
########## hadoop-hdds/docs/content/design/lifecycle-configurations.md: ########## @@ -0,0 +1,253 @@ +--- +title: AWS S3 Lifecycle Configurations +summary: Enables users to manage lifecycle configurations for buckets, allowing automated deletion of keys based on predefined rules. +date: 2024-04-25 +jira: HDDS-8342 +status: draft +--- +<!-- + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + http://www.apache.org/licenses/LICENSE-2.0 + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. See accompanying LICENSE file. +--> + +# Lifecycle Management + +## Introduction +I encountered the need for a retention solution within my cluster, specifically the ability to delete keys in specific paths after a certain time period. +This requirement closely resembled the functionality provided by AWS S3 Lifecycle configurations, particularly the Expiration part ([AWS S3 Lifecycle Configuration Examples](https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-configuration-examples.html)). + +## Overview + +### Functionality +- User should be able to create/remove/fetch lifecycle configurations for a specific S3 bucket. +- The lifecycle configurations will be executed periodically. +- Depending on the rules of the lifecycle configuration there could be different actions or even multiple actions. +- At the moment only expiration is supported (keys get deleted). +- The lifecycle configurations supports all buckets not only S3 buckets. + + +### Components + +- Lifecycle configurations (will be stored in DB) consists of volumeName, bucketName and a list of rules + - A rule contains prefix (string), Expiration and an optional Filter. + - Object tagging integrations for bucket lifecycle configuration. + - Expiration contains either days (integer) or Date (long) + - Filter contains prefix (string). +- S3G bucket endpoint needs few updates to accept ?/lifecycle +- ClientProtocol and all implementers provides (get, list, delete and create) lifecycle configuration +- RetentionManager: + - Upon startup, the OzoneManager initializes the Retention Manager based on configuration parameters such as retention interval. + - A background retention service is responsible for scheduling and executing tasks at specified intervals. + - The Retention Manager retrieves lifecycle configurations associated with buckets. + - Then assigns each lifecycle configuration (attached to a bucket) to a threadpool (Configurable) for further processing. + - Each task will iterate through keys of a specific bucket and issue deletion request for eligible keys. + + + +### Flow +1. Users interact with lifecycle configurations via S3Gateway. +2. Configuration details are processed by a handler. +3. Configurations are saved/fetched from the database. +4. RetentionManager, running periodically in the Leader OM, executes lifecycle configurations and issues deletions for eligible keys. + +## Limitations +- The current solution lacks certain features: + - Only expiration actions are supported. + - Lack of CLI support for managing lifecycle configurations across all buckets (S3G is the only supported entry point). + +All these kind of features can be added in the future. + +## Protobuf Definitions +```protobuf +/** +S3 lifecycles (filter, expiration, rule and configuration). + */ +message LifecycleFilter { + optional string prefix = 1; +} + +message LifecycleExpiration { + optional uint32 days = 1; + optional string date = 2; +} + +message LifecycleRule { + optional string id = 1; + optional string prefix = 2; + required bool enabled = 3; + optional LifecycleExpiration expiration = 4; + optional LifecycleFilter filter = 5; +} + +message LifecycleConfiguration { + required string volume = 1; + required string bucket = 2; + required string owner = 3; + optional uint64 creationTime = 4; + repeated LifecycleRule rules = 5; + optional uint64 objectID = 6; + optional uint64 updateID = 7; +} + +message CreateLifecycleConfigurationRequest { + required LifecycleConfiguration lifecycleConfiguration = 1; +} + +message CreateLifecycleConfigurationResponse { + +} + +message InfoLifecycleConfigurationRequest { + required string volumeName = 1; + required string bucketName = 2; +} + +message InfoLifecycleConfigurationResponse { + required LifecycleConfiguration lifecycleConfiguration = 1; +} + +message DeleteLifecycleConfigurationRequest { + required string volumeName = 1; + required string bucketName = 2; Review Comment: AWS S3 only supports to delete all the `LifecycleConfiguration` of a bucket, refer to: https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteBucketLifecycle.html ########## hadoop-hdds/docs/content/design/lifecycle-configurations.md: ########## @@ -0,0 +1,253 @@ +--- +title: AWS S3 Lifecycle Configurations +summary: Enables users to manage lifecycle configurations for buckets, allowing automated deletion of keys based on predefined rules. +date: 2024-04-25 +jira: HDDS-8342 +status: draft +--- +<!-- + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + http://www.apache.org/licenses/LICENSE-2.0 + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. See accompanying LICENSE file. +--> + +# Lifecycle Management + +## Introduction +I encountered the need for a retention solution within my cluster, specifically the ability to delete keys in specific paths after a certain time period. +This requirement closely resembled the functionality provided by AWS S3 Lifecycle configurations, particularly the Expiration part ([AWS S3 Lifecycle Configuration Examples](https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-configuration-examples.html)). + +## Overview + +### Functionality +- User should be able to create/remove/fetch lifecycle configurations for a specific S3 bucket. +- The lifecycle configurations will be executed periodically. +- Depending on the rules of the lifecycle configuration there could be different actions or even multiple actions. +- At the moment only expiration is supported (keys get deleted). +- The lifecycle configurations supports all buckets not only S3 buckets. + + +### Components + +- Lifecycle configurations (will be stored in DB) consists of volumeName, bucketName and a list of rules + - A rule contains prefix (string), Expiration and an optional Filter. + - Object tagging integrations for bucket lifecycle configuration. + - Expiration contains either days (integer) or Date (long) + - Filter contains prefix (string). +- S3G bucket endpoint needs few updates to accept ?/lifecycle +- ClientProtocol and all implementers provides (get, list, delete and create) lifecycle configuration +- RetentionManager: + - Upon startup, the OzoneManager initializes the Retention Manager based on configuration parameters such as retention interval. + - A background retention service is responsible for scheduling and executing tasks at specified intervals. + - The Retention Manager retrieves lifecycle configurations associated with buckets. + - Then assigns each lifecycle configuration (attached to a bucket) to a threadpool (Configurable) for further processing. + - Each task will iterate through keys of a specific bucket and issue deletion request for eligible keys. + + + +### Flow +1. Users interact with lifecycle configurations via S3Gateway. +2. Configuration details are processed by a handler. +3. Configurations are saved/fetched from the database. +4. RetentionManager, running periodically in the Leader OM, executes lifecycle configurations and issues deletions for eligible keys. + +## Limitations +- The current solution lacks certain features: + - Only expiration actions are supported. + - Lack of CLI support for managing lifecycle configurations across all buckets (S3G is the only supported entry point). + +All these kind of features can be added in the future. + +## Protobuf Definitions +```protobuf +/** +S3 lifecycles (filter, expiration, rule and configuration). + */ +message LifecycleFilter { + optional string prefix = 1; +} + +message LifecycleExpiration { + optional uint32 days = 1; + optional string date = 2; +} + +message LifecycleRule { + optional string id = 1; + optional string prefix = 2; + required bool enabled = 3; + optional LifecycleExpiration expiration = 4; + optional LifecycleFilter filter = 5; +} + +message LifecycleConfiguration { + required string volume = 1; + required string bucket = 2; + required string owner = 3; + optional uint64 creationTime = 4; + repeated LifecycleRule rules = 5; + optional uint64 objectID = 6; + optional uint64 updateID = 7; Review Comment: @SaketaChalamchala The `required bool enabled = 3;` in the `LifecycleRule` is use to indicate whether the configuration is enabled or disabled ########## hadoop-hdds/docs/content/design/lifecycle-configurations.md: ########## @@ -0,0 +1,253 @@ +--- +title: AWS S3 Lifecycle Configurations +summary: Enables users to manage lifecycle configurations for buckets, allowing automated deletion of keys based on predefined rules. +date: 2024-04-25 +jira: HDDS-8342 +status: draft +--- +<!-- + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + http://www.apache.org/licenses/LICENSE-2.0 + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. See accompanying LICENSE file. +--> + +# Lifecycle Management + +## Introduction +I encountered the need for a retention solution within my cluster, specifically the ability to delete keys in specific paths after a certain time period. +This requirement closely resembled the functionality provided by AWS S3 Lifecycle configurations, particularly the Expiration part ([AWS S3 Lifecycle Configuration Examples](https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-configuration-examples.html)). + +## Overview + +### Functionality +- User should be able to create/remove/fetch lifecycle configurations for a specific S3 bucket. Review Comment: > Do you have any thoughts on what Acl checks would be performed for creating a lifecycle configuration. Would it be restricted to the owners of the keys or an ozone administrator? Maybe Need the 'WRITE' permission for the being operated bucket? If a user has 'WRITE' permission on a bucket, it is possible to overwrite or delete another user's key in the bucket without going through the `Lifecycle` > When Lifecycle deletes a key, as long as the `Rule` is met, the key will be deleted, if we want to block users from removing or deleting objects from specific bucket, bucket owner should not give the `WRITE` permission for the other user. When Lifecycle deletes a key, as long as the `Rule` is met, the key will be deleted, the deleting operation is executed by the om own, the om is a `admin` user. if we want to block users from removing or deleting objects from specific bucket, bucket owner should not give the `WRITE` permission for the other user on the specific bucket. https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketLifecycleConfiguration.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
