xichen01 commented on code in PR #6589: URL: https://github.com/apache/ozone/pull/6589#discussion_r1615158426
########## hadoop-hdds/docs/content/design/lifecycle-configurations.md: ########## @@ -0,0 +1,253 @@ +--- +title: AWS S3 Lifecycle Configurations +summary: Enables users to manage lifecycle configurations for buckets, allowing automated deletion of keys based on predefined rules. +date: 2024-04-25 +jira: HDDS-8342 +status: draft +--- +<!-- + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + http://www.apache.org/licenses/LICENSE-2.0 + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. See accompanying LICENSE file. +--> + +# Lifecycle Management + +## Introduction +I encountered the need for a retention solution within my cluster, specifically the ability to delete keys in specific paths after a certain time period. +This requirement closely resembled the functionality provided by AWS S3 Lifecycle configurations, particularly the Expiration part ([AWS S3 Lifecycle Configuration Examples](https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-configuration-examples.html)). + +## Overview + +### Functionality +- User should be able to create/remove/fetch lifecycle configurations for a specific S3 bucket. +- The lifecycle configurations will be executed periodically. +- Depending on the rules of the lifecycle configuration there could be different actions or even multiple actions. +- At the moment only expiration is supported (keys get deleted). +- The lifecycle configurations supports all buckets not only S3 buckets. + + +### Components + +- Lifecycle configurations (will be stored in DB) consists of volumeName, bucketName and a list of rules + - A rule contains prefix (string), Expiration and an optional Filter. + - Object tagging integrations for bucket lifecycle configuration. + - Expiration contains either days (integer) or Date (long) + - Filter contains prefix (string). +- S3G bucket endpoint needs few updates to accept ?/lifecycle +- ClientProtocol and all implementers provides (get, list, delete and create) lifecycle configuration +- RetentionManager: + - Upon startup, the OzoneManager initializes the Retention Manager based on configuration parameters such as retention interval. + - A background retention service is responsible for scheduling and executing tasks at specified intervals. + - The Retention Manager retrieves lifecycle configurations associated with buckets. + - Then assigns each lifecycle configuration (attached to a bucket) to a threadpool (Configurable) for further processing. + - Each task will iterate through keys of a specific bucket and issue deletion request for eligible keys. Review Comment: This simply means that any rule that matches will be executed. Currently, the only "action" in our Lifecycle is to delete, so when checking the specified key, if any rule matches, then the key will be deleted. This is also the rule for AWS S3 Lifecycle. https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-configuration-examples.html#lifecycle-config-conceptual-ex5 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
