[
https://issues.apache.org/jira/browse/HDDS-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17827384#comment-17827384
]
Ivan Andika edited comment on HDDS-8342 at 3/15/24 6:15 AM:
------------------------------------------------------------
[~mohanad] A PR with a markdown file with similar to this ticket's description
(similar to
[hadoop-hdds/docs/content/design/container-reconciliation.md|https://github.com/apache/ozone/pull/6121/files#diff-60a513c58274af0ec3972bffd0366ddf38021b56e03d2e254bca32a7ca423e5c]
in [https://github.com/apache/ozone/pull/6121] ), so that the other community
members can review the markdown file inline and give their suggestion. After
the design has been reviewed, we can ask to cut a feature branch and we can
create subtasks under this ticket and raise the corresponding subtasks PR to
the feature branch.
Also, IMO the design documentation should also include the actual protobuf
schema definitions (e.g. lifecycleConfigurationTable schema) so that it can be
reviewed as well.
was (Author: JIRAUSER298977):
[~mohanad] A PR with a markdown file with similar to this ticket's description
(similar to
[hadoop-hdds/docs/content/design/container-reconciliation.md|https://github.com/apache/ozone/pull/6121/files#diff-60a513c58274af0ec3972bffd0366ddf38021b56e03d2e254bca32a7ca423e5c]
in [https://github.com/apache/ozone/pull/6121] ), so that the other community
members can review the markdown file inline and give their suggestion. After
the design has been reviewed, we can ask to cut a feature branch where we can
do the patches step-by-step and merge it to the feature branch.
The design documentation should also include the actual protobuf schema
definitions (e.g. lifecycleConfigurationTable entry) so that it can be
reviewed as well.
> AWS S3 Lifecycle Configurations
> -------------------------------
>
> Key: HDDS-8342
> URL: https://issues.apache.org/jira/browse/HDDS-8342
> Project: Apache Ozone
> Issue Type: New Feature
> Components: OM, S3
> Reporter: Mohanad Elsafty
> Assignee: Mohanad Elsafty
> Priority: Major
> Attachments: image-2023-03-31-12-42-46-971.png
>
>
> I had the need for a retention solution in my cluster (delete keys in
> specific paths after some time). The idea was very similar to AWS S3
> Lifecycle configurations (Expiration part).
> [https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-configuration-examples.html]
> I made a design and already Implemented most of it, and would like to
> contribute back to Apache Ozone community.
> h2. Here is what included
> # User should be able to create/remove/fetch lifecycle configurations for a
> specific S3 bucket.
> # The lifecycle configurations will be executed periodically.
> # Depending on the rules of the lifecycle configuration there could be
> different actions or even multiple actions.
> # At the moment only expiration is supported (keys get deleted).
> # The lifecycle configurations supports all buckets not only S3 buckets.
>
> h1. Design
> !image-2023-03-31-12-42-46-971.png!
>
> h2. Components
> # Lifecycle configurations (will be stored in DB) consists of volumeName,
> bucketName and a list of rules
> ** A rule contains prefix (string), Expiration and an optional Filter.
> ** Expiration contains either days (integer) or Date (long)
> ** Filter contains prefix (string).
> # S3G bucket endpoint needs few updates to accept ?/lifecycle
> # ClientProtocol and all implementers provides (get, list, delete and
> create) lifecycle configuration
> # RetentionManager will be running periodically.
> ** Fetches a lifecycle configurations list with the help of OM
> ** Executes each lifecycle configuration on a specific bucket
> ** Lifecycle configurations will be running on parallel (each one against
> different bucket).
> h2. Flow
> # Users PUT/GET/DELETE lifecycle configurations via S3Gateway.
> # The lifecycle configurations details will be sent to some handler to be
> processed.
> # The lifecycle configurations will be saved to/fetched from the DB.
> # RetentionManager will be running periodically in the Leader OM to execute
> these lifecycle configurations.
> # RetentionManager will be issuing deletions for eligible keys.
>
> h2. Not a complete solution
> The solution lacks some interesting features for example:
> * The filter doesn't support `AND` yet.
> * Only expiration is supported.
> * A CLI to manage lifecycle configurations for all the buckets (At the
> moment S3G is the only supported entry).
> But these kind of features can be added in the future.
>
>
> *I made some decisions that must be discussed before contributing (Current
> design)*
> Lifecycle configurations will be stored in its own column family in the DB
> instead being a filed in the {*}OmBucketInfo{*}.
> I preferred the lifecycle configuration to have its own table for two reasons:
> # No need to modify OmBucketInfo table.
> # The way the Retention manager Works, this way It will query only the
> buckets that has an attached lifecycle configuration. if the lifecycle is a
> filed in OmBucketInfo it will have to query all the buckets and filter the
> ones that has a LifecycleConfiguration.
> If the other way is preferred, then I will get rid of
> LifecycleConfigurationsManager & the new codec.
>
> To summarize this:
>
> ||A new table for lifecycle configurations||A new field in OmBucketInfo||
> |A new table|Existing table|
> |Efficient query|Less efficient|
> |A new manager (lifecycle manager)|No need|
> |A new codec |No need|
> |No need to alter existing design|Need to update the existing design|
> |Need to update Bucket Deletion. Delete
> the linked lifecycle configuration when
> the bucket is deleted. |No need for updates|
> | |Needs updates to create, get, list
> and delete lifecycle configuration
> in the BucketManager.|
>
>
> h2. Plan for contribution
> The implementation is not small enough for review. I believe it needs to be
> split into few merge requests for better review. Here is my suggested
> breakdown.
> # Basic building blocks (lifecycle configuration, rule, expiration, ...) And
> the related table (if needed).
> # ClientProtocol & OzoneManager new operations (create, get, list, delete)
> lifecycle configurations (protobuf messages as well)
> # S3G endpoints updates.
> # The retention manager.
> # All of them to be merged into a new branch (Let's call it X)
> # Then merge branch X into master.
>
> Please feel free to review the design and ask for more clarifications if
> needed.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]