[
https://issues.apache.org/jira/browse/HDDS-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17841071#comment-17841071
]
Ivan Andika edited comment on HDDS-8342 at 4/26/24 7:30 AM:
------------------------------------------------------------
Thank you for the review [~ritesh]
> We do not need to make retention specific to S3, objects ingested via Hadoop
>APIs should inherit the feature.
Yes, this feature deletes all expired objects, not only the one ingested
through S3. However, currently bucket lifecycle configuration can only be
created using the S3 API. We can introduce Ozone CLI commands in the future.
> The actual loop to scan objects for retention will need more detailed design.
> OM scales to billions and making sure the implementation of scanner is
> efficient will be an important aspect.
Agreed. From my understanding, the implemented design in our cluster uses an OM
background service which is a single thread that will periodically scan through
all the OmLifecycleConfiguration. For each OmLifecycleConfiguration (usually 1
per bucket), it will submit a task to an ThreadPoolExecutor (with a
configurable pool size) which scans the OM keyTable for keys under the bucket
that fits the OmLifecycleConfiguration and delete it if needed.
[~mohanad] Please correct me if I'm wrong. Also, could you help update the
design documents with implementation details on RetentionManager and
RetentionService?
> It is also possible to defer the actual walk of the objects in Recon and have
> recon invoke the OM API to revalidate the configuration for an object for a
> bucket. So this way Recon can walk its copy of OM's data and even if it is
> stale the final validation will happen in OM. Just a thought on the top of my
> mind.
Thanks for the suggestions. I think this is a valid approach since we don't
need the most up-to-date OM DB state for the retention. However, currently due
to HDDS-8271, we currently disabled Recon OM update for OM stability concern. I
think there might be users that do not run Recon since Recon is not required
for a functional Ozone cluster. There are also concerns that Recon sending OM
request might increase the RPC loads on the OM. Therefore, we currently
implement the scanning as a background service in OM. However, we can revisit
this approach if needed. (cc: [~mohanad] [~XiChen] )
By the way, [~mohanad] just raised a design PR on
[https://github.com/apache/ozone/pull/6589], we can continue our discussions
there.
was (Author: JIRAUSER298977):
Thank you for the review [~ritesh]
> We do not need to make retention specific to S3, objects ingested via Hadoop
>APIs should inherit the feature.
Yes, this feature deletes all expired objects, not only the one ingested
through S3. However, currently bucket lifecycle configuration can only be
created using the S3 API. We can introduce Ozone CLI commands in the future.
> The actual loop to scan objects for retention will need more detailed design.
> OM scales to billions and making sure the implementation of scanner is
> efficient will be an important aspect.
Agreed. From my understanding, the implemented design in our cluster uses an OM
background service which is a single thread that will periodically scan through
all the OmLifecycleConfiguration. For each OmLifecycleConfiguration (usually 1
per bucket), it will submit a task to an ThreadPoolExecutor (with a
configurable pool size) which scans the OM keyTable for keys that fits the
OmLifecycleConfiguration and delete it if needed.
[~mohanad] Please correct me if I'm wrong. Also, could you help update the
design documents with implementation details on RetentionManager and
RetentionService?
> It is also possible to defer the actual walk of the objects in Recon and have
> recon invoke the OM API to revalidate the configuration for an object for a
> bucket. So this way Recon can walk its copy of OM's data and even if it is
> stale the final validation will happen in OM. Just a thought on the top of my
> mind.
Thanks for the suggestions. I think this is a valid approach since we don't
need the most up-to-date OM DB state for the retention. However, currently due
to HDDS-8271, we currently disabled Recon OM update for OM stability concern. I
think there might be users that do not run Recon since Recon is not required
for a functional Ozone cluster. There are also concerns that Recon sending OM
request might increase the RPC loads on the OM. Therefore, we currently
implement the scanning as a background service in OM. However, we can revisit
this approach if needed. (cc: [~mohanad] [~XiChen] )
By the way, [~mohanad] just raised a design PR on
https://github.com/apache/ozone/pull/6589, we can continue our discussions
there.
> AWS S3 Lifecycle Configurations
> -------------------------------
>
> Key: HDDS-8342
> URL: https://issues.apache.org/jira/browse/HDDS-8342
> Project: Apache Ozone
> Issue Type: New Feature
> Components: OM, S3
> Reporter: Mohanad Elsafty
> Assignee: Mohanad Elsafty
> Priority: Major
> Labels: pull-request-available
> Attachments: image-2023-03-31-12-42-46-971.png
>
>
> I had the need for a retention solution in my cluster (delete keys in
> specific paths after some time). The idea was very similar to AWS S3
> Lifecycle configurations (Expiration part).
> [https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-configuration-examples.html]
> I made a design and already Implemented most of it, and would like to
> contribute back to Apache Ozone community.
> h2. Here is what included
> # User should be able to create/remove/fetch lifecycle configurations for a
> specific S3 bucket.
> # The lifecycle configurations will be executed periodically.
> # Depending on the rules of the lifecycle configuration there could be
> different actions or even multiple actions.
> # At the moment only expiration is supported (keys get deleted).
> # The lifecycle configurations supports all buckets not only S3 buckets.
>
> h1. Design
> !image-2023-03-31-12-42-46-971.png!
>
> h2. Components
> # Lifecycle configurations (will be stored in DB) consists of volumeName,
> bucketName and a list of rules
> ** A rule contains prefix (string), Expiration and an optional Filter.
> ** Expiration contains either days (integer) or Date (long)
> ** Filter contains prefix (string).
> # S3G bucket endpoint needs few updates to accept ?/lifecycle
> # ClientProtocol and all implementers provides (get, list, delete and
> create) lifecycle configuration
> # RetentionManager will be running periodically.
> ** Fetches a lifecycle configurations list with the help of OM
> ** Executes each lifecycle configuration on a specific bucket
> ** Lifecycle configurations will be running on parallel (each one against
> different bucket).
> h2. Flow
> # Users PUT/GET/DELETE lifecycle configurations via S3Gateway.
> # The lifecycle configurations details will be sent to some handler to be
> processed.
> # The lifecycle configurations will be saved to/fetched from the DB.
> # RetentionManager will be running periodically in the Leader OM to execute
> these lifecycle configurations.
> # RetentionManager will be issuing deletions for eligible keys.
>
> h2. Not a complete solution
> The solution lacks some interesting features for example:
> * The filter doesn't support `AND` yet.
> * Only expiration is supported.
> * A CLI to manage lifecycle configurations for all the buckets (At the
> moment S3G is the only supported entry).
> But these kind of features can be added in the future.
>
>
> *I made some decisions that must be discussed before contributing (Current
> design)*
> Lifecycle configurations will be stored in its own column family in the DB
> instead being a filed in the {*}OmBucketInfo{*}.
> I preferred the lifecycle configuration to have its own table for two reasons:
> # No need to modify OmBucketInfo table.
> # The way the Retention manager Works, this way It will query only the
> buckets that has an attached lifecycle configuration. if the lifecycle is a
> filed in OmBucketInfo it will have to query all the buckets and filter the
> ones that has a LifecycleConfiguration.
> If the other way is preferred, then I will get rid of
> LifecycleConfigurationsManager & the new codec.
>
> To summarize this:
>
> ||A new table for lifecycle configurations||A new field in OmBucketInfo||
> |A new table|Existing table|
> |Efficient query|Less efficient|
> |A new manager (lifecycle manager)|No need|
> |A new codec |No need|
> |No need to alter existing design|Need to update the existing design|
> |Need to update Bucket Deletion. Delete
> the linked lifecycle configuration when
> the bucket is deleted. |No need for updates|
> | |Needs updates to create, get, list
> and delete lifecycle configuration
> in the BucketManager.|
>
>
> h2. Plan for contribution
> The implementation is not small enough for review. I believe it needs to be
> split into few merge requests for better review. Here is my suggested
> breakdown.
> # Basic building blocks (lifecycle configuration, rule, expiration, ...) And
> the related table (if needed).
> # ClientProtocol & OzoneManager new operations (create, get, list, delete)
> lifecycle configurations (protobuf messages as well)
> # S3G endpoints updates.
> # The retention manager.
> # All of them to be merged into a new branch (Let's call it X)
> # Then merge branch X into master.
>
> Please feel free to review the design and ask for more clarifications if
> needed.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]