[
https://issues.apache.org/jira/browse/HDDS-11256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870046#comment-17870046
]
Ethan Rose commented on HDDS-11256:
-----------------------------------
{quote}However, this only protects large scale deletion through Hadoop client
(e.g. "hadoop fs -rm -r"). However, other deletions using -skipTrash or through
S3 are not protected by this trash.
{quote}
Recovering from deletes in S3 are an outstanding issue. Snapshots are not a
complete answer because they may not capture the most recent set of changes
that were just deleted. The description sort of implies this already, but our
implementation of delete recovery would likely be tied to the protocol being
used. That means ofs -> trash, and S3 -> key versioning. IMO implementing key
versioning is the solution to this problem.
> OM Key Trash Feature
> --------------------
>
> Key: HDDS-11256
> URL: https://issues.apache.org/jira/browse/HDDS-11256
> Project: Apache Ozone
> Issue Type: New Feature
> Components: OM
> Reporter: Ivan Andika
> Assignee: Ivan Andika
> Priority: Major
>
> Context: Currently Ozone supports Trash feature with similar implementation
> as HDFS trash feature. However, this only protects large scale deletion
> through Hadoop client (e.g. "hadoop fs -rm -r"). However, other deletions
> using -skipTrash or through S3 are not protected by this trash.
> We are currently working on the implementation in our internal cluster. We
> started with idea of DN block trash (similar to HDFS-12996), but realized
> that this requires changes in all Ozone components and the complexity will be
> very high. The final design implements an OM-based solution that resembles
> HDDS-2416 (Ozone Trash Feature):
> * There is a separate "Trash Table" that hold the deleted keys
> * There will be a background service that checks the "Trash Table" for
> deleted keys older than a certain expiry threshold and move them to the
> deletedTable for normal deletion (a trash cleanup service)
> * In the event of large accidental deletions, an admin can call a "recover"
> request which will query the trash tables and return it back to the original
> keys
> However, there are also some planned implementation differences that can be
> covered in a design document, which includes things like:
> * Another table which stores the modificationTime as RDB key for faster DB
> traversal
> * Recovery request starts a stateful background service on OM (similar to
> ContainerBalancer) to handle large amount of keys
> * Enabling the trash on runtime using a new OM request which uses Ratis
> transaction to ensure consistency of the OM DB. This will set a flag in OM
> metaTable that is used by OM to decide whether to use normal OM deletion or a
> "trash" deletion
> * Runtime reconfigurability of trash cleanup service parameters
> If the community members are interested and see the need for this feature, I
> can come up with a more complete design document.
> As I understand, Ozone already supports snapshot which is able to protect
> against accidental deletion, thus there might be a lot of overlaps. Due to
> these overlaps, it might not make sense to have this trash feature.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]