[
https://issues.apache.org/jira/browse/HDDS-13806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18031359#comment-18031359
]
Ivan Andika edited comment on HDDS-13806 at 10/21/25 8:19 AM:
--------------------------------------------------------------
> I feel we need to somehow get the client FS to move to the correct place for
> ozone if the underlying store is Ozone
Yes, this should be partially handled to HADOOP-19217 for all paths that uses
ofs:// scheme. However, the main issue is with S3A since we currently don't
have a way to determine the underlying trash root implementation S3-compatible
storage that uses (e.g. AWS S3 or Ozone). We might be able to achieve it with
some kind of custom header ("x-hadoop-") that will be sent from S3AFileSystem
to derive the underlying file system, but this requires some Hadoop FileSystem
contract changes.
> Solving this at the server by adding numerous paths that are implemented by
> various FS paths is not a good design IMO. This can lead to bugs which can
> wrongfully delete a key leading to data loss.
IMO this seems to be the the underlying design of the Hadoop trash which can
allow different client-side TrashPolicy to be inconsistent from the
server-side. By right, if we want to make Trash root to be consistent, we need
to get it from the HDFS or OM server first before moving it to the trash root.
However, this might cause unnecessary communication overhead and a lot of
S3-compatible storage won't implement this.
> TrashOzoneFileSystem#getTrashRoots should support Ozone, S3A (HDFS trash),
>and currently HDFS trash before HADOOP-19217.
This seems to be the simplest solution right now, but might not be a good
long-term solution.
was (Author: JIRAUSER298977):
> I feel we need to somehow get the client FS to move to the correct place for
> ozone if the underlying store is Ozone
Yes, this should be partially handled to HADOOP-19217 for all paths that uses
ofs:// scheme. However, the main issue is with S3A since we currently don't
have a way to determine the underlying S3-compatible storage that uses (e.g.
AWS S3 or Ozone). We might be able to achieve it with some kind of custom
header that will be sent to S3AFileSystem to derive the underlying file system,
but this requires some Hadoop FileSystem contract changes.
> Solving this at the server by adding numerous paths that are implemented by
> various FS paths is not a good design IMO. This can lead to bugs which can
> wrongfully delete a key leading to data loss.
IMO this seems to be the the underlying design of the Hadoop trash which can
allow different client-side TrashPolicy to be inconsistent from the
server-side. By right, if we want to make Trash root to be consistent, we need
to get it from the HDFS or OM server first before moving it to the trash root.
However, this might cause unnecessary communication overhead and a lot of
S3-compatible storage won't implement this.
> TrashOzoneFileSystem#getTrashRoots should support Ozone, S3A (HDFS trash),
>and currently HDFS trash before HADOOP-19217.
This seems to be the simplest solution right now, but might not be a good
long-term solution.
> Files are moved to different trash root when deleted through s3a://
> -------------------------------------------------------------------
>
> Key: HDDS-13806
> URL: https://issues.apache.org/jira/browse/HDDS-13806
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Sammi Chen
> Priority: Major
>
> When trash is enabled and user deletes a Ozone file through s3a:// schema,
> the file will be moved to HDFS default trash root, instead of Ozone default
> trash root. Since the trash deletion thread in Ozone only checks Ozone
> default trash root, so these files deleted through s3a:// will never get a
> chance to be deleted.
> HDFS trash behaivor:
> * For unencrypted files, the HDFS trash root is typically located in the
> user's home directory under /user/<username>/.Trash, where deleted files are
> moved to /user/<username>/.Trash/Current/OriginalPath.
> * For encrypted files, the trash root is within the encryption zone's root
> directory at /EncryptionZoneRoot/.Trash, and files are moved to
> /EncryptionZoneRoot/.Trash/$USER/Current/OriginalPath. The trash can be
> accessed using hdfs dfs -ls /user/<username>/.Trash or by using a path
> prefixed with hdfs://
> Ozone trash behavior:
> In Apache Ozone, the default trash location for keys in a File System
> Optimized (FSO) bucket is within the bucket itself. The specific path
> is:/<volume>/<bucket>/.Trash/<user>, where deleted files are moved to
> /<volume>/<bucket>/.Trash/<user>/Current/OriginalPath
> The problem is found by [~chenxi]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]