[
https://issues.apache.org/jira/browse/HADOOP-18798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17742348#comment-17742348
]
Steve Loughran commented on HADOOP-18798:
-----------------------------------------
looking through the code, I don't see it going anywhere near the FsShell; uses
FileSystem.delete() instead
unless someone wants to somehow add trash support *everywhere* this' doc has to
be considered out of date.
Now, HDFS-9913 does seem to cover that, but it's merge bounced; don't remember
why. If someone wants to revisit that, we should.
in the meantime, we should fix the docs
> hadoop distcp -delete sends deleted data to null instead of trash
> -----------------------------------------------------------------
>
> Key: HADOOP-18798
> URL: https://issues.apache.org/jira/browse/HADOOP-18798
> Project: Hadoop Common
> Issue Type: Bug
> Components: documentation
> Reporter: Ryan Blough
> Priority: Major
>
> In the docs the -delete option is specified as moving data to Trash when it
> is enabled:
> [https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html#:~:text=%2Ddelete,or%20overwrite%20options].
> However, it does not go to trash, it goes to null. I know of two instances
> where this misunderstanding has caused data loss.
> The statement that the data goes to Trash should be removed, and it should be
> specified that the data is deleted.
> An earlier reproduction:
> hdfs dfs -mkdir -p /tmp/test1/test2
> hdfs dfs -put /tmp/test.img /tmp/
> hdfs dfs -put /tmp/test.img /tmp/test2/file1
>
> drwxr-xr-x - root supergroup 0 2023-04-17 19:07 /tmp/test1
> drwxr-xr-x - hdfs supergroup 0 2023-04-17 19:06 /tmp/test1/test2
> {-}rw-r{-}{-}r{-}- 3 hdfs supergroup 1073741824 2023-04-17 19:06
> /tmp/test1/test2/file1
>
> distcp -update -delete /tmp/test.img /tmp/test1
>
> {-}rw-r{-}{-}r{-}- 3 root supergroup 1073741824 2023-04-17 18:52
> /tmp/test.img
> drwxr-xr-x - root supergroup 0 2023-04-17 19:03 /tmp/test1
> {-}rw-r{-}{-}r{-}- 3 hdfs supergroup 1073741824 2023-04-17 19:03
> /tmp/test1/test.img
>
> 2023-04-17 19:08:44,252 INFO FSNamesystem.audit: allowed=true ugi=hdfs
> (auth:SIMPLE) ip=/172.25.41.195 cmd=delete src=/tmp/test1/test2
> dst=null perm=null proto
>
> [hdfs@c4401-node2 root]$ date
> Mon Apr 17 19:11:22 UTC 2023
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]