[ 
https://issues.apache.org/jira/browse/HADOOP-18798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17742348#comment-17742348
 ] 

Steve Loughran commented on HADOOP-18798:
-----------------------------------------

looking through the code, I don't see it going anywhere near the FsShell; uses 
FileSystem.delete() instead

unless someone wants to somehow add trash support *everywhere* this' doc has to 
be considered out of date.

Now, HDFS-9913 does seem to cover that, but it's merge bounced; don't remember 
why. If someone wants to revisit that, we should.

in the meantime, we should fix the docs

> hadoop distcp -delete sends deleted data to null instead of trash
> -----------------------------------------------------------------
>
>                 Key: HADOOP-18798
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18798
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: documentation
>            Reporter: Ryan Blough
>            Priority: Major
>
> In the docs the -delete option is specified as moving data to Trash when it 
> is enabled:
> [https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html#:~:text=%2Ddelete,or%20overwrite%20options].
> However, it does not go to trash, it goes to null. I know of two instances 
> where this misunderstanding has caused data loss.
> The statement that the data goes to Trash should be removed, and it should be 
> specified that the data is deleted.
> An earlier reproduction:
> hdfs dfs -mkdir -p /tmp/test1/test2
> hdfs dfs -put /tmp/test.img /tmp/
> hdfs dfs -put /tmp/test.img /tmp/test2/file1
>  
> drwxr-xr-x   - root   supergroup          0 2023-04-17 19:07 /tmp/test1
> drwxr-xr-x   - hdfs   supergroup          0 2023-04-17 19:06 /tmp/test1/test2
> {-}rw-r{-}{-}r{-}-   3 hdfs   supergroup 1073741824 2023-04-17 19:06 
> /tmp/test1/test2/file1
>  
> distcp -update -delete /tmp/test.img /tmp/test1
>  
> {-}rw-r{-}{-}r{-}-   3 root   supergroup 1073741824 2023-04-17 18:52 
> /tmp/test.img
> drwxr-xr-x   - root   supergroup          0 2023-04-17 19:03 /tmp/test1
> {-}rw-r{-}{-}r{-}-   3 hdfs   supergroup 1073741824 2023-04-17 19:03 
> /tmp/test1/test.img
>  
> 2023-04-17 19:08:44,252 INFO FSNamesystem.audit: allowed=true   ugi=hdfs 
> (auth:SIMPLE)  ip=/172.25.41.195       cmd=delete      src=/tmp/test1/test2   
>  dst=null        perm=null       proto
>  
> [hdfs@c4401-node2 root]$ date
> Mon Apr 17 19:11:22 UTC 2023



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to