[ https://issues.apache.org/jira/browse/HADOOP-432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469617 ]
Doug Cutting commented on HADOOP-432: ------------------------------------- > expunge(), as implemented, purges the entire trash No, it only removes things in folders older than the interval. So, in particular, it never removes the current trash, and won't remove a checkpoint until its older than the interval. > results in a large load on the namenode Expunge lists checkpoints, then removes entire checkpoints with a single call to the namenode. So it could take a long time on the namenode if the checkpoint has lots of files, but it doesn't make a lot of calls to the namenode: all directory enumeration except of the top-level checkpoints is done server-side, at the namenode. So the RPC load on the namenode is minimized. > It would be nice if the code, when deleting a file, checked if the source > file is already in the trash and would expunge it Yes, that would be a useful feature. Calling moveToTrash() on any path that begins with the trash's root should cause it to be immediately removed. +1 > support undelete, snapshots, or other mechanism to recover lost files > --------------------------------------------------------------------- > > Key: HADOOP-432 > URL: https://issues.apache.org/jira/browse/HADOOP-432 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Reporter: Yoram Arnon > Assigned To: Doug Cutting > Attachments: trash.patch, undelete12.patch, undelete16.patch, > undelete17.patch > > > currently, once you delete a file it's gone forever. > most file systems allow some form of recovery of deleted files. > a simple solution would be an 'undelete' command. > a more comprehensive solution would include snapshots, manual and automatic, > with scheduling options. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.