[ http://issues.apache.org/jira/browse/HADOOP-432?page=comments#action_12436970 ] Wendy Chien commented on HADOOP-432: ------------------------------------
Here's the current proposal. Comments are welcome. Two config items: * maximum size (say 5TB). DFS tries to keep recycle bin under this size. * minimum time (say 1 hour). Files are never removed less than this time after they're deleted. Namenode: * keeps track of recycle bin size * records deletion time of each file * occasionally wakes up, scans deleted files and removes LRU files until desired size is reached, or files are too young, whichever comes first. notes: * namenode keeps deleted files sorted based on deletion times, so scan for oldest file is O(1). It's the equivalent of having ls -tr * (only in namenode, not exposed externally) * it's all automatic. No user intervention ever, no purge command. * file removal is lazy. Options: -namenode wakes up occasionally (once a minute?) and removes *all* the files pending deletion -namenode wakes up frequently (once a second?) and removes a small (100?) number of files at most * deleted files are renamed with entire path, username of deleter, and time included. > support undelete, snapshots, or other mechanism to recover lost files > --------------------------------------------------------------------- > > Key: HADOOP-432 > URL: http://issues.apache.org/jira/browse/HADOOP-432 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Reporter: Yoram Arnon > Assigned To: Wendy Chien > > currently, once you delete a file it's gone forever. > most file systems allow some form of recovery of deleted files. > a simple solution would be an 'undelete' command. > a more comprehensive solution would include snapshots, manual and automatic, > with scheduling options. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
