[ http://issues.apache.org/jira/browse/HADOOP-432?page=comments#action_12437000 ] Yoram Arnon commented on HADOOP-432: ------------------------------------
removing data only when space is required will result in the filesystem always being 100% full. That has several downsides: * performance: memory usage and image file size on the namenode difficulty finding space for data - file systems are slower when full to allocate space you must first delete space (right now, online), slowing down writes * contention for space: if a disk is shared between dfs and say, map-reduce temporary storage, or the OS, then dfs will take over everything * when the FS is really full, undelete will fail, because no space is reserved for it. Better to declare the FS full earlier and keep the (configured) undelete space available - also, it's the common way of doing things. > support undelete, snapshots, or other mechanism to recover lost files > --------------------------------------------------------------------- > > Key: HADOOP-432 > URL: http://issues.apache.org/jira/browse/HADOOP-432 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Reporter: Yoram Arnon > Assigned To: Wendy Chien > > currently, once you delete a file it's gone forever. > most file systems allow some form of recovery of deleted files. > a simple solution would be an 'undelete' command. > a more comprehensive solution would include snapshots, manual and automatic, > with scheduling options. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira