[ 
http://issues.apache.org/jira/browse/HADOOP-432?page=comments#action_12437000 ] 
            
Yoram Arnon commented on HADOOP-432:
------------------------------------

removing data only when space is required will result in the filesystem always 
being 100% full.
That has several downsides:
* performance:
       memory usage and image file size on the namenode
       difficulty finding space for data - file systems are slower when full
       to allocate space you must first delete space (right now, online), 
slowing down writes
* contention for space:
       if a disk is shared between dfs and say, map-reduce temporary storage, 
or the OS, then dfs will take over everything
* when the FS is really full, undelete will fail, because no space is reserved 
for it. Better to declare the FS full earlier and keep the (configured) 
undelete space available

- also, it's the common way of doing things.

       

> support undelete, snapshots, or other mechanism to recover lost files
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-432
>                 URL: http://issues.apache.org/jira/browse/HADOOP-432
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Yoram Arnon
>         Assigned To: Wendy Chien
>
> currently, once you delete a file it's gone forever.
> most file systems allow some form of recovery of deleted files.
> a simple solution would be an 'undelete' command.
> a more comprehensive solution would include snapshots, manual and automatic, 
> with scheduling options.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to