[ 
https://issues.apache.org/jira/browse/HADOOP-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14726993#comment-14726993
 ] 

Weiwei Yang commented on HADOOP-5323:
-------------------------------------

We surely should improve the document for trash, it is out-of-date and people 
would easily get confused. The document should cover issues that resolved in 
HADOOP-6761. I proposed to revise the doc to 

*File Deletes and Undeletes*

When a file is deleted by a user or an application, it is not immediately 
removed from HDFS. Instead, HDFS moves it to a trash directory (each user has 
its own trash directory under `/user/<username>/.Trash`). Most recent deleted 
files are moved to the current trash directory 
(`/user/<username>/.Trash/Current`), and in a configurable interval, HDFS 
creates checkpoints (under `/user/<username>/.Trash/<date>`) for files in 
current trash directory and deletes old checkpoints when they are expired.

Current default the trash feature is disabled (Delete files without storing in 
trash), user can enable this feature by setting a value greater than zero for 
parameter `fs.trash.interval` (in core-site.xml). This value tells the NameNode 
how long a checkpoint will be expired and removed from HDFS. In addition, user 
can configure an appropriate time to tell NameNode how often to create 
checkpoints in trash (the parameter stored as `fs.trash.checkpoint.interval` in 
core-site.xml), this value should be smaller or equal to fs.trash.interval. 




> Trash documentation needs to be more elaborated.
> ------------------------------------------------
>
>                 Key: HADOOP-5323
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5323
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: documentation
>    Affects Versions: 0.18.3
>            Reporter: Suman Sehgal
>            Assignee: Weiwei Yang
>            Priority: Minor
>              Labels: newbie
>
> Trash documentation should mention the significance of "Current" and 
> "<time-stamp>" directories which get generated inside Trash directory. The 
> documentation should also incorporate modifications done in HADOOP: 4970.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to