[ 
https://issues.apache.org/jira/browse/HADOOP-5712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708134#action_12708134
 ] 

Koji Noguchi commented on HADOOP-5712:
--------------------------------------

bq. At present, it is very easy to find out the last version of the file in 
Trash.

If all the nodes in the cluster are in the same timezone, timestamp would 
(almost) serve this purpose?

bq. Another option would be to make the Trash client retrieve the contents of 
the Trash directory and then scan what files pre-exist in the list. 

listStatus is one of the most expensive call to Namenode right now.  
I really want to avoid an extra overhead to the namenode with this common 
command.

bq. Also, this code is not fool-proof because there is no atomicity between the 
exists and the rename. 

True.  But I haven't seen this become a problem yet.
For me, the contract is we *try* to move the files to Trash but if that fails, 
we simply delete them.
We completely delete the files if rename fails twice in a row anyway.

In short, I want the Trash feature to stay as simple as it is now without 
involving the Namenode much.


> Namenode slowed down when many files with same filename were moved to Trash
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-5712
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5712
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.3
>            Reporter: Koji Noguchi
>            Priority: Minor
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to