access times of HDFS files
--------------------------
Key: HADOOP-1869
URL: https://issues.apache.org/jira/browse/HADOOP-1869
Project: Hadoop
Issue Type: New Feature
Components: dfs
Reporter: dhruba borthakur
HDFS should support some type of statistics that allows an administrator to
determine when a file was last accessed.
Since HDFS does not have quotas yet, it is likely that users keep on
accumulating files in their home directories without much regard to the amount
of space they are occupying. This causes memory-related problems with the
namenode.
Access times are costly to maintain. AFS does not maintain access times. I
thind DCE-DFS does maintain access times with a coarse granularity.
One proposal for HDFS would be to implement something like an "access bit".
1. This access-bit is set when a file is accessed. If the access bit is already
set, then this call does not result in a transaction.
2. A FileSystem.clearAccessBits() indicates that the access bits of all files
need to be cleared.
An administrator can effectively use the above mechanism (maybe a daily cron
job) to determine files that are recently used.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.