[
https://issues.apache.org/jira/browse/HADOOP-1869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623465#action_12623465
]
dhruba borthakur commented on HADOOP-1869:
------------------------------------------
I plan on doing the following:
1. Add a 4 byte field to the in-memory inode to maintain access time.
2. Create a new transaction OP_SET_ACCESSTIME to the edits log.
3. Every access to read data/metadata for a file will do the following:
-- update the in-memory access time of the inode.
-- write a OP_SET_ACCESSTIME entry for this inode to the edits log buffer,
do not sync or flush buffer
4. Enhance the dfs shell/webUI to display access times of files/directories
This should not adversely impact the transaction processing rate of the
namenode. Other types of transactions (e.g. file creation) will anyway cause
the transaction-log-buffer to get synced to disk pretty quickly. This
implementation will not distinguish between different kind of metadata accesses
and is primarily targeted to weed out files that are not used for a long long
time.
> access times of HDFS files
> --------------------------
>
> Key: HADOOP-1869
> URL: https://issues.apache.org/jira/browse/HADOOP-1869
> Project: Hadoop Core
> Issue Type: New Feature
> Components: dfs
> Reporter: dhruba borthakur
>
> HDFS should support some type of statistics that allows an administrator to
> determine when a file was last accessed.
> Since HDFS does not have quotas yet, it is likely that users keep on
> accumulating files in their home directories without much regard to the
> amount of space they are occupying. This causes memory-related problems with
> the namenode.
> Access times are costly to maintain. AFS does not maintain access times. I
> thind DCE-DFS does maintain access times with a coarse granularity.
> One proposal for HDFS would be to implement something like an "access bit".
> 1. This access-bit is set when a file is accessed. If the access bit is
> already set, then this call does not result in a transaction.
> 2. A FileSystem.clearAccessBits() indicates that the access bits of all files
> need to be cleared.
> An administrator can effectively use the above mechanism (maybe a daily cron
> job) to determine files that are recently used.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.