[
https://issues.apache.org/jira/browse/HADOOP-1869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628246#action_12628246
]
Sanjay Radia commented on HADOOP-1869:
--------------------------------------
The main use cases are distcp, restore (or untar).
Konstantine raises 2 good points:
* restrict to create operation. In order to make this work the time has to be
applied at close event otherwise you run into the situation that Ragu raises
about the file taking a long time to write its last block.
* restrict times to be <= the NN's current time,
This could run into problems with distcp betweens two hdfs clusters with
clocks out sync,
While the extended create operation works for our use case, there are few
advantages to the utimes() approach:
- handles other use cases we haven't thought of today
- if we provide partial posix compatibility in the future, one could use
posix's restore/untar tools
Hence I am in favour of:
FileSystem.utimes(path, modTime, aTime).
+1
> access times of HDFS files
> --------------------------
>
> Key: HADOOP-1869
> URL: https://issues.apache.org/jira/browse/HADOOP-1869
> Project: Hadoop Core
> Issue Type: New Feature
> Components: dfs
> Reporter: dhruba borthakur
> Assignee: dhruba borthakur
> Fix For: 0.19.0
>
> Attachments: accessTime1.patch, accessTime4.patch, accessTime5.patch,
> accessTime6.patch
>
>
> HDFS should support some type of statistics that allows an administrator to
> determine when a file was last accessed.
> Since HDFS does not have quotas yet, it is likely that users keep on
> accumulating files in their home directories without much regard to the
> amount of space they are occupying. This causes memory-related problems with
> the namenode.
> Access times are costly to maintain. AFS does not maintain access times. I
> thind DCE-DFS does maintain access times with a coarse granularity.
> One proposal for HDFS would be to implement something like an "access bit".
> 1. This access-bit is set when a file is accessed. If the access bit is
> already set, then this call does not result in a transaction.
> 2. A FileSystem.clearAccessBits() indicates that the access bits of all files
> need to be cleared.
> An administrator can effectively use the above mechanism (maybe a daily cron
> job) to determine files that are recently used.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.