Creation time and modification time for hadoop files and directories
--------------------------------------------------------------------

                 Key: HADOOP-1377
                 URL: https://issues.apache.org/jira/browse/HADOOP-1377
             Project: Hadoop
          Issue Type: New Feature
            Reporter: dhruba borthakur
         Assigned To: dhruba borthakur


This issue will document the requirements, design and implementation of 
creation times and modification times of hadoop files and directories.

My proposal is to have support two additional attributes for each file and 
directory in HDFS. The "creation time" is the time when the file/directory was 
created. It is a 8 byte integer stored in each FSDirectory.INode. The 
"modification time" is the time when the last modification occured to the 
file/directory. It is an 8 byte integer stored in the FSDirectory.INode. These 
two fields are stored in in the FSEdits and FSImage as part of the transaction 
that created the file/directory.

My current proposal is to not support "access time" for a file/directory. It is 
costly to implement and current applications might not need it.

In the current implementation, the "modification time" for a file will be same 
as its creation time because HDFS files are currently unmodifiable. Setting 
file attributes (e.g. setting the replication factor) of a file does not modify 
the "modification time" of that file. The "modification time" for a directory 
is either its creation time or the time when the most recent file-delete or 
file-create occured in that directory.

A new command named "hadoop dfs -lsl" will display the creation time and 
modification time of the files/directories that it lists. The output of the 
existing command "hadoop dfs -ls" will not be affected.

The ClientProtocol will change because DFSFileInfo will have two additional 
fields: the creation time and modification time of the file that it represents. 
This information can be retrieved by clients thorugh the 
ClientProtocol.getListings() method. The FileSystem public API will have two 
additional methods: getCreationTime and getModificationTime().

The datanodes are completely transparent to this design and implementation and 
requires no change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to