[
https://issues.apache.org/jira/browse/HADOOP-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713260#action_12713260
]
Tsz Wo (Nicholas), SZE commented on HADOOP-5620:
------------------------------------------------
> Dhruba told me modification times of directories are not persistent, that is,
> on namenode restart they are set to the latest modification time amongst the
> files they contain.
I just have checked the codes. It seems not true.
Also, DistCp works on general FileSystem. It should not depend on a particular
implementation.
> If we get atime inside the if, it will be the copy time (last access after
> copying the file) instead of the latest access time before copying, which is
> what we need for migration.
FileStatus is a local object. Once it has been obtained from a FileSystem it
remains unchanged even the actual status of the file is changed. So the atime
inside the if-statement will be the latest access time before copying since
getFileStatus is called before copying.
BTW, there is a white space change in the patch, could you remove it?
> discp can preserve modification times of files
> ----------------------------------------------
>
> Key: HADOOP-5620
> URL: https://issues.apache.org/jira/browse/HADOOP-5620
> Project: Hadoop Core
> Issue Type: Improvement
> Components: tools/distcp
> Reporter: dhruba borthakur
> Assignee: Rodrigo Schmidt
> Fix For: 0.21.0
>
> Attachments: HADOOP-5620.patch
>
>
> It will be helpful if distcp can preserve the modification time and access
> time of files. This helps to archive/unarchive hdfs files.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.