Chandan Biswas created HDFS-8714:
------------------------------------
Summary: Folder ModificationTime in Millis Changed When NameNode
is restarted
Key: HDFS-8714
URL: https://issues.apache.org/jira/browse/HDFS-8714
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Chandan Biswas
*Steps to Produce*
# Steps need to do in program
** Create a folder into HDFS
** Print folder modificationTime in millis
** Upload a file or copy a file to this newly created folder
** Print file and folder modificationTime in millis
** Restart the name node
** Print file and folder modificationTime in millis
# Expected Result
** folder modification time should be the file modification time before name
node restart
** folder modification time should not change after name node restart
# Actual result
** folder modification time is not same with file modification time
** folder modification time is changed after name node restart and it's changed
to file modification time
*Impact of this behavior:* Before task is launched, distributed cache
files/folders are checked for any modification. The checks are done by
comparing file/folder modicationTime in millis. So any job that uses
distributed cache has a potential chance of failure if
# name node restarts and running tasks are resubmitted or
# for e.g among 100 tasks 50 are in queue for run. Now name node restarts
Here is the sample code I used for testing-
{code}
// file creating in hdfs
final Path pathToFiles = new Path("/user/vagrant/chandan/test/");
fileSystem.mkdirs(pathToFiles);
System.out.println("HDFS Folder Modification Time in long Before file
copy:"
+ fileSystem.getFileStatus(pathToFiles).getModificationTime());
FileUtil.copy(fileSystem, new Path("/user/cloudera/test"), fileSystem,
pathToFiles, false, configuration);
System.out.println("HDFS File Modification Time in long:"
+ fileSystem.getFileStatus(new
Path("/user/vagrant/chandan/test/test")).getModificationTime());
System.out.println("HDFS Folder Modification Time in long After file
copy:"
+ fileSystem.getFileStatus(pathToFiles).getModificationTime());
for (int i = 0; i < 100; i++) {
System.out.println("Normal HDFS Folder Modification Time in long:"
+
fileSystem.getFileStatus(pathToFiles).getModificationTime());
System.out.println("Normal HDFS File Modification Time in long:"
+ fileSystem.getFileStatus(new
Path("/user/vagrant/chandan/test/test")).getModificationTime());
Thread.sleep(60000 * 2);
}
{code}
Here is the output -
{code}
HDFS Folder Modification Time in long Before file copy:1435868217309
HDFS File Modification Time in long:1435868217368
HDFS Folder Modification Time in long After file copy:1435868217353
Normal HDFS Folder Modification Time in long:1435868217353
Normal HDFS File Modification Time in long:1435868217368
Normal HDFS Folder Modification Time in long:1435868217353
Normal HDFS File Modification Time in long:1435868217368
Normal HDFS Folder Modification Time in long:1435868217368
Normal HDFS File Modification Time in long:1435868217368
{code}
The last two lines are printed after name node restart.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)