[ https://issues.apache.org/jira/browse/HADOOP-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17228567#comment-17228567 ]
Vinayakumar B edited comment on HADOOP-17306 at 11/9/20, 1:14 PM: ------------------------------------------------------------------ Hi [~Jim_Brennan], Thanks for pointing to test failures. AFAIK, test failures are due to setting timestamp of {{LocalResource}} with value returned by {{File.lastModified()}} all in test code explicitly for the scriptfile used for tests. As mentioned in this Jira title, {{File.lastModified()}} is broken and looses accuracy. I tried replacing {{File.lastModified()}} calls with {{Files.getLastModifiedTime(file.toPath()).toMillis()}}, all tests passed. AM's sets the timestamp using the value returned by {{FileStatus#getModifiedTime()}} in which case, it will be consistent. So I dont think any problem with the production code as long as {{FileStatus#getModificationTime()}} is used. As Steve mentioned, relying on modificationTime and length may not be a good idea to detect changes. Without this fix, There could be possibilities of corruption/modification of data, which can happen within the same second, without changing the length of the file, in which case it will go undetected since modificationTime() will looses the millis part. was (Author: vinayrpet): Hi [~Jim_Brennan], Thanks for pointing to test failures. AFAIK, test failures are due to setting timestamp of {{LocalResource}} with value returned by {{File.lastModified()}} all in test code explicitly for the scriptfile used for tests. As mentioned in this Jira title, {{File.lastModified()}} is broken and looses accuracy. I tried replacing {{File.lastModified()}} calls with {{Files.getLastModifiedTime(file.toPath()).toMillis()}}, all tests passed. AM's sets the timestamp using the value returned by {{FileStatus#getModifiedTime()}} in which case, it will be consistent. So I dont think any problem with the production code as long as {{FileStatus#getModificationTime()}} is used. As Steve mentioned, relying on modificationTime and length may not be a good idea to detect changes. There could be possibilities of corruption > RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09 > ---------------------------------------------------------------------------- > > Key: HADOOP-17306 > URL: https://issues.apache.org/jira/browse/HADOOP-17306 > Project: Hadoop Common > Issue Type: Bug > Components: fs > Reporter: Vinayakumar B > Assignee: Vinayakumar B > Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > RawLocalFileSystem's FileStatus uses {{File.lastModified()}} api from JDK. > This api looses milliseconds due to JDK bug. > [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8177809] > This bug fixed in JDK 10 b09 onwards and still exists in JDK 8 which is still > being used in many productions. > Apparently, {{Files.getLastModifiedTime()}} from java's nio package returns > correct time. > Use {{Files.getLastModifiedTime()}} instead of {{File.lastModified}} as > workaround. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org