[ 
https://issues.apache.org/jira/browse/HADOOP-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17228567#comment-17228567
 ] 

Vinayakumar B edited comment on HADOOP-17306 at 11/9/20, 1:14 PM:
------------------------------------------------------------------

Hi [~Jim_Brennan], Thanks for pointing to test failures.

AFAIK, test failures are due to setting timestamp of {{LocalResource}} with 
value returned by {{File.lastModified()}} all in test code explicitly for the 
scriptfile used for tests. As mentioned in this Jira title, 
{{File.lastModified()}} is broken and looses accuracy. I tried replacing 
{{File.lastModified()}} calls with
 {{Files.getLastModifiedTime(file.toPath()).toMillis()}}, all tests passed.

AM's sets the timestamp using the value returned by 
{{FileStatus#getModifiedTime()}} in which case, it will be consistent. So I 
dont think any problem with the production code as long as 
{{FileStatus#getModificationTime()}} is used.

As Steve mentioned, relying on modificationTime and length may not be a good 
idea to detect changes.

Without this fix, There could be possibilities of corruption/modification of 
data, which can happen within the same second, without changing the length of 
the file, in which case it will go undetected since modificationTime() will 
looses the millis part.


was (Author: vinayrpet):
Hi [~Jim_Brennan], Thanks for pointing to test failures.

AFAIK, test failures are due to setting timestamp of {{LocalResource}} with 
value returned by {{File.lastModified()}} all in test code explicitly for the 
scriptfile used for tests. As mentioned in this Jira title, 
{{File.lastModified()}} is broken and looses accuracy. I tried replacing 
{{File.lastModified()}} calls with
 {{Files.getLastModifiedTime(file.toPath()).toMillis()}}, all tests passed.

AM's sets the timestamp using the value returned by 
{{FileStatus#getModifiedTime()}} in which case, it will be consistent. So I 
dont think any problem with the production code as long as 
{{FileStatus#getModificationTime()}} is used.

 

As Steve mentioned, relying on modificationTime and length may not be a good 
idea to detect changes. There could be possibilities of corruption 

> RawLocalFileSystem's lastModifiedTime() looses milli seconds in JDK < 10.b09
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-17306
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17306
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Vinayakumar B
>            Assignee: Vinayakumar B
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> RawLocalFileSystem's FileStatus uses {{File.lastModified()}} api from JDK.
> This api looses milliseconds due to JDK bug.
> [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8177809]
> This bug fixed in JDK 10 b09 onwards and still exists in JDK 8 which is still 
> being used in many productions.
> Apparently, {{Files.getLastModifiedTime()}} from java's nio package returns 
> correct time.
> Use {{Files.getLastModifiedTime()}} instead of {{File.lastModified}} as 
> workaround. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to