[
https://issues.apache.org/jira/browse/HDFS-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500757#comment-14500757
]
Arpit Agarwal commented on HDFS-8163:
-------------------------------------
While working on more tests I found some more issues with the timestamp usage.
The
[System.nanotime|https://docs.oracle.com/javase/7/docs/api/java/lang/System.html#nanoTime()]
docs state that it can return a negative value and can overflow between
successive invocations. So two values should never be compared directly but
diffed to handle overflow.
My guess is that negative values/overflow are unlikely on the platforms we care
about but we should be handling them correctly anyway. I plan to split out the
timestamp handling logic of BPServiceActor into a separate utility class for
clarity and ease of unit testing. Will post an updated patch later today.
> Using monotonicNow for block report scheduling causes test failures on
> recently restarted systems
> -------------------------------------------------------------------------------------------------
>
> Key: HDFS-8163
> URL: https://issues.apache.org/jira/browse/HDFS-8163
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Affects Versions: 2.6.1
> Reporter: Arpit Agarwal
> Assignee: Arpit Agarwal
> Priority: Blocker
> Attachments: HDFS-8163.01.patch, HDFS-8163.02.patch
>
>
> {{BPServiceActor#blockReport}} has the following check:
> {code}
> List<DatanodeCommand> blockReport() throws IOException {
> // send block report if timer has expired.
> final long startTime = monotonicNow();
> if (startTime - lastBlockReport <= dnConf.blockReportInterval) {
> return null;
> }
> {code}
> Many tests trigger an immediate block report via
> {{BPServiceActor#triggerBlockReportForTests}} which sets {{lastBlockReport =
> 0}}. However if the machine was restarted recently then startTime may be less
> than {{dnConf.blockReportInterval}} and the block report is not sent.
> {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed
> since an arbitrary origin. The time should be used only for comparison with
> other values returned by {{System#nanoTime}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)