[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408648#comment-13408648
 ] 

Uma Maheswara Rao G commented on BOOKKEEPER-327:
------------------------------------------------

Update:

We have seen again this issue. I am not sure about 1st occurance. 
But with NanoTime we have seen this negative value in one of our BK cluster.

We have just ran one small sample test with this elapsed time calculation.

Elapsed Time: -4294965969
Elapsed Time: -4287444203
Elapsed Time: -4287387003
Elapsed Time: -4286339034
Elapsed Time: -4287371149
Elapsed Time: -4287349218
Elapsed Time: -4287274419


We have 8 CPU machine running BK cluster and many other processes running on it.

Also I have seen some posts abaout the similar experience.
http://stackoverflow.com/questions/510462/is-system-nanotime-completely-useless

System.currentMillis also gav backwards it seems run 'backwards', in the 
absence of clock adjustments.

Finally it turns out to me that, we should not depend on this time where we are 
doing some sensitive operations. There should be a recovery even it went wrong 
in time diff value slightly.

What I am thinking here is:

{code}
 if (latency < 0) { // less than 0ms . Ideally this should not
                // happen. We have seen this latency negative in
                // some cases.
                LOG.warn("Latency time coming negative");
                bucket = 0;
            }
{code}

we will reserve 0th bucket for this kind of values which are coming with 
negative latency.

>From 1 + (int) (latency / 10); --> 1 * 9 + (int) (latency / 100); can be for 
>less than 100ms latency metrics.

Can you check the impact with this proposal, as I did not try Statistics till 
now in any of our clusters. temporarily I have disabled statistics in our 
cluster till this issue solved.
                
> System.currentTimeMillis usage in BookKeeper
> --------------------------------------------
>
>                 Key: BOOKKEEPER-327
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-327
>             Project: Bookkeeper
>          Issue Type: Bug
>    Affects Versions: 4.0.0, 4.1.0
>            Reporter: Rakesh R
>            Priority: Minor
>         Attachments: BOOKKEEPER-327.patch
>
>
> The following exception occured in the bookie statistics logic due to the 
> System time changes. In our bookie cluster its running a periodic syncup 
> scripts just to unify the SystemTime in all the machines. This is causing the 
> problem and resulting ArrayIndexOutOfBoundException.
> {code}
> Exception in thread "BookieJournal-3181" 
> java.lang.ArrayIndexOutOfBoundsException: -423
> at org.apache.bookkeeper.proto.BKStats$OpStats.updateLatency(BKStats.java:126)
> at 
> org.apache.bookkeeper.proto.BookieServer.writeComplete(BookieServer.java:655)
> at org.apache.bookkeeper.bookie.Journal.run(Journal.java:507)
> {code}
> This jira is raised to discuss whether to use ??System.nanoTime()?? instead 
> of ??System.currentTimeMillis()??

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to