[ 
https://issues.apache.org/jira/browse/ARTEMIS-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729028#comment-16729028
 ] 

ASF GitHub Bot commented on ARTEMIS-2213:
-----------------------------------------

Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2481
  
    I will be able to look into that in the next days :) 
    I can already ask you to collect time to safepoints/GC pauses if possible 
ie -XX:+PrintGCApplicationStoppedTime.
    2 minutes seems a too long period TBH, but worth taking a look if you rely 
on MAPPED journal and/or paging a lot, given that major page faulting can 
causes long stall similar to very long full GC, but TBH nothing so long (2 
minutes is a lot!).
    As an additional suggestion you could run a Java program that just use one 
core and jmeasure the elapsed time between 2 consecutive nanoTime calls, 
recording in which wall-clock time a back-ward "drift" has happened to check if 
a broker shutdown has happened near the same time, makes sense?



> Clock drift causing server halt
> -------------------------------
>
>                 Key: ARTEMIS-2213
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2213
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.6.3
>            Reporter: yangwei
>            Priority: Critical
>
> In our production cluster some brokers crashed. There is nothing unusual in 
> the dump stack. After digging into code, we found component was incorrectly 
> expired. When clock drifted back, left time was less than enter time. If the 
> component was not entered in default 120000ms, it would be expired and server 
> was halted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to