[ceph-users] clock skew

mj Thu, 25 Apr 2019 03:34:20 -0700

Hi all,

On our three-node cluster, we have setup chrony for time sync, and eventhough chrony reports that it is synced to ntp time, at the same timeceph occasionally reports time skews that can last several hours.


See for example:

root@ceph2:~# ceph -v
ceph version 12.2.10 (fc2b1783e3727b66315cc667af9d663d30fe7ed4) luminous 
(stable)
root@ceph2:~# ceph health detail
HEALTH_WARN clock skew detected on mon.1
MON_CLOCK_SKEW clock skew detected on mon.1
    mon.1 addr 10.10.89.2:6789/0 clock skew 0.506374s > max 0.5s (latency 
0.000591877s)
root@ceph2:~# chronyc tracking
Reference ID    : 7F7F0101 ()
Stratum         : 10
Ref time (UTC)  : Wed Apr 24 19:05:28 2019
System time     : 0.000000133 seconds slow of NTP time
Last offset     : -0.003333524 seconds
RMS offset      : 0.003333524 seconds
Frequency       : 12.641 ppm slow
Residual freq   : +0.000 ppm
Skew            : 0.000 ppm
Root delay      : 0.000000 seconds
Root dispersion : 0.000000 seconds
Update interval : 1.4 seconds
Leap status     : Normal

root@ceph2:~#

For the record: mon.1 = ceph2 = 10.10.89.2, and time is synced similarlywith NTP on the two other nodes.


We don't understand this...

I have now injected mon_clock_drift_allowed 0.7, so at least we haveHEALTH_OK again. (to stop upsetting my monitoring system)


But two questions:

- can anyone explain why this is happening, is it looks as if ceph andNTP/chrony disagree on just how time-synced the servers are..?

- how to determine the current clock skew from cephs perspective?Because "ceph health detail" in case of HEALTH_OK does not show it.(I want to start monitoring it continuously, to see if I can find somesort of pattern)


Thanks!

MJ
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] clock skew

Reply via email to