I’ve had good success with this strategy, have the mons chime each other, and 
perhaps have OSD / other nodes against the mons too.  
Chrony >> ntpd
With modern interval backoff / iburst there’s no reason to not have a robust 
set of peers.  

The public NTP pools rotate DNS on some period, so when the quality / jitter 
varies a lot among a given pool you can experience swings.  So depending on the 
scale of one’s organization, it often makes sense to have a set of internal 
stratum 2 servers that servers chime against, which mesh among themselves and 
against both geo-local public servers and a few hand-picked quality *distant* 
servers.  Jitter matters more than latency AIUI.  

Local stratum 1 servers are cool, though getting coax to a DC roof and an 
antenna mounted can be an expensive hassle.  

Success includes a variety of time sources, so that it doesn’t all go to hell 
when some specific server goes weird or disappears, both of which happen.  Eg, 
if there’s a window with sky access, even in an office area, add a couple of 
these (or similar) to the mix, as a source for the workhorse server stratum :

https://www.netburner.com/products/network-time-server/pk70-ex-ntp-network-time-server/#

Not a DC grade item, or a sole solution, but the bang for the buck is 
unbeatable.  


Unless things have changed in the last few years, don’t run NTP servers on VMs. 
 Some network gear can run a server, but be careful with the load it presents 
and how many clients can be supported without impacting the primary roles.  


> On Dec 8, 2021, at 12:14 AM, Janne Johansson <[email protected]> wrote:
> 
> Den ons 8 dec. 2021 kl 02:35 skrev mhnx <[email protected]>:
>> I've been building Ceph clusters since 2014 and the most annoying and
>> worst failure is the NTP server faults and having different times on
>> Ceph nodes.
>> 
>> I've fixed few clusters because of the ntp failure.
>> - Sometimes NTP servers can be unavailable,
>> - Sometimes NTP servers can go crazy.
>> - Sometimes NTP servers can respond but systemd-timesyncd can not sync
>> the time without manual help.
>> 
>> I don't want to deal with another ntp problem and because of that I've
>> decided to build internal ntp servers for the cluster.
>> 
>> I'm thinking of creating 3 NTP servers on the 3 monitor nodes to get
>> an internal ntp server cluster.
>> I will use the internal NTP cluster for the OSD nodes and other services.
>> With this way, I believe that I'll always have a stable and fast time server.
> 
> We do something like this. mons gather "calendar time" from outside
> ntp servers, but also peer against eachother, so if/when they drift
> away the mons drift away equal amounts, then all OSDs/RGWs and ceph
> clients pull time from the mons who serve internal ntp based on their
> idea of what time it is.
> 
> Not using systemd, but both chronyd and ntpd allow you to set peers
> for which you sync "sideways" just to keep the pace in-between hosts.
> 
> -- 
> May the most significant bit of your life be positive.
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to