Kradorex,

If you look at the CSV log link from your monitor page, your server's time is 
excellent but the monitoring station is not getting a response periodically, 
causing your score to decrease by 5 every time because it failed to respond.  
Is your server being DDoS'd or using more bandwidth than it's allocated?  Your 
stress testing could be causing this as well.  Can you ping your server 
externally to check its reachability continuously and check for intermittent 
downtime?

In other words, the problem isn't between your server and torix and your 
server's ability to keep accurate time, but reachability between client and 
your server is very poor.  I would also check dmesg for any Ethernet issues, IP 
conflicts, hardware issues, etc.  When in doubt, reboot.  If you do have to 
reboot it or take it offline, I like to avoid the times when the monitoring 
station will take a measurement, like every ~16-18 minutes or so, just so my 
score doesn't take a -5 hit for an intentional reboot.

ts_epoch,ts,offset,step,score
1435181838,"2015-06-24 21:37:18",-0.0000635385513305664,1,-13
1435180826,"2015-06-24 21:20:26",,-5,-14.8
1435179831,"2015-06-24 21:03:51",-0.000896215438842773,1,-10.3
1435178869,"2015-06-24 20:47:49",0.000105142593383789,1,-11.9
1435177896,"2015-06-24 20:31:36",-0.000214815139770508,1,-13.6
1435176950,"2015-06-24 20:15:50",,-5,-15.3
1435175977,"2015-06-24 19:59:37",-0.000658512115478516,1,-10.9
1435175034,"2015-06-24 19:43:54",,-5,-12.5
1435174065,"2015-06-24 19:27:45",,-5,-7.9
1435173077,"2015-06-24 19:11:17",0.000123381614685059,1,-3
1435172121,"2015-06-24 18:55:21",-0.000849366188049316,1,-4.3
1435171132,"2015-06-24 18:38:52",-0.000703096389770508,1,-5.5
1435170180,"2015-06-24 18:23:00",,-5,-6.9
1435169217,"2015-06-24 18:06:57",-0.000395655632019043,1,-2

Best regards,
Mike

-----Original Message-----
From: pool [mailto:[email protected]] On Behalf Of 
Kradorex Xeron
Sent: June 24, 2015 1:58 PM
To: [email protected]
Subject: [Pool] Extremely strange monitoring behaviour

Hello,

I have been having extreme difficulty in identifying the cause of this 
fault, but monitoring seems to be acting extremely strange on my NTP 
instance:

http://www.pool.ntp.org/scores/72.38.129.202

I have done all kinds of evaluation on this end — including a lot of 
monitoring and stress-testing of network links to verify infrastructure 
it is not at fault, but am rather fuzzy at what would cause a zig-zag 
pattern and not a direct fall for inaccuracy. The issue seems to have 
begun after an update to ntpd [email protected] on Sunday.

This instance has ~3 Stratum 1s to synchronize against, I have reduced 
the maxpoll to no avail.

# ntpq -c peers
      remote           refid      st t when poll reach   delay   offset 
  jitter
==============================================================================
*ntp1.torix.ca   .PPS.            1 u   25   64  377    9.612   -0.503 
  2.745
+ntp2.torix.ca   .PPS.            1 u   26   64  377    9.619   -0.129 
  2.284
+CLOCK.UREGINA.C .GPS.            1 u   25   64  377   57.845    0.283 
  1.022

If someone could give me some insight, that'd be great.

Thanks.

-- 
/s/
Kradorex Xeron <[email protected]>
Executive Director,
Digibase Operations, Research and Development <http://digibase.ca>
_______________________________________________
pool mailing list
[email protected]
http://lists.ntp.org/listinfo/pool

_______________________________________________
pool mailing list
[email protected]
http://lists.ntp.org/listinfo/pool

Reply via email to