Kradorex, If you look at the CSV log link from your monitor page, your server's time is excellent but the monitoring station is not getting a response periodically, causing your score to decrease by 5 every time because it failed to respond. Is your server being DDoS'd or using more bandwidth than it's allocated? Your stress testing could be causing this as well. Can you ping your server externally to check its reachability continuously and check for intermittent downtime?
In other words, the problem isn't between your server and torix and your server's ability to keep accurate time, but reachability between client and your server is very poor. I would also check dmesg for any Ethernet issues, IP conflicts, hardware issues, etc. When in doubt, reboot. If you do have to reboot it or take it offline, I like to avoid the times when the monitoring station will take a measurement, like every ~16-18 minutes or so, just so my score doesn't take a -5 hit for an intentional reboot. ts_epoch,ts,offset,step,score 1435181838,"2015-06-24 21:37:18",-0.0000635385513305664,1,-13 1435180826,"2015-06-24 21:20:26",,-5,-14.8 1435179831,"2015-06-24 21:03:51",-0.000896215438842773,1,-10.3 1435178869,"2015-06-24 20:47:49",0.000105142593383789,1,-11.9 1435177896,"2015-06-24 20:31:36",-0.000214815139770508,1,-13.6 1435176950,"2015-06-24 20:15:50",,-5,-15.3 1435175977,"2015-06-24 19:59:37",-0.000658512115478516,1,-10.9 1435175034,"2015-06-24 19:43:54",,-5,-12.5 1435174065,"2015-06-24 19:27:45",,-5,-7.9 1435173077,"2015-06-24 19:11:17",0.000123381614685059,1,-3 1435172121,"2015-06-24 18:55:21",-0.000849366188049316,1,-4.3 1435171132,"2015-06-24 18:38:52",-0.000703096389770508,1,-5.5 1435170180,"2015-06-24 18:23:00",,-5,-6.9 1435169217,"2015-06-24 18:06:57",-0.000395655632019043,1,-2 Best regards, Mike -----Original Message----- From: pool [mailto:[email protected]] On Behalf Of Kradorex Xeron Sent: June 24, 2015 1:58 PM To: [email protected] Subject: [Pool] Extremely strange monitoring behaviour Hello, I have been having extreme difficulty in identifying the cause of this fault, but monitoring seems to be acting extremely strange on my NTP instance: http://www.pool.ntp.org/scores/72.38.129.202 I have done all kinds of evaluation on this end — including a lot of monitoring and stress-testing of network links to verify infrastructure it is not at fault, but am rather fuzzy at what would cause a zig-zag pattern and not a direct fall for inaccuracy. The issue seems to have begun after an update to ntpd [email protected] on Sunday. This instance has ~3 Stratum 1s to synchronize against, I have reduced the maxpoll to no avail. # ntpq -c peers remote refid st t when poll reach delay offset jitter ============================================================================== *ntp1.torix.ca .PPS. 1 u 25 64 377 9.612 -0.503 2.745 +ntp2.torix.ca .PPS. 1 u 26 64 377 9.619 -0.129 2.284 +CLOCK.UREGINA.C .GPS. 1 u 25 64 377 57.845 0.283 1.022 If someone could give me some insight, that'd be great. Thanks. -- /s/ Kradorex Xeron <[email protected]> Executive Director, Digibase Operations, Research and Development <http://digibase.ca> _______________________________________________ pool mailing list [email protected] http://lists.ntp.org/listinfo/pool _______________________________________________ pool mailing list [email protected] http://lists.ntp.org/listinfo/pool
