Re: [ntp:questions] ntpd wedged again

A C Sat, 11 Feb 2012 11:04:07 -0800

On 2/11/2012 06:51, Dave Hart wrote:

On Sat, Feb 11, 2012 at 09:21, A C<[email protected]>  wrote:

So ntpd has been behaving reasonably well with the snprintf fix.  I had good
results with only internet servers.  My PPS and SHM refclocks were set to
noselect.


I removed the noselect on the PPS refclock and left flag3 set to zero (no
kernel discipline).

Everything seemed fine and then:

Sat Feb 11 01:12:10 PST 2012
     remote           refid      st t when poll reach   delay   offset
  jitter

==============================================================================
x127.127.22.0    .PPS.            0 l    -   16  377    0.000  -111.40
351.464
  127.127.28.0    .GPSD.           4 l   49  128  377    0.000  -14655.
2814.64
  169.229.70.201  169.229.128.214  3 u  103  512  377   39.347  -9274.2
6597.61
  72.14.179.211   127.67.113.92 2 u   79  512  377   57.746  -14699.
10685.0
  24.124.0.251    132.236.56.250   3 u  521  512  377   77.930  -9835.0
7451.10
  130.207.165.28  130.207.244.240  2 u  153  512  377   79.131  -9155.6
6554.15
  131.144.4.10    130.207.244.240  2 u  142  512  377   86.537  -9102.3
6526.3


Did you forget to mention you commented out the NMEA refclock at the
same time you removed noselect from the atom/PPS and SHM drivers?

I am a bit tired right now, so forgive me for latching onto a nit
rather than the juicy part, but I want to be as clear as possible.
You say everything was fine until you made some changes, without
specifying the previous state, and when I try to infer what that
earlier state was based on the two changes, I'm left with a setup with
no refclocks, which is obviously not particularly comparable.  I'm
also hesitating to point a finger at the gpsd+SHM combo, particularly
because I suspect it's racy especially on non-x86 systems and have on
my to-do list rewriting it to use a safer shared memory access
protocol...

So first, let's be clear about what you're reporting.  Was the change
from 3 refclock drivers with 2 marked noselect to 2 selectable
drivers?

No problem. SHM has been disabled by noselect for a while. It is stillcurrently disabled by noselect (but not commented out so I can stillobserve its relative offset). During the snprintf testing from thisweek, ATOM has also been disabled by noselect (also so I could continueto observe its relative offset) so I was left with only the internetservers (five total) as my time sources.

For an entire week I ran with ATOM and SHM in noselect and things lookedfine. Offsets for all internet servers settled down to 1-2ms and thereported ATOM offset also stayed in that same range without strayingaway (again, this is reported offset but the clock wasn't being usedbecause it was still noselect).

I removed the noselect from ATOM only (not SHM) so now I had theinternet servers (five) plus ATOM. Everything looked fine for a fewhours after I restarted ntpd with ATOM enabled again (allowed to beselected). But after a few hours, the clock went crazy and startedslewing very quickly. When I restarted ntpd, it had to step the clockbackwards by 16.6 seconds to bring it into agreement. The clock gained16 seconds in a matter of about 5 minutes (the amount of time I let ntpdrun in this crazy state).


_______________________________________________
questions mailing list
[email protected]
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] ntpd wedged again

Reply via email to