Re: [ntp:questions] Is it possible to confuse ntpd's freq error measurement procedure?

2012-06-07 Thread Paul Malishev
Thanks Dave.

I have some realtime processes on this server and 128ms is too much for
stepping.
But thanks for the hint now I have some start point to investigate

2012/6/7 Dave Hart h...@ntp.org

 On Wed, Jun 6, 2012 at 6:07 PM, Paul Malishev p.malis...@gmail.com
 wrote:
  Hello.
 
  I have two ntpd peers which exchange time between themselves and also
  receive time from external server.
  I believe that at some moment connection to external server was lost and
  time on these two peers drifted a bit.
 
  When connection to external server was restored both ntpd on both peers
  logged something like:
  Jun  5 13:21:09 peer0 ntpd[5052]: frequency error 18158 PPM exceeds
  tolerance 500 PPM
 
  After that there were a lot of messages with not so big freq error:
  Jun  5 13:23:18 DIG ntpd[5052]: frequency error 608 PPM exceeds tolerance
  500 PPM
 
  When an operator saw time difference with external server about 30sec he
  just restarted ntpd on both nodes and surprisingly freq error messages
  disappeared. Now difference is about 1ms and stability stays about 0.021
 
  So my question is: is it possible to confuse ntpd's freq error
 measurement
  with some wrong settings?

 ntpd measures the frequency error directly only at startup, lacking a
 driftfile.  While it's operating, it's not measuring the frequency
 error, but rather manipulating it to steer the clock offset toward
 zero.  That manipulation is capped at 500 parts per million relative
 to the nominal clock rate.

  My config is:
  tinker step 0

 You've told ntpd to never step the clock to correct it (by default, it
 is stepped when the offset exceeds 128ms).  So instead, ntpd must
 eliminate slowly by running the clock faster or slower (but not more
 than 500 PPM faster or slower).  The large frequency error in the
 messages is a direct result of the perceived local clock offset.

 Likely restarting ntpd also invoked ntpdate, which stepped the clock
 so it was close enough that after restarting, ntpd's rate adjustments
 were well under 500 PPM.

 Cheers,
 Dave Hart

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Is it possible to confuse ntpd's freq error measurement procedure?

2012-06-07 Thread Paul Malishev
Oh. Thanks.

This true flag may be the root cause of the problem. Along with slew
adjusting instead of stepping. Thank you, I'll try to investigate his
problem further.

2012/6/7 E-Mail Sent to this address will be added to the BlackLists 
Null@blacklist.anitech-systems.invalid

 Paul Malishev wrote:
  I have two ntpd peers which exchange time between
   themselves and also receive time from external server.
  I believe that at some moment connection to external
   server was lost and time on these two peers drifted a bit.
 
  When connection to external server was restored both ntpd
   on both peers logged something like:
  Jun  5 13:21:09 peer0 ntpd[5052]:
   frequency error 18158 PPM exceeds tolerance 500 PPM
 
  After that there were a lot of messages with not so big freq error:
  Jun  5 13:23:18 DIG ntpd[5052]:
   frequency error 608 PPM exceeds tolerance 500 PPM
 ...
  When an operator saw time difference with external server about 30sec
 ...

  They must have been unable to reach the external server,
  for a really long time?


  server 127.127.1.0 noselect
  fudge  127.127.1.0 stratum 10


  If it looses the all other servers, it will likely continue
   to run away at whatever frequency was last set;
  If it can still contact the internal peer,
   they should run off together (or one would chase the other).

  You might try orphan mode instead; e.g.

 tos cohort 1 orphan 10

  http://www.eecis.udel.edu/~mills/ntp/html/orphan.html
  {Although I'm not certain it would have any significant value,
when only one other server can be reached.}


  restrict 192.168.0.240 mask 255.255.255.240

  If you ever use a server by host name,
  especially when the name may return multiple A records,
  (e.g. pool servers) you may need to add a restrict source
  line; e.g.

 restrict source nomodify


  tinker step 0

  Remove that if you want it to step, instead of slew always.
  BTW, step 0 also disables kernel discipline!


  tos minclock 1 minsane 1

 MinSane defaults to 1 ?


  peer 192.168.0.241 burst iburst minpoll 4 maxpoll 6 prefer true
  server **external-server-ip** burst iburst minpoll 4 maxpoll 6 true

  You should not do burst on servers that are not your own.
  {I have no idea who **external-server-ip** belongs to.}
   {iburst is fine}

  The docs also seem to say to not use burst or iburst with peer ?
  http://www.eecis.udel.edu/~mills/ntp/html/assoc.html
  http://www.eecis.udel.edu/~mills/ntp/html/confopt.html


  Are you really intentionally saying to treat both
  the other internal server and the external server
  as if they always have valid time, with the true option?
   {Even if some day they may not be even close.}

   Are you treating them both as true chimers
because you only have two servers to reference?


 --
 E-Mail Sent to this address blackl...@anitech-systems.com
  will be added to the BlackLists.

 ___
 questions mailing list
 questions@lists.ntp.org
 http://lists.ntp.org/listinfo/questions

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Is it possible to confuse ntpd's freq error measurement procedure?

2012-06-07 Thread David Woolley

Paul Malishev wrote:

Thanks Dave.

I have some realtime processes on this server and 128ms is too much for
stepping.
But thanks for the hint now I have some start point to investigate



Are you aware that disabling stepping also disables the kernel 
discipline, and therefore makes the algorithm subtly different?


___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


[ntp:questions] Is it possible to confuse ntpd's freq error measurement procedure?

2012-06-06 Thread Paul Malishev
Hello.

I have two ntpd peers which exchange time between themselves and also
receive time from external server.
I believe that at some moment connection to external server was lost and
time on these two peers drifted a bit.

When connection to external server was restored both ntpd on both peers
logged something like:
Jun  5 13:21:09 peer0 ntpd[5052]: frequency error 18158 PPM exceeds
tolerance 500 PPM

After that there were a lot of messages with not so big freq error:
Jun  5 13:23:18 DIG ntpd[5052]: frequency error 608 PPM exceeds tolerance
500 PPM

When an operator saw time difference with external server about 30sec he
just restarted ntpd on both nodes and surprisingly freq error messages
disappeared. Now difference is about 1ms and stability stays about 0.021

So my question is: is it possible to confuse ntpd's freq error measurement
with some wrong settings?


My config is:
-
driftfile /var/lib/ntp/drift
keys /etc/ntp/keys

restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
restrict 127.0.0.1
restrict -6 ::1

server 127.127.1.0 noselect
fudge  127.127.1.0 stratum 10

restrict 192.168.0.240 mask 255.255.255.240

tinker step 0
tos minclock 1 minsane 1

peer 192.168.0.241 burst iburst minpoll 4 maxpoll 6 prefer true
server **external-server-ip** burst iburst minpoll 4 maxpoll 6 true
---
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Is it possible to confuse ntpd's freq error measurement procedure?

2012-06-06 Thread Dave Hart
On Wed, Jun 6, 2012 at 6:07 PM, Paul Malishev p.malis...@gmail.com wrote:
 Hello.

 I have two ntpd peers which exchange time between themselves and also
 receive time from external server.
 I believe that at some moment connection to external server was lost and
 time on these two peers drifted a bit.

 When connection to external server was restored both ntpd on both peers
 logged something like:
 Jun  5 13:21:09 peer0 ntpd[5052]: frequency error 18158 PPM exceeds
 tolerance 500 PPM

 After that there were a lot of messages with not so big freq error:
 Jun  5 13:23:18 DIG ntpd[5052]: frequency error 608 PPM exceeds tolerance
 500 PPM

 When an operator saw time difference with external server about 30sec he
 just restarted ntpd on both nodes and surprisingly freq error messages
 disappeared. Now difference is about 1ms and stability stays about 0.021

 So my question is: is it possible to confuse ntpd's freq error measurement
 with some wrong settings?

ntpd measures the frequency error directly only at startup, lacking a
driftfile.  While it's operating, it's not measuring the frequency
error, but rather manipulating it to steer the clock offset toward
zero.  That manipulation is capped at 500 parts per million relative
to the nominal clock rate.

 My config is:
 tinker step 0

You've told ntpd to never step the clock to correct it (by default, it
is stepped when the offset exceeds 128ms).  So instead, ntpd must
eliminate slowly by running the clock faster or slower (but not more
than 500 PPM faster or slower).  The large frequency error in the
messages is a direct result of the perceived local clock offset.

Likely restarting ntpd also invoked ntpdate, which stepped the clock
so it was close enough that after restarting, ntpd's rate adjustments
were well under 500 PPM.

Cheers,
Dave Hart
___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions


Re: [ntp:questions] Is it possible to confuse ntpd's freq error measurement procedure?

2012-06-06 Thread E-Mail Sent to this address will be added to the BlackLists
Paul Malishev wrote:
 I have two ntpd peers which exchange time between
  themselves and also receive time from external server.
 I believe that at some moment connection to external
  server was lost and time on these two peers drifted a bit.

 When connection to external server was restored both ntpd
  on both peers logged something like:
 Jun  5 13:21:09 peer0 ntpd[5052]:
  frequency error 18158 PPM exceeds tolerance 500 PPM

 After that there were a lot of messages with not so big freq error:
 Jun  5 13:23:18 DIG ntpd[5052]:
  frequency error 608 PPM exceeds tolerance 500 PPM
...
 When an operator saw time difference with external server about 30sec
...

 They must have been unable to reach the external server,
  for a really long time?


 server 127.127.1.0 noselect
 fudge  127.127.1.0 stratum 10


 If it looses the all other servers, it will likely continue
   to run away at whatever frequency was last set;
  If it can still contact the internal peer,
   they should run off together (or one would chase the other).

 You might try orphan mode instead; e.g.

tos cohort 1 orphan 10

 http://www.eecis.udel.edu/~mills/ntp/html/orphan.html
  {Although I'm not certain it would have any significant value,
when only one other server can be reached.}


 restrict 192.168.0.240 mask 255.255.255.240

 If you ever use a server by host name,
  especially when the name may return multiple A records,
  (e.g. pool servers) you may need to add a restrict source
  line; e.g.

restrict source nomodify


 tinker step 0

 Remove that if you want it to step, instead of slew always.
 BTW, step 0 also disables kernel discipline!


 tos minclock 1 minsane 1

MinSane defaults to 1 ?


 peer 192.168.0.241 burst iburst minpoll 4 maxpoll 6 prefer true
 server **external-server-ip** burst iburst minpoll 4 maxpoll 6 true

 You should not do burst on servers that are not your own.
  {I have no idea who **external-server-ip** belongs to.}
   {iburst is fine}

  The docs also seem to say to not use burst or iburst with peer ?
  http://www.eecis.udel.edu/~mills/ntp/html/assoc.html
  http://www.eecis.udel.edu/~mills/ntp/html/confopt.html


 Are you really intentionally saying to treat both
  the other internal server and the external server
  as if they always have valid time, with the true option?
   {Even if some day they may not be even close.}

   Are you treating them both as true chimers
because you only have two servers to reference?


-- 
E-Mail Sent to this address blackl...@anitech-systems.com
  will be added to the BlackLists.

___
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions