subject:"Re\: \[ntp\:questions\] NTP vs chrony comparison \(Was\: oscillations in ntp clock synchronization\)"

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-31 Thread Bill Unruh

On Wed, 30 Jan 2008, Danny Mayer wrote:

 Unruh wrote:
  David L. Mills [EMAIL PROTECTED] writes:
 
   David,
 
   We can argue about the Hurst parameter, which can't be truly random-walk 
   as I have assumed, but the approximation is valid up to lag times of at 
   least a week. However, as I have been cautioned, these plots are really 
   sensitive to spectral lines due to nonuniform sampling. I was very 
   careful to avoid such things.

  But the lines I am refering to are not artifacts, they are there because
  of the way the computer is used. -- the temp fluctuations caused by people
  running the machine daily, except on weekends. These are not part of any
  random walk process. They are real jumps in the drift rate of the
  machine, large jumps, and definitely not random.
 

 Well of course. You are running Linux and losing interrupts. FreeBSD and 
 friends don't suffer from that problem. I seem to remember setting HZ=100 
 mostly eliminates that problem, at the price of rebuilding the kernel.

 Danny

No they are not lost interrupts. They are NOT jumps in the offset, they are
jumps in the frequency, which will last for a few hours and then jump back.
Lost interrupts do not act like that-- they would jump the offset by 10ms
(or 4ms) which is definitely not happening. Andit is hard to gain
interrupts.





-- 
William G. Unruh   |  Canadian Institute for| Tel: +1(604)822-3273
PhysicsAstronomy  | Advanced Research  | Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology | [EMAIL PROTECTED]
Canada V6T 1Z1 |  and Gravity   |  www.theory.physics.ubc.ca/
___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-31 Thread David Woolley

Danny Mayer wrote:

 
 Well of course. You are running Linux and losing interrupts. FreeBSD and 

Lost interrupts are not the problem here and nothing about FreeBSD 
should help (unless it runs the CPU permanently at full power).

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-29 Thread David L. Mills

David,

Cite: Judah Levine of NIST, personal communication. A few little 
mistakes on my part proved him right.

Dave

[EMAIL PROTECTED] wrote:

 In comp.protocols.time.ntp you write:

 Hi Dave,

 We can argue about the Hurst parameter, which can't be truly random-walk
 as I have assumed, but the approximation is valid up to lag times of at
 least a week. However, as I have been cautioned, these plots are really
 sensitive to spectral lines due to nonuniform sampling. I was very
 careful to avoid such things.


 Do you have a cite for that?

 Have you seen Vit Klemes take on tree ring data:

 http://iahs.info/perugia/2007IAHSKlemesTreeRings.pdf

 It might appeal to your sense of humour.

 David.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-28 Thread Unruh

[EMAIL PROTECTED] (Danny Mayer) writes:

David L. Mills wrote:
 Danny,
 
 It doesn't stop working; it just clamps whatever it gets to +-500 PPM as 
 appropriate. If the intrinsic error is greater than 500 PPM, the loop 
 will do what it can with the residual it can't correct showing as a 
 systematic time ofset.
 
 Dave
 

I didn't mean to suggest that ntpd stopped running. It was that the 
clock was drifting steadily off into the sunset. I realize that if the 
problem corrected itself ntpd would bring things back to normal.

But that suggests that the drift rate of your chip became bigger than
500PPM, which is huge. Maybe something altered the tick size
inappropriately. ntp should have hauled the offset back to zero -- just
taking a longer time ( 100msec at 500PPM takes about 200 sec to eliminate--
which is not that long.)



Danny

 Danny Mayer wrote:
 David L. Mills wrote:

 Danny,

 Unless the computer clock intrinsic frequency error is huge, the only 
 time the 500-PPM kicks in is with a 100-ms step transient and poll 
 interval 16 s. The loop still works if it hits the stops; it just can't 
 drive the offset to zero.

 Dave

 Yes, I found this out when my laptop stopped disciplined the clock and 
 was complaining about the frequency limits and I started digging into 
 the code to figure out why.

 Danny
 
 ___
 questions mailing list
 questions@lists.ntp.org
 https://lists.ntp.org/mailman/listinfo/questions
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-28 Thread Unruh

[EMAIL PROTECTED] (David Malone) writes:

Unruh [EMAIL PROTECTED] writes:

weekends. Lots of power at 10^-5 Hz and harmonics, and .7 10^-8Hz.-- more
than would be predicted by 1/f

10^-5Hz is about once per day. I'm not sure what .7 10^8Hz is - it
seems to be about once every 4.5 years? I would have assumed you'd
get power around 10^-5Hz (daily), 10^-6 Hz (weekly) and maybe 3x10^-8
(yearly) based on a mix of enviromental factors (air conditioning/heating)
and usage?

Yes, that was supposed to  be 1/week. 


   David.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-28 Thread Danny Mayer

Unruh wrote:
 [EMAIL PROTECTED] (Danny Mayer) writes:
 
 David L. Mills wrote:
 Danny,

 It doesn't stop working; it just clamps whatever it gets to +-500 PPM as 
 appropriate. If the intrinsic error is greater than 500 PPM, the loop 
 will do what it can with the residual it can't correct showing as a 
 systematic time ofset.

 Dave

 
 I didn't mean to suggest that ntpd stopped running. It was that the 
 clock was drifting steadily off into the sunset. I realize that if the 
 problem corrected itself ntpd would bring things back to normal.
 
 But that suggests that the drift rate of your chip became bigger than
 500PPM, which is huge. Maybe something altered the tick size
 inappropriately. ntp should have hauled the offset back to zero -- just
 taking a longer time ( 100msec at 500PPM takes about 200 sec to eliminate--
 which is not that long.)


No, it was something else entirely and not something that ntpd, chrony 
or any other application could do anything about. It's fixed now.

Danny
___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-28 Thread Bill Unruh

What was the problem?

On Mon, 28 Jan 2008, Danny Mayer wrote:

 Unruh wrote:
  [EMAIL PROTECTED] (Danny Mayer) writes:
 
   David L. Mills wrote:
Danny,
   
It doesn't stop working; it just clamps whatever it gets to +-500 PPM 
as appropriate. If the intrinsic error is greater than 500 PPM, the 
loop will do what it can with the residual it can't correct showing as 
a systematic time ofset.
   
Dave
  
 
   I didn't mean to suggest that ntpd stopped running. It was that the 
   clock was drifting steadily off into the sunset. I realize that if the 
   problem corrected itself ntpd would bring things back to normal.

  But that suggests that the drift rate of your chip became bigger than
  500PPM, which is huge. Maybe something altered the tick size
  inappropriately. ntp should have hauled the offset back to zero -- just
  taking a longer time ( 100msec at 500PPM takes about 200 sec to
  eliminate--
  which is not that long.)
 

 No, it was something else entirely and not something that ntpd, chrony or any 
 other application could do anything about. It's fixed now.

 Danny


-- 
William G. Unruh   |  Canadian Institute for| Tel: +1(604)822-3273
PhysicsAstronomy  | Advanced Research  | Fax: +1(604)822-5324
UBC, Vancouver,BC  |   Program in Cosmology | [EMAIL PROTECTED]
Canada V6T 1Z1 |  and Gravity   |  www.theory.physics.ubc.ca/
___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-28 Thread David L. Mills

David,

We can argue about the Hurst parameter, which can't be truly random-walk 
as I have assumed, but the approximation is valid up to lag times of at 
least a week. However, as I have been cautioned, these plots are really 
sensitive to spectral lines due to nonuniform sampling. I was very 
careful to avoid such things.

Dave

David Malone wrote:

 Unruh [EMAIL PROTECTED] writes:
 
 
weekends. Lots of power at 10^-5 Hz and harmonics, and .7 10^-8Hz.-- more
than would be predicted by 1/f
 
 
 10^-5Hz is about once per day. I'm not sure what .7 10^8Hz is - it
 seems to be about once every 4.5 years? I would have assumed you'd
 get power around 10^-5Hz (daily), 10^-6 Hz (weekly) and maybe 3x10^-8
 (yearly) based on a mix of enviromental factors (air conditioning/heating)
 and usage?
 
   David.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-28 Thread David L. Mills

Maarten,

Maybe I didn't make myself clear. The case in question is when the 
intrinsic frequency error of the computer clock is greater than 500 PPM, 
in which case the discipline loop cannot compensate for the error. The 
result is a systematic time offset error that cannot be driven to zero. 
This has nothing to do with the initial offset as you suggest.

Dave

Maarten Wiltink wrote:
 Unruh [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
 
David L. Mills wrote:
 
 
Unless the computer clock intrinsic frequency error is huge, the
only time the 500-PPM kicks in is with a 100-ms step transient and
poll interval 16 s. The loop still works if it hits the stops; it
just can't drive the offset to zero.
 
 [...]
 
Why can't it drive the offset to zero? 100ms should take about 5 min(if
it were always 500 but the loop would make it take longer)
 
 
 That would presumably be in the case of 'huge intrinsic frequency error'.
 
 Groetjes,
 Maarten Wiltink
 
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-27 Thread Danny Mayer

David L. Mills wrote:
 It's easy to make your own Allan characteristic. Just let the computer 
 clock free-run for a couple of weeks and record the offset relative to a 
 known and stable standard, preferable at the smallest poll interval you 
 can. The PPS from a GPS receiver is an ideal source, but you have to 
 jerry-rig a means to capture each transition.
 
 Compute the RMS frequency differences, decimate and repeat. Don't take 
 the following seriously, I lifted it without considering context, but 
 that's the general idea. Be very careful about missing data, etc., as 
 that creates spectral lines that mess up the plot.
 
 p = w; r = diff(x); q = y; i = 1; d = 1;
 while (length(q) = 10)
  u = diff(p) / d;
  x2(i) = sqrt(mean(u .* u) / 2);
  u = diff(r) / d;
  x1(i) = sqrt(mean(u .* u) / 2);
  u = diff(q);
  y1(i) = sqrt(mean(u .* u) / 2);
  p = p(1:2:length(p));
  r = r(1:2:length(r));
  q = q(1:2:length(q));
  m1(i) = d; i = i + 1; d = d * 2;
 end
 loglog(m1, x2 * 1e6, m1, x1 * 1e6, m1, y1 * 1e6, m1, (x1 + y1) * 1e6)
 axis([1 1e5 1e-4 100]);
 xlabel('Time Interval (s)');
 ylabel('Allan Deviation (PPM)');
 print -dtiff allan
 
 Dave

And for those of you who didn't recognize it, that's MatLab code.

Danny
___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-27 Thread Danny Mayer

David L. Mills wrote:
 Danny,
 
 Unless the computer clock intrinsic frequency error is huge, the only 
 time the 500-PPM kicks in is with a 100-ms step transient and poll 
 interval 16 s. The loop still works if it hits the stops; it just can't 
 drive the offset to zero.
 
 Dave

Yes, I found this out when my laptop stopped disciplined the clock and 
was complaining about the frequency limits and I started digging into 
the code to figure out why.

Danny
___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-26 Thread David Woolley

Danny Mayer wrote:

 No, ntpd deliberately limits frequency changes to 500 PPM. That's hard 
 coded. You need to avoid using anything greater than that as Dave has 
 explained. That would be the reason why it taks ntpd longer to bring the 
 clock back to the right time.

Assuming that the static frequency error is consistent with a medium to 
high quality motherboard, slew rate limiting should only kick in if the 
clock was out by more than the order of a second in the first place, in 
which case stepping would have to have been inhibited.  For normal users 
the slow convergence is due to loop time constant being more suited to 
handling gradual temperature variations than startup transients of 
frequency hits.

The slew rate limit, for zero static error, is 1s/2000s.  The loop first 
zero crossing I seem to be remember being quoted at about 3000s, with 
minpoll set for 64 and the slew rate not being exceeded.  The resulting 
peak slew rate is more than 1/3000, for a 1 second error, but will be 
well below 1/2000 for a 128ms error.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-26 Thread Petri Kaukasoina

David Woolley  [EMAIL PROTECTED] wrote:
Petri Kaukasoina wrote:
 Basically, it stepped time with ntpdate, slept 100 seconds and stepped time
 again with ntpdate. From the time adjustment, the script calculated the
 drift value and put that to the drift file. Again, the time offset always
 stays below 1 ms.

That has quite a lot of similarity with what ntpd itself does if it is 
cold started with iburst.  The only big difference is that it uses 900, 
rather than 100 seconds.  I don't know if that is the same 900 as 
controlled by tinker stepout, but, even if it is, the side effect on 
stepout's would probably be undesirable.  To cold start you need to 
delete the drift file, or not configure it.

Hmm, I can't see that. I put in only one good time source with iburst,
deleted the drift file and started ntpd. The time offset just keeps growing
and the frequency changes in very small steps. Now, after 30 minutes time is
already 25 ms off and the frequency is only 1.5 ppm (the correct value would
be about 25 ppm).

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-26 Thread David Woolley

Petri Kaukasoina wrote:
 David Woolley  [EMAIL PROTECTED] wrote:

 That has quite a lot of similarity with what ntpd itself does if it is 
 cold started with iburst.  The only big difference is that it uses 900, 

 
 Hmm, I can't see that. I put in only one good time source with iburst,
 deleted the drift file and started ntpd. The time offset just keeps growing
 and the frequency changes in very small steps. Now, after 30 minutes time is
 already 25 ms off and the frequency is only 1.5 ppm (the correct value would
 be about 25 ppm).

Looking at the comments in the 4.2.0 source code, it looks like you may 
be right; yet another reason why ntpd doesn't handle startup transients 
well!

If this is still true in the latest version ( max means offset  128ms):

  *  Statemaxmax   Comments
  *  
  *  NSETFREQFREQno ntp.drift

  *  FREQSYNCif (mu  900) FREQ  calculate frequency
  *  else if (allow) TSET
  *  else FREQ
  *

Worse than is obvious here, it only sets the time on the first sample if 
it is out by more than 128ms.  More obvious, unless the frequency error 
is so high that the time changes by more than  128ms between the first 
two good samples, it will use the slow PLL method of calibrating the 
frequency.  Even then, unless the offset is more than 128ms both the for 
first sample, and after every subsequent sample, it will compute the 
frequency based on the final absolute value of clock offset, not the 
difference between the first and last readings; this might not be too 
important, because it looks to me to require the intial offset to be 
very close to 128ms (low  probability) or the frequency error to be 
quite high (percentage error in frequency calculation relatively low) 
for it to complete the frequency calibration.

What I was expecting was for it to unconditionally do both frequency and 
phase calibration, in the absence of the drift file.  I presume that 
chrony does a correction on the first couple of samples and then refines it.

Incidentally, the else FREQ doesn't seem to match the code and looks 
like it would prevent it ever getting out of the calibration under some 
conditions.

It looks like I need to fetch the latest source, although it looks, from 
your observations, as though it is still far from what I would consider 
right.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-26 Thread Jan Ceuleers

David,

David Woolley wrote:
 ISTR that time stamps on financial transactions are required to be 
 within two seconds of the correct time.  With NTP that standard is not 
 too difficult to meet.
 
 In 2006, it turns out that it was 3 seconds 
 http://tf.nist.gov/general/pdf/2125.pdf,

NIST is a US government institution; might there perhaps be different 
laws or regulations elsewhere in the world? Does anyone among the 
readership here know?

Thx, Jan

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-26 Thread David Woolley

Jan Ceuleers wrote:

 NIST is a US government institution; might there perhaps be different 
 laws or regulations elsewhere in the world? Does anyone among the 
 readership here know?

I used the US case as that is the one that has come up on the newsgroup, 
but I assume there are similar rules elsewhere.  I think the NIST is 
only documenting the rules in this case, my guess is that it is the SEC 
that sets them, in the USA. I did the search with a site:nist.gov to 
reduce the false positives.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-26 Thread David L. Mills

Richard,

There were several different architecture computers considered in the 
1995 and 1998 studies, incluing SPARC, Alpha, Intel and several lab 
instruments. All oscillators conformed to a simple model: white phase 
noise (slope -1) below the intercept, random-walk frequency noise (slope 
+0.5) above the intercept. This is equivalent to your model.

Additional data are in the nanokernel documentation. The only 
differences are in the (x, y) intercept. You don't need das Buch to 
justify this model; there is evidence all over the place. Clocks of all 
kinds from cold rocks to Cesium oscillators all show very similar 
chacteristics, whether modelled in the time domain or frequency domain.

It's easy to make your own Allan characteristic. Just let the computer 
clock free-run for a couple of weeks and record the offset relative to a 
known and stable standard, preferable at the smallest poll interval you 
can. The PPS from a GPS receiver is an ideal source, but you have to 
jerry-rig a means to capture each transition.

Compute the RMS frequency differences, decimate and repeat. Don't take 
the following seriously, I lifted it without considering context, but 
that's the general idea. Be very careful about missing data, etc., as 
that creates spectral lines that mess up the plot.

p = w; r = diff(x); q = y; i = 1; d = 1;
while (length(q) = 10)
 u = diff(p) / d;
 x2(i) = sqrt(mean(u .* u) / 2);
 u = diff(r) / d;
 x1(i) = sqrt(mean(u .* u) / 2);
 u = diff(q);
 y1(i) = sqrt(mean(u .* u) / 2);
 p = p(1:2:length(p));
 r = r(1:2:length(r));
 q = q(1:2:length(q));
 m1(i) = d; i = i + 1; d = d * 2;
end
loglog(m1, x2 * 1e6, m1, x1 * 1e6, m1, y1 * 1e6, m1, (x1 + y1) * 1e6)
axis([1 1e5 1e-4 100]);
xlabel('Time Interval (s)');
ylabel('Allan Deviation (PPM)');
print -dtiff allan

Dave

Richard B. Gilbert wrote:
 Unruh wrote:
 
 David L. Mills [EMAIL PROTECTED] writes:


 David,



 1. I have explained in very gory detail in many places how the time 
 constant is chosen for the best accuracy using typical computer 
 oscillators and network paths. See the briefings on the NTP project 
 page and especially the discussion about the Allan intercept. If you 
 want the 



 The Allan intercept is predicated on a very specific model of the 
 noise in
 a clock ( as I recall basically random gaussian noise at high 
 frequencies,
 and 1/f noise at low). It is not at all clear that real computers comply
 with that.


 best accuracy over the long term, you had better respect that. Proof 
 positive is in my 1995 SIGCOMM paper, later IEEE Transactions on 
 Networking paper and das Buch. I abvsolutely relish scientific 
 critique, but see the briefings and read the papers first.



 2. To reduce the convergence time, reduce the time constant, but only 
 at the expense of long term accuracy. An extended treatise on that is 
 in das Buch, especially Chaptera 4, 6 and 12. I would be delighted to 
 hear critique of the material, but read the chapters first.



 While you may know what in the world Das Buch is (Hitlers Mein Kampf?) 
 I do
 not. Nor do I know where to get it.
 
 
 Computer Network Time Synchronization: The Network Time Protocol by 
 David L. Mills (Hardcover - Mar 24, 2006)
 
 Available from Amazon.com.   You may be able to find a copy at a 
 University Book store.  Be prepared for Sticker Shock.  It ain't 
 cheap!  Publishing in small quantities is EXPENSIVE!!!  It's different 
 when you can amortize your setup costs over 50,000 copies!
 
 Das Buch is unlikely to become a best seller!
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-26 Thread Richard B. Gilbert

David L. Mills wrote:
 Richard,
 
 There were several different architecture computers considered in the 
 1995 and 1998 studies, incluing SPARC, Alpha, Intel and several lab 
 instruments. All oscillators conformed to a simple model: white phase 
 noise (slope -1) below the intercept, random-walk frequency noise (slope 
 +0.5) above the intercept. This is equivalent to your model.
 
 Additional data are in the nanokernel documentation. The only 
 differences are in the (x, y) intercept. You don't need das Buch to 
 justify this model; there is evidence all over the place. Clocks of all 
 kinds from cold rocks to Cesium oscillators all show very similar 
 chacteristics, whether modelled in the time domain or frequency domain.
 
 It's easy to make your own Allan characteristic. Just let the computer 
 clock free-run for a couple of weeks and record the offset relative to a 
 known and stable standard, preferable at the smallest poll interval you 
 can. The PPS from a GPS receiver is an ideal source, but you have to 
 jerry-rig a means to capture each transition.
 
 Compute the RMS frequency differences, decimate and repeat. Don't take 
 the following seriously, I lifted it without considering context, but 
 that's the general idea. Be very careful about missing data, etc., as 
 that creates spectral lines that mess up the plot.
 
 p = w; r = diff(x); q = y; i = 1; d = 1;
 while (length(q) = 10)
 u = diff(p) / d;
 x2(i) = sqrt(mean(u .* u) / 2);
 u = diff(r) / d;
 x1(i) = sqrt(mean(u .* u) / 2);
 u = diff(q);
 y1(i) = sqrt(mean(u .* u) / 2);
 p = p(1:2:length(p));
 r = r(1:2:length(r));
 q = q(1:2:length(q));
 m1(i) = d; i = i + 1; d = d * 2;
 end
 loglog(m1, x2 * 1e6, m1, x1 * 1e6, m1, y1 * 1e6, m1, (x1 + y1) * 1e6)
 axis([1 1e5 1e-4 100]);
 xlabel('Time Interval (s)');
 ylabel('Allan Deviation (PPM)');
 print -dtiff allan
 
 Dave
 
 Richard B. Gilbert wrote:
 
 Unruh wrote:

 David L. Mills [EMAIL PROTECTED] writes:


 David,




 1. I have explained in very gory detail in many places how the time 
 constant is chosen for the best accuracy using typical computer 
 oscillators and network paths. See the briefings on the NTP project 
 page and especially the discussion about the Allan intercept. If you 
 want the 




 The Allan intercept is predicated on a very specific model of the 
 noise in
 a clock ( as I recall basically random gaussian noise at high 
 frequencies,
 and 1/f noise at low). It is not at all clear that real computers comply
 with that.


 best accuracy over the long term, you had better respect that. Proof 
 positive is in my 1995 SIGCOMM paper, later IEEE Transactions on 
 Networking paper and das Buch. I abvsolutely relish scientific 
 critique, but see the briefings and read the papers first.




 2. To reduce the convergence time, reduce the time constant, but 
 only at the expense of long term accuracy. An extended treatise on 
 that is in das Buch, especially Chaptera 4, 6 and 12. I would be 
 delighted to hear critique of the material, but read the chapters 
 first.




 While you may know what in the world Das Buch is (Hitlers Mein 
 Kampf?) I do
 not. Nor do I know where to get it.



 Computer Network Time Synchronization: The Network Time Protocol by 
 David L. Mills (Hardcover - Mar 24, 2006)

 Available from Amazon.com.   You may be able to find a copy at a 
 University Book store.  Be prepared for Sticker Shock.  It ain't 
 cheap!  Publishing in small quantities is EXPENSIVE!!!  It's different 
 when you can amortize your setup costs over 50,000 copies!

 Das Buch is unlikely to become a best seller!


David,

Why are you telling me this?   My contribution to this thread consisted 
of the above exposition of the publication data and availability of Das 
Buch.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-26 Thread David L. Mills

Danny,

Unless the computer clock intrinsic frequency error is huge, the only 
time the 500-PPM kicks in is with a 100-ms step transient and poll 
interval 16 s. The loop still works if it hits the stops; it just can't 
drive the offset to zero.

Dave

Danny Mayer wrote:

 Unruh wrote:
 
David L. Mills [EMAIL PROTECTED] writes:
 
 
Reading your claims literally, chrony would have to slew the clock 
considerably greater than the 500 PPM provided by the standard Unix 
adjtime() system call. Please explain how it does that.

Using the Linux adjtimex system call which has the ability to change the
ticksize which gives much greater than 500PPM slew rate for the clocks.
( Up to 10PPM, although that is never used. ) And as I understand it,
your handling of leap seconds in ntp also uses far greater than 500PPM slew 
rates. 
 
 
 No, ntpd deliberately limits frequency changes to 500 PPM. That's hard 
 coded. You need to avoid using anything greater than that as Dave has 
 explained. That would be the reason why it taks ntpd longer to bring the 
 clock back to the right time.
 
 Danny

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-26 Thread David L. Mills

Petru,

The default 900-s stepout interval was originally determined by the time 
an old Spectracom WWVB receiver took to regain synchronization after a 
leapsecond and should probably be reduced. It can of course be tinkere.

During the initial training period the time is not disciplined other 
than to amortize the initial offset. The bookeeping to do that and 
preserve an accurate frequency measuremen got too tedious and fragile. 
So, at the end of the training period the offset that built up during 
the interval is amortized. I didn't think this was much of a problem, 
since in practice the training is done only once.

Dave

Petri Kaukasoina wrote:

 David Woolley  [EMAIL PROTECTED] wrote:
 
Petri Kaukasoina wrote:

Basically, it stepped time with ntpdate, slept 100 seconds and stepped time
again with ntpdate. From the time adjustment, the script calculated the
drift value and put that to the drift file. Again, the time offset always
stays below 1 ms.

That has quite a lot of similarity with what ntpd itself does if it is 
cold started with iburst.  The only big difference is that it uses 900, 
rather than 100 seconds.  I don't know if that is the same 900 as 
controlled by tinker stepout, but, even if it is, the side effect on 
stepout's would probably be undesirable.  To cold start you need to 
delete the drift file, or not configure it.
 
 
 Hmm, I can't see that. I put in only one good time source with iburst,
 deleted the drift file and started ntpd. The time offset just keeps growing
 and the frequency changes in very small steps. Now, after 30 minutes time is
 already 25 ms off and the frequency is only 1.5 ppm (the correct value would
 be about 25 ppm).

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-26 Thread Unruh

[EMAIL PROTECTED] (Danny Mayer) writes:

Unruh wrote:
 David L. Mills [EMAIL PROTECTED] writes:

 Reading your claims literally, chrony would have to slew the clock 
 considerably greater than the 500 PPM provided by the standard Unix 
 adjtime() system call. Please explain how it does that.
 
 Using the Linux adjtimex system call which has the ability to change the
 ticksize which gives much greater than 500PPM slew rate for the clocks.
 ( Up to 10PPM, although that is never used. ) And as I understand it,
 your handling of leap seconds in ntp also uses far greater than 500PPM slew 
 rates. 

No, ntpd deliberately limits frequency changes to 500 PPM. That's hard 
coded. You need to avoid using anything greater than that as Dave has 
explained. That would be the reason why it taks ntpd longer to bring the 
clock back to the right time.

Well, no to both. ntpd steps, which hardly obeys that limit, and the reason
ntp takes such a long time is that it has an intergration loop with such a
long time constant. If it put its mind to it and used the 500PPM to get rid
of a 50ms offset, it would only take 200 sec, not 3 hours.
It slowly jacks the PPM to 400 or so and then slowly drops it again below
the nominal. This is done to avoid trashing or instability.



ddDanny

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-26 Thread Unruh

David J Taylor [EMAIL PROTECTED] writes:

Richard B. Gilbert wrote:
[]
 Computer Network Time Synchronization: The Network Time Protocol by
 David L. Mills (Hardcover - Mar 24, 2006)

 Available from Amazon.com.   You may be able to find a copy at a
 University Book store.  Be prepared for Sticker Shock.  It ain't
 cheap!  Publishing in small quantities is EXPENSIVE!!!  It's different
 when you can amortize your setup costs over 50,000 copies!

 Das Buch is unlikely to become a best seller!

Perhaps we could have a Lulu version?  They can manage small quantities 
very effectively.  See:

  http://www.lulu.com

I'd love to see the book, but can't afford those Amazon prices.

Would have been nice if there were an online version.

Cheers,
David 


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-26 Thread David L. Mills

David,

I don't know your version, but the TSET state was removed some time ago 
and your comments are different from the current source. It's really 
hard to test the discipline under all conceivable conditions. Now and 
then somebody cooks up a case considered very unlikely, like Solaris 
adjtime() behavior with large offsets and force-slew mode, so the code 
does get tweaked from time to time.

Dave

David Woolley wrote:

 Petri Kaukasoina wrote:
 
 David Woolley  [EMAIL PROTECTED] wrote:
 
 
 That has quite a lot of similarity with what ntpd itself does if it 
 is cold started with iburst.  The only big difference is that it uses 
 900, 
 
 

 Hmm, I can't see that. I put in only one good time source with iburst,
 deleted the drift file and started ntpd. The time offset just keeps 
 growing
 and the frequency changes in very small steps. Now, after 30 minutes 
 time is
 already 25 ms off and the frequency is only 1.5 ppm (the correct value 
 would
 be about 25 ppm).
 
 
 Looking at the comments in the 4.2.0 source code, it looks like you may 
 be right; yet another reason why ntpd doesn't handle startup transients 
 well!
 
 If this is still true in the latest version ( max means offset  128ms):
 
  *  Statemaxmax   Comments
  *  
  *  NSETFREQFREQno ntp.drift
 
  *  FREQSYNCif (mu  900) FREQ  calculate frequency
  *  else if (allow) TSET
  *  else FREQ
  *
 
 Worse than is obvious here, it only sets the time on the first sample if 
 it is out by more than 128ms.  More obvious, unless the frequency error 
 is so high that the time changes by more than  128ms between the first 
 two good samples, it will use the slow PLL method of calibrating the 
 frequency.  Even then, unless the offset is more than 128ms both the for 
 first sample, and after every subsequent sample, it will compute the 
 frequency based on the final absolute value of clock offset, not the 
 difference between the first and last readings; this might not be too 
 important, because it looks to me to require the intial offset to be 
 very close to 128ms (low  probability) or the frequency error to be 
 quite high (percentage error in frequency calculation relatively low) 
 for it to complete the frequency calibration.
 
 What I was expecting was for it to unconditionally do both frequency and 
 phase calibration, in the absence of the drift file.  I presume that 
 chrony does a correction on the first couple of samples and then refines 
 it.
 
 Incidentally, the else FREQ doesn't seem to match the code and looks 
 like it would prevent it ever getting out of the calibration under some 
 conditions.
 
 It looks like I need to fetch the latest source, although it looks, from 
 your observations, as though it is still far from what I would consider 
 right.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-26 Thread Unruh

David Woolley [EMAIL PROTECTED] writes:


What I was expecting was for it to unconditionally do both frequency and 
phase calibration, in the absence of the drift file.  I presume that 
chrony does a correction on the first couple of samples and then refines it.

Yes. Actually it does a recalibration using the last n samples ( where n is
dynamic and grows with stabiltiy and shrinks if the linear fit is not a
very good one-- good defined by looking at how often the errors in the
linear fit cross zero) It then uses the adjtimex OFFSET single shot
adjustment to get rid of the ofset and uses the slope to set the freuency,
adjusting the old samples to account for the change in offset and
frequency, and keeping track of the offset ajustment in case it was
interrupted or did not completely conpensate. 



Incidentally, the else FREQ doesn't seem to match the code and looks 
like it would prevent it ever getting out of the calibration under some 
conditions.

It looks like I need to fetch the latest source, although it looks, from 
your observations, as though it is still far from what I would consider 
right.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-26 Thread Unruh

Richard B. Gilbert [EMAIL PROTECTED] writes:

David,

Why are you telling me this?   My contribution to this thread consisted 
of the above exposition of the publication data and availability of Das 
Buch.

He is not good at following attributions in threads. He addressed it to you
because he read my comments in your reply. I understood them to be directed
to me.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-25 Thread David Woolley

David L. Mills wrote:

 5. This flap about the speed of convergence has become silly. Most of us 
 are less concerned about squeezing to the low microseconds in four 

Have you done the market surveys to confirm this?  I don't have the 
resources or time to do that, but my impression from the sort of 
questions that appear on this newsgroup is that most IT managers and 
turnkey system developers who want better than 100ms clock accuracy want 
one or both of:

- fast convergence (small compared with overall bootup time) - a
   a common case, these days, is that they are not allowed to process
   financial transactions until convergence is complete;

- strict monotonicity.

It may well be that most users don't need better than 100ms, but those 
users don't care about long term stability, and their long term may be 
an 8 hour shift.


(My interest in NTP is more theoretical, as I work in an industry sector 
that, whilst it deals with timestamped data, those timestamps are often 
a minute or two out (and are added by equipment that is out of our 
control), but I do notice the sorts of questions that keep coming up 
time and time again.)

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-25 Thread David L. Mills

Root,

Right; 5 microseconds per timer interrupt at 100 Hz is 0.5 ms/s. That 
was the original Unix kernel value.

Dave

root wrote:
 David L. Mills [EMAIL PROTECTED] writes:
 snip

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-25 Thread David Woolley

Petri Kaukasoina wrote:
 Basically, it stepped time with ntpdate, slept 100 seconds and stepped time
 again with ntpdate. From the time adjustment, the script calculated the
 drift value and put that to the drift file. Again, the time offset always
 stays below 1 ms.

That has quite a lot of similarity with what ntpd itself does if it is 
cold started with iburst.  The only big difference is that it uses 900, 
rather than 100 seconds.  I don't know if that is the same 900 as 
controlled by tinker stepout, but, even if it is, the side effect on 
stepout's would probably be undesirable.  To cold start you need to 
delete the drift file, or not configure it.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-25 Thread Richard B. Gilbert

David Woolley wrote:
 David L. Mills wrote:
 
 5. This flap about the speed of convergence has become silly. Most of 
 us are less concerned about squeezing to the low microseconds in four 
 
 
 Have you done the market surveys to confirm this?  I don't have the 
 resources or time to do that, but my impression from the sort of 
 questions that appear on this newsgroup is that most IT managers and 
 turnkey system developers who want better than 100ms clock accuracy want 
 one or both of:
 
 - fast convergence (small compared with overall bootup time) - a
   a common case, these days, is that they are not allowed to process
   financial transactions until convergence is complete;
 
 - strict monotonicity.
 
snip

ISTR that time stamps on financial transactions are required to be 
within two seconds of the correct time.  With NTP that standard is not 
too difficult to meet.

Other applications might be far more demanding.




___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-25 Thread David L. Mills

Daivd,

Well, I have done a market survey of sorts, if you can count my 
consulting clients. There seems general agreement that 1 ms is a good 
target, but there is a wide range of expecttions on how quickly that 
must be achieved. Actually, if the TOY chip is within 1 PPM and the 
downtime is less than 1000 s, convergence is essentially instantaneous. 
My advice to the Aegis crew was to isolate the NTP puppies on the fire 
control Ethernet and allow only a couple of other computers on the wire. 
Crony would work just fine.

Here's another contribution to the market survey. There is a seismic 
network on the sea floor off the Washington state coast. They need a 
millisecond for experiments lasting months, not just 8-hour shifts, and 
that when the experiment boxes get rather warm. Crony might work here as 
well, but it would have to track large swings in temperature.

Here's another one. National Public Radio (NTP) distributes almost all 
program media via IP and digital satellite. They don't need 1 ms, but 
they do need good stability in the face of highly variable transmission 
delays that could drive crony nuts.

And another one. A transatlantic link used by Ford Motor was once a 
statistical multilexor that interleaved terminal keystrokes on a 
demand-assigned basis. Toss NTP packets in that mess and watch the huge 
jitter. That not only drove NTP nuts, it drove the TCP retransmission 
algorithm nuts, too.

Seems like the market is highly fragmented.

I hear you say 100 ms which I interpret as 100 milliseconds. Even 25 
year old fuzzballs could to much better than that on the congested 
ARPAnet. Did you mean 100 microseconds?

Dave

David Woolley wrote:

 David L. Mills wrote:
 
 5. This flap about the speed of convergence has become silly. Most of 
 us are less concerned about squeezing to the low microseconds in four 
 
 
 Have you done the market surveys to confirm this?  I don't have the 
 resources or time to do that, but my impression from the sort of 
 questions that appear on this newsgroup is that most IT managers and 
 turnkey system developers who want better than 100ms clock accuracy want 
 one or both of:
 
 - fast convergence (small compared with overall bootup time) - a
   a common case, these days, is that they are not allowed to process
   financial transactions until convergence is complete;
 
 - strict monotonicity.
 
 It may well be that most users don't need better than 100ms, but those 
 users don't care about long term stability, and their long term may be 
 an 8 hour shift.
 
 
 (My interest in NTP is more theoretical, as I work in an industry sector 
 that, whilst it deals with timestamped data, those timestamps are often 
 a minute or two out (and are added by equipment that is out of our 
 control), but I do notice the sorts of questions that keep coming up 
 time and time again.)

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-25 Thread Brian Utterback

Danny, I agree with everything you said except:

Danny Mayer wrote:

 
 I agree. I don't see how it can be a specification violation. The 
 biggest factor is how well it keeps time. A caesium clock keeps good 
 time but you wouldn't say that it violates the specification.
 

When we first started looking at the V4 spec for the ntp-wg, my first
thought was the same as yours, namely that what happens inside a system
shouldn't matter, the algorithms don't matter, only what it chimes
matters. And strictly speaking, this is true. However, after reading
Dave's book (Das Buch as he calls it), I realized that an important
factor to the stability of the NTP network is the actual speed at
which the clocks slew, i.e. the 500 PPM limit. This is largely
ignored in the spec. I sent in some comments about how I thought it
should be addressed but alas, my changes didn't make it in the latest
versions.

Brian Utterback

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-25 Thread Unruh

David Woolley [EMAIL PROTECTED] writes:

David L. Mills wrote:
 
 The NTP discipline is basically a type-II feedback control system. Your 
 training should recall exactly how such a loop works and how it responds 
 to a 50-ms step. Eleven seconds after NTP comes up the mitigation 

You both have problems here.

Dave Mills:  your problem is that you haven't explained why one should 
continue to use a long time constant linear feedback system when a human 
observer can easily tell you how to get within 10 microseconds of the 
correct time after no more than about 3 samples.

Bill Unruh:  you haven't explained what real world situation this test 
is simulating; it is a standard doctrine that ntpd is not a substitute 
for good hardware and system software (e.g. you shouldn't use ntpd to 
get round lost clock interrupts).

The real world situation that the test is run on (not simulating) is having
a computer on a lan with another computer running ntp from a Garmin PPS
acting as the server. It is a best case scenario, I will completely agee.
I still get round trip times of msec rather than 150usec at times, the
oscillators on the machines have glitches in which teh clock rate changes
by 1-2PPS suddenly  ( over less than 1/2 hr) and then long periods of
quiescense. 
I have NOT tested the two in situations where there are longer paths,
through many routers. I have not tested it on the road to Mandalay, or
Indonesia. I have been looking at the real world response in a working
system but where the network delays are minimal. 

Is my testing complete? Heavens no. It is one data point. 
Do I expect chrony to fall over on the road to Mandalay? Looking at its
design, no, but experiments are the answer. 



 algorithms present that transient to the loop and what happens 
 afterwards conforms to the equations of control theory. Discussion about 
 what happens at any time after that is a matter of mathematics and ntpd 
 does conform to the mathematics as confirmed by observation and simulation.

That's an indication that the equations are inappropriate in that context.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-25 Thread David L. Mills

Brian,

The 500 PPM limit in the reference implementation was originally set to 
match the adjtime() slew of that value, but so many kernels have been 
hacked adjtime that this might not even be appropriate now. The bottom 
line is that an update given to adjtime() should be completed before the 
next update. Even if it's not, the leftover is carried over to the next 
update. However, in order to avoid disturbing application programs that 
compute intervals, the slew rate should be no more than necessary.

Dave

Brian Utterback wrote:
 Danny, I agree with everything you said except:
 
 Danny Mayer wrote:
 


 I agree. I don't see how it can be a specification violation. The 
 biggest factor is how well it keeps time. A caesium clock keeps good 
 time but you wouldn't say that it violates the specification.

 
 When we first started looking at the V4 spec for the ntp-wg, my first
 thought was the same as yours, namely that what happens inside a system
 shouldn't matter, the algorithms don't matter, only what it chimes
 matters. And strictly speaking, this is true. However, after reading
 Dave's book (Das Buch as he calls it), I realized that an important
 factor to the stability of the NTP network is the actual speed at
 which the clocks slew, i.e. the 500 PPM limit. This is largely
 ignored in the spec. I sent in some comments about how I thought it
 should be addressed but alas, my changes didn't make it in the latest
 versions.
 
 Brian Utterback

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-24 Thread David Woolley

Unruh wrote:
 I am sorry, but this is idiotic. The ONLY requirement should be that the
 communication protocol is implimented properly and that the clock is

Only a very small part of the mandatory parts of the NTP specification 
describe the wire formats.  The pool is an NTP network, not an SNTP one.

 Yes, I can say that. Elementary clock measurement techniques tell you which
 of the two clocks is better, even if you do not know which. How in the
 world do you think the people who run the national time standards know
 which the better or worse clocks are? They have no clock that is better

I believe they use techniques similar to those in ntpd and they are the 
people who come up with terms like Allan intercept.  However, they are 
operating with instrumentation where they know that the oscillator is 
the main source of error.  In the typical NTP setup, the clock is not 
responsible for the jitter component.

 than the ones they have to act as a standard. They have the best in the world,
  and they can tell which is better or worse. 

Actually, I believe the standard is an average of the individual clocks, 
  and has no physical hardware realization.  It is also only available a 
long time after the time to which it relates.

 Stubborness is good. As long as it is allied with a willingness to listen
 and reexamine his own preconceptions. Scientific progress is made by people
 defending their position but being willing to give it up if it becomes
 clear that it is wrong. 

Dave Mills, if you are still reading, I would point out that to anyone 
reading this except for the committed ntpd users on this list (most of 
whom don't understand the clock discipline theory - I think I understand 
it better than many, but my understanding is still rather fuzzy - many 
have said that they don't understand and don't really care) will be 
pretty convinced that they should be using chrony for any real world 
clock synchronization.

Unless you address, on list:

- the problem that ntpd clearly reacts poorly to real world transients
   (and this is an issue that keeps getting raised, not just in this
   thread).

- why chrony's algorithms are bad.

ntpd is going to lose the battle here for anyone reading the thread who 
wasn't already fundamentally committed to ntpd.

I don't have the depth of understanding to defend the ntpd approach and 
I agree with people when they say that ntpd fails to recognize and 
rapidly recover from situations where it is trivially obvious to a human 
that the time is wrong.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-24 Thread Unruh

David J Taylor [EMAIL PROTECTED] writes:

chrony falls at the first hurdle for me - there appears to be no native 
Windows implementation.

Correct. chrony is not implimented on nearly as many platforms as ntp. 

There were plans once upon a time, but life got in Curnoe's way. 
Anyway, I am NOT advocating everyone change to chrony. I am trying to
understand the clock discipline algorithm. It uses a lot of the special
features of linux. 


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-24 Thread David J Taylor

Unruh wrote:
 David J Taylor
 [EMAIL PROTECTED] writes:

 chrony falls at the first hurdle for me - there appears to be no
 native Windows implementation.

 Correct. chrony is not implimented on nearly as many platforms as ntp.

 There were plans once upon a time, but life got in Curnoe's way.
 Anyway, I am NOT advocating everyone change to chrony. I am trying to
 understand the clock discipline algorithm. It uses a lot of the
 special
 features of linux.

It prevents me from making a comparative test.  In general, NTP has worked 
well for me, but I have seen behaviour sometimes when the drift file seems 
to get stuck near one of the limits and NTP can't fix it.  I suspect 
that's a programming rather than a fundamental algorithm error.  Here's 
what I get, which is quite good enough for me (apart from the Vista PC):

  http://www.david-taylor.myby.co.uk/mrtg/daily_ntp.html

Cheers,
David 


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-24 Thread Unruh

Maarten Wiltink [EMAIL PROTECTED] writes:

Unruh [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 David J Taylor
[EMAIL PROTECTED] writes:

 chrony falls at the first hurdle for me - there appears to be no native
 Windows implementation.

 Correct. chrony is not implimented on nearly as many platforms as ntp.

 There were plans once upon a time, but life got in Curnoe's way.
 Anyway, I am NOT advocating everyone change to chrony. I am trying to
 understand the clock discipline algorithm. It uses a lot of the special
 features of linux.

And that is _not_ a good thing. To win over the world, as the other David
(Woolley) predicts elsewhere, it would need to be available on Windows at
least. That might actually happen. But to win over some of the people who

Sure, it would be nice. It sure will not be me. HOwever if people are
convinced that chrony is a better approach ( and that still does need
proof, even though the suggestions are there) then I am sure volunteers
will be found to port it to Windows.  That is after all how NTP got ported. 

_really_ matter, it would have to be implemented in a simple, transparent,
and platform-neutral way, and to be driven by clock engineering, not code
writing. The other other David (the original Dave) can *prove* that NTP

It is, and this discussion and my experiments are precisely to try to do
the clock experiments to see which approach works better. I think Curnoe
did put a lot of effort into thinking about how to make it work when he
wrote chrony some 10 years ago-- astonishingly it has changed very little
and still works extremely well. 


will not go unstable under a variety of adverse conditions. Curnoe may
have years of logs showing that chrony keeps offsets lower than ntpd, but
standards laboratories are likely to shrug that off as anecdotal evidence,
not proof.

Well, no I do not think he does.  I suspect I have the most logs of anyone in 
the world
(www.theory.physics.ubc.ca/chrony/chrony.html-- follow the past logs link)


And anybody who's really serious about timekeeping seems to be playing with
reference clocks on FreeBSD. Err...

OK, I do not know why, but... 
Anyway, I think chrony works on BSD, but could not swear to it. It does
work on SunOS and Solaris ( or did) but they have (had?) terrible clock
control -- no frequency changes-- at least as implimented in chrony.
chrony is old code, but it works very well. Its whole design goal is to
reduce errors in the offset. It is really hard to go unstable if that is
the goal, but it is possible, especially if you are interested in very long
intervals between clock queries. It is at that point that you want to make
sure that your model of the frequency drift and your estimation of the
frequencies is the best possible. I do worry a bit about chrony in that
situation, but have no reasons for that worry. The very worst case is if
the system runs for a while on very short poll intervals, and then suddenly
has very log poll intervals. The short period estimation of the drift is
not a good estimator of the long period drift. But I suspect that NTP would
have problemsi in that situation as well. 





___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-24 Thread David L. Mills

Unruh,

The NTP discipline is basically a type-II feedback control system. Your 
training should recall exactly how such a loop works and how it responds 
to a 50-ms step. Eleven seconds after NTP comes up the mitigation 
algorithms present that transient to the loop and what happens 
afterwards conforms to the equations of control theory. Discussion about 
what happens at any time after that is a matter of mathematics and ntpd 
does conform to the mathematics as confirmed by observation and simulation.

If you have problems with the loop time constant, tough. It was chosen 
as a compromise for LANs and WANs. You are invited to justify a 
different time constant, but it has to work an a bumpy road to Malaysia.

Further discussion on this issue is neither interesting nor helpful and, 
frankly, boring.

Dave

Unruh wrote:

 David L. Mills [EMAIL PROTECTED] writes:
 
 
Maarten,
 
 
I turn my machines off and on all the time and the clock is set from the 
server within 11 seconds after starting ntpd. If I didn't use burst 
mode, that would take four minutes. Golly.
 
 
 When you say the clock is set what do you mean? With what accuracy is the
 clock running 4 min after powerup in comparison with its accuracy after say
 5 days. (let me define the accuracy as the offset ,not the jitter, but the
 offset on each measurement from your best time source.)
 
 
 
 
 
Please understand the difference between impulse response and poll 
interval. It is true that it might take 3000 s to amortize the initial 
offset from the TOC chip at power-up. This is no different than if some 
server torqued your clock by that amount.
 
 
 So, if some server did torque your clock by 50ms as a one time event, or if
 you stepped your system clock by 50ms, how long would it take ntp to settle
 down (lets say you are running at maxpoll 7, minpoll 4). Let us assume that
 in steady state your clock is controlled to 50usec. HOw long would it take
 to regain that +- 50usec behaviour with ntp? Again, I mean by +- 50 usec
 that the measurement offsets ( what is reported in the peerstats
 file as clock offset) are fluctuating by +-50usec?
 You may not like that as a measure of the clock accuracy, but I want to be
 clear that we are not talking about different things.
 
 
 
 
 
 
Dave
 
 
Maarten Wiltink wrote:

Unruh [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]


David L. Mills [EMAIL PROTECTED] writes:


There are lots of ways to measure the loop transient response. The
easiest way is to set the clock some 50-100 ms off from some stable
source (not necessarily accurate) and watch the loop converge. The
response should cross zero in about 3000 s and overshoot about 6
percent

3000 s is a HUGE time. For people who switch on their computers daily,
that means most of their time is spent with the computer unsynchronised
to best accuracy. The timescale of chrony is far faster. (I am not a
writer of chrony.I am a user who is trying to get the very best out of
the timekeeping.)


But NTP is from a time when people didn't switch on their computers
daily. When NTP was young, dinosaurs walked the machine room and
_you_ did _not_ get to decide when the machine on the other end of
your terminal was rebooted.

NTP can, after weeks of training, teach a computer to keep time very,
very well. As a result, it's less optimised for the other end of the
spectrum.

Features like iburst and the drift file can get your clock synchronised
to within a few milliseconds in less than a minute. If you want better
than that, or you want it faster... don't turn your computer off.

Groetjes,
Maarten Wiltink



___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-24 Thread Unruh

David L. Mills [EMAIL PROTECTED] writes:

Unruh,

I'm sure you know that an ntpd simulator is included in the NTP software 
distribution. It handles multiple simultaneous servers using the same 
algorithms as in the working daemon. We use it to test the daemon 
response to all kinds of possible but unlikely scenarios, all at warp speed.

No I did not know that, but the problem is that chrony does not. 



So, why ae we having this discussion? Whip up a worthy opponent for 
chrony and we can watch a glorious battle of the simulators. For your 
first battle, I have rawstats for a bumpy backdoor path to Malaysia.

I am astonished at your comments about poll interval and reading the 
clock, which are totally independent of each other. The ntpd daemon has 

I am puzzeled by what you are refering to-- oh, you mean the simulator
discussion. Never mind, that comment has absolutely nothing to do with ntp 
but refered to how easy it was to get chrony to run in simulator mode.
It was a comment purely on that issue. 
 

configurable disconnects for the feedback loop and for individual 
servers. I assume chrony has similar features, as you wouldn't be able 
to make the claims you do without them.

Reading your claims literally, chrony would have to slew the clock 
considerably greater than the 500 PPM provided by the standard Unix 
adjtime() system call. Please explain how it does that.

Using the Linux adjtimex system call which has the ability to change the
ticksize which gives much greater than 500PPM slew rate for the clocks.
( Up to 10PPM, although that is never used. ) And as I understand it,
your handling of leap seconds in ntp also uses far greater than 500PPM slew 
rates. 



Dave

Unruh wrote:
 [EMAIL PROTECTED] (Danny Mayer) writes:
 
 
Unruh wrote:

[EMAIL PROTECTED] (Danny Mayer) writes:


Virtual machines buys you the same problem as above. Even on a virtual 
machine there's only one clock. You can have only one application 
discipline that clock never mind how many virtual machines are running. 
Don't be fooled by the technology.

Not if the virtual machines have a virtual clock-- Ie a little program
which intercepts all the clock routines and return the output of a little
program simulating a clock. Now intercepting the various adjtimex calls is
not that hard ( just rewrite the adjtimex and gettimeofday routine and and 
overload it for
your program) but chrony and ntp also use the clock as a scheduler, and
that is a lot more difficult to simulate and catch. 

 
 
As a fellow physicist I would expect you to understand this better. It's 
a basic principal in quantum mechanics: the observers influences the 
observed results. In this case, it's not enough since you are directly 
and deliberately affecting the clock itself and there really can only 
 
 
 NO you do not understant. The clocks I am talking about are NOT hardware
 related clocks, they are just subroutines which return what is supposed to
 be a time when queried, and which change their algorithm for generating
 those numbers when disciplined by the program. 
 
 The really big problem is that the system goes into wait states, and you
 would also have to wake it up appropriately. For example, the polling
 interval is done by the clock. Now there is absolutely no reason why a poll
 which is supposed to be running at poll 10 could not return immediately
 with the clock set to tell it that 1024 sec had passed. However getting
 this right would require a really big rewrite of the NTP or chrony program.
 
 
 
and deliberately affecting the clock itself and there really can only
be one clock. Multiple clocks lead to chaotic events. All virtual
 
 
 Of course there can be many clocks. After all each computer I have has one
 so if I have 10 computers I have 10 clocks. NOw of course you are refering
 to a single computer with a single bit of hardware. But the virtual clocks
 I am talking about are not hardware related at all. They are just
 subroutines which spit out an number when queried.
 
 
 
There are no simulators that I've ever seen that can run tests faster 
than real-time. They are always many orders of magnitude slower, even 
with hardware assist.

We are not asking for a machine simulator but a clock simulator and that
can run thousands of times faster than the real clock. You can run it at
any speed you want. And you can have a separate simualted clock with its
own theory of operation on each virtual machine. 
 
 
I've run many different simulators including hardware ones and I can 
assure you nothing runs slower than a simulator. Like I said there is 
only one real clock in a virtual machine, there just appears to be one 
per virtual machine.
 
 
 A simulator of a clock can run far far faster than a clock. After all I can
 output the numbers from 1 to 1 far faster than 1 sec. That is how
 weather forcasting works. The simulation of the weather is run much faster
 than the real weather. Otherwise the forcast is a bit useless.

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-24 Thread Brian Utterback

Unruh wrote:
 situation, but have no reasons for that worry. The very worst case is if
 the system runs for a while on very short poll intervals, and then suddenly
 has very log poll intervals. The short period estimation of the drift is
 not a good estimator of the long period drift. But I suspect that NTP would
 have problemsi in that situation as well. 
 

Perhaps. Perhaps not. The NTP reference code chooses its own poll
interval based on the clock stability and the sample jitter. For
a frequency correction to be valid, the clock offset must be greater
than the sample jitter. As the frequency gets closer to the correct
value the poll interval must get longer. See, NTP has a different
design goal than chrony. The goal of NTP is not merely to keep the
clock in sync, but to also discipline the frequency, while also
providing a stable time synchronization network. If your goal is
to keep the offset as low as possible, just keep the poll interval
as short as possible. That doesn't take much work.

Brian Utterback

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-24 Thread David L. Mills

Guys,

Sure, I'm stubborn as a bull. The laws of physics make me so.

I am dismissing any comparisons between ntpd and crony or any other 
vehicle unless the comparison includes substantially all the scenarios 
that ntpd is designed to work with. The protocol is specifically 
designed to work over a wide spectrum including lightly loaded LANs and 
highly congested WANs. The choice of parameters, specifically the time 
constant and operating range, was chosen as a compromise to maximize 
accuracy and minimize network loads under typical and extreme conditions.

As for the SNTP restrictsions, please, please read the draft 
specification, which explains exactly what SNTP should and should not 
do. At the crux of the matter is the impulse response of a cascade of 
intervening servers each with its own idiosyncratic impulse response. 
The NTP impulse response has a controlled risetime and overshoot over a 
wide range of time constants. Each server in the cascade must have the 
same impulse response to avoid instabilities and possible whip effects.

We could have simply specified the transfer function in polynomial form 
(it's in RFC 1305 and das Buch) and told the implementor to use that. A 
student of digital signal processing would know how to use that 
directly. But, we thought there would be folks like you that would not 
believe the principles and do something evil like bring up a pool server 
running openntp or crony and synchronized via a flaky circuit to Indonesia.

It is easy to detect that a particular server has or has not the current 
reference implementation. There are a number of features intrinsic to 
the protocol design and others fiendishly crafted to do that, but I'm 
not going to reveal them here.

Dave

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-24 Thread David L. Mills

Guys,

Reprinted without permission from the draft spec:

14.  Simple Network Time Protocol (SNTP)

Primary servers and clients complying with a subset of NTP, called
the Simple Network Time Protocol (SNTPv4) [2], do not need to
implement the mitigation algorithms described in Section 9 and
following sections.  SNTP is intended for primary servers equipped
with a single reference clock, as well as for clients with a single
upstream server and no dependent clients.  The fully developed NTPv4
implementation is intended for secondary servers with multiple
upstream servers and multiple downstream servers or clients.  Other
than these considerations, NTP and SNTP servers and clients are
completely interoperable and can be intermixed in NTP subnets.

An SNTP primary server implementing the on-wire protocol described in
Section 8 has no upstream servers except a single reference clock.
In principle, it is indistinguishable from an NTP primary server that
has the mitigation algorithms and therefore capable of mitigating
between multiple reference clocks.

Upon receiving a client request, an SNTP primary server constructs
and sends the reply packet as described in Figure 34.  Note that the
dispersion field in the packet header must be updated as described in
Section 5.

Dave

Danny Mayer wrote:

snip

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-24 Thread Unruh

David L. Mills [EMAIL PROTECTED] writes:

Unruh,

This answers my earlier question. I can't believe this is so crude and 
dangerous. you really need to provide an analysis on the errors this 
creates when reading the clock during the slew. The problem is not the 
residual time offset but the rate at which time changes. Measuring time 

I am confused. The clock is off from the real time. Anything that happens
during the slew is less off than it was before. You are worried perhaps
that instead of 2 seconds having passed only 1 second and 700 msec have
passed?
Note that the rate change is also limited in chrony even during an offset
slew.

intervals is very different during the slew. The NTP design carefully 
limits this to no more than 5 microseconds per second without the kernel 
and even smaller with the kernel.

OK, why?



Dave

Unruh wrote:

 Brian Utterback [EMAIL PROTECTED] writes:
 
 
Unruh wrote:

Just an update: I started chrony with a 60ms offset. It had the right drift
file. It took about 1 min ( having collected about 4 samples from the
servers at minpoll 4) to drive the offset down to about 100 usec (Yes, a
1000 fold improvement in about 50 sec.) Ie, the time constant for
correction of offset errors is enough time to collect enough samples to
determine that the offset really is statistically way off. 

 
 
Is that supposed to be impressive? One of the design constraints of NTP
is to limit the clock frequency change during offset adjustments to
500ppm to prevent NTP network instabilities. If the offset was
amortized over the 50secs you stated, then that is a slew rate of
1200 ppm. If this happened entirely at the end of the 4 samples, then it 
sounds simply like a step to me. By that reasoning, ntpdate far
 
 
 NO it is NOT a step. It is done via a fast slew by a change in the tick 
 size, which can be 10% (ie
 +-10PPM) The clock always runs forward. It does not step. It may seem
 like a step from the point of the coarse sampling done by chrony or ntp,
 but if you ran a PPS clock and looked at the time returned by gettimeofday,
 it would be continuous and positive, just like ntp. When the NPT offset
 changes by 100ms between samples spaced at 500 sec apart, did it do that by
 stepping? No it did it by increasing the frequency by 200PPM. Chrony
 behaves the same way, only it uses the ticksize as well as the frequency to
 produce fast slews to get rid of the offsets, and it does not go unstable
 that I have ever seen. 
  
 
 
outperforms chrony. I presume that chrony cannot behave as a server and
only does clients right?
 
 
 Chrony is also  a server. The key detraction for me is that it cannot use 
 hardware clocks. 
 It also does not act as a multicast/broadcast server  which may be a
 detraction for others and does not do leap
 seconds. On the other hand with its rapid response it will correct the
 leapsecond within less than an hour. 
 
 Anyway, the issue here is the clock disciplining routine, not a comparison
 of the chronyd program with the ntp implimentation. 
 
 I am arguing that chrony's clock discipline routine keeps the hardware
 clock much closer to the real time (in the real world) and reacts to real
 world changes much faster than does the NTP discipline routine. 
 
 
 And chrony is just as stable it seems as NTP is. The offset fluctuations
 are better than NTP's are. The key question is how close to the real time
 is the time that the system clock delivers. Chrony is closer by factors of
 at least 2 and probably if run at high priority as my ntp is, much better
 than that. In particular if there are glitches in the clock drift rate,
 chrony reacts much faster, and keeps the time much much closer to the true
 time.  Instability would produce worse behaviour not better. 
 
 
 
 
I also started chrony without a drift file. In this case it took about 5
min to get a frequency within 10% of the long term stable frequency and
that error disappeared within 1/2 hour.

 
 
I don't know about the version of ntp you are running, but recent
versions have a bug in the initial frequency calculations which
has since been fixed, but not released (ahem. Harlan?).
 
 
 The initial horrible  transient was under 4.2.0. After this round I will
 try an initial transient test with 4.2.4.  But the transient
 behaviour I am describing in the previous post is during the normal running
  of NTP. It is not  an initial transient. It is the response of the system
 to a real world drift rate glitch.
 
 It is after NTP has been running for 5 days and the hardware clock on the
 machine suffered a frequency glitch. I have no idea what is causing those
 frequency glitches-- the clock suddenly canges it drift rate by .2 to 2 PPM.
 I have seen this both with a chrony controlled clock and an NTP controlled
 clock. It is just that the NTP response is not good. 
 
 
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-24 Thread Unruh

Brian Utterback [EMAIL PROTECTED] writes:

Unruh wrote:
 situation, but have no reasons for that worry. The very worst case is if
 the system runs for a while on very short poll intervals, and then suddenly
 has very log poll intervals. The short period estimation of the drift is
 not a good estimator of the long period drift. But I suspect that NTP would
 have problemsi in that situation as well. 
 

Perhaps. Perhaps not. The NTP reference code chooses its own poll
interval based on the clock stability and the sample jitter. For
a frequency correction to be valid, the clock offset must be greater
than the sample jitter. As the frequency gets closer to the correct
value the poll interval must get longer. See, NTP has a different
design goal than chrony. The goal of NTP is not merely to keep the
clock in sync, but to also discipline the frequency, while also
providing a stable time synchronization network. If your goal is
to keep the offset as low as possible, just keep the poll interval
as short as possible. That doesn't take much work.

No, I am refering to the case where the network suddenly goes down for 2
days. YOur poll has gone from 2 min ( say on maxpoll 10) to 3 days. 
The design goal of ntp and chrony is the same as you outline for ntp. 
The question is what algorithm accomplishes those goals including
minimizing the offset Chrony is just as adaptive on poll intervals as is
ntp. It is just that sometimes the world hands garbage, and the question is
how does the system respond to the garbage.



___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-24 Thread David Woolley

David L. Mills wrote:
 
 The NTP discipline is basically a type-II feedback control system. Your 
 training should recall exactly how such a loop works and how it responds 
 to a 50-ms step. Eleven seconds after NTP comes up the mitigation 

You both have problems here.

Dave Mills:  your problem is that you haven't explained why one should 
continue to use a long time constant linear feedback system when a human 
observer can easily tell you how to get within 10 microseconds of the 
correct time after no more than about 3 samples.

Bill Unruh:  you haven't explained what real world situation this test 
is simulating; it is a standard doctrine that ntpd is not a substitute 
for good hardware and system software (e.g. you shouldn't use ntpd to 
get round lost clock interrupts).

 algorithms present that transient to the loop and what happens 
 afterwards conforms to the equations of control theory. Discussion about 
 what happens at any time after that is a matter of mathematics and ntpd 
 does conform to the mathematics as confirmed by observation and simulation.

That's an indication that the equations are inappropriate in that context.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-23 Thread Unruh


Just an update: I started chrony with a 60ms offset. It had the right drift
file. It took about 1 min ( having collected about 4 samples from the
servers at minpoll 4) to drive the offset down to about 100 usec (Yes, a
1000 fold improvement in about 50 sec.) Ie, the time constant for
correction of offset errors is enough time to collect enough samples to
determine that the offset really is statistically way off. 

I also started chrony without a drift file. In this case it took about 5
min to get a frequency within 10% of the long term stable frequency and
that error disappeared within 1/2 hour.

I have also patched chrony so that it can put itself at max priority. It
seems clear to me that the reason that NTP was so much better at the round
trip scatter was that it was running at max priority. Ie, the large spikes
in the round trip times was because chrony was not being woken up, or was
swapped out, rather than any problem with the network.

However I will have to run chrony again for a while to collect statistics.
However, even without the high priority, chrony did better than NTP at
keeping the clock disciplined, and this taming of the round trip
fluctuations should help. 

To compare the transient response of chrony and NTP, look at the graphs for
flory (bottom graph on the right at
www.theory.physics.ubc.ca/chrony/chrony.html)
and fluxon (fourth down on the right).Both suffered a sudden change in the
drift rate of the clock it appears. 
On the NTP controlled clock there seems to have been a sudden .2PPM change
in the drift rate of the clock on Jan22.8. This caused a 500usec error in the 
offset
errors in the clock, which took a few hours to settle down. 
Contrast this with fluxon at Jan 21.27 where it seems to have suffered a
2PPM sudden change in the drift (ten times the change that flory suffered).
This caused only a 200 usec offset, which chrony corrected within 5 min.
The similar jump at 21.4 behaved in the same way. Ie, a jump 10 times as
big had an effect less than 1/2 as large, and fixed on the timescale of
over 20 times faster. 



___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-23 Thread Danny Mayer

Harlan Stenn wrote:
 In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Danny Mayer) writes:
 
 Danny Harlan Stenn wrote:
 Unruh Unfortunately I cannot run both ntp and chrony on the same system at
 Unruh the same time.
  Bill,

 Exactly why can you not run ntpd and chrony on the same system at the
 same time?
 
 Danny Harlan, really. You *cannot* have two different
 Danny mechanisms/applications to discipline the clock at the same time. I
 Danny invite you to try. You have access to my code so you can test this
 Danny easily.
 
 You are, as is so often the case, missing my point.  It is possible to run
 ntpd in a way that it does not discipline the clock.

In which case you are not able to compare the two algorithms since the 
clock is the central part of the testing and algorithm.

  I am curious about
 your last sentence though - what is special about your code that would allow
 this to be tested?
 

It includes the noall option. You can then run two instances on the same 
box and have one discipline the clock and the other not. Feel free to 
have both of them try. The results will be hilarious.

 I want the ability to run multiple instances of ntpd where at most 1
 instance of ntpd is actually controlling the clock, specifically to make
 it easy to (more quickly) analyze the performance/behavior of different
 configurations of ntpd.  I understand that the boat is rocking while this
 is going on, but I suspect this capability would be a useful one in at
 least some cases.

 
 Danny I don't see the benefit of doing this with two separate
 Danny instances. It's easier and simpler to just add the other servers into
 Danny the one instance and specify noselect.
 
 Again you are missing my point.  Allowing this would let us, for example,
 see how two different versions of ntpd would discipline the clock.  It would
 allow us to see how ntpd might discipline the clock compared to chrony.
 
 I understand and get that by not actually disciplining the clock we are
 removing an important part of the feedback loop, and I do not know if that
 will fatally affect these sort of experiments or not.
 
 

No you cannot do that. The clock is the central part of the algorithm. 
You *cannot* have two different applications discipline the clock 
without disasterous results.

Put it this way: chrony decides that it needs to adjust the clock 
frequency and amount by amount dX and X. ntpd decides to change it by dY 
and Y. When chrony next looks at the clock it decides that the change it 
made wasn't good enough and makes changes by an even bigger amount and 
delta. and so on and so forth.

 And as Bill said, it would be Swell if there was a way to do this using, eg,
 virtual machines so that we could test them that way.  Better yet, it would
 be nice to have a simulator framework where we could run these tests faster
 than in real-time.

Virtual machines buys you the same problem as above. Even on a virtual 
machine there's only one clock. You can have only one application 
discipline that clock never mind how many virtual machines are running. 
Don't be fooled by the technology.

There are no simulators that I've ever seen that can run tests faster 
than real-time. They are always many orders of magnitude slower, even 
with hardware assist.

Danny
___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-23 Thread Unruh

[EMAIL PROTECTED] (Danny Mayer) writes:

...
 And as Bill said, it would be Swell if there was a way to do this using, eg,
 virtual machines so that we could test them that way.  Better yet, it would
 be nice to have a simulator framework where we could run these tests faster
 than in real-time.

Virtual machines buys you the same problem as above. Even on a virtual 
machine there's only one clock. You can have only one application 
discipline that clock never mind how many virtual machines are running. 
Don't be fooled by the technology.

Not if the virtual machines have a virtual clock-- Ie a little program
which intercepts all the clock routines and return the output of a little
program simulating a clock. Now intercepting the various adjtimex calls is
not that hard ( just rewrite the adjtimex and gettimeofday routine and and 
overload it for
your program) but chrony and ntp also use the clock as a scheduler, and
that is a lot more difficult to simulate and catch. 

There are no simulators that I've ever seen that can run tests faster 
than real-time. They are always many orders of magnitude slower, even 
with hardware assist.
We are not asking for a machine simulator but a clock simulator and that
can run thousands of times faster than the real clock. You can run it at
any speed you want. And you can have a separate simualted clock with its
own theory of operation on each virtual machine. 






___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-23 Thread Danny Mayer

Unruh wrote:
 [EMAIL PROTECTED] (Danny Mayer) writes:

 Virtual machines buys you the same problem as above. Even on a virtual 
 machine there's only one clock. You can have only one application 
 discipline that clock never mind how many virtual machines are running. 
 Don't be fooled by the technology.
 
 Not if the virtual machines have a virtual clock-- Ie a little program
 which intercepts all the clock routines and return the output of a little
 program simulating a clock. Now intercepting the various adjtimex calls is
 not that hard ( just rewrite the adjtimex and gettimeofday routine and and 
 overload it for
 your program) but chrony and ntp also use the clock as a scheduler, and
 that is a lot more difficult to simulate and catch. 
 

As a fellow physicist I would expect you to understand this better. It's 
a basic principal in quantum mechanics: the observers influences the 
observed results. In this case, it's not enough since you are directly 
and deliberately affecting the clock itself and there really can only 
be one clock. Multiple clocks lead to chaotic events. All virtual 
clocks are driven off the real one which means that updating the clock 
needs to update the real clock. You don't really have separate clocks, 
it just looks like you do.

 There are no simulators that I've ever seen that can run tests faster 
 than real-time. They are always many orders of magnitude slower, even 
 with hardware assist.
 We are not asking for a machine simulator but a clock simulator and that
 can run thousands of times faster than the real clock. You can run it at
 any speed you want. And you can have a separate simualted clock with its
 own theory of operation on each virtual machine. 

I've run many different simulators including hardware ones and I can 
assure you nothing runs slower than a simulator. Like I said there is 
only one real clock in a virtual machine, there just appears to be one 
per virtual machine.

Danny
___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-23 Thread David L. Mills

Maarten,

I turn my machines off and on all the time and the clock is set from the 
server within 11 seconds after starting ntpd. If I didn't use burst 
mode, that would take four minutes. Golly.

Please understand the difference between impulse response and poll 
interval. It is true that it might take 3000 s to amortize the initial 
offset from the TOC chip at power-up. This is no different than if some 
server torqued your clock by that amount.

Dave

Maarten Wiltink wrote:
 Unruh [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
 
David L. Mills [EMAIL PROTECTED] writes:
 
 
There are lots of ways to measure the loop transient response. The
easiest way is to set the clock some 50-100 ms off from some stable
source (not necessarily accurate) and watch the loop converge. The
response should cross zero in about 3000 s and overshoot about 6
percent

3000 s is a HUGE time. For people who switch on their computers daily,
that means most of their time is spent with the computer unsynchronised
to best accuracy. The timescale of chrony is far faster. (I am not a
writer of chrony.I am a user who is trying to get the very best out of
the timekeeping.)
 
 
 But NTP is from a time when people didn't switch on their computers
 daily. When NTP was young, dinosaurs walked the machine room and
 _you_ did _not_ get to decide when the machine on the other end of
 your terminal was rebooted.
 
 NTP can, after weeks of training, teach a computer to keep time very,
 very well. As a result, it's less optimised for the other end of the
 spectrum.
 
 Features like iburst and the drift file can get your clock synchronised
 to within a few milliseconds in less than a minute. If you want better
 than that, or you want it faster... don't turn your computer off.
 
 Groetjes,
 Maarten Wiltink
 
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-23 Thread Unruh

[EMAIL PROTECTED] (Danny Mayer) writes:

Unruh wrote:
 [EMAIL PROTECTED] (Danny Mayer) writes:

 Virtual machines buys you the same problem as above. Even on a virtual 
 machine there's only one clock. You can have only one application 
 discipline that clock never mind how many virtual machines are running. 
 Don't be fooled by the technology.
 
 Not if the virtual machines have a virtual clock-- Ie a little program
 which intercepts all the clock routines and return the output of a little
 program simulating a clock. Now intercepting the various adjtimex calls is
 not that hard ( just rewrite the adjtimex and gettimeofday routine and and 
 overload it for
 your program) but chrony and ntp also use the clock as a scheduler, and
 that is a lot more difficult to simulate and catch. 
 

As a fellow physicist I would expect you to understand this better. It's 
a basic principal in quantum mechanics: the observers influences the 
observed results. In this case, it's not enough since you are directly 
and deliberately affecting the clock itself and there really can only 

NO you do not understant. The clocks I am talking about are NOT hardware
related clocks, they are just subroutines which return what is supposed to
be a time when queried, and which change their algorithm for generating
those numbers when disciplined by the program. 

The really big problem is that the system goes into wait states, and you
would also have to wake it up appropriately. For example, the polling
interval is done by the clock. Now there is absolutely no reason why a poll
which is supposed to be running at poll 10 could not return immediately
with the clock set to tell it that 1024 sec had passed. However getting
this right would require a really big rewrite of the NTP or chrony program.


and deliberately affecting the clock itself and there really can only
be one clock. Multiple clocks lead to chaotic events. All virtual

Of course there can be many clocks. After all each computer I have has one
so if I have 10 computers I have 10 clocks. NOw of course you are refering
to a single computer with a single bit of hardware. But the virtual clocks
I am talking about are not hardware related at all. They are just
subroutines which spit out an number when queried.


 There are no simulators that I've ever seen that can run tests faster 
 than real-time. They are always many orders of magnitude slower, even 
 with hardware assist.
 We are not asking for a machine simulator but a clock simulator and that
 can run thousands of times faster than the real clock. You can run it at
 any speed you want. And you can have a separate simualted clock with its
 own theory of operation on each virtual machine. 

I've run many different simulators including hardware ones and I can 
assure you nothing runs slower than a simulator. Like I said there is 
only one real clock in a virtual machine, there just appears to be one 
per virtual machine.

A simulator of a clock can run far far faster than a clock. After all I can
output the numbers from 1 to 1 far faster than 1 sec. That is how
weather forcasting works. The simulation of the weather is run much faster
than the real weather. Otherwise the forcast is a bit useless. 



Danny

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-23 Thread Unruh

Brian Utterback [EMAIL PROTECTED] writes:

Unruh wrote:
 Just an update: I started chrony with a 60ms offset. It had the right drift
 file. It took about 1 min ( having collected about 4 samples from the
 servers at minpoll 4) to drive the offset down to about 100 usec (Yes, a
 1000 fold improvement in about 50 sec.) Ie, the time constant for
 correction of offset errors is enough time to collect enough samples to
 determine that the offset really is statistically way off. 
 

Is that supposed to be impressive? One of the design constraints of NTP
is to limit the clock frequency change during offset adjustments to
500ppm to prevent NTP network instabilities. If the offset was
amortized over the 50secs you stated, then that is a slew rate of
1200 ppm. If this happened entirely at the end of the 4 samples, then it 
sounds simply like a step to me. By that reasoning, ntpdate far

NO it is NOT a step. It is done via a fast slew by a change in the tick size, 
which can be 10% (ie
+-10PPM) The clock always runs forward. It does not step. It may seem
like a step from the point of the coarse sampling done by chrony or ntp,
but if you ran a PPS clock and looked at the time returned by gettimeofday,
it would be continuous and positive, just like ntp. When the NPT offset
changes by 100ms between samples spaced at 500 sec apart, did it do that by
stepping? No it did it by increasing the frequency by 200PPM. Chrony
behaves the same way, only it uses the ticksize as well as the frequency to
produce fast slews to get rid of the offsets, and it does not go unstable
that I have ever seen. 
 

outperforms chrony. I presume that chrony cannot behave as a server and
only does clients right?

Chrony is also  a server. The key detraction for me is that it cannot use 
hardware clocks. 
It also does not act as a multicast/broadcast server  which may be a
detraction for others and does not do leap
seconds. On the other hand with its rapid response it will correct the
leapsecond within less than an hour. 

Anyway, the issue here is the clock disciplining routine, not a comparison
of the chronyd program with the ntp implimentation. 

I am arguing that chrony's clock discipline routine keeps the hardware
clock much closer to the real time (in the real world) and reacts to real
world changes much faster than does the NTP discipline routine. 


And chrony is just as stable it seems as NTP is. The offset fluctuations
are better than NTP's are. The key question is how close to the real time
is the time that the system clock delivers. Chrony is closer by factors of
at least 2 and probably if run at high priority as my ntp is, much better
than that. In particular if there are glitches in the clock drift rate,
chrony reacts much faster, and keeps the time much much closer to the true
time.  Instability would produce worse behaviour not better. 



 I also started chrony without a drift file. In this case it took about 5
 min to get a frequency within 10% of the long term stable frequency and
 that error disappeared within 1/2 hour.
 

I don't know about the version of ntp you are running, but recent
versions have a bug in the initial frequency calculations which
has since been fixed, but not released (ahem. Harlan?).

The initial horrible  transient was under 4.2.0. After this round I will
try an initial transient test with 4.2.4.  But the transient
behaviour I am describing in the previous post is during the normal running
 of NTP. It is not  an initial transient. It is the response of the system
to a real world drift rate glitch.

It is after NTP has been running for 5 days and the hardware clock on the
machine suffered a frequency glitch. I have no idea what is causing those
frequency glitches-- the clock suddenly canges it drift rate by .2 to 2 PPM.
I have seen this both with a chrony controlled clock and an NTP controlled
clock. It is just that the NTP response is not good. 



___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-23 Thread Unruh

David L. Mills [EMAIL PROTECTED] writes:

Maarten,

I turn my machines off and on all the time and the clock is set from the 
server within 11 seconds after starting ntpd. If I didn't use burst 
mode, that would take four minutes. Golly.

When you say the clock is set what do you mean? With what accuracy is the
clock running 4 min after powerup in comparison with its accuracy after say
5 days. (let me define the accuracy as the offset ,not the jitter, but the
offset on each measurement from your best time source.)




Please understand the difference between impulse response and poll 
interval. It is true that it might take 3000 s to amortize the initial 
offset from the TOC chip at power-up. This is no different than if some 
server torqued your clock by that amount.

So, if some server did torque your clock by 50ms as a one time event, or if
you stepped your system clock by 50ms, how long would it take ntp to settle
down (lets say you are running at maxpoll 7, minpoll 4). Let us assume that
in steady state your clock is controlled to 50usec. HOw long would it take
to regain that +- 50usec behaviour with ntp? Again, I mean by +- 50 usec
that the measurement offsets ( what is reported in the peerstats
file as clock offset) are fluctuating by +-50usec?
You may not like that as a measure of the clock accuracy, but I want to be
clear that we are not talking about different things.





Dave

Maarten Wiltink wrote:
 Unruh [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
 
David L. Mills [EMAIL PROTECTED] writes:
 
 
There are lots of ways to measure the loop transient response. The
easiest way is to set the clock some 50-100 ms off from some stable
source (not necessarily accurate) and watch the loop converge. The
response should cross zero in about 3000 s and overshoot about 6
percent

3000 s is a HUGE time. For people who switch on their computers daily,
that means most of their time is spent with the computer unsynchronised
to best accuracy. The timescale of chrony is far faster. (I am not a
writer of chrony.I am a user who is trying to get the very best out of
the timekeeping.)
 
 
 But NTP is from a time when people didn't switch on their computers
 daily. When NTP was young, dinosaurs walked the machine room and
 _you_ did _not_ get to decide when the machine on the other end of
 your terminal was rebooted.
 
 NTP can, after weeks of training, teach a computer to keep time very,
 very well. As a result, it's less optimised for the other end of the
 spectrum.
 
 Features like iburst and the drift file can get your clock synchronised
 to within a few milliseconds in less than a minute. If you want better
 than that, or you want it faster... don't turn your computer off.
 
 Groetjes,
 Maarten Wiltink
 
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-23 Thread Danny Mayer

Unruh wrote:
 [EMAIL PROTECTED] (Danny Mayer) writes:
 
 Unruh wrote:
 [EMAIL PROTECTED] (Danny Mayer) writes:

 Virtual machines buys you the same problem as above. Even on a virtual 
 machine there's only one clock. You can have only one application 
 discipline that clock never mind how many virtual machines are running. 
 Don't be fooled by the technology.
 Not if the virtual machines have a virtual clock-- Ie a little program
 which intercepts all the clock routines and return the output of a little
 program simulating a clock. Now intercepting the various adjtimex calls is
 not that hard ( just rewrite the adjtimex and gettimeofday routine and and 
 overload it for
 your program) but chrony and ntp also use the clock as a scheduler, and
 that is a lot more difficult to simulate and catch. 

 
 As a fellow physicist I would expect you to understand this better. It's 
 a basic principal in quantum mechanics: the observers influences the 
 observed results. In this case, it's not enough since you are directly 
 and deliberately affecting the clock itself and there really can only 
 
 NO you do not understant. The clocks I am talking about are NOT hardware
 related clocks, they are just subroutines which return what is supposed to
 be a time when queried, and which change their algorithm for generating
 those numbers when disciplined by the program. 
 

Clocks are not that stable to be just used as an algorithm in a 
subroutine. Real clocks are unstable otherwise we wouldn't be having 
this conversion. In other words, you are not conducting a real 
experiment and you are not modeling the way an actual clock works.

 The really big problem is that the system goes into wait states, and you
 would also have to wake it up appropriately. For example, the polling
 interval is done by the clock. Now there is absolutely no reason why a poll
 which is supposed to be running at poll 10 could not return immediately
 with the clock set to tell it that 1024 sec had passed. However getting
 this right would require a really big rewrite of the NTP or chrony program.
 
 
 and deliberately affecting the clock itself and there really can only
 be one clock. Multiple clocks lead to chaotic events. All virtual
 
 Of course there can be many clocks. After all each computer I have has one
 so if I have 10 computers I have 10 clocks. NOw of course you are refering
 to a single computer with a single bit of hardware. But the virtual clocks
 I am talking about are not hardware related at all. They are just
 subroutines which spit out an number when queried.
 

Virtual Machines run on a real machine. The clock of the virtual machine 
is same as the real machine, it's just hidden from you. subroutines 
don't model a real clock. You have to implement subroutines based on 
real clock behavior.

 
 There are no simulators that I've ever seen that can run tests faster 
 than real-time. They are always many orders of magnitude slower, even 
 with hardware assist.
 We are not asking for a machine simulator but a clock simulator and that
 can run thousands of times faster than the real clock. You can run it at
 any speed you want. And you can have a separate simualted clock with its
 own theory of operation on each virtual machine. 
 
 I've run many different simulators including hardware ones and I can 
 assure you nothing runs slower than a simulator. Like I said there is 
 only one real clock in a virtual machine, there just appears to be one 
 per virtual machine.
 
 A simulator of a clock can run far far faster than a clock. After all I can
 output the numbers from 1 to 1 far faster than 1 sec. That is how
 weather forcasting works. The simulation of the weather is run much faster
 than the real weather. Otherwise the forcast is a bit useless. 

weather modeling requires a great deal of effort to reflect actual 
weather fluctations and changes. It's a very different model and 
situation and the feedback loop is likely to be much weaker in weather 
modeling. And the observer does not influence the results (except for 
their own personal biases).

Danny
___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-23 Thread Harlan Stenn

 In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Danny Mayer) writes:

Danny You do realize that there are timers built into the code so in order
Danny to run faster you'd need to figure out how to change the timers to
Danny work that way?

When was the last time you looked at the ntpdsim code?
-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-23 Thread Unruh

[EMAIL PROTECTED] (Danny Mayer) writes:

Harlan Stenn wrote:
 In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Danny Mayer) writes:
 
 Danny Harlan Stenn wrote:
 Unruh Unfortunately I cannot run both ntp and chrony on the same system at
 Unruh the same time.
  Bill,

 Exactly why can you not run ntpd and chrony on the same system at the
 same time?
 
 Danny Harlan, really. You *cannot* have two different
 Danny mechanisms/applications to discipline the clock at the same time. I
 Danny invite you to try. You have access to my code so you can test this
 Danny easily.
 
 You are, as is so often the case, missing my point.  It is possible to run
 ntpd in a way that it does not discipline the clock.  I am curious about
 your last sentence though - what is special about your code that would allow
 this to be tested?
 
 I want the ability to run multiple instances of ntpd where at most 1
 instance of ntpd is actually controlling the clock, specifically to make
 it easy to (more quickly) analyze the performance/behavior of different
 configurations of ntpd.  I understand that the boat is rocking while this
 is going on, but I suspect this capability would be a useful one in at
 least some cases.

 
 Danny I don't see the benefit of doing this with two separate
 Danny instances. It's easier and simpler to just add the other servers into
 Danny the one instance and specify noselect.
 
 Again you are missing my point.  Allowing this would let us, for example,
 see how two different versions of ntpd would discipline the clock.  It would
 allow us to see how ntpd might discipline the clock compared to chrony.
 
 I understand and get that by not actually disciplining the clock we are
 removing an important part of the feedback loop, and I do not know if that
 will fatally affect these sort of experiments or not.
 
 And as Bill said, it would be Swell if there was a way to do this using, eg,
 virtual machines so that we could test them that way.  Better yet, it would
 be nice to have a simulator framework where we could run these tests faster
 than in real-time.

You do realize that there are timers built into the code so in order to 
run faster you'd need to figure out how to change the timers to work 
that way?

As I said it is not easy, particularly because the clock is used as a
sheduler. If the only problem were the gettimeofday and adjtimex (to use
the Linux expression) then you could simply replace them by having them
interface with a clock simulator. HOwever there are the schedulers (timers) and
timeout functions which are harder to make work in a simulator. 





___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-23 Thread David J Taylor

chrony falls at the first hurdle for me - there appears to be no native 
Windows implementation.

David 


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread Petri Kaukasoina

Unruh  [EMAIL PROTECTED] wrote:
After I collect more data on steady state, I will rerun startups both with
no drift file and a bad drift file to see how fast the convergence is with
4.2.4.

Hi,

On recent Linux kernels, I think the drift file is always bad after reboot.
HZ=100, no dynamic ticks aka tickless system (CONFIG_NO_HZ not set). I think
I even tried with a kernel command line option lpj= but it didn't help.
If the system is rebooted, ntpd stabilizes to a new different drift value.

With a bad or missing drift file, time set with ntpdate, ntpd can soon take
the offset to a 100 or 200 ms error for a long time.

If you are using Linux and are experimenting with these, please try
something like this which has given me good results (a coarse calibration of
the drift file during boot before starting ntpd):

#!/bin/sh
DRIFTFILE=/etc/ntp/drift
NTPSERVER=ip.address.of.a.good.nearby.ntp.server
TIME=100
# remember to stop ntpd first if running
# reset frequency offset to zero
adjtimex -f 0
# calibrate clock rate during $TIME seconds
ntpdate -sb $NTPSERVER
sleep $TIME
ADJUST=$(ntpdate -b $NTPSERVER | sed 's/.*offset \(.*\) sec.*/\1/')
# ntpdate adjusted $ADJUST seconds
FREQUENCY=$(echo scale=3; $ADJUST * 100 / $TIME | bc)
# reset the drift file and start ntpd
echo $FREQUENCY  $DRIFTFILE
/etc/rc.d/rc.ntpd start

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread David L. Mills

Unruh,

It may help to review the material on Allan deviation and noise 
modelling in the briefings on the NTP project page. If you are down in 
the low microsecond range with poll intervals much over 64 s, expect to 
see a frequency sway due to small temperature variations less than one 
degree C. This should appear random-walk in nature and not periodic. Be 
hortunate about the temperature dependency; it makes a very good fire 
detector and fan failure alarm.

Dave

Unruh wrote:
 [EMAIL PROTECTED] (David Woolley) writes:
 
 
In article [EMAIL PROTECTED],
Bill Unruh [EMAIL PROTECTED] wrote:
 
 
Offset error:
   NTP: Mean=-3.1usec, Std Dev=63.1usec
 
 
If offset is the value reported by ntpq, please note that, when ntpd
is locked up, this is an indication of the instantaneous measurement
error, the actual error in the local time should be more stable (there may
be systematic error) by one or two orders of magnitude.
 
 
 No, the offset is the value reported in loopstats.
 
 
 
More generally though, Dave Mills really needs to get in here and defend
his clock discipline algorithm, and the Chrony developer needs to 
defend theirs.  Arguing the cases by proxy isn't particularly satisfactory.
 
 
 This is not arguing by proxy, this is running experiments. As I know, since
 I am a physicist, experiment trumps theory always. 
 
 
 
 
Dave, please remember that what tends to concern people about the algorithm
is not the behaviour in response to gaussian phase noise, but its behaviour
in response to transients, in particular startup transients.  (Personally
I would say that lost clock tick transients should be fixed at source,
but Bill Unruh would also like it to tolerate those well.)
 
 
  Chrony: Mean=-1.5 usec, Std Dev=20.1usec
 
 
Given the way that I understand it works, I think this is the actual
correction applied on that sample.
 
 
 No, this is offset as measured by the ntp procedure ( (t1+t4-t2-t3)/2 )
 
 
 
 
 
Rate fluctuation:
  NTP:Mean=25.32  Std Dev=.078 (PPM) 
  Chrony: Mean=25.26 Std Dev=.091 (PPM)
 
 
 Running it for a longer time, the standard deviation of the rate for ntp
 has dropped to about .020PPM, which is much better than chrony's. 
 
 
 
 
The means depend on the hardware, and, as long as they are within the order
of one standard deviation of each other, they are as good as each other.
 
 
 Yes, I agree that the mean rates are the same. It is the standard deviation
 that is important here. Ie, ntp seems to be much better (smaller
 fluctutation) than chrony here, at the expense of much worst offset control
 (which makes sense if the rate fluctuations are real-- ie, I can make
 chrony's rate fluctuations much smaller by i running averaging the rates over 
 a
 couple of hours but that will make the offset deviation increase. I guess
 it depends on which you consider more important, and accurate rate, or an
 accurate clock.
 
 
 
From the point of view of another machine, chrony will have episodes where
 
the frequency changes much more, as it applies the phase correction.
 
 
 ??? These are done on the same machine. If you mean that the real drift
 rate of the computer changes, then chrony's rate will change, then I would
 hope that that happens. Remember that this is not comparing two different
 machines, but the same machine at two different times. 
 
 And yes, the physical events could have changed between the two. It would
 be nice if one could do a simulation-- put them both on some virtual
 machine and feed in exactly the same real clock drift changes, and use some
 model of the noise ( measurement, transmission, etc) so one could provide
 the two algorithms with exactly the same data to work with. But neither
 chrony nor ntp are set up for that. 
 
 
 
 
over the weekend, and chrony encompassed the weekdays when the grad
students use the computer) the offset control by chrony was a factor of 3
better than by ntp.  
 
 
If the figures are the actual correction for chrony and the sample error for
ntpd and Dave Mills is correct about the phase noise rejection of the ntpd 
filter being a couple of orders of magnitude, ntpd might actually be 30 
times better.
 
 
 Nope, the figures are the actual samples as measured by chrony, and the
 processed output from ntp as reported in loopstats-- whatever that figure
 is. Ie, if processing makes a difference then the advantage lies with ntp.
 But I just checked  using the offset reported in peerstats (choosing only the 
 packets from the  one local server)
  I get the same result as from loopstats. 
 Ie, both  the results for ntp and for chrony are the raw offsets. 
 and chrony's are about 2.2 times better than ntp's.
 
 So, chrony, at least in this one test, controls the offsets of the clock
 much better, at the expense of worse consistancy in the frequency. It also 
 reacts
 much faster to gross changes in the time ( eg startup with no drift file).
 
 
 
 
 
 

___
questions mailing list

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread David L. Mills

Guys,

I haven't read every word on this thread, but all I can contribute is 
that nothing reported here is anything like my experience here. Our 
servers pogo.udel.edu and rackety.udel.edu are synchronized via GPS and 
PPS. I invite the skeptics to peek at them from time to time. I describe 
their behavior as like cats; most of the time they are quiet and gentle 
at a few microseconds, but once in a whild they show a surge of ten 
microseconds or more, especially after a power failure, which we do get 
from time to time.

There is a persistent report that appears as a low-frequency ringing 
with more or less constant period. This would seem to suggest something 
wrong with the discipline loop transient response. In the past the most 
likely cause has been an ill-advised tinker with the Unix adjtime() 
system call with the dubious purpose of reducing the time to slew the 
clock over some range. This wrecks the transient response and easily 
leads to loop instability. If you are using the kernel time discipline 
and not adjtime() this is not an issue.

There are lots of ways to measure the loop transient response. The 
easiest way is to set the clock some 50-100 ms off from some stable 
source (not necessarily accurate) and watch the loop converge. The 
response should cross zero in about 3000 s and overshoot about 6 percent 
and smoothly amortize over several hours. Be sure to clamp the poll 
interval to 64 s over that period. If it does something else, like show 
an exponentially decreasing ringing. Go looking for trouble.

As for offset should be much larger than the error, be careful here. 
By error I assume you mean what ntpq rv shows as jitter. The best case 
is when offset is indeed less than jitter; if the error is much larger 
than error, this suggests the frequency has surged and the time 
constant/poll interval needs to be reduced. Watch the poll interval 
behavior in the loopstats data.

Dave

David Woolley wrote:

 In article [EMAIL PROTECTED],
 Unruh [EMAIL PROTECTED] wrote:
 
 
No, the offset is the value reported in loopstats.
 
 
 Same thing.  If chrony is reporting the same measurements, neither set of
 measurements is particularly valid.  You need to measure the actual
 offsets, using something that has a repeatability a couple of orders
 of magnitude better.  Certainly for ntpd, offset should be much larger
 than the error, when locked.  Is the server running ntpd?
 
 Anyway, as I said, arguing by proxy is difficult and I'm rather hoping that
 Dave Mills will take over.  Certainly it is Dave Mills you have to 
 convince if ntpd is going to change.
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread David L. Mills

Unruh,

Please read the specification. The offset statistic is the 
maximum-likelihood estimate of the remote clock offset relative to the 
local clock and the sign really does matter. The best way to describe 
this and keep the sign straight is to assume the signed offset is the 
quantity in seconds to add to the local clock in order to maintain the 
same time as the remote clock.

The variance statistic, which is represented as an exponentially 
weighted RMS average called jitter, is the expected error when computing 
the offset statistic. Generally speaking, as long as the jitter is 
unbiased, it does not materially affect the clock accuracy due to the 
extreme lowpass characteristic of the discipline. You shoul be watching 
the offset statistic, not the jitter statistic.

Dave

Unruh wrote:

 [EMAIL PROTECTED] (Danny Mayer) writes:
 
 
Unruh wrote:

correction applied on that sample.

No, this is offset as measured by the ntp procedure ( (t1+t4-t2-t3)/2 )

 
 
No, that's wrong. It is very carefully described in the NTPv4 draft 
section 8 (p27):
 
 
theta = T(B) - T(A) = 1/2 * [(T2-T1) + (T3-T4)]
 
 
Not only do you have the wrong sign, the differences must be calculated 
first, otherwise the errors in the calculation overwhelm the resulting 
value. That's why it's written the way it is.
 
 
 I was not writing code. I was telling you what time difference  I was 
 refering to. And
 sign is a convention as to whether you are saying positive is the computer
 is fast or the external source is fast. Everything I talked about is sign
 independent (standard deviation uses squares), and the difference is that 
 as reported by ntp or chrony and
 both are careful to to do the calculations with as high an accuracy as
 possible. 
 
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread David L. Mills

Unruh,

The basic clock discipline feedback loop has been unchanged since 1992, 
although minor changes have been made to improve behavior in very long 
poll intervals. The only radical change has been using a preliminary 
15-minute initial frequency computation when no frequency file is 
available. So, if you are comparing ntpd and chrony at initial startup 
and without a frequency file, expect to find wide differences in 
behavior. Starting ntpd with an intentially bad frequency file is not 
useful unless you can configure chrony in the same way.

If you really do want a definitive experiment, do what I suggested 
earlier: measure the transient response of both ntpd and chrony starting 
from the SAME initial conditions and with a frequency file containing 
zero PPM. Pay attention to the poll interval, whch should be the same in 
both cases. That will tell you the story, the whole story and nothing 
but the truth.

Dave

Unruh wrote:

 [EMAIL PROTECTED] (Danny Mayer) writes:
 
 
Unruh wrote:

All I say is that the experiments I have carried out show that ntp is slow
to converge if it starts of badly, and leaves the offset scatter larger
than chrony does. It does have a smaller scatter in the rate. 

 
 
But you are using an extremely old version of ntp and things have 
radically changed since that version was released. Try rerunning you 
experiments with ntp 4.2.4 and see what you get then. You also need to 
fix your calculations if you are going to get good results as I 
mentioned in a previous message.
 
 
 Most of the standard deviation results are with 4.2.4. Only the startup was
 with 4.2.0. Are you saying that things have radically changed in the
 handling of the startup? After I collect more data on steady state, I will
 rerun startups both with no drift file and a bad drift file to see how fast
 the convergence is with 4.2.4.
 
 
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread David L. Mills

Unruh,

As you can see from the Allan deviation plots in the briefings on the 
NTP project page, it does no good whatsoever to average over ten hours; 
the observations are completely uncorrelated over that lag. The Allan 
intercept, or best averaging time, is more like twenty minutes to two 
hours, depending on the whims of fate and the cut of the rock.

It is a matter of physics that the NTP offsets will truly average out to 
zero in the long term unless there is an intrinsic bias in the timestamp 
calculations and, believe me, these calculations are very carefully done 
to reduce residual bias to essentially zero. In principle, assuming a 
precision source is available, this puppy can hold better than one 
picosecond. If you see a persistent nonzero offset, there very likely is 
an oscillator frequency problem or the adjtime() or equivalent system 
call has an inherent bias. In any case, even if it does, and unless 
there is a huge frequency error in the order of several hundred PPM, the 
long term average offset will be very close to zero.

In principle, comparing ntpd and chrony or any other vehicle is 
meaninful only if using the same hardware, operating system and poll 
interval. I assume chrony has done its homework and optimized the time 
constant for the given poll interval. I am on a limb here, because 
nobody has confirmed that chrony does in fact discipline the clock using 
  some sort of feedback loop sensitive to both time and frequency 
offset. If in fact it does not, then why are we having this discussion?

Dave

Unruh wrote:

 [EMAIL PROTECTED] (David Woolley) writes:
 
 
In article [EMAIL PROTECTED],
Unruh [EMAIL PROTECTED] wrote:
 
 
No, the offset is the value reported in loopstats.
 
 
Same thing.  If chrony is reporting the same measurements, neither set of
measurements is particularly valid.  You need to measure the actual
 
 
 I am sorry but I do not understand what you are saying. The best estimate
 of the time error of the clock is the measurement that you make of that
 error. Now, you might argue that if the drift never changed, and the clock
 never changed then one could get a better estimate by averaging the
 measurements. But the hypothesis that the time reported by the ntp process
 is the true time plus random uncorrelated errors is simply wrong, as
 looking at the plot of the offsets will rapidly convince you. The offsets
 oscillate with a period on the order of an hour or so. Chrony does this.
 ntp does this. The errors are NOT gaussian uncorrelated random errors. 
 
 Thus most of that error budget is in such correlated errors, and ntp does
 NOT do many orders of magnitude better ( even with uncorrelated random
 errors you would need to average 100 samples-- collected over 10 hours at
 poll 7 to get one order of magnitude, and by then the drift errors would have
 gotten you.)
 
 Anyway, I am comparing like with like in the two programs. Chrony is much
 better in offsets, which implies that at least half of the error in ntp
 is internal error. Ie, it is errors which do NOT average out. ( and I do
 not believe that chrony's errors are the minimal uncorrelated random errors
 either.)
 
 
offsets, using something that has a repeatability a couple of orders
 
 
 Of course that is the best way. Unfortunately I do not have that. I might
 extend the line running the main server to also give me the true offsets
 for the machine. However one can also get an estimate of the errors by
 looking at the measured offsets using the ntp exchange. 
 
 
of magnitude better.  Certainly for ntpd, offset should be much larger
than the error, when locked.  Is the server running ntpd?
 
 
 I do not believe this. 
 Yes, the server is running ntp and its offset errors are of the order of 
 3usec-- again correlated as you can see from the string graph near the 
 bottom of the page.
 
 For example, if I take a 10 element running average and subtract it from
 the raw output of ntp for the server , the standard deviation goes from
 3usec to .5usec. Ie, the errors are highly correlated. Averaging may be
 able to  get
 rid of that .5usec, but not the rest of the standard deviation (3usec)
 which is some sort of highly correlated noise. IF the errors were really
 uncorrelated random errors then subtracting off the running average would
 make no ( well, little) difference to the standard deviation.( it would
 decrease it by something like sqrt(N-1/N) where N is the length of the
 running average)
 
 Ie, it is simply not true that the measured offsets  reported by ntp, or 
 chrony,
 are simply some independent gaussian random process around the true time. 
 
 
 
 
Anyway, as I said, arguing by proxy is difficult and I'm rather hoping that
Dave Mills will take over.  Certainly it is Dave Mills you have to 
convince if ntpd is going to change.
 
 
 I do not know if I am trying to convince. I am trying to report the
 outcomes of some experiments. Now if one (Mills) wants ntp to behave
 differently than the

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread David L. Mills

Petri,

I knew Linux was broken, but what you report suggests it is broken 
beyond my wildest imagination. First, I do know Linux supports the 
precision time kernel, as it has the ntp_adjtime() system call, even if 
it is buried in a wrapper. If so, and assuming that syscall is 
implemented correctly, it doesn't matter whether ticks are significant 
or not.

Second, Linux has completely broken the initial frequency computation 
and intended semantics of the frequency file. The really serious and sad 
issue here is that Linux has added much unnecessary baggage that 
disables or distorts the carefully engineered design principles. It 
would be much, much better to rip out all that baggage and use only what 
comes with the bare ntpd. Certainly, at least Solaris and FreeBSD have 
no such baggage.

I know that at least one time in the past ntpd ran just fine on Linux, 
at least the ntpd version that leaves here, not the one that comes with 
Linux. If this is still the case, your course is clear.

Dave

Petri Kaukasoina wrote:

 Unruh  [EMAIL PROTECTED] wrote:
 
After I collect more data on steady state, I will rerun startups both with
no drift file and a bad drift file to see how fast the convergence is with
4.2.4.
 
 
 Hi,
 
 On recent Linux kernels, I think the drift file is always bad after reboot.
 HZ=100, no dynamic ticks aka tickless system (CONFIG_NO_HZ not set). I think
 I even tried with a kernel command line option lpj= but it didn't help.
 If the system is rebooted, ntpd stabilizes to a new different drift value.
 
 With a bad or missing drift file, time set with ntpdate, ntpd can soon take
 the offset to a 100 or 200 ms error for a long time.
 
 If you are using Linux and are experimenting with these, please try
 something like this which has given me good results (a coarse calibration of
 the drift file during boot before starting ntpd):
 
 #!/bin/sh
 DRIFTFILE=/etc/ntp/drift
 NTPSERVER=ip.address.of.a.good.nearby.ntp.server
 TIME=100
 # remember to stop ntpd first if running
 # reset frequency offset to zero
 adjtimex -f 0
 # calibrate clock rate during $TIME seconds
 ntpdate -sb $NTPSERVER
 sleep $TIME
 ADJUST=$(ntpdate -b $NTPSERVER | sed 's/.*offset \(.*\) sec.*/\1/')
 # ntpdate adjusted $ADJUST seconds
 FREQUENCY=$(echo scale=3; $ADJUST * 100 / $TIME | bc)
 # reset the drift file and start ntpd
 echo $FREQUENCY  $DRIFTFILE
 /etc/rc.d/rc.ntpd start

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread Unruh

David L. Mills [EMAIL PROTECTED] writes:

Unruh,

It may help to review the material on Allan deviation and noise 
modelling in the briefings on the NTP project page. If you are down in 
the low microsecond range with poll intervals much over 64 s, expect to 
see a frequency sway due to small temperature variations less than one 
degree C. This should appear random-walk in nature and not periodic. Be 
hortunate about the temperature dependency; it makes a very good fire 
detector and fan failure alarm.

Sure, I understand that. I am not worried about the fluctuations in the
frequency, especially if they track real fluctuations in the drift rate of
the clock. I am worried about the offsets, since they indicate that the
system is NOT following the real drift rate of the clock. Especially when
the fluctuations are highly correlated ( ie are not just random noise).

The much better behaviour of chrony on offsets suggests that ntp is NOT
following the drift rate of the clock. Especially as the scatter on chrony
is a) much more random and b) I suspect is tied to the much worse behaviour
of chrony in the round trip time department. Ie, chrony is doing much
better (in controlling offsets)  even though it is suffering much worse noise 
than is ntp.



Dave

Unruh wrote:
 [EMAIL PROTECTED] (David Woolley) writes:
 
 
In article [EMAIL PROTECTED],
Bill Unruh [EMAIL PROTECTED] wrote:
 
 
Offset error:
   NTP: Mean=-3.1usec, Std Dev=63.1usec
 
 
If offset is the value reported by ntpq, please note that, when ntpd
is locked up, this is an indication of the instantaneous measurement
error, the actual error in the local time should be more stable (there may
be systematic error) by one or two orders of magnitude.
 
 
 No, the offset is the value reported in loopstats.
 
 
 
More generally though, Dave Mills really needs to get in here and defend
his clock discipline algorithm, and the Chrony developer needs to 
defend theirs.  Arguing the cases by proxy isn't particularly satisfactory.
 
 
 This is not arguing by proxy, this is running experiments. As I know, since
 I am a physicist, experiment trumps theory always. 
 
 
 
 
Dave, please remember that what tends to concern people about the algorithm
is not the behaviour in response to gaussian phase noise, but its behaviour
in response to transients, in particular startup transients.  (Personally
I would say that lost clock tick transients should be fixed at source,
but Bill Unruh would also like it to tolerate those well.)
 
 
  Chrony: Mean=-1.5 usec, Std Dev=20.1usec
 
 
Given the way that I understand it works, I think this is the actual
correction applied on that sample.
 
 
 No, this is offset as measured by the ntp procedure ( (t1+t4-t2-t3)/2 )
 
 
 
 
 
Rate fluctuation:
  NTP:Mean=25.32  Std Dev=.078 (PPM) 
  Chrony: Mean=25.26 Std Dev=.091 (PPM)
 
 
 Running it for a longer time, the standard deviation of the rate for ntp
 has dropped to about .020PPM, which is much better than chrony's. 
 
 
 
 
The means depend on the hardware, and, as long as they are within the order
of one standard deviation of each other, they are as good as each other.
 
 
 Yes, I agree that the mean rates are the same. It is the standard deviation
 that is important here. Ie, ntp seems to be much better (smaller
 fluctutation) than chrony here, at the expense of much worst offset control
 (which makes sense if the rate fluctuations are real-- ie, I can make
 chrony's rate fluctuations much smaller by i running averaging the rates 
 over a
 couple of hours but that will make the offset deviation increase. I guess
 it depends on which you consider more important, and accurate rate, or an
 accurate clock.
 
 
 
From the point of view of another machine, chrony will have episodes where
 
the frequency changes much more, as it applies the phase correction.
 
 
 ??? These are done on the same machine. If you mean that the real drift
 rate of the computer changes, then chrony's rate will change, then I would
 hope that that happens. Remember that this is not comparing two different
 machines, but the same machine at two different times. 
 
 And yes, the physical events could have changed between the two. It would
 be nice if one could do a simulation-- put them both on some virtual
 machine and feed in exactly the same real clock drift changes, and use some
 model of the noise ( measurement, transmission, etc) so one could provide
 the two algorithms with exactly the same data to work with. But neither
 chrony nor ntp are set up for that. 
 
 
 
 
over the weekend, and chrony encompassed the weekdays when the grad
students use the computer) the offset control by chrony was a factor of 3
better than by ntp.  
 
 
If the figures are the actual correction for chrony and the sample error for
ntpd and Dave Mills is correct about the phase noise rejection of the ntpd 
filter being a couple of orders of magnitude, ntpd might actually be 30 
times better.
 
 
 Nope, the figures are the actual samples

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread Unruh

David L. Mills [EMAIL PROTECTED] writes:


In principle, comparing ntpd and chrony or any other vehicle is 
meaninful only if using the same hardware, operating system and poll 
interval. I assume chrony has done its homework and optimized the time 
constant for the given poll interval. I am on a limb here, because 
nobody has confirmed that chrony does in fact discipline the clock using 
  some sort of feedback loop sensitive to both time and frequency 
offset. If in fact it does not, then why are we having this discussion?

It does. And the comparison is on exactly the same machine, running exactly
the same operating system. The ONLY change is that chrony was replaced by
ntp on that system. Now the date is not the same. Unfortunately I cannot
run both ntp and chrony on the same system at the same time.




Dave

Unruh wrote:

 [EMAIL PROTECTED] (David Woolley) writes:
 
 
In article [EMAIL PROTECTED],
Unruh [EMAIL PROTECTED] wrote:
 
 
No, the offset is the value reported in loopstats.
 
 
Same thing.  If chrony is reporting the same measurements, neither set of
measurements is particularly valid.  You need to measure the actual
 
 
 I am sorry but I do not understand what you are saying. The best estimate
 of the time error of the clock is the measurement that you make of that
 error. Now, you might argue that if the drift never changed, and the clock
 never changed then one could get a better estimate by averaging the
 measurements. But the hypothesis that the time reported by the ntp process
 is the true time plus random uncorrelated errors is simply wrong, as
 looking at the plot of the offsets will rapidly convince you. The offsets
 oscillate with a period on the order of an hour or so. Chrony does this.
 ntp does this. The errors are NOT gaussian uncorrelated random errors. 
 
 Thus most of that error budget is in such correlated errors, and ntp does
 NOT do many orders of magnitude better ( even with uncorrelated random
 errors you would need to average 100 samples-- collected over 10 hours at
 poll 7 to get one order of magnitude, and by then the drift errors would have
 gotten you.)
 
 Anyway, I am comparing like with like in the two programs. Chrony is much
 better in offsets, which implies that at least half of the error in ntp
 is internal error. Ie, it is errors which do NOT average out. ( and I do
 not believe that chrony's errors are the minimal uncorrelated random errors
 either.)
 
 
offsets, using something that has a repeatability a couple of orders
 
 
 Of course that is the best way. Unfortunately I do not have that. I might
 extend the line running the main server to also give me the true offsets
 for the machine. However one can also get an estimate of the errors by
 looking at the measured offsets using the ntp exchange. 
 
 
of magnitude better.  Certainly for ntpd, offset should be much larger
than the error, when locked.  Is the server running ntpd?
 
 
 I do not believe this. 
 Yes, the server is running ntp and its offset errors are of the order of 
 3usec-- again correlated as you can see from the string graph near the 
 bottom of the page.
 
 For example, if I take a 10 element running average and subtract it from
 the raw output of ntp for the server , the standard deviation goes from
 3usec to .5usec. Ie, the errors are highly correlated. Averaging may be
 able to  get
 rid of that .5usec, but not the rest of the standard deviation (3usec)
 which is some sort of highly correlated noise. IF the errors were really
 uncorrelated random errors then subtracting off the running average would
 make no ( well, little) difference to the standard deviation.( it would
 decrease it by something like sqrt(N-1/N) where N is the length of the
 running average)
 
 Ie, it is simply not true that the measured offsets  reported by ntp, or 
 chrony,
 are simply some independent gaussian random process around the true time. 
 
 
 
 
Anyway, as I said, arguing by proxy is difficult and I'm rather hoping that
Dave Mills will take over.  Certainly it is Dave Mills you have to 
convince if ntpd is going to change.
 
 
 I do not know if I am trying to convince. I am trying to report the
 outcomes of some experiments. Now if one (Mills) wants ntp to behave
 differently than the experiments show it does, then I guess he will change
 it. If not, then not. 
 
 All I say is that the experiments I have carried out show that ntp is slow
 to converge if it starts of badly, and leaves the offset scatter larger
 than chrony does. It does have a smaller scatter in the rate. 
 
 One of the great advantages of two different people-- Mills and Curnoe--
 trying to impliment the same ideas in different ways is that one can learn
 by studying the difference between their results. 
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread Unruh

David L. Mills [EMAIL PROTECTED] writes:

Unruh,

Please read the specification. The offset statistic is the 
maximum-likelihood estimate of the remote clock offset relative to the 
local clock and the sign really does matter. The best way to describe 
this and keep the sign straight is to assume the signed offset is the 
quantity in seconds to add to the local clock in order to maintain the 
same time as the remote clock.

Of course the sign matters if you are trying to correct things. Sheesh. 
The sign does NOT matter to the standard deviation. That was ALL I was
saying. 



The variance statistic, which is represented as an exponentially 
weighted RMS average called jitter, is the expected error when computing 
the offset statistic. Generally speaking, as long as the jitter is 
unbiased, it does not materially affect the clock accuracy due to the 
extreme lowpass characteristic of the discipline. You shoul be watching 
the offset statistic, not the jitter statistic.

??? I am watching the offset. I am taking the offset as measured by the ntp
negotiation and calculating the rms deviation of that. 



Dave

Unruh wrote:

 [EMAIL PROTECTED] (Danny Mayer) writes:
 
 
Unruh wrote:

correction applied on that sample.

No, this is offset as measured by the ntp procedure ( (t1+t4-t2-t3)/2 )

 
 
No, that's wrong. It is very carefully described in the NTPv4 draft 
section 8 (p27):
 
 
theta = T(B) - T(A) = 1/2 * [(T2-T1) + (T3-T4)]
 
 
Not only do you have the wrong sign, the differences must be calculated 
first, otherwise the errors in the calculation overwhelm the resulting 
value. That's why it's written the way it is.
 
 
 I was not writing code. I was telling you what time difference  I was 
 refering to. And
 sign is a convention as to whether you are saying positive is the computer
 is fast or the external source is fast. Everything I talked about is sign
 independent (standard deviation uses squares), and the difference is that 
 as reported by ntp or chrony and
 both are careful to to do the calculations with as high an accuracy as
 possible. 
 
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread Unruh

David L. Mills [EMAIL PROTECTED] writes:

Unruh,

The basic clock discipline feedback loop has been unchanged since 1992, 
although minor changes have been made to improve behavior in very long 
poll intervals. The only radical change has been using a preliminary 
15-minute initial frequency computation when no frequency file is 
available. So, if you are comparing ntpd and chrony at initial startup 
and without a frequency file, expect to find wide differences in 
behavior. Starting ntpd with an intentially bad frequency file is not 
useful unless you can configure chrony in the same way.

Of course I can. And have done so.
As you say, the transient response of ntp is terrible-- 3000 sec is far too
slow. 


If you really do want a definitive experiment, do what I suggested 
earlier: measure the transient response of both ntpd and chrony starting 
from the SAME initial conditions and with a frequency file containing 
zero PPM. Pay attention to the poll interval, whch should be the same in 
both cases. That will tell you the story, the whole story and nothing 
but the truth.

Precisely what I have been doing. 



Dave

Unruh wrote:

 [EMAIL PROTECTED] (Danny Mayer) writes:
 
 
Unruh wrote:

All I say is that the experiments I have carried out show that ntp is slow
to converge if it starts of badly, and leaves the offset scatter larger
than chrony does. It does have a smaller scatter in the rate. 

 
 
But you are using an extremely old version of ntp and things have 
radically changed since that version was released. Try rerunning you 
experiments with ntp 4.2.4 and see what you get then. You also need to 
fix your calculations if you are going to get good results as I 
mentioned in a previous message.
 
 
 Most of the standard deviation results are with 4.2.4. Only the startup was
 with 4.2.0. Are you saying that things have radically changed in the
 handling of the startup? After I collect more data on steady state, I will
 rerun startups both with no drift file and a bad drift file to see how fast
 the convergence is with 4.2.4.
 
 
 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread Harlan Stenn

Hey Danny!

 In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] writes:

Unruh Unfortunately I cannot run both ntp and chrony on the same system at
Unruh the same time.

Bill,

Exactly why can you not run ntpd and chrony on the same system at the same
time?

I want the ability to run multiple instances of ntpd where at most 1
instance of ntpd is actually controlling the clock, specifically to make it
easy to (more quickly) analyze the performance/behavior of different
configurations of ntpd.  I understand that the boat is rocking while this is
going on, but I suspect this capability would be a useful one in at least
some cases.

H

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread Maarten Wiltink

Unruh [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 David L. Mills [EMAIL PROTECTED] writes:

 There are lots of ways to measure the loop transient response. The
 easiest way is to set the clock some 50-100 ms off from some stable
 source (not necessarily accurate) and watch the loop converge. The
 response should cross zero in about 3000 s and overshoot about 6
 percent

 3000 s is a HUGE time. For people who switch on their computers daily,
 that means most of their time is spent with the computer unsynchronised
 to best accuracy. The timescale of chrony is far faster. (I am not a
 writer of chrony.I am a user who is trying to get the very best out of
 the timekeeping.)

But NTP is from a time when people didn't switch on their computers
daily. When NTP was young, dinosaurs walked the machine room and
_you_ did _not_ get to decide when the machine on the other end of
your terminal was rebooted.

NTP can, after weeks of training, teach a computer to keep time very,
very well. As a result, it's less optimised for the other end of the
spectrum.

Features like iburst and the drift file can get your clock synchronised
to within a few milliseconds in less than a minute. If you want better
than that, or you want it faster... don't turn your computer off.

Groetjes,
Maarten Wiltink


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread David Woolley

David L. Mills wrote:

  As for offset should be much larger than the error, be careful here.
  By error I assume you mean what ntpq rv shows as jitter. The best case

No. By error I meant a measurement that neither ntpd nor chrony can 
actually make, namely the difference between the user's concept of 
perfect time and the actual time in the software clock in the client. If 
you could actually measure it, you would probably characterize it by the 
root mean square of this.

What actually happens is that, say you have a server, that you define as 
perfect time, you desperately want a measure of how accurate your client 
is compared with the server's internal time.  People seize on offset as 
a measure of that, but, if the loop is well locked, which I think 
amounts to jitter and RMS offset being essentially the same, offset is 
almost entirely made up of measurement error.  In reality the client's 
software clock may well be in almost perfect synchronization with the 
server's and certainly should have an RMS difference that is much less 
than deduced from offset/jitter.  (Systematic errors may result in a 
systematic offset, so one is really talking about a jitter-like measure, 
relative to the, unavailable, perfect time.)

The measurement cannot be made using ntpd or chrony alone, because if 
they could measure the true error, they could correct for it.

  is when offset is indeed less than jitter; if the error is much larger
  than error, this suggests the frequency has surged and the time

I think you meant the first error to be offset and the second one to be 
jitter.  I would consider this case to be one where the loop was not 
properly locked.

  constant/poll interval needs to be reduced. Watch the poll interval
  behavior in the loopstats data.

I think you really need to address two issues to put this thread to
rest:

- the use of linear regression algorithms on finite histories, as an
   alternative to the ntpd algorithm (i.e. the statisticians/scientists
   approach, versus the engineer's);

- the handling of cases where it is obvious to a human that the time
   is wrong, but ntpd will take 3000+s to fully correct.

chrony uses linear regression (modified least squares) and it seems to 
be getting a reputation for recovering from transients much better than 
ntpd.  Unruh believes that this is the consequence of the algorithm that 
it uses, which means that least squares type techniques are beginning to 
be associated with the way to go with time synchronization.  I know you 
disagree, but you have to convince people of that when chrony seems to 
behave much better in the transients seen in real uses of ntpd.

I wonder if what is really needed is to use linear regression to gain 
and regain lock and to use the current ntpd algorithm when you are 
reasonably convinced that the loop is locked.  At the moment, you do a 
two point linear regression on a cold start, or after a step, although 
two point least squares fits are rather trivial as they always have zero 
variance if the points are distinct!


My understanding of chrony, based on high level documents and a quick 
skim of the code is that:

- it is not NTP compliant because it doesn't seem to implement
   normative parts of the NTPv3 specification, like the intersection
   algorithm (but many people don't distinguish between SNTP and NTP
   because they use the same wire formats);

- the way it works is to maintain a finite history of measurements
   and to use linear regression (least squares modified to give less
   weight to outliers) and to calculate a phase and frequency error.

   It applies the phase correction as a fast slew, which is seen as an
   an advantage, because only a fixed frequency correction is left if the
   server goes away) and the frequency correction continuously.

   Once it has applied a correction, it adjusts the historic measurements
   to account for its current time and frequency scales.

   I think there is more to it than this, e.g. adjusting sample rates
   and the number of retained samples.

Because it is significantly different in principle from ntpd, it is not 
entirely clear that ntpd concepts like loop time constants are explicit 
in the chrony model, although they might be implicit in things like the 
period over which samples are currently being retained.

A problem that Unruh is having is that some of the answers he is getting 
seem to represent blind faith in ntpd without any knowledge of 
alternative approaches.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread Unruh

Harlan Stenn [EMAIL PROTECTED] writes:

Hey Danny!

 In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] writes:

Unruh Unfortunately I cannot run both ntp and chrony on the same system at
Unruh the same time.

Bill,

Exactly why can you not run ntpd and chrony on the same system at the same
time?

The key problem is that there is a feedback between the clock control
algorithm and the clock loop routines. That feedback is missing if chrony
or whatever does not control the clock. It may tell  you a bit but removing
such a crucial part of the feedback loop would totally change the behaviour
of the loop. 

What would be great to have would be the ability to run them on a fake
clock machine, where it is a program which responds to the output (clock
control) and the input ( the network stuff.) In particular chrony uses
system clock stuff to schedule various events-- like sending out the
packets, etc. One would have to have the system feed those routines as
well. HOwever if one could do that then one could look at how both ntp and
chrony reacted to exactly the same input/output. It is however a massive
rewrite of the code, unfortunately, as far as i can see. 



I want the ability to run multiple instances of ntpd where at most 1
instance of ntpd is actually controlling the clock, specifically to make it
easy to (more quickly) analyze the performance/behavior of different
configurations of ntpd.  I understand that the boat is rocking while this is
going on, but I suspect this capability would be a useful one in at least
some cases.

Unfortunately I do not think that will give much info as to how the
different configurations behave. It would be like disconnecting the
feedback in an amplifier-- the amp behaves very very differently.



H

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread Bill Unruh

On Mon, 21 Jan 2008, Danny Mayer wrote:

 Unruh wrote:
  All I say is that the experiments I have carried out show that ntp is slow
  to converge if it starts of badly, and leaves the offset scatter larger
  than chrony does. It does have a smaller scatter in the rate. 

 But you are using an extremely old version of ntp and things have radically 
 changed since that version was released. Try rerunning you experiments with 
 ntp 4.2.4 and see what you get then. You also need to fix your calculations 
 if you are going to get good results as I mentioned in a previous message.

I did. The calculations I presented were with 4.2.4, except for the
convergence on initial transient. I have not retried that experiment ( It
takes too long) Most of the results regarding the scattering of the offset
are for 4.2.4. It is a factor of a little over two worse than chrony in
regulating the offset.

In a few weeks I will probably try the initial transient stuff again.
(I am out of town next week)

However, do you believe that the bechaviour  of 4.2.4 under intial conditions is
better than 4.2.0 (eg either no drift file or a bad drift file)?
___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread Hal Murray


On recent Linux kernels, I think the drift file is always bad after reboot.
HZ=100, no dynamic ticks aka tickless system (CONFIG_NO_HZ not set). I think
I even tried with a kernel command line option lpj= but it didn't help.
If the system is rebooted, ntpd stabilizes to a new different drift value.

That's a bug in the TSC calibration code.

grep your /var/log/messages* for Detected.  You will find things like thsi:
  Jan  4 11:21:49 shuksan kernel: Detected 2793.137 MHz processor.
  Jan  4 21:30:43 shuksan kernel: Detected 2793.209 MHz processor.
  Jan 22 09:32:20 shuksan kernel: Detected 2793.139 MHz processor.

The differences in the bottom bits turn into different drift values.


Recent Linux kernels use the TSC for timekeeping.  (At least on the
systems I work with.)  There may be a simple command line option
to use another chunk of hardware.


-- 
These are my opinions, not necessarily my employer's.  I hate spam.

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread Danny Mayer

Harlan Stenn wrote:
 Hey Danny!
 
 In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] writes:
 
 Unruh Unfortunately I cannot run both ntp and chrony on the same system at
 Unruh the same time.
 
 Bill,
 
 Exactly why can you not run ntpd and chrony on the same system at the same
 time?
 

Harlan, really. You *cannot* have two different mechanisms/applications 
to discipline the clock at the same time. I invite you to try. You have 
access to my code so you can test this easily.

 I want the ability to run multiple instances of ntpd where at most 1
 instance of ntpd is actually controlling the clock, specifically to make it
 easy to (more quickly) analyze the performance/behavior of different
 configurations of ntpd.  I understand that the boat is rocking while this is
 going on, but I suspect this capability would be a useful one in at least
 some cases.
 

I don't see the benefit of doing this with two separate instances. It's 
easier and simpler to just add the other servers into the one instance 
and specify noselect.

Danny
___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-22 Thread Harlan Stenn

 In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Danny Mayer) writes:

Danny Harlan Stenn wrote:
Unruh Unfortunately I cannot run both ntp and chrony on the same system at
Unruh the same time.
  Bill,
 
 Exactly why can you not run ntpd and chrony on the same system at the
 same time?

Danny Harlan, really. You *cannot* have two different
Danny mechanisms/applications to discipline the clock at the same time. I
Danny invite you to try. You have access to my code so you can test this
Danny easily.

You are, as is so often the case, missing my point.  It is possible to run
ntpd in a way that it does not discipline the clock.  I am curious about
your last sentence though - what is special about your code that would allow
this to be tested?

 I want the ability to run multiple instances of ntpd where at most 1
 instance of ntpd is actually controlling the clock, specifically to make
 it easy to (more quickly) analyze the performance/behavior of different
 configurations of ntpd.  I understand that the boat is rocking while this
 is going on, but I suspect this capability would be a useful one in at
 least some cases.
 

Danny I don't see the benefit of doing this with two separate
Danny instances. It's easier and simpler to just add the other servers into
Danny the one instance and specify noselect.

Again you are missing my point.  Allowing this would let us, for example,
see how two different versions of ntpd would discipline the clock.  It would
allow us to see how ntpd might discipline the clock compared to chrony.

I understand and get that by not actually disciplining the clock we are
removing an important part of the feedback loop, and I do not know if that
will fatally affect these sort of experiments or not.

And as Bill said, it would be Swell if there was a way to do this using, eg,
virtual machines so that we could test them that way.  Better yet, it would
be nice to have a simulator framework where we could run these tests faster
than in real-time.
-- 
Harlan Stenn [EMAIL PROTECTED]
http://ntpforum.isc.org  - be a member!

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-21 Thread Unruh

[EMAIL PROTECTED] (David Woolley) writes:

In article [EMAIL PROTECTED],
Bill Unruh [EMAIL PROTECTED] wrote:

 Offset error:
NTP: Mean=-3.1usec, Std Dev=63.1usec

If offset is the value reported by ntpq, please note that, when ntpd
is locked up, this is an indication of the instantaneous measurement
error, the actual error in the local time should be more stable (there may
be systematic error) by one or two orders of magnitude.

No, the offset is the value reported in loopstats.


More generally though, Dave Mills really needs to get in here and defend
his clock discipline algorithm, and the Chrony developer needs to 
defend theirs.  Arguing the cases by proxy isn't particularly satisfactory.

This is not arguing by proxy, this is running experiments. As I know, since
I am a physicist, experiment trumps theory always. 



Dave, please remember that what tends to concern people about the algorithm
is not the behaviour in response to gaussian phase noise, but its behaviour
in response to transients, in particular startup transients.  (Personally
I would say that lost clock tick transients should be fixed at source,
but Bill Unruh would also like it to tolerate those well.)

   Chrony: Mean=-1.5 usec, Std Dev=20.1usec

Given the way that I understand it works, I think this is the actual
correction applied on that sample.

No, this is offset as measured by the ntp procedure ( (t1+t4-t2-t3)/2 )




 Rate fluctuation:
   NTP:Mean=25.32  Std Dev=.078 (PPM) 
   Chrony: Mean=25.26 Std Dev=.091 (PPM)

Running it for a longer time, the standard deviation of the rate for ntp
has dropped to about .020PPM, which is much better than chrony's. 



The means depend on the hardware, and, as long as they are within the order
of one standard deviation of each other, they are as good as each other.

Yes, I agree that the mean rates are the same. It is the standard deviation
that is important here. Ie, ntp seems to be much better (smaller
fluctutation) than chrony here, at the expense of much worst offset control
(which makes sense if the rate fluctuations are real-- ie, I can make
chrony's rate fluctuations much smaller by i running averaging the rates over a
couple of hours but that will make the offset deviation increase. I guess
it depends on which you consider more important, and accurate rate, or an
accurate clock.



From the point of view of another machine, chrony will have episodes where
the frequency changes much more, as it applies the phase correction.

??? These are done on the same machine. If you mean that the real drift
rate of the computer changes, then chrony's rate will change, then I would
hope that that happens. Remember that this is not comparing two different
machines, but the same machine at two different times. 

And yes, the physical events could have changed between the two. It would
be nice if one could do a simulation-- put them both on some virtual
machine and feed in exactly the same real clock drift changes, and use some
model of the noise ( measurement, transmission, etc) so one could provide
the two algorithms with exactly the same data to work with. But neither
chrony nor ntp are set up for that. 



 over the weekend, and chrony encompassed the weekdays when the grad
 students use the computer) the offset control by chrony was a factor of 3
 better than by ntp.  

If the figures are the actual correction for chrony and the sample error for
ntpd and Dave Mills is correct about the phase noise rejection of the ntpd 
filter being a couple of orders of magnitude, ntpd might actually be 30 
times better.

Nope, the figures are the actual samples as measured by chrony, and the
processed output from ntp as reported in loopstats-- whatever that figure
is. Ie, if processing makes a difference then the advantage lies with ntp.
But I just checked  using the offset reported in peerstats (choosing only the 
packets from the  one local server)
 I get the same result as from loopstats. 
Ie, both  the results for ntp and for chrony are the raw offsets. 
and chrony's are about 2.2 times better than ntp's.

So, chrony, at least in this one test, controls the offsets of the clock
much better, at the expense of worse consistancy in the frequency. It also 
reacts
much faster to gross changes in the time ( eg startup with no drift file).






___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-21 Thread Unruh

[EMAIL PROTECTED] (David Woolley) writes:

In article [EMAIL PROTECTED],
Unruh [EMAIL PROTECTED] wrote:

 No, the offset is the value reported in loopstats.

Same thing.  If chrony is reporting the same measurements, neither set of
measurements is particularly valid.  You need to measure the actual

I am sorry but I do not understand what you are saying. The best estimate
of the time error of the clock is the measurement that you make of that
error. Now, you might argue that if the drift never changed, and the clock
never changed then one could get a better estimate by averaging the
measurements. But the hypothesis that the time reported by the ntp process
is the true time plus random uncorrelated errors is simply wrong, as
looking at the plot of the offsets will rapidly convince you. The offsets
oscillate with a period on the order of an hour or so. Chrony does this.
ntp does this. The errors are NOT gaussian uncorrelated random errors. 

Thus most of that error budget is in such correlated errors, and ntp does
NOT do many orders of magnitude better ( even with uncorrelated random
errors you would need to average 100 samples-- collected over 10 hours at
poll 7 to get one order of magnitude, and by then the drift errors would have
gotten you.)

Anyway, I am comparing like with like in the two programs. Chrony is much
better in offsets, which implies that at least half of the error in ntp
is internal error. Ie, it is errors which do NOT average out. ( and I do
not believe that chrony's errors are the minimal uncorrelated random errors
either.)

offsets, using something that has a repeatability a couple of orders

Of course that is the best way. Unfortunately I do not have that. I might
extend the line running the main server to also give me the true offsets
for the machine. However one can also get an estimate of the errors by
looking at the measured offsets using the ntp exchange. 

of magnitude better.  Certainly for ntpd, offset should be much larger
than the error, when locked.  Is the server running ntpd?

I do not believe this. 
Yes, the server is running ntp and its offset errors are of the order of 
3usec-- again correlated as you can see from the string graph near the bottom 
of the page.

For example, if I take a 10 element running average and subtract it from
the raw output of ntp for the server , the standard deviation goes from
3usec to .5usec. Ie, the errors are highly correlated. Averaging may be
able to  get
rid of that .5usec, but not the rest of the standard deviation (3usec)
which is some sort of highly correlated noise. IF the errors were really
uncorrelated random errors then subtracting off the running average would
make no ( well, little) difference to the standard deviation.( it would
decrease it by something like sqrt(N-1/N) where N is the length of the
running average)

Ie, it is simply not true that the measured offsets  reported by ntp, or chrony,
are simply some independent gaussian random process around the true time. 



Anyway, as I said, arguing by proxy is difficult and I'm rather hoping that
Dave Mills will take over.  Certainly it is Dave Mills you have to 
convince if ntpd is going to change.

I do not know if I am trying to convince. I am trying to report the
outcomes of some experiments. Now if one (Mills) wants ntp to behave
differently than the experiments show it does, then I guess he will change
it. If not, then not. 

All I say is that the experiments I have carried out show that ntp is slow
to converge if it starts of badly, and leaves the offset scatter larger
than chrony does. It does have a smaller scatter in the rate. 

One of the great advantages of two different people-- Mills and Curnoe--
trying to impliment the same ideas in different ways is that one can learn
by studying the difference between their results. 

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-21 Thread Danny Mayer

Unruh wrote:
 correction applied on that sample.
 
 No, this is offset as measured by the ntp procedure ( (t1+t4-t2-t3)/2 )
 

No, that's wrong. It is very carefully described in the NTPv4 draft 
section 8 (p27):

theta = T(B) - T(A) = 1/2 * [(T2-T1) + (T3-T4)]

Not only do you have the wrong sign, the differences must be calculated 
first, otherwise the errors in the calculation overwhelm the resulting 
value. That's why it's written the way it is.

Danny

___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-21 Thread Danny Mayer

Unruh wrote:
 All I say is that the experiments I have carried out show that ntp is slow
 to converge if it starts of badly, and leaves the offset scatter larger
 than chrony does. It does have a smaller scatter in the rate. 
 

But you are using an extremely old version of ntp and things have 
radically changed since that version was released. Try rerunning you 
experiments with ntp 4.2.4 and see what you get then. You also need to 
fix your calculations if you are going to get good results as I 
mentioned in a previous message.

Danny
___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-21 Thread Unruh

[EMAIL PROTECTED] (Danny Mayer) writes:

Unruh wrote:
 All I say is that the experiments I have carried out show that ntp is slow
 to converge if it starts of badly, and leaves the offset scatter larger
 than chrony does. It does have a smaller scatter in the rate. 
 

But you are using an extremely old version of ntp and things have 
radically changed since that version was released. Try rerunning you 
experiments with ntp 4.2.4 and see what you get then. You also need to 
fix your calculations if you are going to get good results as I 
mentioned in a previous message.

Most of the standard deviation results are with 4.2.4. Only the startup was
with 4.2.0. Are you saying that things have radically changed in the
handling of the startup? After I collect more data on steady state, I will
rerun startups both with no drift file and a bad drift file to see how fast
the convergence is with 4.2.4.



___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)

2008-01-21 Thread Unruh

[EMAIL PROTECTED] (Danny Mayer) writes:

Unruh wrote:
 correction applied on that sample.
 
 No, this is offset as measured by the ntp procedure ( (t1+t4-t2-t3)/2 )
 

No, that's wrong. It is very carefully described in the NTPv4 draft 
section 8 (p27):

theta = T(B) - T(A) = 1/2 * [(T2-T1) + (T3-T4)]

Not only do you have the wrong sign, the differences must be calculated 
first, otherwise the errors in the calculation overwhelm the resulting 
value. That's why it's written the way it is.

I was not writing code. I was telling you what time difference  I was refering 
to. And
sign is a convention as to whether you are saying positive is the computer
is fast or the external source is fast. Everything I talked about is sign
independent (standard deviation uses squares), and the difference is that 
as reported by ntp or chrony and
both are careful to to do the calculations with as high an accuracy as
possible. 


___
questions mailing list
questions@lists.ntp.org
https://lists.ntp.org/mailman/listinfo/questions

82 matches

Mail list logo