Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
On Wed, 30 Jan 2008, Danny Mayer wrote: Unruh wrote: David L. Mills [EMAIL PROTECTED] writes: David, We can argue about the Hurst parameter, which can't be truly random-walk as I have assumed, but the approximation is valid up to lag times of at least a week. However, as I have been cautioned, these plots are really sensitive to spectral lines due to nonuniform sampling. I was very careful to avoid such things. But the lines I am refering to are not artifacts, they are there because of the way the computer is used. -- the temp fluctuations caused by people running the machine daily, except on weekends. These are not part of any random walk process. They are real jumps in the drift rate of the machine, large jumps, and definitely not random. Well of course. You are running Linux and losing interrupts. FreeBSD and friends don't suffer from that problem. I seem to remember setting HZ=100 mostly eliminates that problem, at the price of rebuilding the kernel. Danny No they are not lost interrupts. They are NOT jumps in the offset, they are jumps in the frequency, which will last for a few hours and then jump back. Lost interrupts do not act like that-- they would jump the offset by 10ms (or 4ms) which is definitely not happening. Andit is hard to gain interrupts. -- William G. Unruh | Canadian Institute for| Tel: +1(604)822-3273 PhysicsAstronomy | Advanced Research | Fax: +1(604)822-5324 UBC, Vancouver,BC | Program in Cosmology | [EMAIL PROTECTED] Canada V6T 1Z1 | and Gravity | www.theory.physics.ubc.ca/ ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Danny Mayer wrote: Well of course. You are running Linux and losing interrupts. FreeBSD and Lost interrupts are not the problem here and nothing about FreeBSD should help (unless it runs the CPU permanently at full power). ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David, Cite: Judah Levine of NIST, personal communication. A few little mistakes on my part proved him right. Dave [EMAIL PROTECTED] wrote: In comp.protocols.time.ntp you write: Hi Dave, We can argue about the Hurst parameter, which can't be truly random-walk as I have assumed, but the approximation is valid up to lag times of at least a week. However, as I have been cautioned, these plots are really sensitive to spectral lines due to nonuniform sampling. I was very careful to avoid such things. Do you have a cite for that? Have you seen Vit Klemes take on tree ring data: http://iahs.info/perugia/2007IAHSKlemesTreeRings.pdf It might appeal to your sense of humour. David. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
[EMAIL PROTECTED] (Danny Mayer) writes: David L. Mills wrote: Danny, It doesn't stop working; it just clamps whatever it gets to +-500 PPM as appropriate. If the intrinsic error is greater than 500 PPM, the loop will do what it can with the residual it can't correct showing as a systematic time ofset. Dave I didn't mean to suggest that ntpd stopped running. It was that the clock was drifting steadily off into the sunset. I realize that if the problem corrected itself ntpd would bring things back to normal. But that suggests that the drift rate of your chip became bigger than 500PPM, which is huge. Maybe something altered the tick size inappropriately. ntp should have hauled the offset back to zero -- just taking a longer time ( 100msec at 500PPM takes about 200 sec to eliminate-- which is not that long.) Danny Danny Mayer wrote: David L. Mills wrote: Danny, Unless the computer clock intrinsic frequency error is huge, the only time the 500-PPM kicks in is with a 100-ms step transient and poll interval 16 s. The loop still works if it hits the stops; it just can't drive the offset to zero. Dave Yes, I found this out when my laptop stopped disciplined the clock and was complaining about the frequency limits and I started digging into the code to figure out why. Danny ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
[EMAIL PROTECTED] (David Malone) writes: Unruh [EMAIL PROTECTED] writes: weekends. Lots of power at 10^-5 Hz and harmonics, and .7 10^-8Hz.-- more than would be predicted by 1/f 10^-5Hz is about once per day. I'm not sure what .7 10^8Hz is - it seems to be about once every 4.5 years? I would have assumed you'd get power around 10^-5Hz (daily), 10^-6 Hz (weekly) and maybe 3x10^-8 (yearly) based on a mix of enviromental factors (air conditioning/heating) and usage? Yes, that was supposed to be 1/week. David. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh wrote: [EMAIL PROTECTED] (Danny Mayer) writes: David L. Mills wrote: Danny, It doesn't stop working; it just clamps whatever it gets to +-500 PPM as appropriate. If the intrinsic error is greater than 500 PPM, the loop will do what it can with the residual it can't correct showing as a systematic time ofset. Dave I didn't mean to suggest that ntpd stopped running. It was that the clock was drifting steadily off into the sunset. I realize that if the problem corrected itself ntpd would bring things back to normal. But that suggests that the drift rate of your chip became bigger than 500PPM, which is huge. Maybe something altered the tick size inappropriately. ntp should have hauled the offset back to zero -- just taking a longer time ( 100msec at 500PPM takes about 200 sec to eliminate-- which is not that long.) No, it was something else entirely and not something that ntpd, chrony or any other application could do anything about. It's fixed now. Danny ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
What was the problem? On Mon, 28 Jan 2008, Danny Mayer wrote: Unruh wrote: [EMAIL PROTECTED] (Danny Mayer) writes: David L. Mills wrote: Danny, It doesn't stop working; it just clamps whatever it gets to +-500 PPM as appropriate. If the intrinsic error is greater than 500 PPM, the loop will do what it can with the residual it can't correct showing as a systematic time ofset. Dave I didn't mean to suggest that ntpd stopped running. It was that the clock was drifting steadily off into the sunset. I realize that if the problem corrected itself ntpd would bring things back to normal. But that suggests that the drift rate of your chip became bigger than 500PPM, which is huge. Maybe something altered the tick size inappropriately. ntp should have hauled the offset back to zero -- just taking a longer time ( 100msec at 500PPM takes about 200 sec to eliminate-- which is not that long.) No, it was something else entirely and not something that ntpd, chrony or any other application could do anything about. It's fixed now. Danny -- William G. Unruh | Canadian Institute for| Tel: +1(604)822-3273 PhysicsAstronomy | Advanced Research | Fax: +1(604)822-5324 UBC, Vancouver,BC | Program in Cosmology | [EMAIL PROTECTED] Canada V6T 1Z1 | and Gravity | www.theory.physics.ubc.ca/ ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David, We can argue about the Hurst parameter, which can't be truly random-walk as I have assumed, but the approximation is valid up to lag times of at least a week. However, as I have been cautioned, these plots are really sensitive to spectral lines due to nonuniform sampling. I was very careful to avoid such things. Dave David Malone wrote: Unruh [EMAIL PROTECTED] writes: weekends. Lots of power at 10^-5 Hz and harmonics, and .7 10^-8Hz.-- more than would be predicted by 1/f 10^-5Hz is about once per day. I'm not sure what .7 10^8Hz is - it seems to be about once every 4.5 years? I would have assumed you'd get power around 10^-5Hz (daily), 10^-6 Hz (weekly) and maybe 3x10^-8 (yearly) based on a mix of enviromental factors (air conditioning/heating) and usage? David. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Maarten, Maybe I didn't make myself clear. The case in question is when the intrinsic frequency error of the computer clock is greater than 500 PPM, in which case the discipline loop cannot compensate for the error. The result is a systematic time offset error that cannot be driven to zero. This has nothing to do with the initial offset as you suggest. Dave Maarten Wiltink wrote: Unruh [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] David L. Mills wrote: Unless the computer clock intrinsic frequency error is huge, the only time the 500-PPM kicks in is with a 100-ms step transient and poll interval 16 s. The loop still works if it hits the stops; it just can't drive the offset to zero. [...] Why can't it drive the offset to zero? 100ms should take about 5 min(if it were always 500 but the loop would make it take longer) That would presumably be in the case of 'huge intrinsic frequency error'. Groetjes, Maarten Wiltink ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David L. Mills wrote: It's easy to make your own Allan characteristic. Just let the computer clock free-run for a couple of weeks and record the offset relative to a known and stable standard, preferable at the smallest poll interval you can. The PPS from a GPS receiver is an ideal source, but you have to jerry-rig a means to capture each transition. Compute the RMS frequency differences, decimate and repeat. Don't take the following seriously, I lifted it without considering context, but that's the general idea. Be very careful about missing data, etc., as that creates spectral lines that mess up the plot. p = w; r = diff(x); q = y; i = 1; d = 1; while (length(q) = 10) u = diff(p) / d; x2(i) = sqrt(mean(u .* u) / 2); u = diff(r) / d; x1(i) = sqrt(mean(u .* u) / 2); u = diff(q); y1(i) = sqrt(mean(u .* u) / 2); p = p(1:2:length(p)); r = r(1:2:length(r)); q = q(1:2:length(q)); m1(i) = d; i = i + 1; d = d * 2; end loglog(m1, x2 * 1e6, m1, x1 * 1e6, m1, y1 * 1e6, m1, (x1 + y1) * 1e6) axis([1 1e5 1e-4 100]); xlabel('Time Interval (s)'); ylabel('Allan Deviation (PPM)'); print -dtiff allan Dave And for those of you who didn't recognize it, that's MatLab code. Danny ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David L. Mills wrote: Danny, Unless the computer clock intrinsic frequency error is huge, the only time the 500-PPM kicks in is with a 100-ms step transient and poll interval 16 s. The loop still works if it hits the stops; it just can't drive the offset to zero. Dave Yes, I found this out when my laptop stopped disciplined the clock and was complaining about the frequency limits and I started digging into the code to figure out why. Danny ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Danny Mayer wrote: No, ntpd deliberately limits frequency changes to 500 PPM. That's hard coded. You need to avoid using anything greater than that as Dave has explained. That would be the reason why it taks ntpd longer to bring the clock back to the right time. Assuming that the static frequency error is consistent with a medium to high quality motherboard, slew rate limiting should only kick in if the clock was out by more than the order of a second in the first place, in which case stepping would have to have been inhibited. For normal users the slow convergence is due to loop time constant being more suited to handling gradual temperature variations than startup transients of frequency hits. The slew rate limit, for zero static error, is 1s/2000s. The loop first zero crossing I seem to be remember being quoted at about 3000s, with minpoll set for 64 and the slew rate not being exceeded. The resulting peak slew rate is more than 1/3000, for a 1 second error, but will be well below 1/2000 for a 128ms error. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David Woolley [EMAIL PROTECTED] wrote: Petri Kaukasoina wrote: Basically, it stepped time with ntpdate, slept 100 seconds and stepped time again with ntpdate. From the time adjustment, the script calculated the drift value and put that to the drift file. Again, the time offset always stays below 1 ms. That has quite a lot of similarity with what ntpd itself does if it is cold started with iburst. The only big difference is that it uses 900, rather than 100 seconds. I don't know if that is the same 900 as controlled by tinker stepout, but, even if it is, the side effect on stepout's would probably be undesirable. To cold start you need to delete the drift file, or not configure it. Hmm, I can't see that. I put in only one good time source with iburst, deleted the drift file and started ntpd. The time offset just keeps growing and the frequency changes in very small steps. Now, after 30 minutes time is already 25 ms off and the frequency is only 1.5 ppm (the correct value would be about 25 ppm). ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Petri Kaukasoina wrote: David Woolley [EMAIL PROTECTED] wrote: That has quite a lot of similarity with what ntpd itself does if it is cold started with iburst. The only big difference is that it uses 900, Hmm, I can't see that. I put in only one good time source with iburst, deleted the drift file and started ntpd. The time offset just keeps growing and the frequency changes in very small steps. Now, after 30 minutes time is already 25 ms off and the frequency is only 1.5 ppm (the correct value would be about 25 ppm). Looking at the comments in the 4.2.0 source code, it looks like you may be right; yet another reason why ntpd doesn't handle startup transients well! If this is still true in the latest version ( max means offset 128ms): * Statemaxmax Comments * * NSETFREQFREQno ntp.drift * FREQSYNCif (mu 900) FREQ calculate frequency * else if (allow) TSET * else FREQ * Worse than is obvious here, it only sets the time on the first sample if it is out by more than 128ms. More obvious, unless the frequency error is so high that the time changes by more than 128ms between the first two good samples, it will use the slow PLL method of calibrating the frequency. Even then, unless the offset is more than 128ms both the for first sample, and after every subsequent sample, it will compute the frequency based on the final absolute value of clock offset, not the difference between the first and last readings; this might not be too important, because it looks to me to require the intial offset to be very close to 128ms (low probability) or the frequency error to be quite high (percentage error in frequency calculation relatively low) for it to complete the frequency calibration. What I was expecting was for it to unconditionally do both frequency and phase calibration, in the absence of the drift file. I presume that chrony does a correction on the first couple of samples and then refines it. Incidentally, the else FREQ doesn't seem to match the code and looks like it would prevent it ever getting out of the calibration under some conditions. It looks like I need to fetch the latest source, although it looks, from your observations, as though it is still far from what I would consider right. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David, David Woolley wrote: ISTR that time stamps on financial transactions are required to be within two seconds of the correct time. With NTP that standard is not too difficult to meet. In 2006, it turns out that it was 3 seconds http://tf.nist.gov/general/pdf/2125.pdf, NIST is a US government institution; might there perhaps be different laws or regulations elsewhere in the world? Does anyone among the readership here know? Thx, Jan ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Jan Ceuleers wrote: NIST is a US government institution; might there perhaps be different laws or regulations elsewhere in the world? Does anyone among the readership here know? I used the US case as that is the one that has come up on the newsgroup, but I assume there are similar rules elsewhere. I think the NIST is only documenting the rules in this case, my guess is that it is the SEC that sets them, in the USA. I did the search with a site:nist.gov to reduce the false positives. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Richard, There were several different architecture computers considered in the 1995 and 1998 studies, incluing SPARC, Alpha, Intel and several lab instruments. All oscillators conformed to a simple model: white phase noise (slope -1) below the intercept, random-walk frequency noise (slope +0.5) above the intercept. This is equivalent to your model. Additional data are in the nanokernel documentation. The only differences are in the (x, y) intercept. You don't need das Buch to justify this model; there is evidence all over the place. Clocks of all kinds from cold rocks to Cesium oscillators all show very similar chacteristics, whether modelled in the time domain or frequency domain. It's easy to make your own Allan characteristic. Just let the computer clock free-run for a couple of weeks and record the offset relative to a known and stable standard, preferable at the smallest poll interval you can. The PPS from a GPS receiver is an ideal source, but you have to jerry-rig a means to capture each transition. Compute the RMS frequency differences, decimate and repeat. Don't take the following seriously, I lifted it without considering context, but that's the general idea. Be very careful about missing data, etc., as that creates spectral lines that mess up the plot. p = w; r = diff(x); q = y; i = 1; d = 1; while (length(q) = 10) u = diff(p) / d; x2(i) = sqrt(mean(u .* u) / 2); u = diff(r) / d; x1(i) = sqrt(mean(u .* u) / 2); u = diff(q); y1(i) = sqrt(mean(u .* u) / 2); p = p(1:2:length(p)); r = r(1:2:length(r)); q = q(1:2:length(q)); m1(i) = d; i = i + 1; d = d * 2; end loglog(m1, x2 * 1e6, m1, x1 * 1e6, m1, y1 * 1e6, m1, (x1 + y1) * 1e6) axis([1 1e5 1e-4 100]); xlabel('Time Interval (s)'); ylabel('Allan Deviation (PPM)'); print -dtiff allan Dave Richard B. Gilbert wrote: Unruh wrote: David L. Mills [EMAIL PROTECTED] writes: David, 1. I have explained in very gory detail in many places how the time constant is chosen for the best accuracy using typical computer oscillators and network paths. See the briefings on the NTP project page and especially the discussion about the Allan intercept. If you want the The Allan intercept is predicated on a very specific model of the noise in a clock ( as I recall basically random gaussian noise at high frequencies, and 1/f noise at low). It is not at all clear that real computers comply with that. best accuracy over the long term, you had better respect that. Proof positive is in my 1995 SIGCOMM paper, later IEEE Transactions on Networking paper and das Buch. I abvsolutely relish scientific critique, but see the briefings and read the papers first. 2. To reduce the convergence time, reduce the time constant, but only at the expense of long term accuracy. An extended treatise on that is in das Buch, especially Chaptera 4, 6 and 12. I would be delighted to hear critique of the material, but read the chapters first. While you may know what in the world Das Buch is (Hitlers Mein Kampf?) I do not. Nor do I know where to get it. Computer Network Time Synchronization: The Network Time Protocol by David L. Mills (Hardcover - Mar 24, 2006) Available from Amazon.com. You may be able to find a copy at a University Book store. Be prepared for Sticker Shock. It ain't cheap! Publishing in small quantities is EXPENSIVE!!! It's different when you can amortize your setup costs over 50,000 copies! Das Buch is unlikely to become a best seller! ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David L. Mills wrote: Richard, There were several different architecture computers considered in the 1995 and 1998 studies, incluing SPARC, Alpha, Intel and several lab instruments. All oscillators conformed to a simple model: white phase noise (slope -1) below the intercept, random-walk frequency noise (slope +0.5) above the intercept. This is equivalent to your model. Additional data are in the nanokernel documentation. The only differences are in the (x, y) intercept. You don't need das Buch to justify this model; there is evidence all over the place. Clocks of all kinds from cold rocks to Cesium oscillators all show very similar chacteristics, whether modelled in the time domain or frequency domain. It's easy to make your own Allan characteristic. Just let the computer clock free-run for a couple of weeks and record the offset relative to a known and stable standard, preferable at the smallest poll interval you can. The PPS from a GPS receiver is an ideal source, but you have to jerry-rig a means to capture each transition. Compute the RMS frequency differences, decimate and repeat. Don't take the following seriously, I lifted it without considering context, but that's the general idea. Be very careful about missing data, etc., as that creates spectral lines that mess up the plot. p = w; r = diff(x); q = y; i = 1; d = 1; while (length(q) = 10) u = diff(p) / d; x2(i) = sqrt(mean(u .* u) / 2); u = diff(r) / d; x1(i) = sqrt(mean(u .* u) / 2); u = diff(q); y1(i) = sqrt(mean(u .* u) / 2); p = p(1:2:length(p)); r = r(1:2:length(r)); q = q(1:2:length(q)); m1(i) = d; i = i + 1; d = d * 2; end loglog(m1, x2 * 1e6, m1, x1 * 1e6, m1, y1 * 1e6, m1, (x1 + y1) * 1e6) axis([1 1e5 1e-4 100]); xlabel('Time Interval (s)'); ylabel('Allan Deviation (PPM)'); print -dtiff allan Dave Richard B. Gilbert wrote: Unruh wrote: David L. Mills [EMAIL PROTECTED] writes: David, 1. I have explained in very gory detail in many places how the time constant is chosen for the best accuracy using typical computer oscillators and network paths. See the briefings on the NTP project page and especially the discussion about the Allan intercept. If you want the The Allan intercept is predicated on a very specific model of the noise in a clock ( as I recall basically random gaussian noise at high frequencies, and 1/f noise at low). It is not at all clear that real computers comply with that. best accuracy over the long term, you had better respect that. Proof positive is in my 1995 SIGCOMM paper, later IEEE Transactions on Networking paper and das Buch. I abvsolutely relish scientific critique, but see the briefings and read the papers first. 2. To reduce the convergence time, reduce the time constant, but only at the expense of long term accuracy. An extended treatise on that is in das Buch, especially Chaptera 4, 6 and 12. I would be delighted to hear critique of the material, but read the chapters first. While you may know what in the world Das Buch is (Hitlers Mein Kampf?) I do not. Nor do I know where to get it. Computer Network Time Synchronization: The Network Time Protocol by David L. Mills (Hardcover - Mar 24, 2006) Available from Amazon.com. You may be able to find a copy at a University Book store. Be prepared for Sticker Shock. It ain't cheap! Publishing in small quantities is EXPENSIVE!!! It's different when you can amortize your setup costs over 50,000 copies! Das Buch is unlikely to become a best seller! David, Why are you telling me this? My contribution to this thread consisted of the above exposition of the publication data and availability of Das Buch. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Danny, Unless the computer clock intrinsic frequency error is huge, the only time the 500-PPM kicks in is with a 100-ms step transient and poll interval 16 s. The loop still works if it hits the stops; it just can't drive the offset to zero. Dave Danny Mayer wrote: Unruh wrote: David L. Mills [EMAIL PROTECTED] writes: Reading your claims literally, chrony would have to slew the clock considerably greater than the 500 PPM provided by the standard Unix adjtime() system call. Please explain how it does that. Using the Linux adjtimex system call which has the ability to change the ticksize which gives much greater than 500PPM slew rate for the clocks. ( Up to 10PPM, although that is never used. ) And as I understand it, your handling of leap seconds in ntp also uses far greater than 500PPM slew rates. No, ntpd deliberately limits frequency changes to 500 PPM. That's hard coded. You need to avoid using anything greater than that as Dave has explained. That would be the reason why it taks ntpd longer to bring the clock back to the right time. Danny ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Petru, The default 900-s stepout interval was originally determined by the time an old Spectracom WWVB receiver took to regain synchronization after a leapsecond and should probably be reduced. It can of course be tinkere. During the initial training period the time is not disciplined other than to amortize the initial offset. The bookeeping to do that and preserve an accurate frequency measuremen got too tedious and fragile. So, at the end of the training period the offset that built up during the interval is amortized. I didn't think this was much of a problem, since in practice the training is done only once. Dave Petri Kaukasoina wrote: David Woolley [EMAIL PROTECTED] wrote: Petri Kaukasoina wrote: Basically, it stepped time with ntpdate, slept 100 seconds and stepped time again with ntpdate. From the time adjustment, the script calculated the drift value and put that to the drift file. Again, the time offset always stays below 1 ms. That has quite a lot of similarity with what ntpd itself does if it is cold started with iburst. The only big difference is that it uses 900, rather than 100 seconds. I don't know if that is the same 900 as controlled by tinker stepout, but, even if it is, the side effect on stepout's would probably be undesirable. To cold start you need to delete the drift file, or not configure it. Hmm, I can't see that. I put in only one good time source with iburst, deleted the drift file and started ntpd. The time offset just keeps growing and the frequency changes in very small steps. Now, after 30 minutes time is already 25 ms off and the frequency is only 1.5 ppm (the correct value would be about 25 ppm). ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
[EMAIL PROTECTED] (Danny Mayer) writes: Unruh wrote: David L. Mills [EMAIL PROTECTED] writes: Reading your claims literally, chrony would have to slew the clock considerably greater than the 500 PPM provided by the standard Unix adjtime() system call. Please explain how it does that. Using the Linux adjtimex system call which has the ability to change the ticksize which gives much greater than 500PPM slew rate for the clocks. ( Up to 10PPM, although that is never used. ) And as I understand it, your handling of leap seconds in ntp also uses far greater than 500PPM slew rates. No, ntpd deliberately limits frequency changes to 500 PPM. That's hard coded. You need to avoid using anything greater than that as Dave has explained. That would be the reason why it taks ntpd longer to bring the clock back to the right time. Well, no to both. ntpd steps, which hardly obeys that limit, and the reason ntp takes such a long time is that it has an intergration loop with such a long time constant. If it put its mind to it and used the 500PPM to get rid of a 50ms offset, it would only take 200 sec, not 3 hours. It slowly jacks the PPM to 400 or so and then slowly drops it again below the nominal. This is done to avoid trashing or instability. ddDanny ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David J Taylor [EMAIL PROTECTED] writes: Richard B. Gilbert wrote: [] Computer Network Time Synchronization: The Network Time Protocol by David L. Mills (Hardcover - Mar 24, 2006) Available from Amazon.com. You may be able to find a copy at a University Book store. Be prepared for Sticker Shock. It ain't cheap! Publishing in small quantities is EXPENSIVE!!! It's different when you can amortize your setup costs over 50,000 copies! Das Buch is unlikely to become a best seller! Perhaps we could have a Lulu version? They can manage small quantities very effectively. See: http://www.lulu.com I'd love to see the book, but can't afford those Amazon prices. Would have been nice if there were an online version. Cheers, David ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David, I don't know your version, but the TSET state was removed some time ago and your comments are different from the current source. It's really hard to test the discipline under all conceivable conditions. Now and then somebody cooks up a case considered very unlikely, like Solaris adjtime() behavior with large offsets and force-slew mode, so the code does get tweaked from time to time. Dave David Woolley wrote: Petri Kaukasoina wrote: David Woolley [EMAIL PROTECTED] wrote: That has quite a lot of similarity with what ntpd itself does if it is cold started with iburst. The only big difference is that it uses 900, Hmm, I can't see that. I put in only one good time source with iburst, deleted the drift file and started ntpd. The time offset just keeps growing and the frequency changes in very small steps. Now, after 30 minutes time is already 25 ms off and the frequency is only 1.5 ppm (the correct value would be about 25 ppm). Looking at the comments in the 4.2.0 source code, it looks like you may be right; yet another reason why ntpd doesn't handle startup transients well! If this is still true in the latest version ( max means offset 128ms): * Statemaxmax Comments * * NSETFREQFREQno ntp.drift * FREQSYNCif (mu 900) FREQ calculate frequency * else if (allow) TSET * else FREQ * Worse than is obvious here, it only sets the time on the first sample if it is out by more than 128ms. More obvious, unless the frequency error is so high that the time changes by more than 128ms between the first two good samples, it will use the slow PLL method of calibrating the frequency. Even then, unless the offset is more than 128ms both the for first sample, and after every subsequent sample, it will compute the frequency based on the final absolute value of clock offset, not the difference between the first and last readings; this might not be too important, because it looks to me to require the intial offset to be very close to 128ms (low probability) or the frequency error to be quite high (percentage error in frequency calculation relatively low) for it to complete the frequency calibration. What I was expecting was for it to unconditionally do both frequency and phase calibration, in the absence of the drift file. I presume that chrony does a correction on the first couple of samples and then refines it. Incidentally, the else FREQ doesn't seem to match the code and looks like it would prevent it ever getting out of the calibration under some conditions. It looks like I need to fetch the latest source, although it looks, from your observations, as though it is still far from what I would consider right. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David Woolley [EMAIL PROTECTED] writes: What I was expecting was for it to unconditionally do both frequency and phase calibration, in the absence of the drift file. I presume that chrony does a correction on the first couple of samples and then refines it. Yes. Actually it does a recalibration using the last n samples ( where n is dynamic and grows with stabiltiy and shrinks if the linear fit is not a very good one-- good defined by looking at how often the errors in the linear fit cross zero) It then uses the adjtimex OFFSET single shot adjustment to get rid of the ofset and uses the slope to set the freuency, adjusting the old samples to account for the change in offset and frequency, and keeping track of the offset ajustment in case it was interrupted or did not completely conpensate. Incidentally, the else FREQ doesn't seem to match the code and looks like it would prevent it ever getting out of the calibration under some conditions. It looks like I need to fetch the latest source, although it looks, from your observations, as though it is still far from what I would consider right. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Richard B. Gilbert [EMAIL PROTECTED] writes: David, Why are you telling me this? My contribution to this thread consisted of the above exposition of the publication data and availability of Das Buch. He is not good at following attributions in threads. He addressed it to you because he read my comments in your reply. I understood them to be directed to me. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David L. Mills wrote: 5. This flap about the speed of convergence has become silly. Most of us are less concerned about squeezing to the low microseconds in four Have you done the market surveys to confirm this? I don't have the resources or time to do that, but my impression from the sort of questions that appear on this newsgroup is that most IT managers and turnkey system developers who want better than 100ms clock accuracy want one or both of: - fast convergence (small compared with overall bootup time) - a a common case, these days, is that they are not allowed to process financial transactions until convergence is complete; - strict monotonicity. It may well be that most users don't need better than 100ms, but those users don't care about long term stability, and their long term may be an 8 hour shift. (My interest in NTP is more theoretical, as I work in an industry sector that, whilst it deals with timestamped data, those timestamps are often a minute or two out (and are added by equipment that is out of our control), but I do notice the sorts of questions that keep coming up time and time again.) ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Root, Right; 5 microseconds per timer interrupt at 100 Hz is 0.5 ms/s. That was the original Unix kernel value. Dave root wrote: David L. Mills [EMAIL PROTECTED] writes: snip ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Petri Kaukasoina wrote: Basically, it stepped time with ntpdate, slept 100 seconds and stepped time again with ntpdate. From the time adjustment, the script calculated the drift value and put that to the drift file. Again, the time offset always stays below 1 ms. That has quite a lot of similarity with what ntpd itself does if it is cold started with iburst. The only big difference is that it uses 900, rather than 100 seconds. I don't know if that is the same 900 as controlled by tinker stepout, but, even if it is, the side effect on stepout's would probably be undesirable. To cold start you need to delete the drift file, or not configure it. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David Woolley wrote: David L. Mills wrote: 5. This flap about the speed of convergence has become silly. Most of us are less concerned about squeezing to the low microseconds in four Have you done the market surveys to confirm this? I don't have the resources or time to do that, but my impression from the sort of questions that appear on this newsgroup is that most IT managers and turnkey system developers who want better than 100ms clock accuracy want one or both of: - fast convergence (small compared with overall bootup time) - a a common case, these days, is that they are not allowed to process financial transactions until convergence is complete; - strict monotonicity. snip ISTR that time stamps on financial transactions are required to be within two seconds of the correct time. With NTP that standard is not too difficult to meet. Other applications might be far more demanding. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Daivd, Well, I have done a market survey of sorts, if you can count my consulting clients. There seems general agreement that 1 ms is a good target, but there is a wide range of expecttions on how quickly that must be achieved. Actually, if the TOY chip is within 1 PPM and the downtime is less than 1000 s, convergence is essentially instantaneous. My advice to the Aegis crew was to isolate the NTP puppies on the fire control Ethernet and allow only a couple of other computers on the wire. Crony would work just fine. Here's another contribution to the market survey. There is a seismic network on the sea floor off the Washington state coast. They need a millisecond for experiments lasting months, not just 8-hour shifts, and that when the experiment boxes get rather warm. Crony might work here as well, but it would have to track large swings in temperature. Here's another one. National Public Radio (NTP) distributes almost all program media via IP and digital satellite. They don't need 1 ms, but they do need good stability in the face of highly variable transmission delays that could drive crony nuts. And another one. A transatlantic link used by Ford Motor was once a statistical multilexor that interleaved terminal keystrokes on a demand-assigned basis. Toss NTP packets in that mess and watch the huge jitter. That not only drove NTP nuts, it drove the TCP retransmission algorithm nuts, too. Seems like the market is highly fragmented. I hear you say 100 ms which I interpret as 100 milliseconds. Even 25 year old fuzzballs could to much better than that on the congested ARPAnet. Did you mean 100 microseconds? Dave David Woolley wrote: David L. Mills wrote: 5. This flap about the speed of convergence has become silly. Most of us are less concerned about squeezing to the low microseconds in four Have you done the market surveys to confirm this? I don't have the resources or time to do that, but my impression from the sort of questions that appear on this newsgroup is that most IT managers and turnkey system developers who want better than 100ms clock accuracy want one or both of: - fast convergence (small compared with overall bootup time) - a a common case, these days, is that they are not allowed to process financial transactions until convergence is complete; - strict monotonicity. It may well be that most users don't need better than 100ms, but those users don't care about long term stability, and their long term may be an 8 hour shift. (My interest in NTP is more theoretical, as I work in an industry sector that, whilst it deals with timestamped data, those timestamps are often a minute or two out (and are added by equipment that is out of our control), but I do notice the sorts of questions that keep coming up time and time again.) ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Danny, I agree with everything you said except: Danny Mayer wrote: I agree. I don't see how it can be a specification violation. The biggest factor is how well it keeps time. A caesium clock keeps good time but you wouldn't say that it violates the specification. When we first started looking at the V4 spec for the ntp-wg, my first thought was the same as yours, namely that what happens inside a system shouldn't matter, the algorithms don't matter, only what it chimes matters. And strictly speaking, this is true. However, after reading Dave's book (Das Buch as he calls it), I realized that an important factor to the stability of the NTP network is the actual speed at which the clocks slew, i.e. the 500 PPM limit. This is largely ignored in the spec. I sent in some comments about how I thought it should be addressed but alas, my changes didn't make it in the latest versions. Brian Utterback ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David Woolley [EMAIL PROTECTED] writes: David L. Mills wrote: The NTP discipline is basically a type-II feedback control system. Your training should recall exactly how such a loop works and how it responds to a 50-ms step. Eleven seconds after NTP comes up the mitigation You both have problems here. Dave Mills: your problem is that you haven't explained why one should continue to use a long time constant linear feedback system when a human observer can easily tell you how to get within 10 microseconds of the correct time after no more than about 3 samples. Bill Unruh: you haven't explained what real world situation this test is simulating; it is a standard doctrine that ntpd is not a substitute for good hardware and system software (e.g. you shouldn't use ntpd to get round lost clock interrupts). The real world situation that the test is run on (not simulating) is having a computer on a lan with another computer running ntp from a Garmin PPS acting as the server. It is a best case scenario, I will completely agee. I still get round trip times of msec rather than 150usec at times, the oscillators on the machines have glitches in which teh clock rate changes by 1-2PPS suddenly ( over less than 1/2 hr) and then long periods of quiescense. I have NOT tested the two in situations where there are longer paths, through many routers. I have not tested it on the road to Mandalay, or Indonesia. I have been looking at the real world response in a working system but where the network delays are minimal. Is my testing complete? Heavens no. It is one data point. Do I expect chrony to fall over on the road to Mandalay? Looking at its design, no, but experiments are the answer. algorithms present that transient to the loop and what happens afterwards conforms to the equations of control theory. Discussion about what happens at any time after that is a matter of mathematics and ntpd does conform to the mathematics as confirmed by observation and simulation. That's an indication that the equations are inappropriate in that context. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Brian, The 500 PPM limit in the reference implementation was originally set to match the adjtime() slew of that value, but so many kernels have been hacked adjtime that this might not even be appropriate now. The bottom line is that an update given to adjtime() should be completed before the next update. Even if it's not, the leftover is carried over to the next update. However, in order to avoid disturbing application programs that compute intervals, the slew rate should be no more than necessary. Dave Brian Utterback wrote: Danny, I agree with everything you said except: Danny Mayer wrote: I agree. I don't see how it can be a specification violation. The biggest factor is how well it keeps time. A caesium clock keeps good time but you wouldn't say that it violates the specification. When we first started looking at the V4 spec for the ntp-wg, my first thought was the same as yours, namely that what happens inside a system shouldn't matter, the algorithms don't matter, only what it chimes matters. And strictly speaking, this is true. However, after reading Dave's book (Das Buch as he calls it), I realized that an important factor to the stability of the NTP network is the actual speed at which the clocks slew, i.e. the 500 PPM limit. This is largely ignored in the spec. I sent in some comments about how I thought it should be addressed but alas, my changes didn't make it in the latest versions. Brian Utterback ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh wrote: I am sorry, but this is idiotic. The ONLY requirement should be that the communication protocol is implimented properly and that the clock is Only a very small part of the mandatory parts of the NTP specification describe the wire formats. The pool is an NTP network, not an SNTP one. Yes, I can say that. Elementary clock measurement techniques tell you which of the two clocks is better, even if you do not know which. How in the world do you think the people who run the national time standards know which the better or worse clocks are? They have no clock that is better I believe they use techniques similar to those in ntpd and they are the people who come up with terms like Allan intercept. However, they are operating with instrumentation where they know that the oscillator is the main source of error. In the typical NTP setup, the clock is not responsible for the jitter component. than the ones they have to act as a standard. They have the best in the world, and they can tell which is better or worse. Actually, I believe the standard is an average of the individual clocks, and has no physical hardware realization. It is also only available a long time after the time to which it relates. Stubborness is good. As long as it is allied with a willingness to listen and reexamine his own preconceptions. Scientific progress is made by people defending their position but being willing to give it up if it becomes clear that it is wrong. Dave Mills, if you are still reading, I would point out that to anyone reading this except for the committed ntpd users on this list (most of whom don't understand the clock discipline theory - I think I understand it better than many, but my understanding is still rather fuzzy - many have said that they don't understand and don't really care) will be pretty convinced that they should be using chrony for any real world clock synchronization. Unless you address, on list: - the problem that ntpd clearly reacts poorly to real world transients (and this is an issue that keeps getting raised, not just in this thread). - why chrony's algorithms are bad. ntpd is going to lose the battle here for anyone reading the thread who wasn't already fundamentally committed to ntpd. I don't have the depth of understanding to defend the ntpd approach and I agree with people when they say that ntpd fails to recognize and rapidly recover from situations where it is trivially obvious to a human that the time is wrong. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David J Taylor [EMAIL PROTECTED] writes: chrony falls at the first hurdle for me - there appears to be no native Windows implementation. Correct. chrony is not implimented on nearly as many platforms as ntp. There were plans once upon a time, but life got in Curnoe's way. Anyway, I am NOT advocating everyone change to chrony. I am trying to understand the clock discipline algorithm. It uses a lot of the special features of linux. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh wrote: David J Taylor [EMAIL PROTECTED] writes: chrony falls at the first hurdle for me - there appears to be no native Windows implementation. Correct. chrony is not implimented on nearly as many platforms as ntp. There were plans once upon a time, but life got in Curnoe's way. Anyway, I am NOT advocating everyone change to chrony. I am trying to understand the clock discipline algorithm. It uses a lot of the special features of linux. It prevents me from making a comparative test. In general, NTP has worked well for me, but I have seen behaviour sometimes when the drift file seems to get stuck near one of the limits and NTP can't fix it. I suspect that's a programming rather than a fundamental algorithm error. Here's what I get, which is quite good enough for me (apart from the Vista PC): http://www.david-taylor.myby.co.uk/mrtg/daily_ntp.html Cheers, David ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Maarten Wiltink [EMAIL PROTECTED] writes: Unruh [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] David J Taylor [EMAIL PROTECTED] writes: chrony falls at the first hurdle for me - there appears to be no native Windows implementation. Correct. chrony is not implimented on nearly as many platforms as ntp. There were plans once upon a time, but life got in Curnoe's way. Anyway, I am NOT advocating everyone change to chrony. I am trying to understand the clock discipline algorithm. It uses a lot of the special features of linux. And that is _not_ a good thing. To win over the world, as the other David (Woolley) predicts elsewhere, it would need to be available on Windows at least. That might actually happen. But to win over some of the people who Sure, it would be nice. It sure will not be me. HOwever if people are convinced that chrony is a better approach ( and that still does need proof, even though the suggestions are there) then I am sure volunteers will be found to port it to Windows. That is after all how NTP got ported. _really_ matter, it would have to be implemented in a simple, transparent, and platform-neutral way, and to be driven by clock engineering, not code writing. The other other David (the original Dave) can *prove* that NTP It is, and this discussion and my experiments are precisely to try to do the clock experiments to see which approach works better. I think Curnoe did put a lot of effort into thinking about how to make it work when he wrote chrony some 10 years ago-- astonishingly it has changed very little and still works extremely well. will not go unstable under a variety of adverse conditions. Curnoe may have years of logs showing that chrony keeps offsets lower than ntpd, but standards laboratories are likely to shrug that off as anecdotal evidence, not proof. Well, no I do not think he does. I suspect I have the most logs of anyone in the world (www.theory.physics.ubc.ca/chrony/chrony.html-- follow the past logs link) And anybody who's really serious about timekeeping seems to be playing with reference clocks on FreeBSD. Err... OK, I do not know why, but... Anyway, I think chrony works on BSD, but could not swear to it. It does work on SunOS and Solaris ( or did) but they have (had?) terrible clock control -- no frequency changes-- at least as implimented in chrony. chrony is old code, but it works very well. Its whole design goal is to reduce errors in the offset. It is really hard to go unstable if that is the goal, but it is possible, especially if you are interested in very long intervals between clock queries. It is at that point that you want to make sure that your model of the frequency drift and your estimation of the frequencies is the best possible. I do worry a bit about chrony in that situation, but have no reasons for that worry. The very worst case is if the system runs for a while on very short poll intervals, and then suddenly has very log poll intervals. The short period estimation of the drift is not a good estimator of the long period drift. But I suspect that NTP would have problemsi in that situation as well. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh, The NTP discipline is basically a type-II feedback control system. Your training should recall exactly how such a loop works and how it responds to a 50-ms step. Eleven seconds after NTP comes up the mitigation algorithms present that transient to the loop and what happens afterwards conforms to the equations of control theory. Discussion about what happens at any time after that is a matter of mathematics and ntpd does conform to the mathematics as confirmed by observation and simulation. If you have problems with the loop time constant, tough. It was chosen as a compromise for LANs and WANs. You are invited to justify a different time constant, but it has to work an a bumpy road to Malaysia. Further discussion on this issue is neither interesting nor helpful and, frankly, boring. Dave Unruh wrote: David L. Mills [EMAIL PROTECTED] writes: Maarten, I turn my machines off and on all the time and the clock is set from the server within 11 seconds after starting ntpd. If I didn't use burst mode, that would take four minutes. Golly. When you say the clock is set what do you mean? With what accuracy is the clock running 4 min after powerup in comparison with its accuracy after say 5 days. (let me define the accuracy as the offset ,not the jitter, but the offset on each measurement from your best time source.) Please understand the difference between impulse response and poll interval. It is true that it might take 3000 s to amortize the initial offset from the TOC chip at power-up. This is no different than if some server torqued your clock by that amount. So, if some server did torque your clock by 50ms as a one time event, or if you stepped your system clock by 50ms, how long would it take ntp to settle down (lets say you are running at maxpoll 7, minpoll 4). Let us assume that in steady state your clock is controlled to 50usec. HOw long would it take to regain that +- 50usec behaviour with ntp? Again, I mean by +- 50 usec that the measurement offsets ( what is reported in the peerstats file as clock offset) are fluctuating by +-50usec? You may not like that as a measure of the clock accuracy, but I want to be clear that we are not talking about different things. Dave Maarten Wiltink wrote: Unruh [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] David L. Mills [EMAIL PROTECTED] writes: There are lots of ways to measure the loop transient response. The easiest way is to set the clock some 50-100 ms off from some stable source (not necessarily accurate) and watch the loop converge. The response should cross zero in about 3000 s and overshoot about 6 percent 3000 s is a HUGE time. For people who switch on their computers daily, that means most of their time is spent with the computer unsynchronised to best accuracy. The timescale of chrony is far faster. (I am not a writer of chrony.I am a user who is trying to get the very best out of the timekeeping.) But NTP is from a time when people didn't switch on their computers daily. When NTP was young, dinosaurs walked the machine room and _you_ did _not_ get to decide when the machine on the other end of your terminal was rebooted. NTP can, after weeks of training, teach a computer to keep time very, very well. As a result, it's less optimised for the other end of the spectrum. Features like iburst and the drift file can get your clock synchronised to within a few milliseconds in less than a minute. If you want better than that, or you want it faster... don't turn your computer off. Groetjes, Maarten Wiltink ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David L. Mills [EMAIL PROTECTED] writes: Unruh, I'm sure you know that an ntpd simulator is included in the NTP software distribution. It handles multiple simultaneous servers using the same algorithms as in the working daemon. We use it to test the daemon response to all kinds of possible but unlikely scenarios, all at warp speed. No I did not know that, but the problem is that chrony does not. So, why ae we having this discussion? Whip up a worthy opponent for chrony and we can watch a glorious battle of the simulators. For your first battle, I have rawstats for a bumpy backdoor path to Malaysia. I am astonished at your comments about poll interval and reading the clock, which are totally independent of each other. The ntpd daemon has I am puzzeled by what you are refering to-- oh, you mean the simulator discussion. Never mind, that comment has absolutely nothing to do with ntp but refered to how easy it was to get chrony to run in simulator mode. It was a comment purely on that issue. configurable disconnects for the feedback loop and for individual servers. I assume chrony has similar features, as you wouldn't be able to make the claims you do without them. Reading your claims literally, chrony would have to slew the clock considerably greater than the 500 PPM provided by the standard Unix adjtime() system call. Please explain how it does that. Using the Linux adjtimex system call which has the ability to change the ticksize which gives much greater than 500PPM slew rate for the clocks. ( Up to 10PPM, although that is never used. ) And as I understand it, your handling of leap seconds in ntp also uses far greater than 500PPM slew rates. Dave Unruh wrote: [EMAIL PROTECTED] (Danny Mayer) writes: Unruh wrote: [EMAIL PROTECTED] (Danny Mayer) writes: Virtual machines buys you the same problem as above. Even on a virtual machine there's only one clock. You can have only one application discipline that clock never mind how many virtual machines are running. Don't be fooled by the technology. Not if the virtual machines have a virtual clock-- Ie a little program which intercepts all the clock routines and return the output of a little program simulating a clock. Now intercepting the various adjtimex calls is not that hard ( just rewrite the adjtimex and gettimeofday routine and and overload it for your program) but chrony and ntp also use the clock as a scheduler, and that is a lot more difficult to simulate and catch. As a fellow physicist I would expect you to understand this better. It's a basic principal in quantum mechanics: the observers influences the observed results. In this case, it's not enough since you are directly and deliberately affecting the clock itself and there really can only NO you do not understant. The clocks I am talking about are NOT hardware related clocks, they are just subroutines which return what is supposed to be a time when queried, and which change their algorithm for generating those numbers when disciplined by the program. The really big problem is that the system goes into wait states, and you would also have to wake it up appropriately. For example, the polling interval is done by the clock. Now there is absolutely no reason why a poll which is supposed to be running at poll 10 could not return immediately with the clock set to tell it that 1024 sec had passed. However getting this right would require a really big rewrite of the NTP or chrony program. and deliberately affecting the clock itself and there really can only be one clock. Multiple clocks lead to chaotic events. All virtual Of course there can be many clocks. After all each computer I have has one so if I have 10 computers I have 10 clocks. NOw of course you are refering to a single computer with a single bit of hardware. But the virtual clocks I am talking about are not hardware related at all. They are just subroutines which spit out an number when queried. There are no simulators that I've ever seen that can run tests faster than real-time. They are always many orders of magnitude slower, even with hardware assist. We are not asking for a machine simulator but a clock simulator and that can run thousands of times faster than the real clock. You can run it at any speed you want. And you can have a separate simualted clock with its own theory of operation on each virtual machine. I've run many different simulators including hardware ones and I can assure you nothing runs slower than a simulator. Like I said there is only one real clock in a virtual machine, there just appears to be one per virtual machine. A simulator of a clock can run far far faster than a clock. After all I can output the numbers from 1 to 1 far faster than 1 sec. That is how weather forcasting works. The simulation of the weather is run much faster than the real weather. Otherwise the forcast is a bit useless.
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh wrote: situation, but have no reasons for that worry. The very worst case is if the system runs for a while on very short poll intervals, and then suddenly has very log poll intervals. The short period estimation of the drift is not a good estimator of the long period drift. But I suspect that NTP would have problemsi in that situation as well. Perhaps. Perhaps not. The NTP reference code chooses its own poll interval based on the clock stability and the sample jitter. For a frequency correction to be valid, the clock offset must be greater than the sample jitter. As the frequency gets closer to the correct value the poll interval must get longer. See, NTP has a different design goal than chrony. The goal of NTP is not merely to keep the clock in sync, but to also discipline the frequency, while also providing a stable time synchronization network. If your goal is to keep the offset as low as possible, just keep the poll interval as short as possible. That doesn't take much work. Brian Utterback ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Guys, Sure, I'm stubborn as a bull. The laws of physics make me so. I am dismissing any comparisons between ntpd and crony or any other vehicle unless the comparison includes substantially all the scenarios that ntpd is designed to work with. The protocol is specifically designed to work over a wide spectrum including lightly loaded LANs and highly congested WANs. The choice of parameters, specifically the time constant and operating range, was chosen as a compromise to maximize accuracy and minimize network loads under typical and extreme conditions. As for the SNTP restrictsions, please, please read the draft specification, which explains exactly what SNTP should and should not do. At the crux of the matter is the impulse response of a cascade of intervening servers each with its own idiosyncratic impulse response. The NTP impulse response has a controlled risetime and overshoot over a wide range of time constants. Each server in the cascade must have the same impulse response to avoid instabilities and possible whip effects. We could have simply specified the transfer function in polynomial form (it's in RFC 1305 and das Buch) and told the implementor to use that. A student of digital signal processing would know how to use that directly. But, we thought there would be folks like you that would not believe the principles and do something evil like bring up a pool server running openntp or crony and synchronized via a flaky circuit to Indonesia. It is easy to detect that a particular server has or has not the current reference implementation. There are a number of features intrinsic to the protocol design and others fiendishly crafted to do that, but I'm not going to reveal them here. Dave ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Guys, Reprinted without permission from the draft spec: 14. Simple Network Time Protocol (SNTP) Primary servers and clients complying with a subset of NTP, called the Simple Network Time Protocol (SNTPv4) [2], do not need to implement the mitigation algorithms described in Section 9 and following sections. SNTP is intended for primary servers equipped with a single reference clock, as well as for clients with a single upstream server and no dependent clients. The fully developed NTPv4 implementation is intended for secondary servers with multiple upstream servers and multiple downstream servers or clients. Other than these considerations, NTP and SNTP servers and clients are completely interoperable and can be intermixed in NTP subnets. An SNTP primary server implementing the on-wire protocol described in Section 8 has no upstream servers except a single reference clock. In principle, it is indistinguishable from an NTP primary server that has the mitigation algorithms and therefore capable of mitigating between multiple reference clocks. Upon receiving a client request, an SNTP primary server constructs and sends the reply packet as described in Figure 34. Note that the dispersion field in the packet header must be updated as described in Section 5. Dave Danny Mayer wrote: snip ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David L. Mills [EMAIL PROTECTED] writes: Unruh, This answers my earlier question. I can't believe this is so crude and dangerous. you really need to provide an analysis on the errors this creates when reading the clock during the slew. The problem is not the residual time offset but the rate at which time changes. Measuring time I am confused. The clock is off from the real time. Anything that happens during the slew is less off than it was before. You are worried perhaps that instead of 2 seconds having passed only 1 second and 700 msec have passed? Note that the rate change is also limited in chrony even during an offset slew. intervals is very different during the slew. The NTP design carefully limits this to no more than 5 microseconds per second without the kernel and even smaller with the kernel. OK, why? Dave Unruh wrote: Brian Utterback [EMAIL PROTECTED] writes: Unruh wrote: Just an update: I started chrony with a 60ms offset. It had the right drift file. It took about 1 min ( having collected about 4 samples from the servers at minpoll 4) to drive the offset down to about 100 usec (Yes, a 1000 fold improvement in about 50 sec.) Ie, the time constant for correction of offset errors is enough time to collect enough samples to determine that the offset really is statistically way off. Is that supposed to be impressive? One of the design constraints of NTP is to limit the clock frequency change during offset adjustments to 500ppm to prevent NTP network instabilities. If the offset was amortized over the 50secs you stated, then that is a slew rate of 1200 ppm. If this happened entirely at the end of the 4 samples, then it sounds simply like a step to me. By that reasoning, ntpdate far NO it is NOT a step. It is done via a fast slew by a change in the tick size, which can be 10% (ie +-10PPM) The clock always runs forward. It does not step. It may seem like a step from the point of the coarse sampling done by chrony or ntp, but if you ran a PPS clock and looked at the time returned by gettimeofday, it would be continuous and positive, just like ntp. When the NPT offset changes by 100ms between samples spaced at 500 sec apart, did it do that by stepping? No it did it by increasing the frequency by 200PPM. Chrony behaves the same way, only it uses the ticksize as well as the frequency to produce fast slews to get rid of the offsets, and it does not go unstable that I have ever seen. outperforms chrony. I presume that chrony cannot behave as a server and only does clients right? Chrony is also a server. The key detraction for me is that it cannot use hardware clocks. It also does not act as a multicast/broadcast server which may be a detraction for others and does not do leap seconds. On the other hand with its rapid response it will correct the leapsecond within less than an hour. Anyway, the issue here is the clock disciplining routine, not a comparison of the chronyd program with the ntp implimentation. I am arguing that chrony's clock discipline routine keeps the hardware clock much closer to the real time (in the real world) and reacts to real world changes much faster than does the NTP discipline routine. And chrony is just as stable it seems as NTP is. The offset fluctuations are better than NTP's are. The key question is how close to the real time is the time that the system clock delivers. Chrony is closer by factors of at least 2 and probably if run at high priority as my ntp is, much better than that. In particular if there are glitches in the clock drift rate, chrony reacts much faster, and keeps the time much much closer to the true time. Instability would produce worse behaviour not better. I also started chrony without a drift file. In this case it took about 5 min to get a frequency within 10% of the long term stable frequency and that error disappeared within 1/2 hour. I don't know about the version of ntp you are running, but recent versions have a bug in the initial frequency calculations which has since been fixed, but not released (ahem. Harlan?). The initial horrible transient was under 4.2.0. After this round I will try an initial transient test with 4.2.4. But the transient behaviour I am describing in the previous post is during the normal running of NTP. It is not an initial transient. It is the response of the system to a real world drift rate glitch. It is after NTP has been running for 5 days and the hardware clock on the machine suffered a frequency glitch. I have no idea what is causing those frequency glitches-- the clock suddenly canges it drift rate by .2 to 2 PPM. I have seen this both with a chrony controlled clock and an NTP controlled clock. It is just that the NTP response is not good. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Brian Utterback [EMAIL PROTECTED] writes: Unruh wrote: situation, but have no reasons for that worry. The very worst case is if the system runs for a while on very short poll intervals, and then suddenly has very log poll intervals. The short period estimation of the drift is not a good estimator of the long period drift. But I suspect that NTP would have problemsi in that situation as well. Perhaps. Perhaps not. The NTP reference code chooses its own poll interval based on the clock stability and the sample jitter. For a frequency correction to be valid, the clock offset must be greater than the sample jitter. As the frequency gets closer to the correct value the poll interval must get longer. See, NTP has a different design goal than chrony. The goal of NTP is not merely to keep the clock in sync, but to also discipline the frequency, while also providing a stable time synchronization network. If your goal is to keep the offset as low as possible, just keep the poll interval as short as possible. That doesn't take much work. No, I am refering to the case where the network suddenly goes down for 2 days. YOur poll has gone from 2 min ( say on maxpoll 10) to 3 days. The design goal of ntp and chrony is the same as you outline for ntp. The question is what algorithm accomplishes those goals including minimizing the offset Chrony is just as adaptive on poll intervals as is ntp. It is just that sometimes the world hands garbage, and the question is how does the system respond to the garbage. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David L. Mills wrote: The NTP discipline is basically a type-II feedback control system. Your training should recall exactly how such a loop works and how it responds to a 50-ms step. Eleven seconds after NTP comes up the mitigation You both have problems here. Dave Mills: your problem is that you haven't explained why one should continue to use a long time constant linear feedback system when a human observer can easily tell you how to get within 10 microseconds of the correct time after no more than about 3 samples. Bill Unruh: you haven't explained what real world situation this test is simulating; it is a standard doctrine that ntpd is not a substitute for good hardware and system software (e.g. you shouldn't use ntpd to get round lost clock interrupts). algorithms present that transient to the loop and what happens afterwards conforms to the equations of control theory. Discussion about what happens at any time after that is a matter of mathematics and ntpd does conform to the mathematics as confirmed by observation and simulation. That's an indication that the equations are inappropriate in that context. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Just an update: I started chrony with a 60ms offset. It had the right drift file. It took about 1 min ( having collected about 4 samples from the servers at minpoll 4) to drive the offset down to about 100 usec (Yes, a 1000 fold improvement in about 50 sec.) Ie, the time constant for correction of offset errors is enough time to collect enough samples to determine that the offset really is statistically way off. I also started chrony without a drift file. In this case it took about 5 min to get a frequency within 10% of the long term stable frequency and that error disappeared within 1/2 hour. I have also patched chrony so that it can put itself at max priority. It seems clear to me that the reason that NTP was so much better at the round trip scatter was that it was running at max priority. Ie, the large spikes in the round trip times was because chrony was not being woken up, or was swapped out, rather than any problem with the network. However I will have to run chrony again for a while to collect statistics. However, even without the high priority, chrony did better than NTP at keeping the clock disciplined, and this taming of the round trip fluctuations should help. To compare the transient response of chrony and NTP, look at the graphs for flory (bottom graph on the right at www.theory.physics.ubc.ca/chrony/chrony.html) and fluxon (fourth down on the right).Both suffered a sudden change in the drift rate of the clock it appears. On the NTP controlled clock there seems to have been a sudden .2PPM change in the drift rate of the clock on Jan22.8. This caused a 500usec error in the offset errors in the clock, which took a few hours to settle down. Contrast this with fluxon at Jan 21.27 where it seems to have suffered a 2PPM sudden change in the drift (ten times the change that flory suffered). This caused only a 200 usec offset, which chrony corrected within 5 min. The similar jump at 21.4 behaved in the same way. Ie, a jump 10 times as big had an effect less than 1/2 as large, and fixed on the timescale of over 20 times faster. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Harlan Stenn wrote: In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Danny Mayer) writes: Danny Harlan Stenn wrote: Unruh Unfortunately I cannot run both ntp and chrony on the same system at Unruh the same time. Bill, Exactly why can you not run ntpd and chrony on the same system at the same time? Danny Harlan, really. You *cannot* have two different Danny mechanisms/applications to discipline the clock at the same time. I Danny invite you to try. You have access to my code so you can test this Danny easily. You are, as is so often the case, missing my point. It is possible to run ntpd in a way that it does not discipline the clock. In which case you are not able to compare the two algorithms since the clock is the central part of the testing and algorithm. I am curious about your last sentence though - what is special about your code that would allow this to be tested? It includes the noall option. You can then run two instances on the same box and have one discipline the clock and the other not. Feel free to have both of them try. The results will be hilarious. I want the ability to run multiple instances of ntpd where at most 1 instance of ntpd is actually controlling the clock, specifically to make it easy to (more quickly) analyze the performance/behavior of different configurations of ntpd. I understand that the boat is rocking while this is going on, but I suspect this capability would be a useful one in at least some cases. Danny I don't see the benefit of doing this with two separate Danny instances. It's easier and simpler to just add the other servers into Danny the one instance and specify noselect. Again you are missing my point. Allowing this would let us, for example, see how two different versions of ntpd would discipline the clock. It would allow us to see how ntpd might discipline the clock compared to chrony. I understand and get that by not actually disciplining the clock we are removing an important part of the feedback loop, and I do not know if that will fatally affect these sort of experiments or not. No you cannot do that. The clock is the central part of the algorithm. You *cannot* have two different applications discipline the clock without disasterous results. Put it this way: chrony decides that it needs to adjust the clock frequency and amount by amount dX and X. ntpd decides to change it by dY and Y. When chrony next looks at the clock it decides that the change it made wasn't good enough and makes changes by an even bigger amount and delta. and so on and so forth. And as Bill said, it would be Swell if there was a way to do this using, eg, virtual machines so that we could test them that way. Better yet, it would be nice to have a simulator framework where we could run these tests faster than in real-time. Virtual machines buys you the same problem as above. Even on a virtual machine there's only one clock. You can have only one application discipline that clock never mind how many virtual machines are running. Don't be fooled by the technology. There are no simulators that I've ever seen that can run tests faster than real-time. They are always many orders of magnitude slower, even with hardware assist. Danny ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
[EMAIL PROTECTED] (Danny Mayer) writes: ... And as Bill said, it would be Swell if there was a way to do this using, eg, virtual machines so that we could test them that way. Better yet, it would be nice to have a simulator framework where we could run these tests faster than in real-time. Virtual machines buys you the same problem as above. Even on a virtual machine there's only one clock. You can have only one application discipline that clock never mind how many virtual machines are running. Don't be fooled by the technology. Not if the virtual machines have a virtual clock-- Ie a little program which intercepts all the clock routines and return the output of a little program simulating a clock. Now intercepting the various adjtimex calls is not that hard ( just rewrite the adjtimex and gettimeofday routine and and overload it for your program) but chrony and ntp also use the clock as a scheduler, and that is a lot more difficult to simulate and catch. There are no simulators that I've ever seen that can run tests faster than real-time. They are always many orders of magnitude slower, even with hardware assist. We are not asking for a machine simulator but a clock simulator and that can run thousands of times faster than the real clock. You can run it at any speed you want. And you can have a separate simualted clock with its own theory of operation on each virtual machine. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh wrote: [EMAIL PROTECTED] (Danny Mayer) writes: Virtual machines buys you the same problem as above. Even on a virtual machine there's only one clock. You can have only one application discipline that clock never mind how many virtual machines are running. Don't be fooled by the technology. Not if the virtual machines have a virtual clock-- Ie a little program which intercepts all the clock routines and return the output of a little program simulating a clock. Now intercepting the various adjtimex calls is not that hard ( just rewrite the adjtimex and gettimeofday routine and and overload it for your program) but chrony and ntp also use the clock as a scheduler, and that is a lot more difficult to simulate and catch. As a fellow physicist I would expect you to understand this better. It's a basic principal in quantum mechanics: the observers influences the observed results. In this case, it's not enough since you are directly and deliberately affecting the clock itself and there really can only be one clock. Multiple clocks lead to chaotic events. All virtual clocks are driven off the real one which means that updating the clock needs to update the real clock. You don't really have separate clocks, it just looks like you do. There are no simulators that I've ever seen that can run tests faster than real-time. They are always many orders of magnitude slower, even with hardware assist. We are not asking for a machine simulator but a clock simulator and that can run thousands of times faster than the real clock. You can run it at any speed you want. And you can have a separate simualted clock with its own theory of operation on each virtual machine. I've run many different simulators including hardware ones and I can assure you nothing runs slower than a simulator. Like I said there is only one real clock in a virtual machine, there just appears to be one per virtual machine. Danny ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Maarten, I turn my machines off and on all the time and the clock is set from the server within 11 seconds after starting ntpd. If I didn't use burst mode, that would take four minutes. Golly. Please understand the difference between impulse response and poll interval. It is true that it might take 3000 s to amortize the initial offset from the TOC chip at power-up. This is no different than if some server torqued your clock by that amount. Dave Maarten Wiltink wrote: Unruh [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] David L. Mills [EMAIL PROTECTED] writes: There are lots of ways to measure the loop transient response. The easiest way is to set the clock some 50-100 ms off from some stable source (not necessarily accurate) and watch the loop converge. The response should cross zero in about 3000 s and overshoot about 6 percent 3000 s is a HUGE time. For people who switch on their computers daily, that means most of their time is spent with the computer unsynchronised to best accuracy. The timescale of chrony is far faster. (I am not a writer of chrony.I am a user who is trying to get the very best out of the timekeeping.) But NTP is from a time when people didn't switch on their computers daily. When NTP was young, dinosaurs walked the machine room and _you_ did _not_ get to decide when the machine on the other end of your terminal was rebooted. NTP can, after weeks of training, teach a computer to keep time very, very well. As a result, it's less optimised for the other end of the spectrum. Features like iburst and the drift file can get your clock synchronised to within a few milliseconds in less than a minute. If you want better than that, or you want it faster... don't turn your computer off. Groetjes, Maarten Wiltink ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
[EMAIL PROTECTED] (Danny Mayer) writes: Unruh wrote: [EMAIL PROTECTED] (Danny Mayer) writes: Virtual machines buys you the same problem as above. Even on a virtual machine there's only one clock. You can have only one application discipline that clock never mind how many virtual machines are running. Don't be fooled by the technology. Not if the virtual machines have a virtual clock-- Ie a little program which intercepts all the clock routines and return the output of a little program simulating a clock. Now intercepting the various adjtimex calls is not that hard ( just rewrite the adjtimex and gettimeofday routine and and overload it for your program) but chrony and ntp also use the clock as a scheduler, and that is a lot more difficult to simulate and catch. As a fellow physicist I would expect you to understand this better. It's a basic principal in quantum mechanics: the observers influences the observed results. In this case, it's not enough since you are directly and deliberately affecting the clock itself and there really can only NO you do not understant. The clocks I am talking about are NOT hardware related clocks, they are just subroutines which return what is supposed to be a time when queried, and which change their algorithm for generating those numbers when disciplined by the program. The really big problem is that the system goes into wait states, and you would also have to wake it up appropriately. For example, the polling interval is done by the clock. Now there is absolutely no reason why a poll which is supposed to be running at poll 10 could not return immediately with the clock set to tell it that 1024 sec had passed. However getting this right would require a really big rewrite of the NTP or chrony program. and deliberately affecting the clock itself and there really can only be one clock. Multiple clocks lead to chaotic events. All virtual Of course there can be many clocks. After all each computer I have has one so if I have 10 computers I have 10 clocks. NOw of course you are refering to a single computer with a single bit of hardware. But the virtual clocks I am talking about are not hardware related at all. They are just subroutines which spit out an number when queried. There are no simulators that I've ever seen that can run tests faster than real-time. They are always many orders of magnitude slower, even with hardware assist. We are not asking for a machine simulator but a clock simulator and that can run thousands of times faster than the real clock. You can run it at any speed you want. And you can have a separate simualted clock with its own theory of operation on each virtual machine. I've run many different simulators including hardware ones and I can assure you nothing runs slower than a simulator. Like I said there is only one real clock in a virtual machine, there just appears to be one per virtual machine. A simulator of a clock can run far far faster than a clock. After all I can output the numbers from 1 to 1 far faster than 1 sec. That is how weather forcasting works. The simulation of the weather is run much faster than the real weather. Otherwise the forcast is a bit useless. Danny ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Brian Utterback [EMAIL PROTECTED] writes: Unruh wrote: Just an update: I started chrony with a 60ms offset. It had the right drift file. It took about 1 min ( having collected about 4 samples from the servers at minpoll 4) to drive the offset down to about 100 usec (Yes, a 1000 fold improvement in about 50 sec.) Ie, the time constant for correction of offset errors is enough time to collect enough samples to determine that the offset really is statistically way off. Is that supposed to be impressive? One of the design constraints of NTP is to limit the clock frequency change during offset adjustments to 500ppm to prevent NTP network instabilities. If the offset was amortized over the 50secs you stated, then that is a slew rate of 1200 ppm. If this happened entirely at the end of the 4 samples, then it sounds simply like a step to me. By that reasoning, ntpdate far NO it is NOT a step. It is done via a fast slew by a change in the tick size, which can be 10% (ie +-10PPM) The clock always runs forward. It does not step. It may seem like a step from the point of the coarse sampling done by chrony or ntp, but if you ran a PPS clock and looked at the time returned by gettimeofday, it would be continuous and positive, just like ntp. When the NPT offset changes by 100ms between samples spaced at 500 sec apart, did it do that by stepping? No it did it by increasing the frequency by 200PPM. Chrony behaves the same way, only it uses the ticksize as well as the frequency to produce fast slews to get rid of the offsets, and it does not go unstable that I have ever seen. outperforms chrony. I presume that chrony cannot behave as a server and only does clients right? Chrony is also a server. The key detraction for me is that it cannot use hardware clocks. It also does not act as a multicast/broadcast server which may be a detraction for others and does not do leap seconds. On the other hand with its rapid response it will correct the leapsecond within less than an hour. Anyway, the issue here is the clock disciplining routine, not a comparison of the chronyd program with the ntp implimentation. I am arguing that chrony's clock discipline routine keeps the hardware clock much closer to the real time (in the real world) and reacts to real world changes much faster than does the NTP discipline routine. And chrony is just as stable it seems as NTP is. The offset fluctuations are better than NTP's are. The key question is how close to the real time is the time that the system clock delivers. Chrony is closer by factors of at least 2 and probably if run at high priority as my ntp is, much better than that. In particular if there are glitches in the clock drift rate, chrony reacts much faster, and keeps the time much much closer to the true time. Instability would produce worse behaviour not better. I also started chrony without a drift file. In this case it took about 5 min to get a frequency within 10% of the long term stable frequency and that error disappeared within 1/2 hour. I don't know about the version of ntp you are running, but recent versions have a bug in the initial frequency calculations which has since been fixed, but not released (ahem. Harlan?). The initial horrible transient was under 4.2.0. After this round I will try an initial transient test with 4.2.4. But the transient behaviour I am describing in the previous post is during the normal running of NTP. It is not an initial transient. It is the response of the system to a real world drift rate glitch. It is after NTP has been running for 5 days and the hardware clock on the machine suffered a frequency glitch. I have no idea what is causing those frequency glitches-- the clock suddenly canges it drift rate by .2 to 2 PPM. I have seen this both with a chrony controlled clock and an NTP controlled clock. It is just that the NTP response is not good. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David L. Mills [EMAIL PROTECTED] writes: Maarten, I turn my machines off and on all the time and the clock is set from the server within 11 seconds after starting ntpd. If I didn't use burst mode, that would take four minutes. Golly. When you say the clock is set what do you mean? With what accuracy is the clock running 4 min after powerup in comparison with its accuracy after say 5 days. (let me define the accuracy as the offset ,not the jitter, but the offset on each measurement from your best time source.) Please understand the difference between impulse response and poll interval. It is true that it might take 3000 s to amortize the initial offset from the TOC chip at power-up. This is no different than if some server torqued your clock by that amount. So, if some server did torque your clock by 50ms as a one time event, or if you stepped your system clock by 50ms, how long would it take ntp to settle down (lets say you are running at maxpoll 7, minpoll 4). Let us assume that in steady state your clock is controlled to 50usec. HOw long would it take to regain that +- 50usec behaviour with ntp? Again, I mean by +- 50 usec that the measurement offsets ( what is reported in the peerstats file as clock offset) are fluctuating by +-50usec? You may not like that as a measure of the clock accuracy, but I want to be clear that we are not talking about different things. Dave Maarten Wiltink wrote: Unruh [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] David L. Mills [EMAIL PROTECTED] writes: There are lots of ways to measure the loop transient response. The easiest way is to set the clock some 50-100 ms off from some stable source (not necessarily accurate) and watch the loop converge. The response should cross zero in about 3000 s and overshoot about 6 percent 3000 s is a HUGE time. For people who switch on their computers daily, that means most of their time is spent with the computer unsynchronised to best accuracy. The timescale of chrony is far faster. (I am not a writer of chrony.I am a user who is trying to get the very best out of the timekeeping.) But NTP is from a time when people didn't switch on their computers daily. When NTP was young, dinosaurs walked the machine room and _you_ did _not_ get to decide when the machine on the other end of your terminal was rebooted. NTP can, after weeks of training, teach a computer to keep time very, very well. As a result, it's less optimised for the other end of the spectrum. Features like iburst and the drift file can get your clock synchronised to within a few milliseconds in less than a minute. If you want better than that, or you want it faster... don't turn your computer off. Groetjes, Maarten Wiltink ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh wrote: [EMAIL PROTECTED] (Danny Mayer) writes: Unruh wrote: [EMAIL PROTECTED] (Danny Mayer) writes: Virtual machines buys you the same problem as above. Even on a virtual machine there's only one clock. You can have only one application discipline that clock never mind how many virtual machines are running. Don't be fooled by the technology. Not if the virtual machines have a virtual clock-- Ie a little program which intercepts all the clock routines and return the output of a little program simulating a clock. Now intercepting the various adjtimex calls is not that hard ( just rewrite the adjtimex and gettimeofday routine and and overload it for your program) but chrony and ntp also use the clock as a scheduler, and that is a lot more difficult to simulate and catch. As a fellow physicist I would expect you to understand this better. It's a basic principal in quantum mechanics: the observers influences the observed results. In this case, it's not enough since you are directly and deliberately affecting the clock itself and there really can only NO you do not understant. The clocks I am talking about are NOT hardware related clocks, they are just subroutines which return what is supposed to be a time when queried, and which change their algorithm for generating those numbers when disciplined by the program. Clocks are not that stable to be just used as an algorithm in a subroutine. Real clocks are unstable otherwise we wouldn't be having this conversion. In other words, you are not conducting a real experiment and you are not modeling the way an actual clock works. The really big problem is that the system goes into wait states, and you would also have to wake it up appropriately. For example, the polling interval is done by the clock. Now there is absolutely no reason why a poll which is supposed to be running at poll 10 could not return immediately with the clock set to tell it that 1024 sec had passed. However getting this right would require a really big rewrite of the NTP or chrony program. and deliberately affecting the clock itself and there really can only be one clock. Multiple clocks lead to chaotic events. All virtual Of course there can be many clocks. After all each computer I have has one so if I have 10 computers I have 10 clocks. NOw of course you are refering to a single computer with a single bit of hardware. But the virtual clocks I am talking about are not hardware related at all. They are just subroutines which spit out an number when queried. Virtual Machines run on a real machine. The clock of the virtual machine is same as the real machine, it's just hidden from you. subroutines don't model a real clock. You have to implement subroutines based on real clock behavior. There are no simulators that I've ever seen that can run tests faster than real-time. They are always many orders of magnitude slower, even with hardware assist. We are not asking for a machine simulator but a clock simulator and that can run thousands of times faster than the real clock. You can run it at any speed you want. And you can have a separate simualted clock with its own theory of operation on each virtual machine. I've run many different simulators including hardware ones and I can assure you nothing runs slower than a simulator. Like I said there is only one real clock in a virtual machine, there just appears to be one per virtual machine. A simulator of a clock can run far far faster than a clock. After all I can output the numbers from 1 to 1 far faster than 1 sec. That is how weather forcasting works. The simulation of the weather is run much faster than the real weather. Otherwise the forcast is a bit useless. weather modeling requires a great deal of effort to reflect actual weather fluctations and changes. It's a very different model and situation and the feedback loop is likely to be much weaker in weather modeling. And the observer does not influence the results (except for their own personal biases). Danny ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Danny Mayer) writes: Danny You do realize that there are timers built into the code so in order Danny to run faster you'd need to figure out how to change the timers to Danny work that way? When was the last time you looked at the ntpdsim code? -- Harlan Stenn [EMAIL PROTECTED] http://ntpforum.isc.org - be a member! ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
[EMAIL PROTECTED] (Danny Mayer) writes: Harlan Stenn wrote: In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Danny Mayer) writes: Danny Harlan Stenn wrote: Unruh Unfortunately I cannot run both ntp and chrony on the same system at Unruh the same time. Bill, Exactly why can you not run ntpd and chrony on the same system at the same time? Danny Harlan, really. You *cannot* have two different Danny mechanisms/applications to discipline the clock at the same time. I Danny invite you to try. You have access to my code so you can test this Danny easily. You are, as is so often the case, missing my point. It is possible to run ntpd in a way that it does not discipline the clock. I am curious about your last sentence though - what is special about your code that would allow this to be tested? I want the ability to run multiple instances of ntpd where at most 1 instance of ntpd is actually controlling the clock, specifically to make it easy to (more quickly) analyze the performance/behavior of different configurations of ntpd. I understand that the boat is rocking while this is going on, but I suspect this capability would be a useful one in at least some cases. Danny I don't see the benefit of doing this with two separate Danny instances. It's easier and simpler to just add the other servers into Danny the one instance and specify noselect. Again you are missing my point. Allowing this would let us, for example, see how two different versions of ntpd would discipline the clock. It would allow us to see how ntpd might discipline the clock compared to chrony. I understand and get that by not actually disciplining the clock we are removing an important part of the feedback loop, and I do not know if that will fatally affect these sort of experiments or not. And as Bill said, it would be Swell if there was a way to do this using, eg, virtual machines so that we could test them that way. Better yet, it would be nice to have a simulator framework where we could run these tests faster than in real-time. You do realize that there are timers built into the code so in order to run faster you'd need to figure out how to change the timers to work that way? As I said it is not easy, particularly because the clock is used as a sheduler. If the only problem were the gettimeofday and adjtimex (to use the Linux expression) then you could simply replace them by having them interface with a clock simulator. HOwever there are the schedulers (timers) and timeout functions which are harder to make work in a simulator. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
chrony falls at the first hurdle for me - there appears to be no native Windows implementation. David ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh [EMAIL PROTECTED] wrote: After I collect more data on steady state, I will rerun startups both with no drift file and a bad drift file to see how fast the convergence is with 4.2.4. Hi, On recent Linux kernels, I think the drift file is always bad after reboot. HZ=100, no dynamic ticks aka tickless system (CONFIG_NO_HZ not set). I think I even tried with a kernel command line option lpj= but it didn't help. If the system is rebooted, ntpd stabilizes to a new different drift value. With a bad or missing drift file, time set with ntpdate, ntpd can soon take the offset to a 100 or 200 ms error for a long time. If you are using Linux and are experimenting with these, please try something like this which has given me good results (a coarse calibration of the drift file during boot before starting ntpd): #!/bin/sh DRIFTFILE=/etc/ntp/drift NTPSERVER=ip.address.of.a.good.nearby.ntp.server TIME=100 # remember to stop ntpd first if running # reset frequency offset to zero adjtimex -f 0 # calibrate clock rate during $TIME seconds ntpdate -sb $NTPSERVER sleep $TIME ADJUST=$(ntpdate -b $NTPSERVER | sed 's/.*offset \(.*\) sec.*/\1/') # ntpdate adjusted $ADJUST seconds FREQUENCY=$(echo scale=3; $ADJUST * 100 / $TIME | bc) # reset the drift file and start ntpd echo $FREQUENCY $DRIFTFILE /etc/rc.d/rc.ntpd start ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh, It may help to review the material on Allan deviation and noise modelling in the briefings on the NTP project page. If you are down in the low microsecond range with poll intervals much over 64 s, expect to see a frequency sway due to small temperature variations less than one degree C. This should appear random-walk in nature and not periodic. Be hortunate about the temperature dependency; it makes a very good fire detector and fan failure alarm. Dave Unruh wrote: [EMAIL PROTECTED] (David Woolley) writes: In article [EMAIL PROTECTED], Bill Unruh [EMAIL PROTECTED] wrote: Offset error: NTP: Mean=-3.1usec, Std Dev=63.1usec If offset is the value reported by ntpq, please note that, when ntpd is locked up, this is an indication of the instantaneous measurement error, the actual error in the local time should be more stable (there may be systematic error) by one or two orders of magnitude. No, the offset is the value reported in loopstats. More generally though, Dave Mills really needs to get in here and defend his clock discipline algorithm, and the Chrony developer needs to defend theirs. Arguing the cases by proxy isn't particularly satisfactory. This is not arguing by proxy, this is running experiments. As I know, since I am a physicist, experiment trumps theory always. Dave, please remember that what tends to concern people about the algorithm is not the behaviour in response to gaussian phase noise, but its behaviour in response to transients, in particular startup transients. (Personally I would say that lost clock tick transients should be fixed at source, but Bill Unruh would also like it to tolerate those well.) Chrony: Mean=-1.5 usec, Std Dev=20.1usec Given the way that I understand it works, I think this is the actual correction applied on that sample. No, this is offset as measured by the ntp procedure ( (t1+t4-t2-t3)/2 ) Rate fluctuation: NTP:Mean=25.32 Std Dev=.078 (PPM) Chrony: Mean=25.26 Std Dev=.091 (PPM) Running it for a longer time, the standard deviation of the rate for ntp has dropped to about .020PPM, which is much better than chrony's. The means depend on the hardware, and, as long as they are within the order of one standard deviation of each other, they are as good as each other. Yes, I agree that the mean rates are the same. It is the standard deviation that is important here. Ie, ntp seems to be much better (smaller fluctutation) than chrony here, at the expense of much worst offset control (which makes sense if the rate fluctuations are real-- ie, I can make chrony's rate fluctuations much smaller by i running averaging the rates over a couple of hours but that will make the offset deviation increase. I guess it depends on which you consider more important, and accurate rate, or an accurate clock. From the point of view of another machine, chrony will have episodes where the frequency changes much more, as it applies the phase correction. ??? These are done on the same machine. If you mean that the real drift rate of the computer changes, then chrony's rate will change, then I would hope that that happens. Remember that this is not comparing two different machines, but the same machine at two different times. And yes, the physical events could have changed between the two. It would be nice if one could do a simulation-- put them both on some virtual machine and feed in exactly the same real clock drift changes, and use some model of the noise ( measurement, transmission, etc) so one could provide the two algorithms with exactly the same data to work with. But neither chrony nor ntp are set up for that. over the weekend, and chrony encompassed the weekdays when the grad students use the computer) the offset control by chrony was a factor of 3 better than by ntp. If the figures are the actual correction for chrony and the sample error for ntpd and Dave Mills is correct about the phase noise rejection of the ntpd filter being a couple of orders of magnitude, ntpd might actually be 30 times better. Nope, the figures are the actual samples as measured by chrony, and the processed output from ntp as reported in loopstats-- whatever that figure is. Ie, if processing makes a difference then the advantage lies with ntp. But I just checked using the offset reported in peerstats (choosing only the packets from the one local server) I get the same result as from loopstats. Ie, both the results for ntp and for chrony are the raw offsets. and chrony's are about 2.2 times better than ntp's. So, chrony, at least in this one test, controls the offsets of the clock much better, at the expense of worse consistancy in the frequency. It also reacts much faster to gross changes in the time ( eg startup with no drift file). ___ questions mailing list
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Guys, I haven't read every word on this thread, but all I can contribute is that nothing reported here is anything like my experience here. Our servers pogo.udel.edu and rackety.udel.edu are synchronized via GPS and PPS. I invite the skeptics to peek at them from time to time. I describe their behavior as like cats; most of the time they are quiet and gentle at a few microseconds, but once in a whild they show a surge of ten microseconds or more, especially after a power failure, which we do get from time to time. There is a persistent report that appears as a low-frequency ringing with more or less constant period. This would seem to suggest something wrong with the discipline loop transient response. In the past the most likely cause has been an ill-advised tinker with the Unix adjtime() system call with the dubious purpose of reducing the time to slew the clock over some range. This wrecks the transient response and easily leads to loop instability. If you are using the kernel time discipline and not adjtime() this is not an issue. There are lots of ways to measure the loop transient response. The easiest way is to set the clock some 50-100 ms off from some stable source (not necessarily accurate) and watch the loop converge. The response should cross zero in about 3000 s and overshoot about 6 percent and smoothly amortize over several hours. Be sure to clamp the poll interval to 64 s over that period. If it does something else, like show an exponentially decreasing ringing. Go looking for trouble. As for offset should be much larger than the error, be careful here. By error I assume you mean what ntpq rv shows as jitter. The best case is when offset is indeed less than jitter; if the error is much larger than error, this suggests the frequency has surged and the time constant/poll interval needs to be reduced. Watch the poll interval behavior in the loopstats data. Dave David Woolley wrote: In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] wrote: No, the offset is the value reported in loopstats. Same thing. If chrony is reporting the same measurements, neither set of measurements is particularly valid. You need to measure the actual offsets, using something that has a repeatability a couple of orders of magnitude better. Certainly for ntpd, offset should be much larger than the error, when locked. Is the server running ntpd? Anyway, as I said, arguing by proxy is difficult and I'm rather hoping that Dave Mills will take over. Certainly it is Dave Mills you have to convince if ntpd is going to change. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh, Please read the specification. The offset statistic is the maximum-likelihood estimate of the remote clock offset relative to the local clock and the sign really does matter. The best way to describe this and keep the sign straight is to assume the signed offset is the quantity in seconds to add to the local clock in order to maintain the same time as the remote clock. The variance statistic, which is represented as an exponentially weighted RMS average called jitter, is the expected error when computing the offset statistic. Generally speaking, as long as the jitter is unbiased, it does not materially affect the clock accuracy due to the extreme lowpass characteristic of the discipline. You shoul be watching the offset statistic, not the jitter statistic. Dave Unruh wrote: [EMAIL PROTECTED] (Danny Mayer) writes: Unruh wrote: correction applied on that sample. No, this is offset as measured by the ntp procedure ( (t1+t4-t2-t3)/2 ) No, that's wrong. It is very carefully described in the NTPv4 draft section 8 (p27): theta = T(B) - T(A) = 1/2 * [(T2-T1) + (T3-T4)] Not only do you have the wrong sign, the differences must be calculated first, otherwise the errors in the calculation overwhelm the resulting value. That's why it's written the way it is. I was not writing code. I was telling you what time difference I was refering to. And sign is a convention as to whether you are saying positive is the computer is fast or the external source is fast. Everything I talked about is sign independent (standard deviation uses squares), and the difference is that as reported by ntp or chrony and both are careful to to do the calculations with as high an accuracy as possible. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh, The basic clock discipline feedback loop has been unchanged since 1992, although minor changes have been made to improve behavior in very long poll intervals. The only radical change has been using a preliminary 15-minute initial frequency computation when no frequency file is available. So, if you are comparing ntpd and chrony at initial startup and without a frequency file, expect to find wide differences in behavior. Starting ntpd with an intentially bad frequency file is not useful unless you can configure chrony in the same way. If you really do want a definitive experiment, do what I suggested earlier: measure the transient response of both ntpd and chrony starting from the SAME initial conditions and with a frequency file containing zero PPM. Pay attention to the poll interval, whch should be the same in both cases. That will tell you the story, the whole story and nothing but the truth. Dave Unruh wrote: [EMAIL PROTECTED] (Danny Mayer) writes: Unruh wrote: All I say is that the experiments I have carried out show that ntp is slow to converge if it starts of badly, and leaves the offset scatter larger than chrony does. It does have a smaller scatter in the rate. But you are using an extremely old version of ntp and things have radically changed since that version was released. Try rerunning you experiments with ntp 4.2.4 and see what you get then. You also need to fix your calculations if you are going to get good results as I mentioned in a previous message. Most of the standard deviation results are with 4.2.4. Only the startup was with 4.2.0. Are you saying that things have radically changed in the handling of the startup? After I collect more data on steady state, I will rerun startups both with no drift file and a bad drift file to see how fast the convergence is with 4.2.4. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh, As you can see from the Allan deviation plots in the briefings on the NTP project page, it does no good whatsoever to average over ten hours; the observations are completely uncorrelated over that lag. The Allan intercept, or best averaging time, is more like twenty minutes to two hours, depending on the whims of fate and the cut of the rock. It is a matter of physics that the NTP offsets will truly average out to zero in the long term unless there is an intrinsic bias in the timestamp calculations and, believe me, these calculations are very carefully done to reduce residual bias to essentially zero. In principle, assuming a precision source is available, this puppy can hold better than one picosecond. If you see a persistent nonzero offset, there very likely is an oscillator frequency problem or the adjtime() or equivalent system call has an inherent bias. In any case, even if it does, and unless there is a huge frequency error in the order of several hundred PPM, the long term average offset will be very close to zero. In principle, comparing ntpd and chrony or any other vehicle is meaninful only if using the same hardware, operating system and poll interval. I assume chrony has done its homework and optimized the time constant for the given poll interval. I am on a limb here, because nobody has confirmed that chrony does in fact discipline the clock using some sort of feedback loop sensitive to both time and frequency offset. If in fact it does not, then why are we having this discussion? Dave Unruh wrote: [EMAIL PROTECTED] (David Woolley) writes: In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] wrote: No, the offset is the value reported in loopstats. Same thing. If chrony is reporting the same measurements, neither set of measurements is particularly valid. You need to measure the actual I am sorry but I do not understand what you are saying. The best estimate of the time error of the clock is the measurement that you make of that error. Now, you might argue that if the drift never changed, and the clock never changed then one could get a better estimate by averaging the measurements. But the hypothesis that the time reported by the ntp process is the true time plus random uncorrelated errors is simply wrong, as looking at the plot of the offsets will rapidly convince you. The offsets oscillate with a period on the order of an hour or so. Chrony does this. ntp does this. The errors are NOT gaussian uncorrelated random errors. Thus most of that error budget is in such correlated errors, and ntp does NOT do many orders of magnitude better ( even with uncorrelated random errors you would need to average 100 samples-- collected over 10 hours at poll 7 to get one order of magnitude, and by then the drift errors would have gotten you.) Anyway, I am comparing like with like in the two programs. Chrony is much better in offsets, which implies that at least half of the error in ntp is internal error. Ie, it is errors which do NOT average out. ( and I do not believe that chrony's errors are the minimal uncorrelated random errors either.) offsets, using something that has a repeatability a couple of orders Of course that is the best way. Unfortunately I do not have that. I might extend the line running the main server to also give me the true offsets for the machine. However one can also get an estimate of the errors by looking at the measured offsets using the ntp exchange. of magnitude better. Certainly for ntpd, offset should be much larger than the error, when locked. Is the server running ntpd? I do not believe this. Yes, the server is running ntp and its offset errors are of the order of 3usec-- again correlated as you can see from the string graph near the bottom of the page. For example, if I take a 10 element running average and subtract it from the raw output of ntp for the server , the standard deviation goes from 3usec to .5usec. Ie, the errors are highly correlated. Averaging may be able to get rid of that .5usec, but not the rest of the standard deviation (3usec) which is some sort of highly correlated noise. IF the errors were really uncorrelated random errors then subtracting off the running average would make no ( well, little) difference to the standard deviation.( it would decrease it by something like sqrt(N-1/N) where N is the length of the running average) Ie, it is simply not true that the measured offsets reported by ntp, or chrony, are simply some independent gaussian random process around the true time. Anyway, as I said, arguing by proxy is difficult and I'm rather hoping that Dave Mills will take over. Certainly it is Dave Mills you have to convince if ntpd is going to change. I do not know if I am trying to convince. I am trying to report the outcomes of some experiments. Now if one (Mills) wants ntp to behave differently than the
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Petri, I knew Linux was broken, but what you report suggests it is broken beyond my wildest imagination. First, I do know Linux supports the precision time kernel, as it has the ntp_adjtime() system call, even if it is buried in a wrapper. If so, and assuming that syscall is implemented correctly, it doesn't matter whether ticks are significant or not. Second, Linux has completely broken the initial frequency computation and intended semantics of the frequency file. The really serious and sad issue here is that Linux has added much unnecessary baggage that disables or distorts the carefully engineered design principles. It would be much, much better to rip out all that baggage and use only what comes with the bare ntpd. Certainly, at least Solaris and FreeBSD have no such baggage. I know that at least one time in the past ntpd ran just fine on Linux, at least the ntpd version that leaves here, not the one that comes with Linux. If this is still the case, your course is clear. Dave Petri Kaukasoina wrote: Unruh [EMAIL PROTECTED] wrote: After I collect more data on steady state, I will rerun startups both with no drift file and a bad drift file to see how fast the convergence is with 4.2.4. Hi, On recent Linux kernels, I think the drift file is always bad after reboot. HZ=100, no dynamic ticks aka tickless system (CONFIG_NO_HZ not set). I think I even tried with a kernel command line option lpj= but it didn't help. If the system is rebooted, ntpd stabilizes to a new different drift value. With a bad or missing drift file, time set with ntpdate, ntpd can soon take the offset to a 100 or 200 ms error for a long time. If you are using Linux and are experimenting with these, please try something like this which has given me good results (a coarse calibration of the drift file during boot before starting ntpd): #!/bin/sh DRIFTFILE=/etc/ntp/drift NTPSERVER=ip.address.of.a.good.nearby.ntp.server TIME=100 # remember to stop ntpd first if running # reset frequency offset to zero adjtimex -f 0 # calibrate clock rate during $TIME seconds ntpdate -sb $NTPSERVER sleep $TIME ADJUST=$(ntpdate -b $NTPSERVER | sed 's/.*offset \(.*\) sec.*/\1/') # ntpdate adjusted $ADJUST seconds FREQUENCY=$(echo scale=3; $ADJUST * 100 / $TIME | bc) # reset the drift file and start ntpd echo $FREQUENCY $DRIFTFILE /etc/rc.d/rc.ntpd start ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David L. Mills [EMAIL PROTECTED] writes: Unruh, It may help to review the material on Allan deviation and noise modelling in the briefings on the NTP project page. If you are down in the low microsecond range with poll intervals much over 64 s, expect to see a frequency sway due to small temperature variations less than one degree C. This should appear random-walk in nature and not periodic. Be hortunate about the temperature dependency; it makes a very good fire detector and fan failure alarm. Sure, I understand that. I am not worried about the fluctuations in the frequency, especially if they track real fluctuations in the drift rate of the clock. I am worried about the offsets, since they indicate that the system is NOT following the real drift rate of the clock. Especially when the fluctuations are highly correlated ( ie are not just random noise). The much better behaviour of chrony on offsets suggests that ntp is NOT following the drift rate of the clock. Especially as the scatter on chrony is a) much more random and b) I suspect is tied to the much worse behaviour of chrony in the round trip time department. Ie, chrony is doing much better (in controlling offsets) even though it is suffering much worse noise than is ntp. Dave Unruh wrote: [EMAIL PROTECTED] (David Woolley) writes: In article [EMAIL PROTECTED], Bill Unruh [EMAIL PROTECTED] wrote: Offset error: NTP: Mean=-3.1usec, Std Dev=63.1usec If offset is the value reported by ntpq, please note that, when ntpd is locked up, this is an indication of the instantaneous measurement error, the actual error in the local time should be more stable (there may be systematic error) by one or two orders of magnitude. No, the offset is the value reported in loopstats. More generally though, Dave Mills really needs to get in here and defend his clock discipline algorithm, and the Chrony developer needs to defend theirs. Arguing the cases by proxy isn't particularly satisfactory. This is not arguing by proxy, this is running experiments. As I know, since I am a physicist, experiment trumps theory always. Dave, please remember that what tends to concern people about the algorithm is not the behaviour in response to gaussian phase noise, but its behaviour in response to transients, in particular startup transients. (Personally I would say that lost clock tick transients should be fixed at source, but Bill Unruh would also like it to tolerate those well.) Chrony: Mean=-1.5 usec, Std Dev=20.1usec Given the way that I understand it works, I think this is the actual correction applied on that sample. No, this is offset as measured by the ntp procedure ( (t1+t4-t2-t3)/2 ) Rate fluctuation: NTP:Mean=25.32 Std Dev=.078 (PPM) Chrony: Mean=25.26 Std Dev=.091 (PPM) Running it for a longer time, the standard deviation of the rate for ntp has dropped to about .020PPM, which is much better than chrony's. The means depend on the hardware, and, as long as they are within the order of one standard deviation of each other, they are as good as each other. Yes, I agree that the mean rates are the same. It is the standard deviation that is important here. Ie, ntp seems to be much better (smaller fluctutation) than chrony here, at the expense of much worst offset control (which makes sense if the rate fluctuations are real-- ie, I can make chrony's rate fluctuations much smaller by i running averaging the rates over a couple of hours but that will make the offset deviation increase. I guess it depends on which you consider more important, and accurate rate, or an accurate clock. From the point of view of another machine, chrony will have episodes where the frequency changes much more, as it applies the phase correction. ??? These are done on the same machine. If you mean that the real drift rate of the computer changes, then chrony's rate will change, then I would hope that that happens. Remember that this is not comparing two different machines, but the same machine at two different times. And yes, the physical events could have changed between the two. It would be nice if one could do a simulation-- put them both on some virtual machine and feed in exactly the same real clock drift changes, and use some model of the noise ( measurement, transmission, etc) so one could provide the two algorithms with exactly the same data to work with. But neither chrony nor ntp are set up for that. over the weekend, and chrony encompassed the weekdays when the grad students use the computer) the offset control by chrony was a factor of 3 better than by ntp. If the figures are the actual correction for chrony and the sample error for ntpd and Dave Mills is correct about the phase noise rejection of the ntpd filter being a couple of orders of magnitude, ntpd might actually be 30 times better. Nope, the figures are the actual samples
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David L. Mills [EMAIL PROTECTED] writes: In principle, comparing ntpd and chrony or any other vehicle is meaninful only if using the same hardware, operating system and poll interval. I assume chrony has done its homework and optimized the time constant for the given poll interval. I am on a limb here, because nobody has confirmed that chrony does in fact discipline the clock using some sort of feedback loop sensitive to both time and frequency offset. If in fact it does not, then why are we having this discussion? It does. And the comparison is on exactly the same machine, running exactly the same operating system. The ONLY change is that chrony was replaced by ntp on that system. Now the date is not the same. Unfortunately I cannot run both ntp and chrony on the same system at the same time. Dave Unruh wrote: [EMAIL PROTECTED] (David Woolley) writes: In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] wrote: No, the offset is the value reported in loopstats. Same thing. If chrony is reporting the same measurements, neither set of measurements is particularly valid. You need to measure the actual I am sorry but I do not understand what you are saying. The best estimate of the time error of the clock is the measurement that you make of that error. Now, you might argue that if the drift never changed, and the clock never changed then one could get a better estimate by averaging the measurements. But the hypothesis that the time reported by the ntp process is the true time plus random uncorrelated errors is simply wrong, as looking at the plot of the offsets will rapidly convince you. The offsets oscillate with a period on the order of an hour or so. Chrony does this. ntp does this. The errors are NOT gaussian uncorrelated random errors. Thus most of that error budget is in such correlated errors, and ntp does NOT do many orders of magnitude better ( even with uncorrelated random errors you would need to average 100 samples-- collected over 10 hours at poll 7 to get one order of magnitude, and by then the drift errors would have gotten you.) Anyway, I am comparing like with like in the two programs. Chrony is much better in offsets, which implies that at least half of the error in ntp is internal error. Ie, it is errors which do NOT average out. ( and I do not believe that chrony's errors are the minimal uncorrelated random errors either.) offsets, using something that has a repeatability a couple of orders Of course that is the best way. Unfortunately I do not have that. I might extend the line running the main server to also give me the true offsets for the machine. However one can also get an estimate of the errors by looking at the measured offsets using the ntp exchange. of magnitude better. Certainly for ntpd, offset should be much larger than the error, when locked. Is the server running ntpd? I do not believe this. Yes, the server is running ntp and its offset errors are of the order of 3usec-- again correlated as you can see from the string graph near the bottom of the page. For example, if I take a 10 element running average and subtract it from the raw output of ntp for the server , the standard deviation goes from 3usec to .5usec. Ie, the errors are highly correlated. Averaging may be able to get rid of that .5usec, but not the rest of the standard deviation (3usec) which is some sort of highly correlated noise. IF the errors were really uncorrelated random errors then subtracting off the running average would make no ( well, little) difference to the standard deviation.( it would decrease it by something like sqrt(N-1/N) where N is the length of the running average) Ie, it is simply not true that the measured offsets reported by ntp, or chrony, are simply some independent gaussian random process around the true time. Anyway, as I said, arguing by proxy is difficult and I'm rather hoping that Dave Mills will take over. Certainly it is Dave Mills you have to convince if ntpd is going to change. I do not know if I am trying to convince. I am trying to report the outcomes of some experiments. Now if one (Mills) wants ntp to behave differently than the experiments show it does, then I guess he will change it. If not, then not. All I say is that the experiments I have carried out show that ntp is slow to converge if it starts of badly, and leaves the offset scatter larger than chrony does. It does have a smaller scatter in the rate. One of the great advantages of two different people-- Mills and Curnoe-- trying to impliment the same ideas in different ways is that one can learn by studying the difference between their results. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David L. Mills [EMAIL PROTECTED] writes: Unruh, Please read the specification. The offset statistic is the maximum-likelihood estimate of the remote clock offset relative to the local clock and the sign really does matter. The best way to describe this and keep the sign straight is to assume the signed offset is the quantity in seconds to add to the local clock in order to maintain the same time as the remote clock. Of course the sign matters if you are trying to correct things. Sheesh. The sign does NOT matter to the standard deviation. That was ALL I was saying. The variance statistic, which is represented as an exponentially weighted RMS average called jitter, is the expected error when computing the offset statistic. Generally speaking, as long as the jitter is unbiased, it does not materially affect the clock accuracy due to the extreme lowpass characteristic of the discipline. You shoul be watching the offset statistic, not the jitter statistic. ??? I am watching the offset. I am taking the offset as measured by the ntp negotiation and calculating the rms deviation of that. Dave Unruh wrote: [EMAIL PROTECTED] (Danny Mayer) writes: Unruh wrote: correction applied on that sample. No, this is offset as measured by the ntp procedure ( (t1+t4-t2-t3)/2 ) No, that's wrong. It is very carefully described in the NTPv4 draft section 8 (p27): theta = T(B) - T(A) = 1/2 * [(T2-T1) + (T3-T4)] Not only do you have the wrong sign, the differences must be calculated first, otherwise the errors in the calculation overwhelm the resulting value. That's why it's written the way it is. I was not writing code. I was telling you what time difference I was refering to. And sign is a convention as to whether you are saying positive is the computer is fast or the external source is fast. Everything I talked about is sign independent (standard deviation uses squares), and the difference is that as reported by ntp or chrony and both are careful to to do the calculations with as high an accuracy as possible. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David L. Mills [EMAIL PROTECTED] writes: Unruh, The basic clock discipline feedback loop has been unchanged since 1992, although minor changes have been made to improve behavior in very long poll intervals. The only radical change has been using a preliminary 15-minute initial frequency computation when no frequency file is available. So, if you are comparing ntpd and chrony at initial startup and without a frequency file, expect to find wide differences in behavior. Starting ntpd with an intentially bad frequency file is not useful unless you can configure chrony in the same way. Of course I can. And have done so. As you say, the transient response of ntp is terrible-- 3000 sec is far too slow. If you really do want a definitive experiment, do what I suggested earlier: measure the transient response of both ntpd and chrony starting from the SAME initial conditions and with a frequency file containing zero PPM. Pay attention to the poll interval, whch should be the same in both cases. That will tell you the story, the whole story and nothing but the truth. Precisely what I have been doing. Dave Unruh wrote: [EMAIL PROTECTED] (Danny Mayer) writes: Unruh wrote: All I say is that the experiments I have carried out show that ntp is slow to converge if it starts of badly, and leaves the offset scatter larger than chrony does. It does have a smaller scatter in the rate. But you are using an extremely old version of ntp and things have radically changed since that version was released. Try rerunning you experiments with ntp 4.2.4 and see what you get then. You also need to fix your calculations if you are going to get good results as I mentioned in a previous message. Most of the standard deviation results are with 4.2.4. Only the startup was with 4.2.0. Are you saying that things have radically changed in the handling of the startup? After I collect more data on steady state, I will rerun startups both with no drift file and a bad drift file to see how fast the convergence is with 4.2.4. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Hey Danny! In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] writes: Unruh Unfortunately I cannot run both ntp and chrony on the same system at Unruh the same time. Bill, Exactly why can you not run ntpd and chrony on the same system at the same time? I want the ability to run multiple instances of ntpd where at most 1 instance of ntpd is actually controlling the clock, specifically to make it easy to (more quickly) analyze the performance/behavior of different configurations of ntpd. I understand that the boat is rocking while this is going on, but I suspect this capability would be a useful one in at least some cases. H ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] David L. Mills [EMAIL PROTECTED] writes: There are lots of ways to measure the loop transient response. The easiest way is to set the clock some 50-100 ms off from some stable source (not necessarily accurate) and watch the loop converge. The response should cross zero in about 3000 s and overshoot about 6 percent 3000 s is a HUGE time. For people who switch on their computers daily, that means most of their time is spent with the computer unsynchronised to best accuracy. The timescale of chrony is far faster. (I am not a writer of chrony.I am a user who is trying to get the very best out of the timekeeping.) But NTP is from a time when people didn't switch on their computers daily. When NTP was young, dinosaurs walked the machine room and _you_ did _not_ get to decide when the machine on the other end of your terminal was rebooted. NTP can, after weeks of training, teach a computer to keep time very, very well. As a result, it's less optimised for the other end of the spectrum. Features like iburst and the drift file can get your clock synchronised to within a few milliseconds in less than a minute. If you want better than that, or you want it faster... don't turn your computer off. Groetjes, Maarten Wiltink ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
David L. Mills wrote: As for offset should be much larger than the error, be careful here. By error I assume you mean what ntpq rv shows as jitter. The best case No. By error I meant a measurement that neither ntpd nor chrony can actually make, namely the difference between the user's concept of perfect time and the actual time in the software clock in the client. If you could actually measure it, you would probably characterize it by the root mean square of this. What actually happens is that, say you have a server, that you define as perfect time, you desperately want a measure of how accurate your client is compared with the server's internal time. People seize on offset as a measure of that, but, if the loop is well locked, which I think amounts to jitter and RMS offset being essentially the same, offset is almost entirely made up of measurement error. In reality the client's software clock may well be in almost perfect synchronization with the server's and certainly should have an RMS difference that is much less than deduced from offset/jitter. (Systematic errors may result in a systematic offset, so one is really talking about a jitter-like measure, relative to the, unavailable, perfect time.) The measurement cannot be made using ntpd or chrony alone, because if they could measure the true error, they could correct for it. is when offset is indeed less than jitter; if the error is much larger than error, this suggests the frequency has surged and the time I think you meant the first error to be offset and the second one to be jitter. I would consider this case to be one where the loop was not properly locked. constant/poll interval needs to be reduced. Watch the poll interval behavior in the loopstats data. I think you really need to address two issues to put this thread to rest: - the use of linear regression algorithms on finite histories, as an alternative to the ntpd algorithm (i.e. the statisticians/scientists approach, versus the engineer's); - the handling of cases where it is obvious to a human that the time is wrong, but ntpd will take 3000+s to fully correct. chrony uses linear regression (modified least squares) and it seems to be getting a reputation for recovering from transients much better than ntpd. Unruh believes that this is the consequence of the algorithm that it uses, which means that least squares type techniques are beginning to be associated with the way to go with time synchronization. I know you disagree, but you have to convince people of that when chrony seems to behave much better in the transients seen in real uses of ntpd. I wonder if what is really needed is to use linear regression to gain and regain lock and to use the current ntpd algorithm when you are reasonably convinced that the loop is locked. At the moment, you do a two point linear regression on a cold start, or after a step, although two point least squares fits are rather trivial as they always have zero variance if the points are distinct! My understanding of chrony, based on high level documents and a quick skim of the code is that: - it is not NTP compliant because it doesn't seem to implement normative parts of the NTPv3 specification, like the intersection algorithm (but many people don't distinguish between SNTP and NTP because they use the same wire formats); - the way it works is to maintain a finite history of measurements and to use linear regression (least squares modified to give less weight to outliers) and to calculate a phase and frequency error. It applies the phase correction as a fast slew, which is seen as an an advantage, because only a fixed frequency correction is left if the server goes away) and the frequency correction continuously. Once it has applied a correction, it adjusts the historic measurements to account for its current time and frequency scales. I think there is more to it than this, e.g. adjusting sample rates and the number of retained samples. Because it is significantly different in principle from ntpd, it is not entirely clear that ntpd concepts like loop time constants are explicit in the chrony model, although they might be implicit in things like the period over which samples are currently being retained. A problem that Unruh is having is that some of the answers he is getting seem to represent blind faith in ntpd without any knowledge of alternative approaches. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Harlan Stenn [EMAIL PROTECTED] writes: Hey Danny! In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] writes: Unruh Unfortunately I cannot run both ntp and chrony on the same system at Unruh the same time. Bill, Exactly why can you not run ntpd and chrony on the same system at the same time? The key problem is that there is a feedback between the clock control algorithm and the clock loop routines. That feedback is missing if chrony or whatever does not control the clock. It may tell you a bit but removing such a crucial part of the feedback loop would totally change the behaviour of the loop. What would be great to have would be the ability to run them on a fake clock machine, where it is a program which responds to the output (clock control) and the input ( the network stuff.) In particular chrony uses system clock stuff to schedule various events-- like sending out the packets, etc. One would have to have the system feed those routines as well. HOwever if one could do that then one could look at how both ntp and chrony reacted to exactly the same input/output. It is however a massive rewrite of the code, unfortunately, as far as i can see. I want the ability to run multiple instances of ntpd where at most 1 instance of ntpd is actually controlling the clock, specifically to make it easy to (more quickly) analyze the performance/behavior of different configurations of ntpd. I understand that the boat is rocking while this is going on, but I suspect this capability would be a useful one in at least some cases. Unfortunately I do not think that will give much info as to how the different configurations behave. It would be like disconnecting the feedback in an amplifier-- the amp behaves very very differently. H ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
On Mon, 21 Jan 2008, Danny Mayer wrote: Unruh wrote: All I say is that the experiments I have carried out show that ntp is slow to converge if it starts of badly, and leaves the offset scatter larger than chrony does. It does have a smaller scatter in the rate. But you are using an extremely old version of ntp and things have radically changed since that version was released. Try rerunning you experiments with ntp 4.2.4 and see what you get then. You also need to fix your calculations if you are going to get good results as I mentioned in a previous message. I did. The calculations I presented were with 4.2.4, except for the convergence on initial transient. I have not retried that experiment ( It takes too long) Most of the results regarding the scattering of the offset are for 4.2.4. It is a factor of a little over two worse than chrony in regulating the offset. In a few weeks I will probably try the initial transient stuff again. (I am out of town next week) However, do you believe that the bechaviour of 4.2.4 under intial conditions is better than 4.2.0 (eg either no drift file or a bad drift file)? ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
On recent Linux kernels, I think the drift file is always bad after reboot. HZ=100, no dynamic ticks aka tickless system (CONFIG_NO_HZ not set). I think I even tried with a kernel command line option lpj= but it didn't help. If the system is rebooted, ntpd stabilizes to a new different drift value. That's a bug in the TSC calibration code. grep your /var/log/messages* for Detected. You will find things like thsi: Jan 4 11:21:49 shuksan kernel: Detected 2793.137 MHz processor. Jan 4 21:30:43 shuksan kernel: Detected 2793.209 MHz processor. Jan 22 09:32:20 shuksan kernel: Detected 2793.139 MHz processor. The differences in the bottom bits turn into different drift values. Recent Linux kernels use the TSC for timekeeping. (At least on the systems I work with.) There may be a simple command line option to use another chunk of hardware. -- These are my opinions, not necessarily my employer's. I hate spam. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Harlan Stenn wrote: Hey Danny! In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] writes: Unruh Unfortunately I cannot run both ntp and chrony on the same system at Unruh the same time. Bill, Exactly why can you not run ntpd and chrony on the same system at the same time? Harlan, really. You *cannot* have two different mechanisms/applications to discipline the clock at the same time. I invite you to try. You have access to my code so you can test this easily. I want the ability to run multiple instances of ntpd where at most 1 instance of ntpd is actually controlling the clock, specifically to make it easy to (more quickly) analyze the performance/behavior of different configurations of ntpd. I understand that the boat is rocking while this is going on, but I suspect this capability would be a useful one in at least some cases. I don't see the benefit of doing this with two separate instances. It's easier and simpler to just add the other servers into the one instance and specify noselect. Danny ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Danny Mayer) writes: Danny Harlan Stenn wrote: Unruh Unfortunately I cannot run both ntp and chrony on the same system at Unruh the same time. Bill, Exactly why can you not run ntpd and chrony on the same system at the same time? Danny Harlan, really. You *cannot* have two different Danny mechanisms/applications to discipline the clock at the same time. I Danny invite you to try. You have access to my code so you can test this Danny easily. You are, as is so often the case, missing my point. It is possible to run ntpd in a way that it does not discipline the clock. I am curious about your last sentence though - what is special about your code that would allow this to be tested? I want the ability to run multiple instances of ntpd where at most 1 instance of ntpd is actually controlling the clock, specifically to make it easy to (more quickly) analyze the performance/behavior of different configurations of ntpd. I understand that the boat is rocking while this is going on, but I suspect this capability would be a useful one in at least some cases. Danny I don't see the benefit of doing this with two separate Danny instances. It's easier and simpler to just add the other servers into Danny the one instance and specify noselect. Again you are missing my point. Allowing this would let us, for example, see how two different versions of ntpd would discipline the clock. It would allow us to see how ntpd might discipline the clock compared to chrony. I understand and get that by not actually disciplining the clock we are removing an important part of the feedback loop, and I do not know if that will fatally affect these sort of experiments or not. And as Bill said, it would be Swell if there was a way to do this using, eg, virtual machines so that we could test them that way. Better yet, it would be nice to have a simulator framework where we could run these tests faster than in real-time. -- Harlan Stenn [EMAIL PROTECTED] http://ntpforum.isc.org - be a member! ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
[EMAIL PROTECTED] (David Woolley) writes: In article [EMAIL PROTECTED], Bill Unruh [EMAIL PROTECTED] wrote: Offset error: NTP: Mean=-3.1usec, Std Dev=63.1usec If offset is the value reported by ntpq, please note that, when ntpd is locked up, this is an indication of the instantaneous measurement error, the actual error in the local time should be more stable (there may be systematic error) by one or two orders of magnitude. No, the offset is the value reported in loopstats. More generally though, Dave Mills really needs to get in here and defend his clock discipline algorithm, and the Chrony developer needs to defend theirs. Arguing the cases by proxy isn't particularly satisfactory. This is not arguing by proxy, this is running experiments. As I know, since I am a physicist, experiment trumps theory always. Dave, please remember that what tends to concern people about the algorithm is not the behaviour in response to gaussian phase noise, but its behaviour in response to transients, in particular startup transients. (Personally I would say that lost clock tick transients should be fixed at source, but Bill Unruh would also like it to tolerate those well.) Chrony: Mean=-1.5 usec, Std Dev=20.1usec Given the way that I understand it works, I think this is the actual correction applied on that sample. No, this is offset as measured by the ntp procedure ( (t1+t4-t2-t3)/2 ) Rate fluctuation: NTP:Mean=25.32 Std Dev=.078 (PPM) Chrony: Mean=25.26 Std Dev=.091 (PPM) Running it for a longer time, the standard deviation of the rate for ntp has dropped to about .020PPM, which is much better than chrony's. The means depend on the hardware, and, as long as they are within the order of one standard deviation of each other, they are as good as each other. Yes, I agree that the mean rates are the same. It is the standard deviation that is important here. Ie, ntp seems to be much better (smaller fluctutation) than chrony here, at the expense of much worst offset control (which makes sense if the rate fluctuations are real-- ie, I can make chrony's rate fluctuations much smaller by i running averaging the rates over a couple of hours but that will make the offset deviation increase. I guess it depends on which you consider more important, and accurate rate, or an accurate clock. From the point of view of another machine, chrony will have episodes where the frequency changes much more, as it applies the phase correction. ??? These are done on the same machine. If you mean that the real drift rate of the computer changes, then chrony's rate will change, then I would hope that that happens. Remember that this is not comparing two different machines, but the same machine at two different times. And yes, the physical events could have changed between the two. It would be nice if one could do a simulation-- put them both on some virtual machine and feed in exactly the same real clock drift changes, and use some model of the noise ( measurement, transmission, etc) so one could provide the two algorithms with exactly the same data to work with. But neither chrony nor ntp are set up for that. over the weekend, and chrony encompassed the weekdays when the grad students use the computer) the offset control by chrony was a factor of 3 better than by ntp. If the figures are the actual correction for chrony and the sample error for ntpd and Dave Mills is correct about the phase noise rejection of the ntpd filter being a couple of orders of magnitude, ntpd might actually be 30 times better. Nope, the figures are the actual samples as measured by chrony, and the processed output from ntp as reported in loopstats-- whatever that figure is. Ie, if processing makes a difference then the advantage lies with ntp. But I just checked using the offset reported in peerstats (choosing only the packets from the one local server) I get the same result as from loopstats. Ie, both the results for ntp and for chrony are the raw offsets. and chrony's are about 2.2 times better than ntp's. So, chrony, at least in this one test, controls the offsets of the clock much better, at the expense of worse consistancy in the frequency. It also reacts much faster to gross changes in the time ( eg startup with no drift file). ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
[EMAIL PROTECTED] (David Woolley) writes: In article [EMAIL PROTECTED], Unruh [EMAIL PROTECTED] wrote: No, the offset is the value reported in loopstats. Same thing. If chrony is reporting the same measurements, neither set of measurements is particularly valid. You need to measure the actual I am sorry but I do not understand what you are saying. The best estimate of the time error of the clock is the measurement that you make of that error. Now, you might argue that if the drift never changed, and the clock never changed then one could get a better estimate by averaging the measurements. But the hypothesis that the time reported by the ntp process is the true time plus random uncorrelated errors is simply wrong, as looking at the plot of the offsets will rapidly convince you. The offsets oscillate with a period on the order of an hour or so. Chrony does this. ntp does this. The errors are NOT gaussian uncorrelated random errors. Thus most of that error budget is in such correlated errors, and ntp does NOT do many orders of magnitude better ( even with uncorrelated random errors you would need to average 100 samples-- collected over 10 hours at poll 7 to get one order of magnitude, and by then the drift errors would have gotten you.) Anyway, I am comparing like with like in the two programs. Chrony is much better in offsets, which implies that at least half of the error in ntp is internal error. Ie, it is errors which do NOT average out. ( and I do not believe that chrony's errors are the minimal uncorrelated random errors either.) offsets, using something that has a repeatability a couple of orders Of course that is the best way. Unfortunately I do not have that. I might extend the line running the main server to also give me the true offsets for the machine. However one can also get an estimate of the errors by looking at the measured offsets using the ntp exchange. of magnitude better. Certainly for ntpd, offset should be much larger than the error, when locked. Is the server running ntpd? I do not believe this. Yes, the server is running ntp and its offset errors are of the order of 3usec-- again correlated as you can see from the string graph near the bottom of the page. For example, if I take a 10 element running average and subtract it from the raw output of ntp for the server , the standard deviation goes from 3usec to .5usec. Ie, the errors are highly correlated. Averaging may be able to get rid of that .5usec, but not the rest of the standard deviation (3usec) which is some sort of highly correlated noise. IF the errors were really uncorrelated random errors then subtracting off the running average would make no ( well, little) difference to the standard deviation.( it would decrease it by something like sqrt(N-1/N) where N is the length of the running average) Ie, it is simply not true that the measured offsets reported by ntp, or chrony, are simply some independent gaussian random process around the true time. Anyway, as I said, arguing by proxy is difficult and I'm rather hoping that Dave Mills will take over. Certainly it is Dave Mills you have to convince if ntpd is going to change. I do not know if I am trying to convince. I am trying to report the outcomes of some experiments. Now if one (Mills) wants ntp to behave differently than the experiments show it does, then I guess he will change it. If not, then not. All I say is that the experiments I have carried out show that ntp is slow to converge if it starts of badly, and leaves the offset scatter larger than chrony does. It does have a smaller scatter in the rate. One of the great advantages of two different people-- Mills and Curnoe-- trying to impliment the same ideas in different ways is that one can learn by studying the difference between their results. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh wrote: correction applied on that sample. No, this is offset as measured by the ntp procedure ( (t1+t4-t2-t3)/2 ) No, that's wrong. It is very carefully described in the NTPv4 draft section 8 (p27): theta = T(B) - T(A) = 1/2 * [(T2-T1) + (T3-T4)] Not only do you have the wrong sign, the differences must be calculated first, otherwise the errors in the calculation overwhelm the resulting value. That's why it's written the way it is. Danny ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
Unruh wrote: All I say is that the experiments I have carried out show that ntp is slow to converge if it starts of badly, and leaves the offset scatter larger than chrony does. It does have a smaller scatter in the rate. But you are using an extremely old version of ntp and things have radically changed since that version was released. Try rerunning you experiments with ntp 4.2.4 and see what you get then. You also need to fix your calculations if you are going to get good results as I mentioned in a previous message. Danny ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
[EMAIL PROTECTED] (Danny Mayer) writes: Unruh wrote: All I say is that the experiments I have carried out show that ntp is slow to converge if it starts of badly, and leaves the offset scatter larger than chrony does. It does have a smaller scatter in the rate. But you are using an extremely old version of ntp and things have radically changed since that version was released. Try rerunning you experiments with ntp 4.2.4 and see what you get then. You also need to fix your calculations if you are going to get good results as I mentioned in a previous message. Most of the standard deviation results are with 4.2.4. Only the startup was with 4.2.0. Are you saying that things have radically changed in the handling of the startup? After I collect more data on steady state, I will rerun startups both with no drift file and a bad drift file to see how fast the convergence is with 4.2.4. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] NTP vs chrony comparison (Was: oscillations in ntp clock synchronization)
[EMAIL PROTECTED] (Danny Mayer) writes: Unruh wrote: correction applied on that sample. No, this is offset as measured by the ntp procedure ( (t1+t4-t2-t3)/2 ) No, that's wrong. It is very carefully described in the NTPv4 draft section 8 (p27): theta = T(B) - T(A) = 1/2 * [(T2-T1) + (T3-T4)] Not only do you have the wrong sign, the differences must be calculated first, otherwise the errors in the calculation overwhelm the resulting value. That's why it's written the way it is. I was not writing code. I was telling you what time difference I was refering to. And sign is a convention as to whether you are saying positive is the computer is fast or the external source is fast. Everything I talked about is sign independent (standard deviation uses squares), and the difference is that as reported by ntp or chrony and both are careful to to do the calculations with as high an accuracy as possible. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions