Re: [chrony-users] Possible bug in PPS support
Miroslav Lichvar wrote: On Tue, Oct 24, 2017 at 11:14:21PM +0200, Rob Janssen wrote: I am now monitoring the Root dispersion and this appears to work OK, after some tweaking of the threshold value. The reference time unfortunately is in a format that is not easy to check for "being recent" in a simple script, it would be nice if there was a "seconds since epoch" field as well (as there is in ntpd/ntpq). With the -c option, which is available in newer chrony versions, the reference timestamp is printed in "seconds since epoch". $ chronyc -c tracking | awk -F , '{ print $4 }' 1508912704.491908798 Thanks! I have updated to 3.2 but not re-read the manpage. This format is much easier to parse in our monitoring plugin, I'll rework it to use this feature. Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
On Tue, Oct 24, 2017 at 11:14:21PM +0200, Rob Janssen wrote: > I am now monitoring the Root dispersion and this appears to work OK, after > some tweaking > of the threshold value. The reference time unfortunately is in a format that > is not easy to > check for "being recent" in a simple script, it would be nice if there was a > "seconds since epoch" > field as well (as there is in ntpd/ntpq). With the -c option, which is available in newer chrony versions, the reference timestamp is printed in "seconds since epoch". $ chronyc -c tracking | awk -F , '{ print $4 }' 1508912704.491908798 -- Miroslav Lichvar -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
Miroslav Lichvar wrote: I think the best approach for checking the accuracy of the clock is to monitor the root delay+dispersion. That's the estimated maximum error of the clock. If you really wanted to make sure an update of the clock was made in the last X seconds, you can check the reference time. I am now monitoring the Root dispersion and this appears to work OK, after some tweaking of the threshold value. The reference time unfortunately is in a format that is not easy to check for "being recent" in a simple script, it would be nice if there was a "seconds since epoch" field as well (as there is in ntpd/ntpq). But well, it looks like the dispersion increases rapidly when there is no PPS reference and this is much like what I require. (after all, the same is happening to the uncertainty of the time for our application) Thanks for the hint! Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
On Mon, 23 Oct 2017, Bill Unruh wrote: If that is all you want, then you could look at the "refclock" log and see Sorry. That's refclocks.log when the last successful input came in. If it is more than say 15 min ago then the reach would be down to 0 and the refclock would have stopped. Or you could run chronyc in a cron, and use the sources and look at the reach and if it was 0 hit an error flag. William G. Unruh __| Canadian Institute for| Tel: +1(604)822-3273 Physics _|___ Advanced Research _| Fax: +1(604)822-5324 UBC, Vancouver,BC _|_ Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity __|_ www.theory.physics.ubc.ca/ On Mon, 23 Oct 2017, Rob Janssen wrote: Bill Unruh wrote: On Mon, 23 Oct 2017, Rob Janssen wrote: Bill Unruh wrote: If you really need 20usec, then relying on one gps is certainly a bad decision. You should have two or three machines all with independent gps sources so you could catch one of them going rogue, or quitting. The GPSDOs we are using are 2-3 orders of magnitude better than that. These are not your typical $50 modules, but professional GPSDO with OCXO or better oscillator. It is not the accuracy of the individual gps but the the fallback in case one of them goes mad (as happened to you). You do not want them on the same machine unless they have hardware timestamping, since the interrupt latency is far larger than 1us for servicing each interrupt. Again you are wandering away from the topic Bill! The discussion is about detection of a possible problem, not about availability. I did not specify availability of the system, it may well be down when there is a component failure, but we only want to know about it. Monitoring of their accuracy is done by their owners, we only get the signal via distribution amplifiers. That is why we would prefer to have some additional validation, like the PPS signal completely missing. (which could also be caused by a mistakenly unplugged or cut cable, which would never be detected by the GPSDO monitoring) As I said, you could do that with a cron job every 5 min cheching. We already have a comprehensive monitoring system based on Nagios, that in case of this service uses "chronyc -h host tracking" to regularly retrieve the status of chrony and alerts responsible people when something is wrong. The issue is that it monitors "stratum" and "last offset" and it failed to trigger when the PPS signal went away, even after 13 hours. It would have triggered when stratum went above 1 or last offset above 20us, but it didn't. Both of these values remain frozen when there is no PPS. That is the issue I want to rectify, but that won't happen when I discuss with you. Fortunately there is Miroslav who gave me useful hints. Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org. -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
If that is all you want, then you could look at the "refclock" log and see when the last successful input came in. If it is more than say 15 min ago then the reach would be down to 0 and the refclock would have stopped. Or you could run chronyc in a cron, and use the sources and look at the reach and if it was 0 hit an error flag. William G. Unruh __| Canadian Institute for| Tel: +1(604)822-3273 Physics _|___ Advanced Research _| Fax: +1(604)822-5324 UBC, Vancouver,BC _|_ Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity __|_ www.theory.physics.ubc.ca/ On Mon, 23 Oct 2017, Rob Janssen wrote: Bill Unruh wrote: On Mon, 23 Oct 2017, Rob Janssen wrote: Bill Unruh wrote: If you really need 20usec, then relying on one gps is certainly a bad decision. You should have two or three machines all with independent gps sources so you could catch one of them going rogue, or quitting. The GPSDOs we are using are 2-3 orders of magnitude better than that. These are not your typical $50 modules, but professional GPSDO with OCXO or better oscillator. It is not the accuracy of the individual gps but the the fallback in case one of them goes mad (as happened to you). You do not want them on the same machine unless they have hardware timestamping, since the interrupt latency is far larger than 1us for servicing each interrupt. Again you are wandering away from the topic Bill! The discussion is about detection of a possible problem, not about availability. I did not specify availability of the system, it may well be down when there is a component failure, but we only want to know about it. Monitoring of their accuracy is done by their owners, we only get the signal via distribution amplifiers. That is why we would prefer to have some additional validation, like the PPS signal completely missing. (which could also be caused by a mistakenly unplugged or cut cable, which would never be detected by the GPSDO monitoring) As I said, you could do that with a cron job every 5 min cheching. We already have a comprehensive monitoring system based on Nagios, that in case of this service uses "chronyc -h host tracking" to regularly retrieve the status of chrony and alerts responsible people when something is wrong. The issue is that it monitors "stratum" and "last offset" and it failed to trigger when the PPS signal went away, even after 13 hours. It would have triggered when stratum went above 1 or last offset above 20us, but it didn't. Both of these values remain frozen when there is no PPS. That is the issue I want to rectify, but that won't happen when I discuss with you. Fortunately there is Miroslav who gave me useful hints. Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org. -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
Bill Unruh wrote: On Mon, 23 Oct 2017, Rob Janssen wrote: Bill Unruh wrote: If you really need 20usec, then relying on one gps is certainly a bad decision. You should have two or three machines all with independent gps sources so you could catch one of them going rogue, or quitting. The GPSDOs we are using are 2-3 orders of magnitude better than that. These are not your typical $50 modules, but professional GPSDO with OCXO or better oscillator. It is not the accuracy of the individual gps but the the fallback in case one of them goes mad (as happened to you). You do not want them on the same machine unless they have hardware timestamping, since the interrupt latency is far larger than 1us for servicing each interrupt. Again you are wandering away from the topic Bill! The discussion is about detection of a possible problem, not about availability. I did not specify availability of the system, it may well be down when there is a component failure, but we only want to know about it. Monitoring of their accuracy is done by their owners, we only get the signal via distribution amplifiers. That is why we would prefer to have some additional validation, like the PPS signal completely missing. (which could also be caused by a mistakenly unplugged or cut cable, which would never be detected by the GPSDO monitoring) As I said, you could do that with a cron job every 5 min cheching. We already have a comprehensive monitoring system based on Nagios, that in case of this service uses "chronyc -h host tracking" to regularly retrieve the status of chrony and alerts responsible people when something is wrong. The issue is that it monitors "stratum" and "last offset" and it failed to trigger when the PPS signal went away, even after 13 hours. It would have triggered when stratum went above 1 or last offset above 20us, but it didn't. Both of these values remain frozen when there is no PPS. That is the issue I want to rectify, but that won't happen when I discuss with you. Fortunately there is Miroslav who gave me useful hints. Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
On Mon, 23 Oct 2017, Rob Janssen wrote: Bill Unruh wrote: If you really need 20usec, then relying on one gps is certainly a bad decision. You should have two or three machines all with independent gps sources so you could catch one of them going rogue, or quitting. The GPSDOs we are using are 2-3 orders of magnitude better than that. These are not your typical $50 modules, but professional GPSDO with OCXO or better oscillator. It is not the accuracy of the individual gps but the the fallback in case one of them goes mad (as happened to you). You do not want them on the same machine unless they have hardware timestamping, since the interrupt latency is far larger than 1us for servicing each interrupt. Monitoring of their accuracy is done by their owners, we only get the signal via distribution amplifiers. That is why we would prefer to have some additional validation, like the PPS signal completely missing. (which could also be caused by a mistakenly unplugged or cut cable, which would never be detected by the GPSDO monitoring) As I said, you could do that with a cron job every 5 min cheching. You seem to be saying that having no time source whatsoever is better than having one which may be off by 20us? I think you need to set out the real conditions that you need in detail ("We need accuracy to 20us" could be because it was a number that some administrator with absolutely no idea of time came up with, or it could be a legal requirement, or it could be "we should be able to do that" kind of requirement) The time is used for a single-channel simulcast transmitter system. That is, the same signal is transmitted from multiple locations on the same frequency at the same time. When this is not done within 20us at the same time, it will cause severe distortion of the signal. When we don't know we are within 20us, we prefer to not transmit at all, so disable that particular transmitter. OK then you should have redundancy on each transmitter, and monitoring eg via that cron job. I think I know better what is involved and what the limitations are than you do. Of course. But that is not what is at issue here. Also, I prefer to discuss with Miroslav, who concentrates on the problem under discussion rather than casting doubt on everything. Thank you for you input until now. ??? You are making claims. I ask for what your evidence is for those claims, and you have never given the evidence. Operating on false evidence is a sure way of making bad decision. I am not casting doubt on everything. I am trying to explain how chrony works and why it does what it does. Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org. -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
Bill Unruh wrote: If you really need 20usec, then relying on one gps is certainly a bad decision. You should have two or three machines all with independent gps sources so you could catch one of them going rogue, or quitting. The GPSDOs we are using are 2-3 orders of magnitude better than that. These are not your typical $50 modules, but professional GPSDO with OCXO or better oscillator. Monitoring of their accuracy is done by their owners, we only get the signal via distribution amplifiers. That is why we would prefer to have some additional validation, like the PPS signal completely missing. (which could also be caused by a mistakenly unplugged or cut cable, which would never be detected by the GPSDO monitoring) You seem to be saying that having no time source whatsoever is better than having one which may be off by 20us? I think you need to set out the real conditions that you need in detail ("We need accuracy to 20us" could be because it was a number that some administrator with absolutely no idea of time came up with, or it could be a legal requirement, or it could be "we should be able to do that" kind of requirement) The time is used for a single-channel simulcast transmitter system. That is, the same signal is transmitted from multiple locations on the same frequency at the same time. When this is not done within 20us at the same time, it will cause severe distortion of the signal. When we don't know we are within 20us, we prefer to not transmit at all, so disable that particular transmitter. I think I know better what is involved and what the limitations are than you do. Also, I prefer to discuss with Miroslav, who concentrates on the problem under discussion rather than casting doubt on everything. Thank you for you input until now. Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
Bill Unruh wrote: On Mon, 23 Oct 2017, Miroslav Lichvar wrote: On Mon, Oct 23, 2017 at 10:24:54AM -0700, Bill Unruh wrote: On Mon, 23 Oct 2017, Rob Janssen wrote: You don't support my calculation that if the clock apparently wandered away 3400us Again, no evidence of that 3400 us. If I understand it correctly, 3.4ms was the offset of the NTP source My question is how he determined that the offset was 3.4 ms after 13 hours. Simply looking at the offset from the one of the ntp servers does not cut it. That is only 2 std dev from the mean. I pasted the output for a single server but the other 2 were within very short offset of that: MS Name/IP address Stratum Poll Reach LastRx Last sample === #* PPS 0 4 0 13h -279ns[ -401ns] +/- 79ns ^- xx..xxx 1 10 377 17m +3476us[+3476us] +/- 9930us ^- xx..xxx 1 10 377 250 +3462us[+3462us] +/- 10ms ^- xxx.xx..xxx 1 10 377 299 +3459us[+3459us] +/- 10ms I am confident that those offsets were correct, but as I mentioned I forgot to subtract the offset that was already there when the PPS sync was present (due to network delay asymmetry). So that part of his concern is certainly valid. On the other hand, being worried about the loss of connectivity on the 15 min time scale probably is not, unless he has evidence. But the evidence is all there in the measurement logs when the pps is running. He could use that to estimate what the skew is over a variety of time periods. Again, I am not interested in the performance when the clock is free-running as I do not believe that it is good enough for our application anyway. I am interested in monitoring/detecting that the clock is not synced to (recent) PPS input. Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
If you really need 20usec, then relying on one gps is certainly a bad decision. You should have two or three machines all with independent gps sources so you could catch one of them going rogue, or quitting. You seem to be saying that having no time source whatsoever is better than having one which may be off by 20us? I think you need to set out the real conditions that you need in detail ("We need accuracy to 20us" could be because it was a number that some administrator with absolutely no idea of time came up with, or it could be a legal requirement, or it could be "we should be able to do that" kind of requirement) They impliment a system which can with some confidence deliver that. There are of course no guarentees. A nuke over the building would severely degrade the accuracy of the clocks in a way that was totally unpredictable beforehand. Or a power failure. etc. William G. Unruh __| Canadian Institute for| Tel: +1(604)822-3273 Physics _|___ Advanced Research _| Fax: +1(604)822-5324 UBC, Vancouver,BC _|_ Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity __|_ www.theory.physics.ubc.ca/ On Mon, 23 Oct 2017, Rob Janssen wrote: Miroslav Lichvar wrote: On Mon, Oct 23, 2017 at 10:24:54AM -0700, Bill Unruh wrote: On Mon, 23 Oct 2017, Rob Janssen wrote: You don't support my calculation that if the clock apparently wandered away 3400us Again, no evidence of that 3400 us. If I understand it correctly, 3.4ms was the offset of the NTP source 13 hours after the PPS stopped working. The stddev of the NTP source from sourcestats is ~50 microseconds, so if the offset was originally better than 1.6ms (there are 3 different sources in the original report with -0.1ms, 1.5ms, 1.5ms offsets), it drifted at least by ~1.8ms in that time. Yes, I forgot that there was a systematic 1.4 ms offset at the time the PPS sync was active so after it ran unsynced for 13h and had a 3.4 ms offset the drift was more like 2ms instead of 3.4ms. However, that still is 2 orders of magnitude more than we can allow. So we certainly need to alert on this condition, we cannot just freewheel for 13 hours and assume the time is still accurate enough. I am now testing with the root delay/dispersion. A couple of minutes after the PPS has been removed, the root delay remains at 0.1 seconds but the root dispersion now has increased to 0.000662125 seconds. That certainly is a value that is immediately affected by the lack of sync, however I need to determine a threshold value for the monitoring alert. The tracking also shows "System time : 0.9 seconds fast of NTP time" but I cannot believe the time is still that accurate. I understand now that the 10ms value shown in "chronyc sources" is based on the 20ms roundtriptime of the network towards the NTP source. This time is quite constant as indicated by the low Std Dev but the fixed RTT apparently makes chrony believe the network is dodgy (as Bill expresses it). The only thing dodgy about it is that for this particular site there is a systematic offset in the propagation time from/towards the site of 1.4 ms resulting in the 1.4ms offset observed when PPS is available, probably caused by asymmetric routing. Other than that, it is quite stable. It is a network designed for distribution of audio and video to transmitter sites, well dimensioned with guaranteed bandwidth and not overloaded at any time. Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org. -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
On Mon, 23 Oct 2017, Miroslav Lichvar wrote: On Mon, Oct 23, 2017 at 10:24:54AM -0700, Bill Unruh wrote: On Mon, 23 Oct 2017, Rob Janssen wrote: You don't support my calculation that if the clock apparently wandered away 3400us Again, no evidence of that 3400 us. If I understand it correctly, 3.4ms was the offset of the NTP source My question is how he determined that the offset was 3.4 ms after 13 hours. Simply looking at the offset from the one of the ntp servers does not cut it. That is only 2 std dev from the mean. 13 hours after the PPS stopped working. The stddev of the NTP source from sourcestats is ~50 microseconds, so if the offset was originally better than 1.6ms (there are 3 different sources in the original report with -0.1ms, 1.5ms, 1.5ms offsets), it drifted at least by ~1.8ms in that time. If there was a significant change in the temperature, the error gained in 13 hours could be much larger. On one of my servers with PPS I see that the frequency offset can change by 0.5 ppm in just few seconds. I agree. and chrony PPS does a bad job of measuring that. Perhaps chrony should keep track of the drift over a much longer period than the measurement period (max 64 samples are 16 sec per sample is only about 15 min. so, keeping a list of the drift rate over say a day would give a much better feeling for the drift wander due to temp differences, etc. It is certainly true that the drift fluctuations are not guassian so an estimate derived from 15 min really gives a very poor estimate of the fluctuations on the time scale of hours or days. So that part of his concern is certainly valid. On the other hand, being worried about the loss of connectivity on the 15 min time scale probably is not, unless he has evidence. But the evidence is all there in the measurement logs when the pps is running. He could use that to estimate what the skew is over a variety of time periods. O -- Miroslav Lichvar -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org. -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
Miroslav Lichvar wrote: On Mon, Oct 23, 2017 at 10:24:54AM -0700, Bill Unruh wrote: On Mon, 23 Oct 2017, Rob Janssen wrote: You don't support my calculation that if the clock apparently wandered away 3400us Again, no evidence of that 3400 us. If I understand it correctly, 3.4ms was the offset of the NTP source 13 hours after the PPS stopped working. The stddev of the NTP source from sourcestats is ~50 microseconds, so if the offset was originally better than 1.6ms (there are 3 different sources in the original report with -0.1ms, 1.5ms, 1.5ms offsets), it drifted at least by ~1.8ms in that time. Yes, I forgot that there was a systematic 1.4 ms offset at the time the PPS sync was active so after it ran unsynced for 13h and had a 3.4 ms offset the drift was more like 2ms instead of 3.4ms. However, that still is 2 orders of magnitude more than we can allow. So we certainly need to alert on this condition, we cannot just freewheel for 13 hours and assume the time is still accurate enough. I am now testing with the root delay/dispersion. A couple of minutes after the PPS has been removed, the root delay remains at 0.1 seconds but the root dispersion now has increased to 0.000662125 seconds. That certainly is a value that is immediately affected by the lack of sync, however I need to determine a threshold value for the monitoring alert. The tracking also shows "System time : 0.9 seconds fast of NTP time" but I cannot believe the time is still that accurate. I understand now that the 10ms value shown in "chronyc sources" is based on the 20ms roundtriptime of the network towards the NTP source. This time is quite constant as indicated by the low Std Dev but the fixed RTT apparently makes chrony believe the network is dodgy (as Bill expresses it). The only thing dodgy about it is that for this particular site there is a systematic offset in the propagation time from/towards the site of 1.4 ms resulting in the 1.4ms offset observed when PPS is available, probably caused by asymmetric routing. Other than that, it is quite stable. It is a network designed for distribution of audio and video to transmitter sites, well dimensioned with guaranteed bandwidth and not overloaded at any time. Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
On Mon, Oct 23, 2017 at 10:24:54AM -0700, Bill Unruh wrote: > On Mon, 23 Oct 2017, Rob Janssen wrote: > > You don't support my calculation that if the clock apparently wandered > > away 3400us > > Again, no evidence of that 3400 us. If I understand it correctly, 3.4ms was the offset of the NTP source 13 hours after the PPS stopped working. The stddev of the NTP source from sourcestats is ~50 microseconds, so if the offset was originally better than 1.6ms (there are 3 different sources in the original report with -0.1ms, 1.5ms, 1.5ms offsets), it drifted at least by ~1.8ms in that time. If there was a significant change in the temperature, the error gained in 13 hours could be much larger. On one of my servers with PPS I see that the frequency offset can change by 0.5 ppm in just few seconds. -- Miroslav Lichvar -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
On Mon, 23 Oct 2017, Rob Janssen wrote: Bill Unruh wrote: Ok but rather than "only a few hours" I would like to see "only a few minutes". But that would be totally rediculous. The offset of the local clock from UTC after even a few hours is still far far better than that from the network, and far better even than 20us. Remember what you want to know is how far the local clock is from UTC, not whether or not the local clock has not heard from PPS in the past few minutes. You don't support my calculation that if the clock apparently wandered away 3400us Again, no evidence of that 3400 us. From the evidence that chrony has, pps does NOT wander that badly in 13 hrs. Remember chrony constantly measures both the standard deviation in the offset AND in the rate. So it has a good estimate of how far the offset will wander in that time. And it is NOT 3400us. So you need to tell us how you measure that 3400us. after 13 hours, it would take about 5 minutes to wander 20us? Not it would not. chrony has measured it, and it is not that much. I would think it is a best-case calculation as it assumes a linear drift in one direction. I practice it will probably wobble, and take less than 5 minutes to wander 20us. Please note we are talking MICROseconds here. Not MILLIseconds. I don't think many standard systems will remain within 20us for several hours if left without sync. (it would likely require some TCXO clock option) Sure they could. If the temp is constant, as you claim, that is main cause of changes in drift rate. The Span indicated by sourcestats is 79 for the PPS source now, and 103m for the network sources. Would that mean it drops the PPS after 79 seconds? That would be fine. No. You really need to think through what you want and what the time on your server machine delivers. After all if the computer clock in your local machine was and exact track of UTC always to atto seconds, and you used the GPS only to make determine the intial offset determination then it would be silly to throw away that source just because the pps had not been heard from. We are not interested in "time that is likely a good estimation". We require accurate time and if I am sorry, but nothing will give you "accurate time" Not even GPS. What it can give you is an estimate of the time and the accuracy of that estimate. we do not have it, or do not have certainty about it, we need to shutdown our application. So we require some monitoring. Of course I can add monitoring of "sources" or "sourcestats" to the monitoring of "tracking" that we currently do, and alert when "Reach" of the PPS clock is zero. That is probably our quickest solution. However, I would have expected this error condition (missing PPS pulses) to be somehow reflected in the "tracking" output. Why? Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
William G. Unruh __| Canadian Institute for| Tel: +1(604)822-3273 Physics _|___ Advanced Research _| Fax: +1(604)822-5324 UBC, Vancouver,BC _|_ Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity __|_ www.theory.physics.ubc.ca/ On Mon, 23 Oct 2017, Rob Janssen wrote: Bill Unruh wrote: 210 Number of sources = 4 MS Name/IP address Stratum Poll Reach LastRx Last sample === #* PPS 0 4 37724 +218ns[ +278ns] +/- 124ns ^- xx..xxx 1 10 377 877 -147us[ -122us] +/- 11ms ^- xx..xxx 1 10 37714 +1480us[+1480us] +/- 10ms ^- xxx.xx..xxx 1 10 377 345 +1446us[+1447us] +/- 10ms However, recently at one site the PPS signal was lost, but chrony keeps "locked" to it: MS Name/IP address Stratum Poll Reach LastRx Last sample === #* PPS 0 4 0 13h -279ns[ -401ns] +/- 79ns ^- xx..xxx 1 10 377 250 +3462us[+3462us] +/- 10ms As can be seen, it has been lost for 13 hours but it still has the * sign in the 2nd column. We are remotely monitoring these systems using chronyc tracking and it still indicated stratum 1 referenced to PPS. I would have expected it to drop back to using those network time servers after some time of not getting pulses (i.e. once "Reach" is 0) and the stratum to increase to 2. When it would operate that way, we would have received an alert. Furthermore, the clock had drifted by 3.5ms by the time the above status was noticed, while when synchronized to network time it usually is within 1 to 1.5ms. So it really is not considering those network time sources anymore. Not sure what the above paragraph means. How do you know it has drifted by 3.5ms or 1 ms? I do not believe those figures, unless you meant 3.5us and 1usec. If by remote monitoring you mean really really remote with dodgy network between them. Look in the above stats: it usually is at about 1.5ms (14xx us) from the network time sources, and when the error condition occurred, it was at 3462us offset. There is a network between the source and the system, but it isn't dodgy. Yes, it is. Note that it is saying that the standard deviation is 10ms. That one particular measurement was only off by 1.5ms does not tell one anything. The standard deviation tells much more. And if it is off by 1.5 ms, that is still 1 times worse than the PPS. Was this a test by the way where you unplugged the gps from the machine. Otherwise figuring out why gps pps was lost for that period of time is probably the first thing to do. We know what happened: the GPSDO went defective so there were no PPS pulses anymore. (and also no 10 MHz reference, which we need in another part of the system) That is of course a different issue. And seeing no 10MHz reference is surely something you can test for elsewhere. What I would like to see is handling of the error condition. Of course it is The purpose of chrony is to discipline the local clock Not to test GPS receivers. You could run a cron job which looks at the PPS reach every 5 min and if it finds it has dropped to 0, it can do something like let you know your gps has problems. But why should that be chrony's job? It is giving you the best estimate of UTC it can given the data. I certainly would not want it giving me worse estimates. understandable that there is no time syncing when there are no PPS pulses, but the condition Sure there is. You can still use the past info from PPS to sync the current clock. should be visible. (e.g. by the stratum increasing and/or the source changing) Miroslav is better placed to figure out what is happening within chrony when it loses pps input. Given the uncertainty in the rate as estimated from the PPS it, 13 hrs ago, is still probably a better estimate of the current time than is the network time from the other systems. It isn't! Network time from the other systems would be about 1500us out, time was now 3400us out. No idea what you mean. As I said I have seen no evidence about how you determined those figures. However, that is not the main point. Remember that they are at poll 10 which is 1000 seconds or so (about 15 min) so the network time sources have not had that many "measurements" in that time interval and those are pretty crappy (10ms std dev which is really huge). The PPS std dev is inn the ns range-- about 1 times better. I don't think the shown output in the last column of "chronyc sources" is the stddev. Right now that column still indicates 10ms, but when I use "chronyc sourcestats" the last column actually has a header Std Dev and the values are around 40-60us. So the PPS is still, even 13 hrs later, a better
Re: [chrony-users] Possible bug in PPS support
Miroslav Lichvar wrote: One point I forgot to make is that even if chronyd reselected immediately after the reach value of the PPS refclock got to 0, like ntpd does, checking stratum or selected source wouldn't be a reliable way to monitor the accuracy, because the reselection wouldn't happen if the NTP source was down too. Ok I will experiment with watching the root delay and -dispersion and see how they behave when removing PPS on my test system. At the moment (after being locked for 8 hours or so) it shows: Root delay : 0.1 seconds Root dispersion : 0.10389 seconds Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
On Mon, Oct 23, 2017 at 07:01:30PM +0200, Rob Janssen wrote: > Miroslav Lichvar wrote: > > > > The last skew was 14 ppb, so it would take about 8 days to accumulate > > 10 milliseconds worth of dispersion. > > Can you explain where the 10ms comes from? I know it is displayed in the > "sources" output, > but how is it calculated? It is way above the StdDev indicated in the > "sourcestats". > And of course it is also way above our usual accuracy. It includes the root delay and distance. Check "chronyc ntpdata". Most of that is probably the round-trip time to the server. One point I forgot to make is that even if chronyd reselected immediately after the reach value of the PPS refclock got to 0, like ntpd does, checking stratum or selected source wouldn't be a reliable way to monitor the accuracy, because the reselection wouldn't happen if the NTP source was down too. -- Miroslav Lichvar -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
Miroslav Lichvar wrote: The last skew was 14 ppb, so it would take about 8 days to accumulate 10 milliseconds worth of dispersion. Can you explain where the 10ms comes from? I know it is displayed in the "sources" output, but how is it calculated? It is way above the StdDev indicated in the "sourcestats". And of course it is also way above our usual accuracy. Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
Bill Unruh wrote: Ok but rather than "only a few hours" I would like to see "only a few minutes". But that would be totally rediculous. The offset of the local clock from UTC after even a few hours is still far far better than that from the network, and far better even than 20us. Remember what you want to know is how far the local clock is from UTC, not whether or not the local clock has not heard from PPS in the past few minutes. You don't support my calculation that if the clock apparently wandered away 3400us after 13 hours, it would take about 5 minutes to wander 20us? I would think it is a best-case calculation as it assumes a linear drift in one direction. I practice it will probably wobble, and take less than 5 minutes to wander 20us. Please note we are talking MICROseconds here. Not MILLIseconds. I don't think many standard systems will remain within 20us for several hours if left without sync. (it would likely require some TCXO clock option) The Span indicated by sourcestats is 79 for the PPS source now, and 103m for the network sources. Would that mean it drops the PPS after 79 seconds? That would be fine. No. You really need to think through what you want and what the time on your server machine delivers. After all if the computer clock in your local machine was and exact track of UTC always to atto seconds, and you used the GPS only to make determine the intial offset determination then it would be silly to throw away that source just because the pps had not been heard from. We are not interested in "time that is likely a good estimation". We require accurate time and if we do not have it, or do not have certainty about it, we need to shutdown our application. So we require some monitoring. Of course I can add monitoring of "sources" or "sourcestats" to the monitoring of "tracking" that we currently do, and alert when "Reach" of the PPS clock is zero. That is probably our quickest solution. However, I would have expected this error condition (missing PPS pulses) to be somehow reflected in the "tracking" output. Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
... Is it to be considered a bug, or is this just a design feature? It's a feature, but there is apparently a bug which may make the switch take much longer than it should. However, we use this form of time synchronization because we need the clock to be within about 20us of real time. When the PPS sync is lost and only network sync is achieved, that is not really attainable. So we need some indication whenever there is no PPS sync. Would it not be reasonable to indicate loss of PPS sync when the Reach value becomes zero? Ok, it could be that freewheeling keeps a more accurate time than syncing to another source, but at least the error condition should be monitored. How could we work around that in this case? Decreasing the maximum number of samples of the NTP source with the maxsamples option should reduce the maximum span (as reported in sourcestats) and also the time it will switch from unreachable sources. Increasing the maxclockerror would do that too if it was included in the source selection. Even with the default value it would take only few hours to switch in your case. Ok but rather than "only a few hours" I would like to see "only a few minutes". But that would be totally rediculous. The offset of the local clock from UTC after even a few hours is still far far better than that from the network, and far better even than 20us. Remember what you want to know is how far the local clock is from UTC, not whether or not the local clock has not heard from PPS in the past few minutes. The Span indicated by sourcestats is 79 for the PPS source now, and 103m for the network sources. Would that mean it drops the PPS after 79 seconds? That would be fine. No. You really need to think through what you want and what the time on your server machine delivers. After all if the computer clock in your local machine was and exact track of UTC always to atto seconds, and you used the GPS only to make determine the intial offset determination then it would be silly to throw away that source just because the pps had not been heard from. Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org. -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
Bill Unruh wrote: 210 Number of sources = 4 MS Name/IP address Stratum Poll Reach LastRx Last sample === #* PPS 0 4 37724 +218ns[ +278ns] +/- 124ns ^- xx..xxx 1 10 377 877 -147us[ -122us] +/- 11ms ^- xx..xxx 1 10 37714 +1480us[+1480us] +/- 10ms ^- xxx.xx..xxx 1 10 377 345 +1446us[+1447us] +/- 10ms However, recently at one site the PPS signal was lost, but chrony keeps "locked" to it: MS Name/IP address Stratum Poll Reach LastRx Last sample === #* PPS 0 4 0 13h -279ns[ -401ns] +/- 79ns ^- xx..xxx 1 10 377 250 +3462us[+3462us] +/- 10ms As can be seen, it has been lost for 13 hours but it still has the * sign in the 2nd column. We are remotely monitoring these systems using chronyc tracking and it still indicated stratum 1 referenced to PPS. I would have expected it to drop back to using those network time servers after some time of not getting pulses (i.e. once "Reach" is 0) and the stratum to increase to 2. When it would operate that way, we would have received an alert. Furthermore, the clock had drifted by 3.5ms by the time the above status was noticed, while when synchronized to network time it usually is within 1 to 1.5ms. So it really is not considering those network time sources anymore. Not sure what the above paragraph means. How do you know it has drifted by 3.5ms or 1 ms? I do not believe those figures, unless you meant 3.5us and 1usec. If by remote monitoring you mean really really remote with dodgy network between them. Look in the above stats: it usually is at about 1.5ms (14xx us) from the network time sources, and when the error condition occurred, it was at 3462us offset. There is a network between the source and the system, but it isn't dodgy. Was this a test by the way where you unplugged the gps from the machine. Otherwise figuring out why gps pps was lost for that period of time is probably the first thing to do. We know what happened: the GPSDO went defective so there were no PPS pulses anymore. (and also no 10 MHz reference, which we need in another part of the system) What I would like to see is handling of the error condition. Of course it is understandable that there is no time syncing when there are no PPS pulses, but the condition should be visible. (e.g. by the stratum increasing and/or the source changing) Miroslav is better placed to figure out what is happening within chrony when it loses pps input. Given the uncertainty in the rate as estimated from the PPS it, 13 hrs ago, is still probably a better estimate of the current time than is the network time from the other systems. It isn't! Network time from the other systems would be about 1500us out, time was now 3400us out. However, that is not the main point. Remember that they are at poll 10 which is 1000 seconds or so (about 15 min) so the network time sources have not had that many "measurements" in that time interval and those are pretty crappy (10ms std dev which is really huge). The PPS std dev is inn the ns range-- about 1 times better. I don't think the shown output in the last column of "chronyc sources" is the stddev. Right now that column still indicates 10ms, but when I use "chronyc sourcestats" the last column actually has a header Std Dev and the values are around 40-60us. So the PPS is still, even 13 hrs later, a better estimate of the true time than are those crappy network sources. The network sources aren't crappy. There is a systematic offset but the variation is low. I have no idea what the figure in the last column of sources means, it has no header. The above situation occurred with chrony 2.1 However, I have reproduced it with an installation updated to version 3.2 although with an "outage" time of 15 minutes. It had Reach 0 but still was indicating lock to PPS after 869 seconds. The star means that the PPS is the best indicator of what the true time now is. Even when it has not provided information for 13 hours? Is it to be considered a bug, or is this just a design feature? It is neither a bug or a "design feature" (by which I assume you mean it is not working properly but the designer does not care-- that is how it is often taken to mean). Of course it could be that the design has a different objective. We need the time to be very accurate (preferably within 2us but certainly within 20us) and it looks like chrony is normally able to achieve that, but a design feature could be that it is freewheeling on loss of sync rather than indicating an error. I don't mind that it is freewheeling but I need an indication of that - because I need to turn off our application as I know it does not take long
Re: [chrony-users] Possible bug in PPS support
210 Number of sources = 4 MS Name/IP address Stratum Poll Reach LastRx Last sample === #* PPS 0 4 37724 +218ns[ +278ns] +/- 124ns ^- xx..xxx 1 10 377 877 -147us[ -122us] +/- 11ms ^- xx..xxx 1 10 37714 +1480us[+1480us] +/- 10ms ^- xxx.xx..xxx 1 10 377 345 +1446us[+1447us] +/- 10ms However, recently at one site the PPS signal was lost, but chrony keeps "locked" to it: MS Name/IP address Stratum Poll Reach LastRx Last sample === #* PPS 0 4 0 13h -279ns[ -401ns] +/- 79ns ^- xx..xxx 1 10 377 250 +3462us[+3462us] +/- 10ms As can be seen, it has been lost for 13 hours but it still has the * sign in the 2nd column. We are remotely monitoring these systems using chronyc tracking and it still indicated stratum 1 referenced to PPS. I would have expected it to drop back to using those network time servers after some time of not getting pulses (i.e. once "Reach" is 0) and the stratum to increase to 2. When it would operate that way, we would have received an alert. Furthermore, the clock had drifted by 3.5ms by the time the above status was noticed, while when synchronized to network time it usually is within 1 to 1.5ms. So it really is not considering those network time sources anymore. Not sure what the above paragraph means. How do you know it has drifted by 3.5ms or 1 ms? I do not believe those figures, unless you meant 3.5us and 1usec. If by remote monitoring you mean really really remote with dodgy network between them. Was this a test by the way where you unplugged the gps from the machine. Otherwise figuring out why gps pps was lost for that period of time is probably the first thing to do. Miroslav is better placed to figure out what is happening within chrony when it loses pps input. Given the uncertainty in the rate as estimated from the PPS it, 13 hrs ago, is still probably a better estimate of the current time than is the network time from the other systems. Remember that they are at poll 10 which is 1000 seconds or so (about 15 min) so the network time sources have not had that many "measurements" in that time interval and those are pretty crappy (10ms std dev which is really huge). The PPS std dev is inn the ns range-- about 1 times better. So the PPS is still, even 13 hrs later, a better estimate of the true time than are those crappy network sources. The above situation occurred with chrony 2.1 However, I have reproduced it with an installation updated to version 3.2 although with an "outage" time of 15 minutes. It had Reach 0 but still was indicating lock to PPS after 869 seconds. The star means that the PPS is the best indicator of what the true time now is. Is it to be considered a bug, or is this just a design feature? It is neither a bug or a "design feature" (by which I assume you mean it is not working properly but the designer does not care-- that is how it is often taken to mean). Here it indicates that the PPS is still, 13 hrs later, the best indication of the offset from UTC. Now, this assumption that it is the best could be off itself. For example if the time span used by the PPS was overnight when the machine was cool inside, and during the day the machine is used a lot and heats up, then the estimate from the PPS rate could well be off because those kinds of jump in the rate would not enter into the estimate of the skew for the PPS. (if the PPS had accumulated 64 samples at 16 sec per sample, that is only 15 min, so the time span over which the pps is measuring the rate and the changes in the rate is quite short and would not capture large rate deviations which occur with non-gaussian distribution-- like the heating up every morning) How could we work around that in this case? It is not clear what it is you want to work around? From all the data, the PPS 13 hrs ago is still the best estimate of the UTC. Why would you want chrony to use a measureably much worse source just because the PPS has not been heard from for 13 hrs? Eventually the PPS from the remote past is no longer as good as the relatively really crappy time from the network, but that could take days. Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org. -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-users] Possible bug in PPS support
Miroslav Lichvar wrote: On Mon, Oct 23, 2017 at 10:54:52AM +0200, Rob Janssen wrote: However, recently at one site the PPS signal was lost, but chrony keeps "locked" to it: MS Name/IP address Stratum Poll Reach LastRx Last sample === #* PPS 0 4 0 13h -279ns[ -401ns] +/- 79ns ^- xx..xxx 1 10 377 250 +3462us[+3462us] +/- 10ms As can be seen, it has been lost for 13 hours but it still has the * sign in the 2nd column. We are remotely monitoring these systems using chronyc tracking and it still indicated stratum 1 referenced to PPS. I would have expected it to drop back to using those network time servers after some time of not getting pulses (i.e. once "Reach" is 0) and the stratum to increase to 2. When it would operate that way, we would have received an alert. Furthermore, the clock had drifted by 3.5ms by the time the above status was noticed, while when synchronized to network time it usually is within 1 to 1.5ms. So it really is not considering those network time sources anymore. It would have switched eventually when the estimated error of the refclock was larger than the error of the NTP source (10 milliseconds). That does not seem reasonable... should it not refer to the estimated error of the source itself rather than to the network source? Have you saved the tracking or sourcestats output? From the skew we could estimate how long it would take. Ok here is the tracking.log, the last few lines before it failed: 2017-10-21 22:18:30 PPS 1-12.275 0.048 -6.697e-07 N 1 4.525e-07 1.504e-07 2017-10-21 22:18:46 PPS 1-12.279 0.030 -1.661e-07 N 1 3.788e-07 3.638e-11 2017-10-21 22:19:02 PPS 1-12.284 0.029 -7.386e-07 N 1 4.446e-07 1.177e-07 2017-10-21 22:19:18 PPS 1-12.286 0.020 -6.956e-08 N 1 3.629e-07 4.908e-11 2017-10-21 22:19:34 PPS 1-12.290 0.022 -7.190e-07 N 1 4.091e-07 6.094e-08 2017-10-21 22:19:50 PPS 1-12.292 0.018 -1.540e-07 N 1 3.709e-07 4.822e-11 2017-10-21 22:20:06 PPS 1-12.295 0.017 -4.841e-07 N 1 4.030e-07 1.114e-07 2017-10-21 22:20:22 PPS 1-12.297 0.014 -1.363e-07 N 1 3.626e-07 8.935e-09 After this, nothing was logged until I restarted chronyd 13 hours later and it synced to the network sources. Is it to be considered a bug, or is this just a design feature? It's a feature, but there is apparently a bug which may make the switch take much longer than it should. However, we use this form of time synchronization because we need the clock to be within about 20us of real time. When the PPS sync is lost and only network sync is achieved, that is not really attainable. So we need some indication whenever there is no PPS sync. Would it not be reasonable to indicate loss of PPS sync when the Reach value becomes zero? Ok, it could be that freewheeling keeps a more accurate time than syncing to another source, but at least the error condition should be monitored. How could we work around that in this case? Decreasing the maximum number of samples of the NTP source with the maxsamples option should reduce the maximum span (as reported in sourcestats) and also the time it will switch from unreachable sources. Increasing the maxclockerror would do that too if it was included in the source selection. Even with the default value it would take only few hours to switch in your case. Ok but rather than "only a few hours" I would like to see "only a few minutes". The Span indicated by sourcestats is 79 for the PPS source now, and 103m for the network sources. Would that mean it drops the PPS after 79 seconds? That would be fine. Rob -- To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-users-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.