Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On Tue, Nov 14, 2017 at 11:08:11AM -0700, James Feeney wrote: > You seem to be describing some kind of log message that has no meaning in > isolation - like an episode of a TV show that makes no sense unless you've > seen the entire series. I think it has a meaning in isolation. It says that the chronyd's idea of the true time has shifted by x seconds, even though we don't know what exactly that is or why it changed. How to describe that in a syslog message? > Personally, I would say that that is just very bad User Interface design. In > contrast, I believe that each log message should be "self contained", that it > should have a clear and precise meaning, even in isolation. > > So far, after much back and forth, I have come away with an incomplete > understanding. > > Miroslav, if you really believe in the importance of this "stacked messages", > "current remaining correction" concept, then please write a precise > mathematical expression for what you are describing, beginning first with a > complete list of variable names and their descriptions. I suspect that description would be very long, for either of those values, and it couldn't be precisely described in a syslog message. There are at least four different offsets that chronyd works with, starting with individual NTP measurements and ending with an estimate of the offset of the system clock that should be corrected. BTW, there is an interesting case that could happen if the message was changed to report the "true" offset of the system clock, but kept the threshold based on the change. It could report an offset smaller than the threshold. For example, if the first update was to correct a 10 second offset, and in the middle of the correction there was an update of -5 seconds, there would be a message saying that the "true" offset is zero. -- Miroslav Lichvar -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
William G. Unruh __| Canadian Institute for| Tel: +1(604)822-3273 Physics _|___ Advanced Research _| Fax: +1(604)822-5324 UBC, Vancouver,BC _|_ Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity __|_ www.theory.physics.ubc.ca/ On Tue, 14 Nov 2017, James Feeney wrote: On 11/14/2017 01:23 AM, Miroslav Lichvar wrote: Yes, the "system clock wrong by" messages were meant to be stacked on top of each other. If we change it to report the current remaining correction, it won't be clear how much the clock has drifted between the messages. If the log had system clock behind 10 seconds system clock behind 9 seconds the clock could have drifted by any value between the logchange threshold and 9 seconds. If you were to ask me to describe to some other third person what it is that is now being reported by chronyd in these log messages, based upon my own understanding, derived from your descriptions and Bill's descriptions, I would be unable to provide any kind of precise explanation. I do not mean to be difficult or contrarian. I just do not understand what is being expressed with these "messages ... stacked on top of each other." Each message says that the clock has suffered an unexpected shift in time from what it should be at that time, and the message says tht that shift is being corrected. The key works are "unexpected" and "shift". It may also have suffered a shift in the past, but that past shift is being corrected already and so it is not unexpected. You seem to be describing some kind of log message that has no meaning in isolation - like an episode of a TV show that makes no sense unless you've seen the entire series. Personally, I would say that that is just very bad User Interface design. In contrast, I believe that each log message should be "self contained", that it should have a clear and precise meaning, even in isolation. Interpreted as above I believe it does have meaning in isolation. "Something happened to the time and chrony will not try to correct it." So far, after much back and forth, I have come away with an incomplete understanding. Miroslav, if you really believe in the importance of this "stacked messages", "current remaining correction" concept, then please write a precise mathematical expression for what you are describing, beginning first with a complete list of variable names and their descriptions. So far, the variables bandied about include: 1) the NTP server clock estimate 2) a not always present on-board Hardware Clock That has nothing to do with chrony except when the system starts up. So never worry about that. 3) the kernel's System Clock 4) some "hidden" chronyd internal clock Chrony corrects errors by altering the drift of the clock (assuming that the error is not so large as to demand a step of the clock). 5-10) the Six offsets between each pair of those "clocks" Ignore the hardware clock. It is never of any relevance. 11-X) some, as yet, uncertain set of "partial offsets" No such thing. Y-Z) some, as yet, uncertain set of "cumulative offsets" No such thing. A-B) possibly some additional set of statistical "running estimates" No such thing. If I am able to understand what this "current remaining correction" thing is, I can better offer some opinion with respect to your questions: The time is out by 5 sec. The rate of the clock is changed so that that 5 sec offset should go to 0 in 100 seconds. The difference between that prediction and the estimate of the NTP time is the "current remaining offset". - What value should be compared with the logchange threshold? - What value should be reported in the message? - How should be that value described in the message? Please be certain that each of these proposed "value" choices is included in the list of variable names and the list of equations. I think that you first need to go and study the theory of how ntp and chrony operate. It is really a bit much to be asking Miroslav to give you a complete less in the theory of chrony. I just like when chronyd reports what it is doing when it starts-up. Still, based upon my best guess, I don't think I care about any "remaining correction" that has no relationship to any "user visible" quantity. I assume that you know the story: "The answer is - 42! But, what is the question?" Does chronyd harbor some "hidden variables"? Yes, many. The last N measurements of the offsets from the measured NTP time from the server, corrected for changes in rate which have occured since they were measured. These are used to make a least squares fit to estimate what the difference in rate and time of the system clock and the NTP server's clock. The old "system clock wrong" I interpreted as an error that the clock acquired between updates, but "ahead/behind" I'd interpret relative to NTP or UTC. So, you are actually talking about two different things? "acquired between
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On 11/14/2017 01:23 AM, Miroslav Lichvar wrote: > Yes, the "system clock wrong by" messages were meant to be stacked on > top of each other. If we change it to report the current remaining > correction, it won't be clear how much the clock has drifted between > the messages. If the log had > > system clock behind 10 seconds > system clock behind 9 seconds > > the clock could have drifted by any value between the logchange > threshold and 9 seconds. If you were to ask me to describe to some other third person what it is that is now being reported by chronyd in these log messages, based upon my own understanding, derived from your descriptions and Bill's descriptions, I would be unable to provide any kind of precise explanation. I do not mean to be difficult or contrarian. I just do not understand what is being expressed with these "messages ... stacked on top of each other." You seem to be describing some kind of log message that has no meaning in isolation - like an episode of a TV show that makes no sense unless you've seen the entire series. Personally, I would say that that is just very bad User Interface design. In contrast, I believe that each log message should be "self contained", that it should have a clear and precise meaning, even in isolation. So far, after much back and forth, I have come away with an incomplete understanding. Miroslav, if you really believe in the importance of this "stacked messages", "current remaining correction" concept, then please write a precise mathematical expression for what you are describing, beginning first with a complete list of variable names and their descriptions. So far, the variables bandied about include: 1) the NTP server clock estimate 2) a not always present on-board Hardware Clock 3) the kernel's System Clock 4) some "hidden" chronyd internal clock 5-10) the Six offsets between each pair of those "clocks" 11-X) some, as yet, uncertain set of "partial offsets" Y-Z) some, as yet, uncertain set of "cumulative offsets" A-B) possibly some additional set of statistical "running estimates" If I am able to understand what this "current remaining correction" thing is, I can better offer some opinion with respect to your questions: > - What value should be compared with the logchange threshold? > - What value should be reported in the message? > - How should be that value described in the message? Please be certain that each of these proposed "value" choices is included in the list of variable names and the list of equations. I just like when chronyd reports what it is doing when it starts-up. Still, based upon my best guess, I don't think I care about any "remaining correction" that has no relationship to any "user visible" quantity. I assume that you know the story: "The answer is - 42! But, what is the question?" Does chronyd harbor some "hidden variables"? > The old "system clock wrong" I interpreted as an error > that the clock acquired between updates, but "ahead/behind" I'd > interpret relative to NTP or UTC. So, you are actually talking about two different things? "acquired between updates", and "relative to [ the NTP server clock estimate ]"? But, "acquired", is relative to what? The NTP server clock estimate? I'd suggest that two different things should be treated as two different things, with proper descriptions, so that they are not confused. Different log messages could say: "System clock acquired a 2second lead between NTP clock estimate updates." "System clock is leading the NTP clock estimate by 2seconds." But then, what's the difference? -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On Fri, Nov 10, 2017 at 12:06:07PM -0800, Bill Unruh wrote: > > > > Bill, I really do not understand what point you are trying to present here, > > with respect to the system log message. I was only suggesting that the log > > message should be more clear as to, let us call it, "leading" and > > "lagging", "forward" and "backward". > > You are requesting something further. You are also requesting a change in the > value that is reported, not just making the sign of the error clearer. I am in > perfect agreement with the request for making the meaning of the sign clearer. > I am (perhaps) not in agreement with the value you want reported. Same for me. I think there are three separate questions: - What value should be compared with the logchange threshold? - What value should be reported in the message? - How should be that value described in the message? For the first one, I think we should keep using the new adjustment that is applied on top of the remaining correction in order to avoid reporting a large adjustment multiple times. The reported value could be changed to include both the old and new adjustment, and I think it would be well described as "system clock ahead/behind". Alternatively, we can keep reporting the change in the correction, but that probably needs a different description than "system clock ahead/behind". The old "system clock wrong" I interpreted as an error that the clock acquired between updates, but "ahead/behind" I'd interpret relative to NTP or UTC. > > On 11/09/2017 09:05 PM, Bill Unruh wrote: > > > The three seconds is being corrected, meaning that the clock has been > > > slowed > > > down so that say in 10 seconds it would be out by 0 seconds. It is a known > > > error and it is an error that, if everything is left as it is, will be 0 > > > in 10 > > > sec. > > > > > > > > > > > Precisely, so why would you report that? > > > > > > > > Indeed. I do not see any reason to report that. > > > > > > But you are claiming you should report that 3 sec. > > > > Uhm - 4seconds, using Miroslav's example. I am saying that the log message > > should report that the system clock offset is currently 4seconds. I do not > > understand where you interpret a "but" out of that. Miroslav has said > > that, presently, chronyd will report instead, what I understand to be a > > kind of "differential slew error", which, in this example, would be > > "1second", not "4seconds". And, I was suggesting that the log message > > should *not* report "1second". > > > There are two components of that 4 seconds. One is the three seconds which is > already being corrected by the alteration of the slew. The other is the 1 > second which is a new offset. The three seconds is already being corrected, > and to report it now again may be confusing. Yes, the "system clock wrong by" messages were meant to be stacked on top of each other. If we change it to report the current remaining correction, it won't be clear how much the clock has drifted between the messages. If the log had system clock behind 10 seconds system clock behind 9 seconds the clock could have drifted by any value between the logchange threshold and 9 seconds. > > What is it that you are thinking that the log message should report? > > From what Lichvar says, it seems that it now reports the 1 sec which is an > additional uncorrected error. The measurments log already reports the raw > current offset and the corrected > offset. The measurements log doesn't actually include the raw offset. I think only the refclocks log has that. However, the remaining correction is reported in the tracking log, so it should be possible to reconstruct the raw offset in the measurements, statistics and tracking logs. -- Miroslav Lichvar -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
Bill, I really do not understand what point you are trying to present here, with respect to the system log message. I was only suggesting that the log message should be more clear as to, let us call it, "leading" and "lagging", "forward" and "backward". You are requesting something further. You are also requesting a change in the value that is reported, not just making the sign of the error clearer. I am in perfect agreement with the request for making the meaning of the sign clearer. I am (perhaps) not in agreement with the value you want reported. On 11/09/2017 09:05 PM, Bill Unruh wrote: The three seconds is being corrected, meaning that the clock has been slowed down so that say in 10 seconds it would be out by 0 seconds. It is a known error and it is an error that, if everything is left as it is, will be 0 in 10 sec. Precisely, so why would you report that? Indeed. I do not see any reason to report that. But you are claiming you should report that 3 sec. Uhm - 4seconds, using Miroslav's example. I am saying that the log message should report that the system clock offset is currently 4seconds. I do not understand where you interpret a "but" out of that. Miroslav has said that, presently, chronyd will report instead, what I understand to be a kind of "differential slew error", which, in this example, would be "1second", not "4seconds". And, I was suggesting that the log message should *not* report "1second". There are two components of that 4 seconds. One is the three seconds which is already being corrected by the alteration of the slew. The other is the 1 second which is a new offset. The three seconds is already being corrected, and to report it now again may be confusing. What is it that you are thinking that the log message should report? From what Lichvar says, it seems that it now reports the 1 sec which is an additional uncorrected error. The measurments log already reports the raw current offset and the corrected offset. Below, you say that chronyd, by slewing the clock, will correct an offset in "10*DT", "where DT is that measured offset". We see that 10*4seconds = 40seconds, not the 10seconds in your example. Think of this another way. Observing the display time, adding or subtracting 1second to the display will not give the correct time when the clock is off by 4seconds. But that is not what the log reports. Note that as I understand it, chrony itself reports to other machines the corrected time -- the system time minus the 3 seconds,( but I am not sure of this). Ie, there is the system time, and the system time corrected for the ongoing slew, "Slewing" is a different thing from "fix it immediately". chronyd *will* step the clock when configured with "makestep", which is probably what you mean by "fix it immediately". No, it will only make a step if the step size is larger than some value. Of course. I should not have to remind you that the "makestep" option takes a "threshold" argument? What is your point? chrony measures the offset of the clock by doing a least squared fit on the past N measurements offsets, to determine what the best value of the current offset is. Once it has that information it alters the rate of the clock so that in something like DT*2000 seconds (where DT is that measured offset) the offset will be 0. (ntpd makes that 2000 a minimum. chrony will slew so that the error disappears in 10*DT if DT is very large and the DT is less than the value to make a step.) That's useful to know. Previous meaning "previously begun". Chrony, and ntpd do not correct times by stepping the clock. They correct it by slightly increasing or decreasing the rate by which the clock is running, until they have corrected the error. There always exists an "instantaneous offset", of the system clock relative to the NTP server clock, which is an easy thing to describe, under any circumstances. No, it is not because all the measurements have uncertainties-- due to travel time fluctuations, due to measurement accuracy, due to interrupt handling fluctruations, etc. Ie, no you cannot know what the "instantaneous offset" is. You state the obvious. What is your point? I have no idea what is obvious to you. If I wanted to be pedantic - which I often do - I might have said "the instantaneous offset estimate", instead of simply "the instantaneous offset". But, we are all adults here, yes? Yes, there is a variable frequency oscillator. Or rather the time in seconds between the ticks of the cpu clock can be changed. More of an electronics hardware guy would take issue with that blurring of the distinction between analog circuitry and digital circuitry. ;) It is all digital. The cpu clock ticks ticks at a certain, unknown rate. The system counts those ticks and converts them to time. That coversion process must be calbrated and that calibration can be changed. It is primarily that change in calibration that chrony
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
Bill, I really do not understand what point you are trying to present here, with respect to the system log message. I was only suggesting that the log message should be more clear as to, let us call it, "leading" and "lagging", "forward" and "backward". On 11/09/2017 09:05 PM, Bill Unruh wrote: > The three seconds is being corrected, meaning that the clock has been slowed > down so that say in 10 seconds it would be out by 0 seconds. It is a known > error and it is an error that, if everything is left as it is, will be 0 in 10 > sec. > > >>> Precisely, so why would you report that? >> >> Indeed. I do not see any reason to report that. > > But you are claiming you should report that 3 sec. Uhm - 4seconds, using Miroslav's example. I am saying that the log message should report that the system clock offset is currently 4seconds. I do not understand where you interpret a "but" out of that. Miroslav has said that, presently, chronyd will report instead, what I understand to be a kind of "differential slew error", which, in this example, would be "1second", not "4seconds". And, I was suggesting that the log message should *not* report "1second". What is it that you are thinking that the log message should report? Below, you say that chronyd, by slewing the clock, will correct an offset in "10*DT", "where DT is that measured offset". We see that 10*4seconds = 40seconds, not the 10seconds in your example. Think of this another way. Observing the display time, adding or subtracting 1second to the display will not give the correct time when the clock is off by 4seconds. >> "Slewing" is a different thing from "fix it immediately". chronyd *will* >> step the clock when configured with "makestep", which is probably what you >> mean by "fix it immediately". > > No, it will only make a step if the step size is larger than some value. Of course. I should not have to remind you that the "makestep" option takes a "threshold" argument? What is your point? > chrony measures the offset of the clock by doing a least squared fit on the > past N measurements offsets, to determine what the best value of the current > offset is. Once it has that information it alters the rate of the clock so > that in something like DT*2000 seconds (where DT is that measured offset) the > offset will be 0. (ntpd makes that 2000 a minimum. chrony will slew so that > the > error disappears in 10*DT if DT is very large and the DT is less than the > value > to make a step.) That's useful to know. >>> Previous meaning "previously begun". Chrony, and ntpd do not correct times >>> by >>> stepping the clock. They correct it by slightly increasing or decreasing the >>> rate by which the clock is running, until they have corrected the error. >> >> There always exists an "instantaneous offset", of the system clock relative >> to the NTP server clock, which is an easy thing to describe, under any >> circumstances. > > No, it is not because all the measurements have uncertainties-- due to travel > time fluctuations, due to measurement accuracy, due to interrupt handling > fluctruations, etc. Ie, no you cannot know what the "instantaneous offset" is. You state the obvious. What is your point? If I wanted to be pedantic - which I often do - I might have said "the instantaneous offset estimate", instead of simply "the instantaneous offset". But, we are all adults here, yes? > Yes, there is a variable frequency oscillator. Or rather the time in seconds > between the ticks of the cpu clock can be changed. More of an electronics hardware guy would take issue with that blurring of the distinction between analog circuitry and digital circuitry. ;) >> Presumably, a measurement of the current instantaneous absolute offset of >> the system clock relative to the NTP server clock. I believe that that is - >> or would be - the intuitive interpretation of most users. The idea that >> this "measurement" reported in the the log message would represent something >> much more abstract - and obtuse - than that is - unfortunate. > > No. In chrony it is the least squares linear fit of the last N measurements of > the offset that is used to estimate the current offset. Because the > instantaneously measured offset is beset by uncertainties that can be > "averaged out" by doing a least squares fit. N is chosen actively to make sure > that over that set of N measurements the offsets are well approximated by a > straight line with gaussian noise. Unfortunately in order to get a good > estimate, one HAS to use "abstract - and > obtuse" in your words, procedures. chrony's way of doing it is a factor of 3 > to a 20 times better than ntpd's method according to measurements that both I > and Lichvar have made. That's nice. Still, you seem to misunderstand what I am referring to as "obtuse". Let me try again. Remember that the topic here is the log message, and nothing about the internal workings of chornyd - which we
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
William G. Unruh __| Canadian Institute for| Tel: +1(604)822-3273 Physics _|___ Advanced Research _| Fax: +1(604)822-5324 UBC, Vancouver,BC _|_ Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity __|_ www.theory.physics.ubc.ca/ On Thu, 9 Nov 2017, James Feeney wrote: On 11/09/2017 02:22 PM, Bill Unruh wrote: That is unclear. Chrony knows that it is out by a certain amount. That is why it is slewing the clock, and in a few seconds or minutes the system time will be exactly what it thinks NTP time is. It now finds it is out by a second. Does it report that it is out by a second or by that second plus the three seconds that it already is correcting. What does that mean, "already is correcting"? Either the system clock current instantaneous offset is 1 second, 3 seconds, or 4 seconds, relative to the NTP server clock. The three seconds is being corrected, meaning that the clock has been slowed down so that say in 10 seconds it would be out by 0 seconds. It is a known error and it is an error that, if everything is left as it is, will be 0 in 10 sec. Precisely, so why would you report that? Indeed. I do not see any reason to report that. But you are claiming you should report that 3 sec. The logs are not the place to give a complete description of the theory of chrony operations. That would just be silly, verbost and totally confusing. One could put that into the documentation. Yes. So, probably not useful to have a log message that even tries to report that "differential slew error". Why? Possibly because your naive expectation is "If the clock is out, fix it immediately" But that is not what chrony does. "Slewing" is a different thing from "fix it immediately". chronyd *will* step the clock when configured with "makestep", which is probably what you mean by "fix it immediately". No, it will only make a step if the step size is larger than some value. But, I don't actually have enough information to understand what is meant by "didn't finish yet". When, or how often, and in what manner is this "previous slew" measured? Is there some "current slew" that is distinct from a "previous slew"? Does a previous slew "end" before the system clock becomes synchronous with the NTP server clock? chrony measures the offset of the clock by doing a least squared fit on the past N measurements offsets, to determine what the best value of the current offset is. Once it has that information it alters the rate of the clock so that in something like DT*2000 seconds (where DT is that measured offset) the offset will be 0. (ntpd makes that 2000 a minimum. chrony will slew so that the error disappears in 10*DT if DT is very large and the DT is less than the value to make a step.) Previous meaning "previously begun". Chrony, and ntpd do not correct times by stepping the clock. They correct it by slightly increasing or decreasing the rate by which the clock is running, until they have corrected the error. There always exists an "instantaneous offset", of the system clock relative to the NTP server clock, which is an easy thing to describe, under any circumstances. No, it is not because all the measurements have uncertainties-- due to travel time fluctuations, due to measurement accuracy, due to interrupt handling fluctruations, etc. Ie, no you cannot know what the "instantaneous offset" is. When a "new measurement" is made of a "previously begun" slewing process - increasing or decreasing the rate of the system clock - the log message should be clear as to the nature of that measurement. What was measured? What is measured is the difference between the average of the time when the ntp packet was sent out and when it was received again according to the local clock , and the time when the remote machine claimed to have received it according to the remote clock . chrony keeps measuring its deviation from the ntp time, and slews the clock, at various rates, to correct for difference. This takes place all the time. So the system clock frequency is changed, rather than the clock value being increased or decreased with small discontinuous offsets greater than 2 counts. Clearly, that is a challenge for the kernel, since there is no variable frequency oscillator hardware, and some hardware counter will need to increment or decrement by 2 counts instead of by 1 count at some rate, generated using a second counter. Yes, there is a variable frequency oscillator. Or rather the time in seconds between the ticks of the cpu clock can be changed. It tweaks it every time it makes a measurement. Presumably, a measurement of the current instantaneous absolute offset of the system clock relative to the NTP server clock. I believe that that is - or would be - the intuitive interpretation of most users. The idea that this "measurement" reported in the the log message would represent something much more abstract
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On 11/09/2017 02:22 PM, Bill Unruh wrote: > That is unclear. Chrony knows that it is out by a certain amount. That is why > it is slewing the clock, and in a few seconds or minutes the system time will > be exactly what it thinks NTP time is. It now finds it is out by a second. > Does it report that it is out by a second or by that second plus the three > seconds that it already is correcting. What does that mean, "already is correcting"? Either the system clock current instantaneous offset is 1 second, 3 seconds, or 4 seconds, relative to the NTP server clock. > Precisely, so why would you report that? Indeed. I do not see any reason to report that. > The logs are not the place to give a complete description of the theory of > chrony operations. That would just be silly, verbost and totally confusing. > One could put that into the documentation. Yes. So, probably not useful to have a log message that even tries to report that "differential slew error". > Why? Possibly because your naive expectation is "If the clock is out, fix it > immediately" But that is not what chrony does. "Slewing" is a different thing from "fix it immediately". chronyd *will* step the clock when configured with "makestep", which is probably what you mean by "fix it immediately". >> But, I don't actually have enough information to understand what is meant by >> "didn't finish yet". When, or how often, and in what manner is this >> "previous slew" measured? Is there some "current slew" that is distinct >> from a "previous slew"? Does a previous slew "end" before the system clock >> becomes synchronous with the NTP server clock? >> > > Previous meaning "previously begun". Chrony, and ntpd do not correct times by > stepping the clock. They correct it by slightly increasing or decreasing the > rate by which the clock is running, until they have corrected the error. There always exists an "instantaneous offset", of the system clock relative to the NTP server clock, which is an easy thing to describe, under any circumstances. When a "new measurement" is made of a "previously begun" slewing process - increasing or decreasing the rate of the system clock - the log message should be clear as to the nature of that measurement. What was measured? > chrony keeps measuring its deviation from the ntp time, and slews the clock, > at various rates, to correct for difference. This takes place all the time. So the system clock frequency is changed, rather than the clock value being increased or decreased with small discontinuous offsets greater than 2 counts. Clearly, that is a challenge for the kernel, since there is no variable frequency oscillator hardware, and some hardware counter will need to increment or decrement by 2 counts instead of by 1 count at some rate, generated using a second counter. > It tweaks it every time it makes a measurement. Presumably, a measurement of the current instantaneous absolute offset of the system clock relative to the NTP server clock. I believe that that is - or would be - the intuitive interpretation of most users. The idea that this "measurement" reported in the the log message would represent something much more abstract - and obtuse - than that is - unfortunate. > And not to all your questions. You really need to read about the theory of > ntpd clock adjustment which is > that same to lowest order as that of chrony. That would not hurt, but the issue I am raising is about the meaning of a simple log message, not about the theory of the network time protocol. This does not need to be complicated. I believe that it should not be *made* complicated. -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
William G. Unruh __| Canadian Institute for| Tel: +1(604)822-3273 Physics _|___ Advanced Research _| Fax: +1(604)822-5324 UBC, Vancouver,BC _|_ Program in Cosmology | un...@physics.ubc.ca Canada V6T 1Z1 | and Gravity __|_ www.theory.physics.ubc.ca/ On Thu, 9 Nov 2017, James Feeney wrote: On 11/09/2017 05:25 AM, Miroslav Lichvar wrote: So, if there is a large adjustment running and a new measurement says the offset of the clock is not what expect (e.g. the clock or the server has drifted for some reason), should we report by how much the adjustment which is still running had to be adjusted, or the total amount of the remaining adjustment? Currently, it's the former. Well, my intuition and naive assumption would be that any allusion to a system clock offset was describing an "absolute offset", relative to the NTP server clock, and NOT relative to some chronyd internal process, which, in this case, would be the slewing process. And I also naively would expect that other people would make that same assumption. That is unclear. Chrony knows that it is out by a certain amount. That is why it is slewing the clock, and in a few seconds or minutes the system time will be exactly what it thinks NTP time is. It now finds it is out by a second. Does it report that it is out by a second or by that second plus the three seconds that it already is correcting. Any log message that was referring to some offset of the system clock relative to some clock *other* than the NTP server clock, in my mind, should require some very very clear and emphasized language. Even then, how would that be worded? I am still not clear about this offset that is being reported! A log message would seem to require some wording as "involved" as the expression of your example: "the initial offset was 5 seconds and the system clock was already corrected by 2 seconds when another measurement is made, which says the offset of the system clock is now actually 4 seconds instead of the expected 3 seconds." I don't even know how often "another measurement is made", so I cannot put that offset error in a precise context. Precisely, so why would you report that? And, why would anyone care about this "differential slew error"? Is there some attempt being made to illustrate a possible hardware fault, for instance, where the system clock is drifting faster than the chrony slewing algorithm is able to track? And then, if that were the case, I would expect a different log message, such as: "Unable to correct system clock - excessive drift." So, there could be an initial message, for instance, "The system clock is leading/lagging the NTP server clock by N seconds", followed by one of two differing follow-up messages, depending upon whether the system clock is corrected by slewing or by stepping. For slewing, there could be a message that simply says something like "Slewing the system clock forward/backward, correction started.", and for stepping, a message something like "Stepping the system clock forward/backward by N seconds." The logs are not the place to give a complete description of the theory of chrony operations. That would just be silly, verbost and totally confusing. One could put that into the documentation. What value should be reported if the previous slew didn't finish yet? That's the normal case. It's just that the slew is rarely larger than logchange. My instinctive answer is "always the absolute offset of the system clock relative to the NTP server clock." Why? Possibly because your naive expectation is "If the clock is out, fix it immediately" But that is not what chrony does. But, I don't actually have enough information to understand what is meant by "didn't finish yet". When, or how often, and in what manner is this "previous slew" measured? Is there some "current slew" that is distinct from a "previous slew"? Does a previous slew "end" before the system clock becomes synchronous with the NTP server clock? Previous meaning "previously begun". Chrony, and ntpd do not correct times by stepping the clock. They correct it by slightly increasing or decreasing the rate by which the clock is running, until they have corrected the error. It is more important that the the clock deliver a smooth time than that it deliver the correct time. For example, backstepping the clock (so that there are two times at which the system clock shows 1496549 seconds for example). That kind of backstepping can cause havoc to the filesystem. My intuitive assumption would be that the slewing process is simply initiated when chronyd is started, and that the slewing process changes its rate dynamically, however much is required to "catch up" the system clock to the NTP server clock, and that a single slewing process will continue until synchronization is achieved. Does "slewing" work like that? Or some other way? chrony keeps measuring its deviation from the ntp
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On 11/09/2017 05:25 AM, Miroslav Lichvar wrote: > So, if there is a large adjustment running and a new measurement says > the offset of the clock is not what expect (e.g. the clock or the > server has drifted for some reason), should we report by how much the > adjustment which is still running had to be adjusted, or the total > amount of the remaining adjustment? Currently, it's the former. Well, my intuition and naive assumption would be that any allusion to a system clock offset was describing an "absolute offset", relative to the NTP server clock, and NOT relative to some chronyd internal process, which, in this case, would be the slewing process. And I also naively would expect that other people would make that same assumption. Any log message that was referring to some offset of the system clock relative to some clock *other* than the NTP server clock, in my mind, should require some very very clear and emphasized language. Even then, how would that be worded? I am still not clear about this offset that is being reported! A log message would seem to require some wording as "involved" as the expression of your example: "the initial offset was 5 seconds and the system clock was already corrected by 2 seconds when another measurement is made, which says the offset of the system clock is now actually 4 seconds instead of the expected 3 seconds." I don't even know how often "another measurement is made", so I cannot put that offset error in a precise context. And, why would anyone care about this "differential slew error"? Is there some attempt being made to illustrate a possible hardware fault, for instance, where the system clock is drifting faster than the chrony slewing algorithm is able to track? And then, if that were the case, I would expect a different log message, such as: "Unable to correct system clock - excessive drift." >> So, there could be an initial message, for instance, "The system clock is >> leading/lagging the NTP server clock by N seconds", followed by one of two >> differing follow-up messages, depending upon whether the system clock is >> corrected by slewing or by stepping. For slewing, there could be a message >> that simply says something like "Slewing the system clock forward/backward, >> correction started.", and for stepping, a message something like "Stepping >> the system clock forward/backward by N seconds." > What value should be reported if the previous slew didn't finish yet? > That's the normal case. It's just that the slew is rarely larger than > logchange. My instinctive answer is "always the absolute offset of the system clock relative to the NTP server clock." But, I don't actually have enough information to understand what is meant by "didn't finish yet". When, or how often, and in what manner is this "previous slew" measured? Is there some "current slew" that is distinct from a "previous slew"? Does a previous slew "end" before the system clock becomes synchronous with the NTP server clock? My intuitive assumption would be that the slewing process is simply initiated when chronyd is started, and that the slewing process changes its rate dynamically, however much is required to "catch up" the system clock to the NTP server clock, and that a single slewing process will continue until synchronization is achieved. Does "slewing" work like that? Or some other way? I have no idea how often chronyd actually "tweaks" the system clock, whether chronyd is in a "slewing" state, or even in a "synchronized" state. Is there some hard-coded maximum allowed offset before the system clock is "tweaked"? Or is there a periodic "tweak" rate? And is this "tweak" interval dynamically adjusted? Is the system clock "frequency" automatically "scaled" by the kernel? I know that those are kind of basic concepts, but I really don't know what chronyd is doing at that low level. I'm appreciating your patience with this explanation... -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On Tue, Nov 07, 2017 at 02:27:50PM -0700, James Feeney wrote: > On 11/06/2017 09:17 AM, Miroslav Lichvar wrote: > > For example, if the initial offset was 5 seconds and the system clock > > was already corrected by 2 seconds when another measurement is made, > > which says the offset of the system clock is now actually 4 seconds > > instead of the expected 3, a message will be logged that the clock is > > wrong by 1 second. The question is which offset of the system clock > > should be reported (4 or 1) and when it should be reported (abs(4) > > > logchange or abs(1) > logchange). > > Ah - too many small numbers - is the "1" from the "5-4" or from the "4-3"? It's the "4-3". > I do not understand in what sense the clock is then wrong by 1 second. And, > I am thinking that the log message should never say that - too confusing. If > I am understanding, you are saying that currently, a message like that would > be logged - and maybe there should not be a message like that. It is not > immediately evident to me why a message like that would be informative. It > may be better to only log a simpler message, describing the *current* > relationship between the system clock and the NTP server clock. I think the idea was to warn the user when the clock needs a larger adjustment. The logchange directive which configures this message is described in the manual as "the threshold for the adjustment of the system clock that will generate a syslog message. An example ... message to be generated if a system clock error of over 0.1 seconds starts to be compensated". So, if there is a large adjustment running and a new measurement says the offset of the clock is not what expect (e.g. the clock or the server has drifted for some reason), should we report by how much the adjustment which is still running had to be adjusted, or the total amount of the remaining adjustment? Currently, it's the former. > So, there could be an initial message, for instance, "The system clock is > leading/lagging the NTP server clock by N seconds", followed by one of two > differing follow-up messages, depending upon whether the system clock is > corrected by slewing or by stepping. For slewing, there could be a message > that simply says something like "Slewing the system clock forward/backward, > correction started.", and for stepping, a message something like "Stepping > the system clock forward/backward by N seconds." What value should be reported if the previous slew didn't finish yet? That's the normal case. It's just that the slew is rarely larger than logchange. -- Miroslav Lichvar -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On Tue, 7 Nov 2017, James Feeney wrote: On 11/06/2017 09:17 AM, Miroslav Lichvar wrote: From the other suggestions that have been made, I liked best "was stepped backward/forward". That's good too. For example, if the initial offset was 5 seconds and the system clock was already corrected by 2 seconds when another measurement is made, which says the offset of the system clock is now actually 4 seconds instead of the expected 3, a message will be logged that the clock is wrong by 1 second. The question is which offset of the system clock should be reported (4 or 1) and when it should be reported (abs(4) > logchange or abs(1) > logchange). Ah - too many small numbers - is the "1" from the "5-4" or from the "4-3"? I do not understand in what sense the clock is then wrong by 1 second. And, I am thinking that the log message should never say that - too confusing. If I am understanding, you are saying that currently, a message like that would be logged - and maybe there should not be a message like that. It is not immediately evident to me why a message like that would be informative. It may be better to only log a simpler message, describing the *current* relationship between the system clock and the NTP server clock. The system is drifting the clock so as to get rid of the 5 second offset. It should have drifted it by 2 seconds by this time, so that time should have read 3 sec wrong. But instead it is 4 seconds wrong. If one allowed the drift to continue it would be 0 seconds wrong in 1 1/2 measurement periods. With that same drift however, it would, at the end of the drift still be 1 second out. Ie, the clock is now out by 1 second from the expected drifting clock. There are at least the following options: 1. change the message to better describe what is behind or ahead 2. change the message to print the real offset of the system clock a) print it on each update as long as the offset is larger than logchange b) print it only when the change in the offset is larger than logchange Thoughts? If I am understanding, I think I prefer option '2.a)'. What I don't like about 2.a is that if there is a large initial offset and stepping is not enabled by "makestep", there may be a large number of the "clock wrong by" messages in the log, even if everything is working as expected. I am guessing that you mean, if a slewing correction is used, that there would then be a log message for every instance in which a small change is made? Yes, that would be too much unimportant information. I was supposing that alternative '2.b)' would mean that no message would be printed saying anything about the system clock when the offset was less than logchange. I may have misunderstood. I'm thinking that something should be printed initially about the system clock offset, but then, subsequently, only "state change" events should be printed in the log. For instance, if the system clock drifts and, for some reason, the offset exceeds logchange, or if, when slewing the system clock, the system clock has achieved synchronization. So, there could be an initial message, for instance, "The system clock is leading/lagging the NTP server clock by N seconds", followed by one of two differing follow-up messages, depending upon whether the system clock is corrected by slewing or by stepping. For slewing, there could be a message that simply says something like "Slewing the system clock forward/backward, correction started.", and for stepping, a message something like "Stepping the system clock forward/backward by N seconds." Slewing is the expected behaviour. Stepping is not. Thus one would expect a log warning on stepping but not slewing, is what I would expect. -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org. -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On 11/06/2017 09:17 AM, Miroslav Lichvar wrote: > From the other suggestions that have been made, I liked best "was > stepped backward/forward". That's good too. > For example, if the initial offset was 5 seconds and the system clock > was already corrected by 2 seconds when another measurement is made, > which says the offset of the system clock is now actually 4 seconds > instead of the expected 3, a message will be logged that the clock is > wrong by 1 second. The question is which offset of the system clock > should be reported (4 or 1) and when it should be reported (abs(4) > > logchange or abs(1) > logchange). Ah - too many small numbers - is the "1" from the "5-4" or from the "4-3"? I do not understand in what sense the clock is then wrong by 1 second. And, I am thinking that the log message should never say that - too confusing. If I am understanding, you are saying that currently, a message like that would be logged - and maybe there should not be a message like that. It is not immediately evident to me why a message like that would be informative. It may be better to only log a simpler message, describing the *current* relationship between the system clock and the NTP server clock. >>> There are at least the following options: >>> 1. change the message to better describe what is behind or ahead >>> 2. change the message to print the real offset of the system clock >>>a) print it on each update as long as the offset is larger than >>> logchange >>>b) print it only when the change in the offset is larger than >>> logchange >>> >>> Thoughts? >> >> If I am understanding, I think I prefer option '2.a)'. > > What I don't like about 2.a is that if there is a large initial offset > and stepping is not enabled by "makestep", there may be a large number > of the "clock wrong by" messages in the log, even if everything is > working as expected. I am guessing that you mean, if a slewing correction is used, that there would then be a log message for every instance in which a small change is made? Yes, that would be too much unimportant information. I was supposing that alternative '2.b)' would mean that no message would be printed saying anything about the system clock when the offset was less than logchange. I may have misunderstood. I'm thinking that something should be printed initially about the system clock offset, but then, subsequently, only "state change" events should be printed in the log. For instance, if the system clock drifts and, for some reason, the offset exceeds logchange, or if, when slewing the system clock, the system clock has achieved synchronization. So, there could be an initial message, for instance, "The system clock is leading/lagging the NTP server clock by N seconds", followed by one of two differing follow-up messages, depending upon whether the system clock is corrected by slewing or by stepping. For slewing, there could be a message that simply says something like "Slewing the system clock forward/backward, correction started.", and for stepping, a message something like "Stepping the system clock forward/backward by N seconds." -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On 10/31/2017 10:41 AM, James Feeney wrote: > "ahead" and "behind" also will work for both the offset message and the> "was > stepped" message. Also: forward / back advanced / delayed leading / lagging These are more traditional technical terms describing offsets. incremented / decremented increased / decreased "stepped forward by" -> "increased by", etc. raised / loweredHa! Our concept of time is so "horizontal". early / late Hmm - on second thought, "was stepped ahead" seems natural, but "was stepped behind", not so much. And then, "clock behind by 3.00 seconds" followed by "was stepped ahead by 3.00 seconds" might seem to mix-together "behind" and "ahead" in a contradictory way. Different word pairs for each message might be easier to read and understand. For reference, from my original example: chronyd[622]: System clock wrong by 1.693005 seconds, adjustment started chronyd[622]: System clock was stepped by 1.693005 seconds I might lean more toward "leading / lagging" and "increased / decreased". -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On 10/31/2017 05:12 AM, Miroslav Lichvar wrote: >> For system logs messages >> "ahead" / "behind" for offset >> "fast" / "slow" for freq > That looks good to me. "ahead" and "behind" also will work for both the offset message and the "was stepped" message. > Another question is whether it's correct to say "System clock". The > messages don't actually print the offset of the system clock, but > rather the change in the offset relative to the NTP time that chronyd > is tracking. On start that is the same as the offset of the system > clock, but if there is an unfinished correction, it's the difference > between the old offset and new offset. I'm sorry, I don't understand. What does that mean, "the change in the offset"? The quantity ( system_clock_time - NTP_server_time ), I think I understand. Now, are you describing something like (( new_system_clock_time - new_NTP_server_time ) - ( old_system_clock_time - old_NTP_server_time ))? Are you saying that there is a system log message that would be saying "the system clock was behind by 5 seconds, but now, after some correction, the system clock is behind by only 3 seconds, and the offset has been 'reduced' by 2 seconds"? That is what "the difference between the old offset and new offset" suggests to me. But then - really? Is chrony reporting that? Why would anyone want to see a report about that? Instead, it might be useful to know, in that example, that there was still 3 seconds of correction remaining. But that quantity would be exactly the same as saying "the system clock is now behind by 3 seconds". So, I need some help there. > There are at least the following options: > 1. change the message to better describe what is behind or ahead > 2. change the message to print the real offset of the system clock >a) print it on each update as long as the offset is larger than > logchange >b) print it only when the change in the offset is larger than > logchange > > Thoughts? If I am understanding, I think I prefer option '2.a)'. Option '2.b)' does not tell me what event is going to compel a system log message - at the first state-change? And only once? Where option '2.a)' will print a log message "on each update" - which I don't know when chrony normally makes those updates - but then, the log messages would inform me. That sounds like a good thing. I am suspicious that option '1.' means having to describe some really complicated concept, this "change in the offset". But I'm only guessing, since I don't actually understand the "what". Could you give an example? In common speaking, I still tend to speak in terms of frequency. I might say something like "That clock is running fast" and "That clock is running slow", where "running" and "slow" are adjectives, as in a "fast running clock" and a "slow running clock". Somehow, instinctively, that is the information for which I am searching. I can infer a "slow clock" when the system clock offset is "behind", but in my mind, I want to reduce the concept to an effective relative frequency. Of course, subsequently, I will want to quantify the resulting offset. I don't know that other people do that in their head. But it might be something to consider, in the printing of system log messages. A "complicated" log message might say something like "The system clock is slow and is behind by 3.00 seconds." That's a bit more like natural human speech - though it may not be as "manly" as a terse and cryptic log message. Ha! Of course, there are also assumptions there, 1) that there exists some local reference clock, and 2) that local clock had been set to the correct time in the past, and 3) that that local clock was used to set the system clock, and 4) the system clock is an analog of the local reference clock. Assumption 4 is not actually valid if, for instance, the local reference clock and the system clock operate from different crystals, which is common. The system clock might actually run fast while the local clock runs slow. But, as you suggest, the casual assumption is also that "the offset is larger than logchange", which, I think, leaves us pretty much in the realm of setting the system clock at boot-up, or at wake-up. I suspect that the casual user understands this. Once the system clock has been set, tweaking the system clock drift is at a very different scale, and no system log message is printed for that change. I'm also assuming that booting a device that has no local reference clock, like a network appliance, a router or a switch or an access point, there is not going to be any "offset" message or any "stepped by" message printed in the system log. Or, maybe the system clock time simply defaults to the epic, "0", and the offset to the current time is shown? -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On 10/30/2017 05:12 PM, Bill Unruh wrote: > Yes, that occurs in the system logs (/var/log/messages, or journalctl ) and I > do not parse those. There is virtually nothing there in the older versions of > chrony. I go to the /var/log/chrony/* logs Ok, I see. But that's a different thing altogether. Do you have any preference with respect to just the system log messages? -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On 10/30/2017 02:42 PM, Bill Unruh wrote: On Mon, 30 Oct 2017, James Feeney wrote: On 10/30/2017 05:07 AM, Miroslav Lichvar wrote: existing scripts that parse the log I am not familiar with any of those scripts - who parses chrony log files? - so my initial reaction is to be unsympathetic. But I don't know - how important is that? I certainly do. Mind you I convert the time to seconds myself with awk. Ok - and you would need to modify your script, if the word "wrong" was removed, but not if there were trailing text? Yes, that occurs in the system logs (/var/log/messages, or journalctl ) and I do not parse those. There is virtually nothing there in the older versions of chrony. I go to the /var/log/chrony/* logs Do you think it would make sense to keep the sign ... I am confused - unclear about what chrony is actually reporting. In my example, the implicit sign was positive, '+'. But then, in apparent contradiction, "chronyc tracking" reports "Frequency : 5.626 ppm slow", which would have accumulated a *negative* offset, '-', of the system time relative to the actual UTC time. The frequency and the offset are not really related to each other. Uhm - no. The system clock frequency and the instantaneous system clock offset on reboot really *are* related to each other. Not sure what you mean. On boot, the system clock frequency is irrelevant since it was not operating between reboots. What is important is the rtc clock offset and frequency, and yes, they can be related to each other. Is there really a sign associated with the system time reported in the log? Yes. The offset can be positive or negative. Ok. All of my hardware clocks must be slow. Could be. There is certainly a sign associated. Of course, the phrases "stepped forward" and "stepped backward" can be used without putting the words "forward" and "backward" in parentheses. Still, saying "stepped backward by -2.00 seconds" would have the flavor of a double negative. Similarly, the phrase "slow by -2.00 seconds" would have a nagging ambiguity about it. If you were going to keep the sign, then you would choose one or the other (ahead or behind is far better than fast or slow, since those are rate terms, not offset term. A clock can be slow and still be ahead. Eg the clock could tick only 59 times per minute, and yet be set ahead. Yes, ok, "ahead" and "behind". But again, frequency errors in my system clock are the cause of instantaneous offsets in the current time. The time is not being set from scratch, with a clock that has never been set. Technically, there are two pieces of information to be stated: 1) the frequency error or drift rate, and 2) the instantaneous absolute offset error. The displayed units serve to distinguish these, "seconds per second" for a frequency error, and "seconds" for an offset error, and so this is already clear in the system log message. I suppose it depends on what language you are use to hearing, as far as fast/slow or ahead/behind. Yes, the rate and the offset are the two pieces that chrony uses. It is not clear here what you are trying to get at. So, the clock is -3.564452 ahead of NTP time is pretty unambiguous. Ha! No, that sounds like an explicitly contradictory statement to me. I'm not even sure what you are trying to say with that example. It means that to get the system time, you add whatever that number is (-3.564452) to the NTP time. If chrony were to use explicitly signed numbers in the log messages, the parenthetic slow/fast and forward/backward would be most appropriate. It does explicity use signed numbers. A few lines from my measurements.log Date (UTC) Time IP Address L St 1234 abc 5678 LP RP Score Offset Peer del. Peer disp. Root del. Root disp. 2017-10-22 11:02:59 142.103.234.23 N 1 111 6 6 1.00 -3.600e-04 2.440e-02 1.007e-06 0.000e+00 3.052e-05 Note the -3.600 in the offset column. I'm confused - why are you presenting your measurements.log? The log file message that I was discussing was the chrony message in the system log. Because it is an example of the report of a negative offset. So, when you said, above, that "The offset can be positive or negative", were you actually talking about your "measurements.log", and not the system log? The example was the measurement log. Here is an example from /var/log/daemonlog : /var/log/daemonlog:Jan 12 00:58:04 planet chronyd[820]: System clock wrong by -1.121814 seconds, adjustment started Note the negative sign. Was my meaning unclear? "Explicitly signed" means that a positive number always has an explicit leading '+'. Of course that is still only suggestive, that the number could just as well have been negative, and does not
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On 10/30/2017 02:42 PM, Bill Unruh wrote: > On Mon, 30 Oct 2017, James Feeney wrote: >> On 10/30/2017 05:07 AM, Miroslav Lichvar wrote: >>> existing scripts that parse the log >> >> I am not familiar with any of those scripts - who parses chrony log files? - >> so my initial reaction is to be unsympathetic. But I don't know - how >> important is that? > > I certainly do. Mind you I convert the time to seconds myself with awk. Ok - and you would need to modify your script, if the word "wrong" was removed, but not if there were trailing text? >>> Do you think it would make sense to keep the sign ... >> >> I am confused - unclear about what chrony is actually reporting. In my >> example, the implicit sign was positive, '+'. But then, in apparent >> contradiction, "chronyc tracking" reports "Frequency : 5.626 ppm >> slow", which would have accumulated a *negative* offset, '-', of the system >> time relative to the actual UTC time. > > The frequency and the offset are not really related to each other. Uhm - no. The system clock frequency and the instantaneous system clock offset on reboot really *are* related to each other. >> Is there really a sign associated with the system time reported in the log? > > Yes. The offset can be positive or negative. Ok. All of my hardware clocks must be slow. >> Of course, the phrases "stepped forward" and "stepped backward" can be used >> without putting the words "forward" and "backward" in parentheses. Still, >> saying "stepped backward by -2.00 seconds" would have the flavor of a >> double negative. Similarly, the phrase "slow by -2.00 seconds" would >> have a nagging ambiguity about it. > > If you were going to keep the sign, then you would choose one or the other > (ahead or behind is far better than fast or slow, since those are rate terms, > not offset term. A clock can be slow and still be ahead. Eg the clock could > tick only 59 times per minute, and yet be set ahead. Yes, ok, "ahead" and "behind". But again, frequency errors in my system clock are the cause of instantaneous offsets in the current time. The time is not being set from scratch, with a clock that has never been set. Technically, there are two pieces of information to be stated: 1) the frequency error or drift rate, and 2) the instantaneous absolute offset error. The displayed units serve to distinguish these, "seconds per second" for a frequency error, and "seconds" for an offset error, and so this is already clear in the system log message. I suppose it depends on what language you are use to hearing, as far as fast/slow or ahead/behind. > So, the clock is -3.564452 ahead of NTP time is pretty unambiguous. Ha! No, that sounds like an explicitly contradictory statement to me. I'm not even sure what you are trying to say with that example. >> If chrony were to use explicitly signed numbers in the log messages, the >> parenthetic slow/fast and forward/backward would be most appropriate. > > It does explicity use signed numbers. > > A few lines from my measurements.log > > > Date (UTC) Time IP Address L St 1234 abc 5678 LP RP Score Offset > Peer del. Peer disp. Root del. Root disp. > > > 2017-10-22 11:02:59 142.103.234.23 N 1 111 6 6 1.00 > -3.600e-04 2.440e-02 1.007e-06 0.000e+00 3.052e-05 > > Note the -3.600 in the offset column. I'm confused - why are you presenting your measurements.log? The log file message that I was discussing was the chrony message in the system log. So, when you said, above, that "The offset can be positive or negative", were you actually talking about your "measurements.log", and not the system log? Was my meaning unclear? "Explicitly signed" means that a positive number always has an explicit leading '+'. Of course that is still only suggestive, that the number could just as well have been negative, and does not represent a simple magnitude. But that would be my interpretation. I am back to guessing that chrony does not use signed numbers in the system log message, not having actually looked at the code myself. -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On 10/30/2017 05:07 AM, Miroslav Lichvar wrote: > existing scripts that parse the log I am not familiar with any of those scripts - who parses chrony log files? - so my initial reaction is to be unsympathetic. But I don't know - how important is that? > Do you think it would make sense to keep the sign ... I am confused - unclear about what chrony is actually reporting. In my example, the implicit sign was positive, '+'. But then, in apparent contradiction, "chronyc tracking" reports "Frequency : 5.626 ppm slow", which would have accumulated a *negative* offset, '-', of the system time relative to the actual UTC time. Is there really a sign associated with the system time reported in the log? > and indicate whether it's fast or slow in parentheses? If that is necessary, yes, that would work. I'm just not clear about the sign thing, and that the term "wrong" with a back-reference needs to be used. Of course, the phrases "stepped forward" and "stepped backward" can be used without putting the words "forward" and "backward" in parentheses. Still, saying "stepped backward by -2.00 seconds" would have the flavor of a double negative. Similarly, the phrase "slow by -2.00 seconds" would have a nagging ambiguity about it. If chrony were to use explicitly signed numbers in the log messages, the parenthetic slow/fast and forward/backward would be most appropriate. But then, I'd like to see an explicit 'plus' sign on the numbers, when appropriate, along with the explicit 'minus' sign, to make clear that the numbers are actually signed, and are not simply magnitudes. -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On Fri, Oct 27, 2017 at 10:31:16AM -0600, James Feeney wrote: > How about, rather than using the term "wrong", instead use the terms "fast" > and "slow" to describe this quantity "1.693005 seconds"? Then the log > message might read: > > chronyd[622]: System clock fast by 1.693005 seconds, adjustment started > > or > > chronyd[622]: System clock slow by 1.693005 seconds, adjustment started I agree this would be much clearer for the user. I never remember which sign is for fast and slow in what context (it's not consistent unfortunately). The trouble is that it would break existing scripts that parse the log and the parsing itself would be more difficult if it had to look for the word "slow" or "fast" instead of the sign. I'm not sure how important that really is. Do you think it would make sense to keep the sign and indicate whether it's fast or slow in parentheses? System clock wrong by +/-?.??? (slow/fast) ? System clock stepped by +/-?.??? (forward/backward) ? There are other messages that print an offset, so maybe they could be all changed at once to keep it consistent. What do others think? -- Miroslav Lichvar -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.
Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages
On 10/27/2017 11:10 AM, Bill Unruh wrote: > ... Of course the sentence does not indicate what convention chrony uses (ie > does > the plus sign mean that the system clock is ahead or behind the true time) I> > can never remember which convetion is used but I think it is UTC-systemclock. Ha! Yes, difficult to remember, on occasion. And then, it seems simple enough to add a single word to clarify the "direction" of the time offsets. > In the tracking report, one has the terminology > > System time : 0.62501 seconds fast of NTP time > > which would not be ambigous as to the sign convention. Including "fast" there does make it much easier to read. I'm thinking that it would be nice to include this sort of thing also in the log messages. -- To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" in the subject. For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the subject. Trouble? Email listmas...@chrony.tuxfamily.org.