Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-11-20 Thread Miroslav Lichvar
On Tue, Nov 14, 2017 at 11:08:11AM -0700, James Feeney wrote:
> You seem to be describing some kind of log message that has no meaning in 
> isolation - like an episode of a TV show that makes no sense unless you've 
> seen the entire series.

I think it has a meaning in isolation. It says that the chronyd's idea
of the true time has shifted by x seconds, even though we don't know
what exactly that is or why it changed.

How to describe that in a syslog message?

> Personally, I would say that that is just very bad User Interface design.  In 
> contrast, I believe that each log message should be "self contained", that it 
> should have a clear and precise meaning, even in isolation.
> 
> So far, after much back and forth, I have come away with an incomplete 
> understanding.
> 
> Miroslav, if you really believe in the importance of this "stacked messages", 
> "current remaining correction" concept, then please write a precise 
> mathematical expression for what you are describing, beginning first with a 
> complete list of variable names and their descriptions.

I suspect that description would be very long, for either of those
values, and it couldn't be precisely described in a syslog message.
There are at least four different offsets that chronyd works with,
starting with individual NTP measurements and ending with an estimate
of the offset of the system clock that should be corrected.

BTW, there is an interesting case that could happen if the message was
changed to report the "true" offset of the system clock, but kept the
threshold based on the change. It could report an offset smaller than
the threshold. For example, if the first update was to correct a 10
second offset, and in the middle of the correction there was an update
of -5 seconds, there would be a message saying that the "true" offset
is zero.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-11-14 Thread Bill Unruh



William G. Unruh __| Canadian Institute for| Tel: +1(604)822-3273
Physics _|___ Advanced Research _| Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology | un...@physics.ubc.ca
Canada V6T 1Z1 | and Gravity __|_ www.theory.physics.ubc.ca/

On Tue, 14 Nov 2017, James Feeney wrote:


On 11/14/2017 01:23 AM, Miroslav Lichvar wrote:

Yes, the "system clock wrong by" messages were meant to be stacked on
top of each other. If we change it to report the current remaining
correction, it won't be clear how much the clock has drifted between
the messages. If the log had

system clock behind 10 seconds
system clock behind 9 seconds

the clock could have drifted by any value between the logchange
threshold and 9 seconds.


If you were to ask me to describe to some other third person what it is that is now being 
reported by chronyd in these log messages, based upon my own understanding, derived from 
your descriptions and Bill's descriptions, I would be unable to provide any kind of 
precise explanation.  I do not mean to be difficult or contrarian.  I just do not 
understand what is being expressed with these "messages ... stacked on top of each 
other."


Each message says that the clock has suffered an unexpected shift in time from
what it should be at that time, and the message says tht that shift is being
corrected. The key works are "unexpected" and "shift". It may also have
suffered a shift in the past, but that past shift is being corrected already
and so it is not unexpected.



You seem to be describing some kind of log message that has no meaning in 
isolation - like an episode of a TV show that makes no sense unless you've seen 
the entire series.

Personally, I would say that that is just very bad User Interface design.  In contrast, I 
believe that each log message should be "self contained", that it should have a 
clear and precise meaning, even in isolation.


Interpreted as above I believe it does have meaning in isolation. "Something
happened to the time and chrony will not try to correct it."


So far, after much back and forth, I have come away with an incomplete 
understanding.

Miroslav, if you really believe in the importance of this "stacked messages", 
"current remaining correction" concept, then please write a precise mathematical 
expression for what you are describing, beginning first with a complete list of variable names and 
their descriptions.




So far, the variables bandied about include:
1) the NTP server clock estimate
2) a not always present on-board Hardware Clock


That has nothing to do with chrony except when the system starts up. So never
worry about that.


3) the kernel's System Clock
4) some "hidden" chronyd internal clock


Chrony corrects errors by altering the drift of the clock (assuming that the
error is not so large as to demand a step of the clock).



5-10) the Six offsets between each pair of those "clocks"


Ignore the hardware clock. It is never of any relevance.



11-X) some, as yet, uncertain set of "partial offsets"


No such thing.


Y-Z) some, as yet, uncertain set of "cumulative offsets"


No such thing.


A-B) possibly some additional set of statistical "running estimates"


No such thing.



If I am able to understand what this "current remaining correction" thing is, I 
can better offer some opinion with respect to your questions:


The time is out by 5 sec. The rate of the clock is changed so that that 5 sec
offset should go to 0 in 100 seconds. The difference between that prediction
and the estimate of the NTP time is the "current remaining offset".




- What value should be compared with the logchange threshold?
- What value should be reported in the message?
- How should be that value described in the message?


Please be certain that each of these proposed "value" choices is included in 
the list of variable names and the list of equations.


I think that you first need to go and study the theory of how ntp and chrony
operate. It is really a bit much to be asking Miroslav to give you a complete
less in the theory of chrony.




I just like when chronyd reports what it is doing when it starts-up.  Still, based upon my best guess, I don't think I 
care about any "remaining correction" that has no relationship to any "user visible" quantity.  I 
assume that you know the story:  "The answer is - 42!  But, what is the question?"  Does chronyd harbor some 
"hidden variables"?


Yes, many. The last N measurements of the offsets from the measured NTP time
from the server, corrected for changes in rate which have occured since they
were measured. These are used to make a least squares fit to estimate what the
difference in rate and time of the system clock and the NTP server's clock.





The old "system clock wrong" I interpreted as an error
that the clock acquired between updates, but "ahead/behind" I'd
interpret relative to NTP or UTC.


So, you are actually talking about two different things? "acquired between 

Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-11-14 Thread James Feeney
On 11/14/2017 01:23 AM, Miroslav Lichvar wrote:
> Yes, the "system clock wrong by" messages were meant to be stacked on
> top of each other. If we change it to report the current remaining
> correction, it won't be clear how much the clock has drifted between
> the messages. If the log had
> 
> system clock behind 10 seconds
> system clock behind 9 seconds
> 
> the clock could have drifted by any value between the logchange
> threshold and 9 seconds.

If you were to ask me to describe to some other third person what it is that is 
now being reported by chronyd in these log messages, based upon my own 
understanding, derived from your descriptions and Bill's descriptions, I would 
be unable to provide any kind of precise explanation.  I do not mean to be 
difficult or contrarian.  I just do not understand what is being expressed with 
these "messages ... stacked on top of each other."

You seem to be describing some kind of log message that has no meaning in 
isolation - like an episode of a TV show that makes no sense unless you've seen 
the entire series.

Personally, I would say that that is just very bad User Interface design.  In 
contrast, I believe that each log message should be "self contained", that it 
should have a clear and precise meaning, even in isolation.

So far, after much back and forth, I have come away with an incomplete 
understanding.

Miroslav, if you really believe in the importance of this "stacked messages", 
"current remaining correction" concept, then please write a precise 
mathematical expression for what you are describing, beginning first with a 
complete list of variable names and their descriptions.

So far, the variables bandied about include:
1) the NTP server clock estimate
2) a not always present on-board Hardware Clock
3) the kernel's System Clock
4) some "hidden" chronyd internal clock
5-10) the Six offsets between each pair of those "clocks"
11-X) some, as yet, uncertain set of "partial offsets"
Y-Z) some, as yet, uncertain set of "cumulative offsets"
A-B) possibly some additional set of statistical "running estimates"

If I am able to understand what this "current remaining correction" thing is, I 
can better offer some opinion with respect to your questions:

> - What value should be compared with the logchange threshold?
> - What value should be reported in the message?
> - How should be that value described in the message?

Please be certain that each of these proposed "value" choices is included in 
the list of variable names and the list of equations.

I just like when chronyd reports what it is doing when it starts-up.  Still, 
based upon my best guess, I don't think I care about any "remaining correction" 
that has no relationship to any "user visible" quantity.  I assume that you 
know the story:  "The answer is - 42!  But, what is the question?"  Does 
chronyd harbor some "hidden variables"?

> The old "system clock wrong" I interpreted as an error
> that the clock acquired between updates, but "ahead/behind" I'd
> interpret relative to NTP or UTC.

So, you are actually talking about two different things? "acquired between 
updates", and "relative to [ the NTP server clock estimate ]"?  But, 
"acquired", is relative to what?  The NTP server clock estimate?

I'd suggest that two different things should be treated as two different 
things, with proper descriptions, so that they are not confused.  Different log 
messages could say:

"System clock acquired a 2second lead between NTP clock estimate updates."
"System clock is leading the NTP clock estimate by 2seconds."

But then, what's the difference?

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-11-14 Thread Miroslav Lichvar
On Fri, Nov 10, 2017 at 12:06:07PM -0800, Bill Unruh wrote:
> > 
> > Bill, I really do not understand what point you are trying to present here, 
> > with respect to the system log message.  I was only suggesting that the log 
> > message should be more clear as to, let us call it, "leading" and 
> > "lagging", "forward" and "backward".
> 
> You are requesting something further. You are also requesting a change in the
> value that is reported, not just making the sign of the error clearer. I am in
> perfect agreement with the request for making the meaning of the sign clearer.
> I am (perhaps) not in agreement with the value you want reported.

Same for me.

I think there are three separate questions:
- What value should be compared with the logchange threshold?
- What value should be reported in the message?
- How should be that value described in the message?

For the first one, I think we should keep using the new adjustment
that is applied on top of the remaining correction in order to avoid
reporting a large adjustment multiple times.

The reported value could be changed to include both the old and new
adjustment, and I think it would be well described as "system clock
ahead/behind".

Alternatively, we can keep reporting the change in the correction, but
that probably needs a different description than "system clock
ahead/behind". The old "system clock wrong" I interpreted as an error
that the clock acquired between updates, but "ahead/behind" I'd
interpret relative to NTP or UTC.

> > On 11/09/2017 09:05 PM, Bill Unruh wrote:
> > > The three seconds is being corrected, meaning that the clock has been 
> > > slowed
> > > down so that say in 10 seconds it would be out by 0 seconds. It is a known
> > > error and it is an error that, if everything is left as it is, will be 0 
> > > in 10
> > > sec.
> > > 
> > > 
> > > > > Precisely, so why would you report that?
> > > > 
> > > > Indeed.  I do not see any reason to report that.
> > > 
> > > But you are claiming you should report that 3 sec.
> > 
> > Uhm - 4seconds, using Miroslav's example.  I am saying that the log message 
> > should report that the system clock offset is currently 4seconds.  I do not 
> > understand where you interpret a "but" out of that.  Miroslav has said 
> > that, presently, chronyd will report instead, what I understand to be a 
> > kind of "differential slew error", which, in this example, would be 
> > "1second", not "4seconds".  And, I was suggesting that the log message 
> > should *not* report "1second".
> 
> 
> There are two components of that 4 seconds. One is the three seconds which is
> already being corrected by the alteration of the slew. The other is the 1
> second which is a new offset. The three seconds is already being corrected,
> and to report it now again may be confusing.

Yes, the "system clock wrong by" messages were meant to be stacked on
top of each other. If we change it to report the current remaining
correction, it won't be clear how much the clock has drifted between
the messages. If the log had

system clock behind 10 seconds
system clock behind 9 seconds

the clock could have drifted by any value between the logchange
threshold and 9 seconds.

> > What is it that you are thinking that the log message should report?
> 
> From what Lichvar says, it seems that it now reports the 1 sec which is an
> additional uncorrected error. The measurments log already reports the raw
> current offset and the corrected
> offset.

The measurements log doesn't actually include the raw offset. I think
only the refclocks log has that. However, the remaining correction is
reported in the tracking log, so it should be possible to reconstruct
the raw offset in the measurements, statistics and tracking logs.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-11-10 Thread Bill Unruh


Bill, I really do not understand what point you are trying to present here, with respect to the system log message.  I 
was only suggesting that the log message should be more clear as to, let us call it, "leading" and 
"lagging", "forward" and "backward".


You are requesting something further. You are also requesting a change in the
value that is reported, not just making the sign of the error clearer. I am in
perfect agreement with the request for making the meaning of the sign clearer.
I am (perhaps) not in agreement with the value you want reported.



On 11/09/2017 09:05 PM, Bill Unruh wrote:

The three seconds is being corrected, meaning that the clock has been slowed
down so that say in 10 seconds it would be out by 0 seconds. It is a known
error and it is an error that, if everything is left as it is, will be 0 in 10
sec.



Precisely, so why would you report that?


Indeed.  I do not see any reason to report that.


But you are claiming you should report that 3 sec.


Uhm - 4seconds, using Miroslav's example.  I am saying that the log message should report that the system clock offset is 
currently 4seconds.  I do not understand where you interpret a "but" out of that.  Miroslav has said that, presently, 
chronyd will report instead, what I understand to be a kind of "differential slew error", which, in this example, would 
be "1second", not "4seconds".  And, I was suggesting that the log message should *not* report 
"1second".



There are two components of that 4 seconds. One is the three seconds which is
already being corrected by the alteration of the slew. The other is the 1
second which is a new offset. The three seconds is already being corrected,
and to report it now again may be confusing.




What is it that you are thinking that the log message should report?


From what Lichvar says, it seems that it now reports the 1 sec which is an
additional uncorrected error. 
The measurments log already reports the raw current offset and the corrected

offset.




Below, you say that chronyd, by slewing the clock, will correct an offset in "10*DT", 
"where DT is that measured offset".  We see that 10*4seconds = 40seconds, not the 
10seconds in your example.

Think of this another way.  Observing the display time, adding or subtracting 
1second to the display will not give the correct time when the clock is off by 
4seconds.


But that is not what the log reports. Note that as I understand it, chrony
itself reports to other machines the corrected time -- the system time minus
the 3 seconds,( but I am not sure of this). Ie, there is the system time, and 
the
system time corrected for the ongoing slew,





"Slewing" is a different thing from "fix it immediately".  chronyd *will* step the clock when 
configured with "makestep", which is probably what you mean by "fix it immediately".


No, it will only make a step if the step size is larger than some value.


Of course.  I should not have to remind you that the "makestep" option takes a 
"threshold" argument?  What is your point?


chrony measures the offset of the clock by doing a least squared fit on the
past N measurements offsets, to determine what the best value of the current
offset is. Once it has that information it alters the rate of the clock so
that in something like DT*2000 seconds (where DT is that measured offset)  the
offset will be 0. (ntpd makes that 2000 a minimum. chrony will slew so that the
error disappears in 10*DT if DT is very large and the DT is less than the value
to make a step.)


That's useful to know.


Previous meaning "previously begun". Chrony, and ntpd do not correct times by
stepping the clock. They correct it by slightly increasing or decreasing the
rate by which the clock is running, until they have corrected the error.


There always exists an "instantaneous offset", of the system clock relative to 
the NTP server clock, which is an easy thing to describe, under any circumstances.


No, it is not because all the measurements have uncertainties-- due to travel
time fluctuations, due to measurement accuracy, due to interrupt handling
fluctruations, etc. Ie, no you cannot know what the "instantaneous offset" is.


You state the obvious.  What is your point?


I have no idea what is obvious to you.



If I wanted to be pedantic - which I often do - I might have said "the instantaneous offset 
estimate", instead of simply "the instantaneous offset".  But, we are all adults 
here, yes?


Yes, there is a variable frequency oscillator. Or rather the time in seconds
between the ticks of the cpu clock can be changed.


More of an electronics hardware guy would take issue with that blurring of the 
distinction between analog circuitry and digital circuitry. ;)


It is all digital. The cpu clock ticks ticks at a certain, unknown rate. The
system counts those ticks and converts them to time. That coversion process
must be calbrated and that calibration can be changed. It is primarily that
change in calibration that chrony 

Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-11-10 Thread James Feeney
Bill, I really do not understand what point you are trying to present here, 
with respect to the system log message.  I was only suggesting that the log 
message should be more clear as to, let us call it, "leading" and "lagging", 
"forward" and "backward".

On 11/09/2017 09:05 PM, Bill Unruh wrote:
> The three seconds is being corrected, meaning that the clock has been slowed
> down so that say in 10 seconds it would be out by 0 seconds. It is a known
> error and it is an error that, if everything is left as it is, will be 0 in 10
> sec.
> 
> 
>>> Precisely, so why would you report that?
>>
>> Indeed.  I do not see any reason to report that.
> 
> But you are claiming you should report that 3 sec.

Uhm - 4seconds, using Miroslav's example.  I am saying that the log message 
should report that the system clock offset is currently 4seconds.  I do not 
understand where you interpret a "but" out of that.  Miroslav has said that, 
presently, chronyd will report instead, what I understand to be a kind of 
"differential slew error", which, in this example, would be "1second", not 
"4seconds".  And, I was suggesting that the log message should *not* report 
"1second".

What is it that you are thinking that the log message should report?

Below, you say that chronyd, by slewing the clock, will correct an offset in 
"10*DT", "where DT is that measured offset".  We see that 10*4seconds = 
40seconds, not the 10seconds in your example.

Think of this another way.  Observing the display time, adding or subtracting 
1second to the display will not give the correct time when the clock is off by 
4seconds.

>> "Slewing" is a different thing from "fix it immediately".  chronyd *will* 
>> step the clock when configured with "makestep", which is probably what you 
>> mean by "fix it immediately".
> 
> No, it will only make a step if the step size is larger than some value.

Of course.  I should not have to remind you that the "makestep" option takes a 
"threshold" argument?  What is your point?

> chrony measures the offset of the clock by doing a least squared fit on the
> past N measurements offsets, to determine what the best value of the current
> offset is. Once it has that information it alters the rate of the clock so
> that in something like DT*2000 seconds (where DT is that measured offset)  the
> offset will be 0. (ntpd makes that 2000 a minimum. chrony will slew so that 
> the
> error disappears in 10*DT if DT is very large and the DT is less than the 
> value
> to make a step.)

That's useful to know.

>>> Previous meaning "previously begun". Chrony, and ntpd do not correct times 
>>> by
>>> stepping the clock. They correct it by slightly increasing or decreasing the
>>> rate by which the clock is running, until they have corrected the error.
>>
>> There always exists an "instantaneous offset", of the system clock relative 
>> to the NTP server clock, which is an easy thing to describe, under any 
>> circumstances.
> 
> No, it is not because all the measurements have uncertainties-- due to travel
> time fluctuations, due to measurement accuracy, due to interrupt handling
> fluctruations, etc. Ie, no you cannot know what the "instantaneous offset" is.

You state the obvious.  What is your point?

If I wanted to be pedantic - which I often do - I might have said "the 
instantaneous offset estimate", instead of simply "the instantaneous offset".  
But, we are all adults here, yes?

> Yes, there is a variable frequency oscillator. Or rather the time in seconds
> between the ticks of the cpu clock can be changed.

More of an electronics hardware guy would take issue with that blurring of the 
distinction between analog circuitry and digital circuitry. ;)

>> Presumably, a measurement of the current instantaneous absolute offset of 
>> the system clock relative to the NTP server clock.  I believe that that is - 
>> or would be - the intuitive interpretation of most users.  The idea that 
>> this "measurement" reported in the the log message would represent something 
>> much more abstract - and obtuse - than that is - unfortunate.
> 
> No. In chrony it is the least squares linear fit of the last N measurements of
> the offset that is used to estimate the current offset. Because the
> instantaneously measured offset is beset by uncertainties that can be
> "averaged out" by doing a least squares fit. N is chosen actively to make sure
> that over that set of N measurements the offsets are well approximated by a
> straight line with gaussian noise. Unfortunately in order to get a good 
> estimate, one HAS to  use "abstract - and
> obtuse" in your words, procedures. chrony's way of doing it is a factor of 3
> to a 20 times better than ntpd's method according to measurements that both I
> and Lichvar have made.

That's nice.  Still, you seem to misunderstand what I am referring to as 
"obtuse".  Let me try again.  Remember that the topic here is the log message, 
and nothing about the internal workings of chornyd - which we 

Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-11-09 Thread Bill Unruh



William G. Unruh __| Canadian Institute for| Tel: +1(604)822-3273
Physics _|___ Advanced Research _| Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology | un...@physics.ubc.ca
Canada V6T 1Z1 | and Gravity __|_ www.theory.physics.ubc.ca/

On Thu, 9 Nov 2017, James Feeney wrote:


On 11/09/2017 02:22 PM, Bill Unruh wrote:

That is unclear. Chrony knows that it is out by a certain amount. That is why
it is slewing the clock, and in a few seconds or minutes the system time will
be exactly what it thinks NTP time is. It now finds it is out by a second.
Does it report that it is out by a second or by that second plus the three
seconds that it already is correcting.


What does that mean, "already is correcting"?  Either the system clock current 
instantaneous offset is 1 second, 3 seconds, or 4 seconds, relative to the NTP server 
clock.



The three seconds is being corrected, meaning that the clock has been slowed
down so that say in 10 seconds it would be out by 0 seconds. It is a known
error and it is an error that, if everything is left as it is, will be 0 in 10
sec.



Precisely, so why would you report that?


Indeed.  I do not see any reason to report that.


But you are claiming you should report that 3 sec.





The logs are not the place to give a complete description of the theory of
chrony operations. That would just be silly, verbost and totally confusing.
One could put that into the documentation.


Yes.  So, probably not useful to have a log message that even tries to report that 
"differential slew error".


Why? Possibly because your naive expectation is "If the clock is out, fix it
immediately" But that is not what chrony does.


"Slewing" is a different thing from "fix it immediately".  chronyd *will* step the clock when 
configured with "makestep", which is probably what you mean by "fix it immediately".


No, it will only make a step if the step size is larger than some value.





But, I don't actually have enough information to understand what is meant by "didn't finish yet".  When, or how often, 
and in what manner is this "previous slew" measured?  Is there some "current slew" that is distinct from a 
"previous slew"?  Does a previous slew "end" before the system clock becomes synchronous with the NTP server 
clock?


chrony measures the offset of the clock by doing a least squared fit on the
past N measurements offsets, to determine what the best value of the current
offset is. Once it has that information it alters the rate of the clock so
that in something like DT*2000 seconds (where DT is that measured offset)  the  offset will 
be 0. (ntpd makes that 2000 a minimum. chrony will slew so that the error

disappears in 10*DT if DT is very large and the DT is less than the value to
make a step.)






Previous meaning "previously begun". Chrony, and ntpd do not correct times by
stepping the clock. They correct it by slightly increasing or decreasing the
rate by which the clock is running, until they have corrected the error.


There always exists an "instantaneous offset", of the system clock relative to 
the NTP server clock, which is an easy thing to describe, under any circumstances.


No, it is not because all the measurements have uncertainties-- due to travel
time fluctuations, due to measurement accuracy, due to interrupt handling
fluctruations, etc. Ie, no you cannot know what the "instantaneous offset" is.




When a "new measurement" is made of a "previously begun" slewing process - 
increasing or decreasing the rate of the system clock - the log message should be clear as to the 
nature of that measurement.  What was measured?


What is measured is the difference between the average of the time when the
ntp packet was sent out and when it was received again according to the local
clock , and the time when the
remote machine claimed to have received it according to the remote clock .




chrony keeps measuring its deviation from the ntp time, and slews the clock,
at various rates, to correct for difference. This takes place all the time.


So the system clock frequency is changed, rather than the clock value being 
increased or decreased with small discontinuous offsets greater than 2 counts.  
Clearly, that is a challenge for the kernel, since there is no variable 
frequency oscillator hardware, and some hardware counter will need to increment 
or decrement by 2 counts instead of by 1 count at some rate, generated using a 
second counter.


Yes, there is a variable frequency oscillator. Or rather the time in seconds
between the ticks of the cpu clock can be changed.





It tweaks it every time it makes a measurement.


Presumably, a measurement of the current instantaneous absolute offset of the system 
clock relative to the NTP server clock.  I believe that that is - or would be - the 
intuitive interpretation of most users.  The idea that this "measurement" 
reported in the the log message would represent something much more abstract 

Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-11-09 Thread James Feeney
On 11/09/2017 02:22 PM, Bill Unruh wrote:
> That is unclear. Chrony knows that it is out by a certain amount. That is why
> it is slewing the clock, and in a few seconds or minutes the system time will
> be exactly what it thinks NTP time is. It now finds it is out by a second.
> Does it report that it is out by a second or by that second plus the three
> seconds that it already is correcting.

What does that mean, "already is correcting"?  Either the system clock current 
instantaneous offset is 1 second, 3 seconds, or 4 seconds, relative to the NTP 
server clock.
 
> Precisely, so why would you report that?

Indeed.  I do not see any reason to report that.

> The logs are not the place to give a complete description of the theory of
> chrony operations. That would just be silly, verbost and totally confusing.
> One could put that into the documentation.

Yes.  So, probably not useful to have a log message that even tries to report 
that "differential slew error".

> Why? Possibly because your naive expectation is "If the clock is out, fix it
> immediately" But that is not what chrony does.

"Slewing" is a different thing from "fix it immediately".  chronyd *will* step 
the clock when configured with "makestep", which is probably what you mean by 
"fix it immediately".

>> But, I don't actually have enough information to understand what is meant by 
>> "didn't finish yet".  When, or how often, and in what manner is this 
>> "previous slew" measured?  Is there some "current slew" that is distinct 
>> from a "previous slew"?  Does a previous slew "end" before the system clock 
>> becomes synchronous with the NTP server clock?
>>
> 
> Previous meaning "previously begun". Chrony, and ntpd do not correct times by
> stepping the clock. They correct it by slightly increasing or decreasing the
> rate by which the clock is running, until they have corrected the error.

There always exists an "instantaneous offset", of the system clock relative to 
the NTP server clock, which is an easy thing to describe, under any 
circumstances.

When a "new measurement" is made of a "previously begun" slewing process - 
increasing or decreasing the rate of the system clock - the log message should 
be clear as to the nature of that measurement.  What was measured?

> chrony keeps measuring its deviation from the ntp time, and slews the clock,
> at various rates, to correct for difference. This takes place all the time.

So the system clock frequency is changed, rather than the clock value being 
increased or decreased with small discontinuous offsets greater than 2 counts.  
Clearly, that is a challenge for the kernel, since there is no variable 
frequency oscillator hardware, and some hardware counter will need to increment 
or decrement by 2 counts instead of by 1 count at some rate, generated using a 
second counter.

> It tweaks it every time it makes a measurement.

Presumably, a measurement of the current instantaneous absolute offset of the 
system clock relative to the NTP server clock.  I believe that that is - or 
would be - the intuitive interpretation of most users.  The idea that this 
"measurement" reported in the the log message would represent something much 
more abstract - and obtuse - than that is - unfortunate.

> And not to all your questions. You really need to read about the theory of 
> ntpd clock adjustment which is
> that same to lowest order as that of chrony.

That would not hurt, but the issue I am raising is about the meaning of a 
simple log message, not about the theory of the network time protocol.  This 
does not need to be complicated.  I believe that it should not be *made* 
complicated.

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-11-09 Thread Bill Unruh



William G. Unruh __| Canadian Institute for| Tel: +1(604)822-3273
Physics _|___ Advanced Research _| Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology | un...@physics.ubc.ca
Canada V6T 1Z1 | and Gravity __|_ www.theory.physics.ubc.ca/

On Thu, 9 Nov 2017, James Feeney wrote:


On 11/09/2017 05:25 AM, Miroslav Lichvar wrote:

So, if there is a large adjustment running and a new measurement says
the offset of the clock is not what expect (e.g. the clock or the
server has drifted for some reason), should we report by how much the
adjustment which is still running had to be adjusted, or the total
amount of the remaining adjustment? Currently, it's the former.


Well, my intuition and naive assumption would be that any allusion to a system clock 
offset was describing an "absolute offset", relative to the NTP server clock, 
and NOT relative to some chronyd internal process, which, in this case, would be the 
slewing process.  And I also naively would expect that other people would make that same 
assumption.


That is unclear. Chrony knows that it is out by a certain amount. That is why
it is slewing the clock, and in a few seconds or minutes the system time will
be exactly what it thinks NTP time is. It now finds it is out by a second.
Does it report that it is out by a second or by that second plus the three
seconds that it already is correcting.



Any log message that was referring to some offset of the system clock relative to some clock *other* than the 
NTP server clock, in my mind, should require some very very clear and emphasized language.  Even then, how 
would that be worded?  I am still not clear about this offset that is being reported!  A log message would 
seem to require some wording as "involved" as the expression of your example: "the initial 
offset was 5 seconds and the system clock was already corrected by 2 seconds when another measurement is 
made, which says the offset of the system clock is now actually 4 seconds instead of the  expected 3 
seconds."  I don't even know how often "another measurement is made", so I cannot put that 
offset error in a precise context.


Precisely, so why would you report that?




And, why would anyone care about this "differential slew error"?  Is there some attempt 
being made to illustrate a possible hardware fault, for instance, where the system clock is 
drifting faster than the chrony slewing algorithm is able to track?  And then, if that were the 
case, I would expect a different log message, such as: "Unable to correct system clock - 
excessive drift."


So, there could be an initial message, for instance, "The system clock is leading/lagging the NTP server 
clock by N seconds", followed by one of two differing follow-up messages, depending upon whether the 
system clock is corrected by slewing or by stepping.  For slewing, there could be a message that simply says 
something like "Slewing the system clock forward/backward, correction started.", and for stepping, 
a message something like "Stepping the system clock forward/backward by N seconds."


The logs are not the place to give a complete description of the theory of
chrony operations. That would just be silly, verbost and totally confusing.
One could put that into the documentation.





What value should be reported if the previous slew didn't finish yet?
That's the normal case. It's just that the slew is rarely larger than
logchange.


My instinctive answer is "always the absolute offset of the system clock relative to 
the NTP server clock."


Why? Possibly because your naive expectation is "If the clock is out, fix it
immediately" But that is not what chrony does.



But, I don't actually have enough information to understand what is meant by "didn't finish yet".  When, or how often, 
and in what manner is this "previous slew" measured?  Is there some "current slew" that is distinct from a 
"previous slew"?  Does a previous slew "end" before the system clock becomes synchronous with the NTP server 
clock?



Previous meaning "previously begun". Chrony, and ntpd do not correct times by
stepping the clock. They correct it by slightly increasing or decreasing the
rate by which the clock is running, until they have corrected the error. It is
more important that the the clock deliver a smooth time than that it deliver
the correct time. For example, backstepping the clock (so that there are two
times at which the system clock shows 1496549 seconds for example). That kind
of backstepping can cause havoc to the filesystem.



My intuitive assumption would be that the slewing process is simply initiated when chronyd is 
started, and that the slewing process changes its rate dynamically, however much is required to 
"catch up" the system clock to the NTP server clock, and that a single slewing process 
will continue until synchronization is achieved.  Does "slewing" work like that?  Or some 
other way?


chrony keeps measuring its deviation from the ntp 

Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-11-09 Thread James Feeney
On 11/09/2017 05:25 AM, Miroslav Lichvar wrote:
> So, if there is a large adjustment running and a new measurement says
> the offset of the clock is not what expect (e.g. the clock or the
> server has drifted for some reason), should we report by how much the
> adjustment which is still running had to be adjusted, or the total
> amount of the remaining adjustment? Currently, it's the former.

Well, my intuition and naive assumption would be that any allusion to a system 
clock offset was describing an "absolute offset", relative to the NTP server 
clock, and NOT relative to some chronyd internal process, which, in this case, 
would be the slewing process.  And I also naively would expect that other 
people would make that same assumption.

Any log message that was referring to some offset of the system clock relative 
to some clock *other* than the NTP server clock, in my mind, should require 
some very very clear and emphasized language.  Even then, how would that be 
worded?  I am still not clear about this offset that is being reported!  A log 
message would seem to require some wording as "involved" as the expression of 
your example: "the initial offset was 5 seconds and the system clock was 
already corrected by 2 seconds when another measurement is made, which says the 
offset of the system clock is now actually 4 seconds instead of the  expected 3 
seconds."  I don't even know how often "another measurement is made", so I 
cannot put that offset error in a precise context.

And, why would anyone care about this "differential slew error"?  Is there some 
attempt being made to illustrate a possible hardware fault, for instance, where 
the system clock is drifting faster than the chrony slewing algorithm is able 
to track?  And then, if that were the case, I would expect a different log 
message, such as: "Unable to correct system clock - excessive drift."

>> So, there could be an initial message, for instance, "The system clock is 
>> leading/lagging the NTP server clock by N seconds", followed by one of two 
>> differing follow-up messages, depending upon whether the system clock is 
>> corrected by slewing or by stepping.  For slewing, there could be a message 
>> that simply says something like "Slewing the system clock forward/backward, 
>> correction started.", and for stepping, a message something like "Stepping 
>> the system clock forward/backward by N seconds."

> What value should be reported if the previous slew didn't finish yet?
> That's the normal case. It's just that the slew is rarely larger than
> logchange.

My instinctive answer is "always the absolute offset of the system clock 
relative to the NTP server clock."

But, I don't actually have enough information to understand what is meant by 
"didn't finish yet".  When, or how often, and in what manner is this "previous 
slew" measured?  Is there some "current slew" that is distinct from a "previous 
slew"?  Does a previous slew "end" before the system clock becomes synchronous 
with the NTP server clock?

My intuitive assumption would be that the slewing process is simply initiated 
when chronyd is started, and that the slewing process changes its rate 
dynamically, however much is required to "catch up" the system clock to the NTP 
server clock, and that a single slewing process will continue until 
synchronization is achieved.  Does "slewing" work like that?  Or some other way?

I have no idea how often chronyd actually "tweaks" the system clock, whether 
chronyd is in a "slewing" state, or even in a "synchronized" state.  Is there 
some hard-coded maximum allowed offset before the system clock is "tweaked"?  
Or is there a periodic "tweak" rate?  And is this "tweak" interval dynamically 
adjusted?  Is the system clock "frequency" automatically "scaled" by the 
kernel?  I know that those are kind of basic concepts, but I really don't know 
what chronyd is doing at that low level.

I'm appreciating your patience with this explanation...

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-11-09 Thread Miroslav Lichvar
On Tue, Nov 07, 2017 at 02:27:50PM -0700, James Feeney wrote:
> On 11/06/2017 09:17 AM, Miroslav Lichvar wrote:
> > For example, if the initial offset was 5 seconds and the system clock
> > was already corrected by 2 seconds when another measurement is made,
> > which says the offset of the system clock is now actually 4 seconds
> > instead of the expected 3, a message will be logged that the clock is
> > wrong by 1 second. The question is which offset of the system clock
> > should be reported (4 or 1) and when it should be reported (abs(4) >
> > logchange or abs(1) > logchange).
>  
> Ah - too many small numbers - is the "1" from the "5-4" or from the "4-3"?

It's the "4-3".

> I do not understand in what sense the clock is then wrong by 1 second.  And, 
> I am thinking that the log message should never say that - too confusing.  If 
> I am understanding, you are saying that currently, a message like that would 
> be logged - and maybe there should not be a message like that.  It is not 
> immediately evident to me why a message like that would be informative.  It 
> may be better to only log a simpler message, describing the *current* 
> relationship between the system clock and the NTP server clock.

I think the idea was to warn the user when the clock needs a larger
adjustment. The logchange directive which configures this message is
described in the manual as "the threshold for the adjustment of the
system clock that will generate a syslog message. An example ...
message to be generated if a system clock error of over 0.1 seconds
starts to be compensated".

So, if there is a large adjustment running and a new measurement says
the offset of the clock is not what expect (e.g. the clock or the
server has drifted for some reason), should we report by how much the
adjustment which is still running had to be adjusted, or the total
amount of the remaining adjustment? Currently, it's the former.

> So, there could be an initial message, for instance, "The system clock is 
> leading/lagging the NTP server clock by N seconds", followed by one of two 
> differing follow-up messages, depending upon whether the system clock is 
> corrected by slewing or by stepping.  For slewing, there could be a message 
> that simply says something like "Slewing the system clock forward/backward, 
> correction started.", and for stepping, a message something like "Stepping 
> the system clock forward/backward by N seconds."

What value should be reported if the previous slew didn't finish yet?
That's the normal case. It's just that the slew is rarely larger than
logchange.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-11-07 Thread Bill Unruh


On Tue, 7 Nov 2017, James Feeney wrote:


On 11/06/2017 09:17 AM, Miroslav Lichvar wrote:

From the other suggestions that have been made, I liked best "was
stepped backward/forward".


That's good too.


For example, if the initial offset was 5 seconds and the system clock
was already corrected by 2 seconds when another measurement is made,
which says the offset of the system clock is now actually 4 seconds
instead of the expected 3, a message will be logged that the clock is
wrong by 1 second. The question is which offset of the system clock
should be reported (4 or 1) and when it should be reported (abs(4) >
logchange or abs(1) > logchange).


Ah - too many small numbers - is the "1" from the "5-4" or from the "4-3"?

I do not understand in what sense the clock is then wrong by 1 second.  And, I 
am thinking that the log message should never say that - too confusing.  If I 
am understanding, you are saying that currently, a message like that would be 
logged - and maybe there should not be a message like that.  It is not 
immediately evident to me why a message like that would be informative.  It may 
be better to only log a simpler message, describing the *current* relationship 
between the system clock and the NTP server clock.


The system is drifting the clock so as to get rid of the 5 second offset. It
should have drifted it by 2 seconds by this time, so that time should have
read 3 sec wrong. But instead it is 4 seconds wrong. If one allowed the drift
to continue it would be 0 seconds wrong in 1 1/2 measurement periods. With
that same drift however, it would, at the end of the drift still be 1 second
out. Ie, the clock is now out by 1 second from the expected drifting clock.






There are at least the following options:
1. change the message to better describe what is behind or ahead
2. change the message to print the real offset of the system clock
   a) print it on each update as long as the offset is larger than
  logchange
   b) print it only when the change in the offset is larger than
  logchange

Thoughts?


If I am understanding, I think I prefer option '2.a)'.


What I don't like about 2.a is that if there is a large initial offset
and stepping is not enabled by "makestep", there may be a large number
of the "clock wrong by" messages in the log, even if everything is
working as expected.


I am guessing that you mean, if a slewing correction is used, that there would 
then be a log message for every instance in which a small change is made?  Yes, 
that would be too much unimportant information.

I was supposing that alternative '2.b)' would mean that no message would be printed 
saying anything about the system clock when the offset was less than logchange.  I may 
have misunderstood.  I'm thinking that something should be printed initially about the 
system clock offset, but then, subsequently, only "state change" events should 
be printed in the log.  For instance, if the system clock drifts and, for some reason, 
the offset exceeds logchange, or if, when slewing the system clock, the system clock has 
achieved synchronization.

So, there could be an initial message, for instance, "The system clock is leading/lagging the NTP server 
clock by N seconds", followed by one of two differing follow-up messages, depending upon whether the 
system clock is corrected by slewing or by stepping.  For slewing, there could be a message that simply says 
something like "Slewing the system clock forward/backward, correction started.", and for stepping, 
a message something like "Stepping the system clock forward/backward by N seconds."



Slewing is the expected behaviour. Stepping is not. Thus one would expect a
log warning on stepping but not slewing, is what I would expect.



--
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



--
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-11-07 Thread James Feeney
On 11/06/2017 09:17 AM, Miroslav Lichvar wrote:
> From the other suggestions that have been made, I liked best "was
> stepped backward/forward".

That's good too.
 
> For example, if the initial offset was 5 seconds and the system clock
> was already corrected by 2 seconds when another measurement is made,
> which says the offset of the system clock is now actually 4 seconds
> instead of the expected 3, a message will be logged that the clock is
> wrong by 1 second. The question is which offset of the system clock
> should be reported (4 or 1) and when it should be reported (abs(4) >
> logchange or abs(1) > logchange).
 
Ah - too many small numbers - is the "1" from the "5-4" or from the "4-3"?

I do not understand in what sense the clock is then wrong by 1 second.  And, I 
am thinking that the log message should never say that - too confusing.  If I 
am understanding, you are saying that currently, a message like that would be 
logged - and maybe there should not be a message like that.  It is not 
immediately evident to me why a message like that would be informative.  It may 
be better to only log a simpler message, describing the *current* relationship 
between the system clock and the NTP server clock.


>>> There are at least the following options:
>>> 1. change the message to better describe what is behind or ahead
>>> 2. change the message to print the real offset of the system clock
>>>a) print it on each update as long as the offset is larger than
>>>   logchange
>>>b) print it only when the change in the offset is larger than
>>>   logchange
>>>
>>> Thoughts?
>>
>> If I am understanding, I think I prefer option '2.a)'.
> 
> What I don't like about 2.a is that if there is a large initial offset
> and stepping is not enabled by "makestep", there may be a large number
> of the "clock wrong by" messages in the log, even if everything is
> working as expected.

I am guessing that you mean, if a slewing correction is used, that there would 
then be a log message for every instance in which a small change is made?  Yes, 
that would be too much unimportant information.

I was supposing that alternative '2.b)' would mean that no message would be 
printed saying anything about the system clock when the offset was less than 
logchange.  I may have misunderstood.  I'm thinking that something should be 
printed initially about the system clock offset, but then, subsequently, only 
"state change" events should be printed in the log.  For instance, if the 
system clock drifts and, for some reason, the offset exceeds logchange, or if, 
when slewing the system clock, the system clock has achieved synchronization.

So, there could be an initial message, for instance, "The system clock is 
leading/lagging the NTP server clock by N seconds", followed by one of two 
differing follow-up messages, depending upon whether the system clock is 
corrected by slewing or by stepping.  For slewing, there could be a message 
that simply says something like "Slewing the system clock forward/backward, 
correction started.", and for stepping, a message something like "Stepping the 
system clock forward/backward by N seconds."

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-10-31 Thread James Feeney
On 10/31/2017 10:41 AM, James Feeney wrote:
> "ahead" and "behind" also will work for both the offset message and the> "was 
> stepped" message.

Also:
forward / back
advanced / delayed
leading / lagging   These are more traditional technical terms describing 
offsets.
incremented / decremented
increased / decreased   "stepped forward by" -> "increased by", etc.
raised / loweredHa!  Our concept of time is so "horizontal".
early / late

Hmm - on second thought, "was stepped ahead" seems natural, but "was stepped 
behind", not so much.  And then, "clock behind by 3.00 seconds" followed by 
"was stepped ahead by 3.00 seconds" might seem to mix-together "behind" and 
"ahead" in a contradictory way.  Different word pairs for each message might be 
easier to read and understand.

For reference, from my original example:

chronyd[622]: System clock wrong by 1.693005 seconds, adjustment started
chronyd[622]: System clock was stepped by 1.693005 seconds

I might lean more toward "leading / lagging" and "increased / decreased".

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-10-31 Thread James Feeney
On 10/31/2017 05:12 AM, Miroslav Lichvar wrote:
>> For system logs messages
>> "ahead" / "behind" for offset
>> "fast" / "slow" for freq
> That looks good to me.

"ahead" and "behind" also will work for both the offset message and the "was 
stepped" message.

> Another question is whether it's correct to say "System clock". The
> messages don't actually print the offset of the system clock, but
> rather the change in the offset relative to the NTP time that chronyd
> is tracking. On start that is the same as the offset of the system
> clock, but if there is an unfinished correction, it's the difference
> between the old offset and new offset.

I'm sorry, I don't understand.  What does that mean, "the change in the offset"?

The quantity ( system_clock_time - NTP_server_time ), I think I understand.

Now, are you describing something like
 (( new_system_clock_time - new_NTP_server_time ) -
 ( old_system_clock_time - old_NTP_server_time ))?

Are you saying that there is a system log message that would be saying "the 
system clock was behind by 5 seconds, but now, after some correction, the 
system clock is behind by only 3 seconds, and the offset has been 'reduced' by 
2 seconds"?

That is what "the difference between the old offset and new offset" suggests to 
me.  But then - really?  Is chrony reporting that?  Why would anyone want to 
see a report about that?  Instead, it might be useful to know, in that example, 
that there was still 3 seconds of correction remaining.  But that quantity 
would be exactly the same as saying "the system clock is now behind by 3 
seconds".

So, I need some help there.

> There are at least the following options:
> 1. change the message to better describe what is behind or ahead
> 2. change the message to print the real offset of the system clock
>a) print it on each update as long as the offset is larger than
>   logchange
>b) print it only when the change in the offset is larger than
>   logchange
> 
> Thoughts?

If I am understanding, I think I prefer option '2.a)'.

Option '2.b)' does not tell me what event is going to compel a system log 
message - at the first state-change?  And only once?  Where option '2.a)' will 
print a log message "on each update" - which I don't know when chrony normally 
makes those updates - but then, the log messages would inform me.  That sounds 
like a good thing.

I am suspicious that option '1.' means having to describe some really 
complicated concept, this "change in the offset".  But I'm only guessing, since 
I don't actually understand the "what".  Could you give an example?

In common speaking, I still tend to speak in terms of frequency.  I might say 
something like "That clock is running fast" and "That clock is running slow", 
where "running" and "slow" are adjectives, as in a "fast running clock" and a 
"slow running clock".  Somehow, instinctively, that is the information for 
which I am searching.  I can infer a "slow clock" when the system clock offset 
is "behind", but in my mind, I want to reduce the concept to an effective 
relative frequency.  Of course, subsequently, I will want to quantify the 
resulting offset.

I don't know that other people do that in their head.  But it might be 
something to consider, in the printing of system log messages.  A "complicated" 
log message might say something like "The system clock is slow and is behind by 
3.00 seconds."  That's a bit more like natural human speech - though it may 
not be as "manly" as a terse and cryptic log message.  Ha!

Of course, there are also assumptions there,
 1) that there exists some local reference clock, and
 2) that local clock had been set to the correct time in the past, and
 3) that that local clock was used to set the system clock, and
 4) the system clock is an analog of the local reference clock.

Assumption 4 is not actually valid if, for instance, the local reference clock 
and the system clock operate from different crystals, which is common.  The 
system clock might actually run fast while the local clock runs slow.  But, as 
you suggest, the casual assumption is also that "the offset is larger than 
logchange", which, I think, leaves us pretty much in the realm of setting the 
system clock at boot-up, or at wake-up.  I suspect that the casual user 
understands this.  Once the system clock has been set, tweaking the system 
clock drift is at a very different scale, and no system log message is printed 
for that change.

I'm also assuming that booting a device that has no local reference clock, like 
a network appliance, a router or a switch or an access point, there is not 
going to be any "offset" message or any "stepped by" message printed in the 
system log.  Or, maybe the system clock time simply defaults to the epic, "0", 
and the offset to the current time is shown?

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email 

Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-10-30 Thread James Feeney
On 10/30/2017 05:12 PM, Bill Unruh wrote:
> Yes, that occurs in the system logs (/var/log/messages, or journalctl ) and I
> do not parse those. There is virtually nothing there in the older versions of
> chrony. I go to the /var/log/chrony/* logs

Ok, I see.  But that's a different thing altogether.

Do you have any preference with respect to just the system log messages?

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-10-30 Thread Bill Unruh





On 10/30/2017 02:42 PM, Bill Unruh wrote:

On Mon, 30 Oct 2017, James Feeney wrote:

On 10/30/2017 05:07 AM, Miroslav Lichvar wrote:

existing scripts that parse the log


I am not familiar with any of those scripts - who parses chrony log files? - so 
my initial reaction is to be unsympathetic.  But I don't know - how important 
is that?


I certainly do. Mind you I convert the time to seconds myself with awk.


Ok - and you would need to modify your script, if the word "wrong" was removed, 
but not if there were trailing text?


Yes, that occurs in the system logs (/var/log/messages, or journalctl ) and I
do not parse those. There is virtually nothing there in the older versions of
chrony. I go to the /var/log/chrony/* logs







Do you think it would make sense to keep the sign ...


I am confused - unclear about what chrony is actually reporting.  In my example, the implicit sign 
was positive, '+'.  But then, in apparent contradiction, "chronyc tracking" reports 
"Frequency   : 5.626 ppm slow", which would have accumulated a *negative* offset, 
'-', of the system time relative to the actual UTC time.


The frequency and the offset are not really related to each other.


Uhm - no.  The system clock frequency and the instantaneous system clock offset 
on reboot really *are* related to each other.


Not sure what you mean. On boot, the system clock frequency is irrelevant
since it was not operating between reboots. What is important is the rtc clock
offset and frequency, and yes, they can be related to each other.





Is there really a sign associated with the system time reported in the log?


Yes. The offset can be positive or negative.


Ok.  All of my hardware clocks must be slow.


Could be. There is certainly a sign associated.





Of course, the phrases "stepped forward" and "stepped backward" can be used without putting the words 
"forward" and "backward" in parentheses.  Still, saying "stepped backward by -2.00 seconds" would have 
the flavor of a double negative.  Similarly, the phrase "slow by -2.00 seconds" would have a nagging ambiguity about it.


If you were going to keep the sign, then you would choose one or the other
(ahead or behind is far better than fast or slow, since those are rate terms,
not offset term. A clock can be slow and still be ahead. Eg the clock could
tick only 59 times per minute, and yet be set ahead.


Yes, ok, "ahead" and "behind".

But again, frequency errors in my system clock are the cause of instantaneous offsets in the 
current time.  The time is not being set from scratch, with a clock that has never been set.  
Technically, there are two pieces of information to be stated: 1) the frequency error or drift 
rate, and 2) the instantaneous absolute offset error.  The displayed units serve to distinguish 
these, "seconds per second" for a frequency error, and "seconds" for an offset 
error, and so this is already clear in the system log message.  I suppose it depends on what 
language you are use to hearing, as far as fast/slow or ahead/behind.


Yes, the rate and the offset are the two pieces that chrony uses. It is not
clear here what you are trying to get at.




So, the clock is -3.564452 ahead of NTP time is pretty unambiguous.


Ha!  No, that sounds like an explicitly contradictory statement to me.  I'm not 
even sure what you are trying to say with that example.


It means that to get the system time, you add whatever that number is
(-3.564452) to the NTP time.




If chrony were to use explicitly signed numbers in the log messages, the 
parenthetic slow/fast and forward/backward would be most appropriate.


It does explicity use signed numbers.

A few lines from my measurements.log


   Date (UTC) Time IP Address   L St 1234 abc 5678 LP RP Score Offset Peer 
del. Peer disp. Root del.  Root disp.
   

   2017-10-22 11:02:59 142.103.234.23  N  1  111   6  6 1.00 -3.600e-04 
 2.440e-02  1.007e-06  0.000e+00  3.052e-05

Note the -3.600 in the offset column.


I'm confused - why are you presenting your measurements.log?  The log file 
message that I was discussing was the chrony message in the system log.


Because it is an example of the report of a negative offset.




So, when you said, above, that "The offset can be positive or negative", were you 
actually talking about your "measurements.log", and not the system log?


The example was the measurement log. Here is an example from
/var/log/daemonlog
:
/var/log/daemonlog:Jan 12 00:58:04 planet chronyd[820]: System clock wrong by
-1.121814 seconds, adjustment started

Note the negative sign.




Was my meaning unclear? "Explicitly signed" means that a positive number always 
has an explicit leading '+'.  Of course that is still only suggestive, that the number 
could just as well have been negative, and does not 

Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-10-30 Thread James Feeney
On 10/30/2017 02:42 PM, Bill Unruh wrote:
> On Mon, 30 Oct 2017, James Feeney wrote:
>> On 10/30/2017 05:07 AM, Miroslav Lichvar wrote:
>>> existing scripts that parse the log
>>
>> I am not familiar with any of those scripts - who parses chrony log files? - 
>> so my initial reaction is to be unsympathetic.  But I don't know - how 
>> important is that?
> 
> I certainly do. Mind you I convert the time to seconds myself with awk.

Ok - and you would need to modify your script, if the word "wrong" was removed, 
but not if there were trailing text?

>>> Do you think it would make sense to keep the sign ...
>>
>> I am confused - unclear about what chrony is actually reporting.  In my 
>> example, the implicit sign was positive, '+'.  But then, in apparent 
>> contradiction, "chronyc tracking" reports "Frequency   : 5.626 ppm 
>> slow", which would have accumulated a *negative* offset, '-', of the system 
>> time relative to the actual UTC time.
> 
> The frequency and the offset are not really related to each other.

Uhm - no.  The system clock frequency and the instantaneous system clock offset 
on reboot really *are* related to each other.

>> Is there really a sign associated with the system time reported in the log?
> 
> Yes. The offset can be positive or negative.

Ok.  All of my hardware clocks must be slow.

>> Of course, the phrases "stepped forward" and "stepped backward" can be used 
>> without putting the words "forward" and "backward" in parentheses.  Still, 
>> saying "stepped backward by -2.00 seconds" would have the flavor of a 
>> double negative.  Similarly, the phrase "slow by -2.00 seconds" would 
>> have a nagging ambiguity about it.
> 
> If you were going to keep the sign, then you would choose one or the other
> (ahead or behind is far better than fast or slow, since those are rate terms,
> not offset term. A clock can be slow and still be ahead. Eg the clock could
> tick only 59 times per minute, and yet be set ahead.

Yes, ok, "ahead" and "behind".

But again, frequency errors in my system clock are the cause of instantaneous 
offsets in the current time.  The time is not being set from scratch, with a 
clock that has never been set.  Technically, there are two pieces of 
information to be stated: 1) the frequency error or drift rate, and 2) the 
instantaneous absolute offset error.  The displayed units serve to distinguish 
these, "seconds per second" for a frequency error, and "seconds" for an offset 
error, and so this is already clear in the system log message.  I suppose it 
depends on what language you are use to hearing, as far as fast/slow or 
ahead/behind.

> So, the clock is -3.564452 ahead of NTP time is pretty unambiguous.

Ha!  No, that sounds like an explicitly contradictory statement to me.  I'm not 
even sure what you are trying to say with that example.

>> If chrony were to use explicitly signed numbers in the log messages, the 
>> parenthetic slow/fast and forward/backward would be most appropriate.
> 
> It does explicity use signed numbers.
> 
> A few lines from my measurements.log
> 
> 
>    Date (UTC) Time IP Address   L St 1234 abc 5678 LP RP Score Offset 
> Peer del. Peer disp. Root del.  Root disp.
>    
> 
>    2017-10-22 11:02:59 142.103.234.23  N  1  111   6  6 1.00 
> -3.600e-04  2.440e-02  1.007e-06  0.000e+00  3.052e-05
> 
> Note the -3.600 in the offset column.

I'm confused - why are you presenting your measurements.log?  The log file 
message that I was discussing was the chrony message in the system log.

So, when you said, above, that "The offset can be positive or negative", were 
you actually talking about your "measurements.log", and not the system log?

Was my meaning unclear? "Explicitly signed" means that a positive number always 
has an explicit leading '+'.  Of course that is still only suggestive, that the 
number could just as well have been negative, and does not represent a simple 
magnitude.  But that would be my interpretation.

I am back to guessing that chrony does not use signed numbers in the system log 
message, not having actually looked at the code myself.

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-10-30 Thread James Feeney
On 10/30/2017 05:07 AM, Miroslav Lichvar wrote:
> existing scripts that parse the log

I am not familiar with any of those scripts - who parses chrony log files? - so 
my initial reaction is to be unsympathetic.  But I don't know - how important 
is that?

> Do you think it would make sense to keep the sign ...

I am confused - unclear about what chrony is actually reporting.  In my 
example, the implicit sign was positive, '+'.  But then, in apparent 
contradiction, "chronyc tracking" reports "Frequency   : 5.626 ppm slow", 
which would have accumulated a *negative* offset, '-', of the system time 
relative to the actual UTC time.

Is there really a sign associated with the system time reported in the log?

> and indicate whether it's fast or slow in parentheses?

If that is necessary, yes, that would work.  I'm just not clear about the sign 
thing, and that the term "wrong" with a back-reference needs to be used.

Of course, the phrases "stepped forward" and "stepped backward" can be used 
without putting the words "forward" and "backward" in parentheses.  Still, 
saying "stepped backward by -2.00 seconds" would have the flavor of a 
double negative.  Similarly, the phrase "slow by -2.00 seconds" would have 
a nagging ambiguity about it.

If chrony were to use explicitly signed numbers in the log messages, the 
parenthetic slow/fast and forward/backward would be most appropriate.

But then, I'd like to see an explicit 'plus' sign on the numbers, when 
appropriate, along with the explicit 'minus' sign, to make clear that the 
numbers are actually signed, and are not simply magnitudes.

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-10-30 Thread Miroslav Lichvar
On Fri, Oct 27, 2017 at 10:31:16AM -0600, James Feeney wrote:
> How about, rather than using the term "wrong", instead use the terms "fast" 
> and "slow" to describe this quantity "1.693005 seconds"?  Then the log 
> message might read:
> 
> chronyd[622]: System clock fast by 1.693005 seconds, adjustment started
> 
> or
> 
> chronyd[622]: System clock slow by 1.693005 seconds, adjustment started

I agree this would be much clearer for the user. I never remember
which sign is for fast and slow in what context (it's not consistent
unfortunately). The trouble is that it would break existing scripts
that parse the log and the parsing itself would be more difficult if
it had to look for the word "slow" or "fast" instead of the sign. I'm
not sure how important that really is.

Do you think it would make sense to keep the sign and indicate whether
it's fast or slow in parentheses?

System clock wrong by +/-?.??? (slow/fast) ?

System clock stepped by +/-?.??? (forward/backward) ?

There are other messages that print an offset, so maybe they could be
all changed at once to keep it consistent.

What do others think?

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] new feature request: add "fast" and "slow" to "clock wrong" and "clock stepped" log messages

2017-10-27 Thread James Feeney
On 10/27/2017 11:10 AM, Bill Unruh wrote:
> ... Of course the sentence does not indicate what convention chrony uses (ie 
> does
> the plus sign mean that the system clock is ahead or behind the true time) I> 
> can never remember which convetion is used but I think it is UTC-systemclock.

Ha!  Yes, difficult to remember, on occasion.  And then, it seems simple enough 
to add a single word to clarify the "direction" of the time offsets.

> In the tracking report, one has the terminology
> 
> System time : 0.62501 seconds fast of NTP time
> 
> which would not be ambigous as to the sign convention.

Including "fast" there does make it much easier to read.  I'm thinking that it 
would be nice to include this sort of thing also in the log messages.

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.