Re: [chrony-users] chronyc tracking question

2017-07-27 Thread Chris Perl
On Thu, Jul 27, 2017 at 4:57 AM, Miroslav Lichvar  wrote:
> On Wed, Jul 26, 2017 at 06:15:13PM -0400, Chris Perl wrote:
>> I'm pretty sure the interleaving gets messed up and I don't wind up
>> getting back the timestamp from the previous exchange (i.e. I'm losing
>> out on the hardware timestamping of t3 on the server).  I think that
>> accounts for the extra 15us that I'm seeing.
>
> Your analysis is correct.

Thanks for confirming.

> With the current implementation, interleaved mode with multiple
> clients on the same IP address is not expected to work. The man page
> mentions this issue with multiple clients behind NAT.

Doh.  Next time I'll try to read the man page more carefully.

> Modifying the code to keep track of individual ports wouldn't help,
> because clients usually change their port between requests.

I had come to the same realization a few hours after I had sent my prior email.

> The timestamps could be separate from IP addresses completely. This
> would allow multiple clients on the same IP address, or even clients
> that change their address between requests. However, a broken client
> sending too many request would be able to flush timestamps that belong
> to other clients. I'm not sure which is worse. There may be a better
> way to do this.

Interesting thought.  I agree with your sentiment, "I'm not sure which
is worse."

> As a workaround in your case, you could configure the monitoring
> client as a peer or you could run a second server instance on a
> different port serving local time.

I had thought of the latter, but not the former.  I'll try both and
see how I make out.

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] chronyc tracking question

2017-07-27 Thread Miroslav Lichvar
On Wed, Jul 26, 2017 at 06:15:13PM -0400, Chris Perl wrote:
> I'm pretty sure the interleaving gets messed up and I don't wind up
> getting back the timestamp from the previous exchange (i.e. I'm losing
> out on the hardware timestamping of t3 on the server).  I think that
> accounts for the extra 15us that I'm seeing.

Your analysis is correct. 

> Is this something that should work, or is the thing I'm trying to do
> not intended to work?

With the current implementation, interleaved mode with multiple
clients on the same IP address is not expected to work. The man page
mentions this issue with multiple clients behind NAT.

Modifying the code to keep track of individual ports wouldn't help,
because clients usually change their port between requests.

The timestamps could be separate from IP addresses completely. This
would allow multiple clients on the same IP address, or even clients
that change their address between requests. However, a broken client
sending too many request would be able to flush timestamps that belong
to other clients. I'm not sure which is worse. There may be a better
way to do this.

As a workaround in your case, you could configure the monitoring
client as a peer or you could run a second server instance on a
different port serving local time.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] chronyc tracking question

2017-07-26 Thread Chris Perl
My best guess at the moment is that is due to using interleaved mode.

As far as I can tell from the debugging and code reading I've done,
chrony, when acting as a server, will eventually call
`NCR_ProcessRxUnknown' for each request from the client.

Then, chrony will try to figure out if its seen this client before by
calling `CLG_LogNTPAccess' and `CLG_GetNtpTimestamps', but that is
based on ip address only, not port.

I'm pretty sure the interleaving gets messed up and I don't wind up
getting back the timestamp from the previous exchange (i.e. I'm losing
out on the hardware timestamping of t3 on the server).  I think that
accounts for the extra 15us that I'm seeing.

Is this something that should work, or is the thing I'm trying to do
not intended to work?

On Wed, Jul 26, 2017 at 1:59 PM, Chris Perl  wrote:
> On Wed, Jul 26, 2017 at 12:23 PM, Chris Perl  wrote:
>> Fwiw, I also have the `xleave' option specified for the second instance.
>
> But, interestingly, the `measurements.log' file from the second
> instance always reports "B" packets:
>
> 2017-07-26 17:32:18 192.168.1.100  N  1 111 111    4  4 0.00
> -8.916e-06  4.595e-05  3.401e-07  0.000e+00  1.526e-05 50545030 4B K K
> 2017-07-26 17:32:34 192.168.1.100  N  1 111 111    4  4 0.00
> -8.401e-06  4.537e-05  3.402e-07  0.000e+00  1.526e-05 50545030 4B K K
> 2017-07-26 17:32:50 192.168.1.100  N  1 111 111    4  4 0.00
> -2.156e-05  7.147e-05  3.954e-07  0.000e+00  1.526e-05 50545030 4B K K

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] chronyc tracking question

2017-07-26 Thread Chris Perl
On Wed, Jul 26, 2017 at 12:23 PM, Chris Perl  wrote:
> Fwiw, I also have the `xleave' option specified for the second instance.

But, interestingly, the `measurements.log' file from the second
instance always reports "B" packets:

2017-07-26 17:32:18 192.168.1.100  N  1 111 111    4  4 0.00
-8.916e-06  4.595e-05  3.401e-07  0.000e+00  1.526e-05 50545030 4B K K
2017-07-26 17:32:34 192.168.1.100  N  1 111 111    4  4 0.00
-8.401e-06  4.537e-05  3.402e-07  0.000e+00  1.526e-05 50545030 4B K K
2017-07-26 17:32:50 192.168.1.100  N  1 111 111    4  4 0.00
-2.156e-05  7.147e-05  3.954e-07  0.000e+00  1.526e-05 50545030 4B K K

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] chronyc tracking question

2017-07-26 Thread Chris Perl
On Tue, Apr 25, 2017 at 4:11 AM, Miroslav Lichvar  wrote:
> On Mon, Apr 24, 2017 at 03:54:25PM -0400, Chris Perl wrote:
>> 1.  If there is asymmetry, its unlikely it is constant for the entire
>> life of the chrony process, assuming you're running chrony for a
>> reasonable period and have a reasonably designed network and your time
>> sources are located reasonably close ("reasonable" can obviously be
>> different for different people).
>
> A major source of constant asymmetry is the timestamping. Unless you
> are using HW timestamping, there can easily be an asymmetry of few
> tens of microseconds due to interrupt coalescing and other delays in
> the kernel, driver and HW. If both ends had the same asymmetry, it
> would cancel out, but at least in my experience that's unusual. Even
> if both machines had the same HW and SW, there may be a difference due
> to the timing of the packets (i.e. server sends a response
> immediatelly after receiving a request).
>
> For example, here is a client with an Intel i210 card running two
> chronyd instances using the same server. One is using HW timestamping
> and controlling the clock, the other is using SW timestamping and just
> monitoring the server with the noselect option.
>
> # chronyc -h 127.0.0.1 -p 323 -m sources sourcestats
> 210 Number of sources = 1
> MS Name/IP address Stratum Poll Reach LastRx Last sample
> ===
> ^* ntp1.local1   0   377 1 +9ns[  +38ns] +/-   
> 26us
> 210 Number of sources = 1
> Name/IP AddressNP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
> ==
> ntp1.local  8   5 9 +0.000  0.018 +0ns31ns
>
> # chronyc -h 127.0.0.1 -p 324 -m sources sourcestats
> 210 Number of sources = 1
> MS Name/IP address Stratum Poll Reach LastRx Last sample
> ===
> ^? ntp1.local1   0   377 4  +9740ns[+9740ns] +/-   
> 56us
> 210 Number of sources = 1
> Name/IP AddressNP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
> ==
> ntp1.local 24  14   354 -0.000  0.001  -8542ns   201ns
>
> While the second instance seem to be stable to few hundreds
> nanoseconds, the measured offset has an error of about 9 microseconds.
> The extra delay of about 40 microseconds is largely asymmetric.
>
> Don't forget to use the xleave option on your clients if their servers
> are running chrony. Even with SW timestamping it can help a lot.

I tried to setup a similar experiment myself, but am seeing some weird
behavior I don't currently understand.

I am running chrony as of git commit
9d9107dcdb7768a03dc129d33b2a7a25f1eea2f5620bc85eb00cfea07c1b6075 with
your chrony-timestamping.patch from the copr repo applied (so I can
use hardware timestamping on CentOS 7.3).

My server is sync'd to a GPS appliance via PTP with linuxptp using an
X540 interface (this is running in a network namespace to hide it from
chrony, re the requirement of only having one interface that supports
hardware timestamping when running on CentOS 7.3).

I have chrony configured using a PHC reflock and serving time via an
I350 interface and configured to use hardware timestamping.

My client is configured to talk to this server via its own I350
interface, set to use hardware timestamping and using the xleave
option.

When running like this, I'm seeing offsets of around hundreds of
nanos, root dispersion of about 17us and root delay of about 14us.

The weird part happens when I try to run a second instance of chrony
using kernel timestamps to compare against the first with hardware
timestamps.

The second instance of chrony is configured to use different paths for
everything, listens on a different command port and is not setup to
act as a server (i.e. it has no `allow' directive).  Or, at least I
believe it is, its possible I've missed something.

When I run the second instance of chrony, I see the root delay for the
first instance jump from a very consistent 14us to about 30us (the
30us is pretty consistent with another machine where I'm running a
client using kernel timestamps only).  I'm observing this by running
`chronyc tracking' in a loop every second.  Further digging reveals
that the increase in the root delay is due to an increase in the peer
delay (observed by running `chronyc ntpdata' every 1s).

I have tried varying the `minpoll' and `maxpoll' on the second
instance and have observed that the jump in the peer delay on the
first instance corresponds with the interval at which the second
instance of chrony is polling (e.g. if I set the second instance to
poll every 16s, the jump only happens about once every 16s).

Further, looking at the `measurements.log'

Re: [chrony-users] chronyc tracking question

2017-04-25 Thread Miroslav Lichvar
On Tue, Apr 25, 2017 at 08:19:54AM -0700, Bill Unruh wrote:
> On Tue, 25 Apr 2017, Miroslav Lichvar wrote:
> > For example, here is a client with an Intel i210 card running two
> > chronyd instances using the same server. One is using HW timestamping
> > and controlling the clock, the other is using SW timestamping and just
> > monitoring the server with the noselect option.
> 
> Running two instances of chrony means that one has to wait with its interrupt
> while the other finishes. That can give a large delay. I once tried that (not
> with one doing hardware timestamping however) and found a large delay (about
> 10us if I remember correctly) of the second waiting for the first.

The two instances are sending and receiving packets at different
times, which are timestamped by the kernel or the HW, so I think it
shouldn't matter. Currently it's necessary to use separate instances
for experiments like this, because the kernel doesn't allow SW
timestamping to be used together with HW timestamping. I'm working on
some patches that should remove this limitation.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] chronyc tracking question

2017-04-25 Thread Bill Unruh


On Tue, 25 Apr 2017, Miroslav Lichvar wrote:


On Mon, Apr 24, 2017 at 03:54:25PM -0400, Chris Perl wrote:

1.  If there is asymmetry, its unlikely it is constant for the entire
life of the chrony process, assuming you're running chrony for a
reasonable period and have a reasonably designed network and your time
sources are located reasonably close ("reasonable" can obviously be
different for different people).


A major source of constant asymmetry is the timestamping. Unless you
are using HW timestamping, there can easily be an asymmetry of few
tens of microseconds due to interrupt coalescing and other delays in
the kernel, driver and HW. If both ends had the same asymmetry, it
would cancel out, but at least in my experience that's unusual. Even
if both machines had the same HW and SW, there may be a difference due
to the timing of the packets (i.e. server sends a response
immediatelly after receiving a request).

For example, here is a client with an Intel i210 card running two
chronyd instances using the same server. One is using HW timestamping
and controlling the clock, the other is using SW timestamping and just
monitoring the server with the noselect option.


Running two instances of chrony means that one has to wait with its interrupt
while the other finishes. That can give a large delay. I once tried that (not
with one doing hardware timestamping however) and found a large delay (about
10us if I remember correctly) of the second waiting for the first.

--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] chronyc tracking question

2017-04-25 Thread Miroslav Lichvar
On Mon, Apr 24, 2017 at 03:54:25PM -0400, Chris Perl wrote:
> 1.  If there is asymmetry, its unlikely it is constant for the entire
> life of the chrony process, assuming you're running chrony for a
> reasonable period and have a reasonably designed network and your time
> sources are located reasonably close ("reasonable" can obviously be
> different for different people).

A major source of constant asymmetry is the timestamping. Unless you
are using HW timestamping, there can easily be an asymmetry of few
tens of microseconds due to interrupt coalescing and other delays in
the kernel, driver and HW. If both ends had the same asymmetry, it
would cancel out, but at least in my experience that's unusual. Even
if both machines had the same HW and SW, there may be a difference due
to the timing of the packets (i.e. server sends a response
immediatelly after receiving a request).

For example, here is a client with an Intel i210 card running two
chronyd instances using the same server. One is using HW timestamping
and controlling the clock, the other is using SW timestamping and just
monitoring the server with the noselect option.

# chronyc -h 127.0.0.1 -p 323 -m sources sourcestats
210 Number of sources = 1
MS Name/IP address Stratum Poll Reach LastRx Last sample   
===
^* ntp1.local1   0   377 1 +9ns[  +38ns] +/-   26us
210 Number of sources = 1
Name/IP AddressNP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==
ntp1.local  8   5 9 +0.000  0.018 +0ns31ns

# chronyc -h 127.0.0.1 -p 324 -m sources sourcestats
210 Number of sources = 1
MS Name/IP address Stratum Poll Reach LastRx Last sample   
===
^? ntp1.local1   0   377 4  +9740ns[+9740ns] +/-   56us
210 Number of sources = 1
Name/IP AddressNP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==
ntp1.local 24  14   354 -0.000  0.001  -8542ns   201ns

While the second instance seem to be stable to few hundreds
nanoseconds, the measured offset has an error of about 9 microseconds.
The extra delay of about 40 microseconds is largely asymmetric.

Don't forget to use the xleave option on your clients if their servers
are running chrony. Even with SW timestamping it can help a lot.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] chronyc tracking question

2017-04-24 Thread Chris Perl
On Mon, Apr 24, 2017 at 7:09 PM, Bill Unruh  wrote:
> But what is a "reasonable network"? Is it defined as one where that does not
> happen? And how do you know your network is "reasonable"?

I think I used a few too many "reasonable's" in my descriptions. :)

I think what I was trying to say was that if you have a network that
is completely under your control then you can potentially make some
statements about what you think may or may not be likely in terms of
delay introduced at various points.

> I would not say it is unlikely. There may well be a reason why one leg or
> the other has a very consistant extra delay, which does not vary.

Yea, agreed.  I think this comes back to what I was saying above.  If
you control the whole thing, then perhaps you can make some stronger
statements about what may or may not be likely based on things like
hardware, software, data flows, etc.

> Sure you can. PUt other clocks which do not have that assymmetry on both
> ends.

Yes, of course.  I think I originally wrote something along the lines
"... unless you put other reference clocks on both ends" but then
decided it wasn't worth the extra verbiage.  Apologies for not being
clear.

> These statements are so wishy washy that they do not really add anything for
> anyone reading them.

Haha, nothing worse than being wishy washy.  I was just attempting to
state that I that it would be unlikely for the delays to change in
lock step like that, but I guess you could come up with plenty of
scenarios where this happens.  Again, coming back to what sorts of
claims you're willing to make about the network.

> See for example the graphs in www.theory.physics.ubc.ca/chrony/chrony.html
> at the bottom of the page. These are offset vs delay graphs. They all
> clearly
> show that the larger offsets come from assymetric random delays. There is a
> correlation between offset and delay which would not be there if the delays
> were symmetric and random. This of course does not mean that there are not
> also systematic assymmetry in
> the delays which would not show up in such a scatter plot.

Awesome, thanks!  I'll take a look at that page in more detail.

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] chronyc tracking question

2017-04-24 Thread Bill Unruh



William G. Unruh __| Canadian Institute for| Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _| Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology | un...@physics.ubc.ca
Canada V6T 1Z1 | and Gravity __|_ www.theory.physics.ubc.ca/

On Mon, 24 Apr 2017, Chris Perl wrote:


On Thu, Apr 20, 2017 at 5:32 AM, Miroslav Lichvar  wrote:

Yes. That's a good description. If you would like to improve the
manual pages, I'll gladly accept patches :).


I'll take another read through what is already there and see if I
think I can add anything useful.


The root distance (root delay / 2 + root dispersion) is the
uncertainty of the "true" time, which is increasing when no updates of
the clock are made. You can graph "system time" +/- "root distance" to
show the maximum assumed error of the clock at given time.

Graphing "last offset" can be useful to show the stability of the
synchronization and estimate the minimum error of the clock.


Great, thanks, that was very helpful.

In my setup it seems the largest potential error comes from the root
delay piece (in my case, usually something like 50us).

I believe that taking one half that value is essentially saying that
its possible that the delay is asymmetric with one direction being
instantaneous and the other taking the full amount of the measured
round trip.


Yes. but without other information that is all you can say. Remember what it
says is "maximum error". You need more information on the distribution of the
delays to make stronger statements.




That would obviously not generally be the case in a reasonable
network.  Although I guess "reasonable" is the important part of that
sentence.


But what is a "reasonable network"? Is it defined as one where that does not
happen? And how do you know your network is "reasonable"?




But, I think what you're saying is I could watch the measured offsets
to get a "feel" for how stable things seem to be.


The problem is that those measured offsets will only give a feel for the
"random" component of the offset variation. Thus if there is a gate which adds
50us to every outbound packet there is not way that watching the offsets will
give you a hint that that could be happening. 
Of course you could put in another clock (GPS, atomic clock,...) and get a

feel for whether or not such a systematic error exists.




Maybe something like the following set of statements:

1.  If there is asymmetry, its unlikely it is constant for the entire
life of the chrony process, assuming you're running chrony for a


I would not say it is unlikely. There may well be a reason why one leg or the
other has a very consistant extra delay, which does not vary.


reasonable period and have a reasonably designed network and your time
sources are located reasonably close ("reasonable" can obviously be
different for different people).


Not sure what that sentence says to anyone.




2.  If the asymmetry is totally constant, there isn't much you can do
to detect it.


Sure you can. PUt other clocks which do not have that assymmetry on both ends.




3.  If there is asymmetry, it would seem unlikely (although not
impossible) that it would change in such a way that the TX delay and
the RX delay both changed by the same amount with opposite signs
(meaning you wouldn't actually see any change to the offset
measurements).


These statements are so wishy washy that they do not really add anything for
anyone reading them.




4.  If asymmetry is introduced then its possible we could detect that
via the offsets.  Either seeing some step in the offsets or just an
increase in the variability of the offset measurements.


See for example the graphs in www.theory.physics.ubc.ca/chrony/chrony.html
at the bottom of the page. These are offset vs delay graphs. They all clearly
show that the larger offsets come from assymetric random delays. There is a
correlation between offset and delay which would not be there if the delays
were symmetric and random. 
This of course does not mean that there are not also systematic assymmetry in

the delays which would not show up in such a scatter plot.




--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.

Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] chronyc tracking question

2017-04-24 Thread Chris Perl
On Thu, Apr 20, 2017 at 5:32 AM, Miroslav Lichvar  wrote:
> Yes. That's a good description. If you would like to improve the
> manual pages, I'll gladly accept patches :).

I'll take another read through what is already there and see if I
think I can add anything useful.

> The root distance (root delay / 2 + root dispersion) is the
> uncertainty of the "true" time, which is increasing when no updates of
> the clock are made. You can graph "system time" +/- "root distance" to
> show the maximum assumed error of the clock at given time.
>
> Graphing "last offset" can be useful to show the stability of the
> synchronization and estimate the minimum error of the clock.

Great, thanks, that was very helpful.

In my setup it seems the largest potential error comes from the root
delay piece (in my case, usually something like 50us).

I believe that taking one half that value is essentially saying that
its possible that the delay is asymmetric with one direction being
instantaneous and the other taking the full amount of the measured
round trip.

That would obviously not generally be the case in a reasonable
network.  Although I guess "reasonable" is the important part of that
sentence.

But, I think what you're saying is I could watch the measured offsets
to get a "feel" for how stable things seem to be.

Maybe something like the following set of statements:

1.  If there is asymmetry, its unlikely it is constant for the entire
life of the chrony process, assuming you're running chrony for a
reasonable period and have a reasonably designed network and your time
sources are located reasonably close ("reasonable" can obviously be
different for different people).

2.  If the asymmetry is totally constant, there isn't much you can do
to detect it.

3.  If there is asymmetry, it would seem unlikely (although not
impossible) that it would change in such a way that the TX delay and
the RX delay both changed by the same amount with opposite signs
(meaning you wouldn't actually see any change to the offset
measurements).

4.  If asymmetry is introduced then its possible we could detect that
via the offsets.  Either seeing some step in the offsets or just an
increase in the variability of the offset measurements.

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-users] chronyc tracking question

2017-04-20 Thread Miroslav Lichvar
On Wed, Apr 19, 2017 at 02:34:21PM -0400, Chris Perl wrote:
> I'm trying to figure out what metric(s) from `chronyc tracking' I
> should be logging and graphing to answer the question of "how far off
> is my system from its reference".

> The System time field from `chronyc tracking' represents the
> difference between what chrony believes the "true" time to be and the
> system time.  But, chrony is constantly updating its notion of what
> the "true" time is.  In the absence of updates, it simply does its job
> and drives that difference to 0.

Yes. That's a good description. If you would like to improve the
manual pages, I'll gladly accept patches :).

> If that is true, does that mean that I should be graphing the "System
> time", but that I need to be sure I've got sufficiently fresh samples
> to make sure the notion of the "true" time is correct?  Or, should I
> just be graphing the "Last offset", or perhaps something else
> entirely?

The root distance (root delay / 2 + root dispersion) is the
uncertainty of the "true" time, which is increasing when no updates of
the clock are made. You can graph "system time" +/- "root distance" to
show the maximum assumed error of the clock at given time.

Graphing "last offset" can be useful to show the stability of the
synchronization and estimate the minimum error of the clock.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



[chrony-users] chronyc tracking question

2017-04-19 Thread Chris Perl
I'm trying to figure out what metric(s) from `chronyc tracking' I
should be logging and graphing to answer the question of "how far off
is my system from its reference".

I believe the "Last offset" field is the most recent offset that
chrony has received from the source that it has selected to
synchronize from (e.g. the `SRC_Instance' which is `SRC_SELECTED').
So, if you were running `chronyc tracking' in a loop and tailing
`tracking.log', I think this field should line up with the "Offset"
column from the tracking file.

I also believe that the "System time" field represents chrony's notion
of how far the system clock is off from the "true" time.  And that its
the samples from the sources that allow chrony to update its notion of
the "true" time.

I've been doing some testing, purposely attempting to cram 10 Gb/s
worth of traffic into a 1 Gb/s host running chrony-3.1 (from the
release tarball) at irregular intervals.

As one might expect, this causes lots of variability in the delay
measurements, and causes "test C" to fail a bunch of the time (as
indicated in the `measurements.log' file).  This is obviously good and
by design to stop chrony from using bogus samples and jumping all over
the place (and I believe can be controlled by adjusting
`maxdelaydevratio').

What I've observed is that if chrony can't get a good sample for a
long period of time (with chrony polling every second, say it can't
get a sample to pass "test C" for something like 60s), the "System
time" is driven to zero.

Is the gist something like the following?

The System time field from `chronyc tracking' represents the
difference between what chrony believes the "true" time to be and the
system time.  But, chrony is constantly updating its notion of what
the "true" time is.  In the absence of updates, it simply does its job
and drives that difference to 0.

If that is true, does that mean that I should be graphing the "System
time", but that I need to be sure I've got sufficiently fresh samples
to make sure the notion of the "true" time is correct?  Or, should I
just be graphing the "Last offset", or perhaps something else
entirely?

Obviously I don't expect chrony to run like this in a real world
scenario, I'm just trying to test out some of the worst cases.

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.