> On Dec 05, 2016, at 01:05, Miroslav Lichvar <mlich...@redhat.com> wrote:
> 
> If I understand your change correctly, it increases the minimum
> acceptable delay, which increases the number of combined readings and
> makes the offset more stable. I'm worried on a busy system bad
> readings could be included and disrupt the mean value.

On a busier systems, I was actually seeing more error than on quiet systems. 
This brings the noise and spike level on the busy system down to the same level 
as the quiet system.

The change has a couple of things it does. First, it ignore bad slots. On all 
the systems I’ve tested, slot one is always the worst slot, which long delays, 
resulting in a baseline skew. The second is avoiding the SysPrecision variable 
as a gate. Using this results in a tendency to select the wider slots for 
averaging, while ignoring the consistent slots. In my testing on busy systems, 
it discarded many slots, sometimes to the point of all but two slots. This is 
obviously bad for averaging. My first attempt was simply to remove the 
averaging and use the best slot. This provided an improvement on the noise over 
the prior approach, but using the slots within 10% is smoother.

FWIW, I haven’t dug into how SysPrecision is calculated, but in looking at 
several systems it appears to be inconsistent on identical hardware.


> Note that if you save the offset between PHC and system clock to a
> double, you will lose precision when the offset is large as double has
> a 53-bit precision.

I agree. I think all of this should be done using 64 bit integers, but the code 
that was there was using floating point so I stayed with that to fit in. Use of 
doubles seems rather pervasive throughout chrony. I was very surprised by this. 
I think it makes sense to use integer math for as much as possible. At least 
for anything to do with time intervals. But it seemed a bit much to rewrite 
everything to address this one issue. :)

Denny

Reply via email to