Re: [chrony-dev] SW/HW timestamping on Linux

2016-11-18 Thread Denny Page
I’ll have to come back to this after the offset issue is resolved.

Denny



> On Nov 18, 2016, at 06:37, Miroslav Lichvar  wrote:
> 
> On Thu, Nov 17, 2016 at 05:44:23PM -0800, Denny Page wrote:
>> Although reduced, I’m still seeing spikes with the patch below.
> 
> I'm not sure what could be wrong at this point. Maybe it really is a
> kernel or HW issue. I'm wondering what would be the best way to
> confirm or reject that idea.


--
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] SW/HW timestamping on Linux

2016-11-18 Thread Denny Page
Miroslav,

I believe that the hardware NTP device, chrony, or both, are 
striking/calculating timestamps incorrectly. I have a test in mind that will 
allow me to determine if this is correct, and if so which. Back to you soon.

Denny



> On Nov 18, 2016, at 00:00, Miroslav Lichvar  wrote:
> 
> On Thu, Nov 17, 2016 at 05:49:44PM -0800, Denny Page wrote:
>> This port speed differential appears to result in a asymmetry in 
>> transmit/receive time which significantly affects the calculations. If I 
>> lock the monitor host port at 100Mb, all three units show precise 
>> synchronization, both with hardware and software time stamping. As noted 
>> previously, with the monitor host port at 1Gb, I see ~300ns (positive) with 
>> software and ~2200ns (negative) with hardware.
> 
> Very interesting!
> 
>> I’ve spent many years on latency in networks, but have never come across 
>> this specific issue. I would like to get my head around how the asymmetry 
>> comes about, and how much it is. I am continuing to research this. I believe 
>> I generally understand how asymmetry affects the calculations, but would 
>> appreciate any guidance you can offer in terms of quantifying how much 
>> asymmetry is required to produce the offsets seen. Also any reason that you 
>> can think of for the offset to be positive with software timestamps, but 
>> negative with hardware timestamps.
> 
> The general rule is that in order to see a positive increase in offset
> of d, the delay of packets from the server to the client needs to
> increase by 2 * d. So, in your case if we take the offset of the local
> unit as a reference, we see an increase of 600ns in the client->server
> delay with SW timestamping and an increase of 4400ns in the
> server->client direction with HW timestamping.
> 
> I don't know much about networking HW and I can only speculate. I
> suspect that if the link speeds don't match, the switch is forced to
> buffer the data and this buffering takes longer when going from 100mb
> to 1gb than when going from 1gb to 100mb. This might explain the
> offset with HW timestamping.
> 
> In the case with SW timestamping, maybe the lower speed of the link to
> the local unit increases the delay of the RX interrupt for some
> reason? Maybe coallescing is not completely disabled and the delay
> takes into account the link speed? I've no idea. It would be great to
> hear from someone who is familiar with the HW and network driver.
> 
> -- 
> Miroslav Lichvar
> 
> -- 
> To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with 
> "unsubscribe" in the subject.
> For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
> subject.
> Trouble?  Email listmas...@chrony.tuxfamily.org.
> 


--
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] SW/HW timestamping on Linux

2016-11-18 Thread Miroslav Lichvar
On Thu, Nov 17, 2016 at 05:44:23PM -0800, Denny Page wrote:
> Although reduced, I’m still seeing spikes with the patch below.

I'm not sure what could be wrong at this point. Maybe it really is a
kernel or HW issue. I'm wondering what would be the best way to
confirm or reject that idea.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] SW/HW timestamping on Linux

2016-11-18 Thread Miroslav Lichvar
On Thu, Nov 17, 2016 at 05:49:44PM -0800, Denny Page wrote:
> This port speed differential appears to result in a asymmetry in 
> transmit/receive time which significantly affects the calculations. If I lock 
> the monitor host port at 100Mb, all three units show precise synchronization, 
> both with hardware and software time stamping. As noted previously, with the 
> monitor host port at 1Gb, I see ~300ns (positive) with software and ~2200ns 
> (negative) with hardware.

Very interesting!

> I’ve spent many years on latency in networks, but have never come across this 
> specific issue. I would like to get my head around how the asymmetry comes 
> about, and how much it is. I am continuing to research this. I believe I 
> generally understand how asymmetry affects the calculations, but would 
> appreciate any guidance you can offer in terms of quantifying how much 
> asymmetry is required to produce the offsets seen. Also any reason that you 
> can think of for the offset to be positive with software timestamps, but 
> negative with hardware timestamps.

The general rule is that in order to see a positive increase in offset
of d, the delay of packets from the server to the client needs to
increase by 2 * d. So, in your case if we take the offset of the local
unit as a reference, we see an increase of 600ns in the client->server
delay with SW timestamping and an increase of 4400ns in the
server->client direction with HW timestamping.

I don't know much about networking HW and I can only speculate. I
suspect that if the link speeds don't match, the switch is forced to
buffer the data and this buffering takes longer when going from 100mb
to 1gb than when going from 1gb to 100mb. This might explain the
offset with HW timestamping.

In the case with SW timestamping, maybe the lower speed of the link to
the local unit increases the delay of the RX interrupt for some
reason? Maybe coallescing is not completely disabled and the delay
takes into account the link speed? I've no idea. It would be great to
hear from someone who is familiar with the HW and network driver.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.