Hello Miroslav,

First, thanks a lot for the suggestions: it is working now! But I had to make 
some changes, and the jitter remains strong.

I would like to have server2 and 3 match the timestamping of server1, to avoid 
introducing inconsistencies in the time series due to measurement issues.

The way I see it, if due to clock jitter, the same event occurring at t0 is 
being seen at t0 on server1, t0+32 us on server2, and t0-32us on server3 (worst 
case scenario with the worst measured values now), this means the aggregated 
measures from the cluster have at best a temporal resolution of about 64 us, 
and anything not separated by at least 64 us risks being seen in the wrong 
order.


  *   On the PTP master you could try this:

The NIC clocks were off by quite a lot, even after synchronizing the system 
clock by NTP and checking with ntpdate -q

With the suggested commands, I got:

phc2sys[12119]: [4853480.133] eth0 sys offset -36998785228 s0 freq  -26003 
delay  12680
(with the offset slowly adjusting)

That’s about 37 seconds. I suppose it’s due to the TAI UTC difference, with the 
RTC being on UTC.

After reading more about that, instead of hardcoding an offset of 37 (and 
keeping track of the when it must be changed) I decided to hardcode 0 instead, 
and to run the ptp4l server in UTC, through either legacy hardware timestamping 
or software timestamping.

Also, it takes some time to reach a smaller offset. I would like to “start from 
scratch” and have the offset jump., ie make the NICs match the RTC immediately, 
before starting to broadcast PTP messages.

However, the -F to force step sync on start to do that seems to be ignored, 
regardless of the units I use:

/usr/sbin/phc2sys -s CLOCK_REALTIME -c %i -r -r -P 1e-4 -I 1e-8 -O0 
-F99999999999

phc2sys[4859094.548]: eth2 sys offset -45059123402 s0 freq -5389409 delay  12788
phc2sys[4859209.606]: eth2 sys offset -44439496394 s0 freq -5389409 delay  12426
(…)
phc2sys[4859247.349]: eth2 sys offset -44236236081 s0 freq -5389409 delay  12748

Also, the legacy hardware timestamping doesn’t work on the e1000e NIC:

/usr/sbin/ptp4l --clockClass 6 --free_running 1 --uds_address 
/var/run/ptp4l.%i.socket -L -i %i -m

ptp4l[4861645.037]: interface 'eth3' does not support requested timestamping 
mode

For now, I’m using linreg in phc2sys (even if I can’t start from scratch, at 
least it is adjusting faster), and software timestamping (in ptp4l, to get UTC 
values ie -O0 offset)

/usr/sbin/phc2sys -s CLOCK_REALTIME -c eth3 -r -r -P 1e-4 -I 1e-8 -O0 -m -F 
09999 -E linreg

phc2sys[4859656.487]: eth3 sys offset -48759179009 s0 freq -6251529 delay  12759
phc2sys[4859675.492]: eth3 sys offset -39138078779 s2 freq -599999999 delay  
20288
(…)
phc2sys[4860253.139]: eth3 sys offset       -43 s2 freq   -5230 delay  12800

/usr/sbin/ptp4l --clockClass 6 -S -i eth2 -i eth3 --uds_address 
/var/run/ptp4l.socket -m

With this, on server3 and server2, the jitter is still far from the ns scale, 
but more tolerable:

Server3:
$ chronyc sources
(…)
#* PTP0                          0   2   377     2    +78ns[+1392ns] +/-   32us
(…)

$ chronyc tracking
Reference ID    : 50545030 (PTP0)
Stratum         : 1
Ref time (UTC)  : Tue Mar 05 07:43:03 2019
System time     : 0.000003320 seconds fast of NTP time
Last offset     : +0.000016404 seconds
RMS offset      : 0.000030792 seconds
Frequency       : 15.905 ppm fast
Residual freq   : +0.678 ppm
Skew            : 11.099 ppm
Root delay      : 0.000010 seconds
Root dispersion : 0.000111 seconds
Update interval : 4.0 seconds
Leap status     : Normal

Server 2:

#* PTP0                          0   2   377     5    +76ns[ +258ns] +/- 5226ns

Reference ID    : 50545030 (PTP0)
Stratum         : 1
Ref time (UTC)  : Tue Mar 05 08:09:37 2019
System time     : 0.000000016 seconds fast of NTP time
Last offset     : +0.000000043 seconds
RMS offset      : 0.000000128 seconds
Frequency       : 21.068 ppm fast
Residual freq   : +0.000 ppm
Skew            : 0.007 ppm
Root delay      : 0.000010 seconds
Root dispersion : 0.000006 seconds
Update interval : 4.0 seconds
Leap status     : Normal

Would you have any idea on how to improve that?


  *   With good switches with PTP support jitter of 50 nanosecond might be
possible.

There is no switch anywhere in my setup. All the NICs are connected directly to 
eachother by crossover cables to minimize such issues. I just can’t have a 
proper PTP grandmaster In the DC, so I’m trying to find work arounds: server 1 
in master mode, servers 2 and running running timemaster (slave mode)

Is there anything I could do to take advantage the direct connection between 
the NICs to further reduce jitter on server2 and 3? Is there any interest at 
all in using L2? Or on using the direct connection between server 2 and server 3

Because another way to look at them is that the servers are hooked together in 
a triangle
:
server1 (eth2) <-- server 2 (eth1)
server 1 (eth3)         (eth3)
|                                ^
V                                |
Server 3(eth1)        |
Server 3 (eth2) ----/


Thanks!
_______________________________________________
Linuxptp-users mailing list
Linuxptp-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-users

Reply via email to