Hello Miroslav, First, thanks a lot for the suggestions: it is working now! But I had to make some changes, and the jitter remains strong.
I would like to have server2 and 3 match the timestamping of server1, to avoid introducing inconsistencies in the time series due to measurement issues. The way I see it, if due to clock jitter, the same event occurring at t0 is being seen at t0 on server1, t0+32 us on server2, and t0-32us on server3 (worst case scenario with the worst measured values now), this means the aggregated measures from the cluster have at best a temporal resolution of about 64 us, and anything not separated by at least 64 us risks being seen in the wrong order. * On the PTP master you could try this: The NIC clocks were off by quite a lot, even after synchronizing the system clock by NTP and checking with ntpdate -q With the suggested commands, I got: phc2sys[12119]: [4853480.133] eth0 sys offset -36998785228 s0 freq -26003 delay 12680 (with the offset slowly adjusting) That’s about 37 seconds. I suppose it’s due to the TAI UTC difference, with the RTC being on UTC. After reading more about that, instead of hardcoding an offset of 37 (and keeping track of the when it must be changed) I decided to hardcode 0 instead, and to run the ptp4l server in UTC, through either legacy hardware timestamping or software timestamping. Also, it takes some time to reach a smaller offset. I would like to “start from scratch” and have the offset jump., ie make the NICs match the RTC immediately, before starting to broadcast PTP messages. However, the -F to force step sync on start to do that seems to be ignored, regardless of the units I use: /usr/sbin/phc2sys -s CLOCK_REALTIME -c %i -r -r -P 1e-4 -I 1e-8 -O0 -F99999999999 phc2sys[4859094.548]: eth2 sys offset -45059123402 s0 freq -5389409 delay 12788 phc2sys[4859209.606]: eth2 sys offset -44439496394 s0 freq -5389409 delay 12426 (…) phc2sys[4859247.349]: eth2 sys offset -44236236081 s0 freq -5389409 delay 12748 Also, the legacy hardware timestamping doesn’t work on the e1000e NIC: /usr/sbin/ptp4l --clockClass 6 --free_running 1 --uds_address /var/run/ptp4l.%i.socket -L -i %i -m ptp4l[4861645.037]: interface 'eth3' does not support requested timestamping mode For now, I’m using linreg in phc2sys (even if I can’t start from scratch, at least it is adjusting faster), and software timestamping (in ptp4l, to get UTC values ie -O0 offset) /usr/sbin/phc2sys -s CLOCK_REALTIME -c eth3 -r -r -P 1e-4 -I 1e-8 -O0 -m -F 09999 -E linreg phc2sys[4859656.487]: eth3 sys offset -48759179009 s0 freq -6251529 delay 12759 phc2sys[4859675.492]: eth3 sys offset -39138078779 s2 freq -599999999 delay 20288 (…) phc2sys[4860253.139]: eth3 sys offset -43 s2 freq -5230 delay 12800 /usr/sbin/ptp4l --clockClass 6 -S -i eth2 -i eth3 --uds_address /var/run/ptp4l.socket -m With this, on server3 and server2, the jitter is still far from the ns scale, but more tolerable: Server3: $ chronyc sources (…) #* PTP0 0 2 377 2 +78ns[+1392ns] +/- 32us (…) $ chronyc tracking Reference ID : 50545030 (PTP0) Stratum : 1 Ref time (UTC) : Tue Mar 05 07:43:03 2019 System time : 0.000003320 seconds fast of NTP time Last offset : +0.000016404 seconds RMS offset : 0.000030792 seconds Frequency : 15.905 ppm fast Residual freq : +0.678 ppm Skew : 11.099 ppm Root delay : 0.000010 seconds Root dispersion : 0.000111 seconds Update interval : 4.0 seconds Leap status : Normal Server 2: #* PTP0 0 2 377 5 +76ns[ +258ns] +/- 5226ns Reference ID : 50545030 (PTP0) Stratum : 1 Ref time (UTC) : Tue Mar 05 08:09:37 2019 System time : 0.000000016 seconds fast of NTP time Last offset : +0.000000043 seconds RMS offset : 0.000000128 seconds Frequency : 21.068 ppm fast Residual freq : +0.000 ppm Skew : 0.007 ppm Root delay : 0.000010 seconds Root dispersion : 0.000006 seconds Update interval : 4.0 seconds Leap status : Normal Would you have any idea on how to improve that? * With good switches with PTP support jitter of 50 nanosecond might be possible. There is no switch anywhere in my setup. All the NICs are connected directly to eachother by crossover cables to minimize such issues. I just can’t have a proper PTP grandmaster In the DC, so I’m trying to find work arounds: server 1 in master mode, servers 2 and running running timemaster (slave mode) Is there anything I could do to take advantage the direct connection between the NICs to further reduce jitter on server2 and 3? Is there any interest at all in using L2? Or on using the direct connection between server 2 and server 3 Because another way to look at them is that the servers are hooked together in a triangle : server1 (eth2) <-- server 2 (eth1) server 1 (eth3) (eth3) | ^ V | Server 3(eth1) | Server 3 (eth2) ----/ Thanks!
_______________________________________________ Linuxptp-users mailing list Linuxptp-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-users