Richard Thanks for the response. I now have one board running kernel 3.18 and another running kernel 4.9.
I still see the issue with 3.18 but I haven't yet seen it on 4.9. Unfortunately, we have a proprietary driver for a device on the pcie bus which doesn't yet support 4.x kernels and it is this that generates (via an application) most of the network traffic. I might have to port all of the stmmac changes back to 3.18. If I add 37 seconds to getnstimeofday then the effect of the "glitch" is less pronounced. Kernel 3.18 introduced timekeeping.c, with timekeeping_get_tai_offset(), which I thought might give me the UTC offset but it returns 0 at the point I call it. Is there a call within the kernel to find the UTC offset? Regards Ian T. -----Original Message----- From: Richard Cochran [mailto:richardcoch...@gmail.com] Sent: Thursday, April 06, 2017 12:49 AM To: Ian Thompson Cc: linuxptp-users@lists.sourceforge.net Subject: [External] Re: [Linuxptp-users] PTP - MAC time On Wed, Apr 05, 2017 at 02:34:14PM +0000, Ian Thompson wrote: > Why is the time that gets put into the PTP registers in the STM MAC, Unix > time rather than PTP time? See below. To you question from the other thread: On Tue, Apr 04, 2017 at 03:45:16PM +0000, Ian Thompson wrote: > Possibly following on from David’s post. > > We have a system with 18 boards in a rack, each board has a Altera SoC with > the STM Ethernet MAC connected via gigabit Ethernet to an Arista ptp-aware > switch and then a Spectracom GrandMaster. > The boards are running Linux kernel 3.15.0. That HW puts the time stamps into the buffer descriptor, and so in theory it should never miss a time stamp. This is most likely a driver bug. Looking at the git log I see: v4.11-rc1~124^2~171^2~12 deeb637 net: stmmac: remove freesoftware address v4.9-rc7~33^2~33^2~1 ba1ffd7 stmmac: fix PTP support for GMAC4 v4.9-rc7~33^2~33^2~2 d204205 stmmac: update the PTP header file v4.9-rc4~28^2~68 c30a70d stmmac: fix and review the ptp registration. v4.9-rc4~28^2~96 50756eb stmmac: fix an error code in stmmac_ptp_register() v4.9-rc1~28^2~10 7086605 stmmac: fix error check when init ptp v4.9-rc1~127^2~108 efee95f ptp_clock: future-proofing drivers against PTP subsystem becoming optional v4.1-rc1~128^2~100^2~5 e7ea55b ptp: stmmac: use helpers for converting ns to timespec. v4.1-rc1~128^2~119^2~6 3f6c465 ptp: stmmac: convert to the 64 bit get/set time methods. v3.17-rc5~41^2~38 5566401 stmmac: ptp: fix the reference clock v3.17-rc5~41^2~50 f95f404 stmmac: set ptp_clock to NULL while unregister v3.15-rc1~113^2~108^2~5 4986b4f0 ptp: drivers: set the number of programmable pins. v3.13-rc7~13^2 7cd0139 stmmac: Fix incorrect spinlock release and PTP cap detection. v3.10-rc1~66^2~195 32ceabc stmmac: improve/review and fix kernel-doc v3.10-rc1~66^2~327^2~1 92ba688 stmmac: add the support for PTP hw clock driver v3.10-rc1~66^2~327^2~2 891434b stmmac: add IEEE PTPv1 and PTPv2 support. Especially ba1ffd7 looks suspicious. > Apr 4 13:42:04 localhost user.info ptp4l: [537.164] rms 123 max 599 freq > +255 +/- 39 delay 7362 +/- 48 > Apr 4 13:42:29 localhost user.err ptp4l: [561.387] timed out while > polling for tx timestamp > Apr 4 13:42:29 localhost user.err ptp4l: [561.387] increasing > tx_timestamp_timeout may correct this issue, but it is likely caused by a > driver bug > Apr 4 13:42:29 localhost user.err ptp4l: [561.387] port 1: send delay > request failed > Apr 4 13:42:29 localhost user.notice ptp4l: [561.387] port 1: SLAVE > to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) Apr 4 13:42:45 localhost > user.notice ptp4l: [577.388] port 1: FAULTY to LISTENING on FAULT_CLEARED > Apr 4 13:42:45 localhost user.warn ptp4l: [577.414] clockcheck: clock > jumped backward or running slower than expected! > Apr 4 13:42:45 localhost user.notice ptp4l: [577.414] port 1: new > foreign master 000cec.fffe.0a085d-1 Apr 4 13:42:47 localhost > user.notice ptp4l: [579.414] selected best master clock > 000cec.fffe.0a085d Apr 4 13:42:47 localhost user.notice ptp4l: [579.414] > port 1: LISTENING to UNCALIBRATED on RS_SLAVE Apr 4 13:42:54 localhost > user.notice ptp4l: [587.164] port 1: UNCALIBRATED to SLAVE on > MASTER_CLOCK_SELECTED > Apr 4 13:46:46 localhost user.info ptp4l: [818.414] rms 2312500092 max > 37000001557 freq +246 +/- 250 delay 7358 +/- 46 > Apr 4 13:51:02 localhost user.info ptp4l: [1074.413] rms 116 max 681 > freq +256 +/- 48 delay 7373 +/- 88 > > Does this imply that one lost delay request can do this, or is there a retry > mechanism? One lost delay request shouldn't introduct such a large error. This is a driver bug. Notice that the time error is 37 seconds, or the UTC/TAI offset. When resetting the fault, ptp4l re-initializes HW time stamping. The funtion, stmmac_hwtstamp_ioctl(), in drivers/net/ethernet/stmicro/stmmac/stmmac_main.c programs the system time (UTC) into the PHC every time HW time stamping is enabled. It shouldn't do that. > We have a lot of traffic leaving the boards but only PTP traffic > coming in. As we increase the off board transfer rates the problem > seems to occur more often. That could indicate a driver or a HW issue, or both. HTH, Richard ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Linuxptp-users mailing list Linuxptp-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-users