Richard

Thanks for the response.
I now have one board running kernel 3.18 and another running kernel 4.9.

I still see the issue with 3.18 but I haven't yet seen it on 4.9. 
Unfortunately, we have a proprietary driver for a device on the pcie bus which 
doesn't yet support 4.x kernels and it is this that generates (via an 
application) most of the network traffic.
I might have to port all of the stmmac changes back to 3.18.

If I add 37 seconds to getnstimeofday then the effect of the "glitch" is less 
pronounced. 
Kernel 3.18 introduced timekeeping.c, with  timekeeping_get_tai_offset(), which 
I thought might give me the UTC offset but it returns 0 at the point I call it.
Is there a call within the kernel to find the UTC offset?

Regards
Ian T. 

-----Original Message-----
From: Richard Cochran [mailto:richardcoch...@gmail.com] 
Sent: Thursday, April 06, 2017 12:49 AM
To: Ian Thompson
Cc: linuxptp-users@lists.sourceforge.net
Subject: [External] Re: [Linuxptp-users] PTP - MAC time

On Wed, Apr 05, 2017 at 02:34:14PM +0000, Ian Thompson wrote:
> Why is the time that gets put into the PTP registers in the STM MAC, Unix 
> time rather than PTP time?

See below.

To you question from the other thread:

On Tue, Apr 04, 2017 at 03:45:16PM +0000, Ian Thompson wrote:
> Possibly following on from David’s post.
> 
> We have a system with 18 boards in a rack, each board has a Altera SoC with 
> the STM Ethernet MAC connected via gigabit Ethernet to an Arista ptp-aware 
> switch and then a Spectracom GrandMaster.
> The boards are running Linux kernel 3.15.0.

That HW puts the time stamps into the buffer descriptor, and so in theory it 
should never miss a time stamp.  This is most likely a driver bug.  Looking at 
the git log I see:

 v4.11-rc1~124^2~171^2~12 deeb637 net: stmmac: remove freesoftware address
     v4.9-rc7~33^2~33^2~1 ba1ffd7 stmmac: fix PTP support for GMAC4
     v4.9-rc7~33^2~33^2~2 d204205 stmmac: update the PTP header file
         v4.9-rc4~28^2~68 c30a70d stmmac: fix and review the ptp registration.
         v4.9-rc4~28^2~96 50756eb stmmac: fix an error code in 
stmmac_ptp_register()
         v4.9-rc1~28^2~10 7086605 stmmac: fix error check when init ptp
       v4.9-rc1~127^2~108 efee95f ptp_clock: future-proofing drivers against 
PTP subsystem becoming optional
   v4.1-rc1~128^2~100^2~5 e7ea55b ptp: stmmac: use helpers for converting ns to 
timespec.
   v4.1-rc1~128^2~119^2~6 3f6c465 ptp: stmmac: convert to the 64 bit get/set 
time methods.
        v3.17-rc5~41^2~38 5566401 stmmac: ptp: fix the reference clock
        v3.17-rc5~41^2~50 f95f404 stmmac: set ptp_clock to NULL while unregister
  v3.15-rc1~113^2~108^2~5 4986b4f0 ptp: drivers: set the number of programmable 
pins.
           v3.13-rc7~13^2 7cd0139 stmmac: Fix incorrect spinlock release and 
PTP cap detection.
       v3.10-rc1~66^2~195 32ceabc stmmac: improve/review and fix kernel-doc
   v3.10-rc1~66^2~327^2~1 92ba688 stmmac: add the support for PTP hw clock 
driver
   v3.10-rc1~66^2~327^2~2 891434b stmmac: add IEEE PTPv1 and PTPv2 support.

Especially ba1ffd7 looks suspicious.

> Apr  4 13:42:04 localhost user.info   ptp4l: [537.164] rms  123 max  599 freq 
>   +255 +/-  39 delay  7362 +/-  48
> Apr  4 13:42:29 localhost user.err    ptp4l: [561.387] timed out while 
> polling for tx timestamp
> Apr  4 13:42:29 localhost user.err    ptp4l: [561.387] increasing 
> tx_timestamp_timeout may correct this issue, but it is likely caused by a 
> driver bug
> Apr  4 13:42:29 localhost user.err    ptp4l: [561.387] port 1: send delay 
> request failed
> Apr  4 13:42:29 localhost user.notice ptp4l: [561.387] port 1: SLAVE 
> to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) Apr  4 13:42:45 localhost 
> user.notice ptp4l: [577.388] port 1: FAULTY to LISTENING on FAULT_CLEARED
> Apr  4 13:42:45 localhost user.warn   ptp4l: [577.414] clockcheck: clock 
> jumped backward or running slower than expected!
> Apr  4 13:42:45 localhost user.notice ptp4l: [577.414] port 1: new 
> foreign master 000cec.fffe.0a085d-1 Apr  4 13:42:47 localhost 
> user.notice ptp4l: [579.414] selected best master clock 
> 000cec.fffe.0a085d Apr  4 13:42:47 localhost user.notice ptp4l: [579.414] 
> port 1: LISTENING to UNCALIBRATED on RS_SLAVE Apr  4 13:42:54 localhost 
> user.notice ptp4l: [587.164] port 1: UNCALIBRATED to SLAVE on 
> MASTER_CLOCK_SELECTED
> Apr  4 13:46:46 localhost user.info   ptp4l: [818.414] rms 2312500092 max 
> 37000001557 freq   +246 +/- 250 delay  7358 +/-  46
> Apr  4 13:51:02 localhost user.info   ptp4l: [1074.413] rms  116 max  681 
> freq   +256 +/-  48 delay  7373 +/-  88
> 
> Does this imply that one lost delay request can do this, or is there a retry 
> mechanism?

One lost delay request shouldn't introduct such a large error.  This is a 
driver bug.  Notice that the time error is 37 seconds, or the UTC/TAI offset.

When resetting the fault, ptp4l re-initializes HW time stamping.

The funtion, stmmac_hwtstamp_ioctl(), in

    drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 

programs the system time (UTC) into the PHC every time HW time stamping is 
enabled.  It shouldn't do that.

> We have a lot of traffic leaving the boards but only PTP traffic 
> coming in. As we increase the off board transfer rates the problem 
> seems to occur more often.

That could indicate a driver or a HW issue, or both.

HTH,
Richard
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Linuxptp-users mailing list
Linuxptp-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-users

Reply via email to