On 6/7/2021 1:19 PM, YENDstudio wrote:
> Hello,
>
> I have configure one of my machines as a unicast BC which is
> synchronized to the grandmaster clock via the first of it's two ports.
> The second port is used to provide sync to another local machine. This
> setup works for a few hours after which one of the ports (master port)
> is marked as faulty, and it never recovers (the second machine stops
> receiving sync) until I restart the ptp4l application. Yet, the first
> port continues sync'ing with the grandmaster clock.
> > The fault is triggered by a timeout during polling of tx timestamp
> (sk_receive function call). As I am not able to fix this issue, I would
> like to at least make the ptp application recover the port
> automatically. I had tried to close-then-open the port when it goes to a
> FAULTY state but it didn't help (the slave machine is not able to sync).
>
Hi,
ptp4l already attempts recovery from a fault after the fault reset
timeout. This is something like 15 seconds by default.
You should see it recover, something like:
> ptp4l[1022068.490]: selected /dev/ptp2 as PTP clock
> ptp4l[1022068.510]: port 1 (enp244s0f0): INITIALIZING to LISTENING on
> INIT_COMPLETE
> ptp4l[1022068.510]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on
> INIT_COMPLETE
> ptp4l[1022068.510]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on
> INIT_COMPLETE
> ptp4l[1022070.454]: port 1 (enp244s0f0): new foreign master
> 527b94.fffe.96b1f3-1
> ptp4l[1022074.454]: selected best master clock 527b94.fffe.96b1f3
> ptp4l[1022074.454]: port 1 (enp244s0f0): LISTENING to UNCALIBRATED on RS_SLAVE
> ptp4l[1022076.454]: master offset 3148999551 s0 freq +0 path delay
> 1466
> ptp4l[1022077.482]: master offset 3149000658 s1 freq +1107 path delay
> 1615
> ptp4l[1022078.029]: timed out while polling for tx timestamp
> ptp4l[1022078.029]: increasing tx_timestamp_timeout may correct this issue,
> but it is likely caused by a driver bug
> ptp4l[1022078.029]: port 1 (enp244s0f0): send delay request failed
> ptp4l[1022078.029]: port 1 (enp244s0f0): UNCALIBRATED to FAULTY on
> FAULT_DETECTED (FT_UNSPECIFIED)
> ptp4l[1022082.057]: port 1 (enp244s0f0): FAULTY to LISTENING on INIT_COMPLETE
^^^
Specifically here.
> ptp4l[1022082.455]: port 1 (enp244s0f0): new foreign master
> 527b94.fffe.96b1f3-1
> ptp4l[1022086.455]: selected best master clock 527b94.fffe.96b1f3
> ptp4l[1022086.455]: port 1 (enp244s0f0): LISTENING to UNCALIBRATED on RS_SLAVE
> ptp4l[1022087.460]: master offset -7124120 s2 freq -7123013 path delay
> 1615
> ptp4l[1022087.460]: port 1 (enp244s0f0): UNCALIBRATED to SLAVE on
> MASTER_CLOCK_SELECTED
> ptp4l[1022088.460]: master offset -39903 s2 freq -2176032 path delay
> 1615
> ptp4l[1022089.460]: master offset 2165416 s2 freq +17316 path delay
> 1466
> ptp4l[1022090.460]: master offset 2161742 s2 freq +663267 path delay
> 1615
> ptp4l[1022091.460]: master offset 1503260 s2 freq +653307 path delay
> 1615
> ptp4l[1022092.460]: master offset 850970 s2 freq +451995 path delay
> 1764
> ptp4l[1022093.460]: master offset 398679 s2 freq +254995 path delay
> 2160
> ptp4l[1022094.460]: master offset 143441 s2 freq +119361 path delay
> 2556
> ptp4l[1022095.460]: master offset 2567 s2 freq +21519 path delay
> 24523
If you're seeing that but it fails to actually recover, (i.e.e
timestamps never begin working again), this is likely a fault of the
driver or hardware for the device.
Thanks,
Jake
_______________________________________________
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel