I've a setup of multiple devices with a ARM CPU using a Marvell PHY with
hardware timestamping capabilitys for networking.

These Marvell PHYs analyse incoming and outgoing packages for PTP packages,
saves a hardware timestamp and issues a interrupt.
The PHY driver catches the interrupt, gets the timestamp and places it in a
queue for ptp4l to get it via sk_receive().

The issue is that the PHY only has two slots for timestamps, one for
outgoing packages and one for incoming packages. If the device in question
is a ptp master and has multiple slave, it sometimes happen that both
DELAY_REQ packages come in short succession, which results in the PHY
driver/interrupt handler being to slow to get all timestamps.

>From a PTP perspective this is not a big issue: A single missing delay
measurement doesn't break a clock.
But PTP4L threats this as a critical error, printing
"timed out while polling for tx timestamp",
"increasing tx_timestamp_timeout may correct this issue, but it is likely
caused by a driver bug"
and going into the "PS_FAULTY" state.

My first idea for the fix is to add a EV_NONCRIT_FAULT_DETECTED event,
fired on such occasions.
This event would be counted in the fsm, reset by EV_NONE and resulting in
a PS_FAULTY after x appearances.

Is there a reason not to do this or a better idea to use linuxptp on
systems with similar hardware constraints?
Are there design considerations within linuxptp that I should know of
before implementing and submitting such a fix?
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel

Reply via email to