Re: [Linuxptp-devel] Handling missing hardware timestamps on embedded systems
On Wed, Feb 21, 2018 at 04:26:06PM +0100, Oliver Westermann wrote: > I got curious after your last messages and again dove deeper into the > kernel and drivers and tried to find out what happens when. > Just FYI: I inherited the code, so feel free to criticize and suggest > improvements where needed ;-) We can't really comment on invisible code, so my first criticism is, post the driver to linux netdev as a RFC. Thanks, Richard -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
Re: [Linuxptp-devel] Handling missing hardware timestamps on embedded systems
2018-02-16 17:51 GMT+01:00 Keller, Jacob E: > > Even the interrupt itself takes up to 12 ms sometimes? > I got curious after your last messages and again dove deeper into the kernel and drivers and tried to find out what happens when. Just FYI: I inherited the code, so feel free to criticize and suggest improvements where needed ;-) Also I'm quite new to anything on linux kernel level so forgive me if I mess up somewhere. My Setup was again two devices, I've created extra CPU Load AND Network Load to increase the chance of big delays and I got some. Here is a small extract from a kernel trace including trace messages from ptp4l (hoping gmail does properly send the formatting): # tracer: nop # # entries-in-buffer/entries-written: 6694/6694 #P:2 # # _-=> irqs-off # / _=> need-resched #| / _---=> hardirq/softirq #|| / _--=> preempt-depth #||| / delay # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | ptp4l-587 [000] ...1 2799.157884: tracing_mark_write: send delay_req ptp4l-587 [000] ...1 2799.157900: tracing_mark_write: transport_send() seq id: 31495/1915 ptp4l-587 [000] d.H2 2799.158000: phy_interrupt: phy_interrupt "handled" ptp4l-587 [000] ...1 2799.158048: tracing_mark_write: enter sk_receive kworker/0:1-1710 [000] 2799.162773: add_event.constprop.9: add_event id 1915 kworker/0:1-1710 [000] 2799.170782: clock_find_ts.part.0: matched id 1915 ptp4l-587 [000] ...1 2799.170791: tracing_mark_write: poll worked What I now notice is that the interrupt in fact is not delayed (takes ~0.1ms). But by comments in the phy_interrupt() function (in phy.c) of the linux kernel thats not the place to handle the interrupt, so it schedules a work item in a work queue. This work queue is handled by the phy_change() function (in phy.c), which in a subfunction retrieves the timestamp from the phy (this is the add_event call from the trace above). This is delayed by 4ms. IMHO, this looks like proper design of this driver. The actual interrupt handler is short and according to a comment in the function phy_interrupt () "The MDIO bus is not allowed to be written in interrupt context", so I cant't read the timestamps there. The package send also sends a skb into the txtstamp function, which does set the SKBTX_IN_PROGRESS flag as described by Jake. As the package is not send out yet, is queues the skb in a workqueue which is part of the phy driver. The handler of this work queue matches the skbs to the timestamps queued by the add_event() function and sends them back up to ptp4l. This handler shows itself above by the " clock_find_ts" line and is as well delayed by another 8ms, resulting in a 12 ms delay from sending to skb returning. I do understand that this is an issue due to two scheduled workers, so what would be a proper design of the "second stage" of the timestamp readout process? And is there something like a good, readable phy driver I can look into for this? Best regards, Olli -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
Re: [Linuxptp-devel] Handling missing hardware timestamps on embedded systems
On Wed, Feb 21, 2018 at 04:26:06PM +0100, Oliver Westermann wrote: > This work queue is handled by the phy_change() function (in phy.c), which > in a subfunction retrieves the timestamp from the phy (this is the > add_event call from the trace above). This is delayed by 4ms. That code in phy.c is for the phy state machine. It is not time critical, and so using a work item is appropriate. However your packet time stamps *are* time critical. I wouldn't put the time stamp code there. You can't really increase the scheduling priority of work queues. We have a ready made kworker in ptp_clock.c. ptp->kworker = kthread_create_worker(0, worker_name ? worker_name : info->name); You can use that, and then set the priority of the kthread using chrt. HTH, Richard -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
Re: [Linuxptp-devel] Handling missing hardware timestamps on embedded systems
Hi, > -Original Message- > From: Oliver Westermann [mailto:owesterm...@gmail.com] > Sent: Friday, February 16, 2018 1:20 AM > To: Richard Cochran <richardcoch...@gmail.com> > Cc: linuxptp-devel@lists.sourceforge.net > Subject: Re: [Linuxptp-devel] Handling missing hardware timestamps on > embedded systems > > I may have run into a design constraint which I understood now, hopefully. > > The issue were to strict timeouts that sometimes ran into an issue. > The current process is like this: > > - ptp4l sends a DELAY_REQ package into the network stack and sends a SKB to > retrieve the timestamp > - when the PHY handles it, it saves the timestamp and generates an interrupt > - the interrupt handler of my PHY reads seq_id and timestamp and places an > event with both into a queue > - the PHY worker fetches the SKBs and searches for matching events in the > queue > So you're likely adding increased delays and potential pitfalls for a couple reasons here: When ptp4l sends a delay request, it indicates an SKB which sets the SKBTX_HW_TSTAMP bit. Your driver responds to this by setting the SKB_TX_IN_PROGRESS bit which indicates that you will be doing a Tx timestamp. You should only set this bit if you are capable of generating a timestamp. If your hardware can only timestamp one packet at a time, then you should be sure to never set this again until the outstanding request is finished, and it is expected that most hardware only supports one outstanding request at once. That's one pitfall you should be aware of. Next, you have an interrupt which the PHY signals that a timestamp is ready, followed by a separate worker thread to handle this? The problem with an extra worker thread here, is that it will potentially be hit by unknown delays due to scheduling. If possible you may be able to handle the work directly in the interrupt thread. Especially easy if you only support a single outstanding timestamp request, since there wouldn't need to be a queue at all. > I added some tracing into ptp4l and my phy driver and measured the timings > over > some minutes of PTP (with additional load on the network interface, additional > load on cpus, all timeouts set to 100ms, 2170 measurements): > > send to interrupt, mean/standard deviation/max in ms: 2,97 / 1,48 / 12,85 > > interrupt to match: 0,55 / 0,16 / 6,86 > send to match: 3,52 / 1,62 / 13,21 > > From that perspective my tx_timestamp_timeout of 10ms was to short. I would > raise it to 18ms (~50% extra margin). > So if I understand this correctly, the mean time to match is 3.5 seconds, but there's occasional stalls? Even the interrupt itself takes up to 12 ms sometimes? It's quite possible that's just time until you service the interrupt, which could be impacted by a number of things. A number of other drivers forgo using interrupts, and instead poll using the "do_work" ptp kernel thread, which is likely more resillient against delays compared to something such as a workqueue item. > Is there a reason not to do this? You can change the configuration to fit what you need, but I suspect this can be fixed by improving the driver. Thanks, Jake > > Thanks, Olli -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
Re: [Linuxptp-devel] Handling missing hardware timestamps on embedded systems
2018-02-13 19:36 GMT+01:00 Richard Cochran: > On Tue, Feb 13, 2018 at 07:01:22PM +0100, Oliver Westermann wrote: > > The issue is that the PHY only has two slots for timestamps, one for > > outgoing packages and one for incoming packages. If the device in > question > > is a ptp master and has multiple slave, it sometimes happen that both > > DELAY_REQ packages come in short succession, which results in the PHY > > driver/interrupt handler being to slow to get all timestamps. > > This is actually a bug in your driver. If an incoming delay request > misses a time stamp, no fault is generated. The *transmit* time stamp > is still available, but your driver fails to fetch it. > Thank you for your hints, you pointed me in the right direction. > This should not happen. You should fix your driver so that it always > fetches the Tx time stamp after sending a marked packet. No interrupt > is necessary. > I may have run into a design constraint which I understood now, hopefully. The issue were to strict timeouts that sometimes ran into an issue. The current process is like this: - ptp4l sends a DELAY_REQ package into the network stack and sends a SKB to retrieve the timestamp - when the PHY handles it, it saves the timestamp and generates an interrupt - the interrupt handler of my PHY reads seq_id and timestamp and places an event with both into a queue - the PHY worker fetches the SKBs and searches for matching events in the queue I added some tracing into ptp4l and my phy driver and measured the timings over some minutes of PTP (with additional load on the network interface, additional load on cpus, all timeouts set to 100ms, 2170 measurements): send to interrupt, mean/standard deviation/max in ms: 2,97 / 1,48 / 12,85 interrupt to match: 0,55 / 0,16 / 6,86 send to match: 3,52 / 1,62 / 13,21 >From that perspective my tx_timestamp_timeout of 10ms was to short. I would raise it to 18ms (~50% extra margin). Is there a reason not to do this? Thanks, Olli 2018-02-13 19:36 GMT+01:00 Richard Cochran : > On Tue, Feb 13, 2018 at 07:01:22PM +0100, Oliver Westermann wrote: > > The issue is that the PHY only has two slots for timestamps, one for > > outgoing packages and one for incoming packages. If the device in > question > > is a ptp master and has multiple slave, it sometimes happen that both > > DELAY_REQ packages come in short succession, which results in the PHY > > driver/interrupt handler being to slow to get all timestamps. > > This is actually a bug in your driver. If an incoming delay request > misses a time stamp, no fault is generated. The *transmit* time stamp > is still available, but your driver fails to fetch it. > > > From a PTP perspective this is not a big issue: A single missing delay > > measurement doesn't break a clock. > > Right. > > > But PTP4L threats this as a critical error, printing > > "timed out while polling for tx timestamp", > > This should not happen. You should fix your driver so that it always > fetches the Tx time stamp after sending a marked packet. No interrupt > is necessary. > > Thanks, > Richard > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
Re: [Linuxptp-devel] Handling missing hardware timestamps on embedded systems
On Tue, Feb 13, 2018 at 07:01:22PM +0100, Oliver Westermann wrote: > The issue is that the PHY only has two slots for timestamps, one for > outgoing packages and one for incoming packages. If the device in question > is a ptp master and has multiple slave, it sometimes happen that both > DELAY_REQ packages come in short succession, which results in the PHY > driver/interrupt handler being to slow to get all timestamps. This is actually a bug in your driver. If an incoming delay request misses a time stamp, no fault is generated. The *transmit* time stamp is still available, but your driver fails to fetch it. > From a PTP perspective this is not a big issue: A single missing delay > measurement doesn't break a clock. Right. > But PTP4L threats this as a critical error, printing > "timed out while polling for tx timestamp", This should not happen. You should fix your driver so that it always fetches the Tx time stamp after sending a marked packet. No interrupt is necessary. Thanks, Richard -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
Re: [Linuxptp-devel] Handling missing hardware timestamps on embedded systems
On Tue, Feb 13, 2018 at 07:01:22PM +0100, Oliver Westermann wrote: > Is there a reason not to do this or a better idea to use linuxptp on > systems with similar hardware constraints? I have had such HW, and I usually just set "fault_reset_interval ASAP" in the configuration. > Are there design considerations within linuxptp that I should know of > before implementing and submitting such a fix? The vast majority of HW is not limited in this respect, and there are no mainline drivers that support such HW. My feeling is that the linuxptp stack shouldn't be filled with special cases just for buggy or deficient HW designs. Thanks, Richard -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
[Linuxptp-devel] Handling missing hardware timestamps on embedded systems
I've a setup of multiple devices with a ARM CPU using a Marvell PHY with hardware timestamping capabilitys for networking. These Marvell PHYs analyse incoming and outgoing packages for PTP packages, saves a hardware timestamp and issues a interrupt. The PHY driver catches the interrupt, gets the timestamp and places it in a queue for ptp4l to get it via sk_receive(). The issue is that the PHY only has two slots for timestamps, one for outgoing packages and one for incoming packages. If the device in question is a ptp master and has multiple slave, it sometimes happen that both DELAY_REQ packages come in short succession, which results in the PHY driver/interrupt handler being to slow to get all timestamps. >From a PTP perspective this is not a big issue: A single missing delay measurement doesn't break a clock. But PTP4L threats this as a critical error, printing "timed out while polling for tx timestamp", "increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug" and going into the "PS_FAULTY" state. My first idea for the fix is to add a EV_NONCRIT_FAULT_DETECTED event, fired on such occasions. This event would be counted in the fsm, reset by EV_NONE and resulting in a PS_FAULTY after x appearances. Is there a reason not to do this or a better idea to use linuxptp on systems with similar hardware constraints? Are there design considerations within linuxptp that I should know of before implementing and submitting such a fix? -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel