2018-02-16 17:51 GMT+01:00 Keller, Jacob E <jacob.e.kel...@intel.com>:
>
> Even the interrupt itself takes up to 12 ms sometimes?
>

I got curious after your last messages and again dove deeper into the
kernel and drivers and tried to find out what happens when.
Just FYI: I inherited the code, so feel free to criticize and suggest
improvements where needed ;-) Also I'm quite new to anything on linux
kernel level so forgive me if I mess up somewhere.

My Setup was again two devices, I've created extra CPU Load AND Network
Load to increase the chance of big delays and I got some.
Here is a small extract from a kernel trace including trace messages from
ptp4l (hoping gmail does properly send the formatting):
# tracer: nop
#
# entries-in-buffer/entries-written: 6694/6694   #P:2
#
#                              _-----=> irqs-off
#                             / _----=> need-resched
#                            | / _---=> hardirq/softirq
#                            || / _--=> preempt-depth
#                            ||| /     delay
#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
#              | |       |   ||||       |         |
           ptp4l-587   [000] ...1  2799.157884: tracing_mark_write: send
delay_req
           ptp4l-587   [000] ...1  2799.157900: tracing_mark_write:
transport_send() seq id: 31495/1915
           ptp4l-587   [000] d.H2  2799.158000: phy_interrupt:
phy_interrupt "handled"
           ptp4l-587   [000] ...1  2799.158048: tracing_mark_write: enter
sk_receive
     kworker/0:1-1710  [000] ....  2799.162773: add_event.constprop.9:
add_event id 1915
     kworker/0:1-1710  [000] ....  2799.170782: clock_find_ts.part.0:
matched id 1915
           ptp4l-587   [000] ...1  2799.170791: tracing_mark_write: poll
worked

What I now notice is that the interrupt in fact is not delayed (takes
~0.1ms).
But by comments in the phy_interrupt() function (in phy.c) of the linux
kernel thats not the place to handle the interrupt, so it schedules a work
item in a work queue.

This work queue is handled by the phy_change() function (in phy.c), which
in a subfunction retrieves the timestamp from the phy (this is the
add_event call from the trace above). This is delayed by 4ms.

IMHO, this looks like proper design of this driver. The actual interrupt
handler is short and according to a comment in the function phy_interrupt ()
"The MDIO bus is not allowed to be written in interrupt context", so I
cant't read the timestamps there.

The package send also sends a skb into the txtstamp function, which does
set the SKBTX_IN_PROGRESS flag as described by Jake. As the package is not
send out yet, is queues the skb in a workqueue which is part of the phy
driver.
The handler of this work queue matches the skbs to the timestamps queued by
the add_event() function and sends them back up to ptp4l. This handler
shows itself above by the " clock_find_ts" line and is as well delayed by
another 8ms, resulting in a 12 ms delay from sending to skb returning.

I do understand that this is an issue due to two scheduled workers, so what
would be a proper design of the "second stage" of the timestamp readout
process? And is there something like a good, readable phy driver I can look
into for this?

Best regards, Olli
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel

Reply via email to