Re: [Linuxptp-devel] tx_timestamp_timeout default
jacob.e.kel...@intel.com said: > We get into the kthread function within a few hundred usec or less, and then > the firmware read takes a long time, sometimes over 2 milliseconds. Why is it taking so long? How long does it take when things go well? Is there anything complicated going on with a "firmware read"? Is there a software lock, maybe because some other code is using related hardware resources? Is the thread not getting scheduled? --- My first try was rejected by the list software because I sent it from the wrong address. -- These are my opinions. I hate spam. ___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
Re: [Linuxptp-devel] [PATCH] Increase the default tx_timestamp_timeout to 5
If the timestamp is available in less than the timeout (5ms) will it still wait for the timeout, or continue processing after the timestamp is received? Eric -Original Message- From: Jacob Keller Sent: Wednesday, July 7, 2021 8:02 PM To: linuxptp-devel@lists.sourceforge.net Subject: [Linuxptp-devel] [PATCH] Increase the default tx_timestamp_timeout to 5 The tx_timestamp_timeout configuration defines the number of milliseconds to wait for a Tx timestamp from the kernel stack. This delay is necessary as Tx timestamps are captured after a packet is sent and reported back via the socket error queue. The current default is to poll for up to 1 millisecond. In practice, it turns out that this is not always enough time for hardware and software to capture the timestamp and report it back. Some hardware designs require reading timestamps over registers or other slow mechanisms. This extra delay results in the timestamp not being sent back to userspace within the default 1 millisecond polling time. If that occurs the following can be seen from ptp4l: ptp4l[4756.840]: timed out while polling for tx timestamp ptp4l[4756.840]: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug ptp4l[4756.840]: port 1 (p2p1): send sync failed ptp4l[4756.840]: port 1 (p2p1): MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) This can confuse users because it implies this is a bug, when the correct solution in many cases is to just increase the timeout to a slightly higher value. Since we know this is a problem for many drivers and hardware designs, lets increase the default timeout. I chose 5 since it is a large enough increase to avoid the issues on test systems I have. We do want to keep this timeout small because it prevents ptp4l from doing any other processing while we wait for the timestamp. An alternative solution would be to refactor ptp4l so that it does not stop and wait for a Tx timestamp, but instead handles the timestamps asynchronously. While this could be done, it adds significant complexity to the application with minimal or no gain. In most cases, hardware is only capable of a single outstanding timestamp at a time, so we cannot send another packet anyways until the first one has completed. Signed-off-by: Jacob Keller --- config.c| 2 +- configs/default.cfg | 2 +- ptp4l.8 | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/config.c b/config.c index 4472d3d9d6f9..f33f177c696a 100644 --- a/config.c +++ b/config.c @@ -323,7 +323,7 @@ struct config_item config_tab[] = { GLOB_ITEM_INT("ts2phc.pulsewidth", 5, 100, 99900), PORT_ITEM_ENU("tsproc_mode", TSPROC_FILTER, tsproc_enu), GLOB_ITEM_INT("twoStepFlag", 1, 0, 1), - GLOB_ITEM_INT("tx_timestamp_timeout", 1, 1, INT_MAX), + GLOB_ITEM_INT("tx_timestamp_timeout", 5, 1, INT_MAX), PORT_ITEM_INT("udp_ttl", 1, 1, 255), PORT_ITEM_INT("udp6_scope", 0x0E, 0x00, 0x0F), GLOB_ITEM_STR("uds_address", "/var/run/ptp4l"), diff --git a/configs/default.cfg b/configs/default.cfg index 64ef3bd7c81d..5e9444da57ee 100644 --- a/configs/default.cfg +++ b/configs/default.cfg @@ -51,7 +51,7 @@ hybrid_e2e0 inhibit_multicast_service 0 net_sync_monitor 0 tc_spanning_tree 0 -tx_timestamp_timeout 1 +tx_timestamp_timeout 5 unicast_listen 0 unicast_master_table 0 unicast_req_duration 3600 diff --git a/ptp4l.8 b/ptp4l.8 index fe9e1502231c..024fad3d19b2 100644 --- a/ptp4l.8 +++ b/ptp4l.8 @@ -496,7 +496,7 @@ switches all implement this option together with the BMCA. .B tx_timestamp_timeout The number of milliseconds to poll waiting for the tx time stamp from the kernel when a message has recently been sent. -The default is 1. +The default is 5. .TP .B check_fup_sync Because of packet reordering that can occur in the network, in the -- 2.31.1.331.gb0c09ab8796f ___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel ___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
Re: [Linuxptp-devel] [PATCH] Increase the default tx_timestamp_timeout to 5
On Wed, Jul 07, 2021 at 05:02:21PM -0700, Jacob Keller wrote: > diff --git a/config.c b/config.c > index 4472d3d9d6f9..f33f177c696a 100644 > --- a/config.c > +++ b/config.c > @@ -323,7 +323,7 @@ struct config_item config_tab[] = { > GLOB_ITEM_INT("ts2phc.pulsewidth", 5, 100, 99900), > PORT_ITEM_ENU("tsproc_mode", TSPROC_FILTER, tsproc_enu), > GLOB_ITEM_INT("twoStepFlag", 1, 0, 1), > - GLOB_ITEM_INT("tx_timestamp_timeout", 1, 1, INT_MAX), > + GLOB_ITEM_INT("tx_timestamp_timeout", 5, 1, INT_MAX), Let's make it 10 ms. Thanks, Richard ___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
Re: [Linuxptp-devel] tx_timestamp_timeout default
On Wed, Jul 07, 2021 at 11:46:16PM +, Keller, Jacob E wrote: > Either way, I found that whether I used a kthread or not I was > unable to avoid the timeout issue with ice hardware because the > delay is caused by the method we must use to access the Tx > timestamps :( We get into the kthread function within a few hundred > usec or less, and then the firmware read takes a long time, > sometimes over 2 milliseconds. Yes, but when using "work" that 2 ms can easily become 200 ms or more on a busy RT system. The work is sched_other, and as such it is effectively running at the lowest priority WRT the sched_fifo tasks. Thanks, Richard ___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
[Linuxptp-devel] [PATCH] Increase the default tx_timestamp_timeout to 5
The tx_timestamp_timeout configuration defines the number of milliseconds to wait for a Tx timestamp from the kernel stack. This delay is necessary as Tx timestamps are captured after a packet is sent and reported back via the socket error queue. The current default is to poll for up to 1 millisecond. In practice, it turns out that this is not always enough time for hardware and software to capture the timestamp and report it back. Some hardware designs require reading timestamps over registers or other slow mechanisms. This extra delay results in the timestamp not being sent back to userspace within the default 1 millisecond polling time. If that occurs the following can be seen from ptp4l: ptp4l[4756.840]: timed out while polling for tx timestamp ptp4l[4756.840]: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug ptp4l[4756.840]: port 1 (p2p1): send sync failed ptp4l[4756.840]: port 1 (p2p1): MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) This can confuse users because it implies this is a bug, when the correct solution in many cases is to just increase the timeout to a slightly higher value. Since we know this is a problem for many drivers and hardware designs, lets increase the default timeout. I chose 5 since it is a large enough increase to avoid the issues on test systems I have. We do want to keep this timeout small because it prevents ptp4l from doing any other processing while we wait for the timestamp. An alternative solution would be to refactor ptp4l so that it does not stop and wait for a Tx timestamp, but instead handles the timestamps asynchronously. While this could be done, it adds significant complexity to the application with minimal or no gain. In most cases, hardware is only capable of a single outstanding timestamp at a time, so we cannot send another packet anyways until the first one has completed. Signed-off-by: Jacob Keller --- config.c| 2 +- configs/default.cfg | 2 +- ptp4l.8 | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/config.c b/config.c index 4472d3d9d6f9..f33f177c696a 100644 --- a/config.c +++ b/config.c @@ -323,7 +323,7 @@ struct config_item config_tab[] = { GLOB_ITEM_INT("ts2phc.pulsewidth", 5, 100, 99900), PORT_ITEM_ENU("tsproc_mode", TSPROC_FILTER, tsproc_enu), GLOB_ITEM_INT("twoStepFlag", 1, 0, 1), - GLOB_ITEM_INT("tx_timestamp_timeout", 1, 1, INT_MAX), + GLOB_ITEM_INT("tx_timestamp_timeout", 5, 1, INT_MAX), PORT_ITEM_INT("udp_ttl", 1, 1, 255), PORT_ITEM_INT("udp6_scope", 0x0E, 0x00, 0x0F), GLOB_ITEM_STR("uds_address", "/var/run/ptp4l"), diff --git a/configs/default.cfg b/configs/default.cfg index 64ef3bd7c81d..5e9444da57ee 100644 --- a/configs/default.cfg +++ b/configs/default.cfg @@ -51,7 +51,7 @@ hybrid_e2e0 inhibit_multicast_service 0 net_sync_monitor 0 tc_spanning_tree 0 -tx_timestamp_timeout 1 +tx_timestamp_timeout 5 unicast_listen 0 unicast_master_table 0 unicast_req_duration 3600 diff --git a/ptp4l.8 b/ptp4l.8 index fe9e1502231c..024fad3d19b2 100644 --- a/ptp4l.8 +++ b/ptp4l.8 @@ -496,7 +496,7 @@ switches all implement this option together with the BMCA. .B tx_timestamp_timeout The number of milliseconds to poll waiting for the tx time stamp from the kernel when a message has recently been sent. -The default is 1. +The default is 5. .TP .B check_fup_sync Because of packet reordering that can occur in the network, in the -- 2.31.1.331.gb0c09ab8796f ___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
Re: [Linuxptp-devel] tx_timestamp_timeout default
> -Original Message- > From: Richard Cochran > Sent: Wednesday, July 07, 2021 4:29 PM > To: Keller, Jacob E > Cc: linuxptp-devel@lists.sourceforge.net > Subject: Re: [Linuxptp-devel] tx_timestamp_timeout default > > On Wed, Jul 07, 2021 at 10:22:59PM +, Keller, Jacob E wrote: > > I am wondering if there would be support for either (a) increasing > > the default timeout, or (b) adding something to the PTP get > > capabilities for indicating a sort of default timeout for the > > device, and if it's not set in the config then the default timeout > > is used? > > I wouldn't mind increasing the default. > > I doubt capabilities would help, because much depends on the system > being used. We really should replace "work" with the kthread in the > drivers, and then tell people to give the kthreads sched_fifo using > chrt. > > Thanks, > Richard I implemented this as a kthread in the ice driver, and I am hoping to get some time to fix the other Intel drivers. Note that ice did not use the ptp do_aux_work kthread because of needing to handle multiple ports simultaneously across multiple PFs (the clock is shared across all PFs, so they don't all have access to the do_aux_work function which is only controlled by a single PF which allocated the clock). Either way, I found that whether I used a kthread or not I was unable to avoid the timeout issue with ice hardware because the delay is caused by the method we must use to access the Tx timestamps :( We get into the kthread function within a few hundred usec or less, and then the firmware read takes a long time, sometimes over 2 milliseconds. Ok. I'll propose a patch to increase the default. Thanks, Jake ___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
Re: [Linuxptp-devel] tx_timestamp_timeout default
On Wed, Jul 07, 2021 at 10:22:59PM +, Keller, Jacob E wrote: > I am wondering if there would be support for either (a) increasing > the default timeout, or (b) adding something to the PTP get > capabilities for indicating a sort of default timeout for the > device, and if it's not set in the config then the default timeout > is used? I wouldn't mind increasing the default. I doubt capabilities would help, because much depends on the system being used. We really should replace "work" with the kthread in the drivers, and then tell people to give the kthreads sched_fifo using chrt. Thanks, Richard ___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
[Linuxptp-devel] tx_timestamp_timeout default
Hi, I've been working on implementing support for PTP in the E810 and E822 devices for Intel's ice driver. As part of this work we've discovered that (again...) timestamps can sometimes take longer than the 1millisecond default delay. Although we get an interrupt within 10usec, the timestamp has to be retrieved from a PHY register, and this access is unfortunately mediated by firmware. The result is that while we can sometimes recover the timestamp within the 1000usec default delay, we sometimes need as much as 2000 or 2500 usec. We've been recommending that people default to using tx_timestamp_timeout of 5 or 10 to avoid issues. I am wondering if there would be support for either (a) increasing the default timeout, or (b) adding something to the PTP get capabilities for indicating a sort of default timeout for the device, and if it's not set in the config then the default timeout is used? The reason for this need is because it is confusing for users to be immediately prompted with "this is likely a driver bug" messaging. We know that the only resolution is to wait slightly longer. This is also not the first time we've had issues with the 1msec delay being too short, I believe we've often had to recommend increasing the delay for other Intel parts in the past for similar reasons. Note I tried a number of changes to the driver implementation in order to see if I could reduce the delay, and although I was able to improve it, I was never able to get it consistently below 1000usec, as we're effectively limited by the need to read the register access over the firmware :( Thanks, Jake ___ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel