> -----Original Message-----
> From: Maciek Machnikowski <mac...@machnikowski.net>
> Sent: Tuesday, July 18, 2023 12:33 AM
> To: Keller, Jacob E <jacob.e.kel...@intel.com>; Richard Cochran
> <richardcoch...@gmail.com>
> Cc: linuxptp-devel@lists.sourceforge.net; Czapnik, Lukasz
> <lukasz.czap...@intel.com>; Kolacinski, Karol <karol.kolacin...@intel.com>;
> Plachno, Lukasz <lukasz.plac...@intel.com>; Pacuszka, MateuszX
> <mateuszx.pacus...@intel.com>; Glaza, Jan <jan.gl...@intel.com>
> Subject: Re: [Linuxptp-devel] [PATCH] sk: don't report random errno on timeout
> 
> 
> 
> On 7/18/2023 1:08 AM, Jacob Keller wrote:
> >
> >
> > On 7/16/2023 1:27 PM, Richard Cochran wrote:
> >> On Fri, Jul 14, 2023 at 08:43:30PM +0000, Keller, Jacob E wrote:
> >>
> >>>> With this patch applied, one will get proper error in last line,
> >>>> "Timer expired", and more modern suggestion about how to approach
> fixing it
> >>>>
> >>>
> >>>
> >>> I think changing the message about what might be causing timeout is
> >>> unnecessary. It may be helpful purely in the context of some
> >>> devices, but it is not a good general message as not all hardware
> >>> and drivers have the same design. In the *general* case if this
> >>> timeout is hit then it is usually a bug in the driver for that
> >>> hardware. In the specific case for ice hardware, the mention of
> >>> thread starvation is accurate, but that is unlikely to be general
> >>> across all hardware.
> >>
> >> But the point about kthread priority is a good hint.  How about
> >> keeping the part about possible driver bug (since we have had many,
> >> Many, MANY questions on this list when somebody is developing a new
> >> driver) and adding a hint about kworker scheduling priority?
> >>
> >
> > Sounds good to me. It might help us reduce our own support burden when
> > users say "I get this message that says its a driver bug, and I already
> > tried <ridiculous delay value here>, but still see timeouts", and then
> > we need to educate them on priority to ensure that they aren't starving
> > the thread which processes timestamps.
> >
> > But I agree keeping the message about it possibly being a bug is
> > important because it has been caused by drivers a lot in the past.
> >
> > Maybe something like:
> >
> > timed out while polling for tx timestamp
> > increasing tx_timestamp_timeout or increasing priority of relevant
> > kworker threads may correct this issue, but it is likely caused by a
> > driver bug
> >
> > Thanks,
> > Jake
> 
> What about:
> increasing tx_timestamp_timeout or increasing kworker threads priority
> may correct this issue, but a driver bug likely causes it
> 
> That would work if we also add the instruction how to do that in the
> manual. Otherwise we'll get the same number of questions - just
> different ones :)
> 
> Thanks,
> Maciek
> 

Ya a section on the Tx timestamp timeout in the man page with additional 
details would be useful.

Thanks,
Jake

> >
> >>> Thus, I think we should leave the error message alone and just fix
> >>> the errno value. Improving the errno value is important since it
> >>> would be less confusing than seeing arbitrary error values which are
> >>> unrelated to the actual error.
> >>
> >> Yeah, errno fix is needed>
> >> Thanks,
> >> Richard
> >
> >
> > _______________________________________________
> > Linuxptp-devel mailing list
> > Linuxptp-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linuxptp-devel

_______________________________________________
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel

Reply via email to