On Thu, May 28, 2020 at 08:36:08PM +0000, Mikolaj Kucharski wrote:
> On Wed, May 27, 2020 at 07:54:32AM +0000, Mikolaj Kucharski wrote:
> > On Wed, May 27, 2020 at 09:31:00AM +0200, Stefan Sperling wrote:
> > > > Uptime of 3h37m with following two entries (from dmesg):
> > > 
> > > So this uptime is a lot better than what you saw before?
> > 
> > I actually cannot compare is it better or not. This PC Engines machine
> > runs -current and I upgrade it very regularly. Uptime below a week is
> > normnal. Uptime of 30+ days would be probably because I'm traveling and
> > I don't want to do remote upgrades. With COVID-19 I'm not really
> > traveling these days, so no long uptimes for that box.
> 
> At the time of writing this email access point has 36 hrs of uptime and
> I was not able to trigger kernel panic, nor XXX messages showed up in
> dmesg with the latest version of the diff (int rekeysta = 0).
> 
> As I cannot repro the panic, I guess for now I don't have anything more
> to add to this thread, except that your diff works, Stefan.
 
Thank you, Mikolaj. I have committed the fix.

> I can trigger athn device timeouts, but this looks like a different
> issue, so I may start new thread about it, but for now I need to think
> how to collect anything useful for this problem, because except dmesg
> messages I don't have anything else about the problem.

Yeah, I occasionally see those, too.

"device timout" happens when hardware does not report Tx success/failure
back to the driver after some time has passed. The hardware device is
supposed to assert an interrupt whenver a frame on its queue has been
transmitted successfully, or if transmission has failed, so that the
driver can clean up resources the OS has allocated to that particular frame.

When "device timeout" is logged, such an interrupt did not occur within a
couple of seconds, and the driver will simply free all queued frames,
reset the device, and start over. It's unclear why the problem happens.
There could be many reasons. In any case, the driver can recover from such
errors and they usually only affect one or a couple of frames.

Reply via email to