On 08/06/17(Thu) 20:38, Björn Ketelaars wrote:
> On Thu 08/06/2017 16:55, Martin Pieuchot wrote:
> > On 07/06/17(Wed) 09:43, Björn Ketelaars wrote:
> > > On Sat 03/06/2017 08:44, Björn Ketelaars wrote:
> > > >
> > > > Reverting back to the previous kernel fixed the issue above. Question:
> > > > can
> > > > someone give a hint on how to track this issue?
> > >
> > > After a bit of experimenting I'm able to reproduce the problem. Summary is
> > > that queueing in pf and use of a current (after May 30), multi processor
> > > kernel (bsd.mp from snapshots) causes these specific watchdog timeouts
> > > followed by a system freeze.
> > >
> > > Issue is 'gone' when:
> > > 1.) using an older kernel (before May 30);
> > > 2.) removal of queueing statements from pf.conf. Included below the
> > > specific
> > > snippet;
> > > 3.) switch from MP kernel to SP kernel.
> > >
> > > New observation is that while queueing, using a MP kernel, the download
> > > bandwidth is only a fraction of what is expected. Exchanging the MP kernel
> > > with a SP kernel restores the download bandwidth to expected level.
> > >
> > > I'm guessing that this issue is related to recent work on PF?
> >
> > It's certainly a problem in, or exposed by, re(4) with the recent MP work
> > in the network stack.
> >
> > It would help if you could build a kernel with MP_LOCKDEBUG defined and
> > see if the resulting kernel enters ddb(4) instead of freezing.
> >
> > Thanks,
> > Martin
>
> Thanks for the hint! It helped in entering ddb. I collected a bit of output,
> which you can find below. If I read the trace correctly the crash is related
> to line 1750 of sys/dev/ic/re.c:
>
> d->rl_cmdstat |= htole32(RL_TDESC_CMD_EOF);
Could you test the diff below, always with a MP_LOCKDEBUG kernel and
tell us if you can reproduce the freeze or if the kernel enters ddb(4)?
Another question, how often do you see "watchdog timeout" messages?
Index: re.c
===================================================================
RCS file: /cvs/src/sys/dev/ic/re.c,v
retrieving revision 1.201
diff -u -p -r1.201 re.c
--- re.c 24 Jan 2017 03:57:34 -0000 1.201
+++ re.c 9 Jun 2017 10:04:43 -0000
@@ -2074,9 +2074,6 @@ re_watchdog(struct ifnet *ifp)
s = splnet();
printf("%s: watchdog timeout\n", sc->sc_dev.dv_xname);
- re_txeof(sc);
- re_rxeof(sc);
-
re_init(ifp);
splx(s);