On 08/06/17(Thu) 20:38, Björn Ketelaars wrote:
> On Thu 08/06/2017 16:55, Martin Pieuchot wrote:
> > On 07/06/17(Wed) 09:43, Björn Ketelaars wrote:
> > > On Sat 03/06/2017 08:44, Björn Ketelaars wrote:
> > > > 
> > > > Reverting back to the previous kernel fixed the issue above. Question: 
> > > > can
> > > > someone give a hint on how to track this issue?
> > > 
> > > After a bit of experimenting I'm able to reproduce the problem. Summary is
> > > that queueing in pf and use of a current (after May 30), multi processor
> > > kernel (bsd.mp from snapshots) causes these specific watchdog timeouts
> > > followed by a system freeze.
> > > 
> > > Issue is 'gone' when:
> > > 1.) using an older kernel (before May 30);
> > > 2.) removal of queueing statements from pf.conf. Included below the 
> > > specific
> > >     snippet;
> > > 3.) switch from MP kernel to SP kernel.
> > > 
> > > New observation is that while queueing, using a MP kernel, the download
> > > bandwidth is only a fraction of what is expected. Exchanging the MP kernel
> > > with a SP kernel restores the download bandwidth to expected level.
> > > 
> > > I'm guessing that this issue is related to recent work on PF?
> > 
> > It's certainly a problem in, or exposed by, re(4) with the recent MP work
> > in the network stack.
> > 
> > It would help if you could build a kernel with MP_LOCKDEBUG defined and
> > see if the resulting kernel enters ddb(4) instead of freezing.
> > 
> > Thanks,
> > Martin
> 
> Thanks for the hint! It helped in entering ddb. I collected a bit of output,
> which you can find below. If I read the trace correctly the crash is related
> to line 1750 of sys/dev/ic/re.c:
> 
>       d->rl_cmdstat |= htole32(RL_TDESC_CMD_EOF);

Could you test the diff below, always with a MP_LOCKDEBUG kernel and
tell us if you can reproduce the freeze or if the kernel enters ddb(4)?

Another question, how often do you see "watchdog timeout" messages?

Index: re.c
===================================================================
RCS file: /cvs/src/sys/dev/ic/re.c,v
retrieving revision 1.201
diff -u -p -r1.201 re.c
--- re.c        24 Jan 2017 03:57:34 -0000      1.201
+++ re.c        9 Jun 2017 10:04:43 -0000
@@ -2074,9 +2074,6 @@ re_watchdog(struct ifnet *ifp)
        s = splnet();
        printf("%s: watchdog timeout\n", sc->sc_dev.dv_xname);
 
-       re_txeof(sc);
-       re_rxeof(sc);
-
        re_init(ifp);
 
        splx(s);

Reply via email to