On Wed, Jan 03, 2024 at 04:51:39PM +1000, Jonathan Matthew wrote:
> On Wed, Jan 03, 2024 at 01:50:06AM +0100, Alexander Bluhm wrote:
> > On Wed, Jan 03, 2024 at 12:26:26AM +0100, Hrvoje Popovski wrote:
> > > While testing kettenis@ ipl diff from tech@ and doing iperf3 to bnxt
> > > interface and ifconfig bnxt0 down/up at the same time I can trigger
> > > panic. Panic can be triggered without kettenis@ diff...
> >
> > It is easy to reproduce. ifconfig bnxt1 down/up a few times while
> > receiving TCP traffic with iperf3. Machine still has kettenis@ diff.
> > My panic looks different.
>
> It looks like I wasn't trying very hard when I wrote bnxt_down().
> I think there's also a problem with bnxt_up() unwinding after failure
> in various places, but that's a different issue.
>
> This makes it a more resilient for me, though it still logs
> 'bnxt0: unexpected completion type 3' a lot if I take the interface
> down while it's in use. I'll look at that separately.
Should we intr_barrier(sc->sc_queues[0].q_ihc) if sc->sc_intrmap == NULL ?
All these barriers make sense to me. OK bluhm@
> Index: if_bnxt.c
> ===================================================================
> RCS file: /cvs/src/sys/dev/pci/if_bnxt.c,v
> retrieving revision 1.39
> diff -u -p -r1.39 if_bnxt.c
> --- if_bnxt.c 10 Nov 2023 15:51:20 -0000 1.39
> +++ if_bnxt.c 3 Jan 2024 06:36:02 -0000
> @@ -1158,12 +1159,16 @@ bnxt_down(struct bnxt_softc *sc)
>
> CLR(ifp->if_flags, IFF_RUNNING);
>
> + intr_barrier(sc->sc_ih);
> +
> for (i = 0; i < sc->sc_nqueues; i++) {
> ifq_clr_oactive(ifp->if_ifqs[i]);
> ifq_barrier(ifp->if_ifqs[i]);
> - /* intr barrier? */
>
> - timeout_del(&sc->sc_queues[i].q_rx.rx_refill);
> + timeout_del_barrier(&sc->sc_queues[i].q_rx.rx_refill);
> +
> + if (sc->sc_intrmap != NULL)
> + intr_barrier(sc->sc_queues[i].q_ihc);
> }
>
> bnxt_hwrm_free_filter(sc, &sc->sc_vnic);
>