On 2015-05-09 17:10:56, Brad Smith <[email protected]> wrote: > On Sun, May 03, 2015 at 12:16:21PM +0200, Mark Kettenis wrote: > > > Date: Sun, 3 May 2015 02:38:12 -0700 > > > From: Bryan Linton <[email protected]> > > > > > > The key difference is the following two lines. The first wedged, > > > the second unwedged: > > > em0 2048 2 2 256 2 > > > em0 2048 4 2 256 4 > > > > > > It seems like the em0 line always shows the latter line, so I'm hoping > > > this indicates something useful. > > > > I believe em(4) needs at least 4 descriptors on its rx ring. If you > > fall below that limit (which can happen if the CPU is busy), the card > > stops receiving packets. I believe em(4) is supposed to get an > > interrupt for dropped packets and will attempt to refill the ring if > > it receives such an interrupt. It seems that mechanism isn't working > > for your card. Perhaps the usual workaround of using a timeout should > > be added to the driver. > > Good hint, rev 1.280 broke the chips not capable of jumbo frames. Correcting > the low watermark for the affected chips resolves the issue Christian and > Bryan were experiencing. > > > Index: if_em.c > =================================================================== > RCS file: /home/cvs/src/sys/dev/pci/if_em.c,v > retrieving revision 1.295 > diff -u -p -u -p -r1.295 if_em.c > --- if_em.c 11 Feb 2015 23:21:47 -0000 1.295 > +++ if_em.c 3 May 2015 11:17:49 -0000 > @@ -2597,6 +2597,7 @@ int > em_setup_receive_structures(struct em_softc *sc) > { > struct ifnet *ifp = &sc->interface_data.ac_if; > + u_int lwm; > > memset(sc->rx_desc_base, 0, > sizeof(struct em_rx_desc) * sc->num_rx_desc); > @@ -2608,8 +2609,12 @@ em_setup_receive_structures(struct em_so > sc->next_rx_desc_to_check = 0; > sc->last_rx_desc_filled = sc->num_rx_desc - 1; > > - if_rxr_init(&sc->rx_ring, 2 * ((ifp->if_hardmtu / MCLBYTES) + 1), > - sc->num_rx_desc); > + if (sc->hw.max_frame_size == ETHER_MAX_LEN) > + lwm = 4; > + else > + lwm = 2 * ((ifp->if_hardmtu / MCLBYTES) + 1); > + > + if_rxr_init(&sc->rx_ring, lwm, sc->num_rx_desc); > > if (em_rxfill(sc) == 0) { > printf("%s: unable to fill any rx descriptors\n", >
I can confirm that this patch has fixed the issue for me and has caused no regressions. I would very much like to see it committed since it resolves a major regression for me that is basically a DoS. If an attacker can force a machine with one of these em(4) NICs to consume a lot of CPU and/or memory resources (maybe by retrieving a webpage thousands of times that is generated from a script that runs some moderately intensive commands) then it is possible to remote DoS a machine with any of these em(4) chips. Regardless of that, I'm happy that just running "pkg_add -ui" will finish without bringing the network down anymore. :) -- Bryan
