On Sun, Jan 28, 2024 at 12:16:20AM +0100, Hrvoje Popovski wrote:

> On 27.1.2024. 21:01, Marcus Glocker wrote:
> > On Sat, Jan 27, 2024 at 08:01:09AM +0100, Hrvoje Popovski wrote:
> > 
> >> On 26.1.2024. 21:56, Marcus Glocker wrote:
> >>> On Fri, Jan 26, 2024 at 11:41:49AM +0100, Hrvoje Popovski wrote:
> >>>
> >>>> I've manage to reproduce TSO em problem on anoter setup, unfortunatly
> >>>> production.
> >>>>
> >>>> Setup is very simple
> >>>>
> >>>> em0 - carp <- uplink
> >>>> em1 - pfsync
> >>>> ix1 - vlans - carp
> >>> Would it be possible that you also share an "ifconfig -a hwfeatures" of
> >>> that box?  You can mask the IPs if it's too sensitive.
> >>>
> >>> I still try to reproduce the issue here, and for now I can't.
> >>> Maybe in your full ifconfig output I can see some specifics about your
> >>> configuration, which makes it more likely to reproduce the issue here.
> >>>
> >> Hi,
> >>
> >> here's ifconfig from second setup where watchdog is triggered much faster.
> >> Originally in this setup uplink is ix0, I've change that to em0 to see
> >> would the problem be same as in other setup and it is, and that's good
> >> because this is pfsync setup for students and I can do whatever I want
> >> with it :)
> > Thanks.
> > 
> > But still, I can do whatever I want on my em(4) I210 box, carp(4),
> > vlan(4), creating a lot of traffic, I can't reproduce the watchdog which
> > you are seeing :-(  I'm not sure if this is something related to your
> > I350.
> > 
> > Also, I can't understand why the watchdog still triggers when you disable
> > TSO by setting net.inet.tcp.tso=0.
> > 
> > Just to rule out that you're receiving a MAXMCLBYTES (65536) packet,
> > while EM_TSO_SIZE (65535) is one byte less, can you please apply this
> > diff to -current and test it?  I doubt it will make a difference, but
> > I'm running a bit out of ideas here.
> 
> 
> Hi,
> 
> with this diff I'm still getting em watchdog
> 
> Jan 28 00:14:12 bcbnfw1 /bsd: em0: watchdog: head 120 tail 185 TDH 185
> TDT 120

Thanks for testing again.

I think we might have a generic problem with TSO with the current em(4)
code and some chips.  Referring to this recent FreeBSD commit.

e1000: disable TSO on lem(4) and em(4):
Disable TSO on lem(4) and em(4) until a ring stall can be debugged.
https://github.com/freebsd/freebsd-src/commit/797e480cba8834e584062092c098e60956d28180

Can you try this diff to specifically disable TSO for I350 please?

We will need to discuss internally which way to go.  I see those
options currently:

- Entirely pull out the TSO diff.
- Leave the TSO code in but disable TSO for now (what FreeBSD did).
- Leave the TSO code in but disable TSO only for chips we see issues
  with (this diff).


Index: if_em.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/if_em.c,v
diff -u -p -u -p -r1.370 if_em.c
--- if_em.c     31 Dec 2023 08:42:33 -0000      1.370
+++ if_em.c     28 Jan 2024 09:30:59 -0000
@@ -2013,7 +2013,9 @@ em_setup_interface(struct em_softc *sc)
        if (sc->hw.mac_type >= em_82575 && sc->hw.mac_type <= em_i210) {
                ifp->if_capabilities |= IFCAP_CSUM_IPv4;
                ifp->if_capabilities |= IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6;
-               ifp->if_capabilities |= IFCAP_TSOv4 | IFCAP_TSOv6;
+               /* XXX: Enabling TSO on I350 causes watchdogs */
+               if (sc->hw.mac_type != em_i350)
+                       ifp->if_capabilities |= IFCAP_TSOv4 | IFCAP_TSOv6;
        }
 
        /* 

Reply via email to