Re: TSO em(4) problem
On 26.1.2024. 22:47, Alexander Bluhm wrote: > On Fri, Jan 26, 2024 at 11:41:49AM +0100, Hrvoje Popovski wrote: >> I've manage to reproduce TSO em problem on anoter setup, unfortunatly >> production. > What helped debugging a similar issue with ixl(4) and TSO was to > remove all TSO specific code from the driver. Then only this part > remains from the original em(4) TSO diff. > > error = bus_dmamap_create(sc->sc_dmat, EM_TSO_SIZE, > EM_MAX_SCATTER / (sc->pcix_82544 ? 2 : 1), > EM_TSO_SEG_SIZE, 0, BUS_DMA_NOWAIT, >pkt_map); > > The parameters that changed when adding TSO are: > > bus_size_t size: MAX_JUMBO_FRAME_SIZE 16128 -> EM_TSO_SIZE 65535 > bus_size_t maxsegsz: MAX_JUMBO_FRAME_SIZE 16128 -> EM_TSO_SEG_SIZE 4096 > > I suspect that this is the cause for the regression as disabling > TSO did not help. Would it be possible to run the diff below? I > expect that the problem will still be there. But then we know it > must be the change of one of the bus_dmamap_create() arguments. > > bluhm Hi, with this diff em0 seems happy and em watchdog is gone. bcbnfw1# uptime 8:06AM up 44 mins, 2 users, load averages: 0.00, 0.00, 0.00 bcbnfw1# ifconfig em0 hwfeatures em0: flags=8b43 mtu 1500 hwfeatures=1b7 hardmtu 9216 lladdr 0c:c4:7a:da:cd:5a index 3 priority 0 llprio 3 groups: egress media: Ethernet autoselect (1000baseT full-duplex,master,rxpause) status: active inet 10.10.155.234 netmask 0xfff8 broadcast 10.10.155.239 This morning without diff bcbnfw1# cat /var/log/messages | grep watchdog Jan 27 07:12:03 bcbnfw1 /bsd: em0: watchdog: head 50 tail 114 TDH 114 TDT 50 Jan 27 07:15:29 bcbnfw1 /bsd: em0: watchdog: head 370 tail 434 TDH 434 TDT 370 Jan 27 07:15:43 bcbnfw1 /bsd: em0: watchdog: head 219 tail 283 TDH 283 TDT 219 Jan 27 07:15:54 bcbnfw1 /bsd: em0: watchdog: head 322 tail 386 TDH 386 TDT 322 Jan 27 07:16:08 bcbnfw1 /bsd: em0: watchdog: head 115 tail 179 TDH 179 TDT 115 Jan 27 07:16:21 bcbnfw1 /bsd: em0: watchdog: head 364 tail 428 TDH 428 TDT 364 Jan 27 07:16:35 bcbnfw1 /bsd: em0: watchdog: head 473 tail 26 TDH 26 TDT 473
Re: TSO em(4) problem
On 26.1.2024. 21:56, Marcus Glocker wrote: > On Fri, Jan 26, 2024 at 11:41:49AM +0100, Hrvoje Popovski wrote: > >> I've manage to reproduce TSO em problem on anoter setup, unfortunatly >> production. >> >> Setup is very simple >> >> em0 - carp <- uplink >> em1 - pfsync >> ix1 - vlans - carp > > Would it be possible that you also share an "ifconfig -a hwfeatures" of > that box? You can mask the IPs if it's too sensitive. > > I still try to reproduce the issue here, and for now I can't. > Maybe in your full ifconfig output I can see some specifics about your > configuration, which makes it more likely to reproduce the issue here. > Hi, here's ifconfig from second setup where watchdog is triggered much faster. Originally in this setup uplink is ix0, I've change that to em0 to see would the problem be same as in other setup and it is, and that's good because this is pfsync setup for students and I can do whatever I want with it :) bcbnfw1# ifconfig -a hwfeatures lo0: flags=2008049 mtu 32768 hwfeatures=7187 index 6 priority 0 llprio 3 groups: lo inet 127.0.0.1 netmask 0xff00 ix0: flags=2008802 mtu 1500 hwfeatures=71b7 hardmtu 9198 lladdr 90:e2:ba:d7:1b:f4 index 1 priority 0 llprio 3 media: Ethernet autoselect (10GbaseSR full-duplex) status: active ix1: flags=2008b43 mtu 1500 hwfeatures=71b7 hardmtu 9198 lladdr 90:e2:ba:d7:1b:f5 index 2 priority 0 llprio 3 media: Ethernet autoselect (10GbaseSR full-duplex,rxpause,txpause) status: active em0: flags=8b43 mtu 1500 hwfeatures=31b7 hardmtu 9216 lladdr 0c:c4:7a:da:cd:5a index 3 priority 0 llprio 3 groups: egress media: Ethernet autoselect (1000baseT full-duplex,rxpause) status: active inet 10.10.155.234 netmask 0xfff8 broadcast 10.10.155.239 em1: flags=8843 mtu 1500 hwfeatures=31b7 hardmtu 9216 lladdr 0c:c4:7a:da:cd:5b index 4 priority 0 llprio 3 media: Ethernet autoselect (1000baseT full-duplex,rxpause,txpause) status: active inet 192.168.0.77 netmask 0xfffc broadcast 192.168.0.79 enc0: flags=0<> hwfeatures=0<> index 5 priority 0 llprio 3 groups: enc status: active carp0: flags=8843 mtu 1500 hwfeatures=3187 hardmtu 1500 lladdr 00:00:5e:00:01:01 index 7 priority 15 llprio 3 carp: MASTER carpdev em0 vhid 1 advbase 1 advskew 10 groups: carp status: master inet 10.10.155.236 netmask 0x carp1100: flags=8843 mtu 1500 hwfeatures=3187 hardmtu 1500 lladdr 00:00:5e:00:01:12 index 8 priority 15 llprio 3 carp: MASTER carpdev vlan1100 vhid 18 advbase 1 advskew 10 groups: carp status: master inet 10.30.16.1 netmask 0x carp1101: flags=8843 mtu 1500 hwfeatures=3187 hardmtu 1500 lladdr 00:00:5e:00:01:16 index 9 priority 15 llprio 3 carp: MASTER carpdev vlan1101 vhid 22 advbase 1 advskew 10 groups: carp status: master inet 10.31.16.1 netmask 0x carp1102: flags=8843 mtu 1500 hwfeatures=3187 hardmtu 1500 lladdr 00:00:5e:00:01:19 index 10 priority 15 llprio 3 carp: MASTER carpdev vlan1102 vhid 25 advbase 1 advskew 10 groups: carp status: master inet 10.32.16.1 netmask 0x carp1103: flags=8843 mtu 1500 hwfeatures=3187 hardmtu 1500 lladdr 00:00:5e:00:01:1c index 11 priority 15 llprio 3 carp: MASTER carpdev vlan1103 vhid 28 advbase 1 advskew 10 groups: carp status: master inet 10.33.16.1 netmask 0x carp1130: flags=8843 mtu 1500 hwfeatures=3187 hardmtu 1500 lladdr 00:00:5e:00:01:13 index 12 priority 15 llprio 3 carp: MASTER carpdev vlan1130 vhid 19 advbase 1 advskew 10 groups: carp status: master inet 10.30.0.1 netmask 0x carp1131: flags=8843 mtu 1500 hwfeatures=3187 hardmtu 1500 lladdr 00:00:5e:00:01:17 index 13 priority 15 llprio 3 carp: MASTER carpdev vlan1131 vhid 23 advbase 1 advskew 10 groups: carp status: master inet 10.31.0.1 netmask 0x carp1132: flags=8843 mtu 1500 hwfeatures=3187 hardmtu 1500 lladdr 00:00:5e:00:01:1a index 14 priority 15 llprio 3 carp: MASTER carpdev vlan1132 vhid 26 advbase 1 advskew 10 groups: carp status: master inet 10.32.0.1 netmask 0x carp1133: flags=8843 mtu 1500 hwfeatures=3187 hardmtu 1500 lladdr 00:00:5e:00:01:1d index 15 priority 15 llprio 3 carp: MASTER carpdev vlan1133 vhid 29 advbase 1 advskew 10 groups: carp status: master inet 10.33.0.1 netmask 0x carp1150: flags=8843 mtu 1500 hwfeatures=3187 hardmtu 1500 lladdr 00:00:5e:00:01:14 index 16 priority 15 llprio 3 carp: MASTER carpdev vlan1150
Re: TSO em(4) problem
On Fri, Jan 26, 2024 at 11:41:49AM +0100, Hrvoje Popovski wrote: > I've manage to reproduce TSO em problem on anoter setup, unfortunatly > production. What helped debugging a similar issue with ixl(4) and TSO was to remove all TSO specific code from the driver. Then only this part remains from the original em(4) TSO diff. error = bus_dmamap_create(sc->sc_dmat, EM_TSO_SIZE, EM_MAX_SCATTER / (sc->pcix_82544 ? 2 : 1), EM_TSO_SEG_SIZE, 0, BUS_DMA_NOWAIT, >pkt_map); The parameters that changed when adding TSO are: bus_size_t size:MAX_JUMBO_FRAME_SIZE 16128 -> EM_TSO_SIZE 65535 bus_size_t maxsegsz:MAX_JUMBO_FRAME_SIZE 16128 -> EM_TSO_SEG_SIZE 4096 I suspect that this is the cause for the regression as disabling TSO did not help. Would it be possible to run the diff below? I expect that the problem will still be there. But then we know it must be the change of one of the bus_dmamap_create() arguments. bluhm Index: dev/pci/if_em.c === RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_em.c,v diff -u -p -r1.370 if_em.c --- dev/pci/if_em.c 31 Dec 2023 08:42:33 - 1.370 +++ dev/pci/if_em.c 26 Jan 2024 21:32:08 - @@ -291,8 +291,6 @@ void em_receive_checksum(struct em_softc struct mbuf *); u_int em_transmit_checksum_setup(struct em_queue *, struct mbuf *, u_int, u_int32_t *, u_int32_t *); -u_int em_tso_setup(struct em_queue *, struct mbuf *, u_int, u_int32_t *, - u_int32_t *); u_int em_tx_ctx_setup(struct em_queue *, struct mbuf *, u_int, u_int32_t *, u_int32_t *); void em_iff(struct em_softc *); @@ -1238,15 +1236,7 @@ em_encap(struct em_queue *que, struct mb } if (sc->hw.mac_type >= em_82575 && sc->hw.mac_type <= em_i210) { - if (ISSET(m->m_pkthdr.csum_flags, M_TCP_TSO)) { - used += em_tso_setup(que, m, head, _upper, - _lower); - if (!used) - return (used); - } else { - used += em_tx_ctx_setup(que, m, head, _upper, - _lower); - } + used += em_tx_ctx_setup(que, m, head, _upper, _lower); } else if (sc->hw.mac_type >= em_82543) { used += em_transmit_checksum_setup(que, m, head, _upper, _lower); @@ -1579,21 +1569,6 @@ em_update_link_status(struct em_softc *s ifp->if_link_state = link_state; if_link_state_change(ifp); } - - /* Disable TSO for 10/100 speeds to avoid some hardware issues */ - switch (sc->link_speed) { - case SPEED_10: - case SPEED_100: - if (sc->hw.mac_type >= em_82575 && sc->hw.mac_type <= em_i210) { - ifp->if_capabilities &= ~IFCAP_TSOv4; - ifp->if_capabilities &= ~IFCAP_TSOv6; - } - break; - case SPEED_1000: - if (sc->hw.mac_type >= em_82575 && sc->hw.mac_type <= em_i210) - ifp->if_capabilities |= IFCAP_TSOv4 | IFCAP_TSOv6; - break; - } } /* @@ -2013,7 +1988,6 @@ em_setup_interface(struct em_softc *sc) if (sc->hw.mac_type >= em_82575 && sc->hw.mac_type <= em_i210) { ifp->if_capabilities |= IFCAP_CSUM_IPv4; ifp->if_capabilities |= IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6; - ifp->if_capabilities |= IFCAP_TSOv4 | IFCAP_TSOv6; } /* @@ -2429,81 +2403,6 @@ em_free_transmit_structures(struct em_so 0, que->tx.sc_tx_dma.dma_map->dm_mapsize, BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE); } -} - -u_int -em_tso_setup(struct em_queue *que, struct mbuf *mp, u_int head, -u_int32_t *olinfo_status, u_int32_t *cmd_type_len) -{ - struct ether_extracted ext; - struct e1000_adv_tx_context_desc *TD; - uint32_t vlan_macip_lens = 0, type_tucmd_mlhl = 0, mss_l4len_idx = 0; - uint32_t paylen = 0; - uint8_t iphlen = 0; - - *olinfo_status = 0; - *cmd_type_len = 0; - TD = (struct e1000_adv_tx_context_desc *)>tx.sc_tx_desc_ring[head]; - -#if NVLAN > 0 - if (ISSET(mp->m_flags, M_VLANTAG)) { - uint32_t vtag = mp->m_pkthdr.ether_vtag; - vlan_macip_lens |= vtag << E1000_ADVTXD_VLAN_SHIFT; - *cmd_type_len |= E1000_ADVTXD_DCMD_VLE; - } -#endif - - ether_extract_headers(mp, ); - if (ext.tcp == NULL) - goto out; - - vlan_macip_lens |= (sizeof(*ext.eh) << E1000_ADVTXD_MACLEN_SHIFT); - - if (ext.ip4) { - iphlen = ext.ip4->ip_hl << 2; - - type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_IPV4; - *olinfo_status |=
Re: TSO em(4) problem
On Fri, Jan 26, 2024 at 11:41:49AM +0100, Hrvoje Popovski wrote: > I've manage to reproduce TSO em problem on anoter setup, unfortunatly > production. > > Setup is very simple > > em0 - carp <- uplink > em1 - pfsync > ix1 - vlans - carp Would it be possible that you also share an "ifconfig -a hwfeatures" of that box? You can mask the IPs if it's too sensitive. I still try to reproduce the issue here, and for now I can't. Maybe in your full ifconfig output I can see some specifics about your configuration, which makes it more likely to reproduce the issue here.
Re: TSO em(4) problem
I've manage to reproduce TSO em problem on anoter setup, unfortunatly production. Setup is very simple em0 - carp <- uplink em1 - pfsync ix1 - vlans - carp Jan 26 11:19:23 bcbnfw1 /bsd: em0: watchdog: head 34 tail 98 TDH 98 TDT 34 Jan 26 11:19:33 bcbnfw1 /bsd: em0: watchdog: head 345 tail 409 TDH 409 TDT 345 Jan 26 11:19:54 bcbnfw1 /bsd: em0: watchdog: head 259 tail 323 TDH 323 TDT 259 Jan 26 11:20:08 bcbnfw1 /bsd: em0: watchdog: head 343 tail 407 TDH 407 TDT 343 Jan 26 11:20:24 bcbnfw1 /bsd: em0: watchdog: head 20 tail 85 TDH 85 TDT 20 Jan 26 11:20:47 bcbnfw1 /bsd: em0: watchdog: head 388 tail 452 TDH 452 TDT 388 Jan 26 11:21:09 bcbnfw1 /bsd: em0: watchdog: head 25 tail 89 TDH 89 TDT 25 Jan 26 11:21:32 bcbnfw1 /bsd: em0: watchdog: head 105 tail 169 TDH 169 TDT 105 Jan 26 11:21:52 bcbnfw1 /bsd: em0: watchdog: head 23 tail 88 TDH 88 TDT 23 647470: Jan 26 11:19:25: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to down 647474: Jan 26 11:19:29: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to up 647478: Jan 26 11:19:35: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to down 647483: Jan 26 11:19:39: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to up 647487: Jan 26 11:19:56: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to down 647491: Jan 26 11:19:59: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to up 647495: Jan 26 11:20:10: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to down 647499: Jan 26 11:20:13: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to up 647504: Jan 26 11:20:26: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to down 647508: Jan 26 11:20:29: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to up 647512: Jan 26 11:20:49: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to down 647516: Jan 26 11:20:52: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to up 647520: Jan 26 11:21:11: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to down 647524: Jan 26 11:21:14: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to up 647528: Jan 26 11:21:34: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to down 647532: Jan 26 11:21:36: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to up 647536: Jan 26 11:21:54: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to down 647540: Jan 26 11:21:56: %LINK-SP-3-UPDOWN: Interface GigabitEthernet4/48, changed state to up bcbnfw1# kstat em0::: em0:0:em-stats:0 rx crc errs: 0 packets rx align errs: 0 packets rx align errs: 0 packets rx errs: 0 packets rx missed: 0 packets tx single coll: 0 packets tx excess coll: 0 packets tx multi coll: 0 packets tx late coll: 0 packets tx coll: 0 tx defers: 0 tx no CRS: 0 packets seq errs: 0 carr ext errs: 0 packets rx len errs: 0 packets rx xon: 0 packets tx xon: 0 packets rx xoff: 0 packets tx xoff: 0 packets FC unsupported: 0 packets rx 64B: 6555 packets rx 65-127B: 11144 packets rx 128-255B: 6264 packets rx 256-511B: 2390 packets rx 512-1023B: 3706 packets rx 1024-maxB: 87987 packets rx good: 118046 packets rx bcast: 3 packets rx mcast: 82 packets tx good: 56796 packets rx good: 132532686 bytes tx good: 13691390 bytes rx no buffers: 0 packets rx undersize: 0 packets rx fragments: 0 packets rx oversize: 0 packets rx jabbers: 0 packets rx mgmt: 0 packets rx mgmt drops: 0 packets tx mgmt: 0 packets rx total: 132532686 bytes tx total: 13691390 bytes rx total: 118046 packets tx total: 56796 packets tx 64B: 11861 packets tx 65-127B: 28718 packets tx 128-255B: 7202 packets tx 256-511B: 1834 packets tx 512-1023B: 2059 packets tx 1024-maxB: 5122 packets tx mcast: 18 packets tx bcast: 2 packets em0:0:rxq:0 packets: 1009629 packets bytes: 1172569417 bytes fdrops: 0 packets qdrops: 0 packets errors: 0 packets qlen: 0 packets enqueues: 348100 dequeues: 348031 em0:0:txq:0 packets: 465709 packets bytes: 103430590 bytes qdrops: 53674 packets errors: 0 packets qlen: 0 packets maxqlen: 511 packets oactive: false oactives: 9 OpenBSD 7.4-current (GENERIC.MP) #1626: Thu Jan 25 20:05:01 MST 2024 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 34224844800 (32639MB) avail mem = 33166430208 (31629MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 3.0 @ 0xec9b0 (62