Re: ugen(4) communication issues with UPS (nut) blazer_usb and nutdrv_qx
Errata: A few minutes later the problem occured on 6.8 It was looking stable but: Apr 23 00:04:49 libre /bsd: ugen0 detached Apr 23 00:04:56 libre /bsd: ugen0 at uhub0 port 2 "INNO TECH USB to Serial" rev 1.10/0.02 addr 2 Apr 23 00:06:08 libre upsmon[1619]: UPS ups0@localhost on battery Apr 23 00:06:18 libre upsmon[1619]: UPS ups0@localhost on line power Apr 23 00:07:50 libre pkg_add: Added gnuwatch-3.2.8p0 Apr 23 00:21:24 libre upsd[80608]: Data for UPS [ups0] is stale - check driver Apr 23 00:21:26 libre upsmon[1619]: Poll UPS [ups0@localhost] failed - Data stale Apr 23 00:21:26 libre upsmon[1619]: Communications with UPS ups0@localhost lost Apr 23 00:21:31 libre upsmon[1619]: Poll UPS [ups0@localhost] failed - Data stale Apr 23 00:22:06 libre last message repeated 7 times restarting upsd made it re-connect, however still no `clear endpoints failed!` in dmesg. On Fri, 2021-04-23 at 00:17 +0200, xs wrote: > My x230 laptop was setup to do the testing with 6.8. > > First observations: > > - There's no `ugen_clear_iface_eps: clear endpoints failed!` messages > - I haven't managed to trigger the bug on 6.8 > - I'm still getting trouble to make upsd able to detect the ups at > boot > > I have to replug the USB then it works. > > x230 laptop: > OpenBSD 6.8 (GENERIC.MP) #98: Sun Oct 4 18:13:26 MDT 2020 > > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > real mem = 8360439808 (7973MB) > avail mem = 8092012544 (7717MB) > random: good seed from bootblocks > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xbff35020 (18 entries) > bios0: vendor coreboot version "CBET4000 4.11-1765-g4bd6927388-dirty" > date 03/24/2020 > bios0: LENOVO 23256CG > acpi0 at bios0: ACPI 4.0 > acpi0: sleep states S0 S3 S4 S5 > acpi0: tables DSDT FACP SSDT MCFG TCPA APIC DMAR HPET > acpi0: wakeup devices HDEF(S4) EHC1(S4) EHC2(S4) XHC_(S4) SLPB(S3) > LID_(S3) > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpimcfg0 at acpi0 > acpimcfg0: addr 0xf000, bus 0-63 > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz, 2893.87 MHz, 06-3a-09 > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE > 36 > ,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MW > AI > T,DS- > CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,PO > PC > NT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC, > FS > GSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVE > OP > T,MELTDOWN > cpu0: 256KB 64b/line 8-way L2 cache > cpu0: smt 0, core 0, package 0 > mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges > cpu0: apic clock running at 99MHz > cpu0: mwait min=64, max=64, C-substates=0.2.1.1.2, IBE > cpu1 at mainbus0: apid 1 (application processor) > cpu1: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz, 2893.44 MHz, 06-3a-09 > cpu1: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE > 36 > ,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MW > AI > T,DS- > CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,PO > PC > NT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC, > FS > GSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVE > OP > T,MELTDOWN > cpu1: 256KB 64b/line 8-way L2 cache > cpu1: smt 1, core 0, package 0 > cpu2 at mainbus0: apid 2 (application processor) > cpu2: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz, 2893.44 MHz, 06-3a-09 > cpu2: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE > 36 > ,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MW > AI > T,DS- > CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,PO > PC > NT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC, > FS > GSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVE > OP > T,MELTDOWN > cpu2: 256KB 64b/line 8-way L2 cache > cpu2: smt 0, core 1, package 0 > cpu3 at mainbus0: apid 3 (application processor) > cpu3: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz, 2893.44 MHz, 06-3a-09 > cpu3: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE > 36 > ,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MW > AI > T,DS- > CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,PO > PC > NT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC, > FS > GSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVE > OP > T,MELTDOWN > cpu3: 256KB 64b/line 8-way L2 cache > cpu3: smt 1, core 1, package 0 > ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 24 pins > acpihpet0 at acpi0: 14318179 Hz > acpiprt0 at acpi0: bus 0 (PCI0) > acpiprt1 at acpi0: bus 1 (RP01) > acpiprt2 at acpi0: bus 2 (RP02) > acpiprt3 at acpi0: bus 3 (RP03) > acpiprt4 at acpi0: bus -1 (RP04) > acpiprt5 at acpi0: bus -1 (RP05) >
Re: [External] : Re: pf_state_key_link_reverse() is prone to race on parallel forwarding
Hello, > > + > Can you remove one of the double empty lines? sure, updated diff is below. > > + > > + old_reverse = atomic_cas_ptr(>reverse, NULL, sk); > > + if (old_reverse != NULL) > > + KASSERT(old_reverse == sk); > > + else > > + pf_state_key_ref(sk); > > } > > > > #if NPFLOG > 0 > > In genua code Markus and Haebaert have a more complex solution. > I don't know if we need this and I did not test it. Just showing > so you can consider it. A bunch of other functions are also > affected. If we want it, I can convert it into a working diff. > > If we think sashan@'s diff is sufficient, I am fine with that. > Otherwise I would recommend to port the working solution from genua. > > void > pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key *skrev) > { > struct pf_state_key *rev; > > /* > * 'skrev' points to the statekey from the packet. > * > * We try to set 'skrev->reverse' to 'sk', so we can find 'sk' w/o > * a lookup in the tree in the future. > * However, both 'skrev->reverse' and 'sk->reverse' might still point > * to state keys that are about to be removed, so we need to do these > * steps: > * > * 1. If 'skrev->reverse' exists, we do nothing because the linked > *state key might be removed concurrently, i.e. the 'removed' > *attribute is set and pf_state_key_unlink_reverse() is running. > *When the remove is complete skrev->reverse will be set to NULL. > * 2. Otherwise ('skrev->reverse' is NULL) we try to set > *sk->reverse=skrev'. This will fail if sk->reverse also points > *to a state key with 'removed' set. > * 3. Only if step (2) succeeds we can update 'skrev->reverse=sk'. > * > * This way both skrev and sk point to each other through the > * 'reverse' attribute. > */ looks like there must be one more difference between OpenBSD and genua. It looks like genua-pf is able to run state expiration concurrently with pf_test() function somehow. In OpenBSD pf_purge_expired_states() removes states under write lock from look up table. pf_state_key_link_reverse() on OpenBSD always runs as reader on state lock. the pf_purge_expired_states() on OpenBSD has two parts. The first part runs as a reader on state lock. It walks through entire state table finding expired states. It puts those states to private garbage collector list. Once done the state lock is dropped and re-acquired as a writer to enter a second part of pf_purge_expired_states(). At this point no packet is able to do a state look up, create state. Also no pf_state_key_link_reverse() functions are running at this point (because they acquire state lock as reader). having exclusive access to state table pf_purge_expired_states() walks its private list of garbage collected states removing them from look up table, removing state keys as well. so it seems to me the reverse key in OpenBSD is either set to correct peer or NULL. thanks and regards sashan 8<---8<---8<--8< diff --git a/sys/net/pf.c b/sys/net/pf.c index 28aa6157552..74293d6b70b 100644 --- a/sys/net/pf.c +++ b/sys/net/pf.c @@ -7366,11 +7366,19 @@ pf_inp_unlink(struct inpcb *inp) void pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key *skrev) { - /* Note that sk and skrev may be equal, then we refcount twice. */ - KASSERT(sk->reverse == NULL); - KASSERT(skrev->reverse == NULL); - sk->reverse = pf_state_key_ref(skrev); - skrev->reverse = pf_state_key_ref(sk); + struct pf_state_key *old_reverse; + + old_reverse = atomic_cas_ptr(>reverse, NULL, skrev); + if (old_reverse != NULL) + KASSERT(old_reverse == skrev); + else + pf_state_key_ref(skrev); + + old_reverse = atomic_cas_ptr(>reverse, NULL, sk); + if (old_reverse != NULL) + KASSERT(old_reverse == sk); + else + pf_state_key_ref(sk); } #if NPFLOG > 0
Re: ugen(4) communication issues with UPS (nut) blazer_usb and nutdrv_qx
My x230 laptop was setup to do the testing with 6.8. First observations: - There's no `ugen_clear_iface_eps: clear endpoints failed!` messages - I haven't managed to trigger the bug on 6.8 - I'm still getting trouble to make upsd able to detect the ups at boot I have to replug the USB then it works. x230 laptop: OpenBSD 6.8 (GENERIC.MP) #98: Sun Oct 4 18:13:26 MDT 2020 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 8360439808 (7973MB) avail mem = 8092012544 (7717MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xbff35020 (18 entries) bios0: vendor coreboot version "CBET4000 4.11-1765-g4bd6927388-dirty" date 03/24/2020 bios0: LENOVO 23256CG acpi0 at bios0: ACPI 4.0 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SSDT MCFG TCPA APIC DMAR HPET acpi0: wakeup devices HDEF(S4) EHC1(S4) EHC2(S4) XHC_(S4) SLPB(S3) LID_(S3) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimcfg0 at acpi0 acpimcfg0: addr 0xf000, bus 0-63 acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz, 2893.87 MHz, 06-3a-09 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36 ,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAI T,DS- CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPC NT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FS GSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOP T,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.1.2, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz, 2893.44 MHz, 06-3a-09 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36 ,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAI T,DS- CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPC NT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FS GSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOP T,MELTDOWN cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 1, core 0, package 0 cpu2 at mainbus0: apid 2 (application processor) cpu2: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz, 2893.44 MHz, 06-3a-09 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36 ,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAI T,DS- CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPC NT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FS GSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOP T,MELTDOWN cpu2: 256KB 64b/line 8-way L2 cache cpu2: smt 0, core 1, package 0 cpu3 at mainbus0: apid 3 (application processor) cpu3: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz, 2893.44 MHz, 06-3a-09 cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36 ,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAI T,DS- CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPC NT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FS GSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOP T,MELTDOWN cpu3: 256KB 64b/line 8-way L2 cache cpu3: smt 1, core 1, package 0 ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 24 pins acpihpet0 at acpi0: 14318179 Hz acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 1 (RP01) acpiprt2 at acpi0: bus 2 (RP02) acpiprt3 at acpi0: bus 3 (RP03) acpiprt4 at acpi0: bus -1 (RP04) acpiprt5 at acpi0: bus -1 (RP05) acpiprt6 at acpi0: bus -1 (RP06) acpiprt7 at acpi0: bus -1 (RP07) acpiprt8 at acpi0: bus -1 (RP08) acpiec0 at acpi0 acpipci0 at acpi0 PCI0: 0x 0x0011 0x0001 acpiac0 at acpi0: AC unit online acpibat0 at acpi0: BAT0 model "45N1029" serial 338 type LION oem "LGC" acpibat1 at acpi0: BAT1 not present acpibtn0 at acpi0: SLPB acpibtn1 at acpi0: LID_ acpithinkpad0 at acpi0: version 1.0 acpicmos0 at acpi0 tpm0 at acpi0 TPM_ addr 0xfed4/0x5000, device 0x104a rev 0x4e "PNP0C0B" at acpi0 not configured "BOOT" at acpi0 not configured acpicpu0 at acpi0: C3(200@90 mwait.1@0x30), C2(500@63 mwait.1@0x10), C1(1000@1 mwait.1), PSS acpicpu1 at acpi0: C3(200@90 mwait.1@0x30), C2(500@63 mwait.1@0x10), C1(1000@1 mwait.1), PSS acpicpu2 at acpi0: C3(200@90 mwait.1@0x30), C2(500@63 mwait.1@0x10), C1(1000@1 mwait.1), PSS acpicpu3 at acpi0: C3(200@90 mwait.1@0x30), C2(500@63 mwait.1@0x10), C1(1000@1 mwait.1), PSS acpitz0 at acpi0: critical temperature is 100 degC acpipwrres0 at acpi0: FPWR, resource for FAN_ acpitz1 at acpi0: critical temperature is 99 degC acpivideo0 at acpi0: GFX0 acpivout0 at acpivideo0: LCD0 cpu0: using VERW MDS workaround (except on
69.html: rsync(1) -> openrsync(1)
Found a small mistake in the 6.9 changelog. Index: 69.html === RCS file: /cvs/www/69.html,v retrieving revision 1.70 diff -u -p -r1.70 69.html --- 69.html 22 Apr 2021 17:59:45 - 1.70 +++ 69.html 22 Apr 2021 20:34:16 - @@ -1048,7 +1048,7 @@ to 6.9. Added a simple --timeout implementation to https://man.openbsd.org/openrsync.1;>openrsync(1). - Added the https://man.openbsd.org/rsync.1;>rsync(1) + Added the https://man.openbsd.org/openrsync.1;>openrsync(1) option --no-motd to suppress the information output by the client at the start of a daemon transfer. Added support for the use of !command to
Re: Unlock top part of uvm_fault()
On Thu, 22 Apr 2021 15:38:53 +0200, Martin Pieuchot wrote: > Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for > both amd64 and sparc64. That means the kernel lock will only be taken > for lower faults and some amap/anon code will now run without it. > > I'd be interested to have this tested and see how much does that impact > the build time of packages. > > We should be able to do the switch on an arch-by-arch basis. It's > easier for me to develop & debug on these two architectures so I started > with them. If you want to unlock another architecture and report back, > I'd be glad. > > Thanks, > Martin Hi Martin, I'd be willing to test this on octeon. However, with my poor little EdgeRouter Lite, I would only be able to test building a few packages. Would this be the correct diff for octeon? diff 5fe2fdbf987b2faadb61c6a00cf1dd30ab3e7fa6 /usr/src blob - 6530ef75203fbea02a4f1aaca62e4e78a81f4cdf file + sys/arch/mips64/mips64/trap.c --- sys/arch/mips64/mips64/trap.c +++ sys/arch/mips64/mips64/trap.c @@ -340,9 +340,7 @@ itsa(struct trapframe *trapframe, struct cpu_info *ci, va = trunc_page((vaddr_t)trapframe->badvaddr); onfault = pcb->pcb_onfault; pcb->pcb_onfault = 0; - KERNEL_LOCK(); rv = uvm_fault(kernel_map, va, 0, access_type); - KERNEL_UNLOCK(); pcb->pcb_onfault = onfault; if (rv == 0) return; @@ -421,9 +419,7 @@ fault_common_no_miss: onfault = pcb->pcb_onfault; pcb->pcb_onfault = 0; - KERNEL_LOCK(); rv = uvm_fault(map, va, 0, access_type); - KERNEL_UNLOCK(); pcb->pcb_onfault = onfault; /*
Re: ugen(4) communication issues with UPS (nut) blazer_usb and nutdrv_qx
On 2021/04/22 22:52, xs wrote: > - I've seen mentions of usb_quirks.c for usbhid driver in > /usr/local/share/doc/pkg-readmes/nut > ``` > The option with fewest side-effects is to add the following entries to > the table in /sys/dev/usb/usb_quirks.c and build a new kernel: > > { USB_VENDOR_APC, USB_PRODUCT_APC_UPS, ANY, { UQ_BAD_HID }}, > { USB_VENDOR_APC, USB_PRODUCT_APC_UPS5G, ANY, { UQ_BAD_HID }}, > ``` > > - Would it be useful ? No. That is for the case where a USB attaches to the uhid/upd drivers; sometimes a device can still work with NUT when that happens, but often not. > - Is this related: > https://www.mail-archive.com/tech@openbsd.org/msg62221.html ? Since it's attaching to ugen: no. > - What should I try to get this USB communication reliable ? Did it work any better with an older version? Userland access to USB devices is definitely not perfect in OpenBSD. btw, dmesg was missing from your mail.
ugen(4) communication issues with UPS (nut) blazer_usb and nutdrv_qx
I'm having an USB communication issue with ugen and the blazer_usb NUT driver on the last snapshot 6.9 GENERIC.MP#473 amd64. Notably the 'ugen_clear_iface_eps: clear endpoints failed!' message and communication loss with the UPS. Here is a link to my dmesg: https://clbin.com/pc1dk - NUT configuration is working normally until the following points - Sometimes upsd can't access the device at boot, I have to manually restart the service - "randomly" The communication fails and I have to restart the service too - The device can be totally unaccessible after some restart, needing to replug the USB Other points that could be useful: - This UPS USB interface was working without any issue on FreeBSD and ugen - The UPS is supported by nut with blazer_usb: https://networkupstools.org/ddl/PowerWalker/Line-Interactive_VI_1400.html It's mentioned it should work with the NUT nutdrv_qx driver too so I'm trying with this one too, see below. Below are detailed information of the issue: The tail of /var/log/messages with comments: Apr 22 19:30:25 se-h1 /bsd: sd5: 457860MB, 512 bytes/sector, 937697393 sectors Apr 22 19:30:25 se-h1 /bsd: root on sd5a (468563ff41793b32.a) swap on sd5b dump on sd5b Apr 22 19:30:25 se-h1 /bsd: inteldrm0: 1024x768, 32bpp Apr 22 19:30:25 se-h1 /bsd: wsdisplay0 at inteldrm0 mux 1: console (std, vt100 emulation), using wskbd0 Apr 22 19:30:25 se-h1 /bsd: wsdisplay0: screen 1-5 added (std, vt100 emulation) Apr 22 19:30:25 se-h1 ntpd[96719]: constraints configured but none available Apr 22 19:30:41 se-h1 ntpd[43289]: no reply received in time, skipping initial time setting Apr 22 19:30:41 se-h1 savecore: no core dump ## The following message happens when upsd tries to access the device ## Apr 22 19:30:47 se-h1 /bsd: ugen_clear_iface_eps: clear endpoints failed! Apr 22 19:30:56 se-h1 reorder_kernel: kernel relinking done ## At this point upsd can communicate to my device without any issue, everything works as expected ## ## But suddenly, ten minutes later: ## Apr 22 19:40:53 se-h1 blazer_usb[31504]: Communications with UPS lost: status read failed! Apr 22 19:40:55 se-h1 upsd[80896]: Data for UPS [ups0] is stale - check driver Apr 22 19:40:59 se-h1 upsmon[82703]: Poll UPS [ups0@localhost] failed - Data stale Apr 22 19:40:59 se-h1 upsmon[82703]: Communications with UPS ups0@localhost lost Apr 22 19:41:04 se-h1 upsmon[82703]: Poll UPS [ups0@localhost] failed - Data stale Apr 22 19:41:39 se-h1 last message repeated 7 times Apr 22 19:42:24 se-h1 last message repeated 9 times ## So what I do there is rcctl restart upsd (sometimes I need to do it several times) ## Apr 22 19:42:27 se-h1 upsd[80896]: mainloop: Interrupted system call Apr 22 19:42:29 se-h1 upsmon[82703]: Poll UPS [ups0@localhost] failed - Write error: Broken pipe ## Again: ## Apr 22 19:42:32 se-h1 /bsd: ugen_clear_iface_eps: clear endpoints failed! Apr 22 19:42:34 se-h1 upsmon[82703]: Communications with UPS ups0@localhost established ## Back to an operational state ## My device is 0665:5161 $ doas usbdevs -v Controller /dev/usb0: addr 01: 8086: Intel, xHCI root hub super speed, self powered, config 1, rev 1.00 driver: uhub0 addr 02: 0665:5161 INNO TECH, USB to Serial # --- this one --- # low speed, power 100 mA, config 1, rev 0.02, iSerial 20100826 driver: ugen0 addr 03: 1058:25a1 Western Digital, Elements 25A1 super speed, power 224 mA, config 1, rev 10.14, iSerial ommited driver: umass0 addr 04: 1058:1078 Western Digital, Elements 1078 super speed, power 224 mA, config 1, rev 10.65, iSerial ommited driver: umass1 addr 05: 1058:25a2 Western Digital, Elements 25A2 super speed, power 224 mA, config 1, rev 10.04, iSerial ommited driver: umass2 Controller /dev/usb1: addr 01: 8086: Intel, EHCI root hub high speed, self powered, config 1, rev 1.00 driver: uhub1 addr 02: 8087:8008 Intel, Rate Matching Hub high speed, self powered, config 1, rev 0.04 driver: uhub3 Controller /dev/usb2: addr 01: 8086: Intel, EHCI root hub high speed, self powered, config 1, rev 1.00 driver: uhub2 addr 02: 8087:8000 Intel, Rate Matching Hub high speed, self powered, config 1, rev 0.04 driver: uhub4 _ups permissions are set with /etc/rc.local $ cat /etc/rc.local # ups0 ugen0 permissions for USB access chgrp _ups /dev/usb0 /dev/ugen0* chmod g+rw /dev/usb0 /dev/ugen0* When access works: $ upsc ups0 battery.charge: 100 battery.voltage: 13.70 battery.voltage.high: 13.00 battery.voltage.low: 10.40 battery.voltage.nominal: 12.0 device.type: ups driver.flag.nolock: enabled driver.name: blazer_usb driver.parameter.pollinterval: 2 driver.parameter.port: /dev/ugen0 driver.parameter.productid: 5161 driver.parameter.protocol: megatec driver.parameter.synchronous: yes driver.parameter.vendorid: 0665 driver.version:
Re: Unlock top part of uvm_fault()
On Thu, Apr 22, 2021 at 03:38:53PM +0200, Martin Pieuchot wrote: > Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for > both amd64 and sparc64. That means the kernel lock will only be taken > for lower faults and some amap/anon code will now run without it. > > I'd be interested to have this tested and see how much does that impact > the build time of packages. Full regress test on amd64 successful. http://bluhm.genua.de/regress/results/regress-ot6.html When you click on custom, you see the diff that was tested. bluhm > We should be able to do the switch on an arch-by-arch basis. It's > easier for me to develop & debug on these two architectures so I started > with them. If you want to unlock another architecture and report back, > I'd be glad. > > Thanks, > Martin > > Index: arch/amd64/amd64/trap.c > === > RCS file: /cvs/src/sys/arch/amd64/amd64/trap.c,v > retrieving revision 1.87 > diff -u -p -r1.87 trap.c > --- arch/amd64/amd64/trap.c 22 Oct 2020 13:41:51 - 1.87 > +++ arch/amd64/amd64/trap.c 22 Apr 2021 13:06:54 - > @@ -176,10 +176,7 @@ upageflttrap(struct trapframe *frame, ui > union sigval sv; > int signal, sicode, error; > > - KERNEL_LOCK(); > error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type); > - KERNEL_UNLOCK(); > - > if (error == 0) { > uvm_grow(p, va); > return 1; > @@ -261,9 +258,7 @@ kpageflttrap(struct trapframe *frame, ui > if (curcpu()->ci_inatomic == 0 || map == kernel_map) { > onfault = pcb->pcb_onfault; > pcb->pcb_onfault = NULL; > - KERNEL_LOCK(); > error = uvm_fault(map, va, 0, access_type); > - KERNEL_UNLOCK(); > pcb->pcb_onfault = onfault; > > if (error == 0 && map != kernel_map) > Index: arch/sparc64/sparc64/trap.c > === > RCS file: /cvs/src/sys/arch/sparc64/sparc64/trap.c,v > retrieving revision 1.108 > diff -u -p -r1.108 trap.c > --- arch/sparc64/sparc64/trap.c 11 Mar 2021 11:17:00 - 1.108 > +++ arch/sparc64/sparc64/trap.c 22 Apr 2021 13:06:54 - > @@ -773,10 +773,7 @@ data_access_fault(struct trapframe64 *tf > if (!(addr & TLB_TAG_ACCESS_CTX)) { > /* CTXT == NUCLEUS */ > > - KERNEL_LOCK(); > error = uvm_fault(kernel_map, va, 0, access_type); > - KERNEL_UNLOCK(); > - > if (error == 0) > return; > goto kfault; > @@ -792,9 +789,7 @@ data_access_fault(struct trapframe64 *tf > > onfault = (vaddr_t)p->p_addr->u_pcb.pcb_onfault; > p->p_addr->u_pcb.pcb_onfault = NULL; > - KERNEL_LOCK(); > error = uvm_fault(>p_vmspace->vm_map, (vaddr_t)va, 0, access_type); > - KERNEL_UNLOCK(); > p->p_addr->u_pcb.pcb_onfault = (void *)onfault; > > /* > @@ -959,9 +954,7 @@ text_access_fault(struct trapframe64 *tf > uvm_map_inentry_sp, p->p_vmspace->vm_map.sserial)) > goto out; > > - KERNEL_LOCK(); > error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type); > - KERNEL_UNLOCK(); > > /* >* If this was a stack access we keep track of the maximum > @@ -1055,9 +1048,7 @@ text_access_error(struct trapframe64 *tf > uvm_map_inentry_sp, p->p_vmspace->vm_map.sserial)) > goto out; > > - KERNEL_LOCK(); > error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type); > - KERNEL_UNLOCK(); > > /* >* If this was a stack access we keep track of the maximum
Allow mounting small FAT32 partitions
If partition is smaller than 32MB, mount_msdos(8) will yield "not an MSDOS filesystem". pmp->pm_Sectors is set by newfs_msdos(8) when the number of sectors is less than MAXU16 even if asked to format in FAT32. I choosed to fix mounting instead of formatting (newfs_msdos.c frightens me...) FreeBSD fixed this in 2013. Index: msdosfs_vfsops.c === RCS file: /cvs/src/sys/msdosfs/msdosfs_vfsops.c,v retrieving revision 1.94 diff -u -p -r1.94 msdosfs_vfsops.c --- msdosfs_vfsops.c10 Aug 2020 05:18:46 - 1.94 +++ msdosfs_vfsops.c22 Apr 2021 19:35:40 - @@ -334,8 +334,7 @@ msdosfs_mountfs(struct vnode *devvp, str } if (pmp->pm_RootDirEnts == 0) { - if (pmp->pm_Sectors || pmp->pm_FATsecs || - getushort(b710->bpbFSVers)) { + if (pmp->pm_FATsecs || getushort(b710->bpbFSVers)) { error = EINVAL; goto error_exit; }
Re: re-enable A-MSDU support with iwm(4) and iwx(4) fixed
On Thu, Apr 22, 2021 at 07:47:29PM +0200, Matthias Schmidt wrote: > I have a kernel with your patch running since several hours and > noticed a regression. My usual "test case" is copying several large > files from my file server via NFSv3 to my laptop. In the beginning the > transfer rate was about 2-3M/s and after some time it dropped to around > 50-300K/s and never recovered (transfer is now running for 2.5h). > > I have the following device in a Thinkpad T450s connect to a Fritzbox > AP. > > iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless-AC 8265" rev 0x78, msi > iwm0: hw rev 0x230, fw ver 34.0.1, address The patch doesn't actually change anything in the driver's receive path for the 8265 chip. You might have seen a side-effect of something else that is unrelated to the patch, such as the wifi channel suddently becoming very busy with unrelated traffic. We should rule that out with a fair amount of certainty before trying to debug the issue. So my question would be if this problem is 100% repeatable with the patch and not at all repeatable without the patch? One possibility is that you've managed to trigger a problem in the receive path of net80211 with A-MSDUs enabled. Such a bug would already have existed but it would have been dormant since July 2019 when A-MSDUs were disabled. For 8265 devices the entire patch boils down to this change so you could simply revert all the other changes and try this much smaller patch instead to test for this regression: diff 804ff40445876fb6652323b00b9804f826133e70 /tmp/net80211 blob - a2de00f7bcdef99ced5d09da5e9b4bc8615156bd file + ieee80211_input.c --- ieee80211_input.c +++ ieee80211_input.c @@ -2758,7 +2758,7 @@ ieee80211_recv_addba_req(struct ieee80211com *ic, stru ba->ba_params = (params & IEEE80211_ADDBA_BA_POLICY); ba->ba_params |= ((ba->ba_winsize << IEEE80211_ADDBA_BUFSZ_SHIFT) | (tid << IEEE80211_ADDBA_TID_SHIFT)); -#if 0 +#if 1 /* iwm(4) 9k and iwx(4) need more work before AMSDU can be enabled. */ ba->ba_params |= IEEE80211_ADDBA_AMSDU; #endif blob - 5bd45d993b558bac50a513c1c4422508d96f44ba file + ieee80211_proto.c --- ieee80211_proto.c +++ ieee80211_proto.c @@ -695,7 +695,7 @@ ieee80211_addba_request(struct ieee80211com *ic, struc ba->ba_params = (ba->ba_winsize << IEEE80211_ADDBA_BUFSZ_SHIFT) | (tid << IEEE80211_ADDBA_TID_SHIFT); -#if 0 +#if 1 /* iwm(4) 9k and iwx(4) need more work before AMSDU can be enabled. */ ba->ba_params |= IEEE80211_ADDBA_AMSDU; #endif
Re: dt(4) ifdef sysctl
On 22/04/21(Thu) 20:19, Alexander Bluhm wrote: > Hi, > > sysctl witnesswatch gives an error message if the feature is not > compiled into the kernel. I think dt(4) allowdt should do the same. > > sysctl: kern.allowdt: value is not available > > This removes a bit of unused code from ramdisk kernel. > The variable allowdt should be in the device, not in sysctl source. > We don't need #ifdef for extern and prototypes, without it code is > more readable. > Put the unneeded sysctl code into an #if NDT > 0. > > ok? ok mpi@ > By the way, can we enable dt(4) in GENERIC? I use it quite often > and it is handy to have it is avaiable. Missuse is prevented by > securelevel sysctl. Any downside? I don't see any reason to not do it. > Index: dev/dt/dt_dev.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/dev/dt/dt_dev.c,v > retrieving revision 1.12 > diff -u -p -r1.12 dt_dev.c > --- dev/dt/dt_dev.c 26 Mar 2021 21:17:10 - 1.12 > +++ dev/dt/dt_dev.c 22 Apr 2021 17:38:22 - > @@ -109,6 +109,8 @@ SIMPLEQ_HEAD(, dt_probe) dt_probe_list; > struct rwlockdt_lock = RWLOCK_INITIALIZER("dtlk"); > volatile uint32_tdt_tracing = 0; /* [K] # of processes tracing */ > > +int allowdt; > + > void dtattach(struct device *, struct device *, void *); > int dtopen(dev_t, int, int, struct proc *); > int dtclose(dev_t, int, int, struct proc *); > @@ -145,7 +147,6 @@ dtopen(dev_t dev, int flags, int mode, s > { > struct dt_softc *sc; > int unit = minor(dev); > - extern int allowdt; > > if (!allowdt) > return EPERM; > Index: kern/kern_sysctl.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/kern/kern_sysctl.c,v > retrieving revision 1.389 > diff -u -p -r1.389 kern_sysctl.c > --- kern/kern_sysctl.c8 Feb 2021 10:51:02 - 1.389 > +++ kern/kern_sysctl.c22 Apr 2021 17:38:22 - > @@ -114,24 +114,21 @@ > #endif > > #include "audio.h" > -#include "video.h" > +#include "dt.h" > #include "pf.h" > +#include "video.h" > > extern struct forkstat forkstat; > extern struct nchstats nchstats; > extern int nselcoll, fscale; > extern struct disklist_head disklist; > extern fixpt_t ccpu; > -extern long numvnodes; > -#if NAUDIO > 0 > +extern long numvnodes; > +extern int allowdt; > extern int audio_record_enable; > -#endif > -#if NVIDEO > 0 > extern int video_record_enable; > -#endif > > int allowkmem; > -int allowdt; > > int sysctl_diskinit(int, struct proc *); > int sysctl_proc_args(int *, u_int, void *, size_t *, struct proc *); > @@ -142,12 +139,8 @@ int sysctl_proc_vmmap(int *, u_int, void > int sysctl_intrcnt(int *, u_int, void *, size_t *); > int sysctl_sensors(int *, u_int, void *, size_t *, void *, size_t); > int sysctl_cptime2(int *, u_int, void *, size_t *, void *, size_t); > -#if NAUDIO > 0 > int sysctl_audio(int *, u_int, void *, size_t *, void *, size_t); > -#endif > -#if NVIDEO > 0 > int sysctl_video(int *, u_int, void *, size_t *, void *, size_t); > -#endif > int sysctl_cpustats(int *, u_int, void *, size_t *, void *, size_t); > int sysctl_utc_offset(void *, size_t *, void *, size_t); > > @@ -479,10 +472,12 @@ kern_sysctl(int *name, u_int namelen, vo > return (EPERM); > securelevel = level; > return (0); > +#if NDT > 0 > case KERN_ALLOWDT: > if (securelevel > 0) > return (sysctl_rdint(oldp, oldlenp, newp, allowdt)); > return (sysctl_int(oldp, oldlenp, newp, newlen, )); > +#endif > case KERN_ALLOWKMEM: > if (securelevel > 0) > return (sysctl_rdint(oldp, oldlenp, newp, allowkmem)); >
dt(4) ifdef sysctl
Hi, sysctl witnesswatch gives an error message if the feature is not compiled into the kernel. I think dt(4) allowdt should do the same. sysctl: kern.allowdt: value is not available This removes a bit of unused code from ramdisk kernel. The variable allowdt should be in the device, not in sysctl source. We don't need #ifdef for extern and prototypes, without it code is more readable. Put the unneeded sysctl code into an #if NDT > 0. ok? By the way, can we enable dt(4) in GENERIC? I use it quite often and it is handy to have it is avaiable. Missuse is prevented by securelevel sysctl. Any downside? bluhm Index: dev/dt/dt_dev.c === RCS file: /data/mirror/openbsd/cvs/src/sys/dev/dt/dt_dev.c,v retrieving revision 1.12 diff -u -p -r1.12 dt_dev.c --- dev/dt/dt_dev.c 26 Mar 2021 21:17:10 - 1.12 +++ dev/dt/dt_dev.c 22 Apr 2021 17:38:22 - @@ -109,6 +109,8 @@ SIMPLEQ_HEAD(, dt_probe)dt_probe_list; struct rwlock dt_lock = RWLOCK_INITIALIZER("dtlk"); volatile uint32_t dt_tracing = 0; /* [K] # of processes tracing */ +int allowdt; + void dtattach(struct device *, struct device *, void *); intdtopen(dev_t, int, int, struct proc *); intdtclose(dev_t, int, int, struct proc *); @@ -145,7 +147,6 @@ dtopen(dev_t dev, int flags, int mode, s { struct dt_softc *sc; int unit = minor(dev); - extern int allowdt; if (!allowdt) return EPERM; Index: kern/kern_sysctl.c === RCS file: /data/mirror/openbsd/cvs/src/sys/kern/kern_sysctl.c,v retrieving revision 1.389 diff -u -p -r1.389 kern_sysctl.c --- kern/kern_sysctl.c 8 Feb 2021 10:51:02 - 1.389 +++ kern/kern_sysctl.c 22 Apr 2021 17:38:22 - @@ -114,24 +114,21 @@ #endif #include "audio.h" -#include "video.h" +#include "dt.h" #include "pf.h" +#include "video.h" extern struct forkstat forkstat; extern struct nchstats nchstats; extern int nselcoll, fscale; extern struct disklist_head disklist; extern fixpt_t ccpu; -extern long numvnodes; -#if NAUDIO > 0 +extern long numvnodes; +extern int allowdt; extern int audio_record_enable; -#endif -#if NVIDEO > 0 extern int video_record_enable; -#endif int allowkmem; -int allowdt; int sysctl_diskinit(int, struct proc *); int sysctl_proc_args(int *, u_int, void *, size_t *, struct proc *); @@ -142,12 +139,8 @@ int sysctl_proc_vmmap(int *, u_int, void int sysctl_intrcnt(int *, u_int, void *, size_t *); int sysctl_sensors(int *, u_int, void *, size_t *, void *, size_t); int sysctl_cptime2(int *, u_int, void *, size_t *, void *, size_t); -#if NAUDIO > 0 int sysctl_audio(int *, u_int, void *, size_t *, void *, size_t); -#endif -#if NVIDEO > 0 int sysctl_video(int *, u_int, void *, size_t *, void *, size_t); -#endif int sysctl_cpustats(int *, u_int, void *, size_t *, void *, size_t); int sysctl_utc_offset(void *, size_t *, void *, size_t); @@ -479,10 +472,12 @@ kern_sysctl(int *name, u_int namelen, vo return (EPERM); securelevel = level; return (0); +#if NDT > 0 case KERN_ALLOWDT: if (securelevel > 0) return (sysctl_rdint(oldp, oldlenp, newp, allowdt)); return (sysctl_int(oldp, oldlenp, newp, newlen, )); +#endif case KERN_ALLOWKMEM: if (securelevel > 0) return (sysctl_rdint(oldp, oldlenp, newp, allowkmem));
Re: re-enable A-MSDU support with iwm(4) and iwx(4) fixed
Hi Stefan, * Stefan Sperling wrote: > This patch adds A-MSDU rx offloading support for both iwm(4) and iwx(4) > and re-enables net80211's software A-MSDU Rx support for all 11n drivers. > > Meaning iwn(4) and athn(4) will also be receiving A-MSDUs again. > This feature has been turned off since July 2019: > https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/net80211/ieee80211_input.c#rev1.207 > The root cause of the problem I saw at that time was related to dlg's Rx > queue back-pressure mechanism. Once we figured this out, all wireless drivers > were fixed to call if_input() only once per interrupt so this is no longer > an issue. > > For a very brief period we tried to enable A-MSDUs again in -current but > we found out that iwm/iwx needed additional work for the new devices which > received support while A-MSDU was disabled in-tree: > https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/net80211/ieee80211_input.c#rev1.226 > The patch below completes the missing bits to make A-MSDUs work on these > new devices. I have a kernel with your patch running since several hours and noticed a regression. My usual "test case" is copying several large files from my file server via NFSv3 to my laptop. In the beginning the transfer rate was about 2-3M/s and after some time it dropped to around 50-300K/s and never recovered (transfer is now running for 2.5h). I have the following device in a Thinkpad T450s connect to a Fritzbox AP. iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless-AC 8265" rev 0x78, msi iwm0: hw rev 0x230, fw ver 34.0.1, address Cheers Matthias
Patch: USB-N10 Nano B1 support - wifi usb dongle
Hello tech@, following patch enables urtwn driver to attach to USB wifi adapter ASUS USB-N10 Nano (Wireless-N150) Sticker on box says: Model: USB-N10 NANO B1 H/W Ver.: B1 ### N10 B1 usbdevs: addr 02: 0b05:18f0 Realtek, 802.11n NIC dmesg: urtwn0 at uhub0 port 1 configuration 1 interface 0 "Realtek 802.11n NIC" rev 2.00/0.00 addr 2 I also have non-B1 release of the same wi-fi adapter that works out of the box and that one is detected like this: ### N10 non-B1 usbdevs: addr 03: 0b05:17ba Realtek, 802.11n WLAN Adapter dmesg: urtwn1 at uhub0 port 2 configuration 1 interface 0 "Realtek 802.11n WLAN Adapter" rev 2.00/2.00 addr 3 I have updated the manpage for urtwn with these two models too. I have one question - where the B1 version is detected as NIC while the non-B1 version says WLAN Adapter. Where does this come from? Does the device itself report it? Thank you, JV ### PATCH Index: share/man/man4/urtwn.4 === RCS file: /cvs/src/share/man/man4/urtwn.4,v retrieving revision 1.49 diff -u -p -u -p -r1.49 urtwn.4 --- share/man/man4/urtwn.4 15 Nov 2020 00:33:24 - 1.49 +++ share/man/man4/urtwn.4 22 Apr 2021 17:24:38 - @@ -105,6 +105,8 @@ The following adapters should work: .It Alfa AWUS036NHR .It Approx APPUSB300NANO V1 .It Aus. Linx AL-9604R1S +.It Asus USB-N10 NANO +.It Asus USB-N10 NANO B1 .It B-Link BL-LW05-5R .It Belkin F7D1102 Surf Wireless Micro .It D-Link DWA-121 Index: sys/dev/usb/if_urtwn.c === RCS file: /cvs/src/sys/dev/usb/if_urtwn.c,v retrieving revision 1.96 diff -u -p -u -p -r1.96 if_urtwn.c --- sys/dev/usb/if_urtwn.c 15 Nov 2020 00:04:05 - 1.96 +++ sys/dev/usb/if_urtwn.c 22 Apr 2021 17:24:38 - @@ -331,6 +331,8 @@ static const struct urtwn_type { URTWN_DEV_8188EU(REALTEK, RTL8188ETV), URTWN_DEV_8188EU(REALTEK, RTL8188EU), URTWN_DEV_8188EU(TPLINK,RTL8188EUS), + URTWN_DEV_8188EU(ASUS, RTL8188EUS), + /* URTWN_RTL8192EU */ URTWN_DEV_8192EU(DLINK, DWA131E1), URTWN_DEV_8192EU(REALTEK, RTL8192EU), Index: sys/dev/usb/usbdevs.h === RCS file: /cvs/src/sys/dev/usb/usbdevs.h,v retrieving revision 1.750 diff -u -p -u -p -r1.750 usbdevs.h --- sys/dev/usb/usbdevs.h 5 Apr 2021 20:46:06 - 1.750 +++ sys/dev/usb/usbdevs.h 22 Apr 2021 17:24:38 - @@ -1083,6 +1083,7 @@ #defineUSB_PRODUCT_ASUS_RTL8192CU_20x17ba /* RTL8192CU */ #defineUSB_PRODUCT_ASUS_RTL8192CU_30x17c0 /* RTL8192CU */ #defineUSB_PRODUCT_ASUS_RTL81560x18d1 /* RTL8156 */ +#defineUSB_PRODUCT_ASUS_RTL8188EUS 0x18f0 /* RTL8188EUS */ #defineUSB_PRODUCT_ASUS_MYPAL_A730 0x4202 /* MyPal A730 */ #defineUSB_PRODUCT_ASUS2_USBN110x0b05 /* USB-N11 */ Index: sys/dev/usb/usbdevs_data.h === RCS file: /cvs/src/sys/dev/usb/usbdevs_data.h,v retrieving revision 1.744 diff -u -p -u -p -r1.744 usbdevs_data.h --- sys/dev/usb/usbdevs_data.h 5 Apr 2021 20:46:06 - 1.744 +++ sys/dev/usb/usbdevs_data.h 22 Apr 2021 17:24:38 - @@ -1134,6 +1134,10 @@ const struct usb_known_product usb_known "RT3070", }, { + USB_VENDOR_ASUS, USB_PRODUCT_ASUS_RTL8188EUS, + "RTL8188EUS", + }, + { USB_VENDOR_ASUS, USB_PRODUCT_ASUS_RTL8192SU_1, "RTL8192SU", }, ### DMESG // I've changed MAC address to xx:xx:xx:xx:xx:xx and serial for battery // and nvme to zeroes. OpenBSD 6.9-current (GENERIC.MP) #0: Thu Apr 22 19:03:24 CEST 2021 ja...@mercurius.bsd:/sys/arch/amd64/compile/GENERIC.MP real mem = 8469352448 (8077MB) avail mem = 8197271552 (7817MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.7 @ 0x8afbc000 (33 entries) bios0: vendor Apple Inc. version "426.0.0.0.0" date 12/17/2020 bios0: Apple Inc. MacBookAir7,2 acpi0 at bios0: ACPI 5.0 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP HPET APIC SBST ECDT SSDT SSDT SSDT SSDT SSDT SSDT SSDT SSDT DMAR MCFG acpi0: wakeup devices PEG0(S3) EC__(S3) HDEF(S3) RP01(S3) RP02(S3) RP03(S4) ARPT(S4) RP05(S3) RP06(S3) SPIT(S3) XHC1(S3) ADP1(S3) LID0(S3) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 14318179 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM) i5-5250U CPU @ 1.60GHz, 1500.21 MHz, 06-3d-04 cpu0:
Re: pf_state_key_link_reverse() is prone to race on parallel forwarding
On Wed, Apr 21, 2021 at 10:19:10PM +0200, Alexandr Nedvedicky wrote: > would it be OK to commit it once bluhm's diff [1] will be in? Diff can be commited independently when we know that it is correct. > 8<---8<---8<--8< > diff --git a/sys/net/pf.c b/sys/net/pf.c > index 28aa6157552..d4212c21d7b 100644 > --- a/sys/net/pf.c > +++ b/sys/net/pf.c > @@ -7366,11 +7366,20 @@ pf_inp_unlink(struct inpcb *inp) > void > pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key > *skrev) > { > - /* Note that sk and skrev may be equal, then we refcount twice. */ > - KASSERT(sk->reverse == NULL); > - KASSERT(skrev->reverse == NULL); > - sk->reverse = pf_state_key_ref(skrev); > - skrev->reverse = pf_state_key_ref(sk); > + struct pf_state_key *old_reverse; > + > + old_reverse = atomic_cas_ptr(>reverse, NULL, skrev); > + if (old_reverse != NULL) > + KASSERT(old_reverse == skrev); > + else > + pf_state_key_ref(skrev); > + Can you remove one of the double empty lines? > + > + old_reverse = atomic_cas_ptr(>reverse, NULL, sk); > + if (old_reverse != NULL) > + KASSERT(old_reverse == sk); > + else > + pf_state_key_ref(sk); > } > > #if NPFLOG > 0 In genua code Markus and Haebaert have a more complex solution. I don't know if we need this and I did not test it. Just showing so you can consider it. A bunch of other functions are also affected. If we want it, I can convert it into a working diff. If we think sashan@'s diff is sufficient, I am fine with that. Otherwise I would recommend to port the working solution from genua. void pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key *skrev) { struct pf_state_key *rev; /* * 'skrev' points to the statekey from the packet. * * We try to set 'skrev->reverse' to 'sk', so we can find 'sk' w/o * a lookup in the tree in the future. * However, both 'skrev->reverse' and 'sk->reverse' might still point * to state keys that are about to be removed, so we need to do these * steps: * * 1. If 'skrev->reverse' exists, we do nothing because the linked *state key might be removed concurrently, i.e. the 'removed' *attribute is set and pf_state_key_unlink_reverse() is running. *When the remove is complete skrev->reverse will be set to NULL. * 2. Otherwise ('skrev->reverse' is NULL) we try to set *sk->reverse=skrev'. This will fail if sk->reverse also points *to a state key with 'removed' set. * 3. Only if step (2) succeeds we can update 'skrev->reverse=sk'. * * This way both skrev and sk point to each other through the * 'reverse' attribute. */ if ((rev = skrev->reverse) != NULL) { /* points either to sk, or removed entry */ KASSERT(rev == sk || rev->removed); return; } /* * Grab a reference to sk/skrev, if we succeed, try to set our reverse * pointers, if we fail, undo do reference. */ if (!pf_state_key_ref_try(sk)) return; if (!pf_state_key_ref_try(skrev)) { pf_state_key_unref(sk); return; } if (atomic_cas_ptr(>reverse, NULL, skrev) != NULL) { /* points either to skrev, NULL, or removed entry */ rev = sk->reverse; KASSERT(rev == skrev || rev == NULL || rev->removed); pf_state_key_unref(skrev); pf_state_key_unref(sk); return; } if (atomic_cas_ptr(>reverse, NULL, sk) != NULL) { rev = skrev->reverse; KASSERT(rev == sk || rev == NULL); return; } /* done */ } int pf_state_key_ref_try0(struct pf_state_key *sk) { if (!pf_state_key_isvalid(sk)) return (0); return (refcnt_take_try(sk->refcnt)); } int refcnt_take_try(struct refcnt *r) { volatile unsigned int refcnt; while (1) { refcnt = r->refs; /* Failed */ if (refcnt == 0) return (0); if (atomic_cas_uint(>refs, refcnt, refcnt + 1) == refcnt) return (1); } return (1); /* NOTREACHED */ }
Unlock top part of uvm_fault()
Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for both amd64 and sparc64. That means the kernel lock will only be taken for lower faults and some amap/anon code will now run without it. I'd be interested to have this tested and see how much does that impact the build time of packages. We should be able to do the switch on an arch-by-arch basis. It's easier for me to develop & debug on these two architectures so I started with them. If you want to unlock another architecture and report back, I'd be glad. Thanks, Martin Index: arch/amd64/amd64/trap.c === RCS file: /cvs/src/sys/arch/amd64/amd64/trap.c,v retrieving revision 1.87 diff -u -p -r1.87 trap.c --- arch/amd64/amd64/trap.c 22 Oct 2020 13:41:51 - 1.87 +++ arch/amd64/amd64/trap.c 22 Apr 2021 13:06:54 - @@ -176,10 +176,7 @@ upageflttrap(struct trapframe *frame, ui union sigval sv; int signal, sicode, error; - KERNEL_LOCK(); error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type); - KERNEL_UNLOCK(); - if (error == 0) { uvm_grow(p, va); return 1; @@ -261,9 +258,7 @@ kpageflttrap(struct trapframe *frame, ui if (curcpu()->ci_inatomic == 0 || map == kernel_map) { onfault = pcb->pcb_onfault; pcb->pcb_onfault = NULL; - KERNEL_LOCK(); error = uvm_fault(map, va, 0, access_type); - KERNEL_UNLOCK(); pcb->pcb_onfault = onfault; if (error == 0 && map != kernel_map) Index: arch/sparc64/sparc64/trap.c === RCS file: /cvs/src/sys/arch/sparc64/sparc64/trap.c,v retrieving revision 1.108 diff -u -p -r1.108 trap.c --- arch/sparc64/sparc64/trap.c 11 Mar 2021 11:17:00 - 1.108 +++ arch/sparc64/sparc64/trap.c 22 Apr 2021 13:06:54 - @@ -773,10 +773,7 @@ data_access_fault(struct trapframe64 *tf if (!(addr & TLB_TAG_ACCESS_CTX)) { /* CTXT == NUCLEUS */ - KERNEL_LOCK(); error = uvm_fault(kernel_map, va, 0, access_type); - KERNEL_UNLOCK(); - if (error == 0) return; goto kfault; @@ -792,9 +789,7 @@ data_access_fault(struct trapframe64 *tf onfault = (vaddr_t)p->p_addr->u_pcb.pcb_onfault; p->p_addr->u_pcb.pcb_onfault = NULL; - KERNEL_LOCK(); error = uvm_fault(>p_vmspace->vm_map, (vaddr_t)va, 0, access_type); - KERNEL_UNLOCK(); p->p_addr->u_pcb.pcb_onfault = (void *)onfault; /* @@ -959,9 +954,7 @@ text_access_fault(struct trapframe64 *tf uvm_map_inentry_sp, p->p_vmspace->vm_map.sserial)) goto out; - KERNEL_LOCK(); error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type); - KERNEL_UNLOCK(); /* * If this was a stack access we keep track of the maximum @@ -1055,9 +1048,7 @@ text_access_error(struct trapframe64 *tf uvm_map_inentry_sp, p->p_vmspace->vm_map.sserial)) goto out; - KERNEL_LOCK(); error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type); - KERNEL_UNLOCK(); /* * If this was a stack access we keep track of the maximum
Re: [External] : Re: running network stack forwarding in parallel
Hello, On Thu, Apr 22, 2021 at 03:08:21PM +0200, Mark Kettenis wrote: > > Date: Thu, 22 Apr 2021 14:43:24 +0200 > > From: Alexandr Nedvedicky > > > > Hello, > > > > On Thu, Apr 22, 2021 at 01:09:34PM +0200, Alexander Bluhm wrote: > > > On Thu, Apr 22, 2021 at 12:33:13PM +0200, Hrvoje Popovski wrote: > > > > r620-1# papnpaiancini:cc :p :op > > > > opooolo_llc_ac_caccahhceh_ei_eti_tieetmme_mm__amgamigacigci__cc_hccehhcekcekc:: > > > > k :m bmubmfubfuppflp llc pc pcuup uf rfferree eel el iilsitss tm tom > > > > omddoidfiiifeifeidde:d ::i ti etietmme m > > > > a daddardd rd0 r0 > > > > xx0fxfddf88d08c0cc0c6c76afc9b3f04500400++01+61 610 6x0 > > > > fx0fxffdffdf88d08 > > > > 00020720d72a8c0049703eb!ef!e==!0=x009x59x95995b9ebbaee3ae3ae344ef54f5a4bff7db07990a9 > > > > > > Wow. 3 CPUs panic in pool_cache_get() pool_cache_item_magic_check > > > simultaneously. This makes me think we may have a bug there. > > > > > > > I took a look at arch/amd64/include/intrdefs.h where interrupt > > priorities are defined. > > > > IPL_NET has priority set to 7, > > IPL_SOFTNET has higher priority set 5 > > > > all allocations are coming from mbpool via m_gethdr(), interrupt > > level priority for mbpool is set to IPL_NET. If I understand > > code in m_pool_get() right, then the pool_cache_enter() does not > > stop guys who call m_gethdr() with IPL_SOFTNET. > > > > if we put KERNEL_LOCK() there the problem is gone, mostlikely > > because the IPL_SOFTNET guy waits for KERNEL_LOCK therefore it > > can not interfere with our IPL_NET task, which forwards packet. > > > > I admit it's a poor speculation, I have no 'hard proof' for my > > claim here. So I might be very wrong here. > > Not sure what you are trying to say here, but IPL_SOFTNET is lower > than IPL_NET. So code that runs at IPL_SOFTNET will raise the IPL to > IPL_NET in pool_cache_enter(), blocking IPL_NET interrupts until > pool_cache_leave() is called and the IPL is lowered again to > IPL_SOFTNET. > > I'm fairly confident the "normal" pools are mpsafe; we have been using > those in concurrent contexts without holding the kernel lock for a > long time already. But the pool cache layer is still relatively new... my understanding was twisted: lower the number the higher the priority. this made me thinking IPL_NET won't prevent processes running with IPL_SOFTNET from running into pool_cache_enter()/pool_cache_leave() section. thanks and regards sashan
Re: [External] : Re: running network stack forwarding in parallel
On 22/04/21(Thu) 15:08, Mark Kettenis wrote: > > Date: Thu, 22 Apr 2021 14:43:24 +0200 > > From: Alexandr Nedvedicky > > > > Hello, > > > > On Thu, Apr 22, 2021 at 01:09:34PM +0200, Alexander Bluhm wrote: > > > On Thu, Apr 22, 2021 at 12:33:13PM +0200, Hrvoje Popovski wrote: > > > > r620-1# papnpaiancini:cc :p :op > > > > opooolo_llc_ac_caccahhceh_ei_eti_tieetmme_mm__amgamigacigci__cc_hccehhcekcekc:: > > > > k :m bmubmfubfuppflp llc pc pcuup uf rfferree eel el iilsitss tm tom > > > > omddoidfiiifeifeidde:d ::i ti etietmme m > > > > a daddardd rd0 r0 > > > > xx0fxfddf88d08c0cc0c6c76afc9b3f04500400++01+61 610 6x0 > > > > fx0fxffdffdf88d08 > > > > 00020720d72a8c0049703eb!ef!e==!0=x009x59x95995b9ebbaee3ae3ae344ef54f5a4bff7db07990a9 > > > > > > Wow. 3 CPUs panic in pool_cache_get() pool_cache_item_magic_check > > > simultaneously. This makes me think we may have a bug there. > > > > > > > I took a look at arch/amd64/include/intrdefs.h where interrupt > > priorities are defined. > > > > IPL_NET has priority set to 7, > > IPL_SOFTNET has higher priority set 5 > > > > all allocations are coming from mbpool via m_gethdr(), interrupt > > level priority for mbpool is set to IPL_NET. If I understand > > code in m_pool_get() right, then the pool_cache_enter() does not > > stop guys who call m_gethdr() with IPL_SOFTNET. > > > > if we put KERNEL_LOCK() there the problem is gone, mostlikely > > because the IPL_SOFTNET guy waits for KERNEL_LOCK therefore it > > can not interfere with our IPL_NET task, which forwards packet. > > > > I admit it's a poor speculation, I have no 'hard proof' for my > > claim here. So I might be very wrong here. > > Not sure what you are trying to say here, but IPL_SOFTNET is lower > than IPL_NET. So code that runs at IPL_SOFTNET will raise the IPL to > IPL_NET in pool_cache_enter(), blocking IPL_NET interrupts until > pool_cache_leave() is called and the IPL is lowered again to > IPL_SOFTNET. IPL_SOFTNET is only blocking timeouts that don't run in a thread context. It is mostly a legacy of the time where received packets where processed in a soft-interrupt context. > I'm fairly confident the "normal" pools are mpsafe; we have been using > those in concurrent contexts without holding the kernel lock for a > long time already. But the pool cache layer is still relatively new... That said, no other subsystem in the kernel is currently as multi-threaded as a network stack with 4 threads. So it is possible that existing races are more likely to show up with this diff,
Re: [External] : Re: running network stack forwarding in parallel
> Date: Thu, 22 Apr 2021 14:43:24 +0200 > From: Alexandr Nedvedicky > > Hello, > > On Thu, Apr 22, 2021 at 01:09:34PM +0200, Alexander Bluhm wrote: > > On Thu, Apr 22, 2021 at 12:33:13PM +0200, Hrvoje Popovski wrote: > > > r620-1# papnpaiancini:cc :p :op > > > opooolo_llc_ac_caccahhceh_ei_eti_tieetmme_mm__amgamigacigci__cc_hccehhcekcekc:: > > > k :m bmubmfubfuppflp llc pc pcuup uf rfferree eel el iilsitss tm tom > > > omddoidfiiifeifeidde:d ::i ti etietmme m > > > a daddardd rd0 r0 > > > xx0fxfddf88d08c0cc0c6c76afc9b3f04500400++01+61 610 6x0 > > > fx0fxffdffdf88d08 > > > 00020720d72a8c0049703eb!ef!e==!0=x009x59x95995b9ebbaee3ae3ae344ef54f5a4bff7db07990a9 > > > > Wow. 3 CPUs panic in pool_cache_get() pool_cache_item_magic_check > > simultaneously. This makes me think we may have a bug there. > > > > I took a look at arch/amd64/include/intrdefs.h where interrupt > priorities are defined. > > IPL_NET has priority set to 7, > IPL_SOFTNET has higher priority set 5 > > all allocations are coming from mbpool via m_gethdr(), interrupt > level priority for mbpool is set to IPL_NET. If I understand > code in m_pool_get() right, then the pool_cache_enter() does not > stop guys who call m_gethdr() with IPL_SOFTNET. > > if we put KERNEL_LOCK() there the problem is gone, mostlikely > because the IPL_SOFTNET guy waits for KERNEL_LOCK therefore it > can not interfere with our IPL_NET task, which forwards packet. > > I admit it's a poor speculation, I have no 'hard proof' for my > claim here. So I might be very wrong here. Not sure what you are trying to say here, but IPL_SOFTNET is lower than IPL_NET. So code that runs at IPL_SOFTNET will raise the IPL to IPL_NET in pool_cache_enter(), blocking IPL_NET interrupts until pool_cache_leave() is called and the IPL is lowered again to IPL_SOFTNET. I'm fairly confident the "normal" pools are mpsafe; we have been using those in concurrent contexts without holding the kernel lock for a long time already. But the pool cache layer is still relatively new...
Re: have bpf kq events fire when the interface goes away
On Thu, Apr 22, 2021 at 01:13:50PM +1000, David Gwynne wrote: > On Wed, Apr 21, 2021 at 01:15:53PM +, Visa Hankala wrote: > > On Wed, Apr 21, 2021 at 11:04:20AM +1000, David Gwynne wrote: > > > On Wed, Apr 21, 2021 at 10:21:32AM +1000, David Gwynne wrote: > > > > if you have a program that uses kq (or libevent) to wait for bytes to > > > > read off an idle network interface via /dev/bpf and that interface > > > > goes away, the program doesnt get woken up. this is because the kq > > > > read filter in bpf only checks if there ares bytes available. because a > > > > detached interface never gets packets (how very zen), this condition > > > > never changes and the program will never know something happened. > > > > > > > > this has the bpf filter check if the interface is detached too. with > > > > this change my test program wakes up, tries to read, and gets EIO. which > > > > is great. > > > > > > > > note that in the middle of this is the vdevgone machinery. when an > > > > interface is detached, bpfdetach gets called, which ends up calling > > > > vdevgone. vdevgone sort of swaps out bpf on the currently open vdev with > > > > some dead operations, part of which involves calling bpfclose() to try > > > > and clean up the existing state associated with the vdev. bpfclose tries > > > > to wake up any waiting listeners, which includes kq handlers. that's how > > > > the kernel goes from an interface being detached to the bpf kq filter > > > > being run. the bpf kq filter just has to check that the interface is > > > > still attached. > > > > > > I thought tun(4) had this same problem, but I wrote a test and couldn't > > > reproduce it. tun works because it addresses the problem in a different > > > way. Instead of having its own kq filter check if the device is dead or > > > not, it calls klist_invalidate, which switches things around like the > > > vdevgone/vop_revoke stuff does with the vdev. > > > > > > So an alternative way to solve this problem in bpf(4) would be the > > > following: > > > > > > Index: bpf.c > > > === > > > RCS file: /cvs/src/sys/net/bpf.c,v > > > retrieving revision 1.203 > > > diff -u -p -r1.203 bpf.c > > > --- bpf.c 21 Jan 2021 12:33:14 - 1.203 > > > +++ bpf.c 21 Apr 2021 00:54:30 - > > > @@ -401,6 +401,7 @@ bpfclose(dev_t dev, int flag, int mode, > > > bpf_wakeup(d); > > > LIST_REMOVE(d, bd_list); > > > mtx_leave(>bd_mtx); > > > + klist_invalidate(>bd_sel.si_note); > > > bpf_put(d); > > > > > > return (0); > > > > I think bpf should call klist_invalidate() from the detach path. > > bpfsdetach() might be the right place. This would make the code pattern > > similar to the existing uses of klist_invalidate(). > > > > Calling klist_invalidate() from the close function twists the logic, > > at least in my opinion. When a file descriptor is closed, the file > > descriptor layer will remove the knotes automatically. This is why close > > functions usually do not have to manage with knotes. However, the > > automatic removal does not happen when a device is revoked, which is > > mended by klist_invalidate(). > > yep, makes sense to me. how's this look? it works as well as my previous > diff did in my tests. Looks good. OK visa@ > Index: bpf.c > === > RCS file: /cvs/src/sys/net/bpf.c,v > retrieving revision 1.203 > diff -u -p -r1.203 bpf.c > --- bpf.c 21 Jan 2021 12:33:14 - 1.203 > +++ bpf.c 22 Apr 2021 03:09:27 - > @@ -1690,8 +1690,10 @@ bpfsdetach(void *p) > if (cdevsw[maj].d_open == bpfopen) > break; > > - while ((bd = SMR_SLIST_FIRST_LOCKED(>bif_dlist))) > + while ((bd = SMR_SLIST_FIRST_LOCKED(>bif_dlist))) { > vdevgone(maj, bd->bd_unit, bd->bd_unit, VCHR); > + klist_invalidate(>bd_sel.si_note); > + } > > for (tbp = bpf_iflist; tbp; tbp = tbp->bif_next) { > if (tbp->bif_next == bp) { >
Re: [External] : Re: running network stack forwarding in parallel
Hello, On Thu, Apr 22, 2021 at 01:09:34PM +0200, Alexander Bluhm wrote: > On Thu, Apr 22, 2021 at 12:33:13PM +0200, Hrvoje Popovski wrote: > > r620-1# papnpaiancini:cc :p :op > > opooolo_llc_ac_caccahhceh_ei_eti_tieetmme_mm__amgamigacigci__cc_hccehhcekcekc:: > > k :m bmubmfubfuppflp llc pc pcuup uf rfferree eel el iilsitss tm tom > > omddoidfiiifeifeidde:d ::i ti etietmme m > > a daddardd rd0 r0 > > xx0fxfddf88d08c0cc0c6c76afc9b3f04500400++01+61 610 6x0 > > fx0fxffdffdf88d08 > > 00020720d72a8c0049703eb!ef!e==!0=x009x59x95995b9ebbaee3ae3ae344ef54f5a4bff7db07990a9 > > Wow. 3 CPUs panic in pool_cache_get() pool_cache_item_magic_check > simultaneously. This makes me think we may have a bug there. > I took a look at arch/amd64/include/intrdefs.h where interrupt priorities are defined. IPL_NET has priority set to 7, IPL_SOFTNET has higher priority set 5 all allocations are coming from mbpool via m_gethdr(), interrupt level priority for mbpool is set to IPL_NET. If I understand code in m_pool_get() right, then the pool_cache_enter() does not stop guys who call m_gethdr() with IPL_SOFTNET. if we put KERNEL_LOCK() there the problem is gone, mostlikely because the IPL_SOFTNET guy waits for KERNEL_LOCK therefore it can not interfere with our IPL_NET task, which forwards packet. I admit it's a poor speculation, I have no 'hard proof' for my claim here. So I might be very wrong here. thanks and regards sashan
Re: running network stack forwarding in parallel
On Thu, Apr 22, 2021 at 12:26:50AM +0200, Alexander Bluhm wrote: > As a wild guess, you could apply this diff on top. Something similar > has fixed IPv6 NDP problem I have seen. Maybe it is in the routing > table, that is used for ARP and NDP. Here are the performance numbers for forwarding with kernel lock around ARP. Columns are: 1. current, 21 April 2. parallel nettask, kernel lock around ARP 3. parallel nettask, remove ipsec hack 4. parallel nettask http://bluhm.genua.de/perform/results/2021-04-21T10%3A50%3A37Z/gnuplot/forward.png Interresting is, that the forwarding performance spreads much more when we have kernel lock around arp. So maybe different timing hides the bug. All the data, you can click on "diff" to see what was tested. http://bluhm.genua.de/perform/results/2021-04-21T10%3A50%3A37Z/perform.html > --- net/if_ethersubr.c > +++ net/if_ethersubr.c > @@ -222,7 +222,10 @@ ether_resolve(struct ifnet *ifp, struct mbuf *m, struct > sockaddr *dst, > > switch (af) { > case AF_INET: > + KERNEL_LOCK(); > + /* XXXSMP there is a MP race in arpresolve() */ > error = arpresolve(ifp, rt, m, dst, eh->ether_dhost); > + KERNEL_UNLOCK(); > if (error) > return (error); > eh->ether_type = htons(ETHERTYPE_IP);
re-enable A-MSDU support with iwm(4) and iwx(4) fixed
This patch adds A-MSDU rx offloading support for both iwm(4) and iwx(4) and re-enables net80211's software A-MSDU Rx support for all 11n drivers. Meaning iwn(4) and athn(4) will also be receiving A-MSDUs again. This feature has been turned off since July 2019: https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/net80211/ieee80211_input.c#rev1.207 The root cause of the problem I saw at that time was related to dlg's Rx queue back-pressure mechanism. Once we figured this out, all wireless drivers were fixed to call if_input() only once per interrupt so this is no longer an issue. For a very brief period we tried to enable A-MSDUs again in -current but we found out that iwm/iwx needed additional work for the new devices which received support while A-MSDU was disabled in-tree: https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/net80211/ieee80211_input.c#rev1.226 The patch below completes the missing bits to make A-MSDUs work on these new devices. My previous iwm-only patch and related test reports are here: https://marc.info/?t=16170390711=1=2 The iwm changes have already been extensively tested. For iwm, this new patch only adds an additional range check: @@ -4640,7 +4640,8 @@ iwm_rx_reorder(struct iwm_softc *sc, struct mbuf *m, i baid = (reorder_data & IWM_RX_MPDU_REORDER_BAID_MASK) >> IWM_RX_MPDU_REORDER_BAID_SHIFT; - if (baid == IWM_RX_REORDER_DATA_INVALID_BAID) + if (baid == IWM_RX_REORDER_DATA_INVALID_BAID || + baid >= nitems(sc->sc_rxba_data)) return 0; rxba = >sc_rxba_data[baid]; I have tested on the following devices: iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless-AC 8265" rev 0x78, msi iwm0: hw rev 0x230, fw ver 34.0.1, address 7c:11:xx:xx:xx:xx iwx0 at pci5 dev 0 function 0 "Intel Wi-Fi 6 AX200" rev 0x1a, msix iwx0: hw rev 0x340, fw ver 48.1335886879.0, address d0:ab:xx:xx:xx:xx iwx0 at pci0 dev 20 function 3 "Intel Wi-Fi 6 AX201" rev 0x00, msix iwx0: hw rev 0x350, fw ver 48.1335886879.0, address 0c:7a:xx:xx:xx:xx ok? diff refs/heads/master refs/heads/amsdu blob - 00bf20b37ed33a652232885349c2f3dfa0d666d3 blob + 05cc01334b522b7b537fba6df4b4fd3e0c5d8eb9 --- sys/dev/pci/if_iwm.c +++ sys/dev/pci/if_iwm.c @@ -144,6 +144,8 @@ #include #include #include +#include /* for SEQ_LT */ +#undef DPRINTF /* defined in ieee80211_priv.h */ #define DEVNAME(_s)((_s)->sc_dev.dv_xname) @@ -328,12 +330,17 @@ int iwm_mimo_enabled(struct iwm_softc *); void iwm_setup_ht_rates(struct iwm_softc *); void iwm_htprot_task(void *); void iwm_update_htprot(struct ieee80211com *, struct ieee80211_node *); +void iwm_init_reorder_buffer(struct iwm_reorder_buffer *, uint16_t, + uint16_t); +void iwm_clear_reorder_buffer(struct iwm_softc *, struct iwm_rxba_data *); intiwm_ampdu_rx_start(struct ieee80211com *, struct ieee80211_node *, uint8_t); void iwm_ampdu_rx_stop(struct ieee80211com *, struct ieee80211_node *, uint8_t); +void iwm_rx_ba_session_expired(void *); +void iwm_reorder_timer_expired(void *); void iwm_sta_rx_agg(struct iwm_softc *, struct ieee80211_node *, uint8_t, - uint16_t, uint16_t, int); + uint16_t, uint16_t, int, int); #ifdef notyet intiwm_ampdu_tx_start(struct ieee80211com *, struct ieee80211_node *, uint8_t); @@ -372,8 +379,10 @@ intiwm_rxmq_get_signal_strength(struct iwm_softc *, s void iwm_rx_rx_phy_cmd(struct iwm_softc *, struct iwm_rx_packet *, struct iwm_rx_data *); intiwm_get_noise(const struct iwm_statistics_rx_non_phy *); +intiwm_rx_hwdecrypt(struct iwm_softc *, struct mbuf *, uint32_t, + struct ieee80211_rxinfo *); intiwm_ccmp_decap(struct iwm_softc *, struct mbuf *, - struct ieee80211_node *); + struct ieee80211_node *, struct ieee80211_rxinfo *); void iwm_rx_frame(struct iwm_softc *, struct mbuf *, int, uint32_t, int, int, uint32_t, struct ieee80211_rxinfo *, struct mbuf_list *); void iwm_rx_tx_cmd_single(struct iwm_softc *, struct iwm_rx_packet *, @@ -490,6 +499,20 @@ void iwm_nic_umac_error(struct iwm_softc *); #endif void iwm_rx_mpdu(struct iwm_softc *, struct mbuf *, void *, size_t, struct mbuf_list *); +void iwm_flip_address(uint8_t *); +intiwm_detect_duplicate(struct iwm_softc *, struct mbuf *, + struct iwm_rx_mpdu_desc *, struct ieee80211_rxinfo *); +intiwm_is_sn_less(uint16_t, uint16_t, uint16_t); +void iwm_release_frames(struct iwm_softc *, struct ieee80211_node *, + struct iwm_rxba_data *, struct iwm_reorder_buffer *, uint16_t, + struct mbuf_list *); +intiwm_oldsn_workaround(struct iwm_softc *, struct ieee80211_node *, + int, struct iwm_reorder_buffer *, uint32_t, uint32_t); +intiwm_rx_reorder(struct iwm_softc *, struct mbuf *, int, + struct iwm_rx_mpdu_desc *, int, int, uint32_t, +
Re: pf_state_key_link_reverse() is prone to race on parallel forwarding
On 21.4.2021. 22:19, Alexandr Nedvedicky wrote: > Hello, > > people who will be running pf(4) with bluhm's diff [1], may trip > one of the asserts triggered by pf_state_key_link_reverse() here: > > 7366 void > 7367 pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key > *skrev) > 7368 { > 7369 /* Note that sk and skrev may be equal, then we refcount twice. > */ > 7370 KASSERT(sk->reverse == NULL); > 7371 KASSERT(skrev->reverse == NULL); > 7372 sk->reverse = pf_state_key_ref(skrev); > 7373 skrev->reverse = pf_state_key_ref(sk); > 7374 } > > pf(4) currently does not provide a mutual access to state instances. > so it may happen pf(4) will be handling more packets, which will be > updating the same state. This is the situation of one winner and > more losers. At this point I'm suggesting to change those asserts > to take the race into account. diff below makes pf_state_key_link_reverse() > happy when pf(4) runs in parallel. > > would it be OK to commit it once bluhm's diff [1] will be in? > > thanks and > regards > sashan I'm running this diff with latest version of bluhm@ network parallel diff and i'm not seeing and problems .. i'm not running pfsync ..
Re: [External] : Re: running network stack forwarding in parallel
> Date: Thu, 22 Apr 2021 13:09:34 +0200 > From: Alexander Bluhm > > On Thu, Apr 22, 2021 at 12:33:13PM +0200, Hrvoje Popovski wrote: > > r620-1# papnpaiancini:cc :p :op > > opooolo_llc_ac_caccahhceh_ei_eti_tieetmme_mm__amgamigacigci__cc_hccehhcekcekc:: > > k :m bmubmfubfuppflp llc pc pcuup uf rfferree eel el iilsitss tm tom > > omddoidfiiifeifeidde:d ::i ti etietmme m > > a daddardd rd0 r0 > > xx0fxfddf88d08c0cc0c6c76afc9b3f04500400++01+61 610 6x0 > > fx0fxffdffdf88d08 > > 00020720d72a8c0049703eb!ef!e==!0=x009x59x95995b9ebbaee3ae3ae344ef54f5a4bff7db07990a9 > > Wow. 3 CPUs panic in pool_cache_get() pool_cache_item_magic_check > simultaneously. This makes me think we may have a bug there. Note that POOL_DEBUG was disabled until very recently. That may affect both: 1. The point at which pool corruption is reported 2. The benchmark results. Cheers, Mark
Re: [External] : Re: running network stack forwarding in parallel
On Thu, Apr 22, 2021 at 12:33:13PM +0200, Hrvoje Popovski wrote: > r620-1# papnpaiancini:cc :p :op > opooolo_llc_ac_caccahhceh_ei_eti_tieetmme_mm__amgamigacigci__cc_hccehhcekcekc:: > k :m bmubmfubfuppflp llc pc pcuup uf rfferree eel el iilsitss tm tom > omddoidfiiifeifeidde:d ::i ti etietmme m > a daddardd rd0 r0 > xx0fxfddf88d08c0cc0c6c76afc9b3f04500400++01+61 610 6x0 > fx0fxffdffdf88d08 > 00020720d72a8c0049703eb!ef!e==!0=x009x59x95995b9ebbaee3ae3ae344ef54f5a4bff7db07990a9 Wow. 3 CPUs panic in pool_cache_get() pool_cache_item_magic_check simultaneously. This makes me think we may have a bug there. bluhm
Re: [External] : Re: running network stack forwarding in parallel
On 22.4.2021. 12:38, Alexander Bluhm wrote: > It is not clear why the lock helps. Is it a bug in routing or ARP? > Or is it just different timing introduced by the additional kernel > lock? The parallel network task stress the subsystems of the kernel > more than before with MP load. Having more results from machines > with different timing would help. > > bluhm > I'm running this diff on Dell r620 with 6 x Intel(R) Xeon(R) CPU E5-2643 v2 @ 3.50GHz, 3600.52 MHz, 06-3e-04 and ix and forwarding went up from 1Mpps to 1.8Mpps Will play with this and contact you if something pops up .. tnx .. > > Index: net/if.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/net/if.c,v > retrieving revision 1.640 > diff -u -p -r1.640 if.c > --- net/if.c 26 Mar 2021 22:41:06 - 1.640 > +++ net/if.c 21 Apr 2021 12:52:08 - > @@ -238,7 +238,7 @@ int ifq_congestion; > > int netisr; > > -#define NET_TASKQ 1 > +#define NET_TASKQ 4 > struct taskq *nettqmp[NET_TASKQ]; > > struct task if_input_task_locked = TASK_INITIALIZER(if_netisr, NULL); > @@ -834,10 +834,10 @@ if_input_process(struct ifnet *ifp, stru >* to PF globals, pipex globals, unicast and multicast addresses >* lists and the socket layer. >*/ > - NET_LOCK(); > + NET_RLOCK_IN_SOFTNET(); > while ((m = ml_dequeue(ml)) != NULL) > (*ifp->if_input)(ifp, m); > - NET_UNLOCK(); > + NET_RUNLOCK_IN_SOFTNET(); > } > > void > @@ -895,6 +895,12 @@ if_netisr(void *unused) > KERNEL_UNLOCK(); > } > #endif > + if (n & (1 << NETISR_IP)) > + ipintr(); > +#ifdef INET6 > + if (n & (1 << NETISR_IPV6)) > + ip6intr(); > +#endif > #if NPPP > 0 > if (n & (1 << NETISR_PPP)) { > KERNEL_LOCK(); > @@ -3316,12 +3322,15 @@ unhandled_af(int af) > * globals aren't ready to be accessed by multiple threads in > * parallel. > */ > -int nettaskqs = NET_TASKQ; > +int nettaskqs; > > struct taskq * > net_tq(unsigned int ifindex) > { > struct taskq *t = NULL; > + > + if (nettaskqs == 0) > + nettaskqs = min(NET_TASKQ, ncpus); > > t = nettqmp[ifindex % nettaskqs]; > > Index: net/if_ethersubr.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_ethersubr.c,v > retrieving revision 1.274 > diff -u -p -r1.274 if_ethersubr.c > --- net/if_ethersubr.c7 Mar 2021 06:02:32 - 1.274 > +++ net/if_ethersubr.c21 Apr 2021 22:16:05 - > @@ -222,7 +222,10 @@ ether_resolve(struct ifnet *ifp, struct > > switch (af) { > case AF_INET: > + KERNEL_LOCK(); > + /* XXXSMP there is a MP race in arpresolve() */ > error = arpresolve(ifp, rt, m, dst, eh->ether_dhost); > + KERNEL_UNLOCK(); > if (error) > return (error); > eh->ether_type = htons(ETHERTYPE_IP); > @@ -245,7 +248,10 @@ ether_resolve(struct ifnet *ifp, struct > break; > #ifdef INET6 > case AF_INET6: > + KERNEL_LOCK(); > + /* XXXSMP there is a MP race in nd6_resolve() */ > error = nd6_resolve(ifp, rt, m, dst, eh->ether_dhost); > + KERNEL_UNLOCK(); > if (error) > return (error); > eh->ether_type = htons(ETHERTYPE_IPV6); > Index: net/ifq.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/net/ifq.c,v > retrieving revision 1.43 > diff -u -p -r1.43 ifq.c > --- net/ifq.c 20 Feb 2021 04:37:26 - 1.43 > +++ net/ifq.c 21 Apr 2021 13:12:44 - > @@ -243,7 +243,7 @@ void > ifq_init(struct ifqueue *ifq, struct ifnet *ifp, unsigned int idx) > { > ifq->ifq_if = ifp; > - ifq->ifq_softnet = net_tq(ifp->if_index); /* + idx */ > + ifq->ifq_softnet = net_tq(ifp->if_index + idx); > ifq->ifq_softc = NULL; > > mtx_init(>ifq_mtx, IPL_NET); > @@ -617,7 +617,7 @@ void > ifiq_init(struct ifiqueue *ifiq, struct ifnet *ifp, unsigned int idx) > { > ifiq->ifiq_if = ifp; > - ifiq->ifiq_softnet = net_tq(ifp->if_index); /* + idx */ > + ifiq->ifiq_softnet = net_tq(ifp->if_index + idx); > ifiq->ifiq_softc = NULL; > > mtx_init(>ifiq_mtx, IPL_NET); > Index: net/netisr.h > === > RCS file: /data/mirror/openbsd/cvs/src/sys/net/netisr.h,v > retrieving revision 1.55 > diff -u -p -r1.55 netisr.h > --- net/netisr.h 5 Jan 2021 20:43:36 - 1.55 > +++ net/netisr.h 21 Apr 2021 12:52:08 - > @@ -41,8 +41,10 @@ > * interrupt used for scheduling the network code to calls > * on the lowest level
Re: [External] : Re: running network stack forwarding in parallel
On Thu, Apr 22, 2021 at 11:36:07AM +0200, Hrvoje Popovski wrote: > On 22.4.2021. 11:02, Alexander Bluhm wrote: > > This was without my kernel lock around ARP bandage, right? > > yes, yes ... Good. Just wanted to be sure. > > Did you enter boot reboot before doing mach ddbcpu 0xa? > > nope... is doing that ever useful? No. I was looking at this strange trace. It is a bug in panic and ddb, you did nothing wrong. > > The if (db_panic) in the panic() function was not written with > > simultaneous panics on multiple CPUs in mind. > > if you want i'll try to reproduce in on other boxes.. > maybe i can trigger it here easily because of 2 sockets ? The panic and ddb bug is pretty clear. We just need an idea for the fix and someone who wants to fix it. But if you have more than one machine, it would be great to stress it with "network in parallel" and the "kernel lock around ARP bandage" diff. Attached together for convenience. It is not clear why the lock helps. Is it a bug in routing or ARP? Or is it just different timing introduced by the additional kernel lock? The parallel network task stress the subsystems of the kernel more than before with MP load. Having more results from machines with different timing would help. bluhm Index: net/if.c === RCS file: /data/mirror/openbsd/cvs/src/sys/net/if.c,v retrieving revision 1.640 diff -u -p -r1.640 if.c --- net/if.c26 Mar 2021 22:41:06 - 1.640 +++ net/if.c21 Apr 2021 12:52:08 - @@ -238,7 +238,7 @@ int ifq_congestion; int netisr; -#defineNET_TASKQ 1 +#defineNET_TASKQ 4 struct taskq *nettqmp[NET_TASKQ]; struct task if_input_task_locked = TASK_INITIALIZER(if_netisr, NULL); @@ -834,10 +834,10 @@ if_input_process(struct ifnet *ifp, stru * to PF globals, pipex globals, unicast and multicast addresses * lists and the socket layer. */ - NET_LOCK(); + NET_RLOCK_IN_SOFTNET(); while ((m = ml_dequeue(ml)) != NULL) (*ifp->if_input)(ifp, m); - NET_UNLOCK(); + NET_RUNLOCK_IN_SOFTNET(); } void @@ -895,6 +895,12 @@ if_netisr(void *unused) KERNEL_UNLOCK(); } #endif + if (n & (1 << NETISR_IP)) + ipintr(); +#ifdef INET6 + if (n & (1 << NETISR_IPV6)) + ip6intr(); +#endif #if NPPP > 0 if (n & (1 << NETISR_PPP)) { KERNEL_LOCK(); @@ -3316,12 +3322,15 @@ unhandled_af(int af) * globals aren't ready to be accessed by multiple threads in * parallel. */ -int nettaskqs = NET_TASKQ; +int nettaskqs; struct taskq * net_tq(unsigned int ifindex) { struct taskq *t = NULL; + + if (nettaskqs == 0) + nettaskqs = min(NET_TASKQ, ncpus); t = nettqmp[ifindex % nettaskqs]; Index: net/if_ethersubr.c === RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_ethersubr.c,v retrieving revision 1.274 diff -u -p -r1.274 if_ethersubr.c --- net/if_ethersubr.c 7 Mar 2021 06:02:32 - 1.274 +++ net/if_ethersubr.c 21 Apr 2021 22:16:05 - @@ -222,7 +222,10 @@ ether_resolve(struct ifnet *ifp, struct switch (af) { case AF_INET: + KERNEL_LOCK(); + /* XXXSMP there is a MP race in arpresolve() */ error = arpresolve(ifp, rt, m, dst, eh->ether_dhost); + KERNEL_UNLOCK(); if (error) return (error); eh->ether_type = htons(ETHERTYPE_IP); @@ -245,7 +248,10 @@ ether_resolve(struct ifnet *ifp, struct break; #ifdef INET6 case AF_INET6: + KERNEL_LOCK(); + /* XXXSMP there is a MP race in nd6_resolve() */ error = nd6_resolve(ifp, rt, m, dst, eh->ether_dhost); + KERNEL_UNLOCK(); if (error) return (error); eh->ether_type = htons(ETHERTYPE_IPV6); Index: net/ifq.c === RCS file: /data/mirror/openbsd/cvs/src/sys/net/ifq.c,v retrieving revision 1.43 diff -u -p -r1.43 ifq.c --- net/ifq.c 20 Feb 2021 04:37:26 - 1.43 +++ net/ifq.c 21 Apr 2021 13:12:44 - @@ -243,7 +243,7 @@ void ifq_init(struct ifqueue *ifq, struct ifnet *ifp, unsigned int idx) { ifq->ifq_if = ifp; - ifq->ifq_softnet = net_tq(ifp->if_index); /* + idx */ + ifq->ifq_softnet = net_tq(ifp->if_index + idx); ifq->ifq_softc = NULL; mtx_init(>ifq_mtx, IPL_NET); @@ -617,7 +617,7 @@ void ifiq_init(struct ifiqueue *ifiq, struct ifnet *ifp, unsigned int idx) { ifiq->ifiq_if = ifp; - ifiq->ifiq_softnet = net_tq(ifp->if_index); /* + idx */ + ifiq->ifiq_softnet =
Re: [External] : Re: running network stack forwarding in parallel
On 22.4.2021. 11:36, Hrvoje Popovski wrote: > if you want i'll try to reproduce in on other boxes.. > maybe i can trigger it here easily because of 2 sockets ? on second box with 6 x Intel(R) Xeon(R) CPU E5-2643 v2 @ 3.50GHz, 3600.02 MHz.. r620-1# papnpaiancini:cc :p :op opooolo_llc_ac_caccahhceh_ei_eti_tieetmme_mm__amgamigacigci__cc_hccehhcekcekc:: k :m bmubmfubfuppflp llc pc pcuup uf rfferree eel el iilsitss tm tom omddoidfiiifeifeidde:d ::i ti etietmme m a daddardd rd0 r0 xx0fxfddf88d08c0cc0c6c76afc9b3f04500400++01+61 610 6x0 fx0fxffdffdf88d08 00020720d72a8c0049703eb!ef!e==!0=x009x59x95995b9ebbaee3ae3ae344ef54f5a4bff7db07990a9 aa Stopped at db_enter+0x10: popq%rbp TIDPIDUID PRFLAGS PFLAGS CPU COMMAND 514514 85981 0 0x14000 0x2003 softnet *327946 47274 0 0x14000 0x2004 softnet 401838 53270 0 0x14000 0x2002 softnet db_enter() at db_enter+0x10 panic(81df3a9d) at panic+0x12a pool_cache_get(822fade8) at pool_cache_get+0x25b pool_get(822fade8,2) at pool_get+0x5e m_clget(0,2,802) at m_clget+0xdd ixgbe_get_buf(800699d0,2a) at ixgbe_get_buf+0xa3 ixgbe_rxfill(800699d0) at ixgbe_rxfill+0xae ixgbe_queue_intr(80024c80) at ixgbe_queue_intr+0x4f intr_handler(80002486ae00,8006fa00) at intr_handler+0x6e Xintr_ioapic_edge0_untramp() at Xintr_ioapic_edge0_untramp+0x18f srp_leave(80002486aed0) at srp_leave+0x11 if_get(2) at if_get+0x6f ip_output(fd80ccca9e00,0,80002486b130,1,0,0) at ip_output+0x393 ip_forward(fd80ccca9e00,80082048,fd83b39ada88,0) at ip_forward+ 0x261 end trace frame: 0x80002486b250, count: 0 https://www.openbsd.org/ddb.html describes the minimum info required in bug reports. Insufficient info makes it difficult to find and fix bugs. ddb{4}> show panic pool_cache_item_magic_check: mbufpl cpu free list modified: item addr 0xfd8 0ccca9300+16 0xfd80027dc47e!=0x959bea3e4ffab79a ddb{4}> trace db_enter() at db_enter+0x10 panic(81df3a9d) at panic+0x12a pool_cache_get(822fade8) at pool_cache_get+0x25b pool_get(822fade8,2) at pool_get+0x5e m_clget(0,2,802) at m_clget+0xdd ixgbe_get_buf(800699d0,2a) at ixgbe_get_buf+0xa3 ixgbe_rxfill(800699d0) at ixgbe_rxfill+0xae ixgbe_queue_intr(80024c80) at ixgbe_queue_intr+0x4f intr_handler(80002486ae00,8006fa00) at intr_handler+0x6e Xintr_ioapic_edge0_untramp() at Xintr_ioapic_edge0_untramp+0x18f srp_leave(80002486aed0) at srp_leave+0x11 if_get(2) at if_get+0x6f ip_output(fd80ccca9e00,0,80002486b130,1,0,0) at ip_output+0x393 ip_forward(fd80ccca9e00,80082048,fd83b39ada88,0) at ip_forward+0x261 ip_input_if(80002486b268,80002486b274,4,0,80082048) at ip_input_if+0x608 ipv4_input(80082048,fd80ccca9e00) at ipv4_input+0x39 if_input_process(80082048,80002486b2e8) at if_input_process+0x6f ifiq_process(80086200) at ifiq_process+0x69 taskq_thread(80030100) at taskq_thread+0x9f end trace frame: 0x0, count: -19 ddb{4}> show locks shared rwlock netlock r = 0 (0x82101bd8) #0 witness_lock+0x339 #1 if_input_process+0x43 #2 ifiq_process+0x69 #3 taskq_thread+0x9f #4 proc_trampoline+0x1c shared rwlock softnet r = 0 (0x80030170) #0 witness_lock+0x339 #1 taskq_thread+0x92 #2 proc_trampoline+0x1c ddb{4}> show all locks Process 53626 (ld) thread 0x800024972008 (479301) exclusive rrwlock inode r = 0 (0xfd842e8d92c0) #0 witness_lock+0x339 #1 rw_enter+0x286 #2 rrw_enter+0x56 #3 VOP_LOCK+0x33 #4 vn_lock+0x84 #5 uvn_get+0x15f #6 uvm_fault_lower+0x2ae #7 uvm_fault+0x19e #8 upageflttrap+0x69 #9 usertrap+0x18b #10 recall_trap+0x8 Process 47274 (softnet) thread 0x8000e2a0 (327946) shared rwlock netlock r = 0 (0x82101bd8) #0 witness_lock+0x339 #1 if_input_process+0x43 #2 ifiq_process+0x69 #3 taskq_thread+0x9f #4 proc_trampoline+0x1c shared rwlock softnet r = 0 (0x80030170) #0 witness_lock+0x339 #1 taskq_thread+0x92 #2 proc_trampoline+0x1c Process 53270 (softnet) thread 0x8000e540 (401838) shared rwlock netlock r = 0 (0x82101bd8) #0 witness_lock+0x339 #1 if_input_process+0x43 #2 ifiq_process+0x69 #3 taskq_thread+0x9f #4 proc_trampoline+0x1c shared rwlock softnet r = 0 (0x80030070) #0 witness_lock+0x339 #1 taskq_thread+0x92 #2 proc_trampoline+0x1c ddb{4}> ps PID TID PPIDUID S FLAGS WAIT COMMAND 53626 479301 83760 0 3 0x2 biowait ld 83760 75359 86199 0 30x10008a sigsusp sh 86199 465340 39220 0 30x10008a sigsusp make 28046 189992 1 0 30x100083 ttyin ksh 39220 374653 1 0 30x10008b sigsusp ksh 4909 380794 1 0 30x100098 poll cron 38969 26279 39918720 3
Re: tmpfs & UVM aobj
Am Thu, Apr 22, 2021 at 11:19:22AM +0200 schrieb Martin Pieuchot: > uao_shrink() and uao_grow() are only used by TMPFS, ok to place them > under an #ifdef? This save some bytes on RAMDISKs. sure, ok patrick@ > Index: uvm/uvm_aobj.c > === > RCS file: /cvs/src/sys/uvm/uvm_aobj.c,v > retrieving revision 1.94 > diff -u -p -r1.94 uvm_aobj.c > --- uvm/uvm_aobj.c31 Mar 2021 08:53:39 - 1.94 > +++ uvm/uvm_aobj.c22 Apr 2021 09:00:27 - > @@ -416,6 +416,7 @@ uao_free(struct uvm_aobj *aobj) > * pager functions > */ > > +#ifdef TMPFS > /* > * Shrink an aobj to a given number of pages. The procedure is always the > same: > * assess the necessity of data structure conversion (hash to array), secure > @@ -692,6 +693,7 @@ uao_grow(struct uvm_object *uobj, int pa > else > return uao_grow_convert(uobj, pages); > } > +#endif /* TMPFS */ > > /* > * uao_create: create an aobj of the given size and return its uvm_object. >
Re: [External] : Re: running network stack forwarding in parallel
On 22.4.2021. 11:02, Alexander Bluhm wrote: > On Thu, Apr 22, 2021 at 09:03:22AM +0200, Hrvoje Popovski wrote: >> something like this: >> >> x3550m4# pappnaiannc:iicc :p:o ppoolo_oolcla__ddcohoe__gg_eiettt::e m >> _mmcbmualg2fkpilc2_:: chppeaag >> gceke: ee mmmbppttuyfyp > > This was without my kernel lock around ARP bandage, right? yes, yes ... > >> ddb{9}> mach ddbcpu 0xa >> Stopped at x86_ipi_db+0x12:leave >> x86_ipi_db(800021a2aff0) at x86_ipi_db+0x12 >> x86_ipi_handler() at x86_ipi_handler+0x80 >> Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23 >> pool_get(8221e568,2) at pool_get+0x43 >> m_gethdr(2,1) at m_gethdr+0x3f >> rtm_msg1(e,800026e3cf70) at rtm_msg1+0x4c >> rtm_ifchg(805b3800) at rtm_ifchg+0x61 >> if_down(805b3800) at if_down+0xa0 >> if_downall() at if_downall+0x5b >> boot(104) at boot+0x99 >> reboot(104) at reboot+0x5b >> panic(81df855b) at panic+0x132 >> pool_do_get(8221ebc8,2,800026e3d294) at pool_do_get+0x309 >> pool_get(8221ebc8,2) at pool_get+0x95 >> end trace frame: 0x800026e3d340, count: 0 >> >> ddb{10}> mach ddbcpu 0xb >> Stopped at db_enter+0x10: popq%rbp >> db_enter() at db_enter+0x10 >> panic(81df855b) at panic+0x12a >> pool_do_get(8221e568,2,800026e43294) at pool_do_get+0x309 >> pool_get(8221e568,2) at pool_get+0x95 >> m_clget(0,2,802) at m_clget+0xdd >> ixgbe_get_buf(8015c0e8,e) at ixgbe_get_buf+0xa3 >> ixgbe_rxfill(8015c0e8) at ixgbe_rxfill+0xae >> ixgbe_queue_intr(8011ac40) at ixgbe_queue_intr+0x4f >> intr_handler(800026e434b0,800cd700) at intr_handler+0x6e >> Xintr_ioapic_edge4_untramp() at Xintr_ioapic_edge4_untramp+0x18f >> acpicpu_idle() at acpicpu_idle+0x1ea >> sched_idle(800021a33ff0) at sched_idle+0x27e >> end trace frame: 0x0, count: 3 > > Two processors 10 and 11 in pool get. > > CPU 10 does pool_get, panic, boot, pool_get again. > CPU 11 was the one that originally stopped in ddb. > > Did you enter boot reboot before doing mach ddbcpu 0xa? nope... is doing that ever useful? > Or how did we get the boot sequence in this trace? > > Can it be that both CPU paniced simultaeously? The mangled massage > indicates this. Then cpu 10 saw that cpu 11 already paniced to ddb > and tried to reboot. There it paniced again and got stuck in a > recursive call to pool_get(). > > The if (db_panic) in the panic() function was not written with > simultaneous panics on multiple CPUs in mind. if you want i'll try to reproduce in on other boxes.. maybe i can trigger it here easily because of 2 sockets ?
tmpfs & UVM aobj
uao_shrink() and uao_grow() are only used by TMPFS, ok to place them under an #ifdef? This save some bytes on RAMDISKs. Index: uvm/uvm_aobj.c === RCS file: /cvs/src/sys/uvm/uvm_aobj.c,v retrieving revision 1.94 diff -u -p -r1.94 uvm_aobj.c --- uvm/uvm_aobj.c 31 Mar 2021 08:53:39 - 1.94 +++ uvm/uvm_aobj.c 22 Apr 2021 09:00:27 - @@ -416,6 +416,7 @@ uao_free(struct uvm_aobj *aobj) * pager functions */ +#ifdef TMPFS /* * Shrink an aobj to a given number of pages. The procedure is always the same: * assess the necessity of data structure conversion (hash to array), secure @@ -692,6 +693,7 @@ uao_grow(struct uvm_object *uobj, int pa else return uao_grow_convert(uobj, pages); } +#endif /* TMPFS */ /* * uao_create: create an aobj of the given size and return its uvm_object.
Re: [External] : Re: running network stack forwarding in parallel
On Thu, Apr 22, 2021 at 09:03:22AM +0200, Hrvoje Popovski wrote: > something like this: > > x3550m4# pappnaiannc:iicc :p:o ppoolo_oolcla__ddcohoe__gg_eiettt::e m > _mmcbmualg2fkpilc2_:: chppeaag > gceke: ee mmmbppttuyfyp This was without my kernel lock around ARP bandage, right? > ddb{9}> mach ddbcpu 0xa > Stopped at x86_ipi_db+0x12:leave > x86_ipi_db(800021a2aff0) at x86_ipi_db+0x12 > x86_ipi_handler() at x86_ipi_handler+0x80 > Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23 > pool_get(8221e568,2) at pool_get+0x43 > m_gethdr(2,1) at m_gethdr+0x3f > rtm_msg1(e,800026e3cf70) at rtm_msg1+0x4c > rtm_ifchg(805b3800) at rtm_ifchg+0x61 > if_down(805b3800) at if_down+0xa0 > if_downall() at if_downall+0x5b > boot(104) at boot+0x99 > reboot(104) at reboot+0x5b > panic(81df855b) at panic+0x132 > pool_do_get(8221ebc8,2,800026e3d294) at pool_do_get+0x309 > pool_get(8221ebc8,2) at pool_get+0x95 > end trace frame: 0x800026e3d340, count: 0 > > ddb{10}> mach ddbcpu 0xb > Stopped at db_enter+0x10: popq%rbp > db_enter() at db_enter+0x10 > panic(81df855b) at panic+0x12a > pool_do_get(8221e568,2,800026e43294) at pool_do_get+0x309 > pool_get(8221e568,2) at pool_get+0x95 > m_clget(0,2,802) at m_clget+0xdd > ixgbe_get_buf(8015c0e8,e) at ixgbe_get_buf+0xa3 > ixgbe_rxfill(8015c0e8) at ixgbe_rxfill+0xae > ixgbe_queue_intr(8011ac40) at ixgbe_queue_intr+0x4f > intr_handler(800026e434b0,800cd700) at intr_handler+0x6e > Xintr_ioapic_edge4_untramp() at Xintr_ioapic_edge4_untramp+0x18f > acpicpu_idle() at acpicpu_idle+0x1ea > sched_idle(800021a33ff0) at sched_idle+0x27e > end trace frame: 0x0, count: 3 Two processors 10 and 11 in pool get. CPU 10 does pool_get, panic, boot, pool_get again. CPU 11 was the one that originally stopped in ddb. Did you enter boot reboot before doing mach ddbcpu 0xa? Or how did we get the boot sequence in this trace? Can it be that both CPU paniced simultaeously? The mangled massage indicates this. Then cpu 10 saw that cpu 11 already paniced to ddb and tried to reboot. There it paniced again and got stuck in a recursive call to pool_get(). The if (db_panic) in the panic() function was not written with simultaneous panics on multiple CPUs in mind. bluhm
Re: [External] : Re: running network stack forwarding in parallel
On 22.4.2021. 1:10, Hrvoje Popovski wrote: > On 22.4.2021. 0:31, Alexandr Nedvedicky wrote: >> Hello, >> >> Hi, with this diff i'm getting panic when i'm pushing traffic over that box. This is plain forwarding. To compile with witness ? >>> >>> >>> with witness >>> >> >> any chance to check other CPUs to see what code they are executing? >> I hope to be lucky enough and see thread, which could corrupt the memory. > > > no problem, will do that tomorrow > something like this: x3550m4# pappnaiannc:iicc :p:o ppoolo_oolcla__ddcohoe__gg_eiettt::e m _mmcbmualg2fkpilc2_:: chppeaag gceke: ee mmmbppttuyfyp l Stopped at db_enter+0x10: popq%rbp TIDPIDUID PRFLAGS PFLAGS CPU COMMAND 110314 50947 0 0x14000 0x2002 softnet 370908 43261 0 0x14000 0x2003 softnet 271599 79025 0 0x14000 0x2000 softnet 121483 71835 0 0x14000 0x2001 softnet db_enter() at db_enter+0x10 panic(81df855b) at panic+0x12a pool_do_get(8221e568,2,800026e43294) at pool_do_get+0x309 pool_get(8221e568,2) at pool_get+0x95 m_clget(0,2,802) at m_clget+0xdd ixgbe_get_buf(8015c0e8,e) at ixgbe_get_buf+0xa3 ixgbe_rxfill(8015c0e8) at ixgbe_rxfill+0xae ixgbe_queue_intr(8011ac40) at ixgbe_queue_intr+0x4f intr_handler(800026e434b0,800cd700) at intr_handler+0x6e Xintr_ioapic_edge4_untramp() at Xintr_ioapic_edge4_untramp+0x18f acpicpu_idle() at acpicpu_idle+0x1ea sched_idle(800021a33ff0) at sched_idle+0x27e end trace frame: 0x0, count: 3 https://www.openbsd.org/ddb.html describes the minimum info required in bug reports. Insufficient info makes it difficult to find and fix bugs. ddb{11}> ddb{11}> show panic pool_cache_item_magic_check: mbufpl cpu free list modified: item addr 0xfd8 066b54500+16 0xfd80792f4bbe!=0xb49ab28a8a25d432 ddb{11}> trace db_enter() at db_enter+0x10 panic(81df855b) at panic+0x12a pool_do_get(8221e568,2,800026e43294) at pool_do_get+0x309 pool_get(8221e568,2) at pool_get+0x95 m_clget(0,2,802) at m_clget+0xdd ixgbe_get_buf(8015c0e8,e) at ixgbe_get_buf+0xa3 ixgbe_rxfill(8015c0e8) at ixgbe_rxfill+0xae ixgbe_queue_intr(8011ac40) at ixgbe_queue_intr+0x4f intr_handler(800026e434b0,800cd700) at intr_handler+0x6e Xintr_ioapic_edge4_untramp() at Xintr_ioapic_edge4_untramp+0x18f acpicpu_idle() at acpicpu_idle+0x1ea sched_idle(800021a33ff0) at sched_idle+0x27e end trace frame: 0x0, count: -12 ddb{11}> show all locks Process 59777 (ld) thread 0x800026ea97a8 (7337) exclusive rrwlock inode r = 0 (0xfd887e57bc58) #0 witness_lock+0x339 #1 rw_enter+0x286 #2 rrw_enter+0x56 #3 VOP_LOCK+0x33 #4 vn_lock+0x84 #5 uvn_get+0x15f #6 uvm_fault_lower+0x2ae #7 uvm_fault+0x19e #8 upageflttrap+0x69 #9 usertrap+0x18b #10 recall_trap+0x8 Process 50947 (softnet) thread 0x800026dc8a80 (110314) shared rwlock netlock r = 0 (0x820e0ea8) #0 witness_lock+0x339 #1 if_input_process+0x43 #2 ifiq_process+0x69 #3 taskq_thread+0x9f #4 proc_trampoline+0x1c shared rwlock softnet r = 0 (0x80030370) #0 witness_lock+0x339 #1 taskq_thread+0x92 #2 proc_trampoline+0x1c ddb{11}> ps PID TID PPIDUID S FLAGS WAIT COMMAND 597777337 5166 0 3 0x2 biowait ld 5166 135990 73189 0 30x10008a sigsusp sh 73189 29786 50754 0 30x10008a sigsusp make 19125584 1 0 30x100083 ttyin ksh 50754 423311 1 0 30x10008b sigsusp ksh 41817 523177 1 0 30x100098 poll cron 55124 161629 41428720 30x90 kqreadlldpd 41428 491873 1 0 30x80 netio lldpd 1877 283481 6199 95 30x100092 kqreadsmtpd 65700 262335 6199103 30x100092 kqreadsmtpd 7456 289409 6199 95 30x100092 kqreadsmtpd 4455 333406 6199 95 30x100092 kqreadsmtpd 11921 364120 6199 95 30x100092 kqreadsmtpd 28535 449450 6199 95 30x100092 kqreadsmtpd 6199 140680 1 0 30x100080 kqreadsmtpd 69702 89036 1 0 30x80 selectsshd 98630 388346 1 0 30x100080 poll ntpd 65290 297654 68128 83 30x100092 poll ntpd 68128 253220 1 83 30x100092 poll ntpd 8458 234468 4179 73 30x100090 kqreadsyslogd 4179 69049 1 0 30x100082 netio syslogd 25880 194929 0 0 3 0x14200 bored smr 22006 336506 0 0 3 0x14200 pgzerozerothread 26528 405700 0 0 3 0x14200 aiodoned aiodoned 9962 288230 0 0 3 0x14200 syncerupdate 77445 330499