I guess it’s know issue caused by ifioctl() races.
> On 5 Aug 2020, at 19:06, Sven F. <sven.falem...@gmail.com> wrote: > > On Wed, Aug 5, 2020 at 9:14 AM Sven F. <sven.falem...@gmail.com> wrote: >> >> Never seen before crash ( 6. 7 stable ) >> >> My devices run a lot of things in, load is easily 4 >> which is good for breaking lock code ? >> >> uvm_fault(0xfffffd820a916cc0, 0x8, 0, 1) -> e >> kernel: page fault trap, code=0 >> Stopped at bridge_brlconf+0x24: movq 0x8(%rdi),%rax >> ddb{1}> bridge_brlconf(0,ffff800022e482b0) at bridge_brlconf+0x24 >> bridge_ioctl(ffff800000524000,c030694f,ffff800022e482b0) at >> bridge_ioctl+0x34f >> ifioctl(fffffd8118f33c98,c030694f,ffff800022e482b0,ffff800022895510) >> at ifioctl+0xa03 >> soo_ioctl(fffffd814c9cf2e8,c030694f,ffff800022e482b0,ffff800022895510) >> at soo_ioctl+0x171 >> sys_ioctl(ffff800022895510,ffff800022e483c0,ffff800022e48420) at >> sys_ioctl+0x2df >> syscall(ffff800022e48490) at syscall+0x389 >> Xsyscall() at Xsyscall+0x128 >> end of kernel >> end trace frame: 0x7f7ffffd2c30, count: -7 >> ddb{1}> PID TID PPID UID S FLAGS WAIT COMMAND >> 93583 108149 190 0 3 0x2 smrbar ifconfig >> 91803 290614 22775 0 2 0 perl >> 10793 382902 22775 0 2 0 perl >> 83906 60674 57936 0 2 0x2 ifconfig >> *80202 204044 63585 0 7 0x2 ifconfig >> 63585 7414 22775 0 3 0x80 piperd perl >> 57936 439878 22775 0 3 0x80 piperd perl >> 70874 88156 22775 0 7 0 perl >> 72682 513697 1 0 3 0x80 poll openvpn >> 84518 279046 22775 0 2 0 perl >> 55262 52512 22775 0 2 0 perl >> 36158 256298 22775 0 2 0 perl >> 84969 398264 1 0 3 0x80 poll openvpn >> 31484 239479 1 0 3 0x80 poll openvpn >> 7902 84087 60669 0 3 0x82 netio sshd >> 25285 366282 1 0 3 0x80 poll openvpn >> 96838 424361 1 0 3 0x80 poll openvpn >> 42763 368876 1 0 3 0x80 poll openvpn >> 92032 243887 1 0 3 0x80 poll openvpn >> 22775 200805 21119 0 2 0x2 perl >> 21119 407468 75737 0 3 0x10008a pause sh >> ddb{1}> rebooting... >> OpenBSD 6.7-stable (GENERIC.MP) #43: Tue Jul 28 21:46:24 EDT 2020 >> root@builder:/sys/arch/amd64/compile/GENERIC.MP >> real mem = 8371683328 (7983MB) >> avail mem = 8105316352 (7729MB) >> mpath0 at root >> scsibus0 at mpath0: 256 targets >> mainbus0 at root >> bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xf6830 (11 entries) >> bios0: vendor SeaBIOS version "2:1.10.2-58953eb7" date 04/01/2014 >> bios0: OpenStack Foundation OpenStack Nova >> acpi0 at bios0: ACPI 1.0 >> acpi0: sleep states S3 S4 S5 >> acpi0: tables DSDT FACP APIC >> acpi0: wakeup devices >> acpitimer0 at acpi0: 3579545 Hz, 24 bits >> acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat >> cpu0 at mainbus0: apid 0 (boot processor) >> cpu0: Intel Core Processor (Haswell, no TSX), 2394.80 MHz, 06-3c-01 >> cpu0: >> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,RDTSCP,LONG,LAHF,ABM,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,ARAT,XSAVEOPT,MELTDOWN >> cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB >> 64b/line 16-way L2 cache >> cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped >> cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped >> cpu0: smt 0, core 0, package 0 >> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges >> cpu0: apic clock running at 1000MHz >> cpu1 at mainbus0: apid 1 (application processor) >> cpu1: Intel Core Processor (Haswell, no TSX), 2394.56 MHz, 06-3c-01 >> cpu1: >> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,VMX,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,RDTSCP,LONG,LAHF,ABM,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,ARAT,XSAVEOPT,MELTDOWN >> cpu1: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB >> 64b/line 16-way L2 cache >> cpu1: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped >> cpu1: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped >> cpu1: smt 0, core 0, package 1 >> ioapic0 at mainbus0: apid 0 pa 0xfec00000, version 11, 24 pins >> acpiprt0 at acpi0: bus 0 (PCI0) >> acpicpu0 at acpi0: C1(@1 halt!) >> acpicpu1 at acpi0: C1(@1 halt!) >> "ACPI0006" at acpi0 not configured >> acpipci0 at acpi0 PCI0: _OSC failed >> acpicmos0 at acpi0 >> "PNP0A06" at acpi0 not configured >> "PNP0A06" at acpi0 not configured >> "PNP0A06" at acpi0 not configured >> "QEMU0002" at acpi0 not configured >> "ACPI0010" at acpi0 not configured >> cpu0: using VERW MDS workaround >> pvbus0 at mainbus0: KVM >> pvclock0 at pvbus0 >> pci0 at mainbus0 bus 0 >> pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02 >> pcib0 at pci0 dev 1 function 0 "Intel 82371SB ISA" rev 0x00 >> pciide0 at pci0 dev 1 function 1 "Intel 82371SB IDE" rev 0x00: DMA, >> channel 0 wired to compatibility, channel 1 wired to compatibility >> pciide0: channel 0 disabled (no drives) >> pciide0: channel 1 disabled (no drives) >> uhci0 at pci0 dev 1 function 2 "Intel 82371SB USB" rev 0x01: apic 0 int 11 >> piixpm0 at pci0 dev 1 function 3 "Intel 82371AB Power" rev 0x03: apic 0 int 9 >> iic0 at piixpm0 >> vga1 at pci0 dev 2 function 0 "Cirrus Logic CL-GD5446" rev 0x00 >> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) >> wsdisplay0: screen 1-5 added (80x25, vt100 emulation) >> virtio0 at pci0 dev 3 function 0 "Qumranet Virtio Network" rev 0x00 >> vio0 at virtio0: address fa:16:3e:d5:60:9d >> virtio0: msix shared >> virtio1 at pci0 dev 4 function 0 "Qumranet Virtio Storage" rev 0x00 >> vioblk0 at virtio1 >> scsibus1 at vioblk0: 2 targets >> sd0 at scsibus1 targ 0 lun 0: <VirtIO, Block Device, > >> sd0: 40960MB, 512 bytes/sector, 83886080 sectors >> virtio1: msix shared >> virtio2 at pci0 dev 5 function 0 "Qumranet Virtio Memory Balloon" rev 0x00 >> viomb0 at virtio2 >> virtio2: apic 0 int 10 >> isa0 at pcib0 >> isadma0 at isa0 >> fdc0 at isa0 port 0x3f0/6 irq 6 drq 2 >> com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo >> com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo >> pckbc0 at isa0 port 0x60/5 irq 1 irq 12 >> pckbd0 at pckbc0 (kbd slot) >> wskbd0 at pckbd0: console keyboard, using wsdisplay0 >> pms0 at pckbc0 (aux slot) >> wsmouse0 at pms0 mux 0 >> pcppi0 at isa0 port 0x61 >> spkr0 at pcppi0 >> usb0 at uhci0: USB revision 1.0 >> uhub0 at usb0 configuration 1 interface 0 "Intel UHCI root hub" rev >> 1.00/1.00 addr 1 >> vmm0 at mainbus0: VMX/EPT (using slow L1TF mitigation) >> uhidev0 at uhub0 port 1 configuration 1 interface 0 "QEMU QEMU USB >> Tablet" rev 2.00/0.00 addr 2 >> uhidev0: iclass 3/0 >> ums0 at uhidev0: 3 buttons, Z dir >> wsmouse1 at ums0 mux 0 >> vscsi0 at root >> scsibus2 at vscsi0: 256 targets >> softraid0 at root >> scsibus3 at softraid0: 256 targets >> root on sd0a (32f861a76884732c.a) swap on sd0b dump on sd0b >> WARNING: / was not properly unmounted >> fd0 at fdc0 drive 1: density unknown >> >> I can patch and test new kernel ( over stable not current ) for this >> but it can takes *hours* before a problem can be triggers :-( >> > > +0x24 is at /usr/src/sys/net/bridgectl.c:612 > > SIMPLEQ_FOREACH(n, &bif->bif_brlout, brl_next) { > total++; > } > > bridge_brlconf(): > /usr/src/sys/net/bridgectl.c:611 > 1120: 4c 8b 1d 00 00 00 00 mov 0(%rip),%r11 # > 1127 <bridge_brlconf+0x7> > 1123: R_X86_64_PC32 > __retguard_2528+0xfffffffffffffffc > 1127: 4c 33 1c 24 xor (%rsp),%r11 > 112b: 55 push %rbp > 112c: 48 89 e5 mov %rsp,%rbp > 112f: 57 push %rdi > 1130: 56 push %rsi > 1131: 41 53 push %r11 > 1133: 41 57 push %r15 > 1135: 41 56 push %r14 > 1137: 41 55 push %r13 > 1139: 41 54 push %r12 > 113b: 53 push %rbx > 113c: 48 83 ec 30 sub $0x30,%rsp > 1140: 48 89 75 b0 mov %rsi,0xffffffffffffffb0(%rbp) > /usr/src/sys/net/bridgectl.c:612 > 1144: 48 8b 47 08 mov 0x8(%rdi),%rax > 1148: 48 89 45 98 mov %rax,0xffffffffffffff98(%rbp) > 114c: 48 89 7d b8 mov %rdi,0xffffffffffffffb8(%rbp) > 1150: 48 8b 47 18 mov 0x18(%rdi),%rax > 1154: 45 31 ed xor %r13d,%r13d > /usr/src/sys/net/bridgectl.c:618 > > -- > -- > --------------------------------------------------------------------------------------------------------------------- > Knowing is not enough; we must apply. Willing is not enough; we must do