> Date: Sat, 1 Oct 2022 20:43:31 +0200 > From: Hrvoje Popovski <hrv...@srce.hr> > > Hi all, > > I have 3 machines where I can reproduce problem in subject. > Dell R6515 16 core with 4 mcx (64 queues). One mcx have 16 queues. If I > lower number of cpus to 12 which lowers number of queues for mcx to 12, > everything is fine. > Supermicro 24 core with 2 bnxt, 2 mcx and 4 ix (112 queues). bnxt have 8 > queues, mcx and ix 16 queues. > IBM 12 core with 2 ix and 4 em (56 queues). On this box I have jmatthew@ > multiqueue diff for em(4) and em have 8 queues and ix 12 queues. > > On all those machines if I enable all interfaces machines freeze or panic. > > In attachment you can find vmstat -iz and dmesg from those machines. > > I would like to utilize all 16 core for my new dell firewalls :)
At least on some of these machines, you're simply running out of kernel malloc space. The machines "hang" because the M_WAITOK flag is used for the allocations, and malloc(9) waits in the hope someone else gives up the memory. Maybe we need to allow for more malloc space. But to be sure this isn't the drivers being completely stupid, vmstat -m output would be helpful. As for interrupts, the x86 architecture only has 256 interrupt vectors (of which some are actually reserved for exception vectors). And the way we prioritize interrupts means we (probably) can't use all of the remaining vector. We currently treat them as a global resource, but probably need to treat them more as a per-CPU resource such that we can re-use vectors for different purposes on different CPUs. Fixing that isn't an easy task though. > Dell > When frozen I can't drop to ddb over ipmi console only over idrac vga. > Here link to ddb output > > https://kosjenka.srce.hr/~hrvoje/openbsd/dell_mcx1.jpg > https://kosjenka.srce.hr/~hrvoje/openbsd/dell_mcx2.jpg > https://kosjenka.srce.hr/~hrvoje/openbsd/dell_mcx3.jpg > > > > > Supermicro > > if last enabled interface is ix then i'm getting log below > smc24# ifconfig ix3 up > ix3: Unable to create TX DMA map > ix3: Could not setup transmit structures > > > if last enabled interface is mcx then box freeze > smc24# ifconfig mcx1 up > > ~B [send break] > Stopped at db_enter+0x10: popq %rbp > ddb{0}> trace > db_enter() at db_enter+0x10 > comintr(ffff800000278000) at comintr+0x2de > intr_handler(ffff800026d1a270,ffff800000267980) at intr_handler+0x6e > Xintr_ioapic_edge17_untramp() at Xintr_ioapic_edge17_untramp+0x18f > acpicpu_idle() at acpicpu_idle+0x11f > sched_idle(ffffffff822cbff0) at sched_idle+0x280 > end trace frame: 0x0, count: -6 > > ddb{0}> ps > PID TID PPID UID S FLAGS WAIT COMMAND > 33034 218257 24243 0 3 0x3 devbuf ifconfig > > ddb{0}> trace /t 0t218257 > sleep_finish(ffff800026e40880,1) at sleep_finish+0xfe > msleep(ffffffff82376190,ffffffff822be2e0,2,ffffffff81f42147,0) at > msleep+0xc7 > malloc(1d8,2,9) at malloc+0x323 > _bus_dmamap_create(ffffffff822d87b8,251c,d,251c,0,2002,f2c29ad4e18b508f) > at _bus_dmamap_create+0x53 > mcx_up(ffff800000323000) at mcx_up+0x55c > mcx_ioctl(ffff800000323048,80206910,ffff800026e40fb0) at mcx_ioctl+0x5f4 > ifioctl(fffffd904d7b0ab8,80206910,ffff800026e40fb0,ffff800026e2f510) at > ifioctl+0x7bf > soo_ioctl(fffffd8e5bd46440,80206910,ffff800026e40fb0,ffff800026e2f510) > at soo_ioctl+0x171 > sys_ioctl(ffff800026e2f510,ffff800026e410c0,ffff800026e41120) at > sys_ioctl+0x2c4 > syscall(ffff800026e41190) at syscall+0x384 > Xsyscall() at Xsyscall+0x128 > end of kernel > end trace frame: 0x7f7ffffde0b0, count: -11 > > > interesting log in dmesg is: > > xhci3 at pci23 dev 0 function 3 vendor "AMD", unknown product 0x148c rev > 0x00failed to allocate interrupt slot for PIC msi pin -2143091968 > : couldn't establish interrupt at msi > ahci2 at pci24 dev 0 function 0 "AMD FCH AHCI" rev 0x51: msi,failed to > allocate interrupt slot for PIC msi pin -2143027200 > ahci2: unable to map interrupt > ahci3 at pci25 dev 0 function 0 "AMD FCH AHCI" rev 0x51: msi,failed to > allocate interrupt slot for PIC msi pin -2142961664 > ahci3: unable to map interrupt > > > > > IBM > > if last enabled interface is ix then box panic > x3550m4# ifconfig ix0 up > ix0: Unable to create Pack DMA map > uvm_fault(0xfffffd887f12cbb0, 0xc, 0, 1) -> e > kernel: page fault trap, code=0 > Stopped at _bus_dmamap_destroy+0x9: movl 0xc(%rsi),%eax > TID PID UID PRFLAGS PFLAGS CPU COMMAND > *474869 73471 0 0x3 0 1K ifconfig > _bus_dmamap_destroy(ffffffff822d5a90,0) at _bus_dmamap_destroy+0x9 > ixgbe_free_receive_buffers(ffff8000000e9910) at > ixgbe_free_receive_buffers+0xb2 > ixgbe_init(ffff8000000e7000) at ixgbe_init+0x778 > ixgbe_ioctl(ffff8000000e7048,80206910,ffff800021c356e0) at ixgbe_ioctl+0x327 > ifioctl(fffffd887c9d3008,80206910,ffff800021c356e0,ffff800021c50a88) at > ifioctl+0x7bf > soo_ioctl(fffffd87851d0c40,80206910,ffff800021c356e0,ffff800021c50a88) > at soo_ioctl+0x171 > sys_ioctl(ffff800021c50a88,ffff800021c357f0,ffff800021c35850) at > sys_ioctl+0x2c4 > syscall(ffff800021c358c0) at syscall+0x384 > Xsyscall() at Xsyscall+0x128 > end of kernel > end trace frame: 0x7f7ffffbd7a0, count: 6 > https://www.openbsd.org/ddb.html describes the minimum info required in > bug reports. Insufficient info makes it difficult to find and fix bugs. > > > if last enabled interface is em then i'm getting log below > x3550m4# ifconfig em3 up > em3: Unable to create TX DMA map > em3: Could not setup transmit structures