On 3.1.2024. 7:51, Jonathan Matthew wrote:
> On Wed, Jan 03, 2024 at 01:50:06AM +0100, Alexander Bluhm wrote:
>> On Wed, Jan 03, 2024 at 12:26:26AM +0100, Hrvoje Popovski wrote:
>>> While testing kettenis@ ipl diff from tech@ and doing iperf3 to bnxt
>>> interface and ifconfig bnxt0 down/up at the same time I can trigger
>>> panic. Panic can be triggered without kettenis@ diff...
>> It is easy to reproduce. ifconfig bnxt1 down/up a few times while
>> receiving TCP traffic with iperf3. Machine still has kettenis@ diff.
>> My panic looks different.
> It looks like I wasn't trying very hard when I wrote bnxt_down().
> I think there's also a problem with bnxt_up() unwinding after failure
> in various places, but that's a different issue.
>
> This makes it a more resilient for me, though it still logs
> 'bnxt0: unexpected completion type 3' a lot if I take the interface
> down while it's in use. I'll look at that separately.
Hi,
with this diff I can still panic box with ifconfig up/down but not as
fast as without it
panic with diff
bnxt0: HWRM_RING_ALLOC command returned RESOURCE_ALLOC_ERROR error.
bnxt0: failed to set up tx ring
uvm_fault(0xfffffd8e57e02460, 0xff0, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at bnxt_queue_down+0x62: movq 0(%r12,%rax,1),%rsi
TID PID UID PRFLAGS PFLAGS CPU COMMAND
* 70181 53204 0 0x3 0 0K ifconfig
bnxt_queue_down(ffff8000002c9000,ffff8000002c9f88) at bnxt_queue_down+0x62
bnxt_up(ffff8000002c9000) at bnxt_up+0x36b
bnxt_ioctl(ffff8000002c9048,80206910,ffff8000607fffd0) at bnxt_ioctl+0x162
ifioctl(fffffd8e417ab758,80206910,ffff8000607fffd0,ffff800060797aa8) at
ifioctl+0x726
sys_ioctl(ffff800060797aa8,ffff8000608000d0,ffff800060800120) at
sys_ioctl+0x2af
syscall(ffff800060800190) at syscall+0x3b4
Xsyscall() at Xsyscall+0x128
end of kernel
end trace frame: 0x7e3d0a930430, count: 8
https://www.openbsd.org/ddb.html describes the minimum info required in
bug reports. Insufficient info makes it difficult to find and fix bugs.
ddb{0}> show reg
rdi 0xffffffff8244b950 pci_bus_dma_tag
rsi 0xffff8000002c9f88
rbp 0xffff8000607ffe40
rbx 0x101
rdx 0xc800000000030000
rcx 0x206
rax 0xff0
r8 0x3f
r9 0
r10 0xa14b312597c5ea6a
r11 0xffffffff819fac40 _bus_dmamap_destroy
r12 0
r13 0x100
r14 0xffff8000002c9f88
r15 0xffff8000002c9000
rip 0xffffffff81b578e2 bnxt_queue_down+0x62
cs 0x8
rflags 0x10216 __ALIGN_SIZE+0xf216
rsp 0xffff8000607ffde0
ss 0x10
bnxt_queue_down+0x62: movq 0(%r12,%rax,1),%rsi
ddb{0}> ps
PID TID PPID UID S FLAGS WAIT COMMAND
*53204 70181 81971 0 7 0x3 ifconfig
57044 336864 81971 0 3 0x100083 kqread iperf3
57044 317909 81971 0 3 0x4100083 kqread iperf3
57044 253167 81971 0 3 0x4100083 kqread iperf3
57044 199984 81971 0 3 0x4100083 kqread iperf3
57044 343144 81971 0 3 0x4100083 kqread iperf3
81971 379109 1 0 3 0x10008b sigsusp ksh
69236 410163 1 0 3 0x100098 kqread cron
28984 478747 27164 95 3 0x1100092 kqread smtpd
75309 290569 27164 103 3 0x1100092 kqread smtpd
3782 175531 27164 95 3 0x1100092 kqread smtpd
60089 38850 27164 95 3 0x100092 kqread smtpd
72803 151501 27164 95 3 0x1100092 kqread smtpd
88240 203086 27164 95 3 0x1100092 kqread smtpd
27164 293957 1 0 3 0x100080 kqread smtpd
51687 170066 1 0 3 0x88 kqread sshd
82716 114406 1 0 3 0x100080 kqread ntpd
95469 439610 76144 83 3 0x100092 kqread ntpd
76144 242283 1 83 3 0x1100092 kqread ntpd
25275 206721 16938 73 3 0x1100090 kqread syslogd
16938 424245 1 0 3 0x100082 netio syslogd
92580 279098 0 0 3 0x14200 bored smr
40549 159120 0 0 3 0x14200 pgzero zerothread
12488 115575 0 0 3 0x14200 aiodoned aiodoned
91171 460632 0 0 3 0x14200 syncer update
83952 275089 0 0 3 0x14200 cleaner cleaner
6394 148862 0 0 3 0x14200 reaper reaper
60888 287201 0 0 3 0x14200 pgdaemon pagedaemon
25804 403088 0 0 3 0x14200 usbtsk usbtask
39034 435293 0 0 3 0x14200 usbatsk usbatsk
14657 265992 0 0 3 0x40014200 acpi0 acpi0
71223 108563 0 0 7 0x40014200 idle23
64527 365115 0 0 7 0x40014200 idle22
72713 21307 0 0 7 0x40014200 idle21
10463 246001 0 0 7 0x40014200 idle20
12917 81705 0 0 7 0x40014200 idle19
21272 95506 0 0 7 0x40014200 idle18
84051 282899 0 0 7 0x40014200 idle17
29950 315242 0 0 7 0x40014200 idle16
74076 50514 0 0 7 0x40014200 idle15
78730 255732 0 0 7 0x40014200 idle14
61629 222239 0 0 7 0x40014200 idle13
55577 69638 0 0 7 0x40014200 idle12
43127 284674 0 0 7 0x40014200 idle11
9490 433717 0 0 7 0x40014200 idle10
59350 221845 0 0 7 0x40014200 idle9
38139 511479 0 0 7 0x40014200 idle8
13306 271932 0 0 7 0x40014200 idle7
58184 304262 0 0 7 0x40014200 idle6
9056 44466 0 0 7 0x40014200 idle5
36823 251810 0 0 7 0x40014200 idle4
88045 230617 0 0 7 0x40014200 idle3
44234 150739 0 0 7 0x40014200 idle2
53028 404805 0 0 7 0x40014200 idle1
36997 229117 0 0 3 0x14200 bored sensors
74306 486052 0 0 3 0x14200 bored softnet15
1765 174236 0 0 3 0x14200 bored softnet14
31387 391287 0 0 3 0x14200 bored softnet13
88981 250088 0 0 3 0x14200 bored softnet12
69014 72250 0 0 3 0x14200 bored softnet11
96695 447362 0 0 3 0x14200 bored softnet10
61058 68822 0 0 3 0x14200 bored softnet9
47217 76327 0 0 3 0x14200 bored softnet8
29217 119308 0 0 3 0x14200 bored softnet7
56584 41551 0 0 3 0x14200 bored softnet6
58844 260791 0 0 3 0x14200 bored softnet5
82630 67345 0 0 3 0x14200 bored softnet4
61327 178726 0 0 3 0x14200 bored softnet3
67763 262066 0 0 3 0x14200 bored softnet2
20749 444743 0 0 3 0x14200 bored softnet1
8928 487450 0 0 3 0x14200 bored softnet0
3095 40129 0 0 3 0x14200 bored systqmp
42348 515847 0 0 3 0x14200 bored systq
71069 335621 0 0 3 0x14200 tmoslp softclockmp
16785 187571 0 0 3 0x40014200 tmoslp softclock
63607 400073 0 0 3 0x40014200 idle0
1 204660 0 0 3 0x82 wait init
0 0 -1 0 3 0x10200 scheduler swapper
ddb{0}> ps /o
TID PID UID PRFLAGS PFLAGS CPU COMMAND
* 70181 53204 0 0x3 0 0K ifconfig
ddb{0}>
ddb{0}> trace /t 0t70181
uvm_fault(0xfffffd8e57e02460, 0x8, 0, 1) -> e
kernel: page fault trap, code=0
Faulted in DDB; continuing...