On 23.11.2021. 14:18, Alexander Bluhm wrote:
> On Tue, Nov 23, 2021 at 06:54:59AM +0100, Hrvoje Popovski wrote:
>> after 24 hours hitting sasyncd setup one box panic
>
> Thanks for testing.
>
> I have reduced my iked lifetime to about 10 seconds and got the
> same panic on my new 8 core test machine.
>
> ddb{2}> trace
> db_enter() at db_enter+0x10
> panic(ffffffff81eaa8e3) at panic+0xbf
> pool_do_get(ffffffff821e64d8,9,ffff8000238b0524) at pool_do_get+0x35c
> pool_get(ffffffff821e64d8,9) at pool_get+0x93
> tdb_alloc(0) at tdb_alloc+0x62
> reserve_spi(0,100,ffffffff,ffff800000d41254,ffff800000d41238,32,cbd2b00c6d3d3ec
> d) at reserve_spi+0xfc
> pfkeyv2_send(fffffd8739174900,ffff800001b3ba80,50) at pfkeyv2_send+0x19c6
> pfkeyv2_output(fffffd80948cea00,fffffd8739174900,0,0) at pfkeyv2_output+0x8a
> pfkeyv2_usrreq(fffffd8739174900,9,fffffd80948cea00,0,0,ffff8000238857b0) at
> pfk
> eyv2_usrreq+0x1b0
> sosend(fffffd8739174900,0,ffff8000238b0b60,0,0,0) at sosend+0x3a9
> dofilewritev(ffff8000238857b0,3,ffff8000238b0b60,0,ffff8000238b0c60) at
> dofilew
> ritev+0x14d
> sys_writev(ffff8000238857b0,ffff8000238b0c00,ffff8000238b0c60) at
> sys_writev+0x
> d2
> syscall(ffff8000238b0cd0) at syscall+0x3a9
> Xsyscall() at Xsyscall+0x128
>
>> ddb{3}> show tdb
>
> You have to add the pool item addr to this command.
>
> I additionally have refcount tracing diff on my machine. With that
> I see this result:
>
> ddb{2}> show panic
> *cpu2: pool_do_get: tdb free list modified: page 0xffff800008010000; item
> addr 0
> xffff80000801c998; offset 0x28=0xdeadbeee
>
> ddb{2}> show tdb /f 0xffff80000801c998
> tdb at 0xffff80000801c998
> hnext: 0x4c38c8f8ffb0cab5
> dnext: 0xff2c2a5ac7964242
> snext: 0xdeadbeefdeadbeef
> ...
> tdb_trace[78]: 350309838: refs 5 -1 cpu2 ipsec_forward_check:1081
> tdb_trace[79]: 350309839: refs 4 +1 cpu2 gettdb_dir:358
> tdb_trace[80]: 350309840: refs 5 -1 cpu2 ipsec_common_input:355
> tdb_trace[81]: 350309841: refs 4 +1 cpu2 gettdb_dir:358
> tdb_trace[82]: 350309842: refs 5 -1 cpu2 ipsec_forward_check:1081
> tdb_trace[83]: 350310888: refs 4 -1 cpu2 ipsp_spd_lookup:529
> tdb_trace[84]: 350816099: refs 3 -1 cpu0 tdb_soft_timeout:726
> tdb_trace[85]: 351266117: refs 2 +1 cpu2 gettdb_dir:358
> tdb_trace[86]: 351266118: refs 3 +0 cpu2 pfkeyv2_send:1599
> tdb_trace[87]: 351266119: refs 3 -1 cpu2 tdb_delete0:997
> tdb_trace[88]: 351271898: refs 2 -1 cpu2 pfkeyv2_send:2143
> tdb_trace[89]: 351300368: refs 1 +0 cpu0 tdb_timeout:688
> tdb_trace[90]: 351300369: refs 1 -1 cpu0 tdb_delete0:997
> tdb_trace[91]: 351300370: refs 3735928559 -1 cpu0 tdb_timeout:691
>
> I will try mvs@ IPL_NET fix and think a bit more about the problem.
>
> bluhm
>
Hi,
i've got panic with mvs@ diff
bluhm@ thank you for tips ..
r620-2# panic: pool_do_get: tdb free list modified: page
0xffff8000012ee000; item addr 0
xffff8000012f1bb0; offset 0x28=0xdeafbeac
Stopped at db_enter+0x10: popq %rbp
TID PID UID PRFLAGS PFLAGS CPU COMMAND
263347 98359 68 0x10 0 2 sasyncd
*136177 87522 68 0x10 0 3 isakmpd
282035 451 0 0x14000 0x200 1 softnet
db_enter() at db_enter+0x10
panic(ffffffff81ea6d34) at panic+0xbf
pool_do_get(ffffffff822308b8,9,ffff800022da0f94) at pool_do_get+0x35c
pool_get(ffffffff822308b8,9) at pool_get+0x93
tdb_alloc(0) at tdb_alloc+0x62
reserve_spi(0,100,ffffffff,ffff8000012c4254,ffff8000012c4238,32,3f96bc02a5ef3ac
f) at reserve_spi+0xff
pfkeyv2_send(fffffd83b1902a90,ffff8000012c3400,50) at pfkeyv2_send+0x146a
pfkeyv2_output(fffffd80a5358c00,fffffd83b1902a90,0,0) at pfkeyv2_output+0x8a
pfkeyv2_usrreq(fffffd83b1902a90,9,fffffd80a5358c00,0,0,ffff800022d03a48)
at pfk
eyv2_usrreq+0x1b0
sosend(fffffd83b1902a90,0,ffff800022da15e0,0,0,0) at sosend+0x3a9
dofilewritev(ffff800022d03a48,7,ffff800022da15e0,0,ffff800022da16e0) at
dofilew
ritev+0x14d
sys_writev(ffff800022d03a48,ffff800022da1680,ffff800022da16e0) at
sys_writev+0x
d2
syscall(ffff800022da1750) at syscall+0x3a9
Xsyscall() at Xsyscall+0x128
end of kernel
end trace frame: 0x7f7ffffc6cc0, count: 1
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports. Insufficient info makes it difficult to find and fix bugs.
ddb{3}> show tdb
0xffff8000012ee000: 00000000 (unknown address family)->(unknown address
family)
:0 #-2135853982 dead4110
ddb{3}> show all tdb
0xffff8000012ee8b0: fac0dfe4 192.168.42.113->192.168.42.100:50 #3 00001082
0xffff8000012efdf0: 4e927a9b 192.168.42.112->192.168.42.100:50 #2 00000012
0xffff8000012f0ab0: c630e737 192.168.42.100->192.168.42.113:50 #4 000d1082
ddb{3}>
ddb{3}> show tdb /f 0xffff8000012ee000
tdb at 0xffff8000012ee000
hnext: 0x7105245c18ca6678
dnext: 0xf0a45d406013ea71
snext: 0xdeafbeaddeafbead
inext: 0xdeafbeaddeafbead
onext: 0xdeafbeaddeafbead
xform: 0x85c3318b80a89349
refcnt: -2135853982
encalgxform: 0xf663dace0c312228
authalgxform: 0xdead4110dead4110
compalgxform: 0xdead4110dead4110
flags:
dead4110<INVALID,SOFT_BYTES,USEDTUNNEL,PFSYNC,PFSYNC_RPL>
seq: 0
exp_allocations: 0
soft_allocations: -2124599952
cur_allocations: -1
exp_bytes: -140737468506064
soft_bytes: 0
cur_bytes: 21474836480
exp_timeout: 4294967295
soft_timeout: 0
established: 0
first_use: 0
soft_first_use: 0
exp_first_use: 0
last_used: 60
last_marked: 0
cryptoid: 0
tdb_spi: 00000000
amxkeylen: 0
emxkeylen: 0
ivlen: 0
sproto: 0
wnd: 0
satype: 0
updates: 0
dst: (unknown address family)
src: (unknown address family)
amxkey: 0x0
emxkey: 0x0
rpl: 2267742732288
ids: 0x0
ids_swapped: 0
mtu: 0
mtutimeout: 0
udpencap_port: 0
tag: 0
tap: 0
rdomain: 0
rdomain_post: 0
now i will try diff from tobhe@ :)