Re: -current 100% CPU, softdep related

Mark Patruck Tue, 03 Mar 2020 01:21:56 -0800

After ~3 days with the system up and running, the crash after doing
a "reboot" looks different. Now it's in handle_workitem_freeblocks(),
according to objdump


/sys/ufs/ffs/ffs_softdep.c:2365
    120e:       48 8b 47 28             mov    0x28(%rdi),%rax
    1212:       48 8b 80 98 02 00 00    mov    0x298(%rax),%rax
--> 1219:       48 83 78 18 01          cmpq   $0x1,0x18(%rax)
    121e:       48 8d 8d b8 fd ff ff    lea    0xfffffffffffffdb8(%rbp),%rcx
    1225:       48 89 4d 98             mov    %rcx,0xffffffffffffff98(%rbp)


syncing disks...uvm_fault(0xfffffd83a1b78228, 0x18, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at      handle_workitem_freeblocks+0x39:        cmpq    $0x1,0x18(%rax)


ddb{0}> trace
handle_workitem_freeblocks(fffffd82b482bce8) at handle_workitem_freeblocks+0x39
process_worklist_item(ffff8000001b8800,40) at process_worklist_item+0x1f2
softdep_process_worklist(ffff8000001b8800) at softdep_process_worklist+0xed
softdep_flushworklist(ffff8000001b8800,ffff8000225ca708,ffff8000226deb00) at so
ftdep_flushworklist+0xb8
ffs_sync(ffff8000001b8800,1,0,fffffd841f7c26c0,ffff8000226deb00) at 
ffs_sync+0xdd
dounmount_leaf(ffff8000001b8800,80000,ffff8000226deb00) at dounmount_leaf+0xaa
dounmount(ffff8000001b8800,80000,ffff8000226deb00) at dounmount+0xfc
vfs_unmountall() at vfs_unmountall+0x8e
vfs_shutdown(ffff8000226deb00) at vfs_shutdown+0x3b
boot(0) at boot+0x6c
reboot(0) at reboot+0x5c
sys_reboot(ffff8000226deb00,ffff8000225ca960,ffff8000225ca9c0) at 
sys_reboot+0x7e
syscall(ffff8000225caa30) at syscall+0x389
Xsyscall() at Xsyscall+0x128
end of kernel
end trace frame: 0x7f7ffffdc410, count: -14


ddb{0}> ps
   PID     TID   PPID    UID  S       FLAGS  WAIT          COMMAND
*50794  484816      1      0  7         0x3                reboot
 19349  244035      0      0  3     0x14200  bored         smr
 91090  149880      0      0  2     0x14200                zerothread
 90933  308106      0      0  3     0x14200  aiodoned      aiodoned
  3559    3994      0      0  3     0x14200  syncer        update
 13412  220874      0      0  3     0x14200  cleaner       cleaner
 67192  445479      0      0  3     0x14200  reaper        reaper
 16869  348395      0      0  3     0x14200  pgdaemon      pagedaemon
 59176  404002      0      0  3     0x14200  bored         crynlk
 70887  377538      0      0  3     0x14200  bored         crypto
 59125  295509      0      0  3     0x14200  usbtsk        usbtask
 73888  331304      0      0  3     0x14200  usbatsk       usbatsk
 87760  317235      0      0  3  0x40014200  acpi0         acpi0
 32435  334346      0      0  7  0x40014200                idle7
 97709  444879      0      0  7  0x40014200                idle6
 85595  223792      0      0  7  0x40014200                idle5
 89353  449747      0      0  7  0x40014200                idle4
 89616   53984      0      0  7  0x40014200                idle3
 17444  112424      0      0  7  0x40014200                idle2
 50503  439810      0      0  7  0x40014200                idle1
 70433  274545      0      0  3     0x14200  bored         sensors
 58599   63095      0      0  3     0x14200  bored         softnet
  4228   56259      0      0  3     0x14200  bored         systqmp
 94385  233288      0      0  3     0x14200  bored         systq
  7159  306934      0      0  3  0x40014200  bored         softclock
 67585  249534      0      0  3  0x40014200                idle0
     1  208459      0      0  3        0x82  wait          init
     0       0     -1      0  3     0x10200  scheduler     swapper


ddb{0}> show registers
rdi               0xfffffd82b482bce8
rsi                             0x40
rbp               0xffff8000225ca440
rbx                              0x1
rdx               0xfe00000007ff1e3a
rcx                            0x286
rax                                0
r8                               0x8
r9                               0x1
r10               0xcbb4b964dc1cd4c3
r11               0xab8da676dd4aa070
r12                             0x40
r13                              0x9
r14               0xfffffd82b482bce8
r15               0xffff8000001b8800
rip               0xffffffff812b3589    handle_workitem_freeblocks+0x39
cs                               0x8
rflags                       0x10286    __ALIGN_SIZE+0xf286
rsp               0xffff8000225ca1f0
ss                              0x10
handle_workitem_freeblocks+0x39:        cmpq    $0x1,0x18(%rax)


On 2020-02-29 10:01, Mark Patruck wrote:

On 2020-02-28 21:57, Todd C. Miller wrote:

This sounds like the loop in softdep_process_worklist() is never
exiting.  It shouldn't run for more than a second, though.

FreeBSD breaks out of the loop if process_worklist_item() can't
make progress.  You could try the following (untested) diff to see
if it changes the behavior.


After ~11h with your diff, the system was still up and running,
so i decided to reboot. It crashed while syncing disks.


syncing disks...uvm_fault(0xfffffd83a134d668, 0x20, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at      handle_workitem_freefile+0x2a:  movq    0x20(%rax),%rcx
ddb{0}>

SNIP

--
Mark Patruck ( mark at wrapped.cx )
GPG key 0xF2865E51 / 187F F6D3 EE04 1DCE 1C74  F644 0D3C F66F F286 5E51

https://www.wrapped.cx

Re: -current 100% CPU, softdep related

Reply via email to