On Wed, Jul 17, 2019 at 05:19:55PM +0200, Alexander Bluhm wrote: > On Wed, Jul 17, 2019 at 01:26:29PM +0200, Alexander Bluhm wrote: > > I got a strange panic on my daily amd64 regress machine > > reordering libraries:panic: pool_do_get: scxspl free list modified: page > > I see more strange effects on my regress machines for a while now. > The SSH connection that controls my tests fails with broken pipe.
Unfortunately SSH broken pipe is the common error if I loose contact with the test machine. Most of the time it was a local fuckup. A pf on a bridge sending TCP RST. > Fri Jul 12 09:59:59 MDT 2019 > malloc_duel(22164) in free(): chunk canary corrupted 0x4fd491216a0 0x4@0x4 > (double free?) This problem is real. Running /usr/src/regress/lib/libpthread/malloc_duel in a loop ends in a crash. The reaper on CPU 0 does a NULL dereference when removing the page. On CPU 1 zerothread is waiting for kernel lock. CPU 2 and 3 are idle. uvm_fault(0xfffffd8240760cc8, 0x7f827ea48908, 0, 2) -> e kernel: page fault trap, code=0 Stopped at pmap_page_remove+0x210: xchgq %rax,0(%rcx,%rdx,1) version: OpenBSD 6.5-current (GENERIC.MP) #129: Mon Jul 15 18:54:34 MDT 2 019\012 [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP ddb{0}> trace pmap_page_remove(fffffd8107d97b00) at pmap_page_remove+0x210 uvm_anfree(fffffd8214777360) at uvm_anfree+0x36 amap_wipeout(fffffd825d3dc130) at amap_wipeout+0xe5 uvm_unmap_detach(ffff800021eba2d8,1) at uvm_unmap_detach+0xef uvm_map_teardown(fffffd8279b37340) at uvm_map_teardown+0x1c1 uvmspace_free(fffffd8279b37340) at uvmspace_free+0x57 uvm_exit(ffff8000ffff6d90) at uvm_exit+0x24 reaper(ffff8000ffff8770) at reaper+0x13b end trace frame: 0x0, count: -8 *52315 402033 0 0 7 0x14200 reaper ddb{0}> show register rdi 0xa rsi 0xfffffd8107d97b68 rbp 0xffff800021eba1f0 rbx 0 rdx 0x7f8000000000 rcx 0x20a8ea850 rax 0 r8 0xfffffd810a6e5d80 r9 0xffffffff81d27ff0 cpu_info_full_primary+0x1ff0 r10 0x45e35718a213a84d r11 0xd63ebed56fdc728a r12 0xfffffd8107d97b00 r13 0xfffffd823df88940 r14 0x27f7c2000 r15 0xfffffd8107d97b68 rip 0xffffffff817df8f0 pmap_page_remove+0x210 cs 0x8 rflags 0x10246 __ALIGN_SIZE+0xf246 rsp 0xffff800021eba190 ss 0x10 pmap_page_remove+0x210: xchgq %rax,0(%rcx,%rdx,1) ddb{1}> trace x86_ipi_db(ffff800021c80ff0) at x86_ipi_db+0x12 x86_ipi_handler() at x86_ipi_handler+0x80 Xresume_lapic_ipi(a,ffff800021c80ff0,ffff8000ffff9640,0,0,ffff8000ffff9718) at X resume_lapic_ipi+0x23 _kernel_lock() at _kernel_lock+0xae timeout_del_barrier(ffff8000ffff9718) at timeout_del_barrier+0xa2 msleep(ffffffff81dacc6c,ffffffff81dacbb8,7f,ffffffff81a91b36,0) at msleep+0xf5 uvm_pagezero_thread(ffff8000ffff9640) at uvm_pagezero_thread+0xa2 end trace frame: 0x0, count: -7 *63535 48641 0 0 7 0x14200 zerothread ddb{1}> show register rdi 0xffff800021c80ff0 rsi 0 rbp 0xffff800021ed2440 rbx 0xffffffff81d06168 ipifunc+0x38 rdx 0 rcx 0x7 rax 0xffffff7f r8 0 r9 0 r10 0 r11 0x29c1412aeecf9c9a r12 0x7 r13 0 r14 0xffff800021c80ff0 r15 0 rip 0xffffffff813572e2 x86_ipi_db+0x12 cs 0x8 rflags 0x206 rsp 0xffff800021ed2430 ss 0x10 x86_ipi_db+0x12: leave I will update kernel and look if panic is reproducable. bluhm
