daily CVS update output
Updating src tree: P src/crypto/external/bsd/openssh/dist/auth.c P src/crypto/external/bsd/openssh/dist/auth2.c P src/crypto/external/bsd/openssh/dist/monitor.c P src/crypto/external/bsd/openssh/dist/sshd.c P src/distrib/sets/lists/comp/mi P src/etc/rc.d/zfs P src/external/bsd/libproc/dist/proc_util.c P src/external/bsd/mdocml/dist/mdoc.c P src/external/bsd/mdocml/dist/mdoc_argv.c P src/external/bsd/mdocml/dist/st.c P src/include/monetary.h P src/lib/libc/atomic/Makefile.inc P src/lib/libutil/snprintb.3 P src/sbin/modstat/modstat.8 P src/share/man/man4/audio.4 P src/share/man/man4/pci.4 P src/share/man/man9/atomic_loadstore.9 P src/share/man/man9/rnd.9 P src/sys/arch/amd64/amd64/amd64_trap.S P src/sys/arch/arm/rockchip/rk3399_pcie.c P src/sys/arch/arm/sunxi/sun50i_a64_ccu.c P src/sys/arch/evbarm/conf/std.generic64 P src/sys/arch/x86/x86/cpu.c P src/sys/arch/x86/x86/hyperv.c P src/sys/arch/x86/x86/hypervvar.h P src/sys/dev/dm/device-mapper.c P src/sys/dev/dm/dm.h P src/sys/dev/dm/dm_dev.c P src/sys/dev/dm/dm_pdev.c P src/sys/dev/dm/dm_table.c P src/sys/dev/dm/dm_target.c P src/sys/dev/dm/dm_target_error.c P src/sys/dev/dm/dm_target_linear.c P src/sys/dev/dm/dm_target_mirror.c P src/sys/dev/dm/dm_target_snapshot.c P src/sys/dev/dm/dm_target_stripe.c P src/sys/dev/dm/dm_target_zero.c P src/sys/dev/dm/doc/locking.txt P src/sys/dev/hyperv/hyperv_common.c P src/sys/dev/hyperv/hypervvar.h P src/sys/dev/hyperv/vmbus.c P src/sys/dev/hyperv/vmbusvar.h P src/sys/external/bsd/drm2/dist/drm/nouveau/include/nvif/os.h P src/sys/kern/kern_synch.c P src/sys/kern/subr_kcov.c P src/sys/rump/librump/rumpkern/rump.c P src/sys/uvm/pmap/pmap_pvt.c P src/usr.bin/mkubootimage/mkubootimage.1 P src/usr.bin/mkubootimage/mkubootimage.c P src/usr.sbin/mopd/common/loop-linux2.c P src/usr.sbin/sysinst/disklabel.c Updating xsrc tree: Killing core files: Updating file list: -rw-rw-r-- 1 srcmastr netbsd 34709970 Dec 8 03:04 ls-lRA.gz
Re: Current test failures
On Sat, Dec 07, 2019 at 09:53:35PM +0200, Andreas Gustafsson wrote: > Perhaps, but before Taylor made that commit, at least one other bug > was introduced that is causing the system to panic before finishing > the tests: > > fs/vfs/t_renamerace (726/847): 28 test cases > ext2fs_renamerace: [6.743565s] Failed: Test program received signal 11 > (core dumped) > ext2fs_renamerace_dirs: [6.690776s] Failed: Test program received signal > 11 (core dumped) > ffs_renamerace: [6.602727s] Failed: Test program received signal 11 (core > dumped) > ffs_renamerace_dirs: [ 3923.9308316] panic: kernel diagnostic assertion > "l->l_cpu == ci" failed: file > "/tmp/bracket/build/2019.12.06.21.45.14-amd64-baremetal/src/sys/kern/kern_synch.c", > line 764 > [ 3924.1108893] cpu7: Begin traceback... > [ 3924.1509019] vpanic() at netbsd:vpanic+0x178 > [ 3924.2009181] kern_assert() at netbsd:kern_assert+0x48 > [ 3924.2609379] mi_switch() at netbsd:mi_switch+0x569 > [ 3924.3209576] sleepq_block() at netbsd:sleepq_block+0xb7 > [ 3924.3809774] lwp_park() at netbsd:lwp_park+0x10d > [ 3924.4409956] syslwp_park60() at netbsd:syslwp_park60+0x5d > [ 3924.5110189] syscall() at netbsd:syscall+0x299 > [ 3924.5610351] --- syscall (number 478) --- > [ 3924.6110531] 7adcb44b035a: > [ 3924.6410624] cpu7: End traceback... Fixed with sys/kern/kern_synch.c revision 1.330. > Could everyone please refrain from committing new kernel-crashing bugs > until the test infrastructure has recovered from the previous round? I think that's a reasonable suggestion. Looking at it from a positive viewpoint your system and ATF appear to be doing a brilliant job in finding problems. Also on a positive note a minority of the bugs have been aincent and only exposed due to recent changes. I will make an effort to run ATF more often locally. Thank you, Andrew
Re: Current test failures
Taylor R Campbell wrote: > OOPS -- rmind removed pserialize_init from rump_init, so the mutex > never got initialized. Fixed in rump.c 1.337! Perhaps, but before Taylor made that commit, at least one other bug was introduced that is causing the system to panic before finishing the tests: fs/vfs/t_renamerace (726/847): 28 test cases ext2fs_renamerace: [6.743565s] Failed: Test program received signal 11 (core dumped) ext2fs_renamerace_dirs: [6.690776s] Failed: Test program received signal 11 (core dumped) ffs_renamerace: [6.602727s] Failed: Test program received signal 11 (core dumped) ffs_renamerace_dirs: [ 3923.9308316] panic: kernel diagnostic assertion "l->l_cpu == ci" failed: file "/tmp/bracket/build/2019.12.06.21.45.14-amd64-baremetal/src/sys/kern/kern_synch.c", line 764 [ 3924.1108893] cpu7: Begin traceback... [ 3924.1509019] vpanic() at netbsd:vpanic+0x178 [ 3924.2009181] kern_assert() at netbsd:kern_assert+0x48 [ 3924.2609379] mi_switch() at netbsd:mi_switch+0x569 [ 3924.3209576] sleepq_block() at netbsd:sleepq_block+0xb7 [ 3924.3809774] lwp_park() at netbsd:lwp_park+0x10d [ 3924.4409956] syslwp_park60() at netbsd:syslwp_park60+0x5d [ 3924.5110189] syscall() at netbsd:syscall+0x299 [ 3924.5610351] --- syscall (number 478) --- [ 3924.6110531] 7adcb44b035a: [ 3924.6410624] cpu7: End traceback... More logs at: http://www.gson.org/netbsd/bugs/build/amd64-baremetal/commits-2019.12.html#2019.12.07.14.55.58 Could everyone please refrain from committing new kernel-crashing bugs until the test infrastructure has recovered from the previous round? -- Andreas Gustafsson, g...@netbsd.org
Re: LOCKDEBUG: Mutex error: mi_switch,528: spin lock held
Hi, On Sat, Dec 07, 2019 at 07:24:32PM +0900, Kimihiro Nonaka wrote: > I got a panic with recent updated source. This should be fixed with rev. 1.330 of sys/kern/kern_synch.c. Thank you, Andrew
Re: Current test failures
> Date: Sat, 7 Dec 2019 11:19:49 +0200 > From: Andreas Gustafsson > > Martin Husemann wrote: > > Here is a simple recipe to reproduce the massive test lossage in -current: > > > > cd /usr/tests/dev/raidframe && atf-run > > I have now bisected it down to the following commits: > > 2019.12.05.03.21.08 riastradh src/sys/kern/subr_percpu.c 1.20 Unlikely to be relevant -- this only makes an assertion fire in fewer circumstances, so it can't reasonably cause _more_ crashes. > 2019.12.05.03.21.17 riastradh src/sys/kern/subr_pserialize.c 1.16 OOPS -- rmind removed pserialize_init from rump_init, so the mutex never got initialized. Fixed in rump.c 1.337! > 2019.12.05.03.21.29 riastradh src/sys/kern/subr_pserialize.c 1.17 Unlikely to be relevant -- this only makes an evcnt attach statically rather than in pserialize_init. > 2019.12.05.03.21.42 riastradh src/external/cddl/osnet/sys/sys/opentypes.h > 1.5 Can't imagine how this could be relevant -- only affects the tools build!
Re: Current test failures
Here is gdb output from the rump_server core: Program terminated with signal SIGSEGV, Segmentation fault. #0 rumpuser_mutex_spin_p (mtx=0x0) at /work/src/lib/librumpuser/rumpuser_pth.c:166 166 /work/src/lib/librumpuser/rumpuser_pth.c: No such file or directory. [Current thread is 1 (process 28)] (gdb) bt #0 rumpuser_mutex_spin_p (mtx=0x0) at /work/src/lib/librumpuser/rumpuser_pth.c:166 #1 0xfde3c02c in mutex_enter (mtx=0xfdea6e80) at /work/src/lib/librump/../../sys/rump/librump/rumpkern/locks.c:164 #2 0xfdde6100 in pserialize_perform (psz=0xfd6cd000) at /work/src/lib/librump/../../sys/rump/../kern/subr_pserialize.c:126 #3 0xfdd0aac4 in fstrans_setstate (mp=, new_state=FSTRANS_SUSPENDING) at /work/src/lib/librumpvfs/../../sys/rump/../kern/vfs_trans.c:635 #4 0xfdcfc42c in genfs_suspendctl (mp=0xfb604000, cmd=) at /work/src/lib/librumpvfs/../../sys/rump/../miscfs/genfs/genfs_vfsops.c:83 #5 0xfdd16794 in VFS_SUSPENDCTL (mp=0xfb604000, a=) at /work/src/lib/librumpvfs/../../sys/rump/../kern/vfs_subr.c:1483 #6 0xfdd0af54 in vfs_suspend (mp=0xfb604000, nowait=) at /work/src/lib/librumpvfs/../../sys/rump/../kern/vfs_trans.c:703 #7 0xfdd19b7c in dounmount (mp=0xfb604000, flags=524288, l=0xfb62bb80) at /work/src/lib/librumpvfs/../../sys/rump/../kern/vfs_mount.c:854 #8 0xfdd19e74 in vfs_unmountall1 (l=l@entry=0xfb62bb80, force=force@entry=true, verbose=verbose@entry=true) at /work/src/lib/librumpvfs/../../sys/rump/../kern/vfs_mount.c:1021 #9 0xfdd19f94 in vfs_unmountall (l=l@entry=0xfb62bb80) at /work/src/lib/librumpvfs/../../sys/rump/../kern/vfs_mount.c:933 #10 0xfdd1a00c in vfs_shutdown () at /work/src/lib/librumpvfs/../../sys/rump/../kern/vfs_mount.c:1086 #11 0xfdd26000 in fini () at /work/src/lib/librumpvfs/../../sys/rump/librump/rumpvfs/rump_vfs.c:81 #12 0xfde45064 in cpu_reboot (howto=0, bootstr=) at /work/src/lib/librump/../../sys/rump/librump/rumpkern/emul.c:415 #13 0xfddc3808 in sys_reboot (l=, uap=0xfb0d8038, retval=) at /work/src/lib/librump/../../sys/rump/../kern/kern_reboot.c:73 #14 0xfdec0f8c in sy_call (rval=0xf99fff18, uap=0xfb0d8038, l=0xfb62bb80, sy=0xfdead600 ) at /work/src/sys/rump/kern/lib/libsysproxy/../../../../sys/syscallvar.h:65 #15 sy_invoke (code=, rval=0xf99fff18, uap=0xfb0d8038, l=0xfb62bb80, sy=0xfdead600 ) at /work/src/sys/rump/kern/lib/libsysproxy/../../../../sys/syscallvar.h:94 #16 hyp_syscall (num=, arg=0xfb0d8038, retval=0xf99fff78) at /work/src/sys/rump/kern/lib/libsysproxy/sysproxy.c:74 #17 0xfde45154 in rspo_wrap_syscall (num=, arg=, retval=) at /work/src/lib/librump/../../sys/rump/librump/rumpkern/rump.c:144 #18 0xfdca6c7c in rumpsyscall (regrv=0xf99fff80, data=0xfb0d8038, sysnum=208) at /work/src/lib/librumpuser/rumpuser_sp.c:267 #19 serv_handlesyscall (rhdr=0xfb104308, rhdr=0xfb104308, data=0xfb0d8038 , spc=0xfdcbd9d0) at /work/src/lib/librumpuser/rumpuser_sp.c:690 #20 serv_workbouncer (arg=) at /work/src/lib/librumpuser/rumpuser_sp.c:773 #21 0xfdc7e130 in pthread__create_tramp (cookie=0xfb135000) at /work/src/lib/libpthread/pthread.c:593 #22 0xfdaa4ba8 in __mknod50 () from /usr/lib/libc.so.12 (gdb) up #1 0xfde3c02c in mutex_enter (mtx=0xfdea6e80) at /work/src/lib/librump/../../sys/rump/librump/rumpkern/locks.c:164 164 /work/src/lib/librump/../../sys/rump/librump/rumpkern/locks.c: No such file or directory. (gdb) p *mtx $1 = {u = {p = {mtxp_a = 0, mtxp_b = {0, 0 (gdb) up #2 0xfdde6100 in pserialize_perform (psz=0xfd6cd000) at /work/src/lib/librump/../../sys/rump/../kern/subr_pserialize.c:126 126 /work/src/lib/librump/../../sys/rump/../kern/subr_pserialize.c: No such file or directory. (gdb) p *psz $2 = {psz_owner = 0x0} Martin
LOCKDEBUG: Mutex error: mi_switch,528: spin lock held
Hi, I got a panic with recent updated source. -- [ 1.000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, [ 1.000] 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, [ 1.000] 2018, 2019 The NetBSD Foundation, Inc. All rights reserved. [ 1.000] Copyright (c) 1982, 1986, 1989, 1991, 1993 [ 1.000] The Regents of the University of California. All rights reserved. [ 1.000] NetBSD 9.99.20 (GENERIC) #3: Sat Dec 7 19:03:12 JST 2019 ... [ 1.4973159] cpu0 has 1x core siblings: cpu0 [ 1.4973159] cpu0 has 4x package siblings: cpu1 cpu2 cpu3 cpu0 [ 1.4973159] cpu0 has 1x peer siblings: cpu0 [ 1.4973159] cpu1 has 1x core siblings: cpu1 [ 1.4973159] cpu1 has 4x package siblings: cpu2 cpu3 cpu0 cpu1 [ 1.4973159] cpu1 has 1x peer siblings: cpu1 [ 1.4973159] cpu2 has 1x core siblings: cpu2 [ 1.4973159] cpu2 has 4x package siblings: cpu3 cpu0 cpu1 cpu2 [ 1.4973159] cpu2 has 1x peer siblings: cpu2 [ 1.4973159] cpu3 has 1x core siblings: cpu3 [ 1.4973159] cpu3 has 4x package siblings: cpu0 cpu1 cpu2 cpu3 [ 1.4973159] cpu3 has 1x peer siblings: cpu3 [ 2.2846485] Mutex error: mi_switch,528: spin lock held [ 2.2846485] lock address : 0x9466c63e44c0 type : spin [ 2.2846485] initialized : 0x809fb614 [ 2.2846485] shared holds : 0 exclusive: 1 [ 2.2846485] shares wanted: 0 exclusive: 0 [ 2.2846485] current cpu : 1 last held: 1 [ 2.2846485] current lwp : 0x94663ea08100 last held: 0x94663ea08100 [ 2.2846485] last locked* : 0x809dbf35 unlocked : 0x809fbd4a [ 2.2846485] owner field : 0x00010700 wait/spin:0/1 [ 2.3346217] panic: LOCKDEBUG: Mutex error: mi_switch,528: spin lock held [ 2.3346788] cpu1: Begin traceback... [ 2.[3 3 46 728.83]3 a4c6p7i88b]a tv0p:a nfiac()i laet d tnoe tebsvda:lvpuaantiec +0_xB1I78F [ 2.334[ 6 7 828.3] 3:46 7A8E8_]E RsRnOprRi [ 2.3346788] ntf() at netbsds:ds0n prati nstcfs [ 2.349254[3 ] i2b.u3s409 2t5a4r3g]e tl o1c klduenb u0g:_ md edbiusgk_ mfoirxee 5 2.35464[7 2 ] 2d.3 [46472] mis_sdw0i:t cfha(b) riacta ting na egtebosmde:tmriy_ [ [2 .23.3554466447722]] ssdw0i:t c8h1+902x06 5M [ 2.3546[4 7 2 ]2 .B3,5 4861497220 ]c yild,l e6_4l ohoepa(d), a3t 2 sence,t b51s2d :biydtlees_/lsoeocpt+ 0xx 11a6d7 2.[3 6 4 626.6326]4 767626126]0 csepcut_ohrast [ 2.36466[6 2 ] 2 .c3h6(4)6 6a6t2 ] snde0t: bfsadb:rcipcua_thiantgc ha+ 0gxe1o7mfe ] 2[. 3 6 4626.6326]4 6t6r6y2 cpu1: End traceback... [ 2.3746869] fat[ [2 . 3 7246.836794]6 8a6l9 ]b rdekak0 paoti nstd 0t: r"aEpF Ii"n, s2u6p2e1r4v4i sbolro cmkosd ea [ [2 .23.734764866896]9 ]t t6r4a,p ttyyppee: 1n tcfodse [ 2 . 327.34764866896]9 ] 0d kr1i pa t0 xsfdf0f:f f"fNfeft8B0S2D1"d,d 81d5 9c1s2 303x889 rbflloacgsk s0 xa2t 0226 2c2r722, 0t iylpee:v eflf[s0 [ [ 2 . 32.834864763723]2 ]x 8d krs2 pa t0 xsfdf0f:f 9"4s8w0a6p7"3,8 68e32806 [ [2 .23.84368743627]3 24]6 3c ubrllwpo c0kxsf faftf 9145696338e5a06861400, tpyipde :0 .s2w0a pl [ [2 . 3 924.638944658]4 5ow]e scdt 0 kastt asccks 0ixbfufsf0f 94t8a0r6g7e3t8 321c 0l Stopped in pid 0.20 (system) at netbsd:breakpoint+0x5: leave db{1}> bt breakpoint() at netbsd:breakpoint+0x5 vpanic() at netbsd:vpanic+0x178 snprintf() at netbsd:snprintf lockdebug_more() at netbsd:lockdebug_more mi_switch() at netbsd:mi_switch+0x65 idle_loop() at netbsd:idle_loop+0x1ad cpu_hatch() at netbsd:cpu_hatch+0x17f db{1}> -- -- Kimihiro Nonaka
Re: Current test failures
Martin Husemann wrote: > Here is a simple recipe to reproduce the massive test lossage in -current: > > cd /usr/tests/dev/raidframe && atf-run I have now bisected it down to the following commits: 2019.12.05.03.21.08 riastradh src/sys/kern/subr_percpu.c 1.20 2019.12.05.03.21.17 riastradh src/sys/kern/subr_pserialize.c 1.16 2019.12.05.03.21.29 riastradh src/sys/kern/subr_pserialize.c 1.17 2019.12.05.03.21.42 riastradh src/external/cddl/osnet/sys/sys/opentypes.h 1.5 -- Andreas Gustafsson, g...@netbsd.org
Current test failures
Here is a simple recipe to reproduce the massive test lossage in -current: cd /usr/tests/dev/raidframe && atf-run In the output you will find: tp-start: 1575709106.330783, t_raid, 7 tc-start: 1575709106.330864, old_numrows_config tc-so:Executing command [ rump_server -lrumpvfs -lrumpdev -lrumpdev_disk -lrumpdev_raidframe -d key=/disk0,hostpath=disk0.img,size=1m -d key=/disk1,hostpath=disk1.img,size=1m unix://sock ] tc-so:Executing command [ rump.raidctl -C raid.conf raid0 ] tc-se:rump.halt: reboot: Socket is not connected tc-se:t_raid: ERROR: The test case cleanup returned a non-ok exit code, but this is not allowed and then all tries to use the same socket will fail. On macppc I get this kernel log when the rump_server process dies: [ 51469.7829284] trap: pid 9120.28 (rump_server): user read DSI trap @ 0x20 by 0xfdca89d4 (DSISR 0x4000, err=14) Martin