Re: Automated report: NetBSD-current/i386 test failure
The NetBSD Test Fixture wrote: > This is an automatically generated notice of a new failure of the > NetBSD test suite. > > The newly failing test case is: > > net/ipsec/t_ipsec_tunnel_odd:ipsec_tunnel_v4v6_esp_cast128cbc > > The above test failed in each of the last 4 test runs, and passed in > at least 26 consecutive runs before that. > > The following commits were made between the last successful test and > the first failed test: > > 2023.10.12.17.18.38 riastradh > src/crypto/external/bsd/heimdal/Makefile.inc 1.11 (etc) Sorry about the multitude of automated reports, some of which (such as the one quoted above) pointed at the wrong commit. What happened is that a commit by ad@ caused a large number of tests to start failing randomly, and because the failures are random and numerous and today is Friday the 13th, some of the tests happened to fail for the first time only after some later commit, and then also happened to fail several times in a row. This resulted in a pattern of a long stretch of successes followed by several failures in a row. This looks very much the same as a deterministic failure introduced by the later commit, and the testbed mistook it as such. Even though some of the test failures were reported as an excessive number of separate emails and attributed to the wrong commit, the failures themselves are real. -- Andreas Gustafsson, g...@gson.org
Re: odd setlist failure
Greg Troxel wrote: > current fails to build for me, complaining about ati_drv.so.19 in > destdir but not in setlist. I see that .6 is in the setlists now. It > my destdir I have: > > -r--r--r-- 1 gdt wheel 7420 Jan 26 10:48 > /usr/obj/gdt-current/destdir/i386/usr/X11R7/lib/modules/drivers/ati_drv.so.6 > -r--r--r-- 1 gdt wheel 7420 Feb 25 08:06 > /usr/obj/gdt-current/destdir/i386/usr/X11R7/lib/modules/drivers/ati_drv.so.19 > lrwxr-xr-x 1 gdt wheel13 Feb 25 08:06 > /usr/obj/gdt-current/destdir/i386/usr/X11R7/lib/modules/drivers/ati_drv.so -> > ati_drv.so.19 > > Build host is netbsd-9 amd64. > > Is anyone else seeing this? Yes: https://www.gson.org/netbsd/bugs/build/amd64-baremetal/commits-2022.02.html#2022.02.24.08.06.41 https://www.gson.org/netbsd/bugs/build/amd64-baremetal/commits-2022.02.html#2022.02.24.08.06.41 I'm guessing it started with this commit: 2022.02.23.17.28.31 mrg src/external/mit/xorg/server/drivers/xf86-video-ati/Makefile 1.7 -- Andreas Gustafsson, g...@gson.org
Re: black screen, boot doesn't finish
Thomas Klausner wrote: > This commit > > $NetBSD: drmfb.c,v 1.13 2022/02/16 23:30:10 riastradh Exp $ > > makes my graphical console disappear. It also makes my i386 laptop testbed hang during boot: http://www.gson.org/netbsd/bugs/build/i386-laptop/commits-2022.02.html#2022.02.16.23.30.10 -- Andreas Gustafsson, g...@gson.org
Re: Heads up: objdir is now rm -rf resistent
m...@netbsd.org wrote: > I hope fixing this is enough to fix all the cryptic issues. The build is now fixed, but I still need to give the testbeds the ability to automatically remove objdirs containing non-writable directories, because otherwise they will get stuck whenever they decide to build a historic version from the affected time range. This is also going to be an ongoing pitfall for anyone building historic versions, for example when bisecting. -- Andreas Gustafsson, g...@gson.org
Heads up: objdir is now rm -rf resistent
All, The TNF testbed is currently failing to start new builds because it is unable to remove the objdirs from previous builds using the Python equivalent of "rm -rf". Specifically, after the i386 build fails the way it currently does, the objdir contains two directories with mode 0111, which rm -rf is unable to remove: obj/distrib/i386/cdroms/bootcd/cdrom/var/spool/ftp/hidden obj/distrib/i386/cdroms/bootcd-com/cdrom/var/spool/ftp/hidden The work-around is to manually chmod the directories to 0755 before removing the objdir, but until I get around to automating that on the testbed, you can expect a reduced level of automated testing service. Also, you may want to be on the lookout for this failure mode in your own builds (or the cleanup after them). -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The build is still failing, now with: --- release-bootcd-com --- Copying set gpufw pax: line 206: ./libdata/firmware/nouveau/nvidia/gp107/gr/fecs_bl.bin: type mismatch: specfile link, tree file This is from: http://releng.netbsd.org/b5reports/i386/2021/2021.12.14.16.55.45/build.log.tail -- Andreas Gustafsson, g...@gson.org
i386 install failing
Hi ryo, NetBSD-current/i386 panics when booting the install media since your recent COMPAT_LINUX32 commits, with panic: kernel diagnostic assertion "*e->e_sigobject == NULL" failed: file "/tmp/build/2021.11.25.03.08.05-i386/src/sys/kern/kern_exec.c", line 2032 Logs: http://releng.NetBSD.org/b5reports/i386/commits-2021.11.html#2021.11.25.03.08.05 The sparc port is also broken; it installs but panics when booting the installed system. -- Andreas Gustafsson, g...@gson.org
Panic running tests
2fs/ext2fs_vnops.c,v 1.136 2021.10.20.03.08.19 thorpej src/sys/ufs/lfs/lfs_rename.c,v 1.25 2021.10.20.03.08.19 thorpej src/sys/ufs/lfs/lfs_vnops.c,v 1.340 2021.10.20.03.08.19 thorpej src/sys/ufs/lfs/ulfs_readwrite.c,v 1.28 2021.10.20.03.08.19 thorpej src/sys/ufs/lfs/ulfs_vnops.c,v 1.55 2021.10.20.03.08.19 thorpej src/sys/ufs/ufs/ufs_acl.c,v 1.3 2021.10.20.03.08.19 thorpej src/sys/ufs/ufs/ufs_extern.h,v 1.88 2021.10.20.03.08.19 thorpej src/sys/ufs/ufs/ufs_readwrite.c,v 1.127 2021.10.20.03.08.19 thorpej src/sys/ufs/ufs/ufs_rename.c,v 1.14 2021.10.20.03.08.19 thorpej src/sys/ufs/ufs/ufs_vnops.c,v 1.260 2021.10.20.03.08.19 thorpej src/tests/kernel/kqueue/t_vnode.c,v 1.2 2021.10.20.03.09.45 thorpej src/sys/sys/param.h,v 1.705 2021.10.20.03.13.14 thorpej src/sys/kern/vnode_if.c,v 1.115 2021.10.20.03.13.14 thorpej src/sys/rump/include/rump/rumpvnode_if.h,v 1.37 2021.10.20.03.13.14 thorpej src/sys/rump/librump/rumpvfs/rumpvnode_if.c,v 1.37 2021.10.20.03.13.14 thorpej src/sys/sys/vnode_if.h,v 1.108 Logs can be found at: http://releng.NetBSD.org/b5reports/i386/commits-2021.10.html#2021.10.20.03.26.20 -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
Hi maya, The build is still failing with: > ./libdata/firmware/nouveau/nvidia/LICENCE.nvidia > ./libdata/firmware/nouveau/nvidia/gm206/fecs_data.bin > ./libdata/firmware/nouveau/nvidia/gm206/fecs_inst.bin > ./libdata/firmware/nouveau/nvidia/gm206/gpccs_data.bin > ./libdata/firmware/nouveau/nvidia/gm206/gpccs_inst.bin > = end of 5 extra files === > *** Failed target: checkflist > *** Failed commands: > ${SETSCMD} ${.CURDIR}/checkflist ${MAKEFLIST_FLAGS} > ${CHECKFLIST_FLAGS} ${METALOG.unpriv} > *** [checkflist] Error code 1 > nbmake[2]: stopped in /tmp/build/2021.09.25.21.26.04-i386/src/distrib/sets > 1 error > nbmake[2]: stopped in /tmp/build/2021.09.25.21.26.04-i386/src/distrib/sets > nbmake[1]: stopped in /tmp/build/2021.09.25.21.26.04-i386/src > nbmake: stopped in /tmp/build/2021.09.25.21.26.04-i386/src > ERROR: Failed to make release since this commit: > 2021.09.25.21.26.03 maya > src/distrib/common/bootimage/Makefile.installimage,v 1.10 > 2021.09.25.21.26.03 maya src/distrib/sets/sets.subr,v 1.197 > 2021.09.25.21.26.04 maya src/distrib/sets/lists/gpufw/mi,v 1.3 -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
Yesterday, the NetBSD Test Fixture wrote: > nbmake[3]: stopped in /tmp/build/2021.08.17.17.31.59-i386/src > nbmake[2]: stopped in /tmp/build/2021.08.17.17.31.59-i386/src > nbmake[1]: stopped in /tmp/build/2021.08.17.17.31.59-i386/src > nbmake: stopped in /tmp/build/2021.08.17.17.31.59-i386/src > ERROR: Failed to make release kre@ fixed this particular error, but the build is still failing on i386 and other 32-bit platforms, now with different errors such as these: --- dependall-sodium --- In file included from /tmp/build/2021.08.17.22.29.11-i386/src/sys/external/isc/libsodium/dist/src/libsodium/include/sodium/private/ed25519_ref10_fe_51.h:3, from /tmp/build/2021.08.17.22.29.11-i386/src/sys/external/isc/libsodium/dist/src/libsodium/include/sodium/private/ed25519_ref10.h:23, from /tmp/build/2021.08.17.22.29.11-i386/src/sys/external/isc/libsodium/dist/src/libsodium/crypto_scalarmult/curve25519/ref10/x25519_ref10.c:7: /tmp/build/2021.08.17.22.29.11-i386/src/sys/external/isc/libsodium/dist/src/libsodium/include/sodium/private/common.h:14:1: error: unable to emulate 'TI' 14 | typedef unsigned uint128_t __attribute__((mode(TI))); | ^~~ In file included from /tmp/build/2021.08.17.22.29.11-i386/src/sys/external/isc/libsodium/dist/src/libsodium/include/sodium/private/ed25519_ref10.h:23, from /tmp/build/2021.08.17.22.29.11-i386/src/sys/external/isc/libsodium/dist/src/libsodium/crypto_scalarmult/curve25519/ref10/x25519_ref10.c:7: /tmp/build/2021.08.17.22.29.11-i386/src/sys/external/isc/libsodium/dist/src/libsodium/include/sodium/private/ed25519_ref10_fe_51.h: In function 'fe25519_mul': /tmp/build/2021.08.17.22.29.11-i386/src/sys/external/isc/libsodium/dist/src/libsodium/include/sodium/private/ed25519_ref10_fe_51.h:300:17: error: right shift count >= width of type [-Werror=shift-count-overflow] 300 | carry = r0 >> 51; | ^~ This is from: http://releng.netbsd.org/b5reports/i386/2021/2021.08.17.22.29.11/build.log.tail -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
On Monday, David Holland wrote: > Right... I wonder what happened to bracket's error-matching script; it > usually does better than that. I have now deployed bracket 2.15 on babylon5.netbsd.org, and the latest build failure report looks much better: http://mail-index.netbsd.org/current-users/2021/07/24/msg041311.html -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
David Holland wrote: > On Mon, Jul 19, 2021 at 10:32:20AM +0900, Rin Okuyama wrote: > > Logs below are usually more helpful. > > Right... I wonder what happened to bracket's error-matching script; it > usually does better than that. There are multiple causes, but a major one is that since babylon5 was upgraded to a new server with more cores, the builds have more parallelism, which causes make(1) to print more output from the other parallel jobs after the actual error message, and bracket isn't looking far enough back in the log. I have a fix in testing on my own testbed but still need to deploy it on babylon5. -- Andreas Gustafsson, g...@gson.org
Re: 9.99.86 HEAD
Martin Husemann wrote: > Hmm, that is the last commit I needed to get everything working again > here - any idea what exactly hangs and where? The same way it has been hanging ever since dholland's commit of 2021.06.29.22.37.11 (except for a period when the tests didn't even start): kernel/t_umountstress (98/899): 2 test cases fileop: The latest test run on b5 has hung but not timed out yet; when it does, the log will appear here: http://releng.netbsd.org/b5reports/i386/commits-2021.07.html#2021.07.01.04.25.51 -- Andreas Gustafsson, g...@gson.org
Re: 9.99.86 HEAD
Martin Husemann wrote: > All regressions I am aware of have been fixed now. At least i386 still hangs while running the ATF tests as of source date 2021.07.01.04.25.51. -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 install success
The NetBSD Test Fixture wrote: > The NetBSD-current/i386 install is working again. So it is, but the system now panics during the ATF tests: dev/fss/t_fss (38/899): 1 test cases basic: [ 148.2076287] panic: kernel diagnostic assertion "KERNEL_LOCKED_P()" failed: file "/tmp/build/2021.06.13.03.09.20-i386/src/sys/kern/subr_autoconf.c", line 1972 [ 148.2076287] cpu0: Begin traceback... [ 148.2076287] vpanic(c11949f0,c9c33d7c,c9c33db0,c0cdc391,c11949f0,c1194937,c11cf716,c129d924,7b4,c2786c00) at netbsd:vpanic+0x13c [ 148.2076287] kern_assert(c11949f0,c1194937,c11cf716,c129d924,7b4,c2786c00,0,c14e1de0,c2689a00,c2689a00) at netbsd:kern_assert+0x23 [ 148.2076287] config_detach(c2689a00,2,c1d1d000,0,0,c0d7335e,0,a300,80,0) at netbsd:config_detach+0x430 [ 148.2076287] fss_close(a300,0,1,6000,c2a89180,0,a300,0,1,c21740c0) at netbsd:fss_close+0x131 [ 148.2200843] spec_close(c9c33e3c,3,0,c117cc10,c2786c00,1,c1a304c0,c2786c00,c9c33e74,c0d69ab7) at netbsd:spec_close+0x209 [ 148.2200843] VOP_CLOSE(c2786c00,1,c1a304c0,0,c9c33f38,c22ad980,c22ad99c,c9c33e88,c0d69b48,c2786c00) at netbsd:VOP_CLOSE+0x3d [ 148.2200843] vn_close(c2786c00,1,c1a304c0,c9c33ec4,c0c8971c,c22ad980,0,0,c9c33eac,c26a4680) at netbsd:vn_close+0x39 [ 148.2200843] vn_closefile(c22ad980,0,0,c9c33eac,c26a4680,2c,402c7413,c2b05bc0,0,c2b05c4c) at netbsd:vn_closefile+0x22 [ 148.2200843] closef(c22ad980,c2a89180,c9c33f9c,c012273c,c270a4d4,b3ff,2,c26a4680,c22ad980,c2b05bf0) at netbsd:closef+0x4f [ 148.2200843] fd_close(3,0,c2a89180,c2a89180,c2a89180,c9c33f9c,c04a1f8b,c2a89180,c9c33f68,c9c33f60) at netbsd:fd_close+0x17d [ 148.2200843] sys_close(c2a89180,c9c33f68,c9c33f60,c19f7bc8,0,6,c9c33f60,c9c33f68,0,0) at netbsd:sys_close+0x20 [ 148.2200843] syscall() at netbsd:syscall+0x17c [ 148.2395535] --- syscall (number 6) --- [ 148.2395535] b405b597: [ 148.2395535] cpu0: End traceback... Logs: http://releng.netbsd.org/b5reports/i386/commits-2021.06.html#2021.06.13.00.11.17 amd64 is also affected. -- Andreas Gustafsson, g...@netbsd.org
Re: Automated report: NetBSD-current/i386 build failure
The i386 build is still failing, but now with a different error: --- in6_pcb.o --- /tmp/build/2021.05.27.08.58.29-i386/src/sys/netinet6/in6_pcb.c: In function 'in6_pcblookup_port': cc1: error: function may return address of local variable [-Werror=return-local-addr] /tmp/build/2021.05.27.08.58.29-i386/src/sys/netinet6/in6_pcb.c:1056:26: note: declared here 1056 | struct vestigial_inpcb better; | ^~ It's not clear to me which of the commits made since christos first broke the build could have triggered this, nor why this is not affecting all ports. Logs: http://releng.netbsd.org/b5reports/i386/commits-2021.05.html#2021.05.27.08.41.35 -- Andreas Gustafsson, g...@gson.org
Re: Problem reports for version control systems
Brett Lymn wrote: > Just for you info... there are a few NetBSD developers in .au, my self > included. I haven't > had any issues with cvs disconnects. Not to deny you have an issue, just > letting you know > it works ok for people near you. For what it's worth, my connections to anoncvs from Finland frequently break in the middle of a transfer, though I'm using rsync rather than cvs. I have an admins@ ticket open about this since a couple of years ago (#160795). -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The i386 build is still failing, with errors like these: --- dependall-tmux --- /tmp/build/2021.04.18.12.05.29-i386/src/external/bsd/tmux/dist/cmd-display-menu.c: In function 'cmd_display_menu_get_position': /tmp/build/2021.04.18.12.05.29-i386/src/external/bsd/tmux/dist/cmd-display-menu.c:158:8: error: comparison of integer expressions of different signedness: 'long int' and 'u_int' {aka 'unsigned int'} [-Werror=sign-compare] 158 | if (n >= tty->sy) |^~ /tmp/build/2021.04.18.12.05.29-i386/src/external/bsd/tmux/dist/cmd-display-menu.c:191:8: error: comparison of integer expressions of different signedness: 'long int' and 'u_int' {aka 'unsigned int'} [-Werror=sign-compare] 191 | if (n >= tty->sy) |^~ /tmp/build/2021.04.18.12.05.29-i386/src/external/bsd/tmux/dist/cmd-display-menu.c:239:8: error: comparison of integer expressions of different signedness: 'long int' and 'u_int' {aka 'unsigned int'} [-Werror=sign-compare] 239 | if (n < h) | ^ -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
The cause of the 1000+ new test failures has now been narrowed down to the following commit: 2021.01.16.23.50.49 chs src/sys/rump/librump/rumpkern/rump.c,v 1.352 2021.01.16.23.51.50 chs src/sys/arch/arm/arm/psci.c,v 1.5 2021.01.16.23.51.50 chs src/sys/conf/files,v 1.1278 2021.01.16.23.51.51 chs src/sys/lib/libkern/arch/hppa/bcopy.S,v 1.16 2021.01.16.23.51.51 chs src/sys/lib/libkern/libkern.h,v 1.141 2021.01.16.23.51.51 chs src/sys/sys/cdefs.h,v 1.156 2021.01.16.23.51.51 chs src/sys/sys/queue.h,v 1.76 Logs: http://releng.netbsd.org/b5reports/i386/commits-2021.01.html#2021.01.16.23.51.51 -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
Yesterday, the NetBSD Test Fixture wrote: > The newly failing test case is: > > rump/rumpkern/t_vm:busypage This one is still failing. The rump kernel panics with: [ 1.1400050] panic: kernel diagnostic assertion "(pg->flags & PG_FAKE) == 0" failed: file "/tmp/build/2020.12.07.10.02.51-i386/src/lib/librump/../../sys/rump/librump/rumpkern/vm.c", line 710 A full log and backtrace is at: http://releng.netbsd.org/b5reports/i386/2020/2020.12.07.10.02.51/test.html#rump_rumpkern_t_vm_busypage -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
Roland Illig wrote: > >=== 2 extra files in DESTDIR = > Fixed in distrib/sets/lists/tests/mi 1.984. Confirmed fixed, thanks. The i386 build is still failing, though - it's now back to failing in mpu_acpi.c: --- kern-GENERIC --- /tmp/build/2020.12.07.08.31.07-i386/src/sys/dev/acpi/mpu_acpi.c:119:38: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast] 119 | sc->arg = acpi_intr_establish(self, (uint64_t)aa->aa_node->ad_handle, | ^ Logs: http://releng.netbsd.org/b5reports/i386/commits-2020.12.html#2020.12.07.08.31.07 -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The build is now failing differently: === 2 extra files in DESTDIR = Files in DESTDIR but missing from flist. File is obsolete or flist is out of date ? -- ./usr/tests/usr.bin/make/unit-tests/opt-keep-going-multiple.exp ./usr/tests/usr.bin/make/unit-tests/opt-keep-going-multiple.mk = end of 2 extra files === Logs: http://releng.netbsd.org/b5reports/i386/commits-2020.12.html#2020.12.07.01.32.04 -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The NetBSD Test Fixture wrote: > nbmake[2]: stopped in > /tmp/build/2020.12.06.12.54.32-i386/obj/sys/arch/i386/compile/LEGACY More specifically: --- mpu_acpi.o --- /tmp/build/2020.12.06.12.23.13-i386/src/sys/dev/acpi/mpu_acpi.c: In function 'mpu_acpi_attach': /tmp/build/2020.12.06.12.23.13-i386/src/sys/dev/acpi/mpu_acpi.c:119:38: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast] 119 | sc->arg = acpi_intr_establish(self, (uint64_t)aa->aa_node->ad_handle, | ^ > The following commits were made between the last successful build and > the failed build: Now bisected to this commit: > 2020.12.06.12.23.13 jmcneill src/sys/dev/acpi/amdccp_acpi.c,v 1.3 > 2020.12.06.12.23.13 jmcneill src/sys/dev/acpi/atppc_acpi.c,v 1.18 > 2020.12.06.12.23.13 jmcneill src/sys/dev/acpi/fdc_acpi.c,v 1.44 > 2020.12.06.12.23.13 jmcneill src/sys/dev/acpi/lpt_acpi.c,v 1.21 > 2020.12.06.12.23.13 jmcneill src/sys/dev/acpi/mpu_acpi.c,v 1.14 > 2020.12.06.12.23.13 jmcneill src/sys/dev/acpi/pckbc_acpi.c,v 1.38 > 2020.12.06.12.23.13 jmcneill src/sys/dev/acpi/spic_acpi.c,v 1.7 > 2020.12.06.12.23.13 jmcneill src/sys/dev/acpi/wb_acpi.c,v 1.6 -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build success
The NetBSD Test Fixture wrote: > The NetBSD-current/i386 build is working again. It is, but the amd64 build is failing with: === 1 extra files in DESTDIR = Files in DESTDIR but missing from flist. File is obsolete or flist is out of date ? -- ./usr/bin/gdbserver = end of 1 extra files === -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The build is now failing with: === 2 extra files in DESTDIR = Files in DESTDIR but missing from flist. File is obsolete or flist is out of date ? -- ./usr/tests/usr.bin/make/unit-tests/objdir-writable.exp ./usr/tests/usr.bin/make/unit-tests/objdir-writable.mk = end of 2 extra files === Logs at: http://releng.netbsd.org/b5reports/i386/commits-2020.11.html#2020.11.13.09.56.53 -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The build is still failing, but now with a different error: nbmake[5]: "/tmp/build/2020.11.13.08.33.07-i386/src/external/ofl/Makefile" line 3: Malformed conditional (${MKX11} != "no") -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The NetBSD Test Fixture wrote: > nbmake[7]: nbmake[7]: don't know how to make -ltermlib. Stop The build is still failing. The problems started with this commit: 2020.11.08.21.56.47 nia src/external/bsd/kyua-cli/Makefile.inc,v 1.8 2020.11.08.21.56.47 nia src/external/ibm-public/postfix/Makefile.inc,v 1.25 2020.11.08.21.56.48 nia src/external/public-domain/sqlite/Makefile.inc,v 1.9 2020.11.08.21.56.48 nia src/external/public-domain/sqlite/bin/Makefile,v 1.7 2020.11.08.21.56.48 nia src/external/public-domain/sqlite/lib/Makefile,v 1.12 2020.11.08.21.56.48 nia src/external/public-domain/sqlite/lib/sqlite3.pc.in,v 1.3 2020.11.08.21.56.48 nia src/usr.sbin/makemandb/Makefile,v 1.10 -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
The NetBSD Test Fixture wrote: > This is an automatically generated notice of a new failure of the > NetBSD test suite. > > The newly failing test case is: > > lib/libm/t_fmod:fmod > [...] > 2020.08.23.06.12.52 rillig src/usr.bin/make/buf.c,v 1.36 [...] False alarm - it looks like testbed has somehow managed to dig up an old test failure from August that has already been fixed. Sorry about that, and I will make some changes to keep it from happening again. -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
nia wrote: > It should be fixed already. It's not, it's now failing earlier in the build: http://releng.netbsd.org/b5reports/i386/commits-2020.10.html#2020.10.29.16.35.33 -- Andreas Gustafsson, g...@gson.org
Re: All ATF curses tests failing on babylon5 i386
> > | Should be fixed. > > Is fixed, thanks. > Thanks for the fixes. The wgetch test case is still failing: http://releng.netbsd.org/b5reports/i386/2020/2020.10.28.03.21.25/test.html#lib_libcurses_t_curses_wgetch -- Andreas Gustafsson, g...@gson.org
Automated report: NetBSD-current/i386 test failure
[Manually forwarded as the automated notifications are temporarily disabled while new hardware is being tested] This is an automatically generated notice of new failures of the NetBSD test suite. The newly failing test cases are: lib/libcurses/t_curses:addch lib/libcurses/t_curses:addchnstr lib/libcurses/t_curses:addchstr lib/libcurses/t_curses:addnstr lib/libcurses/t_curses:addstr lib/libcurses/t_curses:assume_default_colors lib/libcurses/t_curses:attributes lib/libcurses/t_curses:background lib/libcurses/t_curses:beep lib/libcurses/t_curses:bkgdset lib/libcurses/t_curses:box lib/libcurses/t_curses:can_change_color lib/libcurses/t_curses:cbreak lib/libcurses/t_curses:chgat lib/libcurses/t_curses:clear lib/libcurses/t_curses:copywin lib/libcurses/t_curses:curs_set lib/libcurses/t_curses:define_key lib/libcurses/t_curses:derwin lib/libcurses/t_curses:doupdate lib/libcurses/t_curses:dupwin lib/libcurses/t_curses:erasechar lib/libcurses/t_curses:flash lib/libcurses/t_curses:getattrs lib/libcurses/t_curses:getbkgd lib/libcurses/t_curses:getch lib/libcurses/t_curses:getcurx lib/libcurses/t_curses:getmaxx lib/libcurses/t_curses:getmaxy lib/libcurses/t_curses:getnstr lib/libcurses/t_curses:getparx lib/libcurses/t_curses:getstr lib/libcurses/t_curses:has_colors lib/libcurses/t_curses:has_ic lib/libcurses/t_curses:hline lib/libcurses/t_curses:inch lib/libcurses/t_curses:inchnstr lib/libcurses/t_curses:init_color lib/libcurses/t_curses:innstr lib/libcurses/t_curses:is_linetouched lib/libcurses/t_curses:is_wintouched lib/libcurses/t_curses:keyname lib/libcurses/t_curses:keyok lib/libcurses/t_curses:killchar lib/libcurses/t_curses:meta lib/libcurses/t_curses:mvaddch lib/libcurses/t_curses:mvaddchnstr lib/libcurses/t_curses:mvaddchstr lib/libcurses/t_curses:mvaddnstr lib/libcurses/t_curses:mvaddstr lib/libcurses/t_curses:mvchgat lib/libcurses/t_curses:mvcur lib/libcurses/t_curses:mvderwin lib/libcurses/t_curses:mvgetnstr lib/libcurses/t_curses:mvgetstr lib/libcurses/t_curses:mvhline lib/libcurses/t_curses:mvinchnstr lib/libcurses/t_curses:mvprintw lib/libcurses/t_curses:mvscanw lib/libcurses/t_curses:mvvline lib/libcurses/t_curses:mvwin lib/libcurses/t_curses:nocbreak lib/libcurses/t_curses:nodelay lib/libcurses/t_curses:pad lib/libcurses/t_curses:startup lib/libcurses/t_curses:termattrs lib/libcurses/t_curses:timeout lib/libcurses/t_curses:wborder lib/libcurses/t_curses:window lib/libcurses/t_curses:wprintw lib/libcurses/t_curses:wscrl The above tests failed in each of the last 4 test runs, and passed in at least 26 consecutive runs before that. Between the last successful test and the failed test, a total of 299 revisions were committed, by the following developers: blymn rillig The first of these commits was made on CVS date 2020.10.24.04.40.45, and the last on 2020.10.24.04.46.17. Logs can be found at: http://releng.NetBSD.org/b5reports/i386/commits-2020.10.html#2020.10.24.04.46.17
Automated report: NetBSD-current/i386 build failure
[Manually forwarded as the automated notifications are temporarily disabled while new hardware is being tested] This is an automatically generated notice of a NetBSD-current/i386 build failure. The failure occurred on babylon5.netbsd.org, a NetBSD/amd64 host, using sources from CVS date 2020.10.24.04.47.43. An extract from the build.sh output follows: ./usr/tests/lib/libcurses/tests/window_hierarchy ./usr/tests/lib/libcurses/tests/winnstr ./usr/tests/lib/libcurses/tests/winnwstr ./usr/tests/lib/libcurses/tests/wins_nwstr ./usr/tests/lib/libcurses/tests/wins_wch ./usr/tests/lib/libcurses/tests/wins_wstr ./usr/tests/lib/libcurses/tests/winsch ./usr/tests/lib/libcurses/tests/winwstr ./usr/tests/lib/libcurses/tests/wredrawln ./usr/tests/lib/libcurses/tests/wsetscrreg ./usr/tests/lib/libcurses/tests/wstandout ./usr/tests/lib/libcurses/tests/wtimeout ./usr/tests/lib/libcurses/tests/wtouchln ./usr/tests/lib/libcurses/tests/wunderscore ./usr/tests/lib/libcurses/tests/wvline ./usr/tests/lib/libcurses/tests/wvline_set end of 249 missing files == *** [checkflist] Error code 1 nbmake[2]: stopped in /tmp/build/2020.10.24.04.47.43-i386/src/distrib/sets 1 error nbmake[2]: stopped in /tmp/build/2020.10.24.04.47.43-i386/src/distrib/sets ERROR: Failed to make release The following commits were made between the last successful build and the failed build: 2020.10.24.04.47.43 blymn src/distrib/sets/lists/tests/mi,v 1.951 Logs can be found at: http://releng.NetBSD.org/b5reports/i386/commits-2020.10.html#2020.10.24.04.47.43
Re: Automated report: NetBSD-current/i386 test failure (l2tp)
Roy Marples wrote: > This is rump crashing and I don't know why. If the rump kernel crashes in the test, that likely means the real kernel will crash in actual use. > I can't get a backtrace to tell me where the problem is. I managed to get one this way: sysctl -w kern.defcorename="/tmp/%n.core" cd /usr/tests/net/if_l2tp ./t_l2tp l2tp_basic_ipv4overipv4 gdb rump_server /tmp/rump_server.core It looks like this: (gdb) bt #0 0x752206d751ea in _lwp_kill () from /usr/lib/libc.so.12 #1 0x752206d756e5 in abort () at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/lib/libc/stdlib/abort.c:74 #2 0x7522076088bf in rumpuser_exit (rv=rv@entry=-1) at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/lib/librumpuser/rumpuser.c:236 #3 0x7522082c2b74 in cpu_reboot (howto=, bootstr=) at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/lib/librump/../../sys/rump/librump/rumpkern/emul.c:429 #4 0x75220827b08d in kern_reboot (howto=4, bootstr=0x0) at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/lib/librump/../../sys/rump/../kern/kern_reboot.c:73 #5 0x752208279efe in vpanic ( fmt=0x752205ea5428 "kernel %sassertion \"%s\" failed: file \"%s\", line %d ", ap=0x75220319fc88) at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/lib/librump/../../sys/rump/../kern/subr_prf.c:290 #6 0x75220825f298 in kern_assert (fmt=) at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/lib/librump/../../sys/rump/../lib/libkern/kern_assert.c:51 #7 0x752205e9fa7d in if_percpuq_enqueue (ipq=0x0, m=0x752208061650) at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/sys/rump/net/lib/libnet/../../../../net/if.c:911 #8 0x752204a03801 in in_l2tp_input (eparg=, proto=, off=20, m=0x752208061858) at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/sys/rump/net/lib/libl2tp/../../../../netinet/in_l2tp.c:349 #9 in_l2tp_input (m=0x752208061858, off=20, proto=, eparg=) at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/sys/rump/net/lib/libl2tp/../../../../netinet/in_l2tp.c:249 #10 0x752205e75097 in encap4_input (m=0x752208061858, off=20, proto=115) at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/sys/rump/net/lib/libnet/../../../../netinet/ip_encap.c:357 #11 0x752205e7d465 in ip_input (ifp=, m=) at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/sys/rump/net/lib/libnet/../../../../netinet/ip_input.c:821 #12 ipintr (arg=) at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/sys/rump/net/lib/libnet/../../../../netinet/ip_input.c:412 #13 0x7522082c265c in sithread (arg=) at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/lib/librump/../../sys/rump/librump/rumpkern/intr.c:180 #14 0x7522082bf52e in threadbouncer (arg=0x75220883bac0) at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/lib/librump/../../sys/rump/librump/rumpkern/threads.c:90 #15 0x75220720bf7e in pthread__create_tramp (cookie=0x7522085a4800) at /tmp/build/2020.10.22.11.21.42-amd64-debug/src/lib/libpthread/pthread.c:560 #16 0x752206c91dc0 in ?? () from /usr/lib/libc.so.12 -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure (l2tp)
Hi Roy, On Oct 16, the NetBSD Test Fixture wrote: > The newly failing test cases are: > > net/if_l2tp/t_l2tp:l2tp_basic_ipv4overipv4 > net/if_l2tp/t_l2tp:l2tp_basic_ipv4overipv6 > net/if_l2tp/t_l2tp:l2tp_basic_ipv6overipv4 > net/if_l2tp/t_l2tp:l2tp_basic_ipv6overipv6 > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv4_transport_ah_hmacsha512 > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv4_transport_ah_null > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv4_transport_esp_null > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv4_transport_esp_rijndaelcbc > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv4_tunnel_ah_hmacsha512 > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv4_tunnel_ah_null > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv4_tunnel_esp_null > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv4_tunnel_esp_rijndaelcbc > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv6_transport_ah_hmacsha512 > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv6_transport_ah_null > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv6_transport_esp_null > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv6_transport_esp_rijndaelcbc > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv6_tunnel_ah_hmacsha512 > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv6_tunnel_ah_null > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv6_tunnel_esp_null > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv6_tunnel_esp_rijndaelcbc These are still failing as of 2020.10.21.15.12.15, and the commit that triggered the failures has now been identified: 2020.10.15.02.54.10 roy src/sys/net/if_l2tp.c 1.44 For logs, see http://www.gson.org/netbsd/bugs/build/amd64/commits-2020.10.html#2020.10.15.02.54.10 -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
Two days ago, the NetBSD Test Fixture wrote: > This is an automatically generated notice of new failures of the > NetBSD test suite. > > The newly failing test cases are: > > sbin/resize_ffs/t_grow:grow_16M_v0_8192 > sbin/resize_ffs/t_grow:grow_16M_v1_16384 > sbin/resize_ffs/t_grow:grow_16M_v2_32768 > sbin/resize_ffs/t_grow_swapped:grow_16M_v0_65536 > sbin/resize_ffs/t_grow_swapped:grow_16M_v1_4096 > sbin/resize_ffs/t_grow_swapped:grow_16M_v2_8192 > sbin/resize_ffs/t_shrink:shrink_24M_16M_v0_32768 > sbin/resize_ffs/t_shrink:shrink_24M_16M_v1_65536 > sbin/resize_ffs/t_shrink_swapped:shrink_24M_16M_v0_4096 > sbin/resize_ffs/t_shrink_swapped:shrink_24M_16M_v1_8192 These are still failing as of source date 2020.10.21.06.36.10, and the commit that triggered the failures has now been identified: 2020.10.18.18.22.29 chs src/sys/rump/librump/rumpvfs/vm_vfs.c 1.39 2020.10.18.18.22.29 chs src/sys/uvm/uvm_page.c 1.248 2020.10.18.18.22.29 chs src/sys/uvm/uvm_pager.c 1.130 Logs from real amd64 hardware are at: http://www.gson.org/netbsd/bugs/build/amd64-baremetal/commits-2020.10.html#2020.10.18.18.22.29 -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
On Oct 8, the NetBSD Test Fixture wrote: > The newly failing test cases are: > > net/carp/t_basic:carp_handover_ipv4_halt_carpdevip > net/carp/t_basic:carp_handover_ipv4_halt_nocarpdevip > net/carp/t_basic:carp_handover_ipv4_ifdown_carpdevip > net/carp/t_basic:carp_handover_ipv4_ifdown_nocarpdevip > net/carp/t_basic:carp_handover_ipv6_halt_carpdevip > net/carp/t_basic:carp_handover_ipv6_ifdown_carpdevip These were fixed on Oct 8, but then broken again on Oct 12: http://releng.netbsd.org/b5reports/i386/commits-2020.10.html#2020.10.12.11.07.27 They are still failing as of source date 2020.10.13.21.27.18. -- Andreas Gustafsson, g...@gson.org
-current fails to boot
All, At least i386 and amd64 are currently in a state where installation using sysinst completes successfully, but the installed system fails to boot. The problem appears to have started yesterday during a period of build breakage encompassing the following commits: 2020.10.03.17.30.54 rillig src/usr.bin/make/unit-tests/Makefile 1.159 2020.10.03.17.30.54 rillig src/usr.bin/make/unit-tests/hanoi-include.exp 1.1 2020.10.03.17.30.54 rillig src/usr.bin/make/unit-tests/hanoi-include.mk 1.1 2020.10.03.17.31.46 thorpej src/sys/arch/alpha/alpha/autoconf.c 1.55 2020.10.03.17.31.46 thorpej src/sys/arch/alpha/alpha/machdep.c 1.366 2020.10.03.17.31.46 thorpej src/sys/arch/alpha/alpha/prom.c 1.58 2020.10.03.17.31.46 thorpej src/sys/arch/alpha/include/alpha.h 1.42 2020.10.03.17.31.46 thorpej src/sys/arch/alpha/include/prom.h 1.16 2020.10.03.17.32.49 thorpej src/sys/arch/alpha/alpha/qemu.c 1.3 2020.10.03.17.33.23 thorpej src/sys/arch/alpha/include/rpb.h 1.44 2020.10.03.18.06.37 christos src/sbin/mount_nfs/mount_nfs.8 1.49 2020.10.03.18.06.37 christos src/sbin/mount_nfs/mount_nfs.c 1.73 2020.10.03.18.29.02 wiz src/sbin/mount_nfs/mount_nfs.8 1.50 2020.10.03.18.30.39 christos src/include/rpc/auth.h 1.20 2020.10.03.18.31.29 christos src/lib/libc/rpc/Makefile.inc 1.27 2020.10.03.18.31.29 christos src/lib/libc/rpc/auth_unix.c 1.27 2020.10.03.18.31.29 christos src/lib/libc/rpc/rpc_clnt_auth.3 1.7 2020.10.03.18.33.52 christos src/distrib/sets/lists/comp/mi 1.2361 2020.10.03.18.34.15 christos src/lib/libc/shlib_version 1.290 2020.10.03.18.35.21 christos src/distrib/sets/lists/base/shl.mi 1.908 2020.10.03.18.35.21 christos src/distrib/sets/lists/debug/shl.mi 1.267 2020.10.03.18.42.20 christos src/sbin/mount_nfs/mount_nfs.c 1.74 2020.10.03.18.54.18 martin src/usr.sbin/sysinst/bsddisklabel.c 1.46 2020.10.03.18.54.18 martin src/usr.sbin/sysinst/disklabel.c 1.40 2020.10.03.18.54.18 martin src/usr.sbin/sysinst/gpt.c 1.19 2020.10.03.18.54.18 martin src/usr.sbin/sysinst/label.c 1.26 2020.10.03.18.54.18 martin src/usr.sbin/sysinst/mbr.c 1.34 2020.10.03.18.54.18 martin src/usr.sbin/sysinst/part_edit.c 1.18 2020.10.03.18.54.18 martin src/usr.sbin/sysinst/partitions.h 1.17 2020.10.03.20.34.06 rillig src/distrib/sets/lists/tests/mi 1.937 -- Andreas Gustafsson, g...@gson.org
Finding errors in build logs
On June 21, Simon J. Gerraty wrote: > > It would be helpful for both human and robotic users if error messages > > consistently included the word "error", or if there was some other easy > > way of identifying them in the build log. > > The regex 'make.*stopped' is the best clue to look for since it will > always be present. It is not present in this recent log extract: http://releng.netbsd.org/b5reports/i386/2020/2020.09.27.13.59.24/build.log.tail I also checked the full log, and it's not there, either. -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
The NetBSD Test Fixture reported this test failure twice: > net/net/t_unix:sockaddr_un_local_peereid Sorry about the duplicate report. The testbed is now using Python 3 and that appears to have broken the duplicate suppression. I'll fix it. -- Andreas Gustafsson, g...@gson.org
Re: System panicing on boot since recent uvm changes
Tobias Nygren wrote: > Seems there is still something wrong with -current. > ./build.sh -j8 hangs in <10 seconds on a t3.2xlarge EC2 instance. > Reverting to a -D20200812 kernel makes it stable. FWIW, I successfully completed a "build.sh -j 24 release" of 9.0 hosted on a -current built from source date 2020.08.16.00.24.41, running on real amd64 hardware. -- Andreas Gustafsson, g...@gson.org
Re: System panicing on boot since recent uvm changes
Chuck Silvers wrote: > this should be fixed now. > sorry about that, the problem did not happen for me and > it took me forever to find a way that I could reproduce it. This is not to pick on you specifically as almost everyone is doing the same thing, but IMO, in cases like this it would generally be better to revert the commit immediately and later re-commit a correct version rather than leaving things broken during the entire process of reproducing and fixing the issue. -- Andreas Gustafsson, g...@gson.org
System panicing on boot since recent uvm changes
Hi chs, At least i386, amd64, and sparc are all panicing on boot since this commit: 2020.08.14.09.06.14 chs src/sys/miscfs/genfs/genfs_io.c 1.100 2020.08.14.09.06.15 chs src/sys/uvm/uvm_extern.h 1.231 2020.08.14.09.06.15 chs src/sys/uvm/uvm_object.c 1.24 2020.08.14.09.06.15 chs src/sys/uvm/uvm_object.h 1.39 2020.08.14.09.06.15 chs src/sys/uvm/uvm_page.c 1.245 2020.08.14.09.06.15 chs src/sys/uvm/uvm_page_status.c 1.6 2020.08.14.09.06.15 chs src/sys/uvm/uvm_pager.c 1.129 2020.08.14.09.06.15 chs src/sys/uvm/uvm_vnode.c 1.116 Logs: http://releng.netbsd.org/b5reports/i386/commits-2020.08.html#2020.08.14.09.06.15 Please revert the commit. -- Andreas Gustafsson, g...@gson.org
Re: i386 and amd64 testbeds now use NVMM
Jukka Ruohonen wrote: > > This reduces the time it takes to run the test suite from more than > > 20 hours to about 3-4 hours. Many thanks to Maxime Villard for making > > this possible by writing NVMM. > > Does this mean that the amount of test runs increases accordingly (i.e., > to about six runs per 24h)? It's a bit more complicated than that. Since multiple source versions are tested in parallel, the i386 tests have been achieving a throughput of more than six runs per 24 h even before the switch to NVMM. Using NVMM frees up a significant amount of CPU, but the builds and the sparc tests still use as much CPU as before. So the overall throughput of the server has increased, but by a smaller factor than the latency of the i386 and amd64 tests. -- Andreas Gustafsson, g...@gson.org
i386 and amd64 testbeds now use NVMM
Hi all, The TNF testbed is now using NVMM for the i386 and amd64 tests: http://releng.netbsd.org/b5reports/i386/ http://releng.netbsd.org/b5reports/amd64/ This reduces the time it takes to run the test suite from more than 20 hours to about 3-4 hours. Many thanks to Maxime Villard for making this possible by writing NVMM. The switch to NVMM was made yesterday, but since the testbed may test source versions out of order, there is not necessarily an unambiguous transition point in terms of -current source dates. To determine whether a given test run was made using NVMM or not, look for "-accel nvmm" in the qemu command line in the console log. Some test cases that were previously failing are now passing and vice versa. For example, kernel/t_trapsignal:fpe_* now pass, but lib/libpthread/t_condwait:* now fail (these contain a work-around for the qemu timing issues of PR 43997, but now fail to detect that they are running under qemu). -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The NetBSD Test Fixture wrote: > --- kern-XEN3PAE_DOMU --- > *** [netbsd] Error code 1 > nbmake[2]: stopped in > /tmp/bracket/build/2020.08.09.11.04.05-i386/obj/sys/arch/i386/compile/XEN3PAE_DOMU Specifically: --- kern-XEN3PAE_DOMU --- /tmp/bracket/build/2020.08.09.11.04.05-i386/tools/bin/i486--netbsdelf-ld: trap.o: in function `trap': trap.c:(.text+0xe27): undefined reference to `x86_cpu_is_lcall' -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The build is still failing, current error as of 2020.07.26.09.17.24: === 1 extra files in DESTDIR = Files in DESTDIR but missing from flist. File is obsolete or flist is out of date ? -- ./usr/libdata/debug/usr/tests/sys/crypto/chacha = end of 1 extra files === -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The build is still broken as of source date 2020.07.19.16.22.44: http://releng.netbsd.org/b5reports/i386/commits-2020.07.html#2020.07.19.16.22.44 -- Andreas Gustafsson, g...@gson.org
Re: NetBSD-7.0 boots OK and NetBSD-8.0 hangs/crashes during boot on a MacBook7,1
Brian Buhrow wrote: > Hello. I'm thinking of notebooks. Yes, they have screens and > keyboards, but those are not always usable and, having a serial console > over USB could let someone install to a notebook remotely. > Also, I've encountered some Intel based appliance boards that don't have > easily > used serial ports on them. When they're installed in cramped wiring > closets, it's much easier to get a USB serial port on them than it is to > get a screen and keyboard. It's not just laptops and appliance boards - even ATX sized PC motherboards have been made with no com ports for a long time, for example the Intel DH67CL from 2011. The specifications at https://ark.intel.com/content/www/us/en/ark/products/50101/intel-desktop-board-dh67cl.html say # of Serial Ports: 0 Serial Port via Internal Header: No and when booting NetBSD on one, the dmesg output contains no "com" entry. -- Andreas Gustafsson, g...@gson.org
Re: NetBSD-7.0 boots OK and NetBSD-8.0 hangs/crashes during boot on a MacBook7,1
Martin Husemann wrote: > USB keyboards as console in ddb worked fine last I tested. That has not been my experience. For example, PR 52569 Entering ddb using USB keyboard panics with "locking against myself" PR 54599 Can't enter ddb using USB keyboard because console > Running the usb host in polled mode however is quite a bit simpler than > doing the device part (where you have to obey timings from the host). Why would acting as a device be needed? -- Andreas Gustafsson, g...@gson.org
Re: Build error on amd64 -current
Paul Goyette wrote: > With up-to-date sources I'm getting > > /build/netbsd-compat/src_ro/sys/arch/xen/x86/cpu.c: In function > 'mp_cpu_start': > /build/netbsd-compat/src_ro/sys/arch/xen/x86/cpu.c:999:1: error: stack usage > is5408 bytes [-Werror=stack-usage=] > mp_cpu_start(struct cpu_info *ci, vaddr_t target) > ^~~~ It started with this commit: 2020.06.25.14.52.26 jdolecek src/sys/conf/Makefile.kern.inc 1.274 enable gcc stack usage limit for kernel functions, set to 3.5 KiB for now as that seems to be enough to accomodate the current biggest stack usages there are about six functions which use over 3KiB local stack, and about a dozen between 2-3 KiB, so pushing this further needs more work if desired compile tested on amd64, i386, sparc64, sparc, powerpc (evbppc - BookE), m68k (mac68k) -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
Simon J. Gerraty wrote: > Simon J. Gerraty wrote: > > > It would be helpful for both human and robotic users if error messages > > > consistently included the word "error", or if there was some other easy > > > way of identifying them in the build log. > > > > The regex 'make.*stopped' is the best clue to look for since it will > > always be present. I'll change bracket to look for that and see how it works. > BTW if this behavior change is a problem for your automation, you can > disable it by setting .MAKE.DIE_QUIETLY=no That would be counter to my principle of testing with default settings. -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
Martin pointed me to this error some 63 lines from the end of the log: --- dependall-tests --- nbmake[7]: nbmake[7]: don't know how to make t_cabsl.cc. Stop I think the reason I didn't find it myself is that I have developed a habit of searching for the message "Error code 1" (or similar with another number) which used to be printed by make, but that's no longer there. Bracket also looks for that string as part of its heuristics for deciding how much of the build log to include in the email report, which is why this report didn't include any of it. It would be helpful for both human and robotic users if error messages consistently included the word "error", or if there was some other easy way of identifying them in the build log. -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The NetBSD Test Fixture wrote: > This is an automatically generated notice of a NetBSD-current/i386 > build failure. > > The failure occurred on babylon5.netbsd.org, a NetBSD/amd64 host, > using sources from CVS date 2020.06.21.03.39.21. > > The following commits were made between the last successful build and > the failed build: > > 2020.06.21.03.39.21 lukem src/share/mk/bsd.dep.mk,v 1.85 > > Logs can be found at: > > > http://releng.NetBSD.org/b5reports/i386/commits-2020.06.html#2020.06.21.03.39.21 The full build log can be found at: http://releng.netbsd.org/b5reports/i386/2020/2020.06.21.03.39.21/build.log It's not clear from the log what the error was or where it occurred, and I'm wondering if the lack of identifying and locating information could be related to another recent commit: 2020.06.19.21.17.48 sjg src/usr.bin/make/job.c 1.198 2020.06.19.21.17.48 sjg src/usr.bin/make/main.c 1.275 2020.06.19.21.17.48 sjg src/usr.bin/make/make.h 1.108 Avoid unnecessary noise when sub-make or sibling dies When analyzing a build log, the first 'stopped' output from make, is the end of interesting output. Normally when a build fails deep down in a parallel build the log ends with many blockes of error output from make, with all but the fist being unhelpful. We add a function dieQuietly() which will return true if we should supress the error output from make. If the failing node was a sub-make, we want to die quietly. Also when we read an abort token we call dieQuietly telling we want to die quietly. This behavior is suppressed by -dj or setting .MAKE.DIE_QUIETLY=no Reviewed by: christos -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
The NetBSD Test Fixture wrote: > This is an automatically generated notice of new failures of the > NetBSD test suite. > > The newly failing test cases are: > > dev/audio/t_audio:AUDIO_ERROR_RDWR > dev/audio/t_audio:AUDIO_ERROR_WRONLY [and many more] That message got stuck somewhere (moderation?) for three days, and those particular failures have already been fixed. There are still plenty of other test cases failing, though, 79 of them at last count: http://releng.netbsd.org/b5reports/i386/2020/2020.04.27.02.54.42/test.html#failed-tcs-summary -- Andreas Gustafsson, g...@gson.org
Re: github.com/NetBSD/src 5 days old?
m...@netbsd.org wrote: > Yes, I believe joerg and spz are changing the conversion from > cvs->??->git to hg->git, to match what will be done once we stop using > CVS. Has there been a formal decision choosing hg over git? -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
There are actually now more than 2,000 failing test cases in total, but the email message reporting most of them has failed to appear on current-users, perhaps because of its size. -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
The NetBSD Test Fixture sent three reports listing the following groups of commits, respectively: >2020.04.16.14.39.58 joerg src/lib/libc/gen/pthread_atfork.c,v 1.13 >2020.04.16.14.39.58 joerg src/libexec/ld.elf_so/rtld.c,v 1.204 >2020.04.16.14.39.58 joerg src/libexec/ld.elf_so/rtld.h,v 1.139 >2020.04.16.14.39.58 joerg src/libexec/ld.elf_so/symbols.map,v 1.3 >2020.04.16.18.20.46 msaitoh src/sys/dev/pci/pcidevs,v 1.1406 >2020.04.16.18.21.12 msaitoh src/sys/dev/pci/pcidevs.h,v 1.1394 >2020.04.16.18.21.12 msaitoh src/sys/dev/pci/pcidevs_data.h,v 1.1393 >2020.04.16.18.32.29 msaitoh src/sys/dev/pci/ichsmb.c,v 1.67 >2020.04.16.18.51.47 pgoyette src/share/man/man4/man4.x86/imcsmb.4,v 1.8 >2020.04.16.18.56.04 pgoyette src/share/man/man4/man4.x86/imcsmb.4,v 1.9 >2020.04.16.19.23.50 bouyer src/sys/arch/xen/xen/Attic/xen_clock.c,v 1.1 >2020.04.16.15.47.19 christos > src/external/gpl3/binutils/dist/ld/emultempl/elf.em,v 1.2 >2020.04.16.15.58.13 jdolecek > src/sys/external/mit/xen-include-public/dist/xen/include/public/io/blkif.h,v > 1.2 >2020.04.16.16.38.43 jdolecek src/sys/arch/xen/xen/xbd_xenbus.c,v 1.116 >2020.04.16.17.18.27 nat src/sys/dev/ic/rtwnreg.h,v 1.3 >2020.04.16.17.18.27 nat src/sys/dev/usb/if_urtwn.c,v 1.86 The latter two reports are spurious, and the commits listed in them have nothing to do with the breakage. The reason for the spurious reports is that a large number of t_ptrace_wait* test cases started failing with the commit listen in the first report, but are not failing in every run. Tests that happened to pass in the first run and fail four times in a row after that got reported one commit too late, etc. -- Andreas Gustafsson, g...@gson.org
Re: Build time measurements
Earlier, I wrote: > > After disabling DIAGNOSTIC and acpicpu, they are: > > > > 2016.09.06.06.27.173319.87 real 9767.39 user 4184.24 sys > > 2019.10.18.17.16.503525.65 real 10309.00 user 11618.57 sys > > 2020.03.17.22.03.412419.52 real 9577.58 user 9602.81 sys > > 2020.03.22.19.56.072363.06 real 9482.36 user 7614.66 sys One more with the same settings: 2020.04.09.11.10.072210.82 real 9435.36 user 4388.02 sys That's a great reduction in system time in the last few weeks. -- Andreas Gustafsson, g...@gson.org
Re: WRT the failing ATF tests (some of them)
Robert Elz wrote: > A bunch of the net yet fixed, relatively recent ATF test failures are > caused by: > > rn_init: radix functions require max_keylen be set I don't think there is a causal relationship between those messages and any current test failures. If it's any consolation, I made the same mistaken assumption back in 2011. Tens or hundreds of those messages appear in the output from most test runs starting in 2009, including ones where all tests passed. I think the only runs where they don't occur are those where the system was unable to actually run the tests. Can you please file a PR (about the messages being printed and confusing people for a decade now, not about the tests being broken)? -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
NetBSD Test Fixture wrote: > This is an automatically generated notice of new failures of the > NetBSD test suite. > > The newly failing test cases are: > > fs/puffs/t_basic:root_chrdev > fs/puffs/t_basic:root_fifo > fs/puffs/t_basic:root_lnk (etc) These are already reported in kern/55146 as the automated report was delayed until kern/54786 got fixed. Some but not all of the failures reported are already fixed; a more up-to-date list of the tests still failing is at http://www.gson.org/netbsd/bugs/build/amd64-baremetal/2020/2020.04.06.20.26.16/test.html#failed-tcs-summary -- Andreas Gustafsson, g...@gson.org
Re: Build time measurements
Andrew, You wrote: > > 2016.09.06.06.27.173319.87 real 9767.39 user 4184.24 sys > > 2019.10.18.17.16.503525.65 real 10309.00 user 11618.57 sys > > 2020.03.17.22.03.412419.52 real 9577.58 user 9602.81 sys > > 2020.03.22.19.56.072363.06 real 9482.36 user 7614.66 sys > > Thanks for repeating the tests. For the sys time to still be that high in > relation to user, there's some other limiting factor. Does that machine > have tmpfs /tmp? It is a fresh install with all default settings except for disabling DIAGNOSTIC and acpicpu. For 2020.03.22.19.56.07 that means it does have a tmpfs /tmp, but I have not checked the others. The SRCDIR, OBJDIR, etc are all on a single SATA SSD. > Is NUMA enabled in the BIOS? Different node number for > CPUs in recent kernels in dmesg is a good clue. Different from the other CPUs in the same dmesg, or different from a non-recent kernel? And how recent is recent? > Is it a really old source tree? Every build is of the official NetBSD-8.1/amd64 tree. > I would be interested to see lockstat output from a kernel build at > some point, if you're so inclined. Is this just "lockstat build.sh ...", or are there some specific lockstat options I should use? PS. I would prefer that you prioritize fixing the fallout from the changes you have already made so far over making further changes. -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The NetBSD Test Fixture wrote: > --- dependall-gdb --- > > CC=/tmp/bracket/build/2020.04.02.11.52.41-i386/tools/bin/i486--netbsdelf-c++ > /tmp/bracket/build/2020.04.02.11.52.41-i386/tools/bin/nbmkdep -f maint.d.tmp > -- -std=gnu++11 > --sysroot=/tmp/bracket/build/2020.04.02.11.52.41-i386/destdir -D_KERNTYPES > -I/tmp/bracket/build/2020.04.02.11.52.41-i386/src/external/gpl3/gdb/lib/libgdb > > -I/tmp/bracket/build/2020.04.02.11.52.41-i386/src/external/gpl3/gdb/lib/libgdb/arch/i386 > > -I/tmp/bracket/build/2020.04.02.11.52.41-i386/src/external/gpl3/gdb/lib/libgdb/../../dist/gdb > > -I/tmp/bracket/build/2020.04.02.11.52.41-i386/src/external/gpl3/gdb/lib/libgdb/../../dist/gdb/config > > -I/tmp/bracket/build/2020.04.02.11.52.41-i386/src/external/gpl3/gdb/lib/libgdb/../../dist/gdb/common > > -I/tmp/bracket/build/2020.04.02.11.52.41-i386/src/external/gpl3/gdb/lib/libgdb/../../dist/gdb/gnulib/import > > -I/tmp/bracket/build/2020.04.02.11.52.41-i386/src/external/gpl3/gdb/lib/libgdb/../../dist/include/opcode > -I/tmp/bracket/build/2020.04.02.11.52.41-i386/src/external/gpl3/gdb > /lib/libgdb/../../dist/libdecn--- dependall-gcc --- > > /tmp/bracket/build/2020.04.02.11.52.41-i386/src/external/gpl3/gcc/dist/gcc/machmode.h:593:30: > error: 'mode_nunits_inline' was not declared in this scope > ? mode_nunits_inline (mode) : mode_nunits[mode]); > ^ > *** [min-insn-modes.lo] Error code 1 > nbmake[9]: stopped in > /tmp/bracket/build/2020.04.02.11.52.41-i386/src/external/gpl3/gcc/usr.bin/backend > --- gengtype.lo --- This looks like a random failure, and there has been a couple of other similar ones also involving machmode.h: http://releng.netbsd.org/b5reports/i386/2019/2019.11.26.08.38.19/build.log.tail http://releng.netbsd.org/b5reports/i386/2020/2020.03.15.15.58.24/build.log.tail Someone please fix. -- Andreas Gustafsson, g...@gson.org
Re: Build time measurements
On Wednesday, I said: > I will rerun the 24-core tests with these disabled for comparison. Done. To recap, with a stock GENERIC kernel, the numbers were: 2016.09.06.06.27.173321.55 real 9853.49 user 5156.92 sys 2019.10.18.17.16.503767.63 real 10376.15 user 16100.99 sys 2020.03.17.22.03.412910.76 real 9696.10 user 18367.58 sys 2020.03.22.19.56.072711.14 real 9729.10 user 12068.90 sys After disabling DIAGNOSTIC and acpicpu, they are: 2016.09.06.06.27.173319.87 real 9767.39 user 4184.24 sys 2019.10.18.17.16.503525.65 real 10309.00 user 11618.57 sys 2020.03.17.22.03.412419.52 real 9577.58 user 9602.81 sys 2020.03.22.19.56.072363.06 real 9482.36 user 7614.66 sys -- Andreas Gustafsson, g...@gson.org
Re: Build time measurements
Andrew, You wrote: > Thank you for doing this, and for bisecting the performance losses over > time (I fixed the vnode regression you found BTW). Thank you for the fix and the other performance improvements! > There are two options enabled in -current that spoil performance on multi > processor machines: DIAGNOSTIC and acpicpu. I'm guessing that you had both > enabled during your test runs. Yes, my tests so far have all been using unmodified GENERIC kernels. > We ship releases without DIAGNOSTIC, and acpicpu really needs to be > fixed. I will rerun the 24-core tests with these disabled for comparison. -- Andreas Gustafsson, g...@gson.org
Build time measurements
Hi all, In September and November, I reported some measurements of the amount of system time it takes to build a NetBSD-8/amd64 release on different versions of -current/amd64. I have now repeated the measurements with a couple of newer versions of -current on the same hardware, and here are the results. The left column is the source date of the -current system hosting the build. HP ProLiant DL360 G7, 2 x Xeon L5630, 8 cores, 32 GB, build.sh -j 8 2016.09.06.06.27.173930.86 real 15737.04 user 4245.26 sys 2019.10.18.17.16.504461.47 real 16687.37 user 9344.68 sys 2020.03.17.22.03.414723.81 real 16646.42 user 8928.72 sys 2020.03.22.19.56.074595.95 real 16592.80 user 8171.56 sys I also measured the same versions on a newer machine with more cores: Dell PowerEdge 630, 2 x Xeon E5-2678 v3, 24 cores, 32 GB, build.sh -j 24 2016.09.06.06.27.173321.55 real 9853.49 user 5156.92 sys 2019.10.18.17.16.503767.63 real 10376.15 user 16100.99 sys 2020.03.17.22.03.412910.76 real 9696.10 user 18367.58 sys 2020.03.22.19.56.072711.14 real 9729.10 user 12068.90 sys -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The NetBSD Test Fixture wrote: > cc1: all warnings being treated as errors > *** [t_ptrace_wait.o] Error code 1 The compiler error message did not appare because it was too far back from the end of the build log (5149 lines): --- dependall-sys --- /tmp/bracket/build/2020.03.07.14.53.14-i386/src/tests/lib/libc/sys/t_ptrace_wait.c: In function 'traceme_crash': /tmp/bracket/build/2020.03.07.14.53.14-i386/src/tests/lib/libc/sys/t_ptrace_wait.c:441:24: error: implicit declaration of function 'are_fpu_exceptions_supported'; did you mean 'are_fpu_exceptions_supporter'? [-Werror=imp\ licit-function-declaration] if (sig == SIGFPE && !are_fpu_exceptions_supported()) ^~~~ are_fpu_exceptions_supporter -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
This morning, the NetBSD Test Fixture wrote: > The newly failing test case is: > > net/if_ipsec/t_ipsec_natt:ipsecif_natt_transport_rijndaelcbc > > The above test failed in each of the last 3 test runs, and passed in > at least 27 consecutive runs before that. > > The following commits were made between the last successful test and > the failed test: > > 2020.03.04.22.00.03 ad src/sys/arch/x86/x86/pmap.c,v 1.362 > 2020.03.04.22.07.08 christos src/external/bsd/Makefile,v 1.68 > 2020.03.04.22.09.00 christos src/distrib/sets/lists/base/mi,v 1.1231 > 2020.03.04.22.09.00 christos src/distrib/sets/lists/debug/mi,v 1.296 > 2020.03.04.22.24.46 fcambus src/share/misc/inter.phone,v 1.32 > 2020.03.04.22.56.08 jmcneill src/external/bsd/Makefile,v 1.69 I'm not sure what happened here. The ipsecif_natt_transport_null and ipsecif_natt_transport_rijndaelcbc test cases both failed in the same three consecutive tests which seems unlikely to be a coincidence, but whatever it was, it appears to have been resolved as they have both passed twice since then. Here are the outcomes of the last 40 runs for the two test cases with "-" meaning success and "X" meaning failure: XX-X--X-X--XXX-- net/if_ipsec/t_ipsec_natt:ipsecif_natt_transport_null -X-XXX-- net/if_ipsec/t_ipsec_natt:ipsecif_natt_transport_rijndaelcbc -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
NetBSD Test Fixture wrote: > *** [cleandir-pamu2fcfg] Error code 2 > nbmake[7]: stopped in > /tmp/bracket/build/2020.03.03.00.47.33-i386/src/external/bsd/pam-u2f/bin The build is still broken as of source date 2020.03.03.08.56.05: http://releng.netbsd.org/b5reports/i386/commits-2020.03.html#2020.03.03.08.56.05 Would it be too much to ask that imports of entire new subsystems like this be at least build tested with "build.sh release"? -- Andreas Gustafsson, g...@gson.org
Re: Regressions
Jason Thorpe wrote: > The issue seems to be that rump really wants to join threads that > are created for work queues when the rump server exits. But in this > particular case, there's a global work queue that never goes away > because in the real kernel, there's no need to do this before the > system reboots / shuts down. Any change to fix this will be 100% > for the appeasement of rump. Well, yes, just like any change to fix the current build breakage in if_stge.c will be 100% for the appeasement of 32-bit platforms. Are you saying fixing one or the other is not your responsibility, and if so, whose? -- Andreas Gustafsson, g...@gson.org
Regressions
Hi all, NetBSD-current is again suffering from a number of regressions. The last time the ATF tests showed zero unexpected failures on real amd64 hardware was on Dec 12, and the sparc, sparc64, pmax, and hpcmips tests have all been unable to run to completion for more than a month. Here are the PRs for some of the issues: 50350 rump/rumpkern/t_sp/stress_{long,short} fail on Core 2 Quad 54810 sparc64 pool_redzone_check errors during install 54845 sparc panics in sleepq_remove 54923 pmax test runs fail to complete since Jan 15 55018 atf tests for pppoe sometimes leave rump_server processes around 55020 dbregs_dr?_dont_inherit_lwp test cases fail on real hardware 55032 rump/rumpkern/t_vm:uvmwait test case now fails What can be done? -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The NetBSD Test Fixture wrote: > *** [hifn7751.o] Error code 1 > nbmake[8]: stopped in > /tmp/bracket/build/2020.02.29.11.03.44-i386/src/sys/modules/hifn Specifically: --- dependall-hifn --- In file included from /tmp/bracket/build/2020.02.29.11.03.44-i386/src/sys/dev/pci/hifn7751.c:53: /tmp/bracket/build/2020.02.29.11.03.44-i386/src/sys/dev/pci/hifn7751.c: In function 'hifn_rng_locked': /tmp/bracket/build/2020.02.29.11.03.44-i386/src/sys/dev/pci/hifn7751.c:692:13: error: comparison of integer expressions of different signedness: 'unsigned int' and 'int' [-Werror=sign-compare] nwords = MIN(__arraycount(num), nwords); ^~~ /tmp/bracket/build/2020.02.29.11.03.44-i386/src/sys/dev/pci/hifn7751.c:692:13: error: operand of ?: changes signedness from 'int' to 'unsigned int' due to unsignedness of other operand [-Werror=sign-compare] nwords = MIN(__arraycount(num), nwords); ^~~ /tmp/bracket/build/2020.02.29.11.03.44-i386/src/sys/dev/pci/hifn7751.c: In function 'hifn_next_signature': /tmp/bracket/build/2020.02.29.11.03.44-i386/src/sys/dev/pci/hifn7751.c:850:16: error: comparison of integer expressions of different signedness: 'int' and 'u_int' {aka 'unsigned int'} [-Werror=sign-compare] for (i = 0; i < cnt; i++) { ^ /tmp/bracket/build/2020.02.29.11.03.44-i386/src/sys/dev/pci/hifn7751.c: In function 'hifn_ramtype': /tmp/bracket/build/2020.02.29.11.03.44-i386/src/sys/dev/pci/hifn7751.c:1134:16: error: comparison of integer expressions of different signedness: 'int' and 'unsigned int' [-Werror=sign-compare] for (i = 0; i < sizeof(data); i++) ^ /tmp/bracket/build/2020.02.29.11.03.44-i386/src/sys/dev/pci/hifn7751.c:1145:16: error: comparison of integer expressions of different signedness: 'int' and 'unsigned int' [-Werror=sign-compare] for (i = 0; i < sizeof(data); i++) ^ /tmp/bracket/build/2020.02.29.11.03.44-i386/src/sys/dev/pci/hifn7751.c: In function 'hifn_sramsize': /tmp/bracket/build/2020.02.29.11.03.44-i386/src/sys/dev/pci/hifn7751.c:1171:16: error: comparison of integer expressions of different signedness: 'int32_t' {aka 'int'} and 'unsigned int' [-Werror=sign-compare] for (i = 0; i < sizeof(data); i++) ^ -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The NetBSD Test Fixture wrote: > nbmake[8]: nbmake[8]: don't know how to make libpam_echo.so.. Stop This was fixed but the build is still broken as of 2020.02.27.03.25.08: # link rescue/rescue [...] /tmp/bracket/build/2020.02.27.03.25.08-i386/tools/lib/gcc/i486--netbsdelf/8.3.0/../../../../i486--netbsdelf/bin/ld: /tmp/bracket/build/2020.02.27.03.25.08-i386/destdir/usr/lib/libssh.a(sshkey.o): in function `.L1885': sshkey.c:(.text+0x83d9): undefined reference to `sshsk_sign' collect2: error: ld returned 1 exit status *** [rescue] Error code 1 More logs: http://releng.netbsd.org/b5reports/i386/commits-2020.02.html#2020.02.27.03.25.08 -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The NetBSD Test Fixture wrote: > nbmake[8]: stopped in > /tmp/bracket/build/2020.02.14.04.38.48-i386/src/sys/modules/drmkms > 1 error --- dependall-sys --- /tmp/bracket/build/2020.02.14.04.38.48-i386/src/sys/external/bsd/drm2/dist/drm/drm_bufs.c:958:40: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith] buf->address = (void *)(dmah->vaddr + offset); ^ -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
The NetBSD Test Fixture wrote: > The newly failing test case is: > > net/ipsec/t_ipsec_l2tp:ipsec_l2tp_ipv6_transport_ah_null > > The above test failed in each of the last 3 test runs, and passed in > at least 27 consecutive runs before that. The fourth test run passed, so this looks like another random occurrcence made more likely by the high frequency of ipsec test failures reported in PR 54897. -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
The NetBSD Test Fixture wrote: > The newly failing test case is: > > net/ipsec/t_ipsec_tunnel:ipsec_tunnel_ipv4_ah_keyedmd5 > > The above test failed in each of the last 3 test runs, and passed in > at least 27 consecutive runs before that. > > The following commits were made between the last successful test and > the failed test: > > 2020.01.28.07.43.42 martin src/usr.sbin/sysinst/partitions.c,v 1.10 > 2020.01.28.07.47.26 skrll src/sys/arch/arm/mainbus/cpu_mainbus.c,v 1.17 > 2020.01.28.08.09.19 martin src/sys/dev/fdt/fdtbus.c,v 1.32 > 2020.01.28.09.23.15 ad src/lib/libpthread/pthread.c,v 1.158 This is probably unrelated to the commits listed. As reported in PR 54897, many IPSEC tests are failing randomly since Feb 15, and with so many randomly failing tests, one of them failing three times in a row after succeeding 27 times is not all that unlikely. -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The i386 build is still failing as of source date 2020.01.09.04.04.01: --- dependall-exec_elf32 --- /tmp/bracket/build/2020.01.09.04.04.01-i386/src/sys/kern/core_elf32.c: In function 'coredump_note_elf32': /tmp/bracket/build/2020.01.09.04.04.01-i386/src/sys/kern/core_elf32.c:518:2: error: 'PT32_GETXSTATE' undeclared (first use in this function); did you mean 'PT_GETXSTATE'? COREDUMP_MACHDEP_LWP_NOTES(l, ns, name); ^~ /tmp/bracket/build/2020.01.09.04.04.01-i386/src/sys/kern/core_elf32.c:518:2: note: each undeclared identifier is reported only once for each function it appears in *** [core_elf32.o] Error code 1 -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
Andrew Doran wrote: > > sbin/resize_ffs/t_grow:grow_16M_v1_16384 > > sbin/resize_ffs/t_grow:grow_16M_v2_32768 > > sbin/resize_ffs/t_grow_swapped:grow_16M_v0_65536 > > sbin/resize_ffs/t_grow_swapped:grow_16M_v1_4096 > > sbin/resize_ffs/t_grow_swapped:grow_16M_v2_8192 > > sbin/resize_ffs/t_shrink:shrink_24M_16M_v0_32768 > > sbin/resize_ffs/t_shrink:shrink_24M_16M_v1_65536 > > sbin/resize_ffs/t_shrink_swapped:shrink_24M_16M_v0_4096 > > sbin/resize_ffs/t_shrink_swapped:shrink_24M_16M_v1_8192 > > Hmm, I wonder I this is a rump issue. In any case I'll take a look into the > failures this evening. Looks like the resize_ffs failures have been fixed already: http://www.gson.org/netbsd/bugs/build/i386-baremetal/commits-2019.12.html#2019.12.17.18.59.39 The lfs ones are still failing. -- Andreas Gustafsson, g...@gson.org
Re: current/Xen i386 broken on 2019-12-16 01:20 UTC
Martin Husemann wrote: > We see that on various architectures. Indeed. Here's one from i386 under qemu/KVM under Linux, with a more helpful backtrace than the Xen one: ipsec_l2tp_ipv6_tunnel_ah_null: [18.164581s] Passed. ipsec_l2tp_ipv6_tunnel_esp_null: [19.390658s] Passed. ipsec_l2tp_ipv6_tunnel_esp_rijndaelcbc: [ 4272.8545386] panic: kernel diagnostic assertion "pg->offset >= nextoff" failed: file "/bracket/build/2019.12.15.23.13.33-i386/src/sys/miscfs/genfs/genfs_io.c", line 972 [ 4272.8545386] cpu0: Begin traceback... [ 4272.8545386] vpanic(c10df2ac,c52e9c98,c52e9dd8,c09f5e28,c10df2ac,c10df1ef,c11b1014,c11b0c0c,3cc,0) at netbsd:vpanic+0x139 [ 4272.8545386] kern_assert(c10df2ac,c10df1ef,c11b1014,c11b0c0c,3cc,0,0,c52e9cf0,c0974f02,c17a188c) at netbsd:kern_assert+0x23 [ 4272.8545386] genfs_do_putpages(c1a309dc,0,0,0,0,8011,0,c52e9e30,c09f23cb,c52e9e10) at netbsd:genfs_do_putpages+0x75e [ 4272.8648708] genfs_putpages(c52e9e10,0,10,c10548c4,c1a309dc,0,0,0,0,8011) at netbsd:genfs_putpages+0x3f [ 4272.8648708] VOP_PUTPAGES(c1a309dc,0,0,0,0,8011,6,c1a309dc,0,0) at netbsd:VOP_PUTPAGES+0x4c [ 4272.8648708] vflushbuf(c1a309dc,8,c180d300,c1972100,3d5a9ff1,c1972c00,c1972100,4,c0155f51,c52e9f3c) at netbsd:vflushbuf+0x62 [ 4272.8648708] ffs_full_fsync(c1a309dc,8,c0980010,30,c09e3a39,10,c1992008,c1a309dc,c52e9f08,102) at netbsd:ffs_full_fsync+0x134 [ 4272.8745850] ffs_fsync(c52e9f3c,c52e9f60,c09e7529,c1054c18,c1a309dc,c1808040,8,0,0,0) at netbsd:ffs_fsync+0x127 [ 4272.8745850] VOP_FSYNC(c1a309dc,c1808040,8,0,0,0,0,5df6e568,0,c1ba7188) at netbsd:VOP_FSYNC+0x4f [ 4272.8745850] sched_sync(c1972c00,16bd000,16c8000,0,c01005a3,0,0,0,0,0) at netbsd:sched_sync+0x1f0 [ 4272.8745850] cpu0: End traceback... This is from: http://www.gson.org/netbsd/bugs/build/i386-linuxhost/2019/2019.12.15.23.13.33/test.log There's also: http://releng.netbsd.org/b5reports/i386/2019/2019.12.15.22.50.51/install.log http://releng.netbsd.org/b5reports/sparc64/2019/2019.12.16.00.03.50/install.log -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 test failure
The NetBSD Test Fixture wrote: > The newly failing test case is: > > sbin/resize_ffs/t_grow:grow_16M_v0_8192 > > The above test failed in each of the last 4 test runs, and passed in > at least 36 consecutive runs before that. > > The following commits were made between the last successful test and > the failed test: > > 2019.12.15.23.13.33 uwe src/lib/libpthread/pthread_rwlock.c,v 1.35 > 2019.12.16.00.03.50 jmcneill src/sys/arch/aarch64/aarch64/efi_machdep.c,v > 1.5 > 2019.12.16.00.03.50 jmcneill src/sys/arch/arm/arm/efi_runtime.c,v 1.3 > 2019.12.16.00.03.50 jmcneill src/sys/arch/arm/arm/efi_runtime.h,v 1.3 There was a second report that showed several other resize_ffs test cases already failing before these commits were made, so it seems likely that this particular test case just happened to randomly pass in the first run after the bug was introduced, which would mean these commits are innocent. -- Andreas Gustafsson, g...@gson.org
New ATF test failures
The following test cases are now failing on multiple testbeds: dev/sysmon/t_swsensor/alarm_sensor dev/sysmon/t_swsensor/entropy_interrupt_sensor dev/sysmon/t_swsensor/entropy_polled_sensor dev/sysmon/t_swsensor/limit_sensor dev/sysmon/t_swsensor/simple_sensor net/if_vlan/t_vlan/vlan_auto_follow_mtu net/if_vlan/t_vlan/vlan_auto_follow_mtu6 net/if_vlan/t_vlan/vlan_basic net/if_vlan/t_vlan/vlan_basic6 net/if_vlan/t_vlan/vlan_bridge net/if_vlan/t_vlan/vlan_bridge6 net/if_vlan/t_vlan/vlan_configs net/if_vlan/t_vlan/vlan_configs6 net/if_vlan/t_vlan/vlan_create_destroy net/if_vlan/t_vlan/vlan_create_destroy6 net/if_vlan/t_vlan/vlan_multicast net/if_vlan/t_vlan/vlan_multicast6 net/if_vlan/t_vlan/vlan_vlanid net/if_vlan/t_vlan/vlan_vlanid6 since these commits: 2019.12.12.22.55.20 pgoyette src/sys/kern/files.kern 1.39 2019.12.12.22.55.20 pgoyette src/sys/kern/init_main.c 1.509 2019.12.12.22.55.20 pgoyette src/sys/kern/kern_module.c 1.141 2019.12.12.22.55.20 pgoyette src/sys/kern/kern_module_hook.c 1.1 2019.12.12.22.55.20 pgoyette src/sys/rump/librump/rumpkern/Makefile.rumpkern 1.178 2019.12.12.22.55.20 pgoyette src/sys/sys/module_hook.h 1.6 2019.12.12.22.55.20 pgoyette src/sys/sys/param.h 1.624 For logs, see: http://www.gson.org/netbsd/bugs/build/amd64-baremetal/commits-2019.12.html#2019.12.12.22.55.20 -- Andreas Gustafsson, g...@gson.org
Build breakage
Hi all, As of source date 2019.12.12.05.00.33, the evbarm-earmv7hf build is failing with: --- kern_module.o --- /tmp/bracket/build/2019.12.12.11.47.30-evbarm-earmv7hf/src/lib/librump/../../sys/rump/../kern/kern_module.c: In function 'module_init': /tmp/bracket/build/2019.12.12.11.47.30-evbarm-earmv7hf/src/lib/librump/../../sys/rump/../kern/kern_module.c:456:20: error: implicit declaration of function 'pserialize_create'; did you mean 'sysctl_create'? [-Werror=implicit-function-declaration] module_hook_psz = pserialize_create(); ^ sysctl_create cc1: all warnings being treated as errors *** [kern_module.o] Error code 1 and the sparc build is also failing with a similar error. -- Andreas Gustafsson, g...@gson.org
Re: Current test failures
Taylor R Campbell wrote: > OOPS -- rmind removed pserialize_init from rump_init, so the mutex > never got initialized. Fixed in rump.c 1.337! Perhaps, but before Taylor made that commit, at least one other bug was introduced that is causing the system to panic before finishing the tests: fs/vfs/t_renamerace (726/847): 28 test cases ext2fs_renamerace: [6.743565s] Failed: Test program received signal 11 (core dumped) ext2fs_renamerace_dirs: [6.690776s] Failed: Test program received signal 11 (core dumped) ffs_renamerace: [6.602727s] Failed: Test program received signal 11 (core dumped) ffs_renamerace_dirs: [ 3923.9308316] panic: kernel diagnostic assertion "l->l_cpu == ci" failed: file "/tmp/bracket/build/2019.12.06.21.45.14-amd64-baremetal/src/sys/kern/kern_synch.c", line 764 [ 3924.1108893] cpu7: Begin traceback... [ 3924.1509019] vpanic() at netbsd:vpanic+0x178 [ 3924.2009181] kern_assert() at netbsd:kern_assert+0x48 [ 3924.2609379] mi_switch() at netbsd:mi_switch+0x569 [ 3924.3209576] sleepq_block() at netbsd:sleepq_block+0xb7 [ 3924.3809774] lwp_park() at netbsd:lwp_park+0x10d [ 3924.4409956] syslwp_park60() at netbsd:syslwp_park60+0x5d [ 3924.5110189] syscall() at netbsd:syscall+0x299 [ 3924.5610351] --- syscall (number 478) --- [ 3924.6110531] 7adcb44b035a: [ 3924.6410624] cpu7: End traceback... More logs at: http://www.gson.org/netbsd/bugs/build/amd64-baremetal/commits-2019.12.html#2019.12.07.14.55.58 Could everyone please refrain from committing new kernel-crashing bugs until the test infrastructure has recovered from the previous round? -- Andreas Gustafsson, g...@netbsd.org
Re: Current test failures
Martin Husemann wrote: > Here is a simple recipe to reproduce the massive test lossage in -current: > > cd /usr/tests/dev/raidframe && atf-run I have now bisected it down to the following commits: 2019.12.05.03.21.08 riastradh src/sys/kern/subr_percpu.c 1.20 2019.12.05.03.21.17 riastradh src/sys/kern/subr_pserialize.c 1.16 2019.12.05.03.21.29 riastradh src/sys/kern/subr_pserialize.c 1.17 2019.12.05.03.21.42 riastradh src/external/cddl/osnet/sys/sys/opentypes.h 1.5 -- Andreas Gustafsson, g...@netbsd.org
Testbed breakage
Hi all, For the last few days, most of the testbeds have been seeing the system under test either hang or panic before the ATF tests have run to completion. The failures are too many and varied to file a PR about each, but for a start, you can look for "tests: did not complete" in the following: http://releng.netbsd.org/b5reports/i386/commits-2019.12.html http://releng.netbsd.org/b5reports/amd64/commits-2019.12.html http://releng.netbsd.org/b5reports/evbarm-aarch64/commits-2019.12.html http://releng.netbsd.org/b5reports/pmax/commits-2019.12.html For sparc, there is PR 54734. Both qemu and gxemul based testbeds are failing, but my i386 and amd64 testbeds running on real hardware are not (other than the latest amd64 test run showing 1336 new test failures, which looks like an unrelated bug). That the failing hosts are uniprocessors and the working ones are multiprocessors may or may not be a coincidence. Please help find and fix the offending commit(s); until that is done, there can be very little automated testing of new commits. -- Andreas Gustafsson, g...@gson.org
Re: Increases in build system time
Steffen Nurpmeso wrote: > This thread reminds me of me turning off hyperthreading. > Using the four cores i have with HT turned on results in a 40 > percent time penalty compared to when its off. (For example, > compiling the Linux kernel 4.19.X takes almost exactly 10 minutes > when it is turned off, and about 14 minutes when it is turned > on. Just a thought.) FWIW, these tests were run with hyperthreading disabled. -- Andreas Gustafsson, g...@gson.org
Re: Increases in build system time
Mateusz Guzik wrote: > > http://www.gson.org/netbsd/bugs/system-time/fg.svg > > First thing which jumps at me is DIAGNOSTIC being on (seen with e.g., > _vstate_assert). Did your older kernels have it? If you just compiled > GENERIC from release branches it is presumably removed, so would be > nice to retest without it. All the versions tested were built from the CVS trunk, and all used the GENERIC kernel. The only thing from a release branch was the build target (8.1), which was the same in all test runs. > That said, can you rerun without DIGANOSTIC but with lockstat? I'd rather leave that to someone else, and to a separate thread. All the test results presented in this thread were produced with the same options so that they can be meaningfully compared, and running new tests with different options would only confuse things. -- Andreas Gustafsson, g...@gson.org
Re: Increases in build system time
Jaromír Doleček wrote: > I wonder also if we could try enabling vm.ubc_direct on the build machine? Using 2019.11.14.13.58.22 sources: with default settings: 4612.56 real 16896.10 user 9325.87 sys with vm.ubc_direct = 1: 4615.95 real 16819.96 user 9416.13 sys -- Andreas Gustafsson, g...@gson.org
Re: Increases in build system time
Mateusz Guzik wrote: > Can you get a kernel-side flamegraph? Done, using sources from 2019.11.14.13.58.22: http://www.gson.org/netbsd/bugs/system-time/fg.svg -- Andreas Gustafsson, g...@gson.org
Re: Increases in build system time
Michael van Elst wrote: > g...@gson.org (Andreas Gustafsson) writes: > > >mitigations, which I guess is not really surprising. But the 12% net > >increase from jemalloc and the 7% increase from vfs_vnode.c 1.63 seem > >to call for closer investigation. > > Is this also reflected in real time? Only partly. With the jemalloc change, the system time increased by 938 seconds, but the real time only increased by 170 seconds, and the user time decreased by 89 seconds: 2019.03.08.20.34.24/build_8.log.gz: 4229.21 real 16686.03 user 7932.08 sys 2019.03.08.20.35.10/build_8.log.gz: 4398.88 real 16597.25 user 8870.49 sys With the vfs_vnode.c change, the system time increased by 305 seconds, but the real time only increased by 35 seconds: 2016.12.14.15.48.55/build_8.log.gz: 3934.44 real 15707.68 user 4243.02 sys 2016.12.14.15.49.35/build_8.log.gz: 3969.58 real 15718.85 user 4548.50 sys -- Andreas Gustafsson, g...@gson.org
Increases in build system time
Hi all, Back in September, I wrote: > I'm trying to run a bisection to determine why builds hosted on recent > versions of NetBSD seem to be taking significantly more system time > than they used to, building the same thing. I finally have some results to report. These are from builds of the NetBSD-8/amd64 release hosted on various versions of -current/amd64, on a HP DL360 G7 with dual Xeon L5630 CPUs (8 cores in all). The amount of system time taken by each build was measured using time(1). Between a -current from September 2016 and one from October 2019, the system time more than doubled, from 4245 seconds to 9344 seconds. The time(1) output from the oldest and newest version was: 3930.86 real 15737.04 user 4245.26 sys 4461.47 real 16687.37 user 9344.68 sys This means that on the recent -current, on average, roughly four of the eight cores were executing the build tools (compilers, etc), roughly two were executing the kernel, and the remaining two were presumably idle. The increase did not happen all at once but in several smaller steps as shown in this graph: http://www.gson.org/netbsd/bugs/system-time/graph.png For each step, finding the commits that caused it required a separate bisection. Each bisection took 1-2 days to run, so I have only bisected the largest steps, those of 5 percent or more. They are listed below in order from largest to smallest, with CVS revisions and commit messages. 38% increase: 2018.04.04.12.59.49 maxv src/sys/arch/amd64/amd64/machdep.c 1.303 2018.04.04.12.59.49 maxv src/sys/arch/x86/include/cpu.h 1.91 2018.04.04.12.59.49 maxv src/sys/arch/x86/x86/cpu.c 1.154 2018.04.04.12.59.49 maxv src/sys/arch/x86/x86/spectre.c 1.8 Enable the SpectreV2 mitigation by default at boot time. 12% increase: 2019.03.08.20.35.10 christos src/share/mk/bsd.own.mk 1.1108 Back to using jemalloc for x86_64; all problems have been resolved. 9% increase: 2018.02.26.05.52.50 maxv src/sys/arch/amd64/conf/GENERIC 1.485 Enable SVS by default. 7% increase: 2016.12.14.15.49.35 hannken src/sys/kern/vfs_vnode.c 1.63 Change the freelists to lrulists, all vnodes are always on one of the lists. Speeds up namei on cached vnodes by ~3 percent. Merge "vrele_thread" into "vdrain_thread" so we have one thread working on the lrulists. Adapt vfs_drainvnodes() to always wait for a complete cycle of vdrain_thread(). 5% increase: 2018.04.07.22.39.31 christos src/external/Makefile 1.21 2018.04.07.22.39.31 christos src/external/README 1.16 [302 more revisions by christos elided] 2018.04.07.22.39.53 christos src/external/bsd/Makefile 1.59 2018.04.07.22.41.55 christos src/doc/3RDPARTY 1.1515 2018.04.07.22.41.55 christos src/doc/CHANGES 1.2376 2018.04.08.00.52.38 mrg src/sys/arch/amd64/conf/ALL 1.85 2018.04.08.00.52.38 mrg src/sys/arch/amd64/conf/GENERIC 1.489 2018.04.08.00.52.38 mrg src/sys/arch/i386/conf/ALL 1.437 2018.04.08.00.52.38 mrg src/sys/arch/i386/conf/GENERIC 1.1177 2018.04.08.01.30.01 christos src/external/mpl/Makefile 1.1 [Too many commit messages to list here, but the following from mrg's commit of src/sys/arch/amd64/conf/GENERIC 1.489 may be relevant] turn on GCC spectre v2 mitigation options. 5% increase: 2019.03.10.15.32.42 christos src/external/bsd/jemalloc/lib/Makefile.inc 1.5 turn on debugging to help find problems 5% decrease: 2019.07.23.06.31.20 martin src/external/bsd/jemalloc/lib/Makefile.inc 1.10 Disable JEMALLOC_DEBUG, it served us well, but now we want performance back. Discussed with christos. To summarize, most of the increase was due to Spectre and Meltdown mitigations, which I guess is not really surprising. But the 12% net increase from jemalloc and the 7% increase from vfs_vnode.c 1.63 seem to call for closer investigation. -- Andreas Gustafsson, g...@gson.org
Re: vm.ubc_direct
Patrick Welche wrote: > I have been running with vm.ubc_direct=1 and feeling a speedup and no > inconveniences on multicore systems. What are thoughts on having it as > a default? No such option is documented, hence the question makes no sense. -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
Robert Elz wrote: > There's no info in the log (the available log on the website) to allow > anyone to work out what happened there (why is make debug enabled? The > log that is there is filled with make debug noise) I don't know the answer to this, but the question is known as PR 53561. -- Andreas Gustafsson, g...@gson.org
Re: Xen kernel diagnostic assertion "powerof2(align)" failed
Manuel Bouyer wrote: > Hello, > the xen test run for the 201911010850Z build fails with: > [ 1.1185040] panic: kernel diagnostic assertion "powerof2(align)" failed: > file "/usr/src/sys/uvm/uvm_map.c", line 196 [...] > Has it been fixed since then ? Yes, by: commit 2019.11.01.13.04.22 rin src/sys/uvm/uvm_map.c 1.366 -- Andreas Gustafsson, g...@gson.org
Re: time(1) reporting corrupted system time
Mateusz Guzik wrote: > Hi, I failed to find a follow up to this. > > I see someone gave the you the fix for corrupted time accounting. > Did you get around to finding the offending commit? For the corrupted system time, I believe the offending commit was kern_resource.c 1.180, and the fix was 1.182. As for the increased system time taken by release builds, it has happened in multiple steps. I have bisected the largest increases, but analyzing and writing up the results for current-users is still on my "to do" list. -- Andreas Gustafsson, g...@gson.org
-current panics on boot in atabus_alloc_drives()
Christos, Both i386 and amd64 are failing to install on the testbed due to the install kernel panicing on boot. The amd64 console log contains a backtrace: [ 1.0264751] panic: lock error: Mutex: mutex_vector_exit,742: exiting unheld spin mutex: lock 0xc83161f0 cpu 0 lwp 0xc217e7c90540 [ 1.0264751] cpu0: Begin traceback... [ 1.2751115] vpanic() at netbsd:vpanic+0x160 [ 1.2751115] snprintf() at netbsd:snprintf [ 1.3032327] lockdebug_abort() at netbsd:lockdebug_abort+0xee [ 1.3032327] mutex_vector_exit() at netbsd:mutex_vector_exit+0xbd [ 1.3032327] atabus_alloc_drives() at netbsd:atabus_alloc_drives+0x61 [ 1.3231953] wdc_drvprobe() at netbsd:wdc_drvprobe+0x40 [ 1.3432056] atabusconfig() at netbsd:atabusconfig+0x65 [ 1.3432056] atabus_thread() at netbsd:atabus_thread+0x7e [ 1.3432056] cpu0: End traceback... This is from: http://releng.netbsd.org/b5reports/amd64/2019/2019.10.21.19.00.11/install.log On i386, the only kernel commits between the last sucesss and the failure were: 2019.10.21.18.37.47 christos src/sys/dev/ata/ata.c 1.152 2019.10.21.18.58.57 christos src/sys/dev/ata/ata.c 1.153 2019.10.21.19.00.11 christos src/sys/dev/pci/satalink.c 1.57 -- Andreas Gustafsson, g...@gson.org
Re: time(1) reporting corrupted system time
Michael van Elst wrote: > First there was a change to precent that user/system time are decreasing > in kern-resource.c 1.180. But it shouldn't be related to negative numbers. > > Additionally a possible underflow of user/system time was fixed in > kern_resource.c 1.182. This prevents negative numbers, but IIRC this > would only happen for very small values, not when values already > accumulated to a few thousand seconds. Thanks. I think what happened is that 1.180 caused the bug, and 1.182 fixed it. In any case, I think I now have what I need to back-port the fix and bisect the other bug. -- Andreas Gustafsson, g...@gson.org
time(1) reporting corrupted system time
Hi all, I'm trying to run a bisection to determine why builds hosted on recent versions of NetBSD seem to be taking significantly more system time than they used to, building the same thing. My efforts are hampered by time(1) reporting corrupted system times on certain past versions of -current: 2017.01.01.03.06.06/build_8.log: 3562.32 real 15806.10 user 4893.62 sys 2018.05.21.10.28.13/build_8.log: 4250.22 real 16835.23 user 608742554440425.55 sys 2019.01.30.20.20.36/build_8.log: 4228.25 real 16801.48 user 700976274808841.24 sys 2019.09.27.08.57.12/build_8.log: 4488.49 real 16670.79 user 9279.25 sys Does anyone happen to know which commits caused and/or fixed this? This information could save me a couple of days of bisection run time. -- Andreas Gustafsson, g...@gson.org
Re: [Small Heads up] USE_SHLIBDIR=yes added some some library Makefiles
Brad Spencer wrote: > I committed a change today to add USE_SHLIBDIR=yes to the libraries used > by /sbin/{zfs,mount_zfs,zpool}. The general effect will be to move the > libraries from /usr/lib to /lib and put compatibility links in place so > that things, say in /usr/pkg, continue to work as expected. Run tested > on amd64 and i386 and compile tested on evbarm. This will allow /usr > and /var to be mounted as a ZFS legacy filesystem and keeping with the > apparent pattern of having items in /sbin depend on items in /lib and > not /usr/lib. > > Sorry if this breaks anything. Looks like it broke MKDEBUG=yes builds: === 8 extra files in DESTDIR = Files in DESTDIR but missing from flist. File is obsolete or flist is out of date ? -- ./usr/libdata/debug/lib/libavl.so.0.0.debug ./usr/libdata/debug/lib/libnvpair.so.0.0.debug ./usr/libdata/debug/lib/libpthread.so.1.4.debug ./usr/libdata/debug/lib/libumem.so.0.0.debug ./usr/libdata/debug/lib/libuutil.so.0.0.debug ./usr/libdata/debug/lib/libzfs.so.0.0.debug ./usr/libdata/debug/lib/libzfs_core.so.0.0.debug ./usr/libdata/debug/lib/libzpool.so.0.0.debug = end of 8 extra files === *** [checkflist] Error code 1 -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The build is still failing as of source date 2019.09.16.04.59.32, now with a different error: == 2 missing files in DESTDIR Files in flist but missing from DESTDIR. File wasn't installed ? -- ./usr/share/man/html8/mount_zfs.html ./usr/share/man/man8/mount_zfs.8 end of 2 missing files == -- Andreas Gustafsson, g...@gson.org
Re: Automated report: NetBSD-current/i386 build failure
The build is still failing as of source date 2019.09.08.11.53.23, with: /tmp/bracket/build/2019.09.08.11.53.23-i386/src/sys/dev/usb/xhci.c: In function 'xhci_address_device': /tmp/bracket/build/2019.09.08.11.53.23-i386/src/sys/dev/usb/xhci.c:2893:26: error: suggest braces around empty body in an 'else' statement [-Werror=empty-body] icp, slot_id, 0, 0); -- Andreas Gustafsson, g...@gson.org