Re: panic: assertion "!cpu_softintr_p()" failed

2023-10-02 Thread Andrew Doran
Hi, On Sun, Oct 01, 2023 at 10:12:47AM +0200, Thomas Klausner wrote: > panic: kernel diagnostic assertion "!cpu_softintr_p()" failed: file > "/usr/src/sys/kern/subr_kmem.c", line 451 Sorry about that. Should be fixed by: /cvsroot/src/sys/kern/kern_mutex_obj.c,v <-- kern_mutex_obj.c new

Re: Growth in pool usage between netbsd-9 and -10?

2023-09-09 Thread Andrew Doran
On Fri, Sep 08, 2023 at 12:27:57PM +1000, Paul Ripke wrote: > I need to read more code to see when the pools decide to release idle > pages - because this is remarkably wasteful considering my machine is > also paging, and "only" has 16GiB RAM. > > Memory resource pool statistics > Name

Re: kernel diagnostic assertion "c->c_cpu->cc_lwp == curlwp || c->c_cpu->cc_active != c" failed

2020-06-14 Thread Andrew Doran
Hi, On Fri, Jun 12, 2020 at 11:17:30PM +0200, Thomas Klausner wrote: > With a 9.99.63/amd64 kernel from May 19 I saw a panic: > > Jun 7 01:01:01 yt savecore: reboot after panic: [ 396809.5836453] panic: > kernel diagnostic assertion "c->c_cpu->cc_lwp == curlwp || > c->c_cpu->cc_active != c"

Re: cmake hanging

2020-06-10 Thread Andrew Doran
On Mon, Jun 08, 2020 at 03:38:44PM +0100, Chavdar Ivanov wrote: > On Sun, 7 Jun 2020 at 10:25, Chavdar Ivanov wrote: > > > > Hi, > > > > I just had another one, rebuilding gimp, running gegl. Again gdb -p > > ... ; quit sorted it out. > > > > On Sat, 6 Jun 2020 at 20:36, Chavdar Ivanov wrote: >

Re: cmake hanging

2020-06-10 Thread Andrew Doran
On Wed, Jun 10, 2020 at 01:30:22AM +0200, Joerg Sonnenberger wrote: > On Tue, Jun 09, 2020 at 11:22:27PM +0000, Andrew Doran wrote: > > On Sat, Jun 06, 2020 at 09:25:55PM +0200, Joerg Sonnenberger wrote: > > > > > On Sat, Jun 06, 2020 at 06:45:03PM +0100, Chavdar Ivanov

Re: cmake hanging

2020-06-09 Thread Andrew Doran
On Sat, Jun 06, 2020 at 09:25:55PM +0200, Joerg Sonnenberger wrote: > On Sat, Jun 06, 2020 at 06:45:03PM +0100, Chavdar Ivanov wrote: > > On Sat, 6 Jun 2020 at 18:43, Chavdar Ivanov wrote: > > > > > > Hi, > > > > > > I got another cmake hang during pkg_rolling-replace today, while > > > building

Re: Automated report: NetBSD-current/i386 build failure

2020-05-21 Thread Andrew Doran
On Fri, May 22, 2020 at 12:07:52AM +, NetBSD Test Fixture wrote: > This is an automatically generated notice of a NetBSD-current/i386 > build failure. > > The failure occurred on babylon5.netbsd.org, a NetBSD/amd64 host, > using sources from CVS date 2020.05.21.21.12.31. ... > ---

Re: NFS swap on current appears to deadlock

2020-05-18 Thread Andrew Doran
Hi, Finally got around to trying this. Having beaten on it for a while with real hardware I don't see any problem with swapping over NFS on 9.99.63. On Sat, May 02, 2020 at 12:06:48PM +1000, Paul Ripke wrote: > I have a qemu guest for experimenting with -current, 1 CPU & 64MiB RAM. 64 megs,

Re: lang/rust build fails

2020-05-15 Thread Andrew Doran
On Thu, May 14, 2020 at 11:53:04AM -0500, Robert Nestor wrote: > Ran into an interesting problem trying to build lang/rust from both -current > and 2020Q1 pkgsrc. On a NetBSD installation of 9.99.45 kernel and user land, > the builds succeed. Under 9.99.60 kernel and user land the builds fail.

Re: Panic: vrelel: bad ref count (9.99.54)

2020-05-05 Thread Andrew Doran
On Mon, May 04, 2020 at 03:54:57PM +0200, Leonardo Taccari wrote: > Hello Yorick and Andrew, > > Yorick Hardy writes: > > > > > [...] > > > > > > > > > > Crash version 9.99.55, image version 9.99.55. > > > > > crash: _kvm_kvatop(0) > > > > > Kernel compiled without options LOCKDEBUG. > > >

Re: NFS swap on current appears to deadlock

2020-05-04 Thread Andrew Doran
Hi Paul, On Sat, May 02, 2020 at 12:06:48PM +1000, Paul Ripke wrote: > I have a qemu guest for experimenting with -current, 1 CPU & 64MiB RAM. > I gave it an NFS swap space to cope with a few small builds, and it now > locks up hard after touching that swap device. > > >From ddb, stacks are

Re: firefox build broken

2020-05-04 Thread Andrew Doran
Hi, On Tue, Apr 28, 2020 at 12:36:04PM +0200, Thomas Klausner wrote: > It seems to me some recent change broke the firefox build. > > I've built all packages from scratch on 9.99.59/amd64 from 20200426. > > Firefox consistently fails with > > stack backtrace: >0: 0x490088e2 - >

Heads up: ubc_direct enabled by default

2020-04-23 Thread Andrew Doran
Hi, This affects amd64, alpha and aarch64, but only 1 and 2 CPU systems so far. Any more and it's still off by default. Only the default has changed so the sysctl (vm.ubc_direct) still works for turning it on and off manually. This works great for me on amd64 but needs some tweaks to handle

Re: panic: LOCKDEBUG: Mutex error: mutex_vector_enter,514: spin lock held

2020-04-22 Thread Andrew Doran
Hi Paul, On Wed, Apr 22, 2020 at 12:06:41PM +1000, Paul Ripke wrote: > On -current as of ~yesterday, in a 1CPU amd64 qemu boot, I'm seeing: > > Waiting for duplicate address detection to finish... > Starting dhcpcd. > dhcpcd-9.0.1 starting > unknown option: > [ 17.0102686] wm0: link state UP

Re: Automated report: NetBSD-current/i386 build failure

2020-04-19 Thread Andrew Doran
Doesn't show up in Opengrok, maybe it dislikes rump. Already fixed. Andrew On Sun, Apr 19, 2020 at 09:56:55PM +, NetBSD Test Fixture wrote: > This is an automatically generated notice of a NetBSD-current/i386 > build failure. > > The failure occurred on babylon5.netbsd.org, a NetBSD/amd64

Re: Panic: vrelel: bad ref count (9.99.54)

2020-04-19 Thread Andrew Doran
Hi Yorick. On Sat, Apr 18, 2020 at 11:00:02AM +0200, Yorick Hardy wrote: > > I just had the same panic with 9.99.55: > > > > Crash version 9.99.55, image version 9.99.55. > > crash: _kvm_kvatop(0) > > Kernel compiled without options LOCKDEBUG. > > System panicked: vrelel: bad ref count

Re: Panic: vrelel: bad ref count (9.99.54)

2020-04-07 Thread Andrew Doran
Hi Yorick. On Mon, Apr 06, 2020 at 11:16:37PM +0200, Yorick Hardy wrote: >Crash version 9.99.54, image version 9.99.54. >crash: _kvm_kvatop(0) >Kernel compiled without options LOCKDEBUG. >System panicked: vrelel: bad ref count >Backtrace from time of crash is available. >

Re: Build time measurements

2020-04-05 Thread Andrew Doran
Hi Andreas, On Fri, Mar 27, 2020 at 10:39:44AM +0200, Andreas Gustafsson wrote: > On Wednesday, I said: > > I will rerun the 24-core tests with these disabled for comparison. > > Done. To recap, with a stock GENERIC kernel, the numbers were: > > 2016.09.06.06.27.173321.55 real 9853.49

Re: Automated report: NetBSD-current/i386 build failure

2020-03-26 Thread Andrew Doran
Fixed with 1.18 src/sys/rump/librump/rumpkern/sleepq.c Apologies, forgot to commit earlier. Andrew On Thu, Mar 26, 2020 at 10:36:27PM +, NetBSD Test Fixture wrote: > This is an automatically generated notice of a NetBSD-current/i386 > build failure. > > The failure occurred on

Re: locking error using today's sources

2020-03-26 Thread Andrew Doran
Fixed as of src/sys/kern/kern_lwp.c 1.231. Andrew On Mon, Mar 23, 2020 at 01:00:11AM +, Andrew Doran wrote: > Hi, > > I looked into this, it's quite an old bug and you were unlucky to run into > it, there's a very small window of opportunity for it to occur. I'll see &g

Re: Build time measurements

2020-03-26 Thread Andrew Doran
On Wed, Mar 25, 2020 at 09:44:19PM +, Mike Pumford wrote: > On 24/03/2020 21:47, Andrew Doran wrote: > > DIAGNOSTIC and acpicpu are disabled in all kernels but they are otherwise > > GENERIC. The 2020-04-?? kernel is HEAD plus the remaining changes from the > >

Re: Build time measurements

2020-03-24 Thread Andrew Doran
Hi Andreas. On Mon, Mar 23, 2020 at 04:11:17PM +0200, Andreas Gustafsson wrote: > In September and November, I reported some measurements of the amount > of system time it takes to build a NetBSD-8/amd64 release on different > versions of -current/amd64. I have now repeated the measurements

Re: 9.99.51 crash: kernel diagnostic assertion "ncp->nc_vp == vp" failed

2020-03-23 Thread Andrew Doran
On Mon, Mar 23, 2020 at 08:27:15AM +0100, Thomas Klausner wrote: > I've updated to last night's 9.99.51, started a bulk build and went to > sleep. It lasted about an hour or two, then it paniced with: > > kernel diagnostic assertion "ncp->nc_vp == vp" failed: file >

Re: locking error using today's sources

2020-03-22 Thread Andrew Doran
Hi, I looked into this, it's quite an old bug and you were unlucky to run into it, there's a very small window of opportunity for it to occur. I'll see about fixing it. Thanks, Andrew On Thu, Mar 19, 2020 at 02:51:05PM -0700, David Hopper wrote: > I just got this using today's kernel+userland

Re: Heads up: UVM changes

2020-03-22 Thread Andrew Doran
On Sun, Mar 22, 2020 at 12:40:00PM -0700, Jason Thorpe wrote: > > On Mar 22, 2020, at 12:34 PM, Andrew Doran wrote: > > > > This works well for me on amd64, but I haven't tried it on other machines. > > From code inspection I can see that some archite

Heads up: UVM changes

2020-03-22 Thread Andrew Doran
Hi, I changed UVM to allow for concurrent page faults on shared objects. Previously this was single threaded due to locking, which caused a lot of contention over busy objects like libc.so or PostgreSQL's shared buffer for example. This works well for me on amd64, but I haven't tried it on

Re: Automated report: NetBSD-current/i386 build failure

2020-03-22 Thread Andrew Doran
Fixed with src/sys/kern/vfs_vnode.c 1.115. Andrew On Sun, Mar 22, 2020 at 04:23:47PM +, NetBSD Test Fixture wrote: > This is an automatically generated notice of a NetBSD-current/i386 > build failure. > > The failure occurred on babylon5.netbsd.org, a NetBSD/amd64 host, > using sources

Re: Another pmap panic

2020-03-20 Thread Andrew Doran
Hi, I meant to send a note yesterday but fatigue got the better of me. I suggest updaing to the latest, delivered yesterday, which has fixes for every problem I have encountered or seen mentioned including this one, and survives low memory stress testing for me: /* $NetBSD: pmap.c,v

Re: pmap panic

2020-03-17 Thread Andrew Doran
Ok. I think the problems here should be fixed. Andrew On Sun, Mar 15, 2020 at 04:16:20PM +, Andrew Doran wrote: > Hi, > > Thanks for the reports. This and the NVMM related panics should be fixed > now, with: 1.369 src/sys/arch/x86/x86/pmap.c > > I don't have a machine

Re: current: completely stuck after four minutes of uptime

2020-03-16 Thread Andrew Doran
On Mon, Mar 16, 2020 at 11:14:38AM +0100, Lars Reichardt wrote: > > On 2020-03-16 10:45, Thomas Klausner wrote: > > On Sun, Mar 15, 2020 at 11:29:15AM +0100, Thomas Klausner wrote: > > > I've just upgraded my 9.99.49 kernel from March 12 to today's from an > > > hour ago. > > > > > > After

Re: pmap panic

2020-03-15 Thread Andrew Doran
Hi, Thanks for the reports. This and the NVMM related panics should be fixed now, with: 1.369 src/sys/arch/x86/x86/pmap.c I don't have a machine capable of running X11 on NetBSD at the moment so I will spin up qemu or VirtualBox or something to try that out now. Apologies for the distruption.

Re: change within last day broke nvmm

2020-03-15 Thread Andrew Doran
On Sun, Mar 15, 2020 at 02:38:19PM +0100, Tobias Nygren wrote: > This is consistently reproducable while trying to boot Linux on nvmm. > > panic: LIST_INSERT_HEAD 0x88713368 x86/pmap.c:2135 > vpanic() > panic() > pmap_enter_pv() > pmap_ept_enter() > uvm_fault_lower_enter() >

Re: Automated report: NetBSD-current/i386 build failure

2020-03-14 Thread Andrew Doran
Should be fixed with 1.91 src/sys/miscfs/genfs/genfs_io.c. Andrew On Sat, Mar 14, 2020 at 07:46:02PM +, NetBSD Test Fixture wrote: > This is an automatically generated notice of a NetBSD-current/i386 > build failure. > > The failure occurred on babylon5.netbsd.org, a NetBSD/amd64 host, >

Re: Weird qemu-nvmm problem

2020-03-12 Thread Andrew Doran
On Wed, Mar 11, 2020 at 06:45:06PM +0100, Maxime Villard wrote: > Please CC me for issues related to NVMM, there is a number of lists where > I'm not subscribed. > > My understanding is that this commit is the cause (CC ad@): > >

Re: diagnostic assertion curcpu()->ci_biglock_count == 0 failed

2020-03-08 Thread Andrew Doran
Hi, On Sun, Feb 23, 2020 at 10:30:24AM +0100, Thomas Klausner wrote: > With a 9.99.47/amd64 kernel from February 16, I just had panic (handcopied): > > panic: kernel diagnostic assertion "curcpu()->ci_biglock_count == 0" failed: > file .../kern_exit.c line 214: kernel_lock leaked > cpu12:

Re: panic: softint screwup

2020-03-07 Thread Andrew Doran
Has anyone observed this again in the last couple of weeks? Assuming it's fixed now. Thanks, Anrew On Sun, Feb 09, 2020 at 05:05:14PM +0100, Thomas Klausner wrote: > I just had a panic in 9.99.46/amd64: > > Feb 9 16:27:54 yt savecore: reboot after panic: [ 14300.7407347] panic: > softint

Re: assertion "pve == NULL" failed

2020-03-04 Thread Andrew Doran
On Wed, Mar 04, 2020 at 11:41:33AM +, Patrick Welche wrote: > Netbooting (uefi) a -current/amd64 kernel from yesterday morning, > with a serial console, I just had > > [ 131.6333574] panic: kernel diagnostic assertion "pve == NULL" failed: file > "/ > [ 131.7633964] cpu8: Begin

Re: Panic on aarch64

2020-03-03 Thread Andrew Doran
On Tue, Mar 03, 2020 at 10:03:38PM +, Robert Swindells wrote: > > I just got this: > > panic: pr_phinpage_check: [vcachepl] item 0x54d19880 not part of pool > cpu0: Begin traceback... > trace fp ffc0405efc90 > fp ffc0405efcb0 vpanic() at ffc000240880 netbsd:vpanic+0x160 >

Re: benchmark results on ryzen 3950x with netbsd-9, -current, and -current (no DIAGNOSTIC)

2020-03-03 Thread Andrew Doran
Hi. On Tue, Mar 03, 2020 at 08:25:25PM +1100, matthew green wrote: > here are a few build benchmark tests on an amd ryzen 3950x > system, to see the cumulative effect of the various fixes we've > seen since netbsd-9, for this 16 core/ 32 thread CPU, 64GB of > ram, separate nvme ssd for src &

Re: Regressions

2020-03-01 Thread Andrew Doran
On Sun, Mar 01, 2020 at 03:26:12PM +0200, Andreas Gustafsson wrote: > NetBSD-current is again suffering from a number of regressions. The > last time the ATF tests showed zero unexpected failures on real amd64 > hardware was on Dec 12, and the sparc, sparc64, pmax, and hpcmips > tests have all

Re: Failures in x86 pmap

2020-02-24 Thread Andrew Doran
On Mon, Feb 24, 2020 at 01:22:15PM +, Patrick Welche wrote: > On Sun, Feb 23, 2020 at 06:59:50PM +0000, Andrew Doran wrote: > > I think I found the problem, which has existed since ~8PM GMT yesterday. > > Hopefully fixed by revision 1.17 of src/sys/arch/x86/x86/x86_tlb.c. >

Re: Failures in x86 pmap

2020-02-23 Thread Andrew Doran
I think I found the problem, which has existed since ~8PM GMT yesterday. Hopefully fixed by revision 1.17 of src/sys/arch/x86/x86/x86_tlb.c. Andrew On Sun, Feb 23, 2020 at 06:29:38PM +, Andrew Doran wrote: > Having gotten a report of this privately I've now started running into it. &g

Failures in x86 pmap

2020-02-23 Thread Andrew Doran
Having gotten a report of this privately I've now started running into it. Has anyone else seen this, and if so any idea when it started happening? I wonder if there is a memory corruption or TLB coherency issue. Andrew hanging here: db{0}> bt pmap_pp_remove() at netbsd:pmap_pp_remove+0x46c

Re: diagnostic assertion curcpu()->ci_biglock_count == 0 failed

2020-02-23 Thread Andrew Doran
Hi Thomas. On Sun, Feb 23, 2020 at 10:30:24AM +0100, Thomas Klausner wrote: > With a 9.99.47/amd64 kernel from February 16, I just had panic (handcopied): > > panic: kernel diagnostic assertion "curcpu()->ci_biglock_count == 0" failed: > file .../kern_exit.c line 214: kernel_lock leaked >

Re: 9.99.47 panic: diagnostic assertion "lwp_locked(l, spc->spc_mutex)" failed: file ".../kern_synch.c", line 1001

2020-02-16 Thread Andrew Doran
Hi, On Sun, Feb 16, 2020 at 12:27:45PM +0100, Thomas Klausner wrote: > I just updated -current and quite soon had a panic: > cpu1: Begin traceback... > vpanic() > kern_assert > schend_lendpri > turnstile_block > rw_vector_enter > genfs_lock > layer_bypass > VOP_LOCK > vn_lock > layerfs_root >

Re: Automated report: NetBSD-current/i386 test failure

2020-01-29 Thread Andrew Doran
On Wed, Jan 29, 2020 at 02:45:22AM +, NetBSD Test Fixture wrote: > The newly failing test case is: > > lib/libpthread/t_detach:pthread_detach ... > 2020.01.27.20.50.05 ad src/lib/libpthread/pthread.c,v 1.157 Wrong error code from the kernel (ESRCH vs EINVAL) worked around with

Re: 9.99.40: panic: kernel diagnostic assertion "ci->ci_biglock_count == 0" failed

2020-01-27 Thread Andrew Doran
On Sun, Jan 26, 2020 at 09:09:47PM +, Andrew Doran wrote: > Hi Frank, > > On Sun, Jan 26, 2020 at 09:00:51PM +0100, Frank Kardel wrote: > > > While bulk building pkgsrc with 9.99.42 from Jan 25t I see > > > > panic:kernel diagnostic assertion "cur

Re: 9.99.40: panic: kernel diagnostic assertion "ci->ci_biglock_count == 0" failed

2020-01-26 Thread Andrew Doran
> That happens every couple of thousand packages - sorry no dump (locking > against myself as expected). Thanks for letting me know. I will make a change tomorrow to mitigate the panic, and allow the badly behaved code to be identified. Andrew > > Frank > > > > On 01/22/20

Re: assertion (pinned->l_flag & LW_RUNNING) != 0 failed

2020-01-25 Thread Andrew Doran
On Sat, Jan 25, 2020 at 09:13:49AM +, co...@sdf.org wrote: > -current is getting even less reliable at booting for me. > > I think the thing I'm doing unusual is having an Android phone connected > by USB. It's pretty chatty at boot compared to my other USB devices. > > It's hitting this

Re: 9.99.40: panic: kernel diagnostic assertion "ci->ci_biglock_count == 0" failed

2020-01-22 Thread Andrew Doran
On Tue, Jan 21, 2020 at 07:59:35PM +, Andrew Doran wrote: > Hi Thomas, > > On Tue, Jan 21, 2020 at 08:47:44PM +0100, Thomas Klausner wrote: > > > During a bulk build (in rust AFAICT), I got a panic with > > panic: kernel diagnostic assertion "ci->ci_

Re: 9.99.40: panic: kernel diagnostic assertion "ci->ci_biglock_count == 0" failed

2020-01-21 Thread Andrew Doran
Hi Thomas, On Tue, Jan 21, 2020 at 08:47:44PM +0100, Thomas Klausner wrote: > During a bulk build (in rust AFAICT), I got a panic with > panic: kernel diagnostic assertion "ci->ci_biglock_count == 0" failed: file > "/usr/src/sys/sys/userret.h", line 88 > > That's this one: > > static __inline

Re: CVS commit: src/sys [freeze on boot]

2020-01-20 Thread Andrew Doran
Fix committed with sys/kern/kern_rwlock.c rev 1.62. I didn't see the problem as I am running with LOCKDEBUG. Apologies for the disruption. Andrew

Re: File corruption?

2020-01-19 Thread Andrew Doran
On Sun, Jan 19, 2020 at 12:21:06PM -0600, Robert Nestor wrote: > Thanks! I followed Andrew?s instructions and got a photo of the stack > trace and sent it to him directly. Hope it helps him figure out what?s > happening. Thanks for the photo. This is a problem in the DRM code. It was fixed a

Re: File corruption?

2020-01-19 Thread Andrew Doran
Hi Robert, On Sun, Jan 19, 2020 at 10:42:37AM -0600, Robert Nestor wrote: > Sorry for not being specific. When I do the shutdown on a subsequent > reboot all the filesystems are dirty forcing fsck to run. Sometimes it > finds some minor errors and repairs them. > > I?m running xfce4, so when

Re: File corruption?

2020-01-19 Thread Andrew Doran
On Sun, Jan 19, 2020 at 05:17:25PM +, Andrew Doran wrote: > Hi Robert, > > On Sun, Jan 19, 2020 at 10:42:37AM -0600, Robert Nestor wrote: > > > Sorry for not being specific. When I do the shutdown on a subsequent > > reboot all the filesystems are dirty forcing

Re: 9.99.38 panic - i915drm + EFI ?

2020-01-17 Thread Andrew Doran
On Fri, Jan 17, 2020 at 08:35:13PM +0100, Kamil Rytarowski wrote: > On 17.01.2020 20:29, Andrew Doran wrote: > > Hi, > > > > On Fri, Jan 17, 2020 at 07:58:52PM +0100, Kamil Rytarowski wrote: > > > >> My system with i915 survived with these changes appli

Re: 9.99.38 panic - i915drm + EFI ?

2020-01-17 Thread Andrew Doran
Hi, On Fri, Jan 17, 2020 at 07:58:52PM +0100, Kamil Rytarowski wrote: > My system with i915 survived with these changes applied (credit > riastradh@ for hints): > > https://www.netbsd.org/~kamil/patch-00215-i915-dirty-pages.txt That change looks good to me. I didn't put the mutex acquire in

Re: 9.99.38 panic - i915drm + EFI ?

2020-01-16 Thread Andrew Doran
Hi, On Thu, Jan 16, 2020 at 06:14:40PM +, Chavdar Ivanov wrote: > Today's update brought 9.99.38, which fails to boot on my HP Envy 17 > laptop with Intel 530 graphics and NVidia GeForce; latter not used. The > system uses EFI boot and the panic happens the moment it has to switch > the

Re: Xen MP panics in cpu_switchto()

2020-01-13 Thread Andrew Doran
On Mon, Jan 13, 2020 at 09:17:28PM +0100, Manuel Bouyer wrote: > On Mon, Jan 13, 2020 at 07:11:21PM +0000, Andrew Doran wrote: > > On Mon, Jan 13, 2020 at 07:36:41PM +0100, Manuel Bouyer wrote: > > > > > On Mon, Jan 13, 2020 at 06:33:08PM +, Andrew Doran wrote: >

Re: Xen MP panics in cpu_switchto()

2020-01-13 Thread Andrew Doran
On Mon, Jan 13, 2020 at 07:36:41PM +0100, Manuel Bouyer wrote: > On Mon, Jan 13, 2020 at 06:33:08PM +0000, Andrew Doran wrote: > > On Mon, Jan 13, 2020 at 05:43:51PM +0100, Manuel Bouyer wrote: > > > > > On Mon, Jan 13, 2020 at 04:59:50PM +0100, Manuel Bouyer wrote:

Re: Xen MP panics in cpu_switchto()

2020-01-13 Thread Andrew Doran
On Mon, Jan 13, 2020 at 05:43:51PM +0100, Manuel Bouyer wrote: > On Mon, Jan 13, 2020 at 04:59:50PM +0100, Manuel Bouyer wrote: > > It also sets rsp and rbp. I think rbp is not set by anything else, at last > > in the Xen case. > > The different rbp value would explain why in one case we hit a

Re: Xen MP panics in cpu_switchto()

2020-01-13 Thread Andrew Doran
On Mon, Jan 13, 2020 at 03:16:05PM +0100, Manuel Bouyer wrote: > On Mon, Jan 13, 2020 at 12:02:13PM +0000, Andrew Doran wrote: > > Ah yes it does, I saw something that made me think it affected x86_64 only. > > I'll make the change on i386 too. > > thanks. > >

Re: Xen MP panics in cpu_switchto()

2020-01-13 Thread Andrew Doran
On Mon, Jan 13, 2020 at 12:56:22PM +0100, Manuel Bouyer wrote: > On Mon, Jan 13, 2020 at 11:42:17AM +0000, Andrew Doran wrote: > > Hi Manuel, > > > > On Mon, Jan 13, 2020 at 10:56:23AM +0100, Manuel Bouyer wrote: > > > Hello, > > > A current Xen domU kernel

Re: Xen MP panics in cpu_switchto()

2020-01-13 Thread Andrew Doran
Hi Manuel, On Mon, Jan 13, 2020 at 10:56:23AM +0100, Manuel Bouyer wrote: > Hello, > A current Xen domU kernel fails to boot with: > [ 1.000] hypervisor0 at mainbus0: Xen version 4.11.3nb1 > [ 1.000] vcpu0 at hypervisor0 > [ 1.000] vcpu0: Intel(R) Xeon(TM) CPU 3.00GHz, id 0xf64

Re: pmap lock changes: Xen panic

2020-01-07 Thread Andrew Doran
Manuel, On Tue, Jan 07, 2020 at 10:39:33AM +0100, Manuel Bouyer wrote: > Hello, > with 2020-01-05 00:40 UTC sources, Xen domUs panics because of what looks like > locking changes in the pmap code (full log at > http://www-soc.lip6.fr/~bouyer/NetBSD-tests/xen/HEAD/): > mlock error: Mutex:

Re: Automated report: NetBSD-current/i386 test failure

2020-01-02 Thread Andrew Doran
The remaining failures should be fixed by: 1.181 src/sys/rump/librump/rumpkern/vm.c Cheers, Andrew On Thu, Jan 02, 2020 at 01:26:42PM +, Andrew Doran wrote: > I think this is likely fixed already but will take a look now. > > Andrew > > On Thu, Jan 02, 2020 at 0

Re: Automated report: NetBSD-current/i386 test failure

2020-01-02 Thread Andrew Doran
I think this is likely fixed already but will take a look now. Andrew On Thu, Jan 02, 2020 at 08:35:09AM +, NetBSD Test Fixture wrote: > This is an automatically generated notice of new failures of the > NetBSD test suite. > > The newly failing test cases are: > >

Re: 9.99.32 panic

2020-01-01 Thread Andrew Doran
Hi, Missing commit to sys/uvm/uvm_amap.c. Fixed today. Thanks, Andrew On Wed, Jan 01, 2020 at 05:48:35PM +, Chavdar Ivanov wrote: > Hi, > > I get: > ... > #0 0x80224245 in cpu_reboot () > #1 0x807b9723 in db_reboot_cmd () > #2 0x807b9f3b in db_command () > #3

Re: odd panic

2019-12-26 Thread Andrew Doran
Hi Michael, On Thu, Dec 26, 2019 at 11:04:12AM -0800, Michael Cheponis wrote: > what does this mean? (Received last night on RPi 3B+ that has been h/w > stable): > > panic: kernel diagnostic assertion "uvmexp.swpgonly + npages <= > uvmexp.swpginuse" failed: file

Re: building kernel w/o (CPU_UCODE and COMPAT_60) fails

2019-12-21 Thread Andrew Doran
On Sat, Dec 21, 2019 at 12:46:12PM -, Michael van Elst wrote: > a...@netbsd.org (Andrew Doran) writes: > > >cvs rdiff -u -r1.88 -r1.89 src/sys/kern/kern_cpu.c > >cvs rdiff -u -r1.1 -r1.2 src/sys/kern/subr_cpu.c > > Still broken. The topology print r

Re: building kernel w/o (CPU_UCODE and COMPAT_60) fails

2019-12-21 Thread Andrew Doran
Hi, On Sat, Dec 21, 2019 at 12:30:25PM +0100, K. Schreiner wrote: > after the last changes to src/sys/kern/kern_cpu.c compiling > a custom kernel w/o "options CPU_UCODE" and "options COMPAT_60" > fails in kern_cpu.c: > > compile vNBx64/kern_cpu.o > /u/NetBSD/src/sys/kern/kern_cpu.c: In

Re: Automated report: NetBSD-current/i386 test failure

2019-12-20 Thread Andrew Doran
On Wed, Dec 18, 2019 at 01:27:32PM +, Andrew Doran wrote: > On Wed, Dec 18, 2019 at 08:25:15AM +, NetBSD Test Fixture wrote: > > > This is an automatically generated notice of new failures of the > > NetBSD test suite. > > > > The newly failing test

Re: kaybe lake panic

2019-12-20 Thread Andrew Doran
Hi, I had a quick look and this code is confusing. I could not see what lock it's trying to get. Presumably you have a netbsd.gdb in the build directory (seems to be the way now). Could you feed it to gdb and try: "info line *execlists_update_context+0x1234" where 0x1234 is the actual offset

Re: Automated report: NetBSD-current/i386 test failure

2019-12-18 Thread Andrew Doran
On Wed, Dec 18, 2019 at 08:25:15AM +, NetBSD Test Fixture wrote: > This is an automatically generated notice of new failures of the > NetBSD test suite. > > The newly failing test cases are: > > fs/vfs/t_full:lfs_fillfs > fs/vfs/t_io:lfs_extendfile >

Re: current/Xen i386 broken on 2019-12-16 01:20 UTC

2019-12-18 Thread Andrew Doran
Hi, On Wed, Dec 18, 2019 at 09:48:46AM +0100, Martin Husemann wrote: > On Wed, Dec 18, 2019 at 09:41:45AM +0100, Manuel Bouyer wrote: > > kernel diagnostic assertion "pg->offset >= nextoff" failed: file > > "/home/source/ab/HEAD/src/sys/miscfs/genfs/genfs_io.c", line 972 > > We see that on

Re: amd64 -current build failure

2019-12-17 Thread Andrew Doran
Hi, On Tue, Dec 17, 2019 at 09:49:58AM +, Chavdar Ivanov wrote: > Last two days I haven't been able to build amd64 -current: > ... > /home/sysbuild/amd64/tools/lib/gcc/x86_64--netbsd/8.3.0/../../../../x86_64--netbsd/bin/ld: > /home/sysbuild/amd64/destdir/usr/lib/librump.so: undefined

Re: Automated report: NetBSD-current/i386 build failure

2019-12-15 Thread Andrew Doran
On Sun, Dec 15, 2019 at 10:10:15PM +, NetBSD Test Fixture wrote: > This is an automatically generated notice of a NetBSD-current/i386 > build failure. Fixed already with rev 1.285 src/sys/sys/vnode.h. Andrew > The failure occurred on babylon5.netbsd.org, a NetBSD/amd64 host, > using

Re: Automated report: NetBSD-current/i386 build failure

2019-12-10 Thread Andrew Doran
Hi Robert, On Tue, Dec 10, 2019 at 08:54:12AM +0700, Robert Elz wrote: > Date:Tue, 10 Dec 2019 00:49:43 + (UTC) > From:NetBSD Test Fixture > Message-ID: <157593898298.13655.3447375934086628...@babylon5.netbsd.org> > > | This is an automatically generated

Re: Current test failures

2019-12-07 Thread Andrew Doran
On Sat, Dec 07, 2019 at 09:53:35PM +0200, Andreas Gustafsson wrote: > Perhaps, but before Taylor made that commit, at least one other bug > was introduced that is causing the system to panic before finishing > the tests: > > fs/vfs/t_renamerace (726/847): 28 test cases > ext2fs_renamerace:

Re: LOCKDEBUG: Mutex error: mi_switch,528: spin lock held

2019-12-07 Thread Andrew Doran
Hi, On Sat, Dec 07, 2019 at 07:24:32PM +0900, Kimihiro Nonaka wrote: > I got a panic with recent updated source. This should be fixed with rev. 1.330 of sys/kern/kern_synch.c. Thank you, Andrew

Re: Testbed breakage

2019-12-06 Thread Andrew Doran
Hi, On Fri, Dec 06, 2019 at 04:03:05PM +0200, Andreas Gustafsson wrote: > For the last few days, most of the testbeds have been seeing the > system under test either hang or panic before the ATF tests have run > to completion. The failures are too many and varied to file a PR > about each, but

Re: vm.ubc_direct

2019-12-04 Thread Andrew Doran
On Tue, Dec 03, 2019 at 11:28:07PM +0100, Jarom?r Dole?ek wrote: > Le mar. 3 d?c. 2019 ? 18:59, Chuck Silvers a ?crit : > > > On Mon, Dec 02, 2019 at 07:10:52PM +0000, Andrew Doran wrote: > > > Hello, > > > > > > In light of the recent discussion

Re: vm.ubc_direct

2019-12-03 Thread Andrew Doran
On Tue, Dec 03, 2019 at 09:58:39AM -0800, Chuck Silvers wrote: > The current ubc_direct code still has the problem that I pointed out > originally, > which is that it deadlocks if you read() or write() a page of a file into > a mapping of itself. We should not enable this by default until that

Re: vm.ubc_direct

2019-12-02 Thread Andrew Doran
On Mon, Dec 02, 2019 at 08:30:58PM +0100, Kamil Rytarowski wrote: > On 02.12.2019 20:10, Andrew Doran wrote: > > Hello, > > > > In light of the recent discussion, and having asked Jaromir his thoughts on > > the subject, we both think it's time to enable this by de

Re: Xen MP hang in pmap

2019-12-02 Thread Andrew Doran
On Mon, Dec 02, 2019 at 03:47:41PM +0100, Manuel Bouyer wrote: > in pmap_update(): > pmap_update(c0c751c0,c13e2000,c13e4000,c13e2000,2000,0,ccf7dd88,c041babf,c0c8d680,c13e2000) > at netbsd:pmap_update+0x21 > uvm_km_kmem_free(c0c8d680,c13e2000,2000,2,0,c13e3000,1,ccf7dd98,c03d267b,c13e2000) > at

vm.ubc_direct

2019-12-02 Thread Andrew Doran
Hello, In light of the recent discussion, and having asked Jaromir his thoughts on the subject, we both think it's time to enable this by default, so it gets wider testing. Is there a good reason not to? Cheers, Andrew

Re: Xen panic in lwp_need_userret()

2019-11-29 Thread Andrew Doran
OK, I just checked in a fix. Andrew On Fri, Nov 29, 2019 at 09:42:44AM +0100, Manuel Bouyer wrote: > On Tue, Nov 26, 2019 at 01:38:08PM +0000, Andrew Doran wrote: > > Hi Manuel, > > > > On Tue, Nov 26, 2019 at 09:01:28AM +0100, Manuel Bouyer wrote: > > > >

Re: Xen panic in lwp_need_userret()

2019-11-29 Thread Andrew Doran
Hi, On Fri, Nov 29, 2019 at 09:42:44AM +0100, Manuel Bouyer wrote: > > Yes indeed, since yesterday with rev 1.51 src/sys/kern/kern_softint.c. > > Well, the 201911261940Z now panics with: > [ 1.000] xenbus0 at hypervisor0: Xen Virtual Bus Interface > [ 1.000] xencons0 at hypervisor0:

Re: Xen panic in lwp_need_userret()

2019-11-26 Thread Andrew Doran
Hi Manuel, On Tue, Nov 26, 2019 at 09:01:28AM +0100, Manuel Bouyer wrote: > Any chance this has been fixed since 2 days ago ? Yes indeed, since yesterday with rev 1.51 src/sys/kern/kern_softint.c. Cheers, Andrew

Re: Crash with HEAD on amd64 - in setrunnable()

2019-11-25 Thread Andrew Doran
Hi Paul, On Sun, Nov 24, 2019 at 07:15:24PM -0800, Paul Goyette wrote: > On Sun, 24 Nov 2019, Paul Goyette wrote: > > > With a very current kernel, I just got this: > > > > # crash -M /var/crash/netbsd.21.core -N /netbsd.gdb > > Crash version 9.99.18, image version 9.99.18. > > System panicked:

Re: Automated report: NetBSD-current/sparc test failure

2019-11-23 Thread Andrew Doran
I checked in a potential fix for this. More scheduler changes to come later today, though. Andrew On Sat, Nov 23, 2019 at 12:48:29PM +, NetBSD Test Fixture wrote: > This is an automatically generated notice of new failures of the > NetBSD test suite. > > The newly failing test cases are: >