Re: Patch #3 Re: UMA panic under load

2002-12-19 Thread Kris Kennaway
On Tue, Dec 17, 2002 at 05:06:06PM -0500, Brian F. Feldman wrote: Matthew Dillon [EMAIL PROTECTED] wrote: Whoop. Ok, here's a new patch. I think this covers all the cases. I've done some testing and it appears to do the right thing, please look it over (the last patch had

Re: UMA panic under load

2002-12-14 Thread Brian F. Feldman
John Baldwin [EMAIL PROTECTED] wrote: On 12-Dec-2002 Kris Kennaway wrote: I got this on an alpha tonight. It was under heavy load at the time (18 simultaneous package builds had just been spawned on the machine). Any ideas? Slab at 0xfc00042d3fb8, freei 2 = 0. panic: Duplicate

Re: UMA panic under load

2002-12-14 Thread Jake Burkholder
Apparently, On Sat, Dec 14, 2002 at 07:37:31PM -0500, Brian F. Feldman said words to the effect of; John Baldwin [EMAIL PROTECTED] wrote: On 12-Dec-2002 Kris Kennaway wrote: I got this on an alpha tonight. It was under heavy load at the time (18 simultaneous package builds

Re: UMA panic under load

2002-12-14 Thread Matthew Dillon
:The problem appears to be that swapout_procs() is swapping out a process :that is in the process of exiting (in exit1()) and having already :relinquished its vmspace, but has not set PRS_ZOMBIE yet (which would be :preventing the swapout). It's clearly not correct for a process in exit1()

Re: UMA panic under load

2002-12-14 Thread Matthew Dillon
:P_WEXIT is set, so the process won't get swapped out. The problem is that :the vmspace refcnt is 0 when swapout_procs is called, since it was :decremented in exit1. The refcnt is incremented before p_flag is tested :for P_WEXIT, the swapout is skipped because its found to be set, and then

Re: UMA panic under load

2002-12-14 Thread Matthew Dillon
What about something like this. If the vm_refcnt is still being decremented too early, could it be moved to just before the thread_exit() call? -Matt Index: kern/kern_exit.c

Re: UMA panic under load

2002-12-14 Thread Brian F. Feldman
Jake Burkholder [EMAIL PROTECTED] wrote: Apparently, On Sat, Dec 14, 2002 at 07:37:31PM -0500, Brian F. Feldman said words to the effect of; John Baldwin [EMAIL PROTECTED] wrote: On 12-Dec-2002 Kris Kennaway wrote: I got this on an alpha tonight. It was under heavy load at

Re: UMA panic under load

2002-12-14 Thread Brian F. Feldman
Matthew Dillon [EMAIL PROTECTED] wrote: What about something like this. If the vm_refcnt is still being decremented too early, could it be moved to just before the thread_exit() call? The problem that had to be fixed by removing this race was that two processes with the same

Re: UMA panic under load

2002-12-14 Thread Matthew Dillon
It's a big mess. exit1() sets up vm-vm_freer = p and then vmspace_exitfree() tests that and calls vmspace_dofree(). It looks like vm-vm_freer is acting like an exit-lock, so only one process/thread actually frees the vmspace. But there are still some serious race conditions.

Re: UMA panic under load

2002-12-14 Thread Matthew Dillon
Here's another go at a patch (untested). -Matt Index: kern/kern_exit.c === RCS file: /home/ncvs/src/sys/kern/kern_exit.c,v retrieving revision 1.187 diff -u -r1.187 kern_exit.c ---

(lots of posts today Matt!) Re: UMA panic under load

2002-12-14 Thread Matthew Dillon
oops, sorry, I blew that patch. exitingcnt would have to be incremented unconditionally. -Matt Index: kern/kern_exit.c === RCS file: /home/ncvs/src/sys/kern/kern_exit.c,v retrieving

Patch #3 Re: UMA panic under load

2002-12-14 Thread Matthew Dillon
Whoop. Ok, here's a new patch. I think this covers all the cases. I've done some testing and it appears to do the right thing, please look it over (the last patch had type-o's and didn't cover the correct cases). -Matt Index:

Re: Another UMA panic under load

2002-12-13 Thread Terry Lambert
Andrew Gallatin wrote: Ugh. Since it may call kmem_malloc(), UMA must hold Giant. This is the same problem the mbuf system has, and its what's keeping network device drivers under Giant in 5.0. Both subsytems should probably have GIANT_REQUIRED at all entry points so as to catch locking

UMA panic under load

2002-12-12 Thread Kris Kennaway
I got this on an alpha tonight. It was under heavy load at the time (18 simultaneous package builds had just been spawned on the machine). Any ideas? Slab at 0xfc00042d3fb8, freei 2 = 0. panic: Duplicate free of item 0xfc00042d22e0 from zone 0xfc0007d31800(VMSPACE)

RE: UMA panic under load

2002-12-12 Thread John Baldwin
On 12-Dec-2002 Kris Kennaway wrote: I got this on an alpha tonight. It was under heavy load at the time (18 simultaneous package builds had just been spawned on the machine). Any ideas? Slab at 0xfc00042d3fb8, freei 2 = 0. panic: Duplicate free of item 0xfc00042d22e0 from zone

Another UMA panic under load

2002-12-12 Thread Kris Kennaway
I think this is the same one I reported a few days ago (another alpha under heavy load). panic: mutex Giant not owned at /local0/src-client/sys/vm/vm_kern.c:312 db_print_backtrace() at db_print_backtrace+0x18 panic() at panic+0x104 _mtx_assert() at _mtx_assert+0xb4 kmem_malloc() at

Re: Another UMA panic under load

2002-12-12 Thread Andrew Gallatin
Ugh. Since it may call kmem_malloc(), UMA must hold Giant. This is the same problem the mbuf system has, and its what's keeping network device drivers under Giant in 5.0. Both subsytems should probably have GIANT_REQUIRED at all entry points so as to catch locking problems like this earlier.

uma panic

2002-03-23 Thread Kris Kennaway
I upgraded the bento package building cluster to a more recent -current to try and get packages building again (every other snapshot I've tried for the last week has either been broken or does not build). One of the client machines panicked after about an hour under load: The kernel and

Re: uma panic

2002-03-23 Thread Robert Watson
I think I've run into this pre-UMA, so suspect it's not a UMA panic. I could be wrong. Robert N M Watson FreeBSD Core Team, TrustedBSD Project [EMAIL PROTECTED] NAI Labs, Safeport Network Services On Sat, 23 Mar 2002, Kris Kennaway wrote: I upgraded the bento package

Re: uma panic

2002-03-23 Thread David O'Brien
On Sat, Mar 23, 2002 at 12:41:58AM -0800, Kris Kennaway wrote: I upgraded the bento package building cluster to a more recent -current to try and get packages building again (every other snapshot Could you run the actual DP#1 code? This would make sure the packages match the snapshot, and

Re: uma panic

2002-03-23 Thread Dag-Erling Smorgrav
Kris Kennaway [EMAIL PROTECTED] writes: I upgraded the bento package building cluster to a more recent -current to try and get packages building again (every other snapshot I've tried for the last week has either been broken or does not build). One of the client machines panicked after about

Re: uma panic

2002-03-23 Thread Kris Kennaway
On Sat, Mar 23, 2002 at 09:23:33AM -0800, David O'Brien wrote: On Sat, Mar 23, 2002 at 12:41:58AM -0800, Kris Kennaway wrote: I upgraded the bento package building cluster to a more recent -current to try and get packages building again (every other snapshot Could you run the actual DP#1