Re: KSE signal problems still

2002-07-03 Thread Bruce Evans

On Tue, 2 Jul 2002, Julian Elischer wrote:

 On Tue, 2 Jul 2002, Andrew Gallatin wrote:

 
  An easy way to induce a panic w/a post KSE -current is to ^C gdb as it
  starts on an SMP machine:

 A possibly related breakage is:

 type ^Z while doing make buiildworld (or something similar).

 when you type 'fg' there is a high change the build will abort..
 
  # gdb -k /var/crash/kernel.1  /var/crash/vmcore.1
  GNU gdb 5.2.0 (FreeBSD) 20020627
  Copyright 2002 Free Software Foundation, Inc.
  GDB is free software, covered by the GNU General Public License, and
  you are
  welcome to change it and/or distribute copies of it under certain
  conditions.
  Type show copying to see the conditions.
  There is absolutely no warranty for GDB.  Type show warranty for
  details.
  This GDB was configured as i386-undermydesk-freebsd...
  ^C
 
  panic: mutex sched lock not owned at ../../../kern/subr_smp.c:126
  cpuid = 1; lapic.id = 0100
  Debugger(panic)
  Stopped at  Debugger+0x46:  xchgl   %ebx,in_Debugger.0
  db where
  No such command
  db tr
  Debugger(c02dbf5a) at Debugger+0x46
  panic(c02db1a8,c02db318,c02df736,7e,c4445540) at panic+0xd6
  _mtx_assert(c0315440,1,c02df736,7e) at _mtx_assert+0xa8
  forward_signal(c4445540) at forward_signal+0x1a
  tdsignal(c4445540,2,2) at tdsignal+0x182
  psignal(c443d558,2) at psignal+0x3c8
  pgsignal(c441ad00,2,1,c441ad1c,0) at pgsignal+0x63
  ttyinput(3,c41e8e30,c41e8e00,0,c0347903) at ttyinput+0x316
  ptcwrite(c4307a00,d7d5ec88,7f0011,1,d7d5ebc4) at ptcwrite+0x17f
  spec_write(d7d5ebf0,d7d5ec3c,c0204cc8,d7d5ebf0,7f0011) at spec_write+0x5a
  spec_vnoperate(d7d5ebf0) at spec_vnoperate+0x13
  vn_write(c41ded5c,d7d5ec88,c440cd80,0,c409e780) at vn_write+0x1c8
  dofilewrite(c409e780,c41ded5c,5,8088000,1) at dofilewrite+0xaf
  write(c409e780,d7d5ed14,3,b,282) at write+0x39
  syscall(2f,2f,2f,1,8073410) at syscall+0x23c
  syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
  --- syscall (4, FreeBSD ELF, write), eip = 0x281fb3a3, esp =
  0xbfbff37c, ebp = 0xbfbff3e8 ---
 
 

 hum

 so, the question is:
 where should we get the sched lock?

Maybe just remove the foot-shooting that releases it?

% Index: kern_sig.c
% ===
% RCS file: /home/ncvs/src/sys/kern/kern_sig.c,v
% retrieving revision 1.170
% retrieving revision 1.171
% diff -u -1 -r1.170 -r1.171
% --- kern_sig.c29 Jun 2002 02:00:01 -  1.170
% +++ kern_sig.c29 Jun 2002 17:26:18 -  1.171
% @@ -1486,15 +1540,9 @@
%*/
% - if (p-p_stat == SRUN) {
% + mtx_unlock_spin(sched_lock);
^ shoot foot
% + if (td-td_state == TDS_RUNQ ||
% + td-td_state == TDS_RUNNING) {

I think sched_lock is needed for checking td_state too (strictly to use
the result of the check, so the lock is not critical if the use doesn't
do anything harmful), but there is no lock indication for td_state
in proc.h like there used to be for p_stat.

% + signotify(td-td_proc);

Holding sched_lock when calling signotify() used to be an error, but that
was changed in rev.1.155.  This signotify() call seems to be bogus anyway.
signotify() should only be called after the signal mask is changed.  The
call to signotify() here was removed in rev.1.154 when the semantics of
signotify() was changed a little.  Bogus calls to signotify() just waste
time.

%  #ifdef SMP
% - struct kse *ke;
% - struct thread *td = curthread;
% -/* we should only deliver to one thread.. but which one? */
% - FOREACH_KSEGRP_IN_PROC(p, kg) {
% - FOREACH_KSE_IN_GROUP(kg, ke) {
% - if (ke-ke_thread == td) {
% - continue;
% - }
% - forward_signal(ke-ke_thread);
% - }
% - }
% + if (td-td_state == TDS_RUNNING  td != curthread)
% + forward_signal(td);
%  #endif

forward_signal() was called with sched_lock held in rev.1.170, and
forward_signal() still requires it to be held.  I think sched_lock is
needed for checking td_state too, as above.  Here it is fairly clear
that calling forward_signal() bogusly after losing a race is harmless.
It just wakes up td to look for a signal that isn't there or can't
be handled yet.  Since this only happens if we lose a race, it may be
more efficient to let it happen (rarely) than to lock (always) to prevent
it happening.  But we already held the lock so the locking was free
except for latency issues.

Bruce


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-03 Thread Bruce Evans

On Wed, 3 Jul 2002, Bruce Evans wrote:

 Maybe just remove the foot-shooting that releases it?

 % Index: kern_sig.c
 % ===
 % RCS file: /home/ncvs/src/sys/kern/kern_sig.c,v
 % retrieving revision 1.170
 % retrieving revision 1.171
 % diff -u -1 -r1.170 -r1.171
 % --- kern_sig.c  29 Jun 2002 02:00:01 -  1.170
 % +++ kern_sig.c  29 Jun 2002 17:26:18 -  1.171
 % @@ -1486,15 +1540,9 @@
 %  */
 % -   if (p-p_stat == SRUN) {
 % +   mtx_unlock_spin(sched_lock);
   ^ shoot foot
 % +   if (td-td_state == TDS_RUNQ ||
 % +   td-td_state == TDS_RUNNING) {

 I think sched_lock is needed for checking td_state too (strictly to use
 the result of the check, so the lock is not critical if the use doesn't
 do anything harmful), but there is no lock indication for td_state
 in proc.h like there used to be for p_stat.

 % +   signotify(td-td_proc);

 Holding sched_lock when calling signotify() used to be an error, but that
 was changed in rev.1.155.  This signotify() call seems to be bogus anyway.
 signotify() should only be called after the signal mask is changed.  The
 call to signotify() here was removed in rev.1.154 when the semantics of
 signotify() was changed a little.  Bogus calls to signotify() just waste
 time.

 %  #ifdef SMP
 % -   struct kse *ke;
 % -   struct thread *td = curthread;
 % -/* we should only deliver to one thread.. but which one? */
 % -   FOREACH_KSEGRP_IN_PROC(p, kg) {
 % -   FOREACH_KSE_IN_GROUP(kg, ke) {
 % -   if (ke-ke_thread == td) {
 % -   continue;
 % -   }
 % -   forward_signal(ke-ke_thread);
 % -   }
 % -   }
 % +   if (td-td_state == TDS_RUNNING  td != curthread)
 % +   forward_signal(td);
 %  #endif

 forward_signal() was called with sched_lock held in rev.1.170, and
 forward_signal() still requires it to be held.  I think sched_lock is
 needed for checking td_state too, as above.  Here it is fairly clear
 that calling forward_signal() bogusly after losing a race is harmless.
 It just wakes up td to look for a signal that isn't there or can't
 be handled yet.  Since this only happens if we lose a race, it may be
 more efficient to let it happen (rarely) than to lock (always) to prevent
 it happening.  But we already held the lock so the locking was free
 except for latency issues.

 Bruce

Untested fix for thes bugs and some style bugs in tdsignal():

Index: kern_sig.c
===
RCS file: /home/ncvs/src/sys/kern/kern_sig.c,v
retrieving revision 1.171
diff -u -2 -r1.171 kern_sig.c
--- kern_sig.c  29 Jun 2002 17:26:18 -  1.171
+++ kern_sig.c  3 Jul 2002 07:42:31 -
@@ -1468,5 +1449,5 @@
 /*
  * The force of a signal has been directed against a single
- * thread. We need to see what we can do about knocking it
+ * thread.  We need to see what we can do about knocking it
  * out of any sleep it may be in etc.
  */
@@ -1485,8 +1466,7 @@
 */
mtx_lock_spin(sched_lock);
-   if ((action == SIG_DFL)  (prop  SA_KILL)) {
-   if (td-td_priority  PUSER) {
+   if (action == SIG_DFL  (prop  SA_KILL)) {
+   if (td-td_priority  PUSER)
td-td_priority = PUSER;
-   }
}
mtx_unlock_spin(sched_lock);
@@ -1496,7 +1476,7 @@
 * except that stopped processes must be continued by SIGCONT.
 */
-   if (action == SIG_HOLD) {
+   if (action == SIG_HOLD)
goto out;
-   }
+
mtx_lock_spin(sched_lock);
if (td-td_state == TDS_SLP) {
@@ -1531,24 +1511,17 @@
}
goto runfast;
-   /* NOTREACHED */
-
} else {
/*
-* Other states do nothing with the signal immediatly,
+* Other states do nothing with the signal immediately,
 * other than kicking ourselves if we are running.
 * It will either never be noticed, or noticed very soon.
 */
-   mtx_unlock_spin(sched_lock);
-   if (td-td_state == TDS_RUNQ ||
-   td-td_state == TDS_RUNNING) {
-   signotify(td-td_proc);
 #ifdef SMP
-   if (td-td_state == TDS_RUNNING  td != curthread)
-   forward_signal(td);
+   if (td-td_state == TDS_RUNNING  td != curthread)
+   forward_signal(td);
 #endif
-   }
+   mtx_unlock_spin(sched_lock);
goto out;
}
-   /*NOTREACHED*/

 

Re: KSE signal problems still

2002-07-03 Thread Julian Elischer



On Wed, 3 Jul 2002, Bruce Evans wrote:

 On Tue, 2 Jul 2002, Julian Elischer wrote:
 
 Maybe just remove the foot-shooting that releases it?

Yes I'm rationalising it at the moment..
turns out that just holding it for all of tdsignal works well.
Also removing it from setrunnable() is ok as all the callers I could find
have already locked it.

I checked in a stopgap to stop panics but I'm reworking it now.
the trouble is that thread semantics are really not well 
defined for multi thread processes.
What does it mean to make a process run when it has many threads?

Should ALL threads be awakened, or is it enough if ONE thread awakens to
deliver the thread.

For right now it's mostly important that single threaded processs act 
as they used to. We can always change how multithreaded processes
work.





 
 % Index: kern_sig.c
 % ===
 % RCS file: /home/ncvs/src/sys/kern/kern_sig.c,v
 % retrieving revision 1.170
 % retrieving revision 1.171
 % diff -u -1 -r1.170 -r1.171
 % --- kern_sig.c  29 Jun 2002 02:00:01 -  1.170
 % +++ kern_sig.c  29 Jun 2002 17:26:18 -  1.171
 % @@ -1486,15 +1540,9 @@
 %  */
 % -   if (p-p_stat == SRUN) {
 % +   mtx_unlock_spin(sched_lock);
   ^ shoot foot
 % +   if (td-td_state == TDS_RUNQ ||
 % +   td-td_state == TDS_RUNNING) {
 
 I think sched_lock is needed for checking td_state too (strictly to use
 the result of the check, so the lock is not critical if the use doesn't
 do anything harmful), but there is no lock indication for td_state
 in proc.h like there used to be for p_stat.
 
 % +   signotify(td-td_proc);
 
 Holding sched_lock when calling signotify() used to be an error, but that
 was changed in rev.1.155.  This signotify() call seems to be bogus anyway.
 signotify() should only be called after the signal mask is changed.  The
 call to signotify() here was removed in rev.1.154 when the semantics of
 signotify() was changed a little.  Bogus calls to signotify() just waste
 time.

Signotify is already calledin psignal so I've removed this one
from my version.

 
 %  #ifdef SMP
 % -   struct kse *ke;
 % -   struct thread *td = curthread;
 % -/* we should only deliver to one thread.. but which one? */
 % -   FOREACH_KSEGRP_IN_PROC(p, kg) {
 % -   FOREACH_KSE_IN_GROUP(kg, ke) {
 % -   if (ke-ke_thread == td) {
 % -   continue;
 % -   }
 % -   forward_signal(ke-ke_thread);
 % -   }
 % -   }
 % +   if (td-td_state == TDS_RUNNING  td != curthread)
 % +   forward_signal(td);
 %  #endif
 
 forward_signal() was called with sched_lock held in rev.1.170, and
 forward_signal() still requires it to be held.  I think sched_lock is
 needed for checking td_state too, as above.  Here it is fairly clear
 that calling forward_signal() bogusly after losing a race is harmless.
 It just wakes up td to look for a signal that isn't there or can't
 be handled yet.  Since this only happens if we lose a race, it may be
 more efficient to let it happen (rarely) than to lock (always) to prevent
 it happening.  But we already held the lock so the locking was free
 except for latency issues.

much of what you say will be in my next commit
I told Andrew Gallatin that I would work on cleaning up
tdsignal and maybe psignal tonight, so that's what I've been doing..

it's not perfect tough..

but it clears it up a bit..
I'm just testing it at the moment.


 
 Bruce
 
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-03 Thread John Baldwin


On 03-Jul-2002 Julian Elischer wrote:
 
 
 On Wed, 3 Jul 2002, John Baldwin wrote:
 
 
 Erm, I thought I changd signotify() to require sched_lock and made the
 second half of psignal() (the whole case statement) lock sched_lock.
 Did you change that?  (To Julian)
 
 psignal as a whole hasn't existed in the KSE tree since December.
 
 I must have missed it in the complicated merge that came from that in P4.
 
 I just checked it in like this for now to stop 
 the panics until I can work out what he equivalent
 change to your is..
 
 (feel free to check out the new psignal/tdsignal
 combination.)

Well then it must be full of races then that were fixed since DP1.
*sigh*  I wonder how many other things were lost and need to be
reimplemented.

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-03 Thread Terry Lambert

Julian Elischer wrote:
 Should ALL threads be awakened, or is it enough if ONE thread awakens to
 deliver the thread.
 
 For right now it's mostly important that single threaded processs act
 as they used to. We can always change how multithreaded processes
 work.

POSIX makes no guarantees for threads delivery of signals.

Specifically, signals are not thread-things, they are process
things, and there are seperate threads-things for sending the
moral equivalents (e.g. pthread_kill) to threads on an individual
basis, but the system is not expected to make a distinction on
signal delivery as to what theread is running, nor are there
expected to be per thread masking, etc..

Garrett would probably be the right person to ask; he's a much
better POSIX lawyer.

This is really the problem I tried to explain earlier when it
came to the disabling on SIG_POLL on a per descriptor basis.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-03 Thread Julian Elischer



On Wed, 3 Jul 2002, John Baldwin wrote:

 
 On 03-Jul-2002 Julian Elischer wrote:
  
  
  On Wed, 3 Jul 2002, John Baldwin wrote:
  
  
  Erm, I thought I changd signotify() to require sched_lock and made the
  second half of psignal() (the whole case statement) lock sched_lock.
  Did you change that?  (To Julian)
  
  psignal as a whole hasn't existed in the KSE tree since December.
  
  I must have missed it in the complicated merge that came from that in P4.
  
  I just checked it in like this for now to stop 
  the panics until I can work out what he equivalent
  change to your is..
  
  (feel free to check out the new psignal/tdsignal
  combination.)
 
 Well then it must be full of races then that were fixed since DP1.
 *sigh*  I wonder how many other things were lost and need to be
 reimplemented.
 

Psignal is asside from kern_switch.c probably the largest single casualty.
I'm just checking in a cleanup now..
wait a few minutes.


 -- 
 
 John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
 Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-current in the body of the message
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-03 Thread Julian Elischer


Expanding on my own mail:

On Wed, 3 Jul 2002, Julian Elischer wrote:

 On Wed, 3 Jul 2002, John Baldwin wrote:
 
  
  Well then it must be full of races then that were fixed since DP1.
  *sigh*  I wonder how many other things were lost and need to be
  reimplemented.
  

Almost anything you checked into psignal will need looking at.
It may not be mising but since signals for threaded processes are
fundamentally different than signals for non threaded processes, some
things just don't apply.

for example if you checked in something to code that just doesn;t exist
any more in a KSE kernel, what is the correct integration?

Each one has to be evaluated on it's own..





 
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-03 Thread John Baldwin


On 03-Jul-2002 Julian Elischer wrote:
 
 Expanding on my own mail:
 
 On Wed, 3 Jul 2002, Julian Elischer wrote:
 
 On Wed, 3 Jul 2002, John Baldwin wrote:
 
  
  Well then it must be full of races then that were fixed since DP1.
  *sigh*  I wonder how many other things were lost and need to be
  reimplemented.
  
 
 Almost anything you checked into psignal will need looking at.
 It may not be mising but since signals for threaded processes are
 fundamentally different than signals for non threaded processes, some
 things just don't apply.
 
 for example if you checked in something to code that just doesn;t exist
 any more in a KSE kernel, what is the correct integration?
 
 Each one has to be evaluated on it's own..

The one in question here was fairly simple, it just expanded the sched_lock
locking some.

The argument could be made that you shouldn't be checking in stuff until
you know how it works, etc., or that you could commit in smaller pieces
(say, get multiple threads per process for kernel processes working in
the scheduler and just ignoring userland-only things like signals until
you have the other working).

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-03 Thread Julian Elischer



On Wed, 3 Jul 2002, John Baldwin wrote:

 
 The argument could be made that you shouldn't be checking in stuff
 until you know how it works, etc., or that you could commit in smaller
 pieces (say, get multiple threads per process for kernel processes
 working in the scheduler and just ignoring userland-only things like
 signals until you have the other working).

You can't do those separatly unfortulatly..

anyhow, it's not that I don't understand it, it's just that
it's complicated.. 

The new version is as clse as I can get quickly but it still
needs some cleaning.

 
 --
 
 John Baldwin [EMAIL PROTECTED]  http://www.FreeBSD.org/~jhb/ Power
 Users Use the Power to Serve!  - http://www.FreeBSD.org/
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-03 Thread John Baldwin


On 03-Jul-2002 Julian Elischer wrote:
 
 
 On Wed, 3 Jul 2002, John Baldwin wrote:
 
 
 The argument could be made that you shouldn't be checking in stuff
 until you know how it works, etc., or that you could commit in smaller
 pieces (say, get multiple threads per process for kernel processes
 working in the scheduler and just ignoring userland-only things like
 signals until you have the other working).
 
 You can't do those separatly unfortulatly..

Sure you could, just have kernel-only KSE processes at first and
use some special kernel processes for testing.  They would never
return to userland but would be adequate to test that all the
various run and sleep queues, etc. worked fine.

 anyhow, it's not that I don't understand it, it's just that
 it's complicated.. 

That part of my message was overly harsh.  I'm sorry.

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



KSE signal problems still

2002-07-02 Thread Andrew Gallatin


An easy way to induce a panic w/a post KSE -current is to ^C gdb as it
starts on an SMP machine:

# gdb -k /var/crash/kernel.1  /var/crash/vmcore.1 
GNU gdb 5.2.0 (FreeBSD) 20020627
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for
details.
This GDB was configured as i386-undermydesk-freebsd...
^C

panic: mutex sched lock not owned at ../../../kern/subr_smp.c:126
cpuid = 1; lapic.id = 0100
Debugger(panic)
Stopped at  Debugger+0x46:  xchgl   %ebx,in_Debugger.0
db where
No such command
db tr
Debugger(c02dbf5a) at Debugger+0x46
panic(c02db1a8,c02db318,c02df736,7e,c4445540) at panic+0xd6
_mtx_assert(c0315440,1,c02df736,7e) at _mtx_assert+0xa8
forward_signal(c4445540) at forward_signal+0x1a
tdsignal(c4445540,2,2) at tdsignal+0x182
psignal(c443d558,2) at psignal+0x3c8
pgsignal(c441ad00,2,1,c441ad1c,0) at pgsignal+0x63
ttyinput(3,c41e8e30,c41e8e00,0,c0347903) at ttyinput+0x316
ptcwrite(c4307a00,d7d5ec88,7f0011,1,d7d5ebc4) at ptcwrite+0x17f
spec_write(d7d5ebf0,d7d5ec3c,c0204cc8,d7d5ebf0,7f0011) at spec_write+0x5a
spec_vnoperate(d7d5ebf0) at spec_vnoperate+0x13
vn_write(c41ded5c,d7d5ec88,c440cd80,0,c409e780) at vn_write+0x1c8
dofilewrite(c409e780,c41ded5c,5,8088000,1) at dofilewrite+0xaf
write(c409e780,d7d5ed14,3,b,282) at write+0x39
syscall(2f,2f,2f,1,8073410) at syscall+0x23c
syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
--- syscall (4, FreeBSD ELF, write), eip = 0x281fb3a3, esp =
0xbfbff37c, ebp = 0xbfbff3e8 ---


This is a kernel with an updated version of kern_sync and
kern_condvar:

lcvs status kern/kern_synch.c kern/kern_condvar.c
===
File: kern_synch.c  Status: Up-to-date

   Working revision:1.179   Tue Jul  2 20:18:15 2002
   Repository revision: 1.179   /home/ncvs/src/sys/kern/kern_synch.c,v
   Sticky Tag:  (none)
   Sticky Date: (none)
   Sticky Options:  (none)

===
File: kern_condvar.cStatus: Up-to-date

   Working revision:1.24Tue Jul  2 20:18:14 2002
   Repository revision: 1.24/home/ncvs/src/sys/kern/kern_condvar.c,v
   Sticky Tag:  (none)
   Sticky Date: (none)
   Sticky Options:  (none)


I apologize if I'm being redundant, but the FreeBSD mail server seems
to be stuck -- I haven't gotten any messages on committers or -current
in hours.

Drew

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Julian Elischer



On Tue, 2 Jul 2002, Andrew Gallatin wrote:

 
 An easy way to induce a panic w/a post KSE -current is to ^C gdb as it
 starts on an SMP machine:

A possibly related breakage is:

type ^Z while doing make buiildworld (or something similar).

when you type 'fg' there is a high change the build will abort..


 
 # gdb -k /var/crash/kernel.1  /var/crash/vmcore.1 
 GNU gdb 5.2.0 (FreeBSD) 20020627
 Copyright 2002 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and
 you are
 welcome to change it and/or distribute copies of it under certain
 conditions.
 Type show copying to see the conditions.
 There is absolutely no warranty for GDB.  Type show warranty for
 details.
 This GDB was configured as i386-undermydesk-freebsd...
 ^C
 
 panic: mutex sched lock not owned at ../../../kern/subr_smp.c:126
 cpuid = 1; lapic.id = 0100
 Debugger(panic)
 Stopped at  Debugger+0x46:  xchgl   %ebx,in_Debugger.0
 db where
 No such command
 db tr
 Debugger(c02dbf5a) at Debugger+0x46
 panic(c02db1a8,c02db318,c02df736,7e,c4445540) at panic+0xd6
 _mtx_assert(c0315440,1,c02df736,7e) at _mtx_assert+0xa8
 forward_signal(c4445540) at forward_signal+0x1a
 tdsignal(c4445540,2,2) at tdsignal+0x182
 psignal(c443d558,2) at psignal+0x3c8
 pgsignal(c441ad00,2,1,c441ad1c,0) at pgsignal+0x63
 ttyinput(3,c41e8e30,c41e8e00,0,c0347903) at ttyinput+0x316
 ptcwrite(c4307a00,d7d5ec88,7f0011,1,d7d5ebc4) at ptcwrite+0x17f
 spec_write(d7d5ebf0,d7d5ec3c,c0204cc8,d7d5ebf0,7f0011) at spec_write+0x5a
 spec_vnoperate(d7d5ebf0) at spec_vnoperate+0x13
 vn_write(c41ded5c,d7d5ec88,c440cd80,0,c409e780) at vn_write+0x1c8
 dofilewrite(c409e780,c41ded5c,5,8088000,1) at dofilewrite+0xaf
 write(c409e780,d7d5ed14,3,b,282) at write+0x39
 syscall(2f,2f,2f,1,8073410) at syscall+0x23c
 syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
 --- syscall (4, FreeBSD ELF, write), eip = 0x281fb3a3, esp =
 0xbfbff37c, ebp = 0xbfbff3e8 ---
 
 

hum

so, the question is:
where should we get the sched lock?


 
 Drew
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Andrew Gallatin


Julian Elischer writes:
  
  
  On Tue, 2 Jul 2002, Andrew Gallatin wrote:
  
   
   An easy way to induce a panic w/a post KSE -current is to ^C gdb as it
   starts on an SMP machine:
  
  A possibly related breakage is:
  
  type ^Z while doing make buiildworld (or something similar).
  
  when you type 'fg' there is a high change the build will abort..
  

This is nearly 100% for me.  But only on MP boxes.  On my uniprocessor
alpha, things work just fine.  Oh.. hmm.. I'm not sure if I have
witless compiled in there.. 

Drew

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Julian Elischer



On Tue, 2 Jul 2002, Andrew Gallatin wrote:

 
 Julian Elischer writes:
   
   
   On Tue, 2 Jul 2002, Andrew Gallatin wrote:
   

An easy way to induce a panic w/a post KSE -current is to ^C gdb as it
starts on an SMP machine:
   
   A possibly related breakage is:
   
   type ^Z while doing make buiildworld (or something similar).
   
   when you type 'fg' there is a high change the build will abort..
   
 
 This is nearly 100% for me.  But only on MP boxes.  On my uniprocessor
 alpha, things work just fine.  Oh.. hmm.. I'm not sure if I have
 witless compiled in there.. 

which is almost 100%,? the ^Z killing the process, or ^C killing the
machine?

 
 Drew
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Matthew Dillon

:...
: 
:   
:   This is nearly 100% for me.  But only on MP boxes.  On my uniprocessor
:   alpha, things work just fine.  Oh.. hmm.. I'm not sure if I have
:   witless compiled in there.. 
:  
:  which is almost 100%,? the ^Z killing the process, or ^C killing the
:  machine?
:
:^C killing the machine.
:
:Drew

How are we doing on IA32?  I've successfully run 9 buildworld -j 5's
so far with a SMP build of -current.   I'm going to run a bunch more
and then I'll switch to testing signals (a buildworld only generates 4 or
5 signals over the entire build so it isn't a good test for signal-related
issues).

-Matt
Matthew Dillon 
[EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Julian Elischer

try this:

in tdsignal, (kern_sig.c)
take a lock on schedlock and release it again, just around the call to
forward-signal()

forward_signal(c4445540) at forward_signal+0x1a
tdsignal(c4445540,2,2) at tdsignal+0x182
psignal(c443d558,2) at psignal+0x3c8

hopefully this will not be called with the schedlock already locked

if we panic becasue we already own it, it gets more difficult..

On Tue, 2 Jul 2002, Andrew Gallatin wrote:

 
 Julian Elischer writes:
   
   
   On Tue, 2 Jul 2002, Andrew Gallatin wrote:
   

Julian Elischer writes:
  
  
  On Tue, 2 Jul 2002, Andrew Gallatin wrote:
  
   
   An easy way to induce a panic w/a post KSE -current is to ^C gdb as it
   starts on an SMP machine:
  
  A possibly related breakage is:
  
  type ^Z while doing make buiildworld (or something similar).
  
  when you type 'fg' there is a high change the build will abort..
  

This is nearly 100% for me.  But only on MP boxes.  On my uniprocessor
alpha, things work just fine.  Oh.. hmm.. I'm not sure if I have
witless compiled in there.. 
   
   which is almost 100%,? the ^Z killing the process, or ^C killing the
   machine?
 
 ^C killing the machine.
 
 Drew
 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Andrew Gallatin


Matthew Dillon writes:
  :...
  : 
  :   
  :   This is nearly 100% for me.  But only on MP boxes.  On my uniprocessor
  :   alpha, things work just fine.  Oh.. hmm.. I'm not sure if I have
  :   witless compiled in there.. 
  :  
  :  which is almost 100%,? the ^Z killing the process, or ^C killing the
  :  machine?
  :
  :^C killing the machine.
  :
  :Drew
  
  How are we doing on IA32?  I've successfully run 9 buildworld -j 5's
  so far with a SMP build of -current.   I'm going to run a bunch more
  and then I'll switch to testing signals (a buildworld only generates 4 or
  5 signals over the entire build so it isn't a good test for signal-related
  issues).
  

The above refers to IA32.  My (UP, w/o witness) alpha seems solid.  No
panics so far.  Its the SMP IA32 box that keeps falling on its face
with a signal..

Drew

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Andrew Gallatin


Julian Elischer writes:
  try this:
  
  in tdsignal, (kern_sig.c)
  take a lock on schedlock and release it again, just around the call to
  forward-signal()
  
  forward_signal(c4445540) at forward_signal+0x1a
  tdsignal(c4445540,2,2) at tdsignal+0x182
  psignal(c443d558,2) at psignal+0x3c8
  
  hopefully this will not be called with the schedlock already locked
  

Following your suggestion, the appended patch appears to work.

However, it does seem a bit silly, as we end up dropping
and-reaquiring the sched lock quite a few times:

mtx_unlock_spin(sched_lock);
if (td-td_state == TDS_RUNQ ||
td-td_state == TDS_RUNNING) {
signotify(td-td_proc); /* grabs  releases sched_lock*/
#ifdef SMP
if (td-td_state == TDS_RUNNING  td != curthread) {
mtx_lock_spin(sched_lock);
forward_signal(td);
mtx_unlock_spin(sched_lock);
}
#endif
}
goto out;



Wouldn't it be cleaner if there was a signotify_locked () that 
assumed you had the sched_lock held (and was called by signotify)?

Drew




Index: kern_sig.c
===
RCS file: /home/ncvs/src/sys/kern/kern_sig.c,v
retrieving revision 1.171
diff -u -r1.171 kern_sig.c
--- kern_sig.c  29 Jun 2002 17:26:18 -  1.171
+++ kern_sig.c  3 Jul 2002 01:48:35 -
@@ -1543,8 +1543,11 @@
td-td_state == TDS_RUNNING) {
signotify(td-td_proc);
 #ifdef SMP
-   if (td-td_state == TDS_RUNNING  td != curthread)
+   if (td-td_state == TDS_RUNNING  td != curthread) {
+   mtx_lock_spin(sched_lock);
forward_signal(td);
+   mtx_unlock_spin(sched_lock);
+   }
 #endif
}
goto out;

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Julian Elischer


AHH I assumed it was alpha...

On Tue, 2 Jul 2002, Andrew Gallatin wrote:

 
 Matthew Dillon writes:
   :...
   : 
   :   
   :   This is nearly 100% for me.  But only on MP boxes.  On my uniprocessor
   :   alpha, things work just fine.  Oh.. hmm.. I'm not sure if I have
   :   witless compiled in there.. 
   :  
   :  which is almost 100%,? the ^Z killing the process, or ^C killing the
   :  machine?
   :
   :^C killing the machine.
   :
   :Drew
   
   How are we doing on IA32?  I've successfully run 9 buildworld -j 5's
   so far with a SMP build of -current.   I'm going to run a bunch more
   and then I'll switch to testing signals (a buildworld only generates 4 or
   5 signals over the entire build so it isn't a good test for signal-related
   issues).
   
 
 The above refers to IA32.  My (UP, w/o witness) alpha seems solid.  No
 panics so far.  Its the SMP IA32 box that keeps falling on its face
 with a signal..
 
 Drew
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Julian Elischer

we seem pretty solid on ia32
^Z and then fg will sometimes kill teh process instead of forgrounding it
though.

(I aborted several buildworlds that way accidentally)

Andrew's panic seems SMP specific though..
you may check if there is somethign different between ia32 and alpha
on whether it holds schedlock at this point:


panic: mutex sched lock not owned at ../../../kern/subr_smp.c:126
cpuid = 1; lapic.id = 0100
Debugger(panic)
Stopped at  Debugger+0x46:  xchgl   %ebx,in_Debugger.0
db where
No such command
db tr
Debugger(c02dbf5a) at Debugger+0x46
panic(c02db1a8,c02db318,c02df736,7e,c4445540) at panic+0xd6
_mtx_assert(c0315440,1,c02df736,7e) at _mtx_assert+0xa8
forward_signal(c4445540) at forward_signal+0x1a
tdsignal(c4445540,2,2) at tdsignal+0x182
psignal(c443d558,2) at psignal+0x3c8
pgsignal(c441ad00,2,1,c441ad1c,0) at pgsignal+0x63
ttyinput(3,c41e8e30,c41e8e00,0,c0347903) at ttyinput+0x316
ptcwrite(c4307a00,d7d5ec88,7f0011,1,d7d5ebc4) at ptcwrite+0x17f
spec_write(d7d5ebf0,d7d5ec3c,c0204cc8,d7d5ebf0,7f0011) at spec_write+0x5a
spec_vnoperate(d7d5ebf0) at spec_vnoperate+0x13
vn_write(c41ded5c,d7d5ec88,c440cd80,0,c409e780) at vn_write+0x1c8
dofilewrite(c409e780,c41ded5c,5,8088000,1) at dofilewrite+0xaf
write(c409e780,d7d5ed14,3,b,282) at write+0x39
syscall(2f,2f,2f,1,8073410) at syscall+0x23c
syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
--- syscall (4, FreeBSD ELF, write), eip = 0x281fb3a3, esp =
0xbfbff37c, ebp = 0xbfbff3e8 ---

I'm trying to test jeff's latest patch but got side tracked by hardware ..



On Tue, 2 Jul 2002, Matthew Dillon wrote:

 :...
 : 
 :   
 :   This is nearly 100% for me.  But only on MP boxes.  On my uniprocessor
 :   alpha, things work just fine.  Oh.. hmm.. I'm not sure if I have
 :   witless compiled in there.. 
 :  
 :  which is almost 100%,? the ^Z killing the process, or ^C killing the
 :  machine?
 :
 :^C killing the machine.
 :
 :Drew
 
 How are we doing on IA32?  I've successfully run 9 buildworld -j 5's
 so far with a SMP build of -current.   I'm going to run a bunch more
 and then I'll switch to testing signals (a buildworld only generates 4 or
 5 signals over the entire build so it isn't a good test for signal-related
 issues).
 
   -Matt
   Matthew Dillon 
   [EMAIL PROTECTED]
 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Julian Elischer



On Tue, 2 Jul 2002, Andrew Gallatin wrote:

 
 Julian Elischer writes:
   try this:
   
   in tdsignal, (kern_sig.c)
   take a lock on schedlock and release it again, just around the call to
   forward-signal()
   
   forward_signal(c4445540) at forward_signal+0x1a
   tdsignal(c4445540,2,2) at tdsignal+0x182
   psignal(c443d558,2) at psignal+0x3c8
   
   hopefully this will not be called with the schedlock already locked
   
 
 Following your suggestion, the appended patch appears to work.
 
 However, it does seem a bit silly, as we end up dropping
 and-reaquiring the sched lock quite a few times:

That's why I just asked you to test the concept..
If I know that just aquiring it here is ok, 
(I presume you tried doing some work like this)
that tells me that this code isn't called from some odd place,
with the sched lock already set.

(that and code inspection of course..)

Now we know it works we can try optimise it..

I'm going home now for dinner, 
so if you feel like checking this or something mor optimal in,
be my guest :-)




 
 mtx_unlock_spin(sched_lock);
 if (td-td_state == TDS_RUNQ ||
 td-td_state == TDS_RUNNING) {
 signotify(td-td_proc); /* grabs  releases sched_lock*/
 #ifdef SMP
 if (td-td_state == TDS_RUNNING  td != curthread) {
 mtx_lock_spin(sched_lock);
 forward_signal(td);
 mtx_unlock_spin(sched_lock);
 }
 #endif
 }
 goto out;
 
 
 
 Wouldn't it be cleaner if there was a signotify_locked () that 
 assumed you had the sched_lock held (and was called by signotify)?
 
 Drew
 
 
 
 
 Index: kern_sig.c
 ===
 RCS file: /home/ncvs/src/sys/kern/kern_sig.c,v
 retrieving revision 1.171
 diff -u -r1.171 kern_sig.c
 --- kern_sig.c29 Jun 2002 17:26:18 -  1.171
 +++ kern_sig.c3 Jul 2002 01:48:35 -
 @@ -1543,8 +1543,11 @@
   td-td_state == TDS_RUNNING) {
   signotify(td-td_proc);
  #ifdef SMP
 - if (td-td_state == TDS_RUNNING  td != curthread)
 + if (td-td_state == TDS_RUNNING  td != curthread) {
 + mtx_lock_spin(sched_lock);
   forward_signal(td);
 + mtx_unlock_spin(sched_lock);
 + }
  #endif
   }
   goto out;
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Julian Elischer

ignore this Matt.. it was on ia32.



On Tue, 2 Jul 2002, Julian Elischer wrote:

 we seem pretty solid on ia32
 ^Z and then fg will sometimes kill teh process instead of forgrounding it
 though.
 
 (I aborted several buildworlds that way accidentally)
 
 Andrew's panic seems SMP specific though..
 you may check if there is somethign different between ia32 and alpha
 on whether it holds schedlock at this point:
 
 
 panic: mutex sched lock not owned at ../../../kern/subr_smp.c:126
 cpuid = 1; lapic.id = 0100
 Debugger(panic)
 Stopped at  Debugger+0x46:  xchgl   %ebx,in_Debugger.0
 db where
 No such command
 db tr
 Debugger(c02dbf5a) at Debugger+0x46
 panic(c02db1a8,c02db318,c02df736,7e,c4445540) at panic+0xd6
 _mtx_assert(c0315440,1,c02df736,7e) at _mtx_assert+0xa8
 forward_signal(c4445540) at forward_signal+0x1a
 tdsignal(c4445540,2,2) at tdsignal+0x182
 psignal(c443d558,2) at psignal+0x3c8
 pgsignal(c441ad00,2,1,c441ad1c,0) at pgsignal+0x63
 ttyinput(3,c41e8e30,c41e8e00,0,c0347903) at ttyinput+0x316
 ptcwrite(c4307a00,d7d5ec88,7f0011,1,d7d5ebc4) at ptcwrite+0x17f
 spec_write(d7d5ebf0,d7d5ec3c,c0204cc8,d7d5ebf0,7f0011) at spec_write+0x5a
 spec_vnoperate(d7d5ebf0) at spec_vnoperate+0x13
 vn_write(c41ded5c,d7d5ec88,c440cd80,0,c409e780) at vn_write+0x1c8
 dofilewrite(c409e780,c41ded5c,5,8088000,1) at dofilewrite+0xaf
 write(c409e780,d7d5ed14,3,b,282) at write+0x39
 syscall(2f,2f,2f,1,8073410) at syscall+0x23c
 syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
 --- syscall (4, FreeBSD ELF, write), eip = 0x281fb3a3, esp =
 0xbfbff37c, ebp = 0xbfbff3e8 ---
 
 I'm trying to test jeff's latest patch but got side tracked by hardware ..
 
 
 
 On Tue, 2 Jul 2002, Matthew Dillon wrote:
 
  :...
  : 
  :   
  :   This is nearly 100% for me.  But only on MP boxes.  On my uniprocessor
  :   alpha, things work just fine.  Oh.. hmm.. I'm not sure if I have
  :   witless compiled in there.. 
  :  
  :  which is almost 100%,? the ^Z killing the process, or ^C killing the
  :  machine?
  :
  :^C killing the machine.
  :
  :Drew
  
  How are we doing on IA32?  I've successfully run 9 buildworld -j 5's
  so far with a SMP build of -current.   I'm going to run a bunch more
  and then I'll switch to testing signals (a buildworld only generates 4 or
  5 signals over the entire build so it isn't a good test for signal-related
  issues).
  
  -Matt
  Matthew Dillon 
  [EMAIL PROTECTED]
  
 
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-current in the body of the message
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Andrew Gallatin


Julian Elischer writes:
   
   However, it does seem a bit silly, as we end up dropping
   and-reaquiring the sched lock quite a few times:
  
  That's why I just asked you to test the concept..
  If I know that just aquiring it here is ok, 
  (I presume you tried doing some work like this)
  that tells me that this code isn't called from some odd place,
  with the sched lock already set.
  
  (that and code inspection of course..)
  
  Now we know it works we can try optimise it..
  
  I'm going home now for dinner, 
  so if you feel like checking this or something mor optimal in,
  be my guest :-)

OK, I've checked in the unoptimized fix.   Please do optimize it when
you get a chance.

Thanks,

Drew

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread John Baldwin


On 03-Jul-2002 Andrew Gallatin wrote:
 
 Julian Elischer writes:

However, it does seem a bit silly, as we end up dropping
and-reaquiring the sched lock quite a few times:
   
   That's why I just asked you to test the concept..
   If I know that just aquiring it here is ok, 
   (I presume you tried doing some work like this)
   that tells me that this code isn't called from some odd place,
   with the sched lock already set.
   
   (that and code inspection of course..)
   
   Now we know it works we can try optimise it..
   
   I'm going home now for dinner, 
   so if you feel like checking this or something mor optimal in,
   be my guest :-)
 
 OK, I've checked in the unoptimized fix.   Please do optimize it when
 you get a chance.

Erm, I thought I changd signotify() to require sched_lock and made the
second half of psignal() (the whole case statement) lock sched_lock.
Did you change that?  (To Julian)

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Julian Elischer



On Wed, 3 Jul 2002, John Baldwin wrote:

 
 Erm, I thought I changd signotify() to require sched_lock and made the
 second half of psignal() (the whole case statement) lock sched_lock.
 Did you change that?  (To Julian)

psignal as a whole hasn't existed in the KSE tree since December.

I must have missed it in the complicated merge that came from that in P4.

I just checked it in like this for now to stop 
the panics until I can work out what he equivalent
change to your is..

(feel free to check out the new psignal/tdsignal
combination.)



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Matthew Dillon

I can get a panic when ^C'ing buildworld on an SMP build of -current:

-Matt

test3# j
test3# panic: mutex sched lock not owned at 
/FreeBSD/FreeBSD-current/src/sys/kern/subr_smp.c:126
cpuid = 1; lapic.id = 
Debugger(panic)
Stopped at  Debugger+0x46:  xchgl   %ebx,in_Debugger.0
db trace
Debugger(c02ec4ba) at Debugger+0x46
panic(c02eb5e8,c02eb758,c02efe80,7e,c6a5a9c0) at panic+0xd6
_mtx_assert(c0325a20,1,c02efe80,7e) at _mtx_assert+0xa8
forward_signal(c6a5a9c0) at forward_signal+0x1a
tdsignal(c6a5a9c0,2,0) at tdsignal+0x182
psignal(c665f804,2) at psignal+0x3c8
pgsignal(c6bbe480,2,1,c6bbe49c,0) at pgsignal+0x63
ttyinput(3,c6413230,c6413200,0,e0e71b03) at ttyinput+0x316
ptcwrite(c6648600,e0e71c88,7f0011,1,e0e71bc4) at ptcwrite+0x17f
spec_write(e0e71bf0,e0e71c3c,c020f0a0,e0e71bf0,7f0011) at spec_write+0x5a
spec_vnoperate(e0e71bf0) at spec_vnoperate+0x13
vn_write(c645aec4,e0e71c88,c6641380,0,c622b3c0) at vn_write+0x1c8
dofilewrite(c622b3c0,c645aec4,7,807f000,1) at dofilewrite+0xaf
write(c622b3c0,e0e71d14,3,9,282) at write+0x39
syscall(2f,2f,2f,8074600,8074644) at syscall+0x23c
syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
--- syscall (4, FreeBSD ELF, write), eip = 0x281fc3a3, esp = 0xbfbff36c, ebp = 
0xbfbff3d8 ---


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread David Xu


Andrew Gallatin fixed the problem in kern_sig.c, check it out:

gallatin2002/07/02 19:55:48 PDT

  Modified files:
sys/kern kern_sig.c 
  Log:
  Hold the sched lock across call to forward_signal() in tdsignal() to
  keep SMP systems from panic'ing when ^C'ing an app
  
  suggested by julian
  
  Revision  ChangesPath
  1.172 +4 -1  src/sys/kern/kern_sig.c

- Original Message - 
From: Matthew Dillon [EMAIL PROTECTED]
To: Julian Elischer [EMAIL PROTECTED]
Cc: Andrew Gallatin [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Wednesday, July 03, 2002 1:36 PM
Subject: Re: KSE signal problems still


 I can get a panic when ^C'ing buildworld on an SMP build of -current:
 
 -Matt
 
 test3# j
 test3# panic: mutex sched lock not owned at 
/FreeBSD/FreeBSD-current/src/sys/kern/subr_smp.c:126
 cpuid = 1; lapic.id = 
 Debugger(panic)
 Stopped at  Debugger+0x46:  xchgl   %ebx,in_Debugger.0
 db trace
 Debugger(c02ec4ba) at Debugger+0x46
 panic(c02eb5e8,c02eb758,c02efe80,7e,c6a5a9c0) at panic+0xd6
 _mtx_assert(c0325a20,1,c02efe80,7e) at _mtx_assert+0xa8
 forward_signal(c6a5a9c0) at forward_signal+0x1a
 tdsignal(c6a5a9c0,2,0) at tdsignal+0x182
 psignal(c665f804,2) at psignal+0x3c8
 pgsignal(c6bbe480,2,1,c6bbe49c,0) at pgsignal+0x63
 ttyinput(3,c6413230,c6413200,0,e0e71b03) at ttyinput+0x316
 ptcwrite(c6648600,e0e71c88,7f0011,1,e0e71bc4) at ptcwrite+0x17f
 spec_write(e0e71bf0,e0e71c3c,c020f0a0,e0e71bf0,7f0011) at spec_write+0x5a
 spec_vnoperate(e0e71bf0) at spec_vnoperate+0x13
 vn_write(c645aec4,e0e71c88,c6641380,0,c622b3c0) at vn_write+0x1c8
 dofilewrite(c622b3c0,c645aec4,7,807f000,1) at dofilewrite+0xaf
 write(c622b3c0,e0e71d14,3,9,282) at write+0x39
 syscall(2f,2f,2f,8074600,8074644) at syscall+0x23c
 syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
 --- syscall (4, FreeBSD ELF, write), eip = 0x281fc3a3, esp = 0xbfbff36c, ebp = 
0xbfbff3d8 ---



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Julian Elischer

I just fixed that.. get a new version of kern_sig.c


On Tue, 2 Jul 2002, Matthew Dillon wrote:

 I can get a panic when ^C'ing buildworld on an SMP build of -current:
 
   -Matt
 
 test3# j
 test3# panic: mutex sched lock not owned at 
/FreeBSD/FreeBSD-current/src/sys/kern/subr_smp.c:126
 cpuid = 1; lapic.id = 
 Debugger(panic)
 Stopped at  Debugger+0x46:  xchgl   %ebx,in_Debugger.0
 db trace
 Debugger(c02ec4ba) at Debugger+0x46
 panic(c02eb5e8,c02eb758,c02efe80,7e,c6a5a9c0) at panic+0xd6
 _mtx_assert(c0325a20,1,c02efe80,7e) at _mtx_assert+0xa8
 forward_signal(c6a5a9c0) at forward_signal+0x1a
 tdsignal(c6a5a9c0,2,0) at tdsignal+0x182
 psignal(c665f804,2) at psignal+0x3c8
 pgsignal(c6bbe480,2,1,c6bbe49c,0) at pgsignal+0x63
 ttyinput(3,c6413230,c6413200,0,e0e71b03) at ttyinput+0x316
 ptcwrite(c6648600,e0e71c88,7f0011,1,e0e71bc4) at ptcwrite+0x17f
 spec_write(e0e71bf0,e0e71c3c,c020f0a0,e0e71bf0,7f0011) at spec_write+0x5a
 spec_vnoperate(e0e71bf0) at spec_vnoperate+0x13
 vn_write(c645aec4,e0e71c88,c6641380,0,c622b3c0) at vn_write+0x1c8
 dofilewrite(c622b3c0,c645aec4,7,807f000,1) at dofilewrite+0xaf
 write(c622b3c0,e0e71d14,3,9,282) at write+0x39
 syscall(2f,2f,2f,8074600,8074644) at syscall+0x23c
 syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
 --- syscall (4, FreeBSD ELF, write), eip = 0x281fc3a3, esp = 0xbfbff36c, ebp = 
0xbfbff3d8 ---
 
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: KSE signal problems still

2002-07-02 Thread Matthew Dillon


:
:
:Andrew Gallatin fixed the problem in kern_sig.c, check it out:
:
:gallatin2002/07/02 19:55:48 PDT
:

Will do tomorrow!

-Matt

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message