Re: LOR: filedesc structure - pipe mutex

2002-12-07 Thread Robert Watson
On Fri, 6 Dec 2002, Kris Kennaway wrote:

 On Fri, Dec 06, 2002 at 07:18:03PM -0800, Lars Eggert wrote:
 
  I'm getting this too:
 
 After discussing this with various people on IRC, it was determined that
 this is not the place where the reversal is occurring, but since witness
 doesn't have the lock order defined it has to guess, and in this
 instance it is guessing the wrong way around.  After adding the lock
 order to subr_witness.c I now get this: 

Yeah, we're exchanging some out-of-band e-mail on this: the basic problem
is that:

filedescpipe
pipesigio
sigioproc
procfiledesc

We're talking about some possible solutions, including deferred
signalling, etc, etc. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: LOR: filedesc structure - pipe mutex

2002-12-07 Thread Robert Watson
BTW, one upshot of this whole event is that we should probably be
hard-coding the lock order of all important locks rather than allowing it
to be automatically determined.  We'd uncover problems of this sort much
faster and much more easily, and it would provide better documentation of
the intended lock order.  Also, the authors of the following locking bits
should document them in the SMPng architecture document:

- File descriptor locking
- Pipe locking
- Select locking
- Process locking

I've gone ahead and documented the SIGIO locking, and I might get to the
rest sometime also, but it would be a lot faster if the people who did the
work documented it rather than having to rely on a lot of code reading to
figure out the subtleties.

Do we know if WITNESS supports the notion of a partial order, in which it
is simply asserted that there is no valid locking order between two locks? 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Fri, 6 Dec 2002, Kris Kennaway wrote:

 On Fri, Dec 06, 2002 at 07:18:03PM -0800, Lars Eggert wrote:
 
  I'm getting this too:
 
 After discussing this with various people on IRC, it was determined
 that this is not the place where the reversal is occurring, but since
 witness doesn't have the lock order defined it has to guess, and in
 this instance it is guessing the wrong way around.  After adding the
 lock order to subr_witness.c I now get this:
 
 lock order reversal
  1st 0xc41bc418 process lock (process lock) @ 
/local0/src-client/sys/kern/kern_descrip.c:2094
  2nd 0xc42de934 filedesc structure (filedesc structure) @ 
/local0/src-client/sys/kern/kern_descrip.c:2101
 Debugger(witness_lock)
 Stopped at  Debugger+0x45:  xchgl   %ebx,in_Debugger.0
 db trace
 Debugger(c037a085) at Debugger+0x45
 witness_lock(c42de934,8,c039e3cc,835,c41bc418) at witness_lock+0x532
 _mtx_lock_flags(c42de934,0,c039e3cc,835,0) at _mtx_lock_flags+0x7f
 sysctl_kern_file(c03d8a00,0,0,d7aaac08) at sysctl_kern_file+0x119
 sysctl_root(0,d7aaacb4,2,d7aaac08,c0416240) at sysctl_root+0x116
 userland_sysctl(c3f84a80,d7aaacb4,2,bfbfe588,bfbff3f8) at userland_sysctl+0xec
 __sysctl(c3f84a80,d7aaad14,6,1,297) at __sysctl+0x71
 syscall(2f,2f,2f,2,bfbfe588) at syscall+0x211
 Xint0x80_syscall() at Xint0x80_syscall+0x1d
 --- syscall (202, FreeBSD ELF32, __sysctl), eip = 0x805a717, esp = 0xbfbfe53c, ebp = 
0xbfbfe568 ---
 db
 
 Index: sys/kern/subr_witness.c
 ===
 RCS file: /home/ncvs/src/sys/kern/subr_witness.c,v
 retrieving revision 1.130
 diff -u -r1.130 subr_witness.c
 --- sys/kern/subr_witness.c 11 Nov 2002 16:36:20 -  1.130
 +++ sys/kern/subr_witness.c 7 Dec 2002 04:18:29 -
 @@ -198,6 +198,8 @@
   { Giant, lock_class_mtx_sleep },
   { proctree, lock_class_sx },
   { allproc, lock_class_sx },
 + { filedesc structure, lock_class_mtx_sleep },
 + { pipe mutex, lock_class_mtx_sleep },
   { sigio lock, lock_class_mtx_sleep },
   { process group, lock_class_mtx_sleep },
   { process lock, lock_class_mtx_sleep },
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: LOR: filedesc structure - pipe mutex

2002-12-07 Thread Kris Kennaway
On Sat, Dec 07, 2002 at 08:58:26AM -0500, Robert Watson wrote:
 On Fri, 6 Dec 2002, Kris Kennaway wrote:
 
  On Fri, Dec 06, 2002 at 07:18:03PM -0800, Lars Eggert wrote:
  
   I'm getting this too:
  
  After discussing this with various people on IRC, it was determined that
  this is not the place where the reversal is occurring, but since witness
  doesn't have the lock order defined it has to guess, and in this
  instance it is guessing the wrong way around.  After adding the lock
  order to subr_witness.c I now get this: 
 
 Yeah, we're exchanging some out-of-band e-mail on this: the basic problem
 is that:
 
 filedescpipe
 pipesigio
 sigioproc
 procfiledesc
 
 We're talking about some possible solutions, including deferred
 signalling, etc, etc. 

Here's another panic I got overnight:

 lock order reversal
 1st 0xc4cbcd80 pipe mutex (pipe mutex) @ /local0/src-client/sys/kern/sys_pipe.c:465
 2nd 0xc044e780 sigio lock (sigio lock) @ /local0/src-client/sys/kern/kern_sig.c:2225
Debugger(witness_lock)
Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
db trace
Debugger(c03efc85,c044e780,c04141a3,c04141a3,c0416b5c) at Debugger+0x54
witness_lock(c044e780,8,c0416b5c,8b1,c4cbcd80) at witness_lock+0x667
_mtx_lock_flags(c044e780,0,c0416b5c,8b1,23) at _mtx_lock_flags+0xb1
pgsigio(c5cf3cf8,17,0,1ad,0) at pgsigio+0x30
pipe_read(c63b2c6c,d9539c7c,c4e91f00,0,c67608c0) at pipe_read+0x516
dofileread(c67608c0,c63b2c6c,4,bfbff533,1) at dofileread+0xd2
read(c67608c0,d9539d10,c043094c,407,3) at read+0x6b
syscall(2f,2f,2f,0,bfbff534) at syscall+0x28e
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (3, FreeBSD ELF32, read), eip = 0x2812d1c8, esp = 0xbfbff500, ebp = 
0xbfbff700 ---
db

Kris


msg48337/pgp0.pgp
Description: PGP signature


Re: LOR: filedesc structure - pipe mutex

2002-12-07 Thread John Baldwin

On 07-Dec-2002 Robert Watson wrote:
 BTW, one upshot of this whole event is that we should probably be
 hard-coding the lock order of all important locks rather than allowing it
 to be automatically determined.  We'd uncover problems of this sort much
 faster and much more easily, and it would provide better documentation of
 the intended lock order.  Also, the authors of the following locking bits
 should document them in the SMPng architecture document:
 
 - File descriptor locking
 - Pipe locking
 - Select locking
 - Process locking
 
 I've gone ahead and documented the SIGIO locking, and I might get to the
 rest sometime also, but it would be a lot faster if the people who did the
 work documented it rather than having to rely on a lot of code reading to
 figure out the subtleties.
 
 Do we know if WITNESS supports the notion of a partial order, in which it
 is simply asserted that there is no valid locking order between two locks?

You mean that it is not ok to obtain lock A while holdig lock B and it is not
ok to obtain lock B while holding lock A?   It doesn't really support that
right now, no.

 Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
 [EMAIL PROTECTED]  Network Associates Laboratories
 
 On Fri, 6 Dec 2002, Kris Kennaway wrote:
 
 On Fri, Dec 06, 2002 at 07:18:03PM -0800, Lars Eggert wrote:
 
  I'm getting this too:
 
 After discussing this with various people on IRC, it was determined
 that this is not the place where the reversal is occurring, but since
 witness doesn't have the lock order defined it has to guess, and in
 this instance it is guessing the wrong way around.  After adding the
 lock order to subr_witness.c I now get this:
 
 lock order reversal
  1st 0xc41bc418 process lock (process lock) @ 
/local0/src-client/sys/kern/kern_descrip.c:2094
  2nd 0xc42de934 filedesc structure (filedesc structure) @
  /local0/src-client/sys/kern/kern_descrip.c:2101
 Debugger(witness_lock)
 Stopped at  Debugger+0x45:  xchgl   %ebx,in_Debugger.0
 db trace
 Debugger(c037a085) at Debugger+0x45
 witness_lock(c42de934,8,c039e3cc,835,c41bc418) at witness_lock+0x532
 _mtx_lock_flags(c42de934,0,c039e3cc,835,0) at _mtx_lock_flags+0x7f
 sysctl_kern_file(c03d8a00,0,0,d7aaac08) at sysctl_kern_file+0x119
 sysctl_root(0,d7aaacb4,2,d7aaac08,c0416240) at sysctl_root+0x116
 userland_sysctl(c3f84a80,d7aaacb4,2,bfbfe588,bfbff3f8) at userland_sysctl+0xec
 __sysctl(c3f84a80,d7aaad14,6,1,297) at __sysctl+0x71
 syscall(2f,2f,2f,2,bfbfe588) at syscall+0x211
 Xint0x80_syscall() at Xint0x80_syscall+0x1d
 --- syscall (202, FreeBSD ELF32, __sysctl), eip = 0x805a717, esp = 0xbfbfe53c, ebp 
= 0xbfbfe568
 ---
 db
 
 Index: sys/kern/subr_witness.c
 ===
 RCS file: /home/ncvs/src/sys/kern/subr_witness.c,v
 retrieving revision 1.130
 diff -u -r1.130 subr_witness.c
 --- sys/kern/subr_witness.c 11 Nov 2002 16:36:20 -  1.130
 +++ sys/kern/subr_witness.c 7 Dec 2002 04:18:29 -
 @@ -198,6 +198,8 @@
  { Giant, lock_class_mtx_sleep },
  { proctree, lock_class_sx },
  { allproc, lock_class_sx },
 +{ filedesc structure, lock_class_mtx_sleep },
 +{ pipe mutex, lock_class_mtx_sleep },
  { sigio lock, lock_class_mtx_sleep },
  { process group, lock_class_mtx_sleep },
  { process lock, lock_class_mtx_sleep },
 
 

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: LOR: filedesc structure - pipe mutex

2002-12-06 Thread Kris Kennaway
On Fri, Dec 06, 2002 at 02:36:49PM -0800, Lars Eggert wrote:
 On today's -current:
 
 lock order reversal
  1st 0xc6948234 filedesc structure (filedesc structure) @ 
 /usr/src/sys/kern/sys_generic.c:901
  2nd 0xc691f380 pipe mutex (pipe mutex) @ /usr/src/sys/kern/sys_pipe.c:1239
 
 Debugger(witness_lock)
 Stopped at  Debugger+0x5a:  xchgl   %ebx,in_Debugger.0
 
 db trace
 
 Debugger(c0320e4d,c691f380,c0341c32,c0341c32,c0342671) at Debugger+0x5a
 witness_lock(c691f380,8,c0342671,4d7,c01f3b4a) at witness_lock+0x142
 _mtx_lock_flags(c691f380,0,c0342671,4d7,c65c76cc) at _mtx_lock_flags+0x63
 pipe_poll(c65c76cc,40,c21af180,c65d3460,c21af180) at pipe_poll+0x3d
 selscan(c65d3460,eb36ab90,eb36ab80,14,4) at selscan+0xdf
 kern_select(c65d3460,14,804c940,0,0) at kern_select+0x41f
 select(c65d3460,eb36ad10,c035adb4,407,c65d27c8) at select+0x42
 syscall(2f,2f,2f,804c9c0,804c8a0) at syscall+0x3c6
 Xint0x80_syscall() at Xint0x80_syscall+0x1d
 --- syscall (93, FreeBSD ELF32, select), eip = 0x280b3853, esp = 
 0xbfbff90c, ebp = 0xbfbffd60 ---

I'm getting this too:

Local package initialization:lock order reversal
 1st 0xc449ad34 filedesc structure (filedesc structure) @ 
/local0/src-client/sys/kern/sys_generic.c:901
 2nd 0xc4146780 pipe mutex (pipe mutex) @ /local0/src-client/sys/kern/sys_pipe.c:1239
Debugger(witness_lock)
Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
db trace
Debugger(c03efc85,c4146780,c04192dc,c04192dc,c0419d33) at Debugger+0x54
witness_lock(c4146780,8,c0419d33,4d7,c0291b89) at witness_lock+0x667
_mtx_lock_flags(c4146780,0,c0419d33,4d7,c418b4ec) at _mtx_lock_flags+0xb1
pipe_poll(c418b4ec,40,c420ad00,c3f5cb60,c420ad00) at pipe_poll+0x40
selscan(c3f5cb60,d7a29b98,d7a29b88,7,4) at selscan+0x12e
kern_select(c3f5cb60,7,8076190,0,0) at kern_select+0x36f
select(c3f5cb60,d7a29d10,c043094c,407,5) at select+0x66
syscall(2f,2f,2f,8076190,1e) at syscall+0x28e
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (93, FreeBSD ELF32, select), eip = 0x2829a433, esp = 0xbfbff4dc, ebp = 
0xbfbffda0 ---
db

Kris

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: LOR: filedesc structure - pipe mutex

2002-12-06 Thread Lars Eggert
On 12/6/2002 6:22 PM, Kris Kennaway wrote:

On Fri, Dec 06, 2002 at 02:36:49PM -0800, Lars Eggert wrote:


On today's -current:

lock order reversal
1st 0xc6948234 filedesc structure (filedesc structure) @ 
/usr/src/sys/kern/sys_generic.c:901
2nd 0xc691f380 pipe mutex (pipe mutex) @ /usr/src/sys/kern/sys_pipe.c:1239

Debugger(witness_lock)
Stopped at  Debugger+0x5a:  xchgl   %ebx,in_Debugger.0

db trace

Debugger(c0320e4d,c691f380,c0341c32,c0341c32,c0342671) at Debugger+0x5a
witness_lock(c691f380,8,c0342671,4d7,c01f3b4a) at witness_lock+0x142
_mtx_lock_flags(c691f380,0,c0342671,4d7,c65c76cc) at _mtx_lock_flags+0x63
pipe_poll(c65c76cc,40,c21af180,c65d3460,c21af180) at pipe_poll+0x3d
selscan(c65d3460,eb36ab90,eb36ab80,14,4) at selscan+0xdf
kern_select(c65d3460,14,804c940,0,0) at kern_select+0x41f
select(c65d3460,eb36ad10,c035adb4,407,c65d27c8) at select+0x42
syscall(2f,2f,2f,804c9c0,804c8a0) at syscall+0x3c6
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (93, FreeBSD ELF32, select), eip = 0x280b3853, esp = 
0xbfbff90c, ebp = 0xbfbffd60 ---


I'm getting this too:



Have you gotten this before today? I haven't, and I'm wondering if I see 
it because I'm running with jhb@'s witness patch for the other LOR. Does 
your kernel have that patch, too?

(Excuse me if that doesn't make any sense, just wondering about the 
timing...)

Lars
--
Lars Eggert [EMAIL PROTECTED]   USC Information Sciences Institute


smime.p7s
Description: S/MIME Cryptographic Signature


Re: LOR: filedesc structure - pipe mutex

2002-12-06 Thread Kris Kennaway
On Fri, Dec 06, 2002 at 07:18:03PM -0800, Lars Eggert wrote:

 I'm getting this too:
 
 
 Have you gotten this before today? I haven't, and I'm wondering if I see 
 it because I'm running with jhb@'s witness patch for the other LOR. Does 
 your kernel have that patch, too?
 
 (Excuse me if that doesn't make any sense, just wondering about the 
 timing...)

I'm seeing it after I updated to a recent current.  I was running
jhb's patch in an older kernel (~2 weeks old) without getting this
reversal.

Kris



msg48302/pgp0.pgp
Description: PGP signature


Re: LOR: filedesc structure - pipe mutex

2002-12-06 Thread Robert Watson

On Fri, 6 Dec 2002, Kris Kennaway wrote:

 I'm getting this too:
 
 Local package initialization:lock order reversal
  1st 0xc449ad34 filedesc structure (filedesc structure) @ 
/local0/src-client/sys/kern/sys_generic.c:901
  2nd 0xc4146780 pipe mutex (pipe mutex) @ /local0/src-client/sys/kern/sys_pipe.c:1239
 Debugger(witness_lock)
 Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
 db trace
 Debugger(c03efc85,c4146780,c04192dc,c04192dc,c0419d33) at Debugger+0x54
 witness_lock(c4146780,8,c0419d33,4d7,c0291b89) at witness_lock+0x667
 _mtx_lock_flags(c4146780,0,c0419d33,4d7,c418b4ec) at _mtx_lock_flags+0xb1
 pipe_poll(c418b4ec,40,c420ad00,c3f5cb60,c420ad00) at pipe_poll+0x40
 selscan(c3f5cb60,d7a29b98,d7a29b88,7,4) at selscan+0x12e
 kern_select(c3f5cb60,7,8076190,0,0) at kern_select+0x36f
 select(c3f5cb60,d7a29d10,c043094c,407,5) at select+0x66
 syscall(2f,2f,2f,8076190,1e) at syscall+0x28e
 Xint0x80_syscall() at Xint0x80_syscall+0x1d
 --- syscall (93, FreeBSD ELF32, select), eip = 0x2829a433, esp = 0xbfbff4dc, ebp = 
0xbfbffda0 ---
 db

Given that selscan() seems to rely on holding the file descriptor lock for
the duration of the scan, it seems that the file descriptor lock is
intended to be grabbed before the pipe mutex in the lock order.  The
reversal you're seeing is probably therefore the right order rather than
the wrong order.  To properly diagnose this, you probably want to hard
code that order in Witness's lock order list in subr_witness so that a
warning is generated earlier.  This reversal is presumably the second
order found by the kernel, which is the point at which (if the first order
isn't hard-coded) that a violation is first found.  Using show witness 
you can inspect how the lock order was constructed -- I've always found
the output a bit confusing, so if it's a simple order reversal (involving
direct lock relationships between two locks), hard-coding it is easier.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: LOR: filedesc structure - pipe mutex

2002-12-06 Thread Kris Kennaway
On Fri, Dec 06, 2002 at 07:18:03PM -0800, Lars Eggert wrote:

 I'm getting this too:

After discussing this with various people on IRC, it was determined
that this is not the place where the reversal is occurring, but since
witness doesn't have the lock order defined it has to guess, and in
this instance it is guessing the wrong way around.  After adding the
lock order to subr_witness.c I now get this:

lock order reversal
 1st 0xc41bc418 process lock (process lock) @ 
/local0/src-client/sys/kern/kern_descrip.c:2094
 2nd 0xc42de934 filedesc structure (filedesc structure) @ 
/local0/src-client/sys/kern/kern_descrip.c:2101
Debugger(witness_lock)
Stopped at  Debugger+0x45:  xchgl   %ebx,in_Debugger.0
db trace
Debugger(c037a085) at Debugger+0x45
witness_lock(c42de934,8,c039e3cc,835,c41bc418) at witness_lock+0x532
_mtx_lock_flags(c42de934,0,c039e3cc,835,0) at _mtx_lock_flags+0x7f
sysctl_kern_file(c03d8a00,0,0,d7aaac08) at sysctl_kern_file+0x119
sysctl_root(0,d7aaacb4,2,d7aaac08,c0416240) at sysctl_root+0x116
userland_sysctl(c3f84a80,d7aaacb4,2,bfbfe588,bfbff3f8) at userland_sysctl+0xec
__sysctl(c3f84a80,d7aaad14,6,1,297) at __sysctl+0x71
syscall(2f,2f,2f,2,bfbfe588) at syscall+0x211
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (202, FreeBSD ELF32, __sysctl), eip = 0x805a717, esp = 0xbfbfe53c, ebp = 
0xbfbfe568 ---
db

Index: sys/kern/subr_witness.c
===
RCS file: /home/ncvs/src/sys/kern/subr_witness.c,v
retrieving revision 1.130
diff -u -r1.130 subr_witness.c
--- sys/kern/subr_witness.c 11 Nov 2002 16:36:20 -  1.130
+++ sys/kern/subr_witness.c 7 Dec 2002 04:18:29 -
@@ -198,6 +198,8 @@
{ Giant, lock_class_mtx_sleep },
{ proctree, lock_class_sx },
{ allproc, lock_class_sx },
+   { filedesc structure, lock_class_mtx_sleep },
+   { pipe mutex, lock_class_mtx_sleep },
{ sigio lock, lock_class_mtx_sleep },
{ process group, lock_class_mtx_sleep },
{ process lock, lock_class_mtx_sleep },



msg48306/pgp0.pgp
Description: PGP signature