Re: LOR: filedesc structure - pipe mutex
On Fri, 6 Dec 2002, Kris Kennaway wrote: On Fri, Dec 06, 2002 at 07:18:03PM -0800, Lars Eggert wrote: I'm getting this too: After discussing this with various people on IRC, it was determined that this is not the place where the reversal is occurring, but since witness doesn't have the lock order defined it has to guess, and in this instance it is guessing the wrong way around. After adding the lock order to subr_witness.c I now get this: Yeah, we're exchanging some out-of-band e-mail on this: the basic problem is that: filedescpipe pipesigio sigioproc procfiledesc We're talking about some possible solutions, including deferred signalling, etc, etc. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: LOR: filedesc structure - pipe mutex
BTW, one upshot of this whole event is that we should probably be hard-coding the lock order of all important locks rather than allowing it to be automatically determined. We'd uncover problems of this sort much faster and much more easily, and it would provide better documentation of the intended lock order. Also, the authors of the following locking bits should document them in the SMPng architecture document: - File descriptor locking - Pipe locking - Select locking - Process locking I've gone ahead and documented the SIGIO locking, and I might get to the rest sometime also, but it would be a lot faster if the people who did the work documented it rather than having to rely on a lot of code reading to figure out the subtleties. Do we know if WITNESS supports the notion of a partial order, in which it is simply asserted that there is no valid locking order between two locks? Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories On Fri, 6 Dec 2002, Kris Kennaway wrote: On Fri, Dec 06, 2002 at 07:18:03PM -0800, Lars Eggert wrote: I'm getting this too: After discussing this with various people on IRC, it was determined that this is not the place where the reversal is occurring, but since witness doesn't have the lock order defined it has to guess, and in this instance it is guessing the wrong way around. After adding the lock order to subr_witness.c I now get this: lock order reversal 1st 0xc41bc418 process lock (process lock) @ /local0/src-client/sys/kern/kern_descrip.c:2094 2nd 0xc42de934 filedesc structure (filedesc structure) @ /local0/src-client/sys/kern/kern_descrip.c:2101 Debugger(witness_lock) Stopped at Debugger+0x45: xchgl %ebx,in_Debugger.0 db trace Debugger(c037a085) at Debugger+0x45 witness_lock(c42de934,8,c039e3cc,835,c41bc418) at witness_lock+0x532 _mtx_lock_flags(c42de934,0,c039e3cc,835,0) at _mtx_lock_flags+0x7f sysctl_kern_file(c03d8a00,0,0,d7aaac08) at sysctl_kern_file+0x119 sysctl_root(0,d7aaacb4,2,d7aaac08,c0416240) at sysctl_root+0x116 userland_sysctl(c3f84a80,d7aaacb4,2,bfbfe588,bfbff3f8) at userland_sysctl+0xec __sysctl(c3f84a80,d7aaad14,6,1,297) at __sysctl+0x71 syscall(2f,2f,2f,2,bfbfe588) at syscall+0x211 Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (202, FreeBSD ELF32, __sysctl), eip = 0x805a717, esp = 0xbfbfe53c, ebp = 0xbfbfe568 --- db Index: sys/kern/subr_witness.c === RCS file: /home/ncvs/src/sys/kern/subr_witness.c,v retrieving revision 1.130 diff -u -r1.130 subr_witness.c --- sys/kern/subr_witness.c 11 Nov 2002 16:36:20 - 1.130 +++ sys/kern/subr_witness.c 7 Dec 2002 04:18:29 - @@ -198,6 +198,8 @@ { Giant, lock_class_mtx_sleep }, { proctree, lock_class_sx }, { allproc, lock_class_sx }, + { filedesc structure, lock_class_mtx_sleep }, + { pipe mutex, lock_class_mtx_sleep }, { sigio lock, lock_class_mtx_sleep }, { process group, lock_class_mtx_sleep }, { process lock, lock_class_mtx_sleep }, To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: LOR: filedesc structure - pipe mutex
On Sat, Dec 07, 2002 at 08:58:26AM -0500, Robert Watson wrote: On Fri, 6 Dec 2002, Kris Kennaway wrote: On Fri, Dec 06, 2002 at 07:18:03PM -0800, Lars Eggert wrote: I'm getting this too: After discussing this with various people on IRC, it was determined that this is not the place where the reversal is occurring, but since witness doesn't have the lock order defined it has to guess, and in this instance it is guessing the wrong way around. After adding the lock order to subr_witness.c I now get this: Yeah, we're exchanging some out-of-band e-mail on this: the basic problem is that: filedescpipe pipesigio sigioproc procfiledesc We're talking about some possible solutions, including deferred signalling, etc, etc. Here's another panic I got overnight: lock order reversal 1st 0xc4cbcd80 pipe mutex (pipe mutex) @ /local0/src-client/sys/kern/sys_pipe.c:465 2nd 0xc044e780 sigio lock (sigio lock) @ /local0/src-client/sys/kern/kern_sig.c:2225 Debugger(witness_lock) Stopped at Debugger+0x54: xchgl %ebx,in_Debugger.0 db trace Debugger(c03efc85,c044e780,c04141a3,c04141a3,c0416b5c) at Debugger+0x54 witness_lock(c044e780,8,c0416b5c,8b1,c4cbcd80) at witness_lock+0x667 _mtx_lock_flags(c044e780,0,c0416b5c,8b1,23) at _mtx_lock_flags+0xb1 pgsigio(c5cf3cf8,17,0,1ad,0) at pgsigio+0x30 pipe_read(c63b2c6c,d9539c7c,c4e91f00,0,c67608c0) at pipe_read+0x516 dofileread(c67608c0,c63b2c6c,4,bfbff533,1) at dofileread+0xd2 read(c67608c0,d9539d10,c043094c,407,3) at read+0x6b syscall(2f,2f,2f,0,bfbff534) at syscall+0x28e Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (3, FreeBSD ELF32, read), eip = 0x2812d1c8, esp = 0xbfbff500, ebp = 0xbfbff700 --- db Kris msg48337/pgp0.pgp Description: PGP signature
Re: LOR: filedesc structure - pipe mutex
On 07-Dec-2002 Robert Watson wrote: BTW, one upshot of this whole event is that we should probably be hard-coding the lock order of all important locks rather than allowing it to be automatically determined. We'd uncover problems of this sort much faster and much more easily, and it would provide better documentation of the intended lock order. Also, the authors of the following locking bits should document them in the SMPng architecture document: - File descriptor locking - Pipe locking - Select locking - Process locking I've gone ahead and documented the SIGIO locking, and I might get to the rest sometime also, but it would be a lot faster if the people who did the work documented it rather than having to rely on a lot of code reading to figure out the subtleties. Do we know if WITNESS supports the notion of a partial order, in which it is simply asserted that there is no valid locking order between two locks? You mean that it is not ok to obtain lock A while holdig lock B and it is not ok to obtain lock B while holding lock A? It doesn't really support that right now, no. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories On Fri, 6 Dec 2002, Kris Kennaway wrote: On Fri, Dec 06, 2002 at 07:18:03PM -0800, Lars Eggert wrote: I'm getting this too: After discussing this with various people on IRC, it was determined that this is not the place where the reversal is occurring, but since witness doesn't have the lock order defined it has to guess, and in this instance it is guessing the wrong way around. After adding the lock order to subr_witness.c I now get this: lock order reversal 1st 0xc41bc418 process lock (process lock) @ /local0/src-client/sys/kern/kern_descrip.c:2094 2nd 0xc42de934 filedesc structure (filedesc structure) @ /local0/src-client/sys/kern/kern_descrip.c:2101 Debugger(witness_lock) Stopped at Debugger+0x45: xchgl %ebx,in_Debugger.0 db trace Debugger(c037a085) at Debugger+0x45 witness_lock(c42de934,8,c039e3cc,835,c41bc418) at witness_lock+0x532 _mtx_lock_flags(c42de934,0,c039e3cc,835,0) at _mtx_lock_flags+0x7f sysctl_kern_file(c03d8a00,0,0,d7aaac08) at sysctl_kern_file+0x119 sysctl_root(0,d7aaacb4,2,d7aaac08,c0416240) at sysctl_root+0x116 userland_sysctl(c3f84a80,d7aaacb4,2,bfbfe588,bfbff3f8) at userland_sysctl+0xec __sysctl(c3f84a80,d7aaad14,6,1,297) at __sysctl+0x71 syscall(2f,2f,2f,2,bfbfe588) at syscall+0x211 Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (202, FreeBSD ELF32, __sysctl), eip = 0x805a717, esp = 0xbfbfe53c, ebp = 0xbfbfe568 --- db Index: sys/kern/subr_witness.c === RCS file: /home/ncvs/src/sys/kern/subr_witness.c,v retrieving revision 1.130 diff -u -r1.130 subr_witness.c --- sys/kern/subr_witness.c 11 Nov 2002 16:36:20 - 1.130 +++ sys/kern/subr_witness.c 7 Dec 2002 04:18:29 - @@ -198,6 +198,8 @@ { Giant, lock_class_mtx_sleep }, { proctree, lock_class_sx }, { allproc, lock_class_sx }, +{ filedesc structure, lock_class_mtx_sleep }, +{ pipe mutex, lock_class_mtx_sleep }, { sigio lock, lock_class_mtx_sleep }, { process group, lock_class_mtx_sleep }, { process lock, lock_class_mtx_sleep }, -- John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/ Power Users Use the Power to Serve! - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: LOR: filedesc structure - pipe mutex
On Fri, Dec 06, 2002 at 02:36:49PM -0800, Lars Eggert wrote: On today's -current: lock order reversal 1st 0xc6948234 filedesc structure (filedesc structure) @ /usr/src/sys/kern/sys_generic.c:901 2nd 0xc691f380 pipe mutex (pipe mutex) @ /usr/src/sys/kern/sys_pipe.c:1239 Debugger(witness_lock) Stopped at Debugger+0x5a: xchgl %ebx,in_Debugger.0 db trace Debugger(c0320e4d,c691f380,c0341c32,c0341c32,c0342671) at Debugger+0x5a witness_lock(c691f380,8,c0342671,4d7,c01f3b4a) at witness_lock+0x142 _mtx_lock_flags(c691f380,0,c0342671,4d7,c65c76cc) at _mtx_lock_flags+0x63 pipe_poll(c65c76cc,40,c21af180,c65d3460,c21af180) at pipe_poll+0x3d selscan(c65d3460,eb36ab90,eb36ab80,14,4) at selscan+0xdf kern_select(c65d3460,14,804c940,0,0) at kern_select+0x41f select(c65d3460,eb36ad10,c035adb4,407,c65d27c8) at select+0x42 syscall(2f,2f,2f,804c9c0,804c8a0) at syscall+0x3c6 Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (93, FreeBSD ELF32, select), eip = 0x280b3853, esp = 0xbfbff90c, ebp = 0xbfbffd60 --- I'm getting this too: Local package initialization:lock order reversal 1st 0xc449ad34 filedesc structure (filedesc structure) @ /local0/src-client/sys/kern/sys_generic.c:901 2nd 0xc4146780 pipe mutex (pipe mutex) @ /local0/src-client/sys/kern/sys_pipe.c:1239 Debugger(witness_lock) Stopped at Debugger+0x54: xchgl %ebx,in_Debugger.0 db trace Debugger(c03efc85,c4146780,c04192dc,c04192dc,c0419d33) at Debugger+0x54 witness_lock(c4146780,8,c0419d33,4d7,c0291b89) at witness_lock+0x667 _mtx_lock_flags(c4146780,0,c0419d33,4d7,c418b4ec) at _mtx_lock_flags+0xb1 pipe_poll(c418b4ec,40,c420ad00,c3f5cb60,c420ad00) at pipe_poll+0x40 selscan(c3f5cb60,d7a29b98,d7a29b88,7,4) at selscan+0x12e kern_select(c3f5cb60,7,8076190,0,0) at kern_select+0x36f select(c3f5cb60,d7a29d10,c043094c,407,5) at select+0x66 syscall(2f,2f,2f,8076190,1e) at syscall+0x28e Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (93, FreeBSD ELF32, select), eip = 0x2829a433, esp = 0xbfbff4dc, ebp = 0xbfbffda0 --- db Kris To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: LOR: filedesc structure - pipe mutex
On 12/6/2002 6:22 PM, Kris Kennaway wrote: On Fri, Dec 06, 2002 at 02:36:49PM -0800, Lars Eggert wrote: On today's -current: lock order reversal 1st 0xc6948234 filedesc structure (filedesc structure) @ /usr/src/sys/kern/sys_generic.c:901 2nd 0xc691f380 pipe mutex (pipe mutex) @ /usr/src/sys/kern/sys_pipe.c:1239 Debugger(witness_lock) Stopped at Debugger+0x5a: xchgl %ebx,in_Debugger.0 db trace Debugger(c0320e4d,c691f380,c0341c32,c0341c32,c0342671) at Debugger+0x5a witness_lock(c691f380,8,c0342671,4d7,c01f3b4a) at witness_lock+0x142 _mtx_lock_flags(c691f380,0,c0342671,4d7,c65c76cc) at _mtx_lock_flags+0x63 pipe_poll(c65c76cc,40,c21af180,c65d3460,c21af180) at pipe_poll+0x3d selscan(c65d3460,eb36ab90,eb36ab80,14,4) at selscan+0xdf kern_select(c65d3460,14,804c940,0,0) at kern_select+0x41f select(c65d3460,eb36ad10,c035adb4,407,c65d27c8) at select+0x42 syscall(2f,2f,2f,804c9c0,804c8a0) at syscall+0x3c6 Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (93, FreeBSD ELF32, select), eip = 0x280b3853, esp = 0xbfbff90c, ebp = 0xbfbffd60 --- I'm getting this too: Have you gotten this before today? I haven't, and I'm wondering if I see it because I'm running with jhb@'s witness patch for the other LOR. Does your kernel have that patch, too? (Excuse me if that doesn't make any sense, just wondering about the timing...) Lars -- Lars Eggert [EMAIL PROTECTED] USC Information Sciences Institute smime.p7s Description: S/MIME Cryptographic Signature
Re: LOR: filedesc structure - pipe mutex
On Fri, Dec 06, 2002 at 07:18:03PM -0800, Lars Eggert wrote: I'm getting this too: Have you gotten this before today? I haven't, and I'm wondering if I see it because I'm running with jhb@'s witness patch for the other LOR. Does your kernel have that patch, too? (Excuse me if that doesn't make any sense, just wondering about the timing...) I'm seeing it after I updated to a recent current. I was running jhb's patch in an older kernel (~2 weeks old) without getting this reversal. Kris msg48302/pgp0.pgp Description: PGP signature
Re: LOR: filedesc structure - pipe mutex
On Fri, 6 Dec 2002, Kris Kennaway wrote: I'm getting this too: Local package initialization:lock order reversal 1st 0xc449ad34 filedesc structure (filedesc structure) @ /local0/src-client/sys/kern/sys_generic.c:901 2nd 0xc4146780 pipe mutex (pipe mutex) @ /local0/src-client/sys/kern/sys_pipe.c:1239 Debugger(witness_lock) Stopped at Debugger+0x54: xchgl %ebx,in_Debugger.0 db trace Debugger(c03efc85,c4146780,c04192dc,c04192dc,c0419d33) at Debugger+0x54 witness_lock(c4146780,8,c0419d33,4d7,c0291b89) at witness_lock+0x667 _mtx_lock_flags(c4146780,0,c0419d33,4d7,c418b4ec) at _mtx_lock_flags+0xb1 pipe_poll(c418b4ec,40,c420ad00,c3f5cb60,c420ad00) at pipe_poll+0x40 selscan(c3f5cb60,d7a29b98,d7a29b88,7,4) at selscan+0x12e kern_select(c3f5cb60,7,8076190,0,0) at kern_select+0x36f select(c3f5cb60,d7a29d10,c043094c,407,5) at select+0x66 syscall(2f,2f,2f,8076190,1e) at syscall+0x28e Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (93, FreeBSD ELF32, select), eip = 0x2829a433, esp = 0xbfbff4dc, ebp = 0xbfbffda0 --- db Given that selscan() seems to rely on holding the file descriptor lock for the duration of the scan, it seems that the file descriptor lock is intended to be grabbed before the pipe mutex in the lock order. The reversal you're seeing is probably therefore the right order rather than the wrong order. To properly diagnose this, you probably want to hard code that order in Witness's lock order list in subr_witness so that a warning is generated earlier. This reversal is presumably the second order found by the kernel, which is the point at which (if the first order isn't hard-coded) that a violation is first found. Using show witness you can inspect how the lock order was constructed -- I've always found the output a bit confusing, so if it's a simple order reversal (involving direct lock relationships between two locks), hard-coding it is easier. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: LOR: filedesc structure - pipe mutex
On Fri, Dec 06, 2002 at 07:18:03PM -0800, Lars Eggert wrote: I'm getting this too: After discussing this with various people on IRC, it was determined that this is not the place where the reversal is occurring, but since witness doesn't have the lock order defined it has to guess, and in this instance it is guessing the wrong way around. After adding the lock order to subr_witness.c I now get this: lock order reversal 1st 0xc41bc418 process lock (process lock) @ /local0/src-client/sys/kern/kern_descrip.c:2094 2nd 0xc42de934 filedesc structure (filedesc structure) @ /local0/src-client/sys/kern/kern_descrip.c:2101 Debugger(witness_lock) Stopped at Debugger+0x45: xchgl %ebx,in_Debugger.0 db trace Debugger(c037a085) at Debugger+0x45 witness_lock(c42de934,8,c039e3cc,835,c41bc418) at witness_lock+0x532 _mtx_lock_flags(c42de934,0,c039e3cc,835,0) at _mtx_lock_flags+0x7f sysctl_kern_file(c03d8a00,0,0,d7aaac08) at sysctl_kern_file+0x119 sysctl_root(0,d7aaacb4,2,d7aaac08,c0416240) at sysctl_root+0x116 userland_sysctl(c3f84a80,d7aaacb4,2,bfbfe588,bfbff3f8) at userland_sysctl+0xec __sysctl(c3f84a80,d7aaad14,6,1,297) at __sysctl+0x71 syscall(2f,2f,2f,2,bfbfe588) at syscall+0x211 Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (202, FreeBSD ELF32, __sysctl), eip = 0x805a717, esp = 0xbfbfe53c, ebp = 0xbfbfe568 --- db Index: sys/kern/subr_witness.c === RCS file: /home/ncvs/src/sys/kern/subr_witness.c,v retrieving revision 1.130 diff -u -r1.130 subr_witness.c --- sys/kern/subr_witness.c 11 Nov 2002 16:36:20 - 1.130 +++ sys/kern/subr_witness.c 7 Dec 2002 04:18:29 - @@ -198,6 +198,8 @@ { Giant, lock_class_mtx_sleep }, { proctree, lock_class_sx }, { allproc, lock_class_sx }, + { filedesc structure, lock_class_mtx_sleep }, + { pipe mutex, lock_class_mtx_sleep }, { sigio lock, lock_class_mtx_sleep }, { process group, lock_class_mtx_sleep }, { process lock, lock_class_mtx_sleep }, msg48306/pgp0.pgp Description: PGP signature