Re: [BUG] Lockdep warning with XFS on 2.6.22-rc6
> Patch below should fix this (untested). Just tested 2.6.22-rc6: message is gone when patch is applied. But deleting some directories in /var/tmp (which lives on xfs) I got: BUG: MAX_LOCK_DEPTH too low! turning off the locking correctness validator. Thomas -- keep mailinglists in english, feel free to send PM in german - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [xfs-masters] Re: [BUG] Lockdep warning with XFS on 2.6.22-rc6
Patch looks good, Dave. (though, I stuffed up reviewing that bit of code previously:-) Oh, previous typo: s/inodes at the some time/inodes at the same time/ --Tim David Chinner wrote: On Tue, Jun 26, 2007 at 11:35:20AM +0200, Jarek Poplawski wrote: On 26-06-2007 04:16, David Chinner wrote: It does both - parent-first/child-second and ascending inode # order, which is where the problem is. standing alone, these seem fine, but they don't appear to work when the child has a lower inode number than the parent. ... >From xfs_inode.h: /* * Flags for lockdep annotations. * * XFS_I[O]LOCK_PARENT - for operations that require locking two inodes * (ie directory operations that require locking a directory inode and * an entry inode). The first inode gets locked with this flag so it * gets a lockdep subclass of 1 and the second lock will have a lockdep * subclass of 0. * * XFS_I[O]LOCK_INUMORDER - for locking several inodes at the some time * with xfs_lock_inodes(). This flag is used as the starting subclass * and each subsequent lock acquired will increment the subclass by one. * So the first lock acquired will have a lockdep subclass of 2, the * second lock will have a lockdep subclass of 3, and so on. */ I don't know xfs code, and probably miss something, but it seems there could be some inconsistency: lockdep warning shows mr_lock/1 taken both before and after mr_lock (i.e. /0). According to the above comment there should be always 1 before 0... That just fired some rusty neurons. #define XFS_IOLOCK_SHIFT16 #define XFS_IOLOCK_PARENT (1 << XFS_IOLOCK_SHIFT) #define XFS_IOLOCK_INUMORDER(2 << XFS_IOLOCK_SHIFT) #define XFS_ILOCK_SHIFT 24 #define XFS_ILOCK_PARENT(1 << XFS_ILOCK_SHIFT) #define XFS_ILOCK_INUMORDER (2 << XFS_ILOCK_SHIFT) So, in a lock_mode parameter, the upper 8 bits are for the ILOCK lockdep subclass, and the 16..23 bits are for the IOLOCK lockdep subclass. Where do we add them? static inline int xfs_lock_inumorder(int lock_mode, int subclass) { if (lock_mode & (XFS_IOLOCK_SHARED|XFS_IOLOCK_EXCL)) lock_mode |= (subclass + XFS_IOLOCK_INUMORDER) << XFS_IOLOCK_SHIFT; if (lock_mode & (XFS_ILOCK_SHARED|XFS_ILOCK_EXCL)) lock_mode |= (subclass + XFS_ILOCK_INUMORDER) << XFS_ILOCK_SHIFT; return lock_mode; } OH, look at those nice overflow bugs in that in that code. We shift the XFS_IOLOCK_INUMORDER and XFS_ILOCK_INUMORDER bits out the far side of the lock_mode variable result in lock subclasses of 0-3 instead of 2-5 Bugger, eh? Patch below should fix this (untested). Jarek - thanks for pointing what I should have seen earlier. Cheers, Dave. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [xfs-masters] Re: [BUG] Lockdep warning with XFS on 2.6.22-rc6
Patch looks good, Dave. (though, I stuffed up reviewing that bit of code previously:-) Oh, previous typo: s/inodes at the some time/inodes at the same time/ --Tim David Chinner wrote: On Tue, Jun 26, 2007 at 11:35:20AM +0200, Jarek Poplawski wrote: On 26-06-2007 04:16, David Chinner wrote: It does both - parent-first/child-second and ascending inode # order, which is where the problem is. standing alone, these seem fine, but they don't appear to work when the child has a lower inode number than the parent. ... From xfs_inode.h: /* * Flags for lockdep annotations. * * XFS_I[O]LOCK_PARENT - for operations that require locking two inodes * (ie directory operations that require locking a directory inode and * an entry inode). The first inode gets locked with this flag so it * gets a lockdep subclass of 1 and the second lock will have a lockdep * subclass of 0. * * XFS_I[O]LOCK_INUMORDER - for locking several inodes at the some time * with xfs_lock_inodes(). This flag is used as the starting subclass * and each subsequent lock acquired will increment the subclass by one. * So the first lock acquired will have a lockdep subclass of 2, the * second lock will have a lockdep subclass of 3, and so on. */ I don't know xfs code, and probably miss something, but it seems there could be some inconsistency: lockdep warning shows mr_lock/1 taken both before and after mr_lock (i.e. /0). According to the above comment there should be always 1 before 0... That just fired some rusty neurons. #define XFS_IOLOCK_SHIFT16 #define XFS_IOLOCK_PARENT (1 XFS_IOLOCK_SHIFT) #define XFS_IOLOCK_INUMORDER(2 XFS_IOLOCK_SHIFT) #define XFS_ILOCK_SHIFT 24 #define XFS_ILOCK_PARENT(1 XFS_ILOCK_SHIFT) #define XFS_ILOCK_INUMORDER (2 XFS_ILOCK_SHIFT) So, in a lock_mode parameter, the upper 8 bits are for the ILOCK lockdep subclass, and the 16..23 bits are for the IOLOCK lockdep subclass. Where do we add them? static inline int xfs_lock_inumorder(int lock_mode, int subclass) { if (lock_mode (XFS_IOLOCK_SHARED|XFS_IOLOCK_EXCL)) lock_mode |= (subclass + XFS_IOLOCK_INUMORDER) XFS_IOLOCK_SHIFT; if (lock_mode (XFS_ILOCK_SHARED|XFS_ILOCK_EXCL)) lock_mode |= (subclass + XFS_ILOCK_INUMORDER) XFS_ILOCK_SHIFT; return lock_mode; } OH, look at those nice overflow bugs in that in that code. We shift the XFS_IOLOCK_INUMORDER and XFS_ILOCK_INUMORDER bits out the far side of the lock_mode variable result in lock subclasses of 0-3 instead of 2-5 Bugger, eh? Patch below should fix this (untested). Jarek - thanks for pointing what I should have seen earlier. Cheers, Dave. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Lockdep warning with XFS on 2.6.22-rc6
Patch below should fix this (untested). Just tested 2.6.22-rc6: message is gone when patch is applied. But deleting some directories in /var/tmp (which lives on xfs) I got: BUG: MAX_LOCK_DEPTH too low! turning off the locking correctness validator. Thomas -- keep mailinglists in english, feel free to send PM in german - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Lockdep warning with XFS on 2.6.22-rc6
On Tue, Jun 26, 2007 at 11:35:20AM +0200, Jarek Poplawski wrote: > On 26-06-2007 04:16, David Chinner wrote: > > It does both - parent-first/child-second and ascending inode # order, > > which is where the problem is. standing alone, these seem fine, but > > they don't appear to work when the child has a lower inode number > > than the parent. > ... > > >From xfs_inode.h: > > /* > * Flags for lockdep annotations. > * > * XFS_I[O]LOCK_PARENT - for operations that require locking two inodes > * (ie directory operations that require locking a directory inode and > * an entry inode). The first inode gets locked with this flag so it > * gets a lockdep subclass of 1 and the second lock will have a lockdep > * subclass of 0. > * > * XFS_I[O]LOCK_INUMORDER - for locking several inodes at the some time > * with xfs_lock_inodes(). This flag is used as the starting subclass > * and each subsequent lock acquired will increment the subclass by one. > * So the first lock acquired will have a lockdep subclass of 2, the > * second lock will have a lockdep subclass of 3, and so on. > */ > > I don't know xfs code, and probably miss something, but it seems > there could be some inconsistency: lockdep warning shows mr_lock/1 > taken both before and after mr_lock (i.e. /0). According to the > above comment there should be always 1 before 0... That just fired some rusty neurons. #define XFS_IOLOCK_SHIFT16 #define XFS_IOLOCK_PARENT (1 << XFS_IOLOCK_SHIFT) #define XFS_IOLOCK_INUMORDER(2 << XFS_IOLOCK_SHIFT) #define XFS_ILOCK_SHIFT 24 #define XFS_ILOCK_PARENT(1 << XFS_ILOCK_SHIFT) #define XFS_ILOCK_INUMORDER (2 << XFS_ILOCK_SHIFT) So, in a lock_mode parameter, the upper 8 bits are for the ILOCK lockdep subclass, and the 16..23 bits are for the IOLOCK lockdep subclass. Where do we add them? static inline int xfs_lock_inumorder(int lock_mode, int subclass) { if (lock_mode & (XFS_IOLOCK_SHARED|XFS_IOLOCK_EXCL)) lock_mode |= (subclass + XFS_IOLOCK_INUMORDER) << XFS_IOLOCK_SHIFT; if (lock_mode & (XFS_ILOCK_SHARED|XFS_ILOCK_EXCL)) lock_mode |= (subclass + XFS_ILOCK_INUMORDER) << XFS_ILOCK_SHIFT; return lock_mode; } OH, look at those nice overflow bugs in that in that code. We shift the XFS_IOLOCK_INUMORDER and XFS_ILOCK_INUMORDER bits out the far side of the lock_mode variable result in lock subclasses of 0-3 instead of 2-5 Bugger, eh? Patch below should fix this (untested). Jarek - thanks for pointing what I should have seen earlier. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- fs/xfs/xfs_inode.h| 15 +-- fs/xfs/xfs_vnodeops.c |4 ++-- 2 files changed, 11 insertions(+), 8 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/xfs_vnodeops.c === --- 2.6.x-xfs-new.orig/fs/xfs/xfs_vnodeops.c2007-06-25 13:56:20.0 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_vnodeops.c 2007-06-26 22:46:55.412060598 +1000 @@ -2256,9 +2256,9 @@ static inline int xfs_lock_inumorder(int lock_mode, int subclass) { if (lock_mode & (XFS_IOLOCK_SHARED|XFS_IOLOCK_EXCL)) - lock_mode |= (subclass + XFS_IOLOCK_INUMORDER) << XFS_IOLOCK_SHIFT; + lock_mode |= (subclass + XFS_LOCK_INUMORDER) << XFS_IOLOCK_SHIFT; if (lock_mode & (XFS_ILOCK_SHARED|XFS_ILOCK_EXCL)) - lock_mode |= (subclass + XFS_ILOCK_INUMORDER) << XFS_ILOCK_SHIFT; + lock_mode |= (subclass + XFS_LOCK_INUMORDER) << XFS_ILOCK_SHIFT; return lock_mode; } Index: 2.6.x-xfs-new/fs/xfs/xfs_inode.h === --- 2.6.x-xfs-new.orig/fs/xfs/xfs_inode.h 2007-06-20 17:59:35.0 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_inode.h2007-06-26 22:46:50.104749916 +1000 @@ -386,19 +386,22 @@ xfs_iflags_test(xfs_inode_t *ip, unsigne * gets a lockdep subclass of 1 and the second lock will have a lockdep * subclass of 0. * - * XFS_I[O]LOCK_INUMORDER - for locking several inodes at the some time + * XFS_LOCK_INUMORDER - for locking several inodes at the some time * with xfs_lock_inodes(). This flag is used as the starting subclass * and each subsequent lock acquired will increment the subclass by one. * So the first lock acquired will have a lockdep subclass of 2, the - * second lock will have a lockdep subclass of 3, and so on. + * second lock will have a lockdep subclass of 3, and so on. It is + * the responsibility of the class builder to shift this to the correct + * portion of the lock_mode lockdep mask. */ +#define XFS_LOCK_PARENT1 +#define XFS_LOCK_INUMORDER 2 + #define XFS_IOLOCK_SHIFT 16 -#defineXFS_IOLOCK_PARENT (1 << XFS_IOLOCK_SHIFT) -#defineXFS_IOLOCK_INUMORDER(2 << XFS_IOLOCK_SHIFT) +#defineXFS_IOLOCK_PARENT (XFS_LOCK_PARENT <<
Re: [BUG] Lockdep warning with XFS on 2.6.22-rc6
On 26-06-2007 04:16, David Chinner wrote: > On Mon, Jun 25, 2007 at 11:01:11PM +0200, Ingo Molnar wrote: >> * Satyam Sharma <[EMAIL PROTECTED]> wrote: >> >>> [ Ok, so we know that XFS wants to lock inodes in ascending inode >>> number order and not strictly the parent-first-child-second order that >>> rest of the fs/ code does, so that makes it difficult to teach lockdep >>> about this kind of lock ordering ... ] > > It does both - parent-first/child-second and ascending inode # order, > which is where the problem is. standing alone, these seem fine, but > they don't appear to work when the child has a lower inode number > than the parent. ... >From xfs_inode.h: /* * Flags for lockdep annotations. * * XFS_I[O]LOCK_PARENT - for operations that require locking two inodes * (ie directory operations that require locking a directory inode and * an entry inode). The first inode gets locked with this flag so it * gets a lockdep subclass of 1 and the second lock will have a lockdep * subclass of 0. * * XFS_I[O]LOCK_INUMORDER - for locking several inodes at the some time * with xfs_lock_inodes(). This flag is used as the starting subclass * and each subsequent lock acquired will increment the subclass by one. * So the first lock acquired will have a lockdep subclass of 2, the * second lock will have a lockdep subclass of 3, and so on. */ I don't know xfs code, and probably miss something, but it seems there could be some inconsistency: lockdep warning shows mr_lock/1 taken both before and after mr_lock (i.e. /0). According to the above comment there should be always 1 before 0... Cheers, Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Lockdep warning with XFS on 2.6.22-rc6
On 26-06-2007 04:16, David Chinner wrote: On Mon, Jun 25, 2007 at 11:01:11PM +0200, Ingo Molnar wrote: * Satyam Sharma [EMAIL PROTECTED] wrote: [ Ok, so we know that XFS wants to lock inodes in ascending inode number order and not strictly the parent-first-child-second order that rest of the fs/ code does, so that makes it difficult to teach lockdep about this kind of lock ordering ... ] It does both - parent-first/child-second and ascending inode # order, which is where the problem is. standing alone, these seem fine, but they don't appear to work when the child has a lower inode number than the parent. ... From xfs_inode.h: /* * Flags for lockdep annotations. * * XFS_I[O]LOCK_PARENT - for operations that require locking two inodes * (ie directory operations that require locking a directory inode and * an entry inode). The first inode gets locked with this flag so it * gets a lockdep subclass of 1 and the second lock will have a lockdep * subclass of 0. * * XFS_I[O]LOCK_INUMORDER - for locking several inodes at the some time * with xfs_lock_inodes(). This flag is used as the starting subclass * and each subsequent lock acquired will increment the subclass by one. * So the first lock acquired will have a lockdep subclass of 2, the * second lock will have a lockdep subclass of 3, and so on. */ I don't know xfs code, and probably miss something, but it seems there could be some inconsistency: lockdep warning shows mr_lock/1 taken both before and after mr_lock (i.e. /0). According to the above comment there should be always 1 before 0... Cheers, Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Lockdep warning with XFS on 2.6.22-rc6
On Tue, Jun 26, 2007 at 11:35:20AM +0200, Jarek Poplawski wrote: On 26-06-2007 04:16, David Chinner wrote: It does both - parent-first/child-second and ascending inode # order, which is where the problem is. standing alone, these seem fine, but they don't appear to work when the child has a lower inode number than the parent. ... From xfs_inode.h: /* * Flags for lockdep annotations. * * XFS_I[O]LOCK_PARENT - for operations that require locking two inodes * (ie directory operations that require locking a directory inode and * an entry inode). The first inode gets locked with this flag so it * gets a lockdep subclass of 1 and the second lock will have a lockdep * subclass of 0. * * XFS_I[O]LOCK_INUMORDER - for locking several inodes at the some time * with xfs_lock_inodes(). This flag is used as the starting subclass * and each subsequent lock acquired will increment the subclass by one. * So the first lock acquired will have a lockdep subclass of 2, the * second lock will have a lockdep subclass of 3, and so on. */ I don't know xfs code, and probably miss something, but it seems there could be some inconsistency: lockdep warning shows mr_lock/1 taken both before and after mr_lock (i.e. /0). According to the above comment there should be always 1 before 0... That just fired some rusty neurons. #define XFS_IOLOCK_SHIFT16 #define XFS_IOLOCK_PARENT (1 XFS_IOLOCK_SHIFT) #define XFS_IOLOCK_INUMORDER(2 XFS_IOLOCK_SHIFT) #define XFS_ILOCK_SHIFT 24 #define XFS_ILOCK_PARENT(1 XFS_ILOCK_SHIFT) #define XFS_ILOCK_INUMORDER (2 XFS_ILOCK_SHIFT) So, in a lock_mode parameter, the upper 8 bits are for the ILOCK lockdep subclass, and the 16..23 bits are for the IOLOCK lockdep subclass. Where do we add them? static inline int xfs_lock_inumorder(int lock_mode, int subclass) { if (lock_mode (XFS_IOLOCK_SHARED|XFS_IOLOCK_EXCL)) lock_mode |= (subclass + XFS_IOLOCK_INUMORDER) XFS_IOLOCK_SHIFT; if (lock_mode (XFS_ILOCK_SHARED|XFS_ILOCK_EXCL)) lock_mode |= (subclass + XFS_ILOCK_INUMORDER) XFS_ILOCK_SHIFT; return lock_mode; } OH, look at those nice overflow bugs in that in that code. We shift the XFS_IOLOCK_INUMORDER and XFS_ILOCK_INUMORDER bits out the far side of the lock_mode variable result in lock subclasses of 0-3 instead of 2-5 Bugger, eh? Patch below should fix this (untested). Jarek - thanks for pointing what I should have seen earlier. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- fs/xfs/xfs_inode.h| 15 +-- fs/xfs/xfs_vnodeops.c |4 ++-- 2 files changed, 11 insertions(+), 8 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/xfs_vnodeops.c === --- 2.6.x-xfs-new.orig/fs/xfs/xfs_vnodeops.c2007-06-25 13:56:20.0 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_vnodeops.c 2007-06-26 22:46:55.412060598 +1000 @@ -2256,9 +2256,9 @@ static inline int xfs_lock_inumorder(int lock_mode, int subclass) { if (lock_mode (XFS_IOLOCK_SHARED|XFS_IOLOCK_EXCL)) - lock_mode |= (subclass + XFS_IOLOCK_INUMORDER) XFS_IOLOCK_SHIFT; + lock_mode |= (subclass + XFS_LOCK_INUMORDER) XFS_IOLOCK_SHIFT; if (lock_mode (XFS_ILOCK_SHARED|XFS_ILOCK_EXCL)) - lock_mode |= (subclass + XFS_ILOCK_INUMORDER) XFS_ILOCK_SHIFT; + lock_mode |= (subclass + XFS_LOCK_INUMORDER) XFS_ILOCK_SHIFT; return lock_mode; } Index: 2.6.x-xfs-new/fs/xfs/xfs_inode.h === --- 2.6.x-xfs-new.orig/fs/xfs/xfs_inode.h 2007-06-20 17:59:35.0 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_inode.h2007-06-26 22:46:50.104749916 +1000 @@ -386,19 +386,22 @@ xfs_iflags_test(xfs_inode_t *ip, unsigne * gets a lockdep subclass of 1 and the second lock will have a lockdep * subclass of 0. * - * XFS_I[O]LOCK_INUMORDER - for locking several inodes at the some time + * XFS_LOCK_INUMORDER - for locking several inodes at the some time * with xfs_lock_inodes(). This flag is used as the starting subclass * and each subsequent lock acquired will increment the subclass by one. * So the first lock acquired will have a lockdep subclass of 2, the - * second lock will have a lockdep subclass of 3, and so on. + * second lock will have a lockdep subclass of 3, and so on. It is + * the responsibility of the class builder to shift this to the correct + * portion of the lock_mode lockdep mask. */ +#define XFS_LOCK_PARENT1 +#define XFS_LOCK_INUMORDER 2 + #define XFS_IOLOCK_SHIFT 16 -#defineXFS_IOLOCK_PARENT (1 XFS_IOLOCK_SHIFT) -#defineXFS_IOLOCK_INUMORDER(2 XFS_IOLOCK_SHIFT) +#defineXFS_IOLOCK_PARENT (XFS_LOCK_PARENT XFS_IOLOCK_SHIFT) #define XFS_ILOCK_SHIFT24 -#define
Re: [BUG] Lockdep warning with XFS on 2.6.22-rc6
On Mon, Jun 25, 2007 at 11:01:11PM +0200, Ingo Molnar wrote: > > * Satyam Sharma <[EMAIL PROTECTED]> wrote: > > > [ Ok, so we know that XFS wants to lock inodes in ascending inode > > number order and not strictly the parent-first-child-second order that > > rest of the fs/ code does, so that makes it difficult to teach lockdep > > about this kind of lock ordering ... ] It does both - parent-first/child-second and ascending inode # order, which is where the problem is. standing alone, these seem fine, but they don't appear to work when the child has a lower inode number than the parent. > hm, there's a recent API addition to lockdep that should help with this: > have you looked into the patch below, could it be used to solve the XFS > problem? Basically when you are one step deeper into a recursive locking > scenario you can use lock_set_subclass() on the held lock to reset it > one level higher. Peter Zijlstra already pointed me at that patch, Ingo ;). I'm looking into it at the moment. Not being a lockdep expert, it's taking me a little time to understand exactly what is needed here. At first I wasn't sure that this would work, but now I've seen the patch that used it, I suspect that it can be used. That patch (http://lkml.org/lkml/2007/1/28/88) has three cases enumerated (prev,cur,next) to match the three node types in the list and that the "next" got set back to "cur" via lock_set_subclass() when walking the list so that lockdep always sees cur -> next transistions when walking the list. Have I read this correctly? Reading this, it seems to me that the xfs_lock_inodes() code needs to set the lock subclass of the first inode to "parent" (and lock it as a parent) and then as it walks across the remining inodes it sets the previous inode to a parent and locks the current inode as a child. It seems that I'd also need to make sure that this is done on the normal parent/child lock ordering as well. Does this make any sense? lockdep is not something I've paid much attention to up to this point (someone else did the current notation and now on leave for a couple more months), so I'm trying to pick up the pieces as quickly as possible... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Lockdep warning with XFS on 2.6.22-rc6
* Satyam Sharma <[EMAIL PROTECTED]> wrote: > [ Ok, so we know that XFS wants to lock inodes in ascending inode > number order and not strictly the parent-first-child-second order that > rest of the fs/ code does, so that makes it difficult to teach lockdep > about this kind of lock ordering ... ] hm, there's a recent API addition to lockdep that should help with this: have you looked into the patch below, could it be used to solve the XFS problem? Basically when you are one step deeper into a recursive locking scenario you can use lock_set_subclass() on the held lock to reset it one level higher. this preserves lockdep checking but allows arbitrary deep locking. Ingo --> Subject: [patch] lockdep: lock_set_subclass - reset a held lock's subclass From: Peter Zijlstra <[EMAIL PROTECTED]> this can be used to reset a held lock's subclass, for arbitrary-depth iterated data structures such as trees or lists which have per-node locks. Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> --- include/linux/lockdep.h |4 ++ kernel/lockdep.c| 69 2 files changed, 73 insertions(+) Index: linux/include/linux/lockdep.h === --- linux.orig/include/linux/lockdep.h +++ linux/include/linux/lockdep.h @@ -243,6 +243,9 @@ extern void lock_acquire(struct lockdep_ extern void lock_release(struct lockdep_map *lock, int nested, unsigned long ip); +extern void lock_set_subclass(struct lockdep_map *lock, unsigned int subclass, + unsigned long ip); + # define INIT_LOCKDEP .lockdep_recursion = 0, #define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0) @@ -259,6 +262,7 @@ static inline void lockdep_on(void) # define lock_acquire(l, s, t, r, c, i)do { } while (0) # define lock_release(l, n, i) do { } while (0) +# define lock_set_subclass(l, s, i)do { } while (0) # define lockdep_init()do { } while (0) # define lockdep_info()do { } while (0) # define lockdep_init_map(lock, name, key, sub)do { (void)(key); } while (0) Index: linux/kernel/lockdep.c === --- linux.orig/kernel/lockdep.c +++ linux/kernel/lockdep.c @@ -2278,6 +2278,55 @@ static int check_unlock(struct task_stru return 1; } +static int +__lock_set_subclass(struct lockdep_map *lock, + unsigned int subclass, unsigned long ip) +{ + struct task_struct *curr = current; + struct held_lock *hlock, *prev_hlock; + struct lock_class *class; + unsigned int depth; + int i; + + depth = curr->lockdep_depth; + if (DEBUG_LOCKS_WARN_ON(!depth)) + return 0; + + prev_hlock = NULL; + for (i = depth-1; i >= 0; i--) { + hlock = curr->held_locks + i; + /* +* We must not cross into another context: +*/ + if (prev_hlock && prev_hlock->irq_context != hlock->irq_context) + break; + if (hlock->instance == lock) + goto found_it; + prev_hlock = hlock; + } + return print_unlock_inbalance_bug(curr, lock, ip); + +found_it: + class = register_lock_class(lock, subclass, 0); + hlock->class = class; + + curr->lockdep_depth = i; + curr->curr_chain_key = hlock->prev_chain_key; + + for (; i < depth; i++) { + hlock = curr->held_locks + i; + if (!__lock_acquire(hlock->instance, + hlock->class->subclass, hlock->trylock, + hlock->read, hlock->check, hlock->hardirqs_off, + hlock->acquire_ip)) + return 0; + } + + if (DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth)) + return 0; + return 1; +} + /* * Remove the lock to the list of currently held locks in a * potentially non-nested (out of order) manner. This is a @@ -2473,6 +2522,26 @@ void lock_release(struct lockdep_map *lo EXPORT_SYMBOL_GPL(lock_release); +void +lock_set_subclass(struct lockdep_map *lock, + unsigned int subclass, unsigned long ip) +{ + unsigned long flags; + + if (unlikely(current->lockdep_recursion)) + return; + + raw_local_irq_save(flags); + current->lockdep_recursion = 1; + check_flags(flags); + if (__lock_set_subclass(lock, subclass, ip)) + check_chain_key(current); + current->lockdep_recursion = 0; + raw_local_irq_restore(flags); +} + +EXPORT_SYMBOL_GPL(lock_set_subclass); + /* * Used by the testsuite, sanitize
Re: [BUG] Lockdep warning with XFS on 2.6.22-rc6
On 6/25/07, Johannes Weiner <[EMAIL PROTECTED]> wrote: [...] this is what just hit the ring buffer when I was surfing with elinks on a brand-new -rc6. Johannes: This is a known bogus warning. You can safely ignore it. David, Ingo: [ Ok, so we know that XFS wants to lock inodes in ascending inode number order and not strictly the parent-first-child-second order that rest of the fs/ code does, so that makes it difficult to teach lockdep about this kind of lock ordering ... ] However, this (bogus) warning still causes way too much noise on the lists (2-3 or more every week?) and most users wouldn't understand how or why this warning is bogus, so would get unnecessarily disturbed about it. Could there be a way to whitelist such "known bogus cases" in lockdep and stop it from complaining? Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUG] Lockdep warning with XFS on 2.6.22-rc6
Hi, this is what just hit the ring buffer when I was surfing with elinks on a brand-new -rc6. Hannes === [ INFO: possible circular locking dependency detected ] 2.6.22-rc6 #14 --- elinks/4461 is trying to acquire lock: (&(>i_lock)->mr_lock/1){}, at: [] xfs_ilock+0x50/0xd0 but task is already holding lock: (&(>i_lock)->mr_lock){}, at: [] xfs_ilock+0x50/0xd0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&(>i_lock)->mr_lock){}: [] __lock_acquire+0xe0d/0xfc0 [] lock_acquire+0x6d/0x90 [] down_write_nested+0x41/0x60 [] xfs_ilock+0x92/0xd0 [] xfs_iget_core+0x398/0x5c0 [] xfs_iget+0xbe/0x120 [] xfs_trans_iget+0xfd/0x160 [] xfs_ialloc+0xac/0x500 [] xfs_dir_ialloc+0x6c/0x2a0 [] xfs_create+0x33e/0x630 [] xfs_vn_mknod+0x19f/0x210 [] vfs_mknod+0xa2/0xf0 [] unix_bind+0x1bf/0x2e0 [] sys_bind+0x54/0x80 [] sys_socketcall+0x7d/0x260 [] sysenter_past_esp+0x5f/0x99 [] 0x -> #0 (&(>i_lock)->mr_lock/1){}: [] __lock_acquire+0xc85/0xfc0 [] lock_acquire+0x6d/0x90 [] down_read_nested+0x41/0x60 [] xfs_ilock+0x50/0xd0 [] xfs_lock_inodes+0x175/0x190 [] xfs_lock_for_rename+0x22f/0x280 [] xfs_rename+0xaf/0x8c0 [] xfs_vn_rename+0x34/0x80 [] vfs_rename+0x307/0x370 [] sys_renameat+0x1bb/0x1f0 [] sys_rename+0x28/0x30 [] sysenter_past_esp+0x5f/0x99 [] 0x other info that might help us debug this: 3 locks held by elinks/4461: #0: (>i_mutex/1){--..}, at: [] lock_rename+0xe9/0xf0 #1: (>i_mutex){--..}, at: [] mutex_lock+0x1c/0x20 #2: (&(>i_lock)->mr_lock){}, at: [] xfs_ilock+0x50/0xd0 stack backtrace: [] show_trace_log_lvl+0x1a/0x30 [] show_trace+0x12/0x20 [] dump_stack+0x15/0x20 [] print_circular_bug_tail+0x6c/0x80 [] __lock_acquire+0xc85/0xfc0 [] lock_acquire+0x6d/0x90 [] down_read_nested+0x41/0x60 [] xfs_ilock+0x50/0xd0 [] xfs_lock_inodes+0x175/0x190 [] xfs_lock_for_rename+0x22f/0x280 [] xfs_rename+0xaf/0x8c0 [] xfs_vn_rename+0x34/0x80 [] vfs_rename+0x307/0x370 [] sys_renameat+0x1bb/0x1f0 [] sys_rename+0x28/0x30 [] sysenter_past_esp+0x5f/0x99 ===
[BUG] Lockdep warning with XFS on 2.6.22-rc6
Hi, this is what just hit the ring buffer when I was surfing with elinks on a brand-new -rc6. Hannes === [ INFO: possible circular locking dependency detected ] 2.6.22-rc6 #14 --- elinks/4461 is trying to acquire lock: ((ip-i_lock)-mr_lock/1){}, at: [c01ef3d0] xfs_ilock+0x50/0xd0 but task is already holding lock: ((ip-i_lock)-mr_lock){}, at: [c01ef3d0] xfs_ilock+0x50/0xd0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: - #1 ((ip-i_lock)-mr_lock){}: [c0137e0d] __lock_acquire+0xe0d/0xfc0 [c013802d] lock_acquire+0x6d/0x90 [c012e4f1] down_write_nested+0x41/0x60 [c01ef412] xfs_ilock+0x92/0xd0 [c01eff98] xfs_iget_core+0x398/0x5c0 [c01f027e] xfs_iget+0xbe/0x120 [c02098bd] xfs_trans_iget+0xfd/0x160 [c01f379c] xfs_ialloc+0xac/0x500 [c020a46c] xfs_dir_ialloc+0x6c/0x2a0 [c02109ae] xfs_create+0x33e/0x630 [c021bf7f] xfs_vn_mknod+0x19f/0x210 [c016fd12] vfs_mknod+0xa2/0xf0 [c03c3dbf] unix_bind+0x1bf/0x2e0 [c0373ba4] sys_bind+0x54/0x80 [c037518d] sys_socketcall+0x7d/0x260 [c01029ce] sysenter_past_esp+0x5f/0x99 [] 0x - #0 ((ip-i_lock)-mr_lock/1){}: [c0137c85] __lock_acquire+0xc85/0xfc0 [c013802d] lock_acquire+0x6d/0x90 [c012e6e1] down_read_nested+0x41/0x60 [c01ef3d0] xfs_ilock+0x50/0xd0 [c020dd45] xfs_lock_inodes+0x175/0x190 [c02054af] xfs_lock_for_rename+0x22f/0x280 [c02055af] xfs_rename+0xaf/0x8c0 [c021bb64] xfs_vn_rename+0x34/0x80 [c0170237] vfs_rename+0x307/0x370 [c0171e3b] sys_renameat+0x1bb/0x1f0 [c0171e98] sys_rename+0x28/0x30 [c01029ce] sysenter_past_esp+0x5f/0x99 [] 0x other info that might help us debug this: 3 locks held by elinks/4461: #0: (inode-i_mutex/1){--..}, at: [c016f459] lock_rename+0xe9/0xf0 #1: (inode-i_mutex){--..}, at: [c03c8a8c] mutex_lock+0x1c/0x20 #2: ((ip-i_lock)-mr_lock){}, at: [c01ef3d0] xfs_ilock+0x50/0xd0 stack backtrace: [c010398a] show_trace_log_lvl+0x1a/0x30 [c01046a2] show_trace+0x12/0x20 [c0104765] dump_stack+0x15/0x20 [c0135d4c] print_circular_bug_tail+0x6c/0x80 [c0137c85] __lock_acquire+0xc85/0xfc0 [c013802d] lock_acquire+0x6d/0x90 [c012e6e1] down_read_nested+0x41/0x60 [c01ef3d0] xfs_ilock+0x50/0xd0 [c020dd45] xfs_lock_inodes+0x175/0x190 [c02054af] xfs_lock_for_rename+0x22f/0x280 [c02055af] xfs_rename+0xaf/0x8c0 [c021bb64] xfs_vn_rename+0x34/0x80 [c0170237] vfs_rename+0x307/0x370 [c0171e3b] sys_renameat+0x1bb/0x1f0 [c0171e98] sys_rename+0x28/0x30 [c01029ce] sysenter_past_esp+0x5f/0x99 ===
Re: [BUG] Lockdep warning with XFS on 2.6.22-rc6
On 6/25/07, Johannes Weiner [EMAIL PROTECTED] wrote: [...] this is what just hit the ring buffer when I was surfing with elinks on a brand-new -rc6. Johannes: This is a known bogus warning. You can safely ignore it. David, Ingo: [ Ok, so we know that XFS wants to lock inodes in ascending inode number order and not strictly the parent-first-child-second order that rest of the fs/ code does, so that makes it difficult to teach lockdep about this kind of lock ordering ... ] However, this (bogus) warning still causes way too much noise on the lists (2-3 or more every week?) and most users wouldn't understand how or why this warning is bogus, so would get unnecessarily disturbed about it. Could there be a way to whitelist such known bogus cases in lockdep and stop it from complaining? Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Lockdep warning with XFS on 2.6.22-rc6
* Satyam Sharma [EMAIL PROTECTED] wrote: [ Ok, so we know that XFS wants to lock inodes in ascending inode number order and not strictly the parent-first-child-second order that rest of the fs/ code does, so that makes it difficult to teach lockdep about this kind of lock ordering ... ] hm, there's a recent API addition to lockdep that should help with this: have you looked into the patch below, could it be used to solve the XFS problem? Basically when you are one step deeper into a recursive locking scenario you can use lock_set_subclass() on the held lock to reset it one level higher. this preserves lockdep checking but allows arbitrary deep locking. Ingo -- Subject: [patch] lockdep: lock_set_subclass - reset a held lock's subclass From: Peter Zijlstra [EMAIL PROTECTED] this can be used to reset a held lock's subclass, for arbitrary-depth iterated data structures such as trees or lists which have per-node locks. Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] Signed-off-by: Ingo Molnar [EMAIL PROTECTED] --- include/linux/lockdep.h |4 ++ kernel/lockdep.c| 69 2 files changed, 73 insertions(+) Index: linux/include/linux/lockdep.h === --- linux.orig/include/linux/lockdep.h +++ linux/include/linux/lockdep.h @@ -243,6 +243,9 @@ extern void lock_acquire(struct lockdep_ extern void lock_release(struct lockdep_map *lock, int nested, unsigned long ip); +extern void lock_set_subclass(struct lockdep_map *lock, unsigned int subclass, + unsigned long ip); + # define INIT_LOCKDEP .lockdep_recursion = 0, #define lockdep_depth(tsk) (debug_locks ? (tsk)-lockdep_depth : 0) @@ -259,6 +262,7 @@ static inline void lockdep_on(void) # define lock_acquire(l, s, t, r, c, i)do { } while (0) # define lock_release(l, n, i) do { } while (0) +# define lock_set_subclass(l, s, i)do { } while (0) # define lockdep_init()do { } while (0) # define lockdep_info()do { } while (0) # define lockdep_init_map(lock, name, key, sub)do { (void)(key); } while (0) Index: linux/kernel/lockdep.c === --- linux.orig/kernel/lockdep.c +++ linux/kernel/lockdep.c @@ -2278,6 +2278,55 @@ static int check_unlock(struct task_stru return 1; } +static int +__lock_set_subclass(struct lockdep_map *lock, + unsigned int subclass, unsigned long ip) +{ + struct task_struct *curr = current; + struct held_lock *hlock, *prev_hlock; + struct lock_class *class; + unsigned int depth; + int i; + + depth = curr-lockdep_depth; + if (DEBUG_LOCKS_WARN_ON(!depth)) + return 0; + + prev_hlock = NULL; + for (i = depth-1; i = 0; i--) { + hlock = curr-held_locks + i; + /* +* We must not cross into another context: +*/ + if (prev_hlock prev_hlock-irq_context != hlock-irq_context) + break; + if (hlock-instance == lock) + goto found_it; + prev_hlock = hlock; + } + return print_unlock_inbalance_bug(curr, lock, ip); + +found_it: + class = register_lock_class(lock, subclass, 0); + hlock-class = class; + + curr-lockdep_depth = i; + curr-curr_chain_key = hlock-prev_chain_key; + + for (; i depth; i++) { + hlock = curr-held_locks + i; + if (!__lock_acquire(hlock-instance, + hlock-class-subclass, hlock-trylock, + hlock-read, hlock-check, hlock-hardirqs_off, + hlock-acquire_ip)) + return 0; + } + + if (DEBUG_LOCKS_WARN_ON(curr-lockdep_depth != depth)) + return 0; + return 1; +} + /* * Remove the lock to the list of currently held locks in a * potentially non-nested (out of order) manner. This is a @@ -2473,6 +2522,26 @@ void lock_release(struct lockdep_map *lo EXPORT_SYMBOL_GPL(lock_release); +void +lock_set_subclass(struct lockdep_map *lock, + unsigned int subclass, unsigned long ip) +{ + unsigned long flags; + + if (unlikely(current-lockdep_recursion)) + return; + + raw_local_irq_save(flags); + current-lockdep_recursion = 1; + check_flags(flags); + if (__lock_set_subclass(lock, subclass, ip)) + check_chain_key(current); + current-lockdep_recursion = 0; + raw_local_irq_restore(flags); +} + +EXPORT_SYMBOL_GPL(lock_set_subclass); + /* * Used by the testsuite, sanitize the validator state * after a
Re: [BUG] Lockdep warning with XFS on 2.6.22-rc6
On Mon, Jun 25, 2007 at 11:01:11PM +0200, Ingo Molnar wrote: * Satyam Sharma [EMAIL PROTECTED] wrote: [ Ok, so we know that XFS wants to lock inodes in ascending inode number order and not strictly the parent-first-child-second order that rest of the fs/ code does, so that makes it difficult to teach lockdep about this kind of lock ordering ... ] It does both - parent-first/child-second and ascending inode # order, which is where the problem is. standing alone, these seem fine, but they don't appear to work when the child has a lower inode number than the parent. hm, there's a recent API addition to lockdep that should help with this: have you looked into the patch below, could it be used to solve the XFS problem? Basically when you are one step deeper into a recursive locking scenario you can use lock_set_subclass() on the held lock to reset it one level higher. Peter Zijlstra already pointed me at that patch, Ingo ;). I'm looking into it at the moment. Not being a lockdep expert, it's taking me a little time to understand exactly what is needed here. At first I wasn't sure that this would work, but now I've seen the patch that used it, I suspect that it can be used. That patch (http://lkml.org/lkml/2007/1/28/88) has three cases enumerated (prev,cur,next) to match the three node types in the list and that the next got set back to cur via lock_set_subclass() when walking the list so that lockdep always sees cur - next transistions when walking the list. Have I read this correctly? Reading this, it seems to me that the xfs_lock_inodes() code needs to set the lock subclass of the first inode to parent (and lock it as a parent) and then as it walks across the remining inodes it sets the previous inode to a parent and locks the current inode as a child. It seems that I'd also need to make sure that this is done on the normal parent/child lock ordering as well. Does this make any sense? lockdep is not something I've paid much attention to up to this point (someone else did the current notation and now on leave for a couple more months), so I'm trying to pick up the pieces as quickly as possible... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/