On Thu, Nov 12, 2009 at 02:29:18PM +0000, Steven Whitehouse wrote: > Hi, > > On Thu, 2009-11-12 at 15:22 +0100, Ingo Molnar wrote: > > * Steven Whitehouse <swhit...@redhat.com> wrote: > > > > > I looked at possibly changing this to use completions, but > > > it seems that the usage here is not easily adapted to that. > > > This patch adds suitable annotation to the write side of > > > the ls_in_recovery semaphore so that we don't get nasty > > > messages from lockdep when mounting a gfs2 filesystem. > > > > What do those 'nasty messages' say? If they expose some bug and this > > patch works around that bug by hiding it then NAK ... > > > > Ingo > > > The nasty messages are moaning that the lock is being taken in one > thread and unlocked in another. I couldn't see any bugs in the code when > I looked at it. Below are the messages that I get - to reproduce just > mount a GFS2 filesystem with the dlm lock manager. It happens on every > mount, > > Steve. > > Nov 12 15:10:01 chywoon kernel: > ============================================= > Nov 12 15:10:01 chywoon kernel: [ INFO: possible recursive locking > detected ]
That recursive locking trace is something different. up_write_non_owner() addresses this trace, which as you say, is from doing the down and up from different threads (which is the intention): Nov 11 16:50:08 bull-02 kernel: GFS2: fsid=bull:foo.1: Joined cluster. Now mounting FS... Nov 11 16:50:09 bull-02 kernel: Nov 11 16:50:09 bull-02 kernel: ===================================== Nov 11 16:50:09 bull-02 kernel: [ BUG: bad unlock balance detected! ] Nov 11 16:50:09 bull-02 kernel: ------------------------------------- Nov 11 16:50:09 bull-02 kernel: dlm_recoverd/8958 is trying to release lock (&ls->ls_in_recovery) at: Nov 11 16:50:09 bull-02 kernel: [<ffffffffa02e0fa7>] dlm_recoverd+0x323/0x4ce [dlm] Nov 11 16:50:09 bull-02 kernel: but there are no more locks to release! Nov 11 16:50:09 bull-02 kernel: Nov 11 16:50:09 bull-02 kernel: other info that might help us debug this: Nov 11 16:50:09 bull-02 kernel: 3 locks held by dlm_recoverd/8958: Nov 11 16:50:09 bull-02 kernel: #0: (&ls->ls_recoverd_active){......}, at: [<ffffffffa02e0d74>] dlm_recoverd+0xf0/0x4ce [dlm] Nov 11 16:50:09 bull-02 kernel: #1: (&ls->ls_recv_active){......}, at: [<ffffffffa02e0f81>] dlm_recoverd+0x2fd/0x4ce [dlm] Nov 11 16:50:09 bull-02 kernel: #2: (&ls->ls_recover_lock){......}, at: [<ffffffffa02e0f89>] dlm_recoverd+0x305/0x4ce [dlm] Nov 11 16:50:09 bull-02 kernel: Nov 11 16:50:09 bull-02 kernel: stack backtrace: Nov 11 16:50:09 bull-02 kernel: Pid: 8958, comm: dlm_recoverd Not tainted 2.6.32-rc5 #2 Nov 11 16:50:09 bull-02 kernel: Call Trace: Nov 11 16:50:09 bull-02 kernel: [<ffffffffa02e0fa7>] ? dlm_recoverd+0x323/0x4ce [dlm] Nov 11 16:50:09 bull-02 kernel: [<ffffffff8106949b>] print_unlock_inbalance_bug+0xd6/0xe0 Nov 11 16:50:09 bull-02 kernel: [<ffffffff81069563>] lock_release_non_nested+0xbe/0x259 Nov 11 16:50:09 bull-02 kernel: [<ffffffffa02e0fa7>] ? dlm_recoverd+0x323/0x4ce [dlm] Nov 11 16:50:09 bull-02 kernel: [<ffffffffa02e0fa7>] ? dlm_recoverd+0x323/0x4ce [dlm] Nov 11 16:50:09 bull-02 kernel: [<ffffffff8106b8ae>] lock_release+0x14a/0x16c Nov 11 16:50:09 bull-02 kernel: [<ffffffff8105fb71>] up_write+0x1e/0x2d Nov 11 16:50:09 bull-02 kernel: [<ffffffffa02e0fa7>] dlm_recoverd+0x323/0x4ce [dlm] Nov 11 16:50:09 bull-02 kernel: [<ffffffffa02e0c84>] ? dlm_recoverd+0x0/0x4ce [dlm] Nov 11 16:50:09 bull-02 kernel: [<ffffffff8105cad0>] kthread+0x7d/0x85 Nov 11 16:50:09 bull-02 kernel: [<ffffffff8100ca1a>] child_rip+0xa/0x20 Nov 11 16:50:09 bull-02 kernel: [<ffffffff8105ca32>] ? kthreadd+0xc2/0xe3 Nov 11 16:50:09 bull-02 kernel: [<ffffffff8105ca53>] ? kthread+0x0/0x85 Nov 11 16:50:09 bull-02 kernel: [<ffffffff8100ca10>] ? child_rip+0x0/0x20