[PATCH 3.2 144/152] dcache: Fix locking bugs in backported "deal with deadlock in d_walk()"
3.2.67-rc1 review patch. If anyone has any objections, please let me know. -- From: Ben Hutchings Steven Rostedt reported: > Porting -rt to the latest 3.2 stable tree I triggered this bug: > > = > [ BUG: bad unlock balance detected! ] > - > rm/1638 is trying to release lock (rcu_read_lock) at: > [] rcu_read_unlock+0x0/0x23 > but there are no more locks to release! > > other info that might help us debug this: > 2 locks held by rm/1638: > #0: (>s_type->i_mutex_key#9/1){+.+.+.}, at: [] > do_rmdir+0x5f/0xd2 > #1: (>s_type->i_mutex_key#9){+.+.+.}, at: [] > vfs_rmdir+0x49/0xac > > stack backtrace: > Pid: 1638, comm: rm Not tainted 3.2.66-test-rt96+ #2 > Call Trace: > [] ? printk+0x1d/0x1f > [] print_unlock_inbalance_bug+0xc3/0xcd > [] lock_release_non_nested+0x98/0x1ec > [] ? trace_hardirqs_off_caller+0x18/0x90 > [] ? local_clock+0x2d/0x50 > [] ? d_hash+0x2f/0x2f > [] ? d_hash+0x2f/0x2f > [] lock_release+0x192/0x1ad > [] rcu_read_unlock+0x17/0x23 > [] shrink_dcache_parent+0x227/0x270 > [] vfs_rmdir+0x68/0xac > [] do_rmdir+0x98/0xd2 > [] ? fput+0x1a3/0x1ab > [] ? sysenter_exit+0xf/0x1a > [] ? trace_hardirqs_on_caller+0x118/0x149 > [] sys_unlinkat+0x2b/0x35 > [] sysenter_do_call+0x12/0x12 > > > > > There's a path to calling rcu_read_unlock() without calling > rcu_read_lock() in have_submounts(). > > goto positive; > > positive: > if (!locked && read_seqretry(_lock, seq)) > goto rename_retry; > > rename_retry: > rcu_read_unlock(); > > in the above path, rcu_read_lock() is never done before calling > rcu_read_unlock(); I reviewed locking contexts in all three functions that I changed when backporting "deal with deadlock in d_walk()". It's actually worse than this: - We don't hold this_parent->d_lock at the 'positive' label in have_submounts(), but it is unlocked after 'rename_retry'. - There is an rcu_read_unlock() after the 'out' label in select_parent(), but it's not held at the 'goto out'. Fix all three lock imbalances. Reported-by: Steven Rostedt Signed-off-by: Ben Hutchings Tested-by: Steven Rostedt --- fs/dcache.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1035,7 +1035,7 @@ ascend: return 0; /* No mount points found in tree */ positive: if (!locked && read_seqretry(_lock, seq)) - goto rename_retry; + goto rename_retry_unlocked; if (locked) write_sequnlock(_lock); return 1; @@ -1045,6 +1045,7 @@ rename_retry: rcu_read_unlock(); if (locked) goto again; +rename_retry_unlocked: locked = 1; write_seqlock(_lock); goto again; @@ -1109,6 +1110,7 @@ resume: */ if (found && need_resched()) { spin_unlock(>d_lock); + rcu_read_lock(); goto out; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.2 144/152] dcache: Fix locking bugs in backported deal with deadlock in d_walk()
3.2.67-rc1 review patch. If anyone has any objections, please let me know. -- From: Ben Hutchings b...@decadent.org.uk Steven Rostedt reported: Porting -rt to the latest 3.2 stable tree I triggered this bug: = [ BUG: bad unlock balance detected! ] - rm/1638 is trying to release lock (rcu_read_lock) at: [c04fde6c] rcu_read_unlock+0x0/0x23 but there are no more locks to release! other info that might help us debug this: 2 locks held by rm/1638: #0: (sb-s_type-i_mutex_key#9/1){+.+.+.}, at: [c04f93eb] do_rmdir+0x5f/0xd2 #1: (sb-s_type-i_mutex_key#9){+.+.+.}, at: [c04f9329] vfs_rmdir+0x49/0xac stack backtrace: Pid: 1638, comm: rm Not tainted 3.2.66-test-rt96+ #2 Call Trace: [c083f390] ? printk+0x1d/0x1f [c0463cdf] print_unlock_inbalance_bug+0xc3/0xcd [c04653a8] lock_release_non_nested+0x98/0x1ec [c046228d] ? trace_hardirqs_off_caller+0x18/0x90 [c0456f1c] ? local_clock+0x2d/0x50 [c04fde6c] ? d_hash+0x2f/0x2f [c04fde6c] ? d_hash+0x2f/0x2f [c046568e] lock_release+0x192/0x1ad [c04fde83] rcu_read_unlock+0x17/0x23 [c04ff344] shrink_dcache_parent+0x227/0x270 [c04f9348] vfs_rmdir+0x68/0xac [c04f9424] do_rmdir+0x98/0xd2 [c04f03ad] ? fput+0x1a3/0x1ab [c084dd42] ? sysenter_exit+0xf/0x1a [c0465b58] ? trace_hardirqs_on_caller+0x118/0x149 [c04fa3e0] sys_unlinkat+0x2b/0x35 [c084dd13] sysenter_do_call+0x12/0x12 There's a path to calling rcu_read_unlock() without calling rcu_read_lock() in have_submounts(). goto positive; positive: if (!locked read_seqretry(rename_lock, seq)) goto rename_retry; rename_retry: rcu_read_unlock(); in the above path, rcu_read_lock() is never done before calling rcu_read_unlock(); I reviewed locking contexts in all three functions that I changed when backporting deal with deadlock in d_walk(). It's actually worse than this: - We don't hold this_parent-d_lock at the 'positive' label in have_submounts(), but it is unlocked after 'rename_retry'. - There is an rcu_read_unlock() after the 'out' label in select_parent(), but it's not held at the 'goto out'. Fix all three lock imbalances. Reported-by: Steven Rostedt rost...@goodmis.org Signed-off-by: Ben Hutchings b...@decadent.org.uk Tested-by: Steven Rostedt rost...@goodmis.org --- fs/dcache.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1035,7 +1035,7 @@ ascend: return 0; /* No mount points found in tree */ positive: if (!locked read_seqretry(rename_lock, seq)) - goto rename_retry; + goto rename_retry_unlocked; if (locked) write_sequnlock(rename_lock); return 1; @@ -1045,6 +1045,7 @@ rename_retry: rcu_read_unlock(); if (locked) goto again; +rename_retry_unlocked: locked = 1; write_seqlock(rename_lock); goto again; @@ -1109,6 +1110,7 @@ resume: */ if (found need_resched()) { spin_unlock(dentry-d_lock); + rcu_read_lock(); goto out; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/