[PATCH 3.2 144/152] dcache: Fix locking bugs in backported "deal with deadlock in d_walk()"

2015-02-16 Thread Ben Hutchings
3.2.67-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Ben Hutchings 

Steven Rostedt reported:
> Porting -rt to the latest 3.2 stable tree I triggered this bug:
> 
> =
> [ BUG: bad unlock balance detected! ]
> -
> rm/1638 is trying to release lock (rcu_read_lock) at:
> [] rcu_read_unlock+0x0/0x23
> but there are no more locks to release!
> 
> other info that might help us debug this:
> 2 locks held by rm/1638:
>  #0:  (>s_type->i_mutex_key#9/1){+.+.+.}, at: [] 
> do_rmdir+0x5f/0xd2
>  #1:  (>s_type->i_mutex_key#9){+.+.+.}, at: [] 
> vfs_rmdir+0x49/0xac
> 
> stack backtrace:
> Pid: 1638, comm: rm Not tainted 3.2.66-test-rt96+ #2
> Call Trace:
>  [] ? printk+0x1d/0x1f
>  [] print_unlock_inbalance_bug+0xc3/0xcd
>  [] lock_release_non_nested+0x98/0x1ec
>  [] ? trace_hardirqs_off_caller+0x18/0x90
>  [] ? local_clock+0x2d/0x50
>  [] ? d_hash+0x2f/0x2f
>  [] ? d_hash+0x2f/0x2f
>  [] lock_release+0x192/0x1ad
>  [] rcu_read_unlock+0x17/0x23
>  [] shrink_dcache_parent+0x227/0x270
>  [] vfs_rmdir+0x68/0xac
>  [] do_rmdir+0x98/0xd2
>  [] ? fput+0x1a3/0x1ab
>  [] ? sysenter_exit+0xf/0x1a
>  [] ? trace_hardirqs_on_caller+0x118/0x149
>  [] sys_unlinkat+0x2b/0x35
>  [] sysenter_do_call+0x12/0x12
> 
> 
> 
> 
> There's a path to calling rcu_read_unlock() without calling
> rcu_read_lock() in have_submounts().
> 
>   goto positive;
> 
> positive:
>   if (!locked && read_seqretry(_lock, seq))
>   goto rename_retry;
> 
> rename_retry:
>   rcu_read_unlock();
> 
> in the above path, rcu_read_lock() is never done before calling
> rcu_read_unlock();

I reviewed locking contexts in all three functions that I changed when
backporting "deal with deadlock in d_walk()".  It's actually worse
than this:

- We don't hold this_parent->d_lock at the 'positive' label in
  have_submounts(), but it is unlocked after 'rename_retry'.
- There is an rcu_read_unlock() after the 'out' label in
  select_parent(), but it's not held at the 'goto out'.

Fix all three lock imbalances.

Reported-by: Steven Rostedt 
Signed-off-by: Ben Hutchings 
Tested-by: Steven Rostedt 
---
 fs/dcache.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1035,7 +1035,7 @@ ascend:
return 0; /* No mount points found in tree */
 positive:
if (!locked && read_seqretry(_lock, seq))
-   goto rename_retry;
+   goto rename_retry_unlocked;
if (locked)
write_sequnlock(_lock);
return 1;
@@ -1045,6 +1045,7 @@ rename_retry:
rcu_read_unlock();
if (locked)
goto again;
+rename_retry_unlocked:
locked = 1;
write_seqlock(_lock);
goto again;
@@ -1109,6 +1110,7 @@ resume:
 */
if (found && need_resched()) {
spin_unlock(>d_lock);
+   rcu_read_lock();
goto out;
}
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.2 144/152] dcache: Fix locking bugs in backported deal with deadlock in d_walk()

2015-02-16 Thread Ben Hutchings
3.2.67-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Ben Hutchings b...@decadent.org.uk

Steven Rostedt reported:
 Porting -rt to the latest 3.2 stable tree I triggered this bug:
 
 =
 [ BUG: bad unlock balance detected! ]
 -
 rm/1638 is trying to release lock (rcu_read_lock) at:
 [c04fde6c] rcu_read_unlock+0x0/0x23
 but there are no more locks to release!
 
 other info that might help us debug this:
 2 locks held by rm/1638:
  #0:  (sb-s_type-i_mutex_key#9/1){+.+.+.}, at: [c04f93eb] 
 do_rmdir+0x5f/0xd2
  #1:  (sb-s_type-i_mutex_key#9){+.+.+.}, at: [c04f9329] 
 vfs_rmdir+0x49/0xac
 
 stack backtrace:
 Pid: 1638, comm: rm Not tainted 3.2.66-test-rt96+ #2
 Call Trace:
  [c083f390] ? printk+0x1d/0x1f
  [c0463cdf] print_unlock_inbalance_bug+0xc3/0xcd
  [c04653a8] lock_release_non_nested+0x98/0x1ec
  [c046228d] ? trace_hardirqs_off_caller+0x18/0x90
  [c0456f1c] ? local_clock+0x2d/0x50
  [c04fde6c] ? d_hash+0x2f/0x2f
  [c04fde6c] ? d_hash+0x2f/0x2f
  [c046568e] lock_release+0x192/0x1ad
  [c04fde83] rcu_read_unlock+0x17/0x23
  [c04ff344] shrink_dcache_parent+0x227/0x270
  [c04f9348] vfs_rmdir+0x68/0xac
  [c04f9424] do_rmdir+0x98/0xd2
  [c04f03ad] ? fput+0x1a3/0x1ab
  [c084dd42] ? sysenter_exit+0xf/0x1a
  [c0465b58] ? trace_hardirqs_on_caller+0x118/0x149
  [c04fa3e0] sys_unlinkat+0x2b/0x35
  [c084dd13] sysenter_do_call+0x12/0x12
 
 
 
 
 There's a path to calling rcu_read_unlock() without calling
 rcu_read_lock() in have_submounts().
 
   goto positive;
 
 positive:
   if (!locked  read_seqretry(rename_lock, seq))
   goto rename_retry;
 
 rename_retry:
   rcu_read_unlock();
 
 in the above path, rcu_read_lock() is never done before calling
 rcu_read_unlock();

I reviewed locking contexts in all three functions that I changed when
backporting deal with deadlock in d_walk().  It's actually worse
than this:

- We don't hold this_parent-d_lock at the 'positive' label in
  have_submounts(), but it is unlocked after 'rename_retry'.
- There is an rcu_read_unlock() after the 'out' label in
  select_parent(), but it's not held at the 'goto out'.

Fix all three lock imbalances.

Reported-by: Steven Rostedt rost...@goodmis.org
Signed-off-by: Ben Hutchings b...@decadent.org.uk
Tested-by: Steven Rostedt rost...@goodmis.org
---
 fs/dcache.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1035,7 +1035,7 @@ ascend:
return 0; /* No mount points found in tree */
 positive:
if (!locked  read_seqretry(rename_lock, seq))
-   goto rename_retry;
+   goto rename_retry_unlocked;
if (locked)
write_sequnlock(rename_lock);
return 1;
@@ -1045,6 +1045,7 @@ rename_retry:
rcu_read_unlock();
if (locked)
goto again;
+rename_retry_unlocked:
locked = 1;
write_seqlock(rename_lock);
goto again;
@@ -1109,6 +1110,7 @@ resume:
 */
if (found  need_resched()) {
spin_unlock(dentry-d_lock);
+   rcu_read_lock();
goto out;
}
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/