Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-04-16 Thread Mike Galbraith
On Fri, 2013-04-05 at 09:21 -0400, Rik van Riel wrote: 
> On 04/05/2013 12:38 AM, Mike Galbraith wrote:
> > On Tue, 2013-03-26 at 16:00 -0400, Rik van Riel wrote:
> 
> >> The ipc semaphore code has a nasty RCU locking tangle, with both
> >> find_alloc_undo and semtimedop taking the rcu_read_lock(). The
> >> code can be cleaned up somewhat by only taking the rcu_read_lock
> >> once.
> >>
> >> There are no other callers to find_alloc_undo.
> >>
> >> This should also solve the trinity issue reported by Sasha Levin.
> >
> > I take it this is on top of the patchlet Sasha submitted?
> 
> Indeed, and all the other fixes that got submitted :)

I plugged it into my 64 core rt box and beat on it again, no stalls or
any other troubles noted.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-04-16 Thread Mike Galbraith
On Fri, 2013-04-05 at 09:21 -0400, Rik van Riel wrote: 
 On 04/05/2013 12:38 AM, Mike Galbraith wrote:
  On Tue, 2013-03-26 at 16:00 -0400, Rik van Riel wrote:
 
  The ipc semaphore code has a nasty RCU locking tangle, with both
  find_alloc_undo and semtimedop taking the rcu_read_lock(). The
  code can be cleaned up somewhat by only taking the rcu_read_lock
  once.
 
  There are no other callers to find_alloc_undo.
 
  This should also solve the trinity issue reported by Sasha Levin.
 
  I take it this is on top of the patchlet Sasha submitted?
 
 Indeed, and all the other fixes that got submitted :)

I plugged it into my 64 core rt box and beat on it again, no stalls or
any other troubles noted.

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-04-05 Thread Mike Galbraith
On Fri, 2013-04-05 at 09:21 -0400, Rik van Riel wrote: 
> On 04/05/2013 12:38 AM, Mike Galbraith wrote:
> > On Tue, 2013-03-26 at 16:00 -0400, Rik van Riel wrote:
> 
> >> The ipc semaphore code has a nasty RCU locking tangle, with both
> >> find_alloc_undo and semtimedop taking the rcu_read_lock(). The
> >> code can be cleaned up somewhat by only taking the rcu_read_lock
> >> once.
> >>
> >> There are no other callers to find_alloc_undo.
> >>
> >> This should also solve the trinity issue reported by Sasha Levin.
> >
> > I take it this is on top of the patchlet Sasha submitted?
> 
> Indeed, and all the other fixes that got submitted :)
> 
> > (I hit rcu stall banging on patch set in rt with 60 synchronized core
> > executive model if I let it run long enough, fwtw)
> 
> What are you using to trigger an rcu stall?

Running a model of a userspace task scheduler.  That was a fix or so ago
now though.  I'll try the set again on that box.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-04-05 Thread Rik van Riel

On 04/05/2013 12:38 AM, Mike Galbraith wrote:

On Tue, 2013-03-26 at 16:00 -0400, Rik van Riel wrote:



The ipc semaphore code has a nasty RCU locking tangle, with both
find_alloc_undo and semtimedop taking the rcu_read_lock(). The
code can be cleaned up somewhat by only taking the rcu_read_lock
once.

There are no other callers to find_alloc_undo.

This should also solve the trinity issue reported by Sasha Levin.


I take it this is on top of the patchlet Sasha submitted?


Indeed, and all the other fixes that got submitted :)


(I hit rcu stall banging on patch set in rt with 60 synchronized core
executive model if I let it run long enough, fwtw)


What are you using to trigger an rcu stall?


Reported-by: Sasha Levin 
Signed-off-by: Rik van Riel 


--
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-04-05 Thread Rik van Riel

On 04/05/2013 12:38 AM, Mike Galbraith wrote:

On Tue, 2013-03-26 at 16:00 -0400, Rik van Riel wrote:



The ipc semaphore code has a nasty RCU locking tangle, with both
find_alloc_undo and semtimedop taking the rcu_read_lock(). The
code can be cleaned up somewhat by only taking the rcu_read_lock
once.

There are no other callers to find_alloc_undo.

This should also solve the trinity issue reported by Sasha Levin.


I take it this is on top of the patchlet Sasha submitted?


Indeed, and all the other fixes that got submitted :)


(I hit rcu stall banging on patch set in rt with 60 synchronized core
executive model if I let it run long enough, fwtw)


What are you using to trigger an rcu stall?


Reported-by: Sasha Levin sasha.le...@oracle.com
Signed-off-by: Rik van Riel r...@redhat.com


--
All rights reversed.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-04-05 Thread Mike Galbraith
On Fri, 2013-04-05 at 09:21 -0400, Rik van Riel wrote: 
 On 04/05/2013 12:38 AM, Mike Galbraith wrote:
  On Tue, 2013-03-26 at 16:00 -0400, Rik van Riel wrote:
 
  The ipc semaphore code has a nasty RCU locking tangle, with both
  find_alloc_undo and semtimedop taking the rcu_read_lock(). The
  code can be cleaned up somewhat by only taking the rcu_read_lock
  once.
 
  There are no other callers to find_alloc_undo.
 
  This should also solve the trinity issue reported by Sasha Levin.
 
  I take it this is on top of the patchlet Sasha submitted?
 
 Indeed, and all the other fixes that got submitted :)
 
  (I hit rcu stall banging on patch set in rt with 60 synchronized core
  executive model if I let it run long enough, fwtw)
 
 What are you using to trigger an rcu stall?

Running a model of a userspace task scheduler.  That was a fix or so ago
now though.  I'll try the set again on that box.

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-04-04 Thread Mike Galbraith
On Tue, 2013-03-26 at 16:00 -0400, Rik van Riel wrote: 
> On Tue, 26 Mar 2013 14:07:14 -0400
> Sasha Levin  wrote:
> 
> > > Not necessarily, we do release everything at the end of the function: 
> > > out_unlock_free:
> > >   sem_unlock(sma, locknum);
> > 
> > Ow, there's a rcu_read_unlock() in sem_unlock()? This complicates things 
> > even
> > more I suspect. If un is non-NULL we'll be unlocking rcu lock twice?
> 
> Sasha, this patch should resolve the RCU tangle, by making sure
> we only ever take the rcu_read_lock once in semtimedop.
> 
> ---8<---
> 
> The ipc semaphore code has a nasty RCU locking tangle, with both
> find_alloc_undo and semtimedop taking the rcu_read_lock(). The
> code can be cleaned up somewhat by only taking the rcu_read_lock
> once.
> 
> There are no other callers to find_alloc_undo.
> 
> This should also solve the trinity issue reported by Sasha Levin.

I take it this is on top of the patchlet Sasha submitted?

(I hit rcu stall banging on patch set in rt with 60 synchronized core
executive model if I let it run long enough, fwtw)

> Reported-by: Sasha Levin 
> Signed-off-by: Rik van Riel 
> ---
>  ipc/sem.c |   31 +--
>  1 files changed, 9 insertions(+), 22 deletions(-)
> 
> diff --git a/ipc/sem.c b/ipc/sem.c
> index f46441a..2ec2945 100644
> --- a/ipc/sem.c
> +++ b/ipc/sem.c
> @@ -1646,22 +1646,23 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf 
> __user *, tsops,
>   alter = 1;
>   }
>  
> + INIT_LIST_HEAD();
> +
>   if (undos) {
> + /* On success, find_alloc_undo takes the rcu_read_lock */
>   un = find_alloc_undo(ns, semid);
>   if (IS_ERR(un)) {
>   error = PTR_ERR(un);
>   goto out_free;
>   }
> - } else
> + } else {
>   un = NULL;
> + rcu_read_lock();
> + }
>  
> - INIT_LIST_HEAD();
> -
> - rcu_read_lock();
>   sma = sem_obtain_object_check(ns, semid);
>   if (IS_ERR(sma)) {
> - if (un)
> - rcu_read_unlock();
> + rcu_read_unlock();
>   error = PTR_ERR(sma);
>   goto out_free;
>   }
> @@ -1693,22 +1694,8 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf 
> __user *, tsops,
>*/
>   error = -EIDRM;
>   locknum = sem_lock(sma, sops, nsops);
> - if (un) {
> - if (un->semid == -1) {
> - rcu_read_unlock();
> - goto out_unlock_free;
> - } else {
> - /*
> -  * rcu lock can be released, "un" cannot disappear:
> -  * - sem_lock is acquired, thus IPC_RMID is
> -  *   impossible.
> -  * - exit_sem is impossible, it always operates on
> -  *   current (or a dead task).
> -  */
> -
> - rcu_read_unlock();
> - }
> - }
> + if (un && un->semid == -1)
> + goto out_unlock_free;
>  
>   error = try_atomic_semop (sma, sops, nsops, un, task_tgid_vnr(current));
>   if (error <= 0) {
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-04-04 Thread Mike Galbraith
On Tue, 2013-03-26 at 16:00 -0400, Rik van Riel wrote: 
 On Tue, 26 Mar 2013 14:07:14 -0400
 Sasha Levin sasha.le...@oracle.com wrote:
 
   Not necessarily, we do release everything at the end of the function: 
   out_unlock_free:
 sem_unlock(sma, locknum);
  
  Ow, there's a rcu_read_unlock() in sem_unlock()? This complicates things 
  even
  more I suspect. If un is non-NULL we'll be unlocking rcu lock twice?
 
 Sasha, this patch should resolve the RCU tangle, by making sure
 we only ever take the rcu_read_lock once in semtimedop.
 
 ---8---
 
 The ipc semaphore code has a nasty RCU locking tangle, with both
 find_alloc_undo and semtimedop taking the rcu_read_lock(). The
 code can be cleaned up somewhat by only taking the rcu_read_lock
 once.
 
 There are no other callers to find_alloc_undo.
 
 This should also solve the trinity issue reported by Sasha Levin.

I take it this is on top of the patchlet Sasha submitted?

(I hit rcu stall banging on patch set in rt with 60 synchronized core
executive model if I let it run long enough, fwtw)

 Reported-by: Sasha Levin sasha.le...@oracle.com
 Signed-off-by: Rik van Riel r...@redhat.com
 ---
  ipc/sem.c |   31 +--
  1 files changed, 9 insertions(+), 22 deletions(-)
 
 diff --git a/ipc/sem.c b/ipc/sem.c
 index f46441a..2ec2945 100644
 --- a/ipc/sem.c
 +++ b/ipc/sem.c
 @@ -1646,22 +1646,23 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf 
 __user *, tsops,
   alter = 1;
   }
  
 + INIT_LIST_HEAD(tasks);
 +
   if (undos) {
 + /* On success, find_alloc_undo takes the rcu_read_lock */
   un = find_alloc_undo(ns, semid);
   if (IS_ERR(un)) {
   error = PTR_ERR(un);
   goto out_free;
   }
 - } else
 + } else {
   un = NULL;
 + rcu_read_lock();
 + }
  
 - INIT_LIST_HEAD(tasks);
 -
 - rcu_read_lock();
   sma = sem_obtain_object_check(ns, semid);
   if (IS_ERR(sma)) {
 - if (un)
 - rcu_read_unlock();
 + rcu_read_unlock();
   error = PTR_ERR(sma);
   goto out_free;
   }
 @@ -1693,22 +1694,8 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf 
 __user *, tsops,
*/
   error = -EIDRM;
   locknum = sem_lock(sma, sops, nsops);
 - if (un) {
 - if (un-semid == -1) {
 - rcu_read_unlock();
 - goto out_unlock_free;
 - } else {
 - /*
 -  * rcu lock can be released, un cannot disappear:
 -  * - sem_lock is acquired, thus IPC_RMID is
 -  *   impossible.
 -  * - exit_sem is impossible, it always operates on
 -  *   current (or a dead task).
 -  */
 -
 - rcu_read_unlock();
 - }
 - }
 + if (un  un-semid == -1)
 + goto out_unlock_free;
  
   error = try_atomic_semop (sma, sops, nsops, un, task_tgid_vnr(current));
   if (error = 0) {
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-03-30 Thread Davidlohr Bueso
On Sat, 2013-03-30 at 21:30 -0400, Rik van Riel wrote:
> On 03/30/2013 09:35 AM, Sasha Levin wrote:
> 
> > I'm thinking that the solution is as simple as:
> 
> Your patch is absolutely correct.  All it needs now is your
> signed-off-by, so Andrew can merge it into -mm :)
> 
> Reviewed-by: Rik van Riel 

Reviewed-by: Davidlohr Bueso 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-03-30 Thread Rik van Riel

On 03/30/2013 09:35 AM, Sasha Levin wrote:


I'm thinking that the solution is as simple as:


Your patch is absolutely correct.  All it needs now is your
signed-off-by, so Andrew can merge it into -mm :)

Reviewed-by: Rik van Riel 


diff --git a/ipc/sem.c b/ipc/sem.c
index 6e109ef..ac36671 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -1333,8 +1333,10 @@ static int semctl_main(struct ipc_namespace *ns, int 
semid, int semnum,
 /* GETVAL, GETPID, GETNCTN, GETZCNT: fall-through */
 }
 err = -EINVAL;
-   if(semnum < 0 || semnum >= nsems)
-   goto out_unlock;
+   if(semnum < 0 || semnum >= nsems) {
+   rcu_read_unlock();
+   goto out_wakeup;
+   }

 sem_lock(sma, NULL, -1);
 curr = >sem_base[semnum];

But I'm not 100% sure if I don't mess up anything else.


I checked the surrounding code, it all looks fine.

--
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-03-30 Thread Sasha Levin
On 03/28/2013 11:32 AM, Rik van Riel wrote:
> On Tue, 26 Mar 2013 13:33:07 -0400
> Sasha Levin  wrote:
> 
>> > [   96.347341] 
>> > [   96.348085] [ BUG: lock held when returning to user space! ]
>> > [   96.348834] 3.9.0-rc4-next-20130326-sasha-00011-gbcb2313 #318 Tainted: 
>> > GW
>> > [   96.360300] 
>> > [   96.361084] trinity-child9/7583 is leaving the kernel with locks still 
>> > held!
>> > [   96.362019] 1 lock held by trinity-child9/7583:
>> > [   96.362610]  #0:  (rcu_read_lock){.+.+..}, at: [] 
>> > SYSC_semtimedop+0x1fb/0xec0
>> > 
>> > It seems that we can leave semtimedop without releasing the rcu read lock.
> Sasha, this patch untangles the RCU locking with find_alloc_undo,
> and should fix the above issue. As a side benefit, this makes the
> code a little cleaner.
> 
> Next up: implement locking in a way that does not trigger any 
> lockdep warnings...

The following is mostly unrelated to this patch but close enough:

semctl_main() has a snippet that looks like this:

err = -EINVAL;
if(semnum < 0 || semnum >= nsems)
goto out_unlock;

sem_lock(sma, NULL, -1);

Which means we'll try unlocking the sma without trying to lock it first.
It makes lockdep unhappy:

[   95.528492] =
[   95.529251] [ BUG: bad unlock balance detected! ]
[   95.529897] 3.9.0-rc4-next-20130328-sasha-00014-g91a3267 #319 Tainted: G 
   W
[   95.530190] -
[   95.530190] trinity-child14/9123 is trying to release lock 
(&(>lock)->rlock) at:
[   95.530190] [] semctl_main+0xe54/0xf00
[   95.530190] but there are no more locks to release!
[   95.530190]
[   95.530190] other info that might help us debug this:
[   95.530190] 1 lock held by trinity-child14/9123:
[   95.530190]  #0:  (rcu_read_lock){.+.+..}, at: [] 
semctl_main+0x0/0xf00
[   95.530190]
[   95.530190] stack backtrace:
[   95.530190] Pid: 9123, comm: trinity-child14 Tainted: GW
3.9.0-rc4-next-20130328-sasha-00014-g91a3267 #319
[   95.530190] Call Trace:
[   95.530190]  [] ? semctl_main+0xe54/0xf00
[   95.530190]  [] print_unlock_imbalance_bug+0xf6/0x110
[   95.530190]  [] ? semctl_main+0xe54/0xf00
[   95.530190]  [] lock_release_non_nested+0xd5/0x320
[   95.530190]  [] ? __do_fault+0x42b/0x530
[   95.530190]  [] ? get_lock_stats+0x22/0x70
[   95.530190]  [] ? put_lock_stats.isra.14+0xe/0x40
[   95.530190]  [] ? semctl_main+0xe54/0xf00
[   95.530190]  [] lock_release+0x29e/0x3b0
[   95.530190]  [] ? security_ipc_permission+0x14/0x20
[   95.530190]  [] _raw_spin_unlock+0x1e/0x60
[   95.530190]  [] semctl_main+0xe54/0xf00
[   95.530190]  [] ? SYSC_semtimedop+0xe30/0xe30
[   95.530190]  [] ? kvm_clock_read+0x38/0x70
[   95.530190]  [] ? sched_clock_local+0x25/0xa0
[   95.530190]  [] ? sched_clock_cpu+0xf8/0x110
[   95.530190]  [] ? __do_page_fault+0x514/0x5e0
[   95.530190]  [] ? get_lock_stats+0x22/0x70
[   95.530190]  [] ? put_lock_stats.isra.14+0xe/0x40
[   95.530190]  [] ? __do_page_fault+0x514/0x5e0
[   95.530190]  [] ? up_read+0x1e/0x40
[   95.530190]  [] ? __do_page_fault+0x514/0x5e0
[   95.530190]  [] ? rcu_eqs_exit_common+0x60/0x260
[   95.530190]  [] ? user_enter+0xfd/0x130
[   95.530190]  [] ? user_exit+0xb5/0xe0
[   95.530190]  [] SyS_semctl+0x69/0x430
[   95.530190]  [] ? syscall_trace_enter+0x20/0x2e0
[   95.530190]  [] tracesys+0xe1/0xe6

I'm thinking that the solution is as simple as:

diff --git a/ipc/sem.c b/ipc/sem.c
index 6e109ef..ac36671 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -1333,8 +1333,10 @@ static int semctl_main(struct ipc_namespace *ns, int 
semid, int semnum,
/* GETVAL, GETPID, GETNCTN, GETZCNT: fall-through */
}
err = -EINVAL;
-   if(semnum < 0 || semnum >= nsems)
-   goto out_unlock;
+   if(semnum < 0 || semnum >= nsems) {
+   rcu_read_unlock();
+   goto out_wakeup;
+   }

sem_lock(sma, NULL, -1);
curr = >sem_base[semnum];

But I'm not 100% sure if I don't mess up anything else.


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-03-30 Thread Sasha Levin
On 03/28/2013 11:32 AM, Rik van Riel wrote:
 On Tue, 26 Mar 2013 13:33:07 -0400
 Sasha Levin sasha.le...@oracle.com wrote:
 
  [   96.347341] 
  [   96.348085] [ BUG: lock held when returning to user space! ]
  [   96.348834] 3.9.0-rc4-next-20130326-sasha-00011-gbcb2313 #318 Tainted: 
  GW
  [   96.360300] 
  [   96.361084] trinity-child9/7583 is leaving the kernel with locks still 
  held!
  [   96.362019] 1 lock held by trinity-child9/7583:
  [   96.362610]  #0:  (rcu_read_lock){.+.+..}, at: [8192eafb] 
  SYSC_semtimedop+0x1fb/0xec0
  
  It seems that we can leave semtimedop without releasing the rcu read lock.
 Sasha, this patch untangles the RCU locking with find_alloc_undo,
 and should fix the above issue. As a side benefit, this makes the
 code a little cleaner.
 
 Next up: implement locking in a way that does not trigger any 
 lockdep warnings...

The following is mostly unrelated to this patch but close enough:

semctl_main() has a snippet that looks like this:

err = -EINVAL;
if(semnum  0 || semnum = nsems)
goto out_unlock;

sem_lock(sma, NULL, -1);

Which means we'll try unlocking the sma without trying to lock it first.
It makes lockdep unhappy:

[   95.528492] =
[   95.529251] [ BUG: bad unlock balance detected! ]
[   95.529897] 3.9.0-rc4-next-20130328-sasha-00014-g91a3267 #319 Tainted: G 
   W
[   95.530190] -
[   95.530190] trinity-child14/9123 is trying to release lock 
((new-lock)-rlock) at:
[   95.530190] [8192f8e4] semctl_main+0xe54/0xf00
[   95.530190] but there are no more locks to release!
[   95.530190]
[   95.530190] other info that might help us debug this:
[   95.530190] 1 lock held by trinity-child14/9123:
[   95.530190]  #0:  (rcu_read_lock){.+.+..}, at: [8192ea90] 
semctl_main+0x0/0xf00
[   95.530190]
[   95.530190] stack backtrace:
[   95.530190] Pid: 9123, comm: trinity-child14 Tainted: GW
3.9.0-rc4-next-20130328-sasha-00014-g91a3267 #319
[   95.530190] Call Trace:
[   95.530190]  [8192f8e4] ? semctl_main+0xe54/0xf00
[   95.530190]  [8117b7e6] print_unlock_imbalance_bug+0xf6/0x110
[   95.530190]  [8192f8e4] ? semctl_main+0xe54/0xf00
[   95.530190]  [81180a35] lock_release_non_nested+0xd5/0x320
[   95.530190]  [8122e3ab] ? __do_fault+0x42b/0x530
[   95.530190]  [81179da2] ? get_lock_stats+0x22/0x70
[   95.530190]  [81179e5e] ? put_lock_stats.isra.14+0xe/0x40
[   95.530190]  [8192f8e4] ? semctl_main+0xe54/0xf00
[   95.530190]  [81180f1e] lock_release+0x29e/0x3b0
[   95.530190]  [819451f4] ? security_ipc_permission+0x14/0x20
[   95.530190]  [83daf33e] _raw_spin_unlock+0x1e/0x60
[   95.530190]  [8192f8e4] semctl_main+0xe54/0xf00
[   95.530190]  [8192ea90] ? SYSC_semtimedop+0xe30/0xe30
[   95.530190]  [8109d188] ? kvm_clock_read+0x38/0x70
[   95.530190]  [8114feb5] ? sched_clock_local+0x25/0xa0
[   95.530190]  [811500e8] ? sched_clock_cpu+0xf8/0x110
[   95.530190]  [83db3814] ? __do_page_fault+0x514/0x5e0
[   95.530190]  [81179da2] ? get_lock_stats+0x22/0x70
[   95.530190]  [81179e5e] ? put_lock_stats.isra.14+0xe/0x40
[   95.530190]  [83db3814] ? __do_page_fault+0x514/0x5e0
[   95.530190]  [8113dc0e] ? up_read+0x1e/0x40
[   95.530190]  [83db3814] ? __do_page_fault+0x514/0x5e0
[   95.530190]  [811c6500] ? rcu_eqs_exit_common+0x60/0x260
[   95.530190]  [81202b9d] ? user_enter+0xfd/0x130
[   95.530190]  [81202c85] ? user_exit+0xb5/0xe0
[   95.530190]  [8192faf9] SyS_semctl+0x69/0x430
[   95.530190]  [81076ea0] ? syscall_trace_enter+0x20/0x2e0
[   95.530190]  [83db7d18] tracesys+0xe1/0xe6

I'm thinking that the solution is as simple as:

diff --git a/ipc/sem.c b/ipc/sem.c
index 6e109ef..ac36671 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -1333,8 +1333,10 @@ static int semctl_main(struct ipc_namespace *ns, int 
semid, int semnum,
/* GETVAL, GETPID, GETNCTN, GETZCNT: fall-through */
}
err = -EINVAL;
-   if(semnum  0 || semnum = nsems)
-   goto out_unlock;
+   if(semnum  0 || semnum = nsems) {
+   rcu_read_unlock();
+   goto out_wakeup;
+   }

sem_lock(sma, NULL, -1);
curr = sma-sem_base[semnum];

But I'm not 100% sure if I don't mess up anything else.


Thanks,
Sasha
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-03-30 Thread Rik van Riel

On 03/30/2013 09:35 AM, Sasha Levin wrote:


I'm thinking that the solution is as simple as:


Your patch is absolutely correct.  All it needs now is your
signed-off-by, so Andrew can merge it into -mm :)

Reviewed-by: Rik van Riel r...@redhat.com


diff --git a/ipc/sem.c b/ipc/sem.c
index 6e109ef..ac36671 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -1333,8 +1333,10 @@ static int semctl_main(struct ipc_namespace *ns, int 
semid, int semnum,
 /* GETVAL, GETPID, GETNCTN, GETZCNT: fall-through */
 }
 err = -EINVAL;
-   if(semnum  0 || semnum = nsems)
-   goto out_unlock;
+   if(semnum  0 || semnum = nsems) {
+   rcu_read_unlock();
+   goto out_wakeup;
+   }

 sem_lock(sma, NULL, -1);
 curr = sma-sem_base[semnum];

But I'm not 100% sure if I don't mess up anything else.


I checked the surrounding code, it all looks fine.

--
All rights reversed.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-03-30 Thread Davidlohr Bueso
On Sat, 2013-03-30 at 21:30 -0400, Rik van Riel wrote:
 On 03/30/2013 09:35 AM, Sasha Levin wrote:
 
  I'm thinking that the solution is as simple as:
 
 Your patch is absolutely correct.  All it needs now is your
 signed-off-by, so Andrew can merge it into -mm :)
 
 Reviewed-by: Rik van Riel r...@redhat.com

Reviewed-by: Davidlohr Bueso davidlohr.bu...@hp.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-03-28 Thread Sasha Levin
On 03/28/2013 09:00 PM, Michel Lespinasse wrote:
> On Thu, Mar 28, 2013 at 8:32 AM, Rik van Riel  wrote:
>> The ipc semaphore code has a nasty RCU locking tangle, with both
>> find_alloc_undo and semtimedop taking the rcu_read_lock(). The
>> code can be cleaned up somewhat by only taking the rcu_read_lock
>> once.
>>
>> The only caller of find_alloc_undo is in semtimedop.
>>
>> This should solve the trinity issue reported by Sasha Levin.
>>
>> Reported-by: Sasha Levin 
>> Signed-off-by: Rik van Riel 
> 
> Looks good^Wbetter. I have nothing specific to say other than I've
> been staring at it for 10 minutes :)
> 
> Reviewed-by: Michel Lespinasse 

And the warnings are gone:

Tested-by: Sasha Levin 


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-03-28 Thread Michel Lespinasse
On Thu, Mar 28, 2013 at 8:32 AM, Rik van Riel  wrote:
> The ipc semaphore code has a nasty RCU locking tangle, with both
> find_alloc_undo and semtimedop taking the rcu_read_lock(). The
> code can be cleaned up somewhat by only taking the rcu_read_lock
> once.
>
> The only caller of find_alloc_undo is in semtimedop.
>
> This should solve the trinity issue reported by Sasha Levin.
>
> Reported-by: Sasha Levin 
> Signed-off-by: Rik van Riel 

Looks good^Wbetter. I have nothing specific to say other than I've
been staring at it for 10 minutes :)

Reviewed-by: Michel Lespinasse 

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-03-28 Thread Davidlohr Bueso
On Thu, 2013-03-28 at 11:32 -0400, Rik van Riel wrote:
> On Tue, 26 Mar 2013 13:33:07 -0400
> Sasha Levin  wrote:
> 
> > [   96.347341] 
> > [   96.348085] [ BUG: lock held when returning to user space! ]
> > [   96.348834] 3.9.0-rc4-next-20130326-sasha-00011-gbcb2313 #318 Tainted: G 
> >W
> > [   96.360300] 
> > [   96.361084] trinity-child9/7583 is leaving the kernel with locks still 
> > held!
> > [   96.362019] 1 lock held by trinity-child9/7583:
> > [   96.362610]  #0:  (rcu_read_lock){.+.+..}, at: [] 
> > SYSC_semtimedop+0x1fb/0xec0
> > 
> > It seems that we can leave semtimedop without releasing the rcu read lock.
> 
> Sasha, this patch untangles the RCU locking with find_alloc_undo,
> and should fix the above issue. As a side benefit, this makes the
> code a little cleaner.
> 
> Next up: implement locking in a way that does not trigger any 
> lockdep warnings...
> 
> ---8<---
> 
> Subject: ipc,sem: untangle RCU locking with find_alloc_undo
> 
> The ipc semaphore code has a nasty RCU locking tangle, with both
> find_alloc_undo and semtimedop taking the rcu_read_lock(). The
> code can be cleaned up somewhat by only taking the rcu_read_lock
> once.

indeed!

> 
> The only caller of find_alloc_undo is in semtimedop.
> 
> This should solve the trinity issue reported by Sasha Levin.
> 
> Reported-by: Sasha Levin 
> Signed-off-by: Rik van Riel 
> ---
>  ipc/sem.c |   31 +--
>  1 files changed, 9 insertions(+), 22 deletions(-)
> 
> diff --git a/ipc/sem.c b/ipc/sem.c
> index f46441a..2ec2945 100644
> --- a/ipc/sem.c
> +++ b/ipc/sem.c
> @@ -1646,22 +1646,23 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf 
> __user *, tsops,
>   alter = 1;
>   }
>  
> + INIT_LIST_HEAD();
> +
>   if (undos) {
> + /* On success, find_alloc_undo takes the rcu_read_lock */
>   un = find_alloc_undo(ns, semid);

find_alloc_undo() has some nested rcu_read_locks of its own. We can
simplify that as well. Will look into it, but don't want to introduce
any more changes until we address all the issues with the patchset, and
know it to behave.

>   if (IS_ERR(un)) {
>   error = PTR_ERR(un);
>   goto out_free;
>   }
> - } else
> + } else {
>   un = NULL;
> + rcu_read_lock();
> + }
>  
> - INIT_LIST_HEAD();
> -
> - rcu_read_lock();
>   sma = sem_obtain_object_check(ns, semid);
>   if (IS_ERR(sma)) {
> - if (un)
> - rcu_read_unlock();
> + rcu_read_unlock();
>   error = PTR_ERR(sma);
>   goto out_free;
>   }
> @@ -1693,22 +1694,8 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf 
> __user *, tsops,
>*/
>   error = -EIDRM;
>   locknum = sem_lock(sma, sops, nsops);
> - if (un) {
> - if (un->semid == -1) {
> - rcu_read_unlock();
> - goto out_unlock_free;
> - } else {
> - /*
> -  * rcu lock can be released, "un" cannot disappear:
> -  * - sem_lock is acquired, thus IPC_RMID is
> -  *   impossible.
> -  * - exit_sem is impossible, it always operates on
> -  *   current (or a dead task).
> -  */
> -
> - rcu_read_unlock();
> - }
> - }
> + if (un && un->semid == -1)
> + goto out_unlock_free;

Yeah, I was tempted in doing something much like this, but didn't want
to change any existing logic. Hopefully we can get away with this and it
fixes Sasha's issue.

>  
>   error = try_atomic_semop (sma, sops, nsops, un, task_tgid_vnr(current));
>   if (error <= 0) {

Reviewed-by: Davidlohr Bueso 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-03-28 Thread Davidlohr Bueso
On Thu, 2013-03-28 at 11:32 -0400, Rik van Riel wrote:
 On Tue, 26 Mar 2013 13:33:07 -0400
 Sasha Levin sasha.le...@oracle.com wrote:
 
  [   96.347341] 
  [   96.348085] [ BUG: lock held when returning to user space! ]
  [   96.348834] 3.9.0-rc4-next-20130326-sasha-00011-gbcb2313 #318 Tainted: G 
 W
  [   96.360300] 
  [   96.361084] trinity-child9/7583 is leaving the kernel with locks still 
  held!
  [   96.362019] 1 lock held by trinity-child9/7583:
  [   96.362610]  #0:  (rcu_read_lock){.+.+..}, at: [8192eafb] 
  SYSC_semtimedop+0x1fb/0xec0
  
  It seems that we can leave semtimedop without releasing the rcu read lock.
 
 Sasha, this patch untangles the RCU locking with find_alloc_undo,
 and should fix the above issue. As a side benefit, this makes the
 code a little cleaner.
 
 Next up: implement locking in a way that does not trigger any 
 lockdep warnings...
 
 ---8---
 
 Subject: ipc,sem: untangle RCU locking with find_alloc_undo
 
 The ipc semaphore code has a nasty RCU locking tangle, with both
 find_alloc_undo and semtimedop taking the rcu_read_lock(). The
 code can be cleaned up somewhat by only taking the rcu_read_lock
 once.

indeed!

 
 The only caller of find_alloc_undo is in semtimedop.
 
 This should solve the trinity issue reported by Sasha Levin.
 
 Reported-by: Sasha Levin sasha.le...@oracle.com
 Signed-off-by: Rik van Riel r...@redhat.com
 ---
  ipc/sem.c |   31 +--
  1 files changed, 9 insertions(+), 22 deletions(-)
 
 diff --git a/ipc/sem.c b/ipc/sem.c
 index f46441a..2ec2945 100644
 --- a/ipc/sem.c
 +++ b/ipc/sem.c
 @@ -1646,22 +1646,23 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf 
 __user *, tsops,
   alter = 1;
   }
  
 + INIT_LIST_HEAD(tasks);
 +
   if (undos) {
 + /* On success, find_alloc_undo takes the rcu_read_lock */
   un = find_alloc_undo(ns, semid);

find_alloc_undo() has some nested rcu_read_locks of its own. We can
simplify that as well. Will look into it, but don't want to introduce
any more changes until we address all the issues with the patchset, and
know it to behave.

   if (IS_ERR(un)) {
   error = PTR_ERR(un);
   goto out_free;
   }
 - } else
 + } else {
   un = NULL;
 + rcu_read_lock();
 + }
  
 - INIT_LIST_HEAD(tasks);
 -
 - rcu_read_lock();
   sma = sem_obtain_object_check(ns, semid);
   if (IS_ERR(sma)) {
 - if (un)
 - rcu_read_unlock();
 + rcu_read_unlock();
   error = PTR_ERR(sma);
   goto out_free;
   }
 @@ -1693,22 +1694,8 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf 
 __user *, tsops,
*/
   error = -EIDRM;
   locknum = sem_lock(sma, sops, nsops);
 - if (un) {
 - if (un-semid == -1) {
 - rcu_read_unlock();
 - goto out_unlock_free;
 - } else {
 - /*
 -  * rcu lock can be released, un cannot disappear:
 -  * - sem_lock is acquired, thus IPC_RMID is
 -  *   impossible.
 -  * - exit_sem is impossible, it always operates on
 -  *   current (or a dead task).
 -  */
 -
 - rcu_read_unlock();
 - }
 - }
 + if (un  un-semid == -1)
 + goto out_unlock_free;

Yeah, I was tempted in doing something much like this, but didn't want
to change any existing logic. Hopefully we can get away with this and it
fixes Sasha's issue.

  
   error = try_atomic_semop (sma, sops, nsops, un, task_tgid_vnr(current));
   if (error = 0) {

Reviewed-by: Davidlohr Bueso davidlohr.bu...@hp.com


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-03-28 Thread Michel Lespinasse
On Thu, Mar 28, 2013 at 8:32 AM, Rik van Riel r...@surriel.com wrote:
 The ipc semaphore code has a nasty RCU locking tangle, with both
 find_alloc_undo and semtimedop taking the rcu_read_lock(). The
 code can be cleaned up somewhat by only taking the rcu_read_lock
 once.

 The only caller of find_alloc_undo is in semtimedop.

 This should solve the trinity issue reported by Sasha Levin.

 Reported-by: Sasha Levin sasha.le...@oracle.com
 Signed-off-by: Rik van Riel r...@redhat.com

Looks good^Wbetter. I have nothing specific to say other than I've
been staring at it for 10 minutes :)

Reviewed-by: Michel Lespinasse wal...@google.com

-- 
Michel Walken Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo

2013-03-28 Thread Sasha Levin
On 03/28/2013 09:00 PM, Michel Lespinasse wrote:
 On Thu, Mar 28, 2013 at 8:32 AM, Rik van Riel r...@surriel.com wrote:
 The ipc semaphore code has a nasty RCU locking tangle, with both
 find_alloc_undo and semtimedop taking the rcu_read_lock(). The
 code can be cleaned up somewhat by only taking the rcu_read_lock
 once.

 The only caller of find_alloc_undo is in semtimedop.

 This should solve the trinity issue reported by Sasha Levin.

 Reported-by: Sasha Levin sasha.le...@oracle.com
 Signed-off-by: Rik van Riel r...@redhat.com
 
 Looks good^Wbetter. I have nothing specific to say other than I've
 been staring at it for 10 minutes :)
 
 Reviewed-by: Michel Lespinasse wal...@google.com

And the warnings are gone:

Tested-by: Sasha Levin sasha.le...@oracle.com


Thanks,
Sasha
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/