Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-03 Thread Ingo Molnar

* Sergey Senozhatsky  wrote:

> On (02/03/16 09:04), Ingo Molnar wrote:
> > * Sergey Senozhatsky  wrote:
> > 
> > > On (02/03/16 08:28), Ingo Molnar wrote:
> > > [..]
> > > > So why not move printk away from semaphores? Semaphores are classical 
> > > > constructs 
> > > > that have legacies and are somewhat non-obvious to use, compared to 
> > > > modern, 
> > > > simpler locking primitives. I'd not touch their implementation, unless 
> > > > we are 
> > > > absolutely sure this is a safe optimization.
> > > 
> > > semaphore's spin_lock is not the only spin lock that printk acquires. it 
> > > also 
> > > takes the logbuf_lock (and different locks in console drivers (up to 
> > > console 
> > > driver)).
> > >
> > > Jan Kara posted a patch that offloads printing job 
> > > (console_trylock()-console_unlock()) from printk() call (when printk can 
> > > offload 
> > > it). so semaphore and console driver's locks will go away (mostly) with 
> > > Jan's 
> > > patch. logbug spin_lock, however, will stay.
> > 
> > Well, but this patch of yours only affects the semaphore code, so it does 
> > not 
> > change the logbuf_lock situation.
> 
> yes, correct. I just said for the info that there is already 'move printk 
> away from
> console_sem' work in progress. Well, the reason for that work is entirely 
> different,
> though, but this console_sem recursion and console driver's lock recursion 
> can be
> 'fixed as a side effect'.
> 
> > Furthermore, logbuf_lock already has recursion protection:
> > 
> > /*
> >  * Ouch, printk recursed into itself!
> >  */
> > if (unlikely(logbuf_cpu == this_cpu)) {
> 
> it's good, no doubt. but it doesn't work in all of the cases. a simple one is:
> 
> vprintk_emit()
> ...
>   raw_spin_lock(_lock);
>   logbuf_cpu = this_cpu;

Yes, I'm aware of that, and as I said:

> > (There are other ways to get the logbuf_lock - if those are still 
> > triggerable 
> > then they should be fixed.)

The proper way to fix it would be to factor out the recursion-safe logbuf_lock 
taking code into logbuf_lock()/logbuf_unlock() primitives and use those 
consistently in printk.c.

>   ...
>   logbuf_cpu = UINT_MAX;
>   raw_spin_unlock(_lock);  <<  SPIN_BUG_ON
> ...
> 
> if raw_spin_unlock() calls SPIN_BUG_ON, then logbuf_lock recursion detection 
> can't
> help. we recurse into vprintk_emit() with logbuf_lock locked and logbuf_cpu 
> != this_cpu.

If recursion-safe logbuf_lock taking is factored out and extended to other 
functions in printk.c then this problem is solved.

> Peter Hurley also posted the following case (I'll quote):
> 
>   serial8250_do_set_termios()
> spin_lock_irqsave()  ** claim port lock **
> ...
> serial_port_out(port, UART_LCR, );
>   dw8250_serial_out()
> dev_err()
>   vprintk_emit()
> console_trylock()
>   call_console_drivers()
> serial8250_console_write()
>   spin_lock_irqsave()  ** port lock **
>   ** DEADLOCK **

That too seems to be avoided if vprintk_emit() does not take logbuf_lock 
naively.

So I repeat my point:

> > In any case, recursion protection is generally done in the debugging 
> > facilities 
> > trying to behave lockless.

I.e. to address these deadlocks, printk() should be made fundamentally more 
robust, extending its already existing recursion protection logic to remaining 
parts of printk.c.

That was my expectation when I committed the first variant of printk() 
recursion 
protection code a few years ago, it just never happened.

Thanks,

Ingo


Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-03 Thread Sergey Senozhatsky
On (02/03/16 17:12), Byungchul Park wrote:
> On Wed, Feb 03, 2016 at 04:42:23PM +0900, Sergey Senozhatsky wrote:
> > On (02/03/16 08:28), Ingo Molnar wrote:
> > [..]
> > > So why not move printk away from semaphores? Semaphores are classical 
> > > constructs 
> > > that have legacies and are somewhat non-obvious to use, compared to 
> > > modern, 
> > > simpler locking primitives. I'd not touch their implementation, unless we 
> > > are 
> > > absolutely sure this is a safe optimization.
> > 
> > semaphore's spin_lock is not the only spin lock that printk acquires. it 
> > also takes the
> > logbuf_lock (and different locks in console drivers (up to console driver)).
> > 
> > Jan Kara posted a patch that offloads printing job 
> > (console_trylock()-console_unlock())
> > from printk() call (when printk can offload it). so semaphore and console 
> > driver's locks
> > will go away (mostly) with Jan's patch. logbug spin_lock, however, will 
> > stay.
> 
> It sounds good. Could you teach me how to see the patch by Jan?

sure,

https://lkml.org/lkml/2015/10/26/16
and
http://marc.info/?l=linux-kernel=145079206822115

-ss


Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-03 Thread Sergey Senozhatsky
On (02/03/16 09:04), Ingo Molnar wrote:
> * Sergey Senozhatsky  wrote:
> 
> > On (02/03/16 08:28), Ingo Molnar wrote:
> > [..]
> > > So why not move printk away from semaphores? Semaphores are classical 
> > > constructs 
> > > that have legacies and are somewhat non-obvious to use, compared to 
> > > modern, 
> > > simpler locking primitives. I'd not touch their implementation, unless we 
> > > are 
> > > absolutely sure this is a safe optimization.
> > 
> > semaphore's spin_lock is not the only spin lock that printk acquires. it 
> > also 
> > takes the logbuf_lock (and different locks in console drivers (up to 
> > console 
> > driver)).
> >
> > Jan Kara posted a patch that offloads printing job 
> > (console_trylock()-console_unlock()) from printk() call (when printk can 
> > offload 
> > it). so semaphore and console driver's locks will go away (mostly) with 
> > Jan's 
> > patch. logbug spin_lock, however, will stay.
> 
> Well, but this patch of yours only affects the semaphore code, so it does not 
> change the logbuf_lock situation.

yes, correct. I just said for the info that there is already 'move printk away 
from
console_sem' work in progress. Well, the reason for that work is entirely 
different,
though, but this console_sem recursion and console driver's lock recursion can 
be
'fixed as a side effect'.

> Furthermore, logbuf_lock already has recursion protection:
> 
> /*
>  * Ouch, printk recursed into itself!
>  */
> if (unlikely(logbuf_cpu == this_cpu)) {

it's good, no doubt. but it doesn't work in all of the cases. a simple one is:

vprintk_emit()
...
raw_spin_lock(_lock);
logbuf_cpu = this_cpu;
...
logbuf_cpu = UINT_MAX;
raw_spin_unlock(_lock);  <<  SPIN_BUG_ON
...

if raw_spin_unlock() calls SPIN_BUG_ON, then logbuf_lock recursion detection 
can't
help. we recurse into vprintk_emit() with logbuf_lock locked and logbuf_cpu != 
this_cpu.

Peter Hurley also posted the following case (I'll quote):

  serial8250_do_set_termios()
spin_lock_irqsave()  ** claim port lock **
...
serial_port_out(port, UART_LCR, );
  dw8250_serial_out()
dev_err()
  vprintk_emit()
console_trylock()
  call_console_drivers()
serial8250_console_write()
  spin_lock_irqsave()  ** port lock **
  ** DEADLOCK **

-ss

> so it should not be possible to re-enter the printk() logbuf_lock critical 
> section 
> from the spinlock code. (There are other ways to get the logbuf_lock - if 
> those 
> are still triggerable then they should be fixed.)
> 
> In any case, recursion protection is generally done in the debugging 
> facilities 
> trying to behave lockless.
> 
> Thanks,
> 
>   Ingo
> 


Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-03 Thread Byungchul Park
On Wed, Feb 03, 2016 at 04:42:23PM +0900, Sergey Senozhatsky wrote:
> On (02/03/16 08:28), Ingo Molnar wrote:
> [..]
> > So why not move printk away from semaphores? Semaphores are classical 
> > constructs 
> > that have legacies and are somewhat non-obvious to use, compared to modern, 
> > simpler locking primitives. I'd not touch their implementation, unless we 
> > are 
> > absolutely sure this is a safe optimization.
> 
> semaphore's spin_lock is not the only spin lock that printk acquires. it also 
> takes the
> logbuf_lock (and different locks in console drivers (up to console driver)).
> 
> Jan Kara posted a patch that offloads printing job 
> (console_trylock()-console_unlock())
> from printk() call (when printk can offload it). so semaphore and console 
> driver's locks
> will go away (mostly) with Jan's patch. logbug spin_lock, however, will stay.

It sounds good. Could you teach me how to see the patch by Jan?

> 
>   -ss


Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-03 Thread Ingo Molnar

* Sergey Senozhatsky  wrote:

> On (02/03/16 08:28), Ingo Molnar wrote:
> [..]
> > So why not move printk away from semaphores? Semaphores are classical 
> > constructs 
> > that have legacies and are somewhat non-obvious to use, compared to modern, 
> > simpler locking primitives. I'd not touch their implementation, unless we 
> > are 
> > absolutely sure this is a safe optimization.
> 
> semaphore's spin_lock is not the only spin lock that printk acquires. it also 
> takes the logbuf_lock (and different locks in console drivers (up to console 
> driver)).
>
> Jan Kara posted a patch that offloads printing job 
> (console_trylock()-console_unlock()) from printk() call (when printk can 
> offload 
> it). so semaphore and console driver's locks will go away (mostly) with Jan's 
> patch. logbug spin_lock, however, will stay.

Well, but this patch of yours only affects the semaphore code, so it does not 
change the logbuf_lock situation.

Furthermore, logbuf_lock already has recursion protection:

/*
 * Ouch, printk recursed into itself!
 */
if (unlikely(logbuf_cpu == this_cpu)) {

so it should not be possible to re-enter the printk() logbuf_lock critical 
section 
from the spinlock code. (There are other ways to get the logbuf_lock - if those 
are still triggerable then they should be fixed.)

In any case, recursion protection is generally done in the debugging facilities 
trying to behave lockless.

Thanks,

Ingo


Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-03 Thread Ingo Molnar

* Sergey Senozhatsky  wrote:

> On (02/03/16 08:28), Ingo Molnar wrote:
> [..]
> > So why not move printk away from semaphores? Semaphores are classical 
> > constructs 
> > that have legacies and are somewhat non-obvious to use, compared to modern, 
> > simpler locking primitives. I'd not touch their implementation, unless we 
> > are 
> > absolutely sure this is a safe optimization.
> 
> semaphore's spin_lock is not the only spin lock that printk acquires. it also 
> takes the logbuf_lock (and different locks in console drivers (up to console 
> driver)).
>
> Jan Kara posted a patch that offloads printing job 
> (console_trylock()-console_unlock()) from printk() call (when printk can 
> offload 
> it). so semaphore and console driver's locks will go away (mostly) with Jan's 
> patch. logbug spin_lock, however, will stay.

Well, but this patch of yours only affects the semaphore code, so it does not 
change the logbuf_lock situation.

Furthermore, logbuf_lock already has recursion protection:

/*
 * Ouch, printk recursed into itself!
 */
if (unlikely(logbuf_cpu == this_cpu)) {

so it should not be possible to re-enter the printk() logbuf_lock critical 
section 
from the spinlock code. (There are other ways to get the logbuf_lock - if those 
are still triggerable then they should be fixed.)

In any case, recursion protection is generally done in the debugging facilities 
trying to behave lockless.

Thanks,

Ingo


Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-03 Thread Sergey Senozhatsky
On (02/03/16 17:12), Byungchul Park wrote:
> On Wed, Feb 03, 2016 at 04:42:23PM +0900, Sergey Senozhatsky wrote:
> > On (02/03/16 08:28), Ingo Molnar wrote:
> > [..]
> > > So why not move printk away from semaphores? Semaphores are classical 
> > > constructs 
> > > that have legacies and are somewhat non-obvious to use, compared to 
> > > modern, 
> > > simpler locking primitives. I'd not touch their implementation, unless we 
> > > are 
> > > absolutely sure this is a safe optimization.
> > 
> > semaphore's spin_lock is not the only spin lock that printk acquires. it 
> > also takes the
> > logbuf_lock (and different locks in console drivers (up to console driver)).
> > 
> > Jan Kara posted a patch that offloads printing job 
> > (console_trylock()-console_unlock())
> > from printk() call (when printk can offload it). so semaphore and console 
> > driver's locks
> > will go away (mostly) with Jan's patch. logbug spin_lock, however, will 
> > stay.
> 
> It sounds good. Could you teach me how to see the patch by Jan?

sure,

https://lkml.org/lkml/2015/10/26/16
and
http://marc.info/?l=linux-kernel=145079206822115

-ss


Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-03 Thread Ingo Molnar

* Sergey Senozhatsky  wrote:

> On (02/03/16 09:04), Ingo Molnar wrote:
> > * Sergey Senozhatsky  wrote:
> > 
> > > On (02/03/16 08:28), Ingo Molnar wrote:
> > > [..]
> > > > So why not move printk away from semaphores? Semaphores are classical 
> > > > constructs 
> > > > that have legacies and are somewhat non-obvious to use, compared to 
> > > > modern, 
> > > > simpler locking primitives. I'd not touch their implementation, unless 
> > > > we are 
> > > > absolutely sure this is a safe optimization.
> > > 
> > > semaphore's spin_lock is not the only spin lock that printk acquires. it 
> > > also 
> > > takes the logbuf_lock (and different locks in console drivers (up to 
> > > console 
> > > driver)).
> > >
> > > Jan Kara posted a patch that offloads printing job 
> > > (console_trylock()-console_unlock()) from printk() call (when printk can 
> > > offload 
> > > it). so semaphore and console driver's locks will go away (mostly) with 
> > > Jan's 
> > > patch. logbug spin_lock, however, will stay.
> > 
> > Well, but this patch of yours only affects the semaphore code, so it does 
> > not 
> > change the logbuf_lock situation.
> 
> yes, correct. I just said for the info that there is already 'move printk 
> away from
> console_sem' work in progress. Well, the reason for that work is entirely 
> different,
> though, but this console_sem recursion and console driver's lock recursion 
> can be
> 'fixed as a side effect'.
> 
> > Furthermore, logbuf_lock already has recursion protection:
> > 
> > /*
> >  * Ouch, printk recursed into itself!
> >  */
> > if (unlikely(logbuf_cpu == this_cpu)) {
> 
> it's good, no doubt. but it doesn't work in all of the cases. a simple one is:
> 
> vprintk_emit()
> ...
>   raw_spin_lock(_lock);
>   logbuf_cpu = this_cpu;

Yes, I'm aware of that, and as I said:

> > (There are other ways to get the logbuf_lock - if those are still 
> > triggerable 
> > then they should be fixed.)

The proper way to fix it would be to factor out the recursion-safe logbuf_lock 
taking code into logbuf_lock()/logbuf_unlock() primitives and use those 
consistently in printk.c.

>   ...
>   logbuf_cpu = UINT_MAX;
>   raw_spin_unlock(_lock);  <<  SPIN_BUG_ON
> ...
> 
> if raw_spin_unlock() calls SPIN_BUG_ON, then logbuf_lock recursion detection 
> can't
> help. we recurse into vprintk_emit() with logbuf_lock locked and logbuf_cpu 
> != this_cpu.

If recursion-safe logbuf_lock taking is factored out and extended to other 
functions in printk.c then this problem is solved.

> Peter Hurley also posted the following case (I'll quote):
> 
>   serial8250_do_set_termios()
> spin_lock_irqsave()  ** claim port lock **
> ...
> serial_port_out(port, UART_LCR, );
>   dw8250_serial_out()
> dev_err()
>   vprintk_emit()
> console_trylock()
>   call_console_drivers()
> serial8250_console_write()
>   spin_lock_irqsave()  ** port lock **
>   ** DEADLOCK **

That too seems to be avoided if vprintk_emit() does not take logbuf_lock 
naively.

So I repeat my point:

> > In any case, recursion protection is generally done in the debugging 
> > facilities 
> > trying to behave lockless.

I.e. to address these deadlocks, printk() should be made fundamentally more 
robust, extending its already existing recursion protection logic to remaining 
parts of printk.c.

That was my expectation when I committed the first variant of printk() 
recursion 
protection code a few years ago, it just never happened.

Thanks,

Ingo


Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-03 Thread Byungchul Park
On Wed, Feb 03, 2016 at 04:42:23PM +0900, Sergey Senozhatsky wrote:
> On (02/03/16 08:28), Ingo Molnar wrote:
> [..]
> > So why not move printk away from semaphores? Semaphores are classical 
> > constructs 
> > that have legacies and are somewhat non-obvious to use, compared to modern, 
> > simpler locking primitives. I'd not touch their implementation, unless we 
> > are 
> > absolutely sure this is a safe optimization.
> 
> semaphore's spin_lock is not the only spin lock that printk acquires. it also 
> takes the
> logbuf_lock (and different locks in console drivers (up to console driver)).
> 
> Jan Kara posted a patch that offloads printing job 
> (console_trylock()-console_unlock())
> from printk() call (when printk can offload it). so semaphore and console 
> driver's locks
> will go away (mostly) with Jan's patch. logbug spin_lock, however, will stay.

It sounds good. Could you teach me how to see the patch by Jan?

> 
>   -ss


Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-03 Thread Sergey Senozhatsky
On (02/03/16 09:04), Ingo Molnar wrote:
> * Sergey Senozhatsky  wrote:
> 
> > On (02/03/16 08:28), Ingo Molnar wrote:
> > [..]
> > > So why not move printk away from semaphores? Semaphores are classical 
> > > constructs 
> > > that have legacies and are somewhat non-obvious to use, compared to 
> > > modern, 
> > > simpler locking primitives. I'd not touch their implementation, unless we 
> > > are 
> > > absolutely sure this is a safe optimization.
> > 
> > semaphore's spin_lock is not the only spin lock that printk acquires. it 
> > also 
> > takes the logbuf_lock (and different locks in console drivers (up to 
> > console 
> > driver)).
> >
> > Jan Kara posted a patch that offloads printing job 
> > (console_trylock()-console_unlock()) from printk() call (when printk can 
> > offload 
> > it). so semaphore and console driver's locks will go away (mostly) with 
> > Jan's 
> > patch. logbug spin_lock, however, will stay.
> 
> Well, but this patch of yours only affects the semaphore code, so it does not 
> change the logbuf_lock situation.

yes, correct. I just said for the info that there is already 'move printk away 
from
console_sem' work in progress. Well, the reason for that work is entirely 
different,
though, but this console_sem recursion and console driver's lock recursion can 
be
'fixed as a side effect'.

> Furthermore, logbuf_lock already has recursion protection:
> 
> /*
>  * Ouch, printk recursed into itself!
>  */
> if (unlikely(logbuf_cpu == this_cpu)) {

it's good, no doubt. but it doesn't work in all of the cases. a simple one is:

vprintk_emit()
...
raw_spin_lock(_lock);
logbuf_cpu = this_cpu;
...
logbuf_cpu = UINT_MAX;
raw_spin_unlock(_lock);  <<  SPIN_BUG_ON
...

if raw_spin_unlock() calls SPIN_BUG_ON, then logbuf_lock recursion detection 
can't
help. we recurse into vprintk_emit() with logbuf_lock locked and logbuf_cpu != 
this_cpu.

Peter Hurley also posted the following case (I'll quote):

  serial8250_do_set_termios()
spin_lock_irqsave()  ** claim port lock **
...
serial_port_out(port, UART_LCR, );
  dw8250_serial_out()
dev_err()
  vprintk_emit()
console_trylock()
  call_console_drivers()
serial8250_console_write()
  spin_lock_irqsave()  ** port lock **
  ** DEADLOCK **

-ss

> so it should not be possible to re-enter the printk() logbuf_lock critical 
> section 
> from the spinlock code. (There are other ways to get the logbuf_lock - if 
> those 
> are still triggerable then they should be fixed.)
> 
> In any case, recursion protection is generally done in the debugging 
> facilities 
> trying to behave lockless.
> 
> Thanks,
> 
>   Ingo
> 


Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-02 Thread Sergey Senozhatsky
On (02/03/16 08:28), Ingo Molnar wrote:
[..]
> So why not move printk away from semaphores? Semaphores are classical 
> constructs 
> that have legacies and are somewhat non-obvious to use, compared to modern, 
> simpler locking primitives. I'd not touch their implementation, unless we are 
> absolutely sure this is a safe optimization.

semaphore's spin_lock is not the only spin lock that printk acquires. it also 
takes the
logbuf_lock (and different locks in console drivers (up to console driver)).

Jan Kara posted a patch that offloads printing job 
(console_trylock()-console_unlock())
from printk() call (when printk can offload it). so semaphore and console 
driver's locks
will go away (mostly) with Jan's patch. logbug spin_lock, however, will stay.

-ss


Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-02 Thread Ingo Molnar

* Byungchul Park  wrote:

>  void up(struct semaphore *sem)
>  {
>   unsigned long flags;
> + struct task_struct *p = NULL;
>  
>   raw_spin_lock_irqsave(>lock, flags);
>   if (likely(list_empty(>wait_list)))
>   sem->count++;
>   else
> - __up(sem);
> + p = __up(sem);
>   raw_spin_unlock_irqrestore(>lock, flags);
> +
> + /*
> +  * wake_up_process() needs not to be protected by a spinlock.
> +  * Thus move it from the protected region to here. What is
> +  * worse, this unnecessary protection can cause a deadlock by
> +  * acquiring the same sem->lock within wake_up_process().
> +  */
> + if (unlikely(p))
> + wake_up_process(p);

So I'm not sure this is completely race free, for cases where a semaphore is 
attached to a task and is managed/destroyed on task exit.

Since we don't have a guaranteed reference to 'p' here, the task might wake up 
(via a signal) and exit (and its task struct might be freed and the semaphore 
might be freed), after we unlocked the semaphore but before we wake the task up.

So why not move printk away from semaphores? Semaphores are classical 
constructs 
that have legacies and are somewhat non-obvious to use, compared to modern, 
simpler locking primitives. I'd not touch their implementation, unless we are 
absolutely sure this is a safe optimization.

Thanks,

Ingo


[PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-02 Thread Byungchul Park
change from v1 to v2
- remove unnecessary overhead by the redundant spin(un)lock.

Since I faced a infinite recursive printk() bug, I've tried to propose
patches the title of which is "lib/spinlock_debug.c: prevent a recursive
cycle in the debug code". But I noticed the root problem cannot be fixed
by that, through some discussion thanks to Sergey and Peter. So I focused
on preventing the deadlock.

-8<-
>From 56ce4a9c4e9a089eff798fd17015f395751abb62 Mon Sep 17 00:00:00 2001
From: Byungchul Park 
Date: Wed, 3 Feb 2016 14:44:52 +0900
Subject: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

wake_up_process() is currently protected by spinlock though it's not
necessary. Furthermore, it can cause a deadlock when it's hit from within
printk() since the wake_up_process() can printk() again.

The scenario the bad thing can happen is,

printk
  console_trylock
  console_unlock
up_console_sem
  up
raw_spin_lock_irqsave(>lock, flags)
__up
  wake_up_process
try_to_wake_up
  raw_spin_lock_irqsave(>pi_lock)
__spin_lock_debug
  spin_dump
printk
  console_trylock
raw_spin_lock_irqsave(>lock, flags)

*** DEADLOCK ***

Signed-off-by: Byungchul Park 
---
 kernel/locking/semaphore.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c
index b8120ab..14d0aca 100644
--- a/kernel/locking/semaphore.c
+++ b/kernel/locking/semaphore.c
@@ -37,7 +37,7 @@ static noinline void __down(struct semaphore *sem);
 static noinline int __down_interruptible(struct semaphore *sem);
 static noinline int __down_killable(struct semaphore *sem);
 static noinline int __down_timeout(struct semaphore *sem, long timeout);
-static noinline void __up(struct semaphore *sem);
+static noinline struct task_struct *__up(struct semaphore *sem);
 
 /**
  * down - acquire the semaphore
@@ -178,13 +178,23 @@ EXPORT_SYMBOL(down_timeout);
 void up(struct semaphore *sem)
 {
unsigned long flags;
+   struct task_struct *p = NULL;
 
raw_spin_lock_irqsave(>lock, flags);
if (likely(list_empty(>wait_list)))
sem->count++;
else
-   __up(sem);
+   p = __up(sem);
raw_spin_unlock_irqrestore(>lock, flags);
+
+   /*
+* wake_up_process() needs not to be protected by a spinlock.
+* Thus move it from the protected region to here. What is
+* worse, this unnecessary protection can cause a deadlock by
+* acquiring the same sem->lock within wake_up_process().
+*/
+   if (unlikely(p))
+   wake_up_process(p);
 }
 EXPORT_SYMBOL(up);
 
@@ -253,11 +263,11 @@ static noinline int __sched __down_timeout(struct 
semaphore *sem, long timeout)
return __down_common(sem, TASK_UNINTERRUPTIBLE, timeout);
 }
 
-static noinline void __sched __up(struct semaphore *sem)
+static noinline struct task_struct *__sched __up(struct semaphore *sem)
 {
struct semaphore_waiter *waiter = list_first_entry(>wait_list,
struct semaphore_waiter, list);
list_del(>list);
waiter->up = true;
-   wake_up_process(waiter->task);
+   return waiter->task;
 }
-- 
1.9.1



[PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-02 Thread Byungchul Park
change from v1 to v2
- remove unnecessary overhead by the redundant spin(un)lock.

Since I faced a infinite recursive printk() bug, I've tried to propose
patches the title of which is "lib/spinlock_debug.c: prevent a recursive
cycle in the debug code". But I noticed the root problem cannot be fixed
by that, through some discussion thanks to Sergey and Peter. So I focused
on preventing the deadlock.

-8<-
>From 56ce4a9c4e9a089eff798fd17015f395751abb62 Mon Sep 17 00:00:00 2001
From: Byungchul Park <byungchul.p...@lge.com>
Date: Wed, 3 Feb 2016 14:44:52 +0900
Subject: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

wake_up_process() is currently protected by spinlock though it's not
necessary. Furthermore, it can cause a deadlock when it's hit from within
printk() since the wake_up_process() can printk() again.

The scenario the bad thing can happen is,

printk
  console_trylock
  console_unlock
up_console_sem
  up
raw_spin_lock_irqsave(>lock, flags)
__up
  wake_up_process
try_to_wake_up
  raw_spin_lock_irqsave(>pi_lock)
__spin_lock_debug
  spin_dump
printk
  console_trylock
raw_spin_lock_irqsave(>lock, flags)

*** DEADLOCK ***

Signed-off-by: Byungchul Park <byungchul.p...@lge.com>
---
 kernel/locking/semaphore.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c
index b8120ab..14d0aca 100644
--- a/kernel/locking/semaphore.c
+++ b/kernel/locking/semaphore.c
@@ -37,7 +37,7 @@ static noinline void __down(struct semaphore *sem);
 static noinline int __down_interruptible(struct semaphore *sem);
 static noinline int __down_killable(struct semaphore *sem);
 static noinline int __down_timeout(struct semaphore *sem, long timeout);
-static noinline void __up(struct semaphore *sem);
+static noinline struct task_struct *__up(struct semaphore *sem);
 
 /**
  * down - acquire the semaphore
@@ -178,13 +178,23 @@ EXPORT_SYMBOL(down_timeout);
 void up(struct semaphore *sem)
 {
unsigned long flags;
+   struct task_struct *p = NULL;
 
raw_spin_lock_irqsave(>lock, flags);
if (likely(list_empty(>wait_list)))
sem->count++;
else
-   __up(sem);
+   p = __up(sem);
raw_spin_unlock_irqrestore(>lock, flags);
+
+   /*
+* wake_up_process() needs not to be protected by a spinlock.
+* Thus move it from the protected region to here. What is
+* worse, this unnecessary protection can cause a deadlock by
+* acquiring the same sem->lock within wake_up_process().
+*/
+   if (unlikely(p))
+   wake_up_process(p);
 }
 EXPORT_SYMBOL(up);
 
@@ -253,11 +263,11 @@ static noinline int __sched __down_timeout(struct 
semaphore *sem, long timeout)
return __down_common(sem, TASK_UNINTERRUPTIBLE, timeout);
 }
 
-static noinline void __sched __up(struct semaphore *sem)
+static noinline struct task_struct *__sched __up(struct semaphore *sem)
 {
struct semaphore_waiter *waiter = list_first_entry(>wait_list,
struct semaphore_waiter, list);
list_del(>list);
waiter->up = true;
-   wake_up_process(waiter->task);
+   return waiter->task;
 }
-- 
1.9.1



Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-02 Thread Ingo Molnar

* Byungchul Park  wrote:

>  void up(struct semaphore *sem)
>  {
>   unsigned long flags;
> + struct task_struct *p = NULL;
>  
>   raw_spin_lock_irqsave(>lock, flags);
>   if (likely(list_empty(>wait_list)))
>   sem->count++;
>   else
> - __up(sem);
> + p = __up(sem);
>   raw_spin_unlock_irqrestore(>lock, flags);
> +
> + /*
> +  * wake_up_process() needs not to be protected by a spinlock.
> +  * Thus move it from the protected region to here. What is
> +  * worse, this unnecessary protection can cause a deadlock by
> +  * acquiring the same sem->lock within wake_up_process().
> +  */
> + if (unlikely(p))
> + wake_up_process(p);

So I'm not sure this is completely race free, for cases where a semaphore is 
attached to a task and is managed/destroyed on task exit.

Since we don't have a guaranteed reference to 'p' here, the task might wake up 
(via a signal) and exit (and its task struct might be freed and the semaphore 
might be freed), after we unlocked the semaphore but before we wake the task up.

So why not move printk away from semaphores? Semaphores are classical 
constructs 
that have legacies and are somewhat non-obvious to use, compared to modern, 
simpler locking primitives. I'd not touch their implementation, unless we are 
absolutely sure this is a safe optimization.

Thanks,

Ingo


Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

2016-02-02 Thread Sergey Senozhatsky
On (02/03/16 08:28), Ingo Molnar wrote:
[..]
> So why not move printk away from semaphores? Semaphores are classical 
> constructs 
> that have legacies and are somewhat non-obvious to use, compared to modern, 
> simpler locking primitives. I'd not touch their implementation, unless we are 
> absolutely sure this is a safe optimization.

semaphore's spin_lock is not the only spin lock that printk acquires. it also 
takes the
logbuf_lock (and different locks in console drivers (up to console driver)).

Jan Kara posted a patch that offloads printing job 
(console_trylock()-console_unlock())
from printk() call (when printk can offload it). so semaphore and console 
driver's locks
will go away (mostly) with Jan's patch. logbug spin_lock, however, will stay.

-ss