Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
On 12/12, David Hildenbrand wrote: > > > This is subjective, but how about > > > > static bool xxx(void) > > { > > mutex_lock(_hotplug.lock); > > if (atomic_read(_hotplug.refcount) == 0) > > return true; > > mutex_unlock(_hotplug.lock); > > return false; > > } > > > > void cpu_hotplug_begin(void) > > { > > cpu_hotplug.active_writer = current; > > > > cpuhp_lock_acquire(); > > wait_event(_hotplug.wq, xxx()); > > } > > > > instead? > > > > Oleg. > > > > [ 50.662459] do not call blocking ops when !TASK_RUNNING; state=2 set at > [<0017340e>] prepare_to_wait_event+0x7a/0x124 > [ 50.662472] [ cut here ] > [ 50.662475] WARNING: at kernel/sched/core.c:7301 > [ 50.662477] Modules linked in: > [ 50.662482] CPU: 5 PID: 225 Comm: cpu_start_stop. Not tainted 3.18.0+ #59 > [ 50.662485] task: 01f94b20 ti: 01ffc000 task.ti: > 01ffc000 > ... > > Looks like your suggestion won't work. We can only set the task to > TASK_UNINTERRUPTIBLE after taking the lock. Yeees, this warning (and wait_woken() helpers) was specially added to catch/fix the problem like this, sorry for confusion. Easy to fix, just - mutex_lock(_hotplug.lock); + if (!mutex_trylock(_hotplug.lock)) + return false; If .lock is locked then it is hold by get_online_cpus(), and it is going to increment the counter. I would like to say that this is what I actually meant but now I can not recall if this is true ;) But please ignore. Your next version looks simple/clear enough. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
On 12/12, David Hildenbrand wrote: This is subjective, but how about static bool xxx(void) { mutex_lock(cpu_hotplug.lock); if (atomic_read(cpu_hotplug.refcount) == 0) return true; mutex_unlock(cpu_hotplug.lock); return false; } void cpu_hotplug_begin(void) { cpu_hotplug.active_writer = current; cpuhp_lock_acquire(); wait_event(cpu_hotplug.wq, xxx()); } instead? Oleg. [ 50.662459] do not call blocking ops when !TASK_RUNNING; state=2 set at [0017340e] prepare_to_wait_event+0x7a/0x124 [ 50.662472] [ cut here ] [ 50.662475] WARNING: at kernel/sched/core.c:7301 [ 50.662477] Modules linked in: [ 50.662482] CPU: 5 PID: 225 Comm: cpu_start_stop. Not tainted 3.18.0+ #59 [ 50.662485] task: 01f94b20 ti: 01ffc000 task.ti: 01ffc000 ... Looks like your suggestion won't work. We can only set the task to TASK_UNINTERRUPTIBLE after taking the lock. Yeees, this warning (and wait_woken() helpers) was specially added to catch/fix the problem like this, sorry for confusion. Easy to fix, just - mutex_lock(cpu_hotplug.lock); + if (!mutex_trylock(cpu_hotplug.lock)) + return false; If .lock is locked then it is hold by get_online_cpus(), and it is going to increment the counter. I would like to say that this is what I actually meant but now I can not recall if this is true ;) But please ignore. Your next version looks simple/clear enough. Oleg. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
> This is subjective, but how about > > static bool xxx(void) > { > mutex_lock(_hotplug.lock); > if (atomic_read(_hotplug.refcount) == 0) > return true; > mutex_unlock(_hotplug.lock); > return false; > } > > void cpu_hotplug_begin(void) > { > cpu_hotplug.active_writer = current; > > cpuhp_lock_acquire(); > wait_event(_hotplug.wq, xxx()); > } > > instead? > > Oleg. > [ 50.662459] do not call blocking ops when !TASK_RUNNING; state=2 set at [<0017340e>] prepare_to_wait_event+0x7a/0x124 [ 50.662472] [ cut here ] [ 50.662475] WARNING: at kernel/sched/core.c:7301 [ 50.662477] Modules linked in: [ 50.662482] CPU: 5 PID: 225 Comm: cpu_start_stop. Not tainted 3.18.0+ #59 [ 50.662485] task: 01f94b20 ti: 01ffc000 task.ti: 01ffc000 ... Looks like your suggestion won't work. We can only set the task to TASK_UNINTERRUPTIBLE after taking the lock. David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
This is subjective, but how about static bool xxx(void) { mutex_lock(cpu_hotplug.lock); if (atomic_read(cpu_hotplug.refcount) == 0) return true; mutex_unlock(cpu_hotplug.lock); return false; } void cpu_hotplug_begin(void) { cpu_hotplug.active_writer = current; cpuhp_lock_acquire(); wait_event(cpu_hotplug.wq, xxx()); } instead? Oleg. [ 50.662459] do not call blocking ops when !TASK_RUNNING; state=2 set at [0017340e] prepare_to_wait_event+0x7a/0x124 [ 50.662472] [ cut here ] [ 50.662475] WARNING: at kernel/sched/core.c:7301 [ 50.662477] Modules linked in: [ 50.662482] CPU: 5 PID: 225 Comm: cpu_start_stop. Not tainted 3.18.0+ #59 [ 50.662485] task: 01f94b20 ti: 01ffc000 task.ti: 01ffc000 ... Looks like your suggestion won't work. We can only set the task to TASK_UNINTERRUPTIBLE after taking the lock. David -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
> This is subjective, but how about > > static bool xxx(void) > { > mutex_lock(_hotplug.lock); > if (atomic_read(_hotplug.refcount) == 0) > return true; > mutex_unlock(_hotplug.lock); > return false; > } > > void cpu_hotplug_begin(void) > { > cpu_hotplug.active_writer = current; > > cpuhp_lock_acquire(); > wait_event(_hotplug.wq, xxx()); > } > > instead? > > Oleg. > Just checked wait_event(). It is safe against such race conditions as the same pattern is used is applied internally. Thanks! David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
This is subjective, but how about static bool xxx(void) { mutex_lock(cpu_hotplug.lock); if (atomic_read(cpu_hotplug.refcount) == 0) return true; mutex_unlock(cpu_hotplug.lock); return false; } void cpu_hotplug_begin(void) { cpu_hotplug.active_writer = current; cpuhp_lock_acquire(); wait_event(cpu_hotplug.wq, xxx()); } instead? Oleg. Just checked wait_event(). It is safe against such race conditions as the same pattern is used is applied internally. Thanks! David -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
only thing that is bugging me is this part. Without the lock we can't > > guarantee that another get_online_cpus() just arrived and bumped the > > refcount > > to 0. > > > > Of course this only applies to misuse of put/get_online_cpus. > > > > We could hack some loop that tries to cmp_xchng with the old value until it > > fits to work around this, but wouldn't make the code any better readable. > > Well, we didn't have this diagnostic in the original, so one approach > would be to simply leave it out. Another approach would be to just > have the WARN_ON() without the attempt to fix it up. > > Thanx, Paul Well, I think the WARN_ON would be sufficient for debugging purposes. Thanks! David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
> On 12/10, David Hildenbrand wrote: > > > > @@ -127,20 +119,16 @@ void put_online_cpus(void) > > { > > if (cpu_hotplug.active_writer == current) > > return; > > - if (!mutex_trylock(_hotplug.lock)) { > > - atomic_inc(_hotplug.puts_pending); > > - cpuhp_lock_release(); > > - return; > > - } > > - > > - if (WARN_ON(!cpu_hotplug.refcount)) > > - cpu_hotplug.refcount++; /* try to fix things up */ > > > > - if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer)) > > - wake_up_process(cpu_hotplug.active_writer); > > - mutex_unlock(_hotplug.lock); > > - cpuhp_lock_release(); > > + if (atomic_dec_and_test(_hotplug.refcount) && > > + waitqueue_active(_hotplug.wq)) > > + wake_up(_hotplug.wq); > > OK, waitqueue_active() looks safe... prepare_to_wait() has a barrier. > > > void cpu_hotplug_begin(void) > > { > > + DEFINE_WAIT(wait); > > + > > cpu_hotplug.active_writer = current; > > > > - cpuhp_lock_acquire(); > > for (;;) { > > + cpuhp_lock_acquire(); > > not sure I understand why did you move cpuhp_lock_acquire() into > the loop, but this is minor. Well I got some lockdep issues and this way I was able to solve them. (complain about same thread that called cpu_hotplug_begin() calling put_online_cpus(), so we have to correctly tell lockdep when we get an release the lock). So I guess I also need that in the loop, or am I wrong (due to cpuhp_lock_release())? > > > mutex_lock(_hotplug.lock); > > - apply_puts_pending(1); > > - if (likely(!cpu_hotplug.refcount)) > > + prepare_to_wait(_hotplug.wq, , TASK_UNINTERRUPTIBLE); > > + if (likely(!atomic_read(_hotplug.refcount))) > > break; > > - __set_current_state(TASK_UNINTERRUPTIBLE); > > mutex_unlock(_hotplug.lock); > > + cpuhp_lock_release(); > > schedule(); > > } > > + > > + finish_wait(_hotplug.wq, ); > > } > > This is subjective, but how about > > static bool xxx(void) > { > mutex_lock(_hotplug.lock); > if (atomic_read(_hotplug.refcount) == 0) > return true; > mutex_unlock(_hotplug.lock); > return false; > } > > void cpu_hotplug_begin(void) > { > cpu_hotplug.active_writer = current; > > cpuhp_lock_acquire(); > wait_event(_hotplug.wq, xxx()); > } > > instead? > What I don't like about that suggestion is that the mutex_lock() happens in another level of indirection, so by looking at cpu_hotplug_begin() it isn't obvious that that lock remains locked after this function has been called. On the other hand this is really a compact one (+ possibly lockdep annotations) :) . > Oleg. > It is important that we do the state change to TASK_UNINTERRUPTIBLE prior to checking for the condition. Is it guaranteed with wait_event() that things like the following won't happen? 1. CPU1 wakes up the wq (refcount == 0) 2. CPU2 calls get_online_cpus() and increments refcount. (refcount == 1) 2. CPU3 executes xxx() up to "return false;" and gets scheduled away 3. CPU2 calls put_online_cpus(), decrementing the refcount (refcount == 0) -> waitqueue not active -> no wake up 4. CPU3 continues executing and sleeps -> refcount == 0 but writer is not woken up Saying, does wait_event() take care wakeups while executing xxx()? (w.g. activating the wait queue, setting TASK_UNINTERRUPTIBLE just before calling xxx()) In my code, this is guaranteed by calling prepare_to_wait(_hotplug.wq, , TASK_UNINTERRUPTIBLE); prior to checking for the condition. If that is guaranteed, this would work. Will verify that tomorrow. Thanks a lot! David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
On 12/10, David Hildenbrand wrote: > > @@ -127,20 +119,16 @@ void put_online_cpus(void) > { > if (cpu_hotplug.active_writer == current) > return; > - if (!mutex_trylock(_hotplug.lock)) { > - atomic_inc(_hotplug.puts_pending); > - cpuhp_lock_release(); > - return; > - } > - > - if (WARN_ON(!cpu_hotplug.refcount)) > - cpu_hotplug.refcount++; /* try to fix things up */ > > - if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer)) > - wake_up_process(cpu_hotplug.active_writer); > - mutex_unlock(_hotplug.lock); > - cpuhp_lock_release(); > + if (atomic_dec_and_test(_hotplug.refcount) && > + waitqueue_active(_hotplug.wq)) > + wake_up(_hotplug.wq); OK, waitqueue_active() looks safe... prepare_to_wait() has a barrier. > void cpu_hotplug_begin(void) > { > + DEFINE_WAIT(wait); > + > cpu_hotplug.active_writer = current; > > - cpuhp_lock_acquire(); > for (;;) { > + cpuhp_lock_acquire(); not sure I understand why did you move cpuhp_lock_acquire() into the loop, but this is minor. > mutex_lock(_hotplug.lock); > - apply_puts_pending(1); > - if (likely(!cpu_hotplug.refcount)) > + prepare_to_wait(_hotplug.wq, , TASK_UNINTERRUPTIBLE); > + if (likely(!atomic_read(_hotplug.refcount))) > break; > - __set_current_state(TASK_UNINTERRUPTIBLE); > mutex_unlock(_hotplug.lock); > + cpuhp_lock_release(); > schedule(); > } > + > + finish_wait(_hotplug.wq, ); > } This is subjective, but how about static bool xxx(void) { mutex_lock(_hotplug.lock); if (atomic_read(_hotplug.refcount) == 0) return true; mutex_unlock(_hotplug.lock); return false; } void cpu_hotplug_begin(void) { cpu_hotplug.active_writer = current; cpuhp_lock_acquire(); wait_event(_hotplug.wq, xxx()); } instead? Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
On Wed, Dec 10, 2014 at 02:26:19PM +0100, David Hildenbrand wrote: > > Commit b2c4623dcd07 ("rcu: More on deadlock between CPU hotplug and > > expedited > > grace periods") introduced another problem that can easily be reproduced by > > starting/stopping cpus in a loop. > > > > E.g.: > > for i in `seq 5000`; do > > echo 1 > /sys/devices/system/cpu/cpu1/online > > echo 0 > /sys/devices/system/cpu/cpu1/online > > done > > > > Will result in: > > INFO: task /cpu_start_stop:1 blocked for more than 120 seconds. > > Call Trace: > > ([<006a028e>] __schedule+0x406/0x91c) > >[<00130f60>] cpu_hotplug_begin+0xd0/0xd4 > >[<00130ff6>] _cpu_up+0x3e/0x1c4 > >[<00131232>] cpu_up+0xb6/0xd4 > >[<004a5720>] device_online+0x80/0xc0 > >[<004a57f0>] online_store+0x90/0xb0 > > ... > > > > And a deadlock. > > > > Problem is that if the last ref in put_online_cpus() can't get the > > cpu_hotplug.lock the puts_pending count is incremented, but a sleeping > > active_writer might never be woken up, therefore never exiting the loop in > > cpu_hotplug_begin(). > > > > This fix removes puts_pending and turns refcount into an atomic variable. We > > also introduce a wait queue for the active_writer, to avoid possible races > > and > > use-after-free. There is no need to take the lock in put_online_cpus() > > anymore. > > > > Also rearrange the lockdep anotations so we won't get false positives. > > > > Can't reproduce it with this fix. > > > > Signed-off-by: David Hildenbrand > > --- > > kernel/cpu.c | 60 > > ++-- > > 1 file changed, 26 insertions(+), 34 deletions(-) > > > > diff --git a/kernel/cpu.c b/kernel/cpu.c > > index 5d22023..3f1d5ba 100644 > > --- a/kernel/cpu.c > > +++ b/kernel/cpu.c > > @@ -58,22 +58,23 @@ static int cpu_hotplug_disabled; > > > > static struct { > > struct task_struct *active_writer; > > - struct mutex lock; /* Synchronizes accesses to refcount, */ > > + /* wait queue to wake up the active_writer */ > > + wait_queue_head_t wq; > > + /* verifies that no writer will get active while readers are active */ > > + struct mutex lock; > > /* > > * Also blocks the new readers during > > * an ongoing cpu hotplug operation. > > */ > > - int refcount; > > - /* And allows lockless put_online_cpus(). */ > > - atomic_t puts_pending; > > + atomic_t refcount; > > > > #ifdef CONFIG_DEBUG_LOCK_ALLOC > > struct lockdep_map dep_map; > > #endif > > } cpu_hotplug = { > > .active_writer = NULL, > > + .wq = __WAIT_QUEUE_HEAD_INITIALIZER(cpu_hotplug.wq), > > .lock = __MUTEX_INITIALIZER(cpu_hotplug.lock), > > - .refcount = 0, > > #ifdef CONFIG_DEBUG_LOCK_ALLOC > > .dep_map = {.name = "cpu_hotplug.lock" }, > > #endif > > @@ -86,15 +87,6 @@ static struct { > > #define cpuhp_lock_acquire() lock_map_acquire(_hotplug.dep_map) > > #define cpuhp_lock_release() lock_map_release(_hotplug.dep_map) > > > > -static void apply_puts_pending(int max) > > -{ > > - int delta; > > - > > - if (atomic_read(_hotplug.puts_pending) >= max) { > > - delta = atomic_xchg(_hotplug.puts_pending, 0); > > - cpu_hotplug.refcount -= delta; > > - } > > -} > > > > void get_online_cpus(void) > > { > > @@ -103,9 +95,9 @@ void get_online_cpus(void) > > return; > > cpuhp_lock_acquire_read(); > > mutex_lock(_hotplug.lock); > > - apply_puts_pending(65536); > > - cpu_hotplug.refcount++; > > + atomic_inc(_hotplug.refcount); > > mutex_unlock(_hotplug.lock); > > + cpuhp_lock_release(); > > } > > EXPORT_SYMBOL_GPL(get_online_cpus); > > > > @@ -116,9 +108,9 @@ bool try_get_online_cpus(void) > > if (!mutex_trylock(_hotplug.lock)) > > return false; > > cpuhp_lock_acquire_tryread(); > > - apply_puts_pending(65536); > > - cpu_hotplug.refcount++; > > + atomic_inc(_hotplug.refcount); > > mutex_unlock(_hotplug.lock); > > + cpuhp_lock_release(); > > return true; > > } > > EXPORT_SYMBOL_GPL(try_get_online_cpus); > > @@ -127,20 +119,16 @@ void put_online_cpus(void) > > { > > if (cpu_hotplug.active_writer == current) > > return; > > - if (!mutex_trylock(_hotplug.lock)) { > > - atomic_inc(_hotplug.puts_pending); > > - cpuhp_lock_release(); > > - return; > > - } > > - > > - if (WARN_ON(!cpu_hotplug.refcount)) > > - cpu_hotplug.refcount++; /* try to fix things up */ > > > > - if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer)) > > - wake_up_process(cpu_hotplug.active_writer); > > - mutex_unlock(_hotplug.lock); > > - cpuhp_lock_release(); > > + if (atomic_dec_and_test(_hotplug.refcount) && > > + waitqueue_active(_hotplug.wq)) > > + wake_up(_hotplug.wq); > > > > + /* missing get - try to fix things up */ > > The only thing that is bugging
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
> Commit b2c4623dcd07 ("rcu: More on deadlock between CPU hotplug and expedited > grace periods") introduced another problem that can easily be reproduced by > starting/stopping cpus in a loop. > > E.g.: > for i in `seq 5000`; do > echo 1 > /sys/devices/system/cpu/cpu1/online > echo 0 > /sys/devices/system/cpu/cpu1/online > done > > Will result in: > INFO: task /cpu_start_stop:1 blocked for more than 120 seconds. > Call Trace: > ([<006a028e>] __schedule+0x406/0x91c) >[<00130f60>] cpu_hotplug_begin+0xd0/0xd4 >[<00130ff6>] _cpu_up+0x3e/0x1c4 >[<00131232>] cpu_up+0xb6/0xd4 >[<004a5720>] device_online+0x80/0xc0 >[<004a57f0>] online_store+0x90/0xb0 > ... > > And a deadlock. > > Problem is that if the last ref in put_online_cpus() can't get the > cpu_hotplug.lock the puts_pending count is incremented, but a sleeping > active_writer might never be woken up, therefore never exiting the loop in > cpu_hotplug_begin(). > > This fix removes puts_pending and turns refcount into an atomic variable. We > also introduce a wait queue for the active_writer, to avoid possible races and > use-after-free. There is no need to take the lock in put_online_cpus() > anymore. > > Also rearrange the lockdep anotations so we won't get false positives. > > Can't reproduce it with this fix. > > Signed-off-by: David Hildenbrand > --- > kernel/cpu.c | 60 > ++-- > 1 file changed, 26 insertions(+), 34 deletions(-) > > diff --git a/kernel/cpu.c b/kernel/cpu.c > index 5d22023..3f1d5ba 100644 > --- a/kernel/cpu.c > +++ b/kernel/cpu.c > @@ -58,22 +58,23 @@ static int cpu_hotplug_disabled; > > static struct { > struct task_struct *active_writer; > - struct mutex lock; /* Synchronizes accesses to refcount, */ > + /* wait queue to wake up the active_writer */ > + wait_queue_head_t wq; > + /* verifies that no writer will get active while readers are active */ > + struct mutex lock; > /* >* Also blocks the new readers during >* an ongoing cpu hotplug operation. >*/ > - int refcount; > - /* And allows lockless put_online_cpus(). */ > - atomic_t puts_pending; > + atomic_t refcount; > > #ifdef CONFIG_DEBUG_LOCK_ALLOC > struct lockdep_map dep_map; > #endif > } cpu_hotplug = { > .active_writer = NULL, > + .wq = __WAIT_QUEUE_HEAD_INITIALIZER(cpu_hotplug.wq), > .lock = __MUTEX_INITIALIZER(cpu_hotplug.lock), > - .refcount = 0, > #ifdef CONFIG_DEBUG_LOCK_ALLOC > .dep_map = {.name = "cpu_hotplug.lock" }, > #endif > @@ -86,15 +87,6 @@ static struct { > #define cpuhp_lock_acquire() lock_map_acquire(_hotplug.dep_map) > #define cpuhp_lock_release() lock_map_release(_hotplug.dep_map) > > -static void apply_puts_pending(int max) > -{ > - int delta; > - > - if (atomic_read(_hotplug.puts_pending) >= max) { > - delta = atomic_xchg(_hotplug.puts_pending, 0); > - cpu_hotplug.refcount -= delta; > - } > -} > > void get_online_cpus(void) > { > @@ -103,9 +95,9 @@ void get_online_cpus(void) > return; > cpuhp_lock_acquire_read(); > mutex_lock(_hotplug.lock); > - apply_puts_pending(65536); > - cpu_hotplug.refcount++; > + atomic_inc(_hotplug.refcount); > mutex_unlock(_hotplug.lock); > + cpuhp_lock_release(); > } > EXPORT_SYMBOL_GPL(get_online_cpus); > > @@ -116,9 +108,9 @@ bool try_get_online_cpus(void) > if (!mutex_trylock(_hotplug.lock)) > return false; > cpuhp_lock_acquire_tryread(); > - apply_puts_pending(65536); > - cpu_hotplug.refcount++; > + atomic_inc(_hotplug.refcount); > mutex_unlock(_hotplug.lock); > + cpuhp_lock_release(); > return true; > } > EXPORT_SYMBOL_GPL(try_get_online_cpus); > @@ -127,20 +119,16 @@ void put_online_cpus(void) > { > if (cpu_hotplug.active_writer == current) > return; > - if (!mutex_trylock(_hotplug.lock)) { > - atomic_inc(_hotplug.puts_pending); > - cpuhp_lock_release(); > - return; > - } > - > - if (WARN_ON(!cpu_hotplug.refcount)) > - cpu_hotplug.refcount++; /* try to fix things up */ > > - if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer)) > - wake_up_process(cpu_hotplug.active_writer); > - mutex_unlock(_hotplug.lock); > - cpuhp_lock_release(); > + if (atomic_dec_and_test(_hotplug.refcount) && > + waitqueue_active(_hotplug.wq)) > + wake_up(_hotplug.wq); > > + /* missing get - try to fix things up */ The only thing that is bugging me is this part. Without the lock we can't guarantee that another get_online_cpus() just arrived and bumped the refcount to 0. Of course this only applies to misuse of put/get_online_cpus. We could hack some loop that tries to cmp_xchng
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
Commit b2c4623dcd07 (rcu: More on deadlock between CPU hotplug and expedited grace periods) introduced another problem that can easily be reproduced by starting/stopping cpus in a loop. E.g.: for i in `seq 5000`; do echo 1 /sys/devices/system/cpu/cpu1/online echo 0 /sys/devices/system/cpu/cpu1/online done Will result in: INFO: task /cpu_start_stop:1 blocked for more than 120 seconds. Call Trace: ([006a028e] __schedule+0x406/0x91c) [00130f60] cpu_hotplug_begin+0xd0/0xd4 [00130ff6] _cpu_up+0x3e/0x1c4 [00131232] cpu_up+0xb6/0xd4 [004a5720] device_online+0x80/0xc0 [004a57f0] online_store+0x90/0xb0 ... And a deadlock. Problem is that if the last ref in put_online_cpus() can't get the cpu_hotplug.lock the puts_pending count is incremented, but a sleeping active_writer might never be woken up, therefore never exiting the loop in cpu_hotplug_begin(). This fix removes puts_pending and turns refcount into an atomic variable. We also introduce a wait queue for the active_writer, to avoid possible races and use-after-free. There is no need to take the lock in put_online_cpus() anymore. Also rearrange the lockdep anotations so we won't get false positives. Can't reproduce it with this fix. Signed-off-by: David Hildenbrand d...@linux.vnet.ibm.com --- kernel/cpu.c | 60 ++-- 1 file changed, 26 insertions(+), 34 deletions(-) diff --git a/kernel/cpu.c b/kernel/cpu.c index 5d22023..3f1d5ba 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -58,22 +58,23 @@ static int cpu_hotplug_disabled; static struct { struct task_struct *active_writer; - struct mutex lock; /* Synchronizes accesses to refcount, */ + /* wait queue to wake up the active_writer */ + wait_queue_head_t wq; + /* verifies that no writer will get active while readers are active */ + struct mutex lock; /* * Also blocks the new readers during * an ongoing cpu hotplug operation. */ - int refcount; - /* And allows lockless put_online_cpus(). */ - atomic_t puts_pending; + atomic_t refcount; #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; #endif } cpu_hotplug = { .active_writer = NULL, + .wq = __WAIT_QUEUE_HEAD_INITIALIZER(cpu_hotplug.wq), .lock = __MUTEX_INITIALIZER(cpu_hotplug.lock), - .refcount = 0, #ifdef CONFIG_DEBUG_LOCK_ALLOC .dep_map = {.name = cpu_hotplug.lock }, #endif @@ -86,15 +87,6 @@ static struct { #define cpuhp_lock_acquire() lock_map_acquire(cpu_hotplug.dep_map) #define cpuhp_lock_release() lock_map_release(cpu_hotplug.dep_map) -static void apply_puts_pending(int max) -{ - int delta; - - if (atomic_read(cpu_hotplug.puts_pending) = max) { - delta = atomic_xchg(cpu_hotplug.puts_pending, 0); - cpu_hotplug.refcount -= delta; - } -} void get_online_cpus(void) { @@ -103,9 +95,9 @@ void get_online_cpus(void) return; cpuhp_lock_acquire_read(); mutex_lock(cpu_hotplug.lock); - apply_puts_pending(65536); - cpu_hotplug.refcount++; + atomic_inc(cpu_hotplug.refcount); mutex_unlock(cpu_hotplug.lock); + cpuhp_lock_release(); } EXPORT_SYMBOL_GPL(get_online_cpus); @@ -116,9 +108,9 @@ bool try_get_online_cpus(void) if (!mutex_trylock(cpu_hotplug.lock)) return false; cpuhp_lock_acquire_tryread(); - apply_puts_pending(65536); - cpu_hotplug.refcount++; + atomic_inc(cpu_hotplug.refcount); mutex_unlock(cpu_hotplug.lock); + cpuhp_lock_release(); return true; } EXPORT_SYMBOL_GPL(try_get_online_cpus); @@ -127,20 +119,16 @@ void put_online_cpus(void) { if (cpu_hotplug.active_writer == current) return; - if (!mutex_trylock(cpu_hotplug.lock)) { - atomic_inc(cpu_hotplug.puts_pending); - cpuhp_lock_release(); - return; - } - - if (WARN_ON(!cpu_hotplug.refcount)) - cpu_hotplug.refcount++; /* try to fix things up */ - if (!--cpu_hotplug.refcount unlikely(cpu_hotplug.active_writer)) - wake_up_process(cpu_hotplug.active_writer); - mutex_unlock(cpu_hotplug.lock); - cpuhp_lock_release(); + if (atomic_dec_and_test(cpu_hotplug.refcount) + waitqueue_active(cpu_hotplug.wq)) + wake_up(cpu_hotplug.wq); + /* missing get - try to fix things up */ The only thing that is bugging me is this part. Without the lock we can't guarantee that another get_online_cpus() just arrived and bumped the refcount to 0. Of course this only applies to misuse of put/get_online_cpus. We could hack some loop that tries to cmp_xchng with the old value until it fits to work around this, but wouldn't make the code any better
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
On Wed, Dec 10, 2014 at 02:26:19PM +0100, David Hildenbrand wrote: Commit b2c4623dcd07 (rcu: More on deadlock between CPU hotplug and expedited grace periods) introduced another problem that can easily be reproduced by starting/stopping cpus in a loop. E.g.: for i in `seq 5000`; do echo 1 /sys/devices/system/cpu/cpu1/online echo 0 /sys/devices/system/cpu/cpu1/online done Will result in: INFO: task /cpu_start_stop:1 blocked for more than 120 seconds. Call Trace: ([006a028e] __schedule+0x406/0x91c) [00130f60] cpu_hotplug_begin+0xd0/0xd4 [00130ff6] _cpu_up+0x3e/0x1c4 [00131232] cpu_up+0xb6/0xd4 [004a5720] device_online+0x80/0xc0 [004a57f0] online_store+0x90/0xb0 ... And a deadlock. Problem is that if the last ref in put_online_cpus() can't get the cpu_hotplug.lock the puts_pending count is incremented, but a sleeping active_writer might never be woken up, therefore never exiting the loop in cpu_hotplug_begin(). This fix removes puts_pending and turns refcount into an atomic variable. We also introduce a wait queue for the active_writer, to avoid possible races and use-after-free. There is no need to take the lock in put_online_cpus() anymore. Also rearrange the lockdep anotations so we won't get false positives. Can't reproduce it with this fix. Signed-off-by: David Hildenbrand d...@linux.vnet.ibm.com --- kernel/cpu.c | 60 ++-- 1 file changed, 26 insertions(+), 34 deletions(-) diff --git a/kernel/cpu.c b/kernel/cpu.c index 5d22023..3f1d5ba 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -58,22 +58,23 @@ static int cpu_hotplug_disabled; static struct { struct task_struct *active_writer; - struct mutex lock; /* Synchronizes accesses to refcount, */ + /* wait queue to wake up the active_writer */ + wait_queue_head_t wq; + /* verifies that no writer will get active while readers are active */ + struct mutex lock; /* * Also blocks the new readers during * an ongoing cpu hotplug operation. */ - int refcount; - /* And allows lockless put_online_cpus(). */ - atomic_t puts_pending; + atomic_t refcount; #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; #endif } cpu_hotplug = { .active_writer = NULL, + .wq = __WAIT_QUEUE_HEAD_INITIALIZER(cpu_hotplug.wq), .lock = __MUTEX_INITIALIZER(cpu_hotplug.lock), - .refcount = 0, #ifdef CONFIG_DEBUG_LOCK_ALLOC .dep_map = {.name = cpu_hotplug.lock }, #endif @@ -86,15 +87,6 @@ static struct { #define cpuhp_lock_acquire() lock_map_acquire(cpu_hotplug.dep_map) #define cpuhp_lock_release() lock_map_release(cpu_hotplug.dep_map) -static void apply_puts_pending(int max) -{ - int delta; - - if (atomic_read(cpu_hotplug.puts_pending) = max) { - delta = atomic_xchg(cpu_hotplug.puts_pending, 0); - cpu_hotplug.refcount -= delta; - } -} void get_online_cpus(void) { @@ -103,9 +95,9 @@ void get_online_cpus(void) return; cpuhp_lock_acquire_read(); mutex_lock(cpu_hotplug.lock); - apply_puts_pending(65536); - cpu_hotplug.refcount++; + atomic_inc(cpu_hotplug.refcount); mutex_unlock(cpu_hotplug.lock); + cpuhp_lock_release(); } EXPORT_SYMBOL_GPL(get_online_cpus); @@ -116,9 +108,9 @@ bool try_get_online_cpus(void) if (!mutex_trylock(cpu_hotplug.lock)) return false; cpuhp_lock_acquire_tryread(); - apply_puts_pending(65536); - cpu_hotplug.refcount++; + atomic_inc(cpu_hotplug.refcount); mutex_unlock(cpu_hotplug.lock); + cpuhp_lock_release(); return true; } EXPORT_SYMBOL_GPL(try_get_online_cpus); @@ -127,20 +119,16 @@ void put_online_cpus(void) { if (cpu_hotplug.active_writer == current) return; - if (!mutex_trylock(cpu_hotplug.lock)) { - atomic_inc(cpu_hotplug.puts_pending); - cpuhp_lock_release(); - return; - } - - if (WARN_ON(!cpu_hotplug.refcount)) - cpu_hotplug.refcount++; /* try to fix things up */ - if (!--cpu_hotplug.refcount unlikely(cpu_hotplug.active_writer)) - wake_up_process(cpu_hotplug.active_writer); - mutex_unlock(cpu_hotplug.lock); - cpuhp_lock_release(); + if (atomic_dec_and_test(cpu_hotplug.refcount) + waitqueue_active(cpu_hotplug.wq)) + wake_up(cpu_hotplug.wq); + /* missing get - try to fix things up */ The only thing that is bugging me is this part. Without the lock we can't guarantee that another get_online_cpus() just arrived and bumped the refcount to 0. Of course this only applies to misuse of put/get_online_cpus. We could hack some loop that tries to
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
On 12/10, David Hildenbrand wrote: @@ -127,20 +119,16 @@ void put_online_cpus(void) { if (cpu_hotplug.active_writer == current) return; - if (!mutex_trylock(cpu_hotplug.lock)) { - atomic_inc(cpu_hotplug.puts_pending); - cpuhp_lock_release(); - return; - } - - if (WARN_ON(!cpu_hotplug.refcount)) - cpu_hotplug.refcount++; /* try to fix things up */ - if (!--cpu_hotplug.refcount unlikely(cpu_hotplug.active_writer)) - wake_up_process(cpu_hotplug.active_writer); - mutex_unlock(cpu_hotplug.lock); - cpuhp_lock_release(); + if (atomic_dec_and_test(cpu_hotplug.refcount) + waitqueue_active(cpu_hotplug.wq)) + wake_up(cpu_hotplug.wq); OK, waitqueue_active() looks safe... prepare_to_wait() has a barrier. void cpu_hotplug_begin(void) { + DEFINE_WAIT(wait); + cpu_hotplug.active_writer = current; - cpuhp_lock_acquire(); for (;;) { + cpuhp_lock_acquire(); not sure I understand why did you move cpuhp_lock_acquire() into the loop, but this is minor. mutex_lock(cpu_hotplug.lock); - apply_puts_pending(1); - if (likely(!cpu_hotplug.refcount)) + prepare_to_wait(cpu_hotplug.wq, wait, TASK_UNINTERRUPTIBLE); + if (likely(!atomic_read(cpu_hotplug.refcount))) break; - __set_current_state(TASK_UNINTERRUPTIBLE); mutex_unlock(cpu_hotplug.lock); + cpuhp_lock_release(); schedule(); } + + finish_wait(cpu_hotplug.wq, wait); } This is subjective, but how about static bool xxx(void) { mutex_lock(cpu_hotplug.lock); if (atomic_read(cpu_hotplug.refcount) == 0) return true; mutex_unlock(cpu_hotplug.lock); return false; } void cpu_hotplug_begin(void) { cpu_hotplug.active_writer = current; cpuhp_lock_acquire(); wait_event(cpu_hotplug.wq, xxx()); } instead? Oleg. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
On 12/10, David Hildenbrand wrote: @@ -127,20 +119,16 @@ void put_online_cpus(void) { if (cpu_hotplug.active_writer == current) return; - if (!mutex_trylock(cpu_hotplug.lock)) { - atomic_inc(cpu_hotplug.puts_pending); - cpuhp_lock_release(); - return; - } - - if (WARN_ON(!cpu_hotplug.refcount)) - cpu_hotplug.refcount++; /* try to fix things up */ - if (!--cpu_hotplug.refcount unlikely(cpu_hotplug.active_writer)) - wake_up_process(cpu_hotplug.active_writer); - mutex_unlock(cpu_hotplug.lock); - cpuhp_lock_release(); + if (atomic_dec_and_test(cpu_hotplug.refcount) + waitqueue_active(cpu_hotplug.wq)) + wake_up(cpu_hotplug.wq); OK, waitqueue_active() looks safe... prepare_to_wait() has a barrier. void cpu_hotplug_begin(void) { + DEFINE_WAIT(wait); + cpu_hotplug.active_writer = current; - cpuhp_lock_acquire(); for (;;) { + cpuhp_lock_acquire(); not sure I understand why did you move cpuhp_lock_acquire() into the loop, but this is minor. Well I got some lockdep issues and this way I was able to solve them. (complain about same thread that called cpu_hotplug_begin() calling put_online_cpus(), so we have to correctly tell lockdep when we get an release the lock). So I guess I also need that in the loop, or am I wrong (due to cpuhp_lock_release())? mutex_lock(cpu_hotplug.lock); - apply_puts_pending(1); - if (likely(!cpu_hotplug.refcount)) + prepare_to_wait(cpu_hotplug.wq, wait, TASK_UNINTERRUPTIBLE); + if (likely(!atomic_read(cpu_hotplug.refcount))) break; - __set_current_state(TASK_UNINTERRUPTIBLE); mutex_unlock(cpu_hotplug.lock); + cpuhp_lock_release(); schedule(); } + + finish_wait(cpu_hotplug.wq, wait); } This is subjective, but how about static bool xxx(void) { mutex_lock(cpu_hotplug.lock); if (atomic_read(cpu_hotplug.refcount) == 0) return true; mutex_unlock(cpu_hotplug.lock); return false; } void cpu_hotplug_begin(void) { cpu_hotplug.active_writer = current; cpuhp_lock_acquire(); wait_event(cpu_hotplug.wq, xxx()); } instead? What I don't like about that suggestion is that the mutex_lock() happens in another level of indirection, so by looking at cpu_hotplug_begin() it isn't obvious that that lock remains locked after this function has been called. On the other hand this is really a compact one (+ possibly lockdep annotations) :) . Oleg. It is important that we do the state change to TASK_UNINTERRUPTIBLE prior to checking for the condition. Is it guaranteed with wait_event() that things like the following won't happen? 1. CPU1 wakes up the wq (refcount == 0) 2. CPU2 calls get_online_cpus() and increments refcount. (refcount == 1) 2. CPU3 executes xxx() up to return false; and gets scheduled away 3. CPU2 calls put_online_cpus(), decrementing the refcount (refcount == 0) - waitqueue not active - no wake up 4. CPU3 continues executing and sleeps - refcount == 0 but writer is not woken up Saying, does wait_event() take care wakeups while executing xxx()? (w.g. activating the wait queue, setting TASK_UNINTERRUPTIBLE just before calling xxx()) In my code, this is guaranteed by calling prepare_to_wait(cpu_hotplug.wq, wait, TASK_UNINTERRUPTIBLE); prior to checking for the condition. If that is guaranteed, this would work. Will verify that tomorrow. Thanks a lot! David -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
only thing that is bugging me is this part. Without the lock we can't guarantee that another get_online_cpus() just arrived and bumped the refcount to 0. Of course this only applies to misuse of put/get_online_cpus. We could hack some loop that tries to cmp_xchng with the old value until it fits to work around this, but wouldn't make the code any better readable. Well, we didn't have this diagnostic in the original, so one approach would be to simply leave it out. Another approach would be to just have the WARN_ON() without the attempt to fix it up. Thanx, Paul Well, I think the WARN_ON would be sufficient for debugging purposes. Thanks! David -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/