Re: [PATCH 2/2] [RFC] cpuset: Fix potential deadlock w/ set_mems_allowed

2013-09-14 Thread Li Zefan
Cc Mel, who added seqcount to cpuset.

On 2013/9/14 8:19, John Stultz wrote:
> After adding lockdep support to seqlock/seqcount structures,
> I started seeing the following warning:
> 
> [1.070907] ==
> [1.072015] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
> [1.073181] 3.11.0+ #67 Not tainted
> [1.073801] --
> [1.074882] kworker/u4:2/708 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
> [1.076088]  (>mems_allowed_seq){+.+...}, at: [] 
> new_slab+0x5f/0x280
> [1.077572]
> [1.077572] and this task is already holding:
> [1.078593]  (&(>__queue_lock)->rlock){..-...}, at: 
> [] blk_execute_rq_nowait+0x53/0xf0
> [1.080042] which would create a new lock dependency:
> [1.080042]  (&(>__queue_lock)->rlock){..-...} -> 
> (>mems_allowed_seq){+.+...}
> [1.080042]
> [1.080042] but this new dependency connects a SOFTIRQ-irq-safe lock:
> [1.080042]  (&(>__queue_lock)->rlock){..-...}
> [1.080042] ... which became SOFTIRQ-irq-safe at:
> [1.080042]   [] __lock_acquire+0x5b9/0x1db0
> [1.080042]   [] lock_acquire+0x95/0x130
> [1.080042]   [] _raw_spin_lock+0x41/0x80
> [1.080042]   [] scsi_device_unbusy+0x7e/0xd0
> [1.080042]   [] scsi_finish_command+0x32/0xf0
> [1.080042]   [] scsi_softirq_done+0xa1/0x130
> [1.080042]   [] blk_done_softirq+0x73/0x90
> [1.080042]   [] __do_softirq+0x110/0x2f0
> [1.080042]   [] run_ksoftirqd+0x2d/0x60
> [1.080042]   [] smpboot_thread_fn+0x156/0x1e0
> [1.080042]   [] kthread+0xd6/0xe0
> [1.080042]   [] ret_from_fork+0x7c/0xb0
> [1.080042]
> [1.080042] to a SOFTIRQ-irq-unsafe lock:
> [1.080042]  (>mems_allowed_seq){+.+...}
> [1.080042] ... which became SOFTIRQ-irq-unsafe at:
> [1.080042] ...  [] __lock_acquire+0x613/0x1db0
> [1.080042]   [] lock_acquire+0x95/0x130
> [1.080042]   [] kthreadd+0x82/0x180
> [1.080042]   [] ret_from_fork+0x7c/0xb0
> [1.080042]
> [1.080042] other info that might help us debug this:
> [1.080042]
> [1.080042]  Possible interrupt unsafe locking scenario:
> [1.080042]
> [1.080042]CPU0CPU1
> [1.080042]
> [1.080042]   lock(>mems_allowed_seq);
> [1.080042]local_irq_disable();
> [1.080042]
> lock(&(>__queue_lock)->rlock);
> [1.080042]lock(>mems_allowed_seq);
> [1.080042]   
> [1.080042] lock(&(>__queue_lock)->rlock);
> [1.080042]
> [1.080042]  *** DEADLOCK ***
> 
> The issue stems from the kthreadd() function calling set_mems_allowed
> with irqs enabled. While its possibly unlikely for the actual deadlock
> to trigger, a fix is fairly simple: disable irqs before taking the
> mems_allowed_seq lock.
> 

Now I get it. I'm fine with this change.

Acked-by: Li Zefan 

> Let me know if you have any other suggestions or alternative fixes you'd
> prefer.
> 
> Cc: Li Zefan 
> Cc: Mathieu Desnoyers 
> Cc: Steven Rostedt 
> Cc: Peter Zijlstra 
> Cc: Ingo Molnar 
> Cc: Thomas Gleixner 
> Signed-off-by: John Stultz 
> ---
>  include/linux/cpuset.h | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
> index cc1b01c..3fe661f 100644
> --- a/include/linux/cpuset.h
> +++ b/include/linux/cpuset.h
> @@ -110,10 +110,14 @@ static inline bool put_mems_allowed(unsigned int seq)
>  
>  static inline void set_mems_allowed(nodemask_t nodemask)
>  {
> + unsigned long flags;
> +
>   task_lock(current);
> + local_irq_save(flags);
>   write_seqcount_begin(>mems_allowed_seq);
>   current->mems_allowed = nodemask;
>   write_seqcount_end(>mems_allowed_seq);
> + local_irq_restore(flags);
>   task_unlock(current);
>  }
>  
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] [RFC] cpuset: Fix potential deadlock w/ set_mems_allowed

2013-09-14 Thread Li Zefan
Cc Mel, who added seqcount to cpuset.

On 2013/9/14 8:19, John Stultz wrote:
 After adding lockdep support to seqlock/seqcount structures,
 I started seeing the following warning:
 
 [1.070907] ==
 [1.072015] [ INFO: SOFTIRQ-safe - SOFTIRQ-unsafe lock order detected ]
 [1.073181] 3.11.0+ #67 Not tainted
 [1.073801] --
 [1.074882] kworker/u4:2/708 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
 [1.076088]  (p-mems_allowed_seq){+.+...}, at: [81187d7f] 
 new_slab+0x5f/0x280
 [1.077572]
 [1.077572] and this task is already holding:
 [1.078593]  ((q-__queue_lock)-rlock){..-...}, at: 
 [81339f03] blk_execute_rq_nowait+0x53/0xf0
 [1.080042] which would create a new lock dependency:
 [1.080042]  ((q-__queue_lock)-rlock){..-...} - 
 (p-mems_allowed_seq){+.+...}
 [1.080042]
 [1.080042] but this new dependency connects a SOFTIRQ-irq-safe lock:
 [1.080042]  ((q-__queue_lock)-rlock){..-...}
 [1.080042] ... which became SOFTIRQ-irq-safe at:
 [1.080042]   [810ec179] __lock_acquire+0x5b9/0x1db0
 [1.080042]   [810edfe5] lock_acquire+0x95/0x130
 [1.080042]   [818968a1] _raw_spin_lock+0x41/0x80
 [1.080042]   [81560c9e] scsi_device_unbusy+0x7e/0xd0
 [1.080042]   [8155a612] scsi_finish_command+0x32/0xf0
 [1.080042]   [81560e91] scsi_softirq_done+0xa1/0x130
 [1.080042]   [8133b0f3] blk_done_softirq+0x73/0x90
 [1.080042]   [81095dc0] __do_softirq+0x110/0x2f0
 [1.080042]   [81095fcd] run_ksoftirqd+0x2d/0x60
 [1.080042]   [810bc506] smpboot_thread_fn+0x156/0x1e0
 [1.080042]   [810b3916] kthread+0xd6/0xe0
 [1.080042]   [818980ac] ret_from_fork+0x7c/0xb0
 [1.080042]
 [1.080042] to a SOFTIRQ-irq-unsafe lock:
 [1.080042]  (p-mems_allowed_seq){+.+...}
 [1.080042] ... which became SOFTIRQ-irq-unsafe at:
 [1.080042] ...  [810ec1d3] __lock_acquire+0x613/0x1db0
 [1.080042]   [810edfe5] lock_acquire+0x95/0x130
 [1.080042]   [810b3df2] kthreadd+0x82/0x180
 [1.080042]   [818980ac] ret_from_fork+0x7c/0xb0
 [1.080042]
 [1.080042] other info that might help us debug this:
 [1.080042]
 [1.080042]  Possible interrupt unsafe locking scenario:
 [1.080042]
 [1.080042]CPU0CPU1
 [1.080042]
 [1.080042]   lock(p-mems_allowed_seq);
 [1.080042]local_irq_disable();
 [1.080042]
 lock((q-__queue_lock)-rlock);
 [1.080042]lock(p-mems_allowed_seq);
 [1.080042]   Interrupt
 [1.080042] lock((q-__queue_lock)-rlock);
 [1.080042]
 [1.080042]  *** DEADLOCK ***
 
 The issue stems from the kthreadd() function calling set_mems_allowed
 with irqs enabled. While its possibly unlikely for the actual deadlock
 to trigger, a fix is fairly simple: disable irqs before taking the
 mems_allowed_seq lock.
 

Now I get it. I'm fine with this change.

Acked-by: Li Zefan lize...@huawei.com

 Let me know if you have any other suggestions or alternative fixes you'd
 prefer.
 
 Cc: Li Zefan lize...@huawei.com
 Cc: Mathieu Desnoyers mathieu.desnoy...@efficios.com
 Cc: Steven Rostedt rost...@goodmis.org
 Cc: Peter Zijlstra pet...@infradead.org
 Cc: Ingo Molnar mi...@kernel.org
 Cc: Thomas Gleixner t...@linutronix.de
 Signed-off-by: John Stultz john.stu...@linaro.org
 ---
  include/linux/cpuset.h | 4 
  1 file changed, 4 insertions(+)
 
 diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
 index cc1b01c..3fe661f 100644
 --- a/include/linux/cpuset.h
 +++ b/include/linux/cpuset.h
 @@ -110,10 +110,14 @@ static inline bool put_mems_allowed(unsigned int seq)
  
  static inline void set_mems_allowed(nodemask_t nodemask)
  {
 + unsigned long flags;
 +
   task_lock(current);
 + local_irq_save(flags);
   write_seqcount_begin(current-mems_allowed_seq);
   current-mems_allowed = nodemask;
   write_seqcount_end(current-mems_allowed_seq);
 + local_irq_restore(flags);
   task_unlock(current);
  }
  
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] [RFC] cpuset: Fix potential deadlock w/ set_mems_allowed

2013-09-13 Thread John Stultz
After adding lockdep support to seqlock/seqcount structures,
I started seeing the following warning:

[1.070907] ==
[1.072015] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
[1.073181] 3.11.0+ #67 Not tainted
[1.073801] --
[1.074882] kworker/u4:2/708 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[1.076088]  (>mems_allowed_seq){+.+...}, at: [] 
new_slab+0x5f/0x280
[1.077572]
[1.077572] and this task is already holding:
[1.078593]  (&(>__queue_lock)->rlock){..-...}, at: [] 
blk_execute_rq_nowait+0x53/0xf0
[1.080042] which would create a new lock dependency:
[1.080042]  (&(>__queue_lock)->rlock){..-...} -> 
(>mems_allowed_seq){+.+...}
[1.080042]
[1.080042] but this new dependency connects a SOFTIRQ-irq-safe lock:
[1.080042]  (&(>__queue_lock)->rlock){..-...}
[1.080042] ... which became SOFTIRQ-irq-safe at:
[1.080042]   [] __lock_acquire+0x5b9/0x1db0
[1.080042]   [] lock_acquire+0x95/0x130
[1.080042]   [] _raw_spin_lock+0x41/0x80
[1.080042]   [] scsi_device_unbusy+0x7e/0xd0
[1.080042]   [] scsi_finish_command+0x32/0xf0
[1.080042]   [] scsi_softirq_done+0xa1/0x130
[1.080042]   [] blk_done_softirq+0x73/0x90
[1.080042]   [] __do_softirq+0x110/0x2f0
[1.080042]   [] run_ksoftirqd+0x2d/0x60
[1.080042]   [] smpboot_thread_fn+0x156/0x1e0
[1.080042]   [] kthread+0xd6/0xe0
[1.080042]   [] ret_from_fork+0x7c/0xb0
[1.080042]
[1.080042] to a SOFTIRQ-irq-unsafe lock:
[1.080042]  (>mems_allowed_seq){+.+...}
[1.080042] ... which became SOFTIRQ-irq-unsafe at:
[1.080042] ...  [] __lock_acquire+0x613/0x1db0
[1.080042]   [] lock_acquire+0x95/0x130
[1.080042]   [] kthreadd+0x82/0x180
[1.080042]   [] ret_from_fork+0x7c/0xb0
[1.080042]
[1.080042] other info that might help us debug this:
[1.080042]
[1.080042]  Possible interrupt unsafe locking scenario:
[1.080042]
[1.080042]CPU0CPU1
[1.080042]
[1.080042]   lock(>mems_allowed_seq);
[1.080042]local_irq_disable();
[1.080042]lock(&(>__queue_lock)->rlock);
[1.080042]lock(>mems_allowed_seq);
[1.080042]   
[1.080042] lock(&(>__queue_lock)->rlock);
[1.080042]
[1.080042]  *** DEADLOCK ***

The issue stems from the kthreadd() function calling set_mems_allowed
with irqs enabled. While its possibly unlikely for the actual deadlock
to trigger, a fix is fairly simple: disable irqs before taking the
mems_allowed_seq lock.

Let me know if you have any other suggestions or alternative fixes you'd
prefer.

Cc: Li Zefan 
Cc: Mathieu Desnoyers 
Cc: Steven Rostedt 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Signed-off-by: John Stultz 
---
 include/linux/cpuset.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
index cc1b01c..3fe661f 100644
--- a/include/linux/cpuset.h
+++ b/include/linux/cpuset.h
@@ -110,10 +110,14 @@ static inline bool put_mems_allowed(unsigned int seq)
 
 static inline void set_mems_allowed(nodemask_t nodemask)
 {
+   unsigned long flags;
+
task_lock(current);
+   local_irq_save(flags);
write_seqcount_begin(>mems_allowed_seq);
current->mems_allowed = nodemask;
write_seqcount_end(>mems_allowed_seq);
+   local_irq_restore(flags);
task_unlock(current);
 }
 
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] [RFC] cpuset: Fix potential deadlock w/ set_mems_allowed

2013-09-13 Thread John Stultz
After adding lockdep support to seqlock/seqcount structures,
I started seeing the following warning:

[1.070907] ==
[1.072015] [ INFO: SOFTIRQ-safe - SOFTIRQ-unsafe lock order detected ]
[1.073181] 3.11.0+ #67 Not tainted
[1.073801] --
[1.074882] kworker/u4:2/708 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[1.076088]  (p-mems_allowed_seq){+.+...}, at: [81187d7f] 
new_slab+0x5f/0x280
[1.077572]
[1.077572] and this task is already holding:
[1.078593]  ((q-__queue_lock)-rlock){..-...}, at: [81339f03] 
blk_execute_rq_nowait+0x53/0xf0
[1.080042] which would create a new lock dependency:
[1.080042]  ((q-__queue_lock)-rlock){..-...} - 
(p-mems_allowed_seq){+.+...}
[1.080042]
[1.080042] but this new dependency connects a SOFTIRQ-irq-safe lock:
[1.080042]  ((q-__queue_lock)-rlock){..-...}
[1.080042] ... which became SOFTIRQ-irq-safe at:
[1.080042]   [810ec179] __lock_acquire+0x5b9/0x1db0
[1.080042]   [810edfe5] lock_acquire+0x95/0x130
[1.080042]   [818968a1] _raw_spin_lock+0x41/0x80
[1.080042]   [81560c9e] scsi_device_unbusy+0x7e/0xd0
[1.080042]   [8155a612] scsi_finish_command+0x32/0xf0
[1.080042]   [81560e91] scsi_softirq_done+0xa1/0x130
[1.080042]   [8133b0f3] blk_done_softirq+0x73/0x90
[1.080042]   [81095dc0] __do_softirq+0x110/0x2f0
[1.080042]   [81095fcd] run_ksoftirqd+0x2d/0x60
[1.080042]   [810bc506] smpboot_thread_fn+0x156/0x1e0
[1.080042]   [810b3916] kthread+0xd6/0xe0
[1.080042]   [818980ac] ret_from_fork+0x7c/0xb0
[1.080042]
[1.080042] to a SOFTIRQ-irq-unsafe lock:
[1.080042]  (p-mems_allowed_seq){+.+...}
[1.080042] ... which became SOFTIRQ-irq-unsafe at:
[1.080042] ...  [810ec1d3] __lock_acquire+0x613/0x1db0
[1.080042]   [810edfe5] lock_acquire+0x95/0x130
[1.080042]   [810b3df2] kthreadd+0x82/0x180
[1.080042]   [818980ac] ret_from_fork+0x7c/0xb0
[1.080042]
[1.080042] other info that might help us debug this:
[1.080042]
[1.080042]  Possible interrupt unsafe locking scenario:
[1.080042]
[1.080042]CPU0CPU1
[1.080042]
[1.080042]   lock(p-mems_allowed_seq);
[1.080042]local_irq_disable();
[1.080042]lock((q-__queue_lock)-rlock);
[1.080042]lock(p-mems_allowed_seq);
[1.080042]   Interrupt
[1.080042] lock((q-__queue_lock)-rlock);
[1.080042]
[1.080042]  *** DEADLOCK ***

The issue stems from the kthreadd() function calling set_mems_allowed
with irqs enabled. While its possibly unlikely for the actual deadlock
to trigger, a fix is fairly simple: disable irqs before taking the
mems_allowed_seq lock.

Let me know if you have any other suggestions or alternative fixes you'd
prefer.

Cc: Li Zefan lize...@huawei.com
Cc: Mathieu Desnoyers mathieu.desnoy...@efficios.com
Cc: Steven Rostedt rost...@goodmis.org
Cc: Peter Zijlstra pet...@infradead.org
Cc: Ingo Molnar mi...@kernel.org
Cc: Thomas Gleixner t...@linutronix.de
Signed-off-by: John Stultz john.stu...@linaro.org
---
 include/linux/cpuset.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
index cc1b01c..3fe661f 100644
--- a/include/linux/cpuset.h
+++ b/include/linux/cpuset.h
@@ -110,10 +110,14 @@ static inline bool put_mems_allowed(unsigned int seq)
 
 static inline void set_mems_allowed(nodemask_t nodemask)
 {
+   unsigned long flags;
+
task_lock(current);
+   local_irq_save(flags);
write_seqcount_begin(current-mems_allowed_seq);
current-mems_allowed = nodemask;
write_seqcount_end(current-mems_allowed_seq);
+   local_irq_restore(flags);
task_unlock(current);
 }
 
-- 
1.8.1.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/