Re: [PATCH] mm: __set_page_dirty_nobuffers uses spin_lock_irqseve instead of spin_lock_irq

2014-02-06 Thread Tang Chen


Hi,

Tested-by: Tang Chen 

Have tested this patch, and the problem is fixed.

Thanks.

On 02/04/2014 12:49 AM, kosaki.motoh...@gmail.com wrote:

From: KOSAKI Motohiro

During aio stress test, we observed the following lockdep warning.
This mean AIO+numa_balancing is currently deadlockable.

The problem is, aio_migratepage disable interrupt, but 
__set_page_dirty_nobuffers
unintentionally enable it again.

Generally, all helper function should use spin_lock_irqsave()
instead of spin_lock_irq() because they don't know caller at all.

[  599.843948] other info that might help us debug this:
[  599.873748]  Possible unsafe locking scenario:
[  599.873748]
[  599.900902]CPU0
[  599.912701]
[  599.924929]   lock(&(>completion_lock)->rlock);
[  599.950299]
[  599.962576] lock(&(>completion_lock)->rlock);
[  599.985771]
[  599.985771]  *** DEADLOCK ***

[  600.375623]  [] dump_stack+0x19/0x1b
[  600.398769]  [] print_usage_bug+0x1f7/0x208
[  600.425092]  [] ? 
print_shortest_lock_dependencies+0x1d0/0x1d0
[  600.458981]  [] mark_lock+0x21d/0x2a0
[  600.482910]  [] mark_held_locks+0xb9/0x140
[  600.508956]  [] ? _raw_spin_unlock_irq+0x2c/0x50
[  600.536825]  [] trace_hardirqs_on_caller+0x105/0x1d0
[  600.566861]  [] trace_hardirqs_on+0xd/0x10
[  600.593210]  [] _raw_spin_unlock_irq+0x2c/0x50
[  600.620599]  [] __set_page_dirty_nobuffers+0x8c/0xf0
[  600.649992]  [] migrate_page_copy+0x434/0x540
[  600.676635]  [] aio_migratepage+0xb1/0x140
[  600.703126]  [] move_to_new_page+0x7d/0x230
[  600.729022]  [] migrate_pages+0x5e5/0x700
[  600.754705]  [] ? buffer_migrate_lock_buffers+0xb0/0xb0
[  600.785784]  [] migrate_misplaced_page+0xbc/0xf0
[  600.814029]  [] do_numa_page+0x102/0x190
[  600.839182]  [] handle_pte_fault+0x241/0x970
[  600.865875]  [] handle_mm_fault+0x265/0x370
[  600.892071]  [] __do_page_fault+0x172/0x5a0
[  600.918065]  [] ? retint_swapgs+0x13/0x1b
[  600.943493]  [] do_page_fault+0x1a/0x70
[  600.968081]  [] page_fault+0x28/0x30

Signed-off-by: KOSAKI Motohiro
Cc: Larry Woodman
Cc: Rik van Riel
Cc: Johannes Weiner
Cc: sta...@vger.kernel.org
---
  mm/page-writeback.c |5 +++--
  1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 2d30e2c..7106cb1 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2173,11 +2173,12 @@ int __set_page_dirty_nobuffers(struct page *page)
if (!TestSetPageDirty(page)) {
struct address_space *mapping = page_mapping(page);
struct address_space *mapping2;
+   unsigned long flags;

if (!mapping)
return 1;

-   spin_lock_irq(>tree_lock);
+   spin_lock_irqsave(>tree_lock, flags);
mapping2 = page_mapping(page);
if (mapping2) { /* Race with truncate? */
BUG_ON(mapping2 != mapping);
@@ -2186,7 +2187,7 @@ int __set_page_dirty_nobuffers(struct page *page)
radix_tree_tag_set(>page_tree,
page_index(page), PAGECACHE_TAG_DIRTY);
}
-   spin_unlock_irq(>tree_lock);
+   spin_unlock_irqrestore(>tree_lock, flags);
if (mapping->host) {
/* !PageAnon&&  !swapper_space */
__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: __set_page_dirty_nobuffers uses spin_lock_irqseve instead of spin_lock_irq

2014-02-06 Thread Tang Chen


Hi,

Tested-by: Tang Chen tangc...@cn.fujitsu.com

Have tested this patch, and the problem is fixed.

Thanks.

On 02/04/2014 12:49 AM, kosaki.motoh...@gmail.com wrote:

From: KOSAKI Motohirokosaki.motoh...@jp.fujitsu.com

During aio stress test, we observed the following lockdep warning.
This mean AIO+numa_balancing is currently deadlockable.

The problem is, aio_migratepage disable interrupt, but 
__set_page_dirty_nobuffers
unintentionally enable it again.

Generally, all helper function should use spin_lock_irqsave()
instead of spin_lock_irq() because they don't know caller at all.

[  599.843948] other info that might help us debug this:
[  599.873748]  Possible unsafe locking scenario:
[  599.873748]
[  599.900902]CPU0
[  599.912701]
[  599.924929]   lock((ctx-completion_lock)-rlock);
[  599.950299]Interrupt
[  599.962576] lock((ctx-completion_lock)-rlock);
[  599.985771]
[  599.985771]  *** DEADLOCK ***

[  600.375623]  [81678d3c] dump_stack+0x19/0x1b
[  600.398769]  [816731aa] print_usage_bug+0x1f7/0x208
[  600.425092]  [810df370] ? 
print_shortest_lock_dependencies+0x1d0/0x1d0
[  600.458981]  [810e08dd] mark_lock+0x21d/0x2a0
[  600.482910]  [810e0a19] mark_held_locks+0xb9/0x140
[  600.508956]  [8168201c] ? _raw_spin_unlock_irq+0x2c/0x50
[  600.536825]  [810e0ba5] trace_hardirqs_on_caller+0x105/0x1d0
[  600.566861]  [810e0c7d] trace_hardirqs_on+0xd/0x10
[  600.593210]  [8168201c] _raw_spin_unlock_irq+0x2c/0x50
[  600.620599]  [8117f72c] __set_page_dirty_nobuffers+0x8c/0xf0
[  600.649992]  [811d1094] migrate_page_copy+0x434/0x540
[  600.676635]  [8123f5b1] aio_migratepage+0xb1/0x140
[  600.703126]  [811d126d] move_to_new_page+0x7d/0x230
[  600.729022]  [811d1b45] migrate_pages+0x5e5/0x700
[  600.754705]  [811d0070] ? buffer_migrate_lock_buffers+0xb0/0xb0
[  600.785784]  [811d29cc] migrate_misplaced_page+0xbc/0xf0
[  600.814029]  [8119eb62] do_numa_page+0x102/0x190
[  600.839182]  [8119ee31] handle_pte_fault+0x241/0x970
[  600.865875]  [811a0345] handle_mm_fault+0x265/0x370
[  600.892071]  [81686d82] __do_page_fault+0x172/0x5a0
[  600.918065]  [81682cd8] ? retint_swapgs+0x13/0x1b
[  600.943493]  [816871ca] do_page_fault+0x1a/0x70
[  600.968081]  [81682ff8] page_fault+0x28/0x30

Signed-off-by: KOSAKI Motohirokosaki.motoh...@jp.fujitsu.com
Cc: Larry Woodmanlwood...@redhat.com
Cc: Rik van Rielr...@redhat.com
Cc: Johannes Weinerjwei...@redhat.com
Cc: sta...@vger.kernel.org
---
  mm/page-writeback.c |5 +++--
  1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 2d30e2c..7106cb1 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2173,11 +2173,12 @@ int __set_page_dirty_nobuffers(struct page *page)
if (!TestSetPageDirty(page)) {
struct address_space *mapping = page_mapping(page);
struct address_space *mapping2;
+   unsigned long flags;

if (!mapping)
return 1;

-   spin_lock_irq(mapping-tree_lock);
+   spin_lock_irqsave(mapping-tree_lock, flags);
mapping2 = page_mapping(page);
if (mapping2) { /* Race with truncate? */
BUG_ON(mapping2 != mapping);
@@ -2186,7 +2187,7 @@ int __set_page_dirty_nobuffers(struct page *page)
radix_tree_tag_set(mapping-page_tree,
page_index(page), PAGECACHE_TAG_DIRTY);
}
-   spin_unlock_irq(mapping-tree_lock);
+   spin_unlock_irqrestore(mapping-tree_lock, flags);
if (mapping-host) {
/* !PageAnon  !swapper_space */
__mark_inode_dirty(mapping-host, I_DIRTY_PAGES);

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: __set_page_dirty_nobuffers uses spin_lock_irqseve instead of spin_lock_irq

2014-02-05 Thread Yasuaki Ishimatsu

(2014/02/04 1:49), kosaki.motoh...@gmail.com wrote:

From: KOSAKI Motohiro 

During aio stress test, we observed the following lockdep warning.
This mean AIO+numa_balancing is currently deadlockable.

The problem is, aio_migratepage disable interrupt, but 
__set_page_dirty_nobuffers
unintentionally enable it again.

Generally, all helper function should use spin_lock_irqsave()
instead of spin_lock_irq() because they don't know caller at all.

[  599.843948] other info that might help us debug this:
[  599.873748]  Possible unsafe locking scenario:
[  599.873748]
[  599.900902]CPU0
[  599.912701]
[  599.924929]   lock(&(>completion_lock)->rlock);
[  599.950299]   
[  599.962576] lock(&(>completion_lock)->rlock);
[  599.985771]
[  599.985771]  *** DEADLOCK ***

[  600.375623]  [] dump_stack+0x19/0x1b
[  600.398769]  [] print_usage_bug+0x1f7/0x208
[  600.425092]  [] ? 
print_shortest_lock_dependencies+0x1d0/0x1d0
[  600.458981]  [] mark_lock+0x21d/0x2a0
[  600.482910]  [] mark_held_locks+0xb9/0x140
[  600.508956]  [] ? _raw_spin_unlock_irq+0x2c/0x50
[  600.536825]  [] trace_hardirqs_on_caller+0x105/0x1d0
[  600.566861]  [] trace_hardirqs_on+0xd/0x10
[  600.593210]  [] _raw_spin_unlock_irq+0x2c/0x50
[  600.620599]  [] __set_page_dirty_nobuffers+0x8c/0xf0
[  600.649992]  [] migrate_page_copy+0x434/0x540
[  600.676635]  [] aio_migratepage+0xb1/0x140
[  600.703126]  [] move_to_new_page+0x7d/0x230
[  600.729022]  [] migrate_pages+0x5e5/0x700
[  600.754705]  [] ? buffer_migrate_lock_buffers+0xb0/0xb0
[  600.785784]  [] migrate_misplaced_page+0xbc/0xf0
[  600.814029]  [] do_numa_page+0x102/0x190
[  600.839182]  [] handle_pte_fault+0x241/0x970
[  600.865875]  [] handle_mm_fault+0x265/0x370
[  600.892071]  [] __do_page_fault+0x172/0x5a0
[  600.918065]  [] ? retint_swapgs+0x13/0x1b
[  600.943493]  [] do_page_fault+0x1a/0x70
[  600.968081]  [] page_fault+0x28/0x30

Signed-off-by: KOSAKI Motohiro 
Cc: Larry Woodman 
Cc: Rik van Riel 
Cc: Johannes Weiner 
Cc: sta...@vger.kernel.org
---


Tested-by: Yasuaki Ishimatsu 

Thank you for posting the patch.
The same issue occurred on my box. And I confirmed that the issue
disappeared by the patch.

Thanks,
Yasuaki Ishimatsu


  mm/page-writeback.c |5 +++--
  1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 2d30e2c..7106cb1 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2173,11 +2173,12 @@ int __set_page_dirty_nobuffers(struct page *page)
if (!TestSetPageDirty(page)) {
struct address_space *mapping = page_mapping(page);
struct address_space *mapping2;
+   unsigned long flags;

if (!mapping)
return 1;

-   spin_lock_irq(>tree_lock);
+   spin_lock_irqsave(>tree_lock, flags);
mapping2 = page_mapping(page);
if (mapping2) { /* Race with truncate? */
BUG_ON(mapping2 != mapping);
@@ -2186,7 +2187,7 @@ int __set_page_dirty_nobuffers(struct page *page)
radix_tree_tag_set(>page_tree,
page_index(page), PAGECACHE_TAG_DIRTY);
}
-   spin_unlock_irq(>tree_lock);
+   spin_unlock_irqrestore(>tree_lock, flags);
if (mapping->host) {
/* !PageAnon && !swapper_space */
__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: __set_page_dirty_nobuffers uses spin_lock_irqseve instead of spin_lock_irq

2014-02-05 Thread Yasuaki Ishimatsu

(2014/02/04 1:49), kosaki.motoh...@gmail.com wrote:

From: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com

During aio stress test, we observed the following lockdep warning.
This mean AIO+numa_balancing is currently deadlockable.

The problem is, aio_migratepage disable interrupt, but 
__set_page_dirty_nobuffers
unintentionally enable it again.

Generally, all helper function should use spin_lock_irqsave()
instead of spin_lock_irq() because they don't know caller at all.

[  599.843948] other info that might help us debug this:
[  599.873748]  Possible unsafe locking scenario:
[  599.873748]
[  599.900902]CPU0
[  599.912701]
[  599.924929]   lock((ctx-completion_lock)-rlock);
[  599.950299]   Interrupt
[  599.962576] lock((ctx-completion_lock)-rlock);
[  599.985771]
[  599.985771]  *** DEADLOCK ***

[  600.375623]  [81678d3c] dump_stack+0x19/0x1b
[  600.398769]  [816731aa] print_usage_bug+0x1f7/0x208
[  600.425092]  [810df370] ? 
print_shortest_lock_dependencies+0x1d0/0x1d0
[  600.458981]  [810e08dd] mark_lock+0x21d/0x2a0
[  600.482910]  [810e0a19] mark_held_locks+0xb9/0x140
[  600.508956]  [8168201c] ? _raw_spin_unlock_irq+0x2c/0x50
[  600.536825]  [810e0ba5] trace_hardirqs_on_caller+0x105/0x1d0
[  600.566861]  [810e0c7d] trace_hardirqs_on+0xd/0x10
[  600.593210]  [8168201c] _raw_spin_unlock_irq+0x2c/0x50
[  600.620599]  [8117f72c] __set_page_dirty_nobuffers+0x8c/0xf0
[  600.649992]  [811d1094] migrate_page_copy+0x434/0x540
[  600.676635]  [8123f5b1] aio_migratepage+0xb1/0x140
[  600.703126]  [811d126d] move_to_new_page+0x7d/0x230
[  600.729022]  [811d1b45] migrate_pages+0x5e5/0x700
[  600.754705]  [811d0070] ? buffer_migrate_lock_buffers+0xb0/0xb0
[  600.785784]  [811d29cc] migrate_misplaced_page+0xbc/0xf0
[  600.814029]  [8119eb62] do_numa_page+0x102/0x190
[  600.839182]  [8119ee31] handle_pte_fault+0x241/0x970
[  600.865875]  [811a0345] handle_mm_fault+0x265/0x370
[  600.892071]  [81686d82] __do_page_fault+0x172/0x5a0
[  600.918065]  [81682cd8] ? retint_swapgs+0x13/0x1b
[  600.943493]  [816871ca] do_page_fault+0x1a/0x70
[  600.968081]  [81682ff8] page_fault+0x28/0x30

Signed-off-by: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com
Cc: Larry Woodman lwood...@redhat.com
Cc: Rik van Riel r...@redhat.com
Cc: Johannes Weiner jwei...@redhat.com
Cc: sta...@vger.kernel.org
---


Tested-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com

Thank you for posting the patch.
The same issue occurred on my box. And I confirmed that the issue
disappeared by the patch.

Thanks,
Yasuaki Ishimatsu


  mm/page-writeback.c |5 +++--
  1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 2d30e2c..7106cb1 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2173,11 +2173,12 @@ int __set_page_dirty_nobuffers(struct page *page)
if (!TestSetPageDirty(page)) {
struct address_space *mapping = page_mapping(page);
struct address_space *mapping2;
+   unsigned long flags;

if (!mapping)
return 1;

-   spin_lock_irq(mapping-tree_lock);
+   spin_lock_irqsave(mapping-tree_lock, flags);
mapping2 = page_mapping(page);
if (mapping2) { /* Race with truncate? */
BUG_ON(mapping2 != mapping);
@@ -2186,7 +2187,7 @@ int __set_page_dirty_nobuffers(struct page *page)
radix_tree_tag_set(mapping-page_tree,
page_index(page), PAGECACHE_TAG_DIRTY);
}
-   spin_unlock_irq(mapping-tree_lock);
+   spin_unlock_irqrestore(mapping-tree_lock, flags);
if (mapping-host) {
/* !PageAnon  !swapper_space */
__mark_inode_dirty(mapping-host, I_DIRTY_PAGES);




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: __set_page_dirty_nobuffers uses spin_lock_irqseve instead of spin_lock_irq

2014-02-04 Thread KOSAKI Motohiro
> Indeed, good catch.  Do we need the same treatment for
> __set_page_dirty_buffers() that can be called by way of
> clear_page_dirty_for_io()?

Indeed. I posted a patch fixed __set_page_dirty() too. plz see

Subject: [PATCH] __set_page_dirty uses spin_lock_irqsave instead of
spin_lock_irq
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: __set_page_dirty_nobuffers uses spin_lock_irqseve instead of spin_lock_irq

2014-02-04 Thread KOSAKI Motohiro
 Indeed, good catch.  Do we need the same treatment for
 __set_page_dirty_buffers() that can be called by way of
 clear_page_dirty_for_io()?

Indeed. I posted a patch fixed __set_page_dirty() too. plz see

Subject: [PATCH] __set_page_dirty uses spin_lock_irqsave instead of
spin_lock_irq
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: __set_page_dirty_nobuffers uses spin_lock_irqseve instead of spin_lock_irq

2014-02-03 Thread David Rientjes
On Mon, 3 Feb 2014, kosaki.motoh...@gmail.com wrote:

> From: KOSAKI Motohiro 
> 
> During aio stress test, we observed the following lockdep warning.
> This mean AIO+numa_balancing is currently deadlockable.
> 
> The problem is, aio_migratepage disable interrupt, but 
> __set_page_dirty_nobuffers
> unintentionally enable it again.
> 
> Generally, all helper function should use spin_lock_irqsave()
> instead of spin_lock_irq() because they don't know caller at all.
> 
> [  599.843948] other info that might help us debug this:
> [  599.873748]  Possible unsafe locking scenario:
> [  599.873748]
> [  599.900902]CPU0
> [  599.912701]
> [  599.924929]   lock(&(>completion_lock)->rlock);
> [  599.950299]   
> [  599.962576] lock(&(>completion_lock)->rlock);
> [  599.985771]
> [  599.985771]  *** DEADLOCK ***
> 
> [  600.375623]  [] dump_stack+0x19/0x1b
> [  600.398769]  [] print_usage_bug+0x1f7/0x208
> [  600.425092]  [] ? 
> print_shortest_lock_dependencies+0x1d0/0x1d0
> [  600.458981]  [] mark_lock+0x21d/0x2a0
> [  600.482910]  [] mark_held_locks+0xb9/0x140
> [  600.508956]  [] ? _raw_spin_unlock_irq+0x2c/0x50
> [  600.536825]  [] trace_hardirqs_on_caller+0x105/0x1d0
> [  600.566861]  [] trace_hardirqs_on+0xd/0x10
> [  600.593210]  [] _raw_spin_unlock_irq+0x2c/0x50
> [  600.620599]  [] __set_page_dirty_nobuffers+0x8c/0xf0
> [  600.649992]  [] migrate_page_copy+0x434/0x540
> [  600.676635]  [] aio_migratepage+0xb1/0x140
> [  600.703126]  [] move_to_new_page+0x7d/0x230
> [  600.729022]  [] migrate_pages+0x5e5/0x700
> [  600.754705]  [] ? buffer_migrate_lock_buffers+0xb0/0xb0
> [  600.785784]  [] migrate_misplaced_page+0xbc/0xf0
> [  600.814029]  [] do_numa_page+0x102/0x190
> [  600.839182]  [] handle_pte_fault+0x241/0x970
> [  600.865875]  [] handle_mm_fault+0x265/0x370
> [  600.892071]  [] __do_page_fault+0x172/0x5a0
> [  600.918065]  [] ? retint_swapgs+0x13/0x1b
> [  600.943493]  [] do_page_fault+0x1a/0x70
> [  600.968081]  [] page_fault+0x28/0x30
> 
> Signed-off-by: KOSAKI Motohiro 
> Cc: Larry Woodman 
> Cc: Rik van Riel 
> Cc: Johannes Weiner 
> Cc: sta...@vger.kernel.org
> ---
>  mm/page-writeback.c |5 +++--
>  1 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 2d30e2c..7106cb1 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2173,11 +2173,12 @@ int __set_page_dirty_nobuffers(struct page *page)
>   if (!TestSetPageDirty(page)) {
>   struct address_space *mapping = page_mapping(page);
>   struct address_space *mapping2;
> + unsigned long flags;
>  
>   if (!mapping)
>   return 1;
>  
> - spin_lock_irq(>tree_lock);
> + spin_lock_irqsave(>tree_lock, flags);
>   mapping2 = page_mapping(page);
>   if (mapping2) { /* Race with truncate? */
>   BUG_ON(mapping2 != mapping);
> @@ -2186,7 +2187,7 @@ int __set_page_dirty_nobuffers(struct page *page)
>   radix_tree_tag_set(>page_tree,
>   page_index(page), PAGECACHE_TAG_DIRTY);
>   }
> - spin_unlock_irq(>tree_lock);
> + spin_unlock_irqrestore(>tree_lock, flags);
>   if (mapping->host) {
>   /* !PageAnon && !swapper_space */
>   __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);

Indeed, good catch.  Do we need the same treatment for 
__set_page_dirty_buffers() that can be called by way of 
clear_page_dirty_for_io()?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: __set_page_dirty_nobuffers uses spin_lock_irqseve instead of spin_lock_irq

2014-02-03 Thread kosaki . motohiro
From: KOSAKI Motohiro 

During aio stress test, we observed the following lockdep warning.
This mean AIO+numa_balancing is currently deadlockable.

The problem is, aio_migratepage disable interrupt, but 
__set_page_dirty_nobuffers
unintentionally enable it again.

Generally, all helper function should use spin_lock_irqsave()
instead of spin_lock_irq() because they don't know caller at all.

[  599.843948] other info that might help us debug this:
[  599.873748]  Possible unsafe locking scenario:
[  599.873748]
[  599.900902]CPU0
[  599.912701]
[  599.924929]   lock(&(>completion_lock)->rlock);
[  599.950299]   
[  599.962576] lock(&(>completion_lock)->rlock);
[  599.985771]
[  599.985771]  *** DEADLOCK ***

[  600.375623]  [] dump_stack+0x19/0x1b
[  600.398769]  [] print_usage_bug+0x1f7/0x208
[  600.425092]  [] ? 
print_shortest_lock_dependencies+0x1d0/0x1d0
[  600.458981]  [] mark_lock+0x21d/0x2a0
[  600.482910]  [] mark_held_locks+0xb9/0x140
[  600.508956]  [] ? _raw_spin_unlock_irq+0x2c/0x50
[  600.536825]  [] trace_hardirqs_on_caller+0x105/0x1d0
[  600.566861]  [] trace_hardirqs_on+0xd/0x10
[  600.593210]  [] _raw_spin_unlock_irq+0x2c/0x50
[  600.620599]  [] __set_page_dirty_nobuffers+0x8c/0xf0
[  600.649992]  [] migrate_page_copy+0x434/0x540
[  600.676635]  [] aio_migratepage+0xb1/0x140
[  600.703126]  [] move_to_new_page+0x7d/0x230
[  600.729022]  [] migrate_pages+0x5e5/0x700
[  600.754705]  [] ? buffer_migrate_lock_buffers+0xb0/0xb0
[  600.785784]  [] migrate_misplaced_page+0xbc/0xf0
[  600.814029]  [] do_numa_page+0x102/0x190
[  600.839182]  [] handle_pte_fault+0x241/0x970
[  600.865875]  [] handle_mm_fault+0x265/0x370
[  600.892071]  [] __do_page_fault+0x172/0x5a0
[  600.918065]  [] ? retint_swapgs+0x13/0x1b
[  600.943493]  [] do_page_fault+0x1a/0x70
[  600.968081]  [] page_fault+0x28/0x30

Signed-off-by: KOSAKI Motohiro 
Cc: Larry Woodman 
Cc: Rik van Riel 
Cc: Johannes Weiner 
Cc: sta...@vger.kernel.org
---
 mm/page-writeback.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 2d30e2c..7106cb1 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2173,11 +2173,12 @@ int __set_page_dirty_nobuffers(struct page *page)
if (!TestSetPageDirty(page)) {
struct address_space *mapping = page_mapping(page);
struct address_space *mapping2;
+   unsigned long flags;
 
if (!mapping)
return 1;
 
-   spin_lock_irq(>tree_lock);
+   spin_lock_irqsave(>tree_lock, flags);
mapping2 = page_mapping(page);
if (mapping2) { /* Race with truncate? */
BUG_ON(mapping2 != mapping);
@@ -2186,7 +2187,7 @@ int __set_page_dirty_nobuffers(struct page *page)
radix_tree_tag_set(>page_tree,
page_index(page), PAGECACHE_TAG_DIRTY);
}
-   spin_unlock_irq(>tree_lock);
+   spin_unlock_irqrestore(>tree_lock, flags);
if (mapping->host) {
/* !PageAnon && !swapper_space */
__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: __set_page_dirty_nobuffers uses spin_lock_irqseve instead of spin_lock_irq

2014-02-03 Thread kosaki . motohiro
From: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com

During aio stress test, we observed the following lockdep warning.
This mean AIO+numa_balancing is currently deadlockable.

The problem is, aio_migratepage disable interrupt, but 
__set_page_dirty_nobuffers
unintentionally enable it again.

Generally, all helper function should use spin_lock_irqsave()
instead of spin_lock_irq() because they don't know caller at all.

[  599.843948] other info that might help us debug this:
[  599.873748]  Possible unsafe locking scenario:
[  599.873748]
[  599.900902]CPU0
[  599.912701]
[  599.924929]   lock((ctx-completion_lock)-rlock);
[  599.950299]   Interrupt
[  599.962576] lock((ctx-completion_lock)-rlock);
[  599.985771]
[  599.985771]  *** DEADLOCK ***

[  600.375623]  [81678d3c] dump_stack+0x19/0x1b
[  600.398769]  [816731aa] print_usage_bug+0x1f7/0x208
[  600.425092]  [810df370] ? 
print_shortest_lock_dependencies+0x1d0/0x1d0
[  600.458981]  [810e08dd] mark_lock+0x21d/0x2a0
[  600.482910]  [810e0a19] mark_held_locks+0xb9/0x140
[  600.508956]  [8168201c] ? _raw_spin_unlock_irq+0x2c/0x50
[  600.536825]  [810e0ba5] trace_hardirqs_on_caller+0x105/0x1d0
[  600.566861]  [810e0c7d] trace_hardirqs_on+0xd/0x10
[  600.593210]  [8168201c] _raw_spin_unlock_irq+0x2c/0x50
[  600.620599]  [8117f72c] __set_page_dirty_nobuffers+0x8c/0xf0
[  600.649992]  [811d1094] migrate_page_copy+0x434/0x540
[  600.676635]  [8123f5b1] aio_migratepage+0xb1/0x140
[  600.703126]  [811d126d] move_to_new_page+0x7d/0x230
[  600.729022]  [811d1b45] migrate_pages+0x5e5/0x700
[  600.754705]  [811d0070] ? buffer_migrate_lock_buffers+0xb0/0xb0
[  600.785784]  [811d29cc] migrate_misplaced_page+0xbc/0xf0
[  600.814029]  [8119eb62] do_numa_page+0x102/0x190
[  600.839182]  [8119ee31] handle_pte_fault+0x241/0x970
[  600.865875]  [811a0345] handle_mm_fault+0x265/0x370
[  600.892071]  [81686d82] __do_page_fault+0x172/0x5a0
[  600.918065]  [81682cd8] ? retint_swapgs+0x13/0x1b
[  600.943493]  [816871ca] do_page_fault+0x1a/0x70
[  600.968081]  [81682ff8] page_fault+0x28/0x30

Signed-off-by: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com
Cc: Larry Woodman lwood...@redhat.com
Cc: Rik van Riel r...@redhat.com
Cc: Johannes Weiner jwei...@redhat.com
Cc: sta...@vger.kernel.org
---
 mm/page-writeback.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 2d30e2c..7106cb1 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2173,11 +2173,12 @@ int __set_page_dirty_nobuffers(struct page *page)
if (!TestSetPageDirty(page)) {
struct address_space *mapping = page_mapping(page);
struct address_space *mapping2;
+   unsigned long flags;
 
if (!mapping)
return 1;
 
-   spin_lock_irq(mapping-tree_lock);
+   spin_lock_irqsave(mapping-tree_lock, flags);
mapping2 = page_mapping(page);
if (mapping2) { /* Race with truncate? */
BUG_ON(mapping2 != mapping);
@@ -2186,7 +2187,7 @@ int __set_page_dirty_nobuffers(struct page *page)
radix_tree_tag_set(mapping-page_tree,
page_index(page), PAGECACHE_TAG_DIRTY);
}
-   spin_unlock_irq(mapping-tree_lock);
+   spin_unlock_irqrestore(mapping-tree_lock, flags);
if (mapping-host) {
/* !PageAnon  !swapper_space */
__mark_inode_dirty(mapping-host, I_DIRTY_PAGES);
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: __set_page_dirty_nobuffers uses spin_lock_irqseve instead of spin_lock_irq

2014-02-03 Thread David Rientjes
On Mon, 3 Feb 2014, kosaki.motoh...@gmail.com wrote:

 From: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com
 
 During aio stress test, we observed the following lockdep warning.
 This mean AIO+numa_balancing is currently deadlockable.
 
 The problem is, aio_migratepage disable interrupt, but 
 __set_page_dirty_nobuffers
 unintentionally enable it again.
 
 Generally, all helper function should use spin_lock_irqsave()
 instead of spin_lock_irq() because they don't know caller at all.
 
 [  599.843948] other info that might help us debug this:
 [  599.873748]  Possible unsafe locking scenario:
 [  599.873748]
 [  599.900902]CPU0
 [  599.912701]
 [  599.924929]   lock((ctx-completion_lock)-rlock);
 [  599.950299]   Interrupt
 [  599.962576] lock((ctx-completion_lock)-rlock);
 [  599.985771]
 [  599.985771]  *** DEADLOCK ***
 
 [  600.375623]  [81678d3c] dump_stack+0x19/0x1b
 [  600.398769]  [816731aa] print_usage_bug+0x1f7/0x208
 [  600.425092]  [810df370] ? 
 print_shortest_lock_dependencies+0x1d0/0x1d0
 [  600.458981]  [810e08dd] mark_lock+0x21d/0x2a0
 [  600.482910]  [810e0a19] mark_held_locks+0xb9/0x140
 [  600.508956]  [8168201c] ? _raw_spin_unlock_irq+0x2c/0x50
 [  600.536825]  [810e0ba5] trace_hardirqs_on_caller+0x105/0x1d0
 [  600.566861]  [810e0c7d] trace_hardirqs_on+0xd/0x10
 [  600.593210]  [8168201c] _raw_spin_unlock_irq+0x2c/0x50
 [  600.620599]  [8117f72c] __set_page_dirty_nobuffers+0x8c/0xf0
 [  600.649992]  [811d1094] migrate_page_copy+0x434/0x540
 [  600.676635]  [8123f5b1] aio_migratepage+0xb1/0x140
 [  600.703126]  [811d126d] move_to_new_page+0x7d/0x230
 [  600.729022]  [811d1b45] migrate_pages+0x5e5/0x700
 [  600.754705]  [811d0070] ? buffer_migrate_lock_buffers+0xb0/0xb0
 [  600.785784]  [811d29cc] migrate_misplaced_page+0xbc/0xf0
 [  600.814029]  [8119eb62] do_numa_page+0x102/0x190
 [  600.839182]  [8119ee31] handle_pte_fault+0x241/0x970
 [  600.865875]  [811a0345] handle_mm_fault+0x265/0x370
 [  600.892071]  [81686d82] __do_page_fault+0x172/0x5a0
 [  600.918065]  [81682cd8] ? retint_swapgs+0x13/0x1b
 [  600.943493]  [816871ca] do_page_fault+0x1a/0x70
 [  600.968081]  [81682ff8] page_fault+0x28/0x30
 
 Signed-off-by: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com
 Cc: Larry Woodman lwood...@redhat.com
 Cc: Rik van Riel r...@redhat.com
 Cc: Johannes Weiner jwei...@redhat.com
 Cc: sta...@vger.kernel.org
 ---
  mm/page-writeback.c |5 +++--
  1 files changed, 3 insertions(+), 2 deletions(-)
 
 diff --git a/mm/page-writeback.c b/mm/page-writeback.c
 index 2d30e2c..7106cb1 100644
 --- a/mm/page-writeback.c
 +++ b/mm/page-writeback.c
 @@ -2173,11 +2173,12 @@ int __set_page_dirty_nobuffers(struct page *page)
   if (!TestSetPageDirty(page)) {
   struct address_space *mapping = page_mapping(page);
   struct address_space *mapping2;
 + unsigned long flags;
  
   if (!mapping)
   return 1;
  
 - spin_lock_irq(mapping-tree_lock);
 + spin_lock_irqsave(mapping-tree_lock, flags);
   mapping2 = page_mapping(page);
   if (mapping2) { /* Race with truncate? */
   BUG_ON(mapping2 != mapping);
 @@ -2186,7 +2187,7 @@ int __set_page_dirty_nobuffers(struct page *page)
   radix_tree_tag_set(mapping-page_tree,
   page_index(page), PAGECACHE_TAG_DIRTY);
   }
 - spin_unlock_irq(mapping-tree_lock);
 + spin_unlock_irqrestore(mapping-tree_lock, flags);
   if (mapping-host) {
   /* !PageAnon  !swapper_space */
   __mark_inode_dirty(mapping-host, I_DIRTY_PAGES);

Indeed, good catch.  Do we need the same treatment for 
__set_page_dirty_buffers() that can be called by way of 
clear_page_dirty_for_io()?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/