Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-26 Thread paul . szabo
Dear Jonathan,

>> If you can identify where it was fixed then your patch for older
>> versions should go to stable with a reference to the upstream fix (see
>> Documentation/stable_kernel_rules.txt).
>
> How about this patch?
>
> It was applied in mainline during the 3.3 merge window, so kernels
> newer than 3.2.y shouldn't need it.
>
> ...
> commit ab8fabd46f811d5153d8a0cd2fac9a0d41fb593d upstream.
> ...

Yes, I beleive that is the correct patch, surely better than my simple
subtraction of min_free_kbytes.

Noting, that this does not "solve" all problems, the latest 3.8 kernel
still crashes with OOM:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1098961/comments/18

Thanks, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-26 Thread paul . szabo
Dear Jonathan,

 If you can identify where it was fixed then your patch for older
 versions should go to stable with a reference to the upstream fix (see
 Documentation/stable_kernel_rules.txt).

 How about this patch?

 It was applied in mainline during the 3.3 merge window, so kernels
 newer than 3.2.y shouldn't need it.

 ...
 commit ab8fabd46f811d5153d8a0cd2fac9a0d41fb593d upstream.
 ...

Yes, I beleive that is the correct patch, surely better than my simple
subtraction of min_free_kbytes.

Noting, that this does not solve all problems, the latest 3.8 kernel
still crashes with OOM:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1098961/comments/18

Thanks, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-25 Thread Jonathan Nieder
Hi Paul,

Ben Hutchings wrote:

> If you can identify where it was fixed then your patch for older
> versions should go to stable with a reference to the upstream fix (see
> Documentation/stable_kernel_rules.txt).

How about this patch?

It was applied in mainline during the 3.3 merge window, so kernels
newer than 3.2.y shouldn't need it.

-- >8 --
From: Johannes Weiner 
Date: Tue, 10 Jan 2012 15:07:42 -0800
Subject: mm: exclude reserved pages from dirtyable memory

commit ab8fabd46f811d5153d8a0cd2fac9a0d41fb593d upstream.

Per-zone dirty limits try to distribute page cache pages allocated for
writing across zones in proportion to the individual zone sizes, to reduce
the likelihood of reclaim having to write back individual pages from the
LRU lists in order to make progress.

This patch:

The amount of dirtyable pages should not include the full number of free
pages: there is a number of reserved pages that the page allocator and
kswapd always try to keep free.

The closer (reclaimable pages - dirty pages) is to the number of reserved
pages, the more likely it becomes for reclaim to run into dirty pages:

   +--+ ---
   |   anon   |  |
   +--+  |
   |  |  |
   |  |  -- dirty limit new-- flusher new
   |   file   |  | |
   |  |  | |
   |  |  -- dirty limit old-- flusher old
   |  ||
   +--+   --- reclaim
   | reserved |
   +--+
   |  kernel  |
   +--+

This patch introduces a per-zone dirty reserve that takes both the lowmem
reserve as well as the high watermark of the zone into account, and a
global sum of those per-zone values that is subtracted from the global
amount of dirtyable pages.  The lowmem reserve is unavailable to page
cache allocations and kswapd tries to keep the high watermark free.  We
don't want to end up in a situation where reclaim has to clean pages in
order to balance zones.

Not treating reserved pages as dirtyable on a global level is only a
conceptual fix.  In reality, dirty pages are not distributed equally
across zones and reclaim runs into dirty pages on a regular basis.

But it is important to get this right before tackling the problem on a
per-zone level, where the distance between reclaim and the dirty pages is
mostly much smaller in absolute numbers.

[a...@linux-foundation.org: fix highmem build]
Signed-off-by: Johannes Weiner 
Reviewed-by: Rik van Riel 
Reviewed-by: Michal Hocko 
Reviewed-by: Minchan Kim 
Acked-by: Mel Gorman 
Cc: KAMEZAWA Hiroyuki 
Cc: Christoph Hellwig 
Cc: Wu Fengguang 
Cc: Dave Chinner 
Cc: Jan Kara 
Cc: Shaohua Li 
Cc: Chris Mason 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Jonathan Nieder 
---
 include/linux/mmzone.h |  6 ++
 include/linux/swap.h   |  1 +
 mm/page-writeback.c|  5 +++--
 mm/page_alloc.c| 19 +++
 4 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 25842b6e72e1..a594af3278bc 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -319,6 +319,12 @@ struct zone {
 */
unsigned long   lowmem_reserve[MAX_NR_ZONES];
 
+   /*
+* This is a per-zone reserve of pages that should not be
+* considered dirtyable memory.
+*/
+   unsigned long   dirty_balance_reserve;
+
 #ifdef CONFIG_NUMA
int node;
/*
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 67b3fa308988..3e60228e7299 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -207,6 +207,7 @@ struct swap_list_t {
 /* linux/mm/page_alloc.c */
 extern unsigned long totalram_pages;
 extern unsigned long totalreserve_pages;
+extern unsigned long dirty_balance_reserve;
 extern unsigned int nr_free_buffer_pages(void);
 extern unsigned int nr_free_pagecache_pages(void);
 
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 50f08241f981..f620e7b0dc26 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -320,7 +320,7 @@ static unsigned long highmem_dirtyable_memory(unsigned long 
total)
_DATA(node)->node_zones[ZONE_HIGHMEM];
 
x += zone_page_state(z, NR_FREE_PAGES) +
-zone_reclaimable_pages(z);
+zone_reclaimable_pages(z) - z->dirty_balance_reserve;
}
/*
 * Make sure that the number of highmem pages is never larger
@@ -344,7 +344,8 @@ unsigned long determine_dirtyable_memory(void)
 {
unsigned long x;
 
-   x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
+   x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
+   dirty_balance_reserve;
 
if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
diff --git a/mm/page_alloc.c 

Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-25 Thread Ben Hutchings
On Sat, 2013-01-26 at 14:07 +1100, paul.sz...@sydney.edu.au wrote:
> Dear Ben,
> 
> > ... the mm maintainers are probably much better placed ...
> 
> Exactly. Now I wonder: are you one of them?

Hah, no.

Ben.

-- 
Ben Hutchings
Any smoothly functioning technology is indistinguishable from a rigged demo.


signature.asc
Description: This is a digitally signed message part


Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-25 Thread paul . szabo
Dear Ben,

> ... the mm maintainers are probably much better placed ...

Exactly. Now I wonder: are you one of them?

Thanks, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-25 Thread paul . szabo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-25 Thread Ben Hutchings
On Sat, 2013-01-26 at 10:49 +1100, paul.sz...@sydney.edu.au wrote:
> Dear Ben,
> 
> > If you can identify where it was fixed then ...
> 
> Sorry I cannot do that. I have no idea where kernel changelogs are kept.
> 
> I am happy to do some work. Please do not call me lazy.

The changelogs are in git repositories.  But the mm maintainers are
probably much better placed to identify which was the upstream fix.

Ben.

-- 
Ben Hutchings
Any smoothly functioning technology is indistinguishable from a rigged demo.


signature.asc
Description: This is a digitally signed message part


Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-25 Thread paul . szabo
Dear Ben,

> If you can identify where it was fixed then ...

Sorry I cannot do that. I have no idea where kernel changelogs are kept.

I am happy to do some work. Please do not call me lazy.

Cheers, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-25 Thread Ben Hutchings
On Fri, 2013-01-25 at 20:53 +1100, paul.sz...@sydney.edu.au wrote:
> Dear Minchan,
> 
> > So what's the effect for user?
> > ...
> > It seems you saw old kernel.
> > ...
> > Current kernel includes ...
> > So I think we don't need this patch.
> 
> As I understand now, my patch is "right" and needed for older kernels;
> for newer kernels, the issue has been fixed in equivalent ways; it was
> an oversight that the change was not backported; and any justification
> you need, you can get from those "later better" patches.
[...]

If you can identify where it was fixed then your patch for older
versions should go to stable with a reference to the upstream fix (see
Documentation/stable_kernel_rules.txt).

Ben.

-- 
Ben Hutchings
Q.  Which is the greater problem in the world today, ignorance or apathy?
A.  I don't know and I couldn't care less.


signature.asc
Description: This is a digitally signed message part


Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-25 Thread Ben Hutchings
On Fri, 2013-01-25 at 20:53 +1100, paul.sz...@sydney.edu.au wrote:
 Dear Minchan,
 
  So what's the effect for user?
  ...
  It seems you saw old kernel.
  ...
  Current kernel includes ...
  So I think we don't need this patch.
 
 As I understand now, my patch is right and needed for older kernels;
 for newer kernels, the issue has been fixed in equivalent ways; it was
 an oversight that the change was not backported; and any justification
 you need, you can get from those later better patches.
[...]

If you can identify where it was fixed then your patch for older
versions should go to stable with a reference to the upstream fix (see
Documentation/stable_kernel_rules.txt).

Ben.

-- 
Ben Hutchings
Q.  Which is the greater problem in the world today, ignorance or apathy?
A.  I don't know and I couldn't care less.


signature.asc
Description: This is a digitally signed message part


Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-25 Thread paul . szabo
Dear Ben,

 If you can identify where it was fixed then ...

Sorry I cannot do that. I have no idea where kernel changelogs are kept.

I am happy to do some work. Please do not call me lazy.

Cheers, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-25 Thread Ben Hutchings
On Sat, 2013-01-26 at 10:49 +1100, paul.sz...@sydney.edu.au wrote:
 Dear Ben,
 
  If you can identify where it was fixed then ...
 
 Sorry I cannot do that. I have no idea where kernel changelogs are kept.
 
 I am happy to do some work. Please do not call me lazy.

The changelogs are in git repositories.  But the mm maintainers are
probably much better placed to identify which was the upstream fix.

Ben.

-- 
Ben Hutchings
Any smoothly functioning technology is indistinguishable from a rigged demo.


signature.asc
Description: This is a digitally signed message part


Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-25 Thread paul . szabo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-25 Thread paul . szabo
Dear Ben,

 ... the mm maintainers are probably much better placed ...

Exactly. Now I wonder: are you one of them?

Thanks, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-25 Thread Ben Hutchings
On Sat, 2013-01-26 at 14:07 +1100, paul.sz...@sydney.edu.au wrote:
 Dear Ben,
 
  ... the mm maintainers are probably much better placed ...
 
 Exactly. Now I wonder: are you one of them?

Hah, no.

Ben.

-- 
Ben Hutchings
Any smoothly functioning technology is indistinguishable from a rigged demo.


signature.asc
Description: This is a digitally signed message part


Re: Bug#695182: [PATCH] Subtract min_free_kbytes from dirtyable memory

2013-01-25 Thread Jonathan Nieder
Hi Paul,

Ben Hutchings wrote:

 If you can identify where it was fixed then your patch for older
 versions should go to stable with a reference to the upstream fix (see
 Documentation/stable_kernel_rules.txt).

How about this patch?

It was applied in mainline during the 3.3 merge window, so kernels
newer than 3.2.y shouldn't need it.

-- 8 --
From: Johannes Weiner jwei...@redhat.com
Date: Tue, 10 Jan 2012 15:07:42 -0800
Subject: mm: exclude reserved pages from dirtyable memory

commit ab8fabd46f811d5153d8a0cd2fac9a0d41fb593d upstream.

Per-zone dirty limits try to distribute page cache pages allocated for
writing across zones in proportion to the individual zone sizes, to reduce
the likelihood of reclaim having to write back individual pages from the
LRU lists in order to make progress.

This patch:

The amount of dirtyable pages should not include the full number of free
pages: there is a number of reserved pages that the page allocator and
kswapd always try to keep free.

The closer (reclaimable pages - dirty pages) is to the number of reserved
pages, the more likely it becomes for reclaim to run into dirty pages:

   +--+ ---
   |   anon   |  |
   +--+  |
   |  |  |
   |  |  -- dirty limit new-- flusher new
   |   file   |  | |
   |  |  | |
   |  |  -- dirty limit old-- flusher old
   |  ||
   +--+   --- reclaim
   | reserved |
   +--+
   |  kernel  |
   +--+

This patch introduces a per-zone dirty reserve that takes both the lowmem
reserve as well as the high watermark of the zone into account, and a
global sum of those per-zone values that is subtracted from the global
amount of dirtyable pages.  The lowmem reserve is unavailable to page
cache allocations and kswapd tries to keep the high watermark free.  We
don't want to end up in a situation where reclaim has to clean pages in
order to balance zones.

Not treating reserved pages as dirtyable on a global level is only a
conceptual fix.  In reality, dirty pages are not distributed equally
across zones and reclaim runs into dirty pages on a regular basis.

But it is important to get this right before tackling the problem on a
per-zone level, where the distance between reclaim and the dirty pages is
mostly much smaller in absolute numbers.

[a...@linux-foundation.org: fix highmem build]
Signed-off-by: Johannes Weiner jwei...@redhat.com
Reviewed-by: Rik van Riel r...@redhat.com
Reviewed-by: Michal Hocko mho...@suse.cz
Reviewed-by: Minchan Kim minchan@gmail.com
Acked-by: Mel Gorman mgor...@suse.de
Cc: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
Cc: Christoph Hellwig h...@infradead.org
Cc: Wu Fengguang fengguang...@intel.com
Cc: Dave Chinner da...@fromorbit.com
Cc: Jan Kara j...@suse.cz
Cc: Shaohua Li shaohua...@intel.com
Cc: Chris Mason chris.ma...@oracle.com
Signed-off-by: Andrew Morton a...@linux-foundation.org
Signed-off-by: Linus Torvalds torva...@linux-foundation.org
Signed-off-by: Jonathan Nieder jrnie...@gmail.com
---
 include/linux/mmzone.h |  6 ++
 include/linux/swap.h   |  1 +
 mm/page-writeback.c|  5 +++--
 mm/page_alloc.c| 19 +++
 4 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 25842b6e72e1..a594af3278bc 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -319,6 +319,12 @@ struct zone {
 */
unsigned long   lowmem_reserve[MAX_NR_ZONES];
 
+   /*
+* This is a per-zone reserve of pages that should not be
+* considered dirtyable memory.
+*/
+   unsigned long   dirty_balance_reserve;
+
 #ifdef CONFIG_NUMA
int node;
/*
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 67b3fa308988..3e60228e7299 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -207,6 +207,7 @@ struct swap_list_t {
 /* linux/mm/page_alloc.c */
 extern unsigned long totalram_pages;
 extern unsigned long totalreserve_pages;
+extern unsigned long dirty_balance_reserve;
 extern unsigned int nr_free_buffer_pages(void);
 extern unsigned int nr_free_pagecache_pages(void);
 
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 50f08241f981..f620e7b0dc26 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -320,7 +320,7 @@ static unsigned long highmem_dirtyable_memory(unsigned long 
total)
NODE_DATA(node)-node_zones[ZONE_HIGHMEM];
 
x += zone_page_state(z, NR_FREE_PAGES) +
-zone_reclaimable_pages(z);
+zone_reclaimable_pages(z) - z-dirty_balance_reserve;
}
/*
 * Make sure that the number of highmem pages is never larger
@@ -344,7 +344,8 @@ unsigned long determine_dirtyable_memory(void)
 {
unsigned long x;
 
-