Re: [PATCH] mm: Fix calculation of dirtyable memory

2012-11-12 Thread Johannes Weiner
On Mon, Nov 12, 2012 at 11:35:28AM -0800, Sonny Rao wrote:
> The system uses global_dirtyable_memory() to calculate
> number of dirtyable pages/pages that can be allocated
> to the page cache.  A bug causes an underflow thus making
> the page count look like a big unsigned number.  This in turn
> confuses the dirty writeback throttling to aggressively write
> back pages as they become dirty (usually 1 page at a time).
> 
> Fix is to ensure there is no underflow while doing the math.
> 
> Signed-off-by: Sonny Rao 
> Signed-off-by: Puneet Kumar 

Thanks for debugging and sending in the patch.

It might be useful to note in the changelog that the crawling
writeback problem only affects highmem systems because of the way the
underflowed count of high memory is subtracted from the overall amount
of dirtyable memory.

And that the problem was introduced with v3.2-4896-gab8fabd (which
means that we should include Cc: sta...@kernel.org for 3.3+).

The diff itself looks good to me, thanks again:

Acked-by: Johannes Weiner 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: Fix calculation of dirtyable memory

2012-11-12 Thread Sonny Rao
The system uses global_dirtyable_memory() to calculate
number of dirtyable pages/pages that can be allocated
to the page cache.  A bug causes an underflow thus making
the page count look like a big unsigned number.  This in turn
confuses the dirty writeback throttling to aggressively write
back pages as they become dirty (usually 1 page at a time).

Fix is to ensure there is no underflow while doing the math.

Signed-off-by: Sonny Rao 
Signed-off-by: Puneet Kumar 
---
 v2: added apkm's suggestion to make the highmem calculation better
 v3: added Fengguang Wu's suggestions fix zone_dirtyable_memory() and
 (offlist mail) to use max() in global_dirtyable_memory()
 mm/page-writeback.c |   25 -
 1 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 830893b..f9efbe8 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -201,6 +201,18 @@ static unsigned long highmem_dirtyable_memory(unsigned 
long total)
 zone_reclaimable_pages(z) - z->dirty_balance_reserve;
}
/*
+* Unreclaimable memory (kernel memory or anonymous memory
+* without swap) can bring down the dirtyable pages below
+* the zone's dirty balance reserve and the above calculation
+* will underflow.  However we still want to add in nodes
+* which are below threshold (negative values) to get a more
+* accurate calculation but make sure that the total never
+* underflows.
+*/
+   if ((long)x < 0)
+   x = 0;
+
+   /*
 * Make sure that the number of highmem pages is never larger
 * than the number of the total dirtyable memory. This can only
 * occur in very strange VM situations but we want to make sure
@@ -222,8 +234,8 @@ static unsigned long global_dirtyable_memory(void)
 {
unsigned long x;
 
-   x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
-   dirty_balance_reserve;
+   x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
+   x -= max(x, dirty_balance_reserve);
 
if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
@@ -290,9 +302,12 @@ static unsigned long zone_dirtyable_memory(struct zone 
*zone)
 * highmem zone can hold its share of dirty pages, so we don't
 * care about vm_highmem_is_dirtyable here.
 */
-   return zone_page_state(zone, NR_FREE_PAGES) +
-  zone_reclaimable_pages(zone) -
-  zone->dirty_balance_reserve;
+   unsigned long nr_pages = zone_page_state(zone, NR_FREE_PAGES) +
+   zone_reclaimable_pages(zone);
+
+   /* don't allow this to underflow */
+   nr_pages -= max(nr_pages, zone->dirty_balance_reserve);
+   return nr_pages;
 }
 
 /**
-- 
1.7.7.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: Fix calculation of dirtyable memory

2012-11-12 Thread Sonny Rao
The system uses global_dirtyable_memory() to calculate
number of dirtyable pages/pages that can be allocated
to the page cache.  A bug causes an underflow thus making
the page count look like a big unsigned number.  This in turn
confuses the dirty writeback throttling to aggressively write
back pages as they become dirty (usually 1 page at a time).

Fix is to ensure there is no underflow while doing the math.

Signed-off-by: Sonny Rao sonny...@chromium.org
Signed-off-by: Puneet Kumar puneets...@chromium.org
---
 v2: added apkm's suggestion to make the highmem calculation better
 v3: added Fengguang Wu's suggestions fix zone_dirtyable_memory() and
 (offlist mail) to use max() in global_dirtyable_memory()
 mm/page-writeback.c |   25 -
 1 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 830893b..f9efbe8 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -201,6 +201,18 @@ static unsigned long highmem_dirtyable_memory(unsigned 
long total)
 zone_reclaimable_pages(z) - z-dirty_balance_reserve;
}
/*
+* Unreclaimable memory (kernel memory or anonymous memory
+* without swap) can bring down the dirtyable pages below
+* the zone's dirty balance reserve and the above calculation
+* will underflow.  However we still want to add in nodes
+* which are below threshold (negative values) to get a more
+* accurate calculation but make sure that the total never
+* underflows.
+*/
+   if ((long)x  0)
+   x = 0;
+
+   /*
 * Make sure that the number of highmem pages is never larger
 * than the number of the total dirtyable memory. This can only
 * occur in very strange VM situations but we want to make sure
@@ -222,8 +234,8 @@ static unsigned long global_dirtyable_memory(void)
 {
unsigned long x;
 
-   x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
-   dirty_balance_reserve;
+   x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
+   x -= max(x, dirty_balance_reserve);
 
if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
@@ -290,9 +302,12 @@ static unsigned long zone_dirtyable_memory(struct zone 
*zone)
 * highmem zone can hold its share of dirty pages, so we don't
 * care about vm_highmem_is_dirtyable here.
 */
-   return zone_page_state(zone, NR_FREE_PAGES) +
-  zone_reclaimable_pages(zone) -
-  zone-dirty_balance_reserve;
+   unsigned long nr_pages = zone_page_state(zone, NR_FREE_PAGES) +
+   zone_reclaimable_pages(zone);
+
+   /* don't allow this to underflow */
+   nr_pages -= max(nr_pages, zone-dirty_balance_reserve);
+   return nr_pages;
 }
 
 /**
-- 
1.7.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: Fix calculation of dirtyable memory

2012-11-12 Thread Johannes Weiner
On Mon, Nov 12, 2012 at 11:35:28AM -0800, Sonny Rao wrote:
 The system uses global_dirtyable_memory() to calculate
 number of dirtyable pages/pages that can be allocated
 to the page cache.  A bug causes an underflow thus making
 the page count look like a big unsigned number.  This in turn
 confuses the dirty writeback throttling to aggressively write
 back pages as they become dirty (usually 1 page at a time).
 
 Fix is to ensure there is no underflow while doing the math.
 
 Signed-off-by: Sonny Rao sonny...@chromium.org
 Signed-off-by: Puneet Kumar puneets...@chromium.org

Thanks for debugging and sending in the patch.

It might be useful to note in the changelog that the crawling
writeback problem only affects highmem systems because of the way the
underflowed count of high memory is subtracted from the overall amount
of dirtyable memory.

And that the problem was introduced with v3.2-4896-gab8fabd (which
means that we should include Cc: sta...@kernel.org for 3.3+).

The diff itself looks good to me, thanks again:

Acked-by: Johannes Weiner han...@cmpxchg.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: Fix calculation of dirtyable memory

2012-11-08 Thread Andrew Morton
On Thu,  8 Nov 2012 15:25:35 -0800
Sonny Rao  wrote:

> The system uses global_dirtyable_memory() to calculate
> number of dirtyable pages/pages that can be allocated
> to the page cache.  A bug causes an underflow thus making
> the page count look like a big unsigned number.  This in turn
> confuses the dirty writeback throttling to aggressively write
> back pages as they become dirty (usually 1 page at a time).
> 
> Fix is to ensure there is no underflow while doing the math.
> 
> Signed-off-by: Sonny Rao 
> Signed-off-by: Puneet Kumar 
> ---
>  mm/page-writeback.c |   17 +
>  1 files changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 830893b..2a6356c 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -194,11 +194,19 @@ static unsigned long highmem_dirtyable_memory(unsigned 
> long total)
>   unsigned long x = 0;
>  
>   for_each_node_state(node, N_HIGH_MEMORY) {
> + unsigned long nr_pages;
>   struct zone *z =
>   _DATA(node)->node_zones[ZONE_HIGHMEM];
>  
> - x += zone_page_state(z, NR_FREE_PAGES) +
> -  zone_reclaimable_pages(z) - z->dirty_balance_reserve;
> + nr_pages = zone_page_state(z, NR_FREE_PAGES) +
> + zone_reclaimable_pages(z);
> + /*
> +  * Unreclaimable memory (kernel memory or anonymous memory
> +  * without swap) can bring down the dirtyable pages below
> +  * the zone's dirty balance reserve.
> +  */
> + if (nr_pages >= z->dirty_balance_reserve)
> + x += nr_pages - z->dirty_balance_reserve;

If the system has two nodes and one is below its dirty_balance_reserve,
we could end up with something like:

x = 0;
...
x += 1000;
...
x += -100;

In this case, your fix would cause highmem_dirtyable_memory() to return
1000.  Would it be better to instead return 900?

IOW, we instead add logic along the lines of

if ((long)x < 0)
x = 0;
return x;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: Fix calculation of dirtyable memory

2012-11-08 Thread Sonny Rao
The system uses global_dirtyable_memory() to calculate
number of dirtyable pages/pages that can be allocated
to the page cache.  A bug causes an underflow thus making
the page count look like a big unsigned number.  This in turn
confuses the dirty writeback throttling to aggressively write
back pages as they become dirty (usually 1 page at a time).

Fix is to ensure there is no underflow while doing the math.

Signed-off-by: Sonny Rao 
Signed-off-by: Puneet Kumar 
---
 mm/page-writeback.c |   17 +
 1 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 830893b..2a6356c 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -194,11 +194,19 @@ static unsigned long highmem_dirtyable_memory(unsigned 
long total)
unsigned long x = 0;
 
for_each_node_state(node, N_HIGH_MEMORY) {
+   unsigned long nr_pages;
struct zone *z =
_DATA(node)->node_zones[ZONE_HIGHMEM];
 
-   x += zone_page_state(z, NR_FREE_PAGES) +
-zone_reclaimable_pages(z) - z->dirty_balance_reserve;
+   nr_pages = zone_page_state(z, NR_FREE_PAGES) +
+   zone_reclaimable_pages(z);
+   /*
+* Unreclaimable memory (kernel memory or anonymous memory
+* without swap) can bring down the dirtyable pages below
+* the zone's dirty balance reserve.
+*/
+   if (nr_pages >= z->dirty_balance_reserve)
+   x += nr_pages - z->dirty_balance_reserve;
}
/*
 * Make sure that the number of highmem pages is never larger
@@ -222,8 +230,9 @@ static unsigned long global_dirtyable_memory(void)
 {
unsigned long x;
 
-   x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
-   dirty_balance_reserve;
+   x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
+   if (x >= dirty_balance_reserve)
+   x -= dirty_balance_reserve;
 
if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
-- 
1.7.7.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: Fix calculation of dirtyable memory

2012-11-08 Thread Sonny Rao
The system uses global_dirtyable_memory() to calculate
number of dirtyable pages/pages that can be allocated
to the page cache.  A bug causes an underflow thus making
the page count look like a big unsigned number.  This in turn
confuses the dirty writeback throttling to aggressively write
back pages as they become dirty (usually 1 page at a time).

Fix is to ensure there is no underflow while doing the math.

Signed-off-by: Sonny Rao sonny...@chromium.org
Signed-off-by: Puneet Kumar puneets...@chromium.org
---
 mm/page-writeback.c |   17 +
 1 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 830893b..2a6356c 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -194,11 +194,19 @@ static unsigned long highmem_dirtyable_memory(unsigned 
long total)
unsigned long x = 0;
 
for_each_node_state(node, N_HIGH_MEMORY) {
+   unsigned long nr_pages;
struct zone *z =
NODE_DATA(node)-node_zones[ZONE_HIGHMEM];
 
-   x += zone_page_state(z, NR_FREE_PAGES) +
-zone_reclaimable_pages(z) - z-dirty_balance_reserve;
+   nr_pages = zone_page_state(z, NR_FREE_PAGES) +
+   zone_reclaimable_pages(z);
+   /*
+* Unreclaimable memory (kernel memory or anonymous memory
+* without swap) can bring down the dirtyable pages below
+* the zone's dirty balance reserve.
+*/
+   if (nr_pages = z-dirty_balance_reserve)
+   x += nr_pages - z-dirty_balance_reserve;
}
/*
 * Make sure that the number of highmem pages is never larger
@@ -222,8 +230,9 @@ static unsigned long global_dirtyable_memory(void)
 {
unsigned long x;
 
-   x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() -
-   dirty_balance_reserve;
+   x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
+   if (x = dirty_balance_reserve)
+   x -= dirty_balance_reserve;
 
if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
-- 
1.7.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: Fix calculation of dirtyable memory

2012-11-08 Thread Andrew Morton
On Thu,  8 Nov 2012 15:25:35 -0800
Sonny Rao sonny...@chromium.org wrote:

 The system uses global_dirtyable_memory() to calculate
 number of dirtyable pages/pages that can be allocated
 to the page cache.  A bug causes an underflow thus making
 the page count look like a big unsigned number.  This in turn
 confuses the dirty writeback throttling to aggressively write
 back pages as they become dirty (usually 1 page at a time).
 
 Fix is to ensure there is no underflow while doing the math.
 
 Signed-off-by: Sonny Rao sonny...@chromium.org
 Signed-off-by: Puneet Kumar puneets...@chromium.org
 ---
  mm/page-writeback.c |   17 +
  1 files changed, 13 insertions(+), 4 deletions(-)
 
 diff --git a/mm/page-writeback.c b/mm/page-writeback.c
 index 830893b..2a6356c 100644
 --- a/mm/page-writeback.c
 +++ b/mm/page-writeback.c
 @@ -194,11 +194,19 @@ static unsigned long highmem_dirtyable_memory(unsigned 
 long total)
   unsigned long x = 0;
  
   for_each_node_state(node, N_HIGH_MEMORY) {
 + unsigned long nr_pages;
   struct zone *z =
   NODE_DATA(node)-node_zones[ZONE_HIGHMEM];
  
 - x += zone_page_state(z, NR_FREE_PAGES) +
 -  zone_reclaimable_pages(z) - z-dirty_balance_reserve;
 + nr_pages = zone_page_state(z, NR_FREE_PAGES) +
 + zone_reclaimable_pages(z);
 + /*
 +  * Unreclaimable memory (kernel memory or anonymous memory
 +  * without swap) can bring down the dirtyable pages below
 +  * the zone's dirty balance reserve.
 +  */
 + if (nr_pages = z-dirty_balance_reserve)
 + x += nr_pages - z-dirty_balance_reserve;

If the system has two nodes and one is below its dirty_balance_reserve,
we could end up with something like:

x = 0;
...
x += 1000;
...
x += -100;

In this case, your fix would cause highmem_dirtyable_memory() to return
1000.  Would it be better to instead return 900?

IOW, we instead add logic along the lines of

if ((long)x  0)
x = 0;
return x;

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/