Re: [PATCH] mm: Fix calculation of dirtyable memory
On Mon, Nov 12, 2012 at 11:35:28AM -0800, Sonny Rao wrote: > The system uses global_dirtyable_memory() to calculate > number of dirtyable pages/pages that can be allocated > to the page cache. A bug causes an underflow thus making > the page count look like a big unsigned number. This in turn > confuses the dirty writeback throttling to aggressively write > back pages as they become dirty (usually 1 page at a time). > > Fix is to ensure there is no underflow while doing the math. > > Signed-off-by: Sonny Rao > Signed-off-by: Puneet Kumar Thanks for debugging and sending in the patch. It might be useful to note in the changelog that the crawling writeback problem only affects highmem systems because of the way the underflowed count of high memory is subtracted from the overall amount of dirtyable memory. And that the problem was introduced with v3.2-4896-gab8fabd (which means that we should include Cc: sta...@kernel.org for 3.3+). The diff itself looks good to me, thanks again: Acked-by: Johannes Weiner -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: Fix calculation of dirtyable memory
The system uses global_dirtyable_memory() to calculate number of dirtyable pages/pages that can be allocated to the page cache. A bug causes an underflow thus making the page count look like a big unsigned number. This in turn confuses the dirty writeback throttling to aggressively write back pages as they become dirty (usually 1 page at a time). Fix is to ensure there is no underflow while doing the math. Signed-off-by: Sonny Rao Signed-off-by: Puneet Kumar --- v2: added apkm's suggestion to make the highmem calculation better v3: added Fengguang Wu's suggestions fix zone_dirtyable_memory() and (offlist mail) to use max() in global_dirtyable_memory() mm/page-writeback.c | 25 - 1 files changed, 20 insertions(+), 5 deletions(-) diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 830893b..f9efbe8 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -201,6 +201,18 @@ static unsigned long highmem_dirtyable_memory(unsigned long total) zone_reclaimable_pages(z) - z->dirty_balance_reserve; } /* +* Unreclaimable memory (kernel memory or anonymous memory +* without swap) can bring down the dirtyable pages below +* the zone's dirty balance reserve and the above calculation +* will underflow. However we still want to add in nodes +* which are below threshold (negative values) to get a more +* accurate calculation but make sure that the total never +* underflows. +*/ + if ((long)x < 0) + x = 0; + + /* * Make sure that the number of highmem pages is never larger * than the number of the total dirtyable memory. This can only * occur in very strange VM situations but we want to make sure @@ -222,8 +234,8 @@ static unsigned long global_dirtyable_memory(void) { unsigned long x; - x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() - - dirty_balance_reserve; + x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages(); + x -= max(x, dirty_balance_reserve); if (!vm_highmem_is_dirtyable) x -= highmem_dirtyable_memory(x); @@ -290,9 +302,12 @@ static unsigned long zone_dirtyable_memory(struct zone *zone) * highmem zone can hold its share of dirty pages, so we don't * care about vm_highmem_is_dirtyable here. */ - return zone_page_state(zone, NR_FREE_PAGES) + - zone_reclaimable_pages(zone) - - zone->dirty_balance_reserve; + unsigned long nr_pages = zone_page_state(zone, NR_FREE_PAGES) + + zone_reclaimable_pages(zone); + + /* don't allow this to underflow */ + nr_pages -= max(nr_pages, zone->dirty_balance_reserve); + return nr_pages; } /** -- 1.7.7.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: Fix calculation of dirtyable memory
The system uses global_dirtyable_memory() to calculate number of dirtyable pages/pages that can be allocated to the page cache. A bug causes an underflow thus making the page count look like a big unsigned number. This in turn confuses the dirty writeback throttling to aggressively write back pages as they become dirty (usually 1 page at a time). Fix is to ensure there is no underflow while doing the math. Signed-off-by: Sonny Rao sonny...@chromium.org Signed-off-by: Puneet Kumar puneets...@chromium.org --- v2: added apkm's suggestion to make the highmem calculation better v3: added Fengguang Wu's suggestions fix zone_dirtyable_memory() and (offlist mail) to use max() in global_dirtyable_memory() mm/page-writeback.c | 25 - 1 files changed, 20 insertions(+), 5 deletions(-) diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 830893b..f9efbe8 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -201,6 +201,18 @@ static unsigned long highmem_dirtyable_memory(unsigned long total) zone_reclaimable_pages(z) - z-dirty_balance_reserve; } /* +* Unreclaimable memory (kernel memory or anonymous memory +* without swap) can bring down the dirtyable pages below +* the zone's dirty balance reserve and the above calculation +* will underflow. However we still want to add in nodes +* which are below threshold (negative values) to get a more +* accurate calculation but make sure that the total never +* underflows. +*/ + if ((long)x 0) + x = 0; + + /* * Make sure that the number of highmem pages is never larger * than the number of the total dirtyable memory. This can only * occur in very strange VM situations but we want to make sure @@ -222,8 +234,8 @@ static unsigned long global_dirtyable_memory(void) { unsigned long x; - x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() - - dirty_balance_reserve; + x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages(); + x -= max(x, dirty_balance_reserve); if (!vm_highmem_is_dirtyable) x -= highmem_dirtyable_memory(x); @@ -290,9 +302,12 @@ static unsigned long zone_dirtyable_memory(struct zone *zone) * highmem zone can hold its share of dirty pages, so we don't * care about vm_highmem_is_dirtyable here. */ - return zone_page_state(zone, NR_FREE_PAGES) + - zone_reclaimable_pages(zone) - - zone-dirty_balance_reserve; + unsigned long nr_pages = zone_page_state(zone, NR_FREE_PAGES) + + zone_reclaimable_pages(zone); + + /* don't allow this to underflow */ + nr_pages -= max(nr_pages, zone-dirty_balance_reserve); + return nr_pages; } /** -- 1.7.7.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: Fix calculation of dirtyable memory
On Mon, Nov 12, 2012 at 11:35:28AM -0800, Sonny Rao wrote: The system uses global_dirtyable_memory() to calculate number of dirtyable pages/pages that can be allocated to the page cache. A bug causes an underflow thus making the page count look like a big unsigned number. This in turn confuses the dirty writeback throttling to aggressively write back pages as they become dirty (usually 1 page at a time). Fix is to ensure there is no underflow while doing the math. Signed-off-by: Sonny Rao sonny...@chromium.org Signed-off-by: Puneet Kumar puneets...@chromium.org Thanks for debugging and sending in the patch. It might be useful to note in the changelog that the crawling writeback problem only affects highmem systems because of the way the underflowed count of high memory is subtracted from the overall amount of dirtyable memory. And that the problem was introduced with v3.2-4896-gab8fabd (which means that we should include Cc: sta...@kernel.org for 3.3+). The diff itself looks good to me, thanks again: Acked-by: Johannes Weiner han...@cmpxchg.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: Fix calculation of dirtyable memory
On Thu, 8 Nov 2012 15:25:35 -0800 Sonny Rao wrote: > The system uses global_dirtyable_memory() to calculate > number of dirtyable pages/pages that can be allocated > to the page cache. A bug causes an underflow thus making > the page count look like a big unsigned number. This in turn > confuses the dirty writeback throttling to aggressively write > back pages as they become dirty (usually 1 page at a time). > > Fix is to ensure there is no underflow while doing the math. > > Signed-off-by: Sonny Rao > Signed-off-by: Puneet Kumar > --- > mm/page-writeback.c | 17 + > 1 files changed, 13 insertions(+), 4 deletions(-) > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c > index 830893b..2a6356c 100644 > --- a/mm/page-writeback.c > +++ b/mm/page-writeback.c > @@ -194,11 +194,19 @@ static unsigned long highmem_dirtyable_memory(unsigned > long total) > unsigned long x = 0; > > for_each_node_state(node, N_HIGH_MEMORY) { > + unsigned long nr_pages; > struct zone *z = > _DATA(node)->node_zones[ZONE_HIGHMEM]; > > - x += zone_page_state(z, NR_FREE_PAGES) + > - zone_reclaimable_pages(z) - z->dirty_balance_reserve; > + nr_pages = zone_page_state(z, NR_FREE_PAGES) + > + zone_reclaimable_pages(z); > + /* > + * Unreclaimable memory (kernel memory or anonymous memory > + * without swap) can bring down the dirtyable pages below > + * the zone's dirty balance reserve. > + */ > + if (nr_pages >= z->dirty_balance_reserve) > + x += nr_pages - z->dirty_balance_reserve; If the system has two nodes and one is below its dirty_balance_reserve, we could end up with something like: x = 0; ... x += 1000; ... x += -100; In this case, your fix would cause highmem_dirtyable_memory() to return 1000. Would it be better to instead return 900? IOW, we instead add logic along the lines of if ((long)x < 0) x = 0; return x; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: Fix calculation of dirtyable memory
The system uses global_dirtyable_memory() to calculate number of dirtyable pages/pages that can be allocated to the page cache. A bug causes an underflow thus making the page count look like a big unsigned number. This in turn confuses the dirty writeback throttling to aggressively write back pages as they become dirty (usually 1 page at a time). Fix is to ensure there is no underflow while doing the math. Signed-off-by: Sonny Rao Signed-off-by: Puneet Kumar --- mm/page-writeback.c | 17 + 1 files changed, 13 insertions(+), 4 deletions(-) diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 830893b..2a6356c 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -194,11 +194,19 @@ static unsigned long highmem_dirtyable_memory(unsigned long total) unsigned long x = 0; for_each_node_state(node, N_HIGH_MEMORY) { + unsigned long nr_pages; struct zone *z = _DATA(node)->node_zones[ZONE_HIGHMEM]; - x += zone_page_state(z, NR_FREE_PAGES) + -zone_reclaimable_pages(z) - z->dirty_balance_reserve; + nr_pages = zone_page_state(z, NR_FREE_PAGES) + + zone_reclaimable_pages(z); + /* +* Unreclaimable memory (kernel memory or anonymous memory +* without swap) can bring down the dirtyable pages below +* the zone's dirty balance reserve. +*/ + if (nr_pages >= z->dirty_balance_reserve) + x += nr_pages - z->dirty_balance_reserve; } /* * Make sure that the number of highmem pages is never larger @@ -222,8 +230,9 @@ static unsigned long global_dirtyable_memory(void) { unsigned long x; - x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() - - dirty_balance_reserve; + x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages(); + if (x >= dirty_balance_reserve) + x -= dirty_balance_reserve; if (!vm_highmem_is_dirtyable) x -= highmem_dirtyable_memory(x); -- 1.7.7.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: Fix calculation of dirtyable memory
The system uses global_dirtyable_memory() to calculate number of dirtyable pages/pages that can be allocated to the page cache. A bug causes an underflow thus making the page count look like a big unsigned number. This in turn confuses the dirty writeback throttling to aggressively write back pages as they become dirty (usually 1 page at a time). Fix is to ensure there is no underflow while doing the math. Signed-off-by: Sonny Rao sonny...@chromium.org Signed-off-by: Puneet Kumar puneets...@chromium.org --- mm/page-writeback.c | 17 + 1 files changed, 13 insertions(+), 4 deletions(-) diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 830893b..2a6356c 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -194,11 +194,19 @@ static unsigned long highmem_dirtyable_memory(unsigned long total) unsigned long x = 0; for_each_node_state(node, N_HIGH_MEMORY) { + unsigned long nr_pages; struct zone *z = NODE_DATA(node)-node_zones[ZONE_HIGHMEM]; - x += zone_page_state(z, NR_FREE_PAGES) + -zone_reclaimable_pages(z) - z-dirty_balance_reserve; + nr_pages = zone_page_state(z, NR_FREE_PAGES) + + zone_reclaimable_pages(z); + /* +* Unreclaimable memory (kernel memory or anonymous memory +* without swap) can bring down the dirtyable pages below +* the zone's dirty balance reserve. +*/ + if (nr_pages = z-dirty_balance_reserve) + x += nr_pages - z-dirty_balance_reserve; } /* * Make sure that the number of highmem pages is never larger @@ -222,8 +230,9 @@ static unsigned long global_dirtyable_memory(void) { unsigned long x; - x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages() - - dirty_balance_reserve; + x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages(); + if (x = dirty_balance_reserve) + x -= dirty_balance_reserve; if (!vm_highmem_is_dirtyable) x -= highmem_dirtyable_memory(x); -- 1.7.7.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: Fix calculation of dirtyable memory
On Thu, 8 Nov 2012 15:25:35 -0800 Sonny Rao sonny...@chromium.org wrote: The system uses global_dirtyable_memory() to calculate number of dirtyable pages/pages that can be allocated to the page cache. A bug causes an underflow thus making the page count look like a big unsigned number. This in turn confuses the dirty writeback throttling to aggressively write back pages as they become dirty (usually 1 page at a time). Fix is to ensure there is no underflow while doing the math. Signed-off-by: Sonny Rao sonny...@chromium.org Signed-off-by: Puneet Kumar puneets...@chromium.org --- mm/page-writeback.c | 17 + 1 files changed, 13 insertions(+), 4 deletions(-) diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 830893b..2a6356c 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -194,11 +194,19 @@ static unsigned long highmem_dirtyable_memory(unsigned long total) unsigned long x = 0; for_each_node_state(node, N_HIGH_MEMORY) { + unsigned long nr_pages; struct zone *z = NODE_DATA(node)-node_zones[ZONE_HIGHMEM]; - x += zone_page_state(z, NR_FREE_PAGES) + - zone_reclaimable_pages(z) - z-dirty_balance_reserve; + nr_pages = zone_page_state(z, NR_FREE_PAGES) + + zone_reclaimable_pages(z); + /* + * Unreclaimable memory (kernel memory or anonymous memory + * without swap) can bring down the dirtyable pages below + * the zone's dirty balance reserve. + */ + if (nr_pages = z-dirty_balance_reserve) + x += nr_pages - z-dirty_balance_reserve; If the system has two nodes and one is below its dirty_balance_reserve, we could end up with something like: x = 0; ... x += 1000; ... x += -100; In this case, your fix would cause highmem_dirtyable_memory() to return 1000. Would it be better to instead return 900? IOW, we instead add logic along the lines of if ((long)x 0) x = 0; return x; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/