Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-02-04 Thread Minchan Kim
On Tue, Jan 22, 2013 at 09:09:54AM +0900, Minchan Kim wrote:
> On Mon, Jan 21, 2013 at 09:39:06AM -0500, Rik van Riel wrote:
> > On 01/20/2013 08:52 PM, Minchan Kim wrote:
> > 
> > > From 94086dc7152359d052802c55c82ef19509fe8cce Mon Sep 17 00:00:00 2001
> > >From: Minchan Kim 
> > >Date: Mon, 21 Jan 2013 10:43:43 +0900
> > >Subject: [PATCH] mm: Use up free swap space before reaching OOM kill
> > >
> > >Recently, Luigi reported there are lots of free swap space when
> > >OOM happens. It's easily reproduced on zram-over-swap, where
> > >many instance of memory hogs are running and laptop_mode is enabled.
> > >He said there was no problem when he disabled laptop_mode.
> > >The problem when I investigate problem is following as.
> > >
> > >Assumption for easy explanation: There are no page cache page in system
> > >because they all are already reclaimed.
> > >
> > >1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
> > >2. shrink_inactive_list isolates victim pages from inactive anon lru list.
> > >3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
> > >pageout because sc->may_writepage is 0 so the page is rotated back into
> > >inactive anon lru list. The add_to_swap made the page Dirty by 
> > > SetPageDirty.
> > >4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority 
> > >and
> > >retry reclaim with higher priority.
> > >5. shrink_inactlive_list try to isolate victim pages from inactive anon 
> > >lru list
> > >but got failed because it try to isolate pages with ISOLATE_CLEAN mode 
> > > but
> > >inactive anon lru list is full of dirty pages by 3 so it just returns
> > >without  any reclaim progress.
> > >6. do_try_to_free_pages doesn't set may_writepage due to zero 
> > >total_scanned.
> > >Because sc->nr_scanned is increased by shrink_page_list but we don't 
> > > call
> > >shrink_page_list in 5 due to short of isolated pages.
> > >
> > >Above loop is continued until OOM happens.
> > >The problem didn't happen before [1] was merged because old logic's
> > >isolatation in shrink_inactive_list was successful and tried to call
> > >shrink_page_list to pageout them but it still ends up failed to page out
> > >by may_writepage. But important point is that sc->nr_scanned was increased
> > >although we couldn't swap out them so do_try_to_free_pages could set
> > >may_writepages.
> > >
> > >Since [1] was introduced, it's not a good idea any more to depends on
> > >only the number of scanned pages for setting may_writepage. So this patch
> > >adds new trigger point of setting may_writepage as below DEF_PRIOIRTY - 2
> > >which is used to show the significant memory pressure in VM so it's good
> > >fit for our purpose which would be better to lose power saving or clickety
> > >rather than OOM killing.
> > >
> > >[1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware]
> > >
> > >Reported-by: Luigi Semenzato 
> > >Signed-off-by: Minchan Kim 
> > 
> > Your patch is a nice simplification.  I am ok with the
> > change, provided it works for Luigi :)
> 
> Thanks, Rik.
> 
> Oops, I missed to Ccing Luigi. Add him again.
> Luigi, Could you test this patch?
> Thanks for your endless effort.
> 
> > 
> > Acked-by: Rik van Riel 
> > 


Andrew,
I hope Luigi confirms this patch but he seems to be very busy.
At a minimum, I tested this patch and passed my test.
Could you apply this and remove [2]?
Otherwise, should I wait for Luigi?

[2] mm: prevent addition of pages to swap if may_writepage is unset

>From 72cdf4159427c1ecdbd21a40b8bd1f13d5b8d5e2 Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Mon, 21 Jan 2013 10:52:22 +0900
Subject: [PATCH] mm: Use up free swap space before reaching OOM kill

Recently, Luigi reported there are lots of free swap space when
OOM happens. It's easily reproduced on zram-over-swap, where
many instance of memory hogs are running and laptop_mode is enabled.
He said there was no problem when he disabled laptop_mode.
The problem when I investigate problem is following as.

Assumption for easy explanation: There are no page cache page in system
because they all are already reclaimed.

1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
2. shrink_inactive_list isolates victim pages from inactive anon lru list.
3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
   pageout because sc->may_writepage is 0 so the page is rotated back into
   inactive anon lru list. The add_to_swap made the page Dirty by SetPageDirty.
4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority and
   retry reclaim with higher priority.
5. shrink_inactlive_list try to isolate victim pages from inactive anon lru list
   but got failed because it try to isolate pages with ISOLATE_CLEAN mode but
   inactive anon lru list is full of dirty pages by 3 so it just returns
   without  any reclaim progress.
6. do_try_to_free_pages doesn't set may_writepage due to zero 

Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-02-04 Thread Minchan Kim
On Tue, Jan 22, 2013 at 09:09:54AM +0900, Minchan Kim wrote:
 On Mon, Jan 21, 2013 at 09:39:06AM -0500, Rik van Riel wrote:
  On 01/20/2013 08:52 PM, Minchan Kim wrote:
  
   From 94086dc7152359d052802c55c82ef19509fe8cce Mon Sep 17 00:00:00 2001
  From: Minchan Kim minc...@kernel.org
  Date: Mon, 21 Jan 2013 10:43:43 +0900
  Subject: [PATCH] mm: Use up free swap space before reaching OOM kill
  
  Recently, Luigi reported there are lots of free swap space when
  OOM happens. It's easily reproduced on zram-over-swap, where
  many instance of memory hogs are running and laptop_mode is enabled.
  He said there was no problem when he disabled laptop_mode.
  The problem when I investigate problem is following as.
  
  Assumption for easy explanation: There are no page cache page in system
  because they all are already reclaimed.
  
  1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
  2. shrink_inactive_list isolates victim pages from inactive anon lru list.
  3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
  pageout because sc-may_writepage is 0 so the page is rotated back into
  inactive anon lru list. The add_to_swap made the page Dirty by 
   SetPageDirty.
  4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority 
  and
  retry reclaim with higher priority.
  5. shrink_inactlive_list try to isolate victim pages from inactive anon 
  lru list
  but got failed because it try to isolate pages with ISOLATE_CLEAN mode 
   but
  inactive anon lru list is full of dirty pages by 3 so it just returns
  without  any reclaim progress.
  6. do_try_to_free_pages doesn't set may_writepage due to zero 
  total_scanned.
  Because sc-nr_scanned is increased by shrink_page_list but we don't 
   call
  shrink_page_list in 5 due to short of isolated pages.
  
  Above loop is continued until OOM happens.
  The problem didn't happen before [1] was merged because old logic's
  isolatation in shrink_inactive_list was successful and tried to call
  shrink_page_list to pageout them but it still ends up failed to page out
  by may_writepage. But important point is that sc-nr_scanned was increased
  although we couldn't swap out them so do_try_to_free_pages could set
  may_writepages.
  
  Since [1] was introduced, it's not a good idea any more to depends on
  only the number of scanned pages for setting may_writepage. So this patch
  adds new trigger point of setting may_writepage as below DEF_PRIOIRTY - 2
  which is used to show the significant memory pressure in VM so it's good
  fit for our purpose which would be better to lose power saving or clickety
  rather than OOM killing.
  
  [1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware]
  
  Reported-by: Luigi Semenzato semenz...@google.com
  Signed-off-by: Minchan Kim minc...@kernel.org
  
  Your patch is a nice simplification.  I am ok with the
  change, provided it works for Luigi :)
 
 Thanks, Rik.
 
 Oops, I missed to Ccing Luigi. Add him again.
 Luigi, Could you test this patch?
 Thanks for your endless effort.
 
  
  Acked-by: Rik van Riel r...@redhat.com
  


Andrew,
I hope Luigi confirms this patch but he seems to be very busy.
At a minimum, I tested this patch and passed my test.
Could you apply this and remove [2]?
Otherwise, should I wait for Luigi?

[2] mm: prevent addition of pages to swap if may_writepage is unset

From 72cdf4159427c1ecdbd21a40b8bd1f13d5b8d5e2 Mon Sep 17 00:00:00 2001
From: Minchan Kim minc...@kernel.org
Date: Mon, 21 Jan 2013 10:52:22 +0900
Subject: [PATCH] mm: Use up free swap space before reaching OOM kill

Recently, Luigi reported there are lots of free swap space when
OOM happens. It's easily reproduced on zram-over-swap, where
many instance of memory hogs are running and laptop_mode is enabled.
He said there was no problem when he disabled laptop_mode.
The problem when I investigate problem is following as.

Assumption for easy explanation: There are no page cache page in system
because they all are already reclaimed.

1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
2. shrink_inactive_list isolates victim pages from inactive anon lru list.
3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
   pageout because sc-may_writepage is 0 so the page is rotated back into
   inactive anon lru list. The add_to_swap made the page Dirty by SetPageDirty.
4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority and
   retry reclaim with higher priority.
5. shrink_inactlive_list try to isolate victim pages from inactive anon lru list
   but got failed because it try to isolate pages with ISOLATE_CLEAN mode but
   inactive anon lru list is full of dirty pages by 3 so it just returns
   without  any reclaim progress.
6. do_try_to_free_pages doesn't set may_writepage due to zero total_scanned.
   Because sc-nr_scanned is increased by shrink_page_list but we don't call
   shrink_page_list 

Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-21 Thread Minchan Kim
On Mon, Jan 21, 2013 at 09:39:06AM -0500, Rik van Riel wrote:
> On 01/20/2013 08:52 PM, Minchan Kim wrote:
> 
> > From 94086dc7152359d052802c55c82ef19509fe8cce Mon Sep 17 00:00:00 2001
> >From: Minchan Kim 
> >Date: Mon, 21 Jan 2013 10:43:43 +0900
> >Subject: [PATCH] mm: Use up free swap space before reaching OOM kill
> >
> >Recently, Luigi reported there are lots of free swap space when
> >OOM happens. It's easily reproduced on zram-over-swap, where
> >many instance of memory hogs are running and laptop_mode is enabled.
> >He said there was no problem when he disabled laptop_mode.
> >The problem when I investigate problem is following as.
> >
> >Assumption for easy explanation: There are no page cache page in system
> >because they all are already reclaimed.
> >
> >1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
> >2. shrink_inactive_list isolates victim pages from inactive anon lru list.
> >3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
> >pageout because sc->may_writepage is 0 so the page is rotated back into
> >inactive anon lru list. The add_to_swap made the page Dirty by 
> > SetPageDirty.
> >4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority and
> >retry reclaim with higher priority.
> >5. shrink_inactlive_list try to isolate victim pages from inactive anon lru 
> >list
> >but got failed because it try to isolate pages with ISOLATE_CLEAN mode 
> > but
> >inactive anon lru list is full of dirty pages by 3 so it just returns
> >without  any reclaim progress.
> >6. do_try_to_free_pages doesn't set may_writepage due to zero total_scanned.
> >Because sc->nr_scanned is increased by shrink_page_list but we don't call
> >shrink_page_list in 5 due to short of isolated pages.
> >
> >Above loop is continued until OOM happens.
> >The problem didn't happen before [1] was merged because old logic's
> >isolatation in shrink_inactive_list was successful and tried to call
> >shrink_page_list to pageout them but it still ends up failed to page out
> >by may_writepage. But important point is that sc->nr_scanned was increased
> >although we couldn't swap out them so do_try_to_free_pages could set
> >may_writepages.
> >
> >Since [1] was introduced, it's not a good idea any more to depends on
> >only the number of scanned pages for setting may_writepage. So this patch
> >adds new trigger point of setting may_writepage as below DEF_PRIOIRTY - 2
> >which is used to show the significant memory pressure in VM so it's good
> >fit for our purpose which would be better to lose power saving or clickety
> >rather than OOM killing.
> >
> >[1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware]
> >
> >Reported-by: Luigi Semenzato 
> >Signed-off-by: Minchan Kim 
> 
> Your patch is a nice simplification.  I am ok with the
> change, provided it works for Luigi :)

Thanks, Rik.

Oops, I missed to Ccing Luigi. Add him again.
Luigi, Could you test this patch?
Thanks for your endless effort.

> 
> Acked-by: Rik van Riel 
> 
> 
> -- 
> All rights reversed
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-21 Thread Rik van Riel

On 01/20/2013 08:52 PM, Minchan Kim wrote:


 From 94086dc7152359d052802c55c82ef19509fe8cce Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Mon, 21 Jan 2013 10:43:43 +0900
Subject: [PATCH] mm: Use up free swap space before reaching OOM kill

Recently, Luigi reported there are lots of free swap space when
OOM happens. It's easily reproduced on zram-over-swap, where
many instance of memory hogs are running and laptop_mode is enabled.
He said there was no problem when he disabled laptop_mode.
The problem when I investigate problem is following as.

Assumption for easy explanation: There are no page cache page in system
because they all are already reclaimed.

1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
2. shrink_inactive_list isolates victim pages from inactive anon lru list.
3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
pageout because sc->may_writepage is 0 so the page is rotated back into
inactive anon lru list. The add_to_swap made the page Dirty by SetPageDirty.
4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority and
retry reclaim with higher priority.
5. shrink_inactlive_list try to isolate victim pages from inactive anon lru list
but got failed because it try to isolate pages with ISOLATE_CLEAN mode but
inactive anon lru list is full of dirty pages by 3 so it just returns
without  any reclaim progress.
6. do_try_to_free_pages doesn't set may_writepage due to zero total_scanned.
Because sc->nr_scanned is increased by shrink_page_list but we don't call
shrink_page_list in 5 due to short of isolated pages.

Above loop is continued until OOM happens.
The problem didn't happen before [1] was merged because old logic's
isolatation in shrink_inactive_list was successful and tried to call
shrink_page_list to pageout them but it still ends up failed to page out
by may_writepage. But important point is that sc->nr_scanned was increased
although we couldn't swap out them so do_try_to_free_pages could set
may_writepages.

Since [1] was introduced, it's not a good idea any more to depends on
only the number of scanned pages for setting may_writepage. So this patch
adds new trigger point of setting may_writepage as below DEF_PRIOIRTY - 2
which is used to show the significant memory pressure in VM so it's good
fit for our purpose which would be better to lose power saving or clickety
rather than OOM killing.

[1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware]

Reported-by: Luigi Semenzato 
Signed-off-by: Minchan Kim 


Your patch is a nice simplification.  I am ok with the
change, provided it works for Luigi :)

Acked-by: Rik van Riel 


--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-21 Thread Rik van Riel

On 01/20/2013 08:52 PM, Minchan Kim wrote:


 From 94086dc7152359d052802c55c82ef19509fe8cce Mon Sep 17 00:00:00 2001
From: Minchan Kim minc...@kernel.org
Date: Mon, 21 Jan 2013 10:43:43 +0900
Subject: [PATCH] mm: Use up free swap space before reaching OOM kill

Recently, Luigi reported there are lots of free swap space when
OOM happens. It's easily reproduced on zram-over-swap, where
many instance of memory hogs are running and laptop_mode is enabled.
He said there was no problem when he disabled laptop_mode.
The problem when I investigate problem is following as.

Assumption for easy explanation: There are no page cache page in system
because they all are already reclaimed.

1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
2. shrink_inactive_list isolates victim pages from inactive anon lru list.
3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
pageout because sc-may_writepage is 0 so the page is rotated back into
inactive anon lru list. The add_to_swap made the page Dirty by SetPageDirty.
4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority and
retry reclaim with higher priority.
5. shrink_inactlive_list try to isolate victim pages from inactive anon lru list
but got failed because it try to isolate pages with ISOLATE_CLEAN mode but
inactive anon lru list is full of dirty pages by 3 so it just returns
without  any reclaim progress.
6. do_try_to_free_pages doesn't set may_writepage due to zero total_scanned.
Because sc-nr_scanned is increased by shrink_page_list but we don't call
shrink_page_list in 5 due to short of isolated pages.

Above loop is continued until OOM happens.
The problem didn't happen before [1] was merged because old logic's
isolatation in shrink_inactive_list was successful and tried to call
shrink_page_list to pageout them but it still ends up failed to page out
by may_writepage. But important point is that sc-nr_scanned was increased
although we couldn't swap out them so do_try_to_free_pages could set
may_writepages.

Since [1] was introduced, it's not a good idea any more to depends on
only the number of scanned pages for setting may_writepage. So this patch
adds new trigger point of setting may_writepage as below DEF_PRIOIRTY - 2
which is used to show the significant memory pressure in VM so it's good
fit for our purpose which would be better to lose power saving or clickety
rather than OOM killing.

[1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware]

Reported-by: Luigi Semenzato semenz...@google.com
Signed-off-by: Minchan Kim minc...@kernel.org


Your patch is a nice simplification.  I am ok with the
change, provided it works for Luigi :)

Acked-by: Rik van Riel r...@redhat.com


--
All rights reversed
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-21 Thread Minchan Kim
On Mon, Jan 21, 2013 at 09:39:06AM -0500, Rik van Riel wrote:
 On 01/20/2013 08:52 PM, Minchan Kim wrote:
 
  From 94086dc7152359d052802c55c82ef19509fe8cce Mon Sep 17 00:00:00 2001
 From: Minchan Kim minc...@kernel.org
 Date: Mon, 21 Jan 2013 10:43:43 +0900
 Subject: [PATCH] mm: Use up free swap space before reaching OOM kill
 
 Recently, Luigi reported there are lots of free swap space when
 OOM happens. It's easily reproduced on zram-over-swap, where
 many instance of memory hogs are running and laptop_mode is enabled.
 He said there was no problem when he disabled laptop_mode.
 The problem when I investigate problem is following as.
 
 Assumption for easy explanation: There are no page cache page in system
 because they all are already reclaimed.
 
 1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
 2. shrink_inactive_list isolates victim pages from inactive anon lru list.
 3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
 pageout because sc-may_writepage is 0 so the page is rotated back into
 inactive anon lru list. The add_to_swap made the page Dirty by 
  SetPageDirty.
 4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority and
 retry reclaim with higher priority.
 5. shrink_inactlive_list try to isolate victim pages from inactive anon lru 
 list
 but got failed because it try to isolate pages with ISOLATE_CLEAN mode 
  but
 inactive anon lru list is full of dirty pages by 3 so it just returns
 without  any reclaim progress.
 6. do_try_to_free_pages doesn't set may_writepage due to zero total_scanned.
 Because sc-nr_scanned is increased by shrink_page_list but we don't call
 shrink_page_list in 5 due to short of isolated pages.
 
 Above loop is continued until OOM happens.
 The problem didn't happen before [1] was merged because old logic's
 isolatation in shrink_inactive_list was successful and tried to call
 shrink_page_list to pageout them but it still ends up failed to page out
 by may_writepage. But important point is that sc-nr_scanned was increased
 although we couldn't swap out them so do_try_to_free_pages could set
 may_writepages.
 
 Since [1] was introduced, it's not a good idea any more to depends on
 only the number of scanned pages for setting may_writepage. So this patch
 adds new trigger point of setting may_writepage as below DEF_PRIOIRTY - 2
 which is used to show the significant memory pressure in VM so it's good
 fit for our purpose which would be better to lose power saving or clickety
 rather than OOM killing.
 
 [1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware]
 
 Reported-by: Luigi Semenzato semenz...@google.com
 Signed-off-by: Minchan Kim minc...@kernel.org
 
 Your patch is a nice simplification.  I am ok with the
 change, provided it works for Luigi :)

Thanks, Rik.

Oops, I missed to Ccing Luigi. Add him again.
Luigi, Could you test this patch?
Thanks for your endless effort.

 
 Acked-by: Rik van Riel r...@redhat.com
 
 
 -- 
 All rights reversed
 
 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-20 Thread Minchan Kim
Hi,

On Fri, Jan 18, 2013 at 08:36:42AM +0900, Minchan Kim wrote:
> On Thu, Jan 17, 2013 at 02:22:38PM -0800, Andrew Morton wrote:
> > On Thu, 17 Jan 2013 09:53:14 +0900
> > Minchan Kim  wrote:
> > 
> > > Recently, Luigi reported there are lots of free swap space when
> > > OOM happens. It's easily reproduced on zram-over-swap, where
> > > many instance of memory hogs are running and laptop_mode is enabled.
> > > He said there was no problem when he disabled laptop_mode.
> > > 
> > > The problem when I investigate problem is following as.
> > > 
> > > Assumption for easy explanation: There are no page cache page in system
> > > because they all are already reclaimed.
> > > 
> > > 1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
> > > 2. shrink_inactive_list isolates victim pages from inactive anon lru list.
> > > 3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
> > >pageout because sc->may_writepage is 0 so the page is rotated back into
> > >inactive anon lru list. The add_to_swap made the page Dirty by 
> > > SetPageDirty
> > > 4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority 
> > > and
> > >retry reclaim with higher priority.
> > > 5. shrink_inactlive_list try to isolate victim pages from inactive anon 
> > > lru list
> > >but got failed because it try to isolate pages with ISOLATE_CLEAN mode 
> > > but
> > >inactive anon lru list is full of dirty pages by 3 so it just returns
> > >without  any reclaim progress.
> > > 6. do_try_to_free_pages doesn't set may_write due to zero total_scanned.
> > 
> > s/may_write/may_writepage/
> 
> Thanks!
> 
> > 
> > >Because sc->nr_scanned is increased by shrink_page_list but we don't 
> > > call
> > >shrink_page_list in 5 due to short of isolated pages.
> > 
> > This is the bug, is it not?
> > 
> > In laptop mode, we still need to write out dirty swapcache at some
> > point.  An appropriate time to do this is when the scanning priority is
> 
> Yes and when to some point is really important. Now, the point for that is
> depends on on the number of scanned pages by shrink_page_list. It means we
> must isolate victim pages from inactive LRU list and call shrink_page_list
> to increase sc->nr_scanned but unfortunately, we have various filters to
> decrease CPU consumption and LRU churning when VM try to isolate victim pages
> so it could prevent isolating victim pages from LRU list.
> 
> > getting high.  But it seems that this ISOLATE_CLEAN->total_scanned
> 
> Yes. I absolutely agree on that some point should depend on priority, NOT
> the number of scanned pages. And I already said to you about that.
> https://lkml.org/lkml/2013/1/10/643
> 
> We used to use such heuristic in several places in VM, ie DEF_PRIORITY - 2
> But why I hesitate with the patch is that I think this patch should go to
> stable tree so the patch should be really small and have no side effect so
> I don't wanted to change laptop_mode behavior heavily caused by changing
> condition for may_writepage trigger point.
> 
> > interaction is preventing that.
> > 
> > (An enhancement to laptop mode would be to opportunistically write out
> > dirty swapcache in or around laptop_mode_timer_fn()).
> 
> It could but it should be another patch and VM shouldn't rely on ONLY
> laptop_mode_timer_fn, IMHO. VM should have own rule to reclaim pages
> regardless of laptop_mode's help to prevent OOM kill.
> 
> > 
> > > Above loop is continued until OOM happens.
> > > The problem didn't happen before [1] was merged because old logic's 
> > > isolatation
> > > in shrink_inactive_list was successful and tried to call shrink_page_list
> > > to pageout them but it still ends up failed to page out by may_writepage.
> > > But important point is that sc->nr_scanned was increased althoug we 
> > > couldn't
> > > swap out them so do_try_to_free_pages could set may_writepages.
> > > So this patch need to go stable tree althoug it's a band-aid.
> > > Then, for latest linus tree, we should fix laptop_mode's fundamental
> > > problem.
> > 
> > Well.  Perhaps we can do that now.
> 
> Okay. If you don't object my suggestion, I will send patches next week.
> Thanks for the review, Andrew!

Andrew, If nobody objects, I would like to drop [1] and add ths patch instead 
of [1].
Luigi, below patch passed my test. If anybody doesn't object, could you test 
this
patch?

Thanks!

[1] mm: prevent addition of pages to swap if may_writepage is unset

-- &< 

>From 94086dc7152359d052802c55c82ef19509fe8cce Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Mon, 21 Jan 2013 10:43:43 +0900
Subject: [PATCH] mm: Use up free swap space before reaching OOM kill

Recently, Luigi reported there are lots of free swap space when
OOM happens. It's easily reproduced on zram-over-swap, where
many instance of memory hogs are running and laptop_mode is enabled.
He said there was no problem when he disabled laptop_mode.
The problem 

Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-20 Thread Minchan Kim
Hi,

On Fri, Jan 18, 2013 at 08:36:42AM +0900, Minchan Kim wrote:
 On Thu, Jan 17, 2013 at 02:22:38PM -0800, Andrew Morton wrote:
  On Thu, 17 Jan 2013 09:53:14 +0900
  Minchan Kim minc...@kernel.org wrote:
  
   Recently, Luigi reported there are lots of free swap space when
   OOM happens. It's easily reproduced on zram-over-swap, where
   many instance of memory hogs are running and laptop_mode is enabled.
   He said there was no problem when he disabled laptop_mode.
   
   The problem when I investigate problem is following as.
   
   Assumption for easy explanation: There are no page cache page in system
   because they all are already reclaimed.
   
   1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
   2. shrink_inactive_list isolates victim pages from inactive anon lru list.
   3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
  pageout because sc-may_writepage is 0 so the page is rotated back into
  inactive anon lru list. The add_to_swap made the page Dirty by 
   SetPageDirty
   4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority 
   and
  retry reclaim with higher priority.
   5. shrink_inactlive_list try to isolate victim pages from inactive anon 
   lru list
  but got failed because it try to isolate pages with ISOLATE_CLEAN mode 
   but
  inactive anon lru list is full of dirty pages by 3 so it just returns
  without  any reclaim progress.
   6. do_try_to_free_pages doesn't set may_write due to zero total_scanned.
  
  s/may_write/may_writepage/
 
 Thanks!
 
  
  Because sc-nr_scanned is increased by shrink_page_list but we don't 
   call
  shrink_page_list in 5 due to short of isolated pages.
  
  This is the bug, is it not?
  
  In laptop mode, we still need to write out dirty swapcache at some
  point.  An appropriate time to do this is when the scanning priority is
 
 Yes and when to some point is really important. Now, the point for that is
 depends on on the number of scanned pages by shrink_page_list. It means we
 must isolate victim pages from inactive LRU list and call shrink_page_list
 to increase sc-nr_scanned but unfortunately, we have various filters to
 decrease CPU consumption and LRU churning when VM try to isolate victim pages
 so it could prevent isolating victim pages from LRU list.
 
  getting high.  But it seems that this ISOLATE_CLEAN-total_scanned
 
 Yes. I absolutely agree on that some point should depend on priority, NOT
 the number of scanned pages. And I already said to you about that.
 https://lkml.org/lkml/2013/1/10/643
 
 We used to use such heuristic in several places in VM, ie DEF_PRIORITY - 2
 But why I hesitate with the patch is that I think this patch should go to
 stable tree so the patch should be really small and have no side effect so
 I don't wanted to change laptop_mode behavior heavily caused by changing
 condition for may_writepage trigger point.
 
  interaction is preventing that.
  
  (An enhancement to laptop mode would be to opportunistically write out
  dirty swapcache in or around laptop_mode_timer_fn()).
 
 It could but it should be another patch and VM shouldn't rely on ONLY
 laptop_mode_timer_fn, IMHO. VM should have own rule to reclaim pages
 regardless of laptop_mode's help to prevent OOM kill.
 
  
   Above loop is continued until OOM happens.
   The problem didn't happen before [1] was merged because old logic's 
   isolatation
   in shrink_inactive_list was successful and tried to call shrink_page_list
   to pageout them but it still ends up failed to page out by may_writepage.
   But important point is that sc-nr_scanned was increased althoug we 
   couldn't
   swap out them so do_try_to_free_pages could set may_writepages.
   So this patch need to go stable tree althoug it's a band-aid.
   Then, for latest linus tree, we should fix laptop_mode's fundamental
   problem.
  
  Well.  Perhaps we can do that now.
 
 Okay. If you don't object my suggestion, I will send patches next week.
 Thanks for the review, Andrew!

Andrew, If nobody objects, I would like to drop [1] and add ths patch instead 
of [1].
Luigi, below patch passed my test. If anybody doesn't object, could you test 
this
patch?

Thanks!

[1] mm: prevent addition of pages to swap if may_writepage is unset

--  

From 94086dc7152359d052802c55c82ef19509fe8cce Mon Sep 17 00:00:00 2001
From: Minchan Kim minc...@kernel.org
Date: Mon, 21 Jan 2013 10:43:43 +0900
Subject: [PATCH] mm: Use up free swap space before reaching OOM kill

Recently, Luigi reported there are lots of free swap space when
OOM happens. It's easily reproduced on zram-over-swap, where
many instance of memory hogs are running and laptop_mode is enabled.
He said there was no problem when he disabled laptop_mode.
The problem when I investigate problem is following as.

Assumption for easy explanation: There are no page cache page in system
because they all are already reclaimed.

1. 

Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-17 Thread Minchan Kim
On Thu, Jan 17, 2013 at 02:22:38PM -0800, Andrew Morton wrote:
> On Thu, 17 Jan 2013 09:53:14 +0900
> Minchan Kim  wrote:
> 
> > Recently, Luigi reported there are lots of free swap space when
> > OOM happens. It's easily reproduced on zram-over-swap, where
> > many instance of memory hogs are running and laptop_mode is enabled.
> > He said there was no problem when he disabled laptop_mode.
> > 
> > The problem when I investigate problem is following as.
> > 
> > Assumption for easy explanation: There are no page cache page in system
> > because they all are already reclaimed.
> > 
> > 1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
> > 2. shrink_inactive_list isolates victim pages from inactive anon lru list.
> > 3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
> >pageout because sc->may_writepage is 0 so the page is rotated back into
> >inactive anon lru list. The add_to_swap made the page Dirty by 
> > SetPageDirty
> > 4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority 
> > and
> >retry reclaim with higher priority.
> > 5. shrink_inactlive_list try to isolate victim pages from inactive anon lru 
> > list
> >but got failed because it try to isolate pages with ISOLATE_CLEAN mode 
> > but
> >inactive anon lru list is full of dirty pages by 3 so it just returns
> >without  any reclaim progress.
> > 6. do_try_to_free_pages doesn't set may_write due to zero total_scanned.
> 
> s/may_write/may_writepage/

Thanks!

> 
> >Because sc->nr_scanned is increased by shrink_page_list but we don't call
> >shrink_page_list in 5 due to short of isolated pages.
> 
> This is the bug, is it not?
> 
> In laptop mode, we still need to write out dirty swapcache at some
> point.  An appropriate time to do this is when the scanning priority is

Yes and when to some point is really important. Now, the point for that is
depends on on the number of scanned pages by shrink_page_list. It means we
must isolate victim pages from inactive LRU list and call shrink_page_list
to increase sc->nr_scanned but unfortunately, we have various filters to
decrease CPU consumption and LRU churning when VM try to isolate victim pages
so it could prevent isolating victim pages from LRU list.

> getting high.  But it seems that this ISOLATE_CLEAN->total_scanned

Yes. I absolutely agree on that some point should depend on priority, NOT
the number of scanned pages. And I already said to you about that.
https://lkml.org/lkml/2013/1/10/643

We used to use such heuristic in several places in VM, ie DEF_PRIORITY - 2
But why I hesitate with the patch is that I think this patch should go to
stable tree so the patch should be really small and have no side effect so
I don't wanted to change laptop_mode behavior heavily caused by changing
condition for may_writepage trigger point.

> interaction is preventing that.
> 
> (An enhancement to laptop mode would be to opportunistically write out
> dirty swapcache in or around laptop_mode_timer_fn()).

It could but it should be another patch and VM shouldn't rely on ONLY
laptop_mode_timer_fn, IMHO. VM should have own rule to reclaim pages
regardless of laptop_mode's help to prevent OOM kill.

> 
> > Above loop is continued until OOM happens.
> > The problem didn't happen before [1] was merged because old logic's 
> > isolatation
> > in shrink_inactive_list was successful and tried to call shrink_page_list
> > to pageout them but it still ends up failed to page out by may_writepage.
> > But important point is that sc->nr_scanned was increased althoug we couldn't
> > swap out them so do_try_to_free_pages could set may_writepages.
> > So this patch need to go stable tree althoug it's a band-aid.
> > Then, for latest linus tree, we should fix laptop_mode's fundamental
> > problem.
> 
> Well.  Perhaps we can do that now.

Okay. If you don't object my suggestion, I will send patches next week.
Thanks for the review, Andrew!

> 
> > [1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware]
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-17 Thread Andrew Morton
On Thu, 17 Jan 2013 09:53:14 +0900
Minchan Kim  wrote:

> Recently, Luigi reported there are lots of free swap space when
> OOM happens. It's easily reproduced on zram-over-swap, where
> many instance of memory hogs are running and laptop_mode is enabled.
> He said there was no problem when he disabled laptop_mode.
> 
> The problem when I investigate problem is following as.
> 
> Assumption for easy explanation: There are no page cache page in system
> because they all are already reclaimed.
> 
> 1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
> 2. shrink_inactive_list isolates victim pages from inactive anon lru list.
> 3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
>pageout because sc->may_writepage is 0 so the page is rotated back into
>inactive anon lru list. The add_to_swap made the page Dirty by SetPageDirty
> 4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority and
>retry reclaim with higher priority.
> 5. shrink_inactlive_list try to isolate victim pages from inactive anon lru 
> list
>but got failed because it try to isolate pages with ISOLATE_CLEAN mode but
>inactive anon lru list is full of dirty pages by 3 so it just returns
>without  any reclaim progress.
> 6. do_try_to_free_pages doesn't set may_write due to zero total_scanned.

s/may_write/may_writepage/

>Because sc->nr_scanned is increased by shrink_page_list but we don't call
>shrink_page_list in 5 due to short of isolated pages.

This is the bug, is it not?

In laptop mode, we still need to write out dirty swapcache at some
point.  An appropriate time to do this is when the scanning priority is
getting high.  But it seems that this ISOLATE_CLEAN->total_scanned
interaction is preventing that.

(An enhancement to laptop mode would be to opportunistically write out
dirty swapcache in or around laptop_mode_timer_fn()).

> Above loop is continued until OOM happens.
> The problem didn't happen before [1] was merged because old logic's 
> isolatation
> in shrink_inactive_list was successful and tried to call shrink_page_list
> to pageout them but it still ends up failed to page out by may_writepage.
> But important point is that sc->nr_scanned was increased althoug we couldn't
> swap out them so do_try_to_free_pages could set may_writepages.
> So this patch need to go stable tree althoug it's a band-aid.
> Then, for latest linus tree, we should fix laptop_mode's fundamental
> problem.

Well.  Perhaps we can do that now.

> [1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-17 Thread Andrew Morton
On Thu, 17 Jan 2013 09:53:14 +0900
Minchan Kim minc...@kernel.org wrote:

 Recently, Luigi reported there are lots of free swap space when
 OOM happens. It's easily reproduced on zram-over-swap, where
 many instance of memory hogs are running and laptop_mode is enabled.
 He said there was no problem when he disabled laptop_mode.
 
 The problem when I investigate problem is following as.
 
 Assumption for easy explanation: There are no page cache page in system
 because they all are already reclaimed.
 
 1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
 2. shrink_inactive_list isolates victim pages from inactive anon lru list.
 3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
pageout because sc-may_writepage is 0 so the page is rotated back into
inactive anon lru list. The add_to_swap made the page Dirty by SetPageDirty
 4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority and
retry reclaim with higher priority.
 5. shrink_inactlive_list try to isolate victim pages from inactive anon lru 
 list
but got failed because it try to isolate pages with ISOLATE_CLEAN mode but
inactive anon lru list is full of dirty pages by 3 so it just returns
without  any reclaim progress.
 6. do_try_to_free_pages doesn't set may_write due to zero total_scanned.

s/may_write/may_writepage/

Because sc-nr_scanned is increased by shrink_page_list but we don't call
shrink_page_list in 5 due to short of isolated pages.

This is the bug, is it not?

In laptop mode, we still need to write out dirty swapcache at some
point.  An appropriate time to do this is when the scanning priority is
getting high.  But it seems that this ISOLATE_CLEAN-total_scanned
interaction is preventing that.

(An enhancement to laptop mode would be to opportunistically write out
dirty swapcache in or around laptop_mode_timer_fn()).

 Above loop is continued until OOM happens.
 The problem didn't happen before [1] was merged because old logic's 
 isolatation
 in shrink_inactive_list was successful and tried to call shrink_page_list
 to pageout them but it still ends up failed to page out by may_writepage.
 But important point is that sc-nr_scanned was increased althoug we couldn't
 swap out them so do_try_to_free_pages could set may_writepages.
 So this patch need to go stable tree althoug it's a band-aid.
 Then, for latest linus tree, we should fix laptop_mode's fundamental
 problem.

Well.  Perhaps we can do that now.

 [1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-17 Thread Minchan Kim
On Thu, Jan 17, 2013 at 02:22:38PM -0800, Andrew Morton wrote:
 On Thu, 17 Jan 2013 09:53:14 +0900
 Minchan Kim minc...@kernel.org wrote:
 
  Recently, Luigi reported there are lots of free swap space when
  OOM happens. It's easily reproduced on zram-over-swap, where
  many instance of memory hogs are running and laptop_mode is enabled.
  He said there was no problem when he disabled laptop_mode.
  
  The problem when I investigate problem is following as.
  
  Assumption for easy explanation: There are no page cache page in system
  because they all are already reclaimed.
  
  1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
  2. shrink_inactive_list isolates victim pages from inactive anon lru list.
  3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
 pageout because sc-may_writepage is 0 so the page is rotated back into
 inactive anon lru list. The add_to_swap made the page Dirty by 
  SetPageDirty
  4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority 
  and
 retry reclaim with higher priority.
  5. shrink_inactlive_list try to isolate victim pages from inactive anon lru 
  list
 but got failed because it try to isolate pages with ISOLATE_CLEAN mode 
  but
 inactive anon lru list is full of dirty pages by 3 so it just returns
 without  any reclaim progress.
  6. do_try_to_free_pages doesn't set may_write due to zero total_scanned.
 
 s/may_write/may_writepage/

Thanks!

 
 Because sc-nr_scanned is increased by shrink_page_list but we don't call
 shrink_page_list in 5 due to short of isolated pages.
 
 This is the bug, is it not?
 
 In laptop mode, we still need to write out dirty swapcache at some
 point.  An appropriate time to do this is when the scanning priority is

Yes and when to some point is really important. Now, the point for that is
depends on on the number of scanned pages by shrink_page_list. It means we
must isolate victim pages from inactive LRU list and call shrink_page_list
to increase sc-nr_scanned but unfortunately, we have various filters to
decrease CPU consumption and LRU churning when VM try to isolate victim pages
so it could prevent isolating victim pages from LRU list.

 getting high.  But it seems that this ISOLATE_CLEAN-total_scanned

Yes. I absolutely agree on that some point should depend on priority, NOT
the number of scanned pages. And I already said to you about that.
https://lkml.org/lkml/2013/1/10/643

We used to use such heuristic in several places in VM, ie DEF_PRIORITY - 2
But why I hesitate with the patch is that I think this patch should go to
stable tree so the patch should be really small and have no side effect so
I don't wanted to change laptop_mode behavior heavily caused by changing
condition for may_writepage trigger point.

 interaction is preventing that.
 
 (An enhancement to laptop mode would be to opportunistically write out
 dirty swapcache in or around laptop_mode_timer_fn()).

It could but it should be another patch and VM shouldn't rely on ONLY
laptop_mode_timer_fn, IMHO. VM should have own rule to reclaim pages
regardless of laptop_mode's help to prevent OOM kill.

 
  Above loop is continued until OOM happens.
  The problem didn't happen before [1] was merged because old logic's 
  isolatation
  in shrink_inactive_list was successful and tried to call shrink_page_list
  to pageout them but it still ends up failed to page out by may_writepage.
  But important point is that sc-nr_scanned was increased althoug we couldn't
  swap out them so do_try_to_free_pages could set may_writepages.
  So this patch need to go stable tree althoug it's a band-aid.
  Then, for latest linus tree, we should fix laptop_mode's fundamental
  problem.
 
 Well.  Perhaps we can do that now.

Okay. If you don't object my suggestion, I will send patches next week.
Thanks for the review, Andrew!

 
  [1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware]
 
 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-16 Thread Minchan Kim
On Wed, Jan 16, 2013 at 01:41:55PM -0800, Andrew Morton wrote:
> On Wed,  9 Jan 2013 15:21:13 +0900
> Minchan Kim  wrote:
> >
> 
> This changelog is quite hard to understand :(
> 
> > Recently, Luigi reported there are lots of free swap space when
> > OOM happens. It's easily reproduced on zram-over-swap, where
> > many instance of memory hogs are running and laptop_mode is enabled.
> > 
> > Luigi reported there was no problem when he disabled laptop_mode.
> > The problem when I investigate problem is following as.
> > 
> > try_to_free_pages disable may_writepage if laptop_mode is enabled.
> > shrink_page_list adds lots of anon pages in swap cache by
> > add_to_swap, which makes pages Dirty and rotate them to head of
> > inactive LRU without pageout. If it is repeated, inactive anon LRU
> > is full of Dirty and SwapCache pages.
> 
> "Dirty and SwapCache" is ambigious.  Does it mean "dirty pages and
> swapcache pages" or does it mean "dirty swapcache pages".  The latter,
> I expect.

Yeb.

> 
> > 
> > In case of that, isolate_lru_pages fails because it try to isolate
> > clean page due to may_writepage == 0.
> > 
> > The may_writepage could be 1 only if total_scanned is higher than
> > writeback_threshold in do_try_to_free_pages but unfortunately,
> > VM can't isolate anon pages from inactive anon lru list by
> > above reason and we already reclaimed all file-backed pages.
> > So it ends up OOM killing.
> 
> Here, please expand upon "by above reason".  Explain here exactly why
> scanning is unsuccessful.

Let me try again ;)

  &< 

Recently, Luigi reported there are lots of free swap space when
OOM happens. It's easily reproduced on zram-over-swap, where
many instance of memory hogs are running and laptop_mode is enabled.
He said there was no problem when he disabled laptop_mode.

The problem when I investigate problem is following as.

Assumption for easy explanation: There are no page cache page in system
because they all are already reclaimed.

1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
2. shrink_inactive_list isolates victim pages from inactive anon lru list.
3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
   pageout because sc->may_writepage is 0 so the page is rotated back into
   inactive anon lru list. The add_to_swap made the page Dirty by SetPageDirty.
4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority and
   retry reclaim with higher priority.
5. shrink_inactlive_list try to isolate victim pages from inactive anon lru list
   but got failed because it try to isolate pages with ISOLATE_CLEAN mode but
   inactive anon lru list is full of dirty pages by 3 so it just returns
   without  any reclaim progress.
6. do_try_to_free_pages doesn't set may_write due to zero total_scanned.
   Because sc->nr_scanned is increased by shrink_page_list but we don't call
   shrink_page_list in 5 due to short of isolated pages.

Above loop is continued until OOM happens.
The problem didn't happen before [1] was merged because old logic's isolatation
in shrink_inactive_list was successful and tried to call shrink_page_list
to pageout them but it still ends up failed to page out by may_writepage.
But important point is that sc->nr_scanned was increased althoug we couldn't
swap out them so do_try_to_free_pages could set may_writepages.
So this patch need to go stable tree althoug it's a band-aid.
Then, for latest linus tree, we should fix laptop_mode's fundamental
problem.

[1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware]

> 
> > This patch prevents to add a page to swap cache unnecessary when
> > may_writepage is unset so anoymous lru list isn't full of
> > Dirty/Swapcache page. So VM can isolate pages from anon lru list,
> > which ends up setting may_writepage to 1 and could swap out
> > anon lru pages. When OOM triggers, I confirmed swap space was full.
> > 
> > ...
> >
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct list_head 
> > *page_list,
> > if (PageAnon(page) && !PageSwapCache(page)) {
> > if (!(sc->gfp_mask & __GFP_IO))
> > goto keep_locked;
> > +   if (!sc->may_writepage)
> > +   goto keep_locked;
> > if (!add_to_swap(page))
> > goto activate_locked;
> > may_enter_fs = 1;
> 
> Needs a comment explaining why we bale out in this case, please.


Okay. How about this?

/*
 * There is no point to add a page to swap cache if we can't swap out.
 */

> 
> If I'm understanding it correctly, this change causes the kernel to
> move less anonymous memory onto the inactive anon LRU and thereby

No. The amount of inactive anon LRU is same. Patch just prevent to add
page to swapcache unnecessary.

> causes the scanner to be more successful in 

Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-16 Thread Andrew Morton
On Wed,  9 Jan 2013 15:21:13 +0900
Minchan Kim  wrote:
>

This changelog is quite hard to understand :(

> Recently, Luigi reported there are lots of free swap space when
> OOM happens. It's easily reproduced on zram-over-swap, where
> many instance of memory hogs are running and laptop_mode is enabled.
> 
> Luigi reported there was no problem when he disabled laptop_mode.
> The problem when I investigate problem is following as.
> 
> try_to_free_pages disable may_writepage if laptop_mode is enabled.
> shrink_page_list adds lots of anon pages in swap cache by
> add_to_swap, which makes pages Dirty and rotate them to head of
> inactive LRU without pageout. If it is repeated, inactive anon LRU
> is full of Dirty and SwapCache pages.

"Dirty and SwapCache" is ambigious.  Does it mean "dirty pages and
swapcache pages" or does it mean "dirty swapcache pages".  The latter,
I expect.

> 
> In case of that, isolate_lru_pages fails because it try to isolate
> clean page due to may_writepage == 0.
> 
> The may_writepage could be 1 only if total_scanned is higher than
> writeback_threshold in do_try_to_free_pages but unfortunately,
> VM can't isolate anon pages from inactive anon lru list by
> above reason and we already reclaimed all file-backed pages.
> So it ends up OOM killing.

Here, please expand upon "by above reason".  Explain here exactly why
scanning is unsuccessful.

> This patch prevents to add a page to swap cache unnecessary when
> may_writepage is unset so anoymous lru list isn't full of
> Dirty/Swapcache page. So VM can isolate pages from anon lru list,
> which ends up setting may_writepage to 1 and could swap out
> anon lru pages. When OOM triggers, I confirmed swap space was full.
> 
> ...
>
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct list_head 
> *page_list,
>   if (PageAnon(page) && !PageSwapCache(page)) {
>   if (!(sc->gfp_mask & __GFP_IO))
>   goto keep_locked;
> + if (!sc->may_writepage)
> + goto keep_locked;
>   if (!add_to_swap(page))
>   goto activate_locked;
>   may_enter_fs = 1;

Needs a comment explaining why we bale out in this case, please.

If I'm understanding it correctly, this change causes the kernel to
move less anonymous memory onto the inactive anon LRU and thereby
causes the scanner to be more successful in locating clean swapcache
pages on that list?  But that makes no sense, because from your
description it appears the intent of the patch is to use *more* swap.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-16 Thread Andrew Morton
On Wed,  9 Jan 2013 15:21:13 +0900
Minchan Kim minc...@kernel.org wrote:


This changelog is quite hard to understand :(

 Recently, Luigi reported there are lots of free swap space when
 OOM happens. It's easily reproduced on zram-over-swap, where
 many instance of memory hogs are running and laptop_mode is enabled.
 
 Luigi reported there was no problem when he disabled laptop_mode.
 The problem when I investigate problem is following as.
 
 try_to_free_pages disable may_writepage if laptop_mode is enabled.
 shrink_page_list adds lots of anon pages in swap cache by
 add_to_swap, which makes pages Dirty and rotate them to head of
 inactive LRU without pageout. If it is repeated, inactive anon LRU
 is full of Dirty and SwapCache pages.

Dirty and SwapCache is ambigious.  Does it mean dirty pages and
swapcache pages or does it mean dirty swapcache pages.  The latter,
I expect.

 
 In case of that, isolate_lru_pages fails because it try to isolate
 clean page due to may_writepage == 0.
 
 The may_writepage could be 1 only if total_scanned is higher than
 writeback_threshold in do_try_to_free_pages but unfortunately,
 VM can't isolate anon pages from inactive anon lru list by
 above reason and we already reclaimed all file-backed pages.
 So it ends up OOM killing.

Here, please expand upon by above reason.  Explain here exactly why
scanning is unsuccessful.

 This patch prevents to add a page to swap cache unnecessary when
 may_writepage is unset so anoymous lru list isn't full of
 Dirty/Swapcache page. So VM can isolate pages from anon lru list,
 which ends up setting may_writepage to 1 and could swap out
 anon lru pages. When OOM triggers, I confirmed swap space was full.
 
 ...

 --- a/mm/vmscan.c
 +++ b/mm/vmscan.c
 @@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct list_head 
 *page_list,
   if (PageAnon(page)  !PageSwapCache(page)) {
   if (!(sc-gfp_mask  __GFP_IO))
   goto keep_locked;
 + if (!sc-may_writepage)
 + goto keep_locked;
   if (!add_to_swap(page))
   goto activate_locked;
   may_enter_fs = 1;

Needs a comment explaining why we bale out in this case, please.

If I'm understanding it correctly, this change causes the kernel to
move less anonymous memory onto the inactive anon LRU and thereby
causes the scanner to be more successful in locating clean swapcache
pages on that list?  But that makes no sense, because from your
description it appears the intent of the patch is to use *more* swap.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-16 Thread Minchan Kim
On Wed, Jan 16, 2013 at 01:41:55PM -0800, Andrew Morton wrote:
 On Wed,  9 Jan 2013 15:21:13 +0900
 Minchan Kim minc...@kernel.org wrote:
 
 
 This changelog is quite hard to understand :(
 
  Recently, Luigi reported there are lots of free swap space when
  OOM happens. It's easily reproduced on zram-over-swap, where
  many instance of memory hogs are running and laptop_mode is enabled.
  
  Luigi reported there was no problem when he disabled laptop_mode.
  The problem when I investigate problem is following as.
  
  try_to_free_pages disable may_writepage if laptop_mode is enabled.
  shrink_page_list adds lots of anon pages in swap cache by
  add_to_swap, which makes pages Dirty and rotate them to head of
  inactive LRU without pageout. If it is repeated, inactive anon LRU
  is full of Dirty and SwapCache pages.
 
 Dirty and SwapCache is ambigious.  Does it mean dirty pages and
 swapcache pages or does it mean dirty swapcache pages.  The latter,
 I expect.

Yeb.

 
  
  In case of that, isolate_lru_pages fails because it try to isolate
  clean page due to may_writepage == 0.
  
  The may_writepage could be 1 only if total_scanned is higher than
  writeback_threshold in do_try_to_free_pages but unfortunately,
  VM can't isolate anon pages from inactive anon lru list by
  above reason and we already reclaimed all file-backed pages.
  So it ends up OOM killing.
 
 Here, please expand upon by above reason.  Explain here exactly why
 scanning is unsuccessful.

Let me try again ;)

   

Recently, Luigi reported there are lots of free swap space when
OOM happens. It's easily reproduced on zram-over-swap, where
many instance of memory hogs are running and laptop_mode is enabled.
He said there was no problem when he disabled laptop_mode.

The problem when I investigate problem is following as.

Assumption for easy explanation: There are no page cache page in system
because they all are already reclaimed.

1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
2. shrink_inactive_list isolates victim pages from inactive anon lru list.
3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
   pageout because sc-may_writepage is 0 so the page is rotated back into
   inactive anon lru list. The add_to_swap made the page Dirty by SetPageDirty.
4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority and
   retry reclaim with higher priority.
5. shrink_inactlive_list try to isolate victim pages from inactive anon lru list
   but got failed because it try to isolate pages with ISOLATE_CLEAN mode but
   inactive anon lru list is full of dirty pages by 3 so it just returns
   without  any reclaim progress.
6. do_try_to_free_pages doesn't set may_write due to zero total_scanned.
   Because sc-nr_scanned is increased by shrink_page_list but we don't call
   shrink_page_list in 5 due to short of isolated pages.

Above loop is continued until OOM happens.
The problem didn't happen before [1] was merged because old logic's isolatation
in shrink_inactive_list was successful and tried to call shrink_page_list
to pageout them but it still ends up failed to page out by may_writepage.
But important point is that sc-nr_scanned was increased althoug we couldn't
swap out them so do_try_to_free_pages could set may_writepages.
So this patch need to go stable tree althoug it's a band-aid.
Then, for latest linus tree, we should fix laptop_mode's fundamental
problem.

[1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware]

 
  This patch prevents to add a page to swap cache unnecessary when
  may_writepage is unset so anoymous lru list isn't full of
  Dirty/Swapcache page. So VM can isolate pages from anon lru list,
  which ends up setting may_writepage to 1 and could swap out
  anon lru pages. When OOM triggers, I confirmed swap space was full.
  
  ...
 
  --- a/mm/vmscan.c
  +++ b/mm/vmscan.c
  @@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct list_head 
  *page_list,
  if (PageAnon(page)  !PageSwapCache(page)) {
  if (!(sc-gfp_mask  __GFP_IO))
  goto keep_locked;
  +   if (!sc-may_writepage)
  +   goto keep_locked;
  if (!add_to_swap(page))
  goto activate_locked;
  may_enter_fs = 1;
 
 Needs a comment explaining why we bale out in this case, please.


Okay. How about this?

/*
 * There is no point to add a page to swap cache if we can't swap out.
 */

 
 If I'm understanding it correctly, this change causes the kernel to
 move less anonymous memory onto the inactive anon LRU and thereby

No. The amount of inactive anon LRU is same. Patch just prevent to add
page to swapcache unnecessary.

 causes the scanner to be more successful in locating clean swapcache
 pages on that list?  But that makes no sense, because from your
 description 

Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-10 Thread Minchan Kim
Hi Luigi,

On Thu, Jan 10, 2013 at 03:24:21PM -0800, Luigi Semenzato wrote:
> For what it's worth, I tested this patch on my 3.4 kernel, and it works as
> advertised.  Here's my setup.
> 
> - 2 GB RAM
> - a 3 GB zram disk for swapping
> - start one "hog" process per second (each hog process mallocs and touches
> 200 MB of memory).
> - watch /proc/meminfo
> 
> 1. I verified that the problem still exists on my current 3.4 kernel.  With
> laptop_mode = 2, hog processes are oom-killed when about 1.8-1.9 (out of 3)
> GB of swap space are still left
> 
> 2. I double-checked that the problem does not exist with laptop_mode = 0:
> hog processes are oom-killed when swap space is exhausted (with good
> approximation).
> 
> 3. I added the two-line patch, put back laptop_mode = 2, and verified that
> hog processes are oom-killed when swap space is exhausted, same as case 2.
> 
> Let me know if I can run any more tests for you, and thanks for all the
> support so far!

Thanks very much! But it seems Andrew doesn't like this version.
I will discuss more with him and ask again with confimred version to you.

Thanks, again.!

FYI)
After I resolves this issue, will dive into min_filelist_kbytes patch. :)
> 
> 
> 
> On Wed, Jan 9, 2013 at 6:03 PM, Minchan Kim  wrote:
> 
> > Hi Andrew,
> >
> > On Wed, Jan 09, 2013 at 04:18:54PM -0800, Andrew Morton wrote:
> > > On Wed,  9 Jan 2013 15:21:13 +0900
> > > Minchan Kim  wrote:
> > >
> > > > Recently, Luigi reported there are lots of free swap space when
> > > > OOM happens. It's easily reproduced on zram-over-swap, where
> > > > many instance of memory hogs are running and laptop_mode is enabled.
> > > >
> > > > Luigi reported there was no problem when he disabled laptop_mode.
> > > > The problem when I investigate problem is following as.
> > > >
> > > > try_to_free_pages disable may_writepage if laptop_mode is enabled.
> > > > shrink_page_list adds lots of anon pages in swap cache by
> > > > add_to_swap, which makes pages Dirty and rotate them to head of
> > > > inactive LRU without pageout. If it is repeated, inactive anon LRU
> > > > is full of Dirty and SwapCache pages.
> > > >
> > > > In case of that, isolate_lru_pages fails because it try to isolate
> > > > clean page due to may_writepage == 0.
> > > >
> > > > The may_writepage could be 1 only if total_scanned is higher than
> > > > writeback_threshold in do_try_to_free_pages but unfortunately,
> > > > VM can't isolate anon pages from inactive anon lru list by
> > > > above reason and we already reclaimed all file-backed pages.
> > > > So it ends up OOM killing.
> > > >
> > > > This patch prevents to add a page to swap cache unnecessary when
> > > > may_writepage is unset so anoymous lru list isn't full of
> > > > Dirty/Swapcache page. So VM can isolate pages from anon lru list,
> > > > which ends up setting may_writepage to 1 and could swap out
> > > > anon lru pages. When OOM triggers, I confirmed swap space was full.
> > > >
> > > > ...
> > > >
> > > > --- a/mm/vmscan.c
> > > > +++ b/mm/vmscan.c
> > > > @@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct
> > list_head *page_list,
> > > > if (PageAnon(page) && !PageSwapCache(page)) {
> > > > if (!(sc->gfp_mask & __GFP_IO))
> > > > goto keep_locked;
> > > > +   if (!sc->may_writepage)
> > > > +   goto keep_locked;
> > > > if (!add_to_swap(page))
> > > > goto activate_locked;
> > > > may_enter_fs = 1;
> > >
> > > I'm not really getting it, and the description is rather hard to follow
> > :(
> >
> > It seems I don't have a talent about description. :(
> > I hope it would be better this year. :)
> >
> > >
> > > We should be adding anon pages to swapcache even when laptop_mode is
> > > set.  And we should be writing them to swap as well, then reclaiming
> > > them.  The only thing laptop_mode shouild do is make the disk spin up
> > > less frequently - that doesn't mean "not at all"!
> >
> > So it seems your rationale is that let's save power in only system has
> > enough memory so let's remove may_writepage in reclaim path?
> >
> > If it is, I love it because I didn't see any number about power saving
> > through reclaiming throttling(But surely there was reason to add it)
> > and not sure it works well during long time because we have tweaked
> > reclaim part too many.
> >
> > >
> > > So something seems screwed up here and the patch looks like a
> > > heavy-handed workaround.  Why aren't these anon pages getting written
> > > out in laptop_mode?
> >
> > Don't know. It was there long time and I don't want to screw it up.
> > If we decide paging out in reclaim path regardless of laptop_mode,
> > it makes the problem easy without ugly workaround.
> >
> > Remove may_writepage? If it's too agressive, we can remove it in only
> > direct reclaim path.
> >
> > >
> > >
> > > --
> > > To unsubscribe, send a 

Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-10 Thread Luigi Semenzato
[I may have screwed up my previous message, sorry if this is a
duplicate.  (Content-Policy reject msg: The message contains HTML
subpart, therefore we consider it SPAM or Outlook Virus.)]

--

For what it's worth, I tested this patch on my 3.4 kernel, and it
works as advertised.  Here's my setup.

- 2 GB RAM
- a 3 GB zram disk for swapping
- start one "hog" process per second (each hog process mallocs and
touches 200 MB of memory).
- watch /proc/meminfo

1. I verified that the problem still exists on my current 3.4 kernel.
With laptop_mode = 2, hog processes are oom-killed when about 1.8-1.9
(out of 3) GB of swap space are still left

2. I double-checked that the problem does not exist with laptop_mode =
0: hog processes are oom-killed when swap space is exhausted (with
good approximation).

3. I added the two-line patch, put back laptop_mode = 2, and verified
that hog processes are oom-killed when swap space is exhausted, same
as case 2.

Let me know if I can run any more tests for you, and thanks for all
the support so far!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-10 Thread Luigi Semenzato
[I may have screwed up my previous message, sorry if this is a
duplicate.  (Content-Policy reject msg: The message contains HTML
subpart, therefore we consider it SPAM or Outlook Virus.)]

--

For what it's worth, I tested this patch on my 3.4 kernel, and it
works as advertised.  Here's my setup.

- 2 GB RAM
- a 3 GB zram disk for swapping
- start one hog process per second (each hog process mallocs and
touches 200 MB of memory).
- watch /proc/meminfo

1. I verified that the problem still exists on my current 3.4 kernel.
With laptop_mode = 2, hog processes are oom-killed when about 1.8-1.9
(out of 3) GB of swap space are still left

2. I double-checked that the problem does not exist with laptop_mode =
0: hog processes are oom-killed when swap space is exhausted (with
good approximation).

3. I added the two-line patch, put back laptop_mode = 2, and verified
that hog processes are oom-killed when swap space is exhausted, same
as case 2.

Let me know if I can run any more tests for you, and thanks for all
the support so far!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-10 Thread Minchan Kim
Hi Luigi,

On Thu, Jan 10, 2013 at 03:24:21PM -0800, Luigi Semenzato wrote:
 For what it's worth, I tested this patch on my 3.4 kernel, and it works as
 advertised.  Here's my setup.
 
 - 2 GB RAM
 - a 3 GB zram disk for swapping
 - start one hog process per second (each hog process mallocs and touches
 200 MB of memory).
 - watch /proc/meminfo
 
 1. I verified that the problem still exists on my current 3.4 kernel.  With
 laptop_mode = 2, hog processes are oom-killed when about 1.8-1.9 (out of 3)
 GB of swap space are still left
 
 2. I double-checked that the problem does not exist with laptop_mode = 0:
 hog processes are oom-killed when swap space is exhausted (with good
 approximation).
 
 3. I added the two-line patch, put back laptop_mode = 2, and verified that
 hog processes are oom-killed when swap space is exhausted, same as case 2.
 
 Let me know if I can run any more tests for you, and thanks for all the
 support so far!

Thanks very much! But it seems Andrew doesn't like this version.
I will discuss more with him and ask again with confimred version to you.

Thanks, again.!

FYI)
After I resolves this issue, will dive into min_filelist_kbytes patch. :)
 
 
 
 On Wed, Jan 9, 2013 at 6:03 PM, Minchan Kim minc...@kernel.org wrote:
 
  Hi Andrew,
 
  On Wed, Jan 09, 2013 at 04:18:54PM -0800, Andrew Morton wrote:
   On Wed,  9 Jan 2013 15:21:13 +0900
   Minchan Kim minc...@kernel.org wrote:
  
Recently, Luigi reported there are lots of free swap space when
OOM happens. It's easily reproduced on zram-over-swap, where
many instance of memory hogs are running and laptop_mode is enabled.
   
Luigi reported there was no problem when he disabled laptop_mode.
The problem when I investigate problem is following as.
   
try_to_free_pages disable may_writepage if laptop_mode is enabled.
shrink_page_list adds lots of anon pages in swap cache by
add_to_swap, which makes pages Dirty and rotate them to head of
inactive LRU without pageout. If it is repeated, inactive anon LRU
is full of Dirty and SwapCache pages.
   
In case of that, isolate_lru_pages fails because it try to isolate
clean page due to may_writepage == 0.
   
The may_writepage could be 1 only if total_scanned is higher than
writeback_threshold in do_try_to_free_pages but unfortunately,
VM can't isolate anon pages from inactive anon lru list by
above reason and we already reclaimed all file-backed pages.
So it ends up OOM killing.
   
This patch prevents to add a page to swap cache unnecessary when
may_writepage is unset so anoymous lru list isn't full of
Dirty/Swapcache page. So VM can isolate pages from anon lru list,
which ends up setting may_writepage to 1 and could swap out
anon lru pages. When OOM triggers, I confirmed swap space was full.
   
...
   
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct
  list_head *page_list,
if (PageAnon(page)  !PageSwapCache(page)) {
if (!(sc-gfp_mask  __GFP_IO))
goto keep_locked;
+   if (!sc-may_writepage)
+   goto keep_locked;
if (!add_to_swap(page))
goto activate_locked;
may_enter_fs = 1;
  
   I'm not really getting it, and the description is rather hard to follow
  :(
 
  It seems I don't have a talent about description. :(
  I hope it would be better this year. :)
 
  
   We should be adding anon pages to swapcache even when laptop_mode is
   set.  And we should be writing them to swap as well, then reclaiming
   them.  The only thing laptop_mode shouild do is make the disk spin up
   less frequently - that doesn't mean not at all!
 
  So it seems your rationale is that let's save power in only system has
  enough memory so let's remove may_writepage in reclaim path?
 
  If it is, I love it because I didn't see any number about power saving
  through reclaiming throttling(But surely there was reason to add it)
  and not sure it works well during long time because we have tweaked
  reclaim part too many.
 
  
   So something seems screwed up here and the patch looks like a
   heavy-handed workaround.  Why aren't these anon pages getting written
   out in laptop_mode?
 
  Don't know. It was there long time and I don't want to screw it up.
  If we decide paging out in reclaim path regardless of laptop_mode,
  it makes the problem easy without ugly workaround.
 
  Remove may_writepage? If it's too agressive, we can remove it in only
  direct reclaim path.
 
  
  
   --
   To unsubscribe, send a message with 'unsubscribe linux-mm' in
   the body to majord...@kvack.org.  For more info on Linux MM,
   see: http://www.linux-mm.org/ .
   Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a
 
  --
  Kind regards,
  Minchan Kim
 
  --
  To unsubscribe, send 

Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-09 Thread Minchan Kim
Hi Andrew,

On Wed, Jan 09, 2013 at 04:18:54PM -0800, Andrew Morton wrote:
> On Wed,  9 Jan 2013 15:21:13 +0900
> Minchan Kim  wrote:
> 
> > Recently, Luigi reported there are lots of free swap space when
> > OOM happens. It's easily reproduced on zram-over-swap, where
> > many instance of memory hogs are running and laptop_mode is enabled.
> > 
> > Luigi reported there was no problem when he disabled laptop_mode.
> > The problem when I investigate problem is following as.
> > 
> > try_to_free_pages disable may_writepage if laptop_mode is enabled.
> > shrink_page_list adds lots of anon pages in swap cache by
> > add_to_swap, which makes pages Dirty and rotate them to head of
> > inactive LRU without pageout. If it is repeated, inactive anon LRU
> > is full of Dirty and SwapCache pages.
> > 
> > In case of that, isolate_lru_pages fails because it try to isolate
> > clean page due to may_writepage == 0.
> > 
> > The may_writepage could be 1 only if total_scanned is higher than
> > writeback_threshold in do_try_to_free_pages but unfortunately,
> > VM can't isolate anon pages from inactive anon lru list by
> > above reason and we already reclaimed all file-backed pages.
> > So it ends up OOM killing.
> > 
> > This patch prevents to add a page to swap cache unnecessary when
> > may_writepage is unset so anoymous lru list isn't full of
> > Dirty/Swapcache page. So VM can isolate pages from anon lru list,
> > which ends up setting may_writepage to 1 and could swap out
> > anon lru pages. When OOM triggers, I confirmed swap space was full.
> > 
> > ...
> >
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct list_head 
> > *page_list,
> > if (PageAnon(page) && !PageSwapCache(page)) {
> > if (!(sc->gfp_mask & __GFP_IO))
> > goto keep_locked;
> > +   if (!sc->may_writepage)
> > +   goto keep_locked;
> > if (!add_to_swap(page))
> > goto activate_locked;
> > may_enter_fs = 1;
> 
> I'm not really getting it, and the description is rather hard to follow :(

It seems I don't have a talent about description. :(
I hope it would be better this year. :)

> 
> We should be adding anon pages to swapcache even when laptop_mode is
> set.  And we should be writing them to swap as well, then reclaiming
> them.  The only thing laptop_mode shouild do is make the disk spin up
> less frequently - that doesn't mean "not at all"!

So it seems your rationale is that let's save power in only system has
enough memory so let's remove may_writepage in reclaim path?

If it is, I love it because I didn't see any number about power saving
through reclaiming throttling(But surely there was reason to add it)
and not sure it works well during long time because we have tweaked
reclaim part too many.

> 
> So something seems screwed up here and the patch looks like a
> heavy-handed workaround.  Why aren't these anon pages getting written
> out in laptop_mode?

Don't know. It was there long time and I don't want to screw it up.
If we decide paging out in reclaim path regardless of laptop_mode,
it makes the problem easy without ugly workaround.

Remove may_writepage? If it's too agressive, we can remove it in only
direct reclaim path.

> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-09 Thread Andrew Morton
On Wed,  9 Jan 2013 15:21:13 +0900
Minchan Kim  wrote:

> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct list_head 
> *page_list,
>   if (PageAnon(page) && !PageSwapCache(page)) {
>   if (!(sc->gfp_mask & __GFP_IO))
>   goto keep_locked;
> + if (!sc->may_writepage)
> + goto keep_locked;
>   if (!add_to_swap(page))
>   goto activate_locked;
>   may_enter_fs = 1;

We should add a comment here explaining what's going on.  But I can't
suggest anything which sounds rational because this looks so wrong :(

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-09 Thread Andrew Morton
On Wed,  9 Jan 2013 15:21:13 +0900
Minchan Kim  wrote:

> Recently, Luigi reported there are lots of free swap space when
> OOM happens. It's easily reproduced on zram-over-swap, where
> many instance of memory hogs are running and laptop_mode is enabled.
> 
> Luigi reported there was no problem when he disabled laptop_mode.
> The problem when I investigate problem is following as.
> 
> try_to_free_pages disable may_writepage if laptop_mode is enabled.
> shrink_page_list adds lots of anon pages in swap cache by
> add_to_swap, which makes pages Dirty and rotate them to head of
> inactive LRU without pageout. If it is repeated, inactive anon LRU
> is full of Dirty and SwapCache pages.
> 
> In case of that, isolate_lru_pages fails because it try to isolate
> clean page due to may_writepage == 0.
> 
> The may_writepage could be 1 only if total_scanned is higher than
> writeback_threshold in do_try_to_free_pages but unfortunately,
> VM can't isolate anon pages from inactive anon lru list by
> above reason and we already reclaimed all file-backed pages.
> So it ends up OOM killing.
> 
> This patch prevents to add a page to swap cache unnecessary when
> may_writepage is unset so anoymous lru list isn't full of
> Dirty/Swapcache page. So VM can isolate pages from anon lru list,
> which ends up setting may_writepage to 1 and could swap out
> anon lru pages. When OOM triggers, I confirmed swap space was full.
> 
> ...
>
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct list_head 
> *page_list,
>   if (PageAnon(page) && !PageSwapCache(page)) {
>   if (!(sc->gfp_mask & __GFP_IO))
>   goto keep_locked;
> + if (!sc->may_writepage)
> + goto keep_locked;
>   if (!add_to_swap(page))
>   goto activate_locked;
>   may_enter_fs = 1;

I'm not really getting it, and the description is rather hard to follow :(

We should be adding anon pages to swapcache even when laptop_mode is
set.  And we should be writing them to swap as well, then reclaiming
them.  The only thing laptop_mode shouild do is make the disk spin up
less frequently - that doesn't mean "not at all"!

So something seems screwed up here and the patch looks like a
heavy-handed workaround.  Why aren't these anon pages getting written
out in laptop_mode?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-09 Thread Andrew Morton
On Wed,  9 Jan 2013 15:21:13 +0900
Minchan Kim minc...@kernel.org wrote:

 Recently, Luigi reported there are lots of free swap space when
 OOM happens. It's easily reproduced on zram-over-swap, where
 many instance of memory hogs are running and laptop_mode is enabled.
 
 Luigi reported there was no problem when he disabled laptop_mode.
 The problem when I investigate problem is following as.
 
 try_to_free_pages disable may_writepage if laptop_mode is enabled.
 shrink_page_list adds lots of anon pages in swap cache by
 add_to_swap, which makes pages Dirty and rotate them to head of
 inactive LRU without pageout. If it is repeated, inactive anon LRU
 is full of Dirty and SwapCache pages.
 
 In case of that, isolate_lru_pages fails because it try to isolate
 clean page due to may_writepage == 0.
 
 The may_writepage could be 1 only if total_scanned is higher than
 writeback_threshold in do_try_to_free_pages but unfortunately,
 VM can't isolate anon pages from inactive anon lru list by
 above reason and we already reclaimed all file-backed pages.
 So it ends up OOM killing.
 
 This patch prevents to add a page to swap cache unnecessary when
 may_writepage is unset so anoymous lru list isn't full of
 Dirty/Swapcache page. So VM can isolate pages from anon lru list,
 which ends up setting may_writepage to 1 and could swap out
 anon lru pages. When OOM triggers, I confirmed swap space was full.
 
 ...

 --- a/mm/vmscan.c
 +++ b/mm/vmscan.c
 @@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct list_head 
 *page_list,
   if (PageAnon(page)  !PageSwapCache(page)) {
   if (!(sc-gfp_mask  __GFP_IO))
   goto keep_locked;
 + if (!sc-may_writepage)
 + goto keep_locked;
   if (!add_to_swap(page))
   goto activate_locked;
   may_enter_fs = 1;

I'm not really getting it, and the description is rather hard to follow :(

We should be adding anon pages to swapcache even when laptop_mode is
set.  And we should be writing them to swap as well, then reclaiming
them.  The only thing laptop_mode shouild do is make the disk spin up
less frequently - that doesn't mean not at all!

So something seems screwed up here and the patch looks like a
heavy-handed workaround.  Why aren't these anon pages getting written
out in laptop_mode?


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-09 Thread Andrew Morton
On Wed,  9 Jan 2013 15:21:13 +0900
Minchan Kim minc...@kernel.org wrote:

 --- a/mm/vmscan.c
 +++ b/mm/vmscan.c
 @@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct list_head 
 *page_list,
   if (PageAnon(page)  !PageSwapCache(page)) {
   if (!(sc-gfp_mask  __GFP_IO))
   goto keep_locked;
 + if (!sc-may_writepage)
 + goto keep_locked;
   if (!add_to_swap(page))
   goto activate_locked;
   may_enter_fs = 1;

We should add a comment here explaining what's going on.  But I can't
suggest anything which sounds rational because this looks so wrong :(

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-09 Thread Minchan Kim
Hi Andrew,

On Wed, Jan 09, 2013 at 04:18:54PM -0800, Andrew Morton wrote:
 On Wed,  9 Jan 2013 15:21:13 +0900
 Minchan Kim minc...@kernel.org wrote:
 
  Recently, Luigi reported there are lots of free swap space when
  OOM happens. It's easily reproduced on zram-over-swap, where
  many instance of memory hogs are running and laptop_mode is enabled.
  
  Luigi reported there was no problem when he disabled laptop_mode.
  The problem when I investigate problem is following as.
  
  try_to_free_pages disable may_writepage if laptop_mode is enabled.
  shrink_page_list adds lots of anon pages in swap cache by
  add_to_swap, which makes pages Dirty and rotate them to head of
  inactive LRU without pageout. If it is repeated, inactive anon LRU
  is full of Dirty and SwapCache pages.
  
  In case of that, isolate_lru_pages fails because it try to isolate
  clean page due to may_writepage == 0.
  
  The may_writepage could be 1 only if total_scanned is higher than
  writeback_threshold in do_try_to_free_pages but unfortunately,
  VM can't isolate anon pages from inactive anon lru list by
  above reason and we already reclaimed all file-backed pages.
  So it ends up OOM killing.
  
  This patch prevents to add a page to swap cache unnecessary when
  may_writepage is unset so anoymous lru list isn't full of
  Dirty/Swapcache page. So VM can isolate pages from anon lru list,
  which ends up setting may_writepage to 1 and could swap out
  anon lru pages. When OOM triggers, I confirmed swap space was full.
  
  ...
 
  --- a/mm/vmscan.c
  +++ b/mm/vmscan.c
  @@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct list_head 
  *page_list,
  if (PageAnon(page)  !PageSwapCache(page)) {
  if (!(sc-gfp_mask  __GFP_IO))
  goto keep_locked;
  +   if (!sc-may_writepage)
  +   goto keep_locked;
  if (!add_to_swap(page))
  goto activate_locked;
  may_enter_fs = 1;
 
 I'm not really getting it, and the description is rather hard to follow :(

It seems I don't have a talent about description. :(
I hope it would be better this year. :)

 
 We should be adding anon pages to swapcache even when laptop_mode is
 set.  And we should be writing them to swap as well, then reclaiming
 them.  The only thing laptop_mode shouild do is make the disk spin up
 less frequently - that doesn't mean not at all!

So it seems your rationale is that let's save power in only system has
enough memory so let's remove may_writepage in reclaim path?

If it is, I love it because I didn't see any number about power saving
through reclaiming throttling(But surely there was reason to add it)
and not sure it works well during long time because we have tweaked
reclaim part too many.

 
 So something seems screwed up here and the patch looks like a
 heavy-handed workaround.  Why aren't these anon pages getting written
 out in laptop_mode?

Don't know. It was there long time and I don't want to screw it up.
If we decide paging out in reclaim path regardless of laptop_mode,
it makes the problem easy without ugly workaround.

Remove may_writepage? If it's too agressive, we can remove it in only
direct reclaim path.

 
 
 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-08 Thread Minchan Kim
Hi Hannes,

On Wed, Jan 09, 2013 at 01:56:12AM -0500, Johannes Weiner wrote:
> On Wed, Jan 09, 2013 at 03:21:13PM +0900, Minchan Kim wrote:
> > Recently, Luigi reported there are lots of free swap space when
> > OOM happens. It's easily reproduced on zram-over-swap, where
> > many instance of memory hogs are running and laptop_mode is enabled.
> > 
> > Luigi reported there was no problem when he disabled laptop_mode.
> > The problem when I investigate problem is following as.
> > 
> > try_to_free_pages disable may_writepage if laptop_mode is enabled.
> > shrink_page_list adds lots of anon pages in swap cache by
> > add_to_swap, which makes pages Dirty and rotate them to head of
> > inactive LRU without pageout. If it is repeated, inactive anon LRU
> > is full of Dirty and SwapCache pages.
> > 
> > In case of that, isolate_lru_pages fails because it try to isolate
> > clean page due to may_writepage == 0.
> > 
> > The may_writepage could be 1 only if total_scanned is higher than
> > writeback_threshold in do_try_to_free_pages but unfortunately,
> > VM can't isolate anon pages from inactive anon lru list by
> > above reason and we already reclaimed all file-backed pages.
> > So it ends up OOM killing.
> > 
> > This patch prevents to add a page to swap cache unnecessary when
> > may_writepage is unset so anoymous lru list isn't full of
> > Dirty/Swapcache page. So VM can isolate pages from anon lru list,
> > which ends up setting may_writepage to 1 and could swap out
> > anon lru pages. When OOM triggers, I confirmed swap space was full.
> > 
> > Reported-by: Luigi Semenzato 
> > Signed-off-by: Minchan Kim 
> 
> Acked-by: Johannes Weiner 
> 
> We used to ignore the page's writeback state on isolation in the past,
> could you include a reference to since when this problem has been in

Good idea.
It has existed since f80c067[mm: zone_reclaim: make isolate_lru_page() 
filter-aware]
I will write down it in changelog.

> the tree?  Also, would it make sense to tag it for one of the stable
> trees?

If Luigi confirmed it, I will Cc sta...@vger.kernel.org in next spin.
Thanks!

> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-08 Thread Johannes Weiner
On Wed, Jan 09, 2013 at 03:21:13PM +0900, Minchan Kim wrote:
> Recently, Luigi reported there are lots of free swap space when
> OOM happens. It's easily reproduced on zram-over-swap, where
> many instance of memory hogs are running and laptop_mode is enabled.
> 
> Luigi reported there was no problem when he disabled laptop_mode.
> The problem when I investigate problem is following as.
> 
> try_to_free_pages disable may_writepage if laptop_mode is enabled.
> shrink_page_list adds lots of anon pages in swap cache by
> add_to_swap, which makes pages Dirty and rotate them to head of
> inactive LRU without pageout. If it is repeated, inactive anon LRU
> is full of Dirty and SwapCache pages.
> 
> In case of that, isolate_lru_pages fails because it try to isolate
> clean page due to may_writepage == 0.
> 
> The may_writepage could be 1 only if total_scanned is higher than
> writeback_threshold in do_try_to_free_pages but unfortunately,
> VM can't isolate anon pages from inactive anon lru list by
> above reason and we already reclaimed all file-backed pages.
> So it ends up OOM killing.
> 
> This patch prevents to add a page to swap cache unnecessary when
> may_writepage is unset so anoymous lru list isn't full of
> Dirty/Swapcache page. So VM can isolate pages from anon lru list,
> which ends up setting may_writepage to 1 and could swap out
> anon lru pages. When OOM triggers, I confirmed swap space was full.
> 
> Reported-by: Luigi Semenzato 
> Signed-off-by: Minchan Kim 

Acked-by: Johannes Weiner 

We used to ignore the page's writeback state on isolation in the past,
could you include a reference to since when this problem has been in
the tree?  Also, would it make sense to tag it for one of the stable
trees?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-08 Thread Minchan Kim
Recently, Luigi reported there are lots of free swap space when
OOM happens. It's easily reproduced on zram-over-swap, where
many instance of memory hogs are running and laptop_mode is enabled.

Luigi reported there was no problem when he disabled laptop_mode.
The problem when I investigate problem is following as.

try_to_free_pages disable may_writepage if laptop_mode is enabled.
shrink_page_list adds lots of anon pages in swap cache by
add_to_swap, which makes pages Dirty and rotate them to head of
inactive LRU without pageout. If it is repeated, inactive anon LRU
is full of Dirty and SwapCache pages.

In case of that, isolate_lru_pages fails because it try to isolate
clean page due to may_writepage == 0.

The may_writepage could be 1 only if total_scanned is higher than
writeback_threshold in do_try_to_free_pages but unfortunately,
VM can't isolate anon pages from inactive anon lru list by
above reason and we already reclaimed all file-backed pages.
So it ends up OOM killing.

This patch prevents to add a page to swap cache unnecessary when
may_writepage is unset so anoymous lru list isn't full of
Dirty/Swapcache page. So VM can isolate pages from anon lru list,
which ends up setting may_writepage to 1 and could swap out
anon lru pages. When OOM triggers, I confirmed swap space was full.

Reported-by: Luigi Semenzato 
Signed-off-by: Minchan Kim 
---
 mm/vmscan.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index ff869d2..439cc47 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
if (PageAnon(page) && !PageSwapCache(page)) {
if (!(sc->gfp_mask & __GFP_IO))
goto keep_locked;
+   if (!sc->may_writepage)
+   goto keep_locked;
if (!add_to_swap(page))
goto activate_locked;
may_enter_fs = 1;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-08 Thread Minchan Kim
Recently, Luigi reported there are lots of free swap space when
OOM happens. It's easily reproduced on zram-over-swap, where
many instance of memory hogs are running and laptop_mode is enabled.

Luigi reported there was no problem when he disabled laptop_mode.
The problem when I investigate problem is following as.

try_to_free_pages disable may_writepage if laptop_mode is enabled.
shrink_page_list adds lots of anon pages in swap cache by
add_to_swap, which makes pages Dirty and rotate them to head of
inactive LRU without pageout. If it is repeated, inactive anon LRU
is full of Dirty and SwapCache pages.

In case of that, isolate_lru_pages fails because it try to isolate
clean page due to may_writepage == 0.

The may_writepage could be 1 only if total_scanned is higher than
writeback_threshold in do_try_to_free_pages but unfortunately,
VM can't isolate anon pages from inactive anon lru list by
above reason and we already reclaimed all file-backed pages.
So it ends up OOM killing.

This patch prevents to add a page to swap cache unnecessary when
may_writepage is unset so anoymous lru list isn't full of
Dirty/Swapcache page. So VM can isolate pages from anon lru list,
which ends up setting may_writepage to 1 and could swap out
anon lru pages. When OOM triggers, I confirmed swap space was full.

Reported-by: Luigi Semenzato semenz...@google.com
Signed-off-by: Minchan Kim minc...@kernel.org
---
 mm/vmscan.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index ff869d2..439cc47 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
if (PageAnon(page)  !PageSwapCache(page)) {
if (!(sc-gfp_mask  __GFP_IO))
goto keep_locked;
+   if (!sc-may_writepage)
+   goto keep_locked;
if (!add_to_swap(page))
goto activate_locked;
may_enter_fs = 1;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-08 Thread Johannes Weiner
On Wed, Jan 09, 2013 at 03:21:13PM +0900, Minchan Kim wrote:
 Recently, Luigi reported there are lots of free swap space when
 OOM happens. It's easily reproduced on zram-over-swap, where
 many instance of memory hogs are running and laptop_mode is enabled.
 
 Luigi reported there was no problem when he disabled laptop_mode.
 The problem when I investigate problem is following as.
 
 try_to_free_pages disable may_writepage if laptop_mode is enabled.
 shrink_page_list adds lots of anon pages in swap cache by
 add_to_swap, which makes pages Dirty and rotate them to head of
 inactive LRU without pageout. If it is repeated, inactive anon LRU
 is full of Dirty and SwapCache pages.
 
 In case of that, isolate_lru_pages fails because it try to isolate
 clean page due to may_writepage == 0.
 
 The may_writepage could be 1 only if total_scanned is higher than
 writeback_threshold in do_try_to_free_pages but unfortunately,
 VM can't isolate anon pages from inactive anon lru list by
 above reason and we already reclaimed all file-backed pages.
 So it ends up OOM killing.
 
 This patch prevents to add a page to swap cache unnecessary when
 may_writepage is unset so anoymous lru list isn't full of
 Dirty/Swapcache page. So VM can isolate pages from anon lru list,
 which ends up setting may_writepage to 1 and could swap out
 anon lru pages. When OOM triggers, I confirmed swap space was full.
 
 Reported-by: Luigi Semenzato semenz...@google.com
 Signed-off-by: Minchan Kim minc...@kernel.org

Acked-by: Johannes Weiner han...@cmpxchg.org

We used to ignore the page's writeback state on isolation in the past,
could you include a reference to since when this problem has been in
the tree?  Also, would it make sense to tag it for one of the stable
trees?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-08 Thread Minchan Kim
Hi Hannes,

On Wed, Jan 09, 2013 at 01:56:12AM -0500, Johannes Weiner wrote:
 On Wed, Jan 09, 2013 at 03:21:13PM +0900, Minchan Kim wrote:
  Recently, Luigi reported there are lots of free swap space when
  OOM happens. It's easily reproduced on zram-over-swap, where
  many instance of memory hogs are running and laptop_mode is enabled.
  
  Luigi reported there was no problem when he disabled laptop_mode.
  The problem when I investigate problem is following as.
  
  try_to_free_pages disable may_writepage if laptop_mode is enabled.
  shrink_page_list adds lots of anon pages in swap cache by
  add_to_swap, which makes pages Dirty and rotate them to head of
  inactive LRU without pageout. If it is repeated, inactive anon LRU
  is full of Dirty and SwapCache pages.
  
  In case of that, isolate_lru_pages fails because it try to isolate
  clean page due to may_writepage == 0.
  
  The may_writepage could be 1 only if total_scanned is higher than
  writeback_threshold in do_try_to_free_pages but unfortunately,
  VM can't isolate anon pages from inactive anon lru list by
  above reason and we already reclaimed all file-backed pages.
  So it ends up OOM killing.
  
  This patch prevents to add a page to swap cache unnecessary when
  may_writepage is unset so anoymous lru list isn't full of
  Dirty/Swapcache page. So VM can isolate pages from anon lru list,
  which ends up setting may_writepage to 1 and could swap out
  anon lru pages. When OOM triggers, I confirmed swap space was full.
  
  Reported-by: Luigi Semenzato semenz...@google.com
  Signed-off-by: Minchan Kim minc...@kernel.org
 
 Acked-by: Johannes Weiner han...@cmpxchg.org
 
 We used to ignore the page's writeback state on isolation in the past,
 could you include a reference to since when this problem has been in

Good idea.
It has existed since f80c067[mm: zone_reclaim: make isolate_lru_page() 
filter-aware]
I will write down it in changelog.

 the tree?  Also, would it make sense to tag it for one of the stable
 trees?

If Luigi confirmed it, I will Cc sta...@vger.kernel.org in next spin.
Thanks!

 
 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/