Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-04-01 Thread Rik van Riel
On 03/10/2015 10:14 PM, Wang, Yalin wrote:
>> -Original Message-
>> From: Minchan Kim [mailto:minc...@kernel.org]
>> Sent: Wednesday, March 11, 2015 9:21 AM
>> To: Andrew Morton
>> Cc: linux-kernel@vger.kernel.org; linux...@kvack.org; Michal Hocko;
>> Johannes Weiner; Mel Gorman; Rik van Riel; Shaohua Li; Wang, Yalin; Minchan
>> Kim
>> Subject: [PATCH 3/4] mm: move lazy free pages to inactive list
>>
>> MADV_FREE is hint that it's okay to discard pages if there is
>> memory pressure and we uses reclaimers(ie, kswapd and direct reclaim)
>> to free them so there is no worth to remain them in active anonymous LRU
>> so this patch moves them to inactive LRU list's head.
>>
>> This means that MADV_FREE-ed pages which were living on the inactive list
>> are reclaimed first because they are more likely to be cold rather than
>> recently active pages.
>>
>> A arguable issue for the approach would be whether we should put it to
>> head or tail in inactive list. I selected *head* because kernel cannot
>> make sure it's really cold or warm for every MADV_FREE usecase but
>> at least we know it's not *hot* so landing of inactive head would be
>> comprimise for various usecases.
>>
>> This is fixing a suboptimal behavior of MADV_FREE when pages living on
>> the active list will sit there for a long time even under memory
>> pressure while the inactive list is reclaimed heavily. This basically
>> breaks the whole purpose of using MADV_FREE to help the system to free
>> memory which is might not be used.
>>
>> Acked-by: Michal Hocko 
>> Signed-off-by: Minchan Kim 
>> ---
>>  include/linux/swap.h |  1 +
>>  mm/madvise.c |  2 ++
>>  mm/swap.c| 35 +++
>>  3 files changed, 38 insertions(+)
>>
>> diff --git a/include/linux/swap.h b/include/linux/swap.h
>> index cee108c..0428e4c 100644
>> --- a/include/linux/swap.h
>> +++ b/include/linux/swap.h
>> @@ -308,6 +308,7 @@ extern void lru_add_drain_cpu(int cpu);
>>  extern void lru_add_drain_all(void);
>>  extern void rotate_reclaimable_page(struct page *page);
>>  extern void deactivate_file_page(struct page *page);
>> +extern void deactivate_page(struct page *page);
>>  extern void swap_setup(void);
>>
>>  extern void add_page_to_unevictable_list(struct page *page);
>> diff --git a/mm/madvise.c b/mm/madvise.c
>> index ebe692e..22e8f0c 100644
>> --- a/mm/madvise.c
>> +++ b/mm/madvise.c
>> @@ -340,6 +340,8 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned
>> long addr,
>>  ptent = pte_mkold(ptent);
>>  ptent = pte_mkclean(ptent);
>>  set_pte_at(mm, addr, pte, ptent);
>> +if (PageActive(page))
>> +deactivate_page(page);
>>  tlb_remove_tlb_entry(tlb, pte, addr);
>>  }
> 
> I think this place should be changed like this:
>   +   if (!page_referenced(page, false, NULL, NULL, NULL) && 
> PageActive(page))
>   +   deactivate_page(page);
> Because we don't know if other processes are reference this page,
> If it is true, don't need deactivate this page.

We never clear the page and pte referenced bits on an active
page, that is only done when the page is moved to the inactive
list through LRU movement.

In other words, the page_referenced() check will return true
most of the time, even if the page was last referenced half
an hour ago (but there was no memory pressure).

Minchan's code looks correct.

The code may even want a ClearPageReferenced(page) in there...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-04-01 Thread Rik van Riel
On 03/10/2015 10:14 PM, Wang, Yalin wrote:
 -Original Message-
 From: Minchan Kim [mailto:minc...@kernel.org]
 Sent: Wednesday, March 11, 2015 9:21 AM
 To: Andrew Morton
 Cc: linux-kernel@vger.kernel.org; linux...@kvack.org; Michal Hocko;
 Johannes Weiner; Mel Gorman; Rik van Riel; Shaohua Li; Wang, Yalin; Minchan
 Kim
 Subject: [PATCH 3/4] mm: move lazy free pages to inactive list

 MADV_FREE is hint that it's okay to discard pages if there is
 memory pressure and we uses reclaimers(ie, kswapd and direct reclaim)
 to free them so there is no worth to remain them in active anonymous LRU
 so this patch moves them to inactive LRU list's head.

 This means that MADV_FREE-ed pages which were living on the inactive list
 are reclaimed first because they are more likely to be cold rather than
 recently active pages.

 A arguable issue for the approach would be whether we should put it to
 head or tail in inactive list. I selected *head* because kernel cannot
 make sure it's really cold or warm for every MADV_FREE usecase but
 at least we know it's not *hot* so landing of inactive head would be
 comprimise for various usecases.

 This is fixing a suboptimal behavior of MADV_FREE when pages living on
 the active list will sit there for a long time even under memory
 pressure while the inactive list is reclaimed heavily. This basically
 breaks the whole purpose of using MADV_FREE to help the system to free
 memory which is might not be used.

 Acked-by: Michal Hocko mho...@suse.cz
 Signed-off-by: Minchan Kim minc...@kernel.org
 ---
  include/linux/swap.h |  1 +
  mm/madvise.c |  2 ++
  mm/swap.c| 35 +++
  3 files changed, 38 insertions(+)

 diff --git a/include/linux/swap.h b/include/linux/swap.h
 index cee108c..0428e4c 100644
 --- a/include/linux/swap.h
 +++ b/include/linux/swap.h
 @@ -308,6 +308,7 @@ extern void lru_add_drain_cpu(int cpu);
  extern void lru_add_drain_all(void);
  extern void rotate_reclaimable_page(struct page *page);
  extern void deactivate_file_page(struct page *page);
 +extern void deactivate_page(struct page *page);
  extern void swap_setup(void);

  extern void add_page_to_unevictable_list(struct page *page);
 diff --git a/mm/madvise.c b/mm/madvise.c
 index ebe692e..22e8f0c 100644
 --- a/mm/madvise.c
 +++ b/mm/madvise.c
 @@ -340,6 +340,8 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned
 long addr,
  ptent = pte_mkold(ptent);
  ptent = pte_mkclean(ptent);
  set_pte_at(mm, addr, pte, ptent);
 +if (PageActive(page))
 +deactivate_page(page);
  tlb_remove_tlb_entry(tlb, pte, addr);
  }
 
 I think this place should be changed like this:
   +   if (!page_referenced(page, false, NULL, NULL, NULL)  
 PageActive(page))
   +   deactivate_page(page);
 Because we don't know if other processes are reference this page,
 If it is true, don't need deactivate this page.

We never clear the page and pte referenced bits on an active
page, that is only done when the page is moved to the inactive
list through LRU movement.

In other words, the page_referenced() check will return true
most of the time, even if the page was last referenced half
an hour ago (but there was no memory pressure).

Minchan's code looks correct.

The code may even want a ClearPageReferenced(page) in there...

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-30 Thread Minchan Kim
On Mon, Mar 30, 2015 at 10:28:47PM -0700, Andrew Morton wrote:
> On Tue, 31 Mar 2015 13:45:25 +0900 Minchan Kim  wrote:
> > > 
> > > deactivate_page() doesn't look at or alter PageReferenced().  Should it?
> > 
> > Absolutely true. Thanks.
> > Here it goes.
> > 
> > >From 2b2c92eb73a1cceac615b9abd4c0f5f0c3395ff5 Mon Sep 17 00:00:00 2001
> > From: Minchan Kim 
> > Date: Tue, 31 Mar 2015 13:38:46 +0900
> > Subject: [PATCH] mm: lru_deactivate_fn should clear PG_referenced
> > 
> > deactivate_page aims for accelerate for reclaiming through
> > moving pages from active list to inactive list so we should
> > clear PG_referenced for the goal.
> > 
> > ...
> >
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -800,6 +800,7 @@ static void lru_deactivate_fn(struct page *page, struct 
> > lruvec *lruvec,
> >  
> > del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
> > ClearPageActive(page);
> > +   ClearPageReferenced(page);
> > add_page_to_lru_list(page, lruvec, lru);
> >  
> > __count_vm_event(PGDEACTIVATE);
> 
> What if we have
> 
>   PageLRU(page) && !PageActive(page) && PageReferenced(page)
> 
> if we really want to "accelerate the reclaim of @page" then we should
> clear PG_referenced there too.

The function's name is *deactivate*_page. IOW, I think it should work
for only pages in active list, IMHO.

> 
> (And what about page_referenced(page) :))

Yes, I considered it when you mentioned PG_referenced. Now, madvise_free
clear out access bit of page table when the syscall is called so
shrink_page_list could reclaim pages easily.

Of course, we could clear access bit by page_referenced for general purpose,
not only madvise_free but it would hurt performance for madvise_free so
I'd like to leave it unless there is a need for the function.

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-30 Thread Andrew Morton
On Tue, 31 Mar 2015 13:45:25 +0900 Minchan Kim  wrote:
> > 
> > deactivate_page() doesn't look at or alter PageReferenced().  Should it?
> 
> Absolutely true. Thanks.
> Here it goes.
> 
> >From 2b2c92eb73a1cceac615b9abd4c0f5f0c3395ff5 Mon Sep 17 00:00:00 2001
> From: Minchan Kim 
> Date: Tue, 31 Mar 2015 13:38:46 +0900
> Subject: [PATCH] mm: lru_deactivate_fn should clear PG_referenced
> 
> deactivate_page aims for accelerate for reclaiming through
> moving pages from active list to inactive list so we should
> clear PG_referenced for the goal.
> 
> ...
>
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -800,6 +800,7 @@ static void lru_deactivate_fn(struct page *page, struct 
> lruvec *lruvec,
>  
>   del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
>   ClearPageActive(page);
> + ClearPageReferenced(page);
>   add_page_to_lru_list(page, lruvec, lru);
>  
>   __count_vm_event(PGDEACTIVATE);

What if we have

PageLRU(page) && !PageActive(page) && PageReferenced(page)

if we really want to "accelerate the reclaim of @page" then we should
clear PG_referenced there too.

(And what about page_referenced(page) :))
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-30 Thread Minchan Kim
Hello Andrew,

On Mon, Mar 30, 2015 at 02:20:10PM -0700, Andrew Morton wrote:
> On Mon, 30 Mar 2015 14:35:02 +0900 Minchan Kim  wrote:
> 
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -866,6 +866,13 @@ void deactivate_file_page(struct page *page)
> > }
> >  }
> >  
> > +/**
> > + * deactivate_page - deactivate a page
> > + * @page: page to deactivate
> > + *
> > + * This function moves @page to inactive list if @page was on active list 
> > and
> > + * was not unevictable page to accelerate to reclaim @page.
> > + */
> >  void deactivate_page(struct page *page)
> >  {
> > if (PageLRU(page) && PageActive(page) && !PageUnevictable(page)) {
> 
> Thanks.
> 
> deactivate_page() doesn't look at or alter PageReferenced().  Should it?

Absolutely true. Thanks.
Here it goes.

>From 2b2c92eb73a1cceac615b9abd4c0f5f0c3395ff5 Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Tue, 31 Mar 2015 13:38:46 +0900
Subject: [PATCH] mm: lru_deactivate_fn should clear PG_referenced

deactivate_page aims for accelerate for reclaiming through
moving pages from active list to inactive list so we should
clear PG_referenced for the goal.

Suggested-by: Andrew Morton 
Signed-off-by: Minchan Kim 
---
 mm/swap.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/swap.c b/mm/swap.c
index b65fc8c..6b420022 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -800,6 +800,7 @@ static void lru_deactivate_fn(struct page *page, struct 
lruvec *lruvec,
 
del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
ClearPageActive(page);
+   ClearPageReferenced(page);
add_page_to_lru_list(page, lruvec, lru);
 
__count_vm_event(PGDEACTIVATE);
-- 
1.9.3



-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-30 Thread Andrew Morton
On Mon, 30 Mar 2015 14:35:02 +0900 Minchan Kim  wrote:

> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -866,6 +866,13 @@ void deactivate_file_page(struct page *page)
>   }
>  }
>  
> +/**
> + * deactivate_page - deactivate a page
> + * @page: page to deactivate
> + *
> + * This function moves @page to inactive list if @page was on active list and
> + * was not unevictable page to accelerate to reclaim @page.
> + */
>  void deactivate_page(struct page *page)
>  {
>   if (PageLRU(page) && PageActive(page) && !PageUnevictable(page)) {

Thanks.

deactivate_page() doesn't look at or alter PageReferenced().  Should it?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-30 Thread Andrew Morton
On Mon, 30 Mar 2015 14:35:02 +0900 Minchan Kim minc...@kernel.org wrote:

 --- a/mm/swap.c
 +++ b/mm/swap.c
 @@ -866,6 +866,13 @@ void deactivate_file_page(struct page *page)
   }
  }
  
 +/**
 + * deactivate_page - deactivate a page
 + * @page: page to deactivate
 + *
 + * This function moves @page to inactive list if @page was on active list and
 + * was not unevictable page to accelerate to reclaim @page.
 + */
  void deactivate_page(struct page *page)
  {
   if (PageLRU(page)  PageActive(page)  !PageUnevictable(page)) {

Thanks.

deactivate_page() doesn't look at or alter PageReferenced().  Should it?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-30 Thread Andrew Morton
On Tue, 31 Mar 2015 13:45:25 +0900 Minchan Kim minc...@kernel.org wrote:
  
  deactivate_page() doesn't look at or alter PageReferenced().  Should it?
 
 Absolutely true. Thanks.
 Here it goes.
 
 From 2b2c92eb73a1cceac615b9abd4c0f5f0c3395ff5 Mon Sep 17 00:00:00 2001
 From: Minchan Kim minc...@kernel.org
 Date: Tue, 31 Mar 2015 13:38:46 +0900
 Subject: [PATCH] mm: lru_deactivate_fn should clear PG_referenced
 
 deactivate_page aims for accelerate for reclaiming through
 moving pages from active list to inactive list so we should
 clear PG_referenced for the goal.
 
 ...

 --- a/mm/swap.c
 +++ b/mm/swap.c
 @@ -800,6 +800,7 @@ static void lru_deactivate_fn(struct page *page, struct 
 lruvec *lruvec,
  
   del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
   ClearPageActive(page);
 + ClearPageReferenced(page);
   add_page_to_lru_list(page, lruvec, lru);
  
   __count_vm_event(PGDEACTIVATE);

What if we have

PageLRU(page)  !PageActive(page)  PageReferenced(page)

if we really want to accelerate the reclaim of @page then we should
clear PG_referenced there too.

(And what about page_referenced(page) :))
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-30 Thread Minchan Kim
On Mon, Mar 30, 2015 at 10:28:47PM -0700, Andrew Morton wrote:
 On Tue, 31 Mar 2015 13:45:25 +0900 Minchan Kim minc...@kernel.org wrote:
   
   deactivate_page() doesn't look at or alter PageReferenced().  Should it?
  
  Absolutely true. Thanks.
  Here it goes.
  
  From 2b2c92eb73a1cceac615b9abd4c0f5f0c3395ff5 Mon Sep 17 00:00:00 2001
  From: Minchan Kim minc...@kernel.org
  Date: Tue, 31 Mar 2015 13:38:46 +0900
  Subject: [PATCH] mm: lru_deactivate_fn should clear PG_referenced
  
  deactivate_page aims for accelerate for reclaiming through
  moving pages from active list to inactive list so we should
  clear PG_referenced for the goal.
  
  ...
 
  --- a/mm/swap.c
  +++ b/mm/swap.c
  @@ -800,6 +800,7 @@ static void lru_deactivate_fn(struct page *page, struct 
  lruvec *lruvec,
   
  del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
  ClearPageActive(page);
  +   ClearPageReferenced(page);
  add_page_to_lru_list(page, lruvec, lru);
   
  __count_vm_event(PGDEACTIVATE);
 
 What if we have
 
   PageLRU(page)  !PageActive(page)  PageReferenced(page)
 
 if we really want to accelerate the reclaim of @page then we should
 clear PG_referenced there too.

The function's name is *deactivate*_page. IOW, I think it should work
for only pages in active list, IMHO.

 
 (And what about page_referenced(page) :))

Yes, I considered it when you mentioned PG_referenced. Now, madvise_free
clear out access bit of page table when the syscall is called so
shrink_page_list could reclaim pages easily.

Of course, we could clear access bit by page_referenced for general purpose,
not only madvise_free but it would hurt performance for madvise_free so
I'd like to leave it unless there is a need for the function.

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-30 Thread Minchan Kim
Hello Andrew,

On Mon, Mar 30, 2015 at 02:20:10PM -0700, Andrew Morton wrote:
 On Mon, 30 Mar 2015 14:35:02 +0900 Minchan Kim minc...@kernel.org wrote:
 
  --- a/mm/swap.c
  +++ b/mm/swap.c
  @@ -866,6 +866,13 @@ void deactivate_file_page(struct page *page)
  }
   }
   
  +/**
  + * deactivate_page - deactivate a page
  + * @page: page to deactivate
  + *
  + * This function moves @page to inactive list if @page was on active list 
  and
  + * was not unevictable page to accelerate to reclaim @page.
  + */
   void deactivate_page(struct page *page)
   {
  if (PageLRU(page)  PageActive(page)  !PageUnevictable(page)) {
 
 Thanks.
 
 deactivate_page() doesn't look at or alter PageReferenced().  Should it?

Absolutely true. Thanks.
Here it goes.

From 2b2c92eb73a1cceac615b9abd4c0f5f0c3395ff5 Mon Sep 17 00:00:00 2001
From: Minchan Kim minc...@kernel.org
Date: Tue, 31 Mar 2015 13:38:46 +0900
Subject: [PATCH] mm: lru_deactivate_fn should clear PG_referenced

deactivate_page aims for accelerate for reclaiming through
moving pages from active list to inactive list so we should
clear PG_referenced for the goal.

Suggested-by: Andrew Morton a...@linux-foundation.org
Signed-off-by: Minchan Kim minc...@kernel.org
---
 mm/swap.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/swap.c b/mm/swap.c
index b65fc8c..6b420022 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -800,6 +800,7 @@ static void lru_deactivate_fn(struct page *page, struct 
lruvec *lruvec,
 
del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
ClearPageActive(page);
+   ClearPageReferenced(page);
add_page_to_lru_list(page, lruvec, lru);
 
__count_vm_event(PGDEACTIVATE);
-- 
1.9.3



-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-29 Thread Minchan Kim
Hello Andrew,

On Fri, Mar 20, 2015 at 03:43:58PM -0700, Andrew Morton wrote:
> On Wed, 11 Mar 2015 10:20:37 +0900 Minchan Kim  wrote:
> 
> > MADV_FREE is hint that it's okay to discard pages if there is
> > memory pressure and we uses reclaimers(ie, kswapd and direct reclaim)
> > to free them so there is no worth to remain them in active anonymous LRU
> > so this patch moves them to inactive LRU list's head.
> > 
> > This means that MADV_FREE-ed pages which were living on the inactive list
> > are reclaimed first because they are more likely to be cold rather than
> > recently active pages.
> > 
> > A arguable issue for the approach would be whether we should put it to
> > head or tail in inactive list. I selected *head* because kernel cannot
> > make sure it's really cold or warm for every MADV_FREE usecase but
> > at least we know it's not *hot* so landing of inactive head would be
> > comprimise for various usecases.
> > 
> > This is fixing a suboptimal behavior of MADV_FREE when pages living on
> > the active list will sit there for a long time even under memory
> > pressure while the inactive list is reclaimed heavily. This basically
> > breaks the whole purpose of using MADV_FREE to help the system to free
> > memory which is might not be used.
> > 
> > @@ -789,6 +790,23 @@ static void lru_deactivate_file_fn(struct page *page, 
> > struct lruvec *lruvec,
> > update_page_reclaim_stat(lruvec, file, 0);
> >  }
> >  
> > +
> > +static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec,
> > +   void *arg)
> >
> > ...
> >
> > @@ -844,6 +866,18 @@ void deactivate_file_page(struct page *page)
> > }
> >  }
> >  
> > +void deactivate_page(struct page *page)
> > +{
> 
> lru_deactivate_file_fn() and deactivate_file_page() are carefully
> documented and lru_deactivate_fn() and deactivate_page() should
> be as well.  In fact it becomes more important now that we have two
> similar-looking things.

Sorry, I have missed this comment.

Acutally, deactive_file_page was too specific on file-backed page
invalidation when I implemented first time. That's why it had a lot
description but deactivate_page is too general so I think short comment
is enough. :)

Here it goes.

Thanks.

>From 1dbff1d18876962e5248346b59e41014561c09ac Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Mon, 30 Mar 2015 14:30:44 +0900
Subject: [PATCH] mm: document deactivate_page

This patch adds function description for deactivate_page.

Signed-off-by: Minchan Kim 
---
 mm/swap.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/mm/swap.c b/mm/swap.c
index 6b5adc7..b65fc8c 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -866,6 +866,13 @@ void deactivate_file_page(struct page *page)
}
 }
 
+/**
+ * deactivate_page - deactivate a page
+ * @page: page to deactivate
+ *
+ * This function moves @page to inactive list if @page was on active list and
+ * was not unevictable page to accelerate to reclaim @page.
+ */
 void deactivate_page(struct page *page)
 {
if (PageLRU(page) && PageActive(page) && !PageUnevictable(page)) {
-- 
1.9.3


> 
> 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-29 Thread Minchan Kim
Hello Andrew,

On Fri, Mar 20, 2015 at 03:43:58PM -0700, Andrew Morton wrote:
 On Wed, 11 Mar 2015 10:20:37 +0900 Minchan Kim minc...@kernel.org wrote:
 
  MADV_FREE is hint that it's okay to discard pages if there is
  memory pressure and we uses reclaimers(ie, kswapd and direct reclaim)
  to free them so there is no worth to remain them in active anonymous LRU
  so this patch moves them to inactive LRU list's head.
  
  This means that MADV_FREE-ed pages which were living on the inactive list
  are reclaimed first because they are more likely to be cold rather than
  recently active pages.
  
  A arguable issue for the approach would be whether we should put it to
  head or tail in inactive list. I selected *head* because kernel cannot
  make sure it's really cold or warm for every MADV_FREE usecase but
  at least we know it's not *hot* so landing of inactive head would be
  comprimise for various usecases.
  
  This is fixing a suboptimal behavior of MADV_FREE when pages living on
  the active list will sit there for a long time even under memory
  pressure while the inactive list is reclaimed heavily. This basically
  breaks the whole purpose of using MADV_FREE to help the system to free
  memory which is might not be used.
  
  @@ -789,6 +790,23 @@ static void lru_deactivate_file_fn(struct page *page, 
  struct lruvec *lruvec,
  update_page_reclaim_stat(lruvec, file, 0);
   }
   
  +
  +static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec,
  +   void *arg)
 
  ...
 
  @@ -844,6 +866,18 @@ void deactivate_file_page(struct page *page)
  }
   }
   
  +void deactivate_page(struct page *page)
  +{
 
 lru_deactivate_file_fn() and deactivate_file_page() are carefully
 documented and lru_deactivate_fn() and deactivate_page() should
 be as well.  In fact it becomes more important now that we have two
 similar-looking things.

Sorry, I have missed this comment.

Acutally, deactive_file_page was too specific on file-backed page
invalidation when I implemented first time. That's why it had a lot
description but deactivate_page is too general so I think short comment
is enough. :)

Here it goes.

Thanks.

From 1dbff1d18876962e5248346b59e41014561c09ac Mon Sep 17 00:00:00 2001
From: Minchan Kim minc...@kernel.org
Date: Mon, 30 Mar 2015 14:30:44 +0900
Subject: [PATCH] mm: document deactivate_page

This patch adds function description for deactivate_page.

Signed-off-by: Minchan Kim minc...@kernel.org
---
 mm/swap.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/mm/swap.c b/mm/swap.c
index 6b5adc7..b65fc8c 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -866,6 +866,13 @@ void deactivate_file_page(struct page *page)
}
 }
 
+/**
+ * deactivate_page - deactivate a page
+ * @page: page to deactivate
+ *
+ * This function moves @page to inactive list if @page was on active list and
+ * was not unevictable page to accelerate to reclaim @page.
+ */
 void deactivate_page(struct page *page)
 {
if (PageLRU(page)  PageActive(page)  !PageUnevictable(page)) {
-- 
1.9.3


 
 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-20 Thread Andrew Morton
On Wed, 11 Mar 2015 10:20:37 +0900 Minchan Kim  wrote:

> MADV_FREE is hint that it's okay to discard pages if there is
> memory pressure and we uses reclaimers(ie, kswapd and direct reclaim)
> to free them so there is no worth to remain them in active anonymous LRU
> so this patch moves them to inactive LRU list's head.
> 
> This means that MADV_FREE-ed pages which were living on the inactive list
> are reclaimed first because they are more likely to be cold rather than
> recently active pages.
> 
> A arguable issue for the approach would be whether we should put it to
> head or tail in inactive list. I selected *head* because kernel cannot
> make sure it's really cold or warm for every MADV_FREE usecase but
> at least we know it's not *hot* so landing of inactive head would be
> comprimise for various usecases.
> 
> This is fixing a suboptimal behavior of MADV_FREE when pages living on
> the active list will sit there for a long time even under memory
> pressure while the inactive list is reclaimed heavily. This basically
> breaks the whole purpose of using MADV_FREE to help the system to free
> memory which is might not be used.
> 
> @@ -789,6 +790,23 @@ static void lru_deactivate_file_fn(struct page *page, 
> struct lruvec *lruvec,
>   update_page_reclaim_stat(lruvec, file, 0);
>  }
>  
> +
> +static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec,
> + void *arg)
>
> ...
>
> @@ -844,6 +866,18 @@ void deactivate_file_page(struct page *page)
>   }
>  }
>  
> +void deactivate_page(struct page *page)
> +{

lru_deactivate_file_fn() and deactivate_file_page() are carefully
documented and lru_deactivate_fn() and deactivate_page() should
be as well.  In fact it becomes more important now that we have two
similar-looking things.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-20 Thread Andrew Morton
On Wed, 11 Mar 2015 10:20:37 +0900 Minchan Kim minc...@kernel.org wrote:

 MADV_FREE is hint that it's okay to discard pages if there is
 memory pressure and we uses reclaimers(ie, kswapd and direct reclaim)
 to free them so there is no worth to remain them in active anonymous LRU
 so this patch moves them to inactive LRU list's head.
 
 This means that MADV_FREE-ed pages which were living on the inactive list
 are reclaimed first because they are more likely to be cold rather than
 recently active pages.
 
 A arguable issue for the approach would be whether we should put it to
 head or tail in inactive list. I selected *head* because kernel cannot
 make sure it's really cold or warm for every MADV_FREE usecase but
 at least we know it's not *hot* so landing of inactive head would be
 comprimise for various usecases.
 
 This is fixing a suboptimal behavior of MADV_FREE when pages living on
 the active list will sit there for a long time even under memory
 pressure while the inactive list is reclaimed heavily. This basically
 breaks the whole purpose of using MADV_FREE to help the system to free
 memory which is might not be used.
 
 @@ -789,6 +790,23 @@ static void lru_deactivate_file_fn(struct page *page, 
 struct lruvec *lruvec,
   update_page_reclaim_stat(lruvec, file, 0);
  }
  
 +
 +static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec,
 + void *arg)

 ...

 @@ -844,6 +866,18 @@ void deactivate_file_page(struct page *page)
   }
  }
  
 +void deactivate_page(struct page *page)
 +{

lru_deactivate_file_fn() and deactivate_file_page() are carefully
documented and lru_deactivate_fn() and deactivate_page() should
be as well.  In fact it becomes more important now that we have two
similar-looking things.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-10 Thread Minchan Kim
On Wed, Mar 11, 2015 at 10:14:51AM +0800, Wang, Yalin wrote:
> > -Original Message-
> > From: Minchan Kim [mailto:minc...@kernel.org]
> > Sent: Wednesday, March 11, 2015 9:21 AM
> > To: Andrew Morton
> > Cc: linux-kernel@vger.kernel.org; linux...@kvack.org; Michal Hocko;
> > Johannes Weiner; Mel Gorman; Rik van Riel; Shaohua Li; Wang, Yalin; Minchan
> > Kim
> > Subject: [PATCH 3/4] mm: move lazy free pages to inactive list
> > 
> > MADV_FREE is hint that it's okay to discard pages if there is
> > memory pressure and we uses reclaimers(ie, kswapd and direct reclaim)
> > to free them so there is no worth to remain them in active anonymous LRU
> > so this patch moves them to inactive LRU list's head.
> > 
> > This means that MADV_FREE-ed pages which were living on the inactive list
> > are reclaimed first because they are more likely to be cold rather than
> > recently active pages.
> > 
> > A arguable issue for the approach would be whether we should put it to
> > head or tail in inactive list. I selected *head* because kernel cannot
> > make sure it's really cold or warm for every MADV_FREE usecase but
> > at least we know it's not *hot* so landing of inactive head would be
> > comprimise for various usecases.
> > 
> > This is fixing a suboptimal behavior of MADV_FREE when pages living on
> > the active list will sit there for a long time even under memory
> > pressure while the inactive list is reclaimed heavily. This basically
> > breaks the whole purpose of using MADV_FREE to help the system to free
> > memory which is might not be used.
> > 
> > Acked-by: Michal Hocko 
> > Signed-off-by: Minchan Kim 
> > ---
> >  include/linux/swap.h |  1 +
> >  mm/madvise.c |  2 ++
> >  mm/swap.c| 35 +++
> >  3 files changed, 38 insertions(+)
> > 
> > diff --git a/include/linux/swap.h b/include/linux/swap.h
> > index cee108c..0428e4c 100644
> > --- a/include/linux/swap.h
> > +++ b/include/linux/swap.h
> > @@ -308,6 +308,7 @@ extern void lru_add_drain_cpu(int cpu);
> >  extern void lru_add_drain_all(void);
> >  extern void rotate_reclaimable_page(struct page *page);
> >  extern void deactivate_file_page(struct page *page);
> > +extern void deactivate_page(struct page *page);
> >  extern void swap_setup(void);
> > 
> >  extern void add_page_to_unevictable_list(struct page *page);
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index ebe692e..22e8f0c 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -340,6 +340,8 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned
> > long addr,
> > ptent = pte_mkold(ptent);
> > ptent = pte_mkclean(ptent);
> > set_pte_at(mm, addr, pte, ptent);
> > +   if (PageActive(page))
> > +   deactivate_page(page);
> > tlb_remove_tlb_entry(tlb, pte, addr);
> > }
> 
> I think this place should be changed like this:
>   +   if (!page_referenced(page, false, NULL, NULL, NULL) && 
> PageActive(page))
>   +   deactivate_page(page);
> Because we don't know if other processes are reference this page,
> If it is true, don't need deactivate this page.

The page_referenced is too much heavy operation to do it
in madvise_free fast path.
If other processes(parent or child) referenced the page,
shrink_page_list in slow path could filter it out and
activates the page.

In addition, shared case for anon pages happens by fork mostly
so we could expect child will do exec soonish in many cases.

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-10 Thread Wang, Yalin
> -Original Message-
> From: Minchan Kim [mailto:minc...@kernel.org]
> Sent: Wednesday, March 11, 2015 9:21 AM
> To: Andrew Morton
> Cc: linux-kernel@vger.kernel.org; linux...@kvack.org; Michal Hocko;
> Johannes Weiner; Mel Gorman; Rik van Riel; Shaohua Li; Wang, Yalin; Minchan
> Kim
> Subject: [PATCH 3/4] mm: move lazy free pages to inactive list
> 
> MADV_FREE is hint that it's okay to discard pages if there is
> memory pressure and we uses reclaimers(ie, kswapd and direct reclaim)
> to free them so there is no worth to remain them in active anonymous LRU
> so this patch moves them to inactive LRU list's head.
> 
> This means that MADV_FREE-ed pages which were living on the inactive list
> are reclaimed first because they are more likely to be cold rather than
> recently active pages.
> 
> A arguable issue for the approach would be whether we should put it to
> head or tail in inactive list. I selected *head* because kernel cannot
> make sure it's really cold or warm for every MADV_FREE usecase but
> at least we know it's not *hot* so landing of inactive head would be
> comprimise for various usecases.
> 
> This is fixing a suboptimal behavior of MADV_FREE when pages living on
> the active list will sit there for a long time even under memory
> pressure while the inactive list is reclaimed heavily. This basically
> breaks the whole purpose of using MADV_FREE to help the system to free
> memory which is might not be used.
> 
> Acked-by: Michal Hocko 
> Signed-off-by: Minchan Kim 
> ---
>  include/linux/swap.h |  1 +
>  mm/madvise.c |  2 ++
>  mm/swap.c| 35 +++
>  3 files changed, 38 insertions(+)
> 
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index cee108c..0428e4c 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -308,6 +308,7 @@ extern void lru_add_drain_cpu(int cpu);
>  extern void lru_add_drain_all(void);
>  extern void rotate_reclaimable_page(struct page *page);
>  extern void deactivate_file_page(struct page *page);
> +extern void deactivate_page(struct page *page);
>  extern void swap_setup(void);
> 
>  extern void add_page_to_unevictable_list(struct page *page);
> diff --git a/mm/madvise.c b/mm/madvise.c
> index ebe692e..22e8f0c 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -340,6 +340,8 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned
> long addr,
>   ptent = pte_mkold(ptent);
>   ptent = pte_mkclean(ptent);
>   set_pte_at(mm, addr, pte, ptent);
> + if (PageActive(page))
> + deactivate_page(page);
>   tlb_remove_tlb_entry(tlb, pte, addr);
>   }

I think this place should be changed like this:
  + if (!page_referenced(page, false, NULL, NULL, NULL) && 
PageActive(page))
  + deactivate_page(page);
Because we don't know if other processes are reference this page,
If it is true, don't need deactivate this page.

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-10 Thread Minchan Kim
MADV_FREE is hint that it's okay to discard pages if there is
memory pressure and we uses reclaimers(ie, kswapd and direct reclaim)
to free them so there is no worth to remain them in active anonymous LRU
so this patch moves them to inactive LRU list's head.

This means that MADV_FREE-ed pages which were living on the inactive list
are reclaimed first because they are more likely to be cold rather than
recently active pages.

A arguable issue for the approach would be whether we should put it to
head or tail in inactive list. I selected *head* because kernel cannot
make sure it's really cold or warm for every MADV_FREE usecase but
at least we know it's not *hot* so landing of inactive head would be
comprimise for various usecases.

This is fixing a suboptimal behavior of MADV_FREE when pages living on
the active list will sit there for a long time even under memory
pressure while the inactive list is reclaimed heavily. This basically
breaks the whole purpose of using MADV_FREE to help the system to free
memory which is might not be used.

Acked-by: Michal Hocko 
Signed-off-by: Minchan Kim 
---
 include/linux/swap.h |  1 +
 mm/madvise.c |  2 ++
 mm/swap.c| 35 +++
 3 files changed, 38 insertions(+)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index cee108c..0428e4c 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -308,6 +308,7 @@ extern void lru_add_drain_cpu(int cpu);
 extern void lru_add_drain_all(void);
 extern void rotate_reclaimable_page(struct page *page);
 extern void deactivate_file_page(struct page *page);
+extern void deactivate_page(struct page *page);
 extern void swap_setup(void);
 
 extern void add_page_to_unevictable_list(struct page *page);
diff --git a/mm/madvise.c b/mm/madvise.c
index ebe692e..22e8f0c 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -340,6 +340,8 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long 
addr,
ptent = pte_mkold(ptent);
ptent = pte_mkclean(ptent);
set_pte_at(mm, addr, pte, ptent);
+   if (PageActive(page))
+   deactivate_page(page);
tlb_remove_tlb_entry(tlb, pte, addr);
}
 
diff --git a/mm/swap.c b/mm/swap.c
index 5b2a605..393968c 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -43,6 +43,7 @@ int page_cluster;
 static DEFINE_PER_CPU(struct pagevec, lru_add_pvec);
 static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
 static DEFINE_PER_CPU(struct pagevec, lru_deactivate_file_pvecs);
+static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);
 
 /*
  * This path almost never happens for VM activity - pages are normally
@@ -789,6 +790,23 @@ static void lru_deactivate_file_fn(struct page *page, 
struct lruvec *lruvec,
update_page_reclaim_stat(lruvec, file, 0);
 }
 
+
+static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec,
+   void *arg)
+{
+   if (PageLRU(page) && PageActive(page) && !PageUnevictable(page)) {
+   int file = page_is_file_cache(page);
+   int lru = page_lru_base_type(page);
+
+   del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
+   ClearPageActive(page);
+   add_page_to_lru_list(page, lruvec, lru);
+
+   __count_vm_event(PGDEACTIVATE);
+   update_page_reclaim_stat(lruvec, file, 0);
+   }
+}
+
 /*
  * Drain pages out of the cpu's pagevecs.
  * Either "cpu" is the current CPU, and preemption has already been
@@ -815,6 +833,10 @@ void lru_add_drain_cpu(int cpu)
if (pagevec_count(pvec))
pagevec_lru_move_fn(pvec, lru_deactivate_file_fn, NULL);
 
+   pvec = _cpu(lru_deactivate_pvecs, cpu);
+   if (pagevec_count(pvec))
+   pagevec_lru_move_fn(pvec, lru_deactivate_fn, NULL);
+
activate_page_drain(cpu);
 }
 
@@ -844,6 +866,18 @@ void deactivate_file_page(struct page *page)
}
 }
 
+void deactivate_page(struct page *page)
+{
+   if (PageLRU(page) && PageActive(page) && !PageUnevictable(page)) {
+   struct pagevec *pvec = _cpu_var(lru_deactivate_pvecs);
+
+   page_cache_get(page);
+   if (!pagevec_add(pvec, page))
+   pagevec_lru_move_fn(pvec, lru_deactivate_fn, NULL);
+   put_cpu_var(lru_deactivate_pvecs);
+   }
+}
+
 void lru_add_drain(void)
 {
lru_add_drain_cpu(get_cpu());
@@ -873,6 +907,7 @@ void lru_add_drain_all(void)
if (pagevec_count(_cpu(lru_add_pvec, cpu)) ||
pagevec_count(_cpu(lru_rotate_pvecs, cpu)) ||
pagevec_count(_cpu(lru_deactivate_file_pvecs, cpu)) ||
+   pagevec_count(_cpu(lru_deactivate_pvecs, cpu)) ||
need_activate_page_drain(cpu)) {
INIT_WORK(work, lru_add_drain_per_cpu);
schedule_work_on(cpu, work);
-- 
1.9.3

--
To unsubscribe 

Re: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-10 Thread Minchan Kim
On Wed, Mar 11, 2015 at 10:14:51AM +0800, Wang, Yalin wrote:
  -Original Message-
  From: Minchan Kim [mailto:minc...@kernel.org]
  Sent: Wednesday, March 11, 2015 9:21 AM
  To: Andrew Morton
  Cc: linux-kernel@vger.kernel.org; linux...@kvack.org; Michal Hocko;
  Johannes Weiner; Mel Gorman; Rik van Riel; Shaohua Li; Wang, Yalin; Minchan
  Kim
  Subject: [PATCH 3/4] mm: move lazy free pages to inactive list
  
  MADV_FREE is hint that it's okay to discard pages if there is
  memory pressure and we uses reclaimers(ie, kswapd and direct reclaim)
  to free them so there is no worth to remain them in active anonymous LRU
  so this patch moves them to inactive LRU list's head.
  
  This means that MADV_FREE-ed pages which were living on the inactive list
  are reclaimed first because they are more likely to be cold rather than
  recently active pages.
  
  A arguable issue for the approach would be whether we should put it to
  head or tail in inactive list. I selected *head* because kernel cannot
  make sure it's really cold or warm for every MADV_FREE usecase but
  at least we know it's not *hot* so landing of inactive head would be
  comprimise for various usecases.
  
  This is fixing a suboptimal behavior of MADV_FREE when pages living on
  the active list will sit there for a long time even under memory
  pressure while the inactive list is reclaimed heavily. This basically
  breaks the whole purpose of using MADV_FREE to help the system to free
  memory which is might not be used.
  
  Acked-by: Michal Hocko mho...@suse.cz
  Signed-off-by: Minchan Kim minc...@kernel.org
  ---
   include/linux/swap.h |  1 +
   mm/madvise.c |  2 ++
   mm/swap.c| 35 +++
   3 files changed, 38 insertions(+)
  
  diff --git a/include/linux/swap.h b/include/linux/swap.h
  index cee108c..0428e4c 100644
  --- a/include/linux/swap.h
  +++ b/include/linux/swap.h
  @@ -308,6 +308,7 @@ extern void lru_add_drain_cpu(int cpu);
   extern void lru_add_drain_all(void);
   extern void rotate_reclaimable_page(struct page *page);
   extern void deactivate_file_page(struct page *page);
  +extern void deactivate_page(struct page *page);
   extern void swap_setup(void);
  
   extern void add_page_to_unevictable_list(struct page *page);
  diff --git a/mm/madvise.c b/mm/madvise.c
  index ebe692e..22e8f0c 100644
  --- a/mm/madvise.c
  +++ b/mm/madvise.c
  @@ -340,6 +340,8 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned
  long addr,
  ptent = pte_mkold(ptent);
  ptent = pte_mkclean(ptent);
  set_pte_at(mm, addr, pte, ptent);
  +   if (PageActive(page))
  +   deactivate_page(page);
  tlb_remove_tlb_entry(tlb, pte, addr);
  }
 
 I think this place should be changed like this:
   +   if (!page_referenced(page, false, NULL, NULL, NULL)  
 PageActive(page))
   +   deactivate_page(page);
 Because we don't know if other processes are reference this page,
 If it is true, don't need deactivate this page.

The page_referenced is too much heavy operation to do it
in madvise_free fast path.
If other processes(parent or child) referenced the page,
shrink_page_list in slow path could filter it out and
activates the page.

In addition, shared case for anon pages happens by fork mostly
so we could expect child will do exec soonish in many cases.

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-10 Thread Minchan Kim
MADV_FREE is hint that it's okay to discard pages if there is
memory pressure and we uses reclaimers(ie, kswapd and direct reclaim)
to free them so there is no worth to remain them in active anonymous LRU
so this patch moves them to inactive LRU list's head.

This means that MADV_FREE-ed pages which were living on the inactive list
are reclaimed first because they are more likely to be cold rather than
recently active pages.

A arguable issue for the approach would be whether we should put it to
head or tail in inactive list. I selected *head* because kernel cannot
make sure it's really cold or warm for every MADV_FREE usecase but
at least we know it's not *hot* so landing of inactive head would be
comprimise for various usecases.

This is fixing a suboptimal behavior of MADV_FREE when pages living on
the active list will sit there for a long time even under memory
pressure while the inactive list is reclaimed heavily. This basically
breaks the whole purpose of using MADV_FREE to help the system to free
memory which is might not be used.

Acked-by: Michal Hocko mho...@suse.cz
Signed-off-by: Minchan Kim minc...@kernel.org
---
 include/linux/swap.h |  1 +
 mm/madvise.c |  2 ++
 mm/swap.c| 35 +++
 3 files changed, 38 insertions(+)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index cee108c..0428e4c 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -308,6 +308,7 @@ extern void lru_add_drain_cpu(int cpu);
 extern void lru_add_drain_all(void);
 extern void rotate_reclaimable_page(struct page *page);
 extern void deactivate_file_page(struct page *page);
+extern void deactivate_page(struct page *page);
 extern void swap_setup(void);
 
 extern void add_page_to_unevictable_list(struct page *page);
diff --git a/mm/madvise.c b/mm/madvise.c
index ebe692e..22e8f0c 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -340,6 +340,8 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long 
addr,
ptent = pte_mkold(ptent);
ptent = pte_mkclean(ptent);
set_pte_at(mm, addr, pte, ptent);
+   if (PageActive(page))
+   deactivate_page(page);
tlb_remove_tlb_entry(tlb, pte, addr);
}
 
diff --git a/mm/swap.c b/mm/swap.c
index 5b2a605..393968c 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -43,6 +43,7 @@ int page_cluster;
 static DEFINE_PER_CPU(struct pagevec, lru_add_pvec);
 static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
 static DEFINE_PER_CPU(struct pagevec, lru_deactivate_file_pvecs);
+static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);
 
 /*
  * This path almost never happens for VM activity - pages are normally
@@ -789,6 +790,23 @@ static void lru_deactivate_file_fn(struct page *page, 
struct lruvec *lruvec,
update_page_reclaim_stat(lruvec, file, 0);
 }
 
+
+static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec,
+   void *arg)
+{
+   if (PageLRU(page)  PageActive(page)  !PageUnevictable(page)) {
+   int file = page_is_file_cache(page);
+   int lru = page_lru_base_type(page);
+
+   del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
+   ClearPageActive(page);
+   add_page_to_lru_list(page, lruvec, lru);
+
+   __count_vm_event(PGDEACTIVATE);
+   update_page_reclaim_stat(lruvec, file, 0);
+   }
+}
+
 /*
  * Drain pages out of the cpu's pagevecs.
  * Either cpu is the current CPU, and preemption has already been
@@ -815,6 +833,10 @@ void lru_add_drain_cpu(int cpu)
if (pagevec_count(pvec))
pagevec_lru_move_fn(pvec, lru_deactivate_file_fn, NULL);
 
+   pvec = per_cpu(lru_deactivate_pvecs, cpu);
+   if (pagevec_count(pvec))
+   pagevec_lru_move_fn(pvec, lru_deactivate_fn, NULL);
+
activate_page_drain(cpu);
 }
 
@@ -844,6 +866,18 @@ void deactivate_file_page(struct page *page)
}
 }
 
+void deactivate_page(struct page *page)
+{
+   if (PageLRU(page)  PageActive(page)  !PageUnevictable(page)) {
+   struct pagevec *pvec = get_cpu_var(lru_deactivate_pvecs);
+
+   page_cache_get(page);
+   if (!pagevec_add(pvec, page))
+   pagevec_lru_move_fn(pvec, lru_deactivate_fn, NULL);
+   put_cpu_var(lru_deactivate_pvecs);
+   }
+}
+
 void lru_add_drain(void)
 {
lru_add_drain_cpu(get_cpu());
@@ -873,6 +907,7 @@ void lru_add_drain_all(void)
if (pagevec_count(per_cpu(lru_add_pvec, cpu)) ||
pagevec_count(per_cpu(lru_rotate_pvecs, cpu)) ||
pagevec_count(per_cpu(lru_deactivate_file_pvecs, cpu)) ||
+   pagevec_count(per_cpu(lru_deactivate_pvecs, cpu)) ||
need_activate_page_drain(cpu)) {
INIT_WORK(work, lru_add_drain_per_cpu);

RE: [PATCH 3/4] mm: move lazy free pages to inactive list

2015-03-10 Thread Wang, Yalin
 -Original Message-
 From: Minchan Kim [mailto:minc...@kernel.org]
 Sent: Wednesday, March 11, 2015 9:21 AM
 To: Andrew Morton
 Cc: linux-kernel@vger.kernel.org; linux...@kvack.org; Michal Hocko;
 Johannes Weiner; Mel Gorman; Rik van Riel; Shaohua Li; Wang, Yalin; Minchan
 Kim
 Subject: [PATCH 3/4] mm: move lazy free pages to inactive list
 
 MADV_FREE is hint that it's okay to discard pages if there is
 memory pressure and we uses reclaimers(ie, kswapd and direct reclaim)
 to free them so there is no worth to remain them in active anonymous LRU
 so this patch moves them to inactive LRU list's head.
 
 This means that MADV_FREE-ed pages which were living on the inactive list
 are reclaimed first because they are more likely to be cold rather than
 recently active pages.
 
 A arguable issue for the approach would be whether we should put it to
 head or tail in inactive list. I selected *head* because kernel cannot
 make sure it's really cold or warm for every MADV_FREE usecase but
 at least we know it's not *hot* so landing of inactive head would be
 comprimise for various usecases.
 
 This is fixing a suboptimal behavior of MADV_FREE when pages living on
 the active list will sit there for a long time even under memory
 pressure while the inactive list is reclaimed heavily. This basically
 breaks the whole purpose of using MADV_FREE to help the system to free
 memory which is might not be used.
 
 Acked-by: Michal Hocko mho...@suse.cz
 Signed-off-by: Minchan Kim minc...@kernel.org
 ---
  include/linux/swap.h |  1 +
  mm/madvise.c |  2 ++
  mm/swap.c| 35 +++
  3 files changed, 38 insertions(+)
 
 diff --git a/include/linux/swap.h b/include/linux/swap.h
 index cee108c..0428e4c 100644
 --- a/include/linux/swap.h
 +++ b/include/linux/swap.h
 @@ -308,6 +308,7 @@ extern void lru_add_drain_cpu(int cpu);
  extern void lru_add_drain_all(void);
  extern void rotate_reclaimable_page(struct page *page);
  extern void deactivate_file_page(struct page *page);
 +extern void deactivate_page(struct page *page);
  extern void swap_setup(void);
 
  extern void add_page_to_unevictable_list(struct page *page);
 diff --git a/mm/madvise.c b/mm/madvise.c
 index ebe692e..22e8f0c 100644
 --- a/mm/madvise.c
 +++ b/mm/madvise.c
 @@ -340,6 +340,8 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned
 long addr,
   ptent = pte_mkold(ptent);
   ptent = pte_mkclean(ptent);
   set_pte_at(mm, addr, pte, ptent);
 + if (PageActive(page))
 + deactivate_page(page);
   tlb_remove_tlb_entry(tlb, pte, addr);
   }

I think this place should be changed like this:
  + if (!page_referenced(page, false, NULL, NULL, NULL)  
PageActive(page))
  + deactivate_page(page);
Because we don't know if other processes are reference this page,
If it is true, don't need deactivate this page.

Thanks
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/