Re: [RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-08 Thread Minchan Kim
Hello Michal,

On Thu, Mar 05, 2015 at 04:35:05PM +0100, Michal Hocko wrote:
> On Tue 03-03-15 12:25:51, Minchan Kim wrote:
> [...]
> > From 30c6d5b35a3dc7e451041183ce5efd6a6c42bf88 Mon Sep 17 00:00:00 2001
> > From: Minchan Kim 
> > Date: Tue, 3 Mar 2015 10:06:59 +0900
> > Subject: [RFC] mm: make every pte dirty on do_swap_page
> 
> Hi Minchan, could you resend this patch separately. I am afraid that
> this one got so convoluted with originally unrelated issues that
> people might miss it.
> 
> Thanks!

No problem. Thanks for the review.
I will resend it this week but I'm afraid everybody will be in LSF/MM
so they will be busy with hardwork in there. :)


> -- 
> Michal Hocko
> SUSE Labs

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-08 Thread Minchan Kim
Hello Michal,

On Thu, Mar 05, 2015 at 04:35:05PM +0100, Michal Hocko wrote:
 On Tue 03-03-15 12:25:51, Minchan Kim wrote:
 [...]
  From 30c6d5b35a3dc7e451041183ce5efd6a6c42bf88 Mon Sep 17 00:00:00 2001
  From: Minchan Kim minc...@kernel.org
  Date: Tue, 3 Mar 2015 10:06:59 +0900
  Subject: [RFC] mm: make every pte dirty on do_swap_page
 
 Hi Minchan, could you resend this patch separately. I am afraid that
 this one got so convoluted with originally unrelated issues that
 people might miss it.
 
 Thanks!

No problem. Thanks for the review.
I will resend it this week but I'm afraid everybody will be in LSF/MM
so they will be busy with hardwork in there. :)


 -- 
 Michal Hocko
 SUSE Labs

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-05 Thread Michal Hocko
On Tue 03-03-15 12:25:51, Minchan Kim wrote:
[...]
> From 30c6d5b35a3dc7e451041183ce5efd6a6c42bf88 Mon Sep 17 00:00:00 2001
> From: Minchan Kim 
> Date: Tue, 3 Mar 2015 10:06:59 +0900
> Subject: [RFC] mm: make every pte dirty on do_swap_page

Hi Minchan, could you resend this patch separately. I am afraid that
this one got so convoluted with originally unrelated issues that
people might miss it.

Thanks!
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-05 Thread Michal Hocko
On Tue 03-03-15 12:25:51, Minchan Kim wrote:
[...]
 From 30c6d5b35a3dc7e451041183ce5efd6a6c42bf88 Mon Sep 17 00:00:00 2001
 From: Minchan Kim minc...@kernel.org
 Date: Tue, 3 Mar 2015 10:06:59 +0900
 Subject: [RFC] mm: make every pte dirty on do_swap_page

Hi Minchan, could you resend this patch separately. I am afraid that
this one got so convoluted with originally unrelated issues that
people might miss it.

Thanks!
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-03 Thread Minchan Kim
On Tue, Mar 03, 2015 at 02:46:40PM +0800, Wang, Yalin wrote:
> > -Original Message-
> > From: Minchan Kim [mailto:minchan@gmail.com] On Behalf Of Minchan Kim
> > Sent: Tuesday, March 03, 2015 12:15 PM
> > To: Wang, Yalin
> > Cc: 'Michal Hocko'; 'Andrew Morton'; 'linux-kernel@vger.kernel.org';
> > 'linux...@kvack.org'; 'Rik van Riel'; 'Johannes Weiner'; 'Mel Gorman';
> > 'Shaohua Li'; Hugh Dickins; Cyrill Gorcunov
> > Subject: Re: [RFC V3] mm: change mm_advise_free to clear page dirty
> > 
> > On Tue, Mar 03, 2015 at 11:59:17AM +0800, Wang, Yalin wrote:
> > > > -Original Message-
> > > > From: Minchan Kim [mailto:minchan@gmail.com] On Behalf Of Minchan
> > Kim
> > > > Sent: Tuesday, March 03, 2015 11:26 AM
> > > > To: Wang, Yalin
> > > > Cc: 'Michal Hocko'; 'Andrew Morton'; 'linux-kernel@vger.kernel.org';
> > > > 'linux...@kvack.org'; 'Rik van Riel'; 'Johannes Weiner'; 'Mel Gorman';
> > > > 'Shaohua Li'; Hugh Dickins; Cyrill Gorcunov
> > > > Subject: Re: [RFC V3] mm: change mm_advise_free to clear page dirty
> > > >
> > > > Could you separte this patch in this patchset thread?
> > > > It's tackling differnt problem.
> > > >
> > > > As well, I had a question to previous thread about why shared page
> > > > has a problem now but you didn't answer and send a new patchset.
> > > > It makes reviewers/maintainer time waste/confuse. Please, don't
> > > > hurry to send a code. Before that, resolve reviewers's comments.
> > > >
> > > > On Tue, Mar 03, 2015 at 10:06:40AM +0800, Wang, Yalin wrote:
> > > > > This patch add ClearPageDirty() to clear AnonPage dirty flag,
> > > > > if not clear page dirty for this anon page, the page will never be
> > > > > treated as freeable. We also make sure the shared AnonPage is not
> > > > > freeable, we implement it by dirty all copyed AnonPage pte,
> > > > > so that make sure the Anonpage will not become freeable, unless
> > > > > all process which shared this page call madvise_free syscall.
> > > >
> > > > Please, spend more time to make description clear. I really doubt
> > > > who understand this description without code inspection. :(
> > > > Of course, I'm not a person to write description clear like native
> > > > , either but just I'm sure I spend a more time to write description
> > > > rather than coding, at least. :)
> > > >
> > > I see, I will send another mail for file private map pages.
> > > Sorry for my English expressions.
> > > I think your solution is ok,
> > > Your patch will make sure the anonpage pte will always be dirty.
> > > I add some comments for your patch:
> > >
> > > > ---
> > > >  mm/madvise.c | 1 -
> > > >  mm/memory.c  | 9 +++--
> > > >  mm/rmap.c| 2 +-
> > > >  mm/vmscan.c  | 3 +--
> > > >  4 files changed, 9 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/mm/madvise.c b/mm/madvise.c
> > > > index 6d0fcb8..d64200e 100644
> > > > --- a/mm/madvise.c
> > > > +++ b/mm/madvise.c
> > > > @@ -309,7 +309,6 @@ static int madvise_free_pte_range(pmd_t *pmd,
> > unsigned
> > > > long addr,
> > > > continue;
> > > > }
> > > >
> > > > -   ClearPageDirty(page);
> > > > unlock_page(page);
> > > > }
> > > >
> > > > diff --git a/mm/memory.c b/mm/memory.c
> > > > index 8ae52c9..2f45e77 100644
> > > > --- a/mm/memory.c
> > > > +++ b/mm/memory.c
> > > > @@ -2460,9 +2460,14 @@ static int do_swap_page(struct mm_struct *mm,
> > struct
> > > > vm_area_struct *vma,
> > > >
> > > > inc_mm_counter_fast(mm, MM_ANONPAGES);
> > > > dec_mm_counter_fast(mm, MM_SWAPENTS);
> > > > -   pte = mk_pte(page, vma->vm_page_prot);
> > > > +
> > > > +   /*
> > > > +* Every page swapped-out was pte_dirty so we makes pte dirty 
> > > > again.
> > > > +* MADV_FREE relys on it.
> > > > +*/
> > > > +   pte = mk_pte(pte_mkdirty(page), vma->vm_page_prot);
> > > pte_mkdirty() usage s

Re: [RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-03 Thread Minchan Kim
On Tue, Mar 03, 2015 at 02:46:40PM +0800, Wang, Yalin wrote:
  -Original Message-
  From: Minchan Kim [mailto:minchan@gmail.com] On Behalf Of Minchan Kim
  Sent: Tuesday, March 03, 2015 12:15 PM
  To: Wang, Yalin
  Cc: 'Michal Hocko'; 'Andrew Morton'; 'linux-kernel@vger.kernel.org';
  'linux...@kvack.org'; 'Rik van Riel'; 'Johannes Weiner'; 'Mel Gorman';
  'Shaohua Li'; Hugh Dickins; Cyrill Gorcunov
  Subject: Re: [RFC V3] mm: change mm_advise_free to clear page dirty
  
  On Tue, Mar 03, 2015 at 11:59:17AM +0800, Wang, Yalin wrote:
-Original Message-
From: Minchan Kim [mailto:minchan@gmail.com] On Behalf Of Minchan
  Kim
Sent: Tuesday, March 03, 2015 11:26 AM
To: Wang, Yalin
Cc: 'Michal Hocko'; 'Andrew Morton'; 'linux-kernel@vger.kernel.org';
'linux...@kvack.org'; 'Rik van Riel'; 'Johannes Weiner'; 'Mel Gorman';
'Shaohua Li'; Hugh Dickins; Cyrill Gorcunov
Subject: Re: [RFC V3] mm: change mm_advise_free to clear page dirty
   
Could you separte this patch in this patchset thread?
It's tackling differnt problem.
   
As well, I had a question to previous thread about why shared page
has a problem now but you didn't answer and send a new patchset.
It makes reviewers/maintainer time waste/confuse. Please, don't
hurry to send a code. Before that, resolve reviewers's comments.
   
On Tue, Mar 03, 2015 at 10:06:40AM +0800, Wang, Yalin wrote:
 This patch add ClearPageDirty() to clear AnonPage dirty flag,
 if not clear page dirty for this anon page, the page will never be
 treated as freeable. We also make sure the shared AnonPage is not
 freeable, we implement it by dirty all copyed AnonPage pte,
 so that make sure the Anonpage will not become freeable, unless
 all process which shared this page call madvise_free syscall.
   
Please, spend more time to make description clear. I really doubt
who understand this description without code inspection. :(
Of course, I'm not a person to write description clear like native
, either but just I'm sure I spend a more time to write description
rather than coding, at least. :)
   
   I see, I will send another mail for file private map pages.
   Sorry for my English expressions.
   I think your solution is ok,
   Your patch will make sure the anonpage pte will always be dirty.
   I add some comments for your patch:
  
---
 mm/madvise.c | 1 -
 mm/memory.c  | 9 +++--
 mm/rmap.c| 2 +-
 mm/vmscan.c  | 3 +--
 4 files changed, 9 insertions(+), 6 deletions(-)
   
diff --git a/mm/madvise.c b/mm/madvise.c
index 6d0fcb8..d64200e 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -309,7 +309,6 @@ static int madvise_free_pte_range(pmd_t *pmd,
  unsigned
long addr,
continue;
}
   
-   ClearPageDirty(page);
unlock_page(page);
}
   
diff --git a/mm/memory.c b/mm/memory.c
index 8ae52c9..2f45e77 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2460,9 +2460,14 @@ static int do_swap_page(struct mm_struct *mm,
  struct
vm_area_struct *vma,
   
inc_mm_counter_fast(mm, MM_ANONPAGES);
dec_mm_counter_fast(mm, MM_SWAPENTS);
-   pte = mk_pte(page, vma-vm_page_prot);
+
+   /*
+* Every page swapped-out was pte_dirty so we makes pte dirty 
again.
+* MADV_FREE relys on it.
+*/
+   pte = mk_pte(pte_mkdirty(page), vma-vm_page_prot);
   pte_mkdirty() usage seems wrong here.
  
  Argh, it reveals I didn't test even build. My shame.
  But RFC tag might mitigate my shame. :)
  I will fix it if I send a formal version.
  Thanks for the review.
  
  
if ((flags  FAULT_FLAG_WRITE)  reuse_swap_page(page)) {
-   pte = maybe_mkwrite(pte_mkdirty(pte), vma);
+   pte = maybe_mkwrite(pte, vma);
flags = ~FAULT_FLAG_WRITE;
ret |= VM_FAULT_WRITE;
exclusive = 1;
diff --git a/mm/rmap.c b/mm/rmap.c
index 47b3ba8..34c1d66 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1268,7 +1268,7 @@ static int try_to_unmap_one(struct page *page,
  struct
vm_area_struct *vma,
   
if (flags  TTU_FREE) {
VM_BUG_ON_PAGE(PageSwapCache(page), page);
-   if (!dirty  !PageDirty(page)) {
+   if (!dirty) {
/* It's a freeable page by MADV_FREE */
dec_mm_counter(mm, MM_ANONPAGES);
goto discard;
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 671e47e..7f520c9 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -805,8 +805,7 @@ static enum page_references
page_check_references(struct page *page

RE: [RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-02 Thread Wang, Yalin
> -Original Message-
> From: Minchan Kim [mailto:minchan@gmail.com] On Behalf Of Minchan Kim
> Sent: Tuesday, March 03, 2015 12:15 PM
> To: Wang, Yalin
> Cc: 'Michal Hocko'; 'Andrew Morton'; 'linux-kernel@vger.kernel.org';
> 'linux...@kvack.org'; 'Rik van Riel'; 'Johannes Weiner'; 'Mel Gorman';
> 'Shaohua Li'; Hugh Dickins; Cyrill Gorcunov
> Subject: Re: [RFC V3] mm: change mm_advise_free to clear page dirty
> 
> On Tue, Mar 03, 2015 at 11:59:17AM +0800, Wang, Yalin wrote:
> > > -Original Message-
> > > From: Minchan Kim [mailto:minchan@gmail.com] On Behalf Of Minchan
> Kim
> > > Sent: Tuesday, March 03, 2015 11:26 AM
> > > To: Wang, Yalin
> > > Cc: 'Michal Hocko'; 'Andrew Morton'; 'linux-kernel@vger.kernel.org';
> > > 'linux...@kvack.org'; 'Rik van Riel'; 'Johannes Weiner'; 'Mel Gorman';
> > > 'Shaohua Li'; Hugh Dickins; Cyrill Gorcunov
> > > Subject: Re: [RFC V3] mm: change mm_advise_free to clear page dirty
> > >
> > > Could you separte this patch in this patchset thread?
> > > It's tackling differnt problem.
> > >
> > > As well, I had a question to previous thread about why shared page
> > > has a problem now but you didn't answer and send a new patchset.
> > > It makes reviewers/maintainer time waste/confuse. Please, don't
> > > hurry to send a code. Before that, resolve reviewers's comments.
> > >
> > > On Tue, Mar 03, 2015 at 10:06:40AM +0800, Wang, Yalin wrote:
> > > > This patch add ClearPageDirty() to clear AnonPage dirty flag,
> > > > if not clear page dirty for this anon page, the page will never be
> > > > treated as freeable. We also make sure the shared AnonPage is not
> > > > freeable, we implement it by dirty all copyed AnonPage pte,
> > > > so that make sure the Anonpage will not become freeable, unless
> > > > all process which shared this page call madvise_free syscall.
> > >
> > > Please, spend more time to make description clear. I really doubt
> > > who understand this description without code inspection. :(
> > > Of course, I'm not a person to write description clear like native
> > > , either but just I'm sure I spend a more time to write description
> > > rather than coding, at least. :)
> > >
> > I see, I will send another mail for file private map pages.
> > Sorry for my English expressions.
> > I think your solution is ok,
> > Your patch will make sure the anonpage pte will always be dirty.
> > I add some comments for your patch:
> >
> > > ---
> > >  mm/madvise.c | 1 -
> > >  mm/memory.c  | 9 +++--
> > >  mm/rmap.c| 2 +-
> > >  mm/vmscan.c  | 3 +--
> > >  4 files changed, 9 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/mm/madvise.c b/mm/madvise.c
> > > index 6d0fcb8..d64200e 100644
> > > --- a/mm/madvise.c
> > > +++ b/mm/madvise.c
> > > @@ -309,7 +309,6 @@ static int madvise_free_pte_range(pmd_t *pmd,
> unsigned
> > > long addr,
> > >   continue;
> > >   }
> > >
> > > - ClearPageDirty(page);
> > >   unlock_page(page);
> > >   }
> > >
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 8ae52c9..2f45e77 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -2460,9 +2460,14 @@ static int do_swap_page(struct mm_struct *mm,
> struct
> > > vm_area_struct *vma,
> > >
> > >   inc_mm_counter_fast(mm, MM_ANONPAGES);
> > >   dec_mm_counter_fast(mm, MM_SWAPENTS);
> > > - pte = mk_pte(page, vma->vm_page_prot);
> > > +
> > > + /*
> > > +  * Every page swapped-out was pte_dirty so we makes pte dirty again.
> > > +  * MADV_FREE relys on it.
> > > +  */
> > > + pte = mk_pte(pte_mkdirty(page), vma->vm_page_prot);
> > pte_mkdirty() usage seems wrong here.
> 
> Argh, it reveals I didn't test even build. My shame.
> But RFC tag might mitigate my shame. :)
> I will fix it if I send a formal version.
> Thanks for the review.
> 
> >
> > >   if ((flags & FAULT_FLAG_WRITE) && reuse_swap_page(page)) {
> > > - pte = maybe_mkwrite(pte_mkdirty(pte), vma);
> > > + pte = maybe_mkwrite(pte, vma);
> > >   flags &= ~FAULT_FLAG_WRITE;
> > >   ret |= VM_FAULT_WRITE;
> > >   exclusive = 1;
> > > diff --git a/mm/rmap.c b/mm/rmap.c
&

Re: [RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-02 Thread Minchan Kim
On Tue, Mar 03, 2015 at 11:59:17AM +0800, Wang, Yalin wrote:
> > -Original Message-
> > From: Minchan Kim [mailto:minchan@gmail.com] On Behalf Of Minchan Kim
> > Sent: Tuesday, March 03, 2015 11:26 AM
> > To: Wang, Yalin
> > Cc: 'Michal Hocko'; 'Andrew Morton'; 'linux-kernel@vger.kernel.org';
> > 'linux...@kvack.org'; 'Rik van Riel'; 'Johannes Weiner'; 'Mel Gorman';
> > 'Shaohua Li'; Hugh Dickins; Cyrill Gorcunov
> > Subject: Re: [RFC V3] mm: change mm_advise_free to clear page dirty
> > 
> > Could you separte this patch in this patchset thread?
> > It's tackling differnt problem.
> > 
> > As well, I had a question to previous thread about why shared page
> > has a problem now but you didn't answer and send a new patchset.
> > It makes reviewers/maintainer time waste/confuse. Please, don't
> > hurry to send a code. Before that, resolve reviewers's comments.
> > 
> > On Tue, Mar 03, 2015 at 10:06:40AM +0800, Wang, Yalin wrote:
> > > This patch add ClearPageDirty() to clear AnonPage dirty flag,
> > > if not clear page dirty for this anon page, the page will never be
> > > treated as freeable. We also make sure the shared AnonPage is not
> > > freeable, we implement it by dirty all copyed AnonPage pte,
> > > so that make sure the Anonpage will not become freeable, unless
> > > all process which shared this page call madvise_free syscall.
> > 
> > Please, spend more time to make description clear. I really doubt
> > who understand this description without code inspection. :(
> > Of course, I'm not a person to write description clear like native
> > , either but just I'm sure I spend a more time to write description
> > rather than coding, at least. :)
> > 
> I see, I will send another mail for file private map pages.
> Sorry for my English expressions.
> I think your solution is ok,
> Your patch will make sure the anonpage pte will always be dirty.
> I add some comments for your patch:
> 
> > ---
> >  mm/madvise.c | 1 -
> >  mm/memory.c  | 9 +++--
> >  mm/rmap.c| 2 +-
> >  mm/vmscan.c  | 3 +--
> >  4 files changed, 9 insertions(+), 6 deletions(-)
> > 
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index 6d0fcb8..d64200e 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -309,7 +309,6 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned
> > long addr,
> > continue;
> > }
> > 
> > -   ClearPageDirty(page);
> > unlock_page(page);
> > }
> > 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 8ae52c9..2f45e77 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -2460,9 +2460,14 @@ static int do_swap_page(struct mm_struct *mm, struct
> > vm_area_struct *vma,
> > 
> > inc_mm_counter_fast(mm, MM_ANONPAGES);
> > dec_mm_counter_fast(mm, MM_SWAPENTS);
> > -   pte = mk_pte(page, vma->vm_page_prot);
> > +
> > +   /*
> > +* Every page swapped-out was pte_dirty so we makes pte dirty again.
> > +* MADV_FREE relys on it.
> > +*/
> > +   pte = mk_pte(pte_mkdirty(page), vma->vm_page_prot);
> pte_mkdirty() usage seems wrong here.

Argh, it reveals I didn't test even build. My shame.
But RFC tag might mitigate my shame. :)
I will fix it if I send a formal version.
Thanks for the review.

> 
> > if ((flags & FAULT_FLAG_WRITE) && reuse_swap_page(page)) {
> > -   pte = maybe_mkwrite(pte_mkdirty(pte), vma);
> > +   pte = maybe_mkwrite(pte, vma);
> > flags &= ~FAULT_FLAG_WRITE;
> > ret |= VM_FAULT_WRITE;
> > exclusive = 1;
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 47b3ba8..34c1d66 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -1268,7 +1268,7 @@ static int try_to_unmap_one(struct page *page, struct
> > vm_area_struct *vma,
> > 
> > if (flags & TTU_FREE) {
> > VM_BUG_ON_PAGE(PageSwapCache(page), page);
> > -   if (!dirty && !PageDirty(page)) {
> > +   if (!dirty) {
> > /* It's a freeable page by MADV_FREE */
> > dec_mm_counter(mm, MM_ANONPAGES);
> > goto discard;
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 671e47e..7f520c9 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -805,8 +805,7 @@ static enum page_references
> > pag

RE: [RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-02 Thread Wang, Yalin
> -Original Message-
> From: Minchan Kim [mailto:minchan@gmail.com] On Behalf Of Minchan Kim
> Sent: Tuesday, March 03, 2015 11:26 AM
> To: Wang, Yalin
> Cc: 'Michal Hocko'; 'Andrew Morton'; 'linux-kernel@vger.kernel.org';
> 'linux...@kvack.org'; 'Rik van Riel'; 'Johannes Weiner'; 'Mel Gorman';
> 'Shaohua Li'; Hugh Dickins; Cyrill Gorcunov
> Subject: Re: [RFC V3] mm: change mm_advise_free to clear page dirty
> 
> Could you separte this patch in this patchset thread?
> It's tackling differnt problem.
> 
> As well, I had a question to previous thread about why shared page
> has a problem now but you didn't answer and send a new patchset.
> It makes reviewers/maintainer time waste/confuse. Please, don't
> hurry to send a code. Before that, resolve reviewers's comments.
> 
> On Tue, Mar 03, 2015 at 10:06:40AM +0800, Wang, Yalin wrote:
> > This patch add ClearPageDirty() to clear AnonPage dirty flag,
> > if not clear page dirty for this anon page, the page will never be
> > treated as freeable. We also make sure the shared AnonPage is not
> > freeable, we implement it by dirty all copyed AnonPage pte,
> > so that make sure the Anonpage will not become freeable, unless
> > all process which shared this page call madvise_free syscall.
> 
> Please, spend more time to make description clear. I really doubt
> who understand this description without code inspection. :(
> Of course, I'm not a person to write description clear like native
> , either but just I'm sure I spend a more time to write description
> rather than coding, at least. :)
> 
I see, I will send another mail for file private map pages.
Sorry for my English expressions.
I think your solution is ok,
Your patch will make sure the anonpage pte will always be dirty.
I add some comments for your patch:

> ---
>  mm/madvise.c | 1 -
>  mm/memory.c  | 9 +++--
>  mm/rmap.c| 2 +-
>  mm/vmscan.c  | 3 +--
>  4 files changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 6d0fcb8..d64200e 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -309,7 +309,6 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned
> long addr,
>   continue;
>   }
> 
> - ClearPageDirty(page);
>   unlock_page(page);
>   }
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 8ae52c9..2f45e77 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2460,9 +2460,14 @@ static int do_swap_page(struct mm_struct *mm, struct
> vm_area_struct *vma,
> 
>   inc_mm_counter_fast(mm, MM_ANONPAGES);
>   dec_mm_counter_fast(mm, MM_SWAPENTS);
> - pte = mk_pte(page, vma->vm_page_prot);
> +
> + /*
> +  * Every page swapped-out was pte_dirty so we makes pte dirty again.
> +  * MADV_FREE relys on it.
> +  */
> + pte = mk_pte(pte_mkdirty(page), vma->vm_page_prot);
pte_mkdirty() usage seems wrong here.

>   if ((flags & FAULT_FLAG_WRITE) && reuse_swap_page(page)) {
> - pte = maybe_mkwrite(pte_mkdirty(pte), vma);
> + pte = maybe_mkwrite(pte, vma);
>   flags &= ~FAULT_FLAG_WRITE;
>   ret |= VM_FAULT_WRITE;
>   exclusive = 1;
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 47b3ba8..34c1d66 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1268,7 +1268,7 @@ static int try_to_unmap_one(struct page *page, struct
> vm_area_struct *vma,
> 
>   if (flags & TTU_FREE) {
>   VM_BUG_ON_PAGE(PageSwapCache(page), page);
> - if (!dirty && !PageDirty(page)) {
> + if (!dirty) {
>   /* It's a freeable page by MADV_FREE */
>   dec_mm_counter(mm, MM_ANONPAGES);
>   goto discard;
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 671e47e..7f520c9 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -805,8 +805,7 @@ static enum page_references
> page_check_references(struct page *page,
>   return PAGEREF_KEEP;
>   }
> 
> - if (PageAnon(page) && !pte_dirty && !PageSwapCache(page) &&
> - !PageDirty(page))
> + if (PageAnon(page) && !pte_dirty && !PageSwapCache(page))
>   *freeable = true;
> 
>   /* Reclaim if clean, defer dirty pages to writeback */
> --
> 1.9.3
Could we remove SetPageDirty(page); in try_to_free_swap() function based on 
this patch?
Because your patch will make sure the pte is always dirty,
We don't need setpagedirty(),
The try_to_unmap() path will re-dirty the page during reclaim path,
Isn't it?

Thanks







--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-02 Thread Minchan Kim
Could you separte this patch in this patchset thread?
It's tackling differnt problem.

As well, I had a question to previous thread about why shared page
has a problem now but you didn't answer and send a new patchset.
It makes reviewers/maintainer time waste/confuse. Please, don't
hurry to send a code. Before that, resolve reviewers's comments.

On Tue, Mar 03, 2015 at 10:06:40AM +0800, Wang, Yalin wrote:
> This patch add ClearPageDirty() to clear AnonPage dirty flag,
> if not clear page dirty for this anon page, the page will never be
> treated as freeable. We also make sure the shared AnonPage is not
> freeable, we implement it by dirty all copyed AnonPage pte,
> so that make sure the Anonpage will not become freeable, unless
> all process which shared this page call madvise_free syscall.

Please, spend more time to make description clear. I really doubt
who understand this description without code inspection. :(
Of course, I'm not a person to write description clear like native
, either but just I'm sure I spend a more time to write description
rather than coding, at least. :)

> 
> Signed-off-by: Yalin Wang 
> ---
>  mm/madvise.c | 16 +---
>  mm/memory.c  | 12 ++--
>  2 files changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 6d0fcb8..b61070d 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -297,23 +297,25 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned 
> long addr,
>   continue;
>  
>   page = vm_normal_page(vma, addr, ptent);
> - if (!page)
> + if (!page || !trylock_page(page))
>   continue;
>  
>   if (PageSwapCache(page)) {
> - if (!trylock_page(page))
> - continue;
> -
>   if (!try_to_free_swap(page)) {
>   unlock_page(page);
>   continue;
>   }
> -
> - ClearPageDirty(page);
> - unlock_page(page);
>   }
>  
>   /*
> +  * we clear page dirty flag for AnonPage, no matter if this
> +  * page is in swapcahce or not, AnonPage not in swapcache also 
> set
> +  * dirty flag sometimes, this happened when a AnonPage is 
> removed
> +  * from swapcahce by try_to_free_swap()
> +  */
> + ClearPageDirty(page);
> + unlock_page(page);
> + /*

Parent:

ptrP = malloc();
*ptrP = 'a';
fork(); -> child process pte has dirty by your patch
..
memory pressure -> So, swapped out the page.
..
..
Child: var = *ptrP; assert(var =='a') -> So, swapin happens and child has 
pte_clean
parent: var = *ptrP; aasert(var == 'a') -> So, swapin happens and parent has 
pte_clean
..
..
Parent:
madvise_free -> remove PageDirty
So, both parent and child has pte_clean and !PageDirty, which
is target for VM to discard a page.
..
VM discard the page by memory pressure.
..
Child: var = *ptrP: assert(var == 'a'); < oops.

And blindly ClearPageDirty makes duplicates swap out.

>* Some of architecture(ex, PPC) don't update TLB
>* with set_pte_at and tlb_remove_tlb_entry so for
>* the portability, remap the pte with old|clean
> diff --git a/mm/memory.c b/mm/memory.c
> index 8068893..3d949b3 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -874,10 +874,18 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct 
> *src_mm,
>   if (page) {
>   get_page(page);
>   page_dup_rmap(page);
> - if (PageAnon(page))
> + if (PageAnon(page)) {
> + /*
> +  * we dirty the copyed pte for anon page,
> +  * this is useful for madvise_free_pte_range(),
> +  * this can prevent shared anon page freed by 
> madvise_free
> +  * syscall
> +  */
> + pte = pte_mkdirty(pte);

It made every MADV_FREE hinted page void. IOW, if a process called MADV_FREE
calls fork, VM cannot discard pages if child doesn't free pages or calls 
madvise_free.
Then, if parent calls madvise_free before fork, we couldn't free those pages.
IOW, you are ignoring below example.

parent:
ptr1 = malloc(len);
-> allocator calls mmap(len);
memset(ptr1, 'a', len);
free(ptr1);
-> allocator calls madvise_free(ptr1, len);
fork();
..
..
-> VM discard hinted pages
child:

ptr2 = malloc(len)
-> allocator reuses the chunk allocated from parent.
so, child will see zero pages from ptr2 but he doesn't write
anything so garbage|zero page anything is okay to him.

As well, you are adding new instructions in fork which is very frequent syscall
so I'd like to find another way to avoid adding instructions in such hot path.

I will send different patch. Please review it.

So, my 

[RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-02 Thread Wang, Yalin
This patch add ClearPageDirty() to clear AnonPage dirty flag,
if not clear page dirty for this anon page, the page will never be
treated as freeable. We also make sure the shared AnonPage is not
freeable, we implement it by dirty all copyed AnonPage pte,
so that make sure the Anonpage will not become freeable, unless
all process which shared this page call madvise_free syscall.

Signed-off-by: Yalin Wang 
---
 mm/madvise.c | 16 +---
 mm/memory.c  | 12 ++--
 2 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 6d0fcb8..b61070d 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -297,23 +297,25 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned 
long addr,
continue;
 
page = vm_normal_page(vma, addr, ptent);
-   if (!page)
+   if (!page || !trylock_page(page))
continue;
 
if (PageSwapCache(page)) {
-   if (!trylock_page(page))
-   continue;
-
if (!try_to_free_swap(page)) {
unlock_page(page);
continue;
}
-
-   ClearPageDirty(page);
-   unlock_page(page);
}
 
/*
+* we clear page dirty flag for AnonPage, no matter if this
+* page is in swapcahce or not, AnonPage not in swapcache also 
set
+* dirty flag sometimes, this happened when a AnonPage is 
removed
+* from swapcahce by try_to_free_swap()
+*/
+   ClearPageDirty(page);
+   unlock_page(page);
+   /*
 * Some of architecture(ex, PPC) don't update TLB
 * with set_pte_at and tlb_remove_tlb_entry so for
 * the portability, remap the pte with old|clean
diff --git a/mm/memory.c b/mm/memory.c
index 8068893..3d949b3 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -874,10 +874,18 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct 
*src_mm,
if (page) {
get_page(page);
page_dup_rmap(page);
-   if (PageAnon(page))
+   if (PageAnon(page)) {
+   /*
+* we dirty the copyed pte for anon page,
+* this is useful for madvise_free_pte_range(),
+* this can prevent shared anon page freed by 
madvise_free
+* syscall
+*/
+   pte = pte_mkdirty(pte);
rss[MM_ANONPAGES]++;
-   else
+   } else {
rss[MM_FILEPAGES]++;
+   }
}
 
 out_set_pte:
-- 
2.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-02 Thread Wang, Yalin
This patch add ClearPageDirty() to clear AnonPage dirty flag,
if not clear page dirty for this anon page, the page will never be
treated as freeable. We also make sure the shared AnonPage is not
freeable, we implement it by dirty all copyed AnonPage pte,
so that make sure the Anonpage will not become freeable, unless
all process which shared this page call madvise_free syscall.

Signed-off-by: Yalin Wang yalin.w...@sonymobile.com
---
 mm/madvise.c | 16 +---
 mm/memory.c  | 12 ++--
 2 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 6d0fcb8..b61070d 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -297,23 +297,25 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned 
long addr,
continue;
 
page = vm_normal_page(vma, addr, ptent);
-   if (!page)
+   if (!page || !trylock_page(page))
continue;
 
if (PageSwapCache(page)) {
-   if (!trylock_page(page))
-   continue;
-
if (!try_to_free_swap(page)) {
unlock_page(page);
continue;
}
-
-   ClearPageDirty(page);
-   unlock_page(page);
}
 
/*
+* we clear page dirty flag for AnonPage, no matter if this
+* page is in swapcahce or not, AnonPage not in swapcache also 
set
+* dirty flag sometimes, this happened when a AnonPage is 
removed
+* from swapcahce by try_to_free_swap()
+*/
+   ClearPageDirty(page);
+   unlock_page(page);
+   /*
 * Some of architecture(ex, PPC) don't update TLB
 * with set_pte_at and tlb_remove_tlb_entry so for
 * the portability, remap the pte with old|clean
diff --git a/mm/memory.c b/mm/memory.c
index 8068893..3d949b3 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -874,10 +874,18 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct 
*src_mm,
if (page) {
get_page(page);
page_dup_rmap(page);
-   if (PageAnon(page))
+   if (PageAnon(page)) {
+   /*
+* we dirty the copyed pte for anon page,
+* this is useful for madvise_free_pte_range(),
+* this can prevent shared anon page freed by 
madvise_free
+* syscall
+*/
+   pte = pte_mkdirty(pte);
rss[MM_ANONPAGES]++;
-   else
+   } else {
rss[MM_FILEPAGES]++;
+   }
}
 
 out_set_pte:
-- 
2.2.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-02 Thread Minchan Kim
Could you separte this patch in this patchset thread?
It's tackling differnt problem.

As well, I had a question to previous thread about why shared page
has a problem now but you didn't answer and send a new patchset.
It makes reviewers/maintainer time waste/confuse. Please, don't
hurry to send a code. Before that, resolve reviewers's comments.

On Tue, Mar 03, 2015 at 10:06:40AM +0800, Wang, Yalin wrote:
 This patch add ClearPageDirty() to clear AnonPage dirty flag,
 if not clear page dirty for this anon page, the page will never be
 treated as freeable. We also make sure the shared AnonPage is not
 freeable, we implement it by dirty all copyed AnonPage pte,
 so that make sure the Anonpage will not become freeable, unless
 all process which shared this page call madvise_free syscall.

Please, spend more time to make description clear. I really doubt
who understand this description without code inspection. :(
Of course, I'm not a person to write description clear like native
, either but just I'm sure I spend a more time to write description
rather than coding, at least. :)

 
 Signed-off-by: Yalin Wang yalin.w...@sonymobile.com
 ---
  mm/madvise.c | 16 +---
  mm/memory.c  | 12 ++--
  2 files changed, 19 insertions(+), 9 deletions(-)
 
 diff --git a/mm/madvise.c b/mm/madvise.c
 index 6d0fcb8..b61070d 100644
 --- a/mm/madvise.c
 +++ b/mm/madvise.c
 @@ -297,23 +297,25 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned 
 long addr,
   continue;
  
   page = vm_normal_page(vma, addr, ptent);
 - if (!page)
 + if (!page || !trylock_page(page))
   continue;
  
   if (PageSwapCache(page)) {
 - if (!trylock_page(page))
 - continue;
 -
   if (!try_to_free_swap(page)) {
   unlock_page(page);
   continue;
   }
 -
 - ClearPageDirty(page);
 - unlock_page(page);
   }
  
   /*
 +  * we clear page dirty flag for AnonPage, no matter if this
 +  * page is in swapcahce or not, AnonPage not in swapcache also 
 set
 +  * dirty flag sometimes, this happened when a AnonPage is 
 removed
 +  * from swapcahce by try_to_free_swap()
 +  */
 + ClearPageDirty(page);
 + unlock_page(page);
 + /*

Parent:

ptrP = malloc();
*ptrP = 'a';
fork(); - child process pte has dirty by your patch
..
memory pressure - So, swapped out the page.
..
..
Child: var = *ptrP; assert(var =='a') - So, swapin happens and child has 
pte_clean
parent: var = *ptrP; aasert(var == 'a') - So, swapin happens and parent has 
pte_clean
..
..
Parent:
madvise_free - remove PageDirty
So, both parent and child has pte_clean and !PageDirty, which
is target for VM to discard a page.
..
VM discard the page by memory pressure.
..
Child: var = *ptrP: assert(var == 'a');  oops.

And blindly ClearPageDirty makes duplicates swap out.

* Some of architecture(ex, PPC) don't update TLB
* with set_pte_at and tlb_remove_tlb_entry so for
* the portability, remap the pte with old|clean
 diff --git a/mm/memory.c b/mm/memory.c
 index 8068893..3d949b3 100644
 --- a/mm/memory.c
 +++ b/mm/memory.c
 @@ -874,10 +874,18 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct 
 *src_mm,
   if (page) {
   get_page(page);
   page_dup_rmap(page);
 - if (PageAnon(page))
 + if (PageAnon(page)) {
 + /*
 +  * we dirty the copyed pte for anon page,
 +  * this is useful for madvise_free_pte_range(),
 +  * this can prevent shared anon page freed by 
 madvise_free
 +  * syscall
 +  */
 + pte = pte_mkdirty(pte);

It made every MADV_FREE hinted page void. IOW, if a process called MADV_FREE
calls fork, VM cannot discard pages if child doesn't free pages or calls 
madvise_free.
Then, if parent calls madvise_free before fork, we couldn't free those pages.
IOW, you are ignoring below example.

parent:
ptr1 = malloc(len);
- allocator calls mmap(len);
memset(ptr1, 'a', len);
free(ptr1);
- allocator calls madvise_free(ptr1, len);
fork();
..
..
- VM discard hinted pages
child:

ptr2 = malloc(len)
- allocator reuses the chunk allocated from parent.
so, child will see zero pages from ptr2 but he doesn't write
anything so garbage|zero page anything is okay to him.

As well, you are adding new instructions in fork which is very frequent syscall
so I'd like to find another way to avoid adding instructions in such hot path.

I will send different patch. Please review it.

So, my suggestion is below. It always makes pte dirty so let's 

RE: [RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-02 Thread Wang, Yalin
 -Original Message-
 From: Minchan Kim [mailto:minchan@gmail.com] On Behalf Of Minchan Kim
 Sent: Tuesday, March 03, 2015 11:26 AM
 To: Wang, Yalin
 Cc: 'Michal Hocko'; 'Andrew Morton'; 'linux-kernel@vger.kernel.org';
 'linux...@kvack.org'; 'Rik van Riel'; 'Johannes Weiner'; 'Mel Gorman';
 'Shaohua Li'; Hugh Dickins; Cyrill Gorcunov
 Subject: Re: [RFC V3] mm: change mm_advise_free to clear page dirty
 
 Could you separte this patch in this patchset thread?
 It's tackling differnt problem.
 
 As well, I had a question to previous thread about why shared page
 has a problem now but you didn't answer and send a new patchset.
 It makes reviewers/maintainer time waste/confuse. Please, don't
 hurry to send a code. Before that, resolve reviewers's comments.
 
 On Tue, Mar 03, 2015 at 10:06:40AM +0800, Wang, Yalin wrote:
  This patch add ClearPageDirty() to clear AnonPage dirty flag,
  if not clear page dirty for this anon page, the page will never be
  treated as freeable. We also make sure the shared AnonPage is not
  freeable, we implement it by dirty all copyed AnonPage pte,
  so that make sure the Anonpage will not become freeable, unless
  all process which shared this page call madvise_free syscall.
 
 Please, spend more time to make description clear. I really doubt
 who understand this description without code inspection. :(
 Of course, I'm not a person to write description clear like native
 , either but just I'm sure I spend a more time to write description
 rather than coding, at least. :)
 
I see, I will send another mail for file private map pages.
Sorry for my English expressions.
I think your solution is ok,
Your patch will make sure the anonpage pte will always be dirty.
I add some comments for your patch:

 ---
  mm/madvise.c | 1 -
  mm/memory.c  | 9 +++--
  mm/rmap.c| 2 +-
  mm/vmscan.c  | 3 +--
  4 files changed, 9 insertions(+), 6 deletions(-)
 
 diff --git a/mm/madvise.c b/mm/madvise.c
 index 6d0fcb8..d64200e 100644
 --- a/mm/madvise.c
 +++ b/mm/madvise.c
 @@ -309,7 +309,6 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned
 long addr,
   continue;
   }
 
 - ClearPageDirty(page);
   unlock_page(page);
   }
 
 diff --git a/mm/memory.c b/mm/memory.c
 index 8ae52c9..2f45e77 100644
 --- a/mm/memory.c
 +++ b/mm/memory.c
 @@ -2460,9 +2460,14 @@ static int do_swap_page(struct mm_struct *mm, struct
 vm_area_struct *vma,
 
   inc_mm_counter_fast(mm, MM_ANONPAGES);
   dec_mm_counter_fast(mm, MM_SWAPENTS);
 - pte = mk_pte(page, vma-vm_page_prot);
 +
 + /*
 +  * Every page swapped-out was pte_dirty so we makes pte dirty again.
 +  * MADV_FREE relys on it.
 +  */
 + pte = mk_pte(pte_mkdirty(page), vma-vm_page_prot);
pte_mkdirty() usage seems wrong here.

   if ((flags  FAULT_FLAG_WRITE)  reuse_swap_page(page)) {
 - pte = maybe_mkwrite(pte_mkdirty(pte), vma);
 + pte = maybe_mkwrite(pte, vma);
   flags = ~FAULT_FLAG_WRITE;
   ret |= VM_FAULT_WRITE;
   exclusive = 1;
 diff --git a/mm/rmap.c b/mm/rmap.c
 index 47b3ba8..34c1d66 100644
 --- a/mm/rmap.c
 +++ b/mm/rmap.c
 @@ -1268,7 +1268,7 @@ static int try_to_unmap_one(struct page *page, struct
 vm_area_struct *vma,
 
   if (flags  TTU_FREE) {
   VM_BUG_ON_PAGE(PageSwapCache(page), page);
 - if (!dirty  !PageDirty(page)) {
 + if (!dirty) {
   /* It's a freeable page by MADV_FREE */
   dec_mm_counter(mm, MM_ANONPAGES);
   goto discard;
 diff --git a/mm/vmscan.c b/mm/vmscan.c
 index 671e47e..7f520c9 100644
 --- a/mm/vmscan.c
 +++ b/mm/vmscan.c
 @@ -805,8 +805,7 @@ static enum page_references
 page_check_references(struct page *page,
   return PAGEREF_KEEP;
   }
 
 - if (PageAnon(page)  !pte_dirty  !PageSwapCache(page) 
 - !PageDirty(page))
 + if (PageAnon(page)  !pte_dirty  !PageSwapCache(page))
   *freeable = true;
 
   /* Reclaim if clean, defer dirty pages to writeback */
 --
 1.9.3
Could we remove SetPageDirty(page); in try_to_free_swap() function based on 
this patch?
Because your patch will make sure the pte is always dirty,
We don't need setpagedirty(),
The try_to_unmap() path will re-dirty the page during reclaim path,
Isn't it?

Thanks







--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-02 Thread Minchan Kim
On Tue, Mar 03, 2015 at 11:59:17AM +0800, Wang, Yalin wrote:
  -Original Message-
  From: Minchan Kim [mailto:minchan@gmail.com] On Behalf Of Minchan Kim
  Sent: Tuesday, March 03, 2015 11:26 AM
  To: Wang, Yalin
  Cc: 'Michal Hocko'; 'Andrew Morton'; 'linux-kernel@vger.kernel.org';
  'linux...@kvack.org'; 'Rik van Riel'; 'Johannes Weiner'; 'Mel Gorman';
  'Shaohua Li'; Hugh Dickins; Cyrill Gorcunov
  Subject: Re: [RFC V3] mm: change mm_advise_free to clear page dirty
  
  Could you separte this patch in this patchset thread?
  It's tackling differnt problem.
  
  As well, I had a question to previous thread about why shared page
  has a problem now but you didn't answer and send a new patchset.
  It makes reviewers/maintainer time waste/confuse. Please, don't
  hurry to send a code. Before that, resolve reviewers's comments.
  
  On Tue, Mar 03, 2015 at 10:06:40AM +0800, Wang, Yalin wrote:
   This patch add ClearPageDirty() to clear AnonPage dirty flag,
   if not clear page dirty for this anon page, the page will never be
   treated as freeable. We also make sure the shared AnonPage is not
   freeable, we implement it by dirty all copyed AnonPage pte,
   so that make sure the Anonpage will not become freeable, unless
   all process which shared this page call madvise_free syscall.
  
  Please, spend more time to make description clear. I really doubt
  who understand this description without code inspection. :(
  Of course, I'm not a person to write description clear like native
  , either but just I'm sure I spend a more time to write description
  rather than coding, at least. :)
  
 I see, I will send another mail for file private map pages.
 Sorry for my English expressions.
 I think your solution is ok,
 Your patch will make sure the anonpage pte will always be dirty.
 I add some comments for your patch:
 
  ---
   mm/madvise.c | 1 -
   mm/memory.c  | 9 +++--
   mm/rmap.c| 2 +-
   mm/vmscan.c  | 3 +--
   4 files changed, 9 insertions(+), 6 deletions(-)
  
  diff --git a/mm/madvise.c b/mm/madvise.c
  index 6d0fcb8..d64200e 100644
  --- a/mm/madvise.c
  +++ b/mm/madvise.c
  @@ -309,7 +309,6 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned
  long addr,
  continue;
  }
  
  -   ClearPageDirty(page);
  unlock_page(page);
  }
  
  diff --git a/mm/memory.c b/mm/memory.c
  index 8ae52c9..2f45e77 100644
  --- a/mm/memory.c
  +++ b/mm/memory.c
  @@ -2460,9 +2460,14 @@ static int do_swap_page(struct mm_struct *mm, struct
  vm_area_struct *vma,
  
  inc_mm_counter_fast(mm, MM_ANONPAGES);
  dec_mm_counter_fast(mm, MM_SWAPENTS);
  -   pte = mk_pte(page, vma-vm_page_prot);
  +
  +   /*
  +* Every page swapped-out was pte_dirty so we makes pte dirty again.
  +* MADV_FREE relys on it.
  +*/
  +   pte = mk_pte(pte_mkdirty(page), vma-vm_page_prot);
 pte_mkdirty() usage seems wrong here.

Argh, it reveals I didn't test even build. My shame.
But RFC tag might mitigate my shame. :)
I will fix it if I send a formal version.
Thanks for the review.

 
  if ((flags  FAULT_FLAG_WRITE)  reuse_swap_page(page)) {
  -   pte = maybe_mkwrite(pte_mkdirty(pte), vma);
  +   pte = maybe_mkwrite(pte, vma);
  flags = ~FAULT_FLAG_WRITE;
  ret |= VM_FAULT_WRITE;
  exclusive = 1;
  diff --git a/mm/rmap.c b/mm/rmap.c
  index 47b3ba8..34c1d66 100644
  --- a/mm/rmap.c
  +++ b/mm/rmap.c
  @@ -1268,7 +1268,7 @@ static int try_to_unmap_one(struct page *page, struct
  vm_area_struct *vma,
  
  if (flags  TTU_FREE) {
  VM_BUG_ON_PAGE(PageSwapCache(page), page);
  -   if (!dirty  !PageDirty(page)) {
  +   if (!dirty) {
  /* It's a freeable page by MADV_FREE */
  dec_mm_counter(mm, MM_ANONPAGES);
  goto discard;
  diff --git a/mm/vmscan.c b/mm/vmscan.c
  index 671e47e..7f520c9 100644
  --- a/mm/vmscan.c
  +++ b/mm/vmscan.c
  @@ -805,8 +805,7 @@ static enum page_references
  page_check_references(struct page *page,
  return PAGEREF_KEEP;
  }
  
  -   if (PageAnon(page)  !pte_dirty  !PageSwapCache(page) 
  -   !PageDirty(page))
  +   if (PageAnon(page)  !pte_dirty  !PageSwapCache(page))
  *freeable = true;
  
  /* Reclaim if clean, defer dirty pages to writeback */
  --
  1.9.3
 Could we remove SetPageDirty(page); in try_to_free_swap() function based on 
 this patch?
 Because your patch will make sure the pte is always dirty,
 We don't need setpagedirty(),
 The try_to_unmap() path will re-dirty the page during reclaim path,
 Isn't it?

I dont't know what side-effect we will have if we removes SetPageDirty.
It might regress on tmpfs which would page without pte.
I don't want to have such risk in this patch.
If you want it, you could suggest it separately

RE: [RFC V3] mm: change mm_advise_free to clear page dirty

2015-03-02 Thread Wang, Yalin
 -Original Message-
 From: Minchan Kim [mailto:minchan@gmail.com] On Behalf Of Minchan Kim
 Sent: Tuesday, March 03, 2015 12:15 PM
 To: Wang, Yalin
 Cc: 'Michal Hocko'; 'Andrew Morton'; 'linux-kernel@vger.kernel.org';
 'linux...@kvack.org'; 'Rik van Riel'; 'Johannes Weiner'; 'Mel Gorman';
 'Shaohua Li'; Hugh Dickins; Cyrill Gorcunov
 Subject: Re: [RFC V3] mm: change mm_advise_free to clear page dirty
 
 On Tue, Mar 03, 2015 at 11:59:17AM +0800, Wang, Yalin wrote:
   -Original Message-
   From: Minchan Kim [mailto:minchan@gmail.com] On Behalf Of Minchan
 Kim
   Sent: Tuesday, March 03, 2015 11:26 AM
   To: Wang, Yalin
   Cc: 'Michal Hocko'; 'Andrew Morton'; 'linux-kernel@vger.kernel.org';
   'linux...@kvack.org'; 'Rik van Riel'; 'Johannes Weiner'; 'Mel Gorman';
   'Shaohua Li'; Hugh Dickins; Cyrill Gorcunov
   Subject: Re: [RFC V3] mm: change mm_advise_free to clear page dirty
  
   Could you separte this patch in this patchset thread?
   It's tackling differnt problem.
  
   As well, I had a question to previous thread about why shared page
   has a problem now but you didn't answer and send a new patchset.
   It makes reviewers/maintainer time waste/confuse. Please, don't
   hurry to send a code. Before that, resolve reviewers's comments.
  
   On Tue, Mar 03, 2015 at 10:06:40AM +0800, Wang, Yalin wrote:
This patch add ClearPageDirty() to clear AnonPage dirty flag,
if not clear page dirty for this anon page, the page will never be
treated as freeable. We also make sure the shared AnonPage is not
freeable, we implement it by dirty all copyed AnonPage pte,
so that make sure the Anonpage will not become freeable, unless
all process which shared this page call madvise_free syscall.
  
   Please, spend more time to make description clear. I really doubt
   who understand this description without code inspection. :(
   Of course, I'm not a person to write description clear like native
   , either but just I'm sure I spend a more time to write description
   rather than coding, at least. :)
  
  I see, I will send another mail for file private map pages.
  Sorry for my English expressions.
  I think your solution is ok,
  Your patch will make sure the anonpage pte will always be dirty.
  I add some comments for your patch:
 
   ---
mm/madvise.c | 1 -
mm/memory.c  | 9 +++--
mm/rmap.c| 2 +-
mm/vmscan.c  | 3 +--
4 files changed, 9 insertions(+), 6 deletions(-)
  
   diff --git a/mm/madvise.c b/mm/madvise.c
   index 6d0fcb8..d64200e 100644
   --- a/mm/madvise.c
   +++ b/mm/madvise.c
   @@ -309,7 +309,6 @@ static int madvise_free_pte_range(pmd_t *pmd,
 unsigned
   long addr,
 continue;
 }
  
   - ClearPageDirty(page);
 unlock_page(page);
 }
  
   diff --git a/mm/memory.c b/mm/memory.c
   index 8ae52c9..2f45e77 100644
   --- a/mm/memory.c
   +++ b/mm/memory.c
   @@ -2460,9 +2460,14 @@ static int do_swap_page(struct mm_struct *mm,
 struct
   vm_area_struct *vma,
  
 inc_mm_counter_fast(mm, MM_ANONPAGES);
 dec_mm_counter_fast(mm, MM_SWAPENTS);
   - pte = mk_pte(page, vma-vm_page_prot);
   +
   + /*
   +  * Every page swapped-out was pte_dirty so we makes pte dirty again.
   +  * MADV_FREE relys on it.
   +  */
   + pte = mk_pte(pte_mkdirty(page), vma-vm_page_prot);
  pte_mkdirty() usage seems wrong here.
 
 Argh, it reveals I didn't test even build. My shame.
 But RFC tag might mitigate my shame. :)
 I will fix it if I send a formal version.
 Thanks for the review.
 
 
 if ((flags  FAULT_FLAG_WRITE)  reuse_swap_page(page)) {
   - pte = maybe_mkwrite(pte_mkdirty(pte), vma);
   + pte = maybe_mkwrite(pte, vma);
 flags = ~FAULT_FLAG_WRITE;
 ret |= VM_FAULT_WRITE;
 exclusive = 1;
   diff --git a/mm/rmap.c b/mm/rmap.c
   index 47b3ba8..34c1d66 100644
   --- a/mm/rmap.c
   +++ b/mm/rmap.c
   @@ -1268,7 +1268,7 @@ static int try_to_unmap_one(struct page *page,
 struct
   vm_area_struct *vma,
  
 if (flags  TTU_FREE) {
 VM_BUG_ON_PAGE(PageSwapCache(page), page);
   - if (!dirty  !PageDirty(page)) {
   + if (!dirty) {
 /* It's a freeable page by MADV_FREE */
 dec_mm_counter(mm, MM_ANONPAGES);
 goto discard;
   diff --git a/mm/vmscan.c b/mm/vmscan.c
   index 671e47e..7f520c9 100644
   --- a/mm/vmscan.c
   +++ b/mm/vmscan.c
   @@ -805,8 +805,7 @@ static enum page_references
   page_check_references(struct page *page,
 return PAGEREF_KEEP;
 }
  
   - if (PageAnon(page)  !pte_dirty  !PageSwapCache(page) 
   - !PageDirty(page))
   + if (PageAnon(page)  !pte_dirty  !PageSwapCache(page))
 *freeable = true;
  
 /* Reclaim if clean, defer dirty pages to writeback */
   --
   1.9.3
  Could we remove SetPageDirty