Re: [PATCH v2] mm, hugetlb: set PageLRU for in-use/active hugepages

2015-02-17 Thread Naoya Horiguchi
On Tue, Feb 17, 2015 at 04:02:49PM -0800, Andrew Morton wrote:
> On Tue, 17 Feb 2015 15:57:44 -0800 Andrew Morton  
> wrote:
> 
> > So if I'm understanding this correctly, hugepages never have PG_lru set
> > and so you are overloading that bit on hugepages to indicate that the
> > page is present on hstate->hugepage_activelist?
> 
> And maybe we don't need to overload PG_lru at all?  There's plenty of
> free space in the compound page's *(page + 1).

Right, that's not necessary. So I'll use PG_private in *(page + 1), that's
unused now and no worry about conflicting with other usage.

Thanks,
Naoya Horiguchi--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mm, hugetlb: set PageLRU for in-use/active hugepages

2015-02-17 Thread Naoya Horiguchi
On Tue, Feb 17, 2015 at 03:57:44PM -0800, Andrew Morton wrote:
> On Tue, 17 Feb 2015 09:32:08 + Naoya Horiguchi 
>  wrote:
> 
> > Currently we are not safe from concurrent calls of isolate_huge_page(),
> > which can make the victim hugepage in invalid state and results in BUG_ON().
> > 
> > The root problem of this is that we don't have any information on struct 
> > page
> > (so easily accessible) about the hugepage's activeness. Note that hugepages'
> > activeness means just being linked to hstate->hugepage_activelist, which is
> > not the same as normal pages' activeness represented by PageActive flag.
> > 
> > Normal pages are isolated by isolate_lru_page() which prechecks PageLRU 
> > before
> > isolation, so let's do similarly for hugetlb. PageLRU is unused on hugetlb 
> > now,
> > so the change is mostly just inserting Set/ClearPageLRU (no conflict with
> > current usage.) And the other changes are justified like below:
> > - __put_compound_page() calls __page_cache_release() to do some LRU works,
> >   but this is obviously for thps and assumes that hugetlb has always 
> > !PageLRU.
> >   This assumption is not true any more, so this patch simply adds if 
> > (!PageHuge)
> >   to avoid calling __page_cache_release() for hugetlb.
> > - soft_offline_huge_page() now just calls list_move(), but generally callers
> >   of page migration should use the common routine in isolation, so let's
> >   replace the list_move() with isolate_huge_page() rather than inserting
> >   ClearPageLRU.
> > 
> > Set/ClearPageLRU should be called within hugetlb_lock, but hugetlb_cow() and
> > hugetlb_no_page() don't do this. This is justified because in these function
> > SetPageLRU is called right after the hugepage is allocated and no other 
> > thread
> > tries to isolate it.
> 
> Whoa.
> 
> So if I'm understanding this correctly, hugepages never have PG_lru set
> and so you are overloading that bit on hugepages to indicate that the
> page is present on hstate->hugepage_activelist?

Right, that's my intention.

> This is somewhat of a big deal and the patch doesn't make it very clear
> at all.  We should
> 
> - document PG_lru, for both of its identities

OK, I'll do this.

> - consider adding a new PG_hugepage_active(?) flag which has the same
>   value as PG_lru (see how PG_savepinned was done).

I thought of this at first, but didn't do just to avoid complexity for
the first patch. I know this is necessary finally, so I'll do this next.

Maybe I'll name it as PG_hugetlb_active, because just stating "hugepage"
might cause some confusion between hugetlb and thp in the future.

> - create suitable helper functions for the new PG_lru meaning. 
>   Simply calling PageLRU/SetPageLRU for pages which *aren't on the LRU*
>   is lazy and misleading.  Create a name for the new concept
>   (hugepage_active?) and document it and use it consistently.

OK.

> 
> > @@ -75,7 +76,8 @@ static void __put_compound_page(struct page *page)
> >  {
> > compound_page_dtor *dtor;
> >  
> > -   __page_cache_release(page);
> > +   if (!PageHuge(page))
> > +   __page_cache_release(page);
> > dtor = get_compound_page_dtor(page);
> > (*dtor)(page);
> 
> And this needs a good comment - there's no way that a reader can work
> out why this code is here unless he goes dumpster diving in the git
> history.

OK.

Thanks,
Naoya Horiguchi--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mm, hugetlb: set PageLRU for in-use/active hugepages

2015-02-17 Thread Andrew Morton
On Tue, 17 Feb 2015 15:57:44 -0800 Andrew Morton  
wrote:

> So if I'm understanding this correctly, hugepages never have PG_lru set
> and so you are overloading that bit on hugepages to indicate that the
> page is present on hstate->hugepage_activelist?

And maybe we don't need to overload PG_lru at all?  There's plenty of
free space in the compound page's *(page + 1).

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mm, hugetlb: set PageLRU for in-use/active hugepages

2015-02-17 Thread Andrew Morton
On Tue, 17 Feb 2015 09:32:08 + Naoya Horiguchi  
wrote:

> Currently we are not safe from concurrent calls of isolate_huge_page(),
> which can make the victim hugepage in invalid state and results in BUG_ON().
> 
> The root problem of this is that we don't have any information on struct page
> (so easily accessible) about the hugepage's activeness. Note that hugepages'
> activeness means just being linked to hstate->hugepage_activelist, which is
> not the same as normal pages' activeness represented by PageActive flag.
> 
> Normal pages are isolated by isolate_lru_page() which prechecks PageLRU before
> isolation, so let's do similarly for hugetlb. PageLRU is unused on hugetlb 
> now,
> so the change is mostly just inserting Set/ClearPageLRU (no conflict with
> current usage.) And the other changes are justified like below:
> - __put_compound_page() calls __page_cache_release() to do some LRU works,
>   but this is obviously for thps and assumes that hugetlb has always !PageLRU.
>   This assumption is not true any more, so this patch simply adds if 
> (!PageHuge)
>   to avoid calling __page_cache_release() for hugetlb.
> - soft_offline_huge_page() now just calls list_move(), but generally callers
>   of page migration should use the common routine in isolation, so let's
>   replace the list_move() with isolate_huge_page() rather than inserting
>   ClearPageLRU.
> 
> Set/ClearPageLRU should be called within hugetlb_lock, but hugetlb_cow() and
> hugetlb_no_page() don't do this. This is justified because in these function
> SetPageLRU is called right after the hugepage is allocated and no other thread
> tries to isolate it.

Whoa.

So if I'm understanding this correctly, hugepages never have PG_lru set
and so you are overloading that bit on hugepages to indicate that the
page is present on hstate->hugepage_activelist?

This is somewhat of a big deal and the patch doesn't make it very clear
at all.  We should

- document PG_lru, for both of its identities

- consider adding a new PG_hugepage_active(?) flag which has the same
  value as PG_lru (see how PG_savepinned was done).

- create suitable helper functions for the new PG_lru meaning. 
  Simply calling PageLRU/SetPageLRU for pages which *aren't on the LRU*
  is lazy and misleading.  Create a name for the new concept
  (hugepage_active?) and document it and use it consistently.


> @@ -75,7 +76,8 @@ static void __put_compound_page(struct page *page)
>  {
>   compound_page_dtor *dtor;
>  
> - __page_cache_release(page);
> + if (!PageHuge(page))
> + __page_cache_release(page);
>   dtor = get_compound_page_dtor(page);
>   (*dtor)(page);

And this needs a good comment - there's no way that a reader can work
out why this code is here unless he goes dumpster diving in the git
history.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] mm, hugetlb: set PageLRU for in-use/active hugepages

2015-02-17 Thread Naoya Horiguchi
On Tue, Feb 17, 2015 at 03:22:45AM +, Horiguchi Naoya(堀口 直也) wrote:
> Currently we are not safe from concurrent calls of isolate_huge_page(),
> which can make the victim hugepage in invalid state and results in BUG_ON().
> 
> The root problem of this is that we don't have any information on struct page
> (so easily accessible) about the hugepage's activeness. Note that hugepages'
> activeness means just being linked to hstate->hugepage_activelist, which is
> not the same as normal pages' activeness represented by PageActive flag.
> 
> Normal pages are isolated by isolate_lru_page() which prechecks PageLRU before
> isolation, so let's do similarly for hugetlb. PageLRU is unused on hugetlb,
> so this change is mostly straightforward. One non-straightforward point is 
> that
> __put_compound_page() calls __page_cache_release() to do some LRU works,
> but this is obviously for thps and assumes that hugetlb has always !PageLRU.
> This assumption is no more true, so this patch simply adds if (!PageHuge) to
> avoid calling __page_cache_release() for hugetlb.
> 
> Set/ClearPageLRU should be called within hugetlb_lock, but hugetlb_cow() and
> hugetlb_no_page() don't do this. This is justified because in these function
> SetPageLRU is called right after the hugepage is allocated and no other thread
> tries to isolate it.
> 
> Fixes: commit 31caf665e666 ("mm: migrate: make core migration code aware of 
> hugepage")
> Signed-off-by: Naoya Horiguchi 
> Cc: [3.12+]

Sorry, my testing was not enough and I found a bug in soft offline code.
Here is the updated one.

Thanks,
Naoya Horiguchi

From e69950011360f624e08712de4d541c7d686d6296 Mon Sep 17 00:00:00 2001
From: Naoya Horiguchi 
Date: Mon, 16 Feb 2015 18:33:35 +0900
Subject: [PATCH v2] mm, hugetlb: set PageLRU for in-use/active hugepages

Currently we are not safe from concurrent calls of isolate_huge_page(),
which can make the victim hugepage in invalid state and results in BUG_ON().

The root problem of this is that we don't have any information on struct page
(so easily accessible) about the hugepage's activeness. Note that hugepages'
activeness means just being linked to hstate->hugepage_activelist, which is
not the same as normal pages' activeness represented by PageActive flag.

Normal pages are isolated by isolate_lru_page() which prechecks PageLRU before
isolation, so let's do similarly for hugetlb. PageLRU is unused on hugetlb now,
so the change is mostly just inserting Set/ClearPageLRU (no conflict with
current usage.) And the other changes are justified like below:
- __put_compound_page() calls __page_cache_release() to do some LRU works,
  but this is obviously for thps and assumes that hugetlb has always !PageLRU.
  This assumption is not true any more, so this patch simply adds if (!PageHuge)
  to avoid calling __page_cache_release() for hugetlb.
- soft_offline_huge_page() now just calls list_move(), but generally callers
  of page migration should use the common routine in isolation, so let's
  replace the list_move() with isolate_huge_page() rather than inserting
  ClearPageLRU.

Set/ClearPageLRU should be called within hugetlb_lock, but hugetlb_cow() and
hugetlb_no_page() don't do this. This is justified because in these function
SetPageLRU is called right after the hugepage is allocated and no other thread
tries to isolate it.

Fixes: commit 31caf665e666 ("mm: migrate: make core migration code aware of 
hugepage")
Signed-off-by: Naoya Horiguchi 
Cc: [3.12+]
---
ChangeLog v1->v2:
- call isolate_huge_page() in soft_offline_huge_page() instead of list_move()
---
 mm/hugetlb.c| 17 ++---
 mm/memory-failure.c | 14 --
 mm/swap.c   |  4 +++-
 3 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a2bfd02e289f..e28489270d9a 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -830,7 +830,7 @@ static void update_and_free_page(struct hstate *h, struct 
page *page)
page[i].flags &= ~(1 << PG_locked | 1 << PG_error |
1 << PG_referenced | 1 << PG_dirty |
1 << PG_active | 1 << PG_private |
-   1 << PG_writeback);
+   1 << PG_writeback | 1 << PG_lru);
}
VM_BUG_ON_PAGE(hugetlb_cgroup_from_page(page), page);
set_compound_page_dtor(page, NULL);
@@ -875,6 +875,7 @@ void free_huge_page(struct page *page)
ClearPagePrivate(page);
 
spin_lock(&hugetlb_lock);
+   ClearPageLRU(page);
hugetlb_cgroup_uncharge_page(hstate_index(h),
 pages_per_huge_page(h), page);
if (restore_reserve)
@@ -2889,6 +2890,7 @@ static int