Re: [RFC PATCH 2/3] hugetlb: convert page_huge_active() to HPageMigratable flag

2021-01-15 Thread Mike Kravetz
On 1/15/21 12:29 PM, Oscar Salvador wrote:
> 
> About that alloc_contig_range topic, I would like to take a look unless
> someone is already on it or about to be.
> 
> Thanks Mike for the time ;-)

Feel free.

My first thought is that migration of a free hugetlb page would need to
be something like:
1) allocate a fresh hugetlb page from buddy
2) free the 'migrated' free huge page back to buddy

I do not think we can use the existing 'isolate-migrate' flow.  Isolating
a page would make it unavailable for allocation and that could cause
application issues.  

-- 
Mike Kravetz


Re: [RFC PATCH 2/3] hugetlb: convert page_huge_active() to HPageMigratable flag

2021-01-15 Thread Oscar Salvador
On Fri, Jan 15, 2021 at 09:43:36AM -0800, Mike Kravetz wrote:
> > Before the page_huge_active() in scan_movable_pages() we have the
> > if (!PageHuge(page)) check, but could it be that between that check and
> > the page_huge_active(), the page gets dissolved, and so we are checking
> > a wrong page[1]? Am I making sense? 
> 
> Yes, you are making sense.
> 
> The reason I decided to drop the check is because it does not eliminate the
> race.  Even with that check in page_huge_active, the page could be dissolved
> between that check and check of page[1].  There really is no way to eliminate
> the race without holding a reference to the page (or hugetlb_lock).  That
> check in page_huge_active just shortens the race window.

Yeah, you are right, the race already exists.
Anyway, do_migrate_range should take care of making sure what it is
handling, so I think we are good.

-- 
Oscar Salvador
SUSE L3


Re: [RFC PATCH 2/3] hugetlb: convert page_huge_active() to HPageMigratable flag

2021-01-15 Thread Oscar Salvador
On Fri, Jan 15, 2021 at 12:05:29PM -0800, Mike Kravetz wrote:
> I went back and took a closer look.  Migration is the reason the existing
> page_huge_active interfaces were introduced.  And, the only use of the
> page_huge_active check is to determine if a page can be migrated.  So,
> I think 'Migratable' may be the most suitable name.

Ok, I did not know that. Let us stick with 'Migratable' then.

> To address the concern about not all hugetlb sizes are migratable, we can
> just make a check before setting the flag.  This should even help in the
> migration/offline paths as we will know sooner if the page can be
> migrated or not.

This sounds like a good idea to me.

> We can address naming in the 'migrating free hugetlb pages' issue when
> that code is written.

Sure, it was just a suggestion as when I though about that something
like 'InUse' or 'Active' made more sense to me, but your point is valid.

Sorry for the confusion.

About that alloc_contig_range topic, I would like to take a look unless
someone is already on it or about to be.

Thanks Mike for the time ;-)


-- 
Oscar Salvador
SUSE L3


Re: [RFC PATCH 2/3] hugetlb: convert page_huge_active() to HPageMigratable flag

2021-01-15 Thread Mike Kravetz
On 1/15/21 9:43 AM, Mike Kravetz wrote:
> On 1/15/21 1:17 AM, Oscar Salvador wrote:
>> On Mon, Jan 11, 2021 at 01:01:51PM -0800, Mike Kravetz wrote:
>>> Use the new hugetlb page specific flag to replace the page_huge_active
>>> interfaces.  By it's name, page_huge_active implied that a huge page
>>> was on the active list.  However, that is not really what code checking
>>> the flag wanted to know.  It really wanted to determine if the huge
>>> page could be migrated.  This happens when the page is actually added
>>> the page cache and/or task page table.  This is the reasoning behind the
>>> name change.
>>>
>>> The VM_BUG_ON_PAGE() calls in the interfaces were not really necessary
>>> as in all case but one we KNOW the page is a hugetlb page.  Therefore,
>>> they are removed.  In one call to HPageMigratable() is it possible for
>>> the page to not be a hugetlb page due to a race.  However, the code
>>> making the call (scan_movable_pages) is inherently racy, and page state
>>> will be validated later in the migration process.
>>>
>>> Note:  Since HPageMigratable is used outside hugetlb.c, it can not be
>>> static.  Therefore, a new set of hugetlb page flag macros is added for
>>> non-static flag functions.
>>
>> Two things about this one:
>>
>> I am not sure about the name of this one.
>> It is true that page_huge_active() was only called by memory-hotplug and all
>> it wanted to know was whether the page was in-use and so if it made sense
>> to migrate it, so I see some value in the new PageMigratable flag.
>>
>> However, not all in-use hugetlb can be migrated, e.g: we might have 
>> constraints
>> when it comes to migrate certain sizes of hugetlb, right?
>> So setting HPageMigratable to all active hugetlb pages might be a bit 
>> misleading?
>> HPageActive maybe? (Sorry, don't have a replacement)
> 
> You concerns about the name change are correct.
> 
> The reason for the change came about from discussions about Muchun's series
> of fixes and the need for a new 'page is freed' status to fix a race.  In
> that discussion, Michal asked 'Why can't we simply set page_huge_active when
> the page is allocated and put on the active list?'.  That is mentioned above,
> but we really do not want to try and migrate pages after they are allocated
> and before they are in use.  That causes problems in the fault handling code.
> 
> Anyway, that is how the suggestion for Migration came about.
> 
> In that discussion David Hildenbrand noted that code in alloc_contig_range
> should migrate free hugetlb pages, but there is no support for that today.
> I plan to look at that if nobody else does.  When such code is added, the
> name 'Migratable' will become less applicable.
> 
> I'm not great at naming.  Perhaps 'In_Use' as a flag name might fit better.
> 

I went back and took a closer look.  Migration is the reason the existing
page_huge_active interfaces were introduced.  And, the only use of the
page_huge_active check is to determine if a page can be migrated.  So,
I think 'Migratable' may be the most suitable name.

To address the concern about not all hugetlb sizes are migratable, we can
just make a check before setting the flag.  This should even help in the
migration/offline paths as we will know sooner if the page can be
migrated or not.

We can address naming in the 'migrating free hugetlb pages' issue when
that code is written.
-- 
Mike Kravetz


Re: [RFC PATCH 2/3] hugetlb: convert page_huge_active() to HPageMigratable flag

2021-01-15 Thread Mike Kravetz
On 1/15/21 1:17 AM, Oscar Salvador wrote:
> On Mon, Jan 11, 2021 at 01:01:51PM -0800, Mike Kravetz wrote:
>> Use the new hugetlb page specific flag to replace the page_huge_active
>> interfaces.  By it's name, page_huge_active implied that a huge page
>> was on the active list.  However, that is not really what code checking
>> the flag wanted to know.  It really wanted to determine if the huge
>> page could be migrated.  This happens when the page is actually added
>> the page cache and/or task page table.  This is the reasoning behind the
>> name change.
>>
>> The VM_BUG_ON_PAGE() calls in the interfaces were not really necessary
>> as in all case but one we KNOW the page is a hugetlb page.  Therefore,
>> they are removed.  In one call to HPageMigratable() is it possible for
>> the page to not be a hugetlb page due to a race.  However, the code
>> making the call (scan_movable_pages) is inherently racy, and page state
>> will be validated later in the migration process.
>>
>> Note:  Since HPageMigratable is used outside hugetlb.c, it can not be
>> static.  Therefore, a new set of hugetlb page flag macros is added for
>> non-static flag functions.
> 
> Two things about this one:
> 
> I am not sure about the name of this one.
> It is true that page_huge_active() was only called by memory-hotplug and all
> it wanted to know was whether the page was in-use and so if it made sense
> to migrate it, so I see some value in the new PageMigratable flag.
> 
> However, not all in-use hugetlb can be migrated, e.g: we might have 
> constraints
> when it comes to migrate certain sizes of hugetlb, right?
> So setting HPageMigratable to all active hugetlb pages might be a bit 
> misleading?
> HPageActive maybe? (Sorry, don't have a replacement)

You concerns about the name change are correct.

The reason for the change came about from discussions about Muchun's series
of fixes and the need for a new 'page is freed' status to fix a race.  In
that discussion, Michal asked 'Why can't we simply set page_huge_active when
the page is allocated and put on the active list?'.  That is mentioned above,
but we really do not want to try and migrate pages after they are allocated
and before they are in use.  That causes problems in the fault handling code.

Anyway, that is how the suggestion for Migration came about.

In that discussion David Hildenbrand noted that code in alloc_contig_range
should migrate free hugetlb pages, but there is no support for that today.
I plan to look at that if nobody else does.  When such code is added, the
name 'Migratable' will become less applicable.

I'm not great at naming.  Perhaps 'In_Use' as a flag name might fit better.

> The other thing is that you are right that scan_movable_pages is racy, but
> page_huge_active() was checking if the page had the Head flag set before
> retrieving page[1].
> 
> Before the page_huge_active() in scan_movable_pages() we have the
> if (!PageHuge(page)) check, but could it be that between that check and
> the page_huge_active(), the page gets dissolved, and so we are checking
> a wrong page[1]? Am I making sense? 

Yes, you are making sense.

The reason I decided to drop the check is because it does not eliminate the
race.  Even with that check in page_huge_active, the page could be dissolved
between that check and check of page[1].  There really is no way to eliminate
the race without holding a reference to the page (or hugetlb_lock).  That
check in page_huge_active just shortens the race window.

-- 
Mike Kravetz


Re: [RFC PATCH 2/3] hugetlb: convert page_huge_active() to HPageMigratable flag

2021-01-15 Thread Oscar Salvador
On Mon, Jan 11, 2021 at 01:01:51PM -0800, Mike Kravetz wrote:
> Use the new hugetlb page specific flag to replace the page_huge_active
> interfaces.  By it's name, page_huge_active implied that a huge page
> was on the active list.  However, that is not really what code checking
> the flag wanted to know.  It really wanted to determine if the huge
> page could be migrated.  This happens when the page is actually added
> the page cache and/or task page table.  This is the reasoning behind the
> name change.
> 
> The VM_BUG_ON_PAGE() calls in the interfaces were not really necessary
> as in all case but one we KNOW the page is a hugetlb page.  Therefore,
> they are removed.  In one call to HPageMigratable() is it possible for
> the page to not be a hugetlb page due to a race.  However, the code
> making the call (scan_movable_pages) is inherently racy, and page state
> will be validated later in the migration process.
> 
> Note:  Since HPageMigratable is used outside hugetlb.c, it can not be
> static.  Therefore, a new set of hugetlb page flag macros is added for
> non-static flag functions.

Two things about this one:

I am not sure about the name of this one.
It is true that page_huge_active() was only called by memory-hotplug and all
it wanted to know was whether the page was in-use and so if it made sense
to migrate it, so I see some value in the new PageMigratable flag.

However, not all in-use hugetlb can be migrated, e.g: we might have constraints
when it comes to migrate certain sizes of hugetlb, right?
So setting HPageMigratable to all active hugetlb pages might be a bit 
misleading?
HPageActive maybe? (Sorry, don't have a replacement)

The other thing is that you are right that scan_movable_pages is racy, but
page_huge_active() was checking if the page had the Head flag set before
retrieving page[1].

Before the page_huge_active() in scan_movable_pages() we have the
if (!PageHuge(page)) check, but could it be that between that check and
the page_huge_active(), the page gets dissolved, and so we are checking
a wrong page[1]? Am I making sense? 


-- 
Oscar Salvador
SUSE L3


Re: [External] [RFC PATCH 2/3] hugetlb: convert page_huge_active() to HPageMigratable flag

2021-01-11 Thread Muchun Song
On Tue, Jan 12, 2021 at 5:02 AM Mike Kravetz  wrote:
>
> Use the new hugetlb page specific flag to replace the page_huge_active
> interfaces.  By it's name, page_huge_active implied that a huge page
> was on the active list.  However, that is not really what code checking
> the flag wanted to know.  It really wanted to determine if the huge
> page could be migrated.  This happens when the page is actually added
> the page cache and/or task page table.  This is the reasoning behind the
> name change.
>
> The VM_BUG_ON_PAGE() calls in the interfaces were not really necessary
> as in all case but one we KNOW the page is a hugetlb page.  Therefore,
> they are removed.  In one call to HPageMigratable() is it possible for
> the page to not be a hugetlb page due to a race.  However, the code
> making the call (scan_movable_pages) is inherently racy, and page state
> will be validated later in the migration process.
>
> Note:  Since HPageMigratable is used outside hugetlb.c, it can not be
> static.  Therefore, a new set of hugetlb page flag macros is added for
> non-static flag functions.
>
> Signed-off-by: Mike Kravetz 
> ---
>  include/linux/hugetlb.h| 17 +++
>  include/linux/page-flags.h |  6 
>  mm/hugetlb.c   | 60 +-
>  mm/memory_hotplug.c|  2 +-
>  4 files changed, 45 insertions(+), 40 deletions(-)
>
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 4f0159f1b9cc..46e590552d55 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -190,6 +190,9 @@ unsigned long hugetlb_change_protection(struct 
> vm_area_struct *vma,
>
>  bool is_hugetlb_entry_migration(pte_t pte);
>
> +int HPageMigratable(struct page *page);
> +void SetHPageMigratable(struct page *page);
> +void ClearHPageMigratable(struct page *page);
>  #else /* !CONFIG_HUGETLB_PAGE */
>
>  static inline void reset_vma_resv_huge_pages(struct vm_area_struct *vma)
> @@ -370,6 +373,20 @@ static inline vm_fault_t hugetlb_fault(struct mm_struct 
> *mm,
> return 0;
>  }
>
> +static inline int HPageMigratable(struct page *page)
> +{
> +   return(0);
> +}
> +
> +static inline void SetHPageMigratable(struct page *page)
> +{
> +   return;
> +}
> +
> +static inline void ClearHPageMigratable(struct page *page)
> +{
> +   return;
> +}

How about introducing the HPAGEFLAG_NOOP macro to do
that?

#define TESTHPAGEFLAG_FALSE(flname) \
static inline int HPage##flname(struct page *page) { return 0; }

#define SETHPAGEFLAG_NOOP(flname) \
static inline void SetHPage##flname(struct page *page) {}

#define CLEARHPAGEFLAG_NOOP(flname) \
static inline void ClearHPage##flname(struct page *page) {}

#define HPAGEFLAG_NOOP(flname) \
TESTHPAGEFLAG_FALSE(flname) \
SETHPAGEFLAG_NOOP(flname) \
CLEARHPAGEFLAG_NOOP(flname)

HPAGEFLAG_NOOP(Migratable)

>  #endif /* !CONFIG_HUGETLB_PAGE */
>  /*
>   * hugepages at page global directory. If arch support
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 4f6ba9379112..167250466c9c 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -593,15 +593,9 @@ static inline void ClearPageCompound(struct page *page)
>  #ifdef CONFIG_HUGETLB_PAGE
>  int PageHuge(struct page *page);
>  int PageHeadHuge(struct page *page);
> -bool page_huge_active(struct page *page);
>  #else
>  TESTPAGEFLAG_FALSE(Huge)
>  TESTPAGEFLAG_FALSE(HeadHuge)
> -
> -static inline bool page_huge_active(struct page *page)
> -{
> -   return 0;
> -}
>  #endif
>
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 3eb3b102c589..34ce82f4823c 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -57,6 +57,7 @@ static unsigned long hugetlb_cma_size __initdata;
>   */
>  enum htlb_page_flags {
> HPAGE_RestoreReserve = 0,
> +   HPAGE_Migratable,
>  };
>
>  /*
> @@ -79,7 +80,25 @@ static inline void ClearHPage##flname(struct page *page)   
>   \
> SETHPAGEFLAG(flname)\
> CLEARHPAGEFLAG(flname)
>
> +#define EXT_TESTHPAGEFLAG(flname)  \
> +int HPage##flname(struct page *page)   \
> +   { return test_bit(HPAGE_##flname, &(page->private)); }
> +
> +#define EXT_SETHPAGEFLAG(flname)   \
> +void SetHPage##flname(struct page *page)   \
> +   { set_bit(HPAGE_##flname, &(page->private)); }
> +
> +#define EXT_CLEARHPAGEFLAG(flname) \
> +void ClearHPage##flname(struct page *page) \
> +   { clear_bit(HPAGE_##flname, &(page->private)); }
> +
> +#define EXT_HPAGEFLAG(flname)  \
> +   EXT_TESTHPAGEFLAG(flname)   \
> +   EXT_SETHPAGEFLAG(flname)\
> +   EXT_CLEARHPAGEFLAG(flname)
> +
>  HPAGEFLAG(RestoreReserve)
> +EXT_HPAGEFLAG(Migratable)

How about moving HPAGEFLAG to 

[RFC PATCH 2/3] hugetlb: convert page_huge_active() to HPageMigratable flag

2021-01-11 Thread Mike Kravetz
Use the new hugetlb page specific flag to replace the page_huge_active
interfaces.  By it's name, page_huge_active implied that a huge page
was on the active list.  However, that is not really what code checking
the flag wanted to know.  It really wanted to determine if the huge
page could be migrated.  This happens when the page is actually added
the page cache and/or task page table.  This is the reasoning behind the
name change.

The VM_BUG_ON_PAGE() calls in the interfaces were not really necessary
as in all case but one we KNOW the page is a hugetlb page.  Therefore,
they are removed.  In one call to HPageMigratable() is it possible for
the page to not be a hugetlb page due to a race.  However, the code
making the call (scan_movable_pages) is inherently racy, and page state
will be validated later in the migration process.

Note:  Since HPageMigratable is used outside hugetlb.c, it can not be
static.  Therefore, a new set of hugetlb page flag macros is added for
non-static flag functions.

Signed-off-by: Mike Kravetz 
---
 include/linux/hugetlb.h| 17 +++
 include/linux/page-flags.h |  6 
 mm/hugetlb.c   | 60 +-
 mm/memory_hotplug.c|  2 +-
 4 files changed, 45 insertions(+), 40 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 4f0159f1b9cc..46e590552d55 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -190,6 +190,9 @@ unsigned long hugetlb_change_protection(struct 
vm_area_struct *vma,
 
 bool is_hugetlb_entry_migration(pte_t pte);
 
+int HPageMigratable(struct page *page);
+void SetHPageMigratable(struct page *page);
+void ClearHPageMigratable(struct page *page);
 #else /* !CONFIG_HUGETLB_PAGE */
 
 static inline void reset_vma_resv_huge_pages(struct vm_area_struct *vma)
@@ -370,6 +373,20 @@ static inline vm_fault_t hugetlb_fault(struct mm_struct 
*mm,
return 0;
 }
 
+static inline int HPageMigratable(struct page *page)
+{
+   return(0);
+}
+
+static inline void SetHPageMigratable(struct page *page)
+{
+   return;
+}
+
+static inline void ClearHPageMigratable(struct page *page)
+{
+   return;
+}
 #endif /* !CONFIG_HUGETLB_PAGE */
 /*
  * hugepages at page global directory. If arch support
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 4f6ba9379112..167250466c9c 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -593,15 +593,9 @@ static inline void ClearPageCompound(struct page *page)
 #ifdef CONFIG_HUGETLB_PAGE
 int PageHuge(struct page *page);
 int PageHeadHuge(struct page *page);
-bool page_huge_active(struct page *page);
 #else
 TESTPAGEFLAG_FALSE(Huge)
 TESTPAGEFLAG_FALSE(HeadHuge)
-
-static inline bool page_huge_active(struct page *page)
-{
-   return 0;
-}
 #endif
 
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 3eb3b102c589..34ce82f4823c 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -57,6 +57,7 @@ static unsigned long hugetlb_cma_size __initdata;
  */
 enum htlb_page_flags {
HPAGE_RestoreReserve = 0,
+   HPAGE_Migratable,
 };
 
 /*
@@ -79,7 +80,25 @@ static inline void ClearHPage##flname(struct page *page) 
\
SETHPAGEFLAG(flname)\
CLEARHPAGEFLAG(flname)
 
+#define EXT_TESTHPAGEFLAG(flname)  \
+int HPage##flname(struct page *page)   \
+   { return test_bit(HPAGE_##flname, &(page->private)); }
+
+#define EXT_SETHPAGEFLAG(flname)   \
+void SetHPage##flname(struct page *page)   \
+   { set_bit(HPAGE_##flname, &(page->private)); }
+
+#define EXT_CLEARHPAGEFLAG(flname) \
+void ClearHPage##flname(struct page *page) \
+   { clear_bit(HPAGE_##flname, &(page->private)); }
+
+#define EXT_HPAGEFLAG(flname)  \
+   EXT_TESTHPAGEFLAG(flname)   \
+   EXT_SETHPAGEFLAG(flname)\
+   EXT_CLEARHPAGEFLAG(flname)
+
 HPAGEFLAG(RestoreReserve)
+EXT_HPAGEFLAG(Migratable)
 
 /*
  * hugetlb page subpool pointer located in hpage[1].private
@@ -1379,31 +1398,6 @@ struct hstate *size_to_hstate(unsigned long size)
return NULL;
 }
 
-/*
- * Test to determine whether the hugepage is "active/in-use" (i.e. being linked
- * to hstate->hugepage_activelist.)
- *
- * This function can be called for tail pages, but never returns true for them.
- */
-bool page_huge_active(struct page *page)
-{
-   VM_BUG_ON_PAGE(!PageHuge(page), page);
-   return PageHead(page) && PagePrivate([1]);
-}
-
-/* never called for tail page */
-static void set_page_huge_active(struct page *page)
-{
-   VM_BUG_ON_PAGE(!PageHeadHuge(page), page);
-   SetPagePrivate([1]);
-}
-
-static void clear_page_huge_active(struct page *page)
-{
-   VM_BUG_ON_PAGE(!PageHeadHuge(page), page);
-