Naoya Horiguchi wrote:
> On Fri, Apr 21, 2017 at 10:55:49AM -0500, Zi Yan wrote:
>>
>> Anshuman Khandual wrote:
>>> On 04/21/2017 02:17 AM, Zi Yan wrote:
>>>> From: Naoya Horiguchi <n-horigu...@ah.jp.nec.com>
>>>>
>>>> This patch enables thp migration for soft offline.
>>>>
>>>> Signed-off-by: Naoya Horiguchi <n-horigu...@ah.jp.nec.com>
>>>>
>>>> ChangeLog: v1 -> v5:
>>>> - fix page isolation counting error
>>>>
>>>> Signed-off-by: Zi Yan <zi....@cs.rutgers.edu>
>>>> ---
>>>>  mm/memory-failure.c | 35 ++++++++++++++---------------------
>>>>  1 file changed, 14 insertions(+), 21 deletions(-)
>>>>
>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>>>> index 9b77476ef31f..23ff02eb3ed4 100644
>>>> --- a/mm/memory-failure.c
>>>> +++ b/mm/memory-failure.c
>>>> @@ -1481,7 +1481,17 @@ static struct page *new_page(struct page *p, 
>>>> unsigned long private, int **x)
>>>>    if (PageHuge(p))
>>>>            return alloc_huge_page_node(page_hstate(compound_head(p)),
>>>>                                               nid);
>>>> -  else
>>>> +  else if (thp_migration_supported() && PageTransHuge(p)) {
>>>> +          struct page *thp;
>>>> +
>>>> +          thp = alloc_pages_node(nid,
>>>> +                  (GFP_TRANSHUGE | __GFP_THISNODE) & ~__GFP_RECLAIM,
>>> Why not __GFP_RECLAIM ? Its soft offline path we wait a bit before
>>> declaring that THP page cannot be allocated and hence should invoke
>>> reclaim methods as well.
>> I am not sure how much effort the kernel wants to put here to soft
>> offline a THP. Naoya knows more here.
> 
> What I thought at first was that soft offline is not an urgent user
> and no need to reclaim (i.e. give a little impact on other thread.)
> But that's not a strong opinion, so if you like __GFP_RECLAIM here,
> I'm fine about that.

OK, I will add __GFP_RECLAIM.

> 
>>
>>>> +                  HPAGE_PMD_ORDER);
>>>> +          if (!thp)
>>>> +                  return NULL;
>>>> +          prep_transhuge_page(thp);
>>>> +          return thp;
>>>> +  } else
>>>>            return __alloc_pages_node(nid, GFP_HIGHUSER_MOVABLE, 0);
>>>>  }
>>>>  
>>>> @@ -1665,8 +1675,8 @@ static int __soft_offline_page(struct page *page, 
>>>> int flags)
>>>>             * cannot have PAGE_MAPPING_MOVABLE.
>>>>             */
>>>>            if (!__PageMovable(page))
>>>> -                  inc_node_page_state(page, NR_ISOLATED_ANON +
>>>> -                                          page_is_file_cache(page));
>>>> +                  mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON +
>>>> +                                          page_is_file_cache(page), 
>>>> hpage_nr_pages(page));
>>>>            list_add(&page->lru, &pagelist);
>>>>            ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
>>>>                                    MIGRATE_SYNC, MR_MEMORY_FAILURE);
>>>> @@ -1689,28 +1699,11 @@ static int __soft_offline_page(struct page *page, 
>>>> int flags)
>>>>  static int soft_offline_in_use_page(struct page *page, int flags)
>>>>  {
>>>>    int ret;
>>>> -  struct page *hpage = compound_head(page);
>>>> -
>>>> -  if (!PageHuge(page) && PageTransHuge(hpage)) {
>>>> -          lock_page(hpage);
>>>> -          if (!PageAnon(hpage) || unlikely(split_huge_page(hpage))) {
>>>> -                  unlock_page(hpage);
>>>> -                  if (!PageAnon(hpage))
>>>> -                          pr_info("soft offline: %#lx: non anonymous 
>>>> thp\n", page_to_pfn(page));
>>>> -                  else
>>>> -                          pr_info("soft offline: %#lx: thp split 
>>>> failed\n", page_to_pfn(page));
>>>> -                  put_hwpoison_page(hpage);
>>>> -                  return -EBUSY;
>>>> -          }
>>>> -          unlock_page(hpage);
>>>> -          get_hwpoison_page(page);
>>>> -          put_hwpoison_page(hpage);
>>>> -  }
>>>>  
>>>>    if (PageHuge(page))
>>>>            ret = soft_offline_huge_page(page, flags);
>>>>    else
>>>> -          ret = __soft_offline_page(page, flags);
>>>> +          ret = __soft_offline_page(compound_head(page), flags);
>>> Hmm, what if the THP allocation fails in the new_page() path and
>>> we fallback for general page allocation. In that case we will
>>> always be still calling with the head page ? Because we dont
>>> split the huge page any more.
>> This could be a problem if the user wants to offline a TailPage but due
>> to THP allocation failure, the HeadPage is offlined.
> 
> Right, "retry with split" part is unfinished, so we need some improvement.
> 
>> It may be better to only soft offline THPs if page ==
>> compound_head(page). If page != compound_head(page), we still split THPs
>> like before.
>>
>> Because in migrate_pages(), we cannot guarantee any TailPages in that
>> THP are migrated (1. THP allocation failure causes THP splitting, then
>> only HeadPage is going to be migrated; 2. even if we change existing
>> migrate_pages() implementation to add all TailPages to migration list
>> instead of LRU list, we still cannot guarantee the TailPage we want to
>> migrate is migrated.).
>>
>> Naoya, what do you think?
> 
> Maybe soft offline is a special caller of page migration because it
> basically wants to migrate only one page, but thp migration still has
> a benefit because we can avoid thp split.
> So I like that we try thp migration at first, and if it fails we fall
> back to split and migrate (only) a raw error page. This should be done
> in caller side for soft offline, because it knows where the error page is.

Make sense. So when migrate_pages() sees the migrate reason is
MR_MEMORY_FAILURE, it will not split THP when newpage allocation fails.
Then, the soft offline caller will split failed THP and retry migrating
the error subpage. I can do that.

> 
> As for generic case (for other migration callers which mainly want to
> migrate multiple pages for their purpose,) thp split and retry can be
> done in common migration code. After thp split, all subpages are linked
> to migration list, then we retry without returning to the caller.
> So I think that split_huge_page() can be moved to (for example) for-loop
> in migrate_pages().
> 
> I tried to write a patch for it last year, but considering vm event
> accounting, the patch might be large (~100 lines).

Yes. I saw your code on your github. I can pick it up and send it for
review after this patchset is merged, if you are OK with it.


-- 
Best Regards,
Yan Zi

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to