Re: Infinite looping observed in __offline_pages
On Wed 22-08-18 11:58:02, Mike Kravetz wrote: > On 08/22/2018 02:30 AM, Aneesh Kumar K.V wrote: [...] > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > > index 4eb6e824a80c..f9bdea685cf4 100644 > > --- a/mm/memory_hotplug.c > > +++ b/mm/memory_hotplug.c > > @@ -1338,7 +1338,8 @@ static unsigned long scan_movable_pages(unsigned long > > start, unsigned long end) > > return pfn; > > if (__PageMovable(page)) > > return pfn; > > - if (PageHuge(page)) { > > + if (IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION) && > > + PageHuge(page)) { > > How about using hugepage_migration_supported instead? It would automatically > catch those non-migratable huge page sizes. Something like: > > if (PageHuge(page) && > hugepage_migration_supported(page_hstate(page))) { Ohh, definitely, this is much better. -- Michal Hocko SUSE Labs
Re: Infinite looping observed in __offline_pages
On 08/23/2018 12:28 AM, Mike Kravetz wrote: On 08/22/2018 02:30 AM, Aneesh Kumar K.V wrote: commit 2e9d754ac211f2af3731f15df3cd8cd070b4cc54 Author: Aneesh Kumar K.V Date: Tue Aug 21 14:17:55 2018 +0530 mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported. When scanning for movable pages, filter out Hugetlb pages if hugepage migration is not supported. Without this we hit infinte loop in __offline pages where we do pfn = scan_movable_pages(start_pfn, end_pfn); if (pfn) { /* We have movable pages */ ret = do_migrate_range(pfn, end_pfn); goto repeat; } We do support hugetlb migration ony if the hugetlb pages are at pmd level. Here I thought migration at pgd level was added for POWER? commit 94310cbcaa3c (mm/madvise: enable (soft|hard) offline of HugeTLB pages at PGD level). Only remember, because I did not fully understand the use case. :) yes. We hit the issue on older distro kernels. we just check for Kernel config. The gigantic page size check is done in page_huge_active. Reported-by: Haren Myneni CC: Naoya Horiguchi Signed-off-by: Aneesh Kumar K.V diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 4eb6e824a80c..f9bdea685cf4 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1338,7 +1338,8 @@ static unsigned long scan_movable_pages(unsigned long start, unsigned long end) return pfn; if (__PageMovable(page)) return pfn; - if (PageHuge(page)) { + if (IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION) && + PageHuge(page)) { How about using hugepage_migration_supported instead? It would automatically catch those non-migratable huge page sizes. Something like: Will do that. if (PageHuge(page) && hugepage_migration_supported(page_hstate(page))) { -aneesh
Re: Infinite looping observed in __offline_pages
On 08/22/2018 02:30 AM, Aneesh Kumar K.V wrote: > commit 2e9d754ac211f2af3731f15df3cd8cd070b4cc54 > Author: Aneesh Kumar K.V > Date: Tue Aug 21 14:17:55 2018 +0530 > > mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not > supported. > > When scanning for movable pages, filter out Hugetlb pages if hugepage > migration > is not supported. Without this we hit infinte loop in __offline pages > where we > do > pfn = scan_movable_pages(start_pfn, end_pfn); > if (pfn) { /* We have movable pages */ > ret = do_migrate_range(pfn, end_pfn); > goto repeat; > } > > We do support hugetlb migration ony if the hugetlb pages are at pmd > level. Here I thought migration at pgd level was added for POWER? commit 94310cbcaa3c (mm/madvise: enable (soft|hard) offline of HugeTLB pages at PGD level). Only remember, because I did not fully understand the use case. :) > we just check for Kernel config. The gigantic page size check is done in > page_huge_active. > > Reported-by: Haren Myneni > CC: Naoya Horiguchi > Signed-off-by: Aneesh Kumar K.V > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 4eb6e824a80c..f9bdea685cf4 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1338,7 +1338,8 @@ static unsigned long scan_movable_pages(unsigned long > start, unsigned long end) > return pfn; > if (__PageMovable(page)) > return pfn; > - if (PageHuge(page)) { > + if (IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION) && > + PageHuge(page)) { How about using hugepage_migration_supported instead? It would automatically catch those non-migratable huge page sizes. Something like: if (PageHuge(page) && hugepage_migration_supported(page_hstate(page))) { -- Mike Kravetz > if (page_huge_active(page)) > return pfn; > else > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 15ea511fb41c..a3f81e18c882 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -7649,6 +7649,10 @@ bool has_unmovable_pages(struct zone *zone, struct > page *page, int count, >* handle each tail page individually in migration. >*/ > if (PageHuge(page)) { > + > + if (!IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION)) > + goto unmovable; > + > iter = round_up(iter + 1, 1< continue; > } >
Re: Infinite looping observed in __offline_pages
On Wed 22-08-18 15:00:18, Aneesh Kumar K.V wrote: > > Hi Michal, > > Michal Hocko writes: > > > On Wed 25-07-18 13:11:15, John Allen wrote: > > [...] > >> Does a failure in do_migrate_range indicate that the range is unmigratable > >> and the loop in __offline_pages should terminate and goto failed_removal? > >> Or > >> should we allow a certain number of retrys before we > >> give up on migrating the range? > > > > Unfortunatelly not. Migration code doesn't tell a difference between > > ephemeral and permanent failures. We are relying on > > start_isolate_page_range to tell us this. So the question is, what kind > > of page is not migratable and for what reason. > > > > Are you able to add some debugging to give us more information. The > > current debugging code in the hotplug/migration sucks... > > Haren did most of the debugging, so at minimum we need a patch like this > I guess. > > modified mm/page_alloc.c > @@ -7649,6 +7649,10 @@ bool has_unmovable_pages(struct zone *zone, struct > page *page, int count, >* handle each tail page individually in migration. >*/ > if (PageHuge(page)) { > + > + if (!IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION)) > + goto unmovable; > + > iter = round_up(iter + 1, 1< continue; > } > > > The problem is start_isolate_range, doesn't look at hugetlbpage and > return error if the architecture didn't support HUGEPAGE migration. Yes this makes sense. I didn't really have arches without huge page migration in mind. > Now discussing with Naoya, I was suggsting whether we should add a > similar check in > > modified mm/memory_hotplug.c > @@ -1338,7 +1338,8 @@ static unsigned long scan_movable_pages(unsigned long > start, unsigned long end) > return pfn; > if (__PageMovable(page)) > return pfn; > - if (PageHuge(page)) { > + if (IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION) && > + PageHuge(page)) { > if (page_huge_active(page)) > return pfn; > > One of the thinking there is it possible to get new hugetlb pages > allocated in that range after start_isolate_range ran. But i guess since > we marked all the free pages as MIGRATE_ISOLATE that is not possible? I do not follow. You are usually allocating new pages outside of the offlined range. But the above change makes sense because it doesn't really make sense to migrate pfn if it is backed by a non-migrateable huge page. dissolve_free_huge_pages should then try to remove it completely and the offlining fails if that is not possible. > But then it is good to have scan_movable_pages also check for > HUGEPAGE_MIGRATION? Good question. It is not necessary right now because has_unmovable_pages called earlier should take care of it. But I guess it will not hurt to have it there as well for the clarity. > > Complete patch below. > > commit 2e9d754ac211f2af3731f15df3cd8cd070b4cc54 > Author: Aneesh Kumar K.V > Date: Tue Aug 21 14:17:55 2018 +0530 > > mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not > supported. > > When scanning for movable pages, filter out Hugetlb pages if hugepage > migration > is not supported. Without this we hit infinte loop in __offline pages > where we > do > pfn = scan_movable_pages(start_pfn, end_pfn); > if (pfn) { /* We have movable pages */ > ret = do_migrate_range(pfn, end_pfn); > goto repeat; > } > > We do support hugetlb migration ony if the hugetlb pages are at pmd > level. Here > we just check for Kernel config. The gigantic page size check is done in > page_huge_active. That being said. The patch makes sense to me. > > Reported-by: Haren Myneni > CC: Naoya Horiguchi > Signed-off-by: Aneesh Kumar K.V I guess we should mark it for stable even though I am not sure how often do we see CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=n Acked-by: Michal Hocko > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 4eb6e824a80c..f9bdea685cf4 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1338,7 +1338,8 @@ static unsigned long scan_movable_pages(unsigned long > start, unsigned long end) > return pfn; > if (__PageMovable(page)) > return pfn; > - if (PageHuge(page)) { > + if (IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION) && > + PageHuge(page)) { > if (page_huge_active(page)) > return pfn; > else >
Re: Infinite looping observed in __offline_pages
Hi Michal, Michal Hocko writes: > On Wed 25-07-18 13:11:15, John Allen wrote: > [...] >> Does a failure in do_migrate_range indicate that the range is unmigratable >> and the loop in __offline_pages should terminate and goto failed_removal? Or >> should we allow a certain number of retrys before we >> give up on migrating the range? > > Unfortunatelly not. Migration code doesn't tell a difference between > ephemeral and permanent failures. We are relying on > start_isolate_page_range to tell us this. So the question is, what kind > of page is not migratable and for what reason. > > Are you able to add some debugging to give us more information. The > current debugging code in the hotplug/migration sucks... Haren did most of the debugging, so at minimum we need a patch like this I guess. modified mm/page_alloc.c @@ -7649,6 +7649,10 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, * handle each tail page individually in migration. */ if (PageHuge(page)) { + + if (!IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION)) + goto unmovable; + iter = round_up(iter + 1, 1< Date: Tue Aug 21 14:17:55 2018 +0530 mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported. When scanning for movable pages, filter out Hugetlb pages if hugepage migration is not supported. Without this we hit infinte loop in __offline pages where we do pfn = scan_movable_pages(start_pfn, end_pfn); if (pfn) { /* We have movable pages */ ret = do_migrate_range(pfn, end_pfn); goto repeat; } We do support hugetlb migration ony if the hugetlb pages are at pmd level. Here we just check for Kernel config. The gigantic page size check is done in page_huge_active. Reported-by: Haren Myneni CC: Naoya Horiguchi Signed-off-by: Aneesh Kumar K.V diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 4eb6e824a80c..f9bdea685cf4 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1338,7 +1338,8 @@ static unsigned long scan_movable_pages(unsigned long start, unsigned long end) return pfn; if (__PageMovable(page)) return pfn; - if (PageHuge(page)) { + if (IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION) && + PageHuge(page)) { if (page_huge_active(page)) return pfn; else diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 15ea511fb41c..a3f81e18c882 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7649,6 +7649,10 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, * handle each tail page individually in migration. */ if (PageHuge(page)) { + + if (!IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION)) + goto unmovable; + iter = round_up(iter + 1, 1<
Re: Infinite looping observed in __offline_pages
On Wed 01-08-18 21:09:39, Michael Ellerman wrote: > Michal Hocko writes: > > On Wed 25-07-18 13:11:15, John Allen wrote: > > [...] > >> Does a failure in do_migrate_range indicate that the range is unmigratable > >> and the loop in __offline_pages should terminate and goto failed_removal? > >> Or > >> should we allow a certain number of retrys before we > >> give up on migrating the range? > > > > Unfortunatelly not. Migration code doesn't tell a difference between > > ephemeral and permanent failures. > > What's to stop an ephemeral failure happening repeatedly? If there is a short term pin on the page that prevents the migration then the holder of the pin should realease it and the next retry will succeed the migration. If the page gets freed on the way then it will not be reallocated because they are isolated already. I can only see complete OOM to be the reason to fail allocation of the target place as the migration failure and that is highly unlikely and sooner or later trigger the oom killer and release some memory. The biggest problem here is that we cannot tell ephemeral and long term pins... -- Michal Hocko SUSE Labs
Re: Infinite looping observed in __offline_pages
Michal Hocko writes: > On Wed 25-07-18 13:11:15, John Allen wrote: > [...] >> Does a failure in do_migrate_range indicate that the range is unmigratable >> and the loop in __offline_pages should terminate and goto failed_removal? Or >> should we allow a certain number of retrys before we >> give up on migrating the range? > > Unfortunatelly not. Migration code doesn't tell a difference between > ephemeral and permanent failures. What's to stop an ephemeral failure happening repeatedly? cheers
Re: Infinite looping observed in __offline_pages
On 26/07/18 04:11, John Allen wrote: > Hi All, > > Under heavy stress and constant memory hot add/remove, I have observed > the following loop to occasionally loop infinitely: > > mm/memory_hotplug.c:__offline_pages > > repeat: > /* start memory hot removal */ > ret = -EINTR; > if (signal_pending(current)) > goto failed_removal; > > cond_resched(); > lru_add_drain_all(); > drain_all_pages(zone); > > pfn = scan_movable_pages(start_pfn, end_pfn); > if (pfn) { /* We have movable pages */ > ret = do_migrate_range(pfn, end_pfn); > goto repeat; > } > What is CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE set to for you? I have also observed this when hot removing and adding memory. However I only have only seen this when my kernel has CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=n (when it is set to online automatically I do not have this issue) so I assumed that I wasn't onlining the memory properly... > What appears to be happening in this case is that do_migrate_range > returns a failure code which is being ignored. The failure is stemming > from migrate_pages returning "1" which I'm guessing is the result of > us hitting the following case: > > mm/migrate.c: migrate_pages > > default: > /* > * Permanent failure (-EBUSY, -ENOSYS, etc.): > * unlike -EAGAIN case, the failed page is > * removed from migration page list and not > * retried in the next outer loop. > */ > nr_failed++; > break; > } > > Does a failure in do_migrate_range indicate that the range is > unmigratable and the loop in __offline_pages should terminate and goto > failed_removal? Or should we allow a certain number of retrys before we > give up on migrating the range? > > This issue was observed on a ppc64le lpar on a 4.18-rc6 kernel. > > -John >
Re: Infinite looping observed in __offline_pages
On Fri 27-07-18 12:32:59, John Allen wrote: > On Wed, Jul 25, 2018 at 10:03:36PM +0200, Michal Hocko wrote: > > On Wed 25-07-18 13:11:15, John Allen wrote: > > [...] > > > Does a failure in do_migrate_range indicate that the range is unmigratable > > > and the loop in __offline_pages should terminate and goto failed_removal? > > > Or > > > should we allow a certain number of retrys before we > > > give up on migrating the range? > > > > Unfortunatelly not. Migration code doesn't tell a difference between > > ephemeral and permanent failures. We are relying on > > start_isolate_page_range to tell us this. So the question is, what kind > > of page is not migratable and for what reason. > > > > Are you able to add some debugging to give us more information. The > > current debugging code in the hotplug/migration sucks... > > After reproducing the problem a couple times, it seems that it can occur for > different types of pages. Running page-types on the offending page over two > separate instances produced the following: > > # tools/vm/page-types -a 307968-308224 > flags page-count MB symbolic-flags > long-symbolic-flags > 0x0400 10 > __Bbuddy >total 10 Huh! How come a buddy page has non zero reference count. > > And the following on a separate run: > > # tools/vm/page-types -a 313088-313344 > flags page-count MB symbolic-flags > long-symbolic-flags > 0x006c 10 > __RU_lA > referenced,uptodate,lru,active > total 10 Hmm, what is the expected page count in this case? Seeing 1 doesn't look particularly wrong. -- Michal Hocko SUSE Labs
Re: Infinite looping observed in __offline_pages
On Wed, Jul 25, 2018 at 10:03:36PM +0200, Michal Hocko wrote: On Wed 25-07-18 13:11:15, John Allen wrote: [...] Does a failure in do_migrate_range indicate that the range is unmigratable and the loop in __offline_pages should terminate and goto failed_removal? Or should we allow a certain number of retrys before we give up on migrating the range? Unfortunatelly not. Migration code doesn't tell a difference between ephemeral and permanent failures. We are relying on start_isolate_page_range to tell us this. So the question is, what kind of page is not migratable and for what reason. Are you able to add some debugging to give us more information. The current debugging code in the hotplug/migration sucks... After reproducing the problem a couple times, it seems that it can occur for different types of pages. Running page-types on the offending page over two separate instances produced the following: # tools/vm/page-types -a 307968-308224 flags page-count MB symbolic-flags long-symbolic-flags 0x0400 10 __Bbuddy total 10 And the following on a separate run: # tools/vm/page-types -a 313088-313344 flags page-count MB symbolic-flags long-symbolic-flags 0x006c 10 __RU_lA referenced,uptodate,lru,active total10 The source of the failure in migrate_pages actually doesn't seem to be that we're hitting the case of the permanent failure, but instead the -EAGAIN case. I traced the EAGAIN return back to migrate_page_move_mapping which I've seen return EAGAIN in two places: mm/migrate.c:453 if (!mapping) { /* Anonymous page without mapping */ if (page_count(page) != expected_count) return -EAGAIN; mm/migrate.c:476 if (page_count(page) != expected_count || radix_tree_deref_slot_protected(pslot, >i_pages.xa_lock) != page) { xa_unlock_irq(>i_pages); return -EAGAIN; } So it seems in each case, the actual reference count for the page is not what it is expected to be.
Re: Infinite looping observed in __offline_pages
On Wed 25-07-18 13:11:15, John Allen wrote: [...] > Does a failure in do_migrate_range indicate that the range is unmigratable > and the loop in __offline_pages should terminate and goto failed_removal? Or > should we allow a certain number of retrys before we > give up on migrating the range? Unfortunatelly not. Migration code doesn't tell a difference between ephemeral and permanent failures. We are relying on start_isolate_page_range to tell us this. So the question is, what kind of page is not migratable and for what reason. Are you able to add some debugging to give us more information. The current debugging code in the hotplug/migration sucks... -- Michal Hocko SUSE Labs