Everything should be prepared to stop setting pages PG_reserved when
initializing the memmap on memory hotplug. Most importantly, we
stop marking ZONE_DEVICE pages PG_reserved.
a) We made sure that any code that relied on PG_reserved to detect
ZONE_DEVICE memory will no longer rely on PG_reserved (especially,
by relying on pfn_to_online_page() for now). Details can be found
below.
b) We made sure that memory blocks with holes cannot be offlined and
therefore also not onlined. We have quite some code that relies on
memory holes being marked PG_reserved. This is now not an issue
anymore.
generic_online_page() still calls __free_pages_core(), which performs
__ClearPageReserved(p). AFAIKS, this should not hurt.
It is worth nothing that the users of online_page_callback_t might see a
change. E.g., until now, pages not freed to the buddy by the HyperV
balloonm were set PG_reserved until freed via generic_online_page(). Now,
they would look like ordinarily allocated pages (refcount == 1). This
callback is used by the XEN balloon and the HyperV balloon. To not
introduce any silent errors, keep marking the pages PG_reserved. We can
most probably stop doing that, but have to double check if there are
issues (e.g., offlining code aborts right away in has_unmovable_pages()
when it runs into a PageReserved(page))
Update the documentation at various places in the MM core.
There are three PageReserved() users that might be affected by this change.
- drivers/staging/gasket/gasket_page_table.c:gasket_release_page()
-> We might (unlikely) set SetPageDirty() on a ZONE_DEVICE page
-> I assume "we don't care"
- drivers/staging/kpc2000/kpc_dma/fileops.c:transfer_complete_cb()
-> We might (unlikely) set SetPageDirty() on a ZONE_DEVICE page
-> I assume "we don't care"
- mm/usercopy.c: check_page_span()
-> According to Dan, non-HMM ZONE_DEVICE usage excluded this code since
commit 52f476a323f9 ("libnvdimm/pmem: Bypass CONFIG_HARDENED_USERCOPY
overhead")
-> It is unclear whether we rally cared about ZONE_DEVICE here (HMM) or
simply about "PG_reserved". The worst thing that could happen is a
false negative with CONFIG_HARDENED_USERCOPY we should be able to
identify easily.
-> There is a discussion to rip out that code completely
-> I assume "not relevant" / "we don't care"
I audited the other PageReserved() users. They don't affect ZONE_DEVICE:
- mm/page_owner.c:pagetypeinfo_showmixedcount_print()
-> Never called for ZONE_DEVICE, (+ pfn_to_online_page(pfn))
- mm/page_owner.c:init_pages_in_zone()
-> Never called for ZONE_DEVICE (!populated_zone(zone))
- mm/page_ext.c:free_page_ext()
-> Only a BUG_ON(PageReserved(page)), not relevant
- mm/page_ext.c:has_unmovable_pages()
-> Not releveant for ZONE_DEVICE
- mm/page_ext.c:pfn_range_valid_contig()
-> pfn_to_online_page() already guards us
- mm/mempolicy.c:queue_pages_pte_range()
-> vm_normal_page() checks against pte_devmap()
- mm/memory-failure.c:hwpoison_user_mappings()
-> Not reached via memory_failure() due to pfn_to_online_page()
-> Also not reached indirectly via memory_failure_hugetlb()
- mm/hugetlb.c:gather_bootmem_prealloc()
-> Only a WARN_ON(PageReserved(page)), not relevant
- kernel/power/snapshot.c:saveable_highmem_page()
-> pfn_to_online_page() already guards us
- kernel/power/snapshot.c:saveable_page()
-> pfn_to_online_page() already guards us
- fs/proc/task_mmu.c:can_gather_numa_stats()
-> vm_normal_page() checks against pte_devmap()
- fs/proc/task_mmu.c:can_gather_numa_stats_pmd
-> vm_normal_page_pmd() checks against pte_devmap()
- fs/proc/page.c:stable_page_flags()
-> The reserved bit is simply copied, irrelevant
- drivers/firmware/memmap.c:release_firmware_map_entry()
-> really only a check to detect bootmem. Not relevant for ZONE_DEVICE
- arch/ia64/kernel/mca_drv.c
- arch/mips/mm/init.c
- arch/mips/mm/ioremap.c
- arch/nios2/mm/ioremap.c
- arch/parisc/mm/ioremap.c
- arch/sparc/mm/tlb.c
- arch/xtensa/mm/cache.c
-> No ZONE_DEVICE support
- arch/powerpc/mm/init_64.c:vmemmap_free()
-> Special-cases memmap on altmap
-> Only a check for bootmem
- arch/x86/kernel/alternative.c:__text_poke()
-> Only a WARN_ON(!PageReserved(pages[0])) to verify it is bootmem
- arch/x86/mm/init_64.c
-> Only a check for bootmem
Cc: "K. Y. Srinivasan"
Cc: Haiyang Zhang
Cc: Stephen Hemminger
Cc: Sasha Levin
Cc: Boris Ostrovsky
Cc: Juergen Gross
Cc: Stefano Stabellini
Cc: Andrew Morton
Cc: Alexander Duyck
Cc: Pavel Tatashin
Cc: Vlastimil Babka
Cc: Johannes Weiner
Cc: Anthony Yznaga
Cc: Michal Hocko
Cc: Oscar Salvador
Cc: Dan Williams
Cc: Mel Gorman
Cc: Mike Rapoport
Cc: Anshuman Khandual
Cc: Matt Sickler
Cc: Kees Cook
Suggested-by: Michal Hocko
Signed-off-by: David Hildenbrand
---
drivers/hv/hv_balloon.c| 6 ++
drivers/xen/balloon.c | 7 +++
include/linux/page-flags.h | 8 +---