This series adds support for live updating hugetlb-backed memfd,
including support for 1G huge pages. This allows live updating VMs which
use hugepages to back VM memory.

Please take a look at this patch series [0] to know more about the Live
Update Orchestrator (LUO). It also includes patches for live updating a
shmem-backed memfd. This series is a follow up to that, adding huge page
support as well.

You can also read this LWN article [1] to learn more about KHO and Live
Update Orchestrator, though do note that this article is a bit
out-of-date. LUO has since evolved. For example, subsystems have been
replaced with FLB, and the state machine has been simplified.

This series is based on top of mm-non-unstable, which includes the LUO
FLB patches [2].

This series uses LUO FLB to track how many pages are preserved for each
hstate, to ensure the live updated kernel does not over-allocate
hugepages.

Areas for Discussion
====================

Why is this an RFC?
-------------------

While I believe the code is in decent shape, I have only done some basic
testing and have not put it through more intensive testing, including
testing on ARM64. I am also not completely confident on the handling of
reservations and cgroup charging, even though it appears to work on the
surface.

The goal of this is to start discussion at high level points so we can
at least agree on the general direction. This also gives people some
time to see the code, before the session discussing this at LPC 2025
[3].

Disabling scratch-only earlier in boot
--------------------------------------

Patch 2 moves KHO memory initialization to earlier in boot. Detailed
discussion on the topic is in patch 2's message.

Allocating gigantic hugepages after paging_init() on x86
--------------------------------------------------------

To allow KHO to work with gigantic hugepages on x86, patch 2 moves
gigantic huge page allocation after paging_init(). This can have some
impact on ability to allocate gigantic pages, but I believe the impact
should not be severe. See patch 2 for more detailed discussion and test
results.

Early-boot access to LUO FLB data
---------------------------------

To work with gigantic page allocation, LUO FLB data is needed in early
boot, before LUO is fully initialized. Patch 3 adds support for fetching
LUO FLB data in early boot.

Preserving the entire huge page pool vs only used
-------------------------------------------------

This series makes a design decision on preserving only the number of
preserved huge pages for each hstate, instead of preserving the entire
huge page pool. Both approaches were brought up in the Live Update
meetings. Patch 6 discusses the reasoning in more detail.

[0] 
https://lore.kernel.org/linux-mm/[email protected]/T/#u
[1] https://lwn.net/Articles/1033364/
[2] 
https://lore.kernel.org/linux-mm/[email protected]/T/#u
[3] https://lpc.events/event/19/contributions/2044/

Pratyush Yadav (10):
  kho: drop restriction on maximum page order
  kho: disable scratch-only earlier in boot
  liveupdate: do early initialization before hugepages are allocated
  liveupdate: flb: allow getting FLB data in early boot
  mm: hugetlb: export some functions to hugetlb-internal header
  liveupdate: hugetlb subsystem FLB state preservation
  mm: hugetlb: don't allocate pages already in live update
  mm: hugetlb: disable CMA if liveupdate is enabled
  mm: hugetlb: allow freezing the inode
  liveupdate: allow preserving hugetlb-backed memfd

 Documentation/mm/memfd_preservation.rst |   9 +
 MAINTAINERS                             |   2 +
 arch/x86/kernel/setup.c                 |  19 +-
 fs/hugetlbfs/inode.c                    |  14 +-
 include/linux/hugetlb.h                 |   8 +
 include/linux/kho/abi/hugetlb.h         |  98 ++++
 include/linux/liveupdate.h              |  12 +
 kernel/liveupdate/Kconfig               |  15 +
 kernel/liveupdate/kexec_handover.c      |  13 +-
 kernel/liveupdate/luo_core.c            |  30 +-
 kernel/liveupdate/luo_flb.c             |  69 ++-
 kernel/liveupdate/luo_internal.h        |   2 +
 mm/Makefile                             |   1 +
 mm/hugetlb.c                            | 113 ++--
 mm/hugetlb_cma.c                        |   7 +
 mm/hugetlb_internal.h                   |  50 ++
 mm/hugetlb_luo.c                        | 699 ++++++++++++++++++++++++
 mm/memblock.c                           |   1 -
 mm/memfd_luo.c                          |   4 -
 mm/mm_init.c                            |  15 +-
 20 files changed, 1099 insertions(+), 82 deletions(-)
 create mode 100644 include/linux/kho/abi/hugetlb.h
 create mode 100644 mm/hugetlb_internal.h
 create mode 100644 mm/hugetlb_luo.c


base-commit: 55b7d75112c25b3e2a5eadc11244c330a5c00a41
-- 
2.43.0


Reply via email to