On 20.05.25 12:28, Chenyi Qiang wrote:
Commit 852f0048f3 ("RAMBlock: make guest_memfd require uncoordinated
discard") highlighted that subsystems like VFIO may disable RAM block
discard. However, guest_memfd relies on discard operations for page
conversion between private and shared memory, potentially leading to
stale IOMMU mapping issue when assigning hardware devices to
confidential VMs via shared memory. To address this and allow shared
device assignement, it is crucial to ensure VFIO system refresh its
IOMMU mappings.

RamDiscardManager is an existing interface (used by virtio-mem) to
adjust VFIO mappings in relation to VM page assignment. Effectively page
conversion is similar to hot-removing a page in one mode and adding it
back in the other. Therefore, similar actions are required for page
conversion events. Introduce the RamDiscardManager to guest_memfd to
facilitate this process.

Since guest_memfd is not an object, it cannot directly implement the
RamDiscardManager interface. Implementing it in HostMemoryBackend is
not appropriate because guest_memfd is per RAMBlock, and some RAMBlocks
have a memory backend while others do not. Notably, virtual BIOS
RAMBlocks using memory_region_init_ram_guest_memfd() do not have a
backend.

To manage RAMBlocks with guest_memfd, define a new object named
RamBlockAttribute to implement the RamDiscardManager interface. This
object can store the guest_memfd information such as bitmap for shared
memory, and handles page conversion notification. In the context of
RamDiscardManager, shared state is analogous to populated and private
state is treated as discard. The memory state is tracked at the host
page size granularity, as minimum memory conversion size can be one page
per request. Additionally, VFIO expects the DMA mapping for a specific
iova to be mapped and unmapped with the same granularity. Confidential
VMs may perform partial conversions, such as conversions on small
regions within larger regions. To prevent such invalid cases and until
cut_mapping operation support is available, all operations are performed
with 4K granularity.

Signed-off-by: Chenyi Qiang <chenyi.qi...@intel.com>
---
Changes in v5:
     - Revert to use RamDiscardManager interface instead of introducing
       new hierarchy of class to manage private/shared state, and keep
       using the new name of RamBlockAttribute compared with the
       MemoryAttributeManager in v3.
     - Use *simple* version of object_define and object_declare since the
       state_change() function is changed as an exported function instead
       of a virtual function in later patch.
     - Move the introduction of RamBlockAttribute field to this patch and
       rename it to ram_shared. (Alexey)
     - call the exit() when register/unregister failed. (Zhao)
     - Add the ram-block-attribute.c to Memory API related part in
       MAINTAINERS.

Changes in v4:
     - Change the name from memory-attribute-manager to
       ram-block-attribute.
     - Implement the newly-introduced PrivateSharedManager instead of
       RamDiscardManager and change related commit message.
     - Define the new object in ramblock.h instead of adding a new file.

Changes in v3:
     - Some rename (bitmap_size->shared_bitmap_size,
       first_one/zero_bit->first_bit, etc.)
     - Change shared_bitmap_size from uint32_t to unsigned
     - Return mgr->mr->ram_block->page_size in get_block_size()
     - Move set_ram_discard_manager() up to avoid a g_free() in failure
       case.
     - Add const for the memory_attribute_manager_get_block_size()
     - Unify the ReplayRamPopulate and ReplayRamDiscard and related
       callback.

Changes in v2:
     - Rename the object name to MemoryAttributeManager
     - Rename the bitmap to shared_bitmap to make it more clear.
     - Remove block_size field and get it from a helper. In future, we
       can get the page_size from RAMBlock if necessary.
     - Remove the unncessary "struct" before GuestMemfdReplayData
     - Remove the unncessary g_free() for the bitmap
     - Add some error report when the callback failure for
       populated/discarded section.
     - Move the realize()/unrealize() definition to this patch.
---
  MAINTAINERS                  |   1 +
  include/system/ramblock.h    |  20 +++
  system/meson.build           |   1 +
  system/ram-block-attribute.c | 311 +++++++++++++++++++++++++++++++++++
  4 files changed, 333 insertions(+)
  create mode 100644 system/ram-block-attribute.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 6dacd6d004..3b4947dc74 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3149,6 +3149,7 @@ F: system/memory.c
  F: system/memory_mapping.c
  F: system/physmem.c
  F: system/memory-internal.h
+F: system/ram-block-attribute.c
  F: scripts/coccinelle/memory-region-housekeeping.cocci
Memory devices
diff --git a/include/system/ramblock.h b/include/system/ramblock.h
index d8a116ba99..09255e8495 100644
--- a/include/system/ramblock.h
+++ b/include/system/ramblock.h
@@ -22,6 +22,10 @@
  #include "exec/cpu-common.h"
  #include "qemu/rcu.h"
  #include "exec/ramlist.h"
+#include "system/hostmem.h"
+
+#define TYPE_RAM_BLOCK_ATTRIBUTE "ram-block-attribute"
+OBJECT_DECLARE_SIMPLE_TYPE(RamBlockAttribute, RAM_BLOCK_ATTRIBUTE)
struct RAMBlock {
      struct rcu_head rcu;
@@ -42,6 +46,8 @@ struct RAMBlock {
      int fd;
      uint64_t fd_offset;
      int guest_memfd;
+    /* 1-setting of the bitmap in ram_shared represents ram is shared */

That comment looks misplaced, and the variable misnamed.

The commet should go into RamBlockAttribute and the variable should likely be named "attributes".

Also, "ram_shared" is not used at all in this patch, it should be moved into the corresponding patch.

+    RamBlockAttribute *ram_shared;
      size_t page_size;
      /* dirty bitmap used during migration */
      unsigned long *bmap;
@@ -91,4 +97,18 @@ struct RAMBlock {
      ram_addr_t postcopy_length;
  };
+struct RamBlockAttribute {

Should this actually be "RamBlockAttributes" ?

+    Object parent;
+
+    MemoryRegion *mr;


Should we link to the parent RAMBlock instead, and lookup the MR from there?


--
Cheers,

David / dhildenb


Reply via email to