On 24.09.25 12:33, fanhuang wrote:
Hi David,
Hi!
CCing Igor and Jonathan.
I hope this email finds you well. It's been several months since Zhigang last
discussion about the Special Purpose Memory (SPM) implementation in QEMU with
you, and I wanted to provide some background context before presenting the new
patch based on your valuable suggestions.
Previous Discussion Summary
===========================
Back in December 2024, we had an extensive discussion regarding my original
patch that added the `hmem` option to `memory-backend-file`. During that
conversation, you raised several important concerns about the design approach:
Original Approach (December 2024)
----------------------------------
- Zhigang's patch: Added `hmem=on` option to `memory-backend-file`
- QEMU cmdline example:
-object
memory-backend-file,size=16G,id=m1,mem-path=/dev/dax0.0,prealloc=on,align=1G,hmem=on
-numa node,nodeid=1,memdev=m1
Your Concerns and Suggestions
-----------------------------
You correctly identified some issues with the original approach:
- Configuration Safety: Users could create problematic configurations like:
-object memory-backend-file,size=16G,id=unused,mem-path=whatever,hmem=on
- Your Recommendation: You proposed a cleaner approach using NUMA node
configuration:
-numa node,nodeid=1,memdev=m1,spm=on
Oh my, I don't remember all the details from that discussion :)
I assume that any memory devices (DIMM/NVDIMM/virtio-mem) we would
cold/hotplug to such a NUMA node would not be indicated as spm, correct?
Project Context
===============
To refresh your memory on the use case:
- Objective: Pass `EFI_MEMORY_SP` (Special Purpose Memory) type memory from
host to QEMU virtual machine
- Application: Memory reserved for specific PCI devices (e.g., VFIO-PCI devices)
- Guest Behavior: The SPM memory should be recognized by the guest OS and
claimed by hmem-dax driver
Complete QEMU Configuration Example:
-object memory-backend-ram,size=8G,id=m0
-object
memory-backend-file,size=16G,id=m1,mem-path=/dev/dax0.0,prealloc=on,align=1G
-numa node,nodeid=0,memdev=m0
-numa node,nodeid=1,memdev=m1,spm=on # <-- New approach based on your
suggestion
The only alternative I could think of is gluing it to a memory device. For
example,
have something like:
-numa node,nodeid=0,memdev=m0 \
-numa node,nodeid=1 \
-device pc-dimm,id=sp0,memdev=m1,sp=true
But we would not want (and cannot easily) use DIMMs for that purpose.
New Patch Implementation
========================
Following your recommendations, I have completely redesigned the implementation:
Key Changes:
1. Removed `hmem` option from `memory-backend-file`
2. Added `spm` (special-purpose) option to NUMA node configuration
That definitely sounds better to me: essentially "spm" would say: the boot
memory assigned to this
node (through memdev=) will be indicated as EFI_MEMORY_SP.
I would appreciate your review of the new patch implementation. The design now
follows your suggested approach of using NUMA node configuration rather than
memory backend options, which should resolve the safety and scope issues we
discussed.
Thank you for your time and valuable guidance on this implementation.
Please note that I'm located in UTC+8 timezone, so there might be some delay in
my responses to your emails due to the time difference. I appreciate your
patience and understanding.
No worries :)
--
Cheers
David / dhildenb