On Thu, 5 Feb 2026 17:48:47 +0000 Jonathan Cameron <[email protected]> wrote:
Hi Jonathan, Thanks for the clarifications. Quick thought inline. > > > I'm not clear if sysram could be used for virtio, or even needed. I'm > > > still figuring out how virtio of simple memory devices is a gain. > > > > > > > Jonathan mentioned that he thinks it would be possible to just bring it > > online as a private-node and inform the consumer of this. I think > > that's probably reasonable. > > Firstly VM == Application. If we have say a DB that wants to do everything > itself, it would use same interface as a VM to get the whole memory > on offer. (I'm still trying to get that Application Specific Memory term > adopted ;) > > This would be better if we didn't assume anything to do with virtio > - that's just one option (and right now for CXL mem probably not the > sensible one as it's missing too many things we get for free by just > emulating CXL devices - e.g. all the stuff you are describing here > for the host is just as valid in the guest.) We have a path to > get that emulation and should have the big missing piece posted shortly > (DCD backed by 'things - this discussion' that turn up after VM boot). > > The real topic is memory for a VM and we need a way to tie a memory > backend in qemu to, so that whatever the fabric manager provided for > that VM is given to the VM and not used for anything else. > > If it's for a specific VM, then it's tagged as otherwise how else > do we know the intent? (lets ignore random other out of band paths). > > Layering wise we can surface as many backing sources as we like at > runtime via 1+ emulated DCD devices (to give perf information etc). > They each show up in the guest as contiguous (maybe tagged) single > extent and then we apply what ever comes out of the rest of this > discussion on top of that. > > So all we care about is how the host presents it. > > Bunch of things might work for this. > > 1. Just put it in a numa node that requires specific selection to allocate > from. This is nice because it just looks like normal memory and we > can apply any type of front end on top of that. Not good if we have a lot > of these coming and going. > > 2. Provide it as something with an fd we can memmap. I was fine with Dax for > this but if it's normal ram just for a VM anything that gives me a handle > that I can memmap is fine. Just need a way to know which one (so tag). I think both of these approaches are OK, but looking from developers perspective, if someone wants a specific memory for their workload, they should rather get a fd and play with it in whichever way they want. NUMA may not give that much flexibility. As a developer it would prefer 2. Though you may say oh dax then? not sure! > > It's pretty similar for shared cases. Just need a handle to memmap. > In that case, tag goes straight up to guest OS (we've just unwound the > extent ordering in the host and presented it as a contiguous single > extent). > > Assumption here is we always provide all that capacity that was tagged > for the VM to use to the VM. Things may get more entertaining if we have > a bunch of capacity that was tagged to provide extra space for a set of > VMs (e.g. we overcommit on top of the DCD extents) - to me that's a > job for another day. > > So I'm not really envisioning anything special for the VM case, it's > just a dedicate allocation of memory for a user who knows how to get it. > We will want a way to get perf info though so we can provide that > in the VM. Maybe can figure that out from the CXL HW backing it without > needing anything special in what is being discussed here. > > Jonathan > > > > > ~Gregory >

