On 5/6/26 1:23 PM, Ackerley Tng wrote:
> Dave Jiang <[email protected]> writes:
> 
>> On 4/24/26 10:13 AM, Frank van der Linden wrote:
>>> Dave Jiang <[email protected]> wrote:
>>>> This RFC series is created as a proof of concept to connect device DAX to 
>>>> guest
>>>> memory by riding on top of guest memfd in order to prove out that device 
>>>> DAX
>>>> can be used as guest memory. The series seeks to jump start a discussion on
>>>> if there are interests in creating a DAX bridge to utilize CXL memory for 
>>>> guest
>>>> memory until the N_PRIVATE implementation by Gregory [1] is available 
>>>> upstream
>>>> and DAX users are ready to move to the new scheme. Once there's an 
>>>> established
>>>> consensus of interest, we can move the discussion to the best way to 
>>>> implement
>>>> the DAX bridge and the future of device DAX as guest.
>>>>
>>>> I did the bare minimal to get the PoC to pass a modified version of KVM 
>>>> gmem
>>>> selftest (guest_memfd_test) in order to prove out that DAX can go in the 
>>>> gmem
>>>> path. A DAX char dev is created and the fd is passed in user space with
>>>> vm_set_user_memory_region2(). The DAX region is passed in as a whole when 
>>>> used
>>>> unlike memfd where any size can be passed in to be allocated.
>>>>
>>>> The folks on the cc line are people that Dan Williams has mentioned that 
>>>> may be
>>>> of interest to this.
>>>>
> 
> Thanks for the PoC! I've been working on guest_memfd HugeTLB and I'm
> glad there is interest in other "backends" for guest_memfd :)
> 
>>>> [1]: 
>>>> https://lore.kernel.org/linux-cxl/aeWV1CvP9ImZ3eEG@gourry-fedora-PF4VCD3F/T/#t
>>>
>>> One of the main ideas behind guest_memfd is that the memory is managed
>>> by the kernel only, so it knows what it has and that it can trust
>>> the memory. This RFC passes an fd in via the ioctl(), which I think
>>> breaks that model.
> 
> Yup! One of guest_memfd's core purposes is to be able to block host
> accesses to guest private (in the CoCo sense) memory.
> 
>>
>> Don't we issue KVM_CREATE_GUEST_MEMFD ioctl to get a fd in userspace to be 
>> passed to KVM_SET_USER_MEMORY_REGION2 ioctl later? We are just passing in a 
>> DAX fd instead of a guest mem fd.
>>
> 
> This RFC is passing a DAX fd instead of a guest_memfd when creating a
> memslot, so it's not really using guest_memfd, it's just reusing the
> functions that were first created for guest_memfd to support another
> kind of fd.
> 

Right. It was the fastest way to see if something would work. It isn't meant to 
be the design goal in the future.

> What's the use case you're shooting for? Why not mmap() from the DAX
> fd and then pass the userspace address to KVM when setting up a memslot?

The use case mainly is to see if the people currently using DAX via mmap() 
would utilize this for other usages as a bridge vs something like the private 
node implementation Gregory is working on that has a totally different way of 
doing things. So yes what you suggested could be another way to do it. Mainly I 
want to see if there's even any interest at all. And if so then we can talk 
about how we want it to be done and I'm wide open on that.

> 
> Is there a requirement to have the DAX memory usable by CoCo guests as
> well, and hence requiring guest_memfd-style protection from host
> accesses for private DAX memory?

I think if we are to implement this then I think so at some point.

DJ
> 
>>>
>>> Since there is interest for several different allocation backends
>>> (default, hugetlb, zone_device), it might be better to use a model
>>> where guest_memfd has the option for backend allocators to register
>>> themselves in the kernel. The ioctl can then select one by their
>>> id/name (could be just a string). They can be configured using
>>> e.g. sysfs (like hugetlb already is).
>>>
>>> This would also allow easy experimentation with new allocators,
>>> having an allocator with BPF control, etc.
>>
>> Agreed. Although my main intent is to see if there's interest with providing 
>> something to the usages already on the DAX path an ease of transition until 
>> something like what's proposed above shows up. But if what I proposed will 
>> be a security issue then maybe not.
>>
>>>
>>> - Frank
> 


Reply via email to