Hi,
Ackerley Tng <[email protected]> writes: > [email protected] writes: > >> Hi, >> >> Ackerley Tng <[email protected]> writes: >> >>> Sean Christopherson <[email protected]> writes: >>> >>>>> >>>>> [...snip...] >>>>> >>>>> we have one open Question left: >>>>> 1. How to check guest_memfd is fully shared. >>>>> >>>>> [...snip...] >>>>> >>>> >>>> Given that lack of support isn't going to be limited to _just_ guest_memfd, >>>> simply disallow preservation if the VM supports private memory: >>>> >>>> if (kvm_arch_has_private_mem(kvm)) >>>> return -EOPNOTSUPP; >>> >> This is a good idea, I have implemented it in V2 (sent it recently). >> > > Will look at RFC v2 in a bit :) Thanks! > Thankyou. >> Below, I have mentioned a detailed analysis. Please let me know your >> thoughts. >> >>> Makes sense. Tarun this was the other option that I was suggesting when >>> we discussed offline. >>> >>> I think (?) it is possible to create a fully-private guest_memfd for a >>> non-Confidential VM, and even after conversion lands, for both >>> vm_memory_attributes=true and vm_memory_attributes=false. >>> >>> In that case, your preservation series can still preserve memory tracked >>> as private by guest_memfd but not used as private, right? >>> >>> I don't think anyone will use this combination before guest_memfd >>> write() support lands, we just need to make sure there's no kernel crash >>> or corruption in this case. >> >> IIUC, currently, private memory definition is where cocovm with HW based >> private memory is supported. Which is directly checked by >> kvm_arch_has_private_mem and this return false incase ARCH does not >> support it in HW (SEV/SNP, TDX). >> >> So about the combination where a guest_memfd is tracked as private but >> not actually private. (Created without the INIT_SHARED flags). Even >> though kvm_arch_has_private_mem is false. >> >> In this case the luo will preserve the guest_memfd. But it will not >> preserve any attributes, because >> 1. during creation, we dont populate any attributes, so by default >> guest_memfd memory always considered to be shared. (Even though no >> INIT_SHARED is passed). Conversion to private IOCTL will fail as >> kvm_arch_has_private_mem check will fail for non-COCOVM. >> for COCOVM, >> presevation will not be supported and conversion to private memory will >> be succeded. (No corruption and kernel crash in this case) >> >> >> ==== WHAT IF CONVERSION SERIES LANDS ========= >> > > I think there are quite many cases to consider if we consider both > before and after conversion series lands. For preservation, shall we > assume it will go after conversion? :) This way we can avoid discussing > some cases. > No, I dont prefer to make any assumption and try to make both the series independent of each other. Landing any of the series first should not break another. I am keeping this goal. >> In this case, we have two scenerio >> 1. kvm->attributes >> In this case, Every logic remains same as above. > > This is the vm_memory_attributes=true case. > > Since kvm_supported_mem_attributes(kvm) returns 0 when > kvm_arch_has_private_mem(kvm) returns false, the vm's mem_attr_array > will always say shared for any vm where kvm_arch_has_private_mem() is > false. > > So yup, preservation doesn't preserve attributes but that's also > effectively preserving always-shared attributes, so all is good. > >> 2. gmem->attributes >> if INIT_SHARED: memory is initially shared and no entry in maple tree, >> and if >> kvm_arch_has_private_mem returns false (non-COCOVM), this will just >> fails any conversion request. hence preserving the guest_memfd wont >> have any problem (fully shared case). >> > > Yup, preservation doesn't preserve attributes but that's also > effectively preserving always-shared attributes, so all is good. > >> if INIT_SHARED not set: memory is initially private, and there is an >> entry in maple tree, marked the memory as private. During conversion: >> A. To private: it fails as kvm_arch_has_private_mem return false. >> Preservation is safe and there will be no problem. as in >> preservation >> we dont preserve any attributes, but on retrieval, when a >> new gmem >> file is created, identical entry to attributes will also >> be assigned >> as INIT_SHARED flag is not set. So we wont have any issues >> in this case. >> B. TO shared: it passes, and update the maple tree. preservation will >> not be preserving the attributes with current patches, and it will >> also allow the preservation as kvm_arch_has_private_mem is false. So >> in this case, on retrieval, we will lose the data regarding the >> private vs shared (non-cocovm). So if host want to access the shared >> memory, it will try MMAP and which will fail in new kernel (which was >> successful in old kernel as conversion happened). But I dont see any >> kernel crash or corruption. > > Perhaps it should be this way: the process of preserving should first > freeze further conversions, then check for private/shared status, and if > private, return error to userspace. > > I haven't looked at RFC v2 yet, but IIUC this is kind of in line with > the other issue where we freeze allocations: > > + When preserving, freeze further allocations, preserve all pages that > are already allocated. If there are pages that weren't allocated, too > bad - allocations are frozen until unpreserve. > + When preserving, freeze further conversions, preserve all shared > pages. If there are private pages, fail preservation. > For this series, I am strictly focusing on fully shared guest_memfd. And any guest_memfd that supports or will support conversion to private will not be allowed for preservation with this series. Freezing the conversion, is requirement for guest_memfd that will support private mapping. At this point I dont want to include this in this base support. We will look on this when we will work on preservation of private mapping. > This way, there's no loss of any private/shared state, since only shared > pages are preserved. > >> This is an issue with only non-cocoVM support with conversion series >> having the gmem attribute enabled without INIT_SHARED flag. I am not >> sure, if there will be any user for this very soon (is it SW_PROTECTED_VM >> ?? ). >> > > So if a non-CoCo VM does use an all-private guest_memfd (didn't specify > INIT_SHARED) (don't think anyone will use this before guest_memfd > write() support), it'll just always fail preservation unless before > preserving everything was converted to shared. > yes, Also We will never know if everything is converted to shared unless we walk the maple tree, But for this base support, we dont want to depend on conversion series, So depending on INIT_SHARED flag along with kvm_arch_has_private_mem check makes this series immune to conversion series, INIT_SHARED makes the initial mapping shared, and kvm_arch_has_private_mem avoid any further conversion which is also true before and after conversion series. WDYT? V2 does not have INIT_SHARED check, it has only kvm_arch_has_private_mem. I can comment there to be updated in next version, if we are aligned on this. Thanks.

