Re: [PATCH v1 2/3] x86/coco: Disable TDX module calls when TD partitioning is active
On Wed, Nov 22, 2023 at 06:01:05PM +0100, Jeremi Piotrowski wrote: > Introduce CC_ATTR_TDX_MODULE_CALLS to allow code to check whether TDX module > calls are available. When TD partitioning is enabled, a L1 TD VMM handles most > TDX facilities and the kernel running as an L2 TD VM does not have access to > TDX module calls. The kernel still has access to TDVMCALL(0) which is > forwarded > to the VMM for processing, which is the L1 TD VM in this case. Sounds like a problem introduced by patch 1/3 :/ -- Kiryl Shutsemau / Kirill A. Shutemov
Re: [PATCH v1 3/3] x86/tdx: Provide stub tdx_accept_memory() for non-TDX configs
On Wed, Nov 22, 2023 at 06:01:06PM +0100, Jeremi Piotrowski wrote: > When CONFIG_INTEL_TDX_GUEST is not defined but CONFIG_UNACCEPTED_MEMORY=y is, > the kernel fails to link with an undefined reference to tdx_accept_memory from > arch_accept_memory. Provide a stub for tdx_accept_memory to fix the build for > that configuration. > > CONFIG_UNACCEPTED_MEMORY is also selected by CONFIG_AMD_MEM_ENCRYPT, and there > are stubs for snp_accept_memory for when it is not defined. Previously this > did > not result in an error when CONFIG_INTEL_TDX_GUEST was not defined because the > branch that references tdx_accept_memory() was being discarded due to > DISABLE_TDX_GUEST being set. And who unsets it now? -- Kiryl Shutsemau / Kirill A. Shutemov
Re: [PATCH v1 1/3] x86/tdx: Check for TDX partitioning during early TDX init
On Wed, Nov 22, 2023 at 06:01:04PM +0100, Jeremi Piotrowski wrote: > Check for additional CPUID bits to identify TDX guests running with Trust > Domain (TD) partitioning enabled. TD partitioning is like nested > virtualization > inside the Trust Domain so there is a L1 TD VM(M) and there can be L2 TD > VM(s). > > In this arrangement we are not guaranteed that the TDX_CPUID_LEAF_ID is > visible > to Linux running as an L2 TD VM. This is because a majority of TDX facilities > are controlled by the L1 VMM and the L2 TDX guest needs to use TD partitioning > aware mechanisms for what's left. So currently such guests do not have > X86_FEATURE_TDX_GUEST set. > > We want the kernel to have X86_FEATURE_TDX_GUEST set for all TDX guests so we > need to check these additional CPUID bits, but we skip further initialization > in the function as we aren't guaranteed access to TDX module calls. I don't follow. The idea of partitioning is that L2 OS can be unenlightened and have no idea if it runs indide of TD. But this patch tries to enumerate TDX anyway. Why? -- Kiryl Shutsemau / Kirill A. Shutemov
Re: [PATCH v2 06/17] iommu: Add iommu_fwspec_alloc/dealloc()
On 2023/11/22 1:00, Jason Gunthorpe wrote: > On Tue, Nov 21, 2023 at 03:47:48PM +0900, Hector Martin wrote: >>> Which is sensitive only to !NULL fwspec, and if EPROBE_DEFER is >>> returned fwspec will be freed and dev->iommu->fwspec will be NULL >>> here. >>> >>> In the NULL case it does a 'bus probe' with a NULL fwspec and all the >>> fwspec drivers return immediately from their probe functions. >>> >>> Did I miss something? >> >> apple_dart is not a fwspec driver and doesn't do that :-) > > It implements of_xlate that makes it a driver using the fwspec probe > path. > > The issue is in apple-dart. Its logic for avoiding bus probe vs > fwspec probe is not correct. > > It does: > > static int apple_dart_of_xlate(struct device *dev, struct of_phandle_args > *args) > { > [..] > dev_iommu_priv_set(dev, cfg); > > > Then: > > static struct iommu_device *apple_dart_probe_device(struct device *dev) > { > struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev); > struct apple_dart_stream_map *stream_map; > int i; > > if (!cfg) > return ERR_PTR(-ENODEV); > > Which leaks the cfg memory on rare error cases and wrongly allows the > driver to probe without a fwspec, which I think is what you are > hitting. > > It should be > >if (!dev_iommu_fwspec_get(dev) || !cfg) > return ERR_PTR(-ENODEV); > > To ensure the driver never probes on the bus path. > > Clearing the dev->iommu in the core code has the side effect of > clearing (and leaking) the cfg which would hide this issue. Aha! Yes, that makes it work with only the first change. I'll throw the apple-dart fix into our tree (and submit it once I get to clearing out the branch; the affected consumer driver isn't upstream yet so this isn't particularly urgent). - Hector