Lisa Wang <[email protected]> writes: > This patch series focuses on setting up a TDX VM and adding all code > necessary to run a basic lifecycle test. > > Unlike standard KVM selftests can set up the VM through guest registers, > TDX module protects TDs' register state from the host. This feature of > TDX causes problems on VM boot state initialization and the ucall > implementation. > > In standard KVM selftests, the host directly initializes the guest state > by manipulating Special Registers (SREGs) and General Purpose Registers > (GPRs) via IOCTLs (KVM_SET_SREGS, etc.) before the first KVM_RUN. > > To bypass direct register initialization by the host, we utilize the > standard x86 reset vector as the default entry point. > > The mechanism works as follows: > 1. The host places register values into a specific memory region and > inserts boot code at the VM's default starting point. > 2. When the VM starts, it executes this boot code to "pull" values from > memory and manually set up its own SREGs and GPRs. > 3. Once the environment is ready, the boot code jumps to the guest code. > > The standard x86 ucall() implementation uses PIO, but it does not > actually transmit data through the 4-byte PIO data. Instead, it relies > on the host reading the ucall address directly from the guest's RDI > register. > > TDX selftests cannot utilize the standard x86 ucall implementation, > because the host is unable to access the guest's RDI register. Based on > this restriction, we considered these potential solutions for the TDX > ucall implementation. > > 1. TDCALL PIO with RCX-bits Passthrough > We first considered passing the RDI value through RCX bits to bypass the > hardware's register protection, which could be the closest approach to > the non-TDX implementation as per Sean's suggestion[1]. However, this > approach is blocked by the software-side implementation: KVM_GET_REGS > currently does not support TDX VMs and returns -EINVAL. To make this > work, the KVM ioctl would need a test-only hack. > > 2. TDCALL PIO with buffer indexing > To keep a PIO-based approach and unify the get_ucall implementation for > both TDX and non-TDX VMs, we considered TDCALL PIO with buffer indexing. > Since the ucall buffer is initialized prior to execution, the VM could > just pass a buffer index rather than an 8-byte ucall address to fit > within the 4-byte PIO data limit. The host, already knowing the ucall > buffer's base address, could then resolve the ucall content via this > index. We abandoned this solution because it would require changes to > the common ucall structure and impact other non-x86 architectures. > > 3. TDCALL MMIO (Selected solution) > We ultimately selected TDCALL with an 8-byte MMIO data. This method only > requires initializing an MMIO GPA and adding TDCALL MMIO implementation > for TDX under the original x86 ucall path. While this diverges from the > non-TDX PIO, it provides the cleanest implementation with minimal > disruption to the overall ucall architecture. >
Sean, Lisa evaluated your suggestion [1] (summarized as 1. above) but we think TDCALL MMIO is better, what do you think? + Jump directly to where the mmio is used: [2] + And here's [3] how tdx_mmio_write() is implemented, with no more throwing everything in a structure. It's also not macroed/prototyped like you suggested in [4], but I think those prototypes can evolve out of future tdx functions? Let us know so Lisa can try another option (if necessary) while we collect more reviews :) [1] https://lore.kernel.org/all/[email protected]/ [2] https://lore.kernel.org/all/[email protected]/ [3] https://lore.kernel.org/all/[email protected]/ [4] https://lore.kernel.org/all/[email protected]/ > 4. A note on #VE and x86 ucall simplification > It is worth noting that the use of a Virtualization Exception (#VE) > is orthogonal to the PIO vs. MMIO discussion; rather, it is a question > of how much we want to simplify the x86 ucall implementation. A #VE > handler is one option to allow VMs use PIO/MMIO identical to the > non-TDX case. Alternatively, having an MMIO_WRITE wrapper macro, as Sean > suggested[2], is another option. Either way, discussion for this is > likely a premature optimization right now, since the PIO/MMIO call is > only used under ucall_arch_do_ucall(), and standard and TDX VMs use > different ones now. We should optimize this in the future, but for now, > invoking TDCALL directly is more robust and concise. > > > [...snip...] >

