On 08/04/22 15:56, Laszlo Ersek wrote: > On 08/04/22 15:28, Jason A. Donenfeld wrote: >> On Thu, Aug 4, 2022 at 3:25 PM Laszlo Ersek <ler...@redhat.com> wrote: >>> >>> On 08/04/22 14:16, Ard Biesheuvel wrote: >>>> On Thu, 4 Aug 2022 at 14:11, Daniel P. Berrangé <berra...@redhat.com> >>>> wrote: >>>>> >>>>> On Thu, Aug 04, 2022 at 02:03:29PM +0200, Jason A. Donenfeld wrote: >>>>>> Hi Daniel, >>>>>> >>>>>> On Thu, Aug 04, 2022 at 10:25:36AM +0100, Daniel P. Berrangé wrote: >>>>>>> Yep, and ultimately the inability to distinguish UEFI vs other firmware >>>>>>> is arguably correct by design, as the QEMU <-> firmware interface is >>>>>>> supposed to be arbitrarily pluggable for any firmware implementation >>>>>>> not limited to merely UEFI + seabios. >>>>>> >>>>>> Indeed, I agree with this. >>>>>> >>>>>>> >>>>>>>> For now I suggest either reverting the original patch, or at least not >>>>>>>> enabling the knob by default for any machine types. In particular, when >>>>>>>> using MicroVM, the user must leave the knob disabled when direct >>>>>>>> booting >>>>>>>> a kernel on OVMF, and the user may or may not enable the knob when >>>>>>>> direct booting a kernel on SeaBIOS. >>>>>>> >>>>>>> Having it opt-in via a knob would defeat Jason's goal of having the seed >>>>>>> available automatically. >>>>>> >>>>>> Yes, adding a knob is absolutely out of the question. >>>>>> >>>>>> It also doesn't actually solve the problem: this triggers when QEMU >>>>>> passes a DTB too. It's not just for the new RNG seed thing. This bug >>>>>> isn't new. >>>>> >>>>> In the other thread I also mentioned that this RNG Seed addition has >>>>> caused a bug with AMD SEV too, making boot measurement attestation >>>>> fail because the kernel blob passed to the firmware no longer matches >>>>> what the tenant expects, due to the injected seed. >>>>> >>>> >>>> I was actually expecting this to be an issue in the >>>> signing/attestation department as well, and you just confirmed my >>>> suspicion. >>>> >>>> But does this mean that populating the setup_data pointer is out of >>>> the question altogether? Or only that putting the setup_data linked >>>> list nodes inside the image is a problem? >>> >>> QEMU already has to inject a whole bunch of stuff into confidential >>> computing guests. The way it's done (IIRC) is that the non-compressed, >>> trailing part of pflash (basically where the reset vector code lives >>> too) is populated at OVMF build time with a chain of GUID-ed structures, >>> and fields of those structures are filled in (at OVMF build time) from >>> various fixed PCDs. The fixed PCDs in turn are populated from the FD >>> files, using various MEMFD regions. When QEMU launches the guest, it can >>> parse the GPAs from the on-disk pflash image (traversing the list of >>> GUID-ed structs), at which addresses the guest firmware will then expect >>> the various crypto artifacts to be injected. >>> >>> The point is that "who's in control" is reversed. The firmware exposes >>> (at build time) at what GPAs it can accept data injections, and QEMU >>> follows that. Of course the firmware ensures that nothing else in the >>> firmware will try to own those GPAs. >>> >>> The only thing that needed to be hard-coded when this feature was >>> introduced was the "entry point", that is, the flash offset at which >>> QEMU starts the GUID-ed structure traversal. >>> >>> AMD and IBM developers can likely much better describe this mechanism, >>> as I've not dealt with it in over a year. The QEMU side code is in >>> "hw/i386/pc_sysfw_ovmf.c". >>> >>> We can make setup_data chaining work with OVMF, but the whole chain >>> should be located in a GPA range that OVMF dictates. >> >> It sounds like what you describe is pretty OVMF-specific though, >> right? > > Yes, completely OVMF specific. > >> Do we want to tie things together so tightly like that? > > It depends on what the goal is: > > - do we want setup_data chaining work generally? > > - or do we want only the random seed injection to stop crashing OVMF guests? > > In the latter case, we only need to teach QEMU to reliably recognize > OVMF from the on-disk firmware binary (regardless of pflash use), and > then not expose any setup_data in guest RAM. The UEFI guest kernel will > use EFI_RNG_PROTOCOL (when available, from virtio-rng-pci), and no > particular seed otherwise. > > (This is the "papering over" case, which I don't think is necessarily > wrong; only it should be robust. I thought pflash would be usable for > that; turns out it isn't. Thus, we could perhaps try identifying a > firmware binary as "OVMF" by contents.) > > In the former case though, I think the "tight coupling" is unavoidable. > As long as setup_data is needed by the kernel extremely early, no > "firmware hiding" or "firmware independence" is in place yet. > >> Given we only need 48 bytes or so, isn't there a more subtle place we >> could just throw this in ram that doesn't need such complex >> coordination? > > These tricks add up and go wrong after a while. The pedantic > reservations in the firmware have proved necessary. > > IIUC, with v2, the setup_data_base address would (most frequently) be 96 > KB. edk2 does have uses for very low memory. If OVMF's PlatformPei does > not reserve away the area, UefiCpuPkg or other drivers might allocate an > overlapping chunk, even if only temporarily. That might not break the > firmware, but it could overwrite the random seed. If the guest kernel > somehow consumed that seed (rather than getting one from > EFI_RNG_PROTOCOL -- what if no virtio-rng-pci device was configured?), > that could potentially destroy the guest's security, without any > obvious/immediate signs.
I must add however that I feel very much out of place making authoritative-sounding statements like this. The above is my honest opinion, and my (unpleasant) writing style, but it's just that: my opinion & style. So much has happened in edk2 and QEMU since last summer (without me following them closely) that I feel uncomfortable speaking on them. Take whatever I say with a grain of salt, and definitely research all options. Laszlo