On 5 September 2017 at 22:59, Laszlo Ersek <[email protected]> wrote:
> On 09/05/17 23:11, Ard Biesheuvel wrote:
>> On 5 September 2017 at 21:17, Laszlo Ersek <[email protected]> wrote:
[...]
>>> * Option #2: add an ExitBootServices() handler to the IOMMU driver, and
>>>   clean up mappings (= re-encrypt memory) after the PCI / Virtio drivers
>>>   are finalized in their ExitBootServices() handlers. Ignore mappings in
>>>   the drivers' ExitBootServices() handlers.
>>>
>>>   Problem: the keyword is "after". Under this approach, we *must* clean
>>>   up the mappings in the IOMMU driver, but we *must not* do that before
>>>   the device drivers are finished. And the UEFI spec does not allow us
>>>   to express a dependency order between ExitBootServices() handlers.
>>>
>>>   We could hack that around by deferring the actual cleanup to *another*
>>>   event handler, signaled from the IOMMU's "normal" ExitBootServices()
>>>   handler.
>>>
>>>   Problem: the UEFI spec does not promise that signaling events from
>>>   ExitBootServices() handlers is safe.
>>>
>>>   Problem: if some PCI driver does the same trick for whatever reason in
>>>   its ExitBootServices() handler, then we're back to square one.
>>>
>>>
>>
>> I think this is the only feasible option. Fortunately, we don't have
>> to solve this at the spec level, but only have to deal with the
>> implementation.
>>
>> What we need is a method in the IOMMU protocol that is invoked when
>> the ExitBootServices implementation is about to return EFI_SUCCESS
>
> Yes, after "everything else" is done.
>
> (Of course, this is the age-old problem with UEFI, when components that
> were originally meant to be independent now try to order themselves in
> some fashion. For example, "let me just install, or patch, this ACPI
> table or configuration table quickly at ReadyToBoot, *after* everyone
> else is done, and I get a good look at system state". Now combine two
> such DXE drivers into a firmware, and hilarity ensues as they fight for
> the last word.)
>
> No facility exists to my knowledge (on the spec level) that would enable
> such fine delaying or dependency ordering between EBS handlers. The only
> ordering is between notification functions enqueued at TPL_NOTIFY vs.
> TPL_CALLBACK, and there is a guarantee (I think?) that if you get
> signalled demonstrably later (i.e., not via an event group), then you
> get queued for later, within the same TPL.
>
> The problem (for this discussion) is that any random PCI driver can
> register its EBS event notification function at either TPL_NOTIFY or at
> TPL_CALLBACK, plus that all EBS events are signaled together as an event
> group. Consequently, if the IOMMU driver registers its EBS notifier at
> TPL_NOTIFY, it is guaranteed to run earlier than all notifiers with
> TPL_CALLBACK (which is exactly what we *don't* want). If the IOMMU
> registers its EBS handler at TPL_CALLBACK, then (due to being signaled
> through a large event group) the notifier will still be invoked
> somewhere in the middle of a bunch of other TPL_CALLBACK-level notifiers
> -- that is, not necessarily in the last position.
>
> Hence my floating of a hack to re-queue the notification (to another
> TPL_CALLBACK handler), so that we get to the "end of the low-prio
> queue". But calling SignalEvent() from an EBS handler is not explicitly
> permitted in the spec. (Below you even suggest that an EBS handler
> should not call any boot service.)
>
>
> Of course, if CoreExitBootServices() can be updated specifically for
> this use case, I won't protest.
>

I think this is the only solution, and not unreasonable given that the
IOMMU protocol is private to EDK2.

>>
>> (which means it could be the second time it was called).
>
> Side remark: the CoreExitBootServices() implementation does not notice
> memory map changes incurred by the ExitBootServices() handlers, in my
> interpretation.
>
> CoreExitBootServices() has a MapKey (= "memmap generation") check early
> on, in CoreTerminateMemoryMap(). This check catches memmap changes
> between the last GetMemoryMap(), and the call to ExitBootServices().
>
> After this check succeeds, the notify functions are kicked, and on this
> path, CoreExitBootServices() simply cannot return any other value than
> EFI_SUCCESS. So whatever mess the individual callbacks make, it doesn't
> notice or report.
>

CoreExitBootServices() disables the timer first, and so it will return
with event dispatch disabled if it fails, ensuring that it is no
longer possible for an event handler to be invoked between
GetMemoryMap and ExitBootServices.

> Perhaps this is better for your argument, actually. I'm not fully sure.
>

Not really :-)

>> This severely
>> limits what the method is actually capable of doing, and I think it
>> should not be allowed to call any boot or DXE services at all, but it
>> should still be sufficient for some linked list traversals and
>> CopyMem() operations, and whatever else is needed to re-encrypt the
>> memory.
>
> Yes, the necessary actions are sufficiently low-level for this. The
> problem is the ordering.
>
>>
>> And actually, this is not only useful for SEV, other IOMMU drivers
>> will have to deal with the same issue: in most cases, you'll want to
>> disable it before handing over to the OS, but this can never be done
>> safely in a EBS event handler for precisely the reasons you pointed
>> out. Most PCI devices probably deal with this gracefully, but tearing
>> down mappings should simply be avoided if a device could still be
>> accessing it.
>>
>
> Fully agreed. Once Jiewen adds the policy option to lock down VT-d at
> EBS (I hope I understood that plan right), dependent on OS support for
> the IOMMU, the IOMMU faults can show up just the same at EBS, until the
> rest of the PCI driver EBS handlers finish up.
>

Indeed. I think it is justified to treat the IOMMU protocol specially.
_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to