Re: [edk2] [RFC v1 0/3] Add VIRTIO_F_IOMMU_PLATFORM support

Brijesh Singh Fri, 28 Jul 2017 17:53:38 -0700


On 7/28/17 2:59 PM, Laszlo Ersek wrote:
> On 07/28/17 18:00, Brijesh Singh wrote:
>> On 07/28/2017 08:38 AM, Laszlo Ersek wrote:
> snip
>
>>> (b) Plus, approaching the question from the Map() direction, we need
>>> to consider two scenarios:
>>>
>>> - Client code calls AllocateBuffer(), then Map(), and it writes to
>>>   the buffer only then. This should be safe.
>>> - client code calls AllocateBuffer(), writes to it, and then calls
>>>   Map(). This will result in memory contents that look like garbage
>>>   to the hypervisor. Bad.
>>>
>>> I can imagine the following to handle these cases: in the Map() and
>>> Unmap() functions, we have to decrypt and encrypt the memory contents
>>> in-place, after changing the C bit (without allocating additional
>>> memory). Introduce a static UINT8 array with EFI_PAGE_SIZE bytes
>>> (this will always remain in encrypted memory). Update the C bit with
>>> a single function call for the entire range (like now) -- this will
>>> not affect the guest-readability of the pages --, then bounce each
>>> page within the range to the static buffer and back to its original
>>> place. In effect this will in-place encrypt or decrypt the memory,
>>> and will be faster than a byte-wise rewrite.
> snip
>
>>> * BusMasterCommonBuffer:
>>>    - Client calls AllocateBuffer(), and places some data in the
>>>      returned memory.
>>>    - Client calls Map(). Map() clears the C bit in one fell swoop,
>>>      and then decrypts the buffer in-place (by bouncing it page-wise
>>>      to the static array and back).
>>>    - Client communicates with the device.
>>>    - Client calls Unmap(). Unmap() restores the C bit in one fell
>>>      swoop, and encrypts the buffer in-place (by bouncing it
>>>      page-wise to the static array and back).
>>>    - Client reads some residual data from the buffer.
>>>    - Client calls FreeBuffer(). FreeBuffer() relases the pages.
>>>
>> Yes this works fine as long as the client uses
>> EFI_PCI_IO_PROTOCOL.AllocateBuffer() to allocate the buffer.
> Again, a performance-oriented thought:
>
> Above I suggested using a statically allocated page-sized buffer, for
> the in-place encryption/decryption. Ultimately this means *two*
> CopyMem()s for the entire buffer (just executed page-wise), in *each* of
> Map() and Unmap().
>
> Maybe we can do better: what if you perform the CopyMem() from the
> buffer right back to the same buffer? CopyMem() is *required* to work
> with overlapping source and target areas (similarly to memmove() in
> standard C).
>
> This would result in *one* CopyMem (for in-place de-/encryption) in each
> of Map() and Unmap(), and thereby it would have identical performance
> impact to the BusMasterRead and BusMasterWrite Map() operations (where
> copying / crypting takes place between distinct memory areas).
>
> The OVMF DSC files resolve "BaseMemoryLib" -- which provides CopyMem()
> -- to "MdePkg/Library/BaseMemoryLibRepStr/BaseMemoryLibRepStr.inf";
> regardless of module type. The actual implementation appears to reside
> in "MdePkg/Library/BaseMemoryLibRepStr/X64/CopyMem.nasm":
>


AMD APM document a procedure which must be used to perform in-place
encryption/decryption. We must follow those steps to ensure that data is
flush into memory using the correct C-bit. Not doing so  may result in
unpredictable results.

http://support.amd.com/TechDocs/24593.pdf (Section 7.10.8)

>> global ASM_PFX(InternalMemCopyMem)
>> ASM_PFX(InternalMemCopyMem):
>>     push    rsi
>>     push    rdi
>>     mov     rsi, rdx                    ; rsi <- Source
>>     mov     rdi, rcx                    ; rdi <- Destination
>>     lea     r9, [rsi + r8 - 1]          ; r9 <- End of Source
>>     cmp     rsi, rdi
>>     mov     rax, rdi                    ; rax <- Destination as return value
>>     jae     .0
>>     cmp     r9, rdi
>>     jae     @CopyBackward               ; Copy backward if overlapped
>> .0:
>>     mov     rcx, r8
>>     and     r8, 7
>>     shr     rcx, 3
>>     rep     movsq                       ; Copy as many Qwords as possible
>>     jmp     @CopyBytes
>> @CopyBackward:
>>     mov     rsi, r9                     ; rsi <- End of Source
>>     lea     rdi, [rdi + r8 - 1]         ; esi <- End of Destination
>>     std                                 ; set direction flag
>> @CopyBytes:
>>     mov     rcx, r8
>>     rep     movsb                       ; Copy bytes backward
>>     cld
>>     pop     rdi
>>     pop     rsi
>>     ret
>>
> However, I'm afraid even if this works on SEV (which I certainly
> expect!), this code won't be reached, due to the following CopyMem()
> wrapper implementation in
> "MdePkg/Library/BaseMemoryLibRepStr/CopyMemWrapper.c":
>
>> VOID *
>> EFIAPI
>> CopyMem (
>>   OUT VOID       *DestinationBuffer,
>>   IN CONST VOID  *SourceBuffer,
>>   IN UINTN       Length
>>   )
>> {
>>   if (Length == 0) {
>>     return DestinationBuffer;
>>   }
>>   ASSERT ((Length - 1) <= (MAX_ADDRESS - (UINTN)DestinationBuffer));
>>   ASSERT ((Length - 1) <= (MAX_ADDRESS - (UINTN)SourceBuffer));
>>
>>   if (DestinationBuffer == SourceBuffer) {
>>     return DestinationBuffer;
>>   }
>>   return InternalMemCopyMem (DestinationBuffer, SourceBuffer, Length);
>> }
> As you see, (DestinationBuffer == SourceBuffer) is handled as a no-op
> (quite justifiedly, except in the case of SEV).
>
> Personally I think it would be OK to copy the wrapper function and the
> assembly code to OvmfPkg/IoMmuDxe/X64, under the names SevCopyMem() and
> InternalSevCopyMem(), and call SevCopyMem() in the CommonBuffer cases of
> Map() and Unmap(), for the in-place flipping.
>
> For the 32-bit case (OvmfPkgIa32.dsc), my understanding is that guests
> cannot control the C bit at all (there is no C bit in the PTEs), and
> memory is always encrypted. Is that correct? If so, then we only need to
> ensure that SevCopyMem() compile, as it will never be called -- in the
> entry point function of OvmfPkg/IoMmuDxe, MemEncryptSevIsEnabled() will
> return FALSE, and so the IOMMU protocol will not be installed. Therefore
> the 32-bit version (under OvmfPkg/IoMmuDxe/Ia32) of SevCopyMem() can be
> stubbed out as an ASSERT(FALSE)+CpuDeadLoop().
>
> If you can think of a better location for SevCopyMem(), that's fine as
> well. For example, you could add it to
> "OvmfPkg/Library/BaseMemEncryptSevLib" as well.
>
> ... I don't think this functionality should be added under MdePkg,
> because it is *very* special to the IOMMU implementation, and
> practically no other module should use a "busy" in-place CopyMem().
>
> Thanks
> Laszlo

_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.01.org/mailman/listinfo/edk2-devel

Re: [edk2] [RFC v1 0/3] Add VIRTIO_F_IOMMU_PLATFORM support

Reply via email to