Hi,

FWIW I gave this a spin on Seattle and Juno and saw no regressions (both
are pre-2.5 EFI though).

I have some concerns below.

On Wed, Sep 09, 2015 at 08:06:54AM +0100, Ard Biesheuvel wrote:
> The new Properties Table feature introduced in UEFIv2.5 may split
> memory regions that cover PE/COFF memory images into separate code
> and data regions. Since these regions only differ in the type (runtime
> code vs runtime data) and the permission bits, but not in the memory
> type attributes (UC/WC/WT/WB), the spec does not require them to be
> aligned to 64 KB.

We should require those to be 64k-aligned for permissions too. I can
imagine vendors getting permissions wrong but things happening to work
for a 64k kernel (where I assume we have to use the superset of all
permissions within a 64k page).

> As the relative offset of PE/COFF .text and .data segments cannot be
> changed on the fly, this means that we can no longer pad out those
> regions to be mappable using 64 KB pages.
> Unfortunately, there is no annotation in the UEFI memory map that
> identifies data regions that were split off from a code region, so we
> must apply this logic to all adjacent runtime regions whose attributes
> only differ in the permission bits.
> 
> So instead of rounding each memory region to 64 KB alignment at both
> ends, only round down regions that are not directly preceded by another
> runtime region with the same type attributes. Since the UEFI spec does
> not mandate that the memory map be sorted, this means we also need to
> sort it first.
> 
> Signed-off-by: Ard Biesheuvel <[email protected]>
> ---
> 
> As discussed off list, this is the arm64 side of what we should backport
> to stable to prevent firmware that adheres to the current version of the
> UEFI v2.5 spec with the memprotect feature enabled from blowing up the system
> upon the first OS call into the runtime services.
> 
> For arm64, we already map things in order, but since the spec does not mandate
> a sorted memory map, we need to sort it to be sure. This also allows us to
> easily find adjacent regions with < 64 KB granularity, which the current 
> version
> of the spec allows if they only differ in permission bits (which the spec says
> are 'unused' on AArch64, which could be interpreted as 'allowed but ignored').
> 
> Changes since v1:
> - Ensure that we don't inadvertently set the XN bit on the preceding region at
>   mapping time if we the OS is running with >4 KB pages.
>   
>  arch/arm64/kernel/efi.c                 |  3 +-
>  drivers/firmware/efi/libstub/arm-stub.c | 62 +++++++++++++++-----
>  2 files changed, 49 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
> index e8ca6eaedd02..13671a9cf016 100644
> --- a/arch/arm64/kernel/efi.c
> +++ b/arch/arm64/kernel/efi.c
> @@ -258,7 +258,8 @@ static bool __init efi_virtmap_init(void)
>                */
>               if (!is_normal_ram(md))
>                       prot = __pgprot(PROT_DEVICE_nGnRE);
> -             else if (md->type == EFI_RUNTIME_SERVICES_CODE)
> +             else if (md->type == EFI_RUNTIME_SERVICES_CODE ||
> +                      !PAGE_ALIGNED(md->phys_addr))
>                       prot = PAGE_KERNEL_EXEC;

This looks coarser than necessary. For memory organised like:

0x00000000 - 0x0000F000 (60KiB) : EFI_RUNTIME_SERVICES_CODE 
0x0000F000 - 0x00020000 (68KiB) : EFI_RUNTIME_SERVICES_DATA

We should be able to make the last 64K non-executable, but with this all
128K is executable, unless I've missed something?

Maybe we could do a two-step pass, first mapping the data as
not-executable, then mapping any code pages executable (overriding any
overlapping portions, but only for the overlapping parts).

>               else
>                       prot = PAGE_KERNEL;
> diff --git a/drivers/firmware/efi/libstub/arm-stub.c 
> b/drivers/firmware/efi/libstub/arm-stub.c
> index e29560e6b40b..cb4e9c4de952 100644
> --- a/drivers/firmware/efi/libstub/arm-stub.c
> +++ b/drivers/firmware/efi/libstub/arm-stub.c
> @@ -13,6 +13,7 @@
>   */
>  
>  #include <linux/efi.h>
> +#include <linux/sort.h>

Sort isn't an inline in this header. I thought it wasn't safe to call
arbitary kernel functions from the stub?

>  #include <asm/efi.h>
>  
>  #include "efistub.h"
> @@ -305,6 +306,13 @@ fail:
>   */
>  #define EFI_RT_VIRTUAL_BASE  0x40000000
>  
> +static int cmp_mem_desc(const void *a, const void *b)
> +{
> +     const efi_memory_desc_t *left = a, *right = b;
> +
> +     return (left->phys_addr > right->phys_addr) ? 1 : -1;
> +}

Nit: please chose names to make the relationship between these clearer.
e.g. s/left/mem_a/, s/right/mem_b/.

> +
>  /*
>   * efi_get_virtmap() - create a virtual mapping for the EFI memory map
>   *
> @@ -316,34 +324,58 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, 
> unsigned long map_size,
>                    unsigned long desc_size, efi_memory_desc_t *runtime_map,
>                    int *count)
>  {
> +     static const u64 mem_type_mask = EFI_MEMORY_WB | EFI_MEMORY_WT |
> +                                      EFI_MEMORY_WC | EFI_MEMORY_UC |
> +                                      EFI_MEMORY_RUNTIME;
> +
>       u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
> -     efi_memory_desc_t *out = runtime_map;
> +     efi_memory_desc_t *in, *prev = NULL, *out = runtime_map;
>       int l;
>  
> -     for (l = 0; l < map_size; l += desc_size) {
> -             efi_memory_desc_t *in = (void *)memory_map + l;
> +     /*
> +      * To work around potential issues with the Properties Table feature
> +      * introduced in UEFI 2.5, which may split PE/COFF executable images
> +      * in memory into several RuntimeServicesCode and RuntimeServicesData
> +      * regions, we need to preserve the relative offsets between adjacent
> +      * EFI_MEMORY_RUNTIME regions with the same memory type attributes.
> +      * The easiest way to find adjacent regions is to sort the memory map
> +      * before traversing it.
> +      */
> +     sort(memory_map, map_size / desc_size, desc_size, cmp_mem_desc, NULL);
> +
> +     for (l = 0; l < map_size; l += desc_size, prev = in) {
>               u64 paddr, size;
>  
> +             in = (void *)memory_map + l;
>               if (!(in->attribute & EFI_MEMORY_RUNTIME))
>                       continue;
>  
> +             paddr = in->phys_addr;
> +             size = in->num_pages * EFI_PAGE_SIZE;
> +
>               /*
>                * Make the mapping compatible with 64k pages: this allows
>                * a 4k page size kernel to kexec a 64k page size kernel and
>                * vice versa.
>                */
> -             paddr = round_down(in->phys_addr, SZ_64K);
> -             size = round_up(in->num_pages * EFI_PAGE_SIZE +
> -                             in->phys_addr - paddr, SZ_64K);
> -
> -             /*
> -              * Avoid wasting memory on PTEs by choosing a virtual base that
> -              * is compatible with section mappings if this region has the
> -              * appropriate size and physical alignment. (Sections are 2 MB
> -              * on 4k granule kernels)
> -              */
> -             if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
> -                     efi_virt_base = round_up(efi_virt_base, SZ_2M);
> +             if (!prev ||
> +                 ((prev->attribute ^ in->attribute) & mem_type_mask) != 0 ||
> +                 paddr != (prev->phys_addr + prev->num_pages * 
> EFI_PAGE_SIZE)) {
> +

This looks correct, though slightly painful to read. It might be nicer
with helpers helpers like descs_have_same_attrs and
descs_are_contiguous.

Thanks,
Mark.
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to