On Mon, 22 Dec, at 10:59:02AM, Ard Biesheuvel wrote:
> In order to support kexec, the kernel needs to be able to deal with the
> state of the UEFI firmware after SetVirtualAddressMap() has been called.
> To avoid having separate code paths for non-kexec and kexec, let's move
> the call to SetVirtualAddressMap() to the stub: this will guarantee us
> that it will only be called once (since the stub is not executed during
> kexec), and ensures that the UEFI state is identical between kexec and
> normal boot.
> 
> This implies that the layout of the virtual mapping needs to be created
> by the stub as well. All regions are rounded up to a naturally aligned
> multiple of 64 KB (for compatibility with 64k pages kernels) and recorded
> in the UEFI memory map. The kernel proper reads those values and installs
> the mappings in a dedicated set of page tables that are swapped in during
> UEFI Runtime Services calls.
> 
> Signed-off-by: Ard Biesheuvel <[email protected]>
> ---
>  arch/arm64/include/asm/efi.h       |  20 +++-
>  arch/arm64/kernel/efi.c            | 223 
> ++++++++++++++++++++-----------------
>  arch/arm64/kernel/setup.c          |   1 +
>  drivers/firmware/efi/libstub/fdt.c | 137 ++++++++++++++++++++++-
>  4 files changed, 270 insertions(+), 111 deletions(-)

[...]

> @@ -45,5 +53,9 @@ extern void efi_idmap_init(void);
>  #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__)
>  
>  #define EFI_ALLOC_ALIGN              SZ_64K
> +#define EFI_VIRTMAP          EFI_ARCH_1

Any chance of documenting what EFI_VIRTMAP means for posterity and why
you want to use one of the EFI arch config bits? How is this different
from EFI_RUNTIME_SERVICES?

Take a look at EFI_OLD_MEMMAP in arch/x86/include/asm/efi.h for a
crackingly well documented example from Borislav.

[...]
  
> diff --git a/drivers/firmware/efi/libstub/fdt.c 
> b/drivers/firmware/efi/libstub/fdt.c
> index c846a9608cbd..76bc8abf41d1 100644
> --- a/drivers/firmware/efi/libstub/fdt.c
> +++ b/drivers/firmware/efi/libstub/fdt.c
> @@ -167,6 +167,94 @@ fdt_set_fail:
>  #define EFI_FDT_ALIGN EFI_PAGE_SIZE
>  #endif
>  
> +static efi_status_t get_memory_map(efi_system_table_t *sys_table_arg,
> +                                efi_memory_desc_t **map,
> +                                unsigned long *map_size,
> +                                unsigned long *desc_size,
> +                                u32 *desc_ver, unsigned long *key_ptr)
> +{
> +     efi_status_t status;
> +
> +     /*
> +      * Call get_memory_map() with 0 size to retrieve the size of the
> +      * required allocation.
> +      */
> +     *map_size = 0;
> +     status = efi_call_early(get_memory_map, map_size, NULL,
> +                             key_ptr, desc_size, desc_ver);
> +     if (status != EFI_BUFFER_TOO_SMALL)
> +             return EFI_LOAD_ERROR;
> +
> +     /*
> +      * Add an additional efi_memory_desc_t to map_size because we're doing
> +      * an allocation which may be in a new descriptor region. Then double it
> +      * to give us some scratch space to prepare the input virtmap to give
> +      * to SetVirtualAddressMap(). Note that this is EFI_LOADER_DATA memory,
> +      * and the kernel memblock_reserve()'s only the size of the actual
> +      * memory map, so the scratch space is freed again automatically.
> +      */
> +     *map_size += *desc_size;
> +     status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
> +                             *map_size * 2, (void **)map);
> +     if (status != EFI_SUCCESS)
> +             return status;
> +
> +     status = efi_call_early(get_memory_map, map_size, *map,
> +                             key_ptr, desc_size, desc_ver);
> +     if (status != EFI_SUCCESS)
> +             efi_call_early(free_pool, *map);
> +     return status;
> +}

We've now got two (slightly different) versions of this function. Is
there any way we could make do with just one?

> +/*
> + * This is the base address at which to start allocating virtual memory 
> ranges
> + * for UEFI Runtime Services. This is a userland range so that we can use any
> + * allocation we choose, and eliminate the risk of a conflict after kexec.
> + */
> +#define EFI_RT_VIRTUAL_BASE  0x40000000
> +
> +static void update_memory_map(efi_memory_desc_t *memory_map,
> +                           unsigned long map_size, unsigned long desc_size,
> +                           int *count)
> +{
> +     u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
> +     efi_memory_desc_t *out = (void *)memory_map + map_size;
> +     int l;
> +
> +     for (l = 0; l < map_size; l += desc_size) {
> +             efi_memory_desc_t *in = (void *)memory_map + l;
> +             u64 paddr, size;
> +
> +             if (!(in->attribute & EFI_MEMORY_RUNTIME))
> +                     continue;
> +
> +             /*
> +              * Make the mapping compatible with 64k pages: this allows
> +              * a 4k page size kernel to kexec a 64k page size kernel and
> +              * vice versa.
> +              */
> +             paddr = round_down(in->phys_addr, SZ_64K);
> +             size = round_up(in->num_pages * EFI_PAGE_SIZE +
> +                             in->phys_addr - paddr, SZ_64K);
> +
> +             /*
> +              * Avoid wasting memory on PTEs by choosing a virtual base that
> +              * is compatible with section mappings if this region has the
> +              * appropriate size and physical alignment. (Sections are 2 MB
> +              * on 4k granule kernels)
> +              */
> +             if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
> +                     efi_virt_base = round_up(efi_virt_base, SZ_2M);
> +
> +             in->virt_addr = efi_virt_base + in->phys_addr - paddr;
> +             efi_virt_base += size;
> +
> +             memcpy(out, in, desc_size);
> +             out = (void *)out + desc_size;
> +             ++*count;
> +     }
> +}
> +

fdt.c is starting to become a dumping ground for arm*-specific stuff :-(

Is there no general solution for sharing code between arm and aarch64 in
the kernel so we don't have to stick things like this in
drivers/firmware/efi/?

-- 
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to