Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On 10/30/13 at 11:45am, Borislav Petkov wrote: On Wed, Oct 30, 2013 at 05:32:27PM +0800, Dave Young wrote: Boris, thanks for update, it's very elaborate, I have still wonder if 32 bit case should be mentioned as well. Ah, so that's why is mfleming bugging me about it on IRC :) Well, I left out the 32-bit case simply because I don't think anyone cares about it. Ok, that's fine, thanks for telling me. Waiting for you next version of the patch series. I will redo my patches based on that. Since I'm doing only minor fixups, I didn't want to spam the lists again. The latest version is my 'efi' branch at git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git and you can pull it from there. Just pulled your git, the function comment has not yet to be updated, so could you send me privately your new patches if you would not update in list. -- Thanks a lot! Dave -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On 10/29/13 at 10:40am, Borislav Petkov wrote: On Tue, Oct 29, 2013 at 02:47:20PM +0800, Dave Young wrote: Boris, could you update the comment? it says below: update that memory descriptor with the virtual address obtained from ioremap(). Logiclly your patch should update it, then my patch update it again with the case of mapping to fixed address for kexec. Thanks for catching this, I ended up doing the following: /* * This function will switch the EFI runtime services to virtual mode. * Essentially, we look through the EFI memmap and map every region that * has the runtime attribute bit set in its memory descriptor into the * -trampoline_pgd page table using a top-down VA allocation scheme. * * The old method which used to update that memory descriptor with the * virtual address obtained from ioremap() is still supported when the * kernel is booted with efi=old_map on its command line. Same old * method enabled the runtime services to be called without having to * thunk back into physical mode for every invocation. * * The new method does a pagetable switch in a preemption-safe manner * so that we're in a different address space when calling a runtime * function. For function arguments passing we do copy the PGDs of the * kernel page table into -trampoline_pgd prior to each call. */ Boris, thanks for update, it's very elaborate, I have still wonder if 32 bit case should be mentioned as well. Waiting for you next version of the patch series. I will redo my patches based on that. Thanks Dave -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Wed, Oct 30, 2013 at 05:32:27PM +0800, Dave Young wrote: Boris, thanks for update, it's very elaborate, I have still wonder if 32 bit case should be mentioned as well. Ah, so that's why is mfleming bugging me about it on IRC :) Well, I left out the 32-bit case simply because I don't think anyone cares about it. Waiting for you next version of the patch series. I will redo my patches based on that. Since I'm doing only minor fixups, I didn't want to spam the lists again. The latest version is my 'efi' branch at git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git and you can pull it from there. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
/* * This function will switch the EFI runtime services to virtual mode. * Essentially, look through the EFI memmap and map every region that @@ -862,10 +906,10 @@ void efi_memory_uc(u64 addr, unsigned long size) void __init efi_enter_virtual_mode(void) Boris, could you update the comment? it says below: update that memory descriptor with the virtual address obtained from ioremap(). Logiclly your patch should update it, then my patch update it again with the case of mapping to fixed address for kexec. Thanks Dave -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Tue, 08 Oct, at 06:48:31PM, Borislav Petkov wrote: From: Borislav Petkov b...@suse.de We map the EFI regions needed for runtime services contiguously on virtual addresses starting from -4G down for a total max space of 64G. This way, we provide for stable runtime services addresses across kernels so that a kexec'd kernel can still use them. This way, they're mapped in a separate pagetable so that we don't pollute the kernel namespace (you can see how the whole ioremapping and saving and restoring of PGDs is gone now). Also, add a chicken bit called efi=old_map which can be used as a fallback to the old runtime services mapping method in case there's some b0rkage with a particular EFI implementation (haha, it is hard to hold up the sarcasm here...). Add UEFI RT VA space to Documentation/x86/x86_64/mm.txt, while at it. Signed-off-by: Borislav Petkov b...@suse.de --- Documentation/x86/x86_64/mm.txt | 7 +++ arch/x86/include/asm/efi.h | 47 --- arch/x86/include/asm/pgtable_types.h | 3 +- arch/x86/platform/efi/efi.c | 91 ++-- arch/x86/platform/efi/efi_32.c | 8 +++- arch/x86/platform/efi/efi_64.c | 83 arch/x86/platform/efi/efi_stub_64.S | 54 + include/linux/efi.h | 1 + 8 files changed, 251 insertions(+), 43 deletions(-) [...] @@ -949,8 +978,17 @@ void __init efi_enter_virtual_mode(void) count++; } +#ifdef CONFIG_X86_64 + efi_scratch.efi_pgt = (pgd_t *)(unsigned long)real_mode_header-trampoline_pgd; + + if (!test_bit(EFI_OLD_MEMMAP, x86_efi_facility)) + efi_scratch.use_pgd = true; +#endif + BUG_ON(!efi.systab); Could you use the efi_enabled() function to test for EFI_OLD_MEMMAP instead of test_bit()? [...] diff --git a/include/linux/efi.h b/include/linux/efi.h index fa47d80ab4b5..beff433aa8c0 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -632,6 +632,7 @@ extern int __init efi_setup_pcdp_console(char *); #define EFI_RUNTIME_SERVICES 3 /* Can we use runtime services? */ #define EFI_MEMMAP 4 /* Can we use EFI memory map? */ #define EFI_64BIT5 /* Is the firmware 64-bit? */ +#define EFI_OLD_MEMMAP 6 /* Use old mapping method */ Hmm... I'm wondering whether this should actually be, #define EFI_ARCH_1 6 /* Architecture-specific option */ and in arch/x86/include/ we could then do, /* * Lots of info about why we need to switch to a new mapping scheme, but * also why the old scheme might be desirable */ #define EFI_OLD_MEMMAP EFI_ARCH_1 This way we won't exhaust the bitspace quite so soon (since ARM/ARM64 can reuse EFI_ARCH_1 if they need it), plus this memory mapping method is a very architecture-specific thing and so makes sense to hide it in the bowels of arch/x86. If it turns out that ARM/ARM64 need the exact same config option we can delete EFI_ARCH_1 and move EFI_OLD_MEMMAP to include/linux/efi.h just like in your original patch. What do you think? -- Matt Fleming, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Mon, Oct 28, 2013 at 11:22:46AM +, Matt Fleming wrote: Could you use the efi_enabled() function to test for EFI_OLD_MEMMAP instead of test_bit()? Sure. This way we won't exhaust the bitspace quite so soon (since ARM/ARM64 Yeah, very foresightful. can reuse EFI_ARCH_1 if they need it), plus this memory mapping method is a very architecture-specific thing and so makes sense to hide it in the bowels of arch/x86. If it turns out that ARM/ARM64 need the exact same config option we can delete EFI_ARCH_1 and move EFI_OLD_MEMMAP to include/linux/efi.h just like in your original patch. What do you think? Yep, done and pushed out. Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Mon, 21 Oct, at 03:37:41PM, Borislav Petkov wrote: On Mon, Oct 21, 2013 at 08:47:39PM +0800, Dave Young wrote: What's the status of this series? They should appear at some point in Matt's efi-next branch, I think. I need below patch for mapping to fixed virt addr passed from 1st kernel. You need this to map the runtime regions in the kexec kernel, right? Please write that in the commit message. Would you like to add it to your series or I send out it later? Yeah, just add it to your patchset. BTW, what tree should my patches based on? Matt's next tree? Yeah, I think efi-next. Matt? Yes please. -- Matt Fleming, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Wed, Oct 23, 2013 at 10:17:31AM +0800, Dave Young wrote: The reason is that I only pass runtime regions from 1st kernel to kexec kernel, your efi mapping function uses the region size to determin the virtual address from top to down. Because the passed-in md ranges in kexec kernel are different from ranges booting from firmware so the virtual address will be different. Well, this shouldn't be because SetVirtualAddressMap has already fixed the virtual addresses for us. And if they're different, then runtime services won't work anyway. Or am I missing something...? Even I pass the whole untouched ranges including BOOT_SERVICE there's still chance the function for reserving boot regions overwrite the boot region size to 0, and 1st kernel will leave it to be used as normal memory after efi init. I think we have talked about this issue previously. Matt, didn't you question the need to keep boot services regions mapped indefinitely? What was the story there? Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Wed, Oct 23, 2013 at 02:25:31PM +0200, Borislav Petkov wrote: Matt, didn't you question the need to keep boot services regions mapped indefinitely? What was the story there? We shouldn't need boot services regions to be mapped after SetVirtualAddressMap is called. -- Matthew Garrett | mj...@srcf.ucam.org -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On 10/23/13 at 02:25pm, Borislav Petkov wrote: On Wed, Oct 23, 2013 at 10:17:31AM +0800, Dave Young wrote: The reason is that I only pass runtime regions from 1st kernel to kexec kernel, your efi mapping function uses the region size to determin the virtual address from top to down. Because the passed-in md ranges in kexec kernel are different from ranges booting from firmware so the virtual address will be different. Well, this shouldn't be because SetVirtualAddressMap has already fixed the virtual addresses for us. And if they're different, then runtime services won't work anyway. Or am I missing something...? Maybe I did not explain clear enough. Say first kernel mapping below regions: Region A (boot service):phys_start_a size_a - virt_start_a size_a Region B (runtime): phys_start_b size_b - virt_start_b size_b I will pass Range B into 2nd kernel (phys_start_b, size_b, virt_start_b) In kexed 2nd kernel, phys_start_b need to be mapped to virt_start_b Simply use efi_map_region from your patch does not work because it will map phys_start_b to a different virt address, isn't it? So I need simply map according to the kexec passed in mapping addr. Thanks Dave -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Wed, Oct 23, 2013 at 08:51:31PM +0800, Dave Young wrote: In kexed 2nd kernel, phys_start_b need to be mapped to virt_start_b Simply use efi_map_region from your patch does not work because it will map phys_start_b to a different virt address, isn't it? Oh ok, in the second kernel we're not mapping *all* regions we do map in the first kernel, right. So I need simply map according to the kexec passed in mapping addr. Yes, thanks for elaborating. -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Mon, Oct 21, 2013 at 11:04:26PM +0800, Dave Young wrote: You need this to map the runtime regions in the kexec kernel, right? Please write that in the commit message. Yes, will do Ok, but but, why doesn't the normal code path in efi_enter_virtual_mode work anymore? I mean, why do you need another function instead of doing what you did previously: if (!kexec) phys_efi_set_virtual_address_map(...) The path up to here does the mapping already anyway so you only need to do the mapping in the kexec kernel and skip set set_virtual_map thing. Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On 10/22/13 at 01:18pm, Borislav Petkov wrote: On Mon, Oct 21, 2013 at 11:04:26PM +0800, Dave Young wrote: You need this to map the runtime regions in the kexec kernel, right? Please write that in the commit message. Yes, will do Ok, but but, why doesn't the normal code path in efi_enter_virtual_mode work anymore? I mean, why do you need another function instead of doing what you did previously: if (!kexec) phys_efi_set_virtual_address_map(...) The path up to here does the mapping already anyway so you only need to do the mapping in the kexec kernel and skip set set_virtual_map thing. Hi, The reason is that I only pass runtime regions from 1st kernel to kexec kernel, your efi mapping function uses the region size to determin the virtual address from top to down. Because the passed-in md ranges in kexec kernel are different from ranges booting from firmware so the virtual address will be different. Even I pass the whole untouched ranges including BOOT_SERVICE there's still chance the function for reserving boot regions overwrite the boot region size to 0, and 1st kernel will leave it to be used as normal memory after efi init. I think we have talked about this issue previously. Thanks Dave Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Mon, Oct 21, 2013 at 08:47:39PM +0800, Dave Young wrote: What's the status of this series? They should appear at some point in Matt's efi-next branch, I think. I need below patch for mapping to fixed virt addr passed from 1st kernel. You need this to map the runtime regions in the kexec kernel, right? Please write that in the commit message. Would you like to add it to your series or I send out it later? Yeah, just add it to your patchset. BTW, what tree should my patches based on? Matt's next tree? Yeah, I think efi-next. Matt? -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On 10/21/13 at 03:37pm, Borislav Petkov wrote: On Mon, Oct 21, 2013 at 08:47:39PM +0800, Dave Young wrote: What's the status of this series? They should appear at some point in Matt's efi-next branch, I think. I need below patch for mapping to fixed virt addr passed from 1st kernel. You need this to map the runtime regions in the kexec kernel, right? Please write that in the commit message. Yes, will do Would you like to add it to your series or I send out it later? Yeah, just add it to your patchset. Ok. BTW, what tree should my patches based on? Matt's next tree? Yeah, I think efi-next. Matt? -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On 10/14/13 at 11:57am, Peter Jones wrote: On Sat, Oct 12, 2013 at 10:14:39AM +0800, Dave Young wrote: CCing Peter Jones .., Peter, any idea about the grub related problem? What grub problem? As Matt was saying, grub2 isn't loading it as EfiBootServicesCode/Data. grub2 is loading it as EfiLoaderData . Today I did printk debug, it is in fact an off by one bug: text start: 100 md start: 80 md size: 80 Below is the code: if ((start+size = __pa_symbol(_text) start = __pa_symbol(_end)) || !e820_all_mapped(start, start+size, E820_RAM) || memblock_is_region_reserved(start, size)) { /* Could not reserve, skip it */ Will post a patch to fix it. On 10/11/13 at 09:42am, Dave Young wrote: Matt, The kernel I referring is the boot kernel aka the 1st kernel, the boot loader is grub2 from Fedora 19. [sorry for top reply because of using webmail] - Original Message - From: Matt Fleming m...@console-pimps.org To: Dave Young dyo...@redhat.com Cc: Borislav Petkov b...@alien8.de, X86 ML x...@kernel.org, LKML linux-ker...@vger.kernel.org, Borislav Petkov b...@suse.de, Matthew Garrett mj...@srcf.ucam.org, H. Peter Anvin h...@zytor.com, James Bottomley james.bottom...@hansenpartnership.com, Vivek Goyal vgo...@redhat.com, linux-efi@vger.kernel.org, fwts-de...@lists.ubuntu.com Sent: Friday, October 11, 2013 6:27:06 PM Subject: Re: [PATCH 12/12] EFI: Runtime services virtual mapping On Fri, 11 Oct, at 02:24:37PM, Dave Young wrote: For the boot efi_reserve_boot_services code, it's mainly for the SetVirtualAddressMap callback use, so boot regions should not be reused before SetVirtualAddressMap, but the overlapping happens before the efi_reserve_boot_services, isn't it a problem? Hang on, which kernel are you referring to here? The boot kernel or the kexec'd kernel? I thought you were saying you noticed the overlap when running in the second (kexec'd) kernel? The only reason that you would see this overlap in the first (boot) kernel is if the bootloader messed up and allocated the kernel text as EfiBootServicesCode/Data. I'd like to believe no bootloaders are still doing that. -- Matt Fleming, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- Peter -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Sat, Oct 12, 2013 at 10:14:39AM +0800, Dave Young wrote: CCing Peter Jones .., Peter, any idea about the grub related problem? What grub problem? As Matt was saying, grub2 isn't loading it as EfiBootServicesCode/Data. grub2 is loading it as EfiLoaderData . On 10/11/13 at 09:42am, Dave Young wrote: Matt, The kernel I referring is the boot kernel aka the 1st kernel, the boot loader is grub2 from Fedora 19. [sorry for top reply because of using webmail] - Original Message - From: Matt Fleming m...@console-pimps.org To: Dave Young dyo...@redhat.com Cc: Borislav Petkov b...@alien8.de, X86 ML x...@kernel.org, LKML linux-ker...@vger.kernel.org, Borislav Petkov b...@suse.de, Matthew Garrett mj...@srcf.ucam.org, H. Peter Anvin h...@zytor.com, James Bottomley james.bottom...@hansenpartnership.com, Vivek Goyal vgo...@redhat.com, linux-efi@vger.kernel.org, fwts-de...@lists.ubuntu.com Sent: Friday, October 11, 2013 6:27:06 PM Subject: Re: [PATCH 12/12] EFI: Runtime services virtual mapping On Fri, 11 Oct, at 02:24:37PM, Dave Young wrote: For the boot efi_reserve_boot_services code, it's mainly for the SetVirtualAddressMap callback use, so boot regions should not be reused before SetVirtualAddressMap, but the overlapping happens before the efi_reserve_boot_services, isn't it a problem? Hang on, which kernel are you referring to here? The boot kernel or the kexec'd kernel? I thought you were saying you noticed the overlap when running in the second (kexec'd) kernel? The only reason that you would see this overlap in the first (boot) kernel is if the bootloader messed up and allocated the kernel text as EfiBootServicesCode/Data. I'd like to believe no bootloaders are still doing that. -- Matt Fleming, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- Peter -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Sun, Oct 13, 2013 at 11:11:27AM +0800, Dave Young wrote: Boris, I think we have got the agreement about passing setup_data? Yes. Basically, we want to start with what hpa suggested and see where it gets us: http://marc.info/?l=linux-kernelm=138006799131051 I think it should be on top of your patch series, Yep. I can work on that along with other kexec related patches. Or if you would like to do it please let me know. Absolutely, please feel free to do so - it's not like I don't have anything else to do :-) In the meantime, I'll finish randconfigs testing of the patches and upload the latest version to k-org, I'll let you know. Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On 10/11/13 at 09:41am, Borislav Petkov wrote: On Fri, Oct 11, 2013 at 02:24:37PM +0800, Dave Young wrote: But for current implementation from Boris, getting same mapping between diffrent kernel depends on same md order (same start and size for each one) How about using this mapping solution but at the same time for kexec kernel we also pass the virtual mappings via setup_data, only thing diffrent is we only need map the non boot region and just use the boot region size to ensure the other regions are mapped with same virtual address. Actually, as hpa suggested, we will need to be passing the explicit virtual addresses to the kexec kernel in case we change the mapping algorithm in the future. So all should go through setup_data. OTOH, if we only passing ioremapped data without Boris's current patch the problem I worry about is how can we ensure the addresses are not used by other code before we mapping the in 2nd kernel efi_init. Right, the old method of mapping EFI runtime regions used ioremap and was mapping the regions in the same address space. Now we have reserved a 64G in the VA space ending at -4G (i.e. 0x___) which is reserved only for EFI RT usage. Boris: For the boot service region overlapping problem I have another idea, how about modify your mapping code to always mapping the RUNTIME region (non boot service region) firstly from the efi_va, then mapping other regions in order, in this way kexec 2nd kernel will be happy because it does not call SetVirtualAddressMap and it does not need the boot service area at all. Thanks Dave -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Sat, Oct 12, 2013 at 11:13:08AM +0100, Matt Fleming wrote: On Sat, 12 Oct, at 03:54:44PM, Dave Young wrote: Boris: For the boot service region overlapping problem I have another idea, how about modify your mapping code to always mapping the RUNTIME region (non boot service region) firstly from the efi_va, then mapping other regions in order, in this way kexec 2nd kernel will be happy because it does not call SetVirtualAddressMap and it does not need the boot service area at all. Coalescing the runtime regions together implies that the second kernel would care about the fragmentation caused by unmapping the boot service regions - it shouldn't. We've sliced up a considerable chunk of kernel virtual address space (64G) and fragmentation shouldn't be an issue right now. Even if we run out of address space in the future due to fragmentation, and end up needing to coalesce runtime regions, this would be transparent to the kexec kernel because it's passed the memmap entries through setup_data. Though we are defining an ABI around the EFI address range (0xffef - 0x), such that it needs to be the same between kernels, we must not make the layout of regions within that range part of the ABI. We need the freedom to change the layout in the future. Basically, to sum up what Matt so eloquently explained, we will be passing all the runtime regions *but* *not* the boot regions (because the kexec kernel doesn't need them anyway) through setup_data to the kexec kernel. I.e., boot services regions is a dont-care for kexec. And it is very important to restate that we want to reserve ourselves the most flexible way of passing regions to the kexec kernel in case we want to change the mapping algorithm in the future. Therefore, kexec should simply not know anything about the VA layout of the EFI regions but will get them spelled out through the boot header's setup_data. This is the picture so far, AFAICT. Matt, please make a lot of noise if I've misrepresented anything. Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Sat, Oct 12, 2013 at 12:30:55PM +0200, Borislav Petkov wrote: On Sat, Oct 12, 2013 at 11:13:08AM +0100, Matt Fleming wrote: On Sat, 12 Oct, at 03:54:44PM, Dave Young wrote: Boris: For the boot service region overlapping problem I have another idea, how about modify your mapping code to always mapping the RUNTIME region (non boot service region) firstly from the efi_va, then mapping other regions in order, in this way kexec 2nd kernel will be happy because it does not call SetVirtualAddressMap and it does not need the boot service area at all. Coalescing the runtime regions together implies that the second kernel would care about the fragmentation caused by unmapping the boot service regions - it shouldn't. We've sliced up a considerable chunk of kernel virtual address space (64G) and fragmentation shouldn't be an issue right now. Even if we run out of address space in the future due to fragmentation, and end up needing to coalesce runtime regions, this would be transparent to the kexec kernel because it's passed the memmap entries through setup_data. Though we are defining an ABI around the EFI address range (0xffef - 0x), such that it needs to be the same between kernels, we must not make the layout of regions within that range part of the ABI. We need the freedom to change the layout in the future. Basically, to sum up what Matt so eloquently explained, we will be passing all the runtime regions *but* *not* the boot regions (because the kexec kernel doesn't need them anyway) through setup_data to the kexec kernel. I.e., boot services regions is a dont-care for kexec. And it is very important to restate that we want to reserve ourselves the most flexible way of passing regions to the kexec kernel in case we want to change the mapping algorithm in the future. Therefore, kexec should simply not know anything about the VA layout of the EFI regions but will get them spelled out through the boot header's setup_data. Boris, I think we have got the agreement about passing setup_data? I think it should be on top of your patch series, I can work on that along with other kexec related patches. Or if you would like to do it please let me know. This is the picture so far, AFAICT. Matt, please make a lot of noise if I've misrepresented anything. Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On 10/10/13 at 01:34pm, Matt Fleming wrote: On Thu, 10 Oct, at 10:58:28AM, Borislav Petkov wrote: On Thu, Oct 10, 2013 at 04:14:34PM +0800, Dave Young wrote: Even though I still have no idea why kernel text overlap with efi boot region, anyway map the un-overlapped part is necessary though. I can post the kexec related patches after your mapping patches settle down Right, settle down being the key here. Matt just mentioned on IRC that we might not need boot services mappings by the time we have to start the kexec kernel, which would mean, you don't have to do anything in efi_reserve_boot_services(). Dave, apologies for not discussing the whole Boot Services thing sooner. I missed your questions. No problem, Thanks for clarifying the boot service issue. We really should not be passing the EFI Boot Service regions via the memmap to kexec at all, because by the time the kexec'd kernel is running those pages that previously contained Boot Service code/data will have likely been reused for something else. Which, to answer your question, is why the Boot Service regions overlap the kernel text in the kexec'd kernel - those regions have been reallocated by the first kernel and now happen to contain the kernel text of the kexec kernel. Ok, then I understand passing boot service regions to 2nd kernel make no sense. But for current implementation from Boris, getting same mapping between diffrent kernel depends on same md order (same start and size for each one) How about using this mapping solution but at the same time for kexec kernel we also pass the virtual mappings via setup_data, only thing diffrent is we only need map the non boot region and just use the boot region size to ensure the other regions are mapped with same virtual address. OTOH, if we only passing ioremapped data without Boris's current patch the problem I worry about is how can we ensure the addresses are not used by other code before we mapping the in 2nd kernel efi_init. For the boot efi_reserve_boot_services code, it's mainly for the SetVirtualAddressMap callback use, so boot regions should not be reused before SetVirtualAddressMap, but the overlapping happens before the efi_reserve_boot_services, isn't it a problem? The reason that we don't keep the Boot Service regions around forever is because they can take up a considerable amount of memory, so the current situation of free'ing them after we're sure the firmware isn't going to reference them is still the right way to go, and simply not including any Boot Service entries in the memory map passed to kexec should make everything work OK. The question which needs answering first though is, how the whole efi thing is going to handle any functionality like calling into efi boot regions from runtime functions and such. Which hasn't really been tested and fw vendors don't really want to support that. But this is all bits and pieces I heard yesterday so it is all pretty wet and I'll let efi guys, i.e. the Matts and a couple of others :-), figure out this whole issue. We currently treat the scenario where Runtime Services reference Boot Service regions as a bug and either work around it (where we do efi_reserve_boot_services() and efi_free_boot_services() around SetVirtualAddressMap()) or we avoid calling those services altogether. The spec is pretty clear that runtime drivers shouldn't be doing this, and so far this approach has served us well. There are only two reasons why we keep the Boot Services regions around (for a short period) at all, 1) To work around the aforementioned runtime firmware bugs 2) To copy a ACPI BGRT image into kernel memory I'm not sure whether the kexec kernel would care about the BGRT? I have no idea about BGRT previously, it's Boot Graphics Resource Table so it's only for boot time use, I guess kexec can safely ignore it. Thanks Dave -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Fri, Oct 11, 2013 at 02:24:37PM +0800, Dave Young wrote: But for current implementation from Boris, getting same mapping between diffrent kernel depends on same md order (same start and size for each one) How about using this mapping solution but at the same time for kexec kernel we also pass the virtual mappings via setup_data, only thing diffrent is we only need map the non boot region and just use the boot region size to ensure the other regions are mapped with same virtual address. Actually, as hpa suggested, we will need to be passing the explicit virtual addresses to the kexec kernel in case we change the mapping algorithm in the future. So all should go through setup_data. OTOH, if we only passing ioremapped data without Boris's current patch the problem I worry about is how can we ensure the addresses are not used by other code before we mapping the in 2nd kernel efi_init. Right, the old method of mapping EFI runtime regions used ioremap and was mapping the regions in the same address space. Now we have reserved a 64G in the VA space ending at -4G (i.e. 0x___) which is reserved only for EFI RT usage. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Fri, 11 Oct, at 02:24:37PM, Dave Young wrote: For the boot efi_reserve_boot_services code, it's mainly for the SetVirtualAddressMap callback use, so boot regions should not be reused before SetVirtualAddressMap, but the overlapping happens before the efi_reserve_boot_services, isn't it a problem? Hang on, which kernel are you referring to here? The boot kernel or the kexec'd kernel? I thought you were saying you noticed the overlap when running in the second (kexec'd) kernel? The only reason that you would see this overlap in the first (boot) kernel is if the bootloader messed up and allocated the kernel text as EfiBootServicesCode/Data. I'd like to believe no bootloaders are still doing that. -- Matt Fleming, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
Matt, The kernel I referring is the boot kernel aka the 1st kernel, the boot loader is grub2 from Fedora 19. [sorry for top reply because of using webmail] - Original Message - From: Matt Fleming m...@console-pimps.org To: Dave Young dyo...@redhat.com Cc: Borislav Petkov b...@alien8.de, X86 ML x...@kernel.org, LKML linux-ker...@vger.kernel.org, Borislav Petkov b...@suse.de, Matthew Garrett mj...@srcf.ucam.org, H. Peter Anvin h...@zytor.com, James Bottomley james.bottom...@hansenpartnership.com, Vivek Goyal vgo...@redhat.com, linux-efi@vger.kernel.org, fwts-de...@lists.ubuntu.com Sent: Friday, October 11, 2013 6:27:06 PM Subject: Re: [PATCH 12/12] EFI: Runtime services virtual mapping On Fri, 11 Oct, at 02:24:37PM, Dave Young wrote: For the boot efi_reserve_boot_services code, it's mainly for the SetVirtualAddressMap callback use, so boot regions should not be reused before SetVirtualAddressMap, but the overlapping happens before the efi_reserve_boot_services, isn't it a problem? Hang on, which kernel are you referring to here? The boot kernel or the kexec'd kernel? I thought you were saying you noticed the overlap when running in the second (kexec'd) kernel? The only reason that you would see this overlap in the first (boot) kernel is if the bootloader messed up and allocated the kernel text as EfiBootServicesCode/Data. I'd like to believe no bootloaders are still doing that. -- Matt Fleming, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
CCing Peter Jones .., Peter, any idea about the grub related problem? On 10/11/13 at 09:42am, Dave Young wrote: Matt, The kernel I referring is the boot kernel aka the 1st kernel, the boot loader is grub2 from Fedora 19. [sorry for top reply because of using webmail] - Original Message - From: Matt Fleming m...@console-pimps.org To: Dave Young dyo...@redhat.com Cc: Borislav Petkov b...@alien8.de, X86 ML x...@kernel.org, LKML linux-ker...@vger.kernel.org, Borislav Petkov b...@suse.de, Matthew Garrett mj...@srcf.ucam.org, H. Peter Anvin h...@zytor.com, James Bottomley james.bottom...@hansenpartnership.com, Vivek Goyal vgo...@redhat.com, linux-efi@vger.kernel.org, fwts-de...@lists.ubuntu.com Sent: Friday, October 11, 2013 6:27:06 PM Subject: Re: [PATCH 12/12] EFI: Runtime services virtual mapping On Fri, 11 Oct, at 02:24:37PM, Dave Young wrote: For the boot efi_reserve_boot_services code, it's mainly for the SetVirtualAddressMap callback use, so boot regions should not be reused before SetVirtualAddressMap, but the overlapping happens before the efi_reserve_boot_services, isn't it a problem? Hang on, which kernel are you referring to here? The boot kernel or the kexec'd kernel? I thought you were saying you noticed the overlap when running in the second (kexec'd) kernel? The only reason that you would see this overlap in the first (boot) kernel is if the bootloader messed up and allocated the kernel text as EfiBootServicesCode/Data. I'd like to believe no bootloaders are still doing that. -- Matt Fleming, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On 10/08/13 at 06:48pm, Borislav Petkov wrote: From: Borislav Petkov b...@suse.de We map the EFI regions needed for runtime services contiguously on virtual addresses starting from -4G down for a total max space of 64G. This way, we provide for stable runtime services addresses across kernels so that a kexec'd kernel can still use them. This way, they're mapped in a separate pagetable so that we don't pollute the kernel namespace (you can see how the whole ioremapping and saving and restoring of PGDs is gone now). Also, add a chicken bit called efi=old_map which can be used as a fallback to the old runtime services mapping method in case there's some b0rkage with a particular EFI implementation (haha, it is hard to hold up the sarcasm here...). Add UEFI RT VA space to Documentation/x86/x86_64/mm.txt, while at it. Tested this new patch, the kexec kernel still get different mappings. Same reason, in first kernel reserve boot service function the size is set to 0. With a little hack patch below (upon my previous test patches for kexec) kexec and kdump works ok in qemu/ovmf, still not tried on real hardware. --- bp.orig/arch/x86/platform/efi/efi.c +++ bp/arch/x86/platform/efi/efi.c @@ -445,10 +445,18 @@ static void __init print_efi_memmap(void #endif /* EFI_DEBUG */ } +static bool inline overlap_with_ktext(u64 start, u64 size) +{ + return (start + size = __pa_symbol(_text) +start = __pa_symbol(_end)); +} + void __init efi_reserve_boot_services(void) { void *p; + if (kexecboot) + return; for (p = memmap.map; p memmap.map_end; p += memmap.desc_size) { efi_memory_desc_t *md = p; u64 start = md-phys_addr; @@ -463,13 +471,16 @@ void __init efi_reserve_boot_services(vo * - Not within any part of the kernel * - Not the bios reserved area */ - if ((start+size = __pa_symbol(_text) -start = __pa_symbol(_end)) || + if (overlap_with_ktext(start, size) || !e820_all_mapped(start, start+size, E820_RAM) || memblock_is_region_reserved(start, size)) { /* Could not reserve, skip it */ - md-num_pages = 0; - memblock_dbg(Could not reserve boot range + if (overlap_with_ktext(start, size)) { + u64 s = __pa_symbol(_text) - start; + memblock_reserve(start, s); + } else + md-num_pages = 0; + memblock_dbg(Could not reserve whole boot range [0x%010llx-0x%010llx]\n, start, start+size-1); } else @@ -490,6 +501,8 @@ void __init efi_free_boot_services(void) { void *p; + if (kexecboot) + return; if (!efi_is_native()) return; -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On 10/10/13 at 04:06pm, Dave Young wrote: On 10/08/13 at 06:48pm, Borislav Petkov wrote: From: Borislav Petkov b...@suse.de We map the EFI regions needed for runtime services contiguously on virtual addresses starting from -4G down for a total max space of 64G. This way, we provide for stable runtime services addresses across kernels so that a kexec'd kernel can still use them. This way, they're mapped in a separate pagetable so that we don't pollute the kernel namespace (you can see how the whole ioremapping and saving and restoring of PGDs is gone now). Also, add a chicken bit called efi=old_map which can be used as a fallback to the old runtime services mapping method in case there's some b0rkage with a particular EFI implementation (haha, it is hard to hold up the sarcasm here...). Add UEFI RT VA space to Documentation/x86/x86_64/mm.txt, while at it. Tested this new patch, the kexec kernel still get different mappings. Same reason, in first kernel reserve boot service function the size is set to 0. With a little hack patch below (upon my previous test patches for kexec) kexec and kdump works ok in qemu/ovmf, still not tried on real hardware. Even though I still have no idea why kernel text overlap with efi boot region, anyway map the un-overlapped part is necessary though. I can post the kexec related patches after your mapping patches settle down --- bp.orig/arch/x86/platform/efi/efi.c +++ bp/arch/x86/platform/efi/efi.c @@ -445,10 +445,18 @@ static void __init print_efi_memmap(void #endif /* EFI_DEBUG */ } +static bool inline overlap_with_ktext(u64 start, u64 size) +{ + return (start + size = __pa_symbol(_text) + start = __pa_symbol(_end)); +} + void __init efi_reserve_boot_services(void) { void *p; + if (kexecboot) + return; for (p = memmap.map; p memmap.map_end; p += memmap.desc_size) { efi_memory_desc_t *md = p; u64 start = md-phys_addr; @@ -463,13 +471,16 @@ void __init efi_reserve_boot_services(vo * - Not within any part of the kernel * - Not the bios reserved area */ - if ((start+size = __pa_symbol(_text) - start = __pa_symbol(_end)) || + if (overlap_with_ktext(start, size) || !e820_all_mapped(start, start+size, E820_RAM) || memblock_is_region_reserved(start, size)) { /* Could not reserve, skip it */ - md-num_pages = 0; - memblock_dbg(Could not reserve boot range + if (overlap_with_ktext(start, size)) { + u64 s = __pa_symbol(_text) - start; + memblock_reserve(start, s); + } else + md-num_pages = 0; + memblock_dbg(Could not reserve whole boot range [0x%010llx-0x%010llx]\n, start, start+size-1); } else @@ -490,6 +501,8 @@ void __init efi_free_boot_services(void) { void *p; + if (kexecboot) + return; if (!efi_is_native()) return; -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Thu, Oct 10, 2013 at 04:14:34PM +0800, Dave Young wrote: Even though I still have no idea why kernel text overlap with efi boot region, anyway map the un-overlapped part is necessary though. I can post the kexec related patches after your mapping patches settle down Right, settle down being the key here. Matt just mentioned on IRC that we might not need boot services mappings by the time we have to start the kexec kernel, which would mean, you don't have to do anything in efi_reserve_boot_services(). The question which needs answering first though is, how the whole efi thing is going to handle any functionality like calling into efi boot regions from runtime functions and such. Which hasn't really been tested and fw vendors don't really want to support that. But this is all bits and pieces I heard yesterday so it is all pretty wet and I'll let efi guys, i.e. the Matts and a couple of others :-), figure out this whole issue. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/12] EFI: Runtime services virtual mapping
On Thu, 10 Oct, at 10:58:28AM, Borislav Petkov wrote: On Thu, Oct 10, 2013 at 04:14:34PM +0800, Dave Young wrote: Even though I still have no idea why kernel text overlap with efi boot region, anyway map the un-overlapped part is necessary though. I can post the kexec related patches after your mapping patches settle down Right, settle down being the key here. Matt just mentioned on IRC that we might not need boot services mappings by the time we have to start the kexec kernel, which would mean, you don't have to do anything in efi_reserve_boot_services(). Dave, apologies for not discussing the whole Boot Services thing sooner. I missed your questions. We really should not be passing the EFI Boot Service regions via the memmap to kexec at all, because by the time the kexec'd kernel is running those pages that previously contained Boot Service code/data will have likely been reused for something else. Which, to answer your question, is why the Boot Service regions overlap the kernel text in the kexec'd kernel - those regions have been reallocated by the first kernel and now happen to contain the kernel text of the kexec kernel. The reason that we don't keep the Boot Service regions around forever is because they can take up a considerable amount of memory, so the current situation of free'ing them after we're sure the firmware isn't going to reference them is still the right way to go, and simply not including any Boot Service entries in the memory map passed to kexec should make everything work OK. The question which needs answering first though is, how the whole efi thing is going to handle any functionality like calling into efi boot regions from runtime functions and such. Which hasn't really been tested and fw vendors don't really want to support that. But this is all bits and pieces I heard yesterday so it is all pretty wet and I'll let efi guys, i.e. the Matts and a couple of others :-), figure out this whole issue. We currently treat the scenario where Runtime Services reference Boot Service regions as a bug and either work around it (where we do efi_reserve_boot_services() and efi_free_boot_services() around SetVirtualAddressMap()) or we avoid calling those services altogether. The spec is pretty clear that runtime drivers shouldn't be doing this, and so far this approach has served us well. There are only two reasons why we keep the Boot Services regions around (for a short period) at all, 1) To work around the aforementioned runtime firmware bugs 2) To copy a ACPI BGRT image into kernel memory I'm not sure whether the kexec kernel would care about the BGRT? -- Matt Fleming, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html