Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On Tue, Dec 25, 2012 at 3:20 AM, Borislav Petkov wrote: > On Mon, Dec 24, 2012 at 08:04:18PM -0800, Yinghai Lu wrote: >> well, I updated for-x86-boot-v7 that stop #PF handler after >> init_mem_mapping. >> >> it has fix for AMD system aka reverting far jmp to ret. > > -v7? > > You told me yesterday -v8 is the current branch. Do you have -v7 which > does break KGDB and -v8 which breaks it and both branches are current? > -v7: stop #PF handler after init_mem_mapping, so it could break KGDB, if someone try to use mdump. -v8: stop #PF handler before x86_64_start_reservations. Now both have be updated and could work with AMD platform after drop the change with lretq aka keep lretq. Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On Mon, Dec 24, 2012 at 08:04:18PM -0800, Yinghai Lu wrote: > well, I updated for-x86-boot-v7 that stop #PF handler after > init_mem_mapping. > > it has fix for AMD system aka reverting far jmp to ret. -v7? You told me yesterday -v8 is the current branch. Do you have -v7 which does break KGDB and -v8 which breaks it and both branches are current? Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On Mon, Dec 24, 2012 at 08:04:18PM -0800, Yinghai Lu wrote: well, I updated for-x86-boot-v7 that stop #PF handler after init_mem_mapping. it has fix for AMD system aka reverting far jmp to ret. -v7? You told me yesterday -v8 is the current branch. Do you have -v7 which does break KGDB and -v8 which breaks it and both branches are current? Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On Tue, Dec 25, 2012 at 3:20 AM, Borislav Petkov b...@alien8.de wrote: On Mon, Dec 24, 2012 at 08:04:18PM -0800, Yinghai Lu wrote: well, I updated for-x86-boot-v7 that stop #PF handler after init_mem_mapping. it has fix for AMD system aka reverting far jmp to ret. -v7? You told me yesterday -v8 is the current branch. Do you have -v7 which does break KGDB and -v8 which breaks it and both branches are current? -v7: stop #PF handler after init_mem_mapping, so it could break KGDB, if someone try to use mdump. -v8: stop #PF handler before x86_64_start_reservations. Now both have be updated and could work with AMD platform after drop the change with lretq aka keep lretq. Yinghai -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On Mon, Dec 24, 2012 at 4:16 PM, H. Peter Anvin wrote: > On 12/20/2012 08:56 AM, Yinghai Lu wrote: >>> >>> >>> So in that case, kgdb is broken and will need to be fixed up. That >>> happens all the time with debugging tools. >> >> >> If there is a way that we can make all parties happy, we really should >> not break KGDB. >> >> Please reconsider to stop #PF handler in x86_64_start_kernel. in that case >> 1. microcode update still can use #PF handler to find microcode in >> ramdisk and use it. >> 2. kernel that is loaded above 4G, could set mapping in C instead of >> set that in head_64.S >> and use ioremap to access zero_page >> 3. KGDB still can call early_trap_init early before init_mem_mapping. >> > > Yinghai, this is total and utter bullshit. > > We should *fix* kgdb, not pave around it. I refuse to have kgdb be yet > another Xen turning random kernel internals into ABIs. well, I updated for-x86-boot-v7 that stop #PF handler after init_mem_mapping. it has fix for AMD system aka reverting far jmp to ret. Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On 12/20/2012 08:56 AM, Yinghai Lu wrote: So in that case, kgdb is broken and will need to be fixed up. That happens all the time with debugging tools. If there is a way that we can make all parties happy, we really should not break KGDB. Please reconsider to stop #PF handler in x86_64_start_kernel. in that case 1. microcode update still can use #PF handler to find microcode in ramdisk and use it. 2. kernel that is loaded above 4G, could set mapping in C instead of set that in head_64.S and use ioremap to access zero_page 3. KGDB still can call early_trap_init early before init_mem_mapping. Yinghai, this is total and utter bullshit. We should *fix* kgdb, not pave around it. I refuse to have kgdb be yet another Xen turning random kernel internals into ABIs. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On 12/20/2012 08:56 AM, Yinghai Lu wrote: So in that case, kgdb is broken and will need to be fixed up. That happens all the time with debugging tools. If there is a way that we can make all parties happy, we really should not break KGDB. Please reconsider to stop #PF handler in x86_64_start_kernel. in that case 1. microcode update still can use #PF handler to find microcode in ramdisk and use it. 2. kernel that is loaded above 4G, could set mapping in C instead of set that in head_64.S and use ioremap to access zero_page 3. KGDB still can call early_trap_init early before init_mem_mapping. Yinghai, this is total and utter bullshit. We should *fix* kgdb, not pave around it. I refuse to have kgdb be yet another Xen turning random kernel internals into ABIs. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On Mon, Dec 24, 2012 at 4:16 PM, H. Peter Anvin h...@zytor.com wrote: On 12/20/2012 08:56 AM, Yinghai Lu wrote: So in that case, kgdb is broken and will need to be fixed up. That happens all the time with debugging tools. If there is a way that we can make all parties happy, we really should not break KGDB. Please reconsider to stop #PF handler in x86_64_start_kernel. in that case 1. microcode update still can use #PF handler to find microcode in ramdisk and use it. 2. kernel that is loaded above 4G, could set mapping in C instead of set that in head_64.S and use ioremap to access zero_page 3. KGDB still can call early_trap_init early before init_mem_mapping. Yinghai, this is total and utter bullshit. We should *fix* kgdb, not pave around it. I refuse to have kgdb be yet another Xen turning random kernel internals into ABIs. well, I updated for-x86-boot-v7 that stop #PF handler after init_mem_mapping. it has fix for AMD system aka reverting far jmp to ret. Yinghai -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On Tue, Dec 18, 2012 at 1:07 PM, H. Peter Anvin wrote: > On 12/18/2012 12:55 PM, Yinghai Lu wrote: >> On Tue, Dec 18, 2012 at 12:49 PM, H. Peter Anvin wrote: >>> On 12/18/2012 12:43 PM, Yinghai Lu wrote: >> >>> >>> That is putting the cart before the horse. What is the specific requirement >>> with kgdb here (I didn't see any email on that, please don't have private >>> back conversations)? Either way, however, kgdb is a tool to debug the >>> kernel... having it a barrier for proper functionality of the kernel is not >>> acceptable. >> >> did not hear back from Jason or Jan. >> >> Looks like last mail in LKML from Jason is about Oct 20 >> >> looks like kgdb is want DB, BP, and PF are set at first. >> >> and just after that early_param for kgdbwait will get into to hold the >> kernel. >> >> then command from kgdb could dump ram etc. >> > > So in that case, kgdb is broken and will need to be fixed up. That > happens all the time with debugging tools. > If there is a way that we can make all parties happy, we really should not break KGDB. Please reconsider to stop #PF handler in x86_64_start_kernel. in that case 1. microcode update still can use #PF handler to find microcode in ramdisk and use it. 2. kernel that is loaded above 4G, could set mapping in C instead of set that in head_64.S and use ioremap to access zero_page 3. KGDB still can call early_trap_init early before init_mem_mapping. I put the change in for-x86-boot-v8 branch. core patch is: http://git.kernel.org/?p=linux/kernel/git/yinghai/linux-yinghai.git;a=commitdiff;h=6fa4f1e68f0b67d0dc13d30c5ce6c3932697d08f Thanks Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On Tue, Dec 18, 2012 at 1:07 PM, H. Peter Anvin h...@zytor.com wrote: On 12/18/2012 12:55 PM, Yinghai Lu wrote: On Tue, Dec 18, 2012 at 12:49 PM, H. Peter Anvin h...@zytor.com wrote: On 12/18/2012 12:43 PM, Yinghai Lu wrote: That is putting the cart before the horse. What is the specific requirement with kgdb here (I didn't see any email on that, please don't have private back conversations)? Either way, however, kgdb is a tool to debug the kernel... having it a barrier for proper functionality of the kernel is not acceptable. did not hear back from Jason or Jan. Looks like last mail in LKML from Jason is about Oct 20 looks like kgdb is want DB, BP, and PF are set at first. and just after that early_param for kgdbwait will get into to hold the kernel. then command from kgdb could dump ram etc. So in that case, kgdb is broken and will need to be fixed up. That happens all the time with debugging tools. If there is a way that we can make all parties happy, we really should not break KGDB. Please reconsider to stop #PF handler in x86_64_start_kernel. in that case 1. microcode update still can use #PF handler to find microcode in ramdisk and use it. 2. kernel that is loaded above 4G, could set mapping in C instead of set that in head_64.S and use ioremap to access zero_page 3. KGDB still can call early_trap_init early before init_mem_mapping. I put the change in for-x86-boot-v8 branch. core patch is: http://git.kernel.org/?p=linux/kernel/git/yinghai/linux-yinghai.git;a=commitdiff;h=6fa4f1e68f0b67d0dc13d30c5ce6c3932697d08f Thanks Yinghai -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On 12/18/2012 12:55 PM, Yinghai Lu wrote: > On Tue, Dec 18, 2012 at 12:49 PM, H. Peter Anvin wrote: >> On 12/18/2012 12:43 PM, Yinghai Lu wrote: > >> >> That is putting the cart before the horse. What is the specific requirement >> with kgdb here (I didn't see any email on that, please don't have private >> back conversations)? Either way, however, kgdb is a tool to debug the >> kernel... having it a barrier for proper functionality of the kernel is not >> acceptable. > > did not hear back from Jason or Jan. > > Looks like last mail in LKML from Jason is about Oct 20 > > looks like kgdb is want DB, BP, and PF are set at first. > > and just after that early_param for kgdbwait will get into to hold the kernel. > > then command from kgdb could dump ram etc. > So in that case, kgdb is broken and will need to be fixed up. That happens all the time with debugging tools. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On Tue, Dec 18, 2012 at 12:49 PM, H. Peter Anvin wrote: > On 12/18/2012 12:43 PM, Yinghai Lu wrote: > > That is putting the cart before the horse. What is the specific requirement > with kgdb here (I didn't see any email on that, please don't have private > back conversations)? Either way, however, kgdb is a tool to debug the > kernel... having it a barrier for proper functionality of the kernel is not > acceptable. did not hear back from Jason or Jan. Looks like last mail in LKML from Jason is about Oct 20 looks like kgdb is want DB, BP, and PF are set at first. and just after that early_param for kgdbwait will get into to hold the kernel. then command from kgdb could dump ram etc. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On 12/18/2012 12:43 PM, Yinghai Lu wrote: On Mon, Dec 17, 2012 at 11:15 PM, Yinghai Lu wrote: -v8: we need to keep that handler alive until init_mem_mapping and don't let early_trap_init to trash that early #PF handler. So split early_trap_pf_init out and move it down. - Yinghai Peter, looks like moving down early_trap_init would break kgdb. we could make temporary early pgt cover 1G, and kernel and stop updating later. please check attached patch. init_mem_mapping need to be change a little: map BRK at first then switch pgt. Thanks Yinghai That is putting the cart before the horse. What is the specific requirement with kgdb here (I didn't see any email on that, please don't have private back conversations)? Either way, however, kgdb is a tool to debug the kernel... having it a barrier for proper functionality of the kernel is not acceptable. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On Mon, Dec 17, 2012 at 11:15 PM, Yinghai Lu wrote: > -v8: we need to keep that handler alive until init_mem_mapping and don't > let early_trap_init to trash that early #PF handler. > So split early_trap_pf_init out and move it down. - Yinghai Peter, looks like moving down early_trap_init would break kgdb. we could make temporary early pgt cover 1G, and kernel and stop updating later. please check attached patch. init_mem_mapping need to be change a little: map BRK at first then switch pgt. Thanks Yinghai hpa_pf_set_page_table_5.patch Description: Binary data
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On Mon, Dec 17, 2012 at 11:15 PM, Yinghai Lu ying...@kernel.org wrote: -v8: we need to keep that handler alive until init_mem_mapping and don't let early_trap_init to trash that early #PF handler. So split early_trap_pf_init out and move it down. - Yinghai Peter, looks like moving down early_trap_init would break kgdb. we could make temporary early pgt cover 1G, and kernel and stop updating later. please check attached patch. init_mem_mapping need to be change a little: map BRK at first then switch pgt. Thanks Yinghai hpa_pf_set_page_table_5.patch Description: Binary data
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On 12/18/2012 12:43 PM, Yinghai Lu wrote: On Mon, Dec 17, 2012 at 11:15 PM, Yinghai Lu ying...@kernel.org wrote: -v8: we need to keep that handler alive until init_mem_mapping and don't let early_trap_init to trash that early #PF handler. So split early_trap_pf_init out and move it down. - Yinghai Peter, looks like moving down early_trap_init would break kgdb. we could make temporary early pgt cover 1G, and kernel and stop updating later. please check attached patch. init_mem_mapping need to be change a little: map BRK at first then switch pgt. Thanks Yinghai That is putting the cart before the horse. What is the specific requirement with kgdb here (I didn't see any email on that, please don't have private back conversations)? Either way, however, kgdb is a tool to debug the kernel... having it a barrier for proper functionality of the kernel is not acceptable. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On Tue, Dec 18, 2012 at 12:49 PM, H. Peter Anvin h...@zytor.com wrote: On 12/18/2012 12:43 PM, Yinghai Lu wrote: That is putting the cart before the horse. What is the specific requirement with kgdb here (I didn't see any email on that, please don't have private back conversations)? Either way, however, kgdb is a tool to debug the kernel... having it a barrier for proper functionality of the kernel is not acceptable. did not hear back from Jason or Jan. Looks like last mail in LKML from Jason is about Oct 20 looks like kgdb is want DB, BP, and PF are set at first. and just after that early_param for kgdbwait will get into to hold the kernel. then command from kgdb could dump ram etc. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 06/27] x86, 64bit: early #PF handler set page table
On 12/18/2012 12:55 PM, Yinghai Lu wrote: On Tue, Dec 18, 2012 at 12:49 PM, H. Peter Anvin h...@zytor.com wrote: On 12/18/2012 12:43 PM, Yinghai Lu wrote: That is putting the cart before the horse. What is the specific requirement with kgdb here (I didn't see any email on that, please don't have private back conversations)? Either way, however, kgdb is a tool to debug the kernel... having it a barrier for proper functionality of the kernel is not acceptable. did not hear back from Jason or Jan. Looks like last mail in LKML from Jason is about Oct 20 looks like kgdb is want DB, BP, and PF are set at first. and just after that early_param for kgdbwait will get into to hold the kernel. then command from kgdb could dump ram etc. So in that case, kgdb is broken and will need to be fixed up. That happens all the time with debugging tools. -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v7 06/27] x86, 64bit: early #PF handler set page table
From: "H. Peter Anvin" two use cases: 1. We will support load and run kernel above 4G, and zero_page, ramdisk will be above 4G, too 2. need to access ramdisk early to get microcode to update that as early possible. We could use early_iomap to access them, but it will make code to messy and hard to unified with 32bit. So here comes #PF handler to set page page. When #PF happen, handler will use pages in __initdata to set page page to cover accessed page. those code and page in __INIT sections, so will not increase ram usages. The good point is: with help of #PF handler, we can set kernel mapping from blank, and switch to init_level4_pgt later. switchover in head_64.S is only using three page to handle kernel crossing 1G, 512G with shareing page, most insteresting part. early_make_pgtable is using kernel high mapping address to access pages to set page table. -v4: Add phys_base offset to make kexec happy, and add init_mapping_kernel() - Yinghai -v5: fix compiling with xen, and add back ident level3 and level2 for xen also move back init_level4_pgt from BSS to DATA again. because we have to clear it anyway. - Yinghai -v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai -v7: remove not needed clear_page for init_level4_page it is with fill 512,8,0 already in head_64.S - Yinghai -v8: we need to keep that handler alive until init_mem_mapping and don't let early_trap_init to trash that early #PF handler. So split early_trap_pf_init out and move it down. - Yinghai -v9: switchover only cover kernel space instead of 1G so could avoid touch possible mem holes. - Yinghai Signed-off-by: Yinghai Lu --- arch/x86/include/asm/pgtable_64_types.h |4 + arch/x86/include/asm/processor.h|1 + arch/x86/kernel/head64.c| 79 ++-- arch/x86/kernel/head_64.S | 202 +-- arch/x86/kernel/setup.c |2 + arch/x86/kernel/traps.c |9 ++ arch/x86/mm/init.c |3 +- 7 files changed, 204 insertions(+), 96 deletions(-) diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h index 766ea16..2d88344 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -1,6 +1,8 @@ #ifndef _ASM_X86_PGTABLE_64_DEFS_H #define _ASM_X86_PGTABLE_64_DEFS_H +#include + #ifndef __ASSEMBLY__ #include @@ -60,4 +62,6 @@ typedef struct { pteval_t pte; } pte_t; #define MODULES_END _AC(0xff00, UL) #define MODULES_LEN (MODULES_END - MODULES_VADDR) +#define EARLY_DYNAMIC_PAGE_TABLES 64 + #endif /* _ASM_X86_PGTABLE_64_DEFS_H */ diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 888184b..a0b58dd 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -731,6 +731,7 @@ extern void enable_sep_cpu(void); extern int sysenter_setup(void); extern void early_trap_init(void); +extern void early_trap_pf_init(void); /* Defined in head.S */ extern struct desc_ptr early_gdt_descr; diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index 7b215a5..cac61dc 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -26,11 +26,72 @@ #include #include -static void __init zap_identity_mappings(void) +/* + * Manage page tables very early on. + */ +extern pgd_t early_level4_pgt[PTRS_PER_PGD]; +extern pmd_t early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD]; +static unsigned int __initdata next_early_pgt = 2; + +/* Wipe all early page tables except for the kernel symbol map */ +static void __init reset_early_page_tables(void) { - pgd_t *pgd = pgd_offset_k(0UL); - pgd_clear(pgd); - __flush_tlb_all(); + unsigned long i; + + for (i = 0; i < PTRS_PER_PGD-1; i++) + early_level4_pgt[i].pgd = 0; + + next_early_pgt = 0; + + write_cr3(__pa(early_level4_pgt)); +} + +/* Create a new PMD entry */ +int __init early_make_pgtable(unsigned long address) +{ + unsigned long physaddr = address - __PAGE_OFFSET; + unsigned long i; + pgdval_t pgd, *pgd_p; + pudval_t *pud_p; + pmdval_t pmd, *pmd_p; + + + /* Invalid address or early pgt is done ? */ + if (physaddr >= MAXMEM || read_cr3() != __pa(early_level4_pgt)) + return -1; + + pgd_p = _level4_pgt[pgd_index(address)].pgd; + pgd = *pgd_p; + + /* +* The use of __START_KERNEL_map rather than __PAGE_OFFSET here is +* critical -- __PAGE_OFFSET would point us back into the dynamic +* range and we might end up looping forever... +*/ + if (pgd && next_early_pgt < EARLY_DYNAMIC_PAGE_TABLES) { + pud_p = (pudval_t *)((pgd & PTE_PFN_MASK) + __START_KERNEL_map - phys_base); + } else { + if (next_early_pgt >=
[PATCH v7 06/27] x86, 64bit: early #PF handler set page table
From: H. Peter Anvin h...@zytor.com two use cases: 1. We will support load and run kernel above 4G, and zero_page, ramdisk will be above 4G, too 2. need to access ramdisk early to get microcode to update that as early possible. We could use early_iomap to access them, but it will make code to messy and hard to unified with 32bit. So here comes #PF handler to set page page. When #PF happen, handler will use pages in __initdata to set page page to cover accessed page. those code and page in __INIT sections, so will not increase ram usages. The good point is: with help of #PF handler, we can set kernel mapping from blank, and switch to init_level4_pgt later. switchover in head_64.S is only using three page to handle kernel crossing 1G, 512G with shareing page, most insteresting part. early_make_pgtable is using kernel high mapping address to access pages to set page table. -v4: Add phys_base offset to make kexec happy, and add init_mapping_kernel() - Yinghai -v5: fix compiling with xen, and add back ident level3 and level2 for xen also move back init_level4_pgt from BSS to DATA again. because we have to clear it anyway. - Yinghai -v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai -v7: remove not needed clear_page for init_level4_page it is with fill 512,8,0 already in head_64.S - Yinghai -v8: we need to keep that handler alive until init_mem_mapping and don't let early_trap_init to trash that early #PF handler. So split early_trap_pf_init out and move it down. - Yinghai -v9: switchover only cover kernel space instead of 1G so could avoid touch possible mem holes. - Yinghai Signed-off-by: Yinghai Lu ying...@kernel.org --- arch/x86/include/asm/pgtable_64_types.h |4 + arch/x86/include/asm/processor.h|1 + arch/x86/kernel/head64.c| 79 ++-- arch/x86/kernel/head_64.S | 202 +-- arch/x86/kernel/setup.c |2 + arch/x86/kernel/traps.c |9 ++ arch/x86/mm/init.c |3 +- 7 files changed, 204 insertions(+), 96 deletions(-) diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h index 766ea16..2d88344 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -1,6 +1,8 @@ #ifndef _ASM_X86_PGTABLE_64_DEFS_H #define _ASM_X86_PGTABLE_64_DEFS_H +#include asm/sparsemem.h + #ifndef __ASSEMBLY__ #include linux/types.h @@ -60,4 +62,6 @@ typedef struct { pteval_t pte; } pte_t; #define MODULES_END _AC(0xff00, UL) #define MODULES_LEN (MODULES_END - MODULES_VADDR) +#define EARLY_DYNAMIC_PAGE_TABLES 64 + #endif /* _ASM_X86_PGTABLE_64_DEFS_H */ diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 888184b..a0b58dd 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -731,6 +731,7 @@ extern void enable_sep_cpu(void); extern int sysenter_setup(void); extern void early_trap_init(void); +extern void early_trap_pf_init(void); /* Defined in head.S */ extern struct desc_ptr early_gdt_descr; diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index 7b215a5..cac61dc 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -26,11 +26,72 @@ #include asm/e820.h #include asm/bios_ebda.h -static void __init zap_identity_mappings(void) +/* + * Manage page tables very early on. + */ +extern pgd_t early_level4_pgt[PTRS_PER_PGD]; +extern pmd_t early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD]; +static unsigned int __initdata next_early_pgt = 2; + +/* Wipe all early page tables except for the kernel symbol map */ +static void __init reset_early_page_tables(void) { - pgd_t *pgd = pgd_offset_k(0UL); - pgd_clear(pgd); - __flush_tlb_all(); + unsigned long i; + + for (i = 0; i PTRS_PER_PGD-1; i++) + early_level4_pgt[i].pgd = 0; + + next_early_pgt = 0; + + write_cr3(__pa(early_level4_pgt)); +} + +/* Create a new PMD entry */ +int __init early_make_pgtable(unsigned long address) +{ + unsigned long physaddr = address - __PAGE_OFFSET; + unsigned long i; + pgdval_t pgd, *pgd_p; + pudval_t *pud_p; + pmdval_t pmd, *pmd_p; + + + /* Invalid address or early pgt is done ? */ + if (physaddr = MAXMEM || read_cr3() != __pa(early_level4_pgt)) + return -1; + + pgd_p = early_level4_pgt[pgd_index(address)].pgd; + pgd = *pgd_p; + + /* +* The use of __START_KERNEL_map rather than __PAGE_OFFSET here is +* critical -- __PAGE_OFFSET would point us back into the dynamic +* range and we might end up looping forever... +*/ + if (pgd next_early_pgt EARLY_DYNAMIC_PAGE_TABLES) { + pud_p = (pudval_t *)((pgd PTE_PFN_MASK) + __START_KERNEL_map -