Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On 02/08/17 10:17, Kirill A. Shutemov wrote: > On Wed, Aug 02, 2017 at 09:44:54AM +0200, Juergen Gross wrote: >> That did the trick! >> >> PV domU is coming up now with a 5-level paging enabled kernel. > > Thanks a lot for helping me up with it. > > I'll integrate the fixes into patchset. > > Just, for clarification XEN_PVH works too, right? Yes. Juergen
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On 02/08/17 10:17, Kirill A. Shutemov wrote: > On Wed, Aug 02, 2017 at 09:44:54AM +0200, Juergen Gross wrote: >> That did the trick! >> >> PV domU is coming up now with a 5-level paging enabled kernel. > > Thanks a lot for helping me up with it. > > I'll integrate the fixes into patchset. > > Just, for clarification XEN_PVH works too, right? Yes. Juergen
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On Wed, Aug 02, 2017 at 09:44:54AM +0200, Juergen Gross wrote: > That did the trick! > > PV domU is coming up now with a 5-level paging enabled kernel. Thanks a lot for helping me up with it. I'll integrate the fixes into patchset. Just, for clarification XEN_PVH works too, right? -- Kirill A. Shutemov
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On Wed, Aug 02, 2017 at 09:44:54AM +0200, Juergen Gross wrote: > That did the trick! > > PV domU is coming up now with a 5-level paging enabled kernel. Thanks a lot for helping me up with it. I'll integrate the fixes into patchset. Just, for clarification XEN_PVH works too, right? -- Kirill A. Shutemov
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On 01/08/17 21:11, Kirill A. Shutemov wrote: > On Tue, Aug 01, 2017 at 07:14:57PM +0200, Juergen Gross wrote: >> On 01/08/17 16:44, Kirill A. Shutemov wrote: >>> On Tue, Aug 01, 2017 at 09:46:56AM +0200, Juergen Gross wrote: On 26/07/17 18:43, Kirill A. Shutemov wrote: > On Wed, Jul 26, 2017 at 09:28:16AM +0200, Juergen Gross wrote: >> On 25/07/17 11:05, Kirill A. Shutemov wrote: >>> On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: Xen PV guests will never run with 5-level-paging enabled. So I guess you can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. >>> >>> There is more code to drop from mmu_pv.c. >>> >>> But while there, I thought if with boot-time 5-level paging switching we >>> can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image >>> can be used in these XEN modes with 4-level paging. >>> >>> Could you check if with the patch below we can boot in XEN_PV and >>> XEN_PVH >>> modes? >> >> We can't. I have used your branch: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git >> la57/boot-switching/v2 >> >> with this patch applied on top. >> >> Doesn't boot PV guest with X86_5LEVEL configured (very early crash). > > Hm. Okay. > > Have you tried PVH? > >> Doesn't build with X86_5LEVEL not configured: >> >> AS arch/x86/kernel/head_64.o > > I've fixed the patch and split the patch into two parts: cleanup and > re-enabling XEN_PV and XEN_PVH for X86_5LEVEL. > > There's chance that I screw somthing up in clenaup part. Could you check > that? Build is working with and without X86_5LEVEL configured. PV domU boots without X86_5LEVEL configured. PV domU crashes with X86_5LEVEL configured: xen_start_kernel() x86_64_start_reservations() start_kernel() setup_arch() early_ioremap_init() early_ioremap_pmd() In early_ioremap_pmd() there seems to be a call to p4d_val() which is an uninitialized paravirt operation in the Xen pv case. >>> >>> Thanks for testing. >>> >>> Could you check if patch below makes a difference? >> >> A little bit better. I get a panic message with backtrace now: > > Are you running with 512m of ram or so? Yes. :-) > There's known issue with sparse mem: it still allocate data structures as > if there's 52-bit phys address space even for p4d_folded case. > > I'm looking this. > > Try to bump memory size to 2g or so for now. That did the trick! PV domU is coming up now with a 5-level paging enabled kernel. Juergen
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On 01/08/17 21:11, Kirill A. Shutemov wrote: > On Tue, Aug 01, 2017 at 07:14:57PM +0200, Juergen Gross wrote: >> On 01/08/17 16:44, Kirill A. Shutemov wrote: >>> On Tue, Aug 01, 2017 at 09:46:56AM +0200, Juergen Gross wrote: On 26/07/17 18:43, Kirill A. Shutemov wrote: > On Wed, Jul 26, 2017 at 09:28:16AM +0200, Juergen Gross wrote: >> On 25/07/17 11:05, Kirill A. Shutemov wrote: >>> On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: Xen PV guests will never run with 5-level-paging enabled. So I guess you can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. >>> >>> There is more code to drop from mmu_pv.c. >>> >>> But while there, I thought if with boot-time 5-level paging switching we >>> can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image >>> can be used in these XEN modes with 4-level paging. >>> >>> Could you check if with the patch below we can boot in XEN_PV and >>> XEN_PVH >>> modes? >> >> We can't. I have used your branch: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git >> la57/boot-switching/v2 >> >> with this patch applied on top. >> >> Doesn't boot PV guest with X86_5LEVEL configured (very early crash). > > Hm. Okay. > > Have you tried PVH? > >> Doesn't build with X86_5LEVEL not configured: >> >> AS arch/x86/kernel/head_64.o > > I've fixed the patch and split the patch into two parts: cleanup and > re-enabling XEN_PV and XEN_PVH for X86_5LEVEL. > > There's chance that I screw somthing up in clenaup part. Could you check > that? Build is working with and without X86_5LEVEL configured. PV domU boots without X86_5LEVEL configured. PV domU crashes with X86_5LEVEL configured: xen_start_kernel() x86_64_start_reservations() start_kernel() setup_arch() early_ioremap_init() early_ioremap_pmd() In early_ioremap_pmd() there seems to be a call to p4d_val() which is an uninitialized paravirt operation in the Xen pv case. >>> >>> Thanks for testing. >>> >>> Could you check if patch below makes a difference? >> >> A little bit better. I get a panic message with backtrace now: > > Are you running with 512m of ram or so? Yes. :-) > There's known issue with sparse mem: it still allocate data structures as > if there's 52-bit phys address space even for p4d_folded case. > > I'm looking this. > > Try to bump memory size to 2g or so for now. That did the trick! PV domU is coming up now with a 5-level paging enabled kernel. Juergen
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On Tue, Aug 01, 2017 at 07:14:57PM +0200, Juergen Gross wrote: > On 01/08/17 16:44, Kirill A. Shutemov wrote: > > On Tue, Aug 01, 2017 at 09:46:56AM +0200, Juergen Gross wrote: > >> On 26/07/17 18:43, Kirill A. Shutemov wrote: > >>> On Wed, Jul 26, 2017 at 09:28:16AM +0200, Juergen Gross wrote: > On 25/07/17 11:05, Kirill A. Shutemov wrote: > > On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: > >> Xen PV guests will never run with 5-level-paging enabled. So I guess > >> you > >> can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. > > > > There is more code to drop from mmu_pv.c. > > > > But while there, I thought if with boot-time 5-level paging switching we > > can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image > > can be used in these XEN modes with 4-level paging. > > > > Could you check if with the patch below we can boot in XEN_PV and > > XEN_PVH > > modes? > > We can't. I have used your branch: > > git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git > la57/boot-switching/v2 > > with this patch applied on top. > > Doesn't boot PV guest with X86_5LEVEL configured (very early crash). > >>> > >>> Hm. Okay. > >>> > >>> Have you tried PVH? > >>> > Doesn't build with X86_5LEVEL not configured: > > AS arch/x86/kernel/head_64.o > >>> > >>> I've fixed the patch and split the patch into two parts: cleanup and > >>> re-enabling XEN_PV and XEN_PVH for X86_5LEVEL. > >>> > >>> There's chance that I screw somthing up in clenaup part. Could you check > >>> that? > >> > >> Build is working with and without X86_5LEVEL configured. > >> > >> PV domU boots without X86_5LEVEL configured. > >> > >> PV domU crashes with X86_5LEVEL configured: > >> > >> xen_start_kernel() > >> x86_64_start_reservations() > >> start_kernel() > >> setup_arch() > >> early_ioremap_init() > >> early_ioremap_pmd() > >> > >> In early_ioremap_pmd() there seems to be a call to p4d_val() which is an > >> uninitialized paravirt operation in the Xen pv case. > > > > Thanks for testing. > > > > Could you check if patch below makes a difference? > > A little bit better. I get a panic message with backtrace now: Are you running with 512m of ram or so? There's known issue with sparse mem: it still allocate data structures as if there's 52-bit phys address space even for p4d_folded case. I'm looking this. Try to bump memory size to 2g or so for now. -- Kirill A. Shutemov
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On Tue, Aug 01, 2017 at 07:14:57PM +0200, Juergen Gross wrote: > On 01/08/17 16:44, Kirill A. Shutemov wrote: > > On Tue, Aug 01, 2017 at 09:46:56AM +0200, Juergen Gross wrote: > >> On 26/07/17 18:43, Kirill A. Shutemov wrote: > >>> On Wed, Jul 26, 2017 at 09:28:16AM +0200, Juergen Gross wrote: > On 25/07/17 11:05, Kirill A. Shutemov wrote: > > On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: > >> Xen PV guests will never run with 5-level-paging enabled. So I guess > >> you > >> can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. > > > > There is more code to drop from mmu_pv.c. > > > > But while there, I thought if with boot-time 5-level paging switching we > > can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image > > can be used in these XEN modes with 4-level paging. > > > > Could you check if with the patch below we can boot in XEN_PV and > > XEN_PVH > > modes? > > We can't. I have used your branch: > > git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git > la57/boot-switching/v2 > > with this patch applied on top. > > Doesn't boot PV guest with X86_5LEVEL configured (very early crash). > >>> > >>> Hm. Okay. > >>> > >>> Have you tried PVH? > >>> > Doesn't build with X86_5LEVEL not configured: > > AS arch/x86/kernel/head_64.o > >>> > >>> I've fixed the patch and split the patch into two parts: cleanup and > >>> re-enabling XEN_PV and XEN_PVH for X86_5LEVEL. > >>> > >>> There's chance that I screw somthing up in clenaup part. Could you check > >>> that? > >> > >> Build is working with and without X86_5LEVEL configured. > >> > >> PV domU boots without X86_5LEVEL configured. > >> > >> PV domU crashes with X86_5LEVEL configured: > >> > >> xen_start_kernel() > >> x86_64_start_reservations() > >> start_kernel() > >> setup_arch() > >> early_ioremap_init() > >> early_ioremap_pmd() > >> > >> In early_ioremap_pmd() there seems to be a call to p4d_val() which is an > >> uninitialized paravirt operation in the Xen pv case. > > > > Thanks for testing. > > > > Could you check if patch below makes a difference? > > A little bit better. I get a panic message with backtrace now: Are you running with 512m of ram or so? There's known issue with sparse mem: it still allocate data structures as if there's 52-bit phys address space even for p4d_folded case. I'm looking this. Try to bump memory size to 2g or so for now. -- Kirill A. Shutemov
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On 01/08/17 16:44, Kirill A. Shutemov wrote: > On Tue, Aug 01, 2017 at 09:46:56AM +0200, Juergen Gross wrote: >> On 26/07/17 18:43, Kirill A. Shutemov wrote: >>> On Wed, Jul 26, 2017 at 09:28:16AM +0200, Juergen Gross wrote: On 25/07/17 11:05, Kirill A. Shutemov wrote: > On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: >> Xen PV guests will never run with 5-level-paging enabled. So I guess you >> can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. > > There is more code to drop from mmu_pv.c. > > But while there, I thought if with boot-time 5-level paging switching we > can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image > can be used in these XEN modes with 4-level paging. > > Could you check if with the patch below we can boot in XEN_PV and XEN_PVH > modes? We can't. I have used your branch: git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git la57/boot-switching/v2 with this patch applied on top. Doesn't boot PV guest with X86_5LEVEL configured (very early crash). >>> >>> Hm. Okay. >>> >>> Have you tried PVH? >>> Doesn't build with X86_5LEVEL not configured: AS arch/x86/kernel/head_64.o >>> >>> I've fixed the patch and split the patch into two parts: cleanup and >>> re-enabling XEN_PV and XEN_PVH for X86_5LEVEL. >>> >>> There's chance that I screw somthing up in clenaup part. Could you check >>> that? >> >> Build is working with and without X86_5LEVEL configured. >> >> PV domU boots without X86_5LEVEL configured. >> >> PV domU crashes with X86_5LEVEL configured: >> >> xen_start_kernel() >> x86_64_start_reservations() >> start_kernel() >> setup_arch() >> early_ioremap_init() >> early_ioremap_pmd() >> >> In early_ioremap_pmd() there seems to be a call to p4d_val() which is an >> uninitialized paravirt operation in the Xen pv case. > > Thanks for testing. > > Could you check if patch below makes a difference? A little bit better. I get a panic message with backtrace now: (early) [0.00] random: get_random_bytes called from start_kernel+0x33/0x495 with crng_init=0 (early) [0.00] Linux version 4.13.0-rc2-default+ (gross@g226) (gcc version 4.8.5 (SUSE Linux)) #135 SMP PREEMPT Tue Aug 1 17:43:57 CEST 2017 (early) [0.00] Command line: root=UUID=3fa1e04c-4741-46ca-a1cd-859cf0da92d0 resume=/dev/xvda1 splash=silent showopts earlyprintk=xen,keep (early) [0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' (early) [0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' (early) [0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' (early) [0.00] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 (early) [0.00] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. (early) [0.00] ACPI in unprivileged domain disabled (early) [0.00] Released 0 page(s) (early) [0.00] e820: BIOS-provided physical RAM map: (early) [0.00] Xen: [mem 0x-0x0009] usable (early) [0.00] Xen: [mem 0x000a-0x000f] reserved (early) [0.00] Xen: [mem 0x0010-0x1fff] usable (early) [0.00] console [xenboot0] enabled (early) [0.00] NX (Execute Disable) protection: active (early) [0.00] DMI not present or invalid. (early) [0.00] Hypervisor detected: Xen PV (early) [0.00] tsc: Fast TSC calibration failed (early) [0.00] tsc: Unable to calibrate against PIT (early) [0.00] tsc: No reference (HPET/PMTIMER) available (early) [0.00] e820: last_pfn = 0x2 max_arch_pfn = 0x4 (early) [0.00] MTRR: Disabled (early) [0.00] x86/PAT: MTRRs disabled, skipping PAT initialization too. (early) [0.00] x86/PAT: Configuration [0-7]: WB WT UC- UC WC WP UC UC (early) [0.00] Scanning 1 areas for low memory corruption (early) [0.00] RAMDISK: [mem 0x021dd000-0x034e4fff] (early) [0.00] NUMA turned off (early) [0.00] Faking a node at [mem 0x-0x1fff] (early) [0.00] NODE_DATA(0) allocated [mem 0x1ff07000-0x1ff1cfff] (early) [0.00] Section 1 and 3 (node 0) have a circular dependency on usemap and pgdat allocations (early) [0.00] Kernel panic - not syncing: memblock_virt_alloc_try_nid: Failed to allocate 268435456 bytes align=0x0 nid=-1 from=0x0 max_addr=0x0 [0.00] (early) [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 4.13.0-rc2-default+ #135 (early) [0.00] Call Trace: (early) [0.00] dump_stack+0x63/0x89 (early) [0.00] panic+0xdb/0x235 (early) [0.00] memblock_virt_alloc_try_nid+0x95/0xa2 (early) [0.00] ? sparse_early_mem_maps_alloc_node+0x10/0x10 (early) [
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On 01/08/17 16:44, Kirill A. Shutemov wrote: > On Tue, Aug 01, 2017 at 09:46:56AM +0200, Juergen Gross wrote: >> On 26/07/17 18:43, Kirill A. Shutemov wrote: >>> On Wed, Jul 26, 2017 at 09:28:16AM +0200, Juergen Gross wrote: On 25/07/17 11:05, Kirill A. Shutemov wrote: > On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: >> Xen PV guests will never run with 5-level-paging enabled. So I guess you >> can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. > > There is more code to drop from mmu_pv.c. > > But while there, I thought if with boot-time 5-level paging switching we > can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image > can be used in these XEN modes with 4-level paging. > > Could you check if with the patch below we can boot in XEN_PV and XEN_PVH > modes? We can't. I have used your branch: git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git la57/boot-switching/v2 with this patch applied on top. Doesn't boot PV guest with X86_5LEVEL configured (very early crash). >>> >>> Hm. Okay. >>> >>> Have you tried PVH? >>> Doesn't build with X86_5LEVEL not configured: AS arch/x86/kernel/head_64.o >>> >>> I've fixed the patch and split the patch into two parts: cleanup and >>> re-enabling XEN_PV and XEN_PVH for X86_5LEVEL. >>> >>> There's chance that I screw somthing up in clenaup part. Could you check >>> that? >> >> Build is working with and without X86_5LEVEL configured. >> >> PV domU boots without X86_5LEVEL configured. >> >> PV domU crashes with X86_5LEVEL configured: >> >> xen_start_kernel() >> x86_64_start_reservations() >> start_kernel() >> setup_arch() >> early_ioremap_init() >> early_ioremap_pmd() >> >> In early_ioremap_pmd() there seems to be a call to p4d_val() which is an >> uninitialized paravirt operation in the Xen pv case. > > Thanks for testing. > > Could you check if patch below makes a difference? A little bit better. I get a panic message with backtrace now: (early) [0.00] random: get_random_bytes called from start_kernel+0x33/0x495 with crng_init=0 (early) [0.00] Linux version 4.13.0-rc2-default+ (gross@g226) (gcc version 4.8.5 (SUSE Linux)) #135 SMP PREEMPT Tue Aug 1 17:43:57 CEST 2017 (early) [0.00] Command line: root=UUID=3fa1e04c-4741-46ca-a1cd-859cf0da92d0 resume=/dev/xvda1 splash=silent showopts earlyprintk=xen,keep (early) [0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' (early) [0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' (early) [0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' (early) [0.00] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 (early) [0.00] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. (early) [0.00] ACPI in unprivileged domain disabled (early) [0.00] Released 0 page(s) (early) [0.00] e820: BIOS-provided physical RAM map: (early) [0.00] Xen: [mem 0x-0x0009] usable (early) [0.00] Xen: [mem 0x000a-0x000f] reserved (early) [0.00] Xen: [mem 0x0010-0x1fff] usable (early) [0.00] console [xenboot0] enabled (early) [0.00] NX (Execute Disable) protection: active (early) [0.00] DMI not present or invalid. (early) [0.00] Hypervisor detected: Xen PV (early) [0.00] tsc: Fast TSC calibration failed (early) [0.00] tsc: Unable to calibrate against PIT (early) [0.00] tsc: No reference (HPET/PMTIMER) available (early) [0.00] e820: last_pfn = 0x2 max_arch_pfn = 0x4 (early) [0.00] MTRR: Disabled (early) [0.00] x86/PAT: MTRRs disabled, skipping PAT initialization too. (early) [0.00] x86/PAT: Configuration [0-7]: WB WT UC- UC WC WP UC UC (early) [0.00] Scanning 1 areas for low memory corruption (early) [0.00] RAMDISK: [mem 0x021dd000-0x034e4fff] (early) [0.00] NUMA turned off (early) [0.00] Faking a node at [mem 0x-0x1fff] (early) [0.00] NODE_DATA(0) allocated [mem 0x1ff07000-0x1ff1cfff] (early) [0.00] Section 1 and 3 (node 0) have a circular dependency on usemap and pgdat allocations (early) [0.00] Kernel panic - not syncing: memblock_virt_alloc_try_nid: Failed to allocate 268435456 bytes align=0x0 nid=-1 from=0x0 max_addr=0x0 [0.00] (early) [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 4.13.0-rc2-default+ #135 (early) [0.00] Call Trace: (early) [0.00] dump_stack+0x63/0x89 (early) [0.00] panic+0xdb/0x235 (early) [0.00] memblock_virt_alloc_try_nid+0x95/0xa2 (early) [0.00] ? sparse_early_mem_maps_alloc_node+0x10/0x10 (early) [
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On Tue, Aug 01, 2017 at 09:46:56AM +0200, Juergen Gross wrote: > On 26/07/17 18:43, Kirill A. Shutemov wrote: > > On Wed, Jul 26, 2017 at 09:28:16AM +0200, Juergen Gross wrote: > >> On 25/07/17 11:05, Kirill A. Shutemov wrote: > >>> On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: > Xen PV guests will never run with 5-level-paging enabled. So I guess you > can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. > >>> > >>> There is more code to drop from mmu_pv.c. > >>> > >>> But while there, I thought if with boot-time 5-level paging switching we > >>> can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image > >>> can be used in these XEN modes with 4-level paging. > >>> > >>> Could you check if with the patch below we can boot in XEN_PV and XEN_PVH > >>> modes? > >> > >> We can't. I have used your branch: > >> > >> git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git > >> la57/boot-switching/v2 > >> > >> with this patch applied on top. > >> > >> Doesn't boot PV guest with X86_5LEVEL configured (very early crash). > > > > Hm. Okay. > > > > Have you tried PVH? > > > >> Doesn't build with X86_5LEVEL not configured: > >> > >> AS arch/x86/kernel/head_64.o > > > > I've fixed the patch and split the patch into two parts: cleanup and > > re-enabling XEN_PV and XEN_PVH for X86_5LEVEL. > > > > There's chance that I screw somthing up in clenaup part. Could you check > > that? > > Build is working with and without X86_5LEVEL configured. > > PV domU boots without X86_5LEVEL configured. > > PV domU crashes with X86_5LEVEL configured: > > xen_start_kernel() > x86_64_start_reservations() > start_kernel() > setup_arch() > early_ioremap_init() > early_ioremap_pmd() > > In early_ioremap_pmd() there seems to be a call to p4d_val() which is an > uninitialized paravirt operation in the Xen pv case. Thanks for testing. Could you check if patch below makes a difference? diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 8febaa318aa2..37e5ccc3890f 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -604,12 +604,12 @@ static inline p4dval_t p4d_val(p4d_t p4d) return PVOP_CALLEE1(p4dval_t, pv_mmu_ops.p4d_val, p4d.p4d); } -static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) -{ - pgdval_t val = native_pgd_val(pgd); - - PVOP_VCALL2(pv_mmu_ops.set_pgd, pgdp, val); -} +#define set_pgd(pgdp, pgdval) do { \ + if (p4d_folded) \ + set_p4d((p4d_t *)(pgdp), (p4d_t) { (pgdval).pgd }); \ + else \ + PVOP_VCALL2(pv_mmu_ops.set_pgd, pgdp, native_pgd_val(pgdval)); \ + } while (0) #define pgd_clear(pgdp) do { \ if (!p4d_folded) \ @@ -834,6 +834,7 @@ static inline notrace unsigned long arch_local_irq_save(void) } +#if 0 /* Make sure as little as possible of this mess escapes. */ #undef PARAVIRT_CALL #undef __PVOP_CALL @@ -848,6 +849,7 @@ static inline notrace unsigned long arch_local_irq_save(void) #undef PVOP_CALL3 #undef PVOP_VCALL4 #undef PVOP_CALL4 +#endif extern void default_banner(void); diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index 3116649302f2..ab1a4f0c65c5 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -558,6 +558,22 @@ static void xen_set_p4d(p4d_t *ptr, p4d_t val) xen_mc_issue(PARAVIRT_LAZY_MMU); } + +#if CONFIG_PGTABLE_LEVELS >= 5 +__visible p4dval_t xen_p4d_val(p4d_t p4d) +{ + return pte_mfn_to_pfn(p4d.p4d); +} +PV_CALLEE_SAVE_REGS_THUNK(xen_p4d_val); + +__visible p4d_t xen_make_p4d(p4dval_t p4d) +{ + p4d = pte_pfn_to_mfn(p4d); + + return native_make_p4d(p4d); +} +PV_CALLEE_SAVE_REGS_THUNK(xen_make_p4d); +#endif /* CONFIG_PGTABLE_LEVELS >= 5 */ #endif /* CONFIG_X86_64 */ static int xen_pmd_walk(struct mm_struct *mm, pmd_t *pmd, @@ -2431,6 +2447,11 @@ static const struct pv_mmu_ops xen_mmu_ops __initconst = { .alloc_pud = xen_alloc_pmd_init, .release_pud = xen_release_pmd_init, + +#if CONFIG_PGTABLE_LEVELS >= 5 + .p4d_val = PV_CALLEE_SAVE(xen_p4d_val), + .make_p4d = PV_CALLEE_SAVE(xen_make_p4d), +#endif #endif /* CONFIG_X86_64 */ .activate_mm = xen_activate_mm, -- Kirill A. Shutemov
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On Tue, Aug 01, 2017 at 09:46:56AM +0200, Juergen Gross wrote: > On 26/07/17 18:43, Kirill A. Shutemov wrote: > > On Wed, Jul 26, 2017 at 09:28:16AM +0200, Juergen Gross wrote: > >> On 25/07/17 11:05, Kirill A. Shutemov wrote: > >>> On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: > Xen PV guests will never run with 5-level-paging enabled. So I guess you > can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. > >>> > >>> There is more code to drop from mmu_pv.c. > >>> > >>> But while there, I thought if with boot-time 5-level paging switching we > >>> can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image > >>> can be used in these XEN modes with 4-level paging. > >>> > >>> Could you check if with the patch below we can boot in XEN_PV and XEN_PVH > >>> modes? > >> > >> We can't. I have used your branch: > >> > >> git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git > >> la57/boot-switching/v2 > >> > >> with this patch applied on top. > >> > >> Doesn't boot PV guest with X86_5LEVEL configured (very early crash). > > > > Hm. Okay. > > > > Have you tried PVH? > > > >> Doesn't build with X86_5LEVEL not configured: > >> > >> AS arch/x86/kernel/head_64.o > > > > I've fixed the patch and split the patch into two parts: cleanup and > > re-enabling XEN_PV and XEN_PVH for X86_5LEVEL. > > > > There's chance that I screw somthing up in clenaup part. Could you check > > that? > > Build is working with and without X86_5LEVEL configured. > > PV domU boots without X86_5LEVEL configured. > > PV domU crashes with X86_5LEVEL configured: > > xen_start_kernel() > x86_64_start_reservations() > start_kernel() > setup_arch() > early_ioremap_init() > early_ioremap_pmd() > > In early_ioremap_pmd() there seems to be a call to p4d_val() which is an > uninitialized paravirt operation in the Xen pv case. Thanks for testing. Could you check if patch below makes a difference? diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 8febaa318aa2..37e5ccc3890f 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -604,12 +604,12 @@ static inline p4dval_t p4d_val(p4d_t p4d) return PVOP_CALLEE1(p4dval_t, pv_mmu_ops.p4d_val, p4d.p4d); } -static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) -{ - pgdval_t val = native_pgd_val(pgd); - - PVOP_VCALL2(pv_mmu_ops.set_pgd, pgdp, val); -} +#define set_pgd(pgdp, pgdval) do { \ + if (p4d_folded) \ + set_p4d((p4d_t *)(pgdp), (p4d_t) { (pgdval).pgd }); \ + else \ + PVOP_VCALL2(pv_mmu_ops.set_pgd, pgdp, native_pgd_val(pgdval)); \ + } while (0) #define pgd_clear(pgdp) do { \ if (!p4d_folded) \ @@ -834,6 +834,7 @@ static inline notrace unsigned long arch_local_irq_save(void) } +#if 0 /* Make sure as little as possible of this mess escapes. */ #undef PARAVIRT_CALL #undef __PVOP_CALL @@ -848,6 +849,7 @@ static inline notrace unsigned long arch_local_irq_save(void) #undef PVOP_CALL3 #undef PVOP_VCALL4 #undef PVOP_CALL4 +#endif extern void default_banner(void); diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index 3116649302f2..ab1a4f0c65c5 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -558,6 +558,22 @@ static void xen_set_p4d(p4d_t *ptr, p4d_t val) xen_mc_issue(PARAVIRT_LAZY_MMU); } + +#if CONFIG_PGTABLE_LEVELS >= 5 +__visible p4dval_t xen_p4d_val(p4d_t p4d) +{ + return pte_mfn_to_pfn(p4d.p4d); +} +PV_CALLEE_SAVE_REGS_THUNK(xen_p4d_val); + +__visible p4d_t xen_make_p4d(p4dval_t p4d) +{ + p4d = pte_pfn_to_mfn(p4d); + + return native_make_p4d(p4d); +} +PV_CALLEE_SAVE_REGS_THUNK(xen_make_p4d); +#endif /* CONFIG_PGTABLE_LEVELS >= 5 */ #endif /* CONFIG_X86_64 */ static int xen_pmd_walk(struct mm_struct *mm, pmd_t *pmd, @@ -2431,6 +2447,11 @@ static const struct pv_mmu_ops xen_mmu_ops __initconst = { .alloc_pud = xen_alloc_pmd_init, .release_pud = xen_release_pmd_init, + +#if CONFIG_PGTABLE_LEVELS >= 5 + .p4d_val = PV_CALLEE_SAVE(xen_p4d_val), + .make_p4d = PV_CALLEE_SAVE(xen_make_p4d), +#endif #endif /* CONFIG_X86_64 */ .activate_mm = xen_activate_mm, -- Kirill A. Shutemov
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On 26/07/17 18:43, Kirill A. Shutemov wrote: > On Wed, Jul 26, 2017 at 09:28:16AM +0200, Juergen Gross wrote: >> On 25/07/17 11:05, Kirill A. Shutemov wrote: >>> On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: Xen PV guests will never run with 5-level-paging enabled. So I guess you can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. >>> >>> There is more code to drop from mmu_pv.c. >>> >>> But while there, I thought if with boot-time 5-level paging switching we >>> can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image >>> can be used in these XEN modes with 4-level paging. >>> >>> Could you check if with the patch below we can boot in XEN_PV and XEN_PVH >>> modes? >> >> We can't. I have used your branch: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git >> la57/boot-switching/v2 >> >> with this patch applied on top. >> >> Doesn't boot PV guest with X86_5LEVEL configured (very early crash). > > Hm. Okay. > > Have you tried PVH? > >> Doesn't build with X86_5LEVEL not configured: >> >> AS arch/x86/kernel/head_64.o > > I've fixed the patch and split the patch into two parts: cleanup and > re-enabling XEN_PV and XEN_PVH for X86_5LEVEL. > > There's chance that I screw somthing up in clenaup part. Could you check > that? Build is working with and without X86_5LEVEL configured. PV domU boots without X86_5LEVEL configured. PV domU crashes with X86_5LEVEL configured: xen_start_kernel() x86_64_start_reservations() start_kernel() setup_arch() early_ioremap_init() early_ioremap_pmd() In early_ioremap_pmd() there seems to be a call to p4d_val() which is an uninitialized paravirt operation in the Xen pv case. HTH, Juergen
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On 26/07/17 18:43, Kirill A. Shutemov wrote: > On Wed, Jul 26, 2017 at 09:28:16AM +0200, Juergen Gross wrote: >> On 25/07/17 11:05, Kirill A. Shutemov wrote: >>> On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: Xen PV guests will never run with 5-level-paging enabled. So I guess you can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. >>> >>> There is more code to drop from mmu_pv.c. >>> >>> But while there, I thought if with boot-time 5-level paging switching we >>> can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image >>> can be used in these XEN modes with 4-level paging. >>> >>> Could you check if with the patch below we can boot in XEN_PV and XEN_PVH >>> modes? >> >> We can't. I have used your branch: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git >> la57/boot-switching/v2 >> >> with this patch applied on top. >> >> Doesn't boot PV guest with X86_5LEVEL configured (very early crash). > > Hm. Okay. > > Have you tried PVH? > >> Doesn't build with X86_5LEVEL not configured: >> >> AS arch/x86/kernel/head_64.o > > I've fixed the patch and split the patch into two parts: cleanup and > re-enabling XEN_PV and XEN_PVH for X86_5LEVEL. > > There's chance that I screw somthing up in clenaup part. Could you check > that? Build is working with and without X86_5LEVEL configured. PV domU boots without X86_5LEVEL configured. PV domU crashes with X86_5LEVEL configured: xen_start_kernel() x86_64_start_reservations() start_kernel() setup_arch() early_ioremap_init() early_ioremap_pmd() In early_ioremap_pmd() there seems to be a call to p4d_val() which is an uninitialized paravirt operation in the Xen pv case. HTH, Juergen
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On 26/07/17 18:43, Kirill A. Shutemov wrote: > On Wed, Jul 26, 2017 at 09:28:16AM +0200, Juergen Gross wrote: >> On 25/07/17 11:05, Kirill A. Shutemov wrote: >>> On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: Xen PV guests will never run with 5-level-paging enabled. So I guess you can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. >>> >>> There is more code to drop from mmu_pv.c. >>> >>> But while there, I thought if with boot-time 5-level paging switching we >>> can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image >>> can be used in these XEN modes with 4-level paging. >>> >>> Could you check if with the patch below we can boot in XEN_PV and XEN_PVH >>> modes? >> >> We can't. I have used your branch: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git >> la57/boot-switching/v2 >> >> with this patch applied on top. >> >> Doesn't boot PV guest with X86_5LEVEL configured (very early crash). > > Hm. Okay. > > Have you tried PVH? Now I have. Its coming up. > >> Doesn't build with X86_5LEVEL not configured: >> >> AS arch/x86/kernel/head_64.o > > I've fixed the patch and split the patch into two parts: cleanup and > re-enabling XEN_PV and XEN_PVH for X86_5LEVEL. > > There's chance that I screw somthing up in clenaup part. Could you check > that? Not sure I'll manage to do this today. Stay tuned... Juergen
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On 26/07/17 18:43, Kirill A. Shutemov wrote: > On Wed, Jul 26, 2017 at 09:28:16AM +0200, Juergen Gross wrote: >> On 25/07/17 11:05, Kirill A. Shutemov wrote: >>> On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: Xen PV guests will never run with 5-level-paging enabled. So I guess you can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. >>> >>> There is more code to drop from mmu_pv.c. >>> >>> But while there, I thought if with boot-time 5-level paging switching we >>> can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image >>> can be used in these XEN modes with 4-level paging. >>> >>> Could you check if with the patch below we can boot in XEN_PV and XEN_PVH >>> modes? >> >> We can't. I have used your branch: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git >> la57/boot-switching/v2 >> >> with this patch applied on top. >> >> Doesn't boot PV guest with X86_5LEVEL configured (very early crash). > > Hm. Okay. > > Have you tried PVH? Now I have. Its coming up. > >> Doesn't build with X86_5LEVEL not configured: >> >> AS arch/x86/kernel/head_64.o > > I've fixed the patch and split the patch into two parts: cleanup and > re-enabling XEN_PV and XEN_PVH for X86_5LEVEL. > > There's chance that I screw somthing up in clenaup part. Could you check > that? Not sure I'll manage to do this today. Stay tuned... Juergen
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On Wed, Jul 26, 2017 at 09:28:16AM +0200, Juergen Gross wrote: > On 25/07/17 11:05, Kirill A. Shutemov wrote: > > On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: > >> Xen PV guests will never run with 5-level-paging enabled. So I guess you > >> can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. > > > > There is more code to drop from mmu_pv.c. > > > > But while there, I thought if with boot-time 5-level paging switching we > > can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image > > can be used in these XEN modes with 4-level paging. > > > > Could you check if with the patch below we can boot in XEN_PV and XEN_PVH > > modes? > > We can't. I have used your branch: > > git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git > la57/boot-switching/v2 > > with this patch applied on top. > > Doesn't boot PV guest with X86_5LEVEL configured (very early crash). Hm. Okay. Have you tried PVH? > Doesn't build with X86_5LEVEL not configured: > > AS arch/x86/kernel/head_64.o I've fixed the patch and split the patch into two parts: cleanup and re-enabling XEN_PV and XEN_PVH for X86_5LEVEL. There's chance that I screw somthing up in clenaup part. Could you check that? -- Kirill A. Shutemov
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On Wed, Jul 26, 2017 at 09:28:16AM +0200, Juergen Gross wrote: > On 25/07/17 11:05, Kirill A. Shutemov wrote: > > On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: > >> Xen PV guests will never run with 5-level-paging enabled. So I guess you > >> can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. > > > > There is more code to drop from mmu_pv.c. > > > > But while there, I thought if with boot-time 5-level paging switching we > > can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image > > can be used in these XEN modes with 4-level paging. > > > > Could you check if with the patch below we can boot in XEN_PV and XEN_PVH > > modes? > > We can't. I have used your branch: > > git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git > la57/boot-switching/v2 > > with this patch applied on top. > > Doesn't boot PV guest with X86_5LEVEL configured (very early crash). Hm. Okay. Have you tried PVH? > Doesn't build with X86_5LEVEL not configured: > > AS arch/x86/kernel/head_64.o I've fixed the patch and split the patch into two parts: cleanup and re-enabling XEN_PV and XEN_PVH for X86_5LEVEL. There's chance that I screw somthing up in clenaup part. Could you check that? -- Kirill A. Shutemov
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On 25/07/17 11:05, Kirill A. Shutemov wrote: > On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: >> Xen PV guests will never run with 5-level-paging enabled. So I guess you >> can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. > > There is more code to drop from mmu_pv.c. > > But while there, I thought if with boot-time 5-level paging switching we > can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image > can be used in these XEN modes with 4-level paging. > > Could you check if with the patch below we can boot in XEN_PV and XEN_PVH > modes? We can't. I have used your branch: git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git la57/boot-switching/v2 with this patch applied on top. Doesn't boot PV guest with X86_5LEVEL configured (very early crash). Doesn't build with X86_5LEVEL not configured: AS arch/x86/kernel/head_64.o /home/gross/linux/arch/x86/kernel/head_64.S: Assembler messages: /home/gross/linux/arch/x86/kernel/head_64.S:350: Error: attempt to move .org backwards /home/gross/linux/arch/x86/kernel/head_64.S:352: Error: attempt to move .org backwards /home/gross/linux/arch/x86/kernel/head_64.S:43: Error: invalid operands (*ABS* and *UND* sections) for `>>' /home/gross/linux/arch/x86/kernel/head_64.S:44: Error: invalid operands (*ABS* and *UND* sections) for `>>' /home/gross/linux/scripts/Makefile.build:403: recipe for target 'arch/x86/kernel/head_64.o' failed make[7]: *** [arch/x86/kernel/head_64.o] Error 1 Juergen
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On 25/07/17 11:05, Kirill A. Shutemov wrote: > On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: >> Xen PV guests will never run with 5-level-paging enabled. So I guess you >> can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. > > There is more code to drop from mmu_pv.c. > > But while there, I thought if with boot-time 5-level paging switching we > can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image > can be used in these XEN modes with 4-level paging. > > Could you check if with the patch below we can boot in XEN_PV and XEN_PVH > modes? We can't. I have used your branch: git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git la57/boot-switching/v2 with this patch applied on top. Doesn't boot PV guest with X86_5LEVEL configured (very early crash). Doesn't build with X86_5LEVEL not configured: AS arch/x86/kernel/head_64.o /home/gross/linux/arch/x86/kernel/head_64.S: Assembler messages: /home/gross/linux/arch/x86/kernel/head_64.S:350: Error: attempt to move .org backwards /home/gross/linux/arch/x86/kernel/head_64.S:352: Error: attempt to move .org backwards /home/gross/linux/arch/x86/kernel/head_64.S:43: Error: invalid operands (*ABS* and *UND* sections) for `>>' /home/gross/linux/arch/x86/kernel/head_64.S:44: Error: invalid operands (*ABS* and *UND* sections) for `>>' /home/gross/linux/scripts/Makefile.build:403: recipe for target 'arch/x86/kernel/head_64.o' failed make[7]: *** [arch/x86/kernel/head_64.o] Error 1 Juergen
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: > Xen PV guests will never run with 5-level-paging enabled. So I guess you > can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. There is more code to drop from mmu_pv.c. But while there, I thought if with boot-time 5-level paging switching we can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image can be used in these XEN modes with 4-level paging. Could you check if with the patch below we can boot in XEN_PV and XEN_PVH modes? diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 7ebb56e99389..6d67d3530698 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -37,12 +37,12 @@ * */ +#define l4_index(x)(((x) >> P4D_SHIFT) & 511) #define pud_index(x) (((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1)) -#if defined(CONFIG_XEN_PV) || defined(CONFIG_XEN_PVH) -PGD_PAGE_OFFSET = pgd_index(__PAGE_OFFSET_BASE48) -PGD_START_KERNEL = pgd_index(__START_KERNEL_map) -#endif +L4_PAGE_OFFSET = l4_index(__PAGE_OFFSET_BASE48) +L4_START_KERNEL = l4_index(__START_KERNEL_map) + L3_START_KERNEL = pud_index(__START_KERNEL_map) .text @@ -347,9 +347,9 @@ NEXT_PAGE(early_dynamic_pgts) #if defined(CONFIG_XEN_PV) || defined(CONFIG_XEN_PVH) NEXT_PAGE(init_top_pgt) .quad level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE - .orginit_top_pgt + PGD_PAGE_OFFSET*8, 0 + .orginit_top_pgt + L4_PAGE_OFFSET*8, 0 .quad level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE - .orginit_top_pgt + PGD_START_KERNEL*8, 0 + .orginit_top_pgt + L4_START_KERNEL*8, 0 /* (2^48-(2*1024*1024*1024))/(2^39) = 511 */ .quad level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig index 1ecd419811a2..027987638e98 100644 --- a/arch/x86/xen/Kconfig +++ b/arch/x86/xen/Kconfig @@ -17,9 +17,6 @@ config XEN_PV bool "Xen PV guest support" default y depends on XEN - # XEN_PV is not ready to work with 5-level paging. - # Changes to hypervisor are also required. - depends on !X86_5LEVEL select XEN_HAVE_PVMMU select XEN_HAVE_VPMU help @@ -78,6 +75,4 @@ config XEN_DEBUG_FS config XEN_PVH bool "Support for running as a PVH guest" depends on XEN && XEN_PVHVM && ACPI - # Pre-built page tables are not ready to handle 5-level paging. - depends on !X86_5LEVEL def_bool n diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index b0530184c637..3116649302f2 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -469,7 +469,7 @@ __visible pmd_t xen_make_pmd(pmdval_t pmd) } PV_CALLEE_SAVE_REGS_THUNK(xen_make_pmd); -#if CONFIG_PGTABLE_LEVELS == 4 +#ifdef CONFIG_X86_64 __visible pudval_t xen_pud_val(pud_t pud) { return pte_mfn_to_pfn(pud.pud); @@ -558,7 +558,7 @@ static void xen_set_p4d(p4d_t *ptr, p4d_t val) xen_mc_issue(PARAVIRT_LAZY_MMU); } -#endif /* CONFIG_PGTABLE_LEVELS == 4 */ +#endif /* CONFIG_X86_64 */ static int xen_pmd_walk(struct mm_struct *mm, pmd_t *pmd, int (*func)(struct mm_struct *mm, struct page *, enum pt_level), @@ -600,21 +600,17 @@ static int xen_p4d_walk(struct mm_struct *mm, p4d_t *p4d, int (*func)(struct mm_struct *mm, struct page *, enum pt_level), bool last, unsigned long limit) { - int i, nr, flush = 0; + int flush = 0; + pud_t *pud; - nr = last ? p4d_index(limit) + 1 : PTRS_PER_P4D; - for (i = 0; i < nr; i++) { - pud_t *pud; - if (p4d_none(p4d[i])) - continue; + if (p4d_none(*p4d)) + return flush; - pud = pud_offset([i], 0); - if (PTRS_PER_PUD > 1) - flush |= (*func)(mm, virt_to_page(pud), PT_PUD); - flush |= xen_pud_walk(mm, pud, func, - last && i == nr - 1, limit); - } + pud = pud_offset(p4d, 0); + if (PTRS_PER_PUD > 1) + flush |= (*func)(mm, virt_to_page(pud), PT_PUD); + flush |= xen_pud_walk(mm, pud, func, last, limit); return flush; } @@ -664,8 +660,6 @@ static int __xen_pgd_walk(struct mm_struct *mm, pgd_t *pgd, continue; p4d = p4d_offset([i], 0); - if (PTRS_PER_P4D > 1) - flush |= (*func)(mm, virt_to_page(p4d), PT_P4D); flush |= xen_p4d_walk(mm, p4d, func, i == nr - 1, limit); } @@ -1197,22 +1191,14 @@ static void __init xen_cleanmfnmap(unsigned long vaddr) { pgd_t *pgd; p4d_t *p4d; - unsigned int i; bool unpin; unpin = (vaddr == 2 * PGDIR_SIZE); vaddr &= PMD_MASK; pgd = pgd_offset_k(vaddr); p4d = p4d_offset(pgd, 0); - for (i = 0; i < PTRS_PER_P4D; i++) { -
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: > Xen PV guests will never run with 5-level-paging enabled. So I guess you > can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. There is more code to drop from mmu_pv.c. But while there, I thought if with boot-time 5-level paging switching we can allow kernel to compile with XEN_PV and XEN_PVH, so the kernel image can be used in these XEN modes with 4-level paging. Could you check if with the patch below we can boot in XEN_PV and XEN_PVH modes? diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 7ebb56e99389..6d67d3530698 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -37,12 +37,12 @@ * */ +#define l4_index(x)(((x) >> P4D_SHIFT) & 511) #define pud_index(x) (((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1)) -#if defined(CONFIG_XEN_PV) || defined(CONFIG_XEN_PVH) -PGD_PAGE_OFFSET = pgd_index(__PAGE_OFFSET_BASE48) -PGD_START_KERNEL = pgd_index(__START_KERNEL_map) -#endif +L4_PAGE_OFFSET = l4_index(__PAGE_OFFSET_BASE48) +L4_START_KERNEL = l4_index(__START_KERNEL_map) + L3_START_KERNEL = pud_index(__START_KERNEL_map) .text @@ -347,9 +347,9 @@ NEXT_PAGE(early_dynamic_pgts) #if defined(CONFIG_XEN_PV) || defined(CONFIG_XEN_PVH) NEXT_PAGE(init_top_pgt) .quad level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE - .orginit_top_pgt + PGD_PAGE_OFFSET*8, 0 + .orginit_top_pgt + L4_PAGE_OFFSET*8, 0 .quad level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE - .orginit_top_pgt + PGD_START_KERNEL*8, 0 + .orginit_top_pgt + L4_START_KERNEL*8, 0 /* (2^48-(2*1024*1024*1024))/(2^39) = 511 */ .quad level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig index 1ecd419811a2..027987638e98 100644 --- a/arch/x86/xen/Kconfig +++ b/arch/x86/xen/Kconfig @@ -17,9 +17,6 @@ config XEN_PV bool "Xen PV guest support" default y depends on XEN - # XEN_PV is not ready to work with 5-level paging. - # Changes to hypervisor are also required. - depends on !X86_5LEVEL select XEN_HAVE_PVMMU select XEN_HAVE_VPMU help @@ -78,6 +75,4 @@ config XEN_DEBUG_FS config XEN_PVH bool "Support for running as a PVH guest" depends on XEN && XEN_PVHVM && ACPI - # Pre-built page tables are not ready to handle 5-level paging. - depends on !X86_5LEVEL def_bool n diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index b0530184c637..3116649302f2 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -469,7 +469,7 @@ __visible pmd_t xen_make_pmd(pmdval_t pmd) } PV_CALLEE_SAVE_REGS_THUNK(xen_make_pmd); -#if CONFIG_PGTABLE_LEVELS == 4 +#ifdef CONFIG_X86_64 __visible pudval_t xen_pud_val(pud_t pud) { return pte_mfn_to_pfn(pud.pud); @@ -558,7 +558,7 @@ static void xen_set_p4d(p4d_t *ptr, p4d_t val) xen_mc_issue(PARAVIRT_LAZY_MMU); } -#endif /* CONFIG_PGTABLE_LEVELS == 4 */ +#endif /* CONFIG_X86_64 */ static int xen_pmd_walk(struct mm_struct *mm, pmd_t *pmd, int (*func)(struct mm_struct *mm, struct page *, enum pt_level), @@ -600,21 +600,17 @@ static int xen_p4d_walk(struct mm_struct *mm, p4d_t *p4d, int (*func)(struct mm_struct *mm, struct page *, enum pt_level), bool last, unsigned long limit) { - int i, nr, flush = 0; + int flush = 0; + pud_t *pud; - nr = last ? p4d_index(limit) + 1 : PTRS_PER_P4D; - for (i = 0; i < nr; i++) { - pud_t *pud; - if (p4d_none(p4d[i])) - continue; + if (p4d_none(*p4d)) + return flush; - pud = pud_offset([i], 0); - if (PTRS_PER_PUD > 1) - flush |= (*func)(mm, virt_to_page(pud), PT_PUD); - flush |= xen_pud_walk(mm, pud, func, - last && i == nr - 1, limit); - } + pud = pud_offset(p4d, 0); + if (PTRS_PER_PUD > 1) + flush |= (*func)(mm, virt_to_page(pud), PT_PUD); + flush |= xen_pud_walk(mm, pud, func, last, limit); return flush; } @@ -664,8 +660,6 @@ static int __xen_pgd_walk(struct mm_struct *mm, pgd_t *pgd, continue; p4d = p4d_offset([i], 0); - if (PTRS_PER_P4D > 1) - flush |= (*func)(mm, virt_to_page(p4d), PT_P4D); flush |= xen_p4d_walk(mm, p4d, func, i == nr - 1, limit); } @@ -1197,22 +1191,14 @@ static void __init xen_cleanmfnmap(unsigned long vaddr) { pgd_t *pgd; p4d_t *p4d; - unsigned int i; bool unpin; unpin = (vaddr == 2 * PGDIR_SIZE); vaddr &= PMD_MASK; pgd = pgd_offset_k(vaddr); p4d = p4d_offset(pgd, 0); - for (i = 0; i < PTRS_PER_P4D; i++) { -
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: > On 18/07/17 16:15, Kirill A. Shutemov wrote: > > diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c > > index cab28cf2cffb..b0530184c637 100644 > > --- a/arch/x86/xen/mmu_pv.c > > +++ b/arch/x86/xen/mmu_pv.c > > @@ -1209,7 +1209,7 @@ static void __init xen_cleanmfnmap(unsigned long > > vaddr) > > continue; > > xen_cleanmfnmap_p4d(p4d + i, unpin); > > } > > - if (IS_ENABLED(CONFIG_X86_5LEVEL)) { > > + if (!p4d_folded) { > > set_pgd(pgd, __pgd(0)); > > xen_cleanmfnmap_free_pgtbl(p4d, unpin); > > } > > Xen PV guests will never run with 5-level-paging enabled. So I guess you > can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. Thanks. I'll do a sparate cleanup patch for this. -- Kirill A. Shutemov
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On Tue, Jul 18, 2017 at 04:24:06PM +0200, Juergen Gross wrote: > On 18/07/17 16:15, Kirill A. Shutemov wrote: > > diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c > > index cab28cf2cffb..b0530184c637 100644 > > --- a/arch/x86/xen/mmu_pv.c > > +++ b/arch/x86/xen/mmu_pv.c > > @@ -1209,7 +1209,7 @@ static void __init xen_cleanmfnmap(unsigned long > > vaddr) > > continue; > > xen_cleanmfnmap_p4d(p4d + i, unpin); > > } > > - if (IS_ENABLED(CONFIG_X86_5LEVEL)) { > > + if (!p4d_folded) { > > set_pgd(pgd, __pgd(0)); > > xen_cleanmfnmap_free_pgtbl(p4d, unpin); > > } > > Xen PV guests will never run with 5-level-paging enabled. So I guess you > can drop the complete if (IS_ENABLED(CONFIG_X86_5LEVEL)) {} block. Thanks. I'll do a sparate cleanup patch for this. -- Kirill A. Shutemov
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On 18/07/17 16:15, Kirill A. Shutemov wrote: > This patch converts the of CONFIG_X86_5LEVEL check to runtime checks for > p4d folding. > > Signed-off-by: Kirill A. Shutemov> --- > arch/x86/mm/fault.c| 2 +- > arch/x86/mm/ident_map.c| 2 +- > arch/x86/mm/init_64.c | 30 ++ > arch/x86/mm/kasan_init_64.c| 8 > arch/x86/mm/kaslr.c| 6 +++--- > arch/x86/platform/efi/efi_64.c | 2 +- > arch/x86/power/hibernate_64.c | 6 +++--- > arch/x86/xen/mmu_pv.c | 2 +- > 8 files changed, 32 insertions(+), 26 deletions(-) > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > index 2a1fa10c6a98..d3d8f10f0c10 100644 > --- a/arch/x86/mm/fault.c > +++ b/arch/x86/mm/fault.c > @@ -459,7 +459,7 @@ static noinline int vmalloc_fault(unsigned long address) > if (pgd_none(*pgd)) { > set_pgd(pgd, *pgd_ref); > arch_flush_lazy_mmu_mode(); > - } else if (CONFIG_PGTABLE_LEVELS > 4) { > + } else if (!p4d_folded) { > /* >* With folded p4d, pgd_none() is always false, so the pgd may >* point to an empty page table entry and pgd_page_vaddr() > diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c > index adab1595f4bd..d2df33a2cbfb 100644 > --- a/arch/x86/mm/ident_map.c > +++ b/arch/x86/mm/ident_map.c > @@ -115,7 +115,7 @@ int kernel_ident_mapping_init(struct x86_mapping_info > *info, pgd_t *pgd_page, > result = ident_p4d_init(info, p4d, addr, next); > if (result) > return result; > - if (IS_ENABLED(CONFIG_X86_5LEVEL)) { > + if (!p4d_folded) { > set_pgd(pgd, __pgd(__pa(p4d) | _KERNPG_TABLE)); > } else { > /* > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 649b8df485ad..6b97f6c1bf77 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -88,12 +88,7 @@ static int __init nonx32_setup(char *str) > } > __setup("noexec32=", nonx32_setup); > > -/* > - * When memory was added make sure all the processes MM have > - * suitable PGD entries in the local PGD level page. > - */ > -#ifdef CONFIG_X86_5LEVEL > -void sync_global_pgds(unsigned long start, unsigned long end) > +static void sync_global_pgds_57(unsigned long start, unsigned long end) > { > unsigned long addr; > > @@ -129,8 +124,8 @@ void sync_global_pgds(unsigned long start, unsigned long > end) > spin_unlock(_lock); > } > } > -#else > -void sync_global_pgds(unsigned long start, unsigned long end) > + > +static void sync_global_pgds_48(unsigned long start, unsigned long end) > { > unsigned long addr; > > @@ -173,7 +168,18 @@ void sync_global_pgds(unsigned long start, unsigned long > end) > spin_unlock(_lock); > } > } > -#endif > + > +/* > + * When memory was added make sure all the processes MM have > + * suitable PGD entries in the local PGD level page. > + */ > +void sync_global_pgds(unsigned long start, unsigned long end) > +{ > + if (!p4d_folded) > + sync_global_pgds_57(start, end); > + else > + sync_global_pgds_48(start, end); > +} > > /* > * NOTE: This function is marked __ref because it calls __init function > @@ -632,7 +638,7 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, > unsigned long paddr_end, > unsigned long vaddr = (unsigned long)__va(paddr); > int i = p4d_index(vaddr); > > - if (!IS_ENABLED(CONFIG_X86_5LEVEL)) > + if (p4d_folded) > return phys_pud_init((pud_t *) p4d_page, paddr, paddr_end, > page_size_mask); > > for (; i < PTRS_PER_P4D; i++, paddr = paddr_next) { > @@ -712,7 +718,7 @@ kernel_physical_mapping_init(unsigned long paddr_start, > page_size_mask); > > spin_lock(_mm.page_table_lock); > - if (IS_ENABLED(CONFIG_X86_5LEVEL)) > + if (!p4d_folded) > pgd_populate(_mm, pgd, p4d); > else > p4d_populate(_mm, p4d_offset(pgd, vaddr), (pud_t > *) p4d); > @@ -1078,7 +1084,7 @@ remove_p4d_table(p4d_t *p4d_start, unsigned long addr, > unsigned long end, >* 5-level case we should free them. This code will have to > change >* to adapt for boot-time switching between 4 and 5 level page > tables. >*/ > - if (CONFIG_PGTABLE_LEVELS == 5) > + if (!p4d_folded) > free_pud_table(pud_base, p4d); > } > > diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c > index cff8d85fef7b..ee12861e0609 100644 > --- a/arch/x86/mm/kasan_init_64.c > +++ b/arch/x86/mm/kasan_init_64.c > @@ -40,7 +40,7 @@ static void __init clear_pgds(unsigned long start, >* With folded p4d,
Re: [PATCHv2 08/10] x86/mm: Replace compile-time checks for 5-level with runtime-time
On 18/07/17 16:15, Kirill A. Shutemov wrote: > This patch converts the of CONFIG_X86_5LEVEL check to runtime checks for > p4d folding. > > Signed-off-by: Kirill A. Shutemov > --- > arch/x86/mm/fault.c| 2 +- > arch/x86/mm/ident_map.c| 2 +- > arch/x86/mm/init_64.c | 30 ++ > arch/x86/mm/kasan_init_64.c| 8 > arch/x86/mm/kaslr.c| 6 +++--- > arch/x86/platform/efi/efi_64.c | 2 +- > arch/x86/power/hibernate_64.c | 6 +++--- > arch/x86/xen/mmu_pv.c | 2 +- > 8 files changed, 32 insertions(+), 26 deletions(-) > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > index 2a1fa10c6a98..d3d8f10f0c10 100644 > --- a/arch/x86/mm/fault.c > +++ b/arch/x86/mm/fault.c > @@ -459,7 +459,7 @@ static noinline int vmalloc_fault(unsigned long address) > if (pgd_none(*pgd)) { > set_pgd(pgd, *pgd_ref); > arch_flush_lazy_mmu_mode(); > - } else if (CONFIG_PGTABLE_LEVELS > 4) { > + } else if (!p4d_folded) { > /* >* With folded p4d, pgd_none() is always false, so the pgd may >* point to an empty page table entry and pgd_page_vaddr() > diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c > index adab1595f4bd..d2df33a2cbfb 100644 > --- a/arch/x86/mm/ident_map.c > +++ b/arch/x86/mm/ident_map.c > @@ -115,7 +115,7 @@ int kernel_ident_mapping_init(struct x86_mapping_info > *info, pgd_t *pgd_page, > result = ident_p4d_init(info, p4d, addr, next); > if (result) > return result; > - if (IS_ENABLED(CONFIG_X86_5LEVEL)) { > + if (!p4d_folded) { > set_pgd(pgd, __pgd(__pa(p4d) | _KERNPG_TABLE)); > } else { > /* > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 649b8df485ad..6b97f6c1bf77 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -88,12 +88,7 @@ static int __init nonx32_setup(char *str) > } > __setup("noexec32=", nonx32_setup); > > -/* > - * When memory was added make sure all the processes MM have > - * suitable PGD entries in the local PGD level page. > - */ > -#ifdef CONFIG_X86_5LEVEL > -void sync_global_pgds(unsigned long start, unsigned long end) > +static void sync_global_pgds_57(unsigned long start, unsigned long end) > { > unsigned long addr; > > @@ -129,8 +124,8 @@ void sync_global_pgds(unsigned long start, unsigned long > end) > spin_unlock(_lock); > } > } > -#else > -void sync_global_pgds(unsigned long start, unsigned long end) > + > +static void sync_global_pgds_48(unsigned long start, unsigned long end) > { > unsigned long addr; > > @@ -173,7 +168,18 @@ void sync_global_pgds(unsigned long start, unsigned long > end) > spin_unlock(_lock); > } > } > -#endif > + > +/* > + * When memory was added make sure all the processes MM have > + * suitable PGD entries in the local PGD level page. > + */ > +void sync_global_pgds(unsigned long start, unsigned long end) > +{ > + if (!p4d_folded) > + sync_global_pgds_57(start, end); > + else > + sync_global_pgds_48(start, end); > +} > > /* > * NOTE: This function is marked __ref because it calls __init function > @@ -632,7 +638,7 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, > unsigned long paddr_end, > unsigned long vaddr = (unsigned long)__va(paddr); > int i = p4d_index(vaddr); > > - if (!IS_ENABLED(CONFIG_X86_5LEVEL)) > + if (p4d_folded) > return phys_pud_init((pud_t *) p4d_page, paddr, paddr_end, > page_size_mask); > > for (; i < PTRS_PER_P4D; i++, paddr = paddr_next) { > @@ -712,7 +718,7 @@ kernel_physical_mapping_init(unsigned long paddr_start, > page_size_mask); > > spin_lock(_mm.page_table_lock); > - if (IS_ENABLED(CONFIG_X86_5LEVEL)) > + if (!p4d_folded) > pgd_populate(_mm, pgd, p4d); > else > p4d_populate(_mm, p4d_offset(pgd, vaddr), (pud_t > *) p4d); > @@ -1078,7 +1084,7 @@ remove_p4d_table(p4d_t *p4d_start, unsigned long addr, > unsigned long end, >* 5-level case we should free them. This code will have to > change >* to adapt for boot-time switching between 4 and 5 level page > tables. >*/ > - if (CONFIG_PGTABLE_LEVELS == 5) > + if (!p4d_folded) > free_pud_table(pud_base, p4d); > } > > diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c > index cff8d85fef7b..ee12861e0609 100644 > --- a/arch/x86/mm/kasan_init_64.c > +++ b/arch/x86/mm/kasan_init_64.c > @@ -40,7 +40,7 @@ static void __init clear_pgds(unsigned long start, >* With folded p4d, pgd_clear() is nop, use p4d_clear() >