Re: [PATCH v3 0/3] Off-load TLB invalidations to host for !GTSE
Bharata B Rao writes: > On Fri, Jul 17, 2020 at 12:44:00PM +1000, Nicholas Piggin wrote: >> Excerpts from Nicholas Piggin's message of July 17, 2020 12:08 pm: >> > Excerpts from Qian Cai's message of July 17, 2020 3:27 am: >> >> On Fri, Jul 03, 2020 at 11:06:05AM +0530, Bharata B Rao wrote: >> >>> Hypervisor may choose not to enable Guest Translation Shootdown Enable >> >>> (GTSE) option for the guest. When GTSE isn't ON, the guest OS isn't >> >>> permitted to use instructions like tblie and tlbsync directly, but is >> >>> expected to make hypervisor calls to get the TLB flushed. >> >>> >> >>> This series enables the TLB flush routines in the radix code to >> >>> off-load TLB flushing to hypervisor via the newly proposed hcall >> >>> H_RPT_INVALIDATE. >> >>> >> >>> To easily check the availability of GTSE, it is made an MMU feature. >> >>> The OV5 handling and H_REGISTER_PROC_TBL hcall are changed to >> >>> handle GTSE as an optionally available feature and to not assume GTSE >> >>> when radix support is available. >> >>> >> >>> The actual hcall implementation for KVM isn't included in this >> >>> patchset and will be posted separately. >> >>> >> >>> Changes in v3 >> >>> = >> >>> - Fixed a bug in the hcall wrapper code where we were missing setting >> >>> H_RPTI_TYPE_NESTED while retrying the failed flush request with >> >>> a full flush for the nested case. >> >>> - s/psize_to_h_rpti/psize_to_rpti_pgsize >> >>> >> >>> v2: >> >>> https://lore.kernel.org/linuxppc-dev/20200626131000.5207-1-bhar...@linux.ibm.com/T/#t >> >>> >> >>> Bharata B Rao (2): >> >>> powerpc/mm: Enable radix GTSE only if supported. >> >>> powerpc/pseries: H_REGISTER_PROC_TBL should ask for GTSE only if >> >>> enabled >> >>> >> >>> Nicholas Piggin (1): >> >>> powerpc/mm/book3s64/radix: Off-load TLB invalidations to host when >> >>> !GTSE >> >> >> >> Reverting the whole series fixed random memory corruptions during boot on >> >> POWER9 PowerNV systems below. >> > >> > If I s/mmu_has_feature(MMU_FTR_GTSE)/(1)/g in radix_tlb.c, then the .o >> > disasm is the same as reverting my patch. >> > >> > Feature bits not being set right? PowerNV should be pretty simple, seems >> > to do the same as FTR_TYPE_RADIX. >> >> Might need this fix >> >> --- >> >> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c >> index 9cc49f265c86..54c9bcea9d4e 100644 >> --- a/arch/powerpc/kernel/prom.c >> +++ b/arch/powerpc/kernel/prom.c >> @@ -163,7 +163,7 @@ static struct ibm_pa_feature { >> { .pabyte = 0, .pabit = 6, .cpu_features = CPU_FTR_NOEXECUTE }, >> { .pabyte = 1, .pabit = 2, .mmu_features = MMU_FTR_CI_LARGE_PAGE }, >> #ifdef CONFIG_PPC_RADIX_MMU >> -{ .pabyte = 40, .pabit = 0, .mmu_features = MMU_FTR_TYPE_RADIX }, >> +{ .pabyte = 40, .pabit = 0, .mmu_features = (MMU_FTR_TYPE_RADIX | >> MMU_FTR_GTSE) }, >> #endif >> { .pabyte = 1, .pabit = 1, .invert = 1, .cpu_features = >> CPU_FTR_NODSISRALIGN }, >> { .pabyte = 5, .pabit = 0, .cpu_features = CPU_FTR_REAL_LE, > > Michael - Let me know if this should be folded into 1/3 and the complete > series resent. No it's already merged. Please send a proper patch with a Fixes: tag and a change log that explains the details. cheers
Re: [PATCH v3 0/3] Off-load TLB invalidations to host for !GTSE
On Fri, Jul 17, 2020 at 12:44:00PM +1000, Nicholas Piggin wrote: > Excerpts from Nicholas Piggin's message of July 17, 2020 12:08 pm: > > Excerpts from Qian Cai's message of July 17, 2020 3:27 am: > >> On Fri, Jul 03, 2020 at 11:06:05AM +0530, Bharata B Rao wrote: > >>> Hypervisor may choose not to enable Guest Translation Shootdown Enable > >>> (GTSE) option for the guest. When GTSE isn't ON, the guest OS isn't > >>> permitted to use instructions like tblie and tlbsync directly, but is > >>> expected to make hypervisor calls to get the TLB flushed. > >>> > >>> This series enables the TLB flush routines in the radix code to > >>> off-load TLB flushing to hypervisor via the newly proposed hcall > >>> H_RPT_INVALIDATE. > >>> > >>> To easily check the availability of GTSE, it is made an MMU feature. > >>> The OV5 handling and H_REGISTER_PROC_TBL hcall are changed to > >>> handle GTSE as an optionally available feature and to not assume GTSE > >>> when radix support is available. > >>> > >>> The actual hcall implementation for KVM isn't included in this > >>> patchset and will be posted separately. > >>> > >>> Changes in v3 > >>> = > >>> - Fixed a bug in the hcall wrapper code where we were missing setting > >>> H_RPTI_TYPE_NESTED while retrying the failed flush request with > >>> a full flush for the nested case. > >>> - s/psize_to_h_rpti/psize_to_rpti_pgsize > >>> > >>> v2: > >>> https://lore.kernel.org/linuxppc-dev/20200626131000.5207-1-bhar...@linux.ibm.com/T/#t > >>> > >>> Bharata B Rao (2): > >>> powerpc/mm: Enable radix GTSE only if supported. > >>> powerpc/pseries: H_REGISTER_PROC_TBL should ask for GTSE only if > >>> enabled > >>> > >>> Nicholas Piggin (1): > >>> powerpc/mm/book3s64/radix: Off-load TLB invalidations to host when > >>> !GTSE > >> > >> Reverting the whole series fixed random memory corruptions during boot on > >> POWER9 PowerNV systems below. > > > > If I s/mmu_has_feature(MMU_FTR_GTSE)/(1)/g in radix_tlb.c, then the .o > > disasm is the same as reverting my patch. > > > > Feature bits not being set right? PowerNV should be pretty simple, seems > > to do the same as FTR_TYPE_RADIX. > > Might need this fix > > --- > > diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c > index 9cc49f265c86..54c9bcea9d4e 100644 > --- a/arch/powerpc/kernel/prom.c > +++ b/arch/powerpc/kernel/prom.c > @@ -163,7 +163,7 @@ static struct ibm_pa_feature { > { .pabyte = 0, .pabit = 6, .cpu_features = CPU_FTR_NOEXECUTE }, > { .pabyte = 1, .pabit = 2, .mmu_features = MMU_FTR_CI_LARGE_PAGE }, > #ifdef CONFIG_PPC_RADIX_MMU > - { .pabyte = 40, .pabit = 0, .mmu_features = MMU_FTR_TYPE_RADIX }, > + { .pabyte = 40, .pabit = 0, .mmu_features = (MMU_FTR_TYPE_RADIX | > MMU_FTR_GTSE) }, > #endif > { .pabyte = 1, .pabit = 1, .invert = 1, .cpu_features = > CPU_FTR_NODSISRALIGN }, > { .pabyte = 5, .pabit = 0, .cpu_features = CPU_FTR_REAL_LE, Michael - Let me know if this should be folded into 1/3 and the complete series resent. Regards, Bharata.
Re: [PATCH v3 0/3] Off-load TLB invalidations to host for !GTSE
Excerpts from Nicholas Piggin's message of July 17, 2020 12:08 pm: > Excerpts from Qian Cai's message of July 17, 2020 3:27 am: >> On Fri, Jul 03, 2020 at 11:06:05AM +0530, Bharata B Rao wrote: >>> Hypervisor may choose not to enable Guest Translation Shootdown Enable >>> (GTSE) option for the guest. When GTSE isn't ON, the guest OS isn't >>> permitted to use instructions like tblie and tlbsync directly, but is >>> expected to make hypervisor calls to get the TLB flushed. >>> >>> This series enables the TLB flush routines in the radix code to >>> off-load TLB flushing to hypervisor via the newly proposed hcall >>> H_RPT_INVALIDATE. >>> >>> To easily check the availability of GTSE, it is made an MMU feature. >>> The OV5 handling and H_REGISTER_PROC_TBL hcall are changed to >>> handle GTSE as an optionally available feature and to not assume GTSE >>> when radix support is available. >>> >>> The actual hcall implementation for KVM isn't included in this >>> patchset and will be posted separately. >>> >>> Changes in v3 >>> = >>> - Fixed a bug in the hcall wrapper code where we were missing setting >>> H_RPTI_TYPE_NESTED while retrying the failed flush request with >>> a full flush for the nested case. >>> - s/psize_to_h_rpti/psize_to_rpti_pgsize >>> >>> v2: >>> https://lore.kernel.org/linuxppc-dev/20200626131000.5207-1-bhar...@linux.ibm.com/T/#t >>> >>> Bharata B Rao (2): >>> powerpc/mm: Enable radix GTSE only if supported. >>> powerpc/pseries: H_REGISTER_PROC_TBL should ask for GTSE only if >>> enabled >>> >>> Nicholas Piggin (1): >>> powerpc/mm/book3s64/radix: Off-load TLB invalidations to host when >>> !GTSE >> >> Reverting the whole series fixed random memory corruptions during boot on >> POWER9 PowerNV systems below. > > If I s/mmu_has_feature(MMU_FTR_GTSE)/(1)/g in radix_tlb.c, then the .o > disasm is the same as reverting my patch. > > Feature bits not being set right? PowerNV should be pretty simple, seems > to do the same as FTR_TYPE_RADIX. Might need this fix --- diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index 9cc49f265c86..54c9bcea9d4e 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -163,7 +163,7 @@ static struct ibm_pa_feature { { .pabyte = 0, .pabit = 6, .cpu_features = CPU_FTR_NOEXECUTE }, { .pabyte = 1, .pabit = 2, .mmu_features = MMU_FTR_CI_LARGE_PAGE }, #ifdef CONFIG_PPC_RADIX_MMU - { .pabyte = 40, .pabit = 0, .mmu_features = MMU_FTR_TYPE_RADIX }, + { .pabyte = 40, .pabit = 0, .mmu_features = (MMU_FTR_TYPE_RADIX | MMU_FTR_GTSE) }, #endif { .pabyte = 1, .pabit = 1, .invert = 1, .cpu_features = CPU_FTR_NODSISRALIGN }, { .pabyte = 5, .pabit = 0, .cpu_features = CPU_FTR_REAL_LE,
Re: [PATCH v3 0/3] Off-load TLB invalidations to host for !GTSE
Excerpts from Qian Cai's message of July 17, 2020 3:27 am: > On Fri, Jul 03, 2020 at 11:06:05AM +0530, Bharata B Rao wrote: >> Hypervisor may choose not to enable Guest Translation Shootdown Enable >> (GTSE) option for the guest. When GTSE isn't ON, the guest OS isn't >> permitted to use instructions like tblie and tlbsync directly, but is >> expected to make hypervisor calls to get the TLB flushed. >> >> This series enables the TLB flush routines in the radix code to >> off-load TLB flushing to hypervisor via the newly proposed hcall >> H_RPT_INVALIDATE. >> >> To easily check the availability of GTSE, it is made an MMU feature. >> The OV5 handling and H_REGISTER_PROC_TBL hcall are changed to >> handle GTSE as an optionally available feature and to not assume GTSE >> when radix support is available. >> >> The actual hcall implementation for KVM isn't included in this >> patchset and will be posted separately. >> >> Changes in v3 >> = >> - Fixed a bug in the hcall wrapper code where we were missing setting >> H_RPTI_TYPE_NESTED while retrying the failed flush request with >> a full flush for the nested case. >> - s/psize_to_h_rpti/psize_to_rpti_pgsize >> >> v2: >> https://lore.kernel.org/linuxppc-dev/20200626131000.5207-1-bhar...@linux.ibm.com/T/#t >> >> Bharata B Rao (2): >> powerpc/mm: Enable radix GTSE only if supported. >> powerpc/pseries: H_REGISTER_PROC_TBL should ask for GTSE only if >> enabled >> >> Nicholas Piggin (1): >> powerpc/mm/book3s64/radix: Off-load TLB invalidations to host when >> !GTSE > > Reverting the whole series fixed random memory corruptions during boot on > POWER9 PowerNV systems below. If I s/mmu_has_feature(MMU_FTR_GTSE)/(1)/g in radix_tlb.c, then the .o disasm is the same as reverting my patch. Feature bits not being set right? PowerNV should be pretty simple, seems to do the same as FTR_TYPE_RADIX. So... test being done before static keys are set up? Shouldn't be. Must be something obvious I just can't see it. Thanks, Nick
Re: [PATCH v3 0/3] Off-load TLB invalidations to host for !GTSE
Hi all, On Thu, 16 Jul 2020 13:27:14 -0400 Qian Cai wrote: > > Reverting the whole series fixed random memory corruptions during boot on > POWER9 PowerNV systems below. I will revert those commits from linux-next today as well (they revert cleanly). -- Cheers, Stephen Rothwell pgpFh42TZjw5z.pgp Description: OpenPGP digital signature
Re: [PATCH v3 0/3] Off-load TLB invalidations to host for !GTSE
On Fri, Jul 03, 2020 at 11:06:05AM +0530, Bharata B Rao wrote: > Hypervisor may choose not to enable Guest Translation Shootdown Enable > (GTSE) option for the guest. When GTSE isn't ON, the guest OS isn't > permitted to use instructions like tblie and tlbsync directly, but is > expected to make hypervisor calls to get the TLB flushed. > > This series enables the TLB flush routines in the radix code to > off-load TLB flushing to hypervisor via the newly proposed hcall > H_RPT_INVALIDATE. > > To easily check the availability of GTSE, it is made an MMU feature. > The OV5 handling and H_REGISTER_PROC_TBL hcall are changed to > handle GTSE as an optionally available feature and to not assume GTSE > when radix support is available. > > The actual hcall implementation for KVM isn't included in this > patchset and will be posted separately. > > Changes in v3 > = > - Fixed a bug in the hcall wrapper code where we were missing setting > H_RPTI_TYPE_NESTED while retrying the failed flush request with > a full flush for the nested case. > - s/psize_to_h_rpti/psize_to_rpti_pgsize > > v2: > https://lore.kernel.org/linuxppc-dev/20200626131000.5207-1-bhar...@linux.ibm.com/T/#t > > Bharata B Rao (2): > powerpc/mm: Enable radix GTSE only if supported. > powerpc/pseries: H_REGISTER_PROC_TBL should ask for GTSE only if > enabled > > Nicholas Piggin (1): > powerpc/mm/book3s64/radix: Off-load TLB invalidations to host when > !GTSE Reverting the whole series fixed random memory corruptions during boot on POWER9 PowerNV systems below. IBM 8335-GTH (ibm,witherspoon) POWER9, altivec supported 262144 MB memory, 2000 GB disk space .config: https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config [9.338996][ T925] BUG: Unable to handle kernel instruction fetch (NULL pointer?) [9.339026][ T925] Faulting instruction address: 0x [9.339051][ T925] Oops: Kernel access of bad area, sig: 11 [#1] [9.339064][ T925] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 NUMA PowerNV [9.339098][ T925] Modules linked in: dm_mirror dm_region_hash dm_log dm_mod [9.339150][ T925] CPU: 92 PID: 925 Comm: (md-udevd) Not tainted 5.8.0-rc5-next-20200716 #3 [9.339186][ T925] NIP: LR: c021f2cc CTR: [9.339210][ T925] REGS: c000201cb52d79b0 TRAP: 0400 Not tainted (5.8.0-rc5-next-20200716) [9.339244][ T925] MSR: 900040009033 CR: 2492 XER: [9.339278][ T925] CFAR: c021f2c8 IRQMASK: 0 [9.339278][ T925] GPR00: c021f2cc c000201cb52d7c40 c5901000 c000201cb52d7ca8 [9.339278][ T925] GPR04: c0080ea60038 7fff 7fff [9.339278][ T925] GPR08: c000201cb50bd500 0003 [9.339278][ T925] GPR12: c000201fff694500 7fffa4a8a940 7fffa4a8a6c8 [9.339278][ T925] GPR16: 7fffa4a8a8f8 7fffa4a8a650 7fffa4a8a488 [9.339278][ T925] GPR20: 00050001 7fffa4a8a984 7fff ca4545cc [9.339278][ T925] GPR24: c0affe28 0166 [9.339278][ T925] GPR28: c000201cb52d7ca8 c0080ea6 c000201cc3b72600 7fff [9.339493][ T925] NIP [] 0x0 [9.339516][ T925] LR [c021f2cc] __seccomp_filter+0xec/0x530 bpf_dispatcher_nop_func at include/linux/bpf.h:567 (inlined by) bpf_prog_run_pin_on_cpu at include/linux/filter.h:597 (inlined by) seccomp_run_filters at kernel/seccomp.c:324 (inlined by) __seccomp_filter at kernel/seccomp.c:937 [9.339538][ T925] Call Trace: [9.339548][ T925] [c000201cb52d7c40] [c021f2cc] __seccomp_filter+0xec/0x530 (unreliable) [9.339566][ T925] [c000201cb52d7d50] [c0025af8] do_syscall_trace_enter+0xb8/0x470 do_seccomp at arch/powerpc/kernel/ptrace/ptrace.c:252 (inlined by) do_syscall_trace_enter at arch/powerpc/kernel/ptrace/ptrace.c:327 [9.339600][ T925] [c000201cb52d7dc0] [c002c8f8] system_call_exception+0x138/0x180 [9.339625][ T925] [c000201cb52d7e20] [c000c9e8] system_call_common+0xe8/0x214 [9.339648][ T925] Instruction dump: [9.339667][ T925] [9.339706][ T925] [9.339748][ T925] ---[ end trace d89eb80f9a6bc141 ]--- [ OK ] Started Journal Service. [ 10.452364][ T925] Kernel panic - not syncing: Fatal exception [ 11.876655][ T925] ---[ end Kernel panic - not syncing: Fatal exception ]--- There could also be lots of random userspace segfault like, [ 16.463545][ T771] rngd[771]: segfault (11) at 0 nip 0 lr 0 code 1 in rngd[106d6+2] [ 16.463620][ T771] rngd[771]: code: