Re: [PATCH v3 0/3] Off-load TLB invalidations to host for !GTSE

2020-07-18 Thread Michael Ellerman
Bharata B Rao  writes:
> On Fri, Jul 17, 2020 at 12:44:00PM +1000, Nicholas Piggin wrote:
>> Excerpts from Nicholas Piggin's message of July 17, 2020 12:08 pm:
>> > Excerpts from Qian Cai's message of July 17, 2020 3:27 am:
>> >> On Fri, Jul 03, 2020 at 11:06:05AM +0530, Bharata B Rao wrote:
>> >>> Hypervisor may choose not to enable Guest Translation Shootdown Enable
>> >>> (GTSE) option for the guest. When GTSE isn't ON, the guest OS isn't
>> >>> permitted to use instructions like tblie and tlbsync directly, but is
>> >>> expected to make hypervisor calls to get the TLB flushed.
>> >>> 
>> >>> This series enables the TLB flush routines in the radix code to
>> >>> off-load TLB flushing to hypervisor via the newly proposed hcall
>> >>> H_RPT_INVALIDATE. 
>> >>> 
>> >>> To easily check the availability of GTSE, it is made an MMU feature.
>> >>> The OV5 handling and H_REGISTER_PROC_TBL hcall are changed to
>> >>> handle GTSE as an optionally available feature and to not assume GTSE
>> >>> when radix support is available.
>> >>> 
>> >>> The actual hcall implementation for KVM isn't included in this
>> >>> patchset and will be posted separately.
>> >>> 
>> >>> Changes in v3
>> >>> =
>> >>> - Fixed a bug in the hcall wrapper code where we were missing setting
>> >>>   H_RPTI_TYPE_NESTED while retrying the failed flush request with
>> >>>   a full flush for the nested case.
>> >>> - s/psize_to_h_rpti/psize_to_rpti_pgsize
>> >>> 
>> >>> v2: 
>> >>> https://lore.kernel.org/linuxppc-dev/20200626131000.5207-1-bhar...@linux.ibm.com/T/#t
>> >>> 
>> >>> Bharata B Rao (2):
>> >>>   powerpc/mm: Enable radix GTSE only if supported.
>> >>>   powerpc/pseries: H_REGISTER_PROC_TBL should ask for GTSE only if
>> >>> enabled
>> >>> 
>> >>> Nicholas Piggin (1):
>> >>>   powerpc/mm/book3s64/radix: Off-load TLB invalidations to host when
>> >>> !GTSE
>> >> 
>> >> Reverting the whole series fixed random memory corruptions during boot on
>> >> POWER9 PowerNV systems below.
>> > 
>> > If I s/mmu_has_feature(MMU_FTR_GTSE)/(1)/g in radix_tlb.c, then the .o
>> > disasm is the same as reverting my patch.
>> > 
>> > Feature bits not being set right? PowerNV should be pretty simple, seems
>> > to do the same as FTR_TYPE_RADIX.
>> 
>> Might need this fix
>> 
>> ---
>> 
>> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
>> index 9cc49f265c86..54c9bcea9d4e 100644
>> --- a/arch/powerpc/kernel/prom.c
>> +++ b/arch/powerpc/kernel/prom.c
>> @@ -163,7 +163,7 @@ static struct ibm_pa_feature {
>>  { .pabyte = 0,  .pabit = 6, .cpu_features  = CPU_FTR_NOEXECUTE },
>>  { .pabyte = 1,  .pabit = 2, .mmu_features  = MMU_FTR_CI_LARGE_PAGE },
>>  #ifdef CONFIG_PPC_RADIX_MMU
>> -{ .pabyte = 40, .pabit = 0, .mmu_features  = MMU_FTR_TYPE_RADIX },
>> +{ .pabyte = 40, .pabit = 0, .mmu_features  = (MMU_FTR_TYPE_RADIX | 
>> MMU_FTR_GTSE) },
>>  #endif
>>  { .pabyte = 1,  .pabit = 1, .invert = 1, .cpu_features = 
>> CPU_FTR_NODSISRALIGN },
>>  { .pabyte = 5,  .pabit = 0, .cpu_features  = CPU_FTR_REAL_LE,
>
> Michael - Let me know if this should be folded into 1/3 and the complete
> series resent.

No it's already merged.

Please send a proper patch with a Fixes: tag and a change log that
explains the details.

cheers


Re: [PATCH v3 0/3] Off-load TLB invalidations to host for !GTSE

2020-07-16 Thread Bharata B Rao
On Fri, Jul 17, 2020 at 12:44:00PM +1000, Nicholas Piggin wrote:
> Excerpts from Nicholas Piggin's message of July 17, 2020 12:08 pm:
> > Excerpts from Qian Cai's message of July 17, 2020 3:27 am:
> >> On Fri, Jul 03, 2020 at 11:06:05AM +0530, Bharata B Rao wrote:
> >>> Hypervisor may choose not to enable Guest Translation Shootdown Enable
> >>> (GTSE) option for the guest. When GTSE isn't ON, the guest OS isn't
> >>> permitted to use instructions like tblie and tlbsync directly, but is
> >>> expected to make hypervisor calls to get the TLB flushed.
> >>> 
> >>> This series enables the TLB flush routines in the radix code to
> >>> off-load TLB flushing to hypervisor via the newly proposed hcall
> >>> H_RPT_INVALIDATE. 
> >>> 
> >>> To easily check the availability of GTSE, it is made an MMU feature.
> >>> The OV5 handling and H_REGISTER_PROC_TBL hcall are changed to
> >>> handle GTSE as an optionally available feature and to not assume GTSE
> >>> when radix support is available.
> >>> 
> >>> The actual hcall implementation for KVM isn't included in this
> >>> patchset and will be posted separately.
> >>> 
> >>> Changes in v3
> >>> =
> >>> - Fixed a bug in the hcall wrapper code where we were missing setting
> >>>   H_RPTI_TYPE_NESTED while retrying the failed flush request with
> >>>   a full flush for the nested case.
> >>> - s/psize_to_h_rpti/psize_to_rpti_pgsize
> >>> 
> >>> v2: 
> >>> https://lore.kernel.org/linuxppc-dev/20200626131000.5207-1-bhar...@linux.ibm.com/T/#t
> >>> 
> >>> Bharata B Rao (2):
> >>>   powerpc/mm: Enable radix GTSE only if supported.
> >>>   powerpc/pseries: H_REGISTER_PROC_TBL should ask for GTSE only if
> >>> enabled
> >>> 
> >>> Nicholas Piggin (1):
> >>>   powerpc/mm/book3s64/radix: Off-load TLB invalidations to host when
> >>> !GTSE
> >> 
> >> Reverting the whole series fixed random memory corruptions during boot on
> >> POWER9 PowerNV systems below.
> > 
> > If I s/mmu_has_feature(MMU_FTR_GTSE)/(1)/g in radix_tlb.c, then the .o
> > disasm is the same as reverting my patch.
> > 
> > Feature bits not being set right? PowerNV should be pretty simple, seems
> > to do the same as FTR_TYPE_RADIX.
> 
> Might need this fix
> 
> ---
> 
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index 9cc49f265c86..54c9bcea9d4e 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -163,7 +163,7 @@ static struct ibm_pa_feature {
>   { .pabyte = 0,  .pabit = 6, .cpu_features  = CPU_FTR_NOEXECUTE },
>   { .pabyte = 1,  .pabit = 2, .mmu_features  = MMU_FTR_CI_LARGE_PAGE },
>  #ifdef CONFIG_PPC_RADIX_MMU
> - { .pabyte = 40, .pabit = 0, .mmu_features  = MMU_FTR_TYPE_RADIX },
> + { .pabyte = 40, .pabit = 0, .mmu_features  = (MMU_FTR_TYPE_RADIX | 
> MMU_FTR_GTSE) },
>  #endif
>   { .pabyte = 1,  .pabit = 1, .invert = 1, .cpu_features = 
> CPU_FTR_NODSISRALIGN },
>   { .pabyte = 5,  .pabit = 0, .cpu_features  = CPU_FTR_REAL_LE,

Michael - Let me know if this should be folded into 1/3 and the complete
series resent.

Regards,
Bharata.


Re: [PATCH v3 0/3] Off-load TLB invalidations to host for !GTSE

2020-07-16 Thread Nicholas Piggin
Excerpts from Nicholas Piggin's message of July 17, 2020 12:08 pm:
> Excerpts from Qian Cai's message of July 17, 2020 3:27 am:
>> On Fri, Jul 03, 2020 at 11:06:05AM +0530, Bharata B Rao wrote:
>>> Hypervisor may choose not to enable Guest Translation Shootdown Enable
>>> (GTSE) option for the guest. When GTSE isn't ON, the guest OS isn't
>>> permitted to use instructions like tblie and tlbsync directly, but is
>>> expected to make hypervisor calls to get the TLB flushed.
>>> 
>>> This series enables the TLB flush routines in the radix code to
>>> off-load TLB flushing to hypervisor via the newly proposed hcall
>>> H_RPT_INVALIDATE. 
>>> 
>>> To easily check the availability of GTSE, it is made an MMU feature.
>>> The OV5 handling and H_REGISTER_PROC_TBL hcall are changed to
>>> handle GTSE as an optionally available feature and to not assume GTSE
>>> when radix support is available.
>>> 
>>> The actual hcall implementation for KVM isn't included in this
>>> patchset and will be posted separately.
>>> 
>>> Changes in v3
>>> =
>>> - Fixed a bug in the hcall wrapper code where we were missing setting
>>>   H_RPTI_TYPE_NESTED while retrying the failed flush request with
>>>   a full flush for the nested case.
>>> - s/psize_to_h_rpti/psize_to_rpti_pgsize
>>> 
>>> v2: 
>>> https://lore.kernel.org/linuxppc-dev/20200626131000.5207-1-bhar...@linux.ibm.com/T/#t
>>> 
>>> Bharata B Rao (2):
>>>   powerpc/mm: Enable radix GTSE only if supported.
>>>   powerpc/pseries: H_REGISTER_PROC_TBL should ask for GTSE only if
>>> enabled
>>> 
>>> Nicholas Piggin (1):
>>>   powerpc/mm/book3s64/radix: Off-load TLB invalidations to host when
>>> !GTSE
>> 
>> Reverting the whole series fixed random memory corruptions during boot on
>> POWER9 PowerNV systems below.
> 
> If I s/mmu_has_feature(MMU_FTR_GTSE)/(1)/g in radix_tlb.c, then the .o
> disasm is the same as reverting my patch.
> 
> Feature bits not being set right? PowerNV should be pretty simple, seems
> to do the same as FTR_TYPE_RADIX.

Might need this fix

---

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 9cc49f265c86..54c9bcea9d4e 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -163,7 +163,7 @@ static struct ibm_pa_feature {
{ .pabyte = 0,  .pabit = 6, .cpu_features  = CPU_FTR_NOEXECUTE },
{ .pabyte = 1,  .pabit = 2, .mmu_features  = MMU_FTR_CI_LARGE_PAGE },
 #ifdef CONFIG_PPC_RADIX_MMU
-   { .pabyte = 40, .pabit = 0, .mmu_features  = MMU_FTR_TYPE_RADIX },
+   { .pabyte = 40, .pabit = 0, .mmu_features  = (MMU_FTR_TYPE_RADIX | 
MMU_FTR_GTSE) },
 #endif
{ .pabyte = 1,  .pabit = 1, .invert = 1, .cpu_features = 
CPU_FTR_NODSISRALIGN },
{ .pabyte = 5,  .pabit = 0, .cpu_features  = CPU_FTR_REAL_LE,


Re: [PATCH v3 0/3] Off-load TLB invalidations to host for !GTSE

2020-07-16 Thread Nicholas Piggin
Excerpts from Qian Cai's message of July 17, 2020 3:27 am:
> On Fri, Jul 03, 2020 at 11:06:05AM +0530, Bharata B Rao wrote:
>> Hypervisor may choose not to enable Guest Translation Shootdown Enable
>> (GTSE) option for the guest. When GTSE isn't ON, the guest OS isn't
>> permitted to use instructions like tblie and tlbsync directly, but is
>> expected to make hypervisor calls to get the TLB flushed.
>> 
>> This series enables the TLB flush routines in the radix code to
>> off-load TLB flushing to hypervisor via the newly proposed hcall
>> H_RPT_INVALIDATE. 
>> 
>> To easily check the availability of GTSE, it is made an MMU feature.
>> The OV5 handling and H_REGISTER_PROC_TBL hcall are changed to
>> handle GTSE as an optionally available feature and to not assume GTSE
>> when radix support is available.
>> 
>> The actual hcall implementation for KVM isn't included in this
>> patchset and will be posted separately.
>> 
>> Changes in v3
>> =
>> - Fixed a bug in the hcall wrapper code where we were missing setting
>>   H_RPTI_TYPE_NESTED while retrying the failed flush request with
>>   a full flush for the nested case.
>> - s/psize_to_h_rpti/psize_to_rpti_pgsize
>> 
>> v2: 
>> https://lore.kernel.org/linuxppc-dev/20200626131000.5207-1-bhar...@linux.ibm.com/T/#t
>> 
>> Bharata B Rao (2):
>>   powerpc/mm: Enable radix GTSE only if supported.
>>   powerpc/pseries: H_REGISTER_PROC_TBL should ask for GTSE only if
>> enabled
>> 
>> Nicholas Piggin (1):
>>   powerpc/mm/book3s64/radix: Off-load TLB invalidations to host when
>> !GTSE
> 
> Reverting the whole series fixed random memory corruptions during boot on
> POWER9 PowerNV systems below.

If I s/mmu_has_feature(MMU_FTR_GTSE)/(1)/g in radix_tlb.c, then the .o
disasm is the same as reverting my patch.

Feature bits not being set right? PowerNV should be pretty simple, seems
to do the same as FTR_TYPE_RADIX.

So... test being done before static keys are set up? Shouldn't be. Must
be something obvious I just can't see it.

Thanks,
Nick



Re: [PATCH v3 0/3] Off-load TLB invalidations to host for !GTSE

2020-07-16 Thread Stephen Rothwell
Hi all,

On Thu, 16 Jul 2020 13:27:14 -0400 Qian Cai  wrote:
>
> Reverting the whole series fixed random memory corruptions during boot on
> POWER9 PowerNV systems below.

I will revert those commits from linux-next today as well (they revert
cleanly).

-- 
Cheers,
Stephen Rothwell


pgpFh42TZjw5z.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 0/3] Off-load TLB invalidations to host for !GTSE

2020-07-16 Thread Qian Cai
On Fri, Jul 03, 2020 at 11:06:05AM +0530, Bharata B Rao wrote:
> Hypervisor may choose not to enable Guest Translation Shootdown Enable
> (GTSE) option for the guest. When GTSE isn't ON, the guest OS isn't
> permitted to use instructions like tblie and tlbsync directly, but is
> expected to make hypervisor calls to get the TLB flushed.
> 
> This series enables the TLB flush routines in the radix code to
> off-load TLB flushing to hypervisor via the newly proposed hcall
> H_RPT_INVALIDATE. 
> 
> To easily check the availability of GTSE, it is made an MMU feature.
> The OV5 handling and H_REGISTER_PROC_TBL hcall are changed to
> handle GTSE as an optionally available feature and to not assume GTSE
> when radix support is available.
> 
> The actual hcall implementation for KVM isn't included in this
> patchset and will be posted separately.
> 
> Changes in v3
> =
> - Fixed a bug in the hcall wrapper code where we were missing setting
>   H_RPTI_TYPE_NESTED while retrying the failed flush request with
>   a full flush for the nested case.
> - s/psize_to_h_rpti/psize_to_rpti_pgsize
> 
> v2: 
> https://lore.kernel.org/linuxppc-dev/20200626131000.5207-1-bhar...@linux.ibm.com/T/#t
> 
> Bharata B Rao (2):
>   powerpc/mm: Enable radix GTSE only if supported.
>   powerpc/pseries: H_REGISTER_PROC_TBL should ask for GTSE only if
> enabled
> 
> Nicholas Piggin (1):
>   powerpc/mm/book3s64/radix: Off-load TLB invalidations to host when
> !GTSE

Reverting the whole series fixed random memory corruptions during boot on
POWER9 PowerNV systems below.

IBM 8335-GTH (ibm,witherspoon)
POWER9, altivec supported
262144 MB memory, 2000 GB disk space

.config:
https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config

[9.338996][  T925] BUG: Unable to handle kernel instruction fetch (NULL 
pointer?)
[9.339026][  T925] Faulting instruction address: 0x
[9.339051][  T925] Oops: Kernel access of bad area, sig: 11 [#1]
[9.339064][  T925] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 NUMA PowerNV
[9.339098][  T925] Modules linked in: dm_mirror dm_region_hash dm_log dm_mod
[9.339150][  T925] CPU: 92 PID: 925 Comm: (md-udevd) Not tainted 
5.8.0-rc5-next-20200716 #3
[9.339186][  T925] NIP:   LR: c021f2cc CTR: 

[9.339210][  T925] REGS: c000201cb52d79b0 TRAP: 0400   Not tainted  
(5.8.0-rc5-next-20200716)
[9.339244][  T925] MSR:  900040009033   CR: 
2492  XER: 
[9.339278][  T925] CFAR: c021f2c8 IRQMASK: 0 
[9.339278][  T925] GPR00: c021f2cc c000201cb52d7c40 
c5901000 c000201cb52d7ca8 
[9.339278][  T925] GPR04: c0080ea60038  
7fff 7fff 
[9.339278][  T925] GPR08:   
c000201cb50bd500 0003 
[9.339278][  T925] GPR12:  c000201fff694500 
7fffa4a8a940 7fffa4a8a6c8 
[9.339278][  T925] GPR16: 7fffa4a8a8f8 7fffa4a8a650 
7fffa4a8a488  
[9.339278][  T925] GPR20: 00050001 7fffa4a8a984 
7fff ca4545cc 
[9.339278][  T925] GPR24: c0affe28  
 0166 
[9.339278][  T925] GPR28: c000201cb52d7ca8 c0080ea6 
c000201cc3b72600 7fff 
[9.339493][  T925] NIP [] 0x0
[9.339516][  T925] LR [c021f2cc] __seccomp_filter+0xec/0x530
bpf_dispatcher_nop_func at include/linux/bpf.h:567
(inlined by) bpf_prog_run_pin_on_cpu at include/linux/filter.h:597
(inlined by) seccomp_run_filters at kernel/seccomp.c:324
(inlined by) __seccomp_filter at kernel/seccomp.c:937
[9.339538][  T925] Call Trace:
[9.339548][  T925] [c000201cb52d7c40] [c021f2cc] 
__seccomp_filter+0xec/0x530 (unreliable)
[9.339566][  T925] [c000201cb52d7d50] [c0025af8] 
do_syscall_trace_enter+0xb8/0x470
do_seccomp at arch/powerpc/kernel/ptrace/ptrace.c:252
(inlined by) do_syscall_trace_enter at arch/powerpc/kernel/ptrace/ptrace.c:327
[9.339600][  T925] [c000201cb52d7dc0] [c002c8f8] 
system_call_exception+0x138/0x180
[9.339625][  T925] [c000201cb52d7e20] [c000c9e8] 
system_call_common+0xe8/0x214
[9.339648][  T925] Instruction dump:
[9.339667][  T925]       
  
[9.339706][  T925]       
  
[9.339748][  T925] ---[ end trace d89eb80f9a6bc141 ]---
[  OK  ] Started Journal Service.
[   10.452364][  T925] Kernel panic - not syncing: Fatal exception
[   11.876655][  T925] ---[ end Kernel panic - not syncing: Fatal exception ]---

There could also be lots of random userspace segfault like,

[   16.463545][  T771] rngd[771]: segfault (11) at 0 nip 0 lr 0 code 1 in 
rngd[106d6+2]
[   16.463620][  T771] rngd[771]: code: