[PATCH] KVM: PPC: Book3S HV: Use GLOBAL_TOC for kvmppc_h_set_dabr/xdabr()

2021-09-23 Thread Michael Ellerman
kvmppc_h_set_dabr(), and kvmppc_h_set_xdabr() which jumps into
it, need to use _GLOBAL_TOC to setup the kernel TOC pointer, because
kvmppc_h_set_dabr() uses LOAD_REG_ADDR() to load dawr_force_enable.

When called from hcall_try_real_mode() we have the kernel TOC in r2,
established near the start of kvmppc_interrupt_hv(), so there is no
issue.

But they can also be called from kvmppc_pseries_do_hcall() which is
module code, so the access ends up happening with the kvm-hv module's
r2, which will not point at dawr_force_enable and could even cause a
fault.

With the current code layout and compilers we haven't observed a fault
in practice, the load hits somewhere in kvm-hv.ko and silently returns
some bogus value.

Note that we we expect p8/p9 guests to use the DAWR, but SLOF uses
h_set_dabr() to test if sc1 works correctly, see SLOF's
lib/libhvcall/brokensc1.c.

Fixes: c1fe190c0672 ("powerpc: Add force enable of DAWR on P9 option")
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 90484425a1e6..30a8a07cff18 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1999,7 +1999,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
.globl  hcall_real_table_end
 hcall_real_table_end:
 
-_GLOBAL(kvmppc_h_set_xdabr)
+_GLOBAL_TOC(kvmppc_h_set_xdabr)
 EXPORT_SYMBOL_GPL(kvmppc_h_set_xdabr)
andi.   r0, r5, DABRX_USER | DABRX_KERNEL
beq 6f
@@ -2009,7 +2009,7 @@ EXPORT_SYMBOL_GPL(kvmppc_h_set_xdabr)
 6: li  r3, H_PARAMETER
blr
 
-_GLOBAL(kvmppc_h_set_dabr)
+_GLOBAL_TOC(kvmppc_h_set_dabr)
 EXPORT_SYMBOL_GPL(kvmppc_h_set_dabr)
li  r5, DABRX_USER | DABRX_KERNEL
 3:
-- 
2.25.1



Re: [Bug 213837] "Kernel panic - not syncing: corrupted stack end detected inside scheduler" at building via distcc on a G5

2021-09-23 Thread Michael Ellerman
bugzilla-dae...@bugzilla.kernel.org writes:
> https://bugzilla.kernel.org/show_bug.cgi?id=213837
>
> --- Comment #7 from Erhard F. (erhar...@mailbox.org) ---
> Created attachment 298919
>   --> https://bugzilla.kernel.org/attachment.cgi?id=298919=edit
> dmesg (5.15-rc2 + patch, PowerMac G5 11,2)
>
> (In reply to mpe from comment #6)
>> Can you try this patch, it might help us work out what is corrupting the
>> stack.
> With your patch applied to recent v5.15-rc2 the output looks like this:
>
> [...]
> stack corrupted? stack end = 0xc00029fdc000
> stack: c00029fdbc00: 5a5a5a5a 5a5a5a5a    
...

> Can't make much sense out of it but hopefully you can. ;)

Thanks. Obvious isn't it? ;)

  stack corrupted? stack end = 0xc00029fdc000
  stack: c00029fdbc00: 5a5a5a5a 5a5a5a5a    
  stack: c00029fdbc10: 0ddc 7c10    |...
  stack: c00029fdbc20: 29fc4e41 673d4bb3 5a5a5a5a 5a5a5a5a  ).NAg=K.
  stack: c00029fdbc30:   0ddc 8e10  
  stack: c00029fdbc40:   41fc4e41 673d41a3  A.NAg=A.
  stack: c00029fdbc50: 5a5a5a5a 5a5a5a5a    
  stack: c00029fdbc60: 0ddc 8e0c    
  stack: c00029fdbc70: 79fc4e41 673d4dab 5a5a5a5a 5a5a5a5a  y.NAg=M.
  stack: c00029fdbc80:   0ddc 9008  
  stack: c00029fdbc90:   91fc4e41 673d4573  ..NAg=Es
  stack: c00029fdbca0: 5a5a5a5a 5a5a5a5a    
  stack: c00029fdbcb0: 0dd7 ac16    
  stack: c00029fdbcc0: c9fc4e41 673d4203 5a5a5a5a 5a5a5a5a  ..NAg=B.
  stack: c00029fdbcd0:   0ddc 6c04  l...
  stack: c00029fdbce0:   e1fc4e41 673d474b  ..NAg=GK
  stack: c00029fdbcf0: 5a5a5a5a 5a5a5a5a    
  stack: c00029fdbd00: 0ddc 8800    
  stack: c00029fdbd10: 19fd4e41 673d4143 5a5a5a5a 5a5a5a5a  ..NAg=AC
  stack: c00029fdbd20:   0ddb 6c0e  l...
  stack: c00029fdbd30:   31fd4e41 673d4f43  1.NAg=OC
  stack: c00029fdbd40: 5a5a5a5a 5a5a5a5a    
  stack: c00029fdbd50: 0ddc 8e08    
  stack: c00029fdbd60: 69fd4e41 673d407b 5a5a5a5a 5a5a5a5a  i.NAg=@{
  stack: c00029fdbd70:   0ddc 9208  
  stack: c00029fdbd80:   81fd4e41 673d4633  ..NAg=F3
  stack: c00029fdbd90: 5a5a5a5a 5a5a5a5a    
  stack: c00029fdbda0: 0ddb 4218    B...
  stack: c00029fdbdb0: b9fd4e41 673d42fb 5a5a5a5a 5a5a5a5a  ..NAg=B.
  stack: c00029fdbdc0:   0ddc 7e18  ~...
  stack: c00029fdbdd0:   d1fd4e41 673d4a1b  ..NAg=J.
  stack: c00029fdbde0: 5a5a5a5a 5a5a5a5a    
  stack: c00029fdbdf0: 0ddc 8e04    
  stack: c00029fdbe00: 09fe4e41 673d4ee3 5a5a5a5a 5a5a5a5a  ..NAg=N.
  stack: c00029fdbe10:   0dd9 721c  r...
  stack: c00029fdbe20:   21fe4e41 673d4fa3  !.NAg=O.

That's slab data.

It's not clear what the actual data is, but because you booted with
slub_debug=FZP we can see the red zones and poison.

The  is SLUB_RED_ACTIVE, and 5a5a5a5a is POISON_INUSE (see poison.h)


  stack: c00029fdbe30: c000 29fdbeb0    )...

But then here we have an obvious pointer (big endian FTW).

And it points nearby, just slightly higher in memory, so that looks
suspiciously like a stack back chain pointer. There's more similar
values if you look further.

But we shouldn't be seeing the stack yet, it's meant to start (end) at
c00029fdc000 ...

  stack: c00029fdbe40: 0ddc 9400    
  stack: c00029fdbe50: 59fe4e41 673d4933 5a5a5a5a 5a5a5a5a  Y.NAg=I3
  stack: c00029fdbe60:   0dd9 6024  `..$
  stack: c00029fdbe70:   71fe4e41 673d416b  q.NAg=Ak
  stack: c00029fdbe80: 5a5a5a5a 5a5a5a5a    
  stack: c00029fdbe90: 0ddc 600c    `...
  stack: c00029fdbea0: c000 29fdbf20  0002  ).. 
  stack: c00029fdbeb0: c000 29fdbf30 0ddc 7e1c )..0~... 
<---
  stack: c00029fdbec0: c000 29fdbf40 c1fe4e41 673d4723  )..@..NAg=G#
  stack: 

Re: [PATCH v2 1/2] powerpc/powermac: add missing g5_phy_disable_cpu1() declaration

2021-09-23 Thread Michael Ellerman
Krzysztof Kozlowski  writes:
> g5_phy_disable_cpu1() is used outside of platforms/powermac/feature.c,
> so it should have a declaration to fix W=1 warning:
>
>   arch/powerpc/platforms/powermac/feature.c:1533:6:
> error: no previous prototype for ‘g5_phy_disable_cpu1’ 
> [-Werror=missing-prototypes]
>
> Signed-off-by: Krzysztof Kozlowski 
>
> ---
>
> Changes since v1:
> 1. Drop declaration in powermac/smp.c

Sorry I missed v1, so I'm going to ask for more changes :}

>  arch/powerpc/include/asm/pmac_feature.h | 4 

Putting it here exposes it to the whole kernel, but it's only needed
inside arch/powerpc/platforms/powermac.

The right place for the prototype is arch/powerpc/platforms/powermac/pmac.h,
which is for platform internal prototypes.

>  arch/powerpc/platforms/powermac/smp.c   | 2 --
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/pmac_feature.h 
> b/arch/powerpc/include/asm/pmac_feature.h
> index e08e829261b6..7703e5bf1203 100644
> --- a/arch/powerpc/include/asm/pmac_feature.h
> +++ b/arch/powerpc/include/asm/pmac_feature.h
> @@ -143,6 +143,10 @@
>   */
>  struct device_node;
>  
> +#ifdef CONFIG_PPC64
> +void g5_phy_disable_cpu1(void);
> +#endif /* CONFIG_PPC64 */

The ifdef around the prototype doesn't gain much, and is extra visual
noise, so I'd rather without it.

cheers


Re: [PATCH v2 3/3] powerpc/numa: Fill distance_lookup_table for offline nodes

2021-09-23 Thread Michael Ellerman
Srikar Dronamraju  writes:
> * Michael Ellerman  [2021-08-26 23:36:53]:
>
>> Srikar Dronamraju  writes:
>> > Scheduler expects unique number of node distances to be available at
>> > boot.
...
>
>> > Fake the offline node's distance_lookup_table entries so that all
>> > possible node distances are updated.
>> 
>> Does this work if we have a single node offline at boot?
>> 
>
> It should.
>
>> Say we start with:
>> 
>> node distances:
>> node   0   1
>>   0:  10  20
>>   1:  20  10
>> 
>> And node 2 is offline at boot. We can only initialise that nodes entries
>> in the distance_lookup_table:
>> 
>>  while (i--)
>>  distance_lookup_table[node][i] = node;
>> 
>> By filling them all with 2 that causes node_distance(2, X) to return the
>> maximum distance for all other nodes X, because we won't break out of
>> the loop in __node_distance():
>> 
>>  for (i = 0; i < distance_ref_points_depth; i++) {
>>  if (distance_lookup_table[a][i] == distance_lookup_table[b][i])
>>  break;
>> 
>>  /* Double the distance for each NUMA level */
>>  distance *= 2;
>>  }
>> 
>> If distance_ref_points_depth was 4 we'd return 160.
>
> As you already know, distance 10, 20, .. are defined by Powerpc, form1
> affinity. PAPR doesn't define actual distances, it only provides us the
> associativity. If there are distance_ref_points_depth is 4,
> (distance_ref_points_depth doesn't take local distance into consideration)
> 10, 20, 40, 80, 160.
>
>> 
>> That'd leave us with 3 unique distances at boot, 10, 20, 160.
>> 
>
> So if there are unique distances, then the distances as per the current
> code has to be 10, 20, 40, 80.. I dont see a way in which we have a break in
> the series. like having 160 without 80.

I'm confused what you mean there.

If we have a node that's offline at boot then we get 160 for that node,
that's just the result of having no info for it, so we never break out
of the for loop.

So if we have two nodes, one hop apart, and then an offline node we get
10, 20, 160.

Or if you're using depth = 3 then it's 10, 20, 80.

>> But when node 2 comes online it might introduce more than 1 new distance
>> value, eg. it could be that the actual distances are:
>> 
>> node distances:
>> node   0   1   2
>>   0:  10  20  40
>>   1:  20  10  80
>>   2:  40  80  10
>> 
>> ie. we now have 4 distances, 10, 20, 40, 80.
>> 
>> What am I missing?
>
> As I said above, I am not sure how we can have a break in the series.
> If distance_ref_points_depth is 3, the distances has to be 10,20,40,80 as
> atleast for form1 affinity.

I agree for depth 3 we have to see 10, 20, 40, 80. But nothing
guarantees we see each value (other than 10).

We can have two nodes one hop apart, so we have 10 and 20, then a third
node is added 3 hops away, so we get 10, 20, 80.

The real problem is that the third node could be 3 hops from node 0
and 2 hops from node 1, and so the addition of the third node causes
two new distance values (40 & 80) to be required.

I think maybe what you're saying is that in practice we don't see setups
like that. But I don't know if I'm happy with a solution that doesn't
work in the general case, and relies on the particular properties of our
current set of systems.

Possibly we just need to detect that case and WARN about it. The only
problem is we won't know until the system is already up and running, ie.
we can't know at boot that the onlining of the third node will cause 2
new distance values to be added.

cheers


Re: [PATCH] powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted()

2021-09-23 Thread Michael Ellerman
Nathan Lynch  writes:
> Srikar Dronamraju  writes:
>
>> * Nathan Lynch  [2021-09-22 11:01:12]:
>>
>>> Srikar Dronamraju  writes:
>>> > * Nathan Lynch  [2021-09-20 22:12:13]:
>>> >
>>> >> vcpu_is_preempted() can be used outside of preempt-disabled critical
>>> >> sections, yielding warnings such as:
>>> >> 
>>> >> BUG: using smp_processor_id() in preemptible [] code: 
>>> >> systemd-udevd/185
>>> >> caller is rwsem_spin_on_owner+0x1cc/0x2d0
>>> >> CPU: 1 PID: 185 Comm: systemd-udevd Not tainted 5.15.0-rc2+ #33
>>> >> Call Trace:
>>> >> [c00012907ac0] [c0aa30a8] dump_stack_lvl+0xac/0x108 
>>> >> (unreliable)
>>> >> [c00012907b00] [c1371f70] 
>>> >> check_preemption_disabled+0x150/0x160
>>> >> [c00012907b90] [c01e0e8c] rwsem_spin_on_owner+0x1cc/0x2d0
>>> >> [c00012907be0] [c01e1408] 
>>> >> rwsem_down_write_slowpath+0x478/0x9a0
>>> >> [c00012907ca0] [c0576cf4] filename_create+0x94/0x1e0
>>> >> [c00012907d10] [c057ac08] do_symlinkat+0x68/0x1a0
>>> >> [c00012907d70] [c057ae18] sys_symlink+0x58/0x70
>>> >> [c00012907da0] [c002e448] system_call_exception+0x198/0x3c0
>>> >> [c00012907e10] [c000c54c] system_call_common+0xec/0x250
>>> >> 
>>> >> The result of vcpu_is_preempted() is always subject to invalidation by
>>> >> events inside and outside of Linux; it's just a best guess at a point in
>>> >> time. Use raw_smp_processor_id() to avoid such warnings.
>>> >
>>> > Typically smp_processor_id() and raw_smp_processor_id() except for the
>>> > CONFIG_DEBUG_PREEMPT.
>>> 
>>> Sorry, I don't follow...
>>
>> I meant, Unless CONFIG_DEBUG_PREEMPT, smp_processor_id() is defined as
>> raw_processor_id().
>>
>>> 
>>> > In the CONFIG_DEBUG_PREEMPT case, smp_processor_id()
>>> > is actually debug_smp_processor_id(), which does all the checks.
>>> 
>>> Yes, OK.
>>> 
>>> > I believe these checks in debug_smp_processor_id() are only valid for x86
>>> > case (aka cases were they have __smp_processor_id() defined.)
>>> 
>>> Hmm, I am under the impression that the checks in
>>> debug_smp_processor_id() are valid regardless of whether the arch
>>> overrides __smp_processor_id().
>>
>> From include/linux/smp.h
>>
>> /*
>>  * Allow the architecture to differentiate between a stable and unstable 
>> read.
>>  * For example, x86 uses an IRQ-safe asm-volatile read for the unstable but a
>>  * regular asm read for the stable.
>>  */
>> #ifndef __smp_processor_id
>> #define __smp_processor_id(x) raw_smp_processor_id(x)
>> #endif
>>
>> As far as I see, only x86 has a definition of __smp_processor_id.
>> So for archs like Powerpc, __smp_processor_id(), is always
>> defined as raw_smp_processor_id(). Right?
>
> Sure, yes.
>
>> I would think debug_smp_processor_id() would be useful if 
>> __smp_processor_id()
>> is different from raw_smp_processor_id(). Do note debug_smp_processor_id() 
>> calls raw_smp_processor_id().

Agree.

> I do not think the utility of debug_smp_processor_id() is related to
> whether the arch defines __smp_processor_id().
>
>> Or can I understand how debug_smp_processor_id() is useful if
>> __smp_processor_id() is defined as raw_smp_processor_id()?

debug_smp_processor_id() is useful on powerpc, as well as other arches,
because it checks that we're in a context where the processor id won't
change out from under us.

eg. something like this is unsafe:

  int counts[NR_CPUS];
  int tmp, cpu;
  
  cpu = smp_processor_id();
  tmp = counts[cpu];
<- preempted here and migrated to another CPU
  counts[cpu] = tmp + 1;


> So, for powerpc with DEBUG_PREEMPT unset, a call to smp_procesor_id()
> expands to __smp_processor_id() which expands to raw_smp_processor_id(),
> avoiding the preempt safety checks. This is working as intended.
>
> For powerpc with DEBUG_PREEMPT=y, a call to smp_processor_id() expands
> to the out of line call to debug_smp_processor_id(), which calls
> raw_smp_processor_id() and performs the checks, warning if called in an
> inappropriate context, as seen here. Also working as intended.
>
> AFAICT __smp_processor_id() is a performance/codegen-oriented hook, and
> not really related to the debug facility. Please see 9ed7d75b2f09d
> ("x86/percpu: Relax smp_processor_id()").

Yeah good find.

cheers


Re: [PATCH] powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted()

2021-09-22 Thread Michael Ellerman
Nathan Lynch  writes:
> vcpu_is_preempted() can be used outside of preempt-disabled critical
> sections, yielding warnings such as:
>
> BUG: using smp_processor_id() in preemptible [] code: 
> systemd-udevd/185
> caller is rwsem_spin_on_owner+0x1cc/0x2d0
> CPU: 1 PID: 185 Comm: systemd-udevd Not tainted 5.15.0-rc2+ #33
> Call Trace:
> [c00012907ac0] [c0aa30a8] dump_stack_lvl+0xac/0x108 (unreliable)
> [c00012907b00] [c1371f70] check_preemption_disabled+0x150/0x160
> [c00012907b90] [c01e0e8c] rwsem_spin_on_owner+0x1cc/0x2d0
> [c00012907be0] [c01e1408] rwsem_down_write_slowpath+0x478/0x9a0
> [c00012907ca0] [c0576cf4] filename_create+0x94/0x1e0
> [c00012907d10] [c057ac08] do_symlinkat+0x68/0x1a0
> [c00012907d70] [c057ae18] sys_symlink+0x58/0x70
> [c00012907da0] [c002e448] system_call_exception+0x198/0x3c0
> [c00012907e10] [c000c54c] system_call_common+0xec/0x250
>
> The result of vcpu_is_preempted() is always subject to invalidation by
> events inside and outside of Linux; it's just a best guess at a point in
> time. Use raw_smp_processor_id() to avoid such warnings.
>
> Signed-off-by: Nathan Lynch 
> Fixes: ca3f969dcb11 ("powerpc/paravirt: Use is_kvm_guest() in 
> vcpu_is_preempted()")
> ---
>  arch/powerpc/include/asm/paravirt.h | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/paravirt.h 
> b/arch/powerpc/include/asm/paravirt.h
> index bcb7b5f917be..e429aca566de 100644
> --- a/arch/powerpc/include/asm/paravirt.h
> +++ b/arch/powerpc/include/asm/paravirt.h
> @@ -97,7 +97,14 @@ static inline bool vcpu_is_preempted(int cpu)
>  
>  #ifdef CONFIG_PPC_SPLPAR
>   if (!is_kvm_guest()) {
> - int first_cpu = cpu_first_thread_sibling(smp_processor_id());
> + int first_cpu;
> +
> + /*
> +  * This is only a guess at best, and this function may be
> +  * called with preemption enabled. Using raw_smp_processor_id()
> +  * does not damage accuracy.
> +  */
> + first_cpu = cpu_first_thread_sibling(raw_smp_processor_id());

This change seems good, except I think the comment needs to be a lot
more explicit about what it's doing and why.

A casual reader is going to be confused about vcpu preemption vs
"preemption", which are basically unrelated yet use the same word.

It's not clear how raw_smp_processor_id() is related to (Linux)
preemption, unless you know that smp_processor_id() is the alternative
and it contains a preemption check.

And "this is only a guess" is not clear on what *this* is, you're
referring to the result of the whole function, but that's not obvious.

>   /*
>* Preemption can only happen at core granularity. This CPU
   ^^
   Means something different to "preemption" above.

I know you didn't write that comment, and maybe we need to rewrite some
of those existing comments to make it clear they're not talking about
Linux preemption.

cheers


Re: [PATCH 1/2] powerpc/perf: Expose instruction and data address registers as part of extended regs

2021-09-20 Thread Michael Ellerman
Athira Rajeev  writes:
>> On 08-Sep-2021, at 10:47 AM, Michael Ellerman  wrote:
>> 
>> Athira Rajeev  writes:
>>> Patch adds support to include Sampled Instruction Address Register
>>> (SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended
>>> registers. Update the definition of PERF_REG_PMU_MASK_300/31 and
>>> PERF_REG_EXTENDED_MAX to include these SPR's.
>>> 
>>> Signed-off-by: Athira Rajeev 
>>> ---
>>> arch/powerpc/include/uapi/asm/perf_regs.h | 12 +++-
>>> arch/powerpc/perf/perf_regs.c |  4 
>>> 2 files changed, 11 insertions(+), 5 deletions(-)
>>> 
>> ...
>>> diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c
>>> index b931eed..51d31b6 100644
>>> --- a/arch/powerpc/perf/perf_regs.c
>>> +++ b/arch/powerpc/perf/perf_regs.c
>>> @@ -90,7 +90,11 @@ static u64 get_ext_regs_value(int idx)
>>> return mfspr(SPRN_SIER2);
>>> case PERF_REG_POWERPC_SIER3:
>>> return mfspr(SPRN_SIER3);
>>> +   case PERF_REG_POWERPC_SDAR:
>>> +   return mfspr(SPRN_SDAR);
>>> #endif
>>> +   case PERF_REG_POWERPC_SIAR:
>>> +   return mfspr(SPRN_SIAR);
>>> default: return 0;
>>> }
>> 
>> This file is built for all powerpc configs that have PERF_EVENTS. Which
>> includes CPUs that don't have SDAR or SIAR.
>> 
>> Don't we need checks in perf_reg_value() like we do for SIER?
>
> Hi Michael,
>
> Thanks for the review.
>
> SIER is part of PERF_REG_PMU_MASK and hence check is needed to see if 
> platform supports SIER.
> Incase of extended regs, they are part of PERF_REG_EXTENDED_MASK and this 
> mask is
> filled with supported registers while registering the PMU ( ie during 
> init_power9/10_pmu ). So these registers will be added
> only for supported platforms. The validity of extended mask is also done in 
> PMU common code 
> ( In kernel/events/core.c with PERF_REG_EXTENDED_MASK check ). So an 
> unsupported platform requesting for extended
> registers won’t get it.

Right, I'd forgotten how that works.

But I think part of the reason I didn't remember is that
PERF_REG_PMU_MASK_31 doesn't mention those regs by name, it's just a hex
constant, ie:

-#define PERF_REG_PMU_MASK_31   (0xfffULL << PERF_REG_POWERPC_MMCR0)
+#define PERF_REG_PMU_MASK_31   (0x3fffULL << PERF_REG_POWERPC_MMCR0)

Presumably you tested that the added 0x3 there sets the right bits for
SDAR and SIAR, but it's 1) not obvious and 2) fragile.

So I'd like it better if we constructed the PERF_REG_PMU_MASK_31, and
other similar masks, by or'ing together the actual register value
constants.

eg. something like:

#define PERF_REG_PMU_MASK_31\
((1ul << PERF_REG_POWERPC_MMCR0) | (1ul << PERF_REG_POWERPC_MMCR1) | \
(1ul << PERF_REG_POWERPC_MMCR2) | (1ul << PERF_REG_POWERPC_MMCR3) | \
(1ul << PERF_REG_POWERPC_SIER2) | (1ul << PERF_REG_POWERPC_SIER3) | \
(1ul << PERF_REG_POWERPC_PMC1) | (1ul << PERF_REG_POWERPC_PMC2) | \
(1ul << PERF_REG_POWERPC_PMC3) | (1ul << PERF_REG_POWERPC_PMC4) | \
(1ul << PERF_REG_POWERPC_PMC5) | (1ul << PERF_REG_POWERPC_PMC6))


Also PERF_REG_EXTENDED_MAX should be part of the enum, just like
PERF_REG_POWERPC_MAX.

cheers


Re: [PATCH 0/2] powerpc/perf: Add instruction and data address registers to extended regs

2021-09-20 Thread Michael Ellerman
Arnaldo Carvalho de Melo  writes:
> Em Mon, Sep 06, 2021 at 08:13:13AM +0530, Athira Rajeev escreveu:
>> > On 02-Sep-2021, at 1:04 PM, kajoljain  wrote:
>> > On 6/20/21 8:15 PM, Athira Rajeev wrote:
>> >> Patch set adds PMU registers namely Sampled Instruction Address Register
>> >> (SIAR) and Sampled Data Address Register (SDAR) as part of extended regs
>> >> in PowerPC. These registers provides the instruction/data address and
>> >> adding these to extended regs helps in debug purposes.
>> >> 
>> >> Patch 1/2 adds SIAR and SDAR as part of the extended regs mask.
>> >> Patch 2/2 includes perf tools side changes to add the SPRs to
>> >> sample_reg_mask to use with -I? option.
>> >> 
>> >> Athira Rajeev (2):
>> >>  powerpc/perf: Expose instruction and data address registers as part of
>> >>extended regs
>> >>  tools/perf: Add perf tools support to expose instruction and data
>> >>address registers as part of extended regs
>> >> 
>> > 
>> > Patchset looks good to me.
>> > 
>> > Reviewed-By: kajol Jain
>> 
>> Hi Arnaldo,
>> 
>> Requesting for your review on this patchset.
>
> So, this touches the kernel, usually I get a patchkit when the kernel
> bits landed, is that the case by now?

Not yet, I'd like some changes to the kernel patch.

cheers


[GIT PULL] Please pull powerpc/linux.git powerpc-5.15-2 tag

2021-09-19 Thread Michael Ellerman
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi Linus,

Please pull powerpc fixes for 5.15:

The following changes since commit 6880fa6c56601bb8ed59df6c30fd390cc5f6dd8f:

  Linux 5.15-rc1 (2021-09-12 16:28:37 -0700)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
tags/powerpc-5.15-2

for you to fetch changes up to c006a06508db4841d256d82f42da392d6391f3d9:

  powerpc/xics: Set the IRQ chip data for the ICS native backend (2021-09-15 
22:05:53 +1000)

- --
powerpc fixes for 5.15 #2

Fix crashes when scv (System Call Vectored) is used to make a syscall when a 
transaction
is active, on Power9 or later.

Fix bad interactions between rfscv (Return-from scv) and Power9 fake-suspend 
mode.

Fix crashes when handling machine checks in LPARs using the Hash MMU.

Partly revert a recent change to our XICS interrupt controller code, which 
broke the
recently added Microwatt support.

Thanks to: Cédric Le Goater, Eirik Fuller, Ganesh Goudar, Gustavo Romero, Joel 
Stanley,
Nicholas Piggin.

- --
Cédric Le Goater (1):
  powerpc/xics: Set the IRQ chip data for the ICS native backend

Ganesh Goudar (1):
  powerpc/mce: Fix access error in mce handler

Nicholas Piggin (4):
  powerpc/64s: system call scv tabort fix for corrupt irq soft-mask state
  selftests/powerpc: Add scv versions of the basic TM syscall tests
  powerpc/64s: system call rfscv workaround for TM bugs
  KVM: PPC: Book3S HV: Tolerate treclaim. in fake-suspend mode changing 
registers


 arch/powerpc/kernel/interrupt.c | 43 
 arch/powerpc/kernel/interrupt_64.S  | 41 ---
 arch/powerpc/kernel/mce.c   | 17 +++-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 36 +++-
 arch/powerpc/sysdev/xics/xics-common.c  |  4 +-
 tools/testing/selftests/powerpc/tm/tm-syscall-asm.S | 37 -
 tools/testing/selftests/powerpc/tm/tm-syscall.c | 36 
 7 files changed, 159 insertions(+), 55 deletions(-)
-BEGIN PGP SIGNATURE-

iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmFHMuoACgkQUevqPMjh
pYD59RAAsMPzDU+iT15Vm2dC9Ar4hmlLvC0Ew3I/JigC/XSf1bcyGPM4ybSUsoLj
wop5AeOyoaAtegWki3HxMM5iC99KRI9GuJd8Fa8yrSwKYz/O+6oqVJdq0oJY/rLl
NpHDgFCencKya/H0UV+Xobfb41ol25bGchMfcuhQw5G6JVCtWbYSFQXUefTcC3lt
GjYy9jLzvNT7DGlmJBJaeIU78vqRbXLmbHgBwLhxIIBD0us48BQX4elDWcP5Jjwr
UMyz962EaWLAb/nyac9BHzqO9OS6awYJGI5lx3CPJz3+VA6RI/Vu8WoentfejupP
GaiGnCcRBnUAkcUuapy0XQtcv9197yOLkLi5XUPgS82o9EuI2+TLtpoZCwLBOceu
HLpcjYbxf6rV79wYb+P5+oZkPIUi+JWEzF9hCMSFDIJO3mY8rqoDqbEnGEIdiQUi
6IhihFw6s4a15vd+sn0J1KsUvceeA98zE/7zFXW1tYTqxn2zlJFn923EJuHMr+24
mCQb4JvsWlUi+8YuEvr1659I0nstFJaUxY1BFDuFr5oR4/ZFDwnRUs2JbAw8No8Y
uzM7wuq1PMswnKLm5cnR3tsYOVgwTdR6O9wCe3Uva4PoOoB5Y97VpSU8ElfPI+c+
ePdHNteAjRiMLTggWzLryCUNw+2T8wny/r9nyx5tfI2JswXgSvg=
=KZ/U
-END PGP SIGNATURE-


Re: [PATCH v2] powerpc/mce: Fix access error in mce handler

2021-09-19 Thread Michael Ellerman
On Thu, 9 Sep 2021 12:13:30 +0530, Ganesh Goudar wrote:
> We queue an irq work for deferred processing of mce event
> in realmode mce handler, where translation is disabled.
> Queuing of the work may result in accessing memory outside
> RMO region, such access needs the translation to be enabled
> for an LPAR running with hash mmu else the kernel crashes.
> 
> After enabling translation in mce_handle_error() we used to
> leave it enabled to avoid crashing here, but now with the
> commit 74c3354bc1d89 ("powerpc/pseries/mce: restore msr before
> returning from handler") we are restoring the MSR to disable
> translation.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/mce: Fix access error in mce handler
  https://git.kernel.org/powerpc/c/3a1e92d0896e928ac2a5b58962d05a39afef2e23

cheers


Re: [PATCH] powerpc/xics: Set the IRQ chip data for the ICS native backend

2021-09-19 Thread Michael Ellerman
On Mon, 13 Sep 2021 15:40:56 +0200, Cédric Le Goater wrote:
> The ICS native driver relies on the IRQ chip data to find the struct
> 'ics_native' describing the ICS controller but it was removed by commit
> 248af248a8f4 ("powerpc/xics: Rename the map handler in a check handler").
> Revert this change to fix the Microwatt SoC platform.
> 
> 
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/xics: Set the IRQ chip data for the ICS native backend
  https://git.kernel.org/powerpc/c/c006a06508db4841d256d82f42da392d6391f3d9

cheers


Re: [PATCH v3 1/2] powerpc/64s: system call scv tabort fix for corrupt irq soft-mask state

2021-09-19 Thread Michael Ellerman
On Fri, 3 Sep 2021 22:57:06 +1000, Nicholas Piggin wrote:
> If a system call is made with a transaction active, the kernel
> immediately aborts it and returns. scv system calls disable irqs even
> earlier in their interrupt handler, and tabort_syscall does not fix this
> up.
> 
> This can result in irq soft-mask state being messed up on the next
> kernel entry, and crashing at BUG_ON(arch_irq_disabled_regs(regs)) in
> the kernel exit handlers, or possibly worse.
> 
> [...]

Applied to powerpc/fixes.

[1/2] powerpc/64s: system call scv tabort fix for corrupt irq soft-mask state
  https://git.kernel.org/powerpc/c/b871895b148256f1721bc565d803860242755a0b
[2/2] selftests/powerpc: Add scv versions of the basic TM syscall tests
  https://git.kernel.org/powerpc/c/5379ef2a60431232b9bb01c6d3580b875123d723

cheers


Re: [PATCH v1 1/2] powerpc/64s: system call rfscv workaround for TM bugs

2021-09-19 Thread Michael Ellerman
On Wed, 8 Sep 2021 20:17:17 +1000, Nicholas Piggin wrote:
> The rfscv instruction does not work correctly with the fake-suspend mode
> in POWER9, which can end up with the hypervisor restoring an incorrect
> checkpoint.
> 
> Work around this by setting the _TIF_RESTOREALL flag if a system call
> returns to a transaction active state, causing rfid to be used instead
> of rfscv to return, which will do the right thing. The contents of the
> registers are irrelevant because they will be overwritten in this case
> anyway.
> 
> [...]

Applied to powerpc/fixes.

[1/2] powerpc/64s: system call rfscv workaround for TM bugs
  https://git.kernel.org/powerpc/c/ae7aaecc3f2f78b76ab3a8d6178610f55aadfa56
[2/2] KVM: PPC: Book3S HV: Tolerate treclaim. in fake-suspend mode changing 
registers
  https://git.kernel.org/powerpc/c/267cdfa21385d78c794768233678756e32b39ead

cheers


Re: [PATCH v2] powerpc/32: Don't use a struct based type for pte_t

2021-09-17 Thread Michael Ellerman
Christophe Leroy  writes:
> Long time ago we had a config item called STRICT_MM_TYPECHECKS
> to build the kernel with pte_t defined as a structure in order
> to perform additional build checks or build it with pte_t
> defined as a simple type in order to get simpler generated code.
>
> Commit 670eea924198 ("powerpc/mm: Always use STRICT_MM_TYPECHECKS")
> made the struct based definition the only one, considering that the
> generated code was similar in both cases.
>
> That's right on ppc64 because the ABI is such that the content of a
> struct having a single simple type element is passed as register,
> but on ppc32 such a structure is passed via the stack like any
> structure.
>
> Simple test function:
>
>   pte_t test(pte_t pte)
>   {
>   return pte;
>   }
>
> Before this patch we get
>
>   c00108ec :
>   c00108ec:   81 24 00 00 lwz r9,0(r4)
>   c00108f0:   91 23 00 00 stw r9,0(r3)
>   c00108f4:   4e 80 00 20 blr
>
> So, for PPC32, restore the simple type behaviour we got before
> commit 670eea924198, but instead of adding a config option to
> activate type check, do it when __CHECKER__ is set so that type
> checking is performed by 'sparse' and provides feedback like:
>
>   arch/powerpc/mm/pgtable.c:466:16: warning: incorrect type in return 
> expression (different base types)
>   arch/powerpc/mm/pgtable.c:466:16:expected unsigned long
>   arch/powerpc/mm/pgtable.c:466:16:got struct pte_t [usertype] x

OK that's a good trade off.

One question below ...

> diff --git a/arch/powerpc/include/asm/pgtable-types.h 
> b/arch/powerpc/include/asm/pgtable-types.h
> index d11b4c61d686..c60199fc6fa6 100644
> --- a/arch/powerpc/include/asm/pgtable-types.h
> +++ b/arch/powerpc/include/asm/pgtable-types.h
> @@ -5,14 +5,26 @@
>  /* PTE level */
>  #if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
>  typedef struct { pte_basic_t pte, pte1, pte2, pte3; } pte_t;
> -#else
> +#elif defined(__CHECKER__) || !defined(CONFIG_PPC32)

It would be nicer if this logic was in Kconfig.

eg. restore config STRICT_MM_TYPECHECKS but make it always enabled for
64-bit, and depend on CHECKER for 32-bit.

The only thing is I'm not sure if we can test __CHECKER__ in Kconfig?

cheers


Re: [PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()

2021-09-16 Thread Michael Ellerman
Christoph Hellwig  writes:
> On Wed, Sep 15, 2021 at 07:18:34PM +0200, Christophe Leroy wrote:
>> Could you please provide more explicit explanation why inlining such an
>> helper is considered as bad practice and messy ?
>
> Because now we get architectures to all subly differ.  Look at the mess
> for ioremap and the ioremap* variant.
>
> The only good reason to allow for inlines if if they are used in a hot
> path.  Which cc_platform_has is not, especially not on powerpc.

Yes I agree, it's not a hot path so it doesn't really matter, which is
why I Acked it.

I think it is possible to do both, share the declaration across arches
but also give arches flexibility to use an inline if they prefer, see
patch below.

I'm not suggesting we actually do that for this series now, but I think
it would solve the problem if we ever needed to in future.

cheers


diff --git a/arch/powerpc/platforms/pseries/cc_platform.c 
b/arch/powerpc/include/asm/cc_platform.h
similarity index 74%
rename from arch/powerpc/platforms/pseries/cc_platform.c
rename to arch/powerpc/include/asm/cc_platform.h
index e8021af83a19..6285c3c385a6 100644
--- a/arch/powerpc/platforms/pseries/cc_platform.c
+++ b/arch/powerpc/include/asm/cc_platform.h
@@ -7,13 +7,10 @@
  * Author: Tom Lendacky 
  */
 
-#include 
 #include 
-
-#include 
 #include 
 
-bool cc_platform_has(enum cc_attr attr)
+static inline bool arch_cc_platform_has(enum cc_attr attr)
 {
switch (attr) {
case CC_ATTR_MEM_ENCRYPT:
@@ -23,4 +20,3 @@ bool cc_platform_has(enum cc_attr attr)
return false;
}
 }
-EXPORT_SYMBOL_GPL(cc_platform_has);
diff --git a/arch/x86/include/asm/cc_platform.h 
b/arch/x86/include/asm/cc_platform.h
new file mode 100644
index ..0a4220697043
--- /dev/null
+++ b/arch/x86/include/asm/cc_platform.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_CC_PLATFORM_H_
+#define _ASM_X86_CC_PLATFORM_H_
+
+bool arch_cc_platform_has(enum cc_attr attr);
+
+#endif // _ASM_X86_CC_PLATFORM_H_
diff --git a/arch/x86/kernel/cc_platform.c b/arch/x86/kernel/cc_platform.c
index 3c9bacd3c3f3..77e8c3465979 100644
--- a/arch/x86/kernel/cc_platform.c
+++ b/arch/x86/kernel/cc_platform.c
@@ -11,11 +11,11 @@
 #include 
 #include 
 
-bool cc_platform_has(enum cc_attr attr)
+bool arch_cc_platform_has(enum cc_attr attr)
 {
if (sme_me_mask)
return amd_cc_platform_has(attr);
 
return false;
 }
-EXPORT_SYMBOL_GPL(cc_platform_has);
+EXPORT_SYMBOL_GPL(arch_cc_platform_has);
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index 253f3ea66cd8..f3306647c5d9 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -65,6 +65,8 @@ enum cc_attr {
 
 #ifdef CONFIG_ARCH_HAS_CC_PLATFORM
 
+#include 
+
 /**
  * cc_platform_has() - Checks if the specified cc_attr attribute is active
  * @attr: Confidential computing attribute to check
@@ -77,7 +79,10 @@ enum cc_attr {
  * * TRUE  - Specified Confidential Computing attribute is active
  * * FALSE - Specified Confidential Computing attribute is not active
  */
-bool cc_platform_has(enum cc_attr attr);
+static inline bool cc_platform_has(enum cc_attr attr)
+{
+   return arch_cc_platform_has(attr);
+}
 
 #else  /* !CONFIG_ARCH_HAS_CC_PLATFORM */
 




Re: [PATCH 1/3] perf: Add macros to specify onchip L2/L3 accesses

2021-09-16 Thread Michael Ellerman
Peter Zijlstra  writes:
> On Tue, Sep 14, 2021 at 08:40:38PM +1000, Michael Ellerman wrote:
>> Peter Zijlstra  writes:
>
>> > I'm thinking we ought to keep hops as steps along the NUMA fabric, with
>> > 0 hops being the local node. That only gets us:
>> >
>> >  L2, remote=0, hops=HOPS_0 -- our L2
>> >  L2, remote=1, hops=HOPS_0 -- L2 on the local node but not ours
>> >  L2, remote=1, hops!=HOPS_0 -- L2 on a remote node
>> 
>> Hmm. I'm not sure about tying it directly to NUMA hops. I worry we're
>> going to see more and more systems where there's a hierarchy within the
>> chip/package, in addition to the traditional NUMA hierarchy.
>> 
>> Although then I guess it becomes a question of what exactly is a NUMA
>> hop, maybe the answer is that on those future systems those
>> intra-chip/package hops should be represented as NUMA hops.
>> 
>> It's not like we have a hard definition of what a NUMA hop is?
>
> Not really, typically whatever the BIOS/DT/whatever tables tell us. I
> think in case of Power you're mostly making things up in software :-)

Firmware is software so yes :)

> But yeah, I think we have plenty wriggle room there.

OK.

cheers


Re: [PATCH] powerpc/powernv/flash: Check OPAL flash calls exist before using

2021-09-15 Thread Michael Ellerman
Vasant Hegde  writes:
> Currently only FSP based powernv systems supports firmware update
> interfaces. Hence check that the token OPAL_FLASH_VALIDATE exists
> before initalising the flash driver.
>
> Signed-off-by: Vasant Hegde 
> ---
>  arch/powerpc/platforms/powernv/opal-flash.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/powerpc/platforms/powernv/opal-flash.c 
> b/arch/powerpc/platforms/powernv/opal-flash.c
> index 7e7d38b17420..05490fc22fae 100644
> --- a/arch/powerpc/platforms/powernv/opal-flash.c
> +++ b/arch/powerpc/platforms/powernv/opal-flash.c
> @@ -520,6 +520,10 @@ void __init opal_flash_update_init(void)
>  {
>   int ret;
>  
> + /* Firmware update is not supported by firmware */
> + if (!opal_check_token(OPAL_FLASH_VALIDATE))
> + return;
> +

That will mean the following files no longer appear on BMC systems:

  /sys/firmware/opal/image
  /sys/firmware/opal/validate_flash
  /sys/firmware/opal/manage_flash
  /sys/firmware/opal/update_flash

Presumably those files don't actually work correctly, but are we sure
their mere existence isn't used by anything at all?

We've had trouble in the past where removing sysfs files breaks tools
unexpectedly, see smt_snooze_delay.

cheers


Re: [PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()

2021-09-14 Thread Michael Ellerman
Borislav Petkov  writes:
> On Wed, Sep 08, 2021 at 05:58:35PM -0500, Tom Lendacky wrote:
>> Introduce a powerpc version of the cc_platform_has() function. This will
>> be used to replace the powerpc mem_encrypt_active() implementation, so
>> the implementation will initially only support the CC_ATTR_MEM_ENCRYPT
>> attribute.
>> 
>> Cc: Michael Ellerman 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Paul Mackerras 
>> Signed-off-by: Tom Lendacky 
>> ---
>>  arch/powerpc/platforms/pseries/Kconfig   |  1 +
>>  arch/powerpc/platforms/pseries/Makefile  |  2 ++
>>  arch/powerpc/platforms/pseries/cc_platform.c | 26 
>>  3 files changed, 29 insertions(+)
>>  create mode 100644 arch/powerpc/platforms/pseries/cc_platform.c
>
> Michael,
>
> can I get an ACK for the ppc bits to carry them through the tip tree
> pls?

Yeah.

I don't love it, a new C file and an out-of-line call to then call back
to a static inline that for most configuration will return false ... but
whatever :)

Acked-by: Michael Ellerman  (powerpc)


> Btw, on a related note, cross-compiling this throws the following error here:
>
> $ make 
> CROSS_COMPILE=/home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/powerpc64-linux-
>  V=1 ARCH=powerpc
>
> ...
>
> /home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/powerpc64-linux-gcc
>  -Wp,-MD,arch/powerpc/boot/.crt0.o.d -D__ASSEMBLY__ -Wall -Wundef 
> -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -O2 -msoft-float 
> -mno-altivec -mno-vsx -pipe -fomit-frame-pointer -fno-builtin -fPIC -nostdinc 
> -include ./include/linux/compiler_attributes.h -I./arch/powerpc/include 
> -I./arch/powerpc/include/generated  -I./include -I./arch/powerpc/include/uapi 
> -I./arch/powerpc/include/generated/uapi -I./include/uapi 
> -I./include/generated/uapi -include ./include/linux/compiler-version.h 
> -include ./include/linux/kconfig.h -m32 -isystem 
> /home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/../lib/gcc/powerpc64-linux/9.4.0/include
>  -mbig-endian -nostdinc -c -o arch/powerpc/boot/crt0.o 
> arch/powerpc/boot/crt0.S
> In file included from :
> ././include/linux/compiler_attributes.h:62:5: warning: "__has_attribute" is 
> not defined, evaluates to 0 [-Wundef]
>62 | #if __has_attribute(__assume_aligned__)
>   | ^~~
> ././include/linux/compiler_attributes.h:62:20: error: missing binary operator 
> before token "("
>62 | #if __has_attribute(__assume_aligned__)
>   |^
> ././include/linux/compiler_attributes.h:88:5: warning: "__has_attribute" is 
> not defined, evaluates to 0 [-Wundef]
>88 | #if __has_attribute(__copy__)
>   | ^~~
> ...
>
> Known issue?

Yeah, fixed in mainline today, thanks for trying to cross compile :)

cheers


Re: [PATCH 1/1] powerpc: Drop superfluous pci_dev_is_added() calls

2021-09-14 Thread Michael Ellerman
Bjorn Helgaas  writes:
> On Fri, Sep 10, 2021 at 04:19:40PM +0200, Niklas Schnelle wrote:
>> On powerpc, pci_dev_is_added() is called as part of SR-IOV fixups
>> that are done under pcibios_add_device() which in turn is only called in
>> pci_device_add() whih is called when a PCI device is scanned.
>> 
>> Now pci_dev_assign_added() is called in pci_bus_add_device() which is
>> only called after scanning the device. Thus pci_dev_is_added() is always
>> false and can be dropped.
>> 
>> Signed-off-by: Niklas Schnelle 
>
> Reviewed-by: Bjorn Helgaas 
>
> This doesn't touch the PCI core, so maybe makes sense for you to take
> it, Michael?  But let me know if you think otherwise.

Yeah I'm happy to take it, thanks.

cheers


Re: linux-next: build failure after merge of the origin tree

2021-09-14 Thread Michael Ellerman
Linus Torvalds  writes:
> On Mon, Sep 13, 2021 at 7:08 PM Stephen Rothwell  
> wrote:
>>
>> That patch works for me - for the ppc64_defconfig build at least.
>
> Yeah, I just tested the allmodconfig case too, although I suspect it's
> essentially the same wrt the boot *.S files, so it probably doesn't
> matter.
>
> I'd like to have Michael or somebody who can actually run some tests
> on the end result ack that patch (or - even better - come up with
> something cleaner) before committing it.
>
> Because yeah, the build failure is annoying and I apologize, but I'd
> rather have the build fail overnight than commit something that builds
> but then is subtle buggy for some reason.
>
> But if I don't get any other comments by the time I'm up again
> tomorrow, I'll just commit it as "fixes the build".

I ended up doing a more minimal version of your change.

I sent it separately, or it's here:

  https://lore.kernel.org/lkml/20210914121723.3756817-1-...@ellerman.id.au/


cheers


[PATCH] powerpc/boot: Fix build failure since GCC 4.9 removal

2021-09-14 Thread Michael Ellerman
Stephen reported that the build was broken since commit
6d2ef226f2f1 ("compiler_attributes.h: drop __has_attribute() support for
gcc4"), with errors such as:

  include/linux/compiler_attributes.h:296:5: warning: "__has_attribute" is not 
defined, evaluates to 0 [-Wundef]
296 | #if __has_attribute(__warning__)
| ^~~
  make[2]: *** [arch/powerpc/boot/Makefile:225: arch/powerpc/boot/crt0.o] Error 
1

But we expect __has_attribute() to always be defined now that we've
stopped using GCC 4.

Linus debugged it to the point of reading the GCC sources, and noticing
that the problem is that __has_attribute() is not defined when
preprocessing assembly files, which is what we're doing here.

Our assembly files don't include, or need, compiler_attributes.h, but
they are getting it unconditionally from the -include in BOOT_CFLAGS,
which is then added in its entirety to BOOT_AFLAGS.

That -include was added in commit 77433830ed16 ("powerpc: boot: include
compiler_attributes.h") so that we'd have "fallthrough" and other
attributes defined for the C files in arch/powerpc/boot. But it's not
needed for assembly files.

The minimal fix is to move the addition to BOOT_CFLAGS of -include
compiler_attributes.h until after we've copied BOOT_CFLAGS into
BOOT_AFLAGS. That avoids including compiler_attributes.h for asm files,
but makes no other change to BOOT_CFLAGS or BOOT_AFLAGS.

Reported-by: Stephen Rothwell 
Debugged-by: Linus Torvalds 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/boot/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


This seemed safer as a minimal fix, rather than doing a more
comprehensive separation of CFLAGS/AFLAGS. We can do that in a future
patch.

It passed my usual build/boot tests, including booting the built zImage
on some real hardware, so this is good to go from my POV.

cheers

diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 6900d0ac2421..089ee3ea55c8 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -35,7 +35,6 @@ endif
 BOOTCFLAGS:= -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
 -fno-strict-aliasing -O2 -msoft-float -mno-altivec -mno-vsx \
 -pipe -fomit-frame-pointer -fno-builtin -fPIC -nostdinc \
--include $(srctree)/include/linux/compiler_attributes.h \
 $(LINUXINCLUDE)
 
 ifdef CONFIG_PPC64_BOOT_WRAPPER
@@ -70,6 +69,7 @@ ifeq ($(call cc-option-yn, -fstack-protector),y)
 BOOTCFLAGS += -fno-stack-protector
 endif
 
+BOOTCFLAGS += -include $(srctree)/include/linux/compiler_attributes.h
 BOOTCFLAGS += -I$(objtree)/$(obj) -I$(srctree)/$(obj)
 
 DTC_FLAGS  ?= -p 1024
-- 
2.25.1



Re: [PATCH 1/3] perf: Add macros to specify onchip L2/L3 accesses

2021-09-14 Thread Michael Ellerman
Peter Zijlstra  writes:
> On Thu, Sep 09, 2021 at 10:45:54PM +1000, Michael Ellerman wrote:
>
>> > The 'new' composite doesnt have a hops field because the hardware that
>> > nessecitated that change doesn't report it, but we could easily add a
>> > field there.
>> >
>> > Suppose we add, mem_hops:3 (would 6 hops be too small?) and the
>> > corresponding PERF_MEM_HOPS_{NA, 0..6}
>> 
>> It's really 7 if we use remote && hop = 0 to mean the first hop.
>
> I don't think we can do that, becaus of backward compat. Currently:
>
>   lvl_num=DRAM, remote=1
>
> denites: "Remote DRAM of any distance". Effectively it would have the new
> hops field filled with zeros though, so if you then decode with the hops
> field added it suddenly becomes:
>
>  lvl_num=DRAM, remote=1, hops=0
>
> and reads like: "Remote DRAM of 0 hops" which is quite daft. Therefore 0
> really must denote a 'N/A'.

Ah yeah, duh, it needs to be backward compatible.

>> If we're wanting to use some of the hop levels to represent
>> intra-chip/package hops then we could possibly use them all on a really
>> big system.
>> 
>> eg. you could imagine something like:
>> 
>>  L2 |- local L2
>>  L2 | REMOTE | HOPS_0- L2 of neighbour core
>>  L2 | REMOTE | HOPS_1- L2 of near core on same chip (same 1/2 of 
>> chip)
>>  L2 | REMOTE | HOPS_2- L2 of far core on same chip (other 1/2 of 
>> chip)
>>  L2 | REMOTE | HOPS_3- L2 of sibling chip in same package
>>  L2 | REMOTE | HOPS_4- L2 on separate package 1 hop away
>>  L2 | REMOTE | HOPS_5- L2 on separate package 2 hops away
>>  L2 | REMOTE | HOPS_6- L2 on separate package 3 hops away
>> 
>> 
>> Whether it's useful to represent all those levels I'm not sure, but it's
>> probably good if we have the ability.
>
> I'm thinking we ought to keep hops as steps along the NUMA fabric, with
> 0 hops being the local node. That only gets us:
>
>  L2, remote=0, hops=HOPS_0 -- our L2
>  L2, remote=1, hops=HOPS_0 -- L2 on the local node but not ours
>  L2, remote=1, hops!=HOPS_0 -- L2 on a remote node

Hmm. I'm not sure about tying it directly to NUMA hops. I worry we're
going to see more and more systems where there's a hierarchy within the
chip/package, in addition to the traditional NUMA hierarchy.

Although then I guess it becomes a question of what exactly is a NUMA
hop, maybe the answer is that on those future systems those
intra-chip/package hops should be represented as NUMA hops.

It's not like we have a hard definition of what a NUMA hop is?

>> I guess I'm 50/50 on whether that's enough levels, or whether we want
>> another bit to allow for future growth.
>
> Right, possibly safer to add one extra bit while we can I suppose.

Equally it's not _that_ hard to add another bit later (if there's still
one free), makes the API a little uglier to use, but not the end of the
world.

cheers


Re: linux-next: build failure after merge of the origin tree

2021-09-13 Thread Michael Ellerman
Linus Torvalds  writes:
> On Mon, Sep 13, 2021 at 7:08 PM Stephen Rothwell  
> wrote:
>>
>> That patch works for me - for the ppc64_defconfig build at least.
>
> Yeah, I just tested the allmodconfig case too, although I suspect it's
> essentially the same wrt the boot *.S files, so it probably doesn't
> matter.
>
> I'd like to have Michael or somebody who can actually run some tests
> on the end result ack that patch (or - even better - come up with
> something cleaner) before committing it.
>
> Because yeah, the build failure is annoying and I apologize, but I'd
> rather have the build fail overnight than commit something that builds
> but then is subtle buggy for some reason.
>
> But if I don't get any other comments by the time I'm up again
> tomorrow, I'll just commit it as "fixes the build".

I'll have a look and get back to you before tomorrow.

cheers


Re: [PATCH 1/1] selftests/powerpc: Add memmove_64 test

2021-09-11 Thread Michael Ellerman
Ritesh Harjani  writes:
> While debugging an issue, we wanted to check whether the arch specific
> kernel memmove implementation is correct. This selftest could help test that.
>
> Suggested-by: Aneesh Kumar K.V 
> Suggested-by: Vaibhav Jain 
> Signed-off-by: Ritesh Harjani 
> ---
>  tools/testing/selftests/powerpc/Makefile  |  1 +
>  .../selftests/powerpc/memmoveloop/.gitignore  |  2 +
>  .../selftests/powerpc/memmoveloop/Makefile| 19 +++
>  .../powerpc/memmoveloop/asm/asm-compat.h  |  0
>  .../powerpc/memmoveloop/asm/export.h  |  4 ++
>  .../powerpc/memmoveloop/asm/feature-fixups.h  |  0
>  .../selftests/powerpc/memmoveloop/asm/kasan.h |  0
>  .../powerpc/memmoveloop/asm/ppc_asm.h | 39 +
>  .../powerpc/memmoveloop/asm/processor.h   |  0
>  .../selftests/powerpc/memmoveloop/mem_64.S|  1 +
>  .../selftests/powerpc/memmoveloop/memcpy_64.S |  1 +
>  .../selftests/powerpc/memmoveloop/stubs.S |  8 +++
>  .../selftests/powerpc/memmoveloop/validate.c  | 56 +++
>  13 files changed, 131 insertions(+)
>  create mode 100644 tools/testing/selftests/powerpc/memmoveloop/.gitignore
>  create mode 100644 tools/testing/selftests/powerpc/memmoveloop/Makefile
>  create mode 100644 
> tools/testing/selftests/powerpc/memmoveloop/asm/asm-compat.h
>  create mode 100644 tools/testing/selftests/powerpc/memmoveloop/asm/export.h
>  create mode 100644 
> tools/testing/selftests/powerpc/memmoveloop/asm/feature-fixups.h
>  create mode 100644 tools/testing/selftests/powerpc/memmoveloop/asm/kasan.h
>  create mode 100644 tools/testing/selftests/powerpc/memmoveloop/asm/ppc_asm.h
>  create mode 100644 
> tools/testing/selftests/powerpc/memmoveloop/asm/processor.h
>  create mode 12 tools/testing/selftests/powerpc/memmoveloop/mem_64.S
>  create mode 12 tools/testing/selftests/powerpc/memmoveloop/memcpy_64.S
>  create mode 100644 tools/testing/selftests/powerpc/memmoveloop/stubs.S
>  create mode 100644 tools/testing/selftests/powerpc/memmoveloop/validate.c

You've copied a lot of tools/testing/selftests/powerpc/copyloops 

I'd be happier if you integrated the memmove test into that existing
code. I realise memmove is not technically a copy loop, but it's close
enough.

Did you try that and hit some roadblock?

cheers


> diff --git a/tools/testing/selftests/powerpc/Makefile 
> b/tools/testing/selftests/powerpc/Makefile
> index 0830e63818c1..d110b3e5cbcd 100644
> --- a/tools/testing/selftests/powerpc/Makefile
> +++ b/tools/testing/selftests/powerpc/Makefile
> @@ -16,6 +16,7 @@ export CFLAGS
>  SUB_DIRS = alignment \
>  benchmarks   \
>  cache_shape  \
> +memmoveloop  \
>  copyloops\
>  dscr \
>  mm   \
> diff --git a/tools/testing/selftests/powerpc/memmoveloop/.gitignore 
> b/tools/testing/selftests/powerpc/memmoveloop/.gitignore
> new file mode 100644
> index ..56c1426013d5
> --- /dev/null
> +++ b/tools/testing/selftests/powerpc/memmoveloop/.gitignore
> @@ -0,0 +1,2 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +memmove_64
> diff --git a/tools/testing/selftests/powerpc/memmoveloop/Makefile 
> b/tools/testing/selftests/powerpc/memmoveloop/Makefile
> new file mode 100644
> index ..d58d8c100099
> --- /dev/null
> +++ b/tools/testing/selftests/powerpc/memmoveloop/Makefile
> @@ -0,0 +1,19 @@
> +# SPDX-License-Identifier: GPL-2.0
> +CFLAGS += -m64
> +CFLAGS += -I$(CURDIR)
> +CFLAGS += -D SELFTEST
> +CFLAGS += -maltivec
> +
> +ASFLAGS = $(CFLAGS) -Wa,-mpower4
> +
> +TEST_GEN_PROGS := memmove_64
> +EXTRA_SOURCES := validate.c ../harness.c stubs.S
> +CPPFLAGS += -D TEST_MEMMOVE=test_memmove
> +
> +top_srcdir = ../../../../..
> +include ../../lib.mk
> +
> +$(OUTPUT)/memmove_64: mem_64.S memcpy_64.S $(EXTRA_SOURCES)
> + $(CC) $(CPPFLAGS) $(CFLAGS) \
> + -D TEST_MEMMOVE=test_memmove \
> + -o $@ $^
> diff --git a/tools/testing/selftests/powerpc/memmoveloop/asm/asm-compat.h 
> b/tools/testing/selftests/powerpc/memmoveloop/asm/asm-compat.h
> new file mode 100644
> index ..e69de29bb2d1
> diff --git a/tools/testing/selftests/powerpc/memmoveloop/asm/export.h 
> b/tools/testing/selftests/powerpc/memmoveloop/asm/export.h
> new file mode 100644
> index ..e6b80d5fbd14
> --- /dev/null
> +++ b/tools/testing/selftests/powerpc/memmoveloop/asm/export.h
> @@ -0,0 +1,4 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#define EXPORT_SYMBOL(x)
> +#define EXPORT_SYMBOL_GPL(x)
> +#define EXPORT_SYMBOL_KASAN(x)
> diff --git a/tools/testing/selftests/powerpc/memmoveloop/asm/feature-fixups.h 
> b/tools/testing/selftests/powerpc/memmoveloop/asm/feature-fixups.h
> new file mode 100644
> index ..e69de29bb2d1
> diff --git a/tools/testing/selftests/powerpc/memmoveloop/asm/kasan.h 
> b/tools/testing/selftests/powerpc/memmoveloop/asm/kasan.h
> new file mode 100644
> index 

Re: [PATCH 1/1] powerpc: Drop superfluous pci_dev_is_added() calls

2021-09-11 Thread Michael Ellerman
Niklas Schnelle  writes:
> On powerpc, pci_dev_is_added() is called as part of SR-IOV fixups
> that are done under pcibios_add_device() which in turn is only called in
> pci_device_add() whih is called when a PCI device is scanned.

Thanks for cleaning this up for us.

> Now pci_dev_assign_added() is called in pci_bus_add_device() which is
> only called after scanning the device. Thus pci_dev_is_added() is always
> false and can be dropped.

My only query is whether we can pin down when that changed.

Oliver said:

  The use of pci_dev_is_added() in arch/powerpc was because in the past
  pci_bus_add_device() could be called before pci_device_add(). That was
  fixed a while ago so It should be safe to remove those calls now.

I trawled back through the history a bit but I can't remember/find which
commit changed that, Oliver can you remember?

cheers

> diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c 
> b/arch/powerpc/platforms/powernv/pci-sriov.c
> index 28aac933a439..deddbb233fde 100644
> --- a/arch/powerpc/platforms/powernv/pci-sriov.c
> +++ b/arch/powerpc/platforms/powernv/pci-sriov.c
> @@ -9,9 +9,6 @@
>  
>  #include "pci.h"
>  
> -/* for pci_dev_is_added() */
> -#include "../../../../drivers/pci/pci.h"
> -
>  /*
>   * The majority of the complexity in supporting SR-IOV on PowerNV comes from
>   * the need to put the MMIO space for each VF into a separate PE. Internally
> @@ -228,9 +225,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
> pci_dev *pdev)
>  
>  void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
>  {
> - if (WARN_ON(pci_dev_is_added(pdev)))
> - return;
> -
>   if (pdev->is_virtfn) {
>   struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
>  
> diff --git a/arch/powerpc/platforms/pseries/setup.c 
> b/arch/powerpc/platforms/pseries/setup.c
> index f79126f16258..2188054470c1 100644
> --- a/arch/powerpc/platforms/pseries/setup.c
> +++ b/arch/powerpc/platforms/pseries/setup.c
> @@ -74,7 +74,6 @@
>  #include 
>  
>  #include "pseries.h"
> -#include "../../../../drivers/pci/pci.h"
>  
>  DEFINE_STATIC_KEY_FALSE(shared_processor);
>  EXPORT_SYMBOL(shared_processor);
> @@ -750,7 +749,7 @@ static void pseries_pci_fixup_iov_resources(struct 
> pci_dev *pdev)
>   const int *indexes;
>   struct device_node *dn = pci_device_to_OF_node(pdev);
>  
> - if (!pdev->is_physfn || pci_dev_is_added(pdev))
> + if (!pdev->is_physfn)
>   return;
>   /*Firmware must support open sriov otherwise dont configure*/
>   indexes = of_get_property(dn, "ibm,open-sriov-vf-bar-info", NULL);
> -- 
> 2.25.1


Re: [PATCH 06/10] powerpc: remove GCC version check for UPD_CONSTR

2021-09-11 Thread Michael Ellerman
Nathan Chancellor  writes:
> On 9/10/2021 4:40 PM, Nick Desaulniers wrote:
>> Now that GCC 5.1 is the minimum supported version, we can drop this
>> workaround for older versions of GCC. This adversely affected clang,
>> too.
>> 
>> Cc: Michael Ellerman 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Paul Mackerras 
>> Cc: Segher Boessenkool 
>> Cc: Christophe Leroy 
>> Cc: linuxppc-dev@lists.ozlabs.org
>> Signed-off-by: Nick Desaulniers 
>> ---
>>   arch/powerpc/include/asm/asm-const.h | 10 --
>>   1 file changed, 10 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/asm-const.h 
>> b/arch/powerpc/include/asm/asm-const.h
>> index 0ce2368bd20f..dbfa5e1e3198 100644
>> --- a/arch/powerpc/include/asm/asm-const.h
>> +++ b/arch/powerpc/include/asm/asm-const.h
>> @@ -12,16 +12,6 @@
>>   #  define ASM_CONST(x) __ASM_CONST(x)
>>   #endif
>>   
>> -/*
>> - * Inline assembly memory constraint
>> - *
>> - * GCC 4.9 doesn't properly handle pre update memory constraint "m<>"
>> - *
>> - */
>> -#if defined(GCC_VERSION) && GCC_VERSION < 5
>> -#define UPD_CONSTR ""
>> -#else
>>   #define UPD_CONSTR "<>"
>> -#endif
>
> The only reason this exists is because of commit 592bbe9c505d 
> ("powerpc/uaccess: Don't use "m<>" constraint with GCC 4.9"). It is 
> probably just worth sinking "<>" into all of the callsites and removing
> UPD_CONSTR.

Yeah that would be great if you're doing a v2. Or we can do it as a
follow-up.

cheers


Re: [PATCH 1/3] perf: Add macros to specify onchip L2/L3 accesses

2021-09-09 Thread Michael Ellerman
Peter Zijlstra  writes:
> On Wed, Sep 08, 2021 at 05:17:53PM +1000, Michael Ellerman wrote:
>> Kajol Jain  writes:
>
>> > diff --git a/include/uapi/linux/perf_event.h 
>> > b/include/uapi/linux/perf_event.h
>> > index f92880a15645..030b3e990ac3 100644
>> > --- a/include/uapi/linux/perf_event.h
>> > +++ b/include/uapi/linux/perf_event.h
>> > @@ -1265,7 +1265,9 @@ union perf_mem_data_src {
>> >  #define PERF_MEM_LVLNUM_L20x02 /* L2 */
>> >  #define PERF_MEM_LVLNUM_L30x03 /* L3 */
>> >  #define PERF_MEM_LVLNUM_L40x04 /* L4 */
>> > -/* 5-0xa available */
>> > +#define PERF_MEM_LVLNUM_OC_L2 0x05 /* On Chip L2 */
>> > +#define PERF_MEM_LVLNUM_OC_L3 0x06 /* On Chip L3 */
>> 
>> The obvious use for 5 is for "L5" and so on.
>> 
>> I'm not sure adding new levels is the best idea, because these don't fit
>> neatly into the hierarchy, they are off to the side.
>> 
>> 
>> I wonder if we should use the remote field.
>> 
>> ie. for another core's L2 we set:
>> 
>>   mem_lvl = PERF_MEM_LVL_L2
>>   mem_remote = 1
>
> This mixes APIs (see below), IIUC the correct usage would be something
> like: lvl_num=L2 remote=1

Aha, I was wondering how lvl and lvl_num were supposed to interact.

>> Which would mean "remote L2", but not remote enough to be
>> lvl = PERF_MEM_LVL_REM_CCE1.
>> 
>> It would be printed by the existing tools/perf code as "Remote L2", vs
>> "Remote cache (1 hop)", which seems OK.
>> 
>> 
>> ie. we'd be able to express:
>> 
>>   Current core's L2: LVL_L2
>>   Other core's L2:   LVL_L2 | REMOTE
>>   Other chip's L2:   LVL_REM_CCE1 | REMOTE
>> 
>> And similarly for L3.
>> 
>> I think that makes sense? Unless people think remote should be reserved
>> to mean on another chip, though we already have REM_CCE1 for that.
>
> IIRC the PERF_MEM_LVL_* namespace is somewhat depricated in favour of
> the newer composite PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_} fields. Of
> course, ABIs being what they are, we get to support both :/ But I'm not
> sure mixing them is a great idea.

OK.

> Also, clearly this could use a comment...
>
> The 'new' composite doesnt have a hops field because the hardware that
> nessecitated that change doesn't report it, but we could easily add a
> field there.
>
> Suppose we add, mem_hops:3 (would 6 hops be too small?) and the
> corresponding PERF_MEM_HOPS_{NA, 0..6}

It's really 7 if we use remote && hop = 0 to mean the first hop.

If we're wanting to use some of the hop levels to represent
intra-chip/package hops then we could possibly use them all on a really
big system.

eg. you could imagine something like:

 L2 |   - local L2
 L2 | REMOTE | HOPS_0   - L2 of neighbour core
 L2 | REMOTE | HOPS_1   - L2 of near core on same chip (same 1/2 of chip)
 L2 | REMOTE | HOPS_2   - L2 of far core on same chip (other 1/2 of chip)
 L2 | REMOTE | HOPS_3   - L2 of sibling chip in same package
 L2 | REMOTE | HOPS_4   - L2 on separate package 1 hop away
 L2 | REMOTE | HOPS_5   - L2 on separate package 2 hops away
 L2 | REMOTE | HOPS_6   - L2 on separate package 3 hops away


Whether it's useful to represent all those levels I'm not sure, but it's
probably good if we have the ability.

I guess I'm 50/50 on whether that's enough levels, or whether we want
another bit to allow for future growth.

> Then I suppose you can encode things like:
> 
>   L2  - local L2
>   L2 | REMOTE - remote L2 at an unspecified distance (NA)
>   L2 | REMOTE | HOPS_0- remote L2 on the same node
>   L2 | REMOTE | HOPS_1- remote L2 on a node 1 removed
> 
> Would that work?

Yeah that looks good to me.

cheers


Re: [Bug 213837] "Kernel panic - not syncing: corrupted stack end detected inside scheduler" at building via distcc on a G5

2021-09-08 Thread Michael Ellerman
bugzilla-dae...@bugzilla.kernel.org writes:
> https://bugzilla.kernel.org/show_bug.cgi?id=213837
>
> Erhard F. (erhar...@mailbox.org) changed:
>
>What|Removed |Added
> 
>See Also|https://bugzilla.kernel.org |
>|/show_bug.cgi?id=213079 |
>
> --- Comment #4 from Erhard F. (erhar...@mailbox.org) ---
> Checked out whether this has really something to do with bug #213079 or not by
> copying this root partition to a regular HDD and use that one instead. As the
> issue still happens it seems these are two seperate bugs.
>
> [...]
> Kernel panic - not syncing: corrupted stack end detected inside scheduler

Can you try this patch, it might help us work out what is corrupting the
stack.

cheers

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c4462c454ab9..07bfa25c1b48 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5490,8 +5490,14 @@ static noinline void __schedule_bug(struct task_struct 
*prev)
 static inline void schedule_debug(struct task_struct *prev, bool preempt)
 {
 #ifdef CONFIG_SCHED_STACK_END_CHECK
-   if (task_stack_end_corrupted(prev))
+   if (task_stack_end_corrupted(prev)) {
+   char *start = (char *)end_of_stack(prev);
+   pr_err("stack corrupted? stack end = 0x%px\n", 
end_of_stack(prev));
+   print_hex_dump(KERN_ERR, "stack: ", DUMP_PREFIX_ADDRESS, 16, 4,
+  start - SZ_1K, THREAD_SIZE + SZ_1K, true);
+
panic("corrupted stack end detected inside scheduler\n");
+   }
 
if (task_scs_end_corrupted(prev))
panic("corrupted shadow stack detected inside scheduler\n");


Re: [PATCH 1/3] perf: Add macros to specify onchip L2/L3 accesses

2021-09-08 Thread Michael Ellerman
Kajol Jain  writes:
> Add couple of new macros to represent onchip L2 and onchip L3 accesses.

It would be "on chip". But I think this needs much more explanation,
this is a generic header so these definitions need to make sense, and
have an understood meaning, across all architectures.

I think most people are going to read "on chip" as differentiating
between an L2/L3 that is "on chip" vs "off chip".

But the case you're trying to express is "another core's L2/L3 on the
same chip as the CPU", vs "the current CPU's L2/L3".


> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index f92880a15645..030b3e990ac3 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -1265,7 +1265,9 @@ union perf_mem_data_src {
>  #define PERF_MEM_LVLNUM_L2   0x02 /* L2 */
>  #define PERF_MEM_LVLNUM_L3   0x03 /* L3 */
>  #define PERF_MEM_LVLNUM_L4   0x04 /* L4 */
> -/* 5-0xa available */
> +#define PERF_MEM_LVLNUM_OC_L20x05 /* On Chip L2 */
> +#define PERF_MEM_LVLNUM_OC_L30x06 /* On Chip L3 */

The obvious use for 5 is for "L5" and so on.

I'm not sure adding new levels is the best idea, because these don't fit
neatly into the hierarchy, they are off to the side.


I wonder if we should use the remote field.

ie. for another core's L2 we set:

  mem_lvl = PERF_MEM_LVL_L2
  mem_remote = 1

Which would mean "remote L2", but not remote enough to be
lvl = PERF_MEM_LVL_REM_CCE1.

It would be printed by the existing tools/perf code as "Remote L2", vs
"Remote cache (1 hop)", which seems OK.


ie. we'd be able to express:

  Current core's L2: LVL_L2
  Other core's L2:   LVL_L2 | REMOTE
  Other chip's L2:   LVL_REM_CCE1 | REMOTE

And similarly for L3.

I think that makes sense? Unless people think remote should be reserved
to mean on another chip, though we already have REM_CCE1 for that.

cheers




Re: [PATCH] powerpc/mce: Fix access error in mce handler

2021-09-07 Thread Michael Ellerman
Ganesh  writes:
> On 9/6/21 6:03 PM, Michael Ellerman wrote:
>
>> Ganesh Goudar  writes:
>>> We queue an irq work for deferred processing of mce event
>>> in realmode mce handler, where translation is disabled.
>>> Queuing of the work may result in accessing memory outside
>>> RMO region, such access needs the translation to be enabled
>>> for an LPAR running with hash mmu else the kernel crashes.
>>>
>>> So enable the translation before queuing the work.
>>>
>>> Without this change following trace is seen on injecting machine
>>> check error in an LPAR running with hash mmu.
>> What type of error are you injecting?
>
> SLB multihit in kernel mode.
>
>>
>>> Oops: Kernel access of bad area, sig: 11 [#1]
>>> LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
>>> CPU: 5 PID: 1883 Comm: insmod Tainted: GOE 5.14.0-mce+ #137
>>> NIP:  c0735d60 LR: c0318640 CTR: 
>>> REGS: c0001ebff9a0 TRAP: 0300   Tainted: G   OE  (5.14.0-mce+)
>>> MSR:  80001003   CR: 28008228  XER: 0001
>>> CFAR: c031863c DAR: c0027fa8fe08 DSISR: 4000 IRQMASK: 0
>>> GPR00: c03186d0 c0001ebffc40 c1b0df00 c16337e8
>>> GPR04: c16337e8 c0027fa8fe08 0023 c16337f0
>>> GPR08: 0023 c12ffe08  c00801460240
>>> GPR12:  c0001ec9a900 c0002ac4bd00 
>>> GPR16: 05a0 c008006b c008006b05a0 c0ff3068
>>> GPR20: c0002ac4bbc0 0001 c0002ac4bbc0 c00801490298
>>> GPR24: c00801490108 c1636198 c00801470090 c00801470058
>>> GPR28: 0510 c0080100 c0080819 0019
>>> NIP [c0735d60] llist_add_batch+0x0/0x40
>>> LR [c0318640] __irq_work_queue_local+0x70/0xc0
>>> Call Trace:
>>> [c0001ebffc40] [c0001ebffc0c] 0xc0001ebffc0c (unreliable)
>>> [c0001ebffc60] [c03186d0] irq_work_queue+0x40/0x70
>>> [c0001ebffc80] [c004425c] machine_check_queue_event+0xbc/0xd0
>>> [c0001ebffcf0] [c000838c] machine_check_early_common+0x16c/0x1f4
>>>
>>> Fixes: 74c3354bc1d89 ("powerpc/pseries/mce: restore msr before returning 
>>> from handler")
>> Please explain in more detail why that commit caused this breakage.
>
> After enabling translation in mce_handle_error() we used to leave it enabled 
> to avoid
> crashing here, but now with this commit we are restoring the MSR to disable 
> translation.

Are you sure we left the MMU enabled to avoid crashing there, or we just
left it enabled by accident?

But yeah, previously the MMU was enabled when we got here whereas now
it's not, because of that change.

> Missed to mention it in commit log, I will add it.

Thanks.

>>> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
>>> index 47a683cd00d2..9d1e39d42e3e 100644
>>> --- a/arch/powerpc/kernel/mce.c
>>> +++ b/arch/powerpc/kernel/mce.c
>>> @@ -249,6 +249,7 @@ void machine_check_queue_event(void)
>>>   {
>>> int index;
>>> struct machine_check_event evt;
>>> +   unsigned long msr;
>>>   
>>> if (!get_mce_event(, MCE_EVENT_RELEASE))
>>> return;
>>> @@ -262,8 +263,19 @@ void machine_check_queue_event(void)
>>> memcpy(_paca->mce_info->mce_event_queue[index],
>>>, sizeof(evt));
>>>   
>>> -   /* Queue irq work to process this event later. */
>>> -   irq_work_queue(_event_process_work);
>>> +   /* Queue irq work to process this event later. Before
>>> +* queuing the work enable translation for non radix LPAR,
>>> +* as irq_work_queue may try to access memory outside RMO
>>> +* region.
>>> +*/
>>> +   if (!radix_enabled() && firmware_has_feature(FW_FEATURE_LPAR)) {
>>> +   msr = mfmsr();
>>> +   mtmsr(msr | MSR_IR | MSR_DR);
>>> +   irq_work_queue(_event_process_work);
>>> +   mtmsr(msr);
>>> +   } else {
>>> +   irq_work_queue(_event_process_work);
>>> +   }
>>>   }
>> We already went to virtual mode and queued (different) irq work in
>> arch/powerpc/platforms/pseries/ras.c:mce_handle_error()
>>
>> We also called save_mce_event() which also might have queued irq work,
>> via machine_check_ue_ev

Re: [PATCH 1/2] powerpc/perf: Expose instruction and data address registers as part of extended regs

2021-09-07 Thread Michael Ellerman
Athira Rajeev  writes:
> Patch adds support to include Sampled Instruction Address Register
> (SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended
> registers. Update the definition of PERF_REG_PMU_MASK_300/31 and
> PERF_REG_EXTENDED_MAX to include these SPR's.
>
> Signed-off-by: Athira Rajeev 
> ---
>  arch/powerpc/include/uapi/asm/perf_regs.h | 12 +++-
>  arch/powerpc/perf/perf_regs.c |  4 
>  2 files changed, 11 insertions(+), 5 deletions(-)
>
...
> diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c
> index b931eed..51d31b6 100644
> --- a/arch/powerpc/perf/perf_regs.c
> +++ b/arch/powerpc/perf/perf_regs.c
> @@ -90,7 +90,11 @@ static u64 get_ext_regs_value(int idx)
>   return mfspr(SPRN_SIER2);
>   case PERF_REG_POWERPC_SIER3:
>   return mfspr(SPRN_SIER3);
> + case PERF_REG_POWERPC_SDAR:
> + return mfspr(SPRN_SDAR);
>  #endif
> + case PERF_REG_POWERPC_SIAR:
> + return mfspr(SPRN_SIAR);
>   default: return 0;
>   }

This file is built for all powerpc configs that have PERF_EVENTS. Which
includes CPUs that don't have SDAR or SIAR.

Don't we need checks in perf_reg_value() like we do for SIER?

I guess we already got this wrong when we added the Power10 registers,
SIER2/3 etc.

cheers


Re: [RFC PATCH v2] powerpc/papr_scm: Move duplicate definitions to common header files

2021-09-07 Thread Michael Ellerman
Shivaprasad G Bhat  writes:
> papr_scm and ndtest share common PDSM payload structs like
> nd_papr_pdsm_health. Presently these structs are duplicated across papr_pdsm.h
> and ndtest.h header files. Since 'ndtest' is essentially arch independent and 
> can
> run on platforms other than PPC64, a way needs to be deviced to avoid 
> redundancy
> and duplication of PDSM structs in future.
>
> So the patch proposes moving the PDSM header from arch/powerpc/include/uapi/ 
> to
> the generic include/uapi/linux directory. Also, there are some #defines common
> between papr_scm and ndtest which are not exported to the user space. So, move
> them to a header file which can be shared across ndtest and papr_scm via newly
> introduced include/linux/papr_scm.h.
>
> Signed-off-by: Shivaprasad G Bhat 
> Signed-off-by: Vaibhav Jain 
> Suggested-by: "Aneesh Kumar K.V" 
> ---
> Changelog:
>
> Since v1:
> Link: 
> https://patchwork.kernel.org/project/linux-nvdimm/patch/162505488483.72147.12741153746322191381.stgit@56e104a48989/
> * Removed dependency on this patch for the other patches
>
>  MAINTAINERS   |2 
>  arch/powerpc/include/uapi/asm/papr_pdsm.h |  165 
> -
>  arch/powerpc/platforms/pseries/papr_scm.c |   43 
>  include/linux/papr_scm.h  |   48 
>  include/uapi/linux/papr_pdsm.h|  165 
> +

This doesn't make sense to me.

Anything with papr (or PAPR) in the name is fundamentally powerpc
specific, it doesn't belong in a generic header, or in a generic
location.

What's the actual problem you're trying to solve?

If it's just including papr_scm bits into ndtest.c then that should be
as simple as:

#ifdef __powerpc__
#include 
#endif

Shouldn't it?

cheers


Re: [PATCH v2] ftrace: Cleanup ftrace_dyn_arch_init()

2021-09-06 Thread Michael Ellerman
Weizhao Ouyang  writes:
> Most of ARCHs use empty ftrace_dyn_arch_init(), introduce a weak common
> ftrace_dyn_arch_init() to cleanup them.
>
> Signed-off-by: Weizhao Ouyang 
> Acked-by: Heiko Carstens  (s390)
>
> ---
>
> Changes in v2:
> -- correct CONFIG_DYNAMIC_FTRACE on PowerPC
> -- add Acked-by tag

> diff --git a/arch/powerpc/include/asm/ftrace.h 
> b/arch/powerpc/include/asm/ftrace.h
> index debe8c4f7062..d59f67c0225f 100644
> --- a/arch/powerpc/include/asm/ftrace.h
> +++ b/arch/powerpc/include/asm/ftrace.h
> @@ -61,6 +61,10 @@ struct dyn_arch_ftrace {
>  };
>  #endif /* __ASSEMBLY__ */
>  
> +#ifdef CONFIG_DYNAMIC_FTRACE
> +int __init ftrace_dyn_arch_init(void);
> +#endif /* CONFIG_DYNAMIC_FTRACE */
> +
>  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
>  #define ARCH_SUPPORTS_FTRACE_OPS 1
>  #endif

That breaks the build for powerpc:

  /linux/arch/powerpc/include/asm/ftrace.h: Assembler messages:
  /linux/arch/powerpc/include/asm/ftrace.h:65: Error: unrecognized opcode: `int'
  make[4]: *** [/linux/scripts/Makefile.build:352: 
arch/powerpc/kernel/trace/ftrace_64.o] Error 1
  make[3]: *** [/linux/scripts/Makefile.build:514: arch/powerpc/kernel/trace] 
Error 2
  make[2]: *** [/linux/scripts/Makefile.build:514: arch/powerpc/kernel] Error 2
  make[1]: *** [/linux/Makefile:1861: arch/powerpc] Error 2
  make[1]: *** Waiting for unfinished jobs

It needs to be inside an #ifndef __ASSEMBLY__ section.

cheers


Re: [PATCH] powerpc/mce: Fix access error in mce handler

2021-09-06 Thread Michael Ellerman
Ganesh Goudar  writes:
> We queue an irq work for deferred processing of mce event
> in realmode mce handler, where translation is disabled.
> Queuing of the work may result in accessing memory outside
> RMO region, such access needs the translation to be enabled
> for an LPAR running with hash mmu else the kernel crashes.
>
> So enable the translation before queuing the work.
>
> Without this change following trace is seen on injecting machine
> check error in an LPAR running with hash mmu.

What type of error are you injecting?

> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> CPU: 5 PID: 1883 Comm: insmod Tainted: GOE 5.14.0-mce+ #137
> NIP:  c0735d60 LR: c0318640 CTR: 
> REGS: c0001ebff9a0 TRAP: 0300   Tainted: G   OE  (5.14.0-mce+)
> MSR:  80001003   CR: 28008228  XER: 0001
> CFAR: c031863c DAR: c0027fa8fe08 DSISR: 4000 IRQMASK: 0
> GPR00: c03186d0 c0001ebffc40 c1b0df00 c16337e8
> GPR04: c16337e8 c0027fa8fe08 0023 c16337f0
> GPR08: 0023 c12ffe08  c00801460240
> GPR12:  c0001ec9a900 c0002ac4bd00 
> GPR16: 05a0 c008006b c008006b05a0 c0ff3068
> GPR20: c0002ac4bbc0 0001 c0002ac4bbc0 c00801490298
> GPR24: c00801490108 c1636198 c00801470090 c00801470058
> GPR28: 0510 c0080100 c0080819 0019
> NIP [c0735d60] llist_add_batch+0x0/0x40
> LR [c0318640] __irq_work_queue_local+0x70/0xc0
> Call Trace:
> [c0001ebffc40] [c0001ebffc0c] 0xc0001ebffc0c (unreliable)
> [c0001ebffc60] [c03186d0] irq_work_queue+0x40/0x70
> [c0001ebffc80] [c004425c] machine_check_queue_event+0xbc/0xd0
> [c0001ebffcf0] [c000838c] machine_check_early_common+0x16c/0x1f4
>
> Fixes: 74c3354bc1d89 ("powerpc/pseries/mce: restore msr before returning from 
> handler")

Please explain in more detail why that commit caused this breakage.

> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
> index 47a683cd00d2..9d1e39d42e3e 100644
> --- a/arch/powerpc/kernel/mce.c
> +++ b/arch/powerpc/kernel/mce.c
> @@ -249,6 +249,7 @@ void machine_check_queue_event(void)
>  {
>   int index;
>   struct machine_check_event evt;
> + unsigned long msr;
>  
>   if (!get_mce_event(, MCE_EVENT_RELEASE))
>   return;
> @@ -262,8 +263,19 @@ void machine_check_queue_event(void)
>   memcpy(_paca->mce_info->mce_event_queue[index],
>  , sizeof(evt));
>  
> - /* Queue irq work to process this event later. */
> - irq_work_queue(_event_process_work);
> + /* Queue irq work to process this event later. Before
> +  * queuing the work enable translation for non radix LPAR,
> +  * as irq_work_queue may try to access memory outside RMO
> +  * region.
> +  */
> + if (!radix_enabled() && firmware_has_feature(FW_FEATURE_LPAR)) {
> + msr = mfmsr();
> + mtmsr(msr | MSR_IR | MSR_DR);
> + irq_work_queue(_event_process_work);
> + mtmsr(msr);
> + } else {
> + irq_work_queue(_event_process_work);
> + }
>  }

We already went to virtual mode and queued (different) irq work in
arch/powerpc/platforms/pseries/ras.c:mce_handle_error()

We also called save_mce_event() which also might have queued irq work,
via machine_check_ue_event().

So it really feels like something about the design is wrong if we have
to go to virtual mode again and queue more irq work here.

I guess we can probably merge this as a backportable fix, doing anything
else would be a bigger change.

Looking at ras.c there's the comment:

 * Enable translation as we will be accessing per-cpu variables
 * in save_mce_event() which may fall outside RMO region, also

But AFAICS it's only irq_work_queue() that touches anything percpu?

So maybe we should just not be using irq_work_queue(). It's a pretty
thin wrapper around set_dec(1), perhaps we just need to hand-roll some
real-mode friendly way of doing that.

cheers


Re: [PATCH -next] powerpc/mm: check base flags in ioremap_prot

2021-09-04 Thread Michael Ellerman
Nanyong Sun  writes:
> On 2021/9/3 17:16, Christophe Leroy wrote:
>> Le 03/09/2021 à 11:03, Nanyong Sun a écrit :
>>> Some drivers who call ioremap_prot without setting base flags like
>>> ioremap_prot(addr, len, 0) may work well before
>>> commit 56f3c1413f5c ("powerpc/mm: properly set PAGE_KERNEL flags in
>>> ioremap()"), but now they will get a virtual address "successfully"
>>> from ioremap_prot and badly fault on memory access later because that
>>> patch also dropped the hack adding of base flags for ioremap_prot.
>>>
>>> So return NULL and throw a warning if the caller of ioremap_prot did
>>> not set base flags properly. Why not just hack adding PAGE_KERNEL flags
>>> in the ioremap_prot, because most scenarios can be covered by high level
>>> functions like ioremap(), ioremap_coherent(), ioremap_cache()...
>>> so it is better to keep max flexibility for this low level api.
>>
>> As far as I can see, there is no user of this fonction that sets flags 
>> to 0 in the kernel tree.
>>
>> Did you find any ? If you did, I think it is better to fix the caller.
>>
>> Christophe
>>
> I see some vendor's drivers which are not on upstream ...

Sorry, but we don't carry extraneous checks in upstream for the sake of
out-of-tree drivers.

cheers


Re: [PATCH kernel] KVM: PPC: Fix clearing never mapped TCEs in realmode

2021-09-03 Thread Michael Ellerman
On Fri, 27 Aug 2021 14:07:06 +1000, Alexey Kardashevskiy wrote:
> Since e1a1ef84cd07, pages for TCE tables for KVM guests are allocated
> only when needed. This allows skipping any update when clearing TCEs.
> This works mostly fine as TCE updates are handled when MMU is enabled.
> The realmode handlers fail with H_TOO_HARD when pages are not yet
> allocated except when clearing a TCE in which case KVM prints a warning
> but proceeds to dereference a NULL pointer which crashes the host OS.
> 
> [...]

Applied to powerpc/next.

[1/1] KVM: PPC: Fix clearing never mapped TCEs in realmode
  https://git.kernel.org/powerpc/c/1d78dfde33a02da1d816279c2e3452978b7abd39

cheers


Re: [PATCH v2] powerpc/bug: Cast to unsigned long before passing to inline asm

2021-09-03 Thread Michael Ellerman
On Wed, 1 Sep 2021 21:25:22 +1000, Michael Ellerman wrote:
> In commit 1e688dd2a3d6 ("powerpc/bug: Provide better flexibility to
> WARN_ON/__WARN_FLAGS() with asm goto") we changed WARN_ON(). Previously
> it would take the warning condition, x, and double negate it before
> converting the result to int, and passing that int to the underlying
> inline asm. ie:
> 
>   #define WARN_ON(x) ({
>   int __ret_warn_on = !!(x);
>   if (__builtin_constant_p(__ret_warn_on)) {
>   ...
>   } else {
>   BUG_ENTRY(PPC_TLNEI " %4, 0",
> BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN),
> "r" (__ret_warn_on));
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/bug: Cast to unsigned long before passing to inline asm
  https://git.kernel.org/powerpc/c/e432fe97f3e5de325b40021e505cce53877586c5

cheers


Re: [PATCH] powerpc/ptdump: Fix generic ptdump for 64-bit

2021-09-03 Thread Michael Ellerman
On Tue, 31 Aug 2021 23:51:51 +1000, Michael Ellerman wrote:
> Since the conversion to generic ptdump we see crashes on 64-bit:
> 
>   BUG: Unable to handle kernel data access on read at 0xc0eeff7f
>   Faulting instruction address: 0xc045e5fc
>   Oops: Kernel access of bad area, sig: 11 [#1]
>   ...
>   NIP __walk_page_range+0x2bc/0xce0
>   LR  __walk_page_range+0x240/0xce0
>   Call Trace:
> __walk_page_range+0x240/0xce0 (unreliable)
> walk_page_range_novma+0x74/0xb0
> ptdump_walk_pgd+0x98/0x170
> ptdump_check_wx+0x88/0xd0
> mark_rodata_ro+0x48/0x80
> kernel_init+0x74/0x1a0
> ret_from_kernel_thread+0x5c/0x64
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/ptdump: Fix generic ptdump for 64-bit
  https://git.kernel.org/powerpc/c/b14b8b1ed0e15b8f43fba9c25654278a31ee3c2f

cheers


Re: [PATCH v2 2/2] selftests/powerpc: Add scv versions of the basic TM syscall tests

2021-09-01 Thread Michael Ellerman
Nicholas Piggin  writes:
> The basic TM vs syscall test code hard codes an sc instruction for the
> system call, which fails to cover scv even when the userspace libc has
> support for it.
>
> Duplicate the tests with hard coded scv variants so both are tested
> when possible.
>
> Signed-off-by: Nicholas Piggin 
> ---
>  .../selftests/powerpc/tm/tm-syscall-asm.S | 46 +++
>  .../testing/selftests/powerpc/tm/tm-syscall.c | 36 ---
>  2 files changed, 75 insertions(+), 7 deletions(-)
>
> diff --git a/tools/testing/selftests/powerpc/tm/tm-syscall-asm.S 
> b/tools/testing/selftests/powerpc/tm/tm-syscall-asm.S
> index bd1ca25febe4..849316831e6a 100644
> --- a/tools/testing/selftests/powerpc/tm/tm-syscall-asm.S
> +++ b/tools/testing/selftests/powerpc/tm/tm-syscall-asm.S
> @@ -2,6 +2,10 @@
>  #include 
>  #include 
>  
> +/* ppc-asm.h does not define r0 or r1 */
> +#define r0 0
> +#define r1 1
> +
>   .text
>  FUNC_START(getppid_tm_active)
>   tbegin.
> @@ -26,3 +30,45 @@ FUNC_START(getppid_tm_suspended)
>  1:
>   li  r3, -1
>   blr
> +
> +FUNC_START(getppid_scv_tm_active)
> + mflrr0
> + std r0,16(r1)
> + stdur1,-32(r1)
> + tbegin.
> + beq 1f
> + li  r0, __NR_getppid
> + scv 0
> + tend.
> + addir1,r1,32
> + ld  r0,16(r1)
> + mtlrr0
> + blr
> +1:
> + li  r3, -1
> + addir1,r1,32
> + ld  r0,16(r1)
> + mtlrr0
> + blr

There's some macros in tools/testing/selftests/powerpc/include/basic_asm.h
that can do some of this boiler plate stack setup/teardown.

Incremental diff below to use them, only build tested.

cheers


diff --git a/tools/testing/selftests/powerpc/tm/tm-syscall-asm.S 
b/tools/testing/selftests/powerpc/tm/tm-syscall-asm.S
index 849316831e6a..a73694daca71 100644
--- a/tools/testing/selftests/powerpc/tm/tm-syscall-asm.S
+++ b/tools/testing/selftests/powerpc/tm/tm-syscall-asm.S
@@ -1,10 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#include 
-#include 
-
-/* ppc-asm.h does not define r0 or r1 */
-#define r0 0
-#define r1 1
+#include 
 
.text
 FUNC_START(getppid_tm_active)
@@ -32,29 +27,21 @@ FUNC_START(getppid_tm_suspended)
blr
 
 FUNC_START(getppid_scv_tm_active)
-   mflrr0
-   std r0,16(r1)
-   stdur1,-32(r1)
+   PUSH_BASIC_STACK(0)
tbegin.
beq 1f
li  r0, __NR_getppid
scv 0
tend.
-   addir1,r1,32
-   ld  r0,16(r1)
-   mtlrr0
+   POP_BASIC_STACK(0)
blr
 1:
li  r3, -1
-   addir1,r1,32
-   ld  r0,16(r1)
-   mtlrr0
+   POP_BASIC_STACK(0)
blr
 
 FUNC_START(getppid_scv_tm_suspended)
-   mflrr0
-   std r0,16(r1)
-   stdur1,-32(r1)
+   PUSH_BASIC_STACK(0)
tbegin.
beq 1f
li  r0, __NR_getppid
@@ -62,13 +49,9 @@ FUNC_START(getppid_scv_tm_suspended)
scv 0
tresume.
tend.
-   addir1,r1,32
-   ld  r0,16(r1)
-   mtlrr0
+   POP_BASIC_STACK(0)
blr
 1:
li  r3, -1
-   addir1,r1,32
-   ld  r0,16(r1)
-   mtlrr0
+   POP_BASIC_STACK(0)
blr


Re: [PATCH v2 2/2] selftests/powerpc: Add scv versions of the basic TM syscall tests

2021-09-01 Thread Michael Ellerman
Nicholas Piggin  writes:
> Excerpts from Christophe Leroy's message of September 2, 2021 3:15 am:
>> Le 01/09/2021 à 18:54, Nicholas Piggin a écrit :
>>> The basic TM vs syscall test code hard codes an sc instruction for the
>>> system call, which fails to cover scv even when the userspace libc has
>>> support for it.
>>> 
>>> Duplicate the tests with hard coded scv variants so both are tested
>>> when possible.
>>> 
>>> Signed-off-by: Nicholas Piggin 
>>> ---
>>>   .../selftests/powerpc/tm/tm-syscall-asm.S | 46 +++
>>>   .../testing/selftests/powerpc/tm/tm-syscall.c | 36 ---
>>>   2 files changed, 75 insertions(+), 7 deletions(-)
>>> 
>>> diff --git a/tools/testing/selftests/powerpc/tm/tm-syscall-asm.S 
>>> b/tools/testing/selftests/powerpc/tm/tm-syscall-asm.S
>>> index bd1ca25febe4..849316831e6a 100644
>>> --- a/tools/testing/selftests/powerpc/tm/tm-syscall-asm.S
>>> +++ b/tools/testing/selftests/powerpc/tm/tm-syscall-asm.S
>>> @@ -2,6 +2,10 @@
>>>   #include 
>>>   #include 
>>>   
>>> +/* ppc-asm.h does not define r0 or r1 */
>>> +#define r0 0
>>> +#define r1 1
>>> +
>> 
>> See https://github.com/gcc-mirror/gcc/blob/master/gcc/config/rs6000/ppc-asm.h
>> 
>> It doesn't not define r1 but it defines r0.
>
> Oops, I'll fix that.
>
>> And it defines 'sp' as register 1.
>
> Does userspace code typically use that? Kernel code AFAIKS does not.

Some does, but it's not used consistently IME.

I'd prefer you just use %r1.

cheers


[PATCH v2] powerpc/bug: Cast to unsigned long before passing to inline asm

2021-09-01 Thread Michael Ellerman
In commit 1e688dd2a3d6 ("powerpc/bug: Provide better flexibility to
WARN_ON/__WARN_FLAGS() with asm goto") we changed WARN_ON(). Previously
it would take the warning condition, x, and double negate it before
converting the result to int, and passing that int to the underlying
inline asm. ie:

  #define WARN_ON(x) ({
int __ret_warn_on = !!(x);
if (__builtin_constant_p(__ret_warn_on)) {
...
} else {
BUG_ENTRY(PPC_TLNEI " %4, 0",
  BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN),
  "r" (__ret_warn_on));

The asm then does a full register width comparison with zero and traps
if it is non-zero (PPC_TLNEI).

The new code instead passes the full expression, x, with some arbitrary
type, to the inline asm:

  #define WARN_ON(x) ({
...
do {
if (__builtin_constant_p((x))) {
...
} else {
...
WARN_ENTRY(PPC_TLNEI " %4, 0",
   BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN),
   __label_warn_on, "r" (x));

As reported[1] by Nathan, when building with clang this can cause
spurious warnings to fire repeatedly at boot:

  WARNING: CPU: 0 PID: 1 at lib/klist.c:62 .klist_add_tail+0x3c/0x110
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW 
5.14.0-rc7-next-20210825 #1
  NIP:  c07ff81c LR: c090a038 CTR: 
  REGS: c73c32a0 TRAP: 0700   Tainted: GW  
(5.14.0-rc7-next-20210825)
  MSR:  82029032   CR: 22000a40  XER: 
  CFAR: c090a034 IRQMASK: 0
  GPR00: c090a038 c73c3540 c1be3200 0001
  GPR04: c72d65c0  c91ba798 c91bb0a0
  GPR08: 0001  c8581918 fc00
  GPR12: 44000240 c1dd c0012300 
  GPR16:    
  GPR20:    
  GPR24:  c17e3200  c1a0e778
  GPR28: c72d65b0 c72d65a8 c7de72c8 c73c35d0
  NIP .klist_add_tail+0x3c/0x110
  LR  .bus_add_driver+0x148/0x290
  Call Trace:
0xc73c35d0 (unreliable)
.bus_add_driver+0x148/0x290
.driver_register+0xb8/0x190
.__hid_register_driver+0x70/0xd0
.redragon_driver_init+0x34/0x58
.do_one_initcall+0x130/0x3b0
.do_initcall_level+0xd8/0x188
.do_initcalls+0x7c/0xdc
.kernel_init_freeable+0x178/0x21c
.kernel_init+0x34/0x220
.ret_from_kernel_thread+0x58/0x60
  Instruction dump:
  fba10078 7c7d1b78 3861 fb810070 3b9d0008 fbc10080 7c9e2378 389d0018
  fb9d0008 fb9d0010 9064 fbdd <0b1e> e87e0018 2823 41820024

The instruction dump shows that we are trapping because r30 is not zero:
  tdnei   r30,0

Where r30 = c7de72c8

The WARN_ON() comes from:

  static void knode_set_klist(struct klist_node *knode, struct klist *klist)
  {
knode->n_klist = klist;
/* no knode deserves to start its life dead */
WARN_ON(knode_dead(knode));
  ^

Where:

  #define KNODE_DEAD1LU

  static bool knode_dead(struct klist_node *knode)
  {
return (unsigned long)knode->n_klist & KNODE_DEAD;
  }

The full disassembly shows that clang has not generated any code to
apply the "& KNODE_DEAD" to the n_klist pointer, which is surprising.

Nathan filed an LLVM bug [2], in which Eli Friedman explained that clang
believes it is only passing a single bit to the asm (ie. a bool) and so
the mask of bit 0 with 1 can be omitted, and suggested that if we want
the full 64-bit value passed to the inline asm we should cast to a
64-bit type (or 32-bit on 32-bits).

In fact we already do that for BUG_ENTRY(), which was added to fix a
possibly similar bug in 2005 in commit 32818c2eb6b8 ("[PATCH] ppc64: Fix
issue with gcc 4.0 compiled kernels").

So cast the value we pass to the inline asm to long.

For GCC this appears to have no effect on code generation, other than
causing sign extension in some cases.

[1]: http://lore.kernel.org/r/YSa1O4fcX1nNKqN/@Ryzen-9-3900X.localdomain
[2]: https://bugs.llvm.org/show_bug.cgi?id=51634

Fixes: 1e688dd2a3d6 ("powerpc/bug: Provide better flexibility to 
WARN_ON/__WARN_FLAGS() with asm goto")
Reported-by: Nathan Chancellor 
Reviewed-by: Nathan Chancellor 
Tested-by: Nathan Chancellor 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/bug.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

v2: Reword the change log a bit to hopefully make it clearer that it's clang 
that believes
it only needs to pass a single bit for

Re: [PATCH] powerpc/bug: Cast to unsigned long before passing to inline asm

2021-09-01 Thread Michael Ellerman
Segher Boessenkool  writes:
> On Tue, Aug 31, 2021 at 11:27:20PM +1000, Michael Ellerman wrote:
>> Nathan filed an LLVM bug [2], in which Eli Friedman explained that "if
>> you pass a value of a type that's narrower than a register to an inline
>> asm, the high bits are undefined". In this case we are passing a bool
>> to the inline asm, which is a single bit value, and so the compiler
>> thinks it can leave the high bits of r30 unmasked, because only the
>> value of bit 0 matters.
>> 
>> Because the inline asm compares the full width of the register (32 or
>> 64-bit) we need to ensure the value passed to the inline asm has all
>> bits defined. So fix it by casting to long.
>> 
>> We also already cast to long for the similar case in BUG_ENTRY(), which
>> it turns out was added to fix a similar bug in 2005 in commit
>> 32818c2eb6b8 ("[PATCH] ppc64: Fix issue with gcc 4.0 compiled kernels").
>
> That points to <https://gcc.gnu.org/PR23422>, which shows the correct
> explanation.

That's a pity because I don't understand that explanation ^_^

Why does sign extension matter when we're comparing against zero?

> The code as it was did **not** pass a bool.  It perhaps passed an int
> (so many macros in play, it is hard to tell for sure, but it is int or
> long int, perhaps unsigned (which does not change anything here).

I don't understand that. It definitely is passing a bool at the source
level. Are you saying it's getting promoted somehow?

It expands to:

  asm goto(
  "1:   "
  "tdnei"
  "
  " " % 4,
  0 " "\n " ".section __ex_table,\"a\";"
  " "
  ".balign 4;"
  " "
  ".long (1b) - . ;"
  " "
  ".long (%l[__label_warn_on]) - . 
;"
  " "
  ".previous"
  " "
  ".section __bug_table,\"aw\"\n"
  "2:\t.4byte 1b - 2b, %0 - 2b\n"
  "\t.short %1, %2\n"
  ".org 2b+%3\n"
  ".previous\n"
  :
  : "i"("lib/klist.c"), "i"(62),
"i"((1 << 0) | ((9) << 8)),
"i"(sizeof(struct bug_entry)),
"r"(knode_dead(knode))
^
  :
  : __label_warn_on);

And knode_dead() returns bool:

  static bool knode_dead(struct klist_node *knode)
  {
return (unsigned long)knode->n_klist & KNODE_DEAD;
  }

So in my book that means the type there is bool. But I'm not a compiler
guy so I guessing I'm missing something.

> But td wants a 64-bit register, not a 32-bit one (which are the only two
> possibilities for the "r" constraint on PowerPC).  The cast to "long" is
> fine for powerpc64, but it has nothing to do with "narrower than a
> register".

If it's 32-bit vs 64-bit, and the clang explanation is correct, then
we'd expect the low 32-bits of the value passed to the asm to have the
correct value, ie. have been masked with KNODE_DEAD.

> If this is not the correct explanation for LLVM, it needs to solve some
> other bug.

OK. I just need to get this fixed in the kernel, so I'll do a new
version with a changelog that is ~= "shrug not sure what's going on" and
merge that. Then we can argue about what is really going on later :)

cheeers


Re: [PATCH] powerpc/head_check: Fix shellcheck errors

2021-08-31 Thread Michael Ellerman
On Tue, 17 Aug 2021 22:51:54 +1000, Michael Ellerman wrote:
> Replace "cat file | grep pattern" with "grep pattern file", and quote a
> few variables. Together that fixes all shellcheck errors.

Applied to powerpc/next.

[1/1] powerpc/head_check: Fix shellcheck errors
  https://git.kernel.org/powerpc/c/e95ad5f21693a37c0d318b5988c5a0de324eb3e3

cheers


Re: [PATCH v3 1/3] powerpc: Remove MSR_PR check in interrupt_exit_{user/kernel}_prepare()

2021-08-31 Thread Michael Ellerman
On Mon, 23 Aug 2021 08:24:20 + (UTC), Christophe Leroy wrote:
> In those hot functions that are called at every interrupt, any saved
> cycle is worth it.
> 
> interrupt_exit_user_prepare() and interrupt_exit_kernel_prepare() are
> called from three places:
> - From entry_32.S
> - From interrupt_64.S
> - From interrupt_exit_user_restart() and interrupt_exit_kernel_restart()
> 
> [...]

Applied to powerpc/next.

[1/3] powerpc: Remove MSR_PR check in interrupt_exit_{user/kernel}_prepare()
  https://git.kernel.org/powerpc/c/133c17a1788d68c9fff59d5f724a4ba14647a16d

cheers


Re: [PATCH v3 0/5] Updates to powerpc for robust CPU online/offline

2021-08-31 Thread Michael Ellerman
On Thu, 26 Aug 2021 15:35:16 +0530, Srikar Dronamraju wrote:
> Changelog v2 -> v3:
> v2: 
> https://lore.kernel.org/linuxppc-dev/20210821102535.169643-1-sri...@linux.vnet.ibm.com/t/#u
> Add patch 1: to drop dbg and numa=debug (Suggested by Michael Ellerman)
> Add patch 2: to convert printk to pr_xxx (Suggested by Michael Ellerman)
>   Use pr_warn instead of pr_debug(WARNING) (Suggested by Laurent)
> 
> Changelog v1 -> v2:
> Moved patch to this series: powerpc/numa: Fill distance_lookup_table for 
> offline nodes
> fixed a missing prototype warning
> 
> [...]

Patches 1-4 applied to powerpc/next.

[1/5] powerpc/numa: Drop dbg in favour of pr_debug
  https://git.kernel.org/powerpc/c/544af6429777cefae2f8af9a9866df5e8cb21763
[2/5] powerpc/numa: convert printk to pr_xxx
  https://git.kernel.org/powerpc/c/506c2075ffd8db352c53201ef166948a272e3bce
[3/5] powerpc/numa: Print debug statements only when required
  https://git.kernel.org/powerpc/c/544a09ee7434b949552266d20ece538d35535bd5
[4/5] powerpc/numa: Update cpu_cpu_map on CPU online/offline
  https://git.kernel.org/powerpc/c/9a245d0e1f006bc7ccf0285d0d520ed304d00c4a

cheers


Re: [PATCH v2 0/3] powerpc/smp: Misc fixes

2021-08-31 Thread Michael Ellerman
On Thu, 26 Aug 2021 15:33:58 +0530, Srikar Dronamraju wrote:
> Changelog : v1 -> v2:
> v1: 
> https://lore.kernel.org/linuxppc-dev/20210821092419.167454-1-sri...@linux.vnet.ibm.com/t/#u``
> [ patch 1: Updated to use DIV_ROUND_UP instead of max to handle more 
> situations ]
> [ patch 2: updated changelog to make it more generic powerpc issue ]
> 
> The 1st patch fixes a regression which causes a crash when booted with
> nr_cpus=2.
> 
> [...]

Applied to powerpc/next.

[1/3] powerpc/smp: Fix a crash while booting kvm guest with nr_cpus=2
  https://git.kernel.org/powerpc/c/8efd249babea2fec268cff90b9f5ca723dbb7499
[2/3] powerpc/smp: Update cpu_core_map on all PowerPc systems
  https://git.kernel.org/powerpc/c/b8b928030332a0ca16d42433eb2c3085600d8704
[3/3] powerpc/smp: Enable CACHE domain for shared processor
  https://git.kernel.org/powerpc/c/5bf63497b8ddf53d8973f68076119e482639b2bb

cheers


Re: [PATCH v6 00/11] DDW + Indirect Mapping

2021-08-31 Thread Michael Ellerman
On Tue, 17 Aug 2021 03:39:18 -0300, Leonardo Bras wrote:
> So far it's assumed possible to map the guest RAM 1:1 to the bus, which
> works with a small number of devices. SRIOV changes it as the user can
> configure hundreds VFs and since phyp preallocates TCEs and does not
> allow IOMMU pages bigger than 64K, it has to limit the number of TCEs
> per a PE to limit waste of physical pages.
> 
> As of today, if the assumed direct mapping is not possible, DDW creation
> is skipped and the default DMA window "ibm,dma-window" is used instead.
> 
> [...]

Applied to powerpc/next.

[01/11] powerpc/pseries/iommu: Replace hard-coded page shift

https://git.kernel.org/powerpc/c/0c634bafe3bbee7a36dca7f1277057e05bf14d91
[02/11] powerpc/kernel/iommu: Add new iommu_table_in_use() helper

https://git.kernel.org/powerpc/c/3c33066a21903076722a2881556a92aa3cd7d359
[03/11] powerpc/pseries/iommu: Add iommu_pseries_alloc_table() helper

https://git.kernel.org/powerpc/c/4ff8677a0b192a58d998d1d34fc5168203041a24
[04/11] powerpc/pseries/iommu: Add ddw_list_new_entry() helper

https://git.kernel.org/powerpc/c/92a23219299cedde52e3298788484f4875d5ce0f
[05/11] powerpc/pseries/iommu: Allow DDW windows starting at 0x00

https://git.kernel.org/powerpc/c/2ca73c54ce24489518a56d816331b774044c2445
[06/11] powerpc/pseries/iommu: Add ddw_property_create() and refactor 
enable_ddw()

https://git.kernel.org/powerpc/c/7ed2ed2db2685a285cb09ab330dc4efea0b64022
[07/11] powerpc/pseries/iommu: Reorganize iommu_table_setparms*() with new 
helper

https://git.kernel.org/powerpc/c/fc8cba8f989fb98e496b33a78476861e246c42a0
[08/11] powerpc/pseries/iommu: Update remove_dma_window() to accept property 
name

https://git.kernel.org/powerpc/c/a5fd95120c653962a9e75e260a35436b96d2c991
[09/11] powerpc/pseries/iommu: Find existing DDW with given property name

https://git.kernel.org/powerpc/c/8599395d34f2dd7b77bef42da1d99798e7a3d58f
[10/11] powerpc/pseries/iommu: Make use of DDW for indirect mapping

https://git.kernel.org/powerpc/c/381ceda88c4c4c8345cad1cffa6328892f15dca6
[11/11] powerpc/pseries/iommu: Rename "direct window" to "dma window"

https://git.kernel.org/powerpc/c/57dbbe590f152e5e8a3ff8bf5ba163df34eeae0b

cheers


Re: [PATCH 0/3] powerpc/microwatt: Device tree and config updates

2021-08-31 Thread Michael Ellerman
On Thu, 26 Aug 2021 21:56:50 +0930, Joel Stanley wrote:
> This enables the liteeth network device for microwatt which will be
> merged in v5.15.
> 
> It also turns on some options so the microwatt defconfig can be used to
> boot a userspace with systemd.
> 
> Joel Stanley (3):
>   powerpc/microwatt: Add Ethernet to device tree
>   powerpc/configs/microwattt: Enable Liteeth
>   powerpc/configs/microwatt: Enable options for systemd
> 
> [...]

Applied to powerpc/next.

[1/3] powerpc/microwatt: Add Ethernet to device tree
  https://git.kernel.org/powerpc/c/602d0f96563c2e0b8e1ddb22ac46bf7f58480d64
[2/3] powerpc/configs/microwattt: Enable Liteeth
  https://git.kernel.org/powerpc/c/ef4fcaf99cd27eb48790f2adc4eff456dbe1dec4
[3/3] powerpc/configs/microwatt: Enable options for systemd
  https://git.kernel.org/powerpc/c/3e18e271182206c996a3a7efbbe70c66307ef137

cheers


Re: [PATCH] powerpc: Redefine HMT_xxx macros as empty on PPC32

2021-08-31 Thread Michael Ellerman
On Wed, 25 Aug 2021 13:34:45 + (UTC), Christophe Leroy wrote:
> HMT_xxx macros are macros for adjusting thread priority
> (hardware multi-threading) are macros inherited from PPC64
> via commit 5f7c690728ac ("[PATCH] powerpc: Merged ppc_asm.h")
> 
> Those instructions are pointless on PPC32, but some common
> fonctions like arch_cpu_idle() use them.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc: Redefine HMT_xxx macros as empty on PPC32
  https://git.kernel.org/powerpc/c/8149238ffd210875f5a77e3c654bb59b58da35e3

cheers


[PATCH] powerpc/ptdump: Fix generic ptdump for 64-bit

2021-08-31 Thread Michael Ellerman
Since the conversion to generic ptdump we see crashes on 64-bit:

  BUG: Unable to handle kernel data access on read at 0xc0eeff7f
  Faulting instruction address: 0xc045e5fc
  Oops: Kernel access of bad area, sig: 11 [#1]
  ...
  NIP __walk_page_range+0x2bc/0xce0
  LR  __walk_page_range+0x240/0xce0
  Call Trace:
__walk_page_range+0x240/0xce0 (unreliable)
walk_page_range_novma+0x74/0xb0
ptdump_walk_pgd+0x98/0x170
ptdump_check_wx+0x88/0xd0
mark_rodata_ro+0x48/0x80
kernel_init+0x74/0x1a0
ret_from_kernel_thread+0x5c/0x64

What's happening is that have walked off the end of the kernel page
tables, and started dereferencing junk values.

That happens because we initialised the ptdump_range to span all the way
up to 0x:

static struct ptdump_range ptdump_range[] __ro_after_init = {
{TASK_SIZE_MAX, ~0UL},

But the kernel page tables don't span that far. So on 64-bit set the end
of the range to be the address immediately past the end of the kernel
page tables, to limit the page table walk to valid addresses.

Fixes: e084728393a5 ("powerpc/ptdump: Convert powerpc to GENERIC_PTDUMP")
Reported-by: Nathan Chancellor 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/mm/ptdump/ptdump.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index 2d80d775d15e..bf251191e78d 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -359,6 +359,8 @@ static int __init ptdump_init(void)
ptdump_range[0].start = KERN_VIRT_START;
else
ptdump_range[0].start = PAGE_OFFSET;
+
+   ptdump_range[0].end = PAGE_OFFSET + (PGDIR_SIZE * PTRS_PER_PGD);
 #endif
 
populate_markers();

base-commit: e1ab9a730b426fadc018f91b7c98412473e542fb
prerequisite-patch-id: 942553bda7d83bbae8bf6b2b718033d488ee2410
prerequisite-patch-id: a14c44e671eba8648c4fe385a2552fd57875ec8a
prerequisite-patch-id: 94f5c890f54da2b46f06c60562e879171fab2be3
prerequisite-patch-id: 330af32f2aa34a432d450acc9f6e9fd1cec96417
prerequisite-patch-id: b46c65afa63944f3fb02f4b9bdf940507bb25de6
prerequisite-patch-id: c4ba00ee949f70d7745f75bad11bbb2416f329f1
prerequisite-patch-id: f479601944d0aa615716d5349d93bd6e3d5619c1
prerequisite-patch-id: 9523cde933393b2d68648cecb740efdba9dd8601
prerequisite-patch-id: 034afc97c841a6dcd2b9932406f391d65d18bf87
prerequisite-patch-id: effd7ac8a7db6b59a2677c9c3a7ef8b3ef8bdaf8
prerequisite-patch-id: 23883cf116ee69b452db3c6e10dd49e756e7b5d5
prerequisite-patch-id: 37b6695321c96db466b0faba9308bacfb79c7ced
prerequisite-patch-id: 83420e68ca4476c9ba5a67aa19e1fdc0b6d656a4
prerequisite-patch-id: 362219acf820b78b83c6c09071a636b28976a1ce
prerequisite-patch-id: 857513c5f431887d16a59d193834dcec636c73dc
prerequisite-patch-id: 49f6879a819e205b5361280ab923664fcd29daaf
prerequisite-patch-id: 5a37bcf70c5cb44d78de63a64e5ce920a0a7e419
prerequisite-patch-id: 2c06dd3833117b0498baa198694f6c7e84975840
prerequisite-patch-id: 5794a211ebbf7f0d416ae882443201621c00f615
prerequisite-patch-id: 19ed5ae34e233079c7f66376b8d309cac2b57dbc
prerequisite-patch-id: 1d4c82277473e8dbecf83faf6c4a6788538b064d
prerequisite-patch-id: 8cb5ecc4fe23dafb4a43192f93b669c80a548985
prerequisite-patch-id: 763b8d98c3aefd120862154b94814e3ef3578b5c
prerequisite-patch-id: f45e04e6d030eb157be550976b07dc891fa0836d
prerequisite-patch-id: 07b6fb682675845aca694deff1847bc7a40e1fec
prerequisite-patch-id: 7f1082effa12b1eba445cef90e4749155662888c
prerequisite-patch-id: 76743814dd8e6151c27676ae2e318579d658bf8b
prerequisite-patch-id: 8a6b12c11dbbcd5dda0ccc9877dee1be785e0173
prerequisite-patch-id: e98f013ce41c27d16f75ac3eb1c7eec4235cca0a
prerequisite-patch-id: 285e11f96169ec82702a69b2fca5318c0e307508
prerequisite-patch-id: 9fa89fb9f4ac839177307891bb240009f1d55e88
prerequisite-patch-id: feebaed3f6e0c15e8fa468d64129fe9aa4411d57
prerequisite-patch-id: 8f1093cf40180a623439d82e563e1dd18194cc19
prerequisite-patch-id: d042674595d0678e71e5258d55b93d54b5c4
prerequisite-patch-id: 286812aaed6630139583fd21d388137b8d5a6931
prerequisite-patch-id: 54af8aa735a12282bb40a0ed87455e268ae356d9
prerequisite-patch-id: cc5ee85759d99a6ebf18e39634dde65f15476f84
prerequisite-patch-id: 3f8437c8bfda23c45839596ec432d81a95505061
prerequisite-patch-id: f30d6fa2c7c7c417ee4bee0827c0ce587570db34
prerequisite-patch-id: fa402f5deaa301587ced629dfa523728aece4705
prerequisite-patch-id: 51f326f5de947cea58003cc8b988b54436689d1b
prerequisite-patch-id: 4003c9a6b2792e797c333875e63a184df8fcc7e7
prerequisite-patch-id: f73fd878eb9b65ecbed3c3ee8ca6725f7e55d5d2
prerequisite-patch-id: 5e55b3e9b3809da22b8742f0ed356df6d6fdd301
prerequisite-patch-id: 1fde98fffabd6313d1921d8b2f28691e9a191b1d
prerequisite-patch-id: 51c0595fe54ad077c736b7a4351c2f2700ab66d7
prerequisite-patch-id: e490360db8c2dc7cbf693258ca93e4597f165c6f
prerequisite-patch-id: c4354b3226d31d8ddb6992956cf0ed12ea97cb8e
prerequisite-patch-id: c67a26ed658da4b11a3319e0e99c4a84afb68d80
prerequi

Re: [PATCH v4 4/4] powerpc/ptdump: Convert powerpc to GENERIC_PTDUMP

2021-08-31 Thread Michael Ellerman
Christophe Leroy  writes:
> Le 30/08/2021 à 13:55, Michael Ellerman a écrit :
>> Christophe Leroy  writes:
>>> Le 30/08/2021 à 09:52, Michael Ellerman a écrit :
>>>> Christophe Leroy  writes:
>>>>> Le 29/08/2021 à 20:55, Nathan Chancellor a écrit :
>>>>>> On Thu, Jul 08, 2021 at 04:49:43PM +, Christophe Leroy wrote:
>>>>>>> This patch converts powerpc to the generic PTDUMP implementation.
>>>>>>>
>>>>>>
>>>>>> This patch as commit e084728393a5 ("powerpc/ptdump: Convert powerpc to
>>>>>> GENERIC_PTDUMP") in powerpc/next causes a panic with Fedora's ppc64le
>>>>>> config [1] when booting up in QEMU with [2]:
>>>>>>
>>>>>> [1.621864] BUG: Unable to handle kernel data access on read at 
>>>>>> 0xc0eeff7f
>>>>>> [1.623058] Faulting instruction address: 0xc045e5fc
>>>>>> [1.623832] Oops: Kernel access of bad area, sig: 11 [#1]
>>>>>> [1.624318] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
>>>>>> [1.625015] Modules linked in:
>>>>>> [1.625463] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
>>>>>> 5.14.0-rc7-next-20210827 #16
>>>>>> [1.626237] NIP:  c045e5fc LR: c045e580 CTR: 
>>>>>> c0518220
>>>>>> [1.626839] REGS: c752b820 TRAP: 0380   Not tainted  
>>>>>> (5.14.0-rc7-next-20210827)
>>>>>> [1.627528] MSR:  92009033   CR: 
>>>>>> 84002482  XER: 2000
>>>>>> [1.628449] CFAR: c0518300 IRQMASK: 0
>>>>>> [1.628449] GPR00: c045e580 c752bac0 c28a9300 
>>>>>> 
>>>>>> [1.628449] GPR04: c2008000  000a 
>>>>>> 0001
>>>>>> [1.628449] GPR08: c0eeff7f 0012  
>>>>>> 
>>>>>> [1.628449] GPR12:  c2b2 fffe 
>>>>>> c2971a70
>>>>>> [1.628449] GPR16: c2960040 c11a8f98 c752bbf0 
>>>>>> 
>>>>>> [1.628449] GPR20: c2008fff c0eeff7f c2971a68 
>>>>>> c00a0003ff00
>>>>>> [1.628449] GPR24: c2971a78 0002 0001 
>>>>>> c11a8f98
>>>>>> [1.628449] GPR28: c11a8f98 c28daef8 c2008000 
>>>>>> c2009000
>>>>>> [1.634090] NIP [c045e5fc] __walk_page_range+0x2bc/0xce0
>>>>>> [1.635117] LR [c045e580] __walk_page_range+0x240/0xce0
>>>>>> [1.635755] Call Trace:
>>>>>> [1.636018] [c752bac0] [c045e580] 
>>>>>> __walk_page_range+0x240/0xce0 (unreliable)
>>>>>> [1.636811] [c752bbd0] [c045f234] 
>>>>>> walk_page_range_novma+0x74/0xb0
>>>>>> [1.637459] [c752bc20] [c0518448] 
>>>>>> ptdump_walk_pgd+0x98/0x170
>>>>>> [1.638138] [c752bc70] [c00aa988] 
>>>>>> ptdump_check_wx+0x88/0xd0
>>>>>> [1.638738] [c752bd50] [c008d6d8] 
>>>>>> mark_rodata_ro+0x48/0x80
>>>>>> [1.639299] [c752bdb0] [c0012a34] 
>>>>>> kernel_init+0x74/0x1a0
>>>>>> [1.639842] [c752be10] [c000cfd4] 
>>>>>> ret_from_kernel_thread+0x5c/0x64
>>>>>> [1.640597] Instruction dump:
>>>>>> [1.641021] 38e7 39490010 7ce707b4 7fca5436 79081564 7d4a3838 
>>>>>> 7908f082 794a1f24
>>>>>> [1.641740] 78a8f00e 30e6 7ea85214 7ce73110 <7d48502a> 78f90fa4 
>>>>>> 2c2a 39290010
>>>>>> [1.642771] ---[ end trace 6cf72b085097ad52 ]---
>>>>>> [1.643220]
>>>>>> [2.644228] Kernel panic - not syncing: Attempted to kill init! 
>>>>>> exitcode=0x000b
>>>>>> [2.645523] ---[ end Kernel panic - not syncing: Attempted to kill 
>>>>>> init! exitcode=0x000b ]---
>>>>>>
>>>&

[PATCH] powerpc/bug: Cast to unsigned long before passing to inline asm

2021-08-31 Thread Michael Ellerman
In commit 1e688dd2a3d6 ("powerpc/bug: Provide better flexibility to
WARN_ON/__WARN_FLAGS() with asm goto") we changed WARN_ON(). Previously
it would take the warning condition, x, and double negate it before
converting the result to int, and passing that int to the underlying
inline asm. ie:

  #define WARN_ON(x) ({
int __ret_warn_on = !!(x);
if (__builtin_constant_p(__ret_warn_on)) {
...
} else {
BUG_ENTRY(PPC_TLNEI " %4, 0",
  BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN),
  "r" (__ret_warn_on));

The asm then does a full register width comparison with zero and traps
if it is non-zero (PPC_TLNEI).

The new code instead passes the full expression, x, with some unknown
type, to the inline asm:

  #define WARN_ON(x) ({
...
do {
if (__builtin_constant_p((x))) {
...
} else {
...
WARN_ENTRY(PPC_TLNEI " %4, 0",
   BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN),
   __label_warn_on, "r" (x));

This was not seen to cause any issues with GCC, however with clang in at
least one case it leads to a WARN_ON() that fires incorrectly and
repeatedly at boot, as reported[1] by Nathan:

  WARNING: CPU: 0 PID: 1 at lib/klist.c:62 .klist_add_tail+0x3c/0x110
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW 
5.14.0-rc7-next-20210825 #1
  NIP:  c07ff81c LR: c090a038 CTR: 
  REGS: c73c32a0 TRAP: 0700   Tainted: GW  
(5.14.0-rc7-next-20210825)
  MSR:  82029032   CR: 22000a40  XER: 
  CFAR: c090a034 IRQMASK: 0
  GPR00: c090a038 c73c3540 c1be3200 0001
  GPR04: c72d65c0  c91ba798 c91bb0a0
  GPR08: 0001  c8581918 fc00
  GPR12: 44000240 c1dd c0012300 
  GPR16:    
  GPR20:    
  GPR24:  c17e3200  c1a0e778
  GPR28: c72d65b0 c72d65a8 c7de72c8 c73c35d0
  NIP .klist_add_tail+0x3c/0x110
  LR  .bus_add_driver+0x148/0x290
  Call Trace:
0xc73c35d0 (unreliable)
.bus_add_driver+0x148/0x290
.driver_register+0xb8/0x190
.__hid_register_driver+0x70/0xd0
.redragon_driver_init+0x34/0x58
.do_one_initcall+0x130/0x3b0
.do_initcall_level+0xd8/0x188
.do_initcalls+0x7c/0xdc
.kernel_init_freeable+0x178/0x21c
.kernel_init+0x34/0x220
.ret_from_kernel_thread+0x58/0x60
  Instruction dump:
  fba10078 7c7d1b78 3861 fb810070 3b9d0008 fbc10080 7c9e2378 389d0018
  fb9d0008 fb9d0010 9064 fbdd <0b1e> e87e0018 2823 41820024

The instruction dump shows that we are trapping because r30 is not zero:
  tdnei   r30,0

Where r30 = c7de72c8

The WARN_ON() comes from:

  static void knode_set_klist(struct klist_node *knode, struct klist *klist)
  {
knode->n_klist = klist;
/* no knode deserves to start its life dead */
WARN_ON(knode_dead(knode));
^

Where:

  #define KNODE_DEAD1LU

  static bool knode_dead(struct klist_node *knode)
  {
return (unsigned long)knode->n_klist & KNODE_DEAD;
  }

The full disassembly shows that the compiler has not generated any code
to apply the "& KNODE_DEAD" to the n_klist pointer, which is surprising.

Nathan filed an LLVM bug [2], in which Eli Friedman explained that "if
you pass a value of a type that's narrower than a register to an inline
asm, the high bits are undefined". In this case we are passing a bool
to the inline asm, which is a single bit value, and so the compiler
thinks it can leave the high bits of r30 unmasked, because only the
value of bit 0 matters.

Because the inline asm compares the full width of the register (32 or
64-bit) we need to ensure the value passed to the inline asm has all
bits defined. So fix it by casting to long.

We also already cast to long for the similar case in BUG_ENTRY(), which
it turns out was added to fix a similar bug in 2005 in commit
32818c2eb6b8 ("[PATCH] ppc64: Fix issue with gcc 4.0 compiled kernels").

[1]: http://lore.kernel.org/r/YSa1O4fcX1nNKqN/@Ryzen-9-3900X.localdomain
[2]: https://bugs.llvm.org/show_bug.cgi?id=51634

Fixes: 1e688dd2a3d6 ("powerpc/bug: Provide better flexibility to 
WARN_ON/__WARN_FLAGS() with asm goto")
Reported-by: Nathan Chancellor 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/bug.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff -

Re: [PATCH v4 4/4] powerpc/ptdump: Convert powerpc to GENERIC_PTDUMP

2021-08-30 Thread Michael Ellerman
Christophe Leroy  writes:
> Le 30/08/2021 à 09:52, Michael Ellerman a écrit :
>> Christophe Leroy  writes:
>>> Le 29/08/2021 à 20:55, Nathan Chancellor a écrit :
>>>> On Thu, Jul 08, 2021 at 04:49:43PM +, Christophe Leroy wrote:
>>>>> This patch converts powerpc to the generic PTDUMP implementation.
>>>>>
>>>>
>>>> This patch as commit e084728393a5 ("powerpc/ptdump: Convert powerpc to
>>>> GENERIC_PTDUMP") in powerpc/next causes a panic with Fedora's ppc64le
>>>> config [1] when booting up in QEMU with [2]:
>>>>
>>>> [1.621864] BUG: Unable to handle kernel data access on read at 
>>>> 0xc0eeff7f
>>>> [1.623058] Faulting instruction address: 0xc045e5fc
>>>> [1.623832] Oops: Kernel access of bad area, sig: 11 [#1]
>>>> [1.624318] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
>>>> [1.625015] Modules linked in:
>>>> [1.625463] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
>>>> 5.14.0-rc7-next-20210827 #16
>>>> [1.626237] NIP:  c045e5fc LR: c045e580 CTR: 
>>>> c0518220
>>>> [1.626839] REGS: c752b820 TRAP: 0380   Not tainted  
>>>> (5.14.0-rc7-next-20210827)
>>>> [1.627528] MSR:  92009033   CR: 
>>>> 84002482  XER: 2000
>>>> [1.628449] CFAR: c0518300 IRQMASK: 0
>>>> [1.628449] GPR00: c045e580 c752bac0 c28a9300 
>>>> 
>>>> [1.628449] GPR04: c2008000  000a 
>>>> 0001
>>>> [1.628449] GPR08: c0eeff7f 0012  
>>>> 
>>>> [1.628449] GPR12:  c2b2 fffe 
>>>> c2971a70
>>>> [1.628449] GPR16: c2960040 c11a8f98 c752bbf0 
>>>> 
>>>> [1.628449] GPR20: c2008fff c0eeff7f c2971a68 
>>>> c00a0003ff00
>>>> [1.628449] GPR24: c2971a78 0002 0001 
>>>> c11a8f98
>>>> [1.628449] GPR28: c11a8f98 c28daef8 c2008000 
>>>> c2009000
>>>> [1.634090] NIP [c045e5fc] __walk_page_range+0x2bc/0xce0
>>>> [1.635117] LR [c045e580] __walk_page_range+0x240/0xce0
>>>> [1.635755] Call Trace:
>>>> [1.636018] [c752bac0] [c045e580] 
>>>> __walk_page_range+0x240/0xce0 (unreliable)
>>>> [1.636811] [c752bbd0] [c045f234] 
>>>> walk_page_range_novma+0x74/0xb0
>>>> [1.637459] [c752bc20] [c0518448] 
>>>> ptdump_walk_pgd+0x98/0x170
>>>> [1.638138] [c752bc70] [c00aa988] 
>>>> ptdump_check_wx+0x88/0xd0
>>>> [1.638738] [c752bd50] [c008d6d8] 
>>>> mark_rodata_ro+0x48/0x80
>>>> [1.639299] [c752bdb0] [c0012a34] kernel_init+0x74/0x1a0
>>>> [1.639842] [c752be10] [c000cfd4] 
>>>> ret_from_kernel_thread+0x5c/0x64
>>>> [1.640597] Instruction dump:
>>>> [1.641021] 38e7 39490010 7ce707b4 7fca5436 79081564 7d4a3838 
>>>> 7908f082 794a1f24
>>>> [1.641740] 78a8f00e 30e6 7ea85214 7ce73110 <7d48502a> 78f90fa4 
>>>> 2c2a 39290010
>>>> [1.642771] ---[ end trace 6cf72b085097ad52 ]---
>>>> [1.643220]
>>>> [2.644228] Kernel panic - not syncing: Attempted to kill init! 
>>>> exitcode=0x000b
>>>> [2.645523] ---[ end Kernel panic - not syncing: Attempted to kill 
>>>> init! exitcode=0x000b ]---
>>>>
>>>> This is not compiler specific, I can reproduce it with GCC 11.2.0 and
>>>> binutils 2.37. If there is any additional information I can provide,
>>>> please let me know.
>>>
>>> Can you provide a dissassembly of __walk_page_range() ? Or provide your 
>>> vmlinux binary.
>> 
>> It seems to be walking of the end of the pgd.
>> 
>> [3.373800] walk_p4d_range: addr c00fff00 end c00fff80
>> [3.373852] walk_p4d_range: addr c00fff80 end c010
>> <- end of pgd at PAGE_OFFSET + 4PB
>> [3.373905] walk_p4d_range: addr c010 end 

Re: [PATCH v4 4/4] powerpc/ptdump: Convert powerpc to GENERIC_PTDUMP

2021-08-30 Thread Michael Ellerman
Christophe Leroy  writes:
> Hi Nathan,
>
> Le 29/08/2021 à 20:55, Nathan Chancellor a écrit :
>> Hi Christophe,
>> 
>> On Thu, Jul 08, 2021 at 04:49:43PM +, Christophe Leroy wrote:
>>> This patch converts powerpc to the generic PTDUMP implementation.
>>>
>>> Signed-off-by: Christophe Leroy 
>> 
>> This patch as commit e084728393a5 ("powerpc/ptdump: Convert powerpc to
>> GENERIC_PTDUMP") in powerpc/next causes a panic with Fedora's ppc64le
>> config [1] when booting up in QEMU with [2]:
>> 
>> [1.621864] BUG: Unable to handle kernel data access on read at 
>> 0xc0eeff7f
>> [1.623058] Faulting instruction address: 0xc045e5fc
>> [1.623832] Oops: Kernel access of bad area, sig: 11 [#1]
>> [1.624318] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
>> [1.625015] Modules linked in:
>> [1.625463] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
>> 5.14.0-rc7-next-20210827 #16
>> [1.626237] NIP:  c045e5fc LR: c045e580 CTR: 
>> c0518220
>> [1.626839] REGS: c752b820 TRAP: 0380   Not tainted  
>> (5.14.0-rc7-next-20210827)
>> [1.627528] MSR:  92009033   CR: 
>> 84002482  XER: 2000
>> [1.628449] CFAR: c0518300 IRQMASK: 0
>> [1.628449] GPR00: c045e580 c752bac0 c28a9300 
>> 
>> [1.628449] GPR04: c2008000  000a 
>> 0001
>> [1.628449] GPR08: c0eeff7f 0012  
>> 
>> [1.628449] GPR12:  c2b2 fffe 
>> c2971a70
>> [1.628449] GPR16: c2960040 c11a8f98 c752bbf0 
>> 
>> [1.628449] GPR20: c2008fff c0eeff7f c2971a68 
>> c00a0003ff00
>> [1.628449] GPR24: c2971a78 0002 0001 
>> c11a8f98
>> [1.628449] GPR28: c11a8f98 c28daef8 c2008000 
>> c2009000
>> [1.634090] NIP [c045e5fc] __walk_page_range+0x2bc/0xce0
>> [1.635117] LR [c045e580] __walk_page_range+0x240/0xce0
>> [1.635755] Call Trace:
>> [1.636018] [c752bac0] [c045e580] 
>> __walk_page_range+0x240/0xce0 (unreliable)
>> [1.636811] [c752bbd0] [c045f234] 
>> walk_page_range_novma+0x74/0xb0
>> [1.637459] [c752bc20] [c0518448] 
>> ptdump_walk_pgd+0x98/0x170
>> [1.638138] [c752bc70] [c00aa988] 
>> ptdump_check_wx+0x88/0xd0
>> [1.638738] [c752bd50] [c008d6d8] mark_rodata_ro+0x48/0x80
>> [1.639299] [c752bdb0] [c0012a34] kernel_init+0x74/0x1a0
>> [1.639842] [c752be10] [c000cfd4] 
>> ret_from_kernel_thread+0x5c/0x64
>> [1.640597] Instruction dump:
>> [1.641021] 38e7 39490010 7ce707b4 7fca5436 79081564 7d4a3838 
>> 7908f082 794a1f24
>> [1.641740] 78a8f00e 30e6 7ea85214 7ce73110 <7d48502a> 78f90fa4 
>> 2c2a 39290010
>> [1.642771] ---[ end trace 6cf72b085097ad52 ]---
>> [1.643220]
>> [2.644228] Kernel panic - not syncing: Attempted to kill init! 
>> exitcode=0x000b
>> [2.645523] ---[ end Kernel panic - not syncing: Attempted to kill init! 
>> exitcode=0x000b ]---
>> 
>> This is not compiler specific, I can reproduce it with GCC 11.2.0 and
>> binutils 2.37. If there is any additional information I can provide,
>> please let me know.
>
> Can you provide a dissassembly of __walk_page_range() ? Or provide your 
> vmlinux binary.

It seems to be walking of the end of the pgd.

[3.373800] walk_p4d_range: addr c00fff00 end c00fff80
[3.373852] walk_p4d_range: addr c00fff80 end c010   
<- end of pgd at PAGE_OFFSET + 4PB
[3.373905] walk_p4d_range: addr c010 end c0100080
[3.373957] walk_p4d_range: addr c0100080 end c0100100
[3.374009] walk_p4d_range: addr c0100100 end c0100180
[3.374060] walk_p4d_range: addr c0100180 end c0100200
[3.376727] walk_p4d_range: addr c0100200 end c0100280
[3.376780] walk_p4d_range: addr c0100280 end c0100300
[3.376831] walk_p4d_range: addr c0100300 end c0100380
[3.376883] walk_p4d_range: addr c0100380 end c0100400
[3.376935] walk_p4d_range: addr c0100400 end c0100480
[3.376988] walk_p4d_range: addr c0100480 end c0100500
[3.377039] walk_p4d_range: addr c0100500 end c0100580
[3.377091] walk_p4d_range: addr c0100580 end c0100600
[3.377143] walk_p4d_range: addr c0100600 end c0100680
[3.377244] walk_pud_range: addr c0100600 end c0100680
[3.377374] walk_pmd_range: addr c0100601 end c01006014000
[3.377817] BUG: Unable to handle kernel data access on read at 

Re: [RFC PATCH 6/6] powerpc/microwatt: Stop building the hash MMU code

2021-08-29 Thread Michael Ellerman
Christophe Leroy  writes:
> Le 27/08/2021 à 18:34, Nicholas Piggin a écrit :
>> Microwatt is radix-only, so stop selecting the hash MMU code.
>> 
>> This saves 20kB compressed dtbImage and 56kB vmlinux size.
>> 
>> Signed-off-by: Nicholas Piggin 
>> ---
>>   arch/powerpc/configs/microwatt_defconfig | 1 -
>>   arch/powerpc/platforms/Kconfig.cputype   | 1 +
>>   arch/powerpc/platforms/microwatt/Kconfig | 1 -
>>   3 files changed, 1 insertion(+), 2 deletions(-)
>> 
>> diff --git a/arch/powerpc/configs/microwatt_defconfig 
>> b/arch/powerpc/configs/microwatt_defconfig
>> index bf5f2e5905eb..6fbad42f9e3d 100644
>> --- a/arch/powerpc/configs/microwatt_defconfig
>> +++ b/arch/powerpc/configs/microwatt_defconfig
>> @@ -26,7 +26,6 @@ CONFIG_PPC_MICROWATT=y
>>   # CONFIG_PPC_OF_BOOT_TRAMPOLINE is not set
>>   CONFIG_CPU_FREQ=y
>>   CONFIG_HZ_100=y
>> -# CONFIG_PPC_MEM_KEYS is not set
>>   # CONFIG_SECCOMP is not set
>>   # CONFIG_MQ_IOSCHED_KYBER is not set
>>   # CONFIG_COREDUMP is not set
>> diff --git a/arch/powerpc/platforms/Kconfig.cputype 
>> b/arch/powerpc/platforms/Kconfig.cputype
>> index 9b90fc501ed4..b9acd6c68c81 100644
>> --- a/arch/powerpc/platforms/Kconfig.cputype
>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>> @@ -371,6 +371,7 @@ config SPE
>>   config PPC_64S_HASH_MMU
>>  bool "Hash MMU Support"
>>  depends on PPC_BOOK3S_64
>> +depends on !PPC_MICROWATT
>
> Do we really want to start nit picking which platforms can select such or 
> such option ?
> Isn't it enough to get it off through the defconfig ?
>
> Is PPC_MICROWATT mutually exclusive with other book3s64 configs ? Don't we 
> build multiplatform kernels ?

Yeah that belongs in the microwatt defconfig, not as a forced exclusion
in Kconfig.

cheers


Re: [PATCH] net: spider_net: switch from 'pci_' to 'dma_' API

2021-08-29 Thread Michael Ellerman
Christophe Leroy  writes:
> Le 27/08/2021 à 21:56, Christophe JAILLET a écrit :
>> ---
>> It has *not* been compile tested because I don't have the needed
>> configuration or cross-compiler. However, the modification is completely
>> mechanical and done by coccinelle.
>
> All you need is at https://mirrors.edge.kernel.org/pub/tools/crosstool/

There's also some instructions here for using distro toolchains:

https://github.com/linuxppc/wiki/wiki/Building-powerpc-kernels

cheers


[GIT PULL] Please pull powerpc/linux.git powerpc-5.14-7 tag

2021-08-28 Thread Michael Ellerman
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi Linus,

Please pull two more powerpc fixes for 5.14:

The following changes since commit 9f7853d7609d59172eecfc5e7ccf503bc1b690bd:

  powerpc/mm: Fix set_memory_*() against concurrent accesses (2021-08-19 
09:41:54 +1000)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
tags/powerpc-5.14-7

for you to fetch changes up to 787c70f2f9990b5a197320152d2fc32cd8a6ad1a:

  powerpc/64s: Fix scv implicit soft-mask table for relocated kernels 
(2021-08-20 22:35:18 +1000)

- --
powerpc fixes for 5.14 #7

 - Fix scv implicit soft-mask table for relocated (eg. kdump) kernels.
 - Re-enable ARCH_ENABLE_SPLIT_PMD_PTLOCK, which was disabled due to a typo.

Thanks to: Lukas Bulwahn, Nicholas Piggin, Daniel Axtens.

- --
Lukas Bulwahn (1):
  powerpc: Re-enable ARCH_ENABLE_SPLIT_PMD_PTLOCK

Nicholas Piggin (1):
  powerpc/64s: Fix scv implicit soft-mask table for relocated kernels


 arch/powerpc/kernel/exceptions-64s.S   | 7 ---
 arch/powerpc/platforms/Kconfig.cputype | 2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)
-BEGIN PGP SIGNATURE-

iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmEqL8QACgkQUevqPMjh
pYDvjQ//dYPmb8eTLrWTRXUy6vLnem+CuJv6gyLEG+7s6T8qZMSA2uvKr4zXAvgF
2QOAoVYWDth6blxjfHc5zqpdERhZHhOq0o2ofuJYGz1J/HAymvIohGFmiIwvOu5n
rPI8NMmsw61W35jJS7dgRr86b3rBwZSzRpDL14g+zQRRzuCqnBdOCA3Ixxn5JQ/F
D1AhyWL61IpVdg0Tz6FRU8s+VKYHh4Yr/CsozkFRMgqZZ2k3zs7aTcluyN8JhEta
BoqejJStrgRt2KOJPr3HXk5fHaoz/9AJn3lSauhnEPFR/Li2ChjkDZPd9KQysHI/
f56HfO/jRx3lY/qhHQ3HeGVJ8rsQrzEILj7KqL0KHwfQoqAhP3E2sut6oqZBFWii
HzfNl0vDrVkjBW7WDV/Y1hlGYaeiGt6DgXwh6wifek6JhMSABwyrd+Uoi9efG7sg
fOUl11VHvBHHQoT+h526urQrdSvgNn+M2iwjElK3LC+tDStNVZTxEiS/KtvpK1vC
I7f7CuwS+z3y4kPs2lwr1t0qQOnmOsvJP82+IDb7Tq4LJRgajyFlGdUEtMrpxZvP
whpFxWrxlQabX6QEr3CGYTGjbiYaGUCfWLkwhwpcftp7QcadjYko7u6nxIyOjI2G
8XG3+CMcPkNB4fSo4Ijt5upaTjhjl4JUGEenLHbyzdcixfj9Z9w=
=yPpX
-END PGP SIGNATURE-


Re: [PATCH v2 0/2] Kconfig symbol fixes on powerpc

2021-08-27 Thread Michael Ellerman
On Thu, 19 Aug 2021 13:39:52 +0200, Lukas Bulwahn wrote:
> The script ./scripts/checkkconfigsymbols.py warns on invalid references to
> Kconfig symbols (often, minor typos, name confusions or outdated references).
> 
> This patch series addresses all issues reported by
> ./scripts/checkkconfigsymbols.py in ./drivers/usb/ for Kconfig and Makefile
> files. Issues in the Kconfig and Makefile files indicate some shortcomings in
> the overall build definitions, and often are true actionable issues to 
> address.
> 
> [...]

Patch 2 applied to powerpc/fixes.

[2/2] powerpc: rectify selection to ARCH_ENABLE_SPLIT_PMD_PTLOCK
  https://git.kernel.org/powerpc/c/310d2e83cb9b7f1e7232319880e3fcb57592fa10

cheers


Re: [PATCH v1] powerpc/64s: Fix scv implicit soft-mask table for relocated kernels

2021-08-27 Thread Michael Ellerman
On Fri, 20 Aug 2021 20:34:31 +1000, Nicholas Piggin wrote:
> The implict soft-mask table addresses get relocated if they use a
> relative symbol like a label. This is right for code that runs relocated
> but not for unrelocated. The scv interrupt vectors run unrelocated, so
> absolute addresses are required for their soft-mask table entry.
> 
> This fixes crashing with relocated kernels, usually an asynchronous
> interrupt hitting in the scv handler, then hitting the trap that checks
> whether r1 is in userspace.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/64s: Fix scv implicit soft-mask table for relocated kernels
  https://git.kernel.org/powerpc/c/787c70f2f9990b5a197320152d2fc32cd8a6ad1a

cheers


Re: [PATCH] powerpc/xive: Do not mark xive_request_ipi() as __init

2021-08-27 Thread Michael Ellerman
On Mon, 16 Aug 2021 11:57:11 -0700, Nathan Chancellor wrote:
> Compiling ppc64le_defconfig with clang-14 shows a modpost warning:
> 
> WARNING: modpost: vmlinux.o(.text+0xa74e0): Section mismatch in
> reference from the function xive_setup_cpu_ipi() to the function
> .init.text:xive_request_ipi()
> The function xive_setup_cpu_ipi() references
> the function __init xive_request_ipi().
> This is often because xive_setup_cpu_ipi lacks a __init
> annotation or the annotation of xive_request_ipi is wrong.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/xive: Do not mark xive_request_ipi() as __init
  https://git.kernel.org/powerpc/c/3f78c90f9eb2e228f44ecc8f4377753f0e11dbab

cheers


Re: [PATCH v2] powerpc/mm: Fix set_memory_*() against concurrent accesses

2021-08-27 Thread Michael Ellerman
On Wed, 18 Aug 2021 22:05:18 +1000, Michael Ellerman wrote:
> Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes
> on one of his systems:
> 
>   kernel tried to execute exec-protected page (c00804073278) - exploit 
> attempt? (uid: 0)
>   BUG: Unable to handle kernel instruction fetch
>   Faulting instruction address: 0xc00804073278
>   Oops: Kernel access of bad area, sig: 11 [#1]
>   LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>   Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ...
>   CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
>   Workqueue: events control_work_handler [virtio_console]
>   NIP:  c00804073278 LR: c00804073278 CTR: c01e9de0
>   REGS: c0002e4ef7e0 TRAP: 0400   Not tainted  (5.14.0-rc4+)
>   MSR:  80004280b033   CR: 24002822 XER: 
> 200400cf
>   ...
>   NIP fill_queue+0xf0/0x210 [virtio_console]
>   LR  fill_queue+0xf0/0x210 [virtio_console]
>   Call Trace:
> fill_queue+0xb4/0x210 [virtio_console] (unreliable)
> add_port+0x1a8/0x470 [virtio_console]
> control_work_handler+0xbc/0x1e8 [virtio_console]
> process_one_work+0x290/0x590
> worker_thread+0x88/0x620
> kthread+0x194/0x1a0
> ret_from_kernel_thread+0x5c/0x64
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/mm: Fix set_memory_*() against concurrent accesses
  https://git.kernel.org/powerpc/c/9f7853d7609d59172eecfc5e7ccf503bc1b690bd

cheers


Re: [PATCH v2] powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEP

2021-08-27 Thread Michael Ellerman
On Wed, 18 Aug 2021 06:49:29 + (UTC), Christophe Leroy wrote:
> Commit b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C")
> removed the 'isync' instruction after adding/removing NX bit in user
> segments. The reasoning behind this change was that when setting the
> NX bit we don't mind it taking effect with delay as the kernel never
> executes text from userspace, and when clearing the NX bit this is
> to return to userspace and then the 'rfi' should synchronise the
> context.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/32s: Fix random crashes by adding isync() after locking/unlocking 
KUEP
  https://git.kernel.org/powerpc/c/ef486bf448a057a6e2d50e40ae879f7add6585da

cheers


Re: [PATCH -next] selftests/powerpc: Remove duplicated include from tm-poison.c

2021-08-27 Thread Michael Ellerman
On Fri, 26 Mar 2021 14:48:08 +0800, Zheng Yongjun wrote:
> Remove duplicated include.
> 
> 
> 
> 

Applied to powerpc/next.

[1/1] selftests/powerpc: Remove duplicated include from tm-poison.c
  https://git.kernel.org/powerpc/c/6af0b5570b59ce8dd1608a8e48f59eff3f4bdd04

cheers


Re: [PATCH] [v2] arch: powerpc: Remove duplicate includes

2021-08-27 Thread Michael Ellerman
On Tue, 23 Mar 2021 14:29:05 +0800, Wan Jiabing wrote:
> mmu-hash.h: asm/bug.h has been included at line 12, so remove
> the duplicate one at line 21.
> interrupt.c: asm/interrupt.h has been included at line 12, so
> remove the duplicate one at line 10.
> time.c: linux/sched/clock.h has been included at line 33,so
> remove the duplicate one at line 56 and move sched/cputime.h
> under sched including segament.
> 
> [...]

Applied to powerpc/next.

[1/1] arch: powerpc: Remove duplicate includes
  https://git.kernel.org/powerpc/c/e225c4d6bc389701f2f63fc246420a1da3465ab5

cheers


Re: [PATCH v2 0/4] Some improvements on regs usage

2021-08-27 Thread Michael Ellerman
On Sat, 7 Aug 2021 09:02:35 +0800, sxwj...@me.com wrote:
> From: Xiongwei Song 
> 
> When CONFIG_4xx=y or CONFIG_BOOKE=y, currently in code we reference dsisr
> to get interrupt reasons and reference dar to get excepiton address.
> However, in reference manuals, esr is used for interrupt reasons and dear
> is used for excepiton address, so the patchset changes dsisr -> esr,
> dar -> dear for CONFIG_4xx=y or CONFIG_BOOKE=y.
> 
> [...]

Applied to powerpc/next.

[1/4] powerpc: Optimize register usage for esr register
  https://git.kernel.org/powerpc/c/4f8e78c0757e3c5a65d9d8ac76e2434c71a78f5a
[2/4] powerpc/64e: Get esr offset with _ESR macro
  https://git.kernel.org/powerpc/c/cfa47772ca8d53d7a6c9b331a7f6e7c4c9827214
[3/4] powerpc: Optimize register usage for dear register
  https://git.kernel.org/powerpc/c/4872cbd0ca35ca5b20d52e2539e7e1950f126e7b
[4/4] powerpc/64e: Get dear offset with _DEAR macro
  https://git.kernel.org/powerpc/c/d9db6e420268b2d561731468a31f0b15e2e9a145

cheers


Re: [PATCH] powerpc/head_check: use stdout for error messages

2021-08-27 Thread Michael Ellerman
On Sun, 15 Aug 2021 15:23:34 -0700, Randy Dunlap wrote:
> Prefer stderr instead of stdout for error messages.
> This is a good practice and can help CI error detecting and
> reporting (0day in this case).
> 
> 
> 
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/head_check: use stdout for error messages
  https://git.kernel.org/powerpc/c/47c258d71ebfc832a760a1dc6540cf3c33968023

cheers


Re: (subset) [PATCH v2 00/60] KVM: PPC: Book3S HV P9: entry/exit optimisations

2021-08-27 Thread Michael Ellerman
On Thu, 12 Aug 2021 02:00:34 +1000, Nicholas Piggin wrote:
> This reduces radix guest full entry/exit latency on POWER9 and POWER10
> by 2x.
> 
> Nested HV guests should see smaller improvements in their L1 entry/exit,
> but this is also combined with most L0 speedups also applying to nested
> entry. nginx localhost throughput test in a SMP nested guest is improved
> about 10% (in a direct guest it doesn't change much because it uses XIVE
> for IPIs) when L0 and L1 are patched.
> 
> [...]

Applied to powerpc/next.

[01/60] KVM: PPC: Book3S HV: Initialise vcpu MSR with MSR_ME

https://git.kernel.org/powerpc/c/fd42b7b09c602c904452c0c3e5955ca21d8e387a
[02/60] KVM: PPC: Book3S HV: Remove TM emulation from POWER7/8 path

https://git.kernel.org/powerpc/c/daac40e8d7a63ab8608132a7cfce411986feac32
[03/60] KVM: PPC: Book3S HV P9: Fixes for TM softpatch interrupt NIP

https://git.kernel.org/powerpc/c/4782e0cd0d184d727ad3b0cfe20d1d44d9f98239
[04/60] KVM: PPC: Book3S HV Nested: Fix TM softpatch HFAC interrupt emulation

https://git.kernel.org/powerpc/c/d82b392d9b3556b63e3f9916cf057ea847e173a9
[05/60] KVM: PPC: Book3S HV Nested: Sanitise vcpu registers

https://git.kernel.org/powerpc/c/7487cabc7ed2f7716bf304e4e97c57fd995cf70e
[06/60] KVM: PPC: Book3S HV Nested: Make nested HFSCR state accessible

https://git.kernel.org/powerpc/c/8b210a880b35ba75eb42b79dfd65e369c1feb119
[07/60] KVM: PPC: Book3S HV Nested: Stop forwarding all HFUs to L1

https://git.kernel.org/powerpc/c/7c3ded5735141ff4d049747c9f76672a8b737c49
[08/60] KVM: PPC: Book3S HV Nested: save_hv_return_state does not require trap 
argument

https://git.kernel.org/powerpc/c/f2e29db156523bf08a0524e0f4306a641912c6d9
[09/60] KVM: PPC: Book3S HV Nested: Reflect guest PMU in-use to L0 when guest 
SPRs are live

https://git.kernel.org/powerpc/c/1782663897945a5cf28e564ba5eed730098e9aa4
[10/60] powerpc/64s: Remove WORT SPR from POWER9/10

https://git.kernel.org/powerpc/c/0c8fb653d487d2873f5eefdcaf4cecff4e103828

cheers


Re: [PATCH] powerpc/pseries: Fix build error when NUMA=n

2021-08-27 Thread Michael Ellerman
On Mon, 16 Aug 2021 14:10:32 +1000, Michael Ellerman wrote:
> As reported by lkp, if NUMA=n we see a build error:
> 
>arch/powerpc/platforms/pseries/hotplug-cpu.c: In function 
> 'pseries_cpu_hotplug_init':
>arch/powerpc/platforms/pseries/hotplug-cpu.c:1022:8: error: 
> 'node_to_cpumask_map' undeclared
> 1022 |node_to_cpumask_map[node]);
> 
> Use cpumask_of_node() which has an empty stub for NUMA=n, and when
> NUMA=y does a lookup from node_to_cpumask_map[].
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/pseries: Fix build error when NUMA=n
  https://git.kernel.org/powerpc/c/8b893ef190b0c440877de04f767efca4bf4d6af8

cheers


Re: [PATCH v2] ppc: add "-z notext" flag to disable diagnostic

2021-08-27 Thread Michael Ellerman
On Fri, 13 Aug 2021 13:05:11 -0700, Bill Wendling wrote:
> Object files used to link .tmp_vmlinux.kallsyms1 have many R_PPC64_ADDR64
> relocations in non-SHF_WRITE sections. There are many text relocations (e.g. 
> in
> .rela___ksymtab_gpl+* and .rela__mcount_loc sections) in a -pie link and are
> disallowed by LLD:
> 
>   ld.lld: error: can't create dynamic relocation R_PPC64_ADDR64 against local 
> symbol in readonly segment; recompile object files with -fPIC or pass 
> '-Wl,-z,notext' to allow text relocations in the output
>   >>> defined in arch/powerpc/kernel/head_64.o
>   >>> referenced by arch/powerpc/kernel/head_64.o:(__restart_table+0x10)
> 
> [...]

Applied to powerpc/next.

[1/1] ppc: add "-z notext" flag to disable diagnostic
  https://git.kernel.org/powerpc/c/0355785313e2191be4e1108cdbda94ddb0238c48

cheers


Re: [PATCH v2 0/2] Kconfig symbol fixes on powerpc

2021-08-27 Thread Michael Ellerman
On Thu, 19 Aug 2021 13:39:52 +0200, Lukas Bulwahn wrote:
> The script ./scripts/checkkconfigsymbols.py warns on invalid references to
> Kconfig symbols (often, minor typos, name confusions or outdated references).
> 
> This patch series addresses all issues reported by
> ./scripts/checkkconfigsymbols.py in ./drivers/usb/ for Kconfig and Makefile
> files. Issues in the Kconfig and Makefile files indicate some shortcomings in
> the overall build definitions, and often are true actionable issues to 
> address.
> 
> [...]

Patch 1 applied to powerpc/next.

[1/2] powerpc: kvm: remove obsolete and unneeded select
  https://git.kernel.org/powerpc/c/c26d4c5d4f0df7207da3975458261927f9305465

cheers


Re: [PATCH v4 1/3] powerpc/perf: Use stack siar instead of mfspr

2021-08-27 Thread Michael Ellerman
On Wed, 18 Aug 2021 22:45:54 +0530, Kajol Jain wrote:
> Minor optimization in the 'perf_instruction_pointer' function code by
> making use of stack siar instead of mfspr.
> 
> 
> 
> 

Applied to powerpc/next.

[1/3] powerpc/perf: Use stack siar instead of mfspr
  https://git.kernel.org/powerpc/c/b1643084d164cea0c107a39bcdf0119fc52619af
[2/3] powerpc/perf: Drop the case of returning 0 as instruction pointer
  https://git.kernel.org/powerpc/c/cc90c6742ef5b438f4cb86029d7a794bd0a44a06
[3/3] powerpc/perf: Fix the check for SIAR value
  https://git.kernel.org/powerpc/c/3c69a5f3fa3e312689ec218b5059784d49d7

cheers


Re: [PATCH] powerpc/perf/hv-gpci: Fix the logic to compute counter value from the hcall result buffer.

2021-08-27 Thread Michael Ellerman
On Fri, 13 Aug 2021 13:51:58 +0530, Kajol Jain wrote:
> H_GetPerformanceCounterInfo (0xF080) hcall returns the counter data in the
> result buffer. Result buffer has specific format defined in the PAPR
> specification. One of the field is counter offset and width of the counter
> data returned.
> 
> Counter data are returned in a unsigned char array. To
> get the final counter data, these values should be left shifted
> byte at a time. But commit 220a0c609ad17 ("powerpc/perf: Add support
> for the hv gpci (get performance counter info) interface") made the
> shifting bitwise. Because of this, hcall counters values could end up
> in lower side, which messes the counter prev vs now calculation. This
> lead to huge counter value reporting
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/perf/hv-gpci: Fix the logic to compute counter value from the 
hcall result buffer.
  https://git.kernel.org/powerpc/c/f9addd85fbfacf0d155e83dbee8696d6df5ed0c7

cheers


Re: [PATCH v2 0/3] powerpc: mpc855_ads defconfig fixes

2021-08-27 Thread Michael Ellerman
On Tue, 17 Aug 2021 14:24:04 +0930, Joel Stanley wrote:
> v2: fix typos, split out mtd fix from savedefconfig patch
> 
> The first patch fixes a build warning I noticed when testing something
> unrelated.
> 
> The second fixes a regression where the MTD partition support dropped out
> of the config. I have enabled the dependency so this is still part of
> the config.
> 
> [...]

Applied to powerpc/next.

[1/3] powerpc/config: Fix IPV6 warning in mpc855_ads
  https://git.kernel.org/powerpc/c/c5ac55b6cbc604ad4144c380feae20f69453df91
[2/3] powerpc/config: Renable MTD_PHYSMAP_OF
  https://git.kernel.org/powerpc/c/d0e28a6145c3455b69991245e7f6147eb914b34a
[3/3] powerpc/configs: Regenerate mpc885_ads_defconfig
  https://git.kernel.org/powerpc/c/87e0d46bf68913f4c87dba94aadc00da658a874b

cheers


Re: [PATCH v2 1/2] selftests/powerpc: Add missing clobbered register to to ptrace TM tests

2021-08-27 Thread Michael Ellerman
On Thu, 29 Jul 2021 14:13:16 +1000, Jordan Niethe wrote:
> ISA v3.1 removes TM but includes a synthetic implementation for
> backwards compatibility.  With this implementation,  the tests
> ptrace-tm-spd-gpr and ptrace-tm-gpr should never be able to make any
> forward progress and eventually should be killed by the timeout.
> Instead on a P10 running in P9 mode, ptrace_tm_gpr fails like so:
> 
> test: ptrace_tm_gpr
> tags: git_version:unknown
> Starting the child
> ...
> ...
> GPR[27]: 1 Expected: 2
> GPR[28]: 1 Expected: 2
> GPR[29]: 1 Expected: 2
> GPR[30]: 1 Expected: 2
> GPR[31]: 1 Expected: 2
> [FAIL] Test FAILED on line 98
> failure: ptrace_tm_gpr
> selftests:  ptrace-tm-gpr [FAIL]
> 
> [...]

Applied to powerpc/next.

[1/2] selftests/powerpc: Add missing clobbered register to to ptrace TM tests
  https://git.kernel.org/powerpc/c/c95278a0534449efc64ac8169382bce217963be2
[2/2] selftests: Skip TM tests on synthetic TM implementations
  https://git.kernel.org/powerpc/c/e42edf9b9d126bb1c743f2e7984877ba27f09fe7

cheers


Re: [PATCH] powerpc/tau: Add 'static' storage qualifier to 'tau_work' definition

2021-08-27 Thread Michael Ellerman
On Thu, 19 Aug 2021 10:46:54 +1000, Finn Thain wrote:
> This patch prevents the following sparse warning.
> 
> arch/powerpc/kernel/tau_6xx.c:199:1: sparse: sparse: symbol 'tau_work'
> was not declared. Should it be static?
> 
> 
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/tau: Add 'static' storage qualifier to 'tau_work' definition
  https://git.kernel.org/powerpc/c/6cd717fe9b3a787f8e8f9d0bc9b7634a9c104b3e

cheers


Re: [PATCH v2 0/3] KVM: PPC: Book3S HV: kvmhv_copy_tofrom_guest_radix changes

2021-08-27 Thread Michael Ellerman
On Thu, 5 Aug 2021 18:26:13 -0300, Fabiano Rosas wrote:
> This series contains the fix for __kvmhv_copy_tofrom_guest_radix plus
> a couple of changes.
> 
> - Patch 1: The fix itself. I thought a smaller fix upfront would be
>better to facilitate any backports.
> 
> - Patch 2: Adds a sanity check to the low level function. Since the
>hcall API already enforces that the effective address must
>be on quadrant 0, moving the checks from the high level
>function would only add overhead to the hcall path, so I
>opted for a lightweight check at the low level.
> 
> [...]

Applied to powerpc/next.

[1/3] KVM: PPC: Book3S HV: Fix copy_tofrom_guest routines
  https://git.kernel.org/powerpc/c/5d7d6dac8fe99ed59eee2300e4a03370f94d5222
[2/3] KVM: PPC: Book3S HV: Add sanity check to copy_tofrom_guest
  https://git.kernel.org/powerpc/c/c232461c0c3b0aca637f0d7658a7f5d0bb393a4e
[3/3] KVM: PPC: Book3S HV: Stop exporting symbols from book3s_64_mmu_radix
  https://git.kernel.org/powerpc/c/0eb596f1e6103ebe122792a425b88c5dc21c4087

cheers


Re: [PATCH v2 0/2] W=1 fixes

2021-08-27 Thread Michael Ellerman
On Mon, 23 Aug 2021 11:00:37 +0200, Cédric Le Goater wrote:
> These are the remaining patches needed to compile the ppc kernel with
> W=1. Audit issues are now being addressed by Christophe in patch :
> 
> [v2] powerpc/audit: Convert powerpc to AUDIT_ARCH_COMPAT_GENERIC
> http://patchwork.ozlabs.org/project/linuxppc-dev/patch/dc14509a28a993738b1325211f412be72a4f9b1e.1629701132.git.christophe.le...@csgroup.eu/
> 
> Thanks,
> 
> [...]

Applied to powerpc/next.

[1/2] powerpc/prom: Fix unused variable ‘reserve_map’ when CONFIG_PPC32 is not 
set
  https://git.kernel.org/powerpc/c/3accc0faef081b6813967b34f7d05a3edb855cbd
[2/2] powerpc/compat_sys: Declare syscalls
  https://git.kernel.org/powerpc/c/cc47ad409ba9cc950e9c492c8ba653dabd392148

cheers


Re: [PATCH 0/6] W=1 fixes

2021-08-27 Thread Michael Ellerman
On Thu, 19 Aug 2021 14:56:50 +0200, Cédric Le Goater wrote:
> With this small series, I could compile the ppc kernel with W=1. There
> are certainly other configs which will need more fixes but it's a good
> start.
> 
> The last 2 patches look hacky. Christophe, could you help with these
> to find a better place to include the declarations ?
> 
> [...]

Patches 2-4 applied to powerpc/next.

[2/6] powerpc/pseries/vas: Declare pseries_vas_fault_thread_fn() as static
  https://git.kernel.org/powerpc/c/4cb266074aa17e9cafed3a92e9f43b161516569f
[3/6] KVM: PPC: Book3S PR: Declare kvmppc_handle_exit_pr()
  https://git.kernel.org/powerpc/c/cb53a93e33e108bfec00a13cd12696e1c0ccc8b6
[4/6] KVM: PPC: Book3S PR: Remove unused variable
  https://git.kernel.org/powerpc/c/b352ddae7b2ccd2cb29f495ca0a7c9b6848b623f

cheers


Re: [PATCH v4 1/4] powerpc/ptdump: Use DEFINE_SHOW_ATTRIBUTE()

2021-08-27 Thread Michael Ellerman
On Thu, 8 Jul 2021 16:49:40 + (UTC), Christophe Leroy wrote:
> Use DEFINE_SHOW_ATTRIBUTE() instead of open coding
> open() and fops.
> 
> 
> 
> 

Applied to powerpc/next.

[1/4] powerpc/ptdump: Use DEFINE_SHOW_ATTRIBUTE()
  https://git.kernel.org/powerpc/c/11f27a7fa4ca27935de74e3eb052bdc430d5f8d8
[2/4] powerpc/ptdump: Remove unused 'page_size' parameter
  https://git.kernel.org/powerpc/c/64b87b0c70e0fd28352895cba3c0a9631e0072dd
[3/4] powerpc/ptdump: Reduce level numbers by 1 in note_page() and add p4d level
  https://git.kernel.org/powerpc/c/cf98d2b6eea6a1b2c43f85680ad58fcc3ea9496b
[4/4] powerpc/ptdump: Convert powerpc to GENERIC_PTDUMP
  https://git.kernel.org/powerpc/c/e084728393a58e7fdafeee2e6b20e0faff09b557

cheers


Re: [PATCH v3] powerpc/booke: Avoid link stack corruption in several places

2021-08-27 Thread Michael Ellerman
On Tue, 24 Aug 2021 07:56:26 + (UTC), Christophe Leroy wrote:
> Use bcl 20,31,+4 instead of bl in order to preserve link stack.
> 
> See commit c974809a26a1 ("powerpc/vdso: Avoid link stack corruption
> in __get_datapage()") for details.
> 
> 
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/booke: Avoid link stack corruption in several places
  https://git.kernel.org/powerpc/c/f5007dbf4da729baa850b33a64dc3cc53757bdf8

cheers


Re: (subset) [PATCH v3 1/3] powerpc: Remove MSR_PR check in interrupt_exit_{user/kernel}_prepare()

2021-08-27 Thread Michael Ellerman
On Mon, 23 Aug 2021 08:24:20 + (UTC), Christophe Leroy wrote:
> In those hot functions that are called at every interrupt, any saved
> cycle is worth it.
> 
> interrupt_exit_user_prepare() and interrupt_exit_kernel_prepare() are
> called from three places:
> - From entry_32.S
> - From interrupt_64.S
> - From interrupt_exit_user_restart() and interrupt_exit_kernel_restart()
> 
> [...]

Applied to powerpc/next.

[2/3] powerpc: Refactor verification of MSR_RI
  https://git.kernel.org/powerpc/c/806c0e6e7e97adc17389c8dc1f52d4736f49299b

cheers


Re: [PATCH v2] powerpc: Avoid link stack corruption in misc asm functions

2021-08-27 Thread Michael Ellerman
On Tue, 24 Aug 2021 07:56:35 + (UTC), Christophe Leroy wrote:
> bl;mflr is used at several places to get code position.
> 
> Use bcl 20,31,+4 instead of bl in order to preserve link stack.
> 
> See commit c974809a26a1 ("powerpc/vdso: Avoid link stack corruption
> in __get_datapage()") for details.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc: Avoid link stack corruption in misc asm functions
  https://git.kernel.org/powerpc/c/33e1402435cb9f3021439a15935ea2dc69ec1844

cheers


Re: [PATCH v1 1/2] powerpc: Use lwarx/ldarx directly instead of PPC_LWARX/LDARX macros

2021-08-27 Thread Michael Ellerman
On Tue, 2 Mar 2021 08:48:11 + (UTC), Christophe Leroy wrote:
> Force the eh flag at 0 on PPC32.
> 

Patch 1 applied to powerpc/next.

[1/2] powerpc: Use lwarx/ldarx directly instead of PPC_LWARX/LDARX macros
  https://git.kernel.org/powerpc/c/9401f4e46cf6965e23738f70e149172344a01eef

cheers


Re: [PATCH] powerpc/syscalls: Simplify do_mmap2()

2021-08-27 Thread Michael Ellerman
On Fri, 25 Jun 2021 10:58:33 + (UTC), Christophe Leroy wrote:
> When shift is nul, operations remain valid so no test needed.
> 
> And 'ret' is unnecessary.
> 
> And use IS_ALIGNED() to check alignment, that's more clear.
> 
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/syscalls: Simplify do_mmap2()
  https://git.kernel.org/powerpc/c/316389e904f968d24d44cd96a6d171ee1ef269cf

cheers


Re: [PATCH] powerpc/syscalls: Remove __NR__exit

2021-08-27 Thread Michael Ellerman
On Mon, 23 Aug 2021 06:45:20 + (UTC), Christophe Leroy wrote:
> __NR_exit is nowhere used. On most architectures it was removed by
> commit 135ab6ec8fda ("[PATCH] remove remaining errno and
> __KERNEL_SYSCALLS__ references") but not on powerpc.
> 
> powerpc removed __KERNEL_SYSCALLS__ in commit 3db03b4afb3e ("[PATCH]
> rename the provided execve functions to kernel_execve"), but __NR_exit
> was left over.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/syscalls: Remove __NR__exit
  https://git.kernel.org/powerpc/c/a00ea5b6f2bbef8b004b0b7228c61680a50c7c3f

cheers


Re: [PATCH] powerpc/ptrace: Make user_mode() common to PPC32 and PPC64

2021-08-27 Thread Michael Ellerman
On Tue, 17 Aug 2021 16:00:14 + (UTC), Christophe Leroy wrote:
> Today we have:
> 
>   #ifdef __powerpc64__
>   #define user_mode(regs) regs)->msr) >> MSR_PR_LG) & 0x1)
>   #else
>   #define user_mode(regs) (((regs)->msr & MSR_PR) != 0)
>   #endif
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/ptrace: Make user_mode() common to PPC32 and PPC64
  https://git.kernel.org/powerpc/c/19e932eb6ea47f4f37513eb2ae0daee19117765c

cheers


Re: [PATCH] powerpc/32: indirect function call use bctrl rather than blrl in ret_from_kernel_thread

2021-08-27 Thread Michael Ellerman
On Fri, 20 Aug 2021 05:16:05 + (UTC), Christophe Leroy wrote:
> Copied from commit 89bbe4c798bc ("powerpc/64: indirect function call
> use bctrl rather than blrl in ret_from_kernel_thread")
> 
> blrl is not recommended to use as an indirect function call, as it may
> corrupt the link stack predictor.
> 
> This is not a performance critical path but this should be fixed for
> consistency.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/32: indirect function call use bctrl rather than blrl in 
ret_from_kernel_thread
  https://git.kernel.org/powerpc/c/113ec9ccc8049c3772f0eab46b62c5d6654c09f7

cheers


Re: [PATCH] powerpc/audit: Simplify syscall_get_arch()

2021-08-27 Thread Michael Ellerman
On Fri, 20 Aug 2021 09:39:14 + (UTC), Christophe Leroy wrote:
> Make use of is_32bit_task() and CONFIG_CPU_LITTLE_ENDIAN
> to simplify syscall_get_arch().
> 
> 
> 
> 

Applied to powerpc/next.

[1/1] powerpc/audit: Simplify syscall_get_arch()
  https://git.kernel.org/powerpc/c/770cec16cdc9d1e67896dd2214a4fec9a3fa

cheers


Re: [PATCH] powerpc/audit: Avoid unneccessary #ifdef in syscall_get_arguments()

2021-08-27 Thread Michael Ellerman
On Fri, 20 Aug 2021 09:28:19 + (UTC), Christophe Leroy wrote:
> Use is_32bit_task() which already handles CONFIG_COMPAT.
> 
> 
> 
> 

Applied to powerpc/next.

[1/1] powerpc/audit: Avoid unneccessary #ifdef in syscall_get_arguments()
  https://git.kernel.org/powerpc/c/898a1ef06ad4a2a8e3c9490ce62624fcd1a7b1f8

cheers


Re: [PATCH] powerpc/32: Remove unneccessary calculations in load_up_{fpu/altivec}

2021-08-27 Thread Michael Ellerman
On Wed, 18 Aug 2021 08:47:28 + (UTC), Christophe Leroy wrote:
> No need to re-read SPRN_THREAD, we can calculate thread address
> from current (r2).
> 
> And remove a reload of value 1 into r4 as r4 is already 1.
> 
> 
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/32: Remove unneccessary calculations in load_up_{fpu/altivec}
  https://git.kernel.org/powerpc/c/51ed00e71f0130e0f3534b8e7d78facd16829426

cheers


Re: [PATCH] powerpc/doc: Fix htmldocs errors

2021-08-27 Thread Michael Ellerman
On Wed, 25 Aug 2021 09:54:47 +0530, Aneesh Kumar K.V wrote:
> Fix make htmldocs related errors with the newly added associativity.rst
> doc file.
> 
> 
> 
> 

Applied to powerpc/next.

[1/1] powerpc/doc: Fix htmldocs errors
  https://git.kernel.org/powerpc/c/f50da6edbf1ebf35dd8070847bfab5cb988d472b

cheers


Re: [PATCH v2 2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto

2021-08-27 Thread Michael Ellerman
Nathan Chancellor  writes:
> On Thu, Aug 26, 2021 at 11:54:17AM -0700, Nathan Chancellor wrote:
>> On Thu, Aug 26, 2021 at 01:21:39PM +1000, Michael Ellerman wrote:
>> > Nathan Chancellor  writes:
>> > > On Tue, Apr 13, 2021 at 04:38:10PM +, Christophe Leroy wrote:
>> > >> Using asm goto in __WARN_FLAGS() and WARN_ON() allows more
>> > >> flexibility to GCC.
>> > ...
>> > >
>> > > This patch as commit 1e688dd2a3d6 ("powerpc/bug: Provide better
>> > > flexibility to WARN_ON/__WARN_FLAGS() with asm goto") cause a WARN_ON in
>> > > klist_add_tail to trigger over and over on boot when compiling with
>> > > clang:
>> > >
...
>> > 
>> > This patch seems to fix it. Not sure if that's just papering over it 
>> > though.
>> > 
>> > diff --git a/arch/powerpc/include/asm/bug.h 
>> > b/arch/powerpc/include/asm/bug.h
>> > index 1ee0f22313ee..75fcb4370d96 100644
>> > --- a/arch/powerpc/include/asm/bug.h
>> > +++ b/arch/powerpc/include/asm/bug.h
>> > @@ -119,7 +119,7 @@ __label_warn_on:   
>> > \
>> >\
>> >WARN_ENTRY(PPC_TLNEI " %4, 0",  \
>> >   BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN), 
>> > \
>> > - __label_warn_on, "r" (x));   \
>> > + __label_warn_on, "r" (!!(x))); \
>> >break;  \
>> >  __label_warn_on:  \
>> >__ret_warn_on = true;   \
>> > 
>> 
>> Thank you for the in-depth explanation and triage! I have gone ahead and
>> created a standalone reproducer that shows this based on the
>> preprocessed file and opened https://bugs.llvm.org/show_bug.cgi?id=51634
>> so the LLVM developers can take a look.
>
> Based on Eli Friedman's comment in the bug, would something like this
> work and not regress GCC? I noticed that the BUG_ON macro does a cast as
> well. Nick pointed out to me privately that we have run into what seems
> like a similar issue before in commit 6c58f25e6938 ("riscv/atomic: Fix
> sign extension for RV64I").

Yes, I read that comment this morning, and then landed at the same fix
via digging through the history of our bug code.

We in fact fixed a similar bug 16 years ago :}

32818c2eb6b8 ("[PATCH] ppc64: Fix issue with gcc 4.0 compiled kernels")

Which is when we first started adding the cast to long.


> diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
> index 1ee0f22313ee..35022667f57d 100644
> --- a/arch/powerpc/include/asm/bug.h
> +++ b/arch/powerpc/include/asm/bug.h
> @@ -119,7 +119,7 @@ __label_warn_on:  
>   \
> \
> WARN_ENTRY(PPC_TLNEI " %4, 0",  \
>BUGFLAG_WARNING | 
> BUGFLAG_TAINT(TAINT_WARN), \
> -  __label_warn_on, "r" (x));   \
> +  __label_warn_on, "r" ((__force long)(x))); 
>   \
> break;  \
>  __label_warn_on:   \
> __ret_warn_on = true;   \


Yeah that fixes the clang build for me.

For GCC it seems to generate the same code in the simple cases:

void test_warn_on_ulong(unsigned long b)
{
WARN_ON(b);
}

void test_warn_on_int(int b)
{
WARN_ON(b);
}

I get:

0020 <.test_warn_on_ulong>:
  20:   0b 03 00 00 tdnei   r3,0
  24:   4e 80 00 20 blr
  28:   60 00 00 00 nop
  2c:   60 00 00 00 nop

0030 <.test_warn_on_int>:
  30:   0b 03 00 00 tdnei   r3,0
  34:   4e 80 00 20 blr

Both before and after the change. So that's good.

For:

void test_warn_on_int_addition(int b)
{
WARN_ON(b+1);
}

Before I get:

0040 <.test_warn_on_int_addition>:
  40:   38 63 00 01 addir3,r3,1
  44:   0b 03 00 00 tdnei   r3,0
  48:   4e 80 00 20 blr

vs after:

0040 <.test_warn_on_int_addition>:
  40:   38 63 00 01 addir3,r3,1
  44:   7c 63 07 b4 extsw   r3,r3
  48:   0b 03 00 00 tdnei   r3,0
  4c:   4e 80 00 20 blr


So there's an extra sign extension we don't need, but I guess we can
probably live with that.

cheers


Re: [GIT PULL] retire legacy WR sbc8548 and sbc8641 platforms

2021-08-26 Thread Michael Ellerman
Paul Gortmaker  writes:
> This is unchanged from the original wr_sbc-delete branch sent in January,
> other than to add the Acks from Scott in July, and update the baseline.
>
> Built with ppc64 defconfig and mpc85xx_cds_defconfig and mpc86xx_defconfig
> just to make sure I didn't fat finger anything in the baseline update.

Thanks for following up on this.

I ended up cherry-picking the patches into my branch. I like to keep my
next based on rc2, and merging this would have pulled in everything up
to rc7 into my branch.

I don't think you were planning to merge this branch anywhere else, so
it shouldn't make any difference, but let me know if it's a problem.

It should appear here soon:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/log/?h=next


cheers
 

> Original v1 text follows below, from:
>
> https://lore.kernel.org/lkml/20210111082823.99562-1-paul.gortma...@windriver.com
>
> It would be nice to get this in and off our collective to-do list.
>
> Thanks,
> Paul.
>
>   ---
>
> In v2.6.27 (2008, 917f0af9e5a9) the sbc8260 support was implicitly
> retired by not being carried forward through the ppc --> powerpc
> device tree transition.
>
> Then, in v3.6 (2012, b048b4e17cbb) we retired the support for the
> sbc8560 boards.
>
> Next, in v4.18 (2017, 3bc6cf5a86e5) we retired the support for the
> 2006 vintage sbc834x boards.
>
> The sbc8548 and sbc8641d boards were maybe 1-2 years newer than the
> sbc834x boards, but it is also 3+ years later, so it makes sense to
> now retire them as well - which is what is done here.
>
> These two remaining WR boards were based on the Freescale MPC8548-CDS
> and the MPC8641D-HPCN reference board implementations.  Having had the
> chance to use these and many other Fsl ref boards, I know this:  The
> Freescale reference boards were typically produced in limited quantity
> and primarily available to BSP developers and hardware designers, and
> not likely to have found a 2nd life with hobbyists and/or collectors.
>
> It was good to have that BSP code subjected to mainline review and
> hence also widely available back in the day. But given the above, we
> should probably also be giving serious consideration to retiring
> additional similar age/type reference board platforms as well.
>
> I've always felt it is important for us to be proactive in retiring
> old code, since it has a genuine non-zero carrying cost, as described
> in the 930d52c012b8 merge log.  But for the here and now, we just
> clean up the remaining BSP code that I had added for SBC platforms.
>
> --- 
>
> The following changes since commit e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93:
>
>   Linux 5.14-rc7 (2021-08-22 14:24:56 -0700)
>
> are available in the Git repository at:
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux.git 
> wr_sbc-delete-v2
>
> for you to fetch changes up to d44e2dc12ea2112e74cdd25090eeda2727ed09cc:
>
>   MAINTAINERS: update for Paul Gortmaker (2021-08-24 08:19:01 -0400)
>
> 
> Paul Gortmaker (3):
>   powerpc: retire sbc8548 board support
>   powerpc: retire sbc8641d board support
>   MAINTAINERS: update for Paul Gortmaker
>
>  MAINTAINERS |   1 -
>  arch/powerpc/boot/Makefile  |   1 -
>  arch/powerpc/boot/dts/fsl/sbc8641d.dts  | 176 -
>  arch/powerpc/boot/dts/sbc8548-altflash.dts  | 111 ---
>  arch/powerpc/boot/dts/sbc8548-post.dtsi | 289 
> 
>  arch/powerpc/boot/dts/sbc8548-pre.dtsi  |  48 -
>  arch/powerpc/boot/dts/sbc8548.dts   | 106 --
>  arch/powerpc/boot/wrapper   |   2 +-
>  arch/powerpc/configs/85xx/sbc8548_defconfig |  50 -
>  arch/powerpc/configs/mpc85xx_base.config|   1 -
>  arch/powerpc/configs/mpc86xx_base.config|   1 -
>  arch/powerpc/configs/ppc6xx_defconfig   |   1 -
>  arch/powerpc/platforms/85xx/Kconfig |   6 -
>  arch/powerpc/platforms/85xx/Makefile|   1 -
>  arch/powerpc/platforms/85xx/sbc8548.c   | 134 -
>  arch/powerpc/platforms/86xx/Kconfig |   8 +-
>  arch/powerpc/platforms/86xx/Makefile|   1 -
>  arch/powerpc/platforms/86xx/sbc8641d.c  |  87 -
>  18 files changed, 2 insertions(+), 1022 deletions(-)
>  delete mode 100644 arch/powerpc/boot/dts/fsl/sbc8641d.dts
>  delete mode 100644 arch/powerpc/boot/dts/sbc8548-altflash.dts
>  delete mode 100644 arch/powerpc/boot/dts/sbc8548-post.dtsi
>  delete mode 100644 arch/powerpc/boot/dts/sbc8548-pre.dtsi
>  delete mode 100644 arch/powerpc/boot/dts/sbc8548.dts
>  delete mode 100644 arch/powerpc/configs/85xx/sbc8548_defconfig
>  delete mode 100644 arch/powerpc/platforms/85xx/sbc8548.c
>  delete mode 100644 arch/powerpc/platforms/86xx/sbc8641d.c


Re: [PATCH v2 2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto

2021-08-26 Thread Michael Ellerman
Christophe Leroy  writes:
> Le 26/08/2021 à 05:21, Michael Ellerman a écrit :
>> Nathan Chancellor  writes:
>>> On Tue, Apr 13, 2021 at 04:38:10PM +, Christophe Leroy wrote:
>>>> Using asm goto in __WARN_FLAGS() and WARN_ON() allows more
>>>> flexibility to GCC.
>> ...
>>>
>>> This patch as commit 1e688dd2a3d6 ("powerpc/bug: Provide better
>>> flexibility to WARN_ON/__WARN_FLAGS() with asm goto") cause a WARN_ON in
>>> klist_add_tail to trigger over and over on boot when compiling with
>>> clang:
>
> ...
>
>> 
>> This patch seems to fix it. Not sure if that's just papering over it though.
>> 
>> diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
>> index 1ee0f22313ee..75fcb4370d96 100644
>> --- a/arch/powerpc/include/asm/bug.h
>> +++ b/arch/powerpc/include/asm/bug.h
>> @@ -119,7 +119,7 @@ __label_warn_on: 
>> \
>>  \
>>  WARN_ENTRY(PPC_TLNEI " %4, 0",  \
>> BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN), 
>> \
>> -   __label_warn_on, "r" (x));   \
>> +   __label_warn_on, "r" (!!(x))); \
>>  break;  \
>>   __label_warn_on:   \
>>  __ret_warn_on = true;   \
>
> But for a simple WARN_ON() call:
>
> void test(unsigned long b)
> {
>   WARN_ON(b);
> }
>
> Without your change with GCC you get:
>
> 12d0 <.test>:
>  12d0:0b 03 00 00 tdnei   r3,0
>  12d4:4e 80 00 20 blr
>
>
> With the !! change you get:
>
> 12d0 <.test>:
>  12d0:31 23 ff ff addic   r9,r3,-1
>  12d4:7d 29 19 10 subfe   r9,r9,r3
>  12d8:0b 09 00 00 tdnei   r9,0
>  12dc:4e 80 00 20 blr

Yeah that's a pity.

We could do something like below, which is ugly, but would be better
than having to revert the whole thing.

Although this doesn't fix the strange warning in drivers/net/ethernet/sfc.

So possibly we need a CLANG ifdef around the whole thing, and use the
old style warn for clang.

cheers


diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index 1ee0f22313ee..d978d9004d0d 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -106,6 +106,12 @@ __label_warn_on:   
\
}   \
 } while (0)
 
+#ifdef CONFIG_CC_IS_CLANG
+#define __clang_warn_hack(x)   (!!(x))
+#else
+#define __clang_warn_hack(x)   (x)
+#endif
+
 #define WARN_ON(x) ({  \
bool __ret_warn_on = false; \
do {\
@@ -119,7 +125,8 @@ __label_warn_on:
\
\
WARN_ENTRY(PPC_TLNEI " %4, 0",  \
   BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN), 
\
-  __label_warn_on, "r" (x));   \
+  __label_warn_on, \
+  "r" __clang_warn_hack(x));   \
break;  \
 __label_warn_on:   \
__ret_warn_on = true;   \




Re: [PATCH v2 RESEND] powerpc/audit: Convert powerpc to AUDIT_ARCH_COMPAT_GENERIC

2021-08-26 Thread Michael Ellerman
Paul Moore  writes:
> On Tue, Aug 24, 2021 at 1:11 PM Christophe Leroy
>  wrote:
>> Le 24/08/2021 à 16:47, Paul Moore a écrit :
>> > On Tue, Aug 24, 2021 at 9:36 AM Christophe Leroy
>> >  wrote:
>> >>
>> >> Commit e65e1fc2d24b ("[PATCH] syscall class hookup for all normal
>> >> targets") added generic support for AUDIT but that didn't include
>> >> support for bi-arch like powerpc.
>> >>
>> >> Commit 4b58841149dc ("audit: Add generic compat syscall support")
>> >> added generic support for bi-arch.
>> >>
>> >> Convert powerpc to that bi-arch generic audit support.
>> >>
>> >> Cc: Paul Moore 
>> >> Cc: Eric Paris 
>> >> Signed-off-by: Christophe Leroy 
>> >> ---
>> >> Resending v2 with Audit people in Cc
>> >>
>> >> v2:
>> >> - Missing 'git add' for arch/powerpc/include/asm/unistd32.h
>> >> - Finalised commit description
>> >> ---
>> >>   arch/powerpc/Kconfig|  5 +-
>> >>   arch/powerpc/include/asm/unistd32.h |  7 +++
>> >>   arch/powerpc/kernel/Makefile|  3 --
>> >>   arch/powerpc/kernel/audit.c | 84 -
>> >>   arch/powerpc/kernel/compat_audit.c  | 44 ---
>> >>   5 files changed, 8 insertions(+), 135 deletions(-)
>> >>   create mode 100644 arch/powerpc/include/asm/unistd32.h
>> >>   delete mode 100644 arch/powerpc/kernel/audit.c
>> >>   delete mode 100644 arch/powerpc/kernel/compat_audit.c
>> >
>> > Can you explain, in detail please, the testing you have done to verify
>> > this patch?
>> >
>>
>> I built ppc64_defconfig and checked that the generated code is functionnaly 
>> equivalent.
>>
>> ppc32_classify_syscall() is exactly the same as 
>> audit_classify_compat_syscall() except that the
>> later takes the syscall as second argument (ie in r4) whereas the former 
>> takes it as first argument
>> (ie in r3).
>>
>> audit_classify_arch() and powerpc audit_classify_syscall() are slightly 
>> different between the
>> powerpc version and the generic version because the powerpc version checks 
>> whether it is
>> AUDIT_ARCH_PPC or not (ie value 20), while the generic one checks whether it 
>> has bit
>> __AUDIT_ARCH_64BIT set or not (__AUDIT_ARCH_64BIT is the sign bit of a 
>> word), but taking into
>> account that the abi is either AUDIT_ARCH_PPC, AUDIT_ARCH_PPC64 or 
>> AUDIT_ARCH_PPC64LE, the result is
>> the same.
>>
>> If you are asking I guess you saw something wrong ?
>
> I was asking because I didn't see any mention of testing, and when you
> are enabling something significant like this it is nice to see that it
> has been verified to work :)
>
> While binary dumps and comparisons are nice, it is always good to see
> verification from a test suite.  I don't have access to the necessary
> hardware to test this, but could you verify that the audit-testsuite
> passes on your test system with your patches applied?
>
>  * https://github.com/linux-audit/audit-testsuite

I tested on ppc64le. Both before and after the patch I get the result
below.

So I guess the patch is OK, but maybe we have some existing issue.

I had a bit of a look at the test code, but my perl is limited. I think
it was running the command below, and it returned "", but
not really sure what that means.

  $ sudo ausearch -i -m SYSCALL -p 216440 -ui 0 -gi 0 -ul 0 -su unconfined 
_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 -ts recent
  

cheers



Running as   userroot
with context unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
on   system  Fedora

backlog_wait_time_actual_reset/test .. ok
exec_execve/test . ok
exec_name/test ... ok
file_create/test . ok
file_delete/test . ok
file_rename/test . ok
filter_exclude/test .. 1/21
# Test 20 got: "256" (filter_exclude/test at line 167)
#Expected: "0"
#  filter_exclude/test line 167 is: ok( $result, 0 );
# Test 21 got: "0" (filter_exclude/test at line 179)
#Expected: "1"
#  filter_exclude/test line 179 is: ok( $found_msg, 1 );
filter_exclude/test .. Failed 2/21 subtests
filter_saddr_fam/test  ok
filter_sessionid/test  ok
login_tty/test ... ok
lost_reset/test .. ok
netfilter_pkt/test ... ok
syscalls_file/test ... ok
syscall_module/test .. ok
time_change/test . ok
user_msg/test  ok
fanotify/test  ok
bpf/test . ok

Test Summary Report
---
filter_exclude/test(Wstat: 0 Tests: 21 Failed: 2)
  Failed tests:  20-21
Files=18, Tests=202, 45 wallclock secs ( 0.18 usr  0.03 sys + 20.15 cusr  0.92 
csys = 21.28 CPU)
Result: FAIL
Failed 1/18 test programs. 2/202 subtests failed.


Re: [PATCH linux-next] powerpc/tm: remove duplicate include in tm-poison.c

2021-08-26 Thread Michael Ellerman
Christophe Leroy  writes:
> Le 24/08/2021 à 16:40, Shuah Khan a écrit :
>> On 8/5/21 12:52 AM, cgel@gmail.com wrote:
>>> From: yong yiran 
>>>
>>> 'inttypes.h' included in 'tm-poison.c' is duplicated.
>>> Remove all but the first include of inttypes.h from tm-poison.c.
>>>
>>> Reported-by: Zeal Robot 
>>> Signed-off-by: yong yiran 
>>> ---
>>>   tools/testing/selftests/powerpc/tm/tm-poison.c | 1 -
>>>   1 file changed, 1 deletion(-)
>>>
>>> diff --git a/tools/testing/selftests/powerpc/tm/tm-poison.c 
>>> b/tools/testing/selftests/powerpc/tm/tm-poison.c
>>> index 29e5f26af7b9..27c083a03d1f 100644
>>> --- a/tools/testing/selftests/powerpc/tm/tm-poison.c
>>> +++ b/tools/testing/selftests/powerpc/tm/tm-poison.c
>>> @@ -20,7 +20,6 @@
>>>   #include 
>>>   #include 
>>>   #include 
>>> -#include 
>>>   #include "tm.h"
>>>
>> 
>> We can't accept this patch. The from and Signed-off-by don't match.
>> 
>
> As far as I can see they match. You have:
>
> From: yong yiran 
> Signed-off-by: yong yiran 

Regardless I already have a patch queued to fix this, from a different
CI bot.

cheers


Re: [PATCH] powerpc: Make set_endian() return EINVAL when not supporting little endian

2021-08-26 Thread Michael Ellerman
Christophe Leroy  writes:
> Le 26/08/2021 à 05:41, Michael Ellerman a écrit :
>> Christophe Leroy  writes:
>>> There is no point in modifying MSR_LE bit on CPUs not supporting
>>> little endian.
>> 
>> Isn't that an ABI break?
>
> Or an ABI fix ? I don't know.

It could break existing applications, even if the new semantics make
more sense. So that's a break IMHO :)

> My first thought was that all other 32 bits architectures were returning 
> -EINVAL, but looking at the 
> man page of prctl, it is explicit that this is powerpc only.

It could be generic, but yeah seems we're the only arch that implements
it.

>> set_endian(PR_ENDIAN_BIG) should work on a big endian CPU, even if it
>> does nothing useful.
>
> Fair enough. But shouldn't in that case get_endian() return PR_ENDIAN_BIG 
> instead of returning EINVAL ?

> We can do one or the other, but I think it should at least be consistant 
> between them, shouldn't it ?

It should be consistent, but it isn't, and if we change it now we
potentially break existing userspace, which is bad.

I don't think it's widely used, and the risk of breakage would be
minimal, but it's not zero.

So I'm not sure it's worth changing it just for the sake of consistency.

cheers


Re: [PATCH v2 3/3] powerpc/numa: Fill distance_lookup_table for offline nodes

2021-08-26 Thread Michael Ellerman
Srikar Dronamraju  writes:
> Scheduler expects unique number of node distances to be available at
> boot.

I think it needs "the number of unique node distances" ?

> It uses node distance to calculate this unique node distances.

It iterates over all pairs of nodes and records node_distance() for that
pair, and then calculates the set of unique distances.

> On POWER, node distances for offline nodes is not available. However,
> POWER already knows unique possible node distances.

I think it would be more accurate to say PAPR rather than POWER there.
It's PAPR that defines the way we determine distances and imposes that
limitation.

> Fake the offline node's distance_lookup_table entries so that all
> possible node distances are updated.

Does this work if we have a single node offline at boot?

Say we start with:

node distances:
node   0   1
  0:  10  20
  1:  20  10

And node 2 is offline at boot. We can only initialise that nodes entries
in the distance_lookup_table:

while (i--)
distance_lookup_table[node][i] = node;

By filling them all with 2 that causes node_distance(2, X) to return the
maximum distance for all other nodes X, because we won't break out of
the loop in __node_distance():

for (i = 0; i < distance_ref_points_depth; i++) {
if (distance_lookup_table[a][i] == distance_lookup_table[b][i])
break;

/* Double the distance for each NUMA level */
distance *= 2;
}

If distance_ref_points_depth was 4 we'd return 160.

That'd leave us with 3 unique distances at boot, 10, 20, 160.

But when node 2 comes online it might introduce more than 1 new distance
value, eg. it could be that the actual distances are:

node distances:
node   0   1   2
  0:  10  20  40
  1:  20  10  80
  2:  40  80  10

ie. we now have 4 distances, 10, 20, 40, 80.

What am I missing?

> However this only needs to be done if the number of unique node
> distances that can be computed for online nodes is less than the
> number of possible unique node distances as represented by
> distance_ref_points_depth.

Looking at a few machines they all have distance_ref_points_depth = 2.

So maybe that explains it, in practice we only see 10, 20, 40.


> When the node is actually onlined, distance_lookup_table will be
> updated with actual entries.

> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: Nathan Lynch 
> Cc: Michael Ellerman 
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
> Cc: Valentin Schneider 
> Cc: Gautham R Shenoy 
> Cc: Vincent Guittot 
> Cc: Geetika Moolchandani 
> Cc: Laurent Dufour 
> Cc: kernel test robot 
> Signed-off-by: Srikar Dronamraju 
> ---
>  arch/powerpc/mm/numa.c | 70 ++
>  1 file changed, 70 insertions(+)
>
> Changelog:
> v1: 
> https://lore.kernel.org/linuxppc-dev/20210701041552.112072-3-sri...@linux.vnet.ibm.com/t/#u
> [ Fixed a missing prototype warning Reported-by: kernel test robot 
> ]
>
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 3c124928a16d..0ee79a08c9e1 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -856,6 +856,75 @@ void __init dump_numa_cpu_topology(void)
>   }
>  }
>  
> +/*
> + * Scheduler expects unique number of node distances to be available at
> + * boot. It uses node distance to calculate this unique node distances. On
> + * POWER, node distances for offline nodes is not available. However, POWER
> + * already knows unique possible node distances. Fake the offline node's
> + * distance_lookup_table entries so that all possible node distances are
> + * updated.
> + */

> +static void __init fake_update_distance_lookup_table(void)
> +{
> + unsigned long distance_map;
> + int i, nr_levels, nr_depth, node;

Are they distances, depths, or levels? :)

Bit more consistency in the variable names might make the code easier to
follow.

> +
> + if (!numa_enabled)
> + return;
> +
> + if (!form1_affinity)
> + return;

That doesn't exist since Aneesh's FORM2 series, so that will need a
rebase, and possibly some more rework to interact with that series.

> + /*
> +  * distance_ref_points_depth lists the unique numa domains
> +  * available. However it ignore LOCAL_DISTANCE. So add +1
> +  * to get the actual number of unique distances.
> +  */
> + nr_depth = distance_ref_points_depth + 1;

num_depths would be a better name IMHO.

> +
> + WARN_ON(nr_depth > sizeof(distance_map));

Warn but then continue, and corrupt something on the stack? Seems like a
bad idea :)

I guess it's too early to use bitmap_alloc(). But can we at least return
if nr_depth is too big.

> +
> + bitmap_zero(_map, nr_depth);
&g

Re: [PATCH] powerpc: Redefine HMT_xxx macros as empty on PPC32

2021-08-25 Thread Michael Ellerman
Christophe Leroy  writes:
> HMT_xxx macros are macros for adjusting thread priority
> (hardware multi-threading) are macros inherited from PPC64
> via commit 5f7c690728ac ("[PATCH] powerpc: Merged ppc_asm.h")
>
> Those instructions are pointless on PPC32, but some common
> fonctions like arch_cpu_idle() use them.
>
> So make them empty on PPC32 to avoid those instructions.

I guess we're pretty sure no 32-bit CPUs do anything with those.

e6500 can run in 32-bit mode, and is 2-way threaded AIUI, so it's
*possible* it could use them.

But I can't find any mention of those special nops in the e6500 core
manual. And actually it does have documentation about thread priority
registers, PPR32, TPRI0/1, but it says they're not used in e6500.

So I guess this seems safe, I'll pick it up.

cheers

> diff --git a/arch/powerpc/include/asm/vdso/processor.h 
> b/arch/powerpc/include/asm/vdso/processor.h
> index e072577bc7c0..8d79f994b4aa 100644
> --- a/arch/powerpc/include/asm/vdso/processor.h
> +++ b/arch/powerpc/include/asm/vdso/processor.h
> @@ -5,12 +5,21 @@
>  #ifndef __ASSEMBLY__
>  
>  /* Macros for adjusting thread priority (hardware multi-threading) */
> +#ifdef CONFIG_PPC64
>  #define HMT_very_low()   asm volatile("or 31, 31, 31 # very 
> low priority")
>  #define HMT_low()asm volatile("or 1, 1, 1# low priority")
>  #define HMT_medium_low() asm volatile("or 6, 6, 6# medium low 
> priority")
>  #define HMT_medium() asm volatile("or 2, 2, 2# medium 
> priority")
>  #define HMT_medium_high()asm volatile("or 5, 5, 5# medium high 
> priority")
>  #define HMT_high()   asm volatile("or 3, 3, 3# high 
> priority")
> +#else
> +#define HMT_very_low()
> +#define HMT_low()
> +#define HMT_medium_low()
> +#define HMT_medium()
> +#define HMT_medium_high()
> +#define HMT_high()
> +#endif
>  
>  #ifdef CONFIG_PPC64
>  #define cpu_relax()  do { HMT_low(); HMT_medium(); barrier(); } while (0)
> -- 
> 2.25.0


Re: [PATCH] powerpc: Make set_endian() return EINVAL when not supporting little endian

2021-08-25 Thread Michael Ellerman
Christophe Leroy  writes:
> There is no point in modifying MSR_LE bit on CPUs not supporting
> little endian.

Isn't that an ABI break?

set_endian(PR_ENDIAN_BIG) should work on a big endian CPU, even if it
does nothing useful.

cheers

> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 185beb290580..b2b9919795a2 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -1995,6 +1995,10 @@ int set_endian(struct task_struct *tsk, unsigned int 
> val)
>  {
>   struct pt_regs *regs = tsk->thread.regs;
>  
> + if (!cpu_has_feature(CPU_FTR_PPC_LE) &&
> + !cpu_has_feature(CPU_FTR_REAL_LE))
> + return -EINVAL;
> +
>   if ((val == PR_ENDIAN_LITTLE && !cpu_has_feature(CPU_FTR_REAL_LE)) ||
>   (val == PR_ENDIAN_PPC_LITTLE && !cpu_has_feature(CPU_FTR_PPC_LE)))
>   return -EINVAL;
> -- 
> 2.25.0


  1   2   3   4   5   6   7   8   9   10   >