date:20210914

Re: [PATCH bpf-next v2] bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33

2021-09-14 Thread Daniel Borkmann


On 9/11/21 3:56 AM, Tiezhu Yang wrote:

In the current code, the actual max tail call count is 33 which is greater
than MAX_TAIL_CALL_CNT (defined as 32), the actual limit is not consistent
with the meaning of MAX_TAIL_CALL_CNT, there is some confusion and need to
spend some time to think about the reason at the first glance.

We can see the historical evolution from commit 04fd61ab36ec ("bpf: allow
bpf programs to tail-call other bpf programs") and commit f9dabe016b63
("bpf: Undo off-by-one in interpreter tail call count limit").

In order to avoid changing existing behavior, the actual limit is 33 now,
this is reasonable.

After commit 874be05f525e ("bpf, tests: Add tail call test suite"), we can
see there exists failed testcase.

On all archs when CONFIG_BPF_JIT_ALWAYS_ON is not set:
  # echo 0 > /proc/sys/net/core/bpf_jit_enable
  # modprobe test_bpf
  # dmesg | grep -w FAIL
  Tail call error path, max count reached jited:0 ret 34 != 33 FAIL

On some archs:
  # echo 1 > /proc/sys/net/core/bpf_jit_enable
  # modprobe test_bpf
  # dmesg | grep -w FAIL
  Tail call error path, max count reached jited:1 ret 34 != 33 FAIL

So it is necessary to change the value of MAX_TAIL_CALL_CNT from 32 to 33,
then do some small changes of the related code.

With this patch, it does not change the current limit 33, MAX_TAIL_CALL_CNT
can reflect the actual max tail call count, the tailcall selftests can work
well, and also the above failed testcase in test_bpf can be fixed for the
interpreter (all archs) and the JIT (all archs except for x86).

  # uname -m
  x86_64
  # echo 1 > /proc/sys/net/core/bpf_jit_enable
  # modprobe test_bpf
  # dmesg | grep -w FAIL
  Tail call error path, max count reached jited:1 ret 33 != 34 FAIL


Could you also state in here which archs you have tested with this change? I
presume /every/ arch which has a JIT?


Signed-off-by: Tiezhu Yang 
---

v2:
   -- fix the typos in the commit message and update the commit message.
   -- fix the failed tailcall selftests for x86 jit.
  I am not quite sure the change on x86 is proper, with this change,
  tailcall selftests passed, but tailcall limit test in test_bpf.ko
  failed, I do not know the reason now, I think this is another issue,
  maybe someone more versed in x86 jit could take a look.


There should be a series from Johan coming today with regards to test_bpf.ko
that will fix the "tail call error path, max count reached" test which had an
assumption in that R0 would always be valid for the fall-through and could be
passed to the bpf_exit insn whereas it is not guaranteed and verifier, for
example, forbids a subsequent access to R0 w/o reinit. For your testing, I
would suggested to recheck once this series is out.


  arch/arm/net/bpf_jit_32.c | 11 ++-
  arch/arm64/net/bpf_jit_comp.c |  7 ---
  arch/mips/net/ebpf_jit.c  |  4 ++--
  arch/powerpc/net/bpf_jit_comp32.c |  4 ++--
  arch/powerpc/net/bpf_jit_comp64.c | 12 ++--
  arch/riscv/net/bpf_jit_comp32.c   |  4 ++--
  arch/riscv/net/bpf_jit_comp64.c   |  4 ++--
  arch/sparc/net/bpf_jit_comp_64.c  |  8 
  arch/x86/net/bpf_jit_comp.c   | 10 +-
  include/linux/bpf.h   |  2 +-
  kernel/bpf/core.c |  4 ++--
  11 files changed, 36 insertions(+), 34 deletions(-)

[...]

/* prog = array->ptrs[index]
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 41c23f4..5d6c843 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -286,14 +286,15 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx)
emit(A64_CMP(0, r3, tmp), ctx);
emit(A64_B_(A64_COND_CS, jmp_offset), ctx);
  
-	/* if (tail_call_cnt > MAX_TAIL_CALL_CNT)

-* goto out;
+   /*
 * tail_call_cnt++;
+* if (tail_call_cnt > MAX_TAIL_CALL_CNT)
+* goto out;
 */
+   emit(A64_ADD_I(1, tcc, tcc, 1), ctx);
emit_a64_mov_i64(tmp, MAX_TAIL_CALL_CNT, ctx);
emit(A64_CMP(1, tcc, tmp), ctx);
emit(A64_B_(A64_COND_HI, jmp_offset), ctx);
-   emit(A64_ADD_I(1, tcc, tcc, 1), ctx);
  
  	/* prog = array->ptrs[index];

 * if (prog == NULL)

[...]

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 0fe6aac..74a9e61 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -402,7 +402,7 @@ static int get_pop_bytes(bool *callee_regs_used)
   * ... bpf_tail_call(void *ctx, struct bpf_array *array, u64 index) ...
   *   if (index >= array->map.max_entries)
   * goto out;
- *   if (++tail_call_cnt > MAX_TAIL_CALL_CNT)
+ *   if (tail_call_cnt++ == MAX_TAIL_CALL_CNT)


Why such inconsistency to e.g. above with arm64 case but also compared to
x86 32 bit which uses JAE? If so, we should cleanly follow the reference
implementation (== interpreter) _everywhere_ and _not_ introduce additional
variants/implementations across JITs.


   * goto out;
   *   prog =

Re: [PATCH] powerpc/xics: Set the IRQ chip data for the ICS native backend

2021-09-14 Thread Gustavo Romero


Hi,

I confirm that if this fix is *not* applied to current Linus
tree (c605c39677b9) and kernel boots on Microwatt (18eb029) the
kernel will crash with the following exception:



[1.846437] BUG: Kernel NULL pointer dereference on read at 0x0048
[1.853121] Faulting instruction address: 0xc003146c
Vector: 300 (Data Access) at [c10332c0]
pc: c003146c: ics_native_unmask_irq+0x10/0x60
lr: c00314d4: ics_native_startup+0x18/0x2c
sp: c1033560
   msr: 90009033
   dar: 48
 dsisr: 4000
  current = 0xc105
  paca= 0xc068c000   irqmask: 0x03   irq_happened: 0x01
pid   = 1, comm = swapper
Linux version 5.14.0-11101-gc605c39677b9 (gromero@amd) (powerpc64-linux-gnu-gcc 
(Ubuntu 10.3.0-1ubuntu1) 10.3.0, GNU ld (GNU Binutils for Ubuntu) 2.36.1) #5 
Sat Sep 11 22:01
enter ? for help
[link register   ] c00314d4 ics_native_startup+0x18/0x2c
[c1033560] 3000 (unreliable)
[c1033580] c0074c00 irq_startup+0x8c/0xd4
[c10335c0] c0072434 __setup_irq+0x534/0x6c4
[c1033660] c00728ac request_threaded_irq+0x130/0x154
[c10336d0] c0217478 univ8250_setup_irq+0x1b0/0x20c
[c1033720] c021af2c serial8250_do_startup+0x428/0x654
[c10337b0] c021475c uart_startup+0xd0/0x1a0
[c1033800] c021487c uart_port_activate+0x50/0x74
[c1033830] c020fc98 tty_port_open+0xa4/0x110
[c1033880] c0212810 uart_open+0x24/0x4c
[c10338a0] c0208b04 tty_open+0x2d4/0x394
[c1033920] c00e38a8 chrdev_open+0xd4/0x15c
[c1033980] c00dc080 do_dentry_open+0x24c/0x2d0
[c10339d0] c00edd60 path_openat+0x8fc/0xa2c
[c1033ab0] c00edee8 do_filp_open+0x58/0xbc
[c1033be0] c00dd530 file_open_name+0x54/0x7c
[c1033c50] c00dd5a0 filp_open+0x48/0x68
[c1033c90] c048e1d4 console_on_rootfs+0x2c/0x88
[c1033d00] c048e424 kernel_init_freeable+0x1f4/0x238
[c1033db0] c000e51c kernel_init+0x28/0x138
[c1033e10] c000b114 ret_from_kernel_thread+0x5c/0x64
mon>


Thanks for fixing it Cédric.

Tested-by: Gustavo Romero 


Cheers,
Gustavo

Re: [PATCH 2/2] kvm: rename KVM_MAX_VCPU_ID to KVM_MAX_VCPU_IDS

2021-09-14 Thread Eduardo Habkost

On Mon, Sep 13, 2021 at 12:24 PM Sean Christopherson  wrote:
>
> On Mon, Sep 13, 2021, Juergen Gross wrote:
> > KVM_MAX_VCPU_ID is not specifying the highest allowed vcpu-id, but the
> > number of allowed vcpu-ids. This has already led to confusion, so
> > rename KVM_MAX_VCPU_ID to KVM_MAX_VCPU_IDS to make its semantics more
> > clear
>
> My hesitation with this rename is that the max _number_ of IDs is not the same
> thing as the max allowed ID.  E.g. on x86, given a capability that enumerates 
> the
> max number of IDs, I would expect to be able to create vCPUs with arbitrary 
> 32-bit
> x2APIC IDs so long as the total number of IDs is below the max.
>

What name would you suggest instead? KVM_VCPU_ID_LIMIT, maybe?

I'm assuming we are not going to redefine KVM_MAX_VCPU_ID to be an
inclusive limit.

--
Eduardo

Re: [PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()

2021-09-14 Thread Borislav Petkov

On Wed, Sep 08, 2021 at 05:58:35PM -0500, Tom Lendacky wrote:
> Introduce a powerpc version of the cc_platform_has() function. This will
> be used to replace the powerpc mem_encrypt_active() implementation, so
> the implementation will initially only support the CC_ATTR_MEM_ENCRYPT
> attribute.
> 
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/powerpc/platforms/pseries/Kconfig   |  1 +
>  arch/powerpc/platforms/pseries/Makefile  |  2 ++
>  arch/powerpc/platforms/pseries/cc_platform.c | 26 
>  3 files changed, 29 insertions(+)
>  create mode 100644 arch/powerpc/platforms/pseries/cc_platform.c

Michael,

can I get an ACK for the ppc bits to carry them through the tip tree
pls?

Btw, on a related note, cross-compiling this throws the following error here:

$ make 
CROSS_COMPILE=/home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/powerpc64-linux-
 V=1 ARCH=powerpc

...

/home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/powerpc64-linux-gcc
 -Wp,-MD,arch/powerpc/boot/.crt0.o.d -D__ASSEMBLY__ -Wall -Wundef 
-Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -O2 -msoft-float 
-mno-altivec -mno-vsx -pipe -fomit-frame-pointer -fno-builtin -fPIC -nostdinc 
-include ./include/linux/compiler_attributes.h -I./arch/powerpc/include 
-I./arch/powerpc/include/generated  -I./include -I./arch/powerpc/include/uapi 
-I./arch/powerpc/include/generated/uapi -I./include/uapi 
-I./include/generated/uapi -include ./include/linux/compiler-version.h -include 
./include/linux/kconfig.h -m32 -isystem 
/home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/../lib/gcc/powerpc64-linux/9.4.0/include
 -mbig-endian -nostdinc -c -o arch/powerpc/boot/crt0.o arch/powerpc/boot/crt0.S
In file included from :
././include/linux/compiler_attributes.h:62:5: warning: "__has_attribute" is not 
defined, evaluates to 0 [-Wundef]
   62 | #if __has_attribute(__assume_aligned__)
  | ^~~
././include/linux/compiler_attributes.h:62:20: error: missing binary operator 
before token "("
   62 | #if __has_attribute(__assume_aligned__)
  |^
././include/linux/compiler_attributes.h:88:5: warning: "__has_attribute" is not 
defined, evaluates to 0 [-Wundef]
   88 | #if __has_attribute(__copy__)
  | ^~~
...

Known issue?

This __has_attribute() thing is supposed to be supported
in gcc since 5.1 and I'm using the crosstool stuff from
https://www.kernel.org/pub/tools/crosstool/ and gcc-9.4 above is pretty
new so that should not happen actually.

But it does...

Hmmm.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

[RFC PATCH 5/8] sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y

2021-09-14 Thread Ard Biesheuvel

THREAD_INFO_IN_TASK moved the CPU field out of thread_info, but this
causes some issues on architectures that define raw_smp_processor_id()
in terms of this field, due to the fact that #include'ing linux/sched.h
to get at struct task_struct is problematic in terms of circular
dependencies.

Given that thread_info and task_struct are the same data structure
anyway when THREAD_INFO_IN_TASK=y, let's move it back so that having
access to the type definition of struct thread_info is sufficient to
reference the CPU number of the current task.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm64/kernel/asm-offsets.c   | 1 -
 arch/arm64/kernel/head.S  | 2 +-
 arch/powerpc/kernel/asm-offsets.c | 2 +-
 arch/powerpc/kernel/smp.c | 2 +-
 include/linux/sched.h | 6 +-
 kernel/sched/sched.h  | 4 
 6 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index cee9f3e9f906..0bfc048221af 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -27,7 +27,6 @@
 int main(void)
 {
   DEFINE(TSK_ACTIVE_MM,offsetof(struct task_struct, 
active_mm));
-  DEFINE(TSK_CPU,  offsetof(struct task_struct, cpu));
   BLANK();
   DEFINE(TSK_TI_CPU,   offsetof(struct task_struct, thread_info.cpu));
   DEFINE(TSK_TI_FLAGS, offsetof(struct task_struct, 
thread_info.flags));
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 17962452e31d..6a98f1a38c29 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -412,7 +412,7 @@ SYM_FUNC_END(__create_page_tables)
scs_load \tsk
 
adr_l   \tmp1, __per_cpu_offset
-   ldr w\tmp2, [\tsk, #TSK_CPU]
+   ldr w\tmp2, [\tsk, #TSK_TI_CPU]
ldr \tmp1, [\tmp1, \tmp2, lsl #3]
set_this_cpu_offset \tmp1
.endm
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index e563d3222d69..e37e4546034e 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -93,7 +93,7 @@ int main(void)
 #endif /* CONFIG_PPC64 */
OFFSET(TASK_STACK, task_struct, stack);
 #ifdef CONFIG_SMP
-   OFFSET(TASK_CPU, task_struct, cpu);
+   OFFSET(TASK_CPU, task_struct, thread_info.cpu);
 #endif
 
 #ifdef CONFIG_LIVEPATCH
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 9cc7d3dbf439..512d875b45e0 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1223,7 +1223,7 @@ static void cpu_idle_thread_init(unsigned int cpu, struct 
task_struct *idle)
paca_ptrs[cpu]->kstack = (unsigned long)task_stack_page(idle) +
 THREAD_SIZE - STACK_FRAME_OVERHEAD;
 #endif
-   idle->cpu = cpu;
+   task_thread_info(idle)->cpu = cpu;
secondary_current = current_set[cpu] = idle;
 }
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index e12b524426b0..37aa521078e7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -750,10 +750,6 @@ struct task_struct {
 #ifdef CONFIG_SMP
int on_cpu;
struct __call_single_node   wake_entry;
-#ifdef CONFIG_THREAD_INFO_IN_TASK
-   /* Current CPU: */
-   unsigned intcpu;
-#endif
unsigned intwakee_flips;
unsigned long   wakee_flip_decay_ts;
struct task_struct  *last_wakee;
@@ -2114,7 +2110,7 @@ static __always_inline bool need_resched(void)
 static inline unsigned int task_cpu(const struct task_struct *p)
 {
 #ifdef CONFIG_THREAD_INFO_IN_TASK
-   return READ_ONCE(p->cpu);
+   return READ_ONCE(p->thread_info.cpu);
 #else
return READ_ONCE(task_thread_info(p)->cpu);
 #endif
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 3d3e5793e117..79fcbad11450 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1926,11 +1926,7 @@ static inline void __set_task_cpu(struct task_struct *p, 
unsigned int cpu)
 * per-task data have been completed by this moment.
 */
smp_wmb();
-#ifdef CONFIG_THREAD_INFO_IN_TASK
-   WRITE_ONCE(p->cpu, cpu);
-#else
WRITE_ONCE(task_thread_info(p)->cpu, cpu);
-#endif
p->wake_cpu = cpu;
 #endif
 }
-- 
2.30.2

[RFC PATCH 4/8] powerpc: add CPU field to struct thread_info

2021-09-14 Thread Ard Biesheuvel

The CPU field will be moved back into thread_info even when
THREAD_INFO_IN_TASK is enabled, so add it back to powerpc's definition
of struct thread_info.

Signed-off-by: Ard Biesheuvel 
---
 arch/powerpc/include/asm/thread_info.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/include/asm/thread_info.h 
b/arch/powerpc/include/asm/thread_info.h
index b4ec6c7dd72e..5725029aaa29 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -47,6 +47,9 @@
 struct thread_info {
int preempt_count;  /* 0 => preemptable,
   <0 => BUG */
+#ifdef CONFIG_SMP
+   unsigned intcpu;
+#endif
unsigned long   local_flags;/* private flags for thread */
 #ifdef CONFIG_LIVEPATCH
unsigned long *livepatch_sp;
-- 
2.30.2

Re: [PATCH 1/3] perf: Add macros to specify onchip L2/L3 accesses

2021-09-14 Thread Peter Zijlstra

On Tue, Sep 14, 2021 at 08:40:38PM +1000, Michael Ellerman wrote:
> Peter Zijlstra  writes:

> > I'm thinking we ought to keep hops as steps along the NUMA fabric, with
> > 0 hops being the local node. That only gets us:
> >
> >  L2, remote=0, hops=HOPS_0 -- our L2
> >  L2, remote=1, hops=HOPS_0 -- L2 on the local node but not ours
> >  L2, remote=1, hops!=HOPS_0 -- L2 on a remote node
> 
> Hmm. I'm not sure about tying it directly to NUMA hops. I worry we're
> going to see more and more systems where there's a hierarchy within the
> chip/package, in addition to the traditional NUMA hierarchy.
> 
> Although then I guess it becomes a question of what exactly is a NUMA
> hop, maybe the answer is that on those future systems those
> intra-chip/package hops should be represented as NUMA hops.
> 
> It's not like we have a hard definition of what a NUMA hop is?

Not really, typically whatever the BIOS/DT/whatever tables tell us. I
think in case of Power you're mostly making things up in software :-)

But yeah, I think we have plenty wriggle room there.

[RFC PATCH 0/8] Move task_struct::cpu back into thread_info

2021-09-14 Thread Ard Biesheuvel

Commit c65eacbe290b ("sched/core: Allow putting thread_info into
task_struct") mentions that, along with moving thread_info into
task_struct, the cpu field is moved out of the former into the latter,
but does not explain why.

While collaborating with Keith on adding THREAD_INFO_IN_TASK support to
ARM, we noticed that keeping CPU in task_struct is problematic for
architectures that define raw_smp_processor_id() in terms of this field,
as it requires linux/sched.h to be included, which causes a lot of pain
in terms of circular dependencies (or 'header soup', as the original
commit refers to it).

For examples of how existing architectures work around this, please
refer to patches #6 or #7. In the former case, it uses an awful
asm-offsets hack to index thread_info/current without using its type
definition. The latter approach simply keeps a copy of the task_struct
CPU field in thread_info, and keeps it in sync at context switch time.

Patch #8 reverts this latter approach for ARM, but this code is still
under review so it does not currently apply to mainline.

We also discussed introducing yet another Kconfig symbol to indicate
that the arch has THREAD_INFO_IN_TASK enabled but still prefers to keep
its CPU field in thread_info, but simply keeping it in thread_info in
all cases seems to be the cleanest approach here.

Cc: Keith Packard 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Christophe Leroy 
Cc: Paul Mackerras 
Cc: Paul Walmsley 
Cc: Palmer Dabbelt 
Cc: Albert Ou 
Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Peter Zijlstra 
Cc: Kees Cook 
Cc: Andy Lutomirski 
Cc: Linus Torvalds 
Cc: Arnd Bergmann 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-ri...@lists.infradead.org
Cc: linux-s...@vger.kernel.org

Ard Biesheuvel (8):
  arm64: add CPU field to struct thread_info
  x86: add CPU field to struct thread_info
  s390: add CPU field to struct thread_info
  powerpc: add CPU field to struct thread_info
  sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y
  powerpc: smp: remove hack to obtain offset of task_struct::cpu
  riscv: rely on core code to keep thread_info::cpu updated
  ARM: rely on core code to keep thread_info::cpu updated

 arch/arm/include/asm/switch_to.h   | 14 --
 arch/arm/kernel/smp.c  |  3 ---
 arch/arm64/include/asm/thread_info.h   |  1 +
 arch/arm64/kernel/asm-offsets.c|  2 +-
 arch/arm64/kernel/head.S   |  2 +-
 arch/powerpc/Makefile  | 11 ---
 arch/powerpc/include/asm/smp.h | 17 +
 arch/powerpc/include/asm/thread_info.h |  3 +++
 arch/powerpc/kernel/asm-offsets.c  |  4 +---
 arch/powerpc/kernel/smp.c  |  2 +-
 arch/riscv/kernel/asm-offsets.c|  1 -
 arch/riscv/kernel/entry.S  |  5 -
 arch/riscv/kernel/head.S   |  1 -
 arch/s390/include/asm/thread_info.h|  1 +
 arch/x86/include/asm/thread_info.h |  3 +++
 include/linux/sched.h  |  6 +-
 kernel/sched/sched.h   |  4 
 17 files changed, 14 insertions(+), 66 deletions(-)

-- 
2.30.2

[RFC PATCH 1/8] arm64: add CPU field to struct thread_info

2021-09-14 Thread Ard Biesheuvel

The CPU field will be moved back into thread_info even when
THREAD_INFO_IN_TASK is enabled, so add it back to arm64's definition of
struct thread_info.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm64/include/asm/thread_info.h | 1 +
 arch/arm64/kernel/asm-offsets.c  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/thread_info.h 
b/arch/arm64/include/asm/thread_info.h
index 6623c99f0984..c02bc8c183c3 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -42,6 +42,7 @@ struct thread_info {
void*scs_base;
void*scs_sp;
 #endif
+   u32 cpu;
 };
 
 #define thread_saved_pc(tsk)   \
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 551427ae8cc5..cee9f3e9f906 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -29,6 +29,7 @@ int main(void)
   DEFINE(TSK_ACTIVE_MM,offsetof(struct task_struct, 
active_mm));
   DEFINE(TSK_CPU,  offsetof(struct task_struct, cpu));
   BLANK();
+  DEFINE(TSK_TI_CPU,   offsetof(struct task_struct, thread_info.cpu));
   DEFINE(TSK_TI_FLAGS, offsetof(struct task_struct, 
thread_info.flags));
   DEFINE(TSK_TI_PREEMPT,   offsetof(struct task_struct, 
thread_info.preempt_count));
 #ifdef CONFIG_ARM64_SW_TTBR0_PAN
-- 
2.30.2

Re: linux-next: build failure after merge of the origin tree

2021-09-14 Thread Michael Ellerman

Linus Torvalds  writes:
> On Mon, Sep 13, 2021 at 7:08 PM Stephen Rothwell  
> wrote:
>>
>> That patch works for me - for the ppc64_defconfig build at least.
>
> Yeah, I just tested the allmodconfig case too, although I suspect it's
> essentially the same wrt the boot *.S files, so it probably doesn't
> matter.
>
> I'd like to have Michael or somebody who can actually run some tests
> on the end result ack that patch (or - even better - come up with
> something cleaner) before committing it.
>
> Because yeah, the build failure is annoying and I apologize, but I'd
> rather have the build fail overnight than commit something that builds
> but then is subtle buggy for some reason.
>
> But if I don't get any other comments by the time I'm up again
> tomorrow, I'll just commit it as "fixes the build".

I ended up doing a more minimal version of your change.

I sent it separately, or it's here:

  https://lore.kernel.org/lkml/20210914121723.3756817-1-...@ellerman.id.au/


cheers

[PATCH] nvmem: NVMEM_NINTENDO_OTP should depend on WII

2021-09-14 Thread Geert Uytterhoeven

The Nintendo Wii and Wii U OTP is only present on Nintendo Wii and Wii U
consoles.  Hence add a dependency on WII, to prevent asking the user
about this driver when configuring a kernel without Nintendo Wii and Wii
U console support.

Fixes: 3683b761fe3a10ad ("nvmem: nintendo-otp: Add new driver for the Wii and 
Wii U OTP")
Signed-off-by: Geert Uytterhoeven 
---
 drivers/nvmem/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvmem/Kconfig b/drivers/nvmem/Kconfig
index 39854d43758be3fb..da414617a54d4b99 100644
--- a/drivers/nvmem/Kconfig
+++ b/drivers/nvmem/Kconfig
@@ -109,6 +109,7 @@ config MTK_EFUSE
 
 config NVMEM_NINTENDO_OTP
tristate "Nintendo Wii and Wii U OTP Support"
+   depends on WII || COMPILE_TEST
help
  This is a driver exposing the OTP of a Nintendo Wii or Wii U console.
 
-- 
2.25.1

[RFC PATCH 8/8] ARM: rely on core code to keep thread_info::cpu updated

2021-09-14 Thread Ard Biesheuvel

Now that the core code switched back to using thread_info::cpu to keep
a task's CPU number, we no longer need to keep it in sync explicitly. So
just drop the code that does this.

Signed-off-by: Ard Biesheuvel 
---
This patch applies onto [0], which we hope to get merged for v5.16

[0] 
https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm32-ti-in-task-v5

 arch/arm/include/asm/switch_to.h | 14 --
 arch/arm/kernel/smp.c|  3 ---
 2 files changed, 17 deletions(-)

diff --git a/arch/arm/include/asm/switch_to.h b/arch/arm/include/asm/switch_to.h
index db2be1f6550d..61e4a3c4ca6e 100644
--- a/arch/arm/include/asm/switch_to.h
+++ b/arch/arm/include/asm/switch_to.h
@@ -23,23 +23,9 @@
  */
 extern struct task_struct *__switch_to(struct task_struct *, struct 
thread_info *, struct thread_info *);
 
-static inline void set_ti_cpu(struct task_struct *p)
-{
-#ifdef CONFIG_THREAD_INFO_IN_TASK
-   /*
-* The core code no longer maintains the thread_info::cpu field once
-* CONFIG_THREAD_INFO_IN_TASK is in effect, but we rely on it for
-* raw_smp_processor_id(), which cannot access struct task_struct*
-* directly for reasons of circular #inclusion hell.
-*/
-   task_thread_info(p)->cpu = p->cpu;
-#endif
-}
-
 #define switch_to(prev,next,last)  \
 do {   \
__complete_pending_tlbi();  \
-   set_ti_cpu(next);   \
if (IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO)) \
__this_cpu_write(__entry_task, next);   \
last = __switch_to(prev,task_thread_info(prev), 
task_thread_info(next));\
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index cde5b6d8bac5..97ee6b1567e9 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -154,9 +154,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
secondary_data.swapper_pg_dir = get_arch_pgd(swapper_pg_dir);
 #endif
secondary_data.task = idle;
-   if (IS_ENABLED(CONFIG_THREAD_INFO_IN_TASK))
-   task_thread_info(idle)->cpu = cpu;
-
sync_cache_w(_data);
 
/*
-- 
2.30.2

Re: [PATCH bpf-next v2] bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33

2021-09-14 Thread Tiezhu Yang


On 09/14/2021 03:30 PM, Daniel Borkmann wrote:

On 9/11/21 3:56 AM, Tiezhu Yang wrote:



[...]
With this patch, it does not change the current limit 33, 
MAX_TAIL_CALL_CNT
can reflect the actual max tail call count, the tailcall selftests 
can work
well, and also the above failed testcase in test_bpf can be fixed for 
the

interpreter (all archs) and the JIT (all archs except for x86).

  # uname -m
  x86_64
  # echo 1 > /proc/sys/net/core/bpf_jit_enable
  # modprobe test_bpf
  # dmesg | grep -w FAIL
  Tail call error path, max count reached jited:1 ret 33 != 34 FAIL


Could you also state in here which archs you have tested with this 
change? I

presume /every/ arch which has a JIT?


OK, will do it in v3.
I have tested on x86 and mips.




Signed-off-by: Tiezhu Yang 
---

v2:
   -- fix the typos in the commit message and update the commit message.
   -- fix the failed tailcall selftests for x86 jit.
  I am not quite sure the change on x86 is proper, with this change,
  tailcall selftests passed, but tailcall limit test in test_bpf.ko
  failed, I do not know the reason now, I think this is another 
issue,

  maybe someone more versed in x86 jit could take a look.


There should be a series from Johan coming today with regards to 
test_bpf.ko
that will fix the "tail call error path, max count reached" test which 
had an
assumption in that R0 would always be valid for the fall-through and 
could be
passed to the bpf_exit insn whereas it is not guaranteed and verifier, 
for
example, forbids a subsequent access to R0 w/o reinit. For your 
testing, I

would suggested to recheck once this series is out.


I will test the following patch on x86 and mips:

[PATCH bpf v4 13/14] bpf/tests: Fix error in tail call limit tests

[...]


diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 0fe6aac..74a9e61 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -402,7 +402,7 @@ static int get_pop_bytes(bool *callee_regs_used)
   * ... bpf_tail_call(void *ctx, struct bpf_array *array, u64 index) 
...

   *   if (index >= array->map.max_entries)
   * goto out;
- *   if (++tail_call_cnt > MAX_TAIL_CALL_CNT)
+ *   if (tail_call_cnt++ == MAX_TAIL_CALL_CNT)


Why such inconsistency to e.g. above with arm64 case but also compared to
x86 32 bit which uses JAE? If so, we should cleanly follow the reference
implementation (== interpreter) _everywhere_ and _not_ introduce 
additional

variants/implementations across JITs.


In order tokeep consistencyand make as few changes as possible,
I will modify the check condition as follows:

#define MAX_TAIL_CALL_CNT 33
(1) for x86, arm64, ... (0 ~ 32)
tcc = 0;
if (tcc == MAX_TAIL_CALL_CNT)
goto out;
tcc++;

(2) for mips, riscv (33 ~ 1)
tcc = MAX_TAIL_CALL_CNT;
if (tcc == 0)
goto out;
tcc--;

[...]

[PATCH trivial] powerpc/powernv/dump: Fix typo is comment

2021-09-14 Thread Vasant Hegde

Signed-off-by: Vasant Hegde 
---
 arch/powerpc/platforms/powernv/opal-dump.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/opal-dump.c 
b/arch/powerpc/platforms/powernv/opal-dump.c
index 00c5a59d82d9..717d1d30ade5 100644
--- a/arch/powerpc/platforms/powernv/opal-dump.c
+++ b/arch/powerpc/platforms/powernv/opal-dump.c
@@ -419,7 +419,7 @@ void __init opal_platform_dump_init(void)
int rc;
int dump_irq;
 
-   /* ELOG not supported by firmware */
+   /* Dump not supported by firmware */
if (!opal_check_token(OPAL_DUMP_READ))
return;
 
-- 
2.31.1

[PATCH] powerpc/powernv/flash: Check OPAL flash calls exist before using

2021-09-14 Thread Vasant Hegde

Currently only FSP based powernv systems supports firmware update
interfaces. Hence check that the token OPAL_FLASH_VALIDATE exists
before initalising the flash driver.

Signed-off-by: Vasant Hegde 
---
 arch/powerpc/platforms/powernv/opal-flash.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/opal-flash.c 
b/arch/powerpc/platforms/powernv/opal-flash.c
index 7e7d38b17420..05490fc22fae 100644
--- a/arch/powerpc/platforms/powernv/opal-flash.c
+++ b/arch/powerpc/platforms/powernv/opal-flash.c
@@ -520,6 +520,10 @@ void __init opal_flash_update_init(void)
 {
int ret;
 
+   /* Firmware update is not supported by firmware */
+   if (!opal_check_token(OPAL_FLASH_VALIDATE))
+   return;
+
/* Allocate validate image buffer */
validate_flash_data.buf = kzalloc(VALIDATE_BUF_SIZE, GFP_KERNEL);
if (!validate_flash_data.buf) {
-- 
2.31.1

Re: [PATCH 1/3] perf: Add macros to specify onchip L2/L3 accesses

2021-09-14 Thread Michael Ellerman

Peter Zijlstra  writes:
> On Thu, Sep 09, 2021 at 10:45:54PM +1000, Michael Ellerman wrote:
>
>> > The 'new' composite doesnt have a hops field because the hardware that
>> > nessecitated that change doesn't report it, but we could easily add a
>> > field there.
>> >
>> > Suppose we add, mem_hops:3 (would 6 hops be too small?) and the
>> > corresponding PERF_MEM_HOPS_{NA, 0..6}
>> 
>> It's really 7 if we use remote && hop = 0 to mean the first hop.
>
> I don't think we can do that, becaus of backward compat. Currently:
>
>   lvl_num=DRAM, remote=1
>
> denites: "Remote DRAM of any distance". Effectively it would have the new
> hops field filled with zeros though, so if you then decode with the hops
> field added it suddenly becomes:
>
>  lvl_num=DRAM, remote=1, hops=0
>
> and reads like: "Remote DRAM of 0 hops" which is quite daft. Therefore 0
> really must denote a 'N/A'.

Ah yeah, duh, it needs to be backward compatible.

>> If we're wanting to use some of the hop levels to represent
>> intra-chip/package hops then we could possibly use them all on a really
>> big system.
>> 
>> eg. you could imagine something like:
>> 
>>  L2 |- local L2
>>  L2 | REMOTE | HOPS_0- L2 of neighbour core
>>  L2 | REMOTE | HOPS_1- L2 of near core on same chip (same 1/2 of 
>> chip)
>>  L2 | REMOTE | HOPS_2- L2 of far core on same chip (other 1/2 of 
>> chip)
>>  L2 | REMOTE | HOPS_3- L2 of sibling chip in same package
>>  L2 | REMOTE | HOPS_4- L2 on separate package 1 hop away
>>  L2 | REMOTE | HOPS_5- L2 on separate package 2 hops away
>>  L2 | REMOTE | HOPS_6- L2 on separate package 3 hops away
>> 
>> 
>> Whether it's useful to represent all those levels I'm not sure, but it's
>> probably good if we have the ability.
>
> I'm thinking we ought to keep hops as steps along the NUMA fabric, with
> 0 hops being the local node. That only gets us:
>
>  L2, remote=0, hops=HOPS_0 -- our L2
>  L2, remote=1, hops=HOPS_0 -- L2 on the local node but not ours
>  L2, remote=1, hops!=HOPS_0 -- L2 on a remote node

Hmm. I'm not sure about tying it directly to NUMA hops. I worry we're
going to see more and more systems where there's a hierarchy within the
chip/package, in addition to the traditional NUMA hierarchy.

Although then I guess it becomes a question of what exactly is a NUMA
hop, maybe the answer is that on those future systems those
intra-chip/package hops should be represented as NUMA hops.

It's not like we have a hard definition of what a NUMA hop is?

>> I guess I'm 50/50 on whether that's enough levels, or whether we want
>> another bit to allow for future growth.
>
> Right, possibly safer to add one extra bit while we can I suppose.

Equally it's not _that_ hard to add another bit later (if there's still
one free), makes the API a little uglier to use, but not the end of the
world.

cheers

[RFC PATCH 6/8] powerpc: smp: remove hack to obtain offset of task_struct::cpu

2021-09-14 Thread Ard Biesheuvel

Instead of relying on awful hacks to obtain the offset of the cpu field
in struct task_struct, move it back into struct thread_info, which does
not create the same level of circular dependency hell when trying to
include the header file that defines it.

Signed-off-by: Ard Biesheuvel 
---
 arch/powerpc/Makefile | 11 ---
 arch/powerpc/include/asm/smp.h| 17 +
 arch/powerpc/kernel/asm-offsets.c |  2 --
 3 files changed, 1 insertion(+), 29 deletions(-)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index aa6808e70647..54cad1faa5d0 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -446,17 +446,6 @@ else
 endif
 endif
 
-ifdef CONFIG_SMP
-ifdef CONFIG_PPC32
-prepare: task_cpu_prepare
-
-PHONY += task_cpu_prepare
-task_cpu_prepare: prepare0
-   $(eval KBUILD_CFLAGS += -D_TASK_CPU=$(shell awk '{if ($$2 == 
"TASK_CPU") print $$3;}' include/generated/asm-offsets.h))
-
-endif # CONFIG_PPC32
-endif # CONFIG_SMP
-
 PHONY += checkbin
 # Check toolchain versions:
 # - gcc-4.6 is the minimum kernel-wide version so nothing required.
diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 7ef1cd8168a0..007332a4a732 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -87,22 +87,7 @@ int is_cpu_dead(unsigned int cpu);
 /* 32-bit */
 extern int smp_hw_index[];
 
-/*
- * This is particularly ugly: it appears we can't actually get the definition
- * of task_struct here, but we need access to the CPU this task is running on.
- * Instead of using task_struct we're using _TASK_CPU which is extracted from
- * asm-offsets.h by kbuild to get the current processor ID.
- *
- * This also needs to be safeguarded when building asm-offsets.s because at
- * that time _TASK_CPU is not defined yet. It could have been guarded by
- * _TASK_CPU itself, but we want the build to fail if _TASK_CPU is missing
- * when building something else than asm-offsets.s
- */
-#ifdef GENERATING_ASM_OFFSETS
-#define raw_smp_processor_id() (0)
-#else
-#define raw_smp_processor_id() (*(unsigned int *)((void *)current + 
_TASK_CPU))
-#endif
+#define raw_smp_processor_id() (current_thread_info()->cpu)
 #define hard_smp_processor_id()(smp_hw_index[smp_processor_id()])
 
 static inline int get_hard_smp_processor_id(int cpu)
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index e37e4546034e..cc05522f50bf 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -9,8 +9,6 @@
  * #defines from the assembly-language output.
  */
 
-#define GENERATING_ASM_OFFSETS /* asm/smp.h */
-
 #include 
 #include 
 #include 
-- 
2.30.2

[RFC PATCH 7/8] riscv: rely on core code to keep thread_info::cpu updated

2021-09-14 Thread Ard Biesheuvel

Now that the core code switched back to using thread_info::cpu to keep
a task's CPU number, we no longer need to keep it in sync explicitly. So
just drop the code that does this.

Signed-off-by: Ard Biesheuvel 
---
 arch/riscv/kernel/asm-offsets.c | 1 -
 arch/riscv/kernel/entry.S   | 5 -
 arch/riscv/kernel/head.S| 1 -
 3 files changed, 7 deletions(-)

diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
index 90f8ce64fa6f..478d9f02dab5 100644
--- a/arch/riscv/kernel/asm-offsets.c
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -33,7 +33,6 @@ void asm_offsets(void)
OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count);
OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp);
OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp);
-   OFFSET(TASK_TI_CPU, task_struct, thread_info.cpu);
 
OFFSET(TASK_THREAD_F0,  task_struct, thread.fstate.f[0]);
OFFSET(TASK_THREAD_F1,  task_struct, thread.fstate.f[1]);
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 98f502654edd..459eb1714353 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -544,11 +544,6 @@ ENTRY(__switch_to)
REG_L s9,  TASK_THREAD_S9_RA(a4)
REG_L s10, TASK_THREAD_S10_RA(a4)
REG_L s11, TASK_THREAD_S11_RA(a4)
-   /* Swap the CPU entry around. */
-   lw a3, TASK_TI_CPU(a0)
-   lw a4, TASK_TI_CPU(a1)
-   sw a3, TASK_TI_CPU(a1)
-   sw a4, TASK_TI_CPU(a0)
/* The offset of thread_info in task_struct is zero. */
move tp, a1
ret
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index fce5184b22c3..d5ec30ef6f5d 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -317,7 +317,6 @@ clear_bss_done:
call setup_trap_vector
/* Restore C environment */
la tp, init_task
-   sw zero, TASK_TI_CPU(tp)
la sp, init_thread_union + THREAD_SIZE
 
 #ifdef CONFIG_KASAN
-- 
2.30.2

[PATCH] powerpc/boot: Fix build failure since GCC 4.9 removal

2021-09-14 Thread Michael Ellerman

Stephen reported that the build was broken since commit
6d2ef226f2f1 ("compiler_attributes.h: drop __has_attribute() support for
gcc4"), with errors such as:

  include/linux/compiler_attributes.h:296:5: warning: "__has_attribute" is not 
defined, evaluates to 0 [-Wundef]
296 | #if __has_attribute(__warning__)
| ^~~
  make[2]: *** [arch/powerpc/boot/Makefile:225: arch/powerpc/boot/crt0.o] Error 
1

But we expect __has_attribute() to always be defined now that we've
stopped using GCC 4.

Linus debugged it to the point of reading the GCC sources, and noticing
that the problem is that __has_attribute() is not defined when
preprocessing assembly files, which is what we're doing here.

Our assembly files don't include, or need, compiler_attributes.h, but
they are getting it unconditionally from the -include in BOOT_CFLAGS,
which is then added in its entirety to BOOT_AFLAGS.

That -include was added in commit 77433830ed16 ("powerpc: boot: include
compiler_attributes.h") so that we'd have "fallthrough" and other
attributes defined for the C files in arch/powerpc/boot. But it's not
needed for assembly files.

The minimal fix is to move the addition to BOOT_CFLAGS of -include
compiler_attributes.h until after we've copied BOOT_CFLAGS into
BOOT_AFLAGS. That avoids including compiler_attributes.h for asm files,
but makes no other change to BOOT_CFLAGS or BOOT_AFLAGS.

Reported-by: Stephen Rothwell 
Debugged-by: Linus Torvalds 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/boot/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


This seemed safer as a minimal fix, rather than doing a more
comprehensive separation of CFLAGS/AFLAGS. We can do that in a future
patch.

It passed my usual build/boot tests, including booting the built zImage
on some real hardware, so this is good to go from my POV.

cheers

diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 6900d0ac2421..089ee3ea55c8 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -35,7 +35,6 @@ endif
 BOOTCFLAGS:= -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
 -fno-strict-aliasing -O2 -msoft-float -mno-altivec -mno-vsx \
 -pipe -fomit-frame-pointer -fno-builtin -fPIC -nostdinc \
--include $(srctree)/include/linux/compiler_attributes.h \
 $(LINUXINCLUDE)
 
 ifdef CONFIG_PPC64_BOOT_WRAPPER
@@ -70,6 +69,7 @@ ifeq ($(call cc-option-yn, -fstack-protector),y)
 BOOTCFLAGS += -fno-stack-protector
 endif
 
+BOOTCFLAGS += -include $(srctree)/include/linux/compiler_attributes.h
 BOOTCFLAGS += -I$(objtree)/$(obj) -I$(srctree)/$(obj)
 
 DTC_FLAGS  ?= -p 1024
-- 
2.25.1

Re: [PATCH trivial] powerpc/powernv/dump: Fix typo is comment

2021-09-14 Thread Joel Stanley

On Tue, 14 Sept 2021 at 10:17, Vasant Hegde
 wrote:
>
> Signed-off-by: Vasant Hegde 

There's a typo in your commit message :)

> ---
>  arch/powerpc/platforms/powernv/opal-dump.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/platforms/powernv/opal-dump.c 
> b/arch/powerpc/platforms/powernv/opal-dump.c
> index 00c5a59d82d9..717d1d30ade5 100644
> --- a/arch/powerpc/platforms/powernv/opal-dump.c
> +++ b/arch/powerpc/platforms/powernv/opal-dump.c
> @@ -419,7 +419,7 @@ void __init opal_platform_dump_init(void)
> int rc;
> int dump_irq;
>
> -   /* ELOG not supported by firmware */
> +   /* Dump not supported by firmware */
> if (!opal_check_token(OPAL_DUMP_READ))
> return;
>
> --
> 2.31.1
>

[RFC PATCH 3/8] s390: add CPU field to struct thread_info

2021-09-14 Thread Ard Biesheuvel

The CPU field will be moved back into thread_info even when
THREAD_INFO_IN_TASK is enabled, so add it back to s390's definition of
struct thread_info.

Signed-off-by: Ard Biesheuvel 
---
 arch/s390/include/asm/thread_info.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/s390/include/asm/thread_info.h 
b/arch/s390/include/asm/thread_info.h
index e6674796aa6f..b2ffcb4fe000 100644
--- a/arch/s390/include/asm/thread_info.h
+++ b/arch/s390/include/asm/thread_info.h
@@ -37,6 +37,7 @@
 struct thread_info {
unsigned long   flags;  /* low level flags */
unsigned long   syscall_work;   /* SYSCALL_WORK_ flags */
+   unsigned intcpu;/* current CPU */
 };
 
 /*
-- 
2.30.2

[RFC PATCH 2/8] x86: add CPU field to struct thread_info

2021-09-14 Thread Ard Biesheuvel

The CPU field will be moved back into thread_info even when
THREAD_INFO_IN_TASK is enabled, so add it back to x86's definition of
struct thread_info.

Signed-off-by: Ard Biesheuvel 
---
 arch/x86/include/asm/thread_info.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/include/asm/thread_info.h 
b/arch/x86/include/asm/thread_info.h
index cf132663c219..ebec69c35e95 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -57,6 +57,9 @@ struct thread_info {
unsigned long   flags;  /* low level flags */
unsigned long   syscall_work;   /* SYSCALL_WORK_ flags */
u32 status; /* thread synchronous flags */
+#ifdef CONFIG_SMP
+   u32 cpu;/* current CPU */
+#endif
 };
 
 #define INIT_THREAD_INFO(tsk)  \
-- 
2.30.2

Re: [RFC PATCH 0/8] Move task_struct::cpu back into thread_info

2021-09-14 Thread Christophe Leroy





Le 14/09/2021 à 14:10, Ard Biesheuvel a écrit :

Commit c65eacbe290b ("sched/core: Allow putting thread_info into
task_struct") mentions that, along with moving thread_info into
task_struct, the cpu field is moved out of the former into the latter,
but does not explain why.


I think it does explain why (init/Kconfig): "an arch will need to remove 
all thread_info fields except flags".


IIUC initially the intention with THREAD_INFO_IN_TASK was to remove 
everything from thread_info, but at the end it didn't happen it seems.




While collaborating with Keith on adding THREAD_INFO_IN_TASK support to
ARM, we noticed that keeping CPU in task_struct is problematic for
architectures that define raw_smp_processor_id() in terms of this field,
as it requires linux/sched.h to be included, which causes a lot of pain
in terms of circular dependencies (or 'header soup', as the original
commit refers to it).

For examples of how existing architectures work around this, please
refer to patches #6 or #7. In the former case, it uses an awful
asm-offsets hack to index thread_info/current without using its type
definition. The latter approach simply keeps a copy of the task_struct
CPU field in thread_info, and keeps it in sync at context switch time.


It was a pain when implementing that on powerpc, so I really like your 
idea, the series looks good to me.





Patch #8 reverts this latter approach for ARM, but this code is still
under review so it does not currently apply to mainline.

We also discussed introducing yet another Kconfig symbol to indicate
that the arch has THREAD_INFO_IN_TASK enabled but still prefers to keep
its CPU field in thread_info, but simply keeping it in thread_info in
all cases seems to be the cleanest approach here.


Yes, if we can avoid yet another config, that's better. We already have 
so many configs that are supposed to be temporary and have lasted for 
years if not decades.


Christophe

Re: [RFC PATCH 0/8] Move task_struct::cpu back into thread_info

2021-09-14 Thread Mark Rutland

On Tue, Sep 14, 2021 at 02:10:28PM +0200, Ard Biesheuvel wrote:
> Commit c65eacbe290b ("sched/core: Allow putting thread_info into
> task_struct") mentions that, along with moving thread_info into
> task_struct, the cpu field is moved out of the former into the latter,
> but does not explain why.

>From what I recall of talking to Andy around that time, when converting
arm64 over, the theory was that over time we'd move more and more out of
thread_info and into task_struct or thread_struct, until task_struct
supplanted thread_info entirely, and that all became generic.

I think the key gain there was making things more *generic*, and there
are other ways we could do that in future without moving more into
task_struct (e.g. with a geenric thread_info and arch_thread_info inside
that).

With that in mind, and given the diffstat, I think this is worthwhile.

FWIW, for the series:

Acked-by: Mark Rutland 

Mark.

> While collaborating with Keith on adding THREAD_INFO_IN_TASK support to
> ARM, we noticed that keeping CPU in task_struct is problematic for
> architectures that define raw_smp_processor_id() in terms of this field,
> as it requires linux/sched.h to be included, which causes a lot of pain
> in terms of circular dependencies (or 'header soup', as the original
> commit refers to it).
> 
> For examples of how existing architectures work around this, please
> refer to patches #6 or #7. In the former case, it uses an awful
> asm-offsets hack to index thread_info/current without using its type
> definition. The latter approach simply keeps a copy of the task_struct
> CPU field in thread_info, and keeps it in sync at context switch time.
> 
> Patch #8 reverts this latter approach for ARM, but this code is still
> under review so it does not currently apply to mainline.
> 
> We also discussed introducing yet another Kconfig symbol to indicate
> that the arch has THREAD_INFO_IN_TASK enabled but still prefers to keep
> its CPU field in thread_info, but simply keeping it in thread_info in
> all cases seems to be the cleanest approach here.
> 
> Cc: Keith Packard 
> Cc: Russell King 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Christophe Leroy 
> Cc: Paul Mackerras 
> Cc: Paul Walmsley 
> Cc: Palmer Dabbelt 
> Cc: Albert Ou 
> Cc: Heiko Carstens 
> Cc: Vasily Gorbik 
> Cc: Christian Borntraeger 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: Borislav Petkov 
> Cc: Peter Zijlstra 
> Cc: Kees Cook 
> Cc: Andy Lutomirski 
> Cc: Linus Torvalds 
> Cc: Arnd Bergmann 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-ri...@lists.infradead.org
> Cc: linux-s...@vger.kernel.org
> 
> Ard Biesheuvel (8):
>   arm64: add CPU field to struct thread_info
>   x86: add CPU field to struct thread_info
>   s390: add CPU field to struct thread_info
>   powerpc: add CPU field to struct thread_info
>   sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y
>   powerpc: smp: remove hack to obtain offset of task_struct::cpu
>   riscv: rely on core code to keep thread_info::cpu updated
>   ARM: rely on core code to keep thread_info::cpu updated
> 
>  arch/arm/include/asm/switch_to.h   | 14 --
>  arch/arm/kernel/smp.c  |  3 ---
>  arch/arm64/include/asm/thread_info.h   |  1 +
>  arch/arm64/kernel/asm-offsets.c|  2 +-
>  arch/arm64/kernel/head.S   |  2 +-
>  arch/powerpc/Makefile  | 11 ---
>  arch/powerpc/include/asm/smp.h | 17 +
>  arch/powerpc/include/asm/thread_info.h |  3 +++
>  arch/powerpc/kernel/asm-offsets.c  |  4 +---
>  arch/powerpc/kernel/smp.c  |  2 +-
>  arch/riscv/kernel/asm-offsets.c|  1 -
>  arch/riscv/kernel/entry.S  |  5 -
>  arch/riscv/kernel/head.S   |  1 -
>  arch/s390/include/asm/thread_info.h|  1 +
>  arch/x86/include/asm/thread_info.h |  3 +++
>  include/linux/sched.h  |  6 +-
>  kernel/sched/sched.h   |  4 
>  17 files changed, 14 insertions(+), 66 deletions(-)
> 
> -- 
> 2.30.2
>

Re: [PATCH RESEND v3 6/6] powerpc/signal: Use unsafe_copy_siginfo_to_user()

2021-09-14 Thread Christophe Leroy





Le 13/09/2021 à 21:11, Eric W. Biederman a écrit :

Christophe Leroy  writes:


Le 13/09/2021 à 18:21, Eric W. Biederman a écrit :

ebied...@xmission.com (Eric W. Biederman) writes:


Christophe Leroy  writes:


Use unsafe_copy_siginfo_to_user() in order to do the copy
within the user access block.

On an mpc 8321 (book3s/32) the improvment is about 5% on a process
sending a signal to itself.


If you can't make function calls from an unsafe macro there is another
way to handle this that doesn't require everything to be inline.

  From a safety perspective it is probably even a better approach.


Yes but that's exactly what I wanted to avoid for the native ppc32 case: this
double hop means useless pressure on the cache. The siginfo_t structure is 128
bytes large, that means 8 lines of cache on powerpc 8xx.

But maybe it is acceptable to do that only for the compat case. Let me think
about it, it might be quite easy.


The places get_signal is called tend to be well known.  So I think we
are safe from a capacity standpoint.

I am not certain it makes a difference in capacity as there is a high
probability that the stack was deeper recently than it is now which
suggests the cache blocks might already be in the cache.

My sense it is worth benchmarking before optimizing out the extra copy
like that.

On the extreme side there is simply building the entire sigframe on the
stack and then just calling it copy_to_user.  As the stack cache lines
are likely to be hot, and copy_to_user is quite well optimized
there is a real possibility that it is faster to build everything
on the kernel stack, and then copy it to the user space stack.

It is also possible that I am wrong and we may want to figure out how
far up we can push the conversion to the 32bit siginfo format.

If could move the work into collect_signal we could guarantee there
would be no extra work.  That would require adjusting the sigframe
generation code on all of the architectures.

There is a lot we can do but we need benchmarking to tell if it is
worth it.




Sure, I'm benchmarking all the work I have been doing on signal code 
with the following simple app that I run with 'perf stat':


#include 
#include 

void sigusr1(int sig) { }

int main(int argc, char **argv)
{
int i = 10;

signal(SIGUSR1, sigusr1);
for (;i--;)
raise(SIGUSR1);
exit(0);
}


On an mpc8321 a 32 bits powerpc with KUAP enabled (KUAP is equivalent of 
x86 SMAP)


Before changing copy_siginfo_to_user() to unsafe_copy_siginfo_to_user(), 
'perf stat' reports 1983 msec (task-clock)


After my change I get 1900 msec.

With your approach I get 1930 msec, so we are loosing 36% of the benefit 
of converting to the 'unsafe_' alternative.


So I think it is worth it.

Christophe

[PATCH v4 2/5] powerpc/signal: Include the new stack frame inside the user access block

2021-09-14 Thread Christophe Leroy

Include the new stack frame inside the user access block and set it up
using unsafe_put_user().

On an mpc 8321 (book3s/32) the improvment is about 4% on a process
sending a signal to itself.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/signal_32.c | 29 +
 arch/powerpc/kernel/signal_64.c | 14 +++---
 2 files changed, 20 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index 0608581967f0..ff101e2b3bab 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -726,7 +726,7 @@ int handle_rt_signal32(struct ksignal *ksig, sigset_t 
*oldset,
struct rt_sigframe __user *frame;
struct mcontext __user *mctx;
struct mcontext __user *tm_mctx = NULL;
-   unsigned long newsp = 0;
+   unsigned long __user *newsp;
unsigned long tramp;
struct pt_regs *regs = tsk->thread.regs;
/* Save the thread's msr before get_tm_stackpointer() changes it */
@@ -734,6 +734,7 @@ int handle_rt_signal32(struct ksignal *ksig, sigset_t 
*oldset,
 
/* Set up Signal Frame */
frame = get_sigframe(ksig, tsk, sizeof(*frame), 1);
+   newsp = (unsigned long __user *)((unsigned long)frame - 
(__SIGNAL_FRAMESIZE + 16));
mctx = >uc.uc_mcontext;
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
tm_mctx = >uc_transact.uc_mcontext;
@@ -743,7 +744,7 @@ int handle_rt_signal32(struct ksignal *ksig, sigset_t 
*oldset,
else
prepare_save_user_regs(1);
 
-   if (!user_access_begin(frame, sizeof(*frame)))
+   if (!user_access_begin(newsp, __SIGNAL_FRAMESIZE + 16 + sizeof(*frame)))
goto badframe;
 
/* Put the siginfo & fill in most of the ucontext */
@@ -779,6 +780,9 @@ int handle_rt_signal32(struct ksignal *ksig, sigset_t 
*oldset,
}
unsafe_put_sigset_t(>uc.uc_sigmask, oldset, failed);
 
+   /* create a stack frame for the caller of the handler */
+   unsafe_put_user(regs->gpr[1], newsp, failed);
+
user_access_end();
 
if (copy_siginfo_to_user(>info, >info))
@@ -790,13 +794,8 @@ int handle_rt_signal32(struct ksignal *ksig, sigset_t 
*oldset,
tsk->thread.fp_state.fpscr = 0; /* turn off all fp exceptions */
 #endif
 
-   /* create a stack frame for the caller of the handler */
-   newsp = ((unsigned long)frame) - (__SIGNAL_FRAMESIZE + 16);
-   if (put_user(regs->gpr[1], (u32 __user *)newsp))
-   goto badframe;
-
/* Fill registers for signal handler */
-   regs->gpr[1] = newsp;
+   regs->gpr[1] = (unsigned long)newsp;
regs->gpr[3] = ksig->sig;
regs->gpr[4] = (unsigned long)>info;
regs->gpr[5] = (unsigned long)>uc;
@@ -826,7 +825,7 @@ int handle_signal32(struct ksignal *ksig, sigset_t *oldset,
struct sigframe __user *frame;
struct mcontext __user *mctx;
struct mcontext __user *tm_mctx = NULL;
-   unsigned long newsp = 0;
+   unsigned long __user *newsp;
unsigned long tramp;
struct pt_regs *regs = tsk->thread.regs;
/* Save the thread's msr before get_tm_stackpointer() changes it */
@@ -834,6 +833,7 @@ int handle_signal32(struct ksignal *ksig, sigset_t *oldset,
 
/* Set up Signal Frame */
frame = get_sigframe(ksig, tsk, sizeof(*frame), 1);
+   newsp = (unsigned long __user *)((unsigned long)frame - 
__SIGNAL_FRAMESIZE);
mctx = >mctx;
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
tm_mctx = >mctx_transact;
@@ -843,7 +843,7 @@ int handle_signal32(struct ksignal *ksig, sigset_t *oldset,
else
prepare_save_user_regs(1);
 
-   if (!user_access_begin(frame, sizeof(*frame)))
+   if (!user_access_begin(newsp, __SIGNAL_FRAMESIZE + sizeof(*frame)))
goto badframe;
sc = (struct sigcontext __user *) >sctx;
 
@@ -873,6 +873,8 @@ int handle_signal32(struct ksignal *ksig, sigset_t *oldset,
unsafe_put_user(PPC_RAW_SC(), >mc_pad[1], failed);
asm("dcbst %y0; sync; icbi %y0; sync" :: "Z" (mctx->mc_pad[0]));
}
+   /* create a stack frame for the caller of the handler */
+   unsafe_put_user(regs->gpr[1], newsp, failed);
user_access_end();
 
regs->link = tramp;
@@ -881,12 +883,7 @@ int handle_signal32(struct ksignal *ksig, sigset_t *oldset,
tsk->thread.fp_state.fpscr = 0; /* turn off all fp exceptions */
 #endif
 
-   /* create a stack frame for the caller of the handler */
-   newsp = ((unsigned long)frame) - __SIGNAL_FRAMESIZE;
-   if (put_user(regs->gpr[1], (u32 __user *)newsp))
-   goto badframe;
-
-   regs->gpr[1] = newsp;
+   regs->gpr[1] = (unsigned long)newsp;
regs->gpr[3] = ksig->sig;
regs->gpr[4] = (unsigned long) sc;
regs_set_return_ip(regs, (unsigned long) ksig->ka.sa.sa_handler);
diff --git a/arch/powerpc/kernel/signal_64.c

[PATCH v4 3/5] signal: Add unsafe_copy_siginfo_to_user()

2021-09-14 Thread Christophe Leroy

In the same spirit as commit fb05121fd6a2 ("signal: Add
unsafe_get_compat_sigset()"), implement an 'unsafe' version of
copy_siginfo_to_user() in order to use it within user access blocks.

For that, also add an 'unsafe' version of clear_user().

This commit adds the generic fallback for unsafe_clear_user().
Architectures wanting to use unsafe_copy_siginfo_to_user() within a
user_access_begin() section have to make sure they have their
own unsafe_clear_user().

Signed-off-by: Christophe Leroy 
---
v3: Added precision about unsafe_clear_user() in commit message.
---
 include/linux/signal.h  | 15 +++
 include/linux/uaccess.h |  1 +
 kernel/signal.c |  5 -
 3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/include/linux/signal.h b/include/linux/signal.h
index 3f96a6374e4f..70ea7e741427 100644
--- a/include/linux/signal.h
+++ b/include/linux/signal.h
@@ -35,6 +35,21 @@ static inline void copy_siginfo_to_external(siginfo_t *to,
 int copy_siginfo_to_user(siginfo_t __user *to, const kernel_siginfo_t *from);
 int copy_siginfo_from_user(kernel_siginfo_t *to, const siginfo_t __user *from);
 
+static __always_inline char __user *si_expansion(const siginfo_t __user *info)
+{
+   return ((char __user *)info) + sizeof(struct kernel_siginfo);
+}
+
+#define unsafe_copy_siginfo_to_user(to, from, label) do {  \
+   siginfo_t __user *__ucs_to = to;\
+   const kernel_siginfo_t *__ucs_from = from;  \
+   char __user *__ucs_expansion = si_expansion(__ucs_to);  \
+   \
+   unsafe_copy_to_user(__ucs_to, __ucs_from,   \
+   sizeof(struct kernel_siginfo), label);  \
+   unsafe_clear_user(__ucs_expansion, SI_EXPANSION_SIZE, label);   \
+} while (0)
+
 enum siginfo_layout {
SIL_KILL,
SIL_TIMER,
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index c05e903cef02..37073caac474 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -398,6 +398,7 @@ long strnlen_user_nofault(const void __user *unsafe_addr, 
long count);
 #define unsafe_put_user(x,p,e) unsafe_op_wrap(__put_user(x,p),e)
 #define unsafe_copy_to_user(d,s,l,e) unsafe_op_wrap(__copy_to_user(d,s,l),e)
 #define unsafe_copy_from_user(d,s,l,e) 
unsafe_op_wrap(__copy_from_user(d,s,l),e)
+#define unsafe_clear_user(d, l, e) unsafe_op_wrap(__clear_user(d, l), e)
 static inline unsigned long user_access_save(void) { return 0UL; }
 static inline void user_access_restore(unsigned long flags) { }
 #endif
diff --git a/kernel/signal.c b/kernel/signal.c
index 952741f6d0f9..23f168730b7e 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -3324,11 +3324,6 @@ enum siginfo_layout siginfo_layout(unsigned sig, int 
si_code)
return layout;
 }
 
-static inline char __user *si_expansion(const siginfo_t __user *info)
-{
-   return ((char __user *)info) + sizeof(struct kernel_siginfo);
-}
-
 int copy_siginfo_to_user(siginfo_t __user *to, const kernel_siginfo_t *from)
 {
char __user *expansion = si_expansion(to);
-- 
2.31.1

Re: [RFC PATCH 1/8] arm64: add CPU field to struct thread_info

2021-09-14 Thread Ard Biesheuvel

On Tue, 14 Sept 2021 at 17:41, Linus Torvalds
 wrote:
>
> On Tue, Sep 14, 2021 at 5:10 AM Ard Biesheuvel  wrote:
> >
> > The CPU field will be moved back into thread_info even when
> > THREAD_INFO_IN_TASK is enabled, so add it back to arm64's definition of
> > struct thread_info.
>
> The series looks sane to me, but it strikes me that it's inconsistent
> - here for arm64, you make it unconditional, but for the other
> architectures you end up putting it inside a #ifdef CONFIG_SMP.
>
> Was there some reason for this odd behavior?
>

Yes. CONFIG_SMP is a 'def_bool y' on arm64.

Re: [RFC PATCH 5/8] sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y

2021-09-14 Thread Linus Torvalds

On Tue, Sep 14, 2021 at 5:11 AM Ard Biesheuvel  wrote:
>
>  static inline unsigned int task_cpu(const struct task_struct *p)
>  {
>  #ifdef CONFIG_THREAD_INFO_IN_TASK
> -   return READ_ONCE(p->cpu);
> +   return READ_ONCE(p->thread_info.cpu);
>  #else
> return READ_ONCE(task_thread_info(p)->cpu);
>  #endif

Those two lines look different, but aren't.

Please just remove the CONFIG_THREAD_INFO_IN_TASK conditional, and use

  return READ_ONCE(task_thread_info(p)->cpu);

unconditionally, which now does the right thing regardless.

 Linus

[PATCH v2 00/29] Change wildcards on ABI files

2021-09-14 Thread Mauro Carvalho Chehab

The ABI files are meant to be parsed via a script (scripts/get_abi.pl).

A new improvement on it will allow it to help to detect if an ABI description
is missing, or if the What: field won't match the actual location of the symbol.

In order for get_abi.pl to convert What: into regex, changes are needed on
existing ABI files, as the conversion should not be ambiguous.

One alternative would be to convert everything into regexes, but this
would generate a huge amount of patches/changes. So, instead, let's
touch only the ABI files that aren't following the de-facto wildcard 
standards already found on most of the ABI files, e. g.:

/.../
*

(option1|option2)
X
Y
Z
[0-9] (and variants)

A couple of the patches here came from v1, but most of the patches were
written to address things like rcN, where N is a wildcard.

We can't teach get_abi.pl to use an uppercase "N" letter to be a wildcard,
as the USB ABI already uses "N" inside some of their symbols, like 
bNumEndpoints.

Mauro Carvalho Chehab (29):
  ABI: sysfs-bus-usb: better document variable argument
  ABI: sysfs-tty: better document module name parameter
  ABI: sysfs-kernel-slab: use a wildcard for the cache name
  ABI: security: fix location for evm and ima_policy
  ABI: sysfs-class-tpm: use wildcards for pcr-* nodes
  ABI: sysfs-bus-rapidio: use wildcards on What definitions
  ABI: sysfs-class-cxl: place "not in a guest" at description
  ABI: sysfs-class-devfreq-event: use the right wildcards on What
  ABI: sysfs-class-mic: use the right wildcards on What definitions
  ABI: pstore: Fix What field
  ABI: sysfs-class-typec: fix a bad What field
  ABI: sysfs-ata: use a proper wildcard for ata_*
  ABI: sysfs-class-infiniband: use wildcards on What definitions
  ABI: sysfs-bus-pci: use wildcards on What definitions
  ABI: sysfs-bus-soundwire-master: use wildcards on What definitions
  ABI: sysfs-bus-soundwire-slave: use wildcards on What definitions
  ABI: sysfs-class-gnss: use wildcards on What definitions
  ABI: sysfs-class-mei: use wildcards on What definitions
  ABI: sysfs-class-mux: use wildcards on What definitions
  ABI: sysfs-class-pwm: use wildcards on What definitions
  ABI: sysfs-class-rc: use wildcards on What definitions
  ABI: sysfs-class-rc-nuvoton: use wildcards on What definitions
  ABI: sysfs-class-uwb_rc: use wildcards on What definitions
  ABI: sysfs-class-uwb_rc-wusbhc: use wildcards on What definitions
  ABI: sysfs-devices-platform-dock: use wildcards on What definitions
  ABI: sysfs-devices-system-cpu: use wildcards on What definitions
  ABI: sysfs-firmware-efi-esrt: use wildcards on What definitions
  ABI: sysfs-platform-sst-atom: use wildcards on What definitions
  ABI: sysfs-ptp: use wildcards on What definitions

 .../ABI/stable/sysfs-class-infiniband | 64 ++---
 Documentation/ABI/stable/sysfs-class-tpm  |  2 +-
 Documentation/ABI/testing/evm |  4 +-
 Documentation/ABI/testing/ima_policy  |  2 +-
 Documentation/ABI/testing/pstore  |  3 +-
 Documentation/ABI/testing/sysfs-ata   |  2 +-
 Documentation/ABI/testing/sysfs-bus-pci   |  2 +-
 Documentation/ABI/testing/sysfs-bus-rapidio   | 32 +++
 .../ABI/testing/sysfs-bus-soundwire-master|  2 +-
 .../ABI/testing/sysfs-bus-soundwire-slave |  2 +-
 Documentation/ABI/testing/sysfs-bus-usb   | 16 ++--
 Documentation/ABI/testing/sysfs-class-cxl | 15 ++-
 .../ABI/testing/sysfs-class-devfreq-event | 12 +--
 Documentation/ABI/testing/sysfs-class-gnss|  2 +-
 Documentation/ABI/testing/sysfs-class-mei | 18 ++--
 Documentation/ABI/testing/sysfs-class-mic | 24 ++---
 Documentation/ABI/testing/sysfs-class-mux |  2 +-
 Documentation/ABI/testing/sysfs-class-pwm | 20 ++--
 Documentation/ABI/testing/sysfs-class-rc  | 14 +--
 .../ABI/testing/sysfs-class-rc-nuvoton|  2 +-
 Documentation/ABI/testing/sysfs-class-typec   |  2 +-
 Documentation/ABI/testing/sysfs-class-uwb_rc  | 26 ++---
 .../ABI/testing/sysfs-class-uwb_rc-wusbhc | 10 +-
 .../ABI/testing/sysfs-devices-platform-dock   | 10 +-
 .../ABI/testing/sysfs-devices-system-cpu  | 16 ++--
 .../ABI/testing/sysfs-firmware-efi-esrt   | 16 ++--
 Documentation/ABI/testing/sysfs-kernel-slab   | 94 +--
 .../ABI/testing/sysfs-platform-sst-atom   |  2 +-
 Documentation/ABI/testing/sysfs-ptp   | 30 +++---
 Documentation/ABI/testing/sysfs-tty   | 32 +++
 30 files changed, 242 insertions(+), 236 deletions(-)

-- 
2.31.1

[PATCH v2 07/29] ABI: sysfs-class-cxl: place "not in a guest" at description

2021-09-14 Thread Mauro Carvalho Chehab

The What: field should have just the location of the ABI.
Anything else should be inside the description.

This fixes its parsing by get_abi.pl script.

Signed-off-by: Mauro Carvalho Chehab 
---
 Documentation/ABI/testing/sysfs-class-cxl | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-class-cxl 
b/Documentation/ABI/testing/sysfs-class-cxl
index 818f55970efb..3c77677e0ca7 100644
--- a/Documentation/ABI/testing/sysfs-class-cxl
+++ b/Documentation/ABI/testing/sysfs-class-cxl
@@ -166,10 +166,11 @@ Description:read only
 Decimal value of the Per Process MMIO space length.
 Users: https://github.com/ibm-capi/libcxl
 
-What:   /sys/class/cxl/m/pp_mmio_off (not in a guest)
+What:   /sys/class/cxl/m/pp_mmio_off
 Date:   September 2014
 Contact:linuxppc-dev@lists.ozlabs.org
 Description:read only
+(not in a guest)
 Decimal value of the Per Process MMIO space offset.
 Users: https://github.com/ibm-capi/libcxl
 
@@ -190,28 +191,31 @@ Description:read only
 Identifies the revision level of the PSL.
 Users: https://github.com/ibm-capi/libcxl
 
-What:   /sys/class/cxl//base_image (not in a guest)
+What:   /sys/class/cxl//base_image
 Date:   September 2014
 Contact:linuxppc-dev@lists.ozlabs.org
 Description:read only
+(not in a guest)
 Identifies the revision level of the base image for devices
 that support loadable PSLs. For FPGAs this field identifies
 the image contained in the on-adapter flash which is loaded
 during the initial program load.
 Users: https://github.com/ibm-capi/libcxl
 
-What:   /sys/class/cxl//image_loaded (not in a guest)
+What:   /sys/class/cxl//image_loaded
 Date:   September 2014
 Contact:linuxppc-dev@lists.ozlabs.org
 Description:read only
+(not in a guest)
 Will return "user" or "factory" depending on the image loaded
 onto the card.
 Users: https://github.com/ibm-capi/libcxl
 
-What:   /sys/class/cxl//load_image_on_perst (not in a guest)
+What:   /sys/class/cxl//load_image_on_perst
 Date:   December 2014
 Contact:linuxppc-dev@lists.ozlabs.org
 Description:read/write
+(not in a guest)
 Valid entries are "none", "user", and "factory".
 "none" means PERST will not cause image to be loaded to the
 card.  A power cycle is required to load the image.
@@ -235,10 +239,11 @@ Description:write only
 contexts on the card AFUs.
 Users: https://github.com/ibm-capi/libcxl
 
-What:  /sys/class/cxl//perst_reloads_same_image (not in a guest)
+What:  /sys/class/cxl//perst_reloads_same_image
 Date:  July 2015
 Contact:   linuxppc-dev@lists.ozlabs.org
 Description:   read/write
+(not in a guest)
Trust that when an image is reloaded via PERST, it will not
have changed.
 
-- 
2.31.1

Re: [PATCH 2/2] kvm: rename KVM_MAX_VCPU_ID to, KVM_MAX_VCPU_IDS

2021-09-14 Thread Christian Zigotzky


Hello Juergen,
Hello All,

Since the RC1 of kernel 5.13, -smp 2 and -smp 4 don't work with a 
virtual e5500 QEMU KVM-HV machine anymore. [1]
I see in the serial console, that the uImage doesn't load. I use the 
following QEMU command for booting:


qemu-system-ppc64 -M ppce500 -cpu e5500 -enable-kvm -m 1024 -kernel 
uImage -drive format=raw,file=MintPPC32-X5000.img,index=0,if=virtio 
-netdev user,id=mynet0 -device virtio-net,netdev=mynet0 -append "rw 
root=/dev/vda" -device virtio-vga -device virtio-mouse-pci -device 
virtio-keyboard-pci -device pci-ohci,id=newusb -device 
usb-audio,bus=newusb.0 -smp 4


The kernels boot without KVM-HV.

Summary for KVM-HV:

-smp 1 -> works
-smp 2 -> doesn't work
-smp 3 -> works
-smp 4 -> doesn't work

I used -smp 4 before the RC1 of kernel 5.13 because my FSL P5040 BookE 
machine [2] has 4 cores.


Does this patch solve this issue? [3]

Thanks,
Christian

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-May/229103.html
[2] http://wiki.amiga.org/index.php?title=X5000
[3] 
https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-September/234152.html

Re: [RFC PATCH 5/8] sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y

2021-09-14 Thread Linus Torvalds

On Tue, Sep 14, 2021 at 8:53 AM Ard Biesheuvel  wrote:
>
> task_cpu() takes a 'const struct task_struct *', whereas
> task_thread_info() takes a 'struct task_struct *'.

Oh, annoying, but that's easily fixed. Just make that

   static inline struct thread_info *task_thread_info(struct
task_struct *task) ..

be a simple

  #define task_thread_info(tsk) (&(tsk)->thread_info)

instead. That actually then matches the !THREAD_INFO_IN_TASK case anyway.

Make the commit comment be about how that fixes the type problem.

Because while in many cases inline functions are superior to macros,
it clearly isn't the case in this case.

  Linus

Re: [RFC PATCH 5/8] sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y

2021-09-14 Thread Ard Biesheuvel

On Tue, 14 Sept 2021 at 17:59, Linus Torvalds
 wrote:
>
> On Tue, Sep 14, 2021 at 8:53 AM Ard Biesheuvel  wrote:
> >
> > task_cpu() takes a 'const struct task_struct *', whereas
> > task_thread_info() takes a 'struct task_struct *'.
>
> Oh, annoying, but that's easily fixed. Just make that
>
>static inline struct thread_info *task_thread_info(struct
> task_struct *task) ..
>
> be a simple
>
>   #define task_thread_info(tsk) (&(tsk)->thread_info)
>
> instead. That actually then matches the !THREAD_INFO_IN_TASK case anyway.
>
> Make the commit comment be about how that fixes the type problem.
>
> Because while in many cases inline functions are superior to macros,
> it clearly isn't the case in this case.
>

Works for me.

Re: [PATCH] powerpc/boot: Fix build failure since GCC 4.9 removal

2021-09-14 Thread Guenter Roeck

On Tue, Sep 14, 2021 at 10:17:23PM +1000, Michael Ellerman wrote:
> Stephen reported that the build was broken since commit
> 6d2ef226f2f1 ("compiler_attributes.h: drop __has_attribute() support for
> gcc4"), with errors such as:
> 
>   include/linux/compiler_attributes.h:296:5: warning: "__has_attribute" is 
> not defined, evaluates to 0 [-Wundef]
> 296 | #if __has_attribute(__warning__)
> | ^~~
>   make[2]: *** [arch/powerpc/boot/Makefile:225: arch/powerpc/boot/crt0.o] 
> Error 1
> 
> But we expect __has_attribute() to always be defined now that we've
> stopped using GCC 4.
> 
> Linus debugged it to the point of reading the GCC sources, and noticing
> that the problem is that __has_attribute() is not defined when
> preprocessing assembly files, which is what we're doing here.
> 
> Our assembly files don't include, or need, compiler_attributes.h, but
> they are getting it unconditionally from the -include in BOOT_CFLAGS,
> which is then added in its entirety to BOOT_AFLAGS.
> 
> That -include was added in commit 77433830ed16 ("powerpc: boot: include
> compiler_attributes.h") so that we'd have "fallthrough" and other
> attributes defined for the C files in arch/powerpc/boot. But it's not
> needed for assembly files.
> 
> The minimal fix is to move the addition to BOOT_CFLAGS of -include
> compiler_attributes.h until after we've copied BOOT_CFLAGS into
> BOOT_AFLAGS. That avoids including compiler_attributes.h for asm files,
> but makes no other change to BOOT_CFLAGS or BOOT_AFLAGS.
> 
> Reported-by: Stephen Rothwell 
> Debugged-by: Linus Torvalds 
> Signed-off-by: Michael Ellerman 

Tested-by: Guenter Roeck 

> ---
>  arch/powerpc/boot/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> 
> This seemed safer as a minimal fix, rather than doing a more
> comprehensive separation of CFLAGS/AFLAGS. We can do that in a future
> patch.
> 
> It passed my usual build/boot tests, including booting the built zImage
> on some real hardware, so this is good to go from my POV.
> 
> cheers
> 
> diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
> index 6900d0ac2421..089ee3ea55c8 100644
> --- a/arch/powerpc/boot/Makefile
> +++ b/arch/powerpc/boot/Makefile
> @@ -35,7 +35,6 @@ endif
>  BOOTCFLAGS:= -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
>-fno-strict-aliasing -O2 -msoft-float -mno-altivec -mno-vsx \
>-pipe -fomit-frame-pointer -fno-builtin -fPIC -nostdinc \
> -  -include $(srctree)/include/linux/compiler_attributes.h \
>$(LINUXINCLUDE)
>  
>  ifdef CONFIG_PPC64_BOOT_WRAPPER
> @@ -70,6 +69,7 @@ ifeq ($(call cc-option-yn, -fstack-protector),y)
>  BOOTCFLAGS   += -fno-stack-protector
>  endif
>  
> +BOOTCFLAGS   += -include $(srctree)/include/linux/compiler_attributes.h
>  BOOTCFLAGS   += -I$(objtree)/$(obj) -I$(srctree)/$(obj)
>  
>  DTC_FLAGS?= -p 1024
> -- 
> 2.25.1
>

Re: [PATCH] powerpc: clean up UPD_CONSTR

2021-09-14 Thread Nathan Chancellor


On 9/14/2021 9:17 AM, Nick Desaulniers wrote:

UPD_CONSTR was previously a preprocessor define for an old GCC 4.9 inline
asm bug with m<> constraints.

Fixes: 6563139d90ad ("powerpc: remove GCC version check for UPD_CONSTR")
Suggested-by: Nathan Chancellor 
Suggested-by: Christophe Leroy 
Suggested-by: Michael Ellerman 
Signed-off-by: Nick Desaulniers 


Reviewed-by: Nathan Chancellor 


---
  arch/powerpc/include/asm/asm-const.h | 2 --
  arch/powerpc/include/asm/atomic.h| 8 
  arch/powerpc/include/asm/io.h| 4 ++--
  arch/powerpc/include/asm/uaccess.h   | 6 +++---
  arch/powerpc/kvm/powerpc.c   | 4 ++--
  5 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/asm-const.h 
b/arch/powerpc/include/asm/asm-const.h
index dbfa5e1e3198..bfb3c3534877 100644
--- a/arch/powerpc/include/asm/asm-const.h
+++ b/arch/powerpc/include/asm/asm-const.h
@@ -12,6 +12,4 @@
  #  define ASM_CONST(x)__ASM_CONST(x)
  #endif
  
-#define UPD_CONSTR "<>"

-
  #endif /* _ASM_POWERPC_ASM_CONST_H */
diff --git a/arch/powerpc/include/asm/atomic.h 
b/arch/powerpc/include/asm/atomic.h
index 6a53ef178bfd..fd594fdbd84d 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -27,14 +27,14 @@ static __inline__ int arch_atomic_read(const atomic_t *v)
  {
int t;
  
-	__asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m"UPD_CONSTR(v->counter));

+   __asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m<>"(v->counter));
  
  	return t;

  }
  
  static __inline__ void arch_atomic_set(atomic_t *v, int i)

  {
-   __asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"UPD_CONSTR(v->counter) : 
"r"(i));
+   __asm__ __volatile__("stw%U0%X0 %1,%0" : "=m<>"(v->counter) : "r"(i));
  }
  
  #define ATOMIC_OP(op, asm_op)		\

@@ -320,14 +320,14 @@ static __inline__ s64 arch_atomic64_read(const atomic64_t 
*v)
  {
s64 t;
  
-	__asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m"UPD_CONSTR(v->counter));

+   __asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m<>"(v->counter));
  
  	return t;

  }
  
  static __inline__ void arch_atomic64_set(atomic64_t *v, s64 i)

  {
-   __asm__ __volatile__("std%U0%X0 %1,%0" : "=m"UPD_CONSTR(v->counter) : 
"r"(i));
+   __asm__ __volatile__("std%U0%X0 %1,%0" : "=m<>"(v->counter) : "r"(i));
  }
  
  #define ATOMIC64_OP(op, asm_op)		\

diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
index f130783c8301..beba4979bff9 100644
--- a/arch/powerpc/include/asm/io.h
+++ b/arch/powerpc/include/asm/io.h
@@ -122,7 +122,7 @@ static inline u##size name(const volatile u##size __iomem 
*addr)\
  { \
u##size ret;\
__asm__ __volatile__("sync;"#insn"%U1%X1 %0,%1;twi 0,%0,0;isync"\
-   : "=r" (ret) : "m"UPD_CONSTR (*addr) : "memory"); \
+   : "=r" (ret) : "m<>" (*addr) : "memory");   \
return ret; \
  }
  
@@ -130,7 +130,7 @@ static inline u##size name(const volatile u##size __iomem *addr)	\

  static inline void name(volatile u##size __iomem *addr, u##size val)  \
  { \
__asm__ __volatile__("sync;"#insn"%U0%X0 %1,%0" \
-   : "=m"UPD_CONSTR (*addr) : "r" (val) : "memory"); \
+   : "=m<>" (*addr) : "r" (val) : "memory");   \
mmiowb_set_pending();   \
  }
  
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h

index 22c79ab40006..63316100080c 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -86,7 +86,7 @@ __pu_failed:  
\
"1:" op "%U1%X1 %0,%1# put_user\n"  \
EX_TABLE(1b, %l2)   \
:   \
-   : "r" (x), "m"UPD_CONSTR (*addr)\
+   : "r" (x), "m<>" (*addr)  \
:   \
: label)
  
@@ -143,7 +143,7 @@ do {\

"1:"op"%U1%X1 %0, %1 # get_user\n"  \
EX_TABLE(1b, %l2)   \
: "=r" (x)\
-   : "m"UPD_CONSTR (*addr)   \
+   : "m<>" (*addr) \
:   \
: label)
  
@@ -200,7 +200,7 @@ __gus_failed:\

".previous\n" \
EX_TABLE(1b, 3b)\
: "=r" (err),

Re: [PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()

2021-09-14 Thread Christophe Leroy





Le 14/09/2021 à 13:58, Borislav Petkov a écrit :

On Wed, Sep 08, 2021 at 05:58:35PM -0500, Tom Lendacky wrote:

Introduce a powerpc version of the cc_platform_has() function. This will
be used to replace the powerpc mem_encrypt_active() implementation, so
the implementation will initially only support the CC_ATTR_MEM_ENCRYPT
attribute.

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Signed-off-by: Tom Lendacky 
---
  arch/powerpc/platforms/pseries/Kconfig   |  1 +
  arch/powerpc/platforms/pseries/Makefile  |  2 ++
  arch/powerpc/platforms/pseries/cc_platform.c | 26 
  3 files changed, 29 insertions(+)
  create mode 100644 arch/powerpc/platforms/pseries/cc_platform.c


Michael,

can I get an ACK for the ppc bits to carry them through the tip tree
pls?

Btw, on a related note, cross-compiling this throws the following error here:

$ make 
CROSS_COMPILE=/home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/powerpc64-linux-
 V=1 ARCH=powerpc

...

/home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/powerpc64-linux-gcc
 -Wp,-MD,arch/powerpc/boot/.crt0.o.d -D__ASSEMBLY__ -Wall -Wundef 
-Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -O2 -msoft-float 
-mno-altivec -mno-vsx -pipe -fomit-frame-pointer -fno-builtin -fPIC -nostdinc 
-include ./include/linux/compiler_attributes.h -I./arch/powerpc/include 
-I./arch/powerpc/include/generated  -I./include -I./arch/powerpc/include/uapi 
-I./arch/powerpc/include/generated/uapi -I./include/uapi 
-I./include/generated/uapi -include ./include/linux/compiler-version.h -include 
./include/linux/kconfig.h -m32 -isystem 
/home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/../lib/gcc/powerpc64-linux/9.4.0/include
 -mbig-endian -nostdinc -c -o arch/powerpc/boot/crt0.o arch/powerpc/boot/crt0.S
In file included from :
././include/linux/compiler_attributes.h:62:5: warning: "__has_attribute" is not 
defined, evaluates to 0 [-Wundef]
62 | #if __has_attribute(__assume_aligned__)
   | ^~~
././include/linux/compiler_attributes.h:62:20: error: missing binary operator before 
token "("
62 | #if __has_attribute(__assume_aligned__)
   |^
././include/linux/compiler_attributes.h:88:5: warning: "__has_attribute" is not 
defined, evaluates to 0 [-Wundef]
88 | #if __has_attribute(__copy__)
   | ^~~
...

Known issue?

This __has_attribute() thing is supposed to be supported
in gcc since 5.1 and I'm using the crosstool stuff from
https://www.kernel.org/pub/tools/crosstool/ and gcc-9.4 above is pretty
new so that should not happen actually.

But it does...

Hmmm.




Yes, see 
https://lore.kernel.org/linuxppc-dev/20210914123919.58203...@canb.auug.org.au/T/#t

Re: [PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()

2021-09-14 Thread Borislav Petkov

On Tue, Sep 14, 2021 at 04:47:41PM +0200, Christophe Leroy wrote:
> Yes, see 
> https://lore.kernel.org/linuxppc-dev/20210914123919.58203...@canb.auug.org.au/T/#t

Aha, more compiler magic stuff ;-\

Oh well, I guess that fix will land upstream soon.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH] swiotlb: set IO TLB segment size via cmdline

2021-09-14 Thread Christoph Hellwig

On Tue, Sep 14, 2021 at 05:29:07PM +0200, Jan Beulich wrote:
> I'm not convinced the swiotlb use describe there falls under "intended
> use" - mapping a 1280x720 framebuffer in a single chunk? (As an aside,
> the bottom of this page is also confusing, as following "Then we can
> confirm the modified swiotlb size in the boot log:" there is a log
> fragment showing the same original size of 64Mb.

It doesn't.  We also do not add hacks to the kernel for whacky out
of tree modules.

Re: [RFC PATCH 5/8] sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y

2021-09-14 Thread Ard Biesheuvel

On Tue, 14 Sept 2021 at 17:49, Linus Torvalds
 wrote:
>
> On Tue, Sep 14, 2021 at 5:11 AM Ard Biesheuvel  wrote:
> >
> >  static inline unsigned int task_cpu(const struct task_struct *p)
> >  {
> >  #ifdef CONFIG_THREAD_INFO_IN_TASK
> > -   return READ_ONCE(p->cpu);
> > +   return READ_ONCE(p->thread_info.cpu);
> >  #else
> > return READ_ONCE(task_thread_info(p)->cpu);
> >  #endif
>
> Those two lines look different, but aren't.
>
> Please just remove the CONFIG_THREAD_INFO_IN_TASK conditional, and use
>
>   return READ_ONCE(task_thread_info(p)->cpu);
>
> unconditionally, which now does the right thing regardless.
>

Unfortunately not.

task_cpu() takes a 'const struct task_struct *', whereas
task_thread_info() takes a 'struct task_struct *'.

Since task_thread_info()-> is widely used as an lvalue, I would
need to update task_cpu()'s prototype and fix up all the callers, some
of which take the const flavor themselves. Or introduce
'const_task_thread_info()' which takes the const flavor, and cannot be
used to instantiate lvalues.

Suggestions welcome, but this is the cleanest I could come up with.

[PATCH v4 5/5] powerpc/signal: Use unsafe_copy_siginfo_to_user()

2021-09-14 Thread Christophe Leroy

Use unsafe_copy_siginfo_to_user() in order to do the copy
within the user access block.

On an mpc 8321 (book3s/32) the improvment is about 5% on a process
sending a signal to itself.

Signed-off-by: Christophe Leroy 
---
v4: Use another approach for compat: drop the unsafe_copy_siginfo_to_user32(), 
instead directly call copy_siginfo_to_external32() before user_access_begin()

v3: Don't leave compat aside, use the new unsafe_copy_siginfo_to_user32()
---
 arch/powerpc/kernel/signal_32.c | 17 -
 arch/powerpc/kernel/signal_64.c |  5 +
 2 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index ff101e2b3bab..f1f5dde0885f 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -710,12 +710,6 @@ static long restore_tm_user_regs(struct pt_regs *regs, 
struct mcontext __user *s
 }
 #endif
 
-#ifdef CONFIG_PPC64
-
-#define copy_siginfo_to_user   copy_siginfo_to_user32
-
-#endif /* CONFIG_PPC64 */
-
 /*
  * Set up a signal frame for a "real-time" signal handler
  * (one which gets siginfo).
@@ -731,6 +725,7 @@ int handle_rt_signal32(struct ksignal *ksig, sigset_t 
*oldset,
struct pt_regs *regs = tsk->thread.regs;
/* Save the thread's msr before get_tm_stackpointer() changes it */
unsigned long msr = regs->msr;
+   compat_siginfo_t uinfo;
 
/* Set up Signal Frame */
frame = get_sigframe(ksig, tsk, sizeof(*frame), 1);
@@ -744,6 +739,9 @@ int handle_rt_signal32(struct ksignal *ksig, sigset_t 
*oldset,
else
prepare_save_user_regs(1);
 
+   if (IS_ENABLED(CONFIG_COMPAT))
+   copy_siginfo_to_external32(, >info);
+
if (!user_access_begin(newsp, __SIGNAL_FRAMESIZE + 16 + sizeof(*frame)))
goto badframe;
 
@@ -779,15 +777,16 @@ int handle_rt_signal32(struct ksignal *ksig, sigset_t 
*oldset,
asm("dcbst %y0; sync; icbi %y0; sync" :: "Z" (mctx->mc_pad[0]));
}
unsafe_put_sigset_t(>uc.uc_sigmask, oldset, failed);
+   if (IS_ENABLED(CONFIG_COMPAT))
+   unsafe_copy_to_user(>info, , sizeof(frame->info), 
failed);
+   else
+   unsafe_copy_siginfo_to_user((void *)>info, >info, 
failed);
 
/* create a stack frame for the caller of the handler */
unsafe_put_user(regs->gpr[1], newsp, failed);
 
user_access_end();
 
-   if (copy_siginfo_to_user(>info, >info))
-   goto badframe;
-
regs->link = tramp;
 
 #ifdef CONFIG_PPC_FPU_REGS
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index d80ff83cacb9..56c0c74aa28c 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -901,15 +901,12 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t 
*set,
}
 
unsafe_copy_to_user(>uc.uc_sigmask, set, sizeof(*set), 
badframe_block);
+   unsafe_copy_siginfo_to_user(>info, >info, badframe_block);
/* Allocate a dummy caller frame for the signal handler. */
unsafe_put_user(regs->gpr[1], newsp, badframe_block);
 
user_write_access_end();
 
-   /* Save the siginfo outside of the unsafe block. */
-   if (copy_siginfo_to_user(>info, >info))
-   goto badframe;
-
/* Make sure signal handler doesn't get spurious FP exceptions */
tsk->thread.fp_state.fpscr = 0;
 
-- 
2.31.1

[PATCH v4 4/5] powerpc/uaccess: Add unsafe_clear_user()

2021-09-14 Thread Christophe Leroy

Implement unsafe_clear_user() for powerpc.
It's a copy/paste of unsafe_copy_to_user() with value 0 as source.

It may be improved in a later patch by using 'dcbz' instruction
to zeroize full cache lines at once.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/uaccess.h | 20 
 1 file changed, 20 insertions(+)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 22c79ab40006..962b675485ff 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -467,6 +467,26 @@ do {   
\
unsafe_put_user(*(u8*)(_src + _i), (u8 __user *)(_dst + _i), 
e); \
 } while (0)
 
+#define unsafe_clear_user(d, l, e) \
+do {   \
+   u8 __user *_dst = (u8 __user *)(d); \
+   size_t _len = (l);  \
+   int _i; \
+   \
+   for (_i = 0; _i < (_len & ~(sizeof(u64) - 1)); _i += sizeof(u64)) \
+   unsafe_put_user(0, (u64 __user *)(_dst + _i), e);   \
+   if (_len & 4) { \
+   unsafe_put_user(0, (u32 __user *)(_dst + _i), e);   \
+   _i += 4;\
+   }   \
+   if (_len & 2) { \
+   unsafe_put_user(0, (u16 __user *)(_dst + _i), e);   \
+   _i += 2;\
+   }   \
+   if (_len & 1)   \
+   unsafe_put_user(0, (u8 __user *)(_dst + _i), e);\
+} while (0)
+
 #define HAVE_GET_KERNEL_NOFAULT
 
 #define __get_kernel_nofault(dst, src, type, err_label)
\
-- 
2.31.1

[PATCH v4 1/5] powerpc/signal64: Access function descriptor with user access block

2021-09-14 Thread Christophe Leroy

Access the function descriptor of the handler within a
user access block.

Signed-off-by: Christophe Leroy 
---
v3: Flatten the change to avoid nested gotos.
---
 arch/powerpc/kernel/signal_64.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index 1831bba0582e..7b1cd50bc4fb 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -936,8 +936,13 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
func_descr_t __user *funct_desc_ptr =
(func_descr_t __user *) ksig->ka.sa.sa_handler;
 
-   err |= get_user(regs->ctr, _desc_ptr->entry);
-   err |= get_user(regs->gpr[2], _desc_ptr->toc);
+   if (!user_read_access_begin(funct_desc_ptr, 
sizeof(func_descr_t)))
+   goto badfunc;
+
+   unsafe_get_user(regs->ctr, _desc_ptr->entry, 
badfunc_block);
+   unsafe_get_user(regs->gpr[2], _desc_ptr->toc, 
badfunc_block);
+
+   user_read_access_end();
}
 
/* enter the signal handler in native-endian mode */
@@ -962,5 +967,12 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
 badframe:
signal_fault(current, regs, "handle_rt_signal64", frame);
 
+   return 1;
+
+badfunc_block:
+   user_read_access_end();
+badfunc:
+   signal_fault(current, regs, __func__, (void __user 
*)ksig->ka.sa.sa_handler);
+
return 1;
 }
-- 
2.31.1

[PATCH trivial v2] powerpc/powernv/dump: Fix typo in comment

2021-09-14 Thread Vasant Hegde

Signed-off-by: Vasant Hegde 
---
 arch/powerpc/platforms/powernv/opal-dump.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/opal-dump.c 
b/arch/powerpc/platforms/powernv/opal-dump.c
index 00c5a59d82d9..717d1d30ade5 100644
--- a/arch/powerpc/platforms/powernv/opal-dump.c
+++ b/arch/powerpc/platforms/powernv/opal-dump.c
@@ -419,7 +419,7 @@ void __init opal_platform_dump_init(void)
int rc;
int dump_irq;
 
-   /* ELOG not supported by firmware */
+   /* Dump not supported by firmware */
if (!opal_check_token(OPAL_DUMP_READ))
return;
 
-- 
2.31.1

Re: [RFC PATCH 1/8] arm64: add CPU field to struct thread_info

2021-09-14 Thread Linus Torvalds

On Tue, Sep 14, 2021 at 5:10 AM Ard Biesheuvel  wrote:
>
> The CPU field will be moved back into thread_info even when
> THREAD_INFO_IN_TASK is enabled, so add it back to arm64's definition of
> struct thread_info.

The series looks sane to me, but it strikes me that it's inconsistent
- here for arm64, you make it unconditional, but for the other
architectures you end up putting it inside a #ifdef CONFIG_SMP.

Was there some reason for this odd behavior?

   Linus

Re: [5.15-rc1][PPC][bisected 6d2ef226] mainline build breaks at ./include/linux/compiler_attributes.h:62:5: warning: "__has_attribute"

2021-09-14 Thread Stephen Rothwell

Hi Abdul,

On Tue, 14 Sep 2021 11:39:44 +0530 Abdul Haleem  
wrote:
>
> Today's mainline kernel fails to compile on my powerpc box with below errors
> 
> ././include/linux/compiler_attributes.h:62:5: warning: "__has_attribute" is 
> not defined, evaluates to 0 [-Wundef]
>   #if __has_attribute(__assume_aligned__)
>   ^~~
> ././include/linux/compiler_attributes.h:62:20: error: missing binary operator 
> before token "("
>   #if __has_attribute(__assume_aligned__)
>      ^
> ././include/linux/compiler_attributes.h:88:5: warning: "__has_attribute" is 
> not defined, evaluates to 0 [-Wundef]
>   #if __has_attribute(__copy__)
>   ^~~
> ././include/linux/compiler_attributes.h:88:20: error: missing binary operator 
> before token "("
>   #if __has_attribute(__copy__)
> 
> Kernel builds fine when below patch is reverted
> 
> commit 6d2ef22 : compiler_attributes.h: drop __has_attribute() support for 
> gcc4

Thanks for your report.

This is known and being addressed.

-- 
Cheers,
Stephen Rothwell


pgp0jC9fzBAT9.pgp
Description: OpenPGP digital signature

[PATCH] swiotlb: set IO TLB segment size via cmdline

2021-09-14 Thread Roman Skakun

From: Roman Skakun 

It is possible when default IO TLB size is not
enough to fit a long buffers as described here [1].

This patch makes a way to set this parameter
using cmdline instead of recompiling a kernel.

[1] https://www.xilinx.com/support/answers/72694.html

Signed-off-by: Roman Skakun 
---
 .../admin-guide/kernel-parameters.txt |  5 +-
 arch/mips/cavium-octeon/dma-octeon.c  |  2 +-
 arch/powerpc/platforms/pseries/svm.c  |  2 +-
 drivers/xen/swiotlb-xen.c |  7 +--
 include/linux/swiotlb.h   |  1 +
 kernel/dma/swiotlb.c  | 51 ++-
 6 files changed, 48 insertions(+), 20 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 91ba391f9b32..f842a523a485 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5558,8 +5558,9 @@
it if 0 is given (See 
Documentation/admin-guide/cgroup-v1/memory.rst)
 
swiotlb=[ARM,IA-64,PPC,MIPS,X86]
-   Format: {  | force | noforce }
--- Number of I/O TLB slabs
+   Format: {  [,] [,force | 
noforce] }
+-- Number of I/O TLB slabs
+-- Max IO TLB segment size
force -- force using of bounce buffers even if they
 wouldn't be automatically used by the kernel
noforce -- Never use bounce buffers (for debugging)
diff --git a/arch/mips/cavium-octeon/dma-octeon.c 
b/arch/mips/cavium-octeon/dma-octeon.c
index df70308db0e6..446c73bc936e 100644
--- a/arch/mips/cavium-octeon/dma-octeon.c
+++ b/arch/mips/cavium-octeon/dma-octeon.c
@@ -237,7 +237,7 @@ void __init plat_swiotlb_setup(void)
swiotlbsize = 64 * (1<<20);
 #endif
swiotlb_nslabs = swiotlbsize >> IO_TLB_SHIFT;
-   swiotlb_nslabs = ALIGN(swiotlb_nslabs, IO_TLB_SEGSIZE);
+   swiotlb_nslabs = ALIGN(swiotlb_nslabs, swiotlb_io_seg_size());
swiotlbsize = swiotlb_nslabs << IO_TLB_SHIFT;
 
octeon_swiotlb = memblock_alloc_low(swiotlbsize, PAGE_SIZE);
diff --git a/arch/powerpc/platforms/pseries/svm.c 
b/arch/powerpc/platforms/pseries/svm.c
index 87f001b4c4e4..2a1f09c722ac 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -47,7 +47,7 @@ void __init svm_swiotlb_init(void)
unsigned long bytes, io_tlb_nslabs;
 
io_tlb_nslabs = (swiotlb_size_or_default() >> IO_TLB_SHIFT);
-   io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+   io_tlb_nslabs = ALIGN(io_tlb_nslabs, swiotlb_io_seg_size());
 
bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 643fe440c46e..0fc9c6cb6815 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -110,12 +110,13 @@ static int xen_swiotlb_fixup(void *buf, unsigned long 
nslabs)
int dma_bits;
dma_addr_t dma_handle;
phys_addr_t p = virt_to_phys(buf);
+   unsigned long tlb_segment_size = swiotlb_io_seg_size();
 
-   dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
+   dma_bits = get_order(tlb_segment_size << IO_TLB_SHIFT) + PAGE_SHIFT;
 
i = 0;
do {
-   int slabs = min(nslabs - i, (unsigned long)IO_TLB_SEGSIZE);
+   int slabs = min(nslabs - i, (unsigned long)tlb_segment_size);
 
do {
rc = xen_create_contiguous_region(
@@ -153,7 +154,7 @@ static const char *xen_swiotlb_error(enum xen_swiotlb_err 
err)
return "";
 }
 
-#define DEFAULT_NSLABS ALIGN(SZ_64M >> IO_TLB_SHIFT, IO_TLB_SEGSIZE)
+#define DEFAULT_NSLABS ALIGN(SZ_64M >> IO_TLB_SHIFT, 
swiotlb_io_seg_size())
 
 int __ref xen_swiotlb_init(void)
 {
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index b0cb2a9973f4..35c3ffeda9fa 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -59,6 +59,7 @@ void swiotlb_sync_single_for_cpu(struct device *dev, 
phys_addr_t tlb_addr,
size_t size, enum dma_data_direction dir);
 dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys,
size_t size, enum dma_data_direction dir, unsigned long attrs);
+unsigned long swiotlb_io_seg_size(void);
 
 #ifdef CONFIG_SWIOTLB
 extern enum swiotlb_force swiotlb_force;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 87c40517e822..6b505206fc13 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -72,6 +72,11 @@ enum swiotlb_force swiotlb_force;
 
 struct io_tlb_mem io_tlb_default_mem;
 
+/*
+ * Maximum IO TLB segment size.
+ */
+static unsigned long io_tlb_seg_size = IO_TLB_SEGSIZE;
+
 /*
  * Max segment that we can provide which (if pages are contingous) will
  * not be bounced

Re: [PATCH 1/1] powerpc: Drop superfluous pci_dev_is_added() calls

2021-09-14 Thread Michael Ellerman

Bjorn Helgaas  writes:
> On Fri, Sep 10, 2021 at 04:19:40PM +0200, Niklas Schnelle wrote:
>> On powerpc, pci_dev_is_added() is called as part of SR-IOV fixups
>> that are done under pcibios_add_device() which in turn is only called in
>> pci_device_add() whih is called when a PCI device is scanned.
>> 
>> Now pci_dev_assign_added() is called in pci_bus_add_device() which is
>> only called after scanning the device. Thus pci_dev_is_added() is always
>> false and can be dropped.
>> 
>> Signed-off-by: Niklas Schnelle 
>
> Reviewed-by: Bjorn Helgaas 
>
> This doesn't touch the PCI core, so maybe makes sense for you to take
> it, Michael?  But let me know if you think otherwise.

Yeah I'm happy to take it, thanks.

cheers

Re: [PATCH] swiotlb: set IO TLB segment size via cmdline

2021-09-14 Thread Stefano Stabellini

On Tue, 14 Sep 2021, Christoph Hellwig wrote:
> On Tue, Sep 14, 2021 at 05:29:07PM +0200, Jan Beulich wrote:
> > I'm not convinced the swiotlb use describe there falls under "intended
> > use" - mapping a 1280x720 framebuffer in a single chunk? (As an aside,
> > the bottom of this page is also confusing, as following "Then we can
> > confirm the modified swiotlb size in the boot log:" there is a log
> > fragment showing the same original size of 64Mb.
> 
> It doesn't.  We also do not add hacks to the kernel for whacky out
> of tree modules.

Also, Option 1 listed in the webpage seems to be a lot better. Any
reason you can't do that? Because that option both solves the problem
and increases performance.

Re: [PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()

2021-09-14 Thread Michael Ellerman

Borislav Petkov  writes:
> On Wed, Sep 08, 2021 at 05:58:35PM -0500, Tom Lendacky wrote:
>> Introduce a powerpc version of the cc_platform_has() function. This will
>> be used to replace the powerpc mem_encrypt_active() implementation, so
>> the implementation will initially only support the CC_ATTR_MEM_ENCRYPT
>> attribute.
>> 
>> Cc: Michael Ellerman 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Paul Mackerras 
>> Signed-off-by: Tom Lendacky 
>> ---
>>  arch/powerpc/platforms/pseries/Kconfig   |  1 +
>>  arch/powerpc/platforms/pseries/Makefile  |  2 ++
>>  arch/powerpc/platforms/pseries/cc_platform.c | 26 
>>  3 files changed, 29 insertions(+)
>>  create mode 100644 arch/powerpc/platforms/pseries/cc_platform.c
>
> Michael,
>
> can I get an ACK for the ppc bits to carry them through the tip tree
> pls?

Yeah.

I don't love it, a new C file and an out-of-line call to then call back
to a static inline that for most configuration will return false ... but
whatever :)

Acked-by: Michael Ellerman  (powerpc)


> Btw, on a related note, cross-compiling this throws the following error here:
>
> $ make 
> CROSS_COMPILE=/home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/powerpc64-linux-
>  V=1 ARCH=powerpc
>
> ...
>
> /home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/powerpc64-linux-gcc
>  -Wp,-MD,arch/powerpc/boot/.crt0.o.d -D__ASSEMBLY__ -Wall -Wundef 
> -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -O2 -msoft-float 
> -mno-altivec -mno-vsx -pipe -fomit-frame-pointer -fno-builtin -fPIC -nostdinc 
> -include ./include/linux/compiler_attributes.h -I./arch/powerpc/include 
> -I./arch/powerpc/include/generated  -I./include -I./arch/powerpc/include/uapi 
> -I./arch/powerpc/include/generated/uapi -I./include/uapi 
> -I./include/generated/uapi -include ./include/linux/compiler-version.h 
> -include ./include/linux/kconfig.h -m32 -isystem 
> /home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/../lib/gcc/powerpc64-linux/9.4.0/include
>  -mbig-endian -nostdinc -c -o arch/powerpc/boot/crt0.o 
> arch/powerpc/boot/crt0.S
> In file included from :
> ././include/linux/compiler_attributes.h:62:5: warning: "__has_attribute" is 
> not defined, evaluates to 0 [-Wundef]
>62 | #if __has_attribute(__assume_aligned__)
>   | ^~~
> ././include/linux/compiler_attributes.h:62:20: error: missing binary operator 
> before token "("
>62 | #if __has_attribute(__assume_aligned__)
>   |^
> ././include/linux/compiler_attributes.h:88:5: warning: "__has_attribute" is 
> not defined, evaluates to 0 [-Wundef]
>88 | #if __has_attribute(__copy__)
>   | ^~~
> ...
>
> Known issue?

Yeah, fixed in mainline today, thanks for trying to cross compile :)

cheers

Re: [PATCH v2 07/29] ABI: sysfs-class-cxl: place "not in a guest" at description

2021-09-14 Thread Andrew Donnellan


On 15/9/21 12:32 am, Mauro Carvalho Chehab wrote:

The What: field should have just the location of the ABI.
Anything else should be inside the description.

This fixes its parsing by get_abi.pl script.

Signed-off-by: Mauro Carvalho Chehab 


Looks fine to me.

Acked-by: Andrew Donnellan 


---
  Documentation/ABI/testing/sysfs-class-cxl | 15 ++-
  1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-class-cxl 
b/Documentation/ABI/testing/sysfs-class-cxl
index 818f55970efb..3c77677e0ca7 100644
--- a/Documentation/ABI/testing/sysfs-class-cxl
+++ b/Documentation/ABI/testing/sysfs-class-cxl
@@ -166,10 +166,11 @@ Description:read only
  Decimal value of the Per Process MMIO space length.
  Users:https://github.com/ibm-capi/libcxl
  
-What:   /sys/class/cxl/m/pp_mmio_off (not in a guest)

+What:   /sys/class/cxl/m/pp_mmio_off
  Date:   September 2014
  Contact:linuxppc-dev@lists.ozlabs.org
  Description:read only
+(not in a guest)
  Decimal value of the Per Process MMIO space offset.
  Users:https://github.com/ibm-capi/libcxl
  
@@ -190,28 +191,31 @@ Description:read only

  Identifies the revision level of the PSL.
  Users:https://github.com/ibm-capi/libcxl
  
-What:   /sys/class/cxl//base_image (not in a guest)

+What:   /sys/class/cxl//base_image
  Date:   September 2014
  Contact:linuxppc-dev@lists.ozlabs.org
  Description:read only
+(not in a guest)
  Identifies the revision level of the base image for devices
  that support loadable PSLs. For FPGAs this field identifies
  the image contained in the on-adapter flash which is loaded
  during the initial program load.
  Users:https://github.com/ibm-capi/libcxl
  
-What:   /sys/class/cxl//image_loaded (not in a guest)

+What:   /sys/class/cxl//image_loaded
  Date:   September 2014
  Contact:linuxppc-dev@lists.ozlabs.org
  Description:read only
+(not in a guest)
  Will return "user" or "factory" depending on the image loaded
  onto the card.
  Users:https://github.com/ibm-capi/libcxl
  
-What:   /sys/class/cxl//load_image_on_perst (not in a guest)

+What:   /sys/class/cxl//load_image_on_perst
  Date:   December 2014
  Contact:linuxppc-dev@lists.ozlabs.org
  Description:read/write
+(not in a guest)
  Valid entries are "none", "user", and "factory".
  "none" means PERST will not cause image to be loaded to the
  card.  A power cycle is required to load the image.
@@ -235,10 +239,11 @@ Description:write only
  contexts on the card AFUs.
  Users:https://github.com/ibm-capi/libcxl
  
-What:		/sys/class/cxl//perst_reloads_same_image (not in a guest)

+What:  /sys/class/cxl//perst_reloads_same_image
  Date: July 2015
  Contact:  linuxppc-dev@lists.ozlabs.org
  Description:  read/write
+(not in a guest)
Trust that when an image is reloaded via PERST, it will not
have changed.
  



--
Andrew Donnellan  OzLabs, ADL Canberra
a...@linux.ibm.com IBM Australia Limited

Re: [PATCH 1/1] powerpc: Drop superfluous pci_dev_is_added() calls

2021-09-14 Thread Bjorn Helgaas

On Fri, Sep 10, 2021 at 04:19:40PM +0200, Niklas Schnelle wrote:
> On powerpc, pci_dev_is_added() is called as part of SR-IOV fixups
> that are done under pcibios_add_device() which in turn is only called in
> pci_device_add() whih is called when a PCI device is scanned.
> 
> Now pci_dev_assign_added() is called in pci_bus_add_device() which is
> only called after scanning the device. Thus pci_dev_is_added() is always
> false and can be dropped.
> 
> Signed-off-by: Niklas Schnelle 

Reviewed-by: Bjorn Helgaas 

This doesn't touch the PCI core, so maybe makes sense for you to take
it, Michael?  But let me know if you think otherwise.

Thanks a lot for cleaning this up, Niklas.

> ---
>  arch/powerpc/platforms/powernv/pci-sriov.c | 6 --
>  arch/powerpc/platforms/pseries/setup.c | 3 +--
>  2 files changed, 1 insertion(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c 
> b/arch/powerpc/platforms/powernv/pci-sriov.c
> index 28aac933a439..deddbb233fde 100644
> --- a/arch/powerpc/platforms/powernv/pci-sriov.c
> +++ b/arch/powerpc/platforms/powernv/pci-sriov.c
> @@ -9,9 +9,6 @@
>  
>  #include "pci.h"
>  
> -/* for pci_dev_is_added() */
> -#include "../../../../drivers/pci/pci.h"
> -
>  /*
>   * The majority of the complexity in supporting SR-IOV on PowerNV comes from
>   * the need to put the MMIO space for each VF into a separate PE. Internally
> @@ -228,9 +225,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
> pci_dev *pdev)
>  
>  void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
>  {
> - if (WARN_ON(pci_dev_is_added(pdev)))
> - return;
> -
>   if (pdev->is_virtfn) {
>   struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
>  
> diff --git a/arch/powerpc/platforms/pseries/setup.c 
> b/arch/powerpc/platforms/pseries/setup.c
> index f79126f16258..2188054470c1 100644
> --- a/arch/powerpc/platforms/pseries/setup.c
> +++ b/arch/powerpc/platforms/pseries/setup.c
> @@ -74,7 +74,6 @@
>  #include 
>  
>  #include "pseries.h"
> -#include "../../../../drivers/pci/pci.h"
>  
>  DEFINE_STATIC_KEY_FALSE(shared_processor);
>  EXPORT_SYMBOL(shared_processor);
> @@ -750,7 +749,7 @@ static void pseries_pci_fixup_iov_resources(struct 
> pci_dev *pdev)
>   const int *indexes;
>   struct device_node *dn = pci_device_to_OF_node(pdev);
>  
> - if (!pdev->is_physfn || pci_dev_is_added(pdev))
> + if (!pdev->is_physfn)
>   return;
>   /*Firmware must support open sriov otherwise dont configure*/
>   indexes = of_get_property(dn, "ibm,open-sriov-vf-bar-info", NULL);
> -- 
> 2.25.1
>

[PATCH] powerpc: clean up UPD_CONSTR

2021-09-14 Thread Nick Desaulniers

UPD_CONSTR was previously a preprocessor define for an old GCC 4.9 inline
asm bug with m<> constraints.

Fixes: 6563139d90ad ("powerpc: remove GCC version check for UPD_CONSTR")
Suggested-by: Nathan Chancellor 
Suggested-by: Christophe Leroy 
Suggested-by: Michael Ellerman 
Signed-off-by: Nick Desaulniers 
---
 arch/powerpc/include/asm/asm-const.h | 2 --
 arch/powerpc/include/asm/atomic.h| 8 
 arch/powerpc/include/asm/io.h| 4 ++--
 arch/powerpc/include/asm/uaccess.h   | 6 +++---
 arch/powerpc/kvm/powerpc.c   | 4 ++--
 5 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/asm-const.h 
b/arch/powerpc/include/asm/asm-const.h
index dbfa5e1e3198..bfb3c3534877 100644
--- a/arch/powerpc/include/asm/asm-const.h
+++ b/arch/powerpc/include/asm/asm-const.h
@@ -12,6 +12,4 @@
 #  define ASM_CONST(x) __ASM_CONST(x)
 #endif
 
-#define UPD_CONSTR "<>"
-
 #endif /* _ASM_POWERPC_ASM_CONST_H */
diff --git a/arch/powerpc/include/asm/atomic.h 
b/arch/powerpc/include/asm/atomic.h
index 6a53ef178bfd..fd594fdbd84d 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -27,14 +27,14 @@ static __inline__ int arch_atomic_read(const atomic_t *v)
 {
int t;
 
-   __asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : 
"m"UPD_CONSTR(v->counter));
+   __asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m<>"(v->counter));
 
return t;
 }
 
 static __inline__ void arch_atomic_set(atomic_t *v, int i)
 {
-   __asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"UPD_CONSTR(v->counter) : 
"r"(i));
+   __asm__ __volatile__("stw%U0%X0 %1,%0" : "=m<>"(v->counter) : "r"(i));
 }
 
 #define ATOMIC_OP(op, asm_op)  \
@@ -320,14 +320,14 @@ static __inline__ s64 arch_atomic64_read(const atomic64_t 
*v)
 {
s64 t;
 
-   __asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : 
"m"UPD_CONSTR(v->counter));
+   __asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m<>"(v->counter));
 
return t;
 }
 
 static __inline__ void arch_atomic64_set(atomic64_t *v, s64 i)
 {
-   __asm__ __volatile__("std%U0%X0 %1,%0" : "=m"UPD_CONSTR(v->counter) : 
"r"(i));
+   __asm__ __volatile__("std%U0%X0 %1,%0" : "=m<>"(v->counter) : "r"(i));
 }
 
 #define ATOMIC64_OP(op, asm_op)
\
diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
index f130783c8301..beba4979bff9 100644
--- a/arch/powerpc/include/asm/io.h
+++ b/arch/powerpc/include/asm/io.h
@@ -122,7 +122,7 @@ static inline u##size name(const volatile u##size __iomem 
*addr)\
 {  \
u##size ret;\
__asm__ __volatile__("sync;"#insn"%U1%X1 %0,%1;twi 0,%0,0;isync"\
-   : "=r" (ret) : "m"UPD_CONSTR (*addr) : "memory");   \
+   : "=r" (ret) : "m<>" (*addr) : "memory");   \
return ret; \
 }
 
@@ -130,7 +130,7 @@ static inline u##size name(const volatile u##size __iomem 
*addr)\
 static inline void name(volatile u##size __iomem *addr, u##size val)   \
 {  \
__asm__ __volatile__("sync;"#insn"%U0%X0 %1,%0" \
-   : "=m"UPD_CONSTR (*addr) : "r" (val) : "memory");   \
+   : "=m<>" (*addr) : "r" (val) : "memory");   \
mmiowb_set_pending();   \
 }
 
diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 22c79ab40006..63316100080c 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -86,7 +86,7 @@ __pu_failed:  
\
"1: " op "%U1%X1 %0,%1  # put_user\n"   \
EX_TABLE(1b, %l2)   \
:   \
-   : "r" (x), "m"UPD_CONSTR (*addr)\
+   : "r" (x), "m<>" (*addr)\
:   \
: label)
 
@@ -143,7 +143,7 @@ do {
\
"1: "op"%U1%X1 %0, %1   # get_user\n"   \
EX_TABLE(1b, %l2)   \
: "=r" (x)  \
-   : "m"UPD_CONSTR (*addr) \
+   : "m<>" (*addr) \
:   \
: label)
 
@@ -200,7 +200,7 @@ __gus_failed:   
\
".previous\n"

Re: [PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()

2021-09-14 Thread Borislav Petkov

On Wed, Sep 08, 2021 at 05:58:36PM -0500, Tom Lendacky wrote:
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 18fe19916bc3..4b54a2377821 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -144,7 +144,7 @@ void __init sme_unmap_bootdata(char *real_mode_data)
>   struct boot_params *boot_data;
>   unsigned long cmdline_paddr;
>  
> - if (!sme_active())
> + if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
>   return;
>  
>   /* Get the command line address before unmapping the real_mode_data */
> @@ -164,7 +164,7 @@ void __init sme_map_bootdata(char *real_mode_data)
>   struct boot_params *boot_data;
>   unsigned long cmdline_paddr;
>  
> - if (!sme_active())
> + if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
>   return;
>  
>   __sme_early_map_unmap_mem(real_mode_data, sizeof(boot_params), true);
> @@ -377,11 +377,6 @@ bool sev_active(void)
>  {
>   return sev_status & MSR_AMD64_SEV_ENABLED;
>  }
> -
> -bool sme_active(void)
> -{
> - return sme_me_mask && !sev_active();
> -}
>  EXPORT_SYMBOL_GPL(sev_active);
>  
>  /* Needs to be called from non-instrumentable code */

You forgot this hunk:

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 5635ca9a1fbe..a3a2396362a5 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -364,8 +364,9 @@ int __init early_set_memory_encrypted(unsigned long vaddr, 
unsigned long size)
 /*
  * SME and SEV are very similar but they are not the same, so there are
  * times that the kernel will need to distinguish between SME and SEV. The
- * sme_active() and sev_active() functions are used for this.  When a
- * distinction isn't needed, the mem_encrypt_active() function can be used.
+ * PATTR_HOST_MEM_ENCRYPT and PATTR_GUEST_MEM_ENCRYPT flags to
+ * amd_prot_guest_has() are used for this. When a distinction isn't needed,
+ * the mem_encrypt_active() function can be used.
  *
  * The trampoline code is a good example for this requirement.  Before
  * paging is activated, SME will access all memory as decrypted, but SEV

because there's still a sme_active() mentioned there:

$ git grep sme_active
arch/x86/mm/mem_encrypt.c:367: * sme_active() and sev_active() functions are 
used for this.  When a

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH] swiotlb: set IO TLB segment size via cmdline

2021-09-14 Thread Jan Beulich

On 14.09.2021 17:10, Roman Skakun wrote:
> From: Roman Skakun 
> 
> It is possible when default IO TLB size is not
> enough to fit a long buffers as described here [1].
> 
> This patch makes a way to set this parameter
> using cmdline instead of recompiling a kernel.
> 
> [1] https://www.xilinx.com/support/answers/72694.html

I'm not convinced the swiotlb use describe there falls under "intended
use" - mapping a 1280x720 framebuffer in a single chunk? (As an aside,
the bottom of this page is also confusing, as following "Then we can
confirm the modified swiotlb size in the boot log:" there is a log
fragment showing the same original size of 64Mb.

> --- a/arch/mips/cavium-octeon/dma-octeon.c
> +++ b/arch/mips/cavium-octeon/dma-octeon.c
> @@ -237,7 +237,7 @@ void __init plat_swiotlb_setup(void)
>   swiotlbsize = 64 * (1<<20);
>  #endif
>   swiotlb_nslabs = swiotlbsize >> IO_TLB_SHIFT;
> - swiotlb_nslabs = ALIGN(swiotlb_nslabs, IO_TLB_SEGSIZE);
> + swiotlb_nslabs = ALIGN(swiotlb_nslabs, swiotlb_io_seg_size());

In order to be sure to catch all uses like this one (including ones
which make it upstream in parallel to yours), I think you will want
to rename the original IO_TLB_SEGSIZE to e.g. IO_TLB_DEFAULT_SEGSIZE.

> @@ -81,15 +86,30 @@ static unsigned int max_segment;
>  static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT;
>  
>  static int __init
> -setup_io_tlb_npages(char *str)
> +setup_io_tlb_params(char *str)
>  {
> + unsigned long tmp;
> +
>   if (isdigit(*str)) {
> - /* avoid tail segment of size < IO_TLB_SEGSIZE */
> - default_nslabs =
> - ALIGN(simple_strtoul(str, , 0), IO_TLB_SEGSIZE);
> + default_nslabs = simple_strtoul(str, , 0);
>   }
>   if (*str == ',')
>   ++str;
> +
> + /* get max IO TLB segment size */
> + if (isdigit(*str)) {
> + tmp = simple_strtoul(str, , 0);
> + if (tmp)
> + io_tlb_seg_size = ALIGN(tmp, IO_TLB_SEGSIZE);

>From all I can tell io_tlb_seg_size wants to be a power of 2. Merely
aligning to a multiple of IO_TLB_SEGSIZE isn't going to be enough.

Jan

Re: [PATCH] pci: Rename pcibios_add_device to match

2021-09-14 Thread Bjorn Helgaas

On Tue, Sep 14, 2021 at 01:27:08AM +1000, Oliver O'Halloran wrote:
> The general convention for pcibios_* hooks is that they're named after
> the corresponding pci_* function they provide a hook for. The exception
> is pcibios_add_device() which provides a hook for pci_device_add(). This
> has been irritating me for years so rename it.
> 
> Also, remove the export of the microblaze version. The only caller
> must be compiled as a built-in so there's no reason for the export.
> 
> Signed-off-by: Oliver O'Halloran 

I fixed up the subject so it matches previous history and applied to
pci/enumeration for v5.16, thanks!

Stuff like this really annoys me, too.

> ---
>  arch/microblaze/pci/pci-common.c   | 3 +--
>  arch/powerpc/kernel/pci-common.c   | 2 +-
>  arch/powerpc/platforms/powernv/pci-sriov.c | 2 +-
>  arch/s390/pci/pci.c| 2 +-
>  arch/sparc/kernel/pci.c| 2 +-
>  arch/x86/pci/common.c  | 2 +-
>  drivers/pci/pci.c  | 4 ++--
>  drivers/pci/probe.c| 4 ++--
>  include/linux/pci.h| 2 +-
>  9 files changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/microblaze/pci/pci-common.c 
> b/arch/microblaze/pci/pci-common.c
> index 557585f1be41..622a4867f9e9 100644
> --- a/arch/microblaze/pci/pci-common.c
> +++ b/arch/microblaze/pci/pci-common.c
> @@ -587,13 +587,12 @@ static void pcibios_fixup_resources(struct pci_dev *dev)
>  }
>  DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pcibios_fixup_resources);
>  
> -int pcibios_add_device(struct pci_dev *dev)
> +int pcibios_device_add(struct pci_dev *dev)
>  {
>   dev->irq = of_irq_parse_and_map_pci(dev, 0, 0);
>  
>   return 0;
>  }
> -EXPORT_SYMBOL(pcibios_add_device);
>  
>  /*
>   * Reparent resource children of pr that conflict with res
> diff --git a/arch/powerpc/kernel/pci-common.c 
> b/arch/powerpc/kernel/pci-common.c
> index c3573430919d..6749905932f4 100644
> --- a/arch/powerpc/kernel/pci-common.c
> +++ b/arch/powerpc/kernel/pci-common.c
> @@ -1059,7 +1059,7 @@ void pcibios_bus_add_device(struct pci_dev *dev)
>   ppc_md.pcibios_bus_add_device(dev);
>  }
>  
> -int pcibios_add_device(struct pci_dev *dev)
> +int pcibios_device_add(struct pci_dev *dev)
>  {
>   struct irq_domain *d;
>  
> diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c 
> b/arch/powerpc/platforms/powernv/pci-sriov.c
> index 28aac933a439..486c2937b159 100644
> --- a/arch/powerpc/platforms/powernv/pci-sriov.c
> +++ b/arch/powerpc/platforms/powernv/pci-sriov.c
> @@ -54,7 +54,7 @@
>   * to "new_size", calculated above. Implementing this is a convoluted process
>   * which requires several hooks in the PCI core:
>   *
> - * 1. In pcibios_add_device() we call pnv_pci_ioda_fixup_iov().
> + * 1. In pcibios_device_add() we call pnv_pci_ioda_fixup_iov().
>   *
>   *At this point the device has been probed and the device's BARs are 
> sized,
>   *but no resource allocations have been done. The SR-IOV BARs are sized
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index e7e6788d75a8..ded3321b7208 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -561,7 +561,7 @@ static void zpci_cleanup_bus_resources(struct zpci_dev 
> *zdev)
>   zdev->has_resources = 0;
>  }
>  
> -int pcibios_add_device(struct pci_dev *pdev)
> +int pcibios_device_add(struct pci_dev *pdev)
>  {
>   struct zpci_dev *zdev = to_zpci(pdev);
>   struct resource *res;
> diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c
> index 9c2b720bfd20..31b0c1983286 100644
> --- a/arch/sparc/kernel/pci.c
> +++ b/arch/sparc/kernel/pci.c
> @@ -1010,7 +1010,7 @@ void pcibios_set_master(struct pci_dev *dev)
>  }
>  
>  #ifdef CONFIG_PCI_IOV
> -int pcibios_add_device(struct pci_dev *dev)
> +int pcibios_device_add(struct pci_dev *dev)
>  {
>   struct pci_dev *pdev;
>  
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index 3507f456fcd0..9e1e6b8d8876 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -632,7 +632,7 @@ static void set_dev_domain_options(struct pci_dev *pdev)
>   pdev->hotplug_user_indicators = 1;
>  }
>  
> -int pcibios_add_device(struct pci_dev *dev)
> +int pcibios_device_add(struct pci_dev *dev)
>  {
>   struct pci_setup_rom *rom;
>   struct irq_domain *msidom;
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index ce2ab62b64cf..c63598c1cdd8 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2091,14 +2091,14 @@ void pcim_pin_device(struct pci_dev *pdev)
>  EXPORT_SYMBOL(pcim_pin_device);
>  
>  /*
> - * pcibios_add_device - provide arch specific hooks when adding device dev
> + * pcibios_device_add - provide arch specific hooks when adding device dev
>   * @dev: the PCI device being added
>   *
>   * Permits the platform to provide architecture specific functionality when
>   * devices are added. This is the default

Re: [5.15-rc1][PPC][bisected 6d2ef226] mainline build breaks at ./include/linux/compiler_attributes.h:62:5: warning: "__has_attribute"

2021-09-14 Thread Nick Desaulniers

On Mon, Sep 13, 2021 at 11:22 PM Stephen Rothwell  wrote:
>
> Hi Abdul,
>
> On Tue, 14 Sep 2021 11:39:44 +0530 Abdul Haleem  
> wrote:
> >
> > Today's mainline kernel fails to compile on my powerpc box with below errors
> >
> > ././include/linux/compiler_attributes.h:62:5: warning: "__has_attribute" is 
> > not defined, evaluates to 0 [-Wundef]
> >   #if __has_attribute(__assume_aligned__)
> >   ^~~
> > ././include/linux/compiler_attributes.h:62:20: error: missing binary 
> > operator before token "("
> >   #if __has_attribute(__assume_aligned__)
> >  ^
> > ././include/linux/compiler_attributes.h:88:5: warning: "__has_attribute" is 
> > not defined, evaluates to 0 [-Wundef]
> >   #if __has_attribute(__copy__)
> >   ^~~
> > ././include/linux/compiler_attributes.h:88:20: error: missing binary 
> > operator before token "("
> >   #if __has_attribute(__copy__)
> >
> > Kernel builds fine when below patch is reverted
> >
> > commit 6d2ef22 : compiler_attributes.h: drop __has_attribute() support for 
> > gcc4
>
> Thanks for your report.
>
> This is known and being addressed.

Thanks for the report. Support for GCC 4.X has been dropped.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=76ae847497bc5207c479de5e2ac487270008b19b
-- 
Thanks,
~Nick Desaulniers

Re: [RESEND PATCH v4 1/4] drivers/nvdimm: Add nvdimm pmu structure

2021-09-14 Thread Dan Williams

On Tue, Sep 14, 2021 at 9:08 PM Dan Williams  wrote:
>
> On Thu, Sep 9, 2021 at 12:56 AM kajoljain  wrote:
> >
> >
> >
> > On 9/8/21 3:29 AM, Dan Williams wrote:
> > > Hi Kajol,
> > >
> > > Apologies for the delay in responding to this series, some comments below:
> >
> > Hi Dan,
> > No issues, thanks for reviewing the patches.
> >
> > >
> > > On Thu, Sep 2, 2021 at 10:10 PM Kajol Jain  wrote:
> > >>
> > >> A structure is added, called nvdimm_pmu, for performance
> > >> stats reporting support of nvdimm devices. It can be used to add
> > >> nvdimm pmu data such as supported events and pmu event functions
> > >> like event_init/add/read/del with cpu hotplug support.
> > >>
> > >> Acked-by: Peter Zijlstra (Intel) 
> > >> Reviewed-by: Madhavan Srinivasan 
> > >> Tested-by: Nageswara R Sastry 
> > >> Signed-off-by: Kajol Jain 
> > >> ---
> > >>  include/linux/nd.h | 43 +++
> > >>  1 file changed, 43 insertions(+)
> > >>
> > >> diff --git a/include/linux/nd.h b/include/linux/nd.h
> > >> index ee9ad76afbba..712499cf7335 100644
> > >> --- a/include/linux/nd.h
> > >> +++ b/include/linux/nd.h
> > >> @@ -8,6 +8,8 @@
> > >>  #include 
> > >>  #include 
> > >>  #include 
> > >> +#include 
> > >> +#include 
> > >>
> > >>  enum nvdimm_event {
> > >> NVDIMM_REVALIDATE_POISON,
> > >> @@ -23,6 +25,47 @@ enum nvdimm_claim_class {
> > >> NVDIMM_CCLASS_UNKNOWN,
> > >>  };
> > >>
> > >> +/* Event attribute array index */
> > >> +#define NVDIMM_PMU_FORMAT_ATTR 0
> > >> +#define NVDIMM_PMU_EVENT_ATTR  1
> > >> +#define NVDIMM_PMU_CPUMASK_ATTR2
> > >> +#define NVDIMM_PMU_NULL_ATTR   3
> > >> +
> > >> +/**
> > >> + * struct nvdimm_pmu - data structure for nvdimm perf driver
> > >> + *
> > >> + * @name: name of the nvdimm pmu device.
> > >> + * @pmu: pmu data structure for nvdimm performance stats.
> > >> + * @dev: nvdimm device pointer.
> > >> + * @functions(event_init/add/del/read): platform specific pmu functions.
> > >
> > > This is not valid kernel-doc:
> > >
> > > include/linux/nd.h:67: warning: Function parameter or member
> > > 'event_init' not described in 'nvdimm_pmu'
> > > include/linux/nd.h:67: warning: Function parameter or member 'add' not
> > > described in 'nvdimm_pmu'
> > > include/linux/nd.h:67: warning: Function parameter or member 'del' not
> > > described in 'nvdimm_pmu'
> > > include/linux/nd.h:67: warning: Function parameter or member 'read'
> > > not described in 'nvdimm_pmu'
> > >
> > > ...but I think rather than fixing those up 'struct nvdimm_pmu' should be 
> > > pruned.
> > >
> > > It's not clear to me that it is worth the effort to describe these
> > > details to the nvdimm core which is just going to turn around and call
> > > the pmu core. I'd just as soon have the driver call the pmu core
> > > directly, optionally passing in attributes and callbacks that come
> > > from the nvdimm core and/or the nvdimm provider.
> >
> > The intend for adding these callbacks(event_init/add/del/read) is to give
> > flexibility to the nvdimm core to add some common checks/routines if 
> > required
> > in the future. Those checks can be common for all architecture with still 
> > having the
> > ability to call arch/platform specific driver code to use its own routines.
> >
> > But as you said, currently we don't have any common checks and it directly
> > calling platform specific code, so we can get rid of it.
> > Should we remove this part for now?
>
> Yes, lets go direct to the perf api for now and await the need for a
> common core wrapper to present itself.
>
> >
> >
> > >
> > > Otherwise it's also not clear which of these structure members are
> > > used at runtime vs purely used as temporary storage to pass parameters
> > > to the pmu core.
> > >
> > >> + * @attr_groups: data structure for events, formats and cpumask
> > >> + * @cpu: designated cpu for counter access.
> > >> + * @node: node for cpu hotplug notifier link.
> > >> + * @cpuhp_state: state for cpu hotplug notification.
> > >> + * @arch_cpumask: cpumask to get designated cpu for counter access.
> > >> + */
> > >> +struct nvdimm_pmu {
> > >> +   const char *name;
> > >> +   struct pmu pmu;
> > >> +   struct device *dev;
> > >> +   int (*event_init)(struct perf_event *event);
> > >> +   int  (*add)(struct perf_event *event, int flags);
> > >> +   void (*del)(struct perf_event *event, int flags);
> > >> +   void (*read)(struct perf_event *event);
> > >> +   /*
> > >> +* Attribute groups for the nvdimm pmu. Index 0 used for
> > >> +* format attribute, index 1 used for event attribute,
> > >> +* index 2 used for cpusmask attribute and index 3 kept as NULL.
> > >> +*/
> > >> +   const struct attribute_group *attr_groups[4];
> > >
> > > Following from above, I'd rather this was organized as static
> > > attributes with an is_visible() helper for the groups for any dynamic
> > > aspects. That

Re: [PATCH v6 4/4] powerpc/64s: Initialize and use a temporary mm for patching on Radix

2021-09-14 Thread Jordan Niethe

On Sat, Sep 11, 2021 at 12:39 PM Christopher M. Riedl
 wrote:
>
> When code patching a STRICT_KERNEL_RWX kernel the page containing the
> address to be patched is temporarily mapped as writeable. Currently, a
> per-cpu vmalloc patch area is used for this purpose. While the patch
> area is per-cpu, the temporary page mapping is inserted into the kernel
> page tables for the duration of patching. The mapping is exposed to CPUs
> other than the patching CPU - this is undesirable from a hardening
> perspective. Use a temporary mm instead which keeps the mapping local to
> the CPU doing the patching.
>
> Use the `poking_init` init hook to prepare a temporary mm and patching
> address. Initialize the temporary mm by copying the init mm. Choose a
> randomized patching address inside the temporary mm userspace address
> space. The patching address is randomized between PAGE_SIZE and
> DEFAULT_MAP_WINDOW-PAGE_SIZE.
>
> Bits of entropy with 64K page size on BOOK3S_64:
>
> bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE)
>
> PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB
> bits of entropy = log2(128TB / 64K)
> bits of entropy = 31
>
> The upper limit is DEFAULT_MAP_WINDOW due to how the Book3s64 Hash MMU
> operates - by default the space above DEFAULT_MAP_WINDOW is not
> available. Currently the Hash MMU does not use a temporary mm so
> technically this upper limit isn't necessary; however, a larger
> randomization range does not further "harden" this overall approach and
> future work may introduce patching with a temporary mm on Hash as well.
>
> Randomization occurs only once during initialization at boot for each
> possible CPU in the system.
>
> Introduce two new functions, map_patch_mm() and unmap_patch_mm(), to
> respectively create and remove the temporary mapping with write
> permissions at patching_addr. Map the page with PAGE_KERNEL to set
> EAA[0] for the PTE which ignores the AMR (so no need to unlock/lock
> KUAP) according to PowerISA v3.0b Figure 35 on Radix.
>
> Based on x86 implementation:
>
> commit 4fc19708b165
> ("x86/alternatives: Initialize temporary mm for patching")
>
> and:
>
> commit b3fd8e83ada0
> ("x86/alternatives: Use temporary mm for text poking")
>
> Signed-off-by: Christopher M. Riedl 
>
> ---
>
> v6:  * Small clean-ups (naming, formatting, style, etc).
>  * Call stop_using_temporary_mm() before pte_unmap_unlock() after
>patching.
>  * Replace BUG_ON()s in poking_init() w/ WARN_ON()s.
>
> v5:  * Only support Book3s64 Radix MMU for now.
>  * Use a per-cpu datastructure to hold the patching_addr and
>patching_mm to avoid the need for a synchronization lock/mutex.
>
> v4:  * In the previous series this was two separate patches: one to init
>the temporary mm in poking_init() (unused in powerpc at the time)
>and the other to use it for patching (which removed all the
>per-cpu vmalloc code). Now that we use poking_init() in the
>existing per-cpu vmalloc approach, that separation doesn't work
>as nicely anymore so I just merged the two patches into one.
>  * Preload the SLB entry and hash the page for the patching_addr
>when using Hash on book3s64 to avoid taking an SLB and Hash fault
>during patching. The previous implementation was a hack which
>changed current->mm to allow the SLB and Hash fault handlers to
>work with the temporary mm since both of those code-paths always
>assume mm == current->mm.
>  * Also (hmm - seeing a trend here) with the book3s64 Hash MMU we
>have to manage the mm->context.active_cpus counter and mm cpumask
>since they determine (via mm_is_thread_local()) if the TLB flush
>in pte_clear() is local or not - it should always be local when
>we're using the temporary mm. On book3s64's Radix MMU we can
>just call local_flush_tlb_mm().
>  * Use HPTE_USE_KERNEL_KEY on Hash to avoid costly lock/unlock of
>KUAP.
> ---
>  arch/powerpc/lib/code-patching.c | 119 +--
>  1 file changed, 112 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/lib/code-patching.c 
> b/arch/powerpc/lib/code-patching.c
> index e802e42c2789..af8e2a02a9dd 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -11,6 +11,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -103,6 +104,7 @@ static inline void stop_using_temporary_mm(struct temp_mm 
> *temp_mm)
>
>  static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
>  static DEFINE_PER_CPU(unsigned long, cpu_patching_addr);
> +static DEFINE_PER_CPU(struct mm_struct *, cpu_patching_mm);
>
>  static int text_area_cpu_up(unsigned int cpu)
>  {
> @@ -126,8 +128,48 @@ static int text_area_cpu_down(unsigned int cpu)
> return 0;
>  }
>
> +static __always_inline void __poking_init_temp_mm(void)
> +{
> +   int cpu;
> +

Re: [RESEND PATCH v4 1/4] drivers/nvdimm: Add nvdimm pmu structure

2021-09-14 Thread Dan Williams

On Thu, Sep 9, 2021 at 12:56 AM kajoljain  wrote:
>
>
>
> On 9/8/21 3:29 AM, Dan Williams wrote:
> > Hi Kajol,
> >
> > Apologies for the delay in responding to this series, some comments below:
>
> Hi Dan,
> No issues, thanks for reviewing the patches.
>
> >
> > On Thu, Sep 2, 2021 at 10:10 PM Kajol Jain  wrote:
> >>
> >> A structure is added, called nvdimm_pmu, for performance
> >> stats reporting support of nvdimm devices. It can be used to add
> >> nvdimm pmu data such as supported events and pmu event functions
> >> like event_init/add/read/del with cpu hotplug support.
> >>
> >> Acked-by: Peter Zijlstra (Intel) 
> >> Reviewed-by: Madhavan Srinivasan 
> >> Tested-by: Nageswara R Sastry 
> >> Signed-off-by: Kajol Jain 
> >> ---
> >>  include/linux/nd.h | 43 +++
> >>  1 file changed, 43 insertions(+)
> >>
> >> diff --git a/include/linux/nd.h b/include/linux/nd.h
> >> index ee9ad76afbba..712499cf7335 100644
> >> --- a/include/linux/nd.h
> >> +++ b/include/linux/nd.h
> >> @@ -8,6 +8,8 @@
> >>  #include 
> >>  #include 
> >>  #include 
> >> +#include 
> >> +#include 
> >>
> >>  enum nvdimm_event {
> >> NVDIMM_REVALIDATE_POISON,
> >> @@ -23,6 +25,47 @@ enum nvdimm_claim_class {
> >> NVDIMM_CCLASS_UNKNOWN,
> >>  };
> >>
> >> +/* Event attribute array index */
> >> +#define NVDIMM_PMU_FORMAT_ATTR 0
> >> +#define NVDIMM_PMU_EVENT_ATTR  1
> >> +#define NVDIMM_PMU_CPUMASK_ATTR2
> >> +#define NVDIMM_PMU_NULL_ATTR   3
> >> +
> >> +/**
> >> + * struct nvdimm_pmu - data structure for nvdimm perf driver
> >> + *
> >> + * @name: name of the nvdimm pmu device.
> >> + * @pmu: pmu data structure for nvdimm performance stats.
> >> + * @dev: nvdimm device pointer.
> >> + * @functions(event_init/add/del/read): platform specific pmu functions.
> >
> > This is not valid kernel-doc:
> >
> > include/linux/nd.h:67: warning: Function parameter or member
> > 'event_init' not described in 'nvdimm_pmu'
> > include/linux/nd.h:67: warning: Function parameter or member 'add' not
> > described in 'nvdimm_pmu'
> > include/linux/nd.h:67: warning: Function parameter or member 'del' not
> > described in 'nvdimm_pmu'
> > include/linux/nd.h:67: warning: Function parameter or member 'read'
> > not described in 'nvdimm_pmu'
> >
> > ...but I think rather than fixing those up 'struct nvdimm_pmu' should be 
> > pruned.
> >
> > It's not clear to me that it is worth the effort to describe these
> > details to the nvdimm core which is just going to turn around and call
> > the pmu core. I'd just as soon have the driver call the pmu core
> > directly, optionally passing in attributes and callbacks that come
> > from the nvdimm core and/or the nvdimm provider.
>
> The intend for adding these callbacks(event_init/add/del/read) is to give
> flexibility to the nvdimm core to add some common checks/routines if required
> in the future. Those checks can be common for all architecture with still 
> having the
> ability to call arch/platform specific driver code to use its own routines.
>
> But as you said, currently we don't have any common checks and it directly
> calling platform specific code, so we can get rid of it.
> Should we remove this part for now?

Yes, lets go direct to the perf api for now and await the need for a
common core wrapper to present itself.

>
>
> >
> > Otherwise it's also not clear which of these structure members are
> > used at runtime vs purely used as temporary storage to pass parameters
> > to the pmu core.
> >
> >> + * @attr_groups: data structure for events, formats and cpumask
> >> + * @cpu: designated cpu for counter access.
> >> + * @node: node for cpu hotplug notifier link.
> >> + * @cpuhp_state: state for cpu hotplug notification.
> >> + * @arch_cpumask: cpumask to get designated cpu for counter access.
> >> + */
> >> +struct nvdimm_pmu {
> >> +   const char *name;
> >> +   struct pmu pmu;
> >> +   struct device *dev;
> >> +   int (*event_init)(struct perf_event *event);
> >> +   int  (*add)(struct perf_event *event, int flags);
> >> +   void (*del)(struct perf_event *event, int flags);
> >> +   void (*read)(struct perf_event *event);
> >> +   /*
> >> +* Attribute groups for the nvdimm pmu. Index 0 used for
> >> +* format attribute, index 1 used for event attribute,
> >> +* index 2 used for cpusmask attribute and index 3 kept as NULL.
> >> +*/
> >> +   const struct attribute_group *attr_groups[4];
> >
> > Following from above, I'd rather this was organized as static
> > attributes with an is_visible() helper for the groups for any dynamic
> > aspects. That mirrors the behavior of nvdimm_create() and allows for
> > device drivers to compose the attribute groups from a core set and /
> > or a provider specific set.
>
> Since we don't have any common events right now, Can I use papr
> attributes directly or should we create dummy events for common thing and
>

60 matches

Mail list logo