[PATCH] powerpc/32: Preserve cr1 in exception prolog stack check to fix build error

2021-02-07 Thread Christophe Leroy
THREAD_ALIGN_SHIFT = THREAD_SHIFT + 1 = PAGE_SHIFT + 1
Maximum PAGE_SHIFT is 18 for 256k pages so
THREAD_ALIGN_SHIFT is 19 at the maximum.

No need to clobber cr1, it can be preserved when moving r1
into CR when we check stack overflow.

This reduces the number of instructions in Machine Check Exception
prolog and fixes a build failure reported by the kernel test robot
on v5.10 stable when building with RTAS + VMAP_STACK + KVM. That
build failure is due to too many instructions in the prolog hence
not fitting between 0x200 and 0x300. Allthough the problem doesn't
show up in mainline, it is still worth the change.

Reported-by: kernel test robot 
Fixes: 98bf2d3f4970 ("powerpc/32s: Fix RTAS machine check with VMAP stack")
Cc: sta...@vger.kernel.org
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_32.h| 2 +-
 arch/powerpc/kernel/head_book3s_32.S | 6 --
 2 files changed, 1 insertion(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index a2f72c966baf..abc7b603ab65 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -47,7 +47,7 @@
lwz r1,TASK_STACK-THREAD(r1)
addir1, r1, THREAD_SIZE - INT_FRAME_SIZE
 1:
-   mtcrf   0x7f, r1
+   mtcrf   0x3f, r1
bt  32 - THREAD_ALIGN_SHIFT, stack_overflow
 #else
subir11, r1, INT_FRAME_SIZE /* use r1 if kernel */
diff --git a/arch/powerpc/kernel/head_book3s_32.S 
b/arch/powerpc/kernel/head_book3s_32.S
index 54140f4927e5..10e6aa88b1ff 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -278,12 +278,6 @@ MachineCheck:
 7: EXCEPTION_PROLOG_2
addir3,r1,STACK_FRAME_OVERHEAD
 #ifdef CONFIG_PPC_CHRP
-#ifdef CONFIG_VMAP_STACK
-   mfspr   r4, SPRN_SPRG_THREAD
-   tovirt(r4, r4)
-   lwz r4, RTAS_SP(r4)
-   cmpwi   cr1, r4, 0
-#endif
beq cr1, machine_check_tramp
twi 31, 0, 0
 #else
-- 
2.25.0



[PATCH] powerpc/64s: Remove EXSLB interrupt save area

2021-02-07 Thread Nicholas Piggin
SLB faults should not be taken while the PACA save areas are live, all
memory accesses should be fetches from the kernel text, and access to
PACA and the current stack, before C code is called or any other
accesses are made.

All of these have pinned SLBs so will not take a SLB fault. Therefore
EXSLB is not be required.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/paca.h  | 3 +--
 arch/powerpc/kernel/asm-offsets.c| 1 -
 arch/powerpc/kernel/exceptions-64s.S | 5 -
 3 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 9454d29ff4b4..be0b00cb9fbb 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -108,8 +108,7 @@ struct paca_struct {
 */
/* used for most interrupts/exceptions */
u64 exgen[EX_SIZE] __attribute__((aligned(0x80)));
-   u64 exslb[EX_SIZE]; /* used for SLB/segment table misses
-* on the linear mapping */
+
/* SLB related definitions */
u16 vmalloc_sllp;
u8 slb_cache_ptr;
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index b12d7c049bfe..31edd9bbce75 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -255,7 +255,6 @@ int main(void)
 #endif /* CONFIG_PPC_MM_SLICES */
OFFSET(PACA_EXGEN, paca_struct, exgen);
OFFSET(PACA_EXMC, paca_struct, exmc);
-   OFFSET(PACA_EXSLB, paca_struct, exslb);
OFFSET(PACA_EXNMI, paca_struct, exnmi);
 #ifdef CONFIG_PPC_PSERIES
OFFSET(PACALPPACAPTR, paca_struct, lppaca_ptr);
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index dad35b59bcfb..27fa80248406 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1392,13 +1392,9 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
  *   on user-handler data structures.
  *
  *   KVM: Same as 0x300, DSLB must test for KVM guest.
- *
- * A dedicated save area EXSLB is used (XXX: but it actually need not be
- * these days, we could use EXGEN).
  */
 INT_DEFINE_BEGIN(data_access_slb)
IVEC=0x380
-   IAREA=PACA_EXSLB
IDAR=1
IKVM_SKIP=1
IKVM_REAL=1
@@ -1481,7 +1477,6 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
  */
 INT_DEFINE_BEGIN(instruction_access_slb)
IVEC=0x480
-   IAREA=PACA_EXSLB
IISIDE=1
IDAR=1
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
-- 
2.23.0



[PATCH] powerpc/64s: syscall real mode entry use mtmsrd rather than rfid

2021-02-07 Thread Nicholas Piggin
Have the real mode system call entry handler branch to the kernel
0xc000... address and then use mtmsrd to enable the MMU, rather than use
SRRs and rfid.

Commit 8729c26e675c ("powerpc/64s/exception: Move real to virt switch
into the common handler") implemented this style of real mode entry for
other interrupt handlers, so this brings system calls into line with
them, which is the main motivcation for the change.

This tends to be slightly faster due to avoiding the mtsprs, and it also
does not clobber the SRR registers, which becomes important in a
subsequent change. The real mode entry points don't tend to be too
important for performance these days, but it is possible for a
hypervisor to run guests in AIL=0 mode for certian reasons.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/entry_64.S   | 6 ++
 arch/powerpc/kernel/exceptions-64s.S | 9 +++--
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 33ddfeef4fe9..993ed95ed602 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -225,6 +225,12 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_emulate)
b   system_call_vectored_common
 #endif
 
+   .balign IFETCH_ALIGN_BYTES
+   .globl system_call_common_real
+system_call_common_real:
+   ld  r10,PACAKMSR(r13)   /* get MSR value for kernel */
+   mtmsrd  r10
+
.balign IFETCH_ALIGN_BYTES
.globl system_call_common
 system_call_common:
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 5478ffa85603..dad35b59bcfb 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1905,12 +1905,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
HMT_MEDIUM
 
.if ! \virt
-   __LOAD_HANDLER(r10, system_call_common)
-   mtspr   SPRN_SRR0,r10
-   ld  r10,PACAKMSR(r13)
-   mtspr   SPRN_SRR1,r10
-   RFI_TO_KERNEL
-   b   .   /* prevent speculative execution */
+   __LOAD_HANDLER(r10, system_call_common_real)
+   mtctr   r10
+   bctr
.else
li  r10,MSR_RI
mtmsrd  r10,1   /* Set RI (EE=0) */
-- 
2.23.0



Re: [RFC Qemu PATCH v2 1/2] spapr: drc: Add support for async hcalls at the drc level

2021-02-07 Thread David Gibson
On Tue, Jan 19, 2021 at 12:40:31PM +0530, Shivaprasad G Bhat wrote:
> Thanks for the comments!
> 
> 
> On 12/28/20 2:08 PM, David Gibson wrote:
> 
> > On Mon, Dec 21, 2020 at 01:08:53PM +0100, Greg Kurz wrote:
> ...
> > > The overall idea looks good but I think you should consider using
> > > a thread pool to implement it. See below.
> > I am not convinced, however.  Specifically, attaching this to the DRC
> > doesn't make sense to me.  We're adding exactly one DRC related async
> > hcall, and I can't really see much call for another one.  We could
> > have other async hcalls - indeed we already have one for HPT resizing
> > - but attaching this to DRCs doesn't help for those.
> 
> The semantics of the hcall made me think, if this is going to be
> re-usable for future if implemented at DRC level.

It would only be re-usable for operations that are actually connected
to DRCs.  It doesn't seem to me particularly likely that we'll ever
have more asynchronous hcalls that are also associated with DRCs.

> Other option
> is to move the async-hcall-state/list into the NVDIMMState structure
> in include/hw/mem/nvdimm.h and handle it with machine->nvdimms_state
> at a global level.

I'm ok with either of two options:

A) Implement this ad-hoc for this specific case, making whatever
simplifications you can based on this specific case.

B) Implement a general mechanism for async hcalls that is *not* tied
to DRCs.  Then use that for the existing H_RESIZE_HPT_PREPARE call as
well as this new one.

> Hope you are okay with using the pool based approach that Greg

Honestly a thread pool seems like it might be overkill for this
application.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v5 03/10] powerpc/signal64: Move non-inline functions out of setup_sigcontext()

2021-02-07 Thread Daniel Axtens
Hi Chris,

These two paragraphs are a little confusing and they seem slightly
repetitive. But I get the general idea. Two specific comments below:

> There are non-inline functions which get called in setup_sigcontext() to
> save register state to the thread struct. Move these functions into a
> separate prepare_setup_sigcontext() function so that
> setup_sigcontext() can be refactored later into an "unsafe" version
> which assumes an open uaccess window. Non-inline functions should be
> avoided when uaccess is open.

Why do we want to avoid non-inline functions? We came up with:

 - we want KUAP protection for as much of the kernel as possible: each
   extra bit of code run with the window open is another piece of attack
   surface.
   
 - non-inline functions default to traceable, which means we could end
   up ftracing while uaccess is enabled. That's a pretty big hole in the
   defences that KUAP provides.

I think we've also had problems with the window being opened or closed
unexpectedly by various bits of code? So the less code runs in uaccess
context the less likely that is to occur.
 
> The majority of setup_sigcontext() can be refactored to execute in an
> "unsafe" context (uaccess window is opened) except for some non-inline
> functions. Move these out into a separate prepare_setup_sigcontext()
> function which must be called first and before opening up a uaccess
> window. A follow-up commit converts setup_sigcontext() to be "unsafe".

This was a bit confusing until we realise that you're moving the _calls_
to the non-inline functions out, not the non-inline functions themselves.

> Signed-off-by: Christopher M. Riedl 
> ---
>  arch/powerpc/kernel/signal_64.c | 32 +---
>  1 file changed, 21 insertions(+), 11 deletions(-)
>
> diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
> index f9e4a1ac440f..b211a8ea4f6e 100644
> --- a/arch/powerpc/kernel/signal_64.c
> +++ b/arch/powerpc/kernel/signal_64.c
> @@ -79,6 +79,24 @@ static elf_vrreg_t __user *sigcontext_vmx_regs(struct 
> sigcontext __user *sc)
>  }
>  #endif
>  
> +static void prepare_setup_sigcontext(struct task_struct *tsk, int 
> ctx_has_vsx_region)

ctx_has_vsx_region should probably be a bool? Although setup_sigcontext
also has it as an int so I guess that's arguable, and maybe it's better
to stick with this for constency.

> +{
> +#ifdef CONFIG_ALTIVEC
> + /* save altivec registers */
> + if (tsk->thread.used_vr)
> + flush_altivec_to_thread(tsk);
> + if (cpu_has_feature(CPU_FTR_ALTIVEC))
> + tsk->thread.vrsave = mfspr(SPRN_VRSAVE);
> +#endif /* CONFIG_ALTIVEC */
> +
> + flush_fp_to_thread(tsk);
> +
> +#ifdef CONFIG_VSX
> + if (tsk->thread.used_vsr && ctx_has_vsx_region)
> + flush_vsx_to_thread(tsk);
> +#endif /* CONFIG_VSX */

Alternatively, given that this is the only use of ctx_has_vsx_region,
mpe suggested that perhaps we could drop it entirely and always
flush_vsx if used_vsr. The function is only ever called with either
`current` or wth ctx_has_vsx_region set to 1, so in either case I think
that's safe? I'm not sure if it would have performance implications.

Should we move this and the altivec ifdef to IS_ENABLED(CONFIG_VSX) etc?
I'm not sure if that runs into any problems with things like 'used_vsr'
only being defined if CONFIG_VSX is set, but I thought I'd ask.


> +}
> +
>  /*
>   * Set up the sigcontext for the signal frame.
>   */
> @@ -97,7 +115,6 @@ static long setup_sigcontext(struct sigcontext __user *sc,
>*/
>  #ifdef CONFIG_ALTIVEC
>   elf_vrreg_t __user *v_regs = sigcontext_vmx_regs(sc);
> - unsigned long vrsave;
>  #endif
>   struct pt_regs *regs = tsk->thread.regs;
>   unsigned long msr = regs->msr;
> @@ -112,7 +129,6 @@ static long setup_sigcontext(struct sigcontext __user *sc,
>  
>   /* save altivec registers */
>   if (tsk->thread.used_vr) {
> - flush_altivec_to_thread(tsk);
>   /* Copy 33 vec registers (vr0..31 and vscr) to the stack */
>   err |= __copy_to_user(v_regs, >thread.vr_state,
> 33 * sizeof(vector128));
> @@ -124,17 +140,10 @@ static long setup_sigcontext(struct sigcontext __user 
> *sc,
>   /* We always copy to/from vrsave, it's 0 if we don't have or don't
>* use altivec.
>*/
> - vrsave = 0;
> - if (cpu_has_feature(CPU_FTR_ALTIVEC)) {
> - vrsave = mfspr(SPRN_VRSAVE);
> - tsk->thread.vrsave = vrsave;
> - }
> -
> - err |= __put_user(vrsave, (u32 __user *)_regs[33]);
> + err |= __put_user(tsk->thread.vrsave, (u32 __user *)_regs[33]);

Previously, if !cpu_has_feature(ALTIVEC), v_regs[33] had vrsave stored,
which was set to 0 explicitly. Now we store thread.vrsave instead of the
local vrsave. That should be safe - it is initalised to 0 elsewhere.

So you don't have to do anything here, this is just letting you know
that we checked 

Re: [PATCH v16 11/12] powerpc: Use OF alloc and free for FDT

2021-02-07 Thread Michael Ellerman
Rob Herring  writes:
> On Thu, Feb 4, 2021 at 10:42 AM Lakshmi Ramasubramanian
>  wrote:
...
>> diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
>> index d0e459bb2f05..51d2d8eb6c1b 100644
>> --- a/arch/powerpc/kexec/elf_64.c
>> +++ b/arch/powerpc/kexec/elf_64.c
>> @@ -19,6 +19,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -32,7 +33,7 @@ static void *elf64_load(struct kimage *image, char 
>> *kernel_buf,
>> unsigned int fdt_size;
>> unsigned long kernel_load_addr;
>> unsigned long initrd_load_addr = 0, fdt_load_addr;
>> -   void *fdt;
>> +   void *fdt = NULL;
>> const void *slave_code;
>> struct elfhdr ehdr;
>> char *modified_cmdline = NULL;
>> @@ -103,18 +104,12 @@ static void *elf64_load(struct kimage *image, char 
>> *kernel_buf,
>> }
>>
>> fdt_size = fdt_totalsize(initial_boot_params) * 2;
>> -   fdt = kmalloc(fdt_size, GFP_KERNEL);
>> +   fdt = of_alloc_and_init_fdt(fdt_size);
>> if (!fdt) {
>> pr_err("Not enough memory for the device tree.\n");
>> ret = -ENOMEM;
>> goto out;
>> }
>> -   ret = fdt_open_into(initial_boot_params, fdt, fdt_size);
>> -   if (ret < 0) {
>> -   pr_err("Error setting up the new device tree.\n");
>> -   ret = -EINVAL;
>> -   goto out;
>> -   }
>>
>> ret = setup_new_fdt_ppc64(image, fdt, initrd_load_addr,
>
> The first thing this function does is call setup_new_fdt() which first
> calls of_kexec_setup_new_fdt(). (Note, I really don't understand the
> PPC code split. It looks like there's a 32-bit and 64-bit split, but
> 32-bit looks broken to me. Nothing ever calls setup_new_fdt() except
> setup_new_fdt_ppc64()).

I think that's because 32-bit doesn't support kexec_file_load().

cheers


[PATCH] ASoC: imx-hdmi: no need to set .owner when using module_platform_driver

2021-02-07 Thread Tian Tao
the module_platform_driver will call platform_driver_register.
and It will set the .owner to THIS_MODULE

Signed-off-by: Tian Tao 
---
 sound/soc/fsl/imx-hdmi.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/sound/soc/fsl/imx-hdmi.c b/sound/soc/fsl/imx-hdmi.c
index dbbb7618..cd0235a 100644
--- a/sound/soc/fsl/imx-hdmi.c
+++ b/sound/soc/fsl/imx-hdmi.c
@@ -223,7 +223,6 @@ MODULE_DEVICE_TABLE(of, imx_hdmi_dt_ids);
 static struct platform_driver imx_hdmi_driver = {
.driver = {
.name = "imx-hdmi",
-   .owner = THIS_MODULE,
.pm = _soc_pm_ops,
.of_match_table = imx_hdmi_dt_ids,
},
-- 
2.7.4



[PATCH v3 2/2] selftests/powerpc: Test for spurious kernel memory faults on radix

2021-02-07 Thread Jordan Niethe
Previously when mapping kernel memory on radix, no ptesync was included
which would periodically lead to unhandled spurious faults. Mapping
kernel memory is used when code patching with Strict RWX enabled.  As
suggested by Chris Riedl, turning ftrace on and off does a large amount
of code patching so is a convenient way to see this kind of fault.

Add a selftest to try and trigger this kind of a spurious fault. It
tests for 30 seconds which is usually long enough for the issue to show
up.

Signed-off-by: Jordan Niethe 
---
v3: New to series
---
 tools/testing/selftests/powerpc/mm/Makefile   |  1 +
 .../selftests/powerpc/mm/spurious_fault.sh| 49 +++
 2 files changed, 50 insertions(+)
 create mode 100755 tools/testing/selftests/powerpc/mm/spurious_fault.sh

diff --git a/tools/testing/selftests/powerpc/mm/Makefile 
b/tools/testing/selftests/powerpc/mm/Makefile
index defe488d6bf1..56c2896bed53 100644
--- a/tools/testing/selftests/powerpc/mm/Makefile
+++ b/tools/testing/selftests/powerpc/mm/Makefile
@@ -5,6 +5,7 @@ noarg:
 TEST_GEN_PROGS := hugetlb_vs_thp_test subpage_prot prot_sao segv_errors 
wild_bctr \
  large_vm_fork_separation bad_accesses pkey_exec_prot \
  pkey_siginfo stack_expansion_signal stack_expansion_ldst
+TEST_PROGS := spurious_fault.sh
 
 TEST_GEN_PROGS_EXTENDED := tlbie_test
 TEST_GEN_FILES := tempfile
diff --git a/tools/testing/selftests/powerpc/mm/spurious_fault.sh 
b/tools/testing/selftests/powerpc/mm/spurious_fault.sh
new file mode 100755
index ..e454509659f6
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mm/spurious_fault.sh
@@ -0,0 +1,49 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+TIMEOUT=30
+
+DEBUFS_DIR=`cat /proc/mounts | grep debugfs | awk '{print $2}'`
+if [ ! -e "$DEBUFS_DIR" ]
+then
+   echo "debugfs not found, skipping" 1>&2
+   exit 4
+fi
+
+if [ ! -e "$DEBUFS_DIR/tracing/current_tracer" ]
+then
+   echo "Tracing files not found, skipping" 1>&2
+   exit 4
+fi
+
+
+echo "Testing for spurious faults when mapping kernel memory..."
+
+if grep -q "FUNCTION TRACING IS CORRUPTED" "$DEBUFS_DIR/tracing/trace"
+then
+   echo "FAILED: Ftrace already dead. Probably due to a spurious fault" 
1>&2
+   exit 1
+fi
+
+dmesg -C
+START_TIME=`date +%s`
+END_TIME=`expr $START_TIME + $TIMEOUT`
+while [ `date +%s` -lt $END_TIME ]
+do
+   echo function > $DEBUFS_DIR/tracing/current_tracer
+   echo nop > $DEBUFS_DIR/tracing/current_tracer
+   if dmesg | grep -q 'ftrace bug'
+   then
+   break
+   fi
+done
+
+echo nop > $DEBUFS_DIR/tracing/current_tracer
+if dmesg | grep -q 'ftrace bug'
+then
+   echo "FAILED: Mapping kernel memory causes spurious faults" 1>&2
+   exit 1
+else
+   echo "OK: Mapping kernel memory does not cause spurious faults"
+   exit 0
+fi
-- 
2.25.1



[PATCH v3 1/2] powerpc/64s: Fix pte update for kernel memory on radix

2021-02-07 Thread Jordan Niethe
When adding a pte a ptesync is needed to order the update of the pte
with subsequent accesses otherwise a spurious fault may be raised.

radix__set_pte_at() does not do this for performance gains. For
non-kernel memory this is not an issue as any faults of this kind are
corrected by the page fault handler.  For kernel memory these faults are
not handled.  The current solution is that there is a ptesync in
flush_cache_vmap() which should be called when mapping from the vmalloc
region.

However, map_kernel_page() does not call flush_cache_vmap(). This is
troublesome in particular for code patching with Strict RWX on radix. In
do_patch_instruction() the page frame that contains the instruction to
be patched is mapped and then immediately patched. With no ordering or
synchronization between setting up the pte and writing to the page it is
possible for faults.

As the code patching is done using __put_user_asm_goto() the resulting
fault is obscured - but using a normal store instead it can be seen:

[  418.498768][  T757] BUG: Unable to handle kernel data access on write at 
0xc00808f24a3c
[  418.498790][  T757] Faulting instruction address: 0xc008bd74
[  418.498805][  T757] Oops: Kernel access of bad area, sig: 11 [#1]
[  418.498828][  T757] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
[  418.498843][  T757] Modules linked in: nop_module(PO+) [last unloaded: 
nop_module]
[  418.498872][  T757] CPU: 4 PID: 757 Comm: sh Tainted: P   O  
5.10.0-rc5-01361-ge3c1b78c8440-dirty #43
[  418.498936][  T757] NIP:  c008bd74 LR: c008bd50 CTR: 
c0025810
[  418.498979][  T757] REGS: c00016f634a0 TRAP: 0300   Tainted: P   
O   (5.10.0-rc5-01361-ge3c1b78c8440-dirty)
[  418.499033][  T757] MSR:  90009033   CR: 
44002884  XER: 
[  418.499084][  T757] CFAR: c007c68c DAR: c00808f24a3c DSISR: 
4200 IRQMASK: 1

This results in the kind of issue reported here:
https://lore.kernel.org/linuxppc-dev/15ac5b0e-a221-4b8c-9039-fa96b8ef7...@lca.pw/

Chris Riedl suggested a reliable way to reproduce the issue:
$ mount -t debugfs none /sys/kernel/debug
$ (while true; do echo function > /sys/kernel/debug/tracing/current_tracer ; 
echo nop > /sys/kernel/debug/tracing/current_tracer ; done)&

Turning ftrace on and off does a large amount of code patching which in
usually less then 5min will crash giving a trace like:

[  146.668988][  T809] ftrace-powerpc: (ptrval): replaced (4b473b11) != 
old (6000)
[  146.668995][  T809] [ ftrace bug ]
[  146.669031][  T809] ftrace failed to modify
[  146.669039][  T809] [] napi_busy_loop+0xc/0x390
[  146.669045][  T809]  actual:   11:3b:47:4b
[  146.669070][  T809] Setting ftrace call site to call ftrace function
[  146.669075][  T809] ftrace record flags: 8001
[  146.669081][  T809]  (1)
[  146.669081][  T809]  expected tramp: c006c96c
[  146.669096][  T809] [ cut here ]
[  146.669104][  T809] WARNING: CPU: 4 PID: 809 at kernel/trace/ftrace.c:2065 
ftrace_bug+0x28c/0x2e8
[  146.669109][  T809] Modules linked in: nop_module(PO-) [last unloaded: 
nop_module]
[  146.669130][  T809] CPU: 4 PID: 809 Comm: sh Tainted: P   O  
5.10.0-rc5-01360-gf878ccaf250a #1
[  146.669136][  T809] NIP:  c024f334 LR: c024f330 CTR: 
c01a5af0
[  146.669142][  T809] REGS: c4c8b760 TRAP: 0700   Tainted: P   
O   (5.10.0-rc5-01360-gf878ccaf250a)
[  146.669147][  T809] MSR:  9282b033 
  CR: 28008848  XER: 2004
[  146.669208][  T809] CFAR: c01a9c98 IRQMASK: 0
[  146.669208][  T809] GPR00: c024f330 c4c8b9f0 
c2770600 0022
[  146.669208][  T809] GPR04: 7fff c4c8b6d0 
0027 c007fe9bcdd8
[  146.669208][  T809] GPR08: 0023 ffd8 
0027 c2613118
[  146.669208][  T809] GPR12: 8000 c007fffdca00 
 
[  146.669208][  T809] GPR16: 23ec37c5  
 0008
[  146.669208][  T809] GPR20: c4c8bc90 c27a2d20 
c4c8bcd0 c2612fe8
[  146.669208][  T809] GPR24: 0038 0030 
0028 0020
[  146.669208][  T809] GPR28: c0ff1b68 c0bf8e5c 
c312f700 c0fbb9b0
[  146.669384][  T809] NIP [c024f334] ftrace_bug+0x28c/0x2e8
[  146.669391][  T809] LR [c024f330] ftrace_bug+0x288/0x2e8
[  146.669396][  T809] Call Trace:
[  146.669403][  T809] [c4c8b9f0] [c024f330] 
ftrace_bug+0x288/0x2e8 (unreliable)
[  146.669418][  T809] [c4c8ba80] [c0248778] 
ftrace_modify_all_code+0x168/0x210
[  146.669429][  T809] [c4c8bab0] [c006c528] 
arch_ftrace_update_code+0x18/0x30
[  146.669440][  T809] [c4c8bad0] [c0248954] 
ftrace_run_update_code+0x44/0xc0
[  

Re: [PATCH] mm/memtest: Add ARCH_USE_MEMTEST

2021-02-07 Thread Anshuman Khandual



On 2/5/21 2:50 PM, Vladimir Murzin wrote:
> Hi Anshuman,
> 
> On 2/5/21 4:10 AM, Anshuman Khandual wrote:
>> early_memtest() does not get called from all architectures. Hence enabling
>> CONFIG_MEMTEST and providing a valid memtest=[1..N] kernel command line
>> option might not trigger the memory pattern tests as would be expected in
>> normal circumstances. This situation is misleading.
> 
> Documentation already mentions which architectures support that:
> 
> memtest=[KNL,X86,ARM,PPC] Enable memtest
> 
> yet I admit that not all reflected there

But there is nothing that prevents CONFIG_MEMTEST from being set on
other platforms that do not have an affect, which is not optimal.

> 
>>
>> The change here prevents the above mentioned problem after introducing a
>> new config option ARCH_USE_MEMTEST that should be subscribed on platforms
>> that call early_memtest(), in order to enable the config CONFIG_MEMTEST.
>> Conversely CONFIG_MEMTEST cannot be enabled on platforms where it would
>> not be tested anyway.
>>
> 
> Is that generic pattern? What about other cross arch parameters? Do they 
> already
> use similar subscription or they rely on documentation?

Depending solely on the documentation should not be sufficient.

> 
> I'm not against the patch just want to check if things are consistent...
Not sure about other similar situations but those if present should
get fixed as well.


Re: [PATCH] mm/memtest: Add ARCH_USE_MEMTEST

2021-02-07 Thread Anshuman Khandual



On 2/5/21 1:05 PM, Max Filippov wrote:
> On Thu, Feb 4, 2021 at 8:10 PM Anshuman Khandual
>  wrote:
>>
>> early_memtest() does not get called from all architectures. Hence enabling
>> CONFIG_MEMTEST and providing a valid memtest=[1..N] kernel command line
>> option might not trigger the memory pattern tests as would be expected in
>> normal circumstances. This situation is misleading.
>>
>> The change here prevents the above mentioned problem after introducing a
>> new config option ARCH_USE_MEMTEST that should be subscribed on platforms
>> that call early_memtest(), in order to enable the config CONFIG_MEMTEST.
>> Conversely CONFIG_MEMTEST cannot be enabled on platforms where it would
>> not be tested anyway.
>>
>> Cc: Russell King 
>> Cc: Catalin Marinas 
>> Cc: Will Deacon 
>> Cc: Thomas Bogendoerfer 
>> Cc: Michael Ellerman 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Paul Mackerras 
>> Cc: Thomas Gleixner 
>> Cc: Ingo Molnar 
>> Cc: Chris Zankel 
>> Cc: Max Filippov 
>> Cc: linux-arm-ker...@lists.infradead.org
>> Cc: linux-m...@vger.kernel.org
>> Cc: linuxppc-dev@lists.ozlabs.org
>> Cc: linux-xte...@linux-xtensa.org
>> Cc: linux...@kvack.org
>> Cc: linux-ker...@vger.kernel.org
>> Signed-off-by: Anshuman Khandual 
>> ---
>> This patch applies on v5.11-rc6 and has been tested on arm64 platform. But
>> it has been just build tested on all other platforms.
>>
>>  arch/arm/Kconfig | 1 +
>>  arch/arm64/Kconfig   | 1 +
>>  arch/mips/Kconfig| 1 +
>>  arch/powerpc/Kconfig | 1 +
>>  arch/x86/Kconfig | 1 +
>>  arch/xtensa/Kconfig  | 1 +
>>  lib/Kconfig.debug| 9 -
>>  7 files changed, 14 insertions(+), 1 deletion(-)
> 
> Anshuman, entries in arch/*/Konfig files are sorted in alphabetical order,
> please keep them that way.

Sure, will fix up and resend.

> 
> Reviewed-by: Max Filippov 
> 


[PATCH 8/8] xen-swiotlb: remove the unused size argument from xen_swiotlb_fixup

2021-02-07 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/xen/swiotlb-xen.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index b2d9e77059bf5a..621a20c1143597 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -104,8 +104,7 @@ static int is_xen_swiotlb_buffer(struct device *dev, 
dma_addr_t dma_addr)
return 0;
 }
 
-static int
-xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
+static int xen_swiotlb_fixup(void *buf, unsigned long nslabs)
 {
int i, rc;
int dma_bits;
@@ -195,7 +194,7 @@ int __ref xen_swiotlb_init(void)
/*
 * And replace that memory with pages under 4GB.
 */
-   rc = xen_swiotlb_fixup(start, bytes, nslabs);
+   rc = xen_swiotlb_fixup(start, nslabs);
if (rc) {
free_pages((unsigned long)start, order);
m_ret = XEN_SWIOTLB_EFIXUP;
@@ -243,7 +242,7 @@ void __init xen_swiotlb_init_early(void)
/*
 * And replace that memory with pages under 4GB.
 */
-   rc = xen_swiotlb_fixup(start, bytes, nslabs);
+   rc = xen_swiotlb_fixup(start, nslabs);
if (rc) {
memblock_free(__pa(start), PAGE_ALIGN(bytes));
if (repeat--) {
-- 
2.29.2



[PATCH 7/8] xen-swiotlb: split xen_swiotlb_init

2021-02-07 Thread Christoph Hellwig
Split xen_swiotlb_init into a normal an an early case.  That makes both
much simpler and more readable, and also allows marking the early
code as __init and x86-only.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/xen/mm.c  |   2 +-
 arch/x86/xen/pci-swiotlb-xen.c |   4 +-
 drivers/xen/swiotlb-xen.c  | 124 +++--
 include/xen/swiotlb-xen.h  |   3 +-
 4 files changed, 75 insertions(+), 58 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 467fa225c3d0ed..aae950cd053fea 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -140,7 +140,7 @@ static int __init xen_mm_init(void)
struct gnttab_cache_flush cflush;
if (!xen_initial_domain())
return 0;
-   xen_swiotlb_init(1, false);
+   xen_swiotlb_init();
 
cflush.op = 0;
cflush.a.dev_bus_addr = 0;
diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c
index 19ae3e4fe4e98e..54f9aa7e845739 100644
--- a/arch/x86/xen/pci-swiotlb-xen.c
+++ b/arch/x86/xen/pci-swiotlb-xen.c
@@ -59,7 +59,7 @@ int __init pci_xen_swiotlb_detect(void)
 void __init pci_xen_swiotlb_init(void)
 {
if (xen_swiotlb) {
-   xen_swiotlb_init(1, true /* early */);
+   xen_swiotlb_init_early();
dma_ops = _swiotlb_dma_ops;
 
 #ifdef CONFIG_PCI
@@ -76,7 +76,7 @@ int pci_xen_swiotlb_init_late(void)
if (xen_swiotlb)
return 0;
 
-   rc = xen_swiotlb_init(1, false /* late */);
+   rc = xen_swiotlb_init();
if (rc)
return rc;
 
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index e6c8556e879ee6..b2d9e77059bf5a 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -156,96 +156,112 @@ static const char *xen_swiotlb_error(enum 
xen_swiotlb_err err)
 
 #define DEFAULT_NSLABS ALIGN(SZ_64M >> IO_TLB_SHIFT, IO_TLB_SEGSIZE)
 
-int __ref xen_swiotlb_init(int verbose, bool early)
+int __ref xen_swiotlb_init(void)
 {
-   unsigned long bytes, order;
-   int rc = -ENOMEM;
enum xen_swiotlb_err m_ret = XEN_SWIOTLB_UNKNOWN;
+   unsigned long nslabs, bytes, order;
unsigned int repeat = 3;
+   int rc = -ENOMEM;
char *start;
-   unsigned long nslabs;
 
nslabs = swiotlb_nr_tbl();
-retry:
if (!nslabs)
nslabs = DEFAULT_NSLABS;
+retry:
+   m_ret = XEN_SWIOTLB_ENOMEM;
bytes = nslabs << IO_TLB_SHIFT;
order = get_order(bytes);
 
/*
 * Get IO TLB memory from any location.
 */
-   if (early) {
-   start = memblock_alloc(PAGE_ALIGN(bytes),
- PAGE_SIZE);
-   if (!start)
-   panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
- __func__, PAGE_ALIGN(bytes), PAGE_SIZE);
-   } else {
 #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
-   while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-   start = (void *)xen_get_swiotlb_free_pages(order);
-   if (start)
-   break;
-   order--;
-   }
-   if (order != get_order(bytes)) {
-   pr_warn("Warning: only able to allocate %ld MB for 
software IO TLB\n",
-   (PAGE_SIZE << order) >> 20);
-   nslabs = SLABS_PER_PAGE << order;
-   bytes = nslabs << IO_TLB_SHIFT;
-   }
+   while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
+   start = (void *)xen_get_swiotlb_free_pages(order);
+   if (start)
+   break;
+   order--;
}
-   if (!start) {
-   m_ret = XEN_SWIOTLB_ENOMEM;
+   if (!start)
goto error;
+   if (order != get_order(bytes)) {
+   pr_warn("Warning: only able to allocate %ld MB for software IO 
TLB\n",
+   (PAGE_SIZE << order) >> 20);
+   nslabs = SLABS_PER_PAGE << order;
+   bytes = nslabs << IO_TLB_SHIFT;
}
+
/*
 * And replace that memory with pages under 4GB.
 */
-   rc = xen_swiotlb_fixup(start,
-  bytes,
-  nslabs);
+   rc = xen_swiotlb_fixup(start, bytes, nslabs);
if (rc) {
-   if (early)
-   memblock_free(__pa(start),
- PAGE_ALIGN(bytes));
-   else {
-   free_pages((unsigned long)start, order);
-   start = NULL;
-   }
+   free_pages((unsigned long)start, order);
m_ret = XEN_SWIOTLB_EFIXUP;
goto error;
}
-   if 

[PATCH 2/8] xen-swiotlb: use is_swiotlb_buffer in is_xen_swiotlb_buffer

2021-02-07 Thread Christoph Hellwig
Use the is_swiotlb_buffer to check if a physical address is
a swiotlb buffer.  This works because xen-swiotlb does use the
same buffer as the main swiotlb code, and xen_io_tlb_{start,end}
are just the addresses for it that went through phys_to_virt.

Signed-off-by: Christoph Hellwig 
---
 drivers/xen/swiotlb-xen.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 2b385c1b4a99cb..a4026822a889f7 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -111,10 +111,8 @@ static int is_xen_swiotlb_buffer(struct device *dev, 
dma_addr_t dma_addr)
 * have the same virtual address as another address
 * in our domain. Therefore _only_ check address within our domain.
 */
-   if (pfn_valid(PFN_DOWN(paddr))) {
-   return paddr >= virt_to_phys(xen_io_tlb_start) &&
-  paddr < virt_to_phys(xen_io_tlb_end);
-   }
+   if (pfn_valid(PFN_DOWN(paddr)))
+   return is_swiotlb_buffer(paddr);
return 0;
 }
 
-- 
2.29.2



[PATCH 6/8] swiotlb: lift the double initialization protection from xen-swiotlb

2021-02-07 Thread Christoph Hellwig
Lift the double initialization protection from xen-swiotlb to the core
code to avoid exposing too many swiotlb internals.  Also upgrade the
check to a warning as it should not happen.

Signed-off-by: Christoph Hellwig 
---
 drivers/xen/swiotlb-xen.c | 7 ---
 kernel/dma/swiotlb.c  | 8 
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 6e0f2c5ecd1a2f..e6c8556e879ee6 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -172,12 +172,6 @@ int __ref xen_swiotlb_init(int verbose, bool early)
bytes = nslabs << IO_TLB_SHIFT;
order = get_order(bytes);
 
-   /*
-* IO TLB memory already allocated. Just use it.
-*/
-   if (io_tlb_start != 0)
-   goto end;
-
/*
 * Get IO TLB memory from any location.
 */
@@ -232,7 +226,6 @@ int __ref xen_swiotlb_init(int verbose, bool early)
} else
rc = swiotlb_late_init_with_tbl(start, nslabs);
 
-end:
if (!rc)
swiotlb_set_max_segment(PAGE_SIZE);
 
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index adcc3c87e10078..a3737961ae5769 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -224,6 +224,10 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned long 
nslabs, int verbose)
unsigned long i, bytes;
size_t alloc_size;
 
+   /* protect against double initialization */
+   if (WARN_ON_ONCE(io_tlb_start))
+   return -ENOMEM;
+
bytes = nslabs << IO_TLB_SHIFT;
 
io_tlb_nslabs = nslabs;
@@ -355,6 +359,10 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
 {
unsigned long i, bytes;
 
+   /* protect against double initialization */
+   if (WARN_ON_ONCE(io_tlb_start))
+   return -ENOMEM;
+
bytes = nslabs << IO_TLB_SHIFT;
 
io_tlb_nslabs = nslabs;
-- 
2.29.2



[PATCH 3/8] xen-swiotlb: use io_tlb_end in xen_swiotlb_dma_supported

2021-02-07 Thread Christoph Hellwig
Use the existing variable that holds the physical address for
xen_io_tlb_end to simplify xen_swiotlb_dma_supported a bit, and remove
the otherwise unused xen_io_tlb_end variable and the xen_virt_to_bus
helper.

Signed-off-by: Christoph Hellwig 
---
 drivers/xen/swiotlb-xen.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index a4026822a889f7..4298f74a083985 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -46,7 +46,7 @@
  * API.
  */
 
-static char *xen_io_tlb_start, *xen_io_tlb_end;
+static char *xen_io_tlb_start;
 static unsigned long xen_io_tlb_nslabs;
 /*
  * Quick lookup value of the bus address of the IOTLB.
@@ -82,11 +82,6 @@ static inline phys_addr_t xen_dma_to_phys(struct device *dev,
return xen_bus_to_phys(dev, dma_to_phys(dev, dma_addr));
 }
 
-static inline dma_addr_t xen_virt_to_bus(struct device *dev, void *address)
-{
-   return xen_phys_to_dma(dev, virt_to_phys(address));
-}
-
 static inline int range_straddles_page_boundary(phys_addr_t p, size_t size)
 {
unsigned long next_bfn, xen_pfn = XEN_PFN_DOWN(p);
@@ -250,7 +245,6 @@ int __ref xen_swiotlb_init(int verbose, bool early)
rc = swiotlb_late_init_with_tbl(xen_io_tlb_start, 
xen_io_tlb_nslabs);
 
 end:
-   xen_io_tlb_end = xen_io_tlb_start + bytes;
if (!rc)
swiotlb_set_max_segment(PAGE_SIZE);
 
@@ -558,7 +552,7 @@ xen_swiotlb_sync_sg_for_device(struct device *dev, struct 
scatterlist *sgl,
 static int
 xen_swiotlb_dma_supported(struct device *hwdev, u64 mask)
 {
-   return xen_virt_to_bus(hwdev, xen_io_tlb_end - 1) <= mask;
+   return xen_phys_to_dma(hwdev, io_tlb_end - 1) <= mask;
 }
 
 const struct dma_map_ops xen_swiotlb_dma_ops = {
-- 
2.29.2



[PATCH 1/8] powerpc/svm: stop using io_tlb_start

2021-02-07 Thread Christoph Hellwig
Use the local variable that is passed to swiotlb_init_with_tbl for
freeing the memory in the failure case to isolate the code a little
better from swiotlb internals.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/platforms/pseries/svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/svm.c 
b/arch/powerpc/platforms/pseries/svm.c
index 7b739cc7a8a93e..b9968ac7cc0789 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -56,7 +56,7 @@ void __init svm_swiotlb_init(void)
return;
 
if (io_tlb_start)
-   memblock_free_early(io_tlb_start,
+   memblock_free_early(__pa(vstart),
PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT));
panic("SVM: Cannot allocate SWIOTLB buffer");
 }
-- 
2.29.2



[PATCH 4/8] xen-swiotlb: remove xen_set_nslabs

2021-02-07 Thread Christoph Hellwig
The xen_set_nslabs function is a little weird, as it has just one
caller, that caller passes a global variable as the argument,
which is then overriden in the function and a derivative of it
returned.  Just add a cpp symbol for the default size using a readable
constant and open code the remaining three lines in the caller.

Signed-off-by: Christoph Hellwig 
---
 drivers/xen/swiotlb-xen.c | 19 +++
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 4298f74a083985..57f8d5fadc1fcd 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -138,16 +138,6 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long 
nslabs)
} while (i < nslabs);
return 0;
 }
-static unsigned long xen_set_nslabs(unsigned long nr_tbl)
-{
-   if (!nr_tbl) {
-   xen_io_tlb_nslabs = (64 * 1024 * 1024 >> IO_TLB_SHIFT);
-   xen_io_tlb_nslabs = ALIGN(xen_io_tlb_nslabs, IO_TLB_SEGSIZE);
-   } else
-   xen_io_tlb_nslabs = nr_tbl;
-
-   return xen_io_tlb_nslabs << IO_TLB_SHIFT;
-}
 
 enum xen_swiotlb_err {
XEN_SWIOTLB_UNKNOWN = 0,
@@ -170,6 +160,9 @@ static const char *xen_swiotlb_error(enum xen_swiotlb_err 
err)
}
return "";
 }
+
+#define DEFAULT_NSLABS ALIGN(SZ_64M >> IO_TLB_SHIFT, IO_TLB_SEGSIZE)
+
 int __ref xen_swiotlb_init(int verbose, bool early)
 {
unsigned long bytes, order;
@@ -179,8 +172,10 @@ int __ref xen_swiotlb_init(int verbose, bool early)
 
xen_io_tlb_nslabs = swiotlb_nr_tbl();
 retry:
-   bytes = xen_set_nslabs(xen_io_tlb_nslabs);
-   order = get_order(xen_io_tlb_nslabs << IO_TLB_SHIFT);
+   if (!xen_io_tlb_nslabs)
+   xen_io_tlb_nslabs = DEFAULT_NSLABS;
+   bytes = xen_io_tlb_nslabs << IO_TLB_SHIFT;
+   order = get_order(bytes);
 
/*
 * IO TLB memory already allocated. Just use it.
-- 
2.29.2



swiotlb cleanups

2021-02-07 Thread Christoph Hellwig
Hi Konrad,

this series contains a bunch of swiotlb cleanups, mostly to reduce the
amount of internals exposed to code outside of swiotlb.c, which should
helper to prepare for supporting multiple different bounce buffer pools.


[PATCH 5/8] xen-swiotlb: remove xen_io_tlb_start and xen_io_tlb_nslabs

2021-02-07 Thread Christoph Hellwig
The xen_io_tlb_start and xen_io_tlb_nslabs variables ar now only used in
xen_swiotlb_init, so replace them with local variables.

Signed-off-by: Christoph Hellwig 
---
 drivers/xen/swiotlb-xen.c | 57 +--
 1 file changed, 25 insertions(+), 32 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 57f8d5fadc1fcd..6e0f2c5ecd1a2f 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -40,14 +40,7 @@
 
 #include 
 #define MAX_DMA_BITS 32
-/*
- * Used to do a quick range check in swiotlb_tbl_unmap_single and
- * swiotlb_tbl_sync_single_*, to see if the memory was in fact allocated by 
this
- * API.
- */
 
-static char *xen_io_tlb_start;
-static unsigned long xen_io_tlb_nslabs;
 /*
  * Quick lookup value of the bus address of the IOTLB.
  */
@@ -169,75 +162,75 @@ int __ref xen_swiotlb_init(int verbose, bool early)
int rc = -ENOMEM;
enum xen_swiotlb_err m_ret = XEN_SWIOTLB_UNKNOWN;
unsigned int repeat = 3;
+   char *start;
+   unsigned long nslabs;
 
-   xen_io_tlb_nslabs = swiotlb_nr_tbl();
+   nslabs = swiotlb_nr_tbl();
 retry:
-   if (!xen_io_tlb_nslabs)
-   xen_io_tlb_nslabs = DEFAULT_NSLABS;
-   bytes = xen_io_tlb_nslabs << IO_TLB_SHIFT;
+   if (!nslabs)
+   nslabs = DEFAULT_NSLABS;
+   bytes = nslabs << IO_TLB_SHIFT;
order = get_order(bytes);
 
/*
 * IO TLB memory already allocated. Just use it.
 */
-   if (io_tlb_start != 0) {
-   xen_io_tlb_start = phys_to_virt(io_tlb_start);
+   if (io_tlb_start != 0)
goto end;
-   }
 
/*
 * Get IO TLB memory from any location.
 */
if (early) {
-   xen_io_tlb_start = memblock_alloc(PAGE_ALIGN(bytes),
+   start = memblock_alloc(PAGE_ALIGN(bytes),
  PAGE_SIZE);
-   if (!xen_io_tlb_start)
+   if (!start)
panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
  __func__, PAGE_ALIGN(bytes), PAGE_SIZE);
} else {
 #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-   xen_io_tlb_start = (void 
*)xen_get_swiotlb_free_pages(order);
-   if (xen_io_tlb_start)
+   start = (void *)xen_get_swiotlb_free_pages(order);
+   if (start)
break;
order--;
}
if (order != get_order(bytes)) {
pr_warn("Warning: only able to allocate %ld MB for 
software IO TLB\n",
(PAGE_SIZE << order) >> 20);
-   xen_io_tlb_nslabs = SLABS_PER_PAGE << order;
-   bytes = xen_io_tlb_nslabs << IO_TLB_SHIFT;
+   nslabs = SLABS_PER_PAGE << order;
+   bytes = nslabs << IO_TLB_SHIFT;
}
}
-   if (!xen_io_tlb_start) {
+   if (!start) {
m_ret = XEN_SWIOTLB_ENOMEM;
goto error;
}
/*
 * And replace that memory with pages under 4GB.
 */
-   rc = xen_swiotlb_fixup(xen_io_tlb_start,
+   rc = xen_swiotlb_fixup(start,
   bytes,
-  xen_io_tlb_nslabs);
+  nslabs);
if (rc) {
if (early)
-   memblock_free(__pa(xen_io_tlb_start),
+   memblock_free(__pa(start),
  PAGE_ALIGN(bytes));
else {
-   free_pages((unsigned long)xen_io_tlb_start, order);
-   xen_io_tlb_start = NULL;
+   free_pages((unsigned long)start, order);
+   start = NULL;
}
m_ret = XEN_SWIOTLB_EFIXUP;
goto error;
}
if (early) {
-   if (swiotlb_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs,
+   if (swiotlb_init_with_tbl(start, nslabs,
 verbose))
panic("Cannot allocate SWIOTLB buffer");
rc = 0;
} else
-   rc = swiotlb_late_init_with_tbl(xen_io_tlb_start, 
xen_io_tlb_nslabs);
+   rc = swiotlb_late_init_with_tbl(start, nslabs);
 
 end:
if (!rc)
@@ -246,17 +239,17 @@ int __ref xen_swiotlb_init(int verbose, bool early)
return rc;
 error:
if (repeat--) {
-   xen_io_tlb_nslabs = max(1024UL, /* Min is 2MB */
-   (xen_io_tlb_nslabs >> 1));
+   nslabs = max(1024UL, /* Min is 2MB */
+ 

Re: [PATCH RFC v1 5/6] xen-swiotlb: convert variables to arrays

2021-02-07 Thread Christoph Hellwig
On Thu, Feb 04, 2021 at 09:40:23AM +0100, Christoph Hellwig wrote:
> So one thing that has been on my mind for a while:  I'd really like
> to kill the separate dma ops in Xen swiotlb.  If we compare xen-swiotlb
> to swiotlb the main difference seems to be:
> 
>  - additional reasons to bounce I/O vs the plain DMA capable
>  - the possibility to do a hypercall on arm/arm64
>  - an extra translation layer before doing the phys_to_dma and vice
>versa
>  - an special memory allocator
> 
> I wonder if inbetween a few jump labels or other no overhead enablement
> options and possibly better use of the dma_range_map we could kill
> off most of swiotlb-xen instead of maintaining all this code duplication?

So I looked at this a bit more.

For x86 with XENFEAT_auto_translated_physmap (how common is that?)
pfn_to_gfn is a nop, so plain phys_to_dma/dma_to_phys do work as-is.

xen_arch_need_swiotlb always returns true for x86, and
range_straddles_page_boundary should never be true for the
XENFEAT_auto_translated_physmap case.

So as far as I can tell the mapping fast path for the
XENFEAT_auto_translated_physmap can be trivially reused from swiotlb.

That leaves us with the next more complicated case, x86 or fully cache
coherent arm{,64} without XENFEAT_auto_translated_physmap.  In that case
we need to patch in a phys_to_dma/dma_to_phys that performs the MFN
lookup, which could be done using alternatives or jump labels.
I think if that is done right we should also be able to let that cover
the foreign pages in is_xen_swiotlb_buffer/is_swiotlb_buffer, but
in that worst case that would need another alternative / jump label.

For non-coherent arm{,64} we'd also need to use alternatives or jump
labels to for the cache maintainance ops, but that isn't a hard problem
either.




Re: [PATCH v7 42/42] powerpc/64s: power4 nap fixup in C

2021-02-07 Thread Nicholas Piggin
Excerpts from Michael Ellerman's message of February 2, 2021 8:31 pm:
> Nicholas Piggin  writes:
>> There is no need for this to be in asm, use the new intrrupt entry wrapper.
>>
>> Signed-off-by: Nicholas Piggin 
>> ---
>>  arch/powerpc/include/asm/interrupt.h   | 15 +
>>  arch/powerpc/include/asm/processor.h   |  1 +
>>  arch/powerpc/include/asm/thread_info.h |  6 
>>  arch/powerpc/kernel/exceptions-64s.S   | 45 --
>>  arch/powerpc/kernel/idle_book3s.S  |  4 +++
>>  5 files changed, 26 insertions(+), 45 deletions(-)
> 
> Something in here is making my G5 not boot.
> 
> I don't know what the problem is because that machine is in the office,
> and I am not, and it has no serial console.
> 
> I tried turning on pstore but it doesn't record anything :/

Here is an incremental patch (hopefully) should fix it. Should be
folded with this one.

Thanks,
Nick
---
powerpc/64s: nap_adjust_return fix

This is a fix for patch "powerpc/64s: power4 nap fixup in C".

Non-maskable interrupts should not create any exit work, so there
should be no reason to come out of idle. So take the nap fixup out
of NMIs.

The problem with having them here is that an interrupt can come in
and then get interrupted by an NMI, the NMI will then set its return
to the idle wakeup and never complete the regular interrupt handling.

Before the "powerpc/64s: power4 nap fixup in C" patch, this wouldn't
occur because the true NMIs didn't call FIXUP_NAP, and the other
interrupts would call it before running their handler, so they would
not have a chance to call may_hard_irq_enable and allow soft-NMIs
(watchdog, perf) to come through.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/interrupt.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/interrupt.h 
b/arch/powerpc/include/asm/interrupt.h
index 25f2420b1965..afdf4aba2e6b 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -12,6 +12,7 @@ static inline void nap_adjust_return(struct pt_regs *regs)
 {
 #ifdef CONFIG_PPC_970_NAP
if (unlikely(test_thread_local_flags(_TLF_NAPPING))) {
+   /* Can avoid a test-and-clear because NMIs do not call this */
clear_thread_local_flags(_TLF_NAPPING);
regs->nip = (unsigned long)power4_idle_nap_return;
}
@@ -164,7 +165,10 @@ static inline void interrupt_nmi_exit_prepare(struct 
pt_regs *regs, struct inter
radix_enabled() || (mfmsr() & MSR_DR))
nmi_exit();
 
-   nap_adjust_return(regs);
+   /*
+* nmi does not call nap_adjust_return because nmi should not create
+* new work to do (must use irq_work for that).
+*/
 
 #ifdef CONFIG_PPC64
if (TRAP(regs) != 0x900 && TRAP(regs) != 0xf00 && TRAP(regs) != 0x260)
-- 
2.23.0



Re: [PATCH v7 28/42] powerpc: convert interrupt handlers to use wrappers

2021-02-07 Thread Nicholas Piggin
Excerpts from Christophe Leroy's message of February 5, 2021 6:09 pm:
> 
> 
> Le 30/01/2021 à 14:08, Nicholas Piggin a écrit :
>> Signed-off-by: Nicholas Piggin 
>> ---
> 
>> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
>> index f70d3f6174c8..7ff915aae8ec 100644
>> --- a/arch/powerpc/kernel/traps.c
>> +++ b/arch/powerpc/kernel/traps.c
> 
>> @@ -1462,7 +1474,7 @@ static int emulate_math(struct pt_regs *regs)
>>   static inline int emulate_math(struct pt_regs *regs) { return -1; }
>>   #endif
>>   
>> -void program_check_exception(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(program_check_exception)
>>   {
>>  enum ctx_state prev_state = exception_enter();
>>  unsigned int reason = get_reason(regs);
>> @@ -1587,14 +1599,14 @@ NOKPROBE_SYMBOL(program_check_exception);
>>* This occurs when running in hypervisor mode on POWER6 or later
>>* and an illegal instruction is encountered.
>>*/
>> -void emulation_assist_interrupt(struct pt_regs *regs)
>> +DEFINE_INTERRUPT_HANDLER(emulation_assist_interrupt)
>>   {
>>  regs->msr |= REASON_ILLEGAL;
>>  program_check_exception(regs);
> 
> Is it correct that an INTERRUPT_HANDLER calls another INTERRUPT_HANDLER ?

No, here is a patch for it, should go any time before the interrupt 
wrappers are introduced. It causes some conflicts later but are not
too complex. I can resend the series if necessary.

Thanks,
Nick

---
powerpc/traps: factor common code from program check and emulation assist

Move the program check handling into a function called by both, rather
than have the emulation assist handler call the program check handler.

This allows each of these handlers to be implemented with "interrupt
wrappers" in a later change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/traps.c | 38 +++--
 1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index f70d3f6174c8..2c5986109412 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1462,9 +1462,8 @@ static int emulate_math(struct pt_regs *regs)
 static inline int emulate_math(struct pt_regs *regs) { return -1; }
 #endif
 
-void program_check_exception(struct pt_regs *regs)
+static void do_program_check(struct pt_regs *regs)
 {
-   enum ctx_state prev_state = exception_enter();
unsigned int reason = get_reason(regs);
 
/* We can now get here via a FP Unavailable exception if the core
@@ -1473,22 +1472,22 @@ void program_check_exception(struct pt_regs *regs)
if (reason & REASON_FP) {
/* IEEE FP exception */
parse_fpe(regs);
-   goto bail;
+   return;
}
if (reason & REASON_TRAP) {
unsigned long bugaddr;
/* Debugger is first in line to stop recursive faults in
 * rcu_lock, notify_die, or atomic_notifier_call_chain */
if (debugger_bpt(regs))
-   goto bail;
+   return;
 
if (kprobe_handler(regs))
-   goto bail;
+   return;
 
/* trap exception */
if (notify_die(DIE_BPT, "breakpoint", regs, 5, 5, SIGTRAP)
== NOTIFY_STOP)
-   goto bail;
+   return;
 
bugaddr = regs->nip;
/*
@@ -1500,10 +1499,10 @@ void program_check_exception(struct pt_regs *regs)
if (!(regs->msr & MSR_PR) &&  /* not user-mode */
report_bug(bugaddr, regs) == BUG_TRAP_TYPE_WARN) {
regs->nip += 4;
-   goto bail;
+   return;
}
_exception(SIGTRAP, regs, TRAP_BRKPT, regs->nip);
-   goto bail;
+   return;
}
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
if (reason & REASON_TM) {
@@ -1524,7 +1523,7 @@ void program_check_exception(struct pt_regs *regs)
 */
if (user_mode(regs)) {
_exception(SIGILL, regs, ILL_ILLOPN, regs->nip);
-   goto bail;
+   return;
} else {
printk(KERN_EMERG "Unexpected TM Bad Thing exception "
   "at %lx (msr 0x%lx) tm_scratch=%llx\n",
@@ -1557,7 +1556,7 @@ void program_check_exception(struct pt_regs *regs)
 * pattern to occurrences etc. -dgibson 31/Mar/2003
 */
if (!emulate_math(regs))
-   goto bail;
+   return;
 
/* Try to emulate it if we should. */
if (reason & (REASON_ILLEGAL | REASON_PRIVILEGED)) {
@@ -1565,10 +1564,10 @@ void program_check_exception(struct pt_regs *regs)
case 0:
regs->nip += 4;

Re: [PATCH v7 39/42] powerpc: move NMI entry/exit code into wrapper

2021-02-07 Thread Nicholas Piggin
Excerpts from Nicholas Piggin's message of February 6, 2021 12:46 pm:
> Excerpts from Michael Ellerman's message of February 6, 2021 9:38 am:
>> Nicholas Piggin  writes:
>>> Excerpts from Michael Ellerman's message of February 4, 2021 8:15 pm:
 Nicholas Piggin  writes:
> This moves the common NMI entry and exit code into the interrupt handler
> wrappers.
>
> This changes the behaviour of soft-NMI (watchdog) and HMI interrupts, and
> also MCE interrupts on 64e, by adding missing parts of the NMI entry to
> them.
>
> Signed-off-by: Nicholas Piggin 
> ---
>  arch/powerpc/include/asm/interrupt.h | 28 ++
>  arch/powerpc/kernel/mce.c| 11 -
>  arch/powerpc/kernel/traps.c  | 35 +---
>  arch/powerpc/kernel/watchdog.c   | 10 
>  4 files changed, 38 insertions(+), 46 deletions(-)
 
 This is unhappy when injecting SLB multi-hits:
 
   root@p86-2:~# echo PPC_SLB_MULTIHIT > 
 /sys/kernel/debug/provoke-crash/DIRECT
   [  312.496026][ T1344] kernel BUG at 
 arch/powerpc/include/asm/interrupt.h:152!
   [  312.496037][ T1344] Oops: Exception in kernel mode, sig: 5 [#1]
   [  312.496045][ T1344] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA 
 pSeries
>>>
>>> pseries hash. Blast!
>> 
>> The worst kind.
>> 
 147 static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, 
 struct interrupt_nmi_state *state)
 148 {
 149if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64) ||
 150!firmware_has_feature(FW_FEATURE_LPAR) ||
 151radix_enabled() || (mfmsr() & MSR_DR))
 152nmi_exit();
 
 
 So presumably it's:
 
 #define __nmi_exit()   \
do {\
BUG_ON(!in_nmi());  \
>>>
>>> Yes that would be it, pseries machine check enables MMU half way through 
>>> so only one side of this triggers.
>>>
>>> The MSR_DR check is supposed to catch the other NMIs that run with MMU 
>>> on (perf, watchdog, etc). Suppose it could test TRAP(regs) explicitly
>>> although I wonder if we should also do this to keep things balanced
>> 
>> Yeah I think I like that. I'll give it a test.
> 
> The msr restore? Looking closer, pseries_machine_check_realmode may have
> expected mce_handle_error to enable the MMU, because irq_work_queue uses
> some per-cpu variables I think.
> 
> So the code might have to be rearranged a bit more than the patch below.

Here is a patch, it should go anywhere before this patch. Seems to
work with some test MCE injection on pseries hash.

Thanks,
Nick
--

powerpc/pseries/mce: restore msr before returning from handler

The pseries real-mode machine check handler can enable the MMU, and
return from the handler with the MMU still enabled.

This works, but real-mode handler wrapper exit handlers want to rely
on the MMU being in real-mode. So change the pseries handler to
restore the MSR after it has finished virtual mode tasks.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/platforms/pseries/ras.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 2d9f985fd13a..377439e88598 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -722,6 +722,7 @@ static int mce_handle_error(struct pt_regs *regs, struct 
rtas_error_log *errp)
struct pseries_errorlog *pseries_log;
struct pseries_mc_errorlog *mce_log = NULL;
int disposition = rtas_error_disposition(errp);
+   unsigned long msr;
u8 error_type;
 
if (!rtas_error_extended(errp))
@@ -747,9 +748,21 @@ static int mce_handle_error(struct pt_regs *regs, struct 
rtas_error_log *errp)
 *   SLB multihit is done by now.
 */
 out:
-   mtmsr(mfmsr() | MSR_IR | MSR_DR);
+   msr = mfmsr();
+   mtmsr(msr | MSR_IR | MSR_DR);
+
disposition = mce_handle_err_virtmode(regs, errp, mce_log,
  disposition);
+
+   /*
+* Queue irq work to log this rtas event later.
+* irq_work_queue uses per-cpu variables, so do this in virt
+* mode as well.
+*/
+   irq_work_queue(_errlog_process_work);
+
+   mtmsr(msr);
+
return disposition;
 }
 
@@ -865,10 +878,8 @@ long pseries_machine_check_realmode(struct pt_regs *regs)
 * virtual mode.
 */
disposition = mce_handle_error(regs, errp);
-   fwnmi_release_errinfo();
 
-   /* Queue irq work to log this rtas event later. */
-   irq_work_queue(_errlog_process_work);
+   fwnmi_release_errinfo();
 
if 

Re: [PATCH] ASoC: fsl: constify static snd_soc_dai_ops structs

2021-02-07 Thread Shengjiu Wang
On Sun, Feb 7, 2021 at 6:58 AM Rikard Falkeborn
 wrote:
>
> The only usage of these is to assign their address to the 'ops' field in
> the snd_soc_dai_driver struct, which is a pointer to const. Make them
> const to allow the compiler to put them in read-only memory.
>
> Signed-off-by: Rikard Falkeborn 

Acked-by: Shengjiu Wang 


Re: [PATCH 2/8] powerpc/signal: Add unsafe_copy_{vsx, fpr}_from_user()

2021-02-07 Thread Christophe Leroy




Le 06/02/2021 à 18:39, Christopher M. Riedl a écrit :

On Sat Feb 6, 2021 at 10:32 AM CST, Christophe Leroy wrote:



Le 20/10/2020 à 04:01, Christopher M. Riedl a écrit :

On Fri Oct 16, 2020 at 10:48 AM CDT, Christophe Leroy wrote:



Le 15/10/2020 à 17:01, Christopher M. Riedl a écrit :

Reuse the "safe" implementation from signal.c except for calling
unsafe_copy_from_user() to copy into a local buffer. Unlike the
unsafe_copy_{vsx,fpr}_to_user() functions the "copy from" functions
cannot use unsafe_get_user() directly to bypass the local buffer since
doing so significantly reduces signal handling performance.


Why can't the functions use unsafe_get_user(), why does it significantly
reduces signal handling
performance ? How much significant ? I would expect that not going
through an intermediate memory
area would be more efficient



Here is a comparison, 'unsafe-signal64-regs' avoids the intermediate buffer:

|  | hash   | radix  |
|  | -- | -- |
| linuxppc/next| 289014 | 158408 |
| unsafe-signal64  | 298506 | 253053 |
| unsafe-signal64-regs | 254898 | 220831 |

I have not figured out the 'why' yet. As you mentioned in your series,
technically calling __copy_tofrom_user() is overkill for these
operations. The only obvious difference between unsafe_put_user() and
unsafe_get_user() is that we don't have asm-goto for the 'get' variant.
Instead we wrap with unsafe_op_wrap() which inserts a conditional and
then goto to the label.

Implemenations:

#define unsafe_copy_fpr_from_user(task, from, label)   do {\
   struct task_struct *__t = task; \
   u64 __user *buf = (u64 __user *)from;   \
   int i;  \
   \
   for (i = 0; i < ELF_NFPREG - 1; i++)\
   unsafe_get_user(__t->thread.TS_FPR(i), [i], label); \
   unsafe_get_user(__t->thread.fp_state.fpscr, [i], label);\
} while (0)

#define unsafe_copy_vsx_from_user(task, from, label)   do {\
   struct task_struct *__t = task; \
   u64 __user *buf = (u64 __user *)from;   \
   int i;  \
   \
   for (i = 0; i < ELF_NVSRHALFREG ; i++)  \
   
unsafe_get_user(__t->thread.fp_state.fpr[i][TS_VSRLOWOFFSET], \
   [i], label);\
} while (0)



Do you have CONFIG_PROVE_LOCKING or CONFIG_DEBUG_ATOMIC_SLEEP enabled in
your config ?


I don't have these set in my config (ppc64le_defconfig). I think I
figured this out - the reason for the lower signal throughput is the
barrier_nospec() in __get_user_nocheck(). When looping we incur that
cost on every iteration. Commenting it out results in signal performance
of ~316K w/ hash on the unsafe-signal64-regs branch. Obviously the
barrier is there for a reason but it is quite costly.


Interesting.

Can you try with the patch I just sent out 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/c72f014730823b413528e90ab6c4d3bcb79f8497.1612692067.git.christophe.le...@csgroup.eu/


Thanks
Christophe


[PATCH] powerpc/uaccess: Perform barrier_nospec() in KUAP allowance helpers

2021-02-07 Thread Christophe Leroy
barrier_nospec() in uaccess helpers is there to protect against
speculative accesses around access_ok().

When using user_access_begin() sequences together with
unsafe_get_user() like macros, barrier_nospec() is called for
every single read although we know the access_ok() is done
onece.

Since all user accesses must be granted by a call to either
allow_read_from_user() or allow_read_write_user() which will
always happen after the access_ok() check, move the barrier_nospec()
there.

Reported-by: Christopher M. Riedl 
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/kup.h |  2 ++
 arch/powerpc/include/asm/uaccess.h | 12 +---
 2 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/kup.h b/arch/powerpc/include/asm/kup.h
index bf221a2a523e..7ec21af49a45 100644
--- a/arch/powerpc/include/asm/kup.h
+++ b/arch/powerpc/include/asm/kup.h
@@ -91,6 +91,7 @@ static __always_inline void setup_kup(void)
 
 static inline void allow_read_from_user(const void __user *from, unsigned long 
size)
 {
+   barrier_nospec();
allow_user_access(NULL, from, size, KUAP_READ);
 }
 
@@ -102,6 +103,7 @@ static inline void allow_write_to_user(void __user *to, 
unsigned long size)
 static inline void allow_read_write_user(void __user *to, const void __user 
*from,
 unsigned long size)
 {
+   barrier_nospec();
allow_user_access(to, from, size, KUAP_READ_WRITE);
 }
 
diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 501c9a79038c..46123ae6a4c9 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -315,7 +315,6 @@ do {
\
__chk_user_ptr(__gu_addr);  \
if (!is_kernel_addr((unsigned long)__gu_addr))  \
might_fault();  \
-   barrier_nospec();   \
if (do_allow)   
\
__get_user_size(__gu_val, __gu_addr, __gu_size, __gu_err);  
\
else
\
@@ -333,10 +332,8 @@ do {   
\
__typeof__(size) __gu_size = (size);\
\
might_fault();  \
-   if (access_ok(__gu_addr, __gu_size)) {  \
-   barrier_nospec();   \
+   if (access_ok(__gu_addr, __gu_size))\
__get_user_size(__gu_val, __gu_addr, __gu_size, __gu_err); \
-   }   \
(x) = (__force __typeof__(*(ptr)))__gu_val; 
\
\
__gu_err;   \
@@ -350,7 +347,6 @@ do {
\
__typeof__(size) __gu_size = (size);\
\
__chk_user_ptr(__gu_addr);  \
-   barrier_nospec();   \
__get_user_size(__gu_val, __gu_addr, __gu_size, __gu_err); \
(x) = (__force __typeof__(*(ptr)))__gu_val; \
\
@@ -395,7 +391,6 @@ raw_copy_in_user(void __user *to, const void __user *from, 
unsigned long n)
 {
unsigned long ret;
 
-   barrier_nospec();
allow_read_write_user(to, from, n);
ret = __copy_tofrom_user(to, from, n);
prevent_read_write_user(to, from, n);
@@ -412,19 +407,15 @@ static inline unsigned long raw_copy_from_user(void *to,
 
switch (n) {
case 1:
-   barrier_nospec();
__get_user_size(*(u8 *)to, from, 1, ret);
break;
case 2:
-   barrier_nospec();
__get_user_size(*(u16 *)to, from, 2, ret);
break;
case 4:
-   barrier_nospec();
__get_user_size(*(u32 *)to, from, 4, ret);
break;
case 8:
-   barrier_nospec();
__get_user_size(*(u64 *)to, from, 8, ret);
break;
}
@@ -432,7 +423,6 @@ static inline unsigned long raw_copy_from_user(void *to,
return 0;
}
 
-   barrier_nospec();
   

Re: [PATCH 5/7] ASoC: imx-pcm-rpmsg: Add platform driver for audio base on rpmsg

2021-02-07 Thread Shengjiu Wang
On Fri, Feb 5, 2021 at 11:00 PM Mark Brown  wrote:
>
> On Fri, Feb 05, 2021 at 02:57:28PM +0800, Shengjiu Wang wrote:
>
> > + if (params_format(params) == SNDRV_PCM_FORMAT_S16_LE)
> > + msg->s_msg.param.format   = RPMSG_S16_LE;
> > + else if (params_format(params) == SNDRV_PCM_FORMAT_S24_LE)
>
> Again this should be a switch statement.
>
> > + if (params_channels(params) == 1)
> > + msg->s_msg.param.channels = RPMSG_CH_LEFT;
> > + else
> > + msg->s_msg.param.channels = RPMSG_CH_STEREO;
>
> Shouldn't this be reporting an error if the number of channels is more
> than 2?
>
> > + /*
> > +  * if the data in the buffer is less than one period
> > +  * send message immediately.
> > +  * if there is more than one period data, delay one
> > +  * period (timer) to send the message.
> > +  */
> > + if ((avail - writen_num * period_size) <= period_size) {
> > + imx_rpmsg_insert_workqueue(substream, msg, info);
> > + } else if (rpmsg->force_lpa && !timer_pending(timer)) {
> > + int time_msec;
> > +
> > + time_msec = (int)(runtime->period_size * 1000 / 
> > runtime->rate);
> > + mod_timer(timer, jiffies + 
> > msecs_to_jiffies(time_msec));
> > + }
>
> The comment here is at least confusing - why would we not send a full
> buffer immediately if we have one?  This sounds like it's the opposite
> way round to what we'd do if we were trying to cut down the number of
> messages.  It might help to say which buffer and where?
>
> > + /**
> > +  * Every work in the work queue, first we check if there
>
> /** comments are only for kerneldoc.

Thanks Mark, I will update them.

Best regards
wang shengjiu


Re: [PATCH 4/7] ASoC: imx-audio-rpmsg: Add rpmsg_driver for audio channel

2021-02-07 Thread Shengjiu Wang
On Fri, Feb 5, 2021 at 10:27 PM Mark Brown  wrote:
>
> On Fri, Feb 05, 2021 at 02:57:27PM +0800, Shengjiu Wang wrote:
>
> > + /* TYPE C is notification from M core */
> > + if (r_msg->header.type == MSG_TYPE_C) {
> > + if (r_msg->header.cmd == TX_PERIOD_DONE) {
>
> > + } else if (r_msg->header.cmd == RX_PERIOD_DONE) {
>
> A switch statement would be clearer and more extensible...
>
> > + /* TYPE B is response msg */
> > + if (r_msg->header.type == MSG_TYPE_B) {
> > + memcpy(>r_msg, r_msg, sizeof(struct rpmsg_r_msg));
> > + complete(>cmd_complete);
> > + }
>
> ...and make this flow clearer for example.  Do we need to warn on
> unknown messages?

Thanks for reviewing. I will update them.

Best regards
wang shengjiu