[powerpc:next] BUILD SUCCESS c28c2d4abdf95655001992c4f52dc243ba00cac3

2022-09-07 Thread kernel test robot
defconfig
arc defconfig
alpha   defconfig
s390 allyesconfig
nios2allyesconfig
nios2   defconfig
parisc  defconfig
parisc64defconfig
parisc   allyesconfig
ia64 allmodconfig

clang tested configs:
x86_64randconfig-a005
x86_64randconfig-a003
x86_64randconfig-a001
x86_64randconfig-a012
x86_64randconfig-a014
x86_64randconfig-a016
riscvrandconfig-r042-20220907
hexagon  randconfig-r041-20220907
hexagon  randconfig-r045-20220907
s390 randconfig-r044-20220907
i386  randconfig-a002
i386  randconfig-a006
i386  randconfig-a004
x86_64randconfig-k001
powerpc tqm8540_defconfig
arm   spitz_defconfig
powerpc mpc8315_rdb_defconfig
mips   ip22_defconfig
i386  randconfig-a011
i386  randconfig-a013
i386  randconfig-a015

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp


Re: [PATCH v5] livepatch: Clear relocation targets on a module removal

2022-09-07 Thread Russell Currey
On Thu, 2022-09-01 at 08:42 -0400, Joe Lawrence wrote:
> On Thu, Sep 01, 2022 at 01:39:02PM +1000, Michael Ellerman wrote:
> > Joe Lawrence  writes:
> > > On Thu, Sep 01, 2022 at 08:30:44AM +1000, Michael Ellerman wrote:
> > > > Joe Lawrence  writes:
> > ...
> > > 
> > > Hi Michael,
> > > 
> > > While we're on the topic of klp-relocations and Power, I saw a
> > > similar
> > > access problem when writing (late) relocations into
> > > .data..ro_after_init.  I'm not entirely convinced this should be
> > > allowed
> > > (ie, is it really read-only after .init or ???), but it seems
> > > that other
> > > arches currently allow it ...
> > 
> > I guess that's because we didn't properly fix apply_relocate_add()
> > in
> > https://github.com/linuxppc/issues/issues/375 ?
> > 
> > If other arches allow it then we don't want to be the odd one out
> > :)
> > 
> > So I guess we need to implement it.
> > 
> 
> FWIW, I think it this particular relocation is pretty rare, we
> haven't
> seen it in real patches nor do we have a kpatch test that generates
> it.
> I only hit a crash as I was trying to write a more exhaustive test
> for
> the klp-convert implementation.

I'll revive my proper fix.  I stopped working on it since my previous
version was hitting endian bugs with some relocations & it didn't seem
necessary at the time.  Shouldn't take too much to get it going again.

> 
> > > = TEST: klp-convert data relocations (late module patching)
> > > =
> > > % modprobe test_klp_convert_data
> > > livepatch: enabling patch 'test_klp_convert_data'
> > > livepatch: 'test_klp_convert_data': starting patching transition
> > > livepatch: 'test_klp_convert_data': patching complete
> > > % modprobe test_klp_convert_mod
> > > ...
> > > module_64: Applying ADD relocate section 54 to 20
> > > module_64: RELOC at 8482d02a: 38-type as
> > > .klp.sym.test_klp_convert_mod.static_ro_after_init,0
> > > (0xc008016d0084) + 0
> > > BUG: Unable to handle kernel data access on write at
> > > 0xc008021d
> > > Faulting instruction address: 0xc0055f14
> > > Oops: Kernel access of bad area, sig: 11 [#1]
> > > LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> > > Modules linked in: test_klp_convert_mod(+)
> > > test_klp_convert_data(K) bonding rfkill tls pseries_rng drm fuse
> > > drm_panel_orientation_quirks xfs libcrc32c sd_mod t10_pi sg
> > > ibmvscsi ibmveth scsi_transport_srp vmx_crypto dm_mirror
> > > dm_region_hash dm_log dm_mod [last unloaded:
> > > test_klp_convert_mod]
> > > CPU: 0 PID: 17089 Comm: modprobe Kdump: loaded Tainted:
> > > G  K   5.19.0+ #1
> > > NIP:  c0055f14 LR: c021ef28 CTR: c0055f14
> > > REGS: c000387af5a0 TRAP: 0300   Tainted: G  K   
> > > (5.19.0+)
> > > MSR:  82009033   CR: 88228444 
> > > XER: 
> > > CFAR: c0055e04 DAR: c008021d DSISR: 4200
> > > IRQMASK: 0
> > > GPR00: c021ef28 c000387af840 c2a68a00
> > > c88b3000
> > > GPR04: c00802230084 0026 0036
> > > c008021e0480
> > > GPR08: 7c426214 c0055f14 c0055e08
> > > 0d80
> > > GPR12: c021d9b0 c2d9 c88b3000
> > > c008021f0810
> > > GPR16: c008021c0638 c88b3d80 
> > > c1181e38
> > > GPR20: c29dc088 c008021e0480 c008021f0870
> > > aaab
> > > GPR24: c88b3c40 c008021d c008021f
> > > 
> > > GPR28: c008021d  c008021c0638
> > > 0810
> > > NIP [c0055f14] apply_relocate_add+0x474/0x9e0
> > > LR [c021ef28] klp_apply_section_relocs+0x208/0x2d0
> > > Call Trace:
> > > [c000387af840] [c000387af920] 0xc000387af920
> > > (unreliable)
> > > [c000387af940] [c021ef28]
> > > klp_apply_section_relocs+0x208/0x2d0
> > > [c000387afa30] [c021f080]
> > > klp_init_object_loaded+0x90/0x1e0
> > > [c000387afac0] [c02200ac]
> > > klp_module_coming+0x3dc/0x5c0
> > > [c000387afb70] [c0231414] load_module+0xf64/0x13a0
> > > [c000387afc90] [c0231b8c]
> > > __do_sys_finit_module+0xdc/0x180
> > > [c000387afdb0] [c002f004]
> > > system_call_exception+0x164/0x340
> > > [c000387afe10] [c000be68]
> > > system_call_vectored_common+0xe8/0x278
> > > --- interrupt: 3000 at 0x7fffb6af4710
> > > NIP:  7fffb6af4710 LR:  CTR: 
> > > REGS: c000387afe80 TRAP: 3000   Tainted: G  K   
> > > (5.19.0+)
> > > MSR:  8000f033   CR:
> > > 48224244  XER: 
> > > IRQMASK: 0
> > > GPR00: 0161 7fffe06f5550 7fffb6bf7200
> > > 0005
> > > GPR04: 000105f36ca0  0005
> > > 
> > > GPR08:   
> > > 
> > > GPR12: 

[powerpc:fixes-test] BUILD SUCCESS 5f22270db76e8f1726b4287f6c0032d2ad1b9c52

2022-09-07 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
fixes-test
branch HEAD: 5f22270db76e8f1726b4287f6c0032d2ad1b9c52  powerpc/pseries: Fix 
plpks crash on non-pseries

elapsed time: 729m

configs tested: 112
configs skipped: 107

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
um   x86_64_defconfig
um i386_defconfig
x86_64  defconfig
x86_64   allyesconfig
x86_64   rhel-8.3
csky  allnoconfig
alpha allnoconfig
arc   allnoconfig
riscv allnoconfig
x86_64   rhel-8.3-kvm
x86_64  rhel-8.3-func
x86_64   rhel-8.3-syz
x86_64rhel-8.3-kselftests
x86_64 rhel-8.3-kunit
m68k allyesconfig
m68k allmodconfig
arc  allyesconfig
alphaallyesconfig
i386 allyesconfig
i386defconfig
powerpc   allnoconfig
mips allyesconfig
powerpc  allmodconfig
sh   allmodconfig
powerpcsam440ep_defconfig
m68k  amiga_defconfig
powerpc  tqm8xx_defconfig
m68kstmark2_defconfig
i386  randconfig-a012
i386  randconfig-a014
i386  randconfig-a016
arm64allyesconfig
arm defconfig
arm  allyesconfig
xtensa virt_defconfig
ia64 bigsur_defconfig
armkeystone_defconfig
sh  sh7785lcr_32bit_defconfig
m68k  multi_defconfig
arm pxa_defconfig
arc  axs103_defconfig
mips   gcw0_defconfig
parisc64 alldefconfig
sparc   defconfig
sh sh7710voipgw_defconfig
riscvnommu_virt_defconfig
riscv  rv32_defconfig
riscvnommu_k210_defconfig
i386   debian-10.3-kselftests
i386  debian-10.3
sparc allnoconfig
armcerfcube_defconfig
powerpc  arches_defconfig
openrisc simple_smp_defconfig
powerpc asp8347_defconfig
sparcalldefconfig
i386 alldefconfig
powerpc ep8248e_defconfig
m68k  hp300_defconfig
m68km5272c3_defconfig
arm  exynos_defconfig
arm  pxa3xx_defconfig
arm s3c6400_defconfig
powerpc stx_gp3_defconfig
arm64alldefconfig
sh   se7722_defconfig
x86_64randconfig-a006
x86_64randconfig-a004
x86_64randconfig-a002
sh sh03_defconfig
sh   se7750_defconfig
s390 allmodconfig
xtensa   common_defconfig
i386  randconfig-c001
sh  r7785rp_defconfig
arm  iop32x_defconfig
powerpc mpc83xx_defconfig
xtensageneric_kc705_defconfig
cskydefconfig
um  defconfig
shtitan_defconfig
armmps2_defconfig
loongarch   defconfig
loongarch allnoconfig
sh  r7780mp_defconfig
armqcom_defconfig
ia64  tiger_defconfig
arc  alldefconfig
x86_64randconfig-a011
x86_64randconfig-a013
x86_64randconfig-a015

clang tested configs:
x86_64randconfig-a005
x86_64randconfig-a003
x86_64randconfig-a001
x86_64randconfig-a012
x86_64randconfig-a014
x86_64randconfig-a016
riscvrandconfig-r042-20220907
hexagon  randconfig-r041-20220907
hexagon  randconfig-r045

Re: [PATCH] powerpc/lib/xor_vmx: Relax frame size for clang

2022-09-07 Thread Michael Ellerman
Christophe Leroy  writes:
> Le 21/06/2019 à 10:58, Mathieu Malaterre a écrit :
>> When building with clang-8 the frame size limit is hit:
>> 
>>../arch/powerpc/lib/xor_vmx.c:119:6: error: stack frame size of 1200 
>> bytes in function '__xor_altivec_5' [-Werror,-Wframe-larger-than=]
>> 
>> Follow the same approach as commit 9c87156cce5a ("powerpc/xmon: Relax
>> frame size for clang") until a proper fix is implemented upstream in
>> clang and relax requirement for clang.
>
> With Clang 14 I get the following errors, but only with KASAN selected.
>
>CC  arch/powerpc/lib/xor_vmx.o
> arch/powerpc/lib/xor_vmx.c:95:6: error: stack frame size (1040) exceeds 
> limit (1024) in '__xor_altivec_4' [-Werror,-Wframe-larger-than]
> void __xor_altivec_4(unsigned long bytes,
>   ^
> arch/powerpc/lib/xor_vmx.c:124:6: error: stack frame size (1312) exceeds 
> limit (1024) in '__xor_altivec_5' [-Werror,-Wframe-larger-than]
> void __xor_altivec_5(unsigned long bytes,
>   ^

That's a 32-bit build?

> Is this patch still relevant ?

The clang issue was closed because a different change fixed the issue:

  https://github.com/ClangBuiltLinux/linux/issues/563

> Or should frame size be relaxed when KASAN is selected ? After all the 
> stack size is multiplied by 2 when we have KASAN, so maybe the warning 
> limit should be increased as well ?

Yeah that would make some sense.

On 64-bit the largest frame in that file is 1424, which is below the
default 2048 byte limit.

So maybe just increase it for 32-bit && KASAN.

What would be nice is if the FRAME_WARN value could be calculated as a
percentage of the THREAD_SHIFT, but that's not easily doable with the
way things are structured in Kconfig.

cheers


Re: [PATCH v2 0/7] Implement inline static calls on PPC32 - v2

2022-09-07 Thread Benjamin Gray
On Thu, 2022-09-01 at 16:46 +, Christophe Leroy wrote:
> Surprisingly, I get worst performance with inline static call than
> with 
> out of line static call:

I'm not sure what hackbench is doing, but when microbenchmarking 64 bit
out-of-line calls in a loop I saw a similar thing where adding more
indirection improved the performance despite doing more work. The cause
seemed to be a combination of using older hardware and the target being
too short (just an integer increment). Moving to a newer machine and
adding a lot of NOPs to the target made the performance make sense.


signature.asc
Description: This is a digitally signed message part


Re: [v2 PATCH 1/2] mm: gup: fix the fast GUP race against THP collapse

2022-09-07 Thread John Hubbard

On 9/7/22 11:01, Yang Shi wrote:

Since general RCU GUP fast was introduced in commit 2667f50e8b81 ("mm:
introduce a general RCU get_user_pages_fast()"), a TLB flush is no longer
sufficient to handle concurrent GUP-fast in all cases, it only handles
traditional IPI-based GUP-fast correctly.  On architectures that send
an IPI broadcast on TLB flush, it works as expected.  But on the
architectures that do not use IPI to broadcast TLB flush, it may have
the below race:

CPU A  CPU B
THP collapse fast GUP
   gup_pmd_range() <-- see valid pmd
   gup_pte_range() <-- work on 
pte
pmdp_collapse_flush() <-- clear pmd and flush
__collapse_huge_page_isolate()
 check page pinned <-- before GUP bump refcount
   pin the page
   check PTE <-- no change
__collapse_huge_page_copy()
 copy data to huge page
 ptep_clear()
install huge pmd for the huge page
   return the stale page
discard the stale page

The race could be fixed by checking whether PMD is changed or not after
taking the page pin in fast GUP, just like what it does for PTE.  If the
PMD is changed it means there may be parallel THP collapse, so GUP
should back off.

Also update the stale comment about serializing against fast GUP in
khugepaged.

Fixes: 2667f50e8b81 ("mm: introduce a general RCU get_user_pages_fast()")
Acked-by: David Hildenbrand 
Acked-by: Peter Xu 
Signed-off-by: Yang Shi 
---
v2: * Incorporated the comment from Peter about the comment.
 * Moved the comment right before gup_pte_range() instead of in the
   body of the function, per John.
 * Added patch 2/2 per Aneesh.

  mm/gup.c| 34 --
  mm/khugepaged.c | 10 ++
  2 files changed, 34 insertions(+), 10 deletions(-)



Looks good.

Reviewed-by: John Hubbard 

thanks,
--
John Hubbard
NVIDIA


diff --git a/mm/gup.c b/mm/gup.c
index f3fc1f08d90c..40aa1c937212 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2380,8 +2380,28 @@ static void __maybe_unused undo_dev_pagemap(int *nr, int 
nr_start,
  }
  
  #ifdef CONFIG_ARCH_HAS_PTE_SPECIAL

-static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
-unsigned int flags, struct page **pages, int *nr)
+/*
+ * Fast-gup relies on pte change detection to avoid concurrent pgtable
+ * operations.
+ *
+ * To pin the page, fast-gup needs to do below in order:
+ * (1) pin the page (by prefetching pte), then (2) check pte not changed.
+ *
+ * For the rest of pgtable operations where pgtable updates can be racy
+ * with fast-gup, we need to do (1) clear pte, then (2) check whether page
+ * is pinned.
+ *
+ * Above will work for all pte-level operations, including THP split.
+ *
+ * For THP collapse, it's a bit more complicated because fast-gup may be
+ * walking a pgtable page that is being freed (pte is still valid but pmd
+ * can be cleared already).  To avoid race in such condition, we need to
+ * also check pmd here to make sure pmd doesn't change (corresponds to
+ * pmdp_collapse_flush() in the THP collapse code path).
+ */
+static int gup_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr,
+unsigned long end, unsigned int flags,
+struct page **pages, int *nr)
  {
struct dev_pagemap *pgmap = NULL;
int nr_start = *nr, ret = 0;
@@ -2423,7 +2443,8 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, 
unsigned long end,
goto pte_unmap;
}
  
-		if (unlikely(pte_val(pte) != pte_val(*ptep))) {

+   if (unlikely(pmd_val(pmd) != pmd_val(*pmdp)) ||
+   unlikely(pte_val(pte) != pte_val(*ptep))) {
gup_put_folio(folio, 1, flags);
goto pte_unmap;
}
@@ -2470,8 +2491,9 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, 
unsigned long end,
   * get_user_pages_fast_only implementation that can pin pages. Thus it's still
   * useful to have gup_huge_pmd even if we can't operate on ptes.
   */
-static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
-unsigned int flags, struct page **pages, int *nr)
+static int gup_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr,
+unsigned long end, unsigned int flags,
+struct page **pages, int *nr)
  {
return 0;
  }
@@ -2791,7 +2813,7 @@ static int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned 
long addr, unsigned lo
if (!gup_huge_pd(__hugepd(pmd_val(pmd)), addr,
 PMD_SHIFT, next, flags, pages, nr))
return 0;
- 

Re: [PATCH][next] powerpc: Fix fall-through warning for Clang

2022-09-07 Thread Kees Cook
On Tue, Sep 06, 2022 at 10:32:13PM +0100, Gustavo A. R. Silva wrote:
> Fix the following fallthrough warning:
> 
> arch/powerpc/platforms/85xx/mpc85xx_cds.c:161:3: warning: unannotated 
> fall-through between switch labels [-Wimplicit-fallthrough]
> 
> Link: https://github.com/KSPP/linux/issues/198
> Reported-by: kernel test robot 
> Link: https://lore.kernel.org/lkml/202209061224.kxorrgvg-...@intel.com/
> Signed-off-by: Gustavo A. R. Silva 

Thanks!

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH] powerpc/83xx: update kmeter1 defconfig and dts

2022-09-07 Thread Michael Ellerman
Christophe Leroy  writes:
> Le 16/12/2019 à 10:50, Holger Brunck a écrit :
>> From: Matteo Ghidoni 
>> 
>> The defconfig is synchronized and the missing
>> MTD_PHYSMAP, DEVTMPFS and I2C MUX support are switched on.
>> 
>> Additionally the I2C mux device is added to the DTS with
>> its attached temperature sensors and I2C clock frequency
>> is lowered.
>
> This patch doesn't apply.
>
> Is it still relevant ?

If so it should be split into two patches.

cheers


[PATCH] Revert "powerpc/rtas: Implement reentrant rtas call"

2022-09-07 Thread Nathan Lynch
At the time this was submitted by Leonardo, I confirmed -- or thought
I had confirmed -- with PowerVM partition firmware development that
the following RTAS functions:

- ibm,get-xive
- ibm,int-off
- ibm,int-on
- ibm,set-xive

were safe to call on multiple CPUs simultaneously, not only with
respect to themselves as indicated by PAPR, but with arbitrary other
RTAS calls:

https://lore.kernel.org/linuxppc-dev/875zcy2v8o@linux.ibm.com/

Recent discussion with firmware development makes it clear that this
is not true, and that the code in commit b664db8e3f97 ("powerpc/rtas:
Implement reentrant rtas call") is unsafe, likely explaining several
strange bugs we've seen in internal testing involving DLPAR and
LPM. These scenarios use ibm,configure-connector, whose internal state
can be corrupted by the concurrent use of the "reentrant" functions,
leading to symptoms like endless busy statuses from RTAS.

Signed-off-by: Nathan Lynch 
Fixes: b664db8e3f97 ("powerpc/rtas: Implement reentrant rtas call")
Cc: sta...@vger.kernel.org
---
 arch/powerpc/include/asm/paca.h |  1 -
 arch/powerpc/include/asm/rtas.h |  1 -
 arch/powerpc/kernel/paca.c  | 32 -
 arch/powerpc/kernel/rtas.c  | 54 -
 arch/powerpc/sysdev/xics/ics-rtas.c | 22 ++--
 5 files changed, 11 insertions(+), 99 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 4d7aaab82702..3537b0500f4d 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -263,7 +263,6 @@ struct paca_struct {
u64 l1d_flush_size;
 #endif
 #ifdef CONFIG_PPC_PSERIES
-   struct rtas_args *rtas_args_reentrant;
u8 *mce_data_buf;   /* buffer to hold per cpu rtas errlog */
 #endif /* CONFIG_PPC_PSERIES */
 
diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 00531af17ce0..56319aea646e 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -240,7 +240,6 @@ extern struct rtas_t rtas;
 extern int rtas_token(const char *service);
 extern int rtas_service_present(const char *service);
 extern int rtas_call(int token, int, int, int *, ...);
-int rtas_call_reentrant(int token, int nargs, int nret, int *outputs, ...);
 void rtas_call_unlocked(struct rtas_args *args, int token, int nargs,
int nret, ...);
 extern void __noreturn rtas_restart(char *cmd);
diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index ba593fd60124..dfd097b79160 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -16,7 +16,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "setup.h"
 
@@ -170,30 +169,6 @@ static struct slb_shadow * __init new_slb_shadow(int cpu, 
unsigned long limit)
 }
 #endif /* CONFIG_PPC_64S_HASH_MMU */
 
-#ifdef CONFIG_PPC_PSERIES
-/**
- * new_rtas_args() - Allocates rtas args
- * @cpu:   CPU number
- * @limit: Memory limit for this allocation
- *
- * Allocates a struct rtas_args and return it's pointer,
- * if not in Hypervisor mode
- *
- * Return: Pointer to allocated rtas_args
- * NULL if CPU in Hypervisor Mode
- */
-static struct rtas_args * __init new_rtas_args(int cpu, unsigned long limit)
-{
-   limit = min_t(unsigned long, limit, RTAS_INSTANTIATE_MAX);
-
-   if (early_cpu_has_feature(CPU_FTR_HVMODE))
-   return NULL;
-
-   return alloc_paca_data(sizeof(struct rtas_args), L1_CACHE_BYTES,
-  limit, cpu);
-}
-#endif /* CONFIG_PPC_PSERIES */
-
 /* The Paca is an array with one entry per processor.  Each contains an
  * lppaca, which contains the information shared between the
  * hypervisor and Linux.
@@ -232,10 +207,6 @@ void __init initialise_paca(struct paca_struct *new_paca, 
int cpu)
/* For now -- if we have threads this will be adjusted later */
new_paca->tcd_ptr = _paca->tcd;
 #endif
-
-#ifdef CONFIG_PPC_PSERIES
-   new_paca->rtas_args_reentrant = NULL;
-#endif
 }
 
 /* Put the paca pointer into r13 and SPRG_PACA */
@@ -307,9 +278,6 @@ void __init allocate_paca(int cpu)
 #endif
 #ifdef CONFIG_PPC_64S_HASH_MMU
paca->slb_shadow_ptr = new_slb_shadow(cpu, limit);
-#endif
-#ifdef CONFIG_PPC_PSERIES
-   paca->rtas_args_reentrant = new_rtas_args(cpu, limit);
 #endif
paca_struct_size += sizeof(struct paca_struct);
 }
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 693133972294..0b8a858aa847 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -43,7 +43,6 @@
 #include 
 #include 
 #include 
-#include 
 
 /* This is here deliberately so it's only used in this file */
 void enter_rtas(unsigned long);
@@ -932,59 +931,6 @@ void rtas_activate_firmware(void)
pr_err("ibm,activate-firmware failed (%i)\n", fwrc);
 }
 
-#ifdef CONFIG_PPC_PSERIES
-/**
- * rtas_call_reentrant() - Used for reentrant rtas calls
- * @token:  

Re: [PATCH v4 4/4] selftests/hmm-tests: Add test for dirty bits

2022-09-07 Thread John Hubbard

On 9/7/22 04:13, Alistair Popple wrote:

+   /*
+* Attempt to migrate memory to device, which should fail because
+* hopefully some pages are backed by swap storage.
+*/
+   ASSERT_TRUE(hmm_migrate_sys_to_dev(self->fd, buffer, npages));


Are you really sure that you want to assert on that? Because doing so
guarantees a test failure if and when we every upgrade the kernel to
be able to migrate swap-backed pages. And I seem to recall that this
current inability to migrate swap-backed pages is considered a flaw
to be fixed, right?


Right, that's a good point. I was using failure (ASSERT_TRUE) here as a
way of detecting that at least some pages are swap-backed, because if no
pages end up being swap-backed the test is invalid.


Yes. But "invalid" or "waived" is a much different test result than
"failed".



I'm not really sure what to do about it though. It's likely the fix for


Remove the assert. If the test framework allows and you prefer, you
can print a warning.


swap-backed migration may make this bug impossible to hit anyway,
because the obvious fix is to just drop the pages from the swapcache
during migration which would force writeback during subsequent reclaim.

So I'm inclined to leave this here even if it only serves to remind us
about it when we do fix migration of swap-backed pages, because we will
of course run hmm-tests before submitting that fix :-) We can then
either fix the test or drop it if we think it's no longer possible to
hit.


Oh no no no, please. This is not how to do tests. If you want a TODO
list somewhere, there are other ways. But tests that require maintenance
when you change something are an anti-pattern.


thanks,
--
John Hubbard
NVIDIA



Re: [v2 PATCH 1/2] mm: gup: fix the fast GUP race against THP collapse

2022-09-07 Thread Yang Shi
On Wed, Sep 7, 2022 at 2:22 PM Andrew Morton  wrote:
>
> On Wed,  7 Sep 2022 11:01:43 -0700 Yang Shi  wrote:
>
> > Since general RCU GUP fast was introduced in commit 2667f50e8b81 ("mm:
> > introduce a general RCU get_user_pages_fast()"), a TLB flush is no longer
> > sufficient to handle concurrent GUP-fast in all cases, it only handles
> > traditional IPI-based GUP-fast correctly.  On architectures that send
> > an IPI broadcast on TLB flush, it works as expected.  But on the
> > architectures that do not use IPI to broadcast TLB flush, it may have
> > the below race:
> >
> >CPU A  CPU B
> > THP collapse fast GUP
> >   gup_pmd_range() <-- see valid 
> > pmd
> >   gup_pte_range() <-- work 
> > on pte
> > pmdp_collapse_flush() <-- clear pmd and flush
> > __collapse_huge_page_isolate()
> > check page pinned <-- before GUP bump refcount
> >   pin the page
> >   check PTE <-- no 
> > change
> > __collapse_huge_page_copy()
> > copy data to huge page
> > ptep_clear()
> > install huge pmd for the huge page
> >   return the stale page
> > discard the stale page
> >
> > The race could be fixed by checking whether PMD is changed or not after
> > taking the page pin in fast GUP, just like what it does for PTE.  If the
> > PMD is changed it means there may be parallel THP collapse, so GUP
> > should back off.
> >
> > Also update the stale comment about serializing against fast GUP in
> > khugepaged.
> >
> > Fixes: 2667f50e8b81 ("mm: introduce a general RCU get_user_pages_fast()")
>
> Is this not worth a -stable backport?

Yes, I think it is.


Re: [v2 PATCH 1/2] mm: gup: fix the fast GUP race against THP collapse

2022-09-07 Thread Andrew Morton
On Wed,  7 Sep 2022 11:01:43 -0700 Yang Shi  wrote:

> Since general RCU GUP fast was introduced in commit 2667f50e8b81 ("mm:
> introduce a general RCU get_user_pages_fast()"), a TLB flush is no longer
> sufficient to handle concurrent GUP-fast in all cases, it only handles
> traditional IPI-based GUP-fast correctly.  On architectures that send
> an IPI broadcast on TLB flush, it works as expected.  But on the
> architectures that do not use IPI to broadcast TLB flush, it may have
> the below race:
> 
>CPU A  CPU B
> THP collapse fast GUP
>   gup_pmd_range() <-- see valid 
> pmd
>   gup_pte_range() <-- work on 
> pte
> pmdp_collapse_flush() <-- clear pmd and flush
> __collapse_huge_page_isolate()
> check page pinned <-- before GUP bump refcount
>   pin the page
>   check PTE <-- no change
> __collapse_huge_page_copy()
> copy data to huge page
> ptep_clear()
> install huge pmd for the huge page
>   return the stale page
> discard the stale page
> 
> The race could be fixed by checking whether PMD is changed or not after
> taking the page pin in fast GUP, just like what it does for PTE.  If the
> PMD is changed it means there may be parallel THP collapse, so GUP
> should back off.
> 
> Also update the stale comment about serializing against fast GUP in
> khugepaged.
> 
> Fixes: 2667f50e8b81 ("mm: introduce a general RCU get_user_pages_fast()")

Is this not worth a -stable backport?


[powerpc:next 26/63] arch/powerpc/math-emu/math_efp.c:177:5: error: no previous prototype for function 'do_spe_mathemu'

2022-09-07 Thread kernel test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
head:   c28c2d4abdf95655001992c4f52dc243ba00cac3
commit: 7245fc5bb7a966852d5bd7779d1f5855530b461a [26/63] powerpc/math-emu: 
Remove -w build flag and fix warnings
config: powerpc-tqm8540_defconfig 
(https://download.01.org/0day-ci/archive/20220908/202209080432.farx6r6a-...@intel.com/config)
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 
c55b41d5199d2394dd6cdb8f52180d8b81d809d4)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc cross compiling tool for clang build
# apt-get install binutils-powerpc-linux-gnu
# 
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id=7245fc5bb7a966852d5bd7779d1f5855530b461a
git remote add powerpc 
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git
git fetch --no-tags powerpc next
git checkout 7245fc5bb7a966852d5bd7779d1f5855530b461a
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=powerpc SHELL=/bin/bash arch/powerpc/math-emu/

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> arch/powerpc/math-emu/fre.c:6:5: error: no previous prototype for function 
>> 'fre' [-Werror,-Wmissing-prototypes]
   int fre(void *frD, void *frB)
   ^
   arch/powerpc/math-emu/fre.c:6:1: note: declare 'static' if the function is 
not intended to be used outside of this translation unit
   int fre(void *frD, void *frB)
   ^
   static 
   1 error generated.
--
>> arch/powerpc/math-emu/fsqrt.c:11:1: error: no previous prototype for 
>> function 'fsqrt' [-Werror,-Wmissing-prototypes]
   fsqrt(void *frD, void *frB)
   ^
   arch/powerpc/math-emu/fsqrt.c:10:1: note: declare 'static' if the function 
is not intended to be used outside of this translation unit
   int
   ^
   static 
   1 error generated.
--
>> arch/powerpc/math-emu/fsqrts.c:12:1: error: no previous prototype for 
>> function 'fsqrts' [-Werror,-Wmissing-prototypes]
   fsqrts(void *frD, void *frB)
   ^
   arch/powerpc/math-emu/fsqrts.c:11:1: note: declare 'static' if the function 
is not intended to be used outside of this translation unit
   int
   ^
   static 
   1 error generated.
--
>> arch/powerpc/math-emu/frsqrtes.c:6:5: error: no previous prototype for 
>> function 'frsqrtes' [-Werror,-Wmissing-prototypes]
   int frsqrtes(void *frD, void *frB)
   ^
   arch/powerpc/math-emu/frsqrtes.c:6:1: note: declare 'static' if the function 
is not intended to be used outside of this translation unit
   int frsqrtes(void *frD, void *frB)
   ^
   static 
   1 error generated.
--
>> arch/powerpc/math-emu/mtfsf.c:10:1: error: no previous prototype for 
>> function 'mtfsf' [-Werror,-Wmissing-prototypes]
   mtfsf(unsigned int FM, u32 *frB)
   ^
   arch/powerpc/math-emu/mtfsf.c:9:1: note: declare 'static' if the function is 
not intended to be used outside of this translation unit
   int
   ^
   static 
   1 error generated.
--
>> arch/powerpc/math-emu/mtfsfi.c:10:1: error: no previous prototype for 
>> function 'mtfsfi' [-Werror,-Wmissing-prototypes]
   mtfsfi(unsigned int crfD, unsigned int IMM)
   ^
   arch/powerpc/math-emu/mtfsfi.c:9:1: note: declare 'static' if the function 
is not intended to be used outside of this translation unit
   int
   ^
   static 
   1 error generated.
--
>> arch/powerpc/math-emu/fabs.c:7:1: error: no previous prototype for function 
>> 'fabs' [-Werror,-Wmissing-prototypes]
   fabs(u32 *frD, u32 *frB)
   ^
   arch/powerpc/math-emu/fabs.c:6:1: note: declare 'static' if the function is 
not intended to be used outside of this translation unit
   int
   ^
   static 
   1 error generated.
--
>> arch/powerpc/math-emu/fadd.c:11:1: error: no previous prototype for function 
>> 'fadd' [-Werror,-Wmissing-prototypes]
   fadd(void *frD, void *frA, void *frB)
   ^
   arch/powerpc/math-emu/fadd.c:10:1: note: declare 'static' if the function is 
not intended to be used outside of this translation unit
   int
   ^
   static 
   1 error generated.
--
>> arch/powerpc/math-emu/fadds.c:12:1: error: no previous prototype for 
>> function 'fadds' [-Werror,-Wmissing-prototypes]
   fadds(void *frD, void *frA, void *frB)
   ^
   arch/powerpc/math-emu/fadds.c:11:1: note: declare 'static' if the function 
is not intended to be used outside of this translation unit
   int
   ^
   static 
   1 error generated.
--
>> arch/powerpc/math-emu/fcmpo.c:11:1: error: no previous prototype for 
>> function 'fcmpo' [-Werror,-Wmissing-prototypes]
   fcmpo(u32 *ccr, int crfD, void *frA, void *frB)
   ^
   arch/powerpc/math-emu/fcmpo.c:10:1: note: declare 'static' if the function 
is not intended to be used outside of this 

Re: [v2 PATCH 2/2] powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flush

2022-09-07 Thread Peter Xu
On Wed, Sep 07, 2022 at 11:01:44AM -0700, Yang Shi wrote:
> The IPI broadcast is used to serialize against fast-GUP, but fast-GUP
> will move to use RCU instead of disabling local interrupts in fast-GUP.
> Using an IPI is the old-styled way of serializing against fast-GUP
> although it still works as expected now.
> 
> And fast-GUP now fixed the potential race with THP collapse by checking
> whether PMD is changed or not.  So IPI broadcast in radix pmd collapse
> flush is not necessary anymore.  But it is still needed for hash TLB.
> 
> Suggested-by: Aneesh Kumar K.V 
> Signed-off-by: Yang Shi 

Acked-by: Peter Xu 

-- 
Peter Xu



Re: [v2 PATCH 2/2] powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flush

2022-09-07 Thread David Hildenbrand

On 07.09.22 20:01, Yang Shi wrote:

The IPI broadcast is used to serialize against fast-GUP, but fast-GUP
will move to use RCU instead of disabling local interrupts in fast-GUP.
Using an IPI is the old-styled way of serializing against fast-GUP
although it still works as expected now.

And fast-GUP now fixed the potential race with THP collapse by checking
whether PMD is changed or not.  So IPI broadcast in radix pmd collapse
flush is not necessary anymore.  But it is still needed for hash TLB.

Suggested-by: Aneesh Kumar K.V 
Signed-off-by: Yang Shi 
---
  arch/powerpc/mm/book3s64/radix_pgtable.c | 9 -
  1 file changed, 9 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c 
b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 698274109c91..e712f80fe189 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -937,15 +937,6 @@ pmd_t radix__pmdp_collapse_flush(struct vm_area_struct 
*vma, unsigned long addre
pmd = *pmdp;
pmd_clear(pmdp);
  
-	/*

-* pmdp collapse_flush need to ensure that there are no parallel gup
-* walk after this call. This is needed so that we can have stable
-* page ref count when collapsing a page. We don't allow a collapse page
-* if we have gup taken on the page. We can ensure that by sending IPI
-* because gup walk happens with IRQ disabled.
-*/
-   serialize_against_pte_lookup(vma->vm_mm);
-
radix__flush_tlb_collapsed_pmd(vma->vm_mm, address);
  
  	return pmd;


Makes sense to me

Acked-by: David Hildenbrand 

--
Thanks,

David / dhildenb



[v2 PATCH 1/2] mm: gup: fix the fast GUP race against THP collapse

2022-09-07 Thread Yang Shi
Since general RCU GUP fast was introduced in commit 2667f50e8b81 ("mm:
introduce a general RCU get_user_pages_fast()"), a TLB flush is no longer
sufficient to handle concurrent GUP-fast in all cases, it only handles
traditional IPI-based GUP-fast correctly.  On architectures that send
an IPI broadcast on TLB flush, it works as expected.  But on the
architectures that do not use IPI to broadcast TLB flush, it may have
the below race:

   CPU A  CPU B
THP collapse fast GUP
  gup_pmd_range() <-- see valid pmd
  gup_pte_range() <-- work on 
pte
pmdp_collapse_flush() <-- clear pmd and flush
__collapse_huge_page_isolate()
check page pinned <-- before GUP bump refcount
  pin the page
  check PTE <-- no change
__collapse_huge_page_copy()
copy data to huge page
ptep_clear()
install huge pmd for the huge page
  return the stale page
discard the stale page

The race could be fixed by checking whether PMD is changed or not after
taking the page pin in fast GUP, just like what it does for PTE.  If the
PMD is changed it means there may be parallel THP collapse, so GUP
should back off.

Also update the stale comment about serializing against fast GUP in
khugepaged.

Fixes: 2667f50e8b81 ("mm: introduce a general RCU get_user_pages_fast()")
Acked-by: David Hildenbrand 
Acked-by: Peter Xu 
Signed-off-by: Yang Shi 
---
v2: * Incorporated the comment from Peter about the comment.
* Moved the comment right before gup_pte_range() instead of in the
  body of the function, per John.
* Added patch 2/2 per Aneesh.

 mm/gup.c| 34 --
 mm/khugepaged.c | 10 ++
 2 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index f3fc1f08d90c..40aa1c937212 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2380,8 +2380,28 @@ static void __maybe_unused undo_dev_pagemap(int *nr, int 
nr_start,
 }
 
 #ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
-static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
-unsigned int flags, struct page **pages, int *nr)
+/*
+ * Fast-gup relies on pte change detection to avoid concurrent pgtable
+ * operations.
+ *
+ * To pin the page, fast-gup needs to do below in order:
+ * (1) pin the page (by prefetching pte), then (2) check pte not changed.
+ *
+ * For the rest of pgtable operations where pgtable updates can be racy
+ * with fast-gup, we need to do (1) clear pte, then (2) check whether page
+ * is pinned.
+ *
+ * Above will work for all pte-level operations, including THP split.
+ *
+ * For THP collapse, it's a bit more complicated because fast-gup may be
+ * walking a pgtable page that is being freed (pte is still valid but pmd
+ * can be cleared already).  To avoid race in such condition, we need to
+ * also check pmd here to make sure pmd doesn't change (corresponds to
+ * pmdp_collapse_flush() in the THP collapse code path).
+ */
+static int gup_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr,
+unsigned long end, unsigned int flags,
+struct page **pages, int *nr)
 {
struct dev_pagemap *pgmap = NULL;
int nr_start = *nr, ret = 0;
@@ -2423,7 +2443,8 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, 
unsigned long end,
goto pte_unmap;
}
 
-   if (unlikely(pte_val(pte) != pte_val(*ptep))) {
+   if (unlikely(pmd_val(pmd) != pmd_val(*pmdp)) ||
+   unlikely(pte_val(pte) != pte_val(*ptep))) {
gup_put_folio(folio, 1, flags);
goto pte_unmap;
}
@@ -2470,8 +2491,9 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, 
unsigned long end,
  * get_user_pages_fast_only implementation that can pin pages. Thus it's still
  * useful to have gup_huge_pmd even if we can't operate on ptes.
  */
-static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
-unsigned int flags, struct page **pages, int *nr)
+static int gup_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr,
+unsigned long end, unsigned int flags,
+struct page **pages, int *nr)
 {
return 0;
 }
@@ -2791,7 +2813,7 @@ static int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned 
long addr, unsigned lo
if (!gup_huge_pd(__hugepd(pmd_val(pmd)), addr,
 PMD_SHIFT, next, flags, pages, nr))
return 0;
-   } else if (!gup_pte_range(pmd, addr, next, flags, pages, nr))
+   } else if (!gup_pte_range(pmd, pmdp, 

[v2 PATCH 2/2] powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flush

2022-09-07 Thread Yang Shi
The IPI broadcast is used to serialize against fast-GUP, but fast-GUP
will move to use RCU instead of disabling local interrupts in fast-GUP.
Using an IPI is the old-styled way of serializing against fast-GUP
although it still works as expected now.

And fast-GUP now fixed the potential race with THP collapse by checking
whether PMD is changed or not.  So IPI broadcast in radix pmd collapse
flush is not necessary anymore.  But it is still needed for hash TLB.

Suggested-by: Aneesh Kumar K.V 
Signed-off-by: Yang Shi 
---
 arch/powerpc/mm/book3s64/radix_pgtable.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c 
b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 698274109c91..e712f80fe189 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -937,15 +937,6 @@ pmd_t radix__pmdp_collapse_flush(struct vm_area_struct 
*vma, unsigned long addre
pmd = *pmdp;
pmd_clear(pmdp);
 
-   /*
-* pmdp collapse_flush need to ensure that there are no parallel gup
-* walk after this call. This is needed so that we can have stable
-* page ref count when collapsing a page. We don't allow a collapse page
-* if we have gup taken on the page. We can ensure that by sending IPI
-* because gup walk happens with IRQ disabled.
-*/
-   serialize_against_pte_lookup(vma->vm_mm);
-
radix__flush_tlb_collapsed_pmd(vma->vm_mm, address);
 
return pmd;
-- 
2.26.3



Re: [PATCH] powerpc/83xx: update kmeter1 defconfig and dts

2022-09-07 Thread Christophe Leroy


Le 16/12/2019 à 10:50, Holger Brunck a écrit :
> From: Matteo Ghidoni 
> 
> The defconfig is synchronized and the missing
> MTD_PHYSMAP, DEVTMPFS and I2C MUX support are switched on.
> 
> Additionally the I2C mux device is added to the DTS with
> its attached temperature sensors and I2C clock frequency
> is lowered.

This patch doesn't apply.

Is it still relevant ?

Thanks
Christophe

> 
> Signed-off-by: Matteo Ghidoni 
> Signed-off-by: Holger Brunck 
> CC: Heiko Schocher 
> CC: Michael Ellerman 
> ---
>   arch/powerpc/boot/dts/kmeter1.dts   | 40 
> -
>   arch/powerpc/configs/83xx/kmeter1_defconfig | 20 +--
>   2 files changed, 51 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/boot/dts/kmeter1.dts 
> b/arch/powerpc/boot/dts/kmeter1.dts
> index 154f5d293fd3..bc33f3ad19a3 100644
> --- a/arch/powerpc/boot/dts/kmeter1.dts
> +++ b/arch/powerpc/boot/dts/kmeter1.dts
> @@ -70,7 +70,45 @@
>   reg = <0x3000 0x100>;
>   interrupts = <14 0x8>;
>   interrupt-parent = <>;
> - clock-frequency = <40>;
> + clock-frequency = <10>;
> +
> + mux@70 {
> + compatible = "nxp,pca9547";
> + reg = <0x70>;
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + i2c@2 {
> + reg = <2>;
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + /* Temperature sensors */
> + temp@48 {
> + label = "Top";
> + compatible = "national,lm75";
> + reg = <0x48>;
> + };
> +
> + temp@49 {
> + label = "Control";
> + compatible = "national,lm75";
> + reg = <0x49>;
> + };
> +
> + temp@4a {
> + label = "Power";
> + compatible = "national,lm75";
> + reg = <0x4a>;
> + };
> +
> + temp@4b {
> + label = "Front";
> + compatible = "national,lm75";
> + reg = <0x4b>;
> + };
> + };
> + };
>   };
>   
>   serial0: serial@4500 {
> diff --git a/arch/powerpc/configs/83xx/kmeter1_defconfig 
> b/arch/powerpc/configs/83xx/kmeter1_defconfig
> index 648c6b3dccf9..72abd8ae654a 100644
> --- a/arch/powerpc/configs/83xx/kmeter1_defconfig
> +++ b/arch/powerpc/configs/83xx/kmeter1_defconfig
> @@ -3,22 +3,20 @@ CONFIG_SYSVIPC=y
>   CONFIG_POSIX_MQUEUE=y
>   CONFIG_NO_HZ=y
>   CONFIG_HIGH_RES_TIMERS=y
> +CONFIG_PREEMPT=y
>   CONFIG_LOG_BUF_SHIFT=14
>   CONFIG_EXPERT=y
>   CONFIG_SLAB=y
> -CONFIG_MODULES=y
> -CONFIG_MODULE_UNLOAD=y
> -# CONFIG_BLK_DEV_BSG is not set
> -CONFIG_PARTITION_ADVANCED=y
> -# CONFIG_MSDOS_PARTITION is not set
> -# CONFIG_IOSCHED_DEADLINE is not set
> -# CONFIG_IOSCHED_CFQ is not set
>   # CONFIG_PPC_CHRP is not set
>   # CONFIG_PPC_PMAC is not set
>   CONFIG_PPC_83xx=y
>   CONFIG_KMETER1=y
> -CONFIG_PREEMPT=y
>   # CONFIG_SECCOMP is not set
> +CONFIG_MODULES=y
> +CONFIG_MODULE_UNLOAD=y
> +# CONFIG_BLK_DEV_BSG is not set
> +CONFIG_PARTITION_ADVANCED=y
> +# CONFIG_MSDOS_PARTITION is not set
>   CONFIG_NET=y
>   CONFIG_PACKET=y
>   CONFIG_UNIX=y
> @@ -29,12 +27,15 @@ CONFIG_IP_PNP=y
>   CONFIG_TIPC=y
>   CONFIG_BRIDGE=m
>   CONFIG_VLAN_8021Q=y
> +CONFIG_DEVTMPFS=y
> +CONFIG_DEVTMPFS_MOUNT=y
>   CONFIG_MTD=y
>   CONFIG_MTD_CMDLINE_PARTS=y
>   CONFIG_MTD_BLOCK=y
>   CONFIG_MTD_CFI=y
>   CONFIG_MTD_CFI_INTELEXT=y
>   CONFIG_MTD_CFI_AMDSTD=y
> +CONFIG_MTD_PHYSMAP=y
>   CONFIG_MTD_PHYSMAP_OF=y
>   CONFIG_MTD_PHRAM=y
>   CONFIG_MTD_UBI=y
> @@ -57,7 +58,10 @@ CONFIG_SERIAL_8250_CONSOLE=y
>   CONFIG_HW_RANDOM=y
>   CONFIG_I2C=y
>   CONFIG_I2C_CHARDEV=y
> +CONFIG_I2C_MUX=y
> +CONFIG_I2C_MUX_PCA954x=y
>   CONFIG_I2C_MPC=y
> +CONFIG_GPIOLIB=y
>   # CONFIG_HWMON is not set
>   # CONFIG_USB_SUPPORT is not set
>   CONFIG_UIO=y

Re: [PATCH] powerpc/lib/xor_vmx: Relax frame size for clang

2022-09-07 Thread Christophe Leroy




Le 21/06/2019 à 10:58, Mathieu Malaterre a écrit :

When building with clang-8 the frame size limit is hit:

   ../arch/powerpc/lib/xor_vmx.c:119:6: error: stack frame size of 1200 bytes 
in function '__xor_altivec_5' [-Werror,-Wframe-larger-than=]

Follow the same approach as commit 9c87156cce5a ("powerpc/xmon: Relax
frame size for clang") until a proper fix is implemented upstream in
clang and relax requirement for clang.


With Clang 14 I get the following errors, but only with KASAN selected.

  CC  arch/powerpc/lib/xor_vmx.o
arch/powerpc/lib/xor_vmx.c:95:6: error: stack frame size (1040) exceeds 
limit (1024) in '__xor_altivec_4' [-Werror,-Wframe-larger-than]

void __xor_altivec_4(unsigned long bytes,
 ^
arch/powerpc/lib/xor_vmx.c:124:6: error: stack frame size (1312) exceeds 
limit (1024) in '__xor_altivec_5' [-Werror,-Wframe-larger-than]

void __xor_altivec_5(unsigned long bytes,
 ^


Is this patch still relevant ?

Or should frame size be relaxed when KASAN is selected ? After all the 
stack size is multiplied by 2 when we have KASAN, so maybe the warning 
limit should be increased as well ?


Thanks
Christophe



Link: https://github.com/ClangBuiltLinux/linux/issues/563
Cc: Joel Stanley 
Signed-off-by: Mathieu Malaterre 
---
  arch/powerpc/lib/Makefile | 4 
  1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index c55f9c27bf79..b3f7d64caaf0 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -58,5 +58,9 @@ obj-$(CONFIG_FTR_FIXUP_SELFTEST) += feature-fixups-test.o
  
  obj-$(CONFIG_ALTIVEC)	+= xor_vmx.o xor_vmx_glue.o

  CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec)
+ifdef CONFIG_CC_IS_CLANG
+# See https://github.com/ClangBuiltLinux/linux/issues/563
+CFLAGS_xor_vmx.o += -Wframe-larger-than=4096
+endif
  
  obj-$(CONFIG_PPC64) += $(obj64-y)


Re: [PATCH v2] powerpc: add compile-time support for lbarx, lharx

2022-09-07 Thread Christophe Leroy




Le 23/06/2021 à 05:28, Nicholas Piggin a écrit :

ISA v2.06 (POWER7 and up) as well as e6500 support lbarx and lharx.
Add a compile option that allows code to use it, and add support in
cmpxchg and xchg 8 and 16 bit values without shifting and masking.


Is this this patch still relevant ?

If so, it should be rebased because it badly conflicts.

Thanks
Christophe



Signed-off-by: Nicholas Piggin 
---
v2: Fixed lwarx->lharx typo, switched to PPC_HAS_

  arch/powerpc/Kconfig   |   3 +
  arch/powerpc/include/asm/cmpxchg.h | 236 -
  arch/powerpc/lib/sstep.c   |  21 +--
  arch/powerpc/platforms/Kconfig.cputype |   5 +
  4 files changed, 254 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 088dd2afcfe4..dc17f4d51a79 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -278,6 +278,9 @@ config PPC_BARRIER_NOSPEC
default y
depends on PPC_BOOK3S_64 || PPC_FSL_BOOK3E
  
+config PPC_HAS_LBARX_LHARX

+   bool
+
  config EARLY_PRINTK
bool
default y
diff --git a/arch/powerpc/include/asm/cmpxchg.h 
b/arch/powerpc/include/asm/cmpxchg.h
index cf091c4c22e5..28fbd57db1ec 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -77,10 +77,76 @@ u32 __cmpxchg_##type##sfx(volatile void *p, u32 old, u32 
new)   \
   * the previous value stored there.
   */
  
+#ifndef CONFIG_PPC_HAS_LBARX_LHARX

  XCHG_GEN(u8, _local, "memory");
  XCHG_GEN(u8, _relaxed, "cc");
  XCHG_GEN(u16, _local, "memory");
  XCHG_GEN(u16, _relaxed, "cc");
+#else
+static __always_inline unsigned long
+__xchg_u8_local(volatile void *p, unsigned long val)
+{
+   unsigned long prev;
+
+   __asm__ __volatile__(
+"1:   lbarx   %0,0,%2 \n"
+" stbcx.  %3,0,%2 \n\
+   bne-1b"
+   : "=" (prev), "+m" (*(volatile unsigned char *)p)
+   : "r" (p), "r" (val)
+   : "cc", "memory");
+
+   return prev;
+}
+
+static __always_inline unsigned long
+__xchg_u8_relaxed(u8 *p, unsigned long val)
+{
+   unsigned long prev;
+
+   __asm__ __volatile__(
+"1:   lbarx   %0,0,%2\n"
+" stbcx.  %3,0,%2\n"
+" bne-1b"
+   : "=" (prev), "+m" (*p)
+   : "r" (p), "r" (val)
+   : "cc");
+
+   return prev;
+}
+
+static __always_inline unsigned long
+__xchg_u16_local(volatile void *p, unsigned long val)
+{
+   unsigned long prev;
+
+   __asm__ __volatile__(
+"1:   lharx   %0,0,%2 \n"
+" sthcx.  %3,0,%2 \n\
+   bne-1b"
+   : "=" (prev), "+m" (*(volatile unsigned short *)p)
+   : "r" (p), "r" (val)
+   : "cc", "memory");
+
+   return prev;
+}
+
+static __always_inline unsigned long
+__xchg_u16_relaxed(u16 *p, unsigned long val)
+{
+   unsigned long prev;
+
+   __asm__ __volatile__(
+"1:   lharx   %0,0,%2\n"
+" sthcx.  %3,0,%2\n"
+" bne-1b"
+   : "=" (prev), "+m" (*p)
+   : "r" (p), "r" (val)
+   : "cc");
+
+   return prev;
+}
+#endif
  
  static __always_inline unsigned long

  __xchg_u32_local(volatile void *p, unsigned long val)
@@ -198,11 +264,12 @@ __xchg_relaxed(void *ptr, unsigned long x, unsigned int 
size)
(__typeof__(*(ptr))) __xchg_relaxed((ptr),  \
(unsigned long)_x_, sizeof(*(ptr)));\
  })
+
  /*
   * Compare and exchange - if *p == old, set it to new,
   * and return the old value of *p.
   */
-
+#ifndef CONFIG_PPC_HAS_LBARX_LHARX
  CMPXCHG_GEN(u8, , PPC_ATOMIC_ENTRY_BARRIER, PPC_ATOMIC_EXIT_BARRIER, 
"memory");
  CMPXCHG_GEN(u8, _local, , , "memory");
  CMPXCHG_GEN(u8, _acquire, , PPC_ACQUIRE_BARRIER, "memory");
@@ -211,6 +278,173 @@ CMPXCHG_GEN(u16, , PPC_ATOMIC_ENTRY_BARRIER, 
PPC_ATOMIC_EXIT_BARRIER, "memory");
  CMPXCHG_GEN(u16, _local, , , "memory");
  CMPXCHG_GEN(u16, _acquire, , PPC_ACQUIRE_BARRIER, "memory");
  CMPXCHG_GEN(u16, _relaxed, , , "cc");
+#else
+static __always_inline unsigned long
+__cmpxchg_u8(volatile unsigned char *p, unsigned long old, unsigned long new)
+{
+   unsigned int prev;
+
+   __asm__ __volatile__ (
+   PPC_ATOMIC_ENTRY_BARRIER
+"1:   lbarx   %0,0,%2 # __cmpxchg_u8\n\
+   cmpw0,%0,%3\n\
+   bne-2f\n"
+" stbcx.  %4,0,%2\n\
+   bne-1b"
+   PPC_ATOMIC_EXIT_BARRIER
+   "\n\
+2:"
+   : "=" (prev), "+m" (*p)
+   : "r" (p), "r" (old), "r" (new)
+   : "cc", "memory");
+
+   return prev;
+}
+
+static __always_inline unsigned long
+__cmpxchg_u8_local(volatile unsigned char *p, unsigned long old,
+   unsigned long new)
+{
+   unsigned int prev;
+
+   __asm__ __volatile__ (
+"1:   lbarx   %0,0,%2 # __cmpxchg_u8\n\
+   cmpw0,%0,%3\n\
+   bne-2f\n"
+" stbcx.  %4,0,%2\n\
+   bne-1b"
+   "\n\
+2:"
+   : "=" (prev), "+m" (*p)
+   : "r" (p), "r" (old), "r" (new)
+   : "cc", "memory");
+
+   return prev;
+}
+
+static __always_inline 

Re: [PATCH] powerpc/prom: move the device tree to the right space

2022-09-07 Thread Christophe Leroy


Le 03/03/2021 à 06:00, Youlin Song a écrit :
> If the device tree has been allocated memory and it will
> be in the memblock reserved space.Obviously it is in a
> valid memory declaration and will be mapped by the kernel.

Could you please provide clearer explanation ? I don't understand what 
you are doing and why.

Especially, the Subject says you move the device tree, but I can't see 
any move in your patch, only some change in the 'if'.

Thanks
Christophe

> 
> Signed-off-by: Youlin Song 
> ---
>   arch/powerpc/kernel/prom.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index 9a4797d1d40d..ef5f93e7d7f2 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -121,7 +121,7 @@ static void __init move_device_tree(void)
>   size = fdt_totalsize(initial_boot_params);
>   
>   if ((memory_limit && (start + size) > PHYSICAL_START + memory_limit) ||
> - !memblock_is_memory(start + size - 1) ||
> + (!memblock_is_memory(start + size - 1) && 
> !memblock_is_reserved(start + size - 1)) ||
>   overlaps_crashkernel(start, size) || overlaps_initrd(start, size)) {
>   p = memblock_alloc_raw(size, PAGE_SIZE);
>   if (!p)

Re: [PATCH] mm: gup: fix the fast GUP race against THP collapse

2022-09-07 Thread Yang Shi
On Tue, Sep 6, 2022 at 9:51 PM Aneesh Kumar K V
 wrote:
>
> On 9/7/22 12:37 AM, Yang Shi wrote:
> > On Mon, Sep 5, 2022 at 1:56 AM Aneesh Kumar K.V
> >  wrote:
> >>
> >> Yang Shi  writes:
> >>
> >>>
> >>> On Fri, Sep 2, 2022 at 9:00 AM Peter Xu  wrote:
> 
>  On Thu, Sep 01, 2022 at 04:50:45PM -0700, Yang Shi wrote:
> > On Thu, Sep 1, 2022 at 4:26 PM Peter Xu  wrote:
> >>
> >> Hi, Yang,
> >>
> >> On Thu, Sep 01, 2022 at 03:27:07PM -0700, Yang Shi wrote:
> >>> Since general RCU GUP fast was introduced in commit 2667f50e8b81 ("mm:
> >>> introduce a general RCU get_user_pages_fast()"), a TLB flush is no 
> >>> longer
> >>> sufficient to handle concurrent GUP-fast in all cases, it only handles
> >>> traditional IPI-based GUP-fast correctly.
> >>
> >> If TLB flush (or, IPI broadcasts) used to work to protect against 
> >> gup-fast,
> >> I'm kind of confused why it's not sufficient even if with RCU gup?  
> >> Isn't
> >> that'll keep working as long as interrupt disabled (which current 
> >> fast-gup
> >> will still do)?
> >
> > Actually the wording was copied from David's commit log for his
> > PageAnonExclusive fix. My understanding is the IPI broadcast still
> > works, but it may not be supported by all architectures and not
> > preferred anymore. So we should avoid depending on IPI broadcast IIUC.
> >
> >>
> >> IIUC the issue is you suspect not all archs correctly implemented
> >> pmdp_collapse_flush(), or am I wrong?
> >
> > This is a possible fix, please see below for details.
> >
> >>
> >>> On architectures that send
> >>> an IPI broadcast on TLB flush, it works as expected.  But on the
> >>> architectures that do not use IPI to broadcast TLB flush, it may have
> >>> the below race:
> >>>
> >>>CPU A  CPU B
> >>> THP collapse fast GUP
> >>>   gup_pmd_range() <-- see 
> >>> valid pmd
> >>>   gup_pte_range() <-- 
> >>> work on pte
> >>> pmdp_collapse_flush() <-- clear pmd and flush
> >>> __collapse_huge_page_isolate()
> >>> check page pinned <-- before GUP bump refcount
> >>>   pin the page
> >>>   check PTE <-- 
> >>> no change
> >>> __collapse_huge_page_copy()
> >>> copy data to huge page
> >>> ptep_clear()
> >>> install huge pmd for the huge page
> >>>   return the 
> >>> stale page
> >>> discard the stale page
> >>>
> >>> The race could be fixed by checking whether PMD is changed or not 
> >>> after
> >>> taking the page pin in fast GUP, just like what it does for PTE.  If 
> >>> the
> >>> PMD is changed it means there may be parallel THP collapse, so GUP
> >>> should back off.
> >>
> >> Could the race also be fixed by impl pmdp_collapse_flush() correctly 
> >> for
> >> the archs that are missing? Do you know which arch(s) is broken with 
> >> it?
> >
> > Yes, and this was suggested by me in the first place, but per the
> > suggestion from John and David, this is not the preferred way. I think
> > it is because:
> >
> > Firstly, using IPI to serialize against fast GUP is not recommended
> > anymore since fast GUP does check PTE then back off so we should avoid
> > it.
> > Secondly, if checking PMD then backing off could solve the problem,
> > why do we still need broadcast IPI? It doesn't sound performant.
> >
> >>
> >> It's just not clear to me whether this patch is an optimization or a 
> >> fix,
> >> if it's a fix whether the IPI broadcast in ppc pmdp_collapse_flush() 
> >> would
> >> still be needed.
> >
> > It is a fix and the fix will make IPI broadcast not useful anymore.
> 
>  How about another patch to remove the ppc impl too?  Then it can be a two
>  patches series.
> >>>
> >>> BTW, I don't think we could remove the ppc implementation since it is
> >>> different from the generic pmdp_collapse_flush(), particularly for the
> >>> hash part IIUC.
> >>>
> >>> The generic version calls flush_tlb_range() -> hash__flush_tlb_range()
> >>> for hash, but the hash call is actually no-op. The ppc version calls
> >>> hash__pmdp_collapse_flush() -> flush_tlb_pmd_range(), which does
> >>> something useful.
> >>>
> >>
> >> We should actually rename flush_tlb_pmd_range(). It actually flush the
> >> hash page table entries.
> >>
> >> I will do the below patch for ppc64 to clarify this better
> >
> > Thanks, Aneesh. It looks more readable. A follow-up question, I think
> > we could remove serialize_against_pte_lookup(), 

Re: [RFC PATCH v1] spi: fsl_spi: Convert to transfer_one

2022-09-07 Thread Christophe Leroy


Le 18/08/2022 à 15:38, Christophe Leroy a écrit :
> Let the core handle all the chipselect bakery and replace
> transfer_one_message() by transfer_one() and prepare_message().
> 
> At the time being, there is fsl_spi_cs_control() to handle
> chipselects. That function handles both GPIO and non-GPIO
> chipselects. The GPIO chipselects will now be handled by
> the core directly, so only handle non-GPIO chipselects and
> hook it to ->set_cs

Any comment for/about this conversion ?
Did I do it the right way ? Any recommendation ?

Thanks
Christophe


> 
> Signed-off-by: Christophe Leroy 
> ---
> Sending as an RFC as I'm not 100% sure of the correctness.
> I successfully tested it on the hardware I have though.
> Not sure about the change from m->is_dma_mapped to !!t->tx_dma || !!t->rx_dma
> ---
>   drivers/spi/spi-fsl-spi.c | 157 +++---
>   1 file changed, 43 insertions(+), 114 deletions(-)
> 
> diff --git a/drivers/spi/spi-fsl-spi.c b/drivers/spi/spi-fsl-spi.c
> index bdf94cc7be1a..731624f157fc 100644
> --- a/drivers/spi/spi-fsl-spi.c
> +++ b/drivers/spi/spi-fsl-spi.c
> @@ -111,32 +111,6 @@ static void fsl_spi_change_mode(struct spi_device *spi)
>   local_irq_restore(flags);
>   }
>   
> -static void fsl_spi_chipselect(struct spi_device *spi, int value)
> -{
> - struct mpc8xxx_spi *mpc8xxx_spi = spi_master_get_devdata(spi->master);
> - struct fsl_spi_platform_data *pdata;
> - struct spi_mpc8xxx_cs   *cs = spi->controller_state;
> -
> - pdata = spi->dev.parent->parent->platform_data;
> -
> - if (value == BITBANG_CS_INACTIVE) {
> - if (pdata->cs_control)
> - pdata->cs_control(spi, false);
> - }
> -
> - if (value == BITBANG_CS_ACTIVE) {
> - mpc8xxx_spi->rx_shift = cs->rx_shift;
> - mpc8xxx_spi->tx_shift = cs->tx_shift;
> - mpc8xxx_spi->get_rx = cs->get_rx;
> - mpc8xxx_spi->get_tx = cs->get_tx;
> -
> - fsl_spi_change_mode(spi);
> -
> - if (pdata->cs_control)
> - pdata->cs_control(spi, true);
> - }
> -}
> -
>   static void fsl_spi_qe_cpu_set_shifts(u32 *rx_shift, u32 *tx_shift,
> int bits_per_word, int msb_first)
>   {
> @@ -354,15 +328,11 @@ static int fsl_spi_bufs(struct spi_device *spi, struct 
> spi_transfer *t,
>   return mpc8xxx_spi->count;
>   }
>   
> -static int fsl_spi_do_one_msg(struct spi_master *master,
> -   struct spi_message *m)
> +static int fsl_spi_prepare_message(struct spi_controller *ctlr,
> +struct spi_message *m)
>   {
> - struct mpc8xxx_spi *mpc8xxx_spi = spi_master_get_devdata(master);
> - struct spi_device *spi = m->spi;
> - struct spi_transfer *t, *first;
> - unsigned int cs_change;
> - const int nsecs = 50;
> - int status, last_bpw;
> + struct mpc8xxx_spi *mpc8xxx_spi = spi_controller_get_devdata(ctlr);
> + struct spi_transfer *t;
>   
>   /*
>* In CPU mode, optimize large byte transfers to use larger
> @@ -378,62 +348,30 @@ static int fsl_spi_do_one_msg(struct spi_master *master,
>   t->bits_per_word = 16;
>   }
>   }
> + return 0;
> +}
>   
> - /* Don't allow changes if CS is active */
> - cs_change = 1;
> - list_for_each_entry(t, >transfers, transfer_list) {
> - if (cs_change)
> - first = t;
> - cs_change = t->cs_change;
> - if (first->speed_hz != t->speed_hz) {
> - dev_err(>dev,
> - "speed_hz cannot change while CS is active\n");
> - return -EINVAL;
> - }
> - }
> -
> - last_bpw = -1;
> - cs_change = 1;
> - status = -EINVAL;
> - list_for_each_entry(t, >transfers, transfer_list) {
> - if (cs_change || last_bpw != t->bits_per_word)
> - status = fsl_spi_setup_transfer(spi, t);
> - if (status < 0)
> - break;
> - last_bpw = t->bits_per_word;
> -
> - if (cs_change) {
> - fsl_spi_chipselect(spi, BITBANG_CS_ACTIVE);
> - ndelay(nsecs);
> - }
> - cs_change = t->cs_change;
> - if (t->len)
> - status = fsl_spi_bufs(spi, t, m->is_dma_mapped);
> - if (status) {
> - status = -EMSGSIZE;
> - break;
> - }
> - m->actual_length += t->len;
> -
> - spi_transfer_delay_exec(t);
> -
> - if (cs_change) {
> - ndelay(nsecs);
> - fsl_spi_chipselect(spi, BITBANG_CS_INACTIVE);
> - ndelay(nsecs);
> - }
> - }
> +static int fsl_spi_transfer_one(struct spi_controller *controller,
> + struct 

Re: [PATCH v3 1/3] dt-bindings: reset: syscon-reboot: Add priority property

2022-09-07 Thread Pali Rohár
On Wednesday 07 September 2022 14:38:42 Krzysztof Kozlowski wrote:
> On 31/08/2022 10:17, Pali Rohár wrote:
> > This new optional priority property allows to specify custom priority level
> > of reset device. Prior this change priority level was hardcoded to 192 and
> > not possible to specify or change. Specifying other value is needed for
> > some boards. Default level when not specified stays at 192 as before.
> > 
> > Signed-off-by: Pali Rohár 
> 
> Thanks for the changes. Explanation looks good.
> 
> I sent a patch adding the common schema with priority. If it gets
> ack/review from Rob and Sebastian, please kindly rebase on top of it and
> use same way as I did for gpio-restart.yaml
> 
> Best regards,
> Krzysztof

Ok, so just by adding "allOf: - $ref: restart-handler.yaml#" right?


[RFC PATCH] powerpc/64s: early boot machine check handler

2022-09-07 Thread Nicholas Piggin
This patch re-uses the same trick from the program interrupt in early
boot to allow the machine check handler to run before interrupt endian
is set up, and branch to an early boot handler (e.g., before ppc_md is
set up).

MSR[ME] is enabled on the boot CPU earlier, and the machine check stack
is temporarily in the middle of the init task stack.

This allows machine checks (e.g., due to invalid data access in real
mode) to be caught and printed out earlier in boot, as soon as udbg is
set up when CONFIG_PPC_EARLY_DEBUG=y.

---
per cpu data offset poisoning patch was rejected because it causes
checkstops or other strange things in early boot. Maybe this can
help that situation.

XXX: haven't tested with pseries yet.

XXX: should consolidate the interrupt entry code. gas macro should
be able to build the immediate value for the rfid offset.

 arch/powerpc/include/asm/asm-prototypes.h |  1 +
 arch/powerpc/kernel/exceptions-64s.S  | 31 +++
 arch/powerpc/kernel/setup_64.c|  6 +
 arch/powerpc/kernel/traps.c   | 14 ++
 4 files changed, 52 insertions(+)

diff --git a/arch/powerpc/include/asm/asm-prototypes.h 
b/arch/powerpc/include/asm/asm-prototypes.h
index 81631e64dbeb..a1039b9da42e 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -36,6 +36,7 @@ int64_t __opal_call(int64_t a0, int64_t a1, int64_t a2, 
int64_t a3,
int64_t opcode, uint64_t msr);
 
 /* misc runtime */
+void enable_machine_check(void);
 extern u64 __bswapdi2(u64);
 extern s64 __lshrdi3(s64, int);
 extern s64 __ashldi3(s64, int);
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index d98732a33afe..004a08cb48f2 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1079,6 +1079,33 @@ INT_DEFINE_BEGIN(machine_check)
 INT_DEFINE_END(machine_check)
 
 EXC_REAL_BEGIN(machine_check, 0x200, 0x100)
+#ifdef CONFIG_CPU_LITTLE_ENDIAN
+   /*
+* See comment in program_check, this code is identical except
+* the SRR0 rfid offset.
+*/
+BEGIN_FTR_SECTION
+   tdi   0,0,0x48// Trap never, or in reverse endian: b . + 8
+   b 1f  // Skip trampoline if endian is correct
+   .long 0xa643707d  // mtsprg  0, r11  Backup r11
+   .long 0xa6027a7d  // mfsrr0  r11
+   .long 0xa643727d  // mtsprg  2, r11  Backup SRR0 in SPRG2
+   .long 0xa6027b7d  // mfsrr1  r11
+   .long 0xa643737d  // mtsprg  3, r11  Backup SRR1 in SPRG3
+   .long 0xa600607d  // mfmsr   r11
+   .long 0x01006b69  // xorir11, r11, 1 Invert MSR[LE]
+   .long 0xa6037b7d  // mtsrr1  r11
+   .long 0x34026039  // li  r11, 0x234
+   .long 0xa6037a7d  // mtsrr0  r11
+   .long 0x244c  // rfid
+   mfsprg r11, 3
+   mtsrr1 r11// Restore SRR1
+   mfsprg r11, 2
+   mtsrr0 r11// Restore SRR0
+   mfsprg r11, 0 // Restore r11
+1:
+END_FTR_SECTION(0, 1) // nop out after boot
+#endif
GEN_INT_ENTRY machine_check_early, virt=0
 EXC_REAL_END(machine_check, 0x200, 0x100)
 EXC_VIRT_NONE(0x4200, 0x100)
@@ -1143,6 +1170,9 @@ BEGIN_FTR_SECTION
bl  enable_machine_check
 END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
addir3,r1,STACK_FRAME_OVERHEAD
+BEGIN_FTR_SECTION
+   bl  machine_check_early_boot
+END_FTR_SECTION(0, 1) // nop out after boot
bl  machine_check_early
std r3,RESULT(r1)   /* Save result */
ld  r12,_MSR(r1)
@@ -3087,6 +3117,7 @@ CLOSE_FIXED_SECTION(virt_trampolines);
 USE_TEXT_SECTION()
 
 /* MSR[RI] should be clear because this uses SRR[01] */
+   .globl enable_machine_check
 enable_machine_check:
mflrr0
bcl 20,31,$+4
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 2b2d0b0fbb30..19a6c9ca934e 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -180,6 +181,8 @@ static void __init fixup_boot_paca(void)
 {
/* The boot cpu is started */
get_paca()->cpu_start = 1;
+   /* Give the early boot machine check stack somewhere to use */
+   get_paca()->mc_emergency_sp = (void *)_thread_union + 
(THREAD_SIZE/2);
/* Allow percpu accesses to work until we setup percpu data */
get_paca()->data_offset = 0;
/* Mark interrupts disabled in PACA */
@@ -355,6 +358,9 @@ void __init early_setup(unsigned long dt_ptr)
 
/*  printk is now safe to use --- */
 
+   if (mfmsr() & MSR_HV)
+   enable_machine_check();
+
/* Try new device tree based feature discovery ... */
if (!dt_cpu_ftrs_init(__va(dt_ptr)))
/* Otherwise use the old style CPU table */
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c

Re: [PATCH] powerpc/pseries: Fix plpks crash on non-pseries

2022-09-07 Thread Nathan Chancellor
On Wed, Sep 07, 2022 at 04:50:38PM +1000, Michael Ellerman wrote:
> As reported[1] by Nathan, the recently added plpks driver will crash if
> it's built into the kernel and booted on a non-pseries machine, eg
> powernv:
> 
>   kernel BUG at arch/powerpc/kernel/syscall.c:39!
>   Oops: Exception in kernel mode, sig: 5 [#1]
>   LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
>   ...
>   NIP system_call_exception+0x90/0x3d0
>   LR  system_call_common+0xec/0x250
>   Call Trace:
> 0xc35c3e10 (unreliable)
> system_call_common+0xec/0x250
>   --- interrupt: c00 at plpar_hcall+0x38/0x60
>   NIP:  c00e4300 LR: c202945c CTR: 
>   REGS: c35c3e80 TRAP: 0c00   Not tainted  (6.0.0-rc4)
>   MSR:  92009033   CR: 28000284  XER: 
> 
>   ...
>   NIP plpar_hcall+0x38/0x60
>   LR  pseries_plpks_init+0x64/0x23c
>   --- interrupt: c00
> 
> On powernv Linux is the hypervisor, so a hypercall just ends up going to
> the syscall path, which BUGs if the syscall (hypercall) didn't come from
> userspace.
> 
> The fix is simply to not probe the plpks driver on non-pseries machines.
> 
> [1] 
> https://lore.kernel.org/linuxppc-dev/Yxe06fbq18Wv9y3W@dev-arch.thelio-3990X/
> 
> Fixes: 2454a7af0f2a ("powerpc/pseries: define driver for Platform KeyStore")
> Reported-by: Nathan Chancellor 
> Signed-off-by: Michael Ellerman 

Tested-by: Nathan Chancellor 

Thanks!

> ---
>  arch/powerpc/platforms/pseries/plpks.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/plpks.c 
> b/arch/powerpc/platforms/pseries/plpks.c
> index 52aaa2894606..f4b5b5a64db3 100644
> --- a/arch/powerpc/platforms/pseries/plpks.c
> +++ b/arch/powerpc/platforms/pseries/plpks.c
> @@ -17,6 +17,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "plpks.h"
>  
> @@ -457,4 +458,4 @@ static __init int pseries_plpks_init(void)
>  
>   return rc;
>  }
> -arch_initcall(pseries_plpks_init);
> +machine_arch_initcall(pseries, pseries_plpks_init);
> -- 
> 2.37.2
> 


Re: [PATCH v3 1/3] dt-bindings: reset: syscon-reboot: Add priority property

2022-09-07 Thread Krzysztof Kozlowski
On 31/08/2022 10:17, Pali Rohár wrote:
> This new optional priority property allows to specify custom priority level
> of reset device. Prior this change priority level was hardcoded to 192 and
> not possible to specify or change. Specifying other value is needed for
> some boards. Default level when not specified stays at 192 as before.
> 
> Signed-off-by: Pali Rohár 

Thanks for the changes. Explanation looks good.

I sent a patch adding the common schema with priority. If it gets
ack/review from Rob and Sebastian, please kindly rebase on top of it and
use same way as I did for gpio-restart.yaml

Best regards,
Krzysztof


Re: [PATCH 1/3] dt-bindings: reset: syscon-reboot: Add priority property

2022-09-07 Thread Krzysztof Kozlowski
On 02/09/2022 22:37, Rob Herring wrote:
>>
>> Sorry, I do not understand.
> 
> So just keep sending new versions instead?
> 
> syscon-reboot is not the only binding for a system reset device, right? 
> So those others reset devices will need 'priority' too. For a given 
> property, there should only be one schema definition defining the type 
> for the property. Otherwise, there might be conflicts. So you need a 
> common schema doing that. And here you would just have 'priority: true' 
> or possibly some binding specific constraints.

I'll propose a patch for this.


Best regards,
Krzysztof


[PATCH 0/2] powerpc/64: more soft-mask improvements

2022-09-07 Thread Nicholas Piggin
These are a couple more improvements while I'm here, I had to
rediscover again why disabling softirqs during irq replay is
hard, and ran into some bugs trying to find a way to do it.

This doesn't solve that, just documents it better and tidies
up code a bit.

Thanks,
Nick

Nicholas Piggin (2):
  powerpc/64/interrupt: avoid BUG/WARN recursion in interrupt entry
  powerpc/64/irq: tidy soft-masked irq replay and improve documentation

 arch/powerpc/include/asm/interrupt.h | 33 ++
 arch/powerpc/kernel/irq_64.c | 93 ++--
 2 files changed, 81 insertions(+), 45 deletions(-)

-- 
2.37.2



[PATCH 1/2] powerpc/64/interrupt: avoid BUG/WARN recursion in interrupt entry

2022-09-07 Thread Nicholas Piggin
BUG/WARN are handled with a program interrupt which can turn into an
infinite recursion when there are bugs in interrupt handler entry
(which can be irritated by bugs in other parts of the code).

There is one feeble attempt to avoid this recursion, but it misses
several cases. Make a tidier macro for this and switch most bugs in
the interrupt entry wrapper over to use it.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/interrupt.h | 33 +---
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/interrupt.h 
b/arch/powerpc/include/asm/interrupt.h
index 8069dbc4b8d1..690ca27d8dd1 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -74,6 +74,19 @@
 #include 
 #include 
 
+#ifdef CONFIG_PPC64
+/*
+ * WARN/BUG is handled with a program interrupt so minimise checks here to
+ * avoid recursion and maximise the chance of getting the first oops handled.
+ */
+#define INT_SOFT_MASK_BUG_ON(regs, cond)   \
+do {   \
+   if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG) &&   \
+   (user_mode(regs) || (TRAP(regs) != INTERRUPT_PROGRAM))) \
+   BUG_ON(cond);   \
+} while (0)
+#endif
+
 #ifdef CONFIG_PPC_BOOK3S_64
 extern char __end_soft_masked[];
 bool search_kernel_soft_mask_table(unsigned long addr);
@@ -170,8 +183,7 @@ static inline void interrupt_enter_prepare(struct pt_regs 
*regs)
 * context.
 */
if (!(local_paca->irq_happened & PACA_IRQ_HARD_DIS)) {
-   if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
-   BUG_ON(!(regs->msr & MSR_EE));
+   INT_SOFT_MASK_BUG_ON(regs, !(regs->msr & MSR_EE));
__hard_irq_enable();
} else {
__hard_RI_enable();
@@ -194,19 +206,14 @@ static inline void interrupt_enter_prepare(struct pt_regs 
*regs)
 * CT_WARN_ON comes here via program_check_exception,
 * so avoid recursion.
 */
-   if (TRAP(regs) != INTERRUPT_PROGRAM) {
+   if (TRAP(regs) != INTERRUPT_PROGRAM)
CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
-   if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
-   BUG_ON(is_implicit_soft_masked(regs));
-   }
-
-   /* Move this under a debugging check */
-   if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG) &&
-   arch_irq_disabled_regs(regs))
-   BUG_ON(search_kernel_restart_table(regs->nip));
+   INT_SOFT_MASK_BUG_ON(regs, is_implicit_soft_masked(regs));
+   INT_SOFT_MASK_BUG_ON(regs, arch_irq_disabled_regs(regs) &&
+  
search_kernel_restart_table(regs->nip));
}
-   if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
-   BUG_ON(!arch_irq_disabled_regs(regs) && !(regs->msr & MSR_EE));
+   INT_SOFT_MASK_BUG_ON(regs, !arch_irq_disabled_regs(regs) &&
+  !(regs->msr & MSR_EE));
 #endif
 
booke_restore_dbcr0();
-- 
2.37.2



[PATCH 2/2] powerpc/64/irq: tidy soft-masked irq replay and improve documentation

2022-09-07 Thread Nicholas Piggin
irq replay is quite complicated because of softirq processing which
itself enables and disables irqs. Several considerations need to be
accounted for due to this, and they are not clearly documented.

Refactor the irq replay code a bit to tidy and deduplicate some common
functions. Add comments, debug checks.

This has a minor functional change that irq tracing enable/disable is
done after each interrupt replayed, rather than after a batch. It also
re-sets state to IRQS_ALL_DISABLED after an interrupt, which doesn't
matter much because interrupts are hard disabled at this point, but it
is more consistent with how interrupt handlers are called.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/irq_64.c | 93 +++-
 1 file changed, 61 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/kernel/irq_64.c b/arch/powerpc/kernel/irq_64.c
index 01645e03e9f0..eb2b380e52a0 100644
--- a/arch/powerpc/kernel/irq_64.c
+++ b/arch/powerpc/kernel/irq_64.c
@@ -68,6 +68,35 @@
 
 int distribute_irqs = 1;
 
+static inline void next_interrupt(struct pt_regs *regs)
+{
+   /*
+* Softirq processing can enable/disable irqs, which will leave
+* MSR[EE] enabled and the soft mask set to IRQS_DISABLED. Fix
+* this up.
+*/
+   if (!(local_paca->irq_happened & PACA_IRQ_HARD_DIS))
+   hard_irq_disable();
+   else
+   irq_soft_mask_set(IRQS_ALL_DISABLED);
+
+   /*
+* We are responding to the next interrupt, so interrupt-off
+* latencies should be reset here.
+*/
+   trace_hardirqs_on();
+   trace_hardirqs_off();
+}
+
+static inline bool irq_happened_test_and_clear(u8 irq)
+{
+   if (local_paca->irq_happened & irq) {
+   local_paca->irq_happened &= ~irq;
+   return true;
+   }
+   return false;
+}
+
 void replay_soft_interrupts(void)
 {
struct pt_regs regs;
@@ -79,18 +108,25 @@ void replay_soft_interrupts(void)
 * recurse into this function. Don't keep any state across
 * interrupt handler calls which may change underneath us.
 *
+* Softirqs can not be disabled over replay to stop this recursion
+* because interrupts taken in idle code may require RCU softirq
+* to run in the irq RCU tracking context. This is a hard problem
+* to fix without changes to the softirq or idle layer.
+*
 * We use local_paca rather than get_paca() to avoid all the
 * debug_smp_processor_id() business in this low level function.
 */
 
+   if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG)) {
+   WARN_ON_ONCE(mfmsr() & MSR_EE);
+   WARN_ON(!(local_paca->irq_happened & PACA_IRQ_HARD_DIS));
+   }
+
ppc_save_regs();
regs.softe = IRQS_ENABLED;
regs.msr |= MSR_EE;
 
 again:
-   if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
-   WARN_ON_ONCE(mfmsr() & MSR_EE);
-
/*
 * Force the delivery of pending soft-disabled interrupts on PS3.
 * Any HV call will have this side effect.
@@ -105,56 +141,47 @@ void replay_soft_interrupts(void)
 * This is a higher priority interrupt than the others, so
 * replay it first.
 */
-   if (IS_ENABLED(CONFIG_PPC_BOOK3S) && (local_paca->irq_happened & 
PACA_IRQ_HMI)) {
-   local_paca->irq_happened &= ~PACA_IRQ_HMI;
+   if (IS_ENABLED(CONFIG_PPC_BOOK3S) &&
+   irq_happened_test_and_clear(PACA_IRQ_HMI)) {
regs.trap = INTERRUPT_HMI;
handle_hmi_exception();
-   if (!(local_paca->irq_happened & PACA_IRQ_HARD_DIS))
-   hard_irq_disable();
+   next_interrupt();
}
 
-   if (local_paca->irq_happened & PACA_IRQ_DEC) {
-   local_paca->irq_happened &= ~PACA_IRQ_DEC;
+   if (irq_happened_test_and_clear(PACA_IRQ_DEC)) {
regs.trap = INTERRUPT_DECREMENTER;
timer_interrupt();
-   if (!(local_paca->irq_happened & PACA_IRQ_HARD_DIS))
-   hard_irq_disable();
+   next_interrupt();
}
 
-   if (local_paca->irq_happened & PACA_IRQ_EE) {
-   local_paca->irq_happened &= ~PACA_IRQ_EE;
+   if (irq_happened_test_and_clear(PACA_IRQ_EE)) {
regs.trap = INTERRUPT_EXTERNAL;
do_IRQ();
-   if (!(local_paca->irq_happened & PACA_IRQ_HARD_DIS))
-   hard_irq_disable();
+   next_interrupt();
}
 
-   if (IS_ENABLED(CONFIG_PPC_DOORBELL) && (local_paca->irq_happened & 
PACA_IRQ_DBELL)) {
-   local_paca->irq_happened &= ~PACA_IRQ_DBELL;
+   if (IS_ENABLED(CONFIG_PPC_DOORBELL) &&
+   irq_happened_test_and_clear(PACA_IRQ_DBELL)) {
regs.trap = INTERRUPT_DOORBELL;
doorbell_exception();
-   if (!(local_paca->irq_happened 

Re: [PATCH] perf: Rewrite core context handling

2022-09-07 Thread Ravi Bangoria
> -static void
> -ctx_flexible_sched_in(struct perf_event_context *ctx,
> -   struct perf_cpu_context *cpuctx)
> +/* XXX .busy thingy from Peter's patch */
> +static void ctx_flexible_sched_in(struct perf_event_context *ctx, struct pmu 
> *pmu)

This one turned out to be very easy. Given that, we iterate over each
pmu, we can just return error if we fail to schedule any flexible event.
(It wouldn't be straight forward like this if we needed to implement
pmu=NULL optimization.)

---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index e0232e0bb74e..923656af73fe 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3751,6 +3751,7 @@ static int merge_sched_in(struct perf_event *event, void 
*data)
cpc = 
this_cpu_ptr(event->pmu_ctx->pmu->cpu_pmu_context);
perf_mux_hrtimer_restart(cpc);
group_update_userpage(event);
+   return -EBUSY;
}
}
 
@@ -3776,7 +3777,6 @@ static void ctx_pinned_sched_in(struct perf_event_context 
*ctx, struct pmu *pmu)
}
 }
 
-/* XXX .busy thingy from Peter's patch */
 static void ctx_flexible_sched_in(struct perf_event_context *ctx, struct pmu 
*pmu)
 {
struct perf_event_pmu_context *pmu_ctx;
---

Thanks,
Ravi


Re: [PATCH v4 4/4] selftests/hmm-tests: Add test for dirty bits

2022-09-07 Thread Alistair Popple


John Hubbard  writes:

> On 9/1/22 17:35, Alistair Popple wrote:

[...]

>> +/*
>> + * Try and migrate a dirty page that has previously been swapped to disk. 
>> This
>> + * checks that we don't loose dirty bits.
>
> s/loose/lose/

Thanks.

>> + */
>> +TEST_F(hmm, migrate_dirty_page)
>> +{
>> +struct hmm_buffer *buffer;
>> +unsigned long npages;
>> +unsigned long size;
>> +unsigned long i;
>> +int *ptr;
>> +int tmp = 0;
>> +
>> +npages = ALIGN(HMM_BUFFER_SIZE, self->page_size) >> self->page_shift;
>> +ASSERT_NE(npages, 0);
>> +size = npages << self->page_shift;
>> +
>> +buffer = malloc(sizeof(*buffer));
>> +ASSERT_NE(buffer, NULL);
>> +
>> +buffer->fd = -1;
>> +buffer->size = size;
>> +buffer->mirror = malloc(size);
>> +ASSERT_NE(buffer->mirror, NULL);
>> +
>> +ASSERT_EQ(setup_cgroup(), 0);
>> +
>> +buffer->ptr = mmap(NULL, size,
>> +   PROT_READ | PROT_WRITE,
>> +   MAP_PRIVATE | MAP_ANONYMOUS,
>> +   buffer->fd, 0);
>> +ASSERT_NE(buffer->ptr, MAP_FAILED);
>> +
>> +/* Initialize buffer in system memory. */
>> +for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
>> +ptr[i] = 0;
>> +
>> +ASSERT_FALSE(write_cgroup_param(cgroup, "memory.reclaim", 1UL<<30));
>> +
>> +/* Fault pages back in from swap as clean pages */
>> +for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
>> +tmp += ptr[i];
>> +
>> +/* Dirty the pte */
>> +for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
>> +ptr[i] = i;
>> +
>> +/*
>> + * Attempt to migrate memory to device, which should fail because
>> + * hopefully some pages are backed by swap storage.
>> + */
>> +ASSERT_TRUE(hmm_migrate_sys_to_dev(self->fd, buffer, npages));
>
> Are you really sure that you want to assert on that? Because doing so
> guarantees a test failure if and when we every upgrade the kernel to
> be able to migrate swap-backed pages. And I seem to recall that this
> current inability to migrate swap-backed pages is considered a flaw
> to be fixed, right?

Right, that's a good point. I was using failure (ASSERT_TRUE) here as a
way of detecting that at least some pages are swap-backed, because if no
pages end up being swap-backed the test is invalid.

I'm not really sure what to do about it though. It's likely the fix for
swap-backed migration may make this bug impossible to hit anyway,
because the obvious fix is to just drop the pages from the swapcache
during migration which would force writeback during subsequent reclaim.

So I'm inclined to leave this here even if it only serves to remind us
about it when we do fix migration of swap-backed pages, because we will
of course run hmm-tests before submitting that fix :-) We can then
either fix the test or drop it if we think it's no longer possible to
hit.

>> +
>> +ASSERT_FALSE(write_cgroup_param(cgroup, "memory.reclaim", 1UL<<30));
>> +
>> +/* Check we still see the updated data after restoring from swap. */
>> +for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
>> +ASSERT_EQ(ptr[i], i);
>> +
>> +hmm_buffer_free(buffer);
>> +destroy_cgroup();
>> +}
>> +
>>  /*
>>   * Read anonymous memory multiple times.
>>   */
>
> thanks,


[PATCH] powerpc: Make PAGE_KERNEL_xxx macros grep-friendly

2022-09-07 Thread Christophe Leroy
Avoid multi-lines to help getting a complete view when using
grep. They still remain under the 100 chars limit.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 3 +--
 arch/powerpc/include/asm/book3s/64/pgtable.h | 9 +++--
 arch/powerpc/include/asm/nohash/pgtable.h| 3 +--
 3 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index be9e3fd2a9bc..9da1ee5f9201 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -112,8 +112,7 @@ static inline bool pte_user(pte_t pte)
 /* Permission masks used for kernel mappings */
 #define PAGE_KERNEL__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
 #define PAGE_KERNEL_NC __pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | 
_PAGE_NO_CACHE)
-#define PAGE_KERNEL_NCG__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
-_PAGE_NO_CACHE | _PAGE_GUARDED)
+#define PAGE_KERNEL_NCG__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | 
_PAGE_NO_CACHE | _PAGE_GUARDED)
 #define PAGE_KERNEL_X  __pgprot(_PAGE_BASE | _PAGE_KERNEL_RWX)
 #define PAGE_KERNEL_RO __pgprot(_PAGE_BASE | _PAGE_KERNEL_RO)
 #define PAGE_KERNEL_ROX__pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index c09ca4d5ba49..4243f4af3d14 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -117,8 +117,7 @@
 #define _PAGE_KERNEL_RW(_PAGE_PRIVILEGED | _PAGE_RW | 
_PAGE_DIRTY)
 #define _PAGE_KERNEL_RO (_PAGE_PRIVILEGED | _PAGE_READ)
 #define _PAGE_KERNEL_ROX(_PAGE_PRIVILEGED | _PAGE_READ | _PAGE_EXEC)
-#define _PAGE_KERNEL_RWX   (_PAGE_PRIVILEGED | _PAGE_DIRTY |   \
-_PAGE_RW | _PAGE_EXEC)
+#define _PAGE_KERNEL_RWX   (_PAGE_PRIVILEGED | _PAGE_DIRTY | _PAGE_RW | 
_PAGE_EXEC)
 /*
  * _PAGE_CHG_MASK masks of bits that are to be preserved across
  * pgprot changes
@@ -156,10 +155,8 @@
 
 /* Permission masks used for kernel mappings */
 #define PAGE_KERNEL__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
-#define PAGE_KERNEL_NC __pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
-_PAGE_TOLERANT)
-#define PAGE_KERNEL_NCG__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
-_PAGE_NON_IDEMPOTENT)
+#define PAGE_KERNEL_NC __pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | 
_PAGE_TOLERANT)
+#define PAGE_KERNEL_NCG__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | 
_PAGE_NON_IDEMPOTENT)
 #define PAGE_KERNEL_X  __pgprot(_PAGE_BASE | _PAGE_KERNEL_RWX)
 #define PAGE_KERNEL_RO __pgprot(_PAGE_BASE | _PAGE_KERNEL_RO)
 #define PAGE_KERNEL_ROX__pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX)
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 18b29cfee0d6..4fd73c7412d0 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -11,8 +11,7 @@
 /* Permission masks used for kernel mappings */
 #define PAGE_KERNEL__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
 #define PAGE_KERNEL_NC __pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | 
_PAGE_NO_CACHE)
-#define PAGE_KERNEL_NCG__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
-_PAGE_NO_CACHE | _PAGE_GUARDED)
+#define PAGE_KERNEL_NCG__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | 
_PAGE_NO_CACHE | _PAGE_GUARDED)
 #define PAGE_KERNEL_X  __pgprot(_PAGE_BASE | _PAGE_KERNEL_RWX)
 #define PAGE_KERNEL_RO __pgprot(_PAGE_BASE | _PAGE_KERNEL_RO)
 #define PAGE_KERNEL_ROX__pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX)
-- 
2.37.1



[PATCH] powerpc: Reduce redundancy in pgtable.h

2022-09-07 Thread Christophe Leroy
PAGE_KERNEL_TEXT, PAGE_KERNEL_EXEC and PAGE_AGP are the same
for all powerpcs.

Remove duplicated definitions.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 19 ---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 19 ---
 arch/powerpc/include/asm/nohash/pgtable.h| 19 ---
 arch/powerpc/include/asm/pgtable.h   | 19 +++
 4 files changed, 19 insertions(+), 57 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 2a0ca1f9a1ff..be9e3fd2a9bc 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -118,25 +118,6 @@ static inline bool pte_user(pte_t pte)
 #define PAGE_KERNEL_RO __pgprot(_PAGE_BASE | _PAGE_KERNEL_RO)
 #define PAGE_KERNEL_ROX__pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX)
 
-/*
- * Protection used for kernel text. We want the debuggers to be able to
- * set breakpoints anywhere, so don't write protect the kernel text
- * on platforms where such control is possible.
- */
-#if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH) 
||\
-   defined(CONFIG_KPROBES) || defined(CONFIG_DYNAMIC_FTRACE)
-#define PAGE_KERNEL_TEXT   PAGE_KERNEL_X
-#else
-#define PAGE_KERNEL_TEXT   PAGE_KERNEL_ROX
-#endif
-
-/* Make modules code happy. We don't set RO yet */
-#define PAGE_KERNEL_EXEC   PAGE_KERNEL_X
-
-/* Advertise special mapping type for AGP */
-#define PAGE_AGP   (PAGE_KERNEL_NC)
-#define HAVE_PAGE_AGP
-
 #define PTE_INDEX_SIZE PTE_SHIFT
 #define PMD_INDEX_SIZE 0
 #define PUD_INDEX_SIZE 0
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 486902aff040..c09ca4d5ba49 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -164,22 +164,6 @@
 #define PAGE_KERNEL_RO __pgprot(_PAGE_BASE | _PAGE_KERNEL_RO)
 #define PAGE_KERNEL_ROX__pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX)
 
-/*
- * Protection used for kernel text. We want the debuggers to be able to
- * set breakpoints anywhere, so don't write protect the kernel text
- * on platforms where such control is possible.
- */
-#if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH) 
|| \
-   defined(CONFIG_KPROBES) || defined(CONFIG_DYNAMIC_FTRACE)
-#define PAGE_KERNEL_TEXT   PAGE_KERNEL_X
-#else
-#define PAGE_KERNEL_TEXT   PAGE_KERNEL_ROX
-#endif
-
-/* Make modules code happy. We don't set RO yet */
-#define PAGE_KERNEL_EXEC   PAGE_KERNEL_X
-#define PAGE_AGP   (PAGE_KERNEL_NC)
-
 #ifndef __ASSEMBLY__
 /*
  * page table defines
@@ -335,9 +319,6 @@ extern unsigned long pci_io_base;
 #define IOREMAP_END(KERN_IO_END - FIXADDR_SIZE)
 #define FIXADDR_SIZE   SZ_32M
 
-/* Advertise special mapping type for AGP */
-#define HAVE_PAGE_AGP
-
 #ifndef __ASSEMBLY__
 
 /*
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 08429c612cdf..18b29cfee0d6 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -17,25 +17,6 @@
 #define PAGE_KERNEL_RO __pgprot(_PAGE_BASE | _PAGE_KERNEL_RO)
 #define PAGE_KERNEL_ROX__pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX)
 
-/*
- * Protection used for kernel text. We want the debuggers to be able to
- * set breakpoints anywhere, so don't write protect the kernel text
- * on platforms where such control is possible.
- */
-#if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH) 
||\
-   defined(CONFIG_KPROBES) || defined(CONFIG_DYNAMIC_FTRACE)
-#define PAGE_KERNEL_TEXT   PAGE_KERNEL_X
-#else
-#define PAGE_KERNEL_TEXT   PAGE_KERNEL_ROX
-#endif
-
-/* Make modules code happy. We don't set RO yet */
-#define PAGE_KERNEL_EXEC   PAGE_KERNEL_X
-
-/* Advertise special mapping type for AGP */
-#define PAGE_AGP   (PAGE_KERNEL_NC)
-#define HAVE_PAGE_AGP
-
 #ifndef __ASSEMBLY__
 
 /* Generic accessors to PTE bits */
diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index 33f4bf8d22b0..283f40d05a4d 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -20,6 +20,25 @@ struct mm_struct;
 #include 
 #endif /* !CONFIG_PPC_BOOK3S */
 
+/*
+ * Protection used for kernel text. We want the debuggers to be able to
+ * set breakpoints anywhere, so don't write protect the kernel text
+ * on platforms where such control is possible.
+ */
+#if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH) 
|| \
+   defined(CONFIG_KPROBES) || defined(CONFIG_DYNAMIC_FTRACE)
+#define PAGE_KERNEL_TEXT   PAGE_KERNEL_X
+#else
+#define PAGE_KERNEL_TEXT   PAGE_KERNEL_ROX
+#endif
+
+/* Make modules code happy. We don't set RO yet */
+#define PAGE_KERNEL_EXEC   PAGE_KERNEL_X
+
+/* 

Re: [PATCH] powerpc/pseries: Fix plpks crash on non-pseries

2022-09-07 Thread Dan Horák
Hi,
I have tested the fix on top of Fedora's
kernel-6.0.0-0.rc4.20220906git53e99dcff61e.32.fc38 and systems are
booting again.

Tested-By: Dan Horák 
Reviewed-by: Dan Horák 


With regards,

Dan


[PATCH 2/2] powerpc: Rely on generic definition of hugepd_t and is_hugepd when unused

2022-09-07 Thread Christophe Leroy
CONFIG_ARCH_HAS_HUGEPD is used to tell core mm when huge page
directories are used.

When they are not used, no need to provide hugepd_t or is_hugepd(),
just rely on the core mm fallback definition.

For that, change core mm behaviour so that CONFIG_ARCH_HAS_HUGEPD
is used instead of indirect is_hugepd macro existence.

powerpc being the only user of huge page directories, there is no
impact on other architectures.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/page.h | 5 -
 arch/powerpc/include/asm/pgtable-be-types.h | 2 ++
 arch/powerpc/include/asm/pgtable-types.h| 2 ++
 include/linux/hugetlb.h | 2 +-
 4 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index c67eb9531a3f..7f20636d13ed 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -308,11 +308,6 @@ static inline bool pfn_valid(unsigned long pfn)
 #include 
 #endif
 
-
-#ifndef CONFIG_HUGETLB_PAGE
-#define is_hugepd(pdep)(0)
-#endif /* CONFIG_HUGETLB_PAGE */
-
 struct page;
 extern void clear_user_page(void *page, unsigned long vaddr, struct page *pg);
 extern void copy_user_page(void *to, void *from, unsigned long vaddr,
diff --git a/arch/powerpc/include/asm/pgtable-be-types.h 
b/arch/powerpc/include/asm/pgtable-be-types.h
index b169bbf95fcb..82633200b500 100644
--- a/arch/powerpc/include/asm/pgtable-be-types.h
+++ b/arch/powerpc/include/asm/pgtable-be-types.h
@@ -101,6 +101,7 @@ static inline bool pmd_xchg(pmd_t *pmdp, pmd_t old, pmd_t 
new)
return pmd_raw(old) == prev;
 }
 
+#ifdef CONFIG_ARCH_HAS_HUGEPD
 typedef struct { __be64 pdbe; } hugepd_t;
 #define __hugepd(x) ((hugepd_t) { cpu_to_be64(x) })
 
@@ -108,5 +109,6 @@ static inline unsigned long hpd_val(hugepd_t x)
 {
return be64_to_cpu(x.pdbe);
 }
+#endif
 
 #endif /* _ASM_POWERPC_PGTABLE_BE_TYPES_H */
diff --git a/arch/powerpc/include/asm/pgtable-types.h 
b/arch/powerpc/include/asm/pgtable-types.h
index efed0db7b1db..082c85cc09b1 100644
--- a/arch/powerpc/include/asm/pgtable-types.h
+++ b/arch/powerpc/include/asm/pgtable-types.h
@@ -83,11 +83,13 @@ static inline bool pte_xchg(pte_t *ptep, pte_t old, pte_t 
new)
 }
 #endif
 
+#ifdef CONFIG_ARCH_HAS_HUGEPD
 typedef struct { unsigned long pd; } hugepd_t;
 #define __hugepd(x) ((hugepd_t) { (x) })
 static inline unsigned long hpd_val(hugepd_t x)
 {
return x.pd;
 }
+#endif
 
 #endif /* _ASM_POWERPC_PGTABLE_TYPES_H */
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 3ec981a0d8b3..1ec1535be04f 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -17,7 +17,7 @@ struct ctl_table;
 struct user_struct;
 struct mmu_gather;
 
-#ifndef is_hugepd
+#ifndef CONFIG_ARCH_HAS_HUGEPD
 typedef struct { unsigned long pd; } hugepd_t;
 #define is_hugepd(hugepd) (0)
 #define __hugepd(x) ((hugepd_t) { (x) })
-- 
2.37.1



[PATCH 1/2] powerpc/nohash: Remove pgd_huge() stub

2022-09-07 Thread Christophe Leroy
linux/hugetlb.h has a fallback pgd_huge() macro for when
pgd_huge is not defined.

Remove the powerpc redundant definitions.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/pgtable.h | 6 --
 arch/powerpc/include/asm/page.h   | 1 -
 2 files changed, 7 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index b499da6c1a99..08429c612cdf 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -277,12 +277,6 @@ static inline int pud_huge(pud_t pud)
return 0;
 }
 
-static inline int pgd_huge(pgd_t pgd)
-{
-   return 0;
-}
-#define pgd_huge   pgd_huge
-
 #define is_hugepd(hpd) (hugepd_ok(hpd))
 #endif
 
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index e5f75c70eda8..c67eb9531a3f 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -311,7 +311,6 @@ static inline bool pfn_valid(unsigned long pfn)
 
 #ifndef CONFIG_HUGETLB_PAGE
 #define is_hugepd(pdep)(0)
-#define pgd_huge(pgd)  (0)
 #endif /* CONFIG_HUGETLB_PAGE */
 
 struct page;
-- 
2.37.1



Re: [PATCH v2 1/3] powerpc/pseries: define driver for Platform KeyStore

2022-09-07 Thread Michael Ellerman
Nathan Chancellor  writes:
> On Wed, Sep 07, 2022 at 09:23:02AM +1000, Michael Ellerman wrote:
>> Nathan Chancellor  writes:
>> > On Sat, Jul 23, 2022 at 07:30:46AM -0400, Nayna Jain wrote:
>> >> PowerVM provides an isolated Platform Keystore(PKS) storage allocation
>> >> for each LPAR with individually managed access controls to store
>> >> sensitive information securely. It provides a new set of hypervisor
>> >> calls for Linux kernel to access PKS storage.
>> >> 
>> >> Define POWER LPAR Platform KeyStore(PLPKS) driver using H_CALL interface
>> >> to access PKS storage.
>> >> 
>> >> Signed-off-by: Nayna Jain 
>> >
>> > This commit is now in mainline as commit 2454a7af0f2a ("powerpc/pseries:
>> > define driver for Platform KeyStore") and I just bisected a crash while
>> > boot testing Fedora's configuration [1] in QEMU to it. I initially
>> > noticed this in ClangBuiltLinux's CI but this doesn't appear to be clang
>> > specific since I can reproduce with GCC 12.2.1 from Fedora. I can
>> > reproduce with just powernv_defconfig + CONFIG_PPC_PSERIES=y +
>> > CONFIG_PSERIES_PLPKS=y. Our firmware and rootfs are available in our
>> > boot-utils repository [2].
>> 
>> Thanks, classic bug I should have spotted.
>> 
>> I didn't catch it in my testing because PLPKS isn't enabled in
>> our defconfigs.
>> 
>> Does your CI enable new options by default? Or are you booting
>> allyesconfig?
>
> Neither actually. We just test a bunch of in-tree and distribution
> configurations. The distribution configurations are fetched straight
> from their URLs on gitweb so we get any updates that they do, which is
> how we noticed this (CONFIG_PSERIES_PLPKS was recently enabled in
> Fedora):
>
> https://src.fedoraproject.org/rpms/kernel/c/a73f6858a2cbd16bbcc6d305d6c43aab6f59d0b1

Aha, neat trick.

>> I'll send a fix.
>
> Thanks for the quick response!

Thanks for the bug report :)

cheers


Re: [PATCH] powerpc/mm/book3s/hash: Rename flush_tlb_pmd_range

2022-09-07 Thread Christophe Leroy


Le 07/09/2022 à 10:19, Aneesh Kumar K.V a écrit :
> This function does the hash page table update. Hence rename it to
> indicate this better to avoid confusion with flush_pmd_tlb_range()
> 
> Signed-off-by: Aneesh Kumar K.V 
> ---
>   arch/powerpc/include/asm/book3s/64/tlbflush-hash.h | 5 ++---
>   arch/powerpc/mm/book3s64/hash_pgtable.c| 2 +-
>   arch/powerpc/mm/book3s64/hash_tlb.c| 2 +-
>   3 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h 
> b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
> index 8b762f282190..fd30fa20c392 100644
> --- a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
> @@ -112,13 +112,12 @@ static inline void 
> hash__flush_tlb_kernel_range(unsigned long start,
>   
>   struct mmu_gather;
>   extern void hash__tlb_flush(struct mmu_gather *tlb);
> -void flush_tlb_pmd_range(struct mm_struct *mm, pmd_t *pmd, unsigned long 
> addr);
>   
>   #ifdef CONFIG_PPC_64S_HASH_MMU
>   /* Private function for use by PCI IO mapping code */
>   extern void __flush_hash_table_range(unsigned long start, unsigned long 
> end);
> -extern void flush_tlb_pmd_range(struct mm_struct *mm, pmd_t *pmd,
> - unsigned long addr);
> +extern void flush_hash_table_pmd_range(struct mm_struct *mm, pmd_t *pmd,
> +unsigned long addr);

'extern' keyword is pointless and deprecated, would be better to remove it.

>   #else
>   static inline void __flush_hash_table_range(unsigned long start, unsigned 
> long end) { }
>   #endif
> diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c 
> b/arch/powerpc/mm/book3s64/hash_pgtable.c
> index ae008b9df0e6..f30131933a01 100644
> --- a/arch/powerpc/mm/book3s64/hash_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
> @@ -256,7 +256,7 @@ pmd_t hash__pmdp_collapse_flush(struct vm_area_struct 
> *vma, unsigned long addres
>* the __collapse_huge_page_copy can result in copying
>* the old content.
>*/
> - flush_tlb_pmd_range(vma->vm_mm, , address);
> + flush_hash_table_pmd_range(vma->vm_mm, , address);
>   return pmd;
>   }
>   
> diff --git a/arch/powerpc/mm/book3s64/hash_tlb.c 
> b/arch/powerpc/mm/book3s64/hash_tlb.c
> index eb0bccaf221e..a64ea0a7ef96 100644
> --- a/arch/powerpc/mm/book3s64/hash_tlb.c
> +++ b/arch/powerpc/mm/book3s64/hash_tlb.c
> @@ -221,7 +221,7 @@ void __flush_hash_table_range(unsigned long start, 
> unsigned long end)
>   local_irq_restore(flags);
>   }
>   
> -void flush_tlb_pmd_range(struct mm_struct *mm, pmd_t *pmd, unsigned long 
> addr)
> +void flush_hash_table_pmd_range(struct mm_struct *mm, pmd_t *pmd, unsigned 
> long addr)
>   {
>   pte_t *pte;
>   pte_t *start_pte;

[PATCH] powerpc/mm/book3s/hash: Rename flush_tlb_pmd_range

2022-09-07 Thread Aneesh Kumar K.V
This function does the hash page table update. Hence rename it to
indicate this better to avoid confusion with flush_pmd_tlb_range()

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/tlbflush-hash.h | 5 ++---
 arch/powerpc/mm/book3s64/hash_pgtable.c| 2 +-
 arch/powerpc/mm/book3s64/hash_tlb.c| 2 +-
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
index 8b762f282190..fd30fa20c392 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
@@ -112,13 +112,12 @@ static inline void hash__flush_tlb_kernel_range(unsigned 
long start,
 
 struct mmu_gather;
 extern void hash__tlb_flush(struct mmu_gather *tlb);
-void flush_tlb_pmd_range(struct mm_struct *mm, pmd_t *pmd, unsigned long addr);
 
 #ifdef CONFIG_PPC_64S_HASH_MMU
 /* Private function for use by PCI IO mapping code */
 extern void __flush_hash_table_range(unsigned long start, unsigned long end);
-extern void flush_tlb_pmd_range(struct mm_struct *mm, pmd_t *pmd,
-   unsigned long addr);
+extern void flush_hash_table_pmd_range(struct mm_struct *mm, pmd_t *pmd,
+  unsigned long addr);
 #else
 static inline void __flush_hash_table_range(unsigned long start, unsigned long 
end) { }
 #endif
diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c 
b/arch/powerpc/mm/book3s64/hash_pgtable.c
index ae008b9df0e6..f30131933a01 100644
--- a/arch/powerpc/mm/book3s64/hash_pgtable.c
+++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
@@ -256,7 +256,7 @@ pmd_t hash__pmdp_collapse_flush(struct vm_area_struct *vma, 
unsigned long addres
 * the __collapse_huge_page_copy can result in copying
 * the old content.
 */
-   flush_tlb_pmd_range(vma->vm_mm, , address);
+   flush_hash_table_pmd_range(vma->vm_mm, , address);
return pmd;
 }
 
diff --git a/arch/powerpc/mm/book3s64/hash_tlb.c 
b/arch/powerpc/mm/book3s64/hash_tlb.c
index eb0bccaf221e..a64ea0a7ef96 100644
--- a/arch/powerpc/mm/book3s64/hash_tlb.c
+++ b/arch/powerpc/mm/book3s64/hash_tlb.c
@@ -221,7 +221,7 @@ void __flush_hash_table_range(unsigned long start, unsigned 
long end)
local_irq_restore(flags);
 }
 
-void flush_tlb_pmd_range(struct mm_struct *mm, pmd_t *pmd, unsigned long addr)
+void flush_hash_table_pmd_range(struct mm_struct *mm, pmd_t *pmd, unsigned 
long addr)
 {
pte_t *pte;
pte_t *start_pte;
-- 
2.37.3



Re: [PATCH linux-next] ocxl: Remove the unneeded result variable

2022-09-07 Thread Andrew Donnellan
On Tue, 2022-09-06 at 07:20 +, cgel@gmail.com wrote:
> From: ye xingchen 
> 
> Return the value opal_npu_spa_clear_cache() directly instead of
> storing
> it in another redundant variable.
> 
> Reported-by: Zeal Robot 
> Signed-off-by: ye xingchen 

Acked-by: Andrew Donnellan 

> ---
>  arch/powerpc/platforms/powernv/ocxl.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/ocxl.c
> b/arch/powerpc/platforms/powernv/ocxl.c
> index 27c936075031..629067781cec 100644
> --- a/arch/powerpc/platforms/powernv/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/ocxl.c
> @@ -478,10 +478,8 @@ EXPORT_SYMBOL_GPL(pnv_ocxl_spa_release);
>  int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int
> pe_handle)
>  {
> struct spa_data *data = (struct spa_data *) platform_data;
> -   int rc;
>  
> -   rc = opal_npu_spa_clear_cache(data->phb_opal_id, data->bdfn,
> pe_handle);
> -   return rc;
> +   return opal_npu_spa_clear_cache(data->phb_opal_id, data-
> >bdfn, pe_handle);
>  }
>  EXPORT_SYMBOL_GPL(pnv_ocxl_spa_remove_pe_from_cache);
>  

-- 
Andrew DonnellanOzLabs, ADL Canberra
a...@linux.ibm.com   IBM Australia Limited



[PATCH] powerpc/pseries: Fix plpks crash on non-pseries

2022-09-07 Thread Michael Ellerman
As reported[1] by Nathan, the recently added plpks driver will crash if
it's built into the kernel and booted on a non-pseries machine, eg
powernv:

  kernel BUG at arch/powerpc/kernel/syscall.c:39!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
  ...
  NIP system_call_exception+0x90/0x3d0
  LR  system_call_common+0xec/0x250
  Call Trace:
0xc35c3e10 (unreliable)
system_call_common+0xec/0x250
  --- interrupt: c00 at plpar_hcall+0x38/0x60
  NIP:  c00e4300 LR: c202945c CTR: 
  REGS: c35c3e80 TRAP: 0c00   Not tainted  (6.0.0-rc4)
  MSR:  92009033   CR: 28000284  XER: 

  ...
  NIP plpar_hcall+0x38/0x60
  LR  pseries_plpks_init+0x64/0x23c
  --- interrupt: c00

On powernv Linux is the hypervisor, so a hypercall just ends up going to
the syscall path, which BUGs if the syscall (hypercall) didn't come from
userspace.

The fix is simply to not probe the plpks driver on non-pseries machines.

[1] https://lore.kernel.org/linuxppc-dev/Yxe06fbq18Wv9y3W@dev-arch.thelio-3990X/

Fixes: 2454a7af0f2a ("powerpc/pseries: define driver for Platform KeyStore")
Reported-by: Nathan Chancellor 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/pseries/plpks.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/plpks.c 
b/arch/powerpc/platforms/pseries/plpks.c
index 52aaa2894606..f4b5b5a64db3 100644
--- a/arch/powerpc/platforms/pseries/plpks.c
+++ b/arch/powerpc/platforms/pseries/plpks.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "plpks.h"
 
@@ -457,4 +458,4 @@ static __init int pseries_plpks_init(void)
 
return rc;
 }
-arch_initcall(pseries_plpks_init);
+machine_arch_initcall(pseries, pseries_plpks_init);
-- 
2.37.2



[PATCH] powerpc/math-emu: Inhibit W=1 warnings

2022-09-07 Thread Christophe Leroy
When building with W=1 you get:

arch/powerpc/math-emu/fre.c:6:5: error: no previous prototype for 'fre' 
[-Werror=missing-prototypes]
arch/powerpc/math-emu/fsqrt.c:11:1: error: no previous prototype for 
'fsqrt' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fsqrts.c:12:1: error: no previous prototype for 
'fsqrts' [-Werror=missing-prototypes]
arch/powerpc/math-emu/frsqrtes.c:6:5: error: no previous prototype for 
'frsqrtes' [-Werror=missing-prototypes]
arch/powerpc/math-emu/mtfsf.c:10:1: error: no previous prototype for 
'mtfsf' [-Werror=missing-prototypes]
arch/powerpc/math-emu/mtfsfi.c:10:1: error: no previous prototype for 
'mtfsfi' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fabs.c:7:1: error: no previous prototype for 
'fabs' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fadd.c:11:1: error: no previous prototype for 
'fadd' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fadds.c:12:1: error: no previous prototype for 
'fadds' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fcmpo.c:11:1: error: no previous prototype for 
'fcmpo' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fcmpu.c:11:1: error: no previous prototype for 
'fcmpu' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fcmpu.c:14:19: error: variable 'B_c' set but not 
used [-Werror=unused-but-set-variable]
arch/powerpc/math-emu/fcmpu.c:13:19: error: variable 'A_c' set but not 
used [-Werror=unused-but-set-variable]
arch/powerpc/math-emu/fctiw.c:11:1: error: no previous prototype for 
'fctiw' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fctiwz.c:11:1: error: no previous prototype for 
'fctiwz' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fdiv.c:11:1: error: no previous prototype for 
'fdiv' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fdivs.c:12:1: error: no previous prototype for 
'fdivs' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fmadd.c:11:1: error: no previous prototype for 
'fmadd' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fmadds.c:12:1: error: no previous prototype for 
'fmadds' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fmsub.c:11:1: error: no previous prototype for 
'fmsub' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fmsubs.c:12:1: error: no previous prototype for 
'fmsubs' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fmul.c:11:1: error: no previous prototype for 
'fmul' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fmuls.c:12:1: error: no previous prototype for 
'fmuls' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fnabs.c:7:1: error: no previous prototype for 
'fnabs' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fneg.c:7:1: error: no previous prototype for 
'fneg' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fnmadd.c:11:1: error: no previous prototype for 
'fnmadd' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fnmadds.c:12:1: error: no previous prototype for 
'fnmadds' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fnmsub.c:11:1: error: no previous prototype for 
'fnmsub' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fnmsubs.c:12:1: error: no previous prototype for 
'fnmsubs' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fres.c:7:1: error: no previous prototype for 
'fres' [-Werror=missing-prototypes]
arch/powerpc/math-emu/frsp.c:12:1: error: no previous prototype for 
'frsp' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fsel.c:11:1: error: no previous prototype for 
'fsel' [-Werror=missing-prototypes]
arch/powerpc/math-emu/lfs.c:12:1: error: no previous prototype for 
'lfs' [-Werror=missing-prototypes]
arch/powerpc/math-emu/frsqrte.c:7:1: error: no previous prototype for 
'frsqrte' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fsub.c:11:1: error: no previous prototype for 
'fsub' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fsubs.c:12:1: error: no previous prototype for 
'fsubs' [-Werror=missing-prototypes]
arch/powerpc/math-emu/mcrfs.c:10:1: error: no previous prototype for 
'mcrfs' [-Werror=missing-prototypes]
arch/powerpc/math-emu/mffs.c:10:1: error: no previous prototype for 
'mffs' [-Werror=missing-prototypes]
arch/powerpc/math-emu/mtfsb0.c:10:1: error: no previous prototype for 
'mtfsb0' [-Werror=missing-prototypes]
arch/powerpc/math-emu/mtfsb1.c:10:1: error: no previous prototype for 
'mtfsb1' [-Werror=missing-prototypes]
arch/powerpc/math-emu/stfiwx.c:7:1: error: no previous prototype for 
'stfiwx' [-Werror=missing-prototypes]
arch/powerpc/math-emu/stfs.c:12:1: error: no previous prototype for 
'stfs' [-Werror=missing-prototypes]
arch/powerpc/math-emu/fmr.c:7:1: error: no previous prototype for 'fmr' 
[-Werror=missing-prototypes]