Re: [PATCH] powerpc/kprobes: Use probe_address() to read instructions

2020-06-08 Thread Christoph Hellwig
On Tue, Jun 09, 2020 at 03:28:38PM +1000, Michael Ellerman wrote:
> On Mon, 24 Feb 2020 18:02:10 + (UTC), Christophe Leroy wrote:
> > In order to avoid Oopses, use probe_address() to read the
> > instruction at the address where the trap happened.
> 
> Applied to powerpc/next.
> 
> [1/1] powerpc/kprobes: Use probe_address() to read instructions
>   
> https://git.kernel.org/powerpc/c/9ed5df69b79a22b40b20bc2132ba2495708b19c4

probe_addresss has been renamed to get_kernel_nofault in the -mm
queue that Andrew sent off to Linus last night.


Re: [PATCH v8 22.5/30] powerpc/optprobes: Add register argument to patch_imm64_load_insns()

2020-06-08 Thread Michael Ellerman
On Sat, 2020-05-16 at 11:54:49 UTC, Michael Ellerman wrote:
> From: Jordan Niethe 
> 
> Currently patch_imm32_load_insns() is used to load an instruction to
> r4 to be emulated by emulate_step(). For prefixed instructions we
> would like to be able to load a 64bit immediate to r4. To prepare for
> this make patch_imm64_load_insns() take an argument that decides which
> register to load an immediate to - rather than hardcoding r3.
> 
> Signed-off-by: Jordan Niethe 
> Signed-off-by: Michael Ellerman 

Applied to powerpc next.

https://git.kernel.org/powerpc/c/7a8818e0df5c6b53c89c7c928498668a2bbb3de0

cheers


Re: [PATCH v2] powerpc/pseries: Make vio and ibmebus initcalls pseries specific

2020-06-08 Thread Michael Ellerman
On Tue, 21 Apr 2020 18:15:39 +1000, Oliver O'Halloran wrote:
> The vio and ibmebus buses are used for pseries specific paravirtualised
> devices and currently they're initialised by the generic initcall types.
> This is mostly fine, but it can result in some nuisance errors in dmesg
> when booting on PowerNV on some OSes, e.g.
> 
> [2.984439] synth uevent: /devices/vio: failed to send uevent
> [2.984442] vio vio: uevent: failed to send synthetic uevent
> [   17.968551] synth uevent: /devices/vio: failed to send uevent
> [   17.968554] vio vio: uevent: failed to send synthetic uevent
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/pseries: Make vio and ibmebus initcalls pseries specific
  https://git.kernel.org/powerpc/c/4336b9337824a60a0b10013c622caeee99460db5

cheers


Re: [PATCH] hw_breakpoint: Fix build warnings with clang

2020-06-08 Thread Michael Ellerman
On Tue, 2 Jun 2020 09:42:08 +0530, Ravi Bangoria wrote:
> kbuild test robot reported few build warnings with hw_breakpoint code
> when compiled with clang[1]. Fix those.
> 
> [1]: 
> https://lore.kernel.org/linuxppc-dev/202005192233.oi9cjrta%25...@intel.com/

Applied to powerpc/next.

[1/1] hw-breakpoints: Fix build warnings with clang
  https://git.kernel.org/powerpc/c/ef3534a94fdbdeab4c89d18d0164be2ad5d6dbb7

cheers


Re: [PATCH 1/7] powerpc/powernv/npu: Clean up compound table group initialisation

2020-06-08 Thread Michael Ellerman
On Mon, 6 Apr 2020 13:07:39 +1000, Oliver O'Halloran wrote:
> Re-work the control flow a bit so what's going on is a little clearer.
> This also ensures the table_group is only initialised once in the P9
> case. This shouldn't be a functional change since all the GPU PCI
> devices should have the same table_group configuration, but it does
> look strange.

Applied to powerpc/next.

[1/7] powerpc/powernv/npu: Clean up compound table group initialisation
  https://git.kernel.org/powerpc/c/6984856865b55c9c1ee0814c30296119cd8ba511
[2/7] powerpc/powernv/iov: Don't add VFs to iommu group during PE config
  https://git.kernel.org/powerpc/c/6cff91b2b97b1b40a52971c9b1e99980dd49fd54
[3/7] powerpc/powernv/pci: Register iommu group at PE DMA setup
  https://git.kernel.org/powerpc/c/9b9408c55935ecc3b1c27b3eeb5a507394113cbb
[4/7] powerpc/powernv/pci: Add device to iommu group during dma_dev_setup()
  https://git.kernel.org/powerpc/c/84d8cc076723058cc294f4360db6ff7758c25b74
[5/7] powerpc/powernv/pci: Delete old iommu recursive iommu setup
  https://git.kernel.org/powerpc/c/f39b8b10fcc5d4617d2be5f2910e017a55444b43
[6/7] powerpc/powernv/pci: Move tce size parsing to pci-ioda-tce.c
  https://git.kernel.org/powerpc/c/96e2006a9dbc02cb1c103521405d457438a2e260
[7/7] powerpc/powernv/npu: Move IOMMU group setup into npu-dma.c
  https://git.kernel.org/powerpc/c/03b7bf341c18ff19129cc2825b62bb0e212463f1

cheers


Re: [PATCH] powerpc/powernv/pci: Add an explaination for PNV_IODA_PE_BUS_ALL

2020-06-08 Thread Michael Ellerman
On Wed, 15 Apr 2020 09:35:02 +1000, Oliver O'Halloran wrote:
> It's pretty obsecure and confused me for a long time so I figured it's
> worth documenting properly.

Applied to powerpc/next.

[1/1] powerpc/powernv/pci: Add an explaination for PNV_IODA_PE_BUS_ALL
  https://git.kernel.org/powerpc/c/9d0879a2dbc3d0c15f8c71490079c1c38f9f3800

cheers


Re: [PATCH] powerpc/powernv: Add a print indicating when an IODA PE is released

2020-06-08 Thread Michael Ellerman
On Wed, 8 Apr 2020 21:22:13 +1000, Oliver O'Halloran wrote:
> Quite useful to know in some cases.

Applied to powerpc/next.

[1/1] powerpc/powernv: Add a print indicating when an IODA PE is released
  https://git.kernel.org/powerpc/c/e5500ab657c51bec5af8dcf564a096de48e7a132

cheers


Re: [PATCH] powerpc/64s: Fix early_init_mmu section mismatch

2020-06-08 Thread Michael Ellerman
On Wed, 29 Apr 2020 17:02:47 +1000, Nicholas Piggin wrote:
> Christian reports:
> 
>   MODPOST vmlinux.o
>   WARNING: modpost: vmlinux.o(.text.unlikely+0x1a0): Section mismatch in
>   reference from the function .early_init_mmu() to the function
>   .init.text:.radix__early_init_mmu()
>   The function .early_init_mmu() references
>   the function __init .radix__early_init_mmu().
>   This is often because .early_init_mmu lacks a __init
>   annotation or the annotation of .radix__early_init_mmu is wrong.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/64s: Fix early_init_mmu section mismatch
  https://git.kernel.org/powerpc/c/9384e552aabb647ec22acb00181ca1715b0fcdfe

cheers


Re: [PATCH] powerpc/64: refactor interrupt exit irq disabling sequence

2020-06-08 Thread Michael Ellerman
On Wed, 29 Apr 2020 16:24:21 +1000, Nicholas Piggin wrote:
> The same complicated sequence for juggling EE, RI, soft mask, and
> irq tracing is repeated 3 times, tidy these up into one function.
> 
> This differs qiute a bit between sub architectures, so this makes
> the ppc32 port cleaner as well.

Applied to powerpc/next.

[1/1] powerpc/64: Refactor interrupt exit irq disabling sequence
  https://git.kernel.org/powerpc/c/0bdad33d6bd7b80722e2f9e588d3d7c6d6e34978

cheers


Re: [PATCH] powerpc/64s/radix: Don't prefetch DAR in update_mmu_cache

2020-06-08 Thread Michael Ellerman
On Mon, 4 May 2020 22:29:07 +1000, Nicholas Piggin wrote:
> The idea behind this prefetch was to kick off a page table walk before
> returning from the fault, getting some pipelining advantage.
> 
> But this never showed up any noticable performance advantage, and in
> fact with KUAP the prefetches are actually blocked and cause some
> kind of micro-architectural fault. Removing this improves page fault
> microbenchmark performance by about 9%.

Applied to powerpc/next.

[1/1] powerpc/64s/radix: Don't prefetch DAR in update_mmu_cache
  https://git.kernel.org/powerpc/c/18594f9b8c45484bd527ebc6b08383b95f58ba73

cheers


Re: [PATCH 1/4] powerpc/powernv/pci: Add helper to find ioda_pe from BDFN

2020-06-08 Thread Michael Ellerman
On Fri, 17 Apr 2020 17:35:05 +1000, Oliver O'Halloran wrote:
> For each PHB we maintain a reverse-map that can be used to find the
> PE that a BDFN is currently mapped to. Add a helper for doing this
> lookup so we can check if a PE has been configured without looking
> at pdn->pe_number.

Applied to powerpc/next.

[1/4] powerpc/powernv/pci: Add helper to find ioda_pe from BDFN
  https://git.kernel.org/powerpc/c/a8d7d5fc2e1672924a391aa37ef8c02d1ec84a4e
[2/4] powerpc/powernv/pci: Re-work bus PE configuration
  https://git.kernel.org/powerpc/c/dc3d8f85bb571c3640ebba24b82a527cf2cb3f24
[3/4] powerpc/powernv/pci: Reserve the root bus PE during init
  https://git.kernel.org/powerpc/c/718d249aeadff058f79c2e6b25212dd45bd711ae
[4/4] powerpc/powernv/pci: Sprinkle around some WARN_ON()s
  https://git.kernel.org/powerpc/c/6ae8aedf8fa932541f48a85219d75ca041c22080

cheers


Re: [PATCH 0/3] powerpc/module_64: Fix _mcount() stub

2020-06-08 Thread Michael Ellerman
On Tue, 21 Apr 2020 23:05:42 +0530, Naveen N. Rao wrote:
> This series addresses the crash reported by Qian Cai on ppc64le with
> -mprofile-kernel here:
> https://lore.kernel.org/r/15ac5b0e-a221-4b8c-9039-fa96b8ef7...@lca.pw
> 
> While fixing patch_instruction() should address the crash, we should
> still change the default stub we setup for _mcount() for cases where a
> kernel is built without ftrace.
> 
> [...]

Applied to powerpc/next.

[1/3] powerpc/module_64: Consolidate ftrace code
  https://git.kernel.org/powerpc/c/03b51416e876aea5e7638947e50831b6c988c246
[2/3] powerpc/module_64: Simplify check for -mprofile-kernel ftrace relocations
  https://git.kernel.org/powerpc/c/1f2aaed2db03150428dbcd2ddee02ae6cb4bac52
[3/3] powerpc/module_64: Use special stub for _mcount() with -mprofile-kernel
  https://git.kernel.org/powerpc/c/bd55e792de0844631d34487d43eaf3f13294ebfe

cheers


Re: [PATCH] powerpc/wii: Fix declaration made after definition

2020-06-08 Thread Michael Ellerman
On Mon, 13 Apr 2020 12:06:45 -0700, Nathan Chancellor wrote:
> A 0day randconfig uncovered an error with clang, trimmed for brevity:
> 
> arch/powerpc/platforms/embedded6xx/wii.c:195:7: error: attribute
> declaration must precede definition [-Werror,-Wignored-attributes]
> if (!machine_is(wii))
>  ^
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/wii: Fix declaration made after definition
  https://git.kernel.org/powerpc/c/91ffeaa7e5dd62753e23a1204dc7ecd11f26eadc

cheers


Re: [PATCH] powerpc/xmon: Show task->thread.regs in process display

2020-06-08 Thread Michael Ellerman
On Wed, 20 May 2020 21:17:40 +1000, Michael Ellerman wrote:
> Show the address of the tasks regs in the process listing in xmon. The
> regs should always be on the stack page that we also print the address
> of, but it's still helpful not to have to find them by hand.

Applied to powerpc/next.

[1/1] powerpc/xmon: Show task->thread.regs in process display
  https://git.kernel.org/powerpc/c/0e7e92efe11bc5993def689e10f7bcb36f127651

cheers


Re: [PATCH v4 1/2] powerpc/64s/hash: Add stress_slb kernel boot option to increase SLB faults

2020-06-08 Thread Michael Ellerman
On Mon, 11 May 2020 22:58:24 +1000, Michael Ellerman wrote:
> This option increases the number of SLB misses by limiting the number
> of kernel SLB entries, and increased flushing of cached lookaside
> information. This helps stress test difficult to hit paths in the
> kernel.
> 
> [mpe: Relocate the code into arch/powerpc/mm, s/torture/stress/]

Applied to powerpc/next.

[1/1] powerpc/64s/hash: Add stress_slb kernel boot option to increase SLB faults
  https://git.kernel.org/powerpc/c/82a1b8ed5604cccf30b6ff03bcd61640cd26369b

cheers


Re: [RFC PATCH 1/4] powerpc/64s: Don't init FSCR_DSCR in __init_FSCR()

2020-06-08 Thread Michael Ellerman
On Thu, 28 May 2020 00:58:40 +1000, Michael Ellerman wrote:
> __init_FSCR() was added originally in commit 2468dcf641e4 ("powerpc:
> Add support for context switching the TAR register") (Feb 2013), and
> only set FSCR_TAR.
> 
> At that point FSCR (Facility Status and Control Register) was not
> context switched, so the setting was permanent after boot.
> 
> [...]

Applied to powerpc/next.

[1/4] powerpc/64s: Don't init FSCR_DSCR in __init_FSCR()
  https://git.kernel.org/powerpc/c/0828137e8f16721842468e33df0460044a0c588b
[2/4] powerpc/64s: Don't let DT CPU features set FSCR_DSCR
  https://git.kernel.org/powerpc/c/993e3d96fd08c3ebf7566e43be9b8cd622063e6d
[3/4] powerpc/64s: Save FSCR to init_task.thread.fscr after feature init
  https://git.kernel.org/powerpc/c/912c0a7f2b5daa3cbb2bc10f303981e493de73bd
[4/4] powerpc/64s: Don't set FSCR bits in INIT_THREAD
  https://git.kernel.org/powerpc/c/c887ef5707591e84f80271e95e99ff9fb38987b5

cheers


Re: [PATCH v2] powerpc: Add ppc_inst_as_u64()

2020-06-08 Thread Michael Ellerman
On Tue, 26 May 2020 17:26:30 +1000, Michael Ellerman wrote:
> The code patching code wants to get the value of a struct ppc_inst as
> a u64 when the instruction is prefixed, so we can pass the u64 down to
> __put_user_asm() and write it with a single store.
> 
> The optprobes code wants to load a struct ppc_inst as an immediate
> into a register so it is useful to have it as a u64 to use the
> existing helper function.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc: Add ppc_inst_as_u64()
  https://git.kernel.org/powerpc/c/16ef9767e4dc5cf03a71ae7bc2bc588dbbe7983e

cheers


Re: [PATCH] input: i8042: Remove special PowerPC handling

2020-06-08 Thread Michael Ellerman
On Mon, 18 May 2020 11:10:43 -0700, Nathan Chancellor wrote:
> This causes a build error with CONFIG_WALNUT because kb_cs and kb_data
> were removed in commit 917f0af9e5a9 ("powerpc: Remove arch/ppc and
> include/asm-ppc").
> 
> ld.lld: error: undefined symbol: kb_cs
> > referenced by i8042-ppcio.h:28 (drivers/input/serio/i8042-ppcio.h:28)
> > input/serio/i8042.o:(__i8042_command) in archive drivers/built-in.a
> > referenced by i8042-ppcio.h:28 (drivers/input/serio/i8042-ppcio.h:28)
> > input/serio/i8042.o:(__i8042_command) in archive drivers/built-in.a
> > referenced by i8042-ppcio.h:28 (drivers/input/serio/i8042-ppcio.h:28)
> > input/serio/i8042.o:(__i8042_command) in archive drivers/built-in.a
> 
> [...]

Applied to powerpc/next.

[1/1] input: i8042 - Remove special PowerPC handling
  https://git.kernel.org/powerpc/c/e4f4ffa8a98c24a4ab482669b1e2b4cfce3f52f4

cheers


Re: [PATCH v2] powerpc: Add ppc_inst_next()

2020-06-08 Thread Michael Ellerman
On Fri, 22 May 2020 23:33:18 +1000, Michael Ellerman wrote:
> In a few places we want to calculate the address of the next
> instruction. Previously that was simple, we just added 4 bytes, or if
> using a u32 * we incremented that pointer by 1.
> 
> But prefixed instructions make it more complicated, we need to advance
> by either 4 or 8 bytes depending on the actual instruction. We also
> can't do pointer arithmetic using struct ppc_inst, because it is
> always 8 bytes in size on 64-bit, even though we might only need to
> advance by 4 bytes.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc: Add ppc_inst_next()
  https://git.kernel.org/powerpc/c/c5ff46d69c410f7fac173e4fde3eea484b4b4eda

cheers


Re: [PATCH] powerpc/configs/64s: Enable CONFIG_PRINTK_CALLER

2020-06-08 Thread Michael Ellerman
On Wed, 20 May 2020 22:12:57 +1000, Michael Ellerman wrote:
> This adds the CPU or thread number to printk messages. This helps a
> lot when deciphering concurrent oopses that have been interleaved.
> 
> Example output, of PID1 (T1) triggering a warning:
> 
>   [1.581678][T1] WARNING: CPU: 0 PID: 1 at crypto/rsa-pkcs1pad.c:539 
> pkcs1pad_verify+0x38/0x140
>   [1.581681][T1] Modules linked in:
>   [1.581693][T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
> 5.5.0-rc5-gcc-8.2.0-00121-gf84c2e595927-dirty #1515
>   [1.581700][T1] NIP:  c0207d64 LR: c0207d3c CTR: 
> c0207d2c
>   [1.581708][T1] REGS: c000fd2e7560 TRAP: 0700   Not tainted  
> (5.5.0-rc5-gcc-8.2.0-00121-gf84c2e595927-dirty)
>   [1.581712][T1] MSR:  90029033   
> CR: 44000222  XER: 0004

Applied to powerpc/next.

[1/1] powerpc/configs/64s: Enable CONFIG_PRINTK_CALLER
  https://git.kernel.org/powerpc/c/598c01b5b2fca3a9de8ad3400edbff98ec22f0b2

cheers


Re: [PATCH] powerpc: Add ppc_inst_as_u64()

2020-06-08 Thread Michael Ellerman
On Mon, 25 May 2020 15:50:04 +1000, Michael Ellerman wrote:
> The code patching code wants to get the value of a struct ppc_inst as
> a u64 when the instruction is prefixed, so we can pass the u64 down to
> __put_user_asm() and write it with a single store.
> 
> This is a bit awkward because the value differs based on the CPU
> endianness, so add a helper to do the conversion.

Applied to powerpc/next.

[1/1] powerpc: Add ppc_inst_as_u64()
  https://git.kernel.org/powerpc/c/16ef9767e4dc5cf03a71ae7bc2bc588dbbe7983e

cheers


Re: [PATCH] powerpc/4xx: Don't unmap NULL mbase

2020-06-08 Thread Michael Ellerman
On Thu, 21 May 2020 17:26:48 +1000, Michael Ellerman wrote:
> 


Applied to powerpc/next.

[1/1] powerpc/4xx: Don't unmap NULL mbase
  https://git.kernel.org/powerpc/c/bcec081ecc940fc38730b29c743bbee661164161

cheers


Re: [PATCH] powerpc/tm: Document h/rfid and mtmsrd quirk

2020-06-08 Thread Michael Ellerman
On Wed, 25 Mar 2020 15:05:46 +1100, Michael Neuling wrote:
> The ISA has a quirk that's useful for the Linux implementation.
> Document it here so others are less likely to trip over it.

Applied to powerpc/next.

[1/1] powerpc/tm: Document h/rfid and mtmsrd quirk
  https://git.kernel.org/powerpc/c/b8707e2374f68cac79de553ae1ee5c35913813bd

cheers


Re: [PATCH] powerpc: Fix misleading small cores print

2020-06-08 Thread Michael Ellerman
On Fri, 29 May 2020 09:07:31 +1000, Michael Neuling wrote:
> Currently when we boot on a big core system, we get this print:
>   [0.040500] Using small cores at SMT level
> 
> This is misleading as we've actually detected big cores.
> 
> This patch clears up the print to say we've detect big cores but are
> using small cores for scheduling.

Applied to powerpc/next.

[1/1] powerpc: Fix misleading small cores print
  https://git.kernel.org/powerpc/c/82a7cebdd95cffa55449d6c1d97cc9b743a66056

cheers


Re: [PATCH] powerpc/configs: Add LIBNVDIMM to ppc64_defconfig

2020-06-08 Thread Michael Ellerman
On Tue, 19 May 2020 14:30:09 +1000, Michael Neuling wrote:
> This gives us OF_PMEM which is useful in mambo.
> 
> This adds 153K to the text of ppc64le_defconfig which 0.8% of the
> total text.
> 
>   LIBNVDIMM text databss dec  hex
>   Without   18574833 5518150 1539240 25632223 1871ddf
>   With  18727834 5546206 1539368 25813408 189e1a0

Applied to powerpc/next.

[1/1] powerpc/configs: Add LIBNVDIMM to ppc64_defconfig
  https://git.kernel.org/powerpc/c/08b1add150a8863665676d0ac9c3ad2d34b2540c

cheers


Re: [PATCH v2 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms

2020-06-08 Thread Michael Ellerman
On Mon, 30 Mar 2020 15:32:15 +0200, Michal Simek wrote:
> recently we wanted to update xilinx intc driver and we found that function
> which we wanted to remove is still wired by ancient Xilinx PowerPC
> platforms. Here is the thread about it.
> https://lore.kernel.org/linux-next/48d3232d-0f1d-42ea-3109-f44bbabfa...@xilinx.com/
> 
> I have been talking about it internally and there is no interest in these
> platforms and it is also orphan for quite a long time. None is really
> running/testing these platforms regularly that's why I think it makes sense
> to remove them also with drivers which are specific to this platform.
> 
> [...]

Applied to powerpc/next.

[1/2] sound: ac97: Remove sound driver for ancient platform
  https://git.kernel.org/powerpc/c/f16dca3e30c14aff545a834a7c1a1bb02b9edb48
[2/2] powerpc: Remove Xilinx PPC405/PPC440 support
  https://git.kernel.org/powerpc/c/7ade8495dcfd788a76e6877c9ea86f5207369ea4

cheers


Re: [PATCH v3 1/1] powerpc/kernel: Enables memory hot-remove after reboot on pseries guests

2020-06-08 Thread Michael Ellerman
On Thu, 2 Apr 2020 16:51:57 -0300, Leonardo Bras wrote:
> While providing guests, it's desirable to resize it's memory on demand.
> 
> By now, it's possible to do so by creating a guest with a small base
> memory, hot-plugging all the rest, and using 'movable_node' kernel
> command-line parameter, which puts all hot-plugged memory in
> ZONE_MOVABLE, allowing it to be removed whenever needed.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/kernel: Enables memory hot-remove after reboot on pseries guests
  https://git.kernel.org/powerpc/c/b6eca183e23e7a6625a0d2cdb806b7cd1abcd2d2

cheers


Re: [PATCH v3] powerpc/XIVE: SVM: share the event-queue page with the Hypervisor.

2020-06-08 Thread Michael Ellerman
On Sat, 25 Apr 2020 19:05:18 -0700, Ram Pai wrote:
> >From 10ea2eaf492ca3f22f67a5a63a2b7865e45299ad Mon Sep 17 00:00:00 2001
> From: Ram Pai 
> Date: Mon, 24 Feb 2020 01:09:48 -0500
> Subject: [PATCH v3] powerpc/XIVE: SVM: share the event-queue page with the
>  Hypervisor.
> 
> XIVE interrupt controller uses an Event Queue (EQ) to enqueue event
> notifications when an exception occurs. The EQ is a single memory page
> provided by the O/S defining a circular buffer, one per server and
> priority couple.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/xive: Share the event-queue page with the Hypervisor.
  https://git.kernel.org/powerpc/c/094235222d41d68d35de18170058d94a96a82628

cheers


Re: [PATCH v6 0/2] Implement reentrant rtas call

2020-06-08 Thread Michael Ellerman
On Mon, 18 May 2020 20:42:43 -0300, Leonardo Bras wrote:
> Patch 2 implement rtas_call_reentrant() for reentrant rtas-calls:
> "ibm,int-on", "ibm,int-off",ibm,get-xive" and  "ibm,set-xive",
> according to LoPAPR Version 1.1 (March 24, 2016).
> 
> For that, it's necessary that every call uses a different
> rtas buffer (rtas_args). Paul Mackerras suggested using the PACA
> structure for creating a per-cpu buffer for these calls.
> 
> [...]

Applied to powerpc/next.

[1/2] powerpc/rtas: Move type/struct definitions from rtas.h into rtas-types.h
  https://git.kernel.org/powerpc/c/783a015b747f606e803b798eb8b50c73c548691d
[2/2] powerpc/rtas: Implement reentrant rtas call
  https://git.kernel.org/powerpc/c/b664db8e3f976d9233cc9ea5e3f8a8c0bcabeb48

cheers


Re: [PATCH v2 1/1] powerpc/crash: Use NMI context for printk when starting to crash

2020-06-08 Thread Michael Ellerman
On Tue, 12 May 2020 18:45:35 -0300, Leonardo Bras wrote:
> Currently, if printk lock (logbuf_lock) is held by other thread during
> crash, there is a chance of deadlocking the crash on next printk, and
> blocking a possibly desired kdump.
> 
> At the start of default_machine_crash_shutdown, make printk enter
> NMI context, as it will use per-cpu buffers to store the message,
> and avoid locking logbuf_lock.

Applied to powerpc/next.

[1/1] powerpc/crash: Use NMI context for printk when starting to crash
  https://git.kernel.org/powerpc/c/af2876b501e42c3fb5174cac9dd02598436f0fdf

cheers


Re: [PATCH v10 0/5] powerpc/hv-24x7: Expose chip/sockets info to add json file metric support for the hv_24x7 socket/chip level events

2020-06-08 Thread Michael Ellerman
On Mon, 25 May 2020 16:13:02 +0530, Kajol Jain wrote:
> Patchset fixes the inconsistent results we are getting when
> we run multiple 24x7 events.
> 
> "hv_24x7" pmu interface events needs system dependent parameter
> like socket/chip/core. For example, hv_24x7 chip level events needs
> specific chip-id to which the data is requested should be added as part
> of pmu events.
> 
> [...]

Applied to powerpc/next.

[1/5] powerpc/perf/hv-24x7: Fix inconsistent output values incase multiple 
hv-24x7 events run
  https://git.kernel.org/powerpc/c/b4ac18eead28611ff470d0f47a35c4e0ac080d9c
[2/5] powerpc/hv-24x7: Add rtas call in hv-24x7 driver to get processor details
  https://git.kernel.org/powerpc/c/8ba21426738207711347335b2cf3e99c690fc777
[3/5] powerpc/hv-24x7: Add sysfs files inside hv-24x7 device to show processor 
details
  https://git.kernel.org/powerpc/c/60beb65da1efd4cc23d05141181c39b98487950f
[4/5] Documentation/ABI: Add ABI documentation for chips and sockets
  https://git.kernel.org/powerpc/c/15cd1d35ba4a59832df693858ef046457107bd8d
[5/5] powerpc/pseries: Update hv-24x7 information after migration
  https://git.kernel.org/powerpc/c/373b373053384f12951ae9f916043d955501d482

cheers


Re: [PATCHv4] powerpc/crashkernel: take "mem=" option into account

2020-06-08 Thread Michael Ellerman
On Wed, 1 Apr 2020 22:00:44 +0800, Pingfan Liu wrote:
> 'mem=" option is an easy way to put high pressure on memory during some
> test. Hence after applying the memory limit, instead of total mem, the
> actual usable memory should be considered when reserving mem for
> crashkernel. Otherwise the boot up may experience OOM issue.
> 
> E.g. it would reserve 4G prior to the change and 512M afterward, if passing
> crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and
> mem=5G on a 256G machine.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/crashkernel: Take "mem=" option into account
  https://git.kernel.org/powerpc/c/be5470e0c285a68dc3afdea965032f5ddc8269d7

cheers


Re: [PATCH] powerpc/fadump: account for memory_limit while reserving memory

2020-06-08 Thread Michael Ellerman
On Wed, 27 May 2020 15:14:35 +0530, Hari Bathini wrote:
> If the memory chunk found for reserving memory overshoots the memory
> limit imposed, do not proceed with reserving memory. Default behavior
> was this until commit 140777a3d8df ("powerpc/fadump: consider reserved
> ranges while reserving memory") changed it unwittingly.

Applied to powerpc/next.

[1/1] powerpc/fadump: Account for memory_limit while reserving memory
  https://git.kernel.org/powerpc/c/9a2921e5baca1d25eb8d21f21d1e90581a6d0f68

cheers


Re: [PATCH] macintosh/ams-input: switch to using input device polling mode

2020-06-08 Thread Michael Ellerman
On Wed, 2 Oct 2019 14:48:54 -0700, Dmitry Torokhov wrote:
> Now that instances of input_dev support polling mode natively,
> we no longer need to create input_polled_dev instance.

Applied to powerpc/next.

[1/1] macintosh/ams-input: switch to using input device polling mode
  https://git.kernel.org/powerpc/c/0c444d98efad89e2a189d1a5a188e0385edac647

cheers


Re: [PATCH v5 00/13] Modernise powerpc 40x

2020-06-08 Thread Michael Ellerman
On Thu, 21 May 2020 16:55:51 + (UTC), Christophe Leroy wrote:
> v1 and v2 of this series were aiming at removing 40x entirely,
> but it led to protests.
> 
> v3 is trying to start modernising powerpc 40x:
> - Rework TLB miss handlers to not use PTE_ATOMIC_UPDATES and _PAGE_HWWRITE
> - Remove old versions of 40x processors, namely 403 and 405GP and associated
> errata.
> - Last two patches are trivial changes in TLB miss handlers to reduce number
> of scratch registers.
> 
> [...]

Applied to powerpc/next.

[01/13] powerpc: Remove Xilinx PPC405/PPC440 support

https://git.kernel.org/powerpc/c/7ade8495dcfd788a76e6877c9ea86f5207369ea4
[02/13] powerpc/40x: Rework 40x PTE access and TLB miss

https://git.kernel.org/powerpc/c/2c74e2586bb96012ffc05f1c819b05d9cad86d6e
[03/13] powerpc/pgtable: Drop PTE_ATOMIC_UPDATES

https://git.kernel.org/powerpc/c/4e1df545e2fae53e07c93b835c3dcc9d4917c849
[04/13] powerpc/40x: Remove support for IBM 403GCX

https://git.kernel.org/powerpc/c/1b5c0967ab8aa9424cdd5108de4e055d8aeaa9d0
[05/13] powerpc/40x: Remove STB03xxx

https://git.kernel.org/powerpc/c/7583b63c343c1076c89b2012fd8758473f046f5f
[06/13] powerpc/40x: Remove WALNUT

https://git.kernel.org/powerpc/c/5786074b96e38691a0cb3d3644ca2aa5d6d8830d
[07/13] powerpc/40x: Remove EP405

https://git.kernel.org/powerpc/c/548f5244f1064c9facb19c5e97c21e1e80102ea0
[08/13] powerpc/40x: Remove support for ISS Simulator

https://git.kernel.org/powerpc/c/2874ec75708eed59a47a9a986c02add747ae6e9b
[09/13] powerpc/40x: Remove support for IBM 405GP

https://git.kernel.org/powerpc/c/7d372d4ccdd55d5ead4d4ecbc336af4dd7d04344
[10/13] powerpc/40x: Remove IBM405 Erratum #51

https://git.kernel.org/powerpc/c/59fb463b48e904dfdfff64c7dd4d67f20ae27170
[11/13] powerpc: Remove IBM405 Erratum #77

https://git.kernel.org/powerpc/c/455531e9d88048c025ff9099796413df748d92b9
[12/13] powerpc/40x: Avoid using r12 in TLB miss handlers

https://git.kernel.org/powerpc/c/797f4016f6da4a90ac83e32b213b68ff7be3812b
[13/13] powerpc/40x: Don't save CR in SPRN_SPRG_SCRATCH6

https://git.kernel.org/powerpc/c/3aacaa719b7bf135551cabde2480e8f7bfdf7c7d

cheers


Re: [PATCH 0/3] powerpc/xive: PCI hotplug fixes under PowerVM

2020-06-08 Thread Michael Ellerman
On Wed, 29 Apr 2020 09:51:19 +0200, Cédric Le Goater wrote:
> Here are a couple of fixes for PCI hotplug issues for machines running
> under the POWER hypervisor using hash MMU and the XIVE interrupt mode.
> 
> Commit 1ca3dec2b2df ("powerpc/xive: Prevent page fault issues in the
> machine crash handler") forced the mapping of the XIVE ESB page and
> this is now blocking the removal of a passthrough IO adapter because
> the PCI isolation fails with "valid outstanding translations". Under
> KVM, the ESB pages for the adapter interrupts are un-mapped from the
> guest by the hypervisor in the KVM XIVE native device. This is is now
> redundant but it's harmless.
> 
> [...]

Patches 1 & 3 pplied to powerpc/next.

[1/3] powerpc/xive: Clear the page tables for the ESB IO mapping
  https://git.kernel.org/powerpc/c/a101950fcb78b0ba20cd487be6627dea58d55c2b
[3/3] powerpc/xive: Do not expose a debugfs file when XIVE is disabled
  https://git.kernel.org/powerpc/c/0755e85570a4615ca674ad6489d44d63916f1f3e

cheers


Re: [PATCH v4 00/45] Use hugepages to map kernel mem on 8xx

2020-06-08 Thread Michael Ellerman
On Tue, 19 May 2020 05:48:42 + (UTC), Christophe Leroy wrote:
> The main purpose of this big series is to:
> - reorganise huge page handling to avoid using mm_slices.
> - use huge pages to map kernel memory on the 8xx.
> 
> The 8xx supports 4 page sizes: 4k, 16k, 512k and 8M.
> It uses 2 Level page tables, PGD having 1024 entries, each entry
> covering 4M address space. Then each page table has 1024 entries.
> 
> [...]

Patches 1-6 and 9-45 applied to powerpc/next.

[01/45] powerpc/kasan: Fix error detection on memory allocation

https://git.kernel.org/powerpc/c/d132443a73d7a131775df46f33000f67ed92de1e
[02/45] powerpc/kasan: Fix issues by lowering KASAN_SHADOW_END

https://git.kernel.org/powerpc/c/3a66a24f6060e6775f8c02ac52329ea0152d7e58
[03/45] powerpc/kasan: Fix shadow pages allocation failure

https://git.kernel.org/powerpc/c/d2a91cef9bbdeb87b7449fdab1a6be6000930210
[04/45] powerpc/kasan: Remove unnecessary page table locking

https://git.kernel.org/powerpc/c/7c31c05e00fc5ff2067332c5f80e525573e7269c
[05/45] powerpc/kasan: Refactor update of early shadow mappings

https://git.kernel.org/powerpc/c/7dec42ab57f2f59feba82abf0353164479bfde4c
[06/45] powerpc/kasan: Declare kasan_init_region() weak

https://git.kernel.org/powerpc/c/ec97d022f621c6c850aec46d8818b49c6aae95ad
[09/45] powerpc/ptdump: Add _PAGE_COHERENT flag

https://git.kernel.org/powerpc/c/3af4786eb429b2df76cbd7ce3bae21467ac3e4fb
[10/45] powerpc/ptdump: Display size of BATs

https://git.kernel.org/powerpc/c/6b30830e2003d9d77696084ebe2fc19dbe7d6f70
[11/45] powerpc/ptdump: Standardise display of BAT flags

https://git.kernel.org/powerpc/c/8961a2a5353cca5451f648f4838cd848a3b2354c
[12/45] powerpc/ptdump: Properly handle non standard page size

https://git.kernel.org/powerpc/c/b00ff6d8c1c3898b0f768cbb38ef722d25bd2f39
[13/45] powerpc/ptdump: Handle hugepd at PGD level

https://git.kernel.org/powerpc/c/6b789a26d7da2e0256d199da980369ef8fb49ec6
[14/45] powerpc/32s: Don't warn when mapping RO data ROX.

https://git.kernel.org/powerpc/c/4b19f96a81bceaf0bcf44d79c0855c61158065ec
[15/45] powerpc/mm: Allocate static page tables for fixmap

https://git.kernel.org/powerpc/c/925ac141d106b55acbe112a9272f970631a3c082
[16/45] powerpc/mm: Fix conditions to perform MMU specific management by blocks 
on PPC32.

https://git.kernel.org/powerpc/c/4e3319c23a66dabfd6c35f4d2633d64d99b68096
[17/45] powerpc/mm: PTE_ATOMIC_UPDATES is only for 40x

https://git.kernel.org/powerpc/c/fadaac67c9007cad9fc485e36dcc54460d6d5886
[18/45] powerpc/mm: Refactor pte_update() on nohash/32

https://git.kernel.org/powerpc/c/2db99aeb63dd6e8808dc054d181c4d0e8645bbe0
[19/45] powerpc/mm: Refactor pte_update() on book3s/32

https://git.kernel.org/powerpc/c/1c1bf294882bd12669e39ccd7680c4ce34b7c15c
[20/45] powerpc/mm: Standardise __ptep_test_and_clear_young() params between 
PPC32 and PPC64

https://git.kernel.org/powerpc/c/c7fa77016eb6093df38fdabdb7a89bb9617e7185
[21/45] powerpc/mm: Standardise pte_update() prototype between PPC32 and PPC64

https://git.kernel.org/powerpc/c/06f52524870122fb43b214d27e8f4546da36f8ba
[22/45] powerpc/mm: Create a dedicated pte_update() for 8xx

https://git.kernel.org/powerpc/c/6ad41bfbc907be0cd414f09fa5382d2133376595
[23/45] powerpc/mm: Reduce hugepd size for 8M hugepages on 8xx

https://git.kernel.org/powerpc/c/b12c07a4bb064c0a8db7554557b89d40f57c936f
[24/45] powerpc/8xx: Drop CONFIG_8xx_COPYBACK option

https://git.kernel.org/powerpc/c/d3efcd38c0b99162d889e36a30425345a18edb33
[25/45] powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.

https://git.kernel.org/powerpc/c/a891c43b97d315ee5f9fe8e797d3d48fc351e053
[26/45] powerpc/8xx: Manage 512k huge pages as standard pages.

https://git.kernel.org/powerpc/c/b250c8c08c79d1eb5354c7eaa84b7505f5f2d921
[27/45] powerpc/8xx: Only 8M pages are hugepte pages now

https://git.kernel.org/powerpc/c/d4870b89acd7c362ded08f9295e8d143cf7e0024
[28/45] powerpc/8xx: MM_SLICE is not needed anymore

https://git.kernel.org/powerpc/c/555904d07eef3a2e5fc458419edf6174362c4ddd
[29/45] powerpc/8xx: Move PPC_PIN_TLB options into 8xx Kconfig

https://git.kernel.org/powerpc/c/5d4656696c30cef56b2ab506b203533c818af04d
[30/45] powerpc/8xx: Add function to set pinned TLBs

https://git.kernel.org/powerpc/c/f76c8f6d257cefda60221c83af7f97d9f74cb3ce
[31/45] powerpc/8xx: Don't set IMMR map anymore at boot

https://git.kernel.org/powerpc/c/136a9a0f74d2e0d9de5515190fe80344b86b45cf
[32/45] powerpc/8xx: Always pin TLBs at startup.

https://git.kernel.org/powerpc/c/684c1664e0de63398aceb748343541b48d398710
[33/45] powerpc/8xx: Drop special handling of Linear and IMMR mappings in I/D 
TLB handlers

https://git.kernel.org/powerpc/c/400dc0f86102d2ad11d3601f1948fbb02e926431
[34/45] powerpc/8xx: Remove now unused TLB m

Re: [PATCH v2] powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG

2020-06-08 Thread Michael Ellerman
On Sat, 30 May 2020 17:16:33 + (UTC), Christophe Leroy wrote:
> 'thread' doesn't exist in kuap_check() macro.
> 
> Use 'current' instead.

Applied to powerpc/next.

[1/1] powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG
  https://git.kernel.org/powerpc/c/74016701fe5f873ae23bf02835407227138d874d

cheers


Re: [PATCH] powerpc/32: disable KASAN with pages bigger than 16k

2020-06-08 Thread Michael Ellerman
On Thu, 28 May 2020 10:17:04 + (UTC), Christophe Leroy wrote:
> Mapping of early shadow area is implemented by using a single static
> page table having all entries pointing to the same early shadow page.
> The shadow area must therefore occupy full PGD entries.
> 
> The shadow area has a size of 128Mbytes starting at 0xf800.
> With 4k pages, a PGD entry is 4Mbytes
> With 16k pages, a PGD entry is 64Mbytes
> With 64k pages, a PGD entry is 256Mbytes which is too big.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/32: Disable KASAN with pages bigger than 16k
  https://git.kernel.org/powerpc/c/888468ce725a4cd56d72dc7e5096078f7a9251a0

cheers


Re: [PATCH v2 01/12] powerpc/52xx: Blacklist functions running with MMU disabled for kprobe

2020-06-08 Thread Michael Ellerman
On Tue, 31 Mar 2020 16:03:36 + (UTC), Christophe Leroy wrote:
> kprobe does not handle events happening in real mode, all
> functions running with MMU disabled have to be blacklisted.

Applied to powerpc/next.

[01/12] powerpc/52xx: Blacklist functions running with MMU disabled for kprobe

https://git.kernel.org/powerpc/c/e83f01fdb9143a4f90b17fbf7d8b8b21efb2f968
[02/12] powerpc/82xx: Blacklist pq2_restart() for kprobe

https://git.kernel.org/powerpc/c/1740f15a99d30a5e2710b2b0754e65fc5ba68d1d
[03/12] powerpc/83xx: Blacklist mpc83xx_deep_resume() for kprobe

https://git.kernel.org/powerpc/c/7aa85127b1a170694b042cbc35a07afe3904173e
[04/12] powerpc/powermac: Blacklist functions running with MMU disabled for 
kprobe

https://git.kernel.org/powerpc/c/32a820670fa00419375a964ca8bc569e1499b90d
[05/12] powerpc/mem: Blacklist flush_dcache_icache_phys() for kprobe

https://git.kernel.org/powerpc/c/a64371b5d4fb37199dcd04cb7bf0132894018e33
[06/12] powerpc/32s: Make local symbols non visible in hash_low.

https://git.kernel.org/powerpc/c/f892c21d2efb3b86ecbf8f5a95ea4abeedcc91b0
[07/12] powerpc/32s: Blacklist functions running with MMU disabled for kprobe

https://git.kernel.org/powerpc/c/e6209318d63e2774c5ab214b14b948079e040064
[08/12] powerpc/rtas: Remove machine_check_in_rtas()

https://git.kernel.org/powerpc/c/32746dfe4cf37f4077929601e8877a7fd02676e8
[09/12] powerpc/32: Blacklist functions running with MMU disabled for kprobe

https://git.kernel.org/powerpc/c/5f32e8361cba8c58c4f272a389296f489ecc2823
[10/12] powerpc/entry32: Blacklist exception entry points for kprobe.

https://git.kernel.org/powerpc/c/a616c442119f2ea5641e6abc215d7255b73b982b
[11/12] powerpc/entry32: Blacklist syscall exit points for kprobe.

https://git.kernel.org/powerpc/c/7cdf4401388572f720403a7038a178a4b30ac14c
[12/12] powerpc/entry32: Blacklist exception exit points for kprobe.

https://git.kernel.org/powerpc/c/e51c3e13709fe55d4d0eb50ba435bc53a64152bf

cheers


Re: [PATCH] powerpc/uaccess: Don't set KUEP by default on book3s/32

2020-06-08 Thread Michael Ellerman
On Wed, 15 Apr 2020 14:57:11 + (UTC), Christophe Leroy wrote:
> On book3s/32, KUEP is an heavy process as it requires to
> set/unset the NX bit in each of the 12 user segments
> everytime the kernel is entered/exited from/to user space.
> 
> Don't select KUEP by default on book3s/32.

Applied to powerpc/next.

[1/1] powerpc/uaccess: Don't set KUEP by default on book3s/32
  https://git.kernel.org/powerpc/c/c3ba4dbbd1d05b49ec01efe098e0a78857d3ce22

cheers


Re: [PATCH] powerpc/uaccess: Don't set KUAP by default on book3s/32

2020-06-08 Thread Michael Ellerman
On Wed, 15 Apr 2020 14:57:09 + (UTC), Christophe Leroy wrote:
> On book3s/32, KUAP is an heavy process as it requires to
> determine which segments are impacted and unlock/lock
> each of them.
> 
> And since the implementation of user_access_begin/end, it
> is even worth for the time being because unlike __get_user(),
> user_access_begin doesn't make difference between read and write
> and unlocks access also for read allthought that's unneeded
> on book3s/32.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/uaccess: Don't set KUAP by default on book3s/32
  https://git.kernel.org/powerpc/c/547e687b2981a115814962506068873d24983af7

cheers


Re: [PATCH] powerpc/kprobes: Use probe_address() to read instructions

2020-06-08 Thread Michael Ellerman
On Mon, 24 Feb 2020 18:02:10 + (UTC), Christophe Leroy wrote:
> In order to avoid Oopses, use probe_address() to read the
> instruction at the address where the trap happened.

Applied to powerpc/next.

[1/1] powerpc/kprobes: Use probe_address() to read instructions
  https://git.kernel.org/powerpc/c/9ed5df69b79a22b40b20bc2132ba2495708b19c4

cheers


Re: [PATCH] powerpc/8xx: Reduce time spent in allow_user_access() and friends

2020-06-08 Thread Michael Ellerman
On Wed, 15 Apr 2020 10:06:09 + (UTC), Christophe Leroy wrote:
> To enable/disable kernel access to user space, the 8xx has to
> modify the properties of access group 1. This is done by writing
> predefined values into SPRN_Mx_AP registers.
> 
> As of today, a __put_user() gives:
> 
> 0d64 :
>  d64: 3d 20 4f ff lis r9,20479
>  d68: 61 29 ff ff ori r9,r9,65535
>  d6c: 7d 3a c3 a6 mtspr   794,r9
>  d70: 39 20 00 00 li  r9,0
>  d74: 90 83 00 00 stw r4,0(r3)
>  d78: 3d 20 6f ff lis r9,28671
>  d7c: 61 29 ff ff ori r9,r9,65535
>  d80: 7d 3a c3 a6 mtspr   794,r9
>  d84: 4e 80 00 20 blr
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/8xx: Reduce time spent in allow_user_access() and friends
  https://git.kernel.org/powerpc/c/332ce969b763553e9c4d55069e1e15aba4ea560f

cheers


Re: [PATCH -next] powerpc/powernv: add NULL check after kzalloc

2020-06-08 Thread Michael Ellerman
On Sat, 9 May 2020 10:08:38 +0800, Chen Zhou wrote:
> Fixes coccicheck warning:
> 
> ./arch/powerpc/platforms/powernv/opal.c:813:1-5:
>   alloc with no test, possible model on line 814
> 
> Add NULL check after kzalloc.

Applied to powerpc/next.

[1/1] powerpc/powernv: add NULL check after kzalloc
  https://git.kernel.org/powerpc/c/ceffa63acce7165c442395b7d64a11ab8b5c5dca

cheers


Re: [PATCH v3] powerpc/64s/pgtable: fix an undefined behaviour

2020-06-08 Thread Michael Ellerman
On Thu, 5 Mar 2020 23:48:52 -0500, Qian Cai wrote:
> Booting a power9 server with hash MMU could trigger an undefined
> behaviour because pud_offset(p4d, 0) will do,
> 
> 0 >> (PAGE_SHIFT:16 + PTE_INDEX_SIZE:8 + H_PMD_INDEX_SIZE:10)
> 
> Fix it by converting pud_index() and friends to static inline
> functions.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/64s/pgtable: fix an undefined behaviour
  https://git.kernel.org/powerpc/c/c2e929b18cea6cbf71364f22d742d9aad7f4677a

cheers


Re: [PATCH] powerpc/book3s64/radix/tlb: Determine hugepage flush correctly

2020-06-08 Thread Michael Ellerman
On Wed, 13 May 2020 08:36:16 +0530, Aneesh Kumar K.V wrote:
> With a 64K page size flush with start and end value as below
> (start, end) = (721f680d, 721f680e) results in
> (hstart, hend) = (721f6820, 721f6800)
> 
> Avoid doing a __tlbie_va_range with the wrong hstart and hend value in this
> case.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/book3s64/radix/tlb: Determine hugepage flush correctly
  https://git.kernel.org/powerpc/c/8f53f9c0f68ab2168f637494b9e24034899c1310

cheers


Re: [PATCH] powerpc/book3s64/kvm: Fix secondary page table walk warning during migration

2020-06-08 Thread Michael Ellerman
On Thu, 28 May 2020 13:34:56 +0530, Aneesh Kumar K.V wrote:
> This patch fix the below warning reported during migration.
> 
>  find_kvm_secondary_pte called with kvm mmu_lock not held
>  CPU: 23 PID: 5341 Comm: qemu-system-ppc Tainted: GW 
> 5.7.0-rc5-kvm-00211-g9ccf10d6d088 #432
>  NIP:  c00800fe848c LR: c00800fe8488 CTR: 
>  REGS: c01e19f077e0 TRAP: 0700   Tainted: GW  
> (5.7.0-rc5-kvm-00211-g9ccf10d6d088)
>  MSR:  90029033   CR: 4422  XER: 2004
>  CFAR: c012f5ac IRQMASK: 0
>  GPR00: c00800fe8488 c01e19f07a70 c00800ffe200 0039
>  GPR04: 0001 c01ffc8b4900 00018840 0007
>  GPR08: 0003 0001 0007 0001
>  GPR12: 2000 c01fff6d9400 00011f884678 7fff70b7
>  GPR16: 7fff7137cb90 7fff7dcb4410 0001 
>  GPR20: 0ffe  0001 
>  GPR24: 8000 0001 c01e1f67e600 c01e1fd82410
>  GPR28: 1000 c01e2e41 0fff 0ffe
>  NIP [c00800fe848c] kvmppc_hv_get_dirty_log_radix+0x2e4/0x340 [kvm_hv]
>  LR [c00800fe8488] kvmppc_hv_get_dirty_log_radix+0x2e0/0x340 [kvm_hv]
>  Call Trace:
>  [c01e19f07a70] [c00800fe8488] 
> kvmppc_hv_get_dirty_log_radix+0x2e0/0x340 [kvm_hv] (unreliable)
>  [c01e19f07b50] [c00800fd42e4] 
> kvm_vm_ioctl_get_dirty_log_hv+0x33c/0x3c0 [kvm_hv]
>  [c01e19f07be0] [c00800eea878] kvm_vm_ioctl_get_dirty_log+0x30/0x50 
> [kvm]
>  [c01e19f07c00] [c00800edc818] kvm_vm_ioctl+0x2b0/0xc00 [kvm]
>  [c01e19f07d50] [c046e148] ksys_ioctl+0xf8/0x150
>  [c01e19f07da0] [c046e1c8] sys_ioctl+0x28/0x80
>  [c01e19f07dc0] [c003652c] system_call_exception+0x16c/0x240
>  [c01e19f07e20] [c000d070] system_call_common+0xf0/0x278
>  Instruction dump:
>  7d3a512a 4200ffd0 7ffefb78 4bfffdc4 6000 3c82 e8848468 3c62
>  e86384a8 38840010 4800673d e8410018 <0fe0> 4bfffdd4 6000 6000

Applied to powerpc/next.

[1/1] powerpc/book3s64/kvm: Fix secondary page table walk warning during 
migration
  https://git.kernel.org/powerpc/c/bf8036a4098d1548cdccf9ed5c523ef4e83e3c68

cheers


Re: [PATCH v3 0/7] Base support for POWER10

2020-06-08 Thread Michael Ellerman
On Thu, 21 May 2020 11:43:34 +1000, Alistair Popple wrote:
> This series brings together several previously posted patches required for
> POWER10 support and introduces a new patch enabling POWER10 architected
> mode to enable booting as a POWER10 pseries guest.
> 
> It includes support for enabling facilities related to MMA and prefix
> instructions.
> 
> [...]

Patches 1-3 and 5-7 applied to powerpc/next.

[1/7] powerpc: Add new HWCAP bits
  https://git.kernel.org/powerpc/c/ee988c11acf6f9464b7b44e9a091bf6afb3b3a49
[2/7] powerpc: Add support for ISA v3.1
  https://git.kernel.org/powerpc/c/3fd5836ee801ab9ac5b314c26550e209bafa5eaa
[3/7] powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selected
  https://git.kernel.org/powerpc/c/43d0d37acbe40a9a93d9891ca670638cd22116b1
[5/7] powerpc/dt_cpu_ftrs: Enable Prefixed Instructions
  https://git.kernel.org/powerpc/c/c63d688c3dabca973c5a7da73d17422ad13f3737
[6/7] powerpc/dt_cpu_ftrs: Add MMA feature
  https://git.kernel.org/powerpc/c/87939d50e5888bd78478d9aa9455f56b919df658
[7/7] powerpc: Add POWER10 architected mode
  https://git.kernel.org/powerpc/c/a3ea40d5c7365e7e5c7c85b6f30b15142b397571

cheers


Re: [PATCH] ocxl: Fix misleading comment

2020-06-08 Thread Michael Ellerman
On Wed, 26 Feb 2020 15:39:23 +1100, Andrew Donnellan wrote:
> In ocxl_context_free() we note that the AFU reference we're releasing was
> taken in "ocxl_context_init", a function that doesn't actually exist.
> 
> Fix it to say ocxl_context_alloc() instead, which I expect was what was
> intended.

Applied to powerpc/next.

[1/1] ocxl: Fix misleading comment
  https://git.kernel.org/powerpc/c/a0594e89c9dc8e37883cc0d6642d1baad9c0744e

cheers


Re: [PATCH] cxl: Remove dead Kconfig options

2020-06-08 Thread Michael Ellerman
On Tue, 2 Jun 2020 14:03:41 +1000, Andrew Donnellan wrote:
> The CXL_AFU_DRIVER_OPS and CXL_LIB Kconfig options were added to coordinate
> merging of new features. They no longer serve any purpose, so remove them.

Applied to powerpc/next.

[1/1] cxl: Remove dead Kconfig options
  https://git.kernel.org/powerpc/c/f44b85da5e7450d0308695ba6f503d75fe6cc166

cheers


Re: [PATCH 5/5] powerpc: Add LKDTM test to hijack a patch mapping

2020-06-08 Thread Christopher M. Riedl
On Wed Jun 3, 2020 at 9:20 AM, Christophe Leroy wrote:
>
> 
>
> 
> Le 03/06/2020 à 07:19, Christopher M. Riedl a écrit :
> > When live patching with STRICT_KERNEL_RWX, the CPU doing the patching
> > must use a temporary mapping which allows for writing to kernel text.
> > During the entire window of time when this temporary mapping is in use,
> > another CPU could write to the same mapping and maliciously alter kernel
> > text. Implement a LKDTM test to attempt to exploit such a openings when
> > a CPU is patching under STRICT_KERNEL_RWX. The test is only implemented
> > on powerpc for now.
> > 
> > The LKDTM "hijack" test works as follows:
> > 
> > 1. A CPU executes an infinite loop to patch an instruction.
> >This is the "patching" CPU.
> > 2. Another CPU attempts to write to the address of the temporary
> >mapping used by the "patching" CPU. This other CPU is the
> >"hijacker" CPU. The hijack either fails with a segfault or
> >succeeds, in which case some kernel text is now overwritten.
> > 
> > How to run the test:
> > 
> > mount -t debugfs none /sys/kernel/debug
> > (echo HIJACK_PATCH > /sys/kernel/debug/provoke-crash/DIRECT)
> > 
> > Signed-off-by: Christopher M. Riedl 
> > ---
> >   drivers/misc/lkdtm/core.c  |   1 +
> >   drivers/misc/lkdtm/lkdtm.h |   1 +
> >   drivers/misc/lkdtm/perms.c | 101 +
> >   3 files changed, 103 insertions(+)
> > 
> > diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
> > index a5e344df9166..482e72f6a1e1 100644
> > --- a/drivers/misc/lkdtm/core.c
> > +++ b/drivers/misc/lkdtm/core.c
> > @@ -145,6 +145,7 @@ static const struct crashtype crashtypes[] = {
> > CRASHTYPE(WRITE_RO),
> > CRASHTYPE(WRITE_RO_AFTER_INIT),
> > CRASHTYPE(WRITE_KERN),
> > +   CRASHTYPE(HIJACK_PATCH),
> > CRASHTYPE(REFCOUNT_INC_OVERFLOW),
> > CRASHTYPE(REFCOUNT_ADD_OVERFLOW),
> > CRASHTYPE(REFCOUNT_INC_NOT_ZERO_OVERFLOW),
> > diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
> > index 601a2156a0d4..bfcf3542370d 100644
> > --- a/drivers/misc/lkdtm/lkdtm.h
> > +++ b/drivers/misc/lkdtm/lkdtm.h
> > @@ -62,6 +62,7 @@ void lkdtm_EXEC_USERSPACE(void);
> >   void lkdtm_EXEC_NULL(void);
> >   void lkdtm_ACCESS_USERSPACE(void);
> >   void lkdtm_ACCESS_NULL(void);
> > +void lkdtm_HIJACK_PATCH(void);
> >   
> >   /* lkdtm_refcount.c */
> >   void lkdtm_REFCOUNT_INC_OVERFLOW(void);
> > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> > index 62f76d506f04..8bda3b56bc78 100644
> > --- a/drivers/misc/lkdtm/perms.c
> > +++ b/drivers/misc/lkdtm/perms.c
> > @@ -9,6 +9,7 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >   #include 
> >   
> >   /* Whether or not to fill the target memory area with do_nothing(). */
> > @@ -213,6 +214,106 @@ void lkdtm_ACCESS_NULL(void)
> > *ptr = tmp;
> >   }
> >   
> > +#if defined(CONFIG_PPC) && defined(CONFIG_STRICT_KERNEL_RWX)
>
> 
> Why only PPC ? I understood that this applies also to x86. And
> regarless, the test should be able to run on other architectures,
> allthought for sure it will fail. That's the case for other tests.
>

I think the code patching details are different between architectures
and (for now) I am only comfortable enough with PPC to implement
something meaningful. The intent of the RFC versions was to try to get
some interest (hence the distribution to the hardening list) or feedback
about how this could work on other architectures.

There are a few other tests which are arch specific in LKDTM so it's not
completely unheard of :)

> 
> > +#include 
> > +
> > +extern unsigned long read_cpu_patching_addr(unsigned int cpu);
>
> 
> 'extern' keyword is useless for functions and shall be banned.
>
> 
> Shouldn't this declaration be in asm/code-patching.h ?
>

Yes, left-over from the RFC version, this will be fixed in the next
spin.

> 
> > +
> > +static struct ppc_inst * const patch_site = (struct ppc_inst *)&do_nothing;
> > +
> > +static int lkdtm_patching_cpu(void *data)
> > +{
> > +   int err = 0;
> > +   struct ppc_inst insn = ppc_inst(0xdeadbeef);
> > +
> > +   pr_info("starting patching_cpu=%d\n", smp_processor_id());
> > +   do {
> > +   err = patch_instruction(patch_site, insn);
> > +   } while (ppc_inst_equal(ppc_inst_read(READ_ONCE(patch_site)), insn) &&
> > +   !err && !kthread_should_stop());
> > +
> > +   if (err)
> > +   pr_warn("patch_instruction returned error: %d\n", err);
> > +
> > +   set_current_state(TASK_INTERRUPTIBLE);
> > +   while (!kthread_should_stop()) {
> > +   schedule();
> > +   set_current_state(TASK_INTERRUPTIBLE);
> > +   }
> > +
> > +   return err;
> > +}
> > +
> > +void lkdtm_HIJACK_PATCH(void)
> > +{
> > +   struct task_struct *patching_kthrd;
> > +   struct ppc_inst original_insn;
> > +   int patching_cpu, hijacker_cpu, attempts;
> > +   unsigned long addr;
> > +   bool hijacked;
> > +
> > +   if (n

Re: [PATCH 4/5] powerpc/lib: Add LKDTM accessor for patching addr

2020-06-08 Thread Christopher M. Riedl
On Wed Jun 3, 2020 at 9:14 AM, Christophe Leroy wrote:
>
> 
>
> 
> Le 03/06/2020 à 07:19, Christopher M. Riedl a écrit :
> > When live patching a STRICT_RWX kernel, a mapping is installed at a
> > "patching address" with temporary write permissions. Provide a
> > LKDTM-only accessor function for this address in preparation for a LKDTM
> > test which attempts to "hijack" this mapping by writing to it from
> > another CPU.
> > 
> > Signed-off-by: Christopher M. Riedl 
> > ---
> >   arch/powerpc/lib/code-patching.c | 7 +++
> >   1 file changed, 7 insertions(+)
> > 
> > diff --git a/arch/powerpc/lib/code-patching.c 
> > b/arch/powerpc/lib/code-patching.c
> > index df0765845204..c23453049116 100644
> > --- a/arch/powerpc/lib/code-patching.c
> > +++ b/arch/powerpc/lib/code-patching.c
> > @@ -52,6 +52,13 @@ int raw_patch_instruction(struct ppc_inst *addr, struct 
> > ppc_inst instr)
> >   static struct mm_struct *patching_mm __ro_after_init;
> >   static unsigned long patching_addr __ro_after_init;
> >   
> > +#ifdef CONFIG_LKDTM
> > +unsigned long read_cpu_patching_addr(unsigned int cpu)
>
> 
> If this fonction is not static, it means it is intended to be used from
> some other C file, so it should be declared in a .h too.
>
Yup agreed. This was left-over from the RFC to simplify using the LKDTM
test on a tree without this series. Will fix this in the next spin.
> 
> Christophe
>
> 
> > +{
> > +   return patching_addr;
> > +}
> > +#endif
> > +
> >   void __init poking_init(void)
> >   {
> > spinlock_t *ptl; /* for protecting pte table */
> > 
>
> 
>
> 



[PATCH v2] selftests: powerpc: Fix CPU affinity for child process

2020-06-08 Thread Harish
On systems with large number of cpus, test fails trying to set
affinity for child process by calling sched_setaffinity() with 
smaller size for cpuset. This patch fixes it by making sure that
the size of allocated cpu set is dependent on the number of CPUs
as reported by get_nprocs().

Fixes: 00b7ec5c9cf3 ("selftests/powerpc: Import Anton's context_switch2 
benchmark")
Reported-by: Shirisha Ganta 
Signed-off-by: Harish 
Signed-off-by: Sandipan Das 
---
 .../powerpc/benchmarks/context_switch.c| 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/powerpc/benchmarks/context_switch.c 
b/tools/testing/selftests/powerpc/benchmarks/context_switch.c
index a2e8c9da7fa5..de6c49d6f88f 100644
--- a/tools/testing/selftests/powerpc/benchmarks/context_switch.c
+++ b/tools/testing/selftests/powerpc/benchmarks/context_switch.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -104,8 +105,9 @@ static void start_thread_on(void *(*fn)(void *), void *arg, 
unsigned long cpu)
 
 static void start_process_on(void *(*fn)(void *), void *arg, unsigned long cpu)
 {
-   int pid;
-   cpu_set_t cpuset;
+   int pid, ncpus;
+   cpu_set_t *cpuset;
+   size_t size;
 
pid = fork();
if (pid == -1) {
@@ -116,12 +118,16 @@ static void start_process_on(void *(*fn)(void *), void 
*arg, unsigned long cpu)
if (pid)
return;
 
-   CPU_ZERO(&cpuset);
-   CPU_SET(cpu, &cpuset);
+   size = CPU_ALLOC_SIZE(ncpus);
+   ncpus = get_nprocs();
+   cpuset = CPU_ALLOC(ncpus);
+   CPU_ZERO_S(size, cpuset);
+   CPU_SET_S(cpu, size, cpuset);
 
-   if (sched_setaffinity(0, sizeof(cpuset), &cpuset)) {
+   if (sched_setaffinity(0, size, cpuset)) {
perror("sched_setaffinity");
-   exit(1);
+   CPU_FREE(cpuset);
+   exit(-1);
}
 
fn(arg);
-- 
2.24.1



Re: [PATCH v2] mm/debug_vm_pgtable: Fix kernel crash by checking for THP support

2020-06-08 Thread Anshuman Khandual



On 06/08/2020 06:22 PM, Aneesh Kumar K.V wrote:
> Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but
> no THP support enabled based on platforms. For ex: with 4K
> PAGE_SIZE ppc64 supports THP only with radix translation.
> 
> This results in below crash when running with hash translation and
> 4K PAGE_SIZE.
> 
> kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140!
> cpu 0x61: Vector: 700 (Program Check) at [c00ff948f860]
> pc: c18810f8: debug_vm_pgtable+0x480/0x8b0
> lr: c18810ec: debug_vm_pgtable+0x474/0x8b0
> ...
> [c00ff948faf0] c1880fec debug_vm_pgtable+0x374/0x8b0 (unreliable)
> [c00ff948fbf0] c0011648 do_one_initcall+0x98/0x4f0
> [c00ff948fcd0] c1843928 kernel_init_freeable+0x330/0x3fc
> [c00ff948fdb0] c00122ac kernel_init+0x24/0x148
> [c00ff948fe20] c000cc44 ret_from_kernel_thread+0x5c/0x78
> 
> Check for THP support correctly
> 
> Cc: anshuman.khand...@arm.com
> Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table 
> helpers")
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  mm/debug_vm_pgtable.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 188c18908964..df3a3a08f4f8 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -61,6 +61,9 @@ static void __init pmd_basic_tests(unsigned long pfn, 
> pgprot_t prot)
>  {
>   pmd_t pmd = pfn_pmd(pfn, prot);
>  
> + if (!has_transparent_hugepage())
> + return;
> +
>   WARN_ON(!pmd_same(pmd, pmd));
>   WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd;
>   WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd;
> @@ -80,6 +83,9 @@ static void __init pud_basic_tests(unsigned long pfn, 
> pgprot_t prot)
>  {
>   pud_t pud = pfn_pud(pfn, prot);
>  
> + if (!has_transparent_hugepage())
> + return;
> +
>   WARN_ON(!pud_same(pud, pud));
>   WARN_ON(!pud_young(pud_mkyoung(pud_mkold(pud;
>   WARN_ON(!pud_write(pud_mkwrite(pud_wrprotect(pud;
> 

Builds with THP on arc, s390 and runs with THP on x86 and arm64 platforms.

Reviewed-by: Anshuman Khandual 


[PATCH kernel] KVM: PPC: Protect kvm_vcpu_read_guest with srcu locks

2020-06-08 Thread Alexey Kardashevskiy
The kvm_vcpu_read_guest/kvm_vcpu_write_guest used for nested guests
eventually call srcu_dereference_check to dereference a memslot and
lockdep produces a warning as neither kvm->slots_lock nor
kvm->srcu lock is held and kvm->users_count is above zero (>100 in fact).

This wraps mentioned VCPU read/write helpers in srcu read lock/unlock as
it is done in other places. This uses vcpu->srcu_idx when possible.

These helpers are only used for nested KVM so this may explain why
we did not see these before.

Here is an example of a warning:

=
WARNING: suspicious RCU usage
5.7.0-rc3-le_dma-bypass.3.2_a+fstn1 #897 Not tainted
-
include/linux/kvm_host.h:633 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
1 lock held by qemu-system-ppc/2752:
 #0: c000200359016be0 (&vcpu->mutex){+.+.}-{3:3}, at: 
kvm_vcpu_ioctl+0x144/0xd80 [kvm]

stack backtrace:
CPU: 80 PID: 2752 Comm: qemu-system-ppc Not tainted 
5.7.0-rc3-le_dma-bypass.3.2_a+fstn1 #897
Call Trace:
[c0002003591ab240] [c0b23ab4] dump_stack+0x190/0x25c (unreliable)
[c0002003591ab2b0] [c023f954] lockdep_rcu_suspicious+0x140/0x164
[c0002003591ab330] [c00804a445f8] kvm_vcpu_gfn_to_memslot+0x4c0/0x510 [kvm]
[c0002003591ab3a0] [c00804a44c18] kvm_vcpu_read_guest+0xa0/0x180 [kvm]
[c0002003591ab410] [c00804ff9bd8] kvmhv_enter_nested_guest+0x90/0xb80 
[kvm_hv]
[c0002003591ab980] [c00804fe07bc] kvmppc_pseries_do_hcall+0x7b4/0x1c30 
[kvm_hv]
[c0002003591aba10] [c00804fe5d30] kvmppc_vcpu_run_hv+0x10a8/0x1a30 [kvm_hv]
[c0002003591abae0] [c00804a5d954] kvmppc_vcpu_run+0x4c/0x70 [kvm]
[c0002003591abb10] [c00804a56e54] kvm_arch_vcpu_ioctl_run+0x56c/0x7c0 [kvm]
[c0002003591abba0] [c00804a3ddc4] kvm_vcpu_ioctl+0x4ac/0xd80 [kvm]
[c0002003591abd20] [c06ebb58] ksys_ioctl+0x188/0x210
[c0002003591abd70] [c06ebc28] sys_ioctl+0x48/0xb0
[c0002003591abdb0] [c0042764] system_call_exception+0x1d4/0x2e0
[c0002003591abe20] [c000cce8] system_call_common+0xe8/0x214

Signed-off-by: Alexey Kardashevskiy 
---
 arch/powerpc/kvm/book3s_64_mmu_radix.c |  4 
 arch/powerpc/kvm/book3s_hv_nested.c| 30 --
 arch/powerpc/kvm/book3s_rtas.c |  2 ++
 arch/powerpc/kvm/powerpc.c |  5 -
 4 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c 
b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index aa12cd4078b3..ef7fcc2e7c96 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -160,7 +160,9 @@ int kvmppc_mmu_walk_radix_tree(struct kvm_vcpu *vcpu, gva_t 
eaddr,
return -EINVAL;
/* Read the entry from guest memory */
addr = base + (index * sizeof(rpte));
+   vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
ret = kvm_read_guest(kvm, addr, &rpte, sizeof(rpte));
+   srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
if (ret) {
if (pte_ret_p)
*pte_ret_p = addr;
@@ -236,7 +238,9 @@ int kvmppc_mmu_radix_translate_table(struct kvm_vcpu *vcpu, 
gva_t eaddr,
 
/* Read the table to find the root of the radix tree */
ptbl = (table & PRTB_MASK) + (table_index * sizeof(entry));
+   vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
ret = kvm_read_guest(kvm, ptbl, &entry, sizeof(entry));
+   srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
if (ret)
return ret;
 
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c 
b/arch/powerpc/kvm/book3s_hv_nested.c
index dc97e5be76f6..1d3ab6fb00a7 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -233,20 +233,21 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
 
/* copy parameters in */
hv_ptr = kvmppc_get_gpr(vcpu, 4);
+   regs_ptr = kvmppc_get_gpr(vcpu, 5);
+   vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
err = kvm_vcpu_read_guest(vcpu, hv_ptr, &l2_hv,
- sizeof(struct hv_guest_state));
+ sizeof(struct hv_guest_state)) ||
+   kvm_vcpu_read_guest(vcpu, regs_ptr, &l2_regs,
+   sizeof(struct pt_regs));
+   srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
if (err)
return H_PARAMETER;
+
if (kvmppc_need_byteswap(vcpu))
byteswap_hv_regs(&l2_hv);
if (l2_hv.version != HV_GUEST_STATE_VERSION)
return H_P2;
 
-   regs_ptr = kvmppc_get_gpr(vcpu, 5);
-   err = kvm_vcpu_read_guest(vcpu, regs_ptr, &l2_regs,
- sizeof(struct pt_regs));
-   if (err)
-   return H_PARAMETER;
if (kvmppc_need_byteswap(vcpu))
byteswap_pt

Re: [PATCH v11 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods

2020-06-08 Thread Dan Williams
On Mon, Jun 8, 2020 at 5:16 PM kernel test robot  wrote:
>
> Hi Vaibhav,
>
> Thank you for the patch! Perhaps something to improve:
>
> [auto build test WARNING on powerpc/next]
> [also build test WARNING on linus/master v5.7 next-20200605]
> [cannot apply to linux-nvdimm/libnvdimm-for-next scottwood/next]
> [if your patch is applied to the wrong git tree, please drop us a note to help
> improve the system. BTW, we also suggest to use '--base' option to specify the
> base tree in git format-patch, please see 
> https://stackoverflow.com/a/37406982]
>
> url:
> https://github.com/0day-ci/linux/commits/Vaibhav-Jain/powerpc-papr_scm-Add-support-for-reporting-nvdimm-health/20200607-211653
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
> config: powerpc-randconfig-r016-20200607 (attached as .config)
> compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project 
> e429cffd4f228f70c1d9df0e5d77c08590dd9766)
> reproduce (this is a W=1 build):
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # install powerpc cross compiling tool for clang build
> # apt-get install binutils-powerpc-linux-gnu
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross 
> ARCH=powerpc
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
>
> All warnings (new ones prefixed by >>, old ones prefixed by <<):
>
> In file included from :1:
> >> ./usr/include/asm/papr_pdsm.h:69:20: warning: field 'hdr' with variable 
> >> sized type 'struct nd_cmd_pkg' not at the end of a struct or class is a 
> >> GNU extension [-Wgnu-variable-sized-type-not-at-end]
> struct nd_cmd_pkg hdr;  /* Package header containing sub-cmd */

Hi Vaibhav,

This looks like it's going to need another round to get this fixed. I
don't think 'struct nd_pdsm_cmd_pkg' should embed a definition of
'struct nd_cmd_pkg'. An instance of 'struct nd_cmd_pkg' carries a
payload that is the 'pdsm' specifics. As the code has it now it's
defined as a superset of 'struct nd_cmd_pkg' and the compiler warning
is pointing out a real 'struct' organization problem.

Given the soak time needed in -next after the code is finalized this
there's no time to do another round of updates and still make the v5.8
merge window.


Re: [PATCH v11 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods

2020-06-08 Thread kernel test robot
Hi Vaibhav,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on powerpc/next]
[also build test WARNING on linus/master v5.7 next-20200605]
[cannot apply to linux-nvdimm/libnvdimm-for-next scottwood/next]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Vaibhav-Jain/powerpc-papr_scm-Add-support-for-reporting-nvdimm-health/20200607-211653
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-randconfig-r016-20200607 (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project 
e429cffd4f228f70c1d9df0e5d77c08590dd9766)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc cross compiling tool for clang build
# apt-get install binutils-powerpc-linux-gnu
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>, old ones prefixed by <<):

In file included from :1:
>> ./usr/include/asm/papr_pdsm.h:69:20: warning: field 'hdr' with variable 
>> sized type 'struct nd_cmd_pkg' not at the end of a struct or class is a GNU 
>> extension [-Wgnu-variable-sized-type-not-at-end]
struct nd_cmd_pkg hdr;  /* Package header containing sub-cmd */
^
1 warning generated.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


[PATCH AUTOSEL 4.4 21/37] powerpc/spufs: fix copy_to_user while atomic

2020-06-08 Thread Sasha Levin
From: Jeremy Kerr 

[ Upstream commit 88413a6bfbbe2f648df399b62f85c934460b7a4d ]

Currently, we may perform a copy_to_user (through
simple_read_from_buffer()) while holding a context's register_lock,
while accessing the context save area.

This change uses a temporary buffer for the context save area data,
which we then pass to simple_read_from_buffer.

Includes changes from Christoph Hellwig .

Fixes: bf1ab978be23 ("[POWERPC] coredump: Add SPU elf notes to coredump.")
Signed-off-by: Jeremy Kerr 
Reviewed-by: Arnd Bergmann 
[hch: renamed to function to avoid ___-prefixes]
Signed-off-by: Christoph Hellwig 
Signed-off-by: Al Viro 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/cell/spufs/file.c | 113 +++
 1 file changed, 75 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/file.c 
b/arch/powerpc/platforms/cell/spufs/file.c
index 5038fd578e65..e708c163fd6d 100644
--- a/arch/powerpc/platforms/cell/spufs/file.c
+++ b/arch/powerpc/platforms/cell/spufs/file.c
@@ -2044,8 +2044,9 @@ static ssize_t __spufs_mbox_info_read(struct spu_context 
*ctx,
 static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf,
   size_t len, loff_t *pos)
 {
-   int ret;
struct spu_context *ctx = file->private_data;
+   u32 stat, data;
+   int ret;
 
if (!access_ok(VERIFY_WRITE, buf, len))
return -EFAULT;
@@ -2054,11 +2055,16 @@ static ssize_t spufs_mbox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_mbox_info_read(ctx, buf, len, pos);
+   stat = ctx->csa.prob.mb_stat_R;
+   data = ctx->csa.prob.pu_mb_R;
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   /* EOF if there's no entry in the mbox */
+   if (!(stat & 0xff))
+   return 0;
+
+   return simple_read_from_buffer(buf, len, pos, &data, sizeof(data));
 }
 
 static const struct file_operations spufs_mbox_info_fops = {
@@ -2085,6 +2091,7 @@ static ssize_t spufs_ibox_info_read(struct file *file, 
char __user *buf,
   size_t len, loff_t *pos)
 {
struct spu_context *ctx = file->private_data;
+   u32 stat, data;
int ret;
 
if (!access_ok(VERIFY_WRITE, buf, len))
@@ -2094,11 +2101,16 @@ static ssize_t spufs_ibox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_ibox_info_read(ctx, buf, len, pos);
+   stat = ctx->csa.prob.mb_stat_R;
+   data = ctx->csa.priv2.puint_mb_R;
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   /* EOF if there's no entry in the ibox */
+   if (!(stat & 0xff))
+   return 0;
+
+   return simple_read_from_buffer(buf, len, pos, &data, sizeof(data));
 }
 
 static const struct file_operations spufs_ibox_info_fops = {
@@ -2107,6 +2119,11 @@ static const struct file_operations spufs_ibox_info_fops 
= {
.llseek  = generic_file_llseek,
 };
 
+static size_t spufs_wbox_info_cnt(struct spu_context *ctx)
+{
+   return (4 - ((ctx->csa.prob.mb_stat_R & 0x00ff00) >> 8)) * sizeof(u32);
+}
+
 static ssize_t __spufs_wbox_info_read(struct spu_context *ctx,
char __user *buf, size_t len, loff_t *pos)
 {
@@ -2115,7 +2132,7 @@ static ssize_t __spufs_wbox_info_read(struct spu_context 
*ctx,
u32 wbox_stat;
 
wbox_stat = ctx->csa.prob.mb_stat_R;
-   cnt = 4 - ((wbox_stat & 0x00ff00) >> 8);
+   cnt = spufs_wbox_info_cnt(ctx);
for (i = 0; i < cnt; i++) {
data[i] = ctx->csa.spu_mailbox_data[i];
}
@@ -2128,7 +2145,8 @@ static ssize_t spufs_wbox_info_read(struct file *file, 
char __user *buf,
   size_t len, loff_t *pos)
 {
struct spu_context *ctx = file->private_data;
-   int ret;
+   u32 data[ARRAY_SIZE(ctx->csa.spu_mailbox_data)];
+   int ret, count;
 
if (!access_ok(VERIFY_WRITE, buf, len))
return -EFAULT;
@@ -2137,11 +2155,13 @@ static ssize_t spufs_wbox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_wbox_info_read(ctx, buf, len, pos);
+   count = spufs_wbox_info_cnt(ctx);
+   memcpy(&data, &ctx->csa.spu_mailbox_data, sizeof(data));
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   return simple_read_from_buffer(buf, len, pos, &data,
+   count * sizeof(u32));
 }
 
 static const struct file_operations spufs_wbox_info_fops = {
@@ -2150,27 +2170,33 @@ static const struct file_operations 
spufs_wbox_info_fops = {
.llseek  = gener

[PATCH AUTOSEL 4.9 29/50] powerpc/spufs: fix copy_to_user while atomic

2020-06-08 Thread Sasha Levin
From: Jeremy Kerr 

[ Upstream commit 88413a6bfbbe2f648df399b62f85c934460b7a4d ]

Currently, we may perform a copy_to_user (through
simple_read_from_buffer()) while holding a context's register_lock,
while accessing the context save area.

This change uses a temporary buffer for the context save area data,
which we then pass to simple_read_from_buffer.

Includes changes from Christoph Hellwig .

Fixes: bf1ab978be23 ("[POWERPC] coredump: Add SPU elf notes to coredump.")
Signed-off-by: Jeremy Kerr 
Reviewed-by: Arnd Bergmann 
[hch: renamed to function to avoid ___-prefixes]
Signed-off-by: Christoph Hellwig 
Signed-off-by: Al Viro 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/cell/spufs/file.c | 113 +++
 1 file changed, 75 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/file.c 
b/arch/powerpc/platforms/cell/spufs/file.c
index 06254467e4dd..f12b00a056cb 100644
--- a/arch/powerpc/platforms/cell/spufs/file.c
+++ b/arch/powerpc/platforms/cell/spufs/file.c
@@ -2044,8 +2044,9 @@ static ssize_t __spufs_mbox_info_read(struct spu_context 
*ctx,
 static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf,
   size_t len, loff_t *pos)
 {
-   int ret;
struct spu_context *ctx = file->private_data;
+   u32 stat, data;
+   int ret;
 
if (!access_ok(VERIFY_WRITE, buf, len))
return -EFAULT;
@@ -2054,11 +2055,16 @@ static ssize_t spufs_mbox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_mbox_info_read(ctx, buf, len, pos);
+   stat = ctx->csa.prob.mb_stat_R;
+   data = ctx->csa.prob.pu_mb_R;
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   /* EOF if there's no entry in the mbox */
+   if (!(stat & 0xff))
+   return 0;
+
+   return simple_read_from_buffer(buf, len, pos, &data, sizeof(data));
 }
 
 static const struct file_operations spufs_mbox_info_fops = {
@@ -2085,6 +2091,7 @@ static ssize_t spufs_ibox_info_read(struct file *file, 
char __user *buf,
   size_t len, loff_t *pos)
 {
struct spu_context *ctx = file->private_data;
+   u32 stat, data;
int ret;
 
if (!access_ok(VERIFY_WRITE, buf, len))
@@ -2094,11 +2101,16 @@ static ssize_t spufs_ibox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_ibox_info_read(ctx, buf, len, pos);
+   stat = ctx->csa.prob.mb_stat_R;
+   data = ctx->csa.priv2.puint_mb_R;
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   /* EOF if there's no entry in the ibox */
+   if (!(stat & 0xff))
+   return 0;
+
+   return simple_read_from_buffer(buf, len, pos, &data, sizeof(data));
 }
 
 static const struct file_operations spufs_ibox_info_fops = {
@@ -2107,6 +2119,11 @@ static const struct file_operations spufs_ibox_info_fops 
= {
.llseek  = generic_file_llseek,
 };
 
+static size_t spufs_wbox_info_cnt(struct spu_context *ctx)
+{
+   return (4 - ((ctx->csa.prob.mb_stat_R & 0x00ff00) >> 8)) * sizeof(u32);
+}
+
 static ssize_t __spufs_wbox_info_read(struct spu_context *ctx,
char __user *buf, size_t len, loff_t *pos)
 {
@@ -2115,7 +2132,7 @@ static ssize_t __spufs_wbox_info_read(struct spu_context 
*ctx,
u32 wbox_stat;
 
wbox_stat = ctx->csa.prob.mb_stat_R;
-   cnt = 4 - ((wbox_stat & 0x00ff00) >> 8);
+   cnt = spufs_wbox_info_cnt(ctx);
for (i = 0; i < cnt; i++) {
data[i] = ctx->csa.spu_mailbox_data[i];
}
@@ -2128,7 +2145,8 @@ static ssize_t spufs_wbox_info_read(struct file *file, 
char __user *buf,
   size_t len, loff_t *pos)
 {
struct spu_context *ctx = file->private_data;
-   int ret;
+   u32 data[ARRAY_SIZE(ctx->csa.spu_mailbox_data)];
+   int ret, count;
 
if (!access_ok(VERIFY_WRITE, buf, len))
return -EFAULT;
@@ -2137,11 +2155,13 @@ static ssize_t spufs_wbox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_wbox_info_read(ctx, buf, len, pos);
+   count = spufs_wbox_info_cnt(ctx);
+   memcpy(&data, &ctx->csa.spu_mailbox_data, sizeof(data));
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   return simple_read_from_buffer(buf, len, pos, &data,
+   count * sizeof(u32));
 }
 
 static const struct file_operations spufs_wbox_info_fops = {
@@ -2150,27 +2170,33 @@ static const struct file_operations 
spufs_wbox_info_fops = {
.llseek  = gener

[PATCH AUTOSEL 4.14 42/72] powerpc/spufs: fix copy_to_user while atomic

2020-06-08 Thread Sasha Levin
From: Jeremy Kerr 

[ Upstream commit 88413a6bfbbe2f648df399b62f85c934460b7a4d ]

Currently, we may perform a copy_to_user (through
simple_read_from_buffer()) while holding a context's register_lock,
while accessing the context save area.

This change uses a temporary buffer for the context save area data,
which we then pass to simple_read_from_buffer.

Includes changes from Christoph Hellwig .

Fixes: bf1ab978be23 ("[POWERPC] coredump: Add SPU elf notes to coredump.")
Signed-off-by: Jeremy Kerr 
Reviewed-by: Arnd Bergmann 
[hch: renamed to function to avoid ___-prefixes]
Signed-off-by: Christoph Hellwig 
Signed-off-by: Al Viro 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/cell/spufs/file.c | 113 +++
 1 file changed, 75 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/file.c 
b/arch/powerpc/platforms/cell/spufs/file.c
index 5ffcdeb1eb17..9d9fffaedeef 100644
--- a/arch/powerpc/platforms/cell/spufs/file.c
+++ b/arch/powerpc/platforms/cell/spufs/file.c
@@ -1988,8 +1988,9 @@ static ssize_t __spufs_mbox_info_read(struct spu_context 
*ctx,
 static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf,
   size_t len, loff_t *pos)
 {
-   int ret;
struct spu_context *ctx = file->private_data;
+   u32 stat, data;
+   int ret;
 
if (!access_ok(VERIFY_WRITE, buf, len))
return -EFAULT;
@@ -1998,11 +1999,16 @@ static ssize_t spufs_mbox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_mbox_info_read(ctx, buf, len, pos);
+   stat = ctx->csa.prob.mb_stat_R;
+   data = ctx->csa.prob.pu_mb_R;
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   /* EOF if there's no entry in the mbox */
+   if (!(stat & 0xff))
+   return 0;
+
+   return simple_read_from_buffer(buf, len, pos, &data, sizeof(data));
 }
 
 static const struct file_operations spufs_mbox_info_fops = {
@@ -2029,6 +2035,7 @@ static ssize_t spufs_ibox_info_read(struct file *file, 
char __user *buf,
   size_t len, loff_t *pos)
 {
struct spu_context *ctx = file->private_data;
+   u32 stat, data;
int ret;
 
if (!access_ok(VERIFY_WRITE, buf, len))
@@ -2038,11 +2045,16 @@ static ssize_t spufs_ibox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_ibox_info_read(ctx, buf, len, pos);
+   stat = ctx->csa.prob.mb_stat_R;
+   data = ctx->csa.priv2.puint_mb_R;
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   /* EOF if there's no entry in the ibox */
+   if (!(stat & 0xff))
+   return 0;
+
+   return simple_read_from_buffer(buf, len, pos, &data, sizeof(data));
 }
 
 static const struct file_operations spufs_ibox_info_fops = {
@@ -2051,6 +2063,11 @@ static const struct file_operations spufs_ibox_info_fops 
= {
.llseek  = generic_file_llseek,
 };
 
+static size_t spufs_wbox_info_cnt(struct spu_context *ctx)
+{
+   return (4 - ((ctx->csa.prob.mb_stat_R & 0x00ff00) >> 8)) * sizeof(u32);
+}
+
 static ssize_t __spufs_wbox_info_read(struct spu_context *ctx,
char __user *buf, size_t len, loff_t *pos)
 {
@@ -2059,7 +2076,7 @@ static ssize_t __spufs_wbox_info_read(struct spu_context 
*ctx,
u32 wbox_stat;
 
wbox_stat = ctx->csa.prob.mb_stat_R;
-   cnt = 4 - ((wbox_stat & 0x00ff00) >> 8);
+   cnt = spufs_wbox_info_cnt(ctx);
for (i = 0; i < cnt; i++) {
data[i] = ctx->csa.spu_mailbox_data[i];
}
@@ -2072,7 +2089,8 @@ static ssize_t spufs_wbox_info_read(struct file *file, 
char __user *buf,
   size_t len, loff_t *pos)
 {
struct spu_context *ctx = file->private_data;
-   int ret;
+   u32 data[ARRAY_SIZE(ctx->csa.spu_mailbox_data)];
+   int ret, count;
 
if (!access_ok(VERIFY_WRITE, buf, len))
return -EFAULT;
@@ -2081,11 +2099,13 @@ static ssize_t spufs_wbox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_wbox_info_read(ctx, buf, len, pos);
+   count = spufs_wbox_info_cnt(ctx);
+   memcpy(&data, &ctx->csa.spu_mailbox_data, sizeof(data));
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   return simple_read_from_buffer(buf, len, pos, &data,
+   count * sizeof(u32));
 }
 
 static const struct file_operations spufs_wbox_info_fops = {
@@ -2094,27 +2114,33 @@ static const struct file_operations 
spufs_wbox_info_fops = {
.llseek  = gener

[PATCH AUTOSEL 4.19 054/106] powerpc/spufs: fix copy_to_user while atomic

2020-06-08 Thread Sasha Levin
From: Jeremy Kerr 

[ Upstream commit 88413a6bfbbe2f648df399b62f85c934460b7a4d ]

Currently, we may perform a copy_to_user (through
simple_read_from_buffer()) while holding a context's register_lock,
while accessing the context save area.

This change uses a temporary buffer for the context save area data,
which we then pass to simple_read_from_buffer.

Includes changes from Christoph Hellwig .

Fixes: bf1ab978be23 ("[POWERPC] coredump: Add SPU elf notes to coredump.")
Signed-off-by: Jeremy Kerr 
Reviewed-by: Arnd Bergmann 
[hch: renamed to function to avoid ___-prefixes]
Signed-off-by: Christoph Hellwig 
Signed-off-by: Al Viro 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/cell/spufs/file.c | 113 +++
 1 file changed, 75 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/file.c 
b/arch/powerpc/platforms/cell/spufs/file.c
index 43e7b93f27c7..d16adcd93921 100644
--- a/arch/powerpc/platforms/cell/spufs/file.c
+++ b/arch/powerpc/platforms/cell/spufs/file.c
@@ -1991,8 +1991,9 @@ static ssize_t __spufs_mbox_info_read(struct spu_context 
*ctx,
 static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf,
   size_t len, loff_t *pos)
 {
-   int ret;
struct spu_context *ctx = file->private_data;
+   u32 stat, data;
+   int ret;
 
if (!access_ok(VERIFY_WRITE, buf, len))
return -EFAULT;
@@ -2001,11 +2002,16 @@ static ssize_t spufs_mbox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_mbox_info_read(ctx, buf, len, pos);
+   stat = ctx->csa.prob.mb_stat_R;
+   data = ctx->csa.prob.pu_mb_R;
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   /* EOF if there's no entry in the mbox */
+   if (!(stat & 0xff))
+   return 0;
+
+   return simple_read_from_buffer(buf, len, pos, &data, sizeof(data));
 }
 
 static const struct file_operations spufs_mbox_info_fops = {
@@ -2032,6 +2038,7 @@ static ssize_t spufs_ibox_info_read(struct file *file, 
char __user *buf,
   size_t len, loff_t *pos)
 {
struct spu_context *ctx = file->private_data;
+   u32 stat, data;
int ret;
 
if (!access_ok(VERIFY_WRITE, buf, len))
@@ -2041,11 +2048,16 @@ static ssize_t spufs_ibox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_ibox_info_read(ctx, buf, len, pos);
+   stat = ctx->csa.prob.mb_stat_R;
+   data = ctx->csa.priv2.puint_mb_R;
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   /* EOF if there's no entry in the ibox */
+   if (!(stat & 0xff))
+   return 0;
+
+   return simple_read_from_buffer(buf, len, pos, &data, sizeof(data));
 }
 
 static const struct file_operations spufs_ibox_info_fops = {
@@ -2054,6 +2066,11 @@ static const struct file_operations spufs_ibox_info_fops 
= {
.llseek  = generic_file_llseek,
 };
 
+static size_t spufs_wbox_info_cnt(struct spu_context *ctx)
+{
+   return (4 - ((ctx->csa.prob.mb_stat_R & 0x00ff00) >> 8)) * sizeof(u32);
+}
+
 static ssize_t __spufs_wbox_info_read(struct spu_context *ctx,
char __user *buf, size_t len, loff_t *pos)
 {
@@ -2062,7 +2079,7 @@ static ssize_t __spufs_wbox_info_read(struct spu_context 
*ctx,
u32 wbox_stat;
 
wbox_stat = ctx->csa.prob.mb_stat_R;
-   cnt = 4 - ((wbox_stat & 0x00ff00) >> 8);
+   cnt = spufs_wbox_info_cnt(ctx);
for (i = 0; i < cnt; i++) {
data[i] = ctx->csa.spu_mailbox_data[i];
}
@@ -2075,7 +2092,8 @@ static ssize_t spufs_wbox_info_read(struct file *file, 
char __user *buf,
   size_t len, loff_t *pos)
 {
struct spu_context *ctx = file->private_data;
-   int ret;
+   u32 data[ARRAY_SIZE(ctx->csa.spu_mailbox_data)];
+   int ret, count;
 
if (!access_ok(VERIFY_WRITE, buf, len))
return -EFAULT;
@@ -2084,11 +2102,13 @@ static ssize_t spufs_wbox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_wbox_info_read(ctx, buf, len, pos);
+   count = spufs_wbox_info_cnt(ctx);
+   memcpy(&data, &ctx->csa.spu_mailbox_data, sizeof(data));
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   return simple_read_from_buffer(buf, len, pos, &data,
+   count * sizeof(u32));
 }
 
 static const struct file_operations spufs_wbox_info_fops = {
@@ -2097,27 +2117,33 @@ static const struct file_operations 
spufs_wbox_info_fops = {
.llseek  = gener

[PATCH AUTOSEL 4.19 049/106] sched/core: Fix illegal RCU from offline CPUs

2020-06-08 Thread Sasha Levin
From: Peter Zijlstra 

[ Upstream commit bf2c59fce4074e55d622089b34be3a6bc95484fb ]

In the CPU-offline process, it calls mmdrop() after idle entry and the
subsequent call to cpuhp_report_idle_dead(). Once execution passes the
call to rcu_report_dead(), RCU is ignoring the CPU, which results in
lockdep complaining when mmdrop() uses RCU from either memcg or
debugobjects below.

Fix it by cleaning up the active_mm state from BP instead. Every arch
which has CONFIG_HOTPLUG_CPU should have already called idle_task_exit()
from AP. The only exception is parisc because it switches them to
&init_mm unconditionally (see smp_boot_one_cpu() and smp_cpu_init()),
but the patch will still work there because it calls mmgrab(&init_mm) in
smp_cpu_init() and then should call mmdrop(&init_mm) in finish_cpu().

  WARNING: suspicious RCU usage
  -
  kernel/workqueue.c:710 RCU or wq_pool_mutex should be held!

  other info that might help us debug this:

  RCU used illegally from offline CPU!
  Call Trace:
   dump_stack+0xf4/0x164 (unreliable)
   lockdep_rcu_suspicious+0x140/0x164
   get_work_pool+0x110/0x150
   __queue_work+0x1bc/0xca0
   queue_work_on+0x114/0x120
   css_release+0x9c/0xc0
   percpu_ref_put_many+0x204/0x230
   free_pcp_prepare+0x264/0x570
   free_unref_page+0x38/0xf0
   __mmdrop+0x21c/0x2c0
   idle_task_exit+0x170/0x1b0
   pnv_smp_cpu_kill_self+0x38/0x2e0
   cpu_die+0x48/0x64
   arch_cpu_idle_dead+0x30/0x50
   do_idle+0x2f4/0x470
   cpu_startup_entry+0x38/0x40
   start_secondary+0x7a8/0xa80
   start_secondary_resume+0x10/0x14

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Qian Cai 
Signed-off-by: Peter Zijlstra (Intel) 
Acked-by: Michael Ellerman  (powerpc)
Link: https://lkml.kernel.org/r/20200401214033.8448-1-...@lca.pw
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/powernv/smp.c |  1 -
 include/linux/sched/mm.h |  2 ++
 kernel/cpu.c | 18 +-
 kernel/sched/core.c  |  5 +++--
 4 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/smp.c 
b/arch/powerpc/platforms/powernv/smp.c
index 3d3c989e44dd..8d49ba370c50 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -171,7 +171,6 @@ static void pnv_smp_cpu_kill_self(void)
/* Standard hot unplug procedure */
 
idle_task_exit();
-   current->active_mm = NULL; /* for sanity */
cpu = smp_processor_id();
DBG("CPU%d offline\n", cpu);
generic_set_cpu_dead(cpu);
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index e9d4e389aed9..766bbe813861 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -49,6 +49,8 @@ static inline void mmdrop(struct mm_struct *mm)
__mmdrop(mm);
 }
 
+void mmdrop(struct mm_struct *mm);
+
 /*
  * This has to be called after a get_task_mm()/mmget_not_zero()
  * followed by taking the mmap_sem for writing before modifying the
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 6d6c106a495c..08b9d6ba0807 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -3,6 +3,7 @@
  *
  * This code is licenced under the GPL.
  */
+#include 
 #include 
 #include 
 #include 
@@ -532,6 +533,21 @@ static int bringup_cpu(unsigned int cpu)
return bringup_wait_for_ap(cpu);
 }
 
+static int finish_cpu(unsigned int cpu)
+{
+   struct task_struct *idle = idle_thread_get(cpu);
+   struct mm_struct *mm = idle->active_mm;
+
+   /*
+* idle_task_exit() will have switched to &init_mm, now
+* clean up any remaining active_mm state.
+*/
+   if (mm != &init_mm)
+   idle->active_mm = &init_mm;
+   mmdrop(mm);
+   return 0;
+}
+
 /*
  * Hotplug state machine related functions
  */
@@ -1379,7 +1395,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
[CPUHP_BRINGUP_CPU] = {
.name   = "cpu:bringup",
.startup.single = bringup_cpu,
-   .teardown.single= NULL,
+   .teardown.single= finish_cpu,
.cant_stop  = true,
},
/* Final state before CPU kills itself */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2befd2c4ce9e..0325ccf3a8e4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5571,13 +5571,14 @@ void idle_task_exit(void)
struct mm_struct *mm = current->active_mm;
 
BUG_ON(cpu_online(smp_processor_id()));
+   BUG_ON(current != this_rq()->idle);
 
if (mm != &init_mm) {
switch_mm(mm, &init_mm, current);
-   current->active_mm = &init_mm;
finish_arch_post_lock_switch();
}
-   mmdrop(mm);
+
+   /* finish_cpu(), as ran on the BP, will clean up the active_mm state */
 }
 
 /*
-- 
2.25.1



[PATCH AUTOSEL 5.4 097/175] powerpc/spufs: fix copy_to_user while atomic

2020-06-08 Thread Sasha Levin
From: Jeremy Kerr 

[ Upstream commit 88413a6bfbbe2f648df399b62f85c934460b7a4d ]

Currently, we may perform a copy_to_user (through
simple_read_from_buffer()) while holding a context's register_lock,
while accessing the context save area.

This change uses a temporary buffer for the context save area data,
which we then pass to simple_read_from_buffer.

Includes changes from Christoph Hellwig .

Fixes: bf1ab978be23 ("[POWERPC] coredump: Add SPU elf notes to coredump.")
Signed-off-by: Jeremy Kerr 
Reviewed-by: Arnd Bergmann 
[hch: renamed to function to avoid ___-prefixes]
Signed-off-by: Christoph Hellwig 
Signed-off-by: Al Viro 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/cell/spufs/file.c | 113 +++
 1 file changed, 75 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/file.c 
b/arch/powerpc/platforms/cell/spufs/file.c
index c0f950a3f4e1..f4a4dfb191e7 100644
--- a/arch/powerpc/platforms/cell/spufs/file.c
+++ b/arch/powerpc/platforms/cell/spufs/file.c
@@ -1978,8 +1978,9 @@ static ssize_t __spufs_mbox_info_read(struct spu_context 
*ctx,
 static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf,
   size_t len, loff_t *pos)
 {
-   int ret;
struct spu_context *ctx = file->private_data;
+   u32 stat, data;
+   int ret;
 
if (!access_ok(buf, len))
return -EFAULT;
@@ -1988,11 +1989,16 @@ static ssize_t spufs_mbox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_mbox_info_read(ctx, buf, len, pos);
+   stat = ctx->csa.prob.mb_stat_R;
+   data = ctx->csa.prob.pu_mb_R;
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   /* EOF if there's no entry in the mbox */
+   if (!(stat & 0xff))
+   return 0;
+
+   return simple_read_from_buffer(buf, len, pos, &data, sizeof(data));
 }
 
 static const struct file_operations spufs_mbox_info_fops = {
@@ -2019,6 +2025,7 @@ static ssize_t spufs_ibox_info_read(struct file *file, 
char __user *buf,
   size_t len, loff_t *pos)
 {
struct spu_context *ctx = file->private_data;
+   u32 stat, data;
int ret;
 
if (!access_ok(buf, len))
@@ -2028,11 +2035,16 @@ static ssize_t spufs_ibox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_ibox_info_read(ctx, buf, len, pos);
+   stat = ctx->csa.prob.mb_stat_R;
+   data = ctx->csa.priv2.puint_mb_R;
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   /* EOF if there's no entry in the ibox */
+   if (!(stat & 0xff))
+   return 0;
+
+   return simple_read_from_buffer(buf, len, pos, &data, sizeof(data));
 }
 
 static const struct file_operations spufs_ibox_info_fops = {
@@ -2041,6 +2053,11 @@ static const struct file_operations spufs_ibox_info_fops 
= {
.llseek  = generic_file_llseek,
 };
 
+static size_t spufs_wbox_info_cnt(struct spu_context *ctx)
+{
+   return (4 - ((ctx->csa.prob.mb_stat_R & 0x00ff00) >> 8)) * sizeof(u32);
+}
+
 static ssize_t __spufs_wbox_info_read(struct spu_context *ctx,
char __user *buf, size_t len, loff_t *pos)
 {
@@ -2049,7 +2066,7 @@ static ssize_t __spufs_wbox_info_read(struct spu_context 
*ctx,
u32 wbox_stat;
 
wbox_stat = ctx->csa.prob.mb_stat_R;
-   cnt = 4 - ((wbox_stat & 0x00ff00) >> 8);
+   cnt = spufs_wbox_info_cnt(ctx);
for (i = 0; i < cnt; i++) {
data[i] = ctx->csa.spu_mailbox_data[i];
}
@@ -2062,7 +2079,8 @@ static ssize_t spufs_wbox_info_read(struct file *file, 
char __user *buf,
   size_t len, loff_t *pos)
 {
struct spu_context *ctx = file->private_data;
-   int ret;
+   u32 data[ARRAY_SIZE(ctx->csa.spu_mailbox_data)];
+   int ret, count;
 
if (!access_ok(buf, len))
return -EFAULT;
@@ -2071,11 +2089,13 @@ static ssize_t spufs_wbox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_wbox_info_read(ctx, buf, len, pos);
+   count = spufs_wbox_info_cnt(ctx);
+   memcpy(&data, &ctx->csa.spu_mailbox_data, sizeof(data));
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   return simple_read_from_buffer(buf, len, pos, &data,
+   count * sizeof(u32));
 }
 
 static const struct file_operations spufs_wbox_info_fops = {
@@ -2084,27 +2104,33 @@ static const struct file_operations 
spufs_wbox_info_fops = {
.llseek  = generic_file_llseek,
 };
 
-static ssize_t __sp

[PATCH AUTOSEL 5.4 089/175] sched/core: Fix illegal RCU from offline CPUs

2020-06-08 Thread Sasha Levin
From: Peter Zijlstra 

[ Upstream commit bf2c59fce4074e55d622089b34be3a6bc95484fb ]

In the CPU-offline process, it calls mmdrop() after idle entry and the
subsequent call to cpuhp_report_idle_dead(). Once execution passes the
call to rcu_report_dead(), RCU is ignoring the CPU, which results in
lockdep complaining when mmdrop() uses RCU from either memcg or
debugobjects below.

Fix it by cleaning up the active_mm state from BP instead. Every arch
which has CONFIG_HOTPLUG_CPU should have already called idle_task_exit()
from AP. The only exception is parisc because it switches them to
&init_mm unconditionally (see smp_boot_one_cpu() and smp_cpu_init()),
but the patch will still work there because it calls mmgrab(&init_mm) in
smp_cpu_init() and then should call mmdrop(&init_mm) in finish_cpu().

  WARNING: suspicious RCU usage
  -
  kernel/workqueue.c:710 RCU or wq_pool_mutex should be held!

  other info that might help us debug this:

  RCU used illegally from offline CPU!
  Call Trace:
   dump_stack+0xf4/0x164 (unreliable)
   lockdep_rcu_suspicious+0x140/0x164
   get_work_pool+0x110/0x150
   __queue_work+0x1bc/0xca0
   queue_work_on+0x114/0x120
   css_release+0x9c/0xc0
   percpu_ref_put_many+0x204/0x230
   free_pcp_prepare+0x264/0x570
   free_unref_page+0x38/0xf0
   __mmdrop+0x21c/0x2c0
   idle_task_exit+0x170/0x1b0
   pnv_smp_cpu_kill_self+0x38/0x2e0
   cpu_die+0x48/0x64
   arch_cpu_idle_dead+0x30/0x50
   do_idle+0x2f4/0x470
   cpu_startup_entry+0x38/0x40
   start_secondary+0x7a8/0xa80
   start_secondary_resume+0x10/0x14

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Qian Cai 
Signed-off-by: Peter Zijlstra (Intel) 
Acked-by: Michael Ellerman  (powerpc)
Link: https://lkml.kernel.org/r/20200401214033.8448-1-...@lca.pw
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/powernv/smp.c |  1 -
 include/linux/sched/mm.h |  2 ++
 kernel/cpu.c | 18 +-
 kernel/sched/core.c  |  5 +++--
 4 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/smp.c 
b/arch/powerpc/platforms/powernv/smp.c
index 13e251699346..b2ba3e95bda7 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -167,7 +167,6 @@ static void pnv_smp_cpu_kill_self(void)
/* Standard hot unplug procedure */
 
idle_task_exit();
-   current->active_mm = NULL; /* for sanity */
cpu = smp_processor_id();
DBG("CPU%d offline\n", cpu);
generic_set_cpu_dead(cpu);
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index c49257a3b510..a132d875d351 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -49,6 +49,8 @@ static inline void mmdrop(struct mm_struct *mm)
__mmdrop(mm);
 }
 
+void mmdrop(struct mm_struct *mm);
+
 /*
  * This has to be called after a get_task_mm()/mmget_not_zero()
  * followed by taking the mmap_sem for writing before modifying the
diff --git a/kernel/cpu.c b/kernel/cpu.c
index d7890c1285bf..7527825ac7da 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -3,6 +3,7 @@
  *
  * This code is licenced under the GPL.
  */
+#include 
 #include 
 #include 
 #include 
@@ -564,6 +565,21 @@ static int bringup_cpu(unsigned int cpu)
return bringup_wait_for_ap(cpu);
 }
 
+static int finish_cpu(unsigned int cpu)
+{
+   struct task_struct *idle = idle_thread_get(cpu);
+   struct mm_struct *mm = idle->active_mm;
+
+   /*
+* idle_task_exit() will have switched to &init_mm, now
+* clean up any remaining active_mm state.
+*/
+   if (mm != &init_mm)
+   idle->active_mm = &init_mm;
+   mmdrop(mm);
+   return 0;
+}
+
 /*
  * Hotplug state machine related functions
  */
@@ -1434,7 +1450,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
[CPUHP_BRINGUP_CPU] = {
.name   = "cpu:bringup",
.startup.single = bringup_cpu,
-   .teardown.single= NULL,
+   .teardown.single= finish_cpu,
.cant_stop  = true,
},
/* Final state before CPU kills itself */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e99d326fa569..4874e1468279 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6177,13 +6177,14 @@ void idle_task_exit(void)
struct mm_struct *mm = current->active_mm;
 
BUG_ON(cpu_online(smp_processor_id()));
+   BUG_ON(current != this_rq()->idle);
 
if (mm != &init_mm) {
switch_mm(mm, &init_mm, current);
-   current->active_mm = &init_mm;
finish_arch_post_lock_switch();
}
-   mmdrop(mm);
+
+   /* finish_cpu(), as ran on the BP, will clean up the active_mm state */
 }
 
 /*
-- 
2.25.1



[PATCH AUTOSEL 5.6 150/606] powerpc/64s: Disable STRICT_KERNEL_RWX

2020-06-08 Thread Sasha Levin
From: Michael Ellerman 

commit 8659a0e0efdd975c73355dbc033f79ba3b31e82c upstream.

Several strange crashes have been eventually traced back to
STRICT_KERNEL_RWX and its interaction with code patching.

Various paths in our ftrace, kprobes and other patching code need to
be hardened against patching failures, otherwise we can end up running
with partially/incorrectly patched ftrace paths, kprobes or jump
labels, which can then cause strange crashes.

Although fixes for those are in development, they're not -rc material.

There also seem to be problems with the underlying strict RWX logic,
which needs further debugging.

So for now disable STRICT_KERNEL_RWX on 64-bit to prevent people from
enabling the option and tripping over the bugs.

Fixes: 1e0fc9d1eb2b ("powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some 
configs")
Cc: sta...@vger.kernel.org # v4.13+
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200520133605.972649-1-...@ellerman.id.au
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 497b7d0b2d7e..b0fb42b0bf4b 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -129,7 +129,7 @@ config PPC
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_MEMBARRIER_CALLBACKS
select ARCH_HAS_SCALED_CPUTIME  if VIRT_CPU_ACCOUNTING_NATIVE 
&& PPC_BOOK3S_64
-   select ARCH_HAS_STRICT_KERNEL_RWX   if ((PPC_BOOK3S_64 || PPC32) && 
!HIBERNATION)
+   select ARCH_HAS_STRICT_KERNEL_RWX   if (PPC32 && !HIBERNATION)
select ARCH_HAS_TICK_BROADCAST  if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAS_UACCESS_FLUSHCACHE
select ARCH_HAS_UACCESS_MCSAFE  if PPC64
-- 
2.25.1



[PATCH AUTOSEL 5.6 117/606] ibmvnic: Skip fatal error reset after passive init

2020-06-08 Thread Sasha Levin
From: Juliet Kim 

[ Upstream commit f9c6cea0b38518741c8dcf26ac056d26ee2fd61d ]

During MTU change, the following events may happen.
Client-driven CRQ initialization fails due to partner’s CRQ closed,
causing client to enqueue a reset task for FATAL_ERROR. Then passive
(server-driven) CRQ initialization succeeds, causing client to
release CRQ and enqueue a reset task for failover. If the passive
CRQ initialization occurs before the FATAL reset task is processed,
the FATAL error reset task would try to access a CRQ message queue
that was freed, causing an oops. The problem may be most likely to
occur during DLPAR add vNIC with a non-default MTU, because the DLPAR
process will automatically issue a change MTU request.

Fix this by not processing fatal error reset if CRQ is passively
initialized after client-driven CRQ initialization fails.

Signed-off-by: Juliet Kim 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 4bd33245bad6..3de549c6c693 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2189,7 +2189,8 @@ static void __ibmvnic_reset(struct work_struct *work)
rc = do_hard_reset(adapter, rwi, reset_state);
rtnl_unlock();
}
-   } else {
+   } else if (!(rwi->reset_reason == VNIC_RESET_FATAL &&
+   adapter->from_passive_init)) {
rc = do_reset(adapter, rwi, reset_state);
}
kfree(rwi);
-- 
2.25.1



[PATCH AUTOSEL 5.6 115/606] scsi: ibmvscsi: Fix WARN_ON during event pool release

2020-06-08 Thread Sasha Levin
From: Tyrel Datwyler 

[ Upstream commit b36522150e5b85045f868768d46fbaaa034174b2 ]

While removing an ibmvscsi client adapter a WARN_ON like the following is
seen in the kernel log:

drmgr: drmgr: -r -c slot -s U9080.M9S.783AEC8-V11-C11 -w 5 -d 1
WARNING: CPU: 9 PID: 24062 at ../kernel/dma/mapping.c:311 
dma_free_attrs+0x78/0x110
Supported: No, Unreleased kernel
CPU: 9 PID: 24062 Comm: drmgr Kdump: loaded Tainted: G   X 
5.3.18-12-default
NIP:  c01fa758 LR: c01fa744 CTR: c01fa6e0
REGS: c002173375d0 TRAP: 0700   Tainted: G   X 
(5.3.18-12-default)
MSR:  80029033   CR: 28088282  XER: 2000
CFAR: c01fbf0c IRQMASK: 1
GPR00: c01fa744 c00217337860 c161ab00 
GPR04:  c11e1225 1801 
GPR08:  0001 0001 c008190f4fa8
GPR12: c01fa6e0 c7fc2a00  
GPR16:    
GPR20:    
GPR24: 00011420e310   1801
GPR28: c159de50 c11e1225 6600 c11e5c994848
NIP [c01fa758] dma_free_attrs+0x78/0x110
LR [c01fa744] dma_free_attrs+0x64/0x110
Call Trace:
[c00217337860] [00011420e310] 0x11420e310 (unreliable)
[c002173378b0] [c008190f0280] release_event_pool+0xd8/0x120 [ibmvscsi]
[c00217337930] [c008190f3f74] ibmvscsi_remove+0x6c/0x160 [ibmvscsi]
[c00217337960] [c00f3cac] vio_bus_remove+0x5c/0x100
[c002173379a0] [c087a0a4] device_release_driver_internal+0x154/0x280
[c002173379e0] [c08777cc] bus_remove_device+0x11c/0x220
[c00217337a60] [c0870fc4] device_del+0x1c4/0x470
[c00217337b10] [c08712a0] device_unregister+0x30/0xa0
[c00217337b80] [c00f39ec] vio_unregister_device+0x2c/0x60
[c00217337bb0] [c0081a1d0964] dlpar_remove_slot+0x14c/0x250 
[rpadlpar_io]
[c00217337c50] [c0081a1d0bcc] remove_slot_store+0xa4/0x110 [rpadlpar_io]
[c00217337cd0] [c0c091a0] kobj_attr_store+0x30/0x50
[c00217337cf0] [c057c934] sysfs_kf_write+0x64/0x90
[c00217337d10] [c057be10] kernfs_fop_write+0x1b0/0x290
[c00217337d60] [c0488c4c] __vfs_write+0x3c/0x70
[c00217337d80] [c048c648] vfs_write+0xd8/0x260
[c00217337dd0] [c048ca8c] ksys_write+0xdc/0x130
[c00217337e20] [c000b488] system_call+0x5c/0x70
Instruction dump:
7c840074 f8010010 f821ffb1 20840040 eb830218 7c8407b4 48002019 6000
2fa3 409e003c 892d0988 792907e0 <0b09> 2fbd 419e0028 2fbc
---[ end trace 5955b3c0cc079942 ]---
rpadlpar_io: slot U9080.M9S.783AEC8-V11-C11 removed

This is tripped as a result of irqs being disabled during the call to
dma_free_coherent() by release_event_pool(). At this point in the code path
we have quiesced the adapter and it is overly paranoid to be holding the
host lock.

[mkp: fixed build warning reported by sfr]

Link: 
https://lore.kernel.org/r/1588027793-17952-1-git-send-email-tyr...@linux.ibm.com
Signed-off-by: Tyrel Datwyler 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 7f66a7783209..59f0f1030c54 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -2320,16 +2320,12 @@ static int ibmvscsi_probe(struct vio_dev *vdev, const 
struct vio_device_id *id)
 static int ibmvscsi_remove(struct vio_dev *vdev)
 {
struct ibmvscsi_host_data *hostdata = dev_get_drvdata(&vdev->dev);
-   unsigned long flags;
 
srp_remove_host(hostdata->host);
scsi_remove_host(hostdata->host);
 
purge_requests(hostdata, DID_ERROR);
-
-   spin_lock_irqsave(hostdata->host->host_lock, flags);
release_event_pool(&hostdata->pool, hostdata);
-   spin_unlock_irqrestore(hostdata->host->host_lock, flags);
 
ibmvscsi_release_crq_queue(&hostdata->queue, hostdata,
max_events);
-- 
2.25.1



[PATCH AUTOSEL 5.6 069/606] powerpc/uaccess: Evaluate macro arguments once, before user access is allowed

2020-06-08 Thread Sasha Levin
From: Nicholas Piggin 

commit d02f6b7dab8228487268298ea1f21081c0b4b3eb upstream.

get/put_user() can be called with nontrivial arguments. fs/proc/page.c
has a good example:

if (put_user(stable_page_flags(ppage), out)) {

stable_page_flags() is quite a lot of code, including spin locks in
the page allocator.

Ensure these arguments are evaluated before user access is allowed.

This improves security by reducing code with access to userspace, but
it also fixes a PREEMPT bug with KUAP on powerpc/64s:
stable_page_flags() is currently called with AMR set to allow writes,
it ends up calling spin_unlock(), which can call preempt_schedule. But
the task switch code can not be called with AMR set (it relies on
interrupts saving the register), so this blows up.

It's fine if the code inside allow_user_access() is preemptible,
because a timer or IPI will save the AMR, but it's not okay to
explicitly cause a reschedule.

Fixes: de78a9c42a79 ("powerpc: Add a framework for Kernel Userspace Access 
Protection")
Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200407041245.600651-1-npig...@gmail.com
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/include/asm/uaccess.h | 49 +-
 1 file changed, 35 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 2f500debae21..0969285996cb 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -166,13 +166,17 @@ do {  
\
 ({ \
long __pu_err;  \
__typeof__(*(ptr)) __user *__pu_addr = (ptr);   \
+   __typeof__(*(ptr)) __pu_val = (x);  \
+   __typeof__(size) __pu_size = (size);\
+   \
if (!is_kernel_addr((unsigned long)__pu_addr))  \
might_fault();  \
-   __chk_user_ptr(ptr);\
+   __chk_user_ptr(__pu_addr);  \
if (do_allow)   
\
-   __put_user_size((x), __pu_addr, (size), __pu_err);  
\
+   __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err);  
\
else
\
-   __put_user_size_allowed((x), __pu_addr, (size), __pu_err);  
\
+   __put_user_size_allowed(__pu_val, __pu_addr, __pu_size, 
__pu_err); \
+   \
__pu_err;   \
 })
 
@@ -180,9 +184,13 @@ do {   
\
 ({ \
long __pu_err = -EFAULT;\
__typeof__(*(ptr)) __user *__pu_addr = (ptr);   \
+   __typeof__(*(ptr)) __pu_val = (x);  \
+   __typeof__(size) __pu_size = (size);\
+   \
might_fault();  \
-   if (access_ok(__pu_addr, size)) \
-   __put_user_size((x), __pu_addr, (size), __pu_err);  \
+   if (access_ok(__pu_addr, __pu_size))\
+   __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \
+   \
__pu_err;   \
 })
 
@@ -190,8 +198,12 @@ do {   
\
 ({ \
long __pu_err;  \
__typeof__(*(ptr)) __user *__pu_addr = (ptr);   \
-   __chk_user_ptr(ptr);\
-   __put_user_size((x), __pu_addr, (size), __pu_err);  \
+   __typeof__(*(ptr)) __pu_val = (x);  \
+   __typeof__(size) __pu_size = (size);\
+   \
+   __chk_user_ptr(__pu_addr);  \
+   __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \
+   \
__pu_err;   \
 })
 
@@ -283,15 +295,18 @@ do {  
\
long __gu_err;   

[PATCH AUTOSEL 5.6 070/606] powerpc/ima: Fix secure boot rules in ima arch policy

2020-06-08 Thread Sasha Levin
From: Nayna Jain 

commit fa4f3f56ccd28ac031ab275e673ed4098855fed4 upstream.

To prevent verifying the kernel module appended signature
twice (finit_module), once by the module_sig_check() and again by IMA,
powerpc secure boot rules define an IMA architecture specific policy
rule only if CONFIG_MODULE_SIG_FORCE is not enabled. This,
unfortunately, does not take into account the ability of enabling
"sig_enforce" on the boot command line (module.sig_enforce=1).

Including the IMA module appraise rule results in failing the
finit_module syscall, unless the module signing public key is loaded
onto the IMA keyring.

This patch fixes secure boot policy rules to be based on
CONFIG_MODULE_SIG instead.

Fixes: 4238fad366a6 ("powerpc/ima: Add support to initialize ima policy rules")
Signed-off-by: Nayna Jain 
Signed-off-by: Michael Ellerman 
Signed-off-by: Mimi Zohar 
Link: 
https://lore.kernel.org/r/1588342612-14532-1-git-send-email-na...@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/kernel/ima_arch.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/ima_arch.c b/arch/powerpc/kernel/ima_arch.c
index e34116255ced..957abd592075 100644
--- a/arch/powerpc/kernel/ima_arch.c
+++ b/arch/powerpc/kernel/ima_arch.c
@@ -19,12 +19,12 @@ bool arch_ima_get_secureboot(void)
  * to be stored as an xattr or as an appended signature.
  *
  * To avoid duplicate signature verification as much as possible, the IMA
- * policy rule for module appraisal is added only if CONFIG_MODULE_SIG_FORCE
+ * policy rule for module appraisal is added only if CONFIG_MODULE_SIG
  * is not enabled.
  */
 static const char *const secure_rules[] = {
"appraise func=KEXEC_KERNEL_CHECK appraise_flag=check_blacklist 
appraise_type=imasig|modsig",
-#ifndef CONFIG_MODULE_SIG_FORCE
+#ifndef CONFIG_MODULE_SIG
"appraise func=MODULE_CHECK appraise_flag=check_blacklist 
appraise_type=imasig|modsig",
 #endif
NULL
@@ -50,7 +50,7 @@ static const char *const secure_and_trusted_rules[] = {
"measure func=KEXEC_KERNEL_CHECK template=ima-modsig",
"measure func=MODULE_CHECK template=ima-modsig",
"appraise func=KEXEC_KERNEL_CHECK appraise_flag=check_blacklist 
appraise_type=imasig|modsig",
-#ifndef CONFIG_MODULE_SIG_FORCE
+#ifndef CONFIG_MODULE_SIG
"appraise func=MODULE_CHECK appraise_flag=check_blacklist 
appraise_type=imasig|modsig",
 #endif
NULL
-- 
2.25.1



[PATCH AUTOSEL 5.6 039/606] powerpc/32s: Fix build failure with CONFIG_PPC_KUAP_DEBUG

2020-06-08 Thread Sasha Levin
From: Christophe Leroy 

commit 4833ce06e6855d526234618b746ffb71d6612c9a upstream.

gpr2 is not a parametre of kuap_check(), it doesn't exist.

Use gpr instead.

Fixes: a68c31fc01ef ("powerpc/32s: Implement Kernel Userspace Access 
Protection")
Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Cc: sta...@vger.kernel.org
Link: 
https://lore.kernel.org/r/ea599546f2a7771bde551393889e44e6b2632332.1587368807.git.christophe.le...@c-s.fr
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/include/asm/book3s/32/kup.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/32/kup.h 
b/arch/powerpc/include/asm/book3s/32/kup.h
index 3c0ba22dc360..db0a1c281587 100644
--- a/arch/powerpc/include/asm/book3s/32/kup.h
+++ b/arch/powerpc/include/asm/book3s/32/kup.h
@@ -75,7 +75,7 @@
 
 .macro kuap_check  current, gpr
 #ifdef CONFIG_PPC_KUAP_DEBUG
-   lwz \gpr2, KUAP(thread)
+   lwz \gpr, KUAP(thread)
 999:   twnei   \gpr, 0
EMIT_BUG_ENTRY 999b, __FILE__, __LINE__, (BUGFLAG_WARNING | 
BUGFLAG_ONCE)
 #endif
-- 
2.25.1



[PATCH AUTOSEL 5.6 038/606] powerpc/vdso32: Fallback on getres syscall when clock is unknown

2020-06-08 Thread Sasha Levin
From: Christophe Leroy 

commit e963b7a28b2bf2416304e1a15df967fcf662aff5 upstream.

There are other clocks than the standard ones, for instance
per process clocks. Therefore, being above the last standard clock
doesn't mean it is a bad clock. So, fallback to syscall instead
of returning -EINVAL inconditionaly.

Fixes: e33ffc956b08 ("powerpc/vdso32: implement clock_getres entirely")
Cc: sta...@vger.kernel.org # v5.6+
Reported-by: Aurelien Jarno 
Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Tested-by: Aurelien Jarno 
Link: 
https://lore.kernel.org/r/7316a9e2c0c2517923eb4b0411c4a08d15e675a4.1589017281.git.christophe.le...@csgroup.eu
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/kernel/vdso32/gettimeofday.S | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S 
b/arch/powerpc/kernel/vdso32/gettimeofday.S
index a3951567118a..e7f8f9f1b3f4 100644
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso32/gettimeofday.S
@@ -218,11 +218,11 @@ V_FUNCTION_BEGIN(__kernel_clock_getres)
blr
 
/*
-* invalid clock
+* syscall fallback
 */
 99:
-   li  r3, EINVAL
-   crset   so
+   li  r0,__NR_clock_getres
+   sc
blr
   .cfi_endproc
 V_FUNCTION_END(__kernel_clock_getres)
-- 
2.25.1



[PATCH AUTOSEL 5.7 161/274] powerpc/spufs: fix copy_to_user while atomic

2020-06-08 Thread Sasha Levin
From: Jeremy Kerr 

[ Upstream commit 88413a6bfbbe2f648df399b62f85c934460b7a4d ]

Currently, we may perform a copy_to_user (through
simple_read_from_buffer()) while holding a context's register_lock,
while accessing the context save area.

This change uses a temporary buffer for the context save area data,
which we then pass to simple_read_from_buffer.

Includes changes from Christoph Hellwig .

Fixes: bf1ab978be23 ("[POWERPC] coredump: Add SPU elf notes to coredump.")
Signed-off-by: Jeremy Kerr 
Reviewed-by: Arnd Bergmann 
[hch: renamed to function to avoid ___-prefixes]
Signed-off-by: Christoph Hellwig 
Signed-off-by: Al Viro 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/cell/spufs/file.c | 113 +++
 1 file changed, 75 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/file.c 
b/arch/powerpc/platforms/cell/spufs/file.c
index c0f950a3f4e1..f4a4dfb191e7 100644
--- a/arch/powerpc/platforms/cell/spufs/file.c
+++ b/arch/powerpc/platforms/cell/spufs/file.c
@@ -1978,8 +1978,9 @@ static ssize_t __spufs_mbox_info_read(struct spu_context 
*ctx,
 static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf,
   size_t len, loff_t *pos)
 {
-   int ret;
struct spu_context *ctx = file->private_data;
+   u32 stat, data;
+   int ret;
 
if (!access_ok(buf, len))
return -EFAULT;
@@ -1988,11 +1989,16 @@ static ssize_t spufs_mbox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_mbox_info_read(ctx, buf, len, pos);
+   stat = ctx->csa.prob.mb_stat_R;
+   data = ctx->csa.prob.pu_mb_R;
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   /* EOF if there's no entry in the mbox */
+   if (!(stat & 0xff))
+   return 0;
+
+   return simple_read_from_buffer(buf, len, pos, &data, sizeof(data));
 }
 
 static const struct file_operations spufs_mbox_info_fops = {
@@ -2019,6 +2025,7 @@ static ssize_t spufs_ibox_info_read(struct file *file, 
char __user *buf,
   size_t len, loff_t *pos)
 {
struct spu_context *ctx = file->private_data;
+   u32 stat, data;
int ret;
 
if (!access_ok(buf, len))
@@ -2028,11 +2035,16 @@ static ssize_t spufs_ibox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_ibox_info_read(ctx, buf, len, pos);
+   stat = ctx->csa.prob.mb_stat_R;
+   data = ctx->csa.priv2.puint_mb_R;
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   /* EOF if there's no entry in the ibox */
+   if (!(stat & 0xff))
+   return 0;
+
+   return simple_read_from_buffer(buf, len, pos, &data, sizeof(data));
 }
 
 static const struct file_operations spufs_ibox_info_fops = {
@@ -2041,6 +2053,11 @@ static const struct file_operations spufs_ibox_info_fops 
= {
.llseek  = generic_file_llseek,
 };
 
+static size_t spufs_wbox_info_cnt(struct spu_context *ctx)
+{
+   return (4 - ((ctx->csa.prob.mb_stat_R & 0x00ff00) >> 8)) * sizeof(u32);
+}
+
 static ssize_t __spufs_wbox_info_read(struct spu_context *ctx,
char __user *buf, size_t len, loff_t *pos)
 {
@@ -2049,7 +2066,7 @@ static ssize_t __spufs_wbox_info_read(struct spu_context 
*ctx,
u32 wbox_stat;
 
wbox_stat = ctx->csa.prob.mb_stat_R;
-   cnt = 4 - ((wbox_stat & 0x00ff00) >> 8);
+   cnt = spufs_wbox_info_cnt(ctx);
for (i = 0; i < cnt; i++) {
data[i] = ctx->csa.spu_mailbox_data[i];
}
@@ -2062,7 +2079,8 @@ static ssize_t spufs_wbox_info_read(struct file *file, 
char __user *buf,
   size_t len, loff_t *pos)
 {
struct spu_context *ctx = file->private_data;
-   int ret;
+   u32 data[ARRAY_SIZE(ctx->csa.spu_mailbox_data)];
+   int ret, count;
 
if (!access_ok(buf, len))
return -EFAULT;
@@ -2071,11 +2089,13 @@ static ssize_t spufs_wbox_info_read(struct file *file, 
char __user *buf,
if (ret)
return ret;
spin_lock(&ctx->csa.register_lock);
-   ret = __spufs_wbox_info_read(ctx, buf, len, pos);
+   count = spufs_wbox_info_cnt(ctx);
+   memcpy(&data, &ctx->csa.spu_mailbox_data, sizeof(data));
spin_unlock(&ctx->csa.register_lock);
spu_release_saved(ctx);
 
-   return ret;
+   return simple_read_from_buffer(buf, len, pos, &data,
+   count * sizeof(u32));
 }
 
 static const struct file_operations spufs_wbox_info_fops = {
@@ -2084,27 +2104,33 @@ static const struct file_operations 
spufs_wbox_info_fops = {
.llseek  = generic_file_llseek,
 };
 
-static ssize_t __sp

[PATCH AUTOSEL 5.7 145/274] sched/core: Fix illegal RCU from offline CPUs

2020-06-08 Thread Sasha Levin
From: Peter Zijlstra 

[ Upstream commit bf2c59fce4074e55d622089b34be3a6bc95484fb ]

In the CPU-offline process, it calls mmdrop() after idle entry and the
subsequent call to cpuhp_report_idle_dead(). Once execution passes the
call to rcu_report_dead(), RCU is ignoring the CPU, which results in
lockdep complaining when mmdrop() uses RCU from either memcg or
debugobjects below.

Fix it by cleaning up the active_mm state from BP instead. Every arch
which has CONFIG_HOTPLUG_CPU should have already called idle_task_exit()
from AP. The only exception is parisc because it switches them to
&init_mm unconditionally (see smp_boot_one_cpu() and smp_cpu_init()),
but the patch will still work there because it calls mmgrab(&init_mm) in
smp_cpu_init() and then should call mmdrop(&init_mm) in finish_cpu().

  WARNING: suspicious RCU usage
  -
  kernel/workqueue.c:710 RCU or wq_pool_mutex should be held!

  other info that might help us debug this:

  RCU used illegally from offline CPU!
  Call Trace:
   dump_stack+0xf4/0x164 (unreliable)
   lockdep_rcu_suspicious+0x140/0x164
   get_work_pool+0x110/0x150
   __queue_work+0x1bc/0xca0
   queue_work_on+0x114/0x120
   css_release+0x9c/0xc0
   percpu_ref_put_many+0x204/0x230
   free_pcp_prepare+0x264/0x570
   free_unref_page+0x38/0xf0
   __mmdrop+0x21c/0x2c0
   idle_task_exit+0x170/0x1b0
   pnv_smp_cpu_kill_self+0x38/0x2e0
   cpu_die+0x48/0x64
   arch_cpu_idle_dead+0x30/0x50
   do_idle+0x2f4/0x470
   cpu_startup_entry+0x38/0x40
   start_secondary+0x7a8/0xa80
   start_secondary_resume+0x10/0x14

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Qian Cai 
Signed-off-by: Peter Zijlstra (Intel) 
Acked-by: Michael Ellerman  (powerpc)
Link: https://lkml.kernel.org/r/20200401214033.8448-1-...@lca.pw
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/powernv/smp.c |  1 -
 include/linux/sched/mm.h |  2 ++
 kernel/cpu.c | 18 +-
 kernel/sched/core.c  |  5 +++--
 4 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/smp.c 
b/arch/powerpc/platforms/powernv/smp.c
index 13e251699346..b2ba3e95bda7 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -167,7 +167,6 @@ static void pnv_smp_cpu_kill_self(void)
/* Standard hot unplug procedure */
 
idle_task_exit();
-   current->active_mm = NULL; /* for sanity */
cpu = smp_processor_id();
DBG("CPU%d offline\n", cpu);
generic_set_cpu_dead(cpu);
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index c49257a3b510..a132d875d351 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -49,6 +49,8 @@ static inline void mmdrop(struct mm_struct *mm)
__mmdrop(mm);
 }
 
+void mmdrop(struct mm_struct *mm);
+
 /*
  * This has to be called after a get_task_mm()/mmget_not_zero()
  * followed by taking the mmap_sem for writing before modifying the
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 2371292f30b0..244d30544377 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -3,6 +3,7 @@
  *
  * This code is licenced under the GPL.
  */
+#include 
 #include 
 #include 
 #include 
@@ -564,6 +565,21 @@ static int bringup_cpu(unsigned int cpu)
return bringup_wait_for_ap(cpu);
 }
 
+static int finish_cpu(unsigned int cpu)
+{
+   struct task_struct *idle = idle_thread_get(cpu);
+   struct mm_struct *mm = idle->active_mm;
+
+   /*
+* idle_task_exit() will have switched to &init_mm, now
+* clean up any remaining active_mm state.
+*/
+   if (mm != &init_mm)
+   idle->active_mm = &init_mm;
+   mmdrop(mm);
+   return 0;
+}
+
 /*
  * Hotplug state machine related functions
  */
@@ -1549,7 +1565,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
[CPUHP_BRINGUP_CPU] = {
.name   = "cpu:bringup",
.startup.single = bringup_cpu,
-   .teardown.single= NULL,
+   .teardown.single= finish_cpu,
.cant_stop  = true,
},
/* Final state before CPU kills itself */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9a2fbf98fd6f..0bbf387d0f19 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6190,13 +6190,14 @@ void idle_task_exit(void)
struct mm_struct *mm = current->active_mm;
 
BUG_ON(cpu_online(smp_processor_id()));
+   BUG_ON(current != this_rq()->idle);
 
if (mm != &init_mm) {
switch_mm(mm, &init_mm, current);
-   current->active_mm = &init_mm;
finish_arch_post_lock_switch();
}
-   mmdrop(mm);
+
+   /* finish_cpu(), as ran on the BP, will clean up the active_mm state */
 }
 
 /*
-- 
2.25.1



[PATCH AUTOSEL 5.7 038/274] soc: fsl: dpio: properly compute the consumer index

2020-06-08 Thread Sasha Levin
From: Ioana Ciornei 

[ Upstream commit 7596ac9d19a9df25707ecaac0675881f62dd8c18 ]

Mask the consumer index before using it. Without this, we would be
writing frame descriptors beyond the ring size supported by the QBMAN
block.

Fixes: 3b2abda7d28c ("soc: fsl: dpio: Replace QMAN array mode with ring mode 
enqueue")
Signed-off-by: Ioana Ciornei 
Acked-by: Li Yang 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/soc/fsl/dpio/qbman-portal.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/soc/fsl/dpio/qbman-portal.c 
b/drivers/soc/fsl/dpio/qbman-portal.c
index 804b8ba9bf5c..23a1377971f4 100644
--- a/drivers/soc/fsl/dpio/qbman-portal.c
+++ b/drivers/soc/fsl/dpio/qbman-portal.c
@@ -669,6 +669,7 @@ int qbman_swp_enqueue_multiple_direct(struct qbman_swp *s,
eqcr_ci = s->eqcr.ci;
p = s->addr_cena + QBMAN_CENA_SWP_EQCR_CI;
s->eqcr.ci = qbman_read_register(s, QBMAN_CINH_SWP_EQCR_CI);
+   s->eqcr.ci &= full_mask;
 
s->eqcr.available = qm_cyc_diff(s->eqcr.pi_ring_size,
eqcr_ci, s->eqcr.ci);
-- 
2.25.1



[PATCH v12 2/6] seq_buf: Export seq_buf_printf

2020-06-08 Thread Vaibhav Jain
'seq_buf' provides a very useful abstraction for writing to a string
buffer without needing to worry about it over-flowing. However even
though the API has been stable for couple of years now its still not
exported to kernel loadable modules limiting its usage.

Hence this patch proposes update to 'seq_buf.c' to mark
seq_buf_printf() which is part of the seq_buf API to be exported to
kernel loadable GPL modules. This symbol will be used in later parts
of this patch-set to simplify content creation for a sysfs attribute.

Cc: Piotr Maziarz 
Cc: Cezary Rojewski 
Cc: Christoph Hellwig 
Cc: Steven Rostedt 
Cc: Borislav Petkov 
Acked-by: Steven Rostedt (VMware) 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v11..v12:
* None

v10..v11:
* None

v9..v10:
* None

Resend:
* Added ack from Steven Rostedt

v8..v9:
* None

v7..v8:
* Updated the patch title [ Christoph Hellwig ]
* Updated patch description to replace confusing term 'external kernel
  modules' to 'kernel lodable modules'.

Resend:
* Added ack from Steven Rostedt

v6..v7:
* New patch in the series
---
 lib/seq_buf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/seq_buf.c b/lib/seq_buf.c
index 4e865d42ab03..707453f5d58e 100644
--- a/lib/seq_buf.c
+++ b/lib/seq_buf.c
@@ -91,6 +91,7 @@ int seq_buf_printf(struct seq_buf *s, const char *fmt, ...)
 
return ret;
 }
+EXPORT_SYMBOL_GPL(seq_buf_printf);
 
 #ifdef CONFIG_BINARY_PRINTF
 /**
-- 
2.26.2



[PATCH v12 6/6] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH

2020-06-08 Thread Vaibhav Jain
This patch implements support for PDSM request 'PAPR_PDSM_HEALTH'
that returns a newly introduced 'struct nd_papr_pdsm_health' instance
containing dimm health information back to user space in response to
ND_CMD_CALL. This functionality is implemented in newly introduced
papr_pdsm_health() that queries the nvdimm health information and
then copies this information to the package payload whose layout is
defined by 'struct nd_papr_pdsm_health'.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Reviewed-by: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v11..v12:
* Minor: Reodered the initialization of 'struct nd_papr_pdsm_health'
  fields to match order present in its definition. [ Ira ]
* Added ack from Ira

v10..v11:
* Changed the definition of 'struct nd_papr_pdsm_health' to a maximal
  struct 184 bytes in size [ Dan Williams ]
* Added new field 'extension_flags' to 'struct nd_papr_pdsm_health'
  [ Dan Williams ]
* Updated papr_pdsm_health() to set field 'extension_flags' to 0.
* Introduced a define ND_PDSM_PAYLOAD_MAX_SIZE that indicates the
  maximum size of a payload.
* Fixed a suspicious conversion from u64 to u8 in papr_pdsm_health
  that was preventing correct initialization of 'struct
  nd_papr_pdsm_health'. [ Ira ]

v9..v10:
* Removed code in papr_pdsm_health that performed validation on pdsm
  payload version and corrosponding struct and defines used for
  validation of payload version.
* Dropped usage of struct papr_pdsm_health in 'struct
  papr_scm_priv'. Instead papr_psdm_health() now uses
  'papr_scm_priv.health_bitmap' to populate the pdsm payload.
* Above change also fixes the problem where this patch was removing
  the code that was previously introduced in this patch-series.
  [ Ira ]
* Introduced a new def ND_PDSM_ENVELOPE_HDR_SIZE that indicates the
  space allocated to 'struct nd_pdsm_cmd_pkg' fields except 'struct
  nd_cmd_pkg'. This def is useful in validating payload sizes.
* Reworked papr_pdsm_health() to enforce a specific payload size for
  'PAPR_PDSM_HEALTH' pdsm request.

Resend:
* Added ack from Aneesh.

v8..v9:
* s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g  [ Dan , Aneesh ]
* s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g
* Renamed papr_scm_get_health() to papr_psdm_health()
* Updated patch description to replace papr-scm dimm with nvdimm.

v7..v8:
* None

Resend:
* None

v6..v7:
* Updated flags_show() to use seq_buf_printf(). [Mpe]
* Updated papr_scm_get_health() to use newly introduced
  __drc_pmem_query_health() bypassing the cache [Mpe].

v5..v6:
* Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to
  gaurd against possibility of different compilers adding different
  paddings to the struct [ Dan Williams ]

* Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of
  'bool' and also updated drc_pmem_query_health() to take this into
  account. [ Dan Williams ]

v4..v5:
* None

v3..v4:
* Call the DSM_PAPR_SCM_HEALTH service function from
  papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh]

v2..v3:
* Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types
  as its exported to the userspace [Aneesh]
* Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health
  from enum to #defines [Aneesh]

v1..v2:
* New patch in the series
---
 arch/powerpc/include/uapi/asm/papr_pdsm.h | 43 ++
 arch/powerpc/platforms/pseries/papr_scm.c | 71 +++
 2 files changed, 114 insertions(+)

diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h 
b/arch/powerpc/include/uapi/asm/papr_pdsm.h
index 34d1a41d2406..d453baea13c4 100644
--- a/arch/powerpc/include/uapi/asm/papr_pdsm.h
+++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
@@ -70,13 +70,56 @@ struct nd_pdsm_cmd_pkg {
__u8 payload[]; /* In/Out: Sub-cmd data buffer */
 } __packed;
 
+/* Calculate size used by the pdsm header fields minus 'struct nd_cmd_pkg' */
+#define ND_PDSM_HDR_SIZE \
+   (sizeof(struct nd_pdsm_cmd_pkg) - sizeof(struct nd_cmd_pkg))
+
+/* Max payload size that we can handle */
+#define ND_PDSM_PAYLOAD_MAX_SIZE 184
+
 /*
  * Methods to be embedded in ND_CMD_CALL request. These are sent to the kernel
  * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
  */
 enum papr_pdsm {
PAPR_PDSM_MIN = 0x0,
+   PAPR_PDSM_HEALTH,
PAPR_PDSM_MAX,
 };
 
+/* Various nvdimm health indicators */
+#define PAPR_PDSM_DIMM_HEALTHY   0
+#define PAPR_PDSM_DIMM_UNHEALTHY 1
+#define PAPR_PDSM_DIMM_CRITICAL  2
+#define PAPR_PDSM_DIMM_FATAL 3
+
+/*
+ * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH
+ * Various flags indicate the health status of the dimm.
+ *
+ * extension_flags : Any extension fields present in the struct.
+ * dimm_unarmed: Dimm not armed. So contents wont persist.
+ * dimm_bad_shutdown   : Previous shutdown did not persist contents.
+ * dimm_bad_restore: Contents from previous shutdown werent restored.
+ * dimm_scrubbe

[PATCH v12 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods

2020-06-08 Thread Vaibhav Jain
Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
module and add the command family NVDIMM_FAMILY_PAPR to the white list
of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the
nvdimm command mask and implement necessary scaffolding in the module
to handle ND_CMD_CALL ioctl and PDSM requests that we receive.

The layout of the PDSM request as we expect from libnvdimm/libndctl is
described in newly introduced uapi header 'papr_pdsm.h' which
defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used
to communicate the PDSM request via member
'nd_cmd_pkg.nd_command' and size of payload that need to be
sent/received for servicing the PDSM.

A new function is_cmd_valid() is implemented that reads the args to
papr_scm_ndctl() and performs sanity tests on them. A new function
papr_scm_service_pdsm() is introduced and is called from
papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
command from libnvdimm.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v11..v12:
* Updated a misleading comment in 'papr_pdsm.h' regarding payload
  size. [ Ira ]

v10..v11:
* Moved in-lines 'nd_pdsm_cmd_pkg()' and 'pdsm_cmd_to_payload()' from
  'papr_pdsm.h' header to 'papr_scm.c'. The avoids a potential license
  incompatibility issue with non-GPL-2.0 user-space code trying to
  include the header in its code. [ Ira ]
* Verified papr_pdsm.h with UAPI_HEADER_TEST config.
* Moved the is_cmd_valid() check in papr_scm_ndctl() before check for
  cmd_rc == NULL. This prevents cmd_rc to be updated in case the
  nd-cmd is invalid or unknown.

v9..v10:
* Simplified 'struct nd_pdsm_cmd_pkg' by removing the
  'payload_version' field.
* Removed the corrosponding documentation on versioning and backward
  compatibility from 'papr_pdsm.h'
* Reduced the size of reserved fields to 4-bytes making 'struct
  nd_pdsm_cmd_pkg' 64 + 8 bytes long.
* Updated is_cmd_valid() to enforce validation checks on pdsm
  commands. [ Dan Williams ]
* Added check for reserved fields being set to '0' in is_cmd_valid()
  [ Ira ]
* Moved changes for checking cmd_rc == NULL and logging improvements
  to a separate prelim patch [ Ira ].
* Moved  pdsm package validation checks from papr_scm_service_pdsm()
  to is_cmd_valid().
* Marked papr_scm_service_pdsm() return type as 'void' since errors
  are reported in nd_pdsm_cmd_pkg.cmd_status field.

Resend:
* Added ack from Aneesh.

v8..v9:
* Reduced the usage of term SCM replacing it with appropriate
  replacement [ Dan Williams, Aneesh ]
* Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h'
* s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g
* s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g
* Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'.
* Minor update to patch description.

v7..v8:
* Removed the 'payload_offset' field from 'struct
  nd_pdsm_cmd_pkg'. Instead command payload is always assumed to start
  at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ]
* To enable introducing new fields to 'struct nd_pdsm_cmd_pkg',
  'reserved' field of 10-bytes is introduced. [ Aneesh ]
* Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h
  [ Ira ]

Resend:
* None

v6..v7 :
* Removed the re-definitions of __packed macro from papr_scm_pdsm.h
  [Mpe].
* Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h [Mpe].
* Removed macros that were unused in papr_scm.c from papr_scm_pdsm.h
  [Mpe].
* Made functions defined in papr_scm_pdsm.h as static inline. [Mpe]

v5..v6 :
* Changed the usage of the term DSM to PDSM to distinguish it from the
  ACPI term [ Dan Williams ]
* Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various struct
  to reflect the new terminology.
* Updated the patch description and title to reflect the new terminology.
* Squashed patch to introduce new command family in 'ndctl.h' with
  this patch [ Dan Williams ]
* Updated the papr_scm_pdsm method starting index from 0x1 to 0x0
  [ Dan Williams ]
* Removed redundant license text from the papr_scm_psdm.h file.
  [ Dan Williams ]
* s/envelop/envelope/ at various places [ Dan Williams ]
* Added '__packed' attribute to command package header to gaurd
  against different compiler adding paddings between the fields.
  [ Dan Williams]
* Converted various pr_debug to dev_debug [ Dan Williams ]

v4..v5 :
* None

v3..v4 :
* None

v2..v3 :
* Updated the patch prefix to 'ndctl/uapi' [Aneesh]

v1..v2 :
* None
---
 arch/powerpc/include/uapi/asm/papr_pdsm.h |  82 ++
 arch/powerpc/platforms/pseries/papr_scm.c | 126 +-
 include/uapi/linux/ndctl.h|   1 +
 3 files changed, 205 insertions(+), 4 deletions(-)
 create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h

diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h 
b/arch/powerpc/include/uapi/asm/papr_pdsm.h
new file mode 100644
index ..34d1a41d2406
--- /dev/null
+++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
@@ -0,0 +1,82 @@
+/* 

[PATCH v12 4/6] powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()

2020-06-08 Thread Vaibhav Jain
Since papr_scm_ndctl() can be called from outside papr_scm, its
exposed to the possibility of receiving NULL as value of 'cmd_rc'
argument. This patch updates papr_scm_ndctl() to protect against such
possibility by assigning it pointer to a local variable in case cmd_rc
== NULL.

Finally the patch also updates the 'default' add a debug log unknown
'cmd' values.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Reviewed-by: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v11..v12:
* Added ack from Ira

v10..v11:
* Instead of returning *cmd_rd just return '0' in case nd_cmd is
  handled. In case of unknown nd-cmd return -EINVAL
  [ Ira and Dan Williams ]
* Updated patch description.

v9..v10
* New patch in the series
---
 arch/powerpc/platforms/pseries/papr_scm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
b/arch/powerpc/platforms/pseries/papr_scm.c
index 0c091622b15e..692ad3d79826 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -355,11 +355,16 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor 
*nd_desc,
 {
struct nd_cmd_get_config_size *get_size_hdr;
struct papr_scm_priv *p;
+   int rc;
 
/* Only dimm-specific calls are supported atm */
if (!nvdimm)
return -EINVAL;
 
+   /* Use a local variable in case cmd_rc pointer is NULL */
+   if (!cmd_rc)
+   cmd_rc = &rc;
+
p = nvdimm_provider_data(nvdimm);
 
switch (cmd) {
@@ -381,6 +386,7 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor 
*nd_desc,
break;
 
default:
+   dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd);
return -EINVAL;
}
 
-- 
2.26.2



[PATCH v12 3/6] powerpc/papr_scm: Fetch nvdimm health information from PHYP

2020-06-08 Thread Vaibhav Jain
Implement support for fetching nvdimm health information via
H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
of 64-bit bitmap, bitwise-and of which is then stored in
'struct papr_scm_priv' and subsequently partially exposed to
user-space via newly introduced dimm specific attribute
'papr/flags'. Since the hcall is costly, the health information is
cached and only re-queried, 60s after the previous successful hcall.

The patch also adds a  documentation text describing flags reported by
the the new sysfs attribute 'papr/flags' is also introduced at
Documentation/ABI/testing/sysfs-bus-papr-pmem.

[1] commit 58b278f568f0 ("powerpc: Provide initial documentation for
PAPR hcalls")

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v11..v12:
* None

v10..v11:
* None

v9..v10:
* Removed an avoidable 'goto' in __drc_pmem_query_health. [ Ira ].

Resend:
* Added ack from Aneesh.

v8..v9:
* Rename some variables and defines to reduce usage of term SCM
  replacing it with PMEM [Dan Williams, Aneesh]
* s/PAPR_SCM_DIMM/PAPR_PMEM/g
* s/papr_scm_nd_attributes/papr_nd_attributes/g
* s/papr_scm_nd_attribute_group/papr_nd_attribute_group/g
* s/papr_scm_dimm_attr_groups/papr_nd_attribute_groups/g
* Renamed file sysfs-bus-papr-scm to sysfs-bus-papr-pmem

v7..v8:
* Update type of variable 'rc' in __drc_pmem_query_health() and
  drc_pmem_query_health() to long and int respectively. [ Ira ]
* Updated the patch description to s/64 bit Big Endian Number/64-bit
  bitmap/ [ Ira, Aneesh ].

Resend:
* None

v6..v7 :
* Used the exported buf_seq_printf() function to generate content for
  'papr/flags'
* Moved the PAPR_SCM_DIMM_* bit-flags macro definitions to papr_scm.c
  and removed the papr_scm.h file [Mpe]
* Some minor consistency issued in sysfs-bus-papr-scm
  documentation. [Mpe]
* s/dimm_mutex/health_mutex/g [Mpe]
* Split drc_pmem_query_health() into two function one of which takes
  care of caching and locking. [Mpe]
* Fixed a local copy creation of dimm health information using
  READ_ONCE(). [Mpe]

v5..v6 :
* Change the flags sysfs attribute from 'papr_flags' to 'papr/flags'
  [Dan Williams]
* Include documentation for 'papr/flags' attr [Dan Williams]
* Change flag 'save_fail' to 'flush_fail' [Dan Williams]
* Caching of health bitmap to reduce expensive hcalls [Dan Williams]
* Removed usage of PPC_BIT from 'papr-scm.h' header [Mpe]
* Replaced two __be64 integers from papr_scm_priv to a single u64
  integer [Mpe]
* Updated patch description to reflect the changes made in this
  version.
* Removed avoidable usage of 'papr_scm_priv.dimm_mutex' from
  flags_show() [Dan Williams]

v4..v5 :
* None

v3..v4 :
* None

v2..v3 :
* Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for
 NVDIMM unarmed [Aneesh]

v1..v2 :
* New patch in the series.
---
 Documentation/ABI/testing/sysfs-bus-papr-pmem |  27 +++
 arch/powerpc/platforms/pseries/papr_scm.c | 168 +-
 2 files changed, 193 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem

diff --git a/Documentation/ABI/testing/sysfs-bus-papr-pmem 
b/Documentation/ABI/testing/sysfs-bus-papr-pmem
new file mode 100644
index ..5b10d036a8d4
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-papr-pmem
@@ -0,0 +1,27 @@
+What:  /sys/bus/nd/devices/nmemX/papr/flags
+Date:  Apr, 2020
+KernelVersion: v5.8
+Contact:   linuxppc-dev , 
linux-nvd...@lists.01.org,
+Description:
+   (RO) Report flags indicating various states of a
+   papr-pmem NVDIMM device. Each flag maps to a one or
+   more bits set in the dimm-health-bitmap retrieved in
+   response to H_SCM_HEALTH hcall. The details of the bit
+   flags returned in response to this hcall is available
+   at 'Documentation/powerpc/papr_hcalls.rst' . Below are
+   the flags reported in this sysfs file:
+
+   * "not_armed"   : Indicates that NVDIMM contents will not
+ survive a power cycle.
+   * "flush_fail"  : Indicates that NVDIMM contents
+ couldn't be flushed during last
+ shut-down event.
+   * "restore_fail": Indicates that NVDIMM contents
+ couldn't be restored during NVDIMM
+ initialization.
+   * "encrypted"   : NVDIMM contents are encrypted.
+   * "smart_notify": There is health event for the NVDIMM.
+   * "scrubbed": Indicating that contents of the
+ NVDIMM have been scrubbed.
+   * "locked"  : Indicating that NVDIMM contents cant
+ be modified until next power cycle.
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
b/arch/powerpc/platforms/ps

[PATCH v12 1/6] powerpc: Document details on H_SCM_HEALTH hcall

2020-06-08 Thread Vaibhav Jain
Add documentation to 'papr_hcalls.rst' describing the bitmap flags
that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
specification.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Acked-by: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v11..v12:
* None

v10..v11:
* None

v9..v10:
* Added ack from Ira.

Resend:
* None

v8..v9:
* s/SCM/PMEM device. [ Dan Williams, Aneesh ]

v7..v8:
* Added a clarification on bit-ordering of Health Bitmap

Resend:
* None

v6..v7:
* None

v5..v6:
* New patch in the series
---
 Documentation/powerpc/papr_hcalls.rst | 46 ---
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/Documentation/powerpc/papr_hcalls.rst 
b/Documentation/powerpc/papr_hcalls.rst
index 3493631a60f8..48fcf1255a33 100644
--- a/Documentation/powerpc/papr_hcalls.rst
+++ b/Documentation/powerpc/papr_hcalls.rst
@@ -220,13 +220,51 @@ from the LPAR memory.
 **H_SCM_HEALTH**
 
 | Input: drcIndex
-| Out: *health-bitmap, health-bit-valid-bitmap*
+| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
 | Return Value: *H_Success, H_Parameter, H_Hardware*
 
 Given a DRC Index return the info on predictive failure and overall health of
-the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive
-failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
-valid.
+the PMEM device. The asserted bits in the health-bitmap indicate one or more 
states
+(described in table below) of the PMEM device and health-bit-valid-bitmap 
indicate
+which bits in health-bitmap are valid. The bits are reported in
+reverse bit ordering for example a value of 0xC400
+indicates bits 0, 1, and 5 are valid.
+
+Health Bitmap Flags:
+
++--+---+
+|  Bit |   Definition  
|
++==+===+
+|  00  | PMEM device is unable to persist memory contents. 
|
+|  | If the system is powered down, nothing will be saved. 
|
++--+---+
+|  01  | PMEM device failed to persist memory contents. Either contents were   
|
+|  | not saved successfully on power down or were not restored properly on 
|
+|  | power up. 
|
++--+---+
+|  02  | PMEM device contents are persisted from previous IPL. The data from   
|
+|  | the last boot were successfully restored. 
|
++--+---+
+|  03  | PMEM device contents are not persisted from previous IPL. There was 
no|
+|  | data to restore from the last boot.   
|
++--+---+
+|  04  | PMEM device memory life remaining is critically low   
|
++--+---+
+|  05  | PMEM device will be garded off next IPL due to failure
|
++--+---+
+|  06  | PMEM device contents cannot persist due to current platform health
|
+|  | status. A hardware failure may prevent data from being saved or   
|
+|  | restored. 
|
++--+---+
+|  07  | PMEM device is unable to persist memory contents in certain 
conditions|
++--+---+
+|  08  | PMEM device is encrypted  
|
++--+---+
+|  09  | PMEM device has successfully completed a requested erase or secure
|
+|  | erase procedure.  
|
++--+---+
+|10:63 | Reserved / Unused 
|
++--+---+
 
 **H_SCM_PERFORMANCE_STATS**
 
-- 
2.26.2



[PATCH v12 0/6] powerpc/papr_scm: Add support for reporting nvdimm health

2020-06-08 Thread Vaibhav Jain
Changes since v11 [1]:
* Minor update to 'papr_pdsm.h' fixing a misleading comment about
  'possible' padding being added by GCC which doesn't apply in case
  structs are marked as __packed.
* Fix the order of initialization of 'struct nd_papr_pdsm_health' in
  papr_pdsm_health().
* Added acks from Ira for various patches.

[1] 
https://lore.kernel.org/linux-nvdimm/20200607131339.476036-1-vaib...@linux.ibm.com
---

The PAPR standard[2][4] provides mechanisms to query the health and
performance stats of an NVDIMM via various hcalls as described in
Ref[3].  Until now these stats were never available nor exposed to the
user-space tools like 'ndctl'. This is partly due to PAPR platform not
having support for ACPI and NFIT. Hence 'ndctl' is unable to query and
report the dimm health status and a user had no way to determine the
current health status of a NDVIMM.

To overcome this limitation, this patch-set updates papr_scm kernel
module to query and fetch NVDIMM health stats using hcalls described
in Ref[3].  This health and performance stats are then exposed to
userspace via sysfs and PAPR-NVDIMM-Specific-Methods(PDSM) issued by
libndctl.

These changes coupled with proposed ndtcl changes located at Ref[5]
should provide a way for the user to retrieve NVDIMM health status
using ndtcl.

Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM
in a emulation environment:

 # ndctl list -DH
[
  {
"dev":"nmem0",
"health":{
  "health_state":"fatal",
  "shutdown_state":"dirty"
}
  }
]

Dimm health report output on a pseries guest lpar with vPMEM or HMS
based NVDIMMs that are in perfectly healthy conditions:

 # ndctl list -d nmem0 -H
[
  {
"dev":"nmem0",
"health":{
  "health_state":"ok",
  "shutdown_state":"clean"
}
  }
]

PAPR NVDIMM-Specific-Methods(PDSM)
==

PDSM requests are issued by vendor specific code in libndctl to
execute certain operations or fetch information from NVDIMMS. PDSMs
requests can be sent to papr_scm module via libndctl(userspace) and
libnvdimm (kernel) using the ND_CMD_CALL ioctl command which can be
handled in the dimm control function papr_scm_ndctl(). Current
patchset proposes a single PDSM to retrieve NVDIMM health, defined in
the newly introduced uapi header named 'papr_pdsm.h'. Support for
more PDSMs will be added in future.

Structure of the patch-set
==

The patch-set starts with a doc patch documenting details of hcall
H_SCM_HEALTH. Second patch exports kernel symbol seq_buf_printf()
thats used in subsequent patches to generate sysfs attribute content.

Third patch implements support for fetching NVDIMM health information
from PHYP and partially exposing it to user-space via a NVDIMM sysfs
flag.

Fourth patch updates papr_scm_ndctl() to handle a possible error case
and also improve debug logging.

Fifth patch deals with implementing support for servicing PDSM
commands in papr_scm module.

Finally the last patch implements support for servicing PDSM
'PAPR_PDSM_HEALTH' that returns the NVDIMM health information to
libndctl.

References:
[2] "Power Architecture Platform Reference"
  https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
[3] commit 58b278f568f0
 ("powerpc: Provide initial documentation for PAPR hcalls")
[4] "Linux on Power Architecture Platform Reference"
 https://members.openpowerfoundation.org/document/dl/469
[5] https://github.com/vaibhav92/ndctl/tree/papr_scm_health_v12

---

Vaibhav Jain (6):
  powerpc: Document details on H_SCM_HEALTH hcall
  seq_buf: Export seq_buf_printf
  powerpc/papr_scm: Fetch nvdimm health information from PHYP
  powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
  ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
  powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH

 Documentation/ABI/testing/sysfs-bus-papr-pmem |  27 ++
 Documentation/powerpc/papr_hcalls.rst |  46 ++-
 arch/powerpc/include/uapi/asm/papr_pdsm.h | 125 ++
 arch/powerpc/platforms/pseries/papr_scm.c | 373 +-
 include/uapi/linux/ndctl.h|   1 +
 lib/seq_buf.c |   1 +
 6 files changed, 562 insertions(+), 11 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem
 create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h

-- 
2.26.2



Re: [PATCH v11 6/6] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH

2020-06-08 Thread Vaibhav Jain
Thanks Ira,

Ira Weiny  writes:

> On Sun, Jun 07, 2020 at 06:43:39PM +0530, Vaibhav Jain wrote:
>> This patch implements support for PDSM request 'PAPR_PDSM_HEALTH'
>> that returns a newly introduced 'struct nd_papr_pdsm_health' instance
>> containing dimm health information back to user space in response to
>> ND_CMD_CALL. This functionality is implemented in newly introduced
>> papr_pdsm_health() that queries the nvdimm health information and
>> then copies this information to the package payload whose layout is
>> defined by 'struct nd_papr_pdsm_health'.
>> 
>> Cc: "Aneesh Kumar K . V" 
>> Cc: Dan Williams 
>> Cc: Michael Ellerman 
>> Cc: Ira Weiny 
>> Signed-off-by: Vaibhav Jain 
>> ---
>> Changelog:
>> 
>> v10..v11:
>> * Changed the definition of 'struct nd_papr_pdsm_health' to a maximal
>>   struct 184 bytes in size [ Dan Williams ]
>> * Added new field 'extension_flags' to 'struct nd_papr_pdsm_health'
>>   [ Dan Williams ]
>> * Updated papr_pdsm_health() to set field 'extension_flags' to 0.
>> * Introduced a define ND_PDSM_PAYLOAD_MAX_SIZE that indicates the
>>   maximum size of a payload.
>> * Fixed a suspicious conversion from u64 to u8 in papr_pdsm_health
>>   that was preventing correct initialization of 'struct
>>   nd_papr_pdsm_health'. [ Ira ]
>> 
>> v9..v10:
>> * Removed code in papr_pdsm_health that performed validation on pdsm
>>   payload version and corrosponding struct and defines used for
>>   validation of payload version.
>> * Dropped usage of struct papr_pdsm_health in 'struct
>>   papr_scm_priv'. Instead papr_psdm_health() now uses
>>   'papr_scm_priv.health_bitmap' to populate the pdsm payload.
>> * Above change also fixes the problem where this patch was removing
>>   the code that was previously introduced in this patch-series.
>>   [ Ira ]
>> * Introduced a new def ND_PDSM_ENVELOPE_HDR_SIZE that indicates the
>>   space allocated to 'struct nd_pdsm_cmd_pkg' fields except 'struct
>>   nd_cmd_pkg'. This def is useful in validating payload sizes.
>> * Reworked papr_pdsm_health() to enforce a specific payload size for
>>   'PAPR_PDSM_HEALTH' pdsm request.
>> 
>> Resend:
>> * Added ack from Aneesh.
>> 
>> v8..v9:
>> * s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g  [ Dan , Aneesh ]
>> * s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g
>> * Renamed papr_scm_get_health() to papr_psdm_health()
>> * Updated patch description to replace papr-scm dimm with nvdimm.
>> 
>> v7..v8:
>> * None
>> 
>> Resend:
>> * None
>> 
>> v6..v7:
>> * Updated flags_show() to use seq_buf_printf(). [Mpe]
>> * Updated papr_scm_get_health() to use newly introduced
>>   __drc_pmem_query_health() bypassing the cache [Mpe].
>> 
>> v5..v6:
>> * Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to
>>   gaurd against possibility of different compilers adding different
>>   paddings to the struct [ Dan Williams ]
>> 
>> * Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of
>>   'bool' and also updated drc_pmem_query_health() to take this into
>>   account. [ Dan Williams ]
>> 
>> v4..v5:
>> * None
>> 
>> v3..v4:
>> * Call the DSM_PAPR_SCM_HEALTH service function from
>>   papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh]
>> 
>> v2..v3:
>> * Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types
>>   as its exported to the userspace [Aneesh]
>> * Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health
>>   from enum to #defines [Aneesh]
>> 
>> v1..v2:
>> * New patch in the series
>> ---
>>  arch/powerpc/include/uapi/asm/papr_pdsm.h | 43 ++
>>  arch/powerpc/platforms/pseries/papr_scm.c | 71 +++
>>  2 files changed, 114 insertions(+)
>> 
>> diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h 
>> b/arch/powerpc/include/uapi/asm/papr_pdsm.h
>> index df2447455cfe..12c7aa5ee8bf 100644
>> --- a/arch/powerpc/include/uapi/asm/papr_pdsm.h
>> +++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
>> @@ -72,13 +72,56 @@ struct nd_pdsm_cmd_pkg {
>>  __u8 payload[]; /* In/Out: Sub-cmd data buffer */
>>  } __packed;
>>  
>> +/* Calculate size used by the pdsm header fields minus 'struct nd_cmd_pkg' 
>> */
>> +#define ND_PDSM_HDR_SIZE \
>> +(sizeof(struct nd_pdsm_cmd_pkg) - sizeof(struct nd_cmd_pkg))
>> +
>> +/* Max payload size that we can handle */
>> +#define ND_PDSM_PAYLOAD_MAX_SIZE 184
>> +
>>  /*
>>   * Methods to be embedded in ND_CMD_CALL request. These are sent to the 
>> kernel
>>   * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
>>   */
>>  enum papr_pdsm {
>>  PAPR_PDSM_MIN = 0x0,
>> +PAPR_PDSM_HEALTH,
>>  PAPR_PDSM_MAX,
>>  };
>>  
>> +/* Various nvdimm health indicators */
>> +#define PAPR_PDSM_DIMM_HEALTHY   0
>> +#define PAPR_PDSM_DIMM_UNHEALTHY 1
>> +#define PAPR_PDSM_DIMM_CRITICAL  2
>> +#define PAPR_PDSM_DIMM_FATAL 3
>> +
>> +/*
>> + * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH
>> + * Various flags indicate the health status of the dimm.
>> + *
>> + * extensi

Re: [PATCH v11 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods

2020-06-08 Thread Vaibhav Jain


Ira Weiny  writes:

> On Sun, Jun 07, 2020 at 06:43:38PM +0530, Vaibhav Jain wrote:
>> Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
>> module and add the command family NVDIMM_FAMILY_PAPR to the white list
>> of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the
>> nvdimm command mask and implement necessary scaffolding in the module
>> to handle ND_CMD_CALL ioctl and PDSM requests that we receive.
>> 
>> The layout of the PDSM request as we expect from libnvdimm/libndctl is
>> described in newly introduced uapi header 'papr_pdsm.h' which
>> defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used
>> to communicate the PDSM request via member
>> 'nd_cmd_pkg.nd_command' and size of payload that need to be
>> sent/received for servicing the PDSM.
>> 
>> A new function is_cmd_valid() is implemented that reads the args to
>> papr_scm_ndctl() and performs sanity tests on them. A new function
>> papr_scm_service_pdsm() is introduced and is called from
>> papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
>> command from libnvdimm.
>> 
>> Cc: "Aneesh Kumar K . V" 
>> Cc: Dan Williams 
>> Cc: Michael Ellerman 
>> Cc: Ira Weiny 
>> Signed-off-by: Vaibhav Jain 
>> ---
>> Changelog:
>> 
>> v10..v11:
>> * Moved in-lines 'nd_pdsm_cmd_pkg()' and 'pdsm_cmd_to_payload()' from
>>   'papr_pdsm.h' header to 'papr_scm.c'. The avoids a potential license
>>   incompatibility issue with non-GPL-2.0 user-space code trying to
>>   include the header in its code. [ Ira ]
>> * Verified papr_pdsm.h with UAPI_HEADER_TEST config.
>> * Moved the is_cmd_valid() check in papr_scm_ndctl() before check for
>>   cmd_rc == NULL. This prevents cmd_rc to be updated in case the
>>   nd-cmd is invalid or unknown.
>> 
>> v9..v10:
>> * Simplified 'struct nd_pdsm_cmd_pkg' by removing the
>>   'payload_version' field.
>> * Removed the corrosponding documentation on versioning and backward
>>   compatibility from 'papr_pdsm.h'
>> * Reduced the size of reserved fields to 4-bytes making 'struct
>>   nd_pdsm_cmd_pkg' 64 + 8 bytes long.
>> * Updated is_cmd_valid() to enforce validation checks on pdsm
>>   commands. [ Dan Williams ]
>> * Added check for reserved fields being set to '0' in is_cmd_valid()
>>   [ Ira ]
>> * Moved changes for checking cmd_rc == NULL and logging improvements
>>   to a separate prelim patch [ Ira ].
>> * Moved  pdsm package validation checks from papr_scm_service_pdsm()
>>   to is_cmd_valid().
>> * Marked papr_scm_service_pdsm() return type as 'void' since errors
>>   are reported in nd_pdsm_cmd_pkg.cmd_status field.
>> 
>> Resend:
>> * Added ack from Aneesh.
>> 
>> v8..v9:
>> * Reduced the usage of term SCM replacing it with appropriate
>>   replacement [ Dan Williams, Aneesh ]
>> * Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h'
>> * s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g
>> * s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g
>> * Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'.
>> * Minor update to patch description.
>> 
>> v7..v8:
>> * Removed the 'payload_offset' field from 'struct
>>   nd_pdsm_cmd_pkg'. Instead command payload is always assumed to start
>>   at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ]
>> * To enable introducing new fields to 'struct nd_pdsm_cmd_pkg',
>>   'reserved' field of 10-bytes is introduced. [ Aneesh ]
>> * Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h
>>   [ Ira ]
>> 
>> Resend:
>> * None
>> 
>> v6..v7 :
>> * Removed the re-definitions of __packed macro from papr_scm_pdsm.h
>>   [Mpe].
>> * Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h [Mpe].
>> * Removed macros that were unused in papr_scm.c from papr_scm_pdsm.h
>>   [Mpe].
>> * Made functions defined in papr_scm_pdsm.h as static inline. [Mpe]
>> 
>> v5..v6 :
>> * Changed the usage of the term DSM to PDSM to distinguish it from the
>>   ACPI term [ Dan Williams ]
>> * Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various struct
>>   to reflect the new terminology.
>> * Updated the patch description and title to reflect the new terminology.
>> * Squashed patch to introduce new command family in 'ndctl.h' with
>>   this patch [ Dan Williams ]
>> * Updated the papr_scm_pdsm method starting index from 0x1 to 0x0
>>   [ Dan Williams ]
>> * Removed redundant license text from the papr_scm_psdm.h file.
>>   [ Dan Williams ]
>> * s/envelop/envelope/ at various places [ Dan Williams ]
>> * Added '__packed' attribute to command package header to gaurd
>>   against different compiler adding paddings between the fields.
>>   [ Dan Williams]
>> * Converted various pr_debug to dev_debug [ Dan Williams ]
>> 
>> v4..v5 :
>> * None
>> 
>> v3..v4 :
>> * None
>> 
>> v2..v3 :
>> * Updated the patch prefix to 'ndctl/uapi' [Aneesh]
>> 
>> v1..v2 :
>> * None
>> ---
>>  arch/powerpc/include/uapi/asm/papr_pdsm.h |  84 +++
>>  arch/powerpc/platforms/pseries/papr_scm.c | 126 +-
>>  include/uapi/linux/ndctl.h|   1 

Re: [PATCH v11 3/6] powerpc/papr_scm: Fetch nvdimm health information from PHYP

2020-06-08 Thread Vaibhav Jain
Hi Ira,

During v9 you had provided your ack to this patch [1] and also had made a
review comment in a later patch regarding an avoidable 'goto'
statement. I have since updated the patch addressing that review
comment. Can you please provide your ack to this patch too.

[1]
https://lore.kernel.org/linux-nvdimm/20200603231814.gk1505...@iweiny-desk2.sc.intel.com/T/#m668d7b35a2394104f11afdae5951e420a8ccffe6
[2] "I missed this...  probably did not need the goto in the first patch?"
https://lore.kernel.org/linux-nvdimm/20200603231814.gk1505...@iweiny-desk2.sc.intel.com/T/#m1ebdd309ac0cb6f47d3b574b8d05374b21ff75df


Thanks,
~ Vaibhav


Vaibhav Jain  writes:

> Implement support for fetching nvdimm health information via
> H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
> of 64-bit bitmap, bitwise-and of which is then stored in
> 'struct papr_scm_priv' and subsequently partially exposed to
> user-space via newly introduced dimm specific attribute
> 'papr/flags'. Since the hcall is costly, the health information is
> cached and only re-queried, 60s after the previous successful hcall.
>
> The patch also adds a  documentation text describing flags reported by
> the the new sysfs attribute 'papr/flags' is also introduced at
> Documentation/ABI/testing/sysfs-bus-papr-pmem.
>
> [1] commit 58b278f568f0 ("powerpc: Provide initial documentation for
> PAPR hcalls")
>
> Cc: "Aneesh Kumar K . V" 
> Cc: Dan Williams 
> Cc: Michael Ellerman 
> Cc: Ira Weiny 
> Signed-off-by: Vaibhav Jain 
> ---
> Changelog:
>
> v10..v11:
> * None
>
> v9..v10:
> * Removed an avoidable 'goto' in __drc_pmem_query_health. [ Ira ].
>
> Resend:
> * Added ack from Aneesh.
>
> v8..v9:
> * Rename some variables and defines to reduce usage of term SCM
>   replacing it with PMEM [Dan Williams, Aneesh]
> * s/PAPR_SCM_DIMM/PAPR_PMEM/g
> * s/papr_scm_nd_attributes/papr_nd_attributes/g
> * s/papr_scm_nd_attribute_group/papr_nd_attribute_group/g
> * s/papr_scm_dimm_attr_groups/papr_nd_attribute_groups/g
> * Renamed file sysfs-bus-papr-scm to sysfs-bus-papr-pmem
>
> v7..v8:
> * Update type of variable 'rc' in __drc_pmem_query_health() and
>   drc_pmem_query_health() to long and int respectively. [ Ira ]
> * Updated the patch description to s/64 bit Big Endian Number/64-bit
>   bitmap/ [ Ira, Aneesh ].
>
> Resend:
> * None
>
> v6..v7 :
> * Used the exported buf_seq_printf() function to generate content for
>   'papr/flags'
> * Moved the PAPR_SCM_DIMM_* bit-flags macro definitions to papr_scm.c
>   and removed the papr_scm.h file [Mpe]
> * Some minor consistency issued in sysfs-bus-papr-scm
>   documentation. [Mpe]
> * s/dimm_mutex/health_mutex/g [Mpe]
> * Split drc_pmem_query_health() into two function one of which takes
>   care of caching and locking. [Mpe]
> * Fixed a local copy creation of dimm health information using
>   READ_ONCE(). [Mpe]
>
> v5..v6 :
> * Change the flags sysfs attribute from 'papr_flags' to 'papr/flags'
>   [Dan Williams]
> * Include documentation for 'papr/flags' attr [Dan Williams]
> * Change flag 'save_fail' to 'flush_fail' [Dan Williams]
> * Caching of health bitmap to reduce expensive hcalls [Dan Williams]
> * Removed usage of PPC_BIT from 'papr-scm.h' header [Mpe]
> * Replaced two __be64 integers from papr_scm_priv to a single u64
>   integer [Mpe]
> * Updated patch description to reflect the changes made in this
>   version.
> * Removed avoidable usage of 'papr_scm_priv.dimm_mutex' from
>   flags_show() [Dan Williams]
>
> v4..v5 :
> * None
>
> v3..v4 :
> * None
>
> v2..v3 :
> * Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for
>NVDIMM unarmed [Aneesh]
>
> v1..v2 :
> * New patch in the series.
> ---
>  Documentation/ABI/testing/sysfs-bus-papr-pmem |  27 +++
>  arch/powerpc/platforms/pseries/papr_scm.c | 168 +-
>  2 files changed, 193 insertions(+), 2 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem
>
> diff --git a/Documentation/ABI/testing/sysfs-bus-papr-pmem 
> b/Documentation/ABI/testing/sysfs-bus-papr-pmem
> new file mode 100644
> index ..5b10d036a8d4
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-bus-papr-pmem
> @@ -0,0 +1,27 @@
> +What:/sys/bus/nd/devices/nmemX/papr/flags
> +Date:Apr, 2020
> +KernelVersion:   v5.8
> +Contact: linuxppc-dev , 
> linux-nvd...@lists.01.org,
> +Description:
> + (RO) Report flags indicating various states of a
> + papr-pmem NVDIMM device. Each flag maps to a one or
> + more bits set in the dimm-health-bitmap retrieved in
> + response to H_SCM_HEALTH hcall. The details of the bit
> + flags returned in response to this hcall is available
> + at 'Documentation/powerpc/papr_hcalls.rst' . Below are
> + the flags reported in this sysfs file:
> +
> + * "not_armed"   : Indicates that NVDIMM contents will not
> +   survive a 

[PATCH] selftests: powerpc: Fix online CPU selection

2020-06-08 Thread Harish
On systems with large number of cpus, test fails trying to set
affinity by calling sched_setaffinity() with smaller size for
cpuset. This patch fixes it by making sure that the size of
allocated cpu set is dependent on the number of CPUs as reported
by get_nprocs().

Reported-by: Shirisha Ganta 
Signed-off-by: Harish 
Signed-off-by: Sandipan Das 
---
 .../powerpc/benchmarks/context_switch.c| 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/powerpc/benchmarks/context_switch.c 
b/tools/testing/selftests/powerpc/benchmarks/context_switch.c
index a2e8c9da7fa5..de6c49d6f88f 100644
--- a/tools/testing/selftests/powerpc/benchmarks/context_switch.c
+++ b/tools/testing/selftests/powerpc/benchmarks/context_switch.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -104,8 +105,9 @@ static void start_thread_on(void *(*fn)(void *), void *arg, 
unsigned long cpu)
 
 static void start_process_on(void *(*fn)(void *), void *arg, unsigned long cpu)
 {
-   int pid;
-   cpu_set_t cpuset;
+   int pid, ncpus;
+   cpu_set_t *cpuset;
+   size_t size;
 
pid = fork();
if (pid == -1) {
@@ -116,12 +118,16 @@ static void start_process_on(void *(*fn)(void *), void 
*arg, unsigned long cpu)
if (pid)
return;
 
-   CPU_ZERO(&cpuset);
-   CPU_SET(cpu, &cpuset);
+   size = CPU_ALLOC_SIZE(ncpus);
+   ncpus = get_nprocs();
+   cpuset = CPU_ALLOC(ncpus);
+   CPU_ZERO_S(size, cpuset);
+   CPU_SET_S(cpu, size, cpuset);
 
-   if (sched_setaffinity(0, sizeof(cpuset), &cpuset)) {
+   if (sched_setaffinity(0, size, cpuset)) {
perror("sched_setaffinity");
-   exit(1);
+   CPU_FREE(cpuset);
+   exit(-1);
}
 
fn(arg);
-- 
2.24.1



Re: [PATCH v11 6/6] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH

2020-06-08 Thread Ira Weiny
On Sun, Jun 07, 2020 at 06:43:39PM +0530, Vaibhav Jain wrote:
> This patch implements support for PDSM request 'PAPR_PDSM_HEALTH'
> that returns a newly introduced 'struct nd_papr_pdsm_health' instance
> containing dimm health information back to user space in response to
> ND_CMD_CALL. This functionality is implemented in newly introduced
> papr_pdsm_health() that queries the nvdimm health information and
> then copies this information to the package payload whose layout is
> defined by 'struct nd_papr_pdsm_health'.
> 
> Cc: "Aneesh Kumar K . V" 
> Cc: Dan Williams 
> Cc: Michael Ellerman 
> Cc: Ira Weiny 
> Signed-off-by: Vaibhav Jain 
> ---
> Changelog:
> 
> v10..v11:
> * Changed the definition of 'struct nd_papr_pdsm_health' to a maximal
>   struct 184 bytes in size [ Dan Williams ]
> * Added new field 'extension_flags' to 'struct nd_papr_pdsm_health'
>   [ Dan Williams ]
> * Updated papr_pdsm_health() to set field 'extension_flags' to 0.
> * Introduced a define ND_PDSM_PAYLOAD_MAX_SIZE that indicates the
>   maximum size of a payload.
> * Fixed a suspicious conversion from u64 to u8 in papr_pdsm_health
>   that was preventing correct initialization of 'struct
>   nd_papr_pdsm_health'. [ Ira ]
> 
> v9..v10:
> * Removed code in papr_pdsm_health that performed validation on pdsm
>   payload version and corrosponding struct and defines used for
>   validation of payload version.
> * Dropped usage of struct papr_pdsm_health in 'struct
>   papr_scm_priv'. Instead papr_psdm_health() now uses
>   'papr_scm_priv.health_bitmap' to populate the pdsm payload.
> * Above change also fixes the problem where this patch was removing
>   the code that was previously introduced in this patch-series.
>   [ Ira ]
> * Introduced a new def ND_PDSM_ENVELOPE_HDR_SIZE that indicates the
>   space allocated to 'struct nd_pdsm_cmd_pkg' fields except 'struct
>   nd_cmd_pkg'. This def is useful in validating payload sizes.
> * Reworked papr_pdsm_health() to enforce a specific payload size for
>   'PAPR_PDSM_HEALTH' pdsm request.
> 
> Resend:
> * Added ack from Aneesh.
> 
> v8..v9:
> * s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g  [ Dan , Aneesh ]
> * s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g
> * Renamed papr_scm_get_health() to papr_psdm_health()
> * Updated patch description to replace papr-scm dimm with nvdimm.
> 
> v7..v8:
> * None
> 
> Resend:
> * None
> 
> v6..v7:
> * Updated flags_show() to use seq_buf_printf(). [Mpe]
> * Updated papr_scm_get_health() to use newly introduced
>   __drc_pmem_query_health() bypassing the cache [Mpe].
> 
> v5..v6:
> * Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to
>   gaurd against possibility of different compilers adding different
>   paddings to the struct [ Dan Williams ]
> 
> * Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of
>   'bool' and also updated drc_pmem_query_health() to take this into
>   account. [ Dan Williams ]
> 
> v4..v5:
> * None
> 
> v3..v4:
> * Call the DSM_PAPR_SCM_HEALTH service function from
>   papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh]
> 
> v2..v3:
> * Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types
>   as its exported to the userspace [Aneesh]
> * Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health
>   from enum to #defines [Aneesh]
> 
> v1..v2:
> * New patch in the series
> ---
>  arch/powerpc/include/uapi/asm/papr_pdsm.h | 43 ++
>  arch/powerpc/platforms/pseries/papr_scm.c | 71 +++
>  2 files changed, 114 insertions(+)
> 
> diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h 
> b/arch/powerpc/include/uapi/asm/papr_pdsm.h
> index df2447455cfe..12c7aa5ee8bf 100644
> --- a/arch/powerpc/include/uapi/asm/papr_pdsm.h
> +++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
> @@ -72,13 +72,56 @@ struct nd_pdsm_cmd_pkg {
>   __u8 payload[]; /* In/Out: Sub-cmd data buffer */
>  } __packed;
>  
> +/* Calculate size used by the pdsm header fields minus 'struct nd_cmd_pkg' */
> +#define ND_PDSM_HDR_SIZE \
> + (sizeof(struct nd_pdsm_cmd_pkg) - sizeof(struct nd_cmd_pkg))
> +
> +/* Max payload size that we can handle */
> +#define ND_PDSM_PAYLOAD_MAX_SIZE 184
> +
>  /*
>   * Methods to be embedded in ND_CMD_CALL request. These are sent to the 
> kernel
>   * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
>   */
>  enum papr_pdsm {
>   PAPR_PDSM_MIN = 0x0,
> + PAPR_PDSM_HEALTH,
>   PAPR_PDSM_MAX,
>  };
>  
> +/* Various nvdimm health indicators */
> +#define PAPR_PDSM_DIMM_HEALTHY   0
> +#define PAPR_PDSM_DIMM_UNHEALTHY 1
> +#define PAPR_PDSM_DIMM_CRITICAL  2
> +#define PAPR_PDSM_DIMM_FATAL 3
> +
> +/*
> + * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH
> + * Various flags indicate the health status of the dimm.
> + *
> + * extension_flags   : Any extension fields present in the struct.
> + * dimm_unarmed  : Dimm not armed. So contents wont persist.
> + * dimm_bad_shutdown : Previo

Re: [PATCH v11 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods

2020-06-08 Thread Ira Weiny
On Sun, Jun 07, 2020 at 06:43:38PM +0530, Vaibhav Jain wrote:
> Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
> module and add the command family NVDIMM_FAMILY_PAPR to the white list
> of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the
> nvdimm command mask and implement necessary scaffolding in the module
> to handle ND_CMD_CALL ioctl and PDSM requests that we receive.
> 
> The layout of the PDSM request as we expect from libnvdimm/libndctl is
> described in newly introduced uapi header 'papr_pdsm.h' which
> defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used
> to communicate the PDSM request via member
> 'nd_cmd_pkg.nd_command' and size of payload that need to be
> sent/received for servicing the PDSM.
> 
> A new function is_cmd_valid() is implemented that reads the args to
> papr_scm_ndctl() and performs sanity tests on them. A new function
> papr_scm_service_pdsm() is introduced and is called from
> papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
> command from libnvdimm.
> 
> Cc: "Aneesh Kumar K . V" 
> Cc: Dan Williams 
> Cc: Michael Ellerman 
> Cc: Ira Weiny 
> Signed-off-by: Vaibhav Jain 
> ---
> Changelog:
> 
> v10..v11:
> * Moved in-lines 'nd_pdsm_cmd_pkg()' and 'pdsm_cmd_to_payload()' from
>   'papr_pdsm.h' header to 'papr_scm.c'. The avoids a potential license
>   incompatibility issue with non-GPL-2.0 user-space code trying to
>   include the header in its code. [ Ira ]
> * Verified papr_pdsm.h with UAPI_HEADER_TEST config.
> * Moved the is_cmd_valid() check in papr_scm_ndctl() before check for
>   cmd_rc == NULL. This prevents cmd_rc to be updated in case the
>   nd-cmd is invalid or unknown.
> 
> v9..v10:
> * Simplified 'struct nd_pdsm_cmd_pkg' by removing the
>   'payload_version' field.
> * Removed the corrosponding documentation on versioning and backward
>   compatibility from 'papr_pdsm.h'
> * Reduced the size of reserved fields to 4-bytes making 'struct
>   nd_pdsm_cmd_pkg' 64 + 8 bytes long.
> * Updated is_cmd_valid() to enforce validation checks on pdsm
>   commands. [ Dan Williams ]
> * Added check for reserved fields being set to '0' in is_cmd_valid()
>   [ Ira ]
> * Moved changes for checking cmd_rc == NULL and logging improvements
>   to a separate prelim patch [ Ira ].
> * Moved  pdsm package validation checks from papr_scm_service_pdsm()
>   to is_cmd_valid().
> * Marked papr_scm_service_pdsm() return type as 'void' since errors
>   are reported in nd_pdsm_cmd_pkg.cmd_status field.
> 
> Resend:
> * Added ack from Aneesh.
> 
> v8..v9:
> * Reduced the usage of term SCM replacing it with appropriate
>   replacement [ Dan Williams, Aneesh ]
> * Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h'
> * s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g
> * s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g
> * Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'.
> * Minor update to patch description.
> 
> v7..v8:
> * Removed the 'payload_offset' field from 'struct
>   nd_pdsm_cmd_pkg'. Instead command payload is always assumed to start
>   at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ]
> * To enable introducing new fields to 'struct nd_pdsm_cmd_pkg',
>   'reserved' field of 10-bytes is introduced. [ Aneesh ]
> * Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h
>   [ Ira ]
> 
> Resend:
> * None
> 
> v6..v7 :
> * Removed the re-definitions of __packed macro from papr_scm_pdsm.h
>   [Mpe].
> * Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h [Mpe].
> * Removed macros that were unused in papr_scm.c from papr_scm_pdsm.h
>   [Mpe].
> * Made functions defined in papr_scm_pdsm.h as static inline. [Mpe]
> 
> v5..v6 :
> * Changed the usage of the term DSM to PDSM to distinguish it from the
>   ACPI term [ Dan Williams ]
> * Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various struct
>   to reflect the new terminology.
> * Updated the patch description and title to reflect the new terminology.
> * Squashed patch to introduce new command family in 'ndctl.h' with
>   this patch [ Dan Williams ]
> * Updated the papr_scm_pdsm method starting index from 0x1 to 0x0
>   [ Dan Williams ]
> * Removed redundant license text from the papr_scm_psdm.h file.
>   [ Dan Williams ]
> * s/envelop/envelope/ at various places [ Dan Williams ]
> * Added '__packed' attribute to command package header to gaurd
>   against different compiler adding paddings between the fields.
>   [ Dan Williams]
> * Converted various pr_debug to dev_debug [ Dan Williams ]
> 
> v4..v5 :
> * None
> 
> v3..v4 :
> * None
> 
> v2..v3 :
> * Updated the patch prefix to 'ndctl/uapi' [Aneesh]
> 
> v1..v2 :
> * None
> ---
>  arch/powerpc/include/uapi/asm/papr_pdsm.h |  84 +++
>  arch/powerpc/platforms/pseries/papr_scm.c | 126 +-
>  include/uapi/linux/ndctl.h|   1 +
>  3 files changed, 207 insertions(+), 4 deletions(-)
>  create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h
> 
> diff --git a/arch/p

Re: [PATCH v11 4/6] powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()

2020-06-08 Thread Ira Weiny
On Sun, Jun 07, 2020 at 06:43:37PM +0530, Vaibhav Jain wrote:
> Since papr_scm_ndctl() can be called from outside papr_scm, its
> exposed to the possibility of receiving NULL as value of 'cmd_rc'
> argument. This patch updates papr_scm_ndctl() to protect against such
> possibility by assigning it pointer to a local variable in case cmd_rc
> == NULL.
> 
> Finally the patch also updates the 'default' add a debug log unknown
> 'cmd' values.
> 
> Cc: "Aneesh Kumar K . V" 
> Cc: Dan Williams 
> Cc: Michael Ellerman 
> Cc: Ira Weiny 

Reviewed-by: Ira Weiny 

> Signed-off-by: Vaibhav Jain 
> ---
> Changelog:
> 
> v10..v11:
> * Instead of returning *cmd_rd just return '0' in case nd_cmd is
>   handled. In case of unknown nd-cmd return -EINVAL
>   [ Ira and Dan Williams ]
> * Updated patch description.
> 
> v9..v10
> * New patch in the series
> ---
>  arch/powerpc/platforms/pseries/papr_scm.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
> b/arch/powerpc/platforms/pseries/papr_scm.c
> index 0c091622b15e..692ad3d79826 100644
> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> @@ -355,11 +355,16 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor 
> *nd_desc,
>  {
>   struct nd_cmd_get_config_size *get_size_hdr;
>   struct papr_scm_priv *p;
> + int rc;
>  
>   /* Only dimm-specific calls are supported atm */
>   if (!nvdimm)
>   return -EINVAL;
>  
> + /* Use a local variable in case cmd_rc pointer is NULL */
> + if (!cmd_rc)
> + cmd_rc = &rc;
> +
>   p = nvdimm_provider_data(nvdimm);
>  
>   switch (cmd) {
> @@ -381,6 +386,7 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor 
> *nd_desc,
>   break;
>  
>   default:
> + dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd);
>   return -EINVAL;
>   }
>  
> -- 
> 2.26.2
> 


Re: [PATCH] selftests: powerpc: Fix online CPU selection

2020-06-08 Thread Kamalesh Babulal
On 6/8/20 8:12 PM, Sandipan Das wrote:
> The size of the cpu set must be large enough for systems
> with a very large number of CPUs. Otherwise, tests which
> try to determine the first online CPU by calling
> sched_getaffinity() will fail. This makes sure that the
> size of the allocated cpu set is dependent on the number
> of CPUs as reported by get_nprocs().
> 
> Fixes: 3752e453f6ba ("selftests/powerpc: Add tests of PMU EBBs")
> Reported-by: Shirisha Ganta 
> Signed-off-by: Sandipan Das 

LGTM,

Reviewed-by: Kamalesh Babulal 

-- 
Kamalesh


[PATCH] selftests: powerpc: Fix online CPU selection

2020-06-08 Thread Sandipan Das
The size of the cpu set must be large enough for systems
with a very large number of CPUs. Otherwise, tests which
try to determine the first online CPU by calling
sched_getaffinity() will fail. This makes sure that the
size of the allocated cpu set is dependent on the number
of CPUs as reported by get_nprocs().

Fixes: 3752e453f6ba ("selftests/powerpc: Add tests of PMU EBBs")
Reported-by: Shirisha Ganta 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/powerpc/utils.c | 33 -
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/powerpc/utils.c 
b/tools/testing/selftests/powerpc/utils.c
index 933678f1ed0a..bb8e402752c0 100644
--- a/tools/testing/selftests/powerpc/utils.c
+++ b/tools/testing/selftests/powerpc/utils.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -88,28 +89,36 @@ void *get_auxv_entry(int type)
 
 int pick_online_cpu(void)
 {
-   cpu_set_t mask;
-   int cpu;
+   int ncpus, cpu = -1;
+   cpu_set_t *mask;
+   size_t size;
 
-   CPU_ZERO(&mask);
+   ncpus = get_nprocs();
+   size = CPU_ALLOC_SIZE(ncpus);
+   mask = CPU_ALLOC(ncpus);
 
-   if (sched_getaffinity(0, sizeof(mask), &mask)) {
+   CPU_ZERO_S(size, mask);
+
+   if (sched_getaffinity(0, size, mask)) {
perror("sched_getaffinity");
-   return -1;
+   goto done;
}
 
/* We prefer a primary thread, but skip 0 */
-   for (cpu = 8; cpu < CPU_SETSIZE; cpu += 8)
-   if (CPU_ISSET(cpu, &mask))
-   return cpu;
+   for (cpu = 8; cpu < ncpus; cpu += 8)
+   if (CPU_ISSET_S(cpu, size, mask))
+   goto done;
 
/* Search for anything, but in reverse */
-   for (cpu = CPU_SETSIZE - 1; cpu >= 0; cpu--)
-   if (CPU_ISSET(cpu, &mask))
-   return cpu;
+   for (cpu = ncpus - 1; cpu >= 0; cpu--)
+   if (CPU_ISSET_S(cpu, size, mask))
+   goto done;
 
printf("No cpus in affinity mask?!\n");
-   return -1;
+
+done:
+   CPU_FREE(mask);
+   return cpu;
 }
 
 bool is_ppc64le(void)
-- 
2.25.1



[PATCH v2] mm/debug_vm_pgtable: Fix kernel crash by checking for THP support

2020-06-08 Thread Aneesh Kumar K.V
Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but
no THP support enabled based on platforms. For ex: with 4K
PAGE_SIZE ppc64 supports THP only with radix translation.

This results in below crash when running with hash translation and
4K PAGE_SIZE.

kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140!
cpu 0x61: Vector: 700 (Program Check) at [c00ff948f860]
pc: c18810f8: debug_vm_pgtable+0x480/0x8b0
lr: c18810ec: debug_vm_pgtable+0x474/0x8b0
...
[c00ff948faf0] c1880fec debug_vm_pgtable+0x374/0x8b0 (unreliable)
[c00ff948fbf0] c0011648 do_one_initcall+0x98/0x4f0
[c00ff948fcd0] c1843928 kernel_init_freeable+0x330/0x3fc
[c00ff948fdb0] c00122ac kernel_init+0x24/0x148
[c00ff948fe20] c000cc44 ret_from_kernel_thread+0x5c/0x78

Check for THP support correctly

Cc: anshuman.khand...@arm.com
Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table 
helpers")
Signed-off-by: Aneesh Kumar K.V 
---
 mm/debug_vm_pgtable.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 188c18908964..df3a3a08f4f8 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -61,6 +61,9 @@ static void __init pmd_basic_tests(unsigned long pfn, 
pgprot_t prot)
 {
pmd_t pmd = pfn_pmd(pfn, prot);
 
+   if (!has_transparent_hugepage())
+   return;
+
WARN_ON(!pmd_same(pmd, pmd));
WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd;
WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd;
@@ -80,6 +83,9 @@ static void __init pud_basic_tests(unsigned long pfn, 
pgprot_t prot)
 {
pud_t pud = pfn_pud(pfn, prot);
 
+   if (!has_transparent_hugepage())
+   return;
+
WARN_ON(!pud_same(pud, pud));
WARN_ON(!pud_young(pud_mkyoung(pud_mkold(pud;
WARN_ON(!pud_write(pud_mkwrite(pud_wrprotect(pud;
-- 
2.26.2



[PATCH] KVM: PPC: Book3S HV: increase KVMPPC_NR_LPIDS on POWER8 and POWER9

2020-06-08 Thread Cédric Le Goater
POWER8 and POWER9 have 12-bit LPIDs. Change LPID_RSVD to support up to
(4096 - 2) guests on these processors. POWER7 is kept the same with a
limitation of (1024 - 2), but it might be time to drop KVM support for
POWER7.

Tested with 2048 guests * 4 vCPUs on a witherspoon system with 512G
RAM and a bit of swap.

Signed-off-by: Cédric Le Goater 
---
 arch/powerpc/include/asm/reg.h  | 3 ++-
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 8 ++--
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 88e6c78100d9..b70bbfb0ea3c 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -473,7 +473,8 @@
 #ifndef SPRN_LPID
 #define SPRN_LPID  0x13F   /* Logical Partition Identifier */
 #endif
-#define   LPID_RSVD0x3ff   /* Reserved LPID for partn switching */
+#define   LPID_RSVD_POWER7 0x3ff   /* Reserved LPID for partn switching */
+#define   LPID_RSVD0xfff   /* Reserved LPID for partn switching */
 #defineSPRN_HMER   0x150   /* Hypervisor maintenance exception reg 
*/
 #define   HMER_DEBUG_TRIG  (1ul << (63 - 17)) /* Debug trigger */
 #defineSPRN_HMEER  0x151   /* Hyp maintenance exception enable reg 
*/
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 18aed9775a3c..23035ab2ec50 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -260,11 +260,15 @@ int kvmppc_mmu_hv_init(void)
if (!mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE))
return -EINVAL;
 
-   /* POWER7 has 10-bit LPIDs (12-bit in POWER8) */
host_lpid = 0;
if (cpu_has_feature(CPU_FTR_HVMODE))
host_lpid = mfspr(SPRN_LPID);
-   rsvd_lpid = LPID_RSVD;
+
+   /* POWER8 and above have 12-bit LPIDs (10-bit in POWER7) */
+   if (cpu_has_feature(CPU_FTR_ARCH_207S))
+   rsvd_lpid = LPID_RSVD;
+   else
+   rsvd_lpid = LPID_RSVD_POWER7;
 
kvmppc_init_lpid(rsvd_lpid + 1);
 
-- 
2.25.4



Re: [PATCH] mm/debug_vm_pgtable: Fix kernel crash with page table validate

2020-06-08 Thread Anshuman Khandual



On 06/08/2020 04:46 PM, Aneesh Kumar K.V wrote:
> On 6/8/20 4:31 PM, Anshuman Khandual wrote:
>> Hi Aneesh,
>>
>> On 06/08/2020 11:57 AM, Aneesh Kumar K.V wrote:
>>> Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but
>>> no THP support enabled based on platforms. For ex: with 4K
>>> PAGE_SIZE ppc64 supports THP only with radix translation.
>>
>> Good catch, never hit this before.
>>
>>>
>>> This results in below crash when running with hash translation and
>>> 4K PAGE_SIZE.
>>>
>>> kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140!
>>> cpu 0x61: Vector: 700 (Program Check) at [c00ff948f860]
>>>  pc: c18810f8: debug_vm_pgtable+0x480/0x8b0
>>>  lr: c18810ec: debug_vm_pgtable+0x474/0x8b0
>>> ...
>>> [c00ff948faf0] c1880fec debug_vm_pgtable+0x374/0x8b0 
>>> (unreliable)
>>> [c00ff948fbf0] c0011648 do_one_initcall+0x98/0x4f0
>>> [c00ff948fcd0] c1843928 kernel_init_freeable+0x330/0x3fc
>>> [c00ff948fdb0] c00122ac kernel_init+0x24/0x148
>>> [c00ff948fe20] c000cc44 ret_from_kernel_thread+0x5c/0x78
>>>
>>> Check for THP support correctly
>>
>> Makes sense, is this the only configuration which hit the problem ?
> 
> 4K hash ppc64 is the only config i guess.

Okay.

> 
>>
>>>
>>> Cc: anshuman.khand...@arm.com
>>> Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page 
>>> table helpers")
>>> Signed-off-by: Aneesh Kumar K.V 
>>> ---
>>>   mm/debug_vm_pgtable.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>>> index 188c18908964..e60151c5e997 100644
>>> --- a/mm/debug_vm_pgtable.c
>>> +++ b/mm/debug_vm_pgtable.c
>>> @@ -61,6 +61,9 @@ static void __init pmd_basic_tests(unsigned long pfn, 
>>> pgprot_t prot)
>>>   {
>>>   pmd_t pmd = pfn_pmd(pfn, prot);
>>>   +    if (!has_transparent_hugepage())
>>> +    return;
>>> +
>>
>> We should also add this check to pud_basic_tests() as well.
> 
> 
> Do we have a function that check for runtime support for pud level THP? ppc64 
> don't do pud level THP yet. So  we have 
> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=n

I believe, we dont have such a generic function. Please correct me, if I am
missing something here.

> 
> are you suggesting we do the same check for pud level THP too?

Yes. Because regardless CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD, could there
be any THP at PUD level when has_transparent_hugepage() returns negative ? The
current dependency between THP and PUD THP configs seems some what confusing
but having this check at PUD level should protect against similar problems. A
quick test (after adding this check to PUD level) on x86 does not indicate any
problem on the normal path.

> 
> 
>>
>>>   WARN_ON(!pmd_same(pmd, pmd));
>>>   WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd;
>>>   WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd;
>>>
>>
>> The subject line here should mention about correct THP support
>> detection which fixes the problem. Probably something like this
>> or similar ("Fix kernel crash with correct THP support check").
> 
> 
> Not sure about that. This fix a kernel crash with page table validate code.

What this fixes is very clear from the prefix itself - "mm/debug_vm_pgtable:",
making "page table validate" some what bit redundant. Instead, it could just
accommodate method of the fix i.e "via correct THP support check". Nonetheless,
it is just a small nit.


Re: [v1 PATCH 1/2] Refactoring carrying over IMA measuremnet logs over Kexec.

2020-06-08 Thread Mimi Zohar
Hi Prakhar,

On Sun, 2020-06-07 at 16:33 -0700, Prakhar Srivastava wrote:
> This patch moves the non-architecture specific code out of powerpc and
>  adds to security/ima. 
> Update the arm64 and powerpc kexec file load paths to carry the IMA 
> measurement
> logs.

>From your patch description, this patch should be broken up.  Moving
the non-architecture specific code out of powerpc should be one patch.
 Additional support should be in another patch.  After each patch, the
code should work properly.

Before posting patches, please review them, making sure
unnecessary/unwanted changes haven't crept in - commenting out code,
moving code without removing the original code.

thanks,

Mimi


Re: [PATCH] mm/debug_vm_pgtable: Fix kernel crash with page table validate

2020-06-08 Thread Aneesh Kumar K.V

On 6/8/20 4:31 PM, Anshuman Khandual wrote:

Hi Aneesh,

On 06/08/2020 11:57 AM, Aneesh Kumar K.V wrote:

Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but
no THP support enabled based on platforms. For ex: with 4K
PAGE_SIZE ppc64 supports THP only with radix translation.


Good catch, never hit this before.



This results in below crash when running with hash translation and
4K PAGE_SIZE.

kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140!
cpu 0x61: Vector: 700 (Program Check) at [c00ff948f860]
 pc: c18810f8: debug_vm_pgtable+0x480/0x8b0
 lr: c18810ec: debug_vm_pgtable+0x474/0x8b0
...
[c00ff948faf0] c1880fec debug_vm_pgtable+0x374/0x8b0 (unreliable)
[c00ff948fbf0] c0011648 do_one_initcall+0x98/0x4f0
[c00ff948fcd0] c1843928 kernel_init_freeable+0x330/0x3fc
[c00ff948fdb0] c00122ac kernel_init+0x24/0x148
[c00ff948fe20] c000cc44 ret_from_kernel_thread+0x5c/0x78

Check for THP support correctly


Makes sense, is this the only configuration which hit the problem ?


4K hash ppc64 is the only config i guess.





Cc: anshuman.khand...@arm.com
Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table 
helpers")
Signed-off-by: Aneesh Kumar K.V 
---
  mm/debug_vm_pgtable.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 188c18908964..e60151c5e997 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -61,6 +61,9 @@ static void __init pmd_basic_tests(unsigned long pfn, 
pgprot_t prot)
  {
pmd_t pmd = pfn_pmd(pfn, prot);
  
+	if (!has_transparent_hugepage())

+   return;
+


We should also add this check to pud_basic_tests() as well.



Do we have a function that check for runtime support for pud level THP? 
ppc64 don't do pud level THP yet. So  we have 
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=n


are you suggesting we do the same check for pud level THP too?





WARN_ON(!pmd_same(pmd, pmd));
WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd;
WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd;



The subject line here should mention about correct THP support
detection which fixes the problem. Probably something like this
or similar ("Fix kernel crash with correct THP support check").



Not sure about that. This fix a kernel crash with page table validate code.


-aneesh


Re: [PATCH] mm/debug_vm_pgtable: Fix kernel crash with page table validate

2020-06-08 Thread Anshuman Khandual
Hi Aneesh,

On 06/08/2020 11:57 AM, Aneesh Kumar K.V wrote:
> Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but
> no THP support enabled based on platforms. For ex: with 4K
> PAGE_SIZE ppc64 supports THP only with radix translation.

Good catch, never hit this before.

> 
> This results in below crash when running with hash translation and
> 4K PAGE_SIZE.
> 
> kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140!
> cpu 0x61: Vector: 700 (Program Check) at [c00ff948f860]
> pc: c18810f8: debug_vm_pgtable+0x480/0x8b0
> lr: c18810ec: debug_vm_pgtable+0x474/0x8b0
> ...
> [c00ff948faf0] c1880fec debug_vm_pgtable+0x374/0x8b0 (unreliable)
> [c00ff948fbf0] c0011648 do_one_initcall+0x98/0x4f0
> [c00ff948fcd0] c1843928 kernel_init_freeable+0x330/0x3fc
> [c00ff948fdb0] c00122ac kernel_init+0x24/0x148
> [c00ff948fe20] c000cc44 ret_from_kernel_thread+0x5c/0x78
> 
> Check for THP support correctly

Makes sense, is this the only configuration which hit the problem ?

> 
> Cc: anshuman.khand...@arm.com
> Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table 
> helpers")
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  mm/debug_vm_pgtable.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 188c18908964..e60151c5e997 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -61,6 +61,9 @@ static void __init pmd_basic_tests(unsigned long pfn, 
> pgprot_t prot)
>  {
>   pmd_t pmd = pfn_pmd(pfn, prot);
>  
> + if (!has_transparent_hugepage())
> + return;
> +

We should also add this check to pud_basic_tests() as well.

>   WARN_ON(!pmd_same(pmd, pmd));
>   WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd;
>   WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd;
> 

The subject line here should mention about correct THP support
detection which fixes the problem. Probably something like this
or similar ("Fix kernel crash with correct THP support check").

- Anshuman


[RFC PATCH v0 3/4] powerpc/pseries: H_REGISTER_PROC_TBL should ask for GTSE only if enabled

2020-06-08 Thread Bharata B Rao
H_REGISTER_PROC_TBL asks for GTSE by default. GTSE flag bit should
be set only when GTSE is supported.

Signed-off-by: Bharata B Rao 
---
 arch/powerpc/platforms/pseries/lpar.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/lpar.c 
b/arch/powerpc/platforms/pseries/lpar.c
index e4ed5317f117..58ba76bc1964 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -1680,9 +1680,11 @@ static int pseries_lpar_register_process_table(unsigned 
long base,
 
if (table_size)
flags |= PROC_TABLE_NEW;
-   if (radix_enabled())
-   flags |= PROC_TABLE_RADIX | PROC_TABLE_GTSE;
-   else
+   if (radix_enabled()) {
+   flags |= PROC_TABLE_RADIX;
+   if (mmu_has_feature(MMU_FTR_GTSE))
+   flags |= PROC_TABLE_GTSE;
+   } else
flags |= PROC_TABLE_HPT_SLB;
for (;;) {
rc = plpar_hcall_norets(H_REGISTER_PROC_TBL, flags, base,
-- 
2.21.3



[RFC PATCH v0 4/4] powerpc/mm/book3s64/radix: Off-load TLB invalidations to host when !GTSE

2020-06-08 Thread Bharata B Rao
From: Nicholas Piggin 

When platform doesn't support GTSE, let TLB invalidation requests
for radix guests be off-loaded to the host using H_RPT_INVALIDATE
hcall

Signed-off-by: Nicholas Piggin 
Signed-off-by: Bharata B Rao 
---
 arch/powerpc/include/asm/hvcall.h |   1 +
 arch/powerpc/include/asm/plpar_wrappers.h |  14 +++
 arch/powerpc/mm/book3s64/radix_tlb.c  | 105 --
 3 files changed, 113 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index e90c073e437e..08917147415b 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -335,6 +335,7 @@
 #define H_GET_24X7_CATALOG_PAGE0xF078
 #define H_GET_24X7_DATA0xF07C
 #define H_GET_PERF_COUNTER_INFO0xF080
+#define H_RPT_INVALIDATE   0xF084
 
 /* Platform-specific hcalls used for nested HV KVM */
 #define H_SET_PARTITION_TABLE  0xF800
diff --git a/arch/powerpc/include/asm/plpar_wrappers.h 
b/arch/powerpc/include/asm/plpar_wrappers.h
index 4497c8afb573..e952139b0e47 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -334,6 +334,13 @@ static inline long plpar_get_cpu_characteristics(struct 
h_cpu_char_result *p)
return rc;
 }
 
+static inline long pseries_rpt_invalidate(u32 pid, u64 target, u64 what,
+ u64 pages, u64 start, u64 end)
+{
+   return plpar_hcall_norets(H_RPT_INVALIDATE, pid, target, what,
+ pages, start, end);
+}
+
 #else /* !CONFIG_PPC_PSERIES */
 
 static inline long plpar_set_ciabr(unsigned long ciabr)
@@ -346,6 +353,13 @@ static inline long plpar_pte_read_4(unsigned long flags, 
unsigned long ptex,
 {
return 0;
 }
+
+static inline long pseries_rpt_invalidate(u32 pid, u64 target, u64 what,
+ u64 pages, u64 start, u64 end)
+{
+   return 0;
+}
+
 #endif /* CONFIG_PPC_PSERIES */
 
 #endif /* _ASM_POWERPC_PLPAR_WRAPPERS_H */
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c 
b/arch/powerpc/mm/book3s64/radix_tlb.c
index b5cc9b23cf02..4dd1d3c75562 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -16,11 +16,39 @@
 #include 
 #include 
 #include 
+#include 
 
 #define RIC_FLUSH_TLB 0
 #define RIC_FLUSH_PWC 1
 #define RIC_FLUSH_ALL 2
 
+#define H_TLBI_TLB 0x0001
+#define H_TLBI_PWC 0x0002
+#define H_TLBI_PRS 0x0004
+
+#define H_TLBI_TARGET_CMMU 0x01
+#define H_TLBI_TARGET_CMMU_LOCAL 0x02
+#define H_TLBI_TARGET_NMMU 0x04
+
+#define H_TLBI_PAGE_ALL (-1UL)
+#define H_TLBI_PAGE_4K 0x01
+#define H_TLBI_PAGE_64K0x02
+#define H_TLBI_PAGE_2M 0x04
+#define H_TLBI_PAGE_1G 0x08
+
+static inline u64 psize_to_h_tlbi(unsigned long psize)
+{
+   if (psize == MMU_PAGE_4K)
+   return H_TLBI_PAGE_4K;
+   if (psize == MMU_PAGE_64K)
+   return H_TLBI_PAGE_64K;
+   if (psize == MMU_PAGE_2M)
+   return H_TLBI_PAGE_2M;
+   if (psize == MMU_PAGE_1G)
+   return H_TLBI_PAGE_1G;
+   return H_TLBI_PAGE_ALL;
+}
+
 /*
  * tlbiel instruction for radix, set invalidation
  * i.e., r=1 and is=01 or is=10 or is=11
@@ -694,7 +722,14 @@ void radix__flush_tlb_mm(struct mm_struct *mm)
goto local;
}
 
-   if (cputlb_use_tlbie()) {
+   if (!mmu_has_feature(MMU_FTR_GTSE)) {
+   unsigned long targ = H_TLBI_TARGET_CMMU;
+
+   if (atomic_read(&mm->context.copros) > 0)
+   targ |= H_TLBI_TARGET_NMMU;
+   pseries_rpt_invalidate(pid, targ, H_TLBI_TLB,
+  H_TLBI_PAGE_ALL, 0, -1UL);
+   } else if (cputlb_use_tlbie()) {
if (mm_needs_flush_escalation(mm))
_tlbie_pid(pid, RIC_FLUSH_ALL);
else
@@ -727,7 +762,16 @@ static void __flush_all_mm(struct mm_struct *mm, bool 
fullmm)
goto local;
}
}
-   if (cputlb_use_tlbie())
+   if (!mmu_has_feature(MMU_FTR_GTSE)) {
+   unsigned long targ = H_TLBI_TARGET_CMMU;
+   unsigned long what = H_TLBI_TLB | H_TLBI_PWC |
+H_TLBI_PRS;
+
+   if (atomic_read(&mm->context.copros) > 0)
+   targ |= H_TLBI_TARGET_NMMU;
+   pseries_rpt_invalidate(pid, targ, what,
+  H_TLBI_PAGE_ALL, 0, -1UL);
+   } else if (cputlb_use_tlbie())
_tlbie_pid(pid, RIC_FLUSH_ALL);
else
_tlbiel_pid_multicast(mm, pid, RIC_FLUSH_ALL);
@@ -760,7 +804,17 @@ void radix__flush_t

[RFC PATCH v0 1/4] powerpc/mm: Make GTSE as MMU FTR

2020-06-08 Thread Bharata B Rao
Make GTSE as an MMU feature and enable it by default for radix.
However for guest, conditionally enable it if hypervisor supports it
via OV5 vector.

Making GTSE as a MMU feature will make it easy to enable radix
without GTSE.

Signed-off-by: Bharata B Rao 
---
 arch/powerpc/include/asm/mmu.h| 4 
 arch/powerpc/kernel/dt_cpu_ftrs.c | 2 ++
 arch/powerpc/mm/init_64.c | 6 +-
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index f4ac25d4df05..884d51995934 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -28,6 +28,9 @@
  * Individual features below.
  */
 
+/* Guest Translation Shootdown Enable */
+#define MMU_FTR_GTSE   ASM_CONST(0x1000)
+
 /*
  * Support for 68 bit VA space. We added that from ISA 2.05
  */
@@ -173,6 +176,7 @@ enum {
 #endif
 #ifdef CONFIG_PPC_RADIX_MMU
MMU_FTR_TYPE_RADIX |
+   MMU_FTR_GTSE |
 #ifdef CONFIG_PPC_KUAP
MMU_FTR_RADIX_KUAP |
 #endif /* CONFIG_PPC_KUAP */
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c 
b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 3a409517c031..571aa39e35d5 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -337,6 +337,8 @@ static int __init feat_enable_mmu_radix(struct 
dt_cpu_feature *f)
 #ifdef CONFIG_PPC_RADIX_MMU
cur_cpu_spec->mmu_features |= MMU_FTR_TYPE_RADIX;
cur_cpu_spec->mmu_features |= MMU_FTRS_HASH_BASE;
+   /* TODO: Does this need a separate cpu dt feature? */
+   cur_cpu_spec->mmu_features |= MMU_FTR_GTSE;
cur_cpu_spec->cpu_user_features |= PPC_FEATURE_HAS_MMU;
 
return 1;
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index c7ce4ec5060e..feb9bed9177c 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -408,13 +408,17 @@ static void __init early_check_vec5(void)
if (!(vec5[OV5_INDX(OV5_RADIX_GTSE)] &
OV5_FEAT(OV5_RADIX_GTSE))) {
pr_warn("WARNING: Hypervisor doesn't support RADIX with 
GTSE\n");
-   }
+   cur_cpu_spec->mmu_features &= ~MMU_FTR_GTSE;
+   } else
+   cur_cpu_spec->mmu_features |= MMU_FTR_GTSE;
/* Do radix anyway - the hypervisor said we had to */
cur_cpu_spec->mmu_features |= MMU_FTR_TYPE_RADIX;
} else if (mmu_supported == OV5_FEAT(OV5_MMU_HASH)) {
/* Hypervisor only supports hash - disable radix */
cur_cpu_spec->mmu_features &= ~MMU_FTR_TYPE_RADIX;
+   cur_cpu_spec->mmu_features &= ~MMU_FTR_GTSE;
}
+
 }
 
 void __init mmu_early_init_devtree(void)
-- 
2.21.3



  1   2   >