Re: [PATCH] powerpc/kprobes: Use probe_address() to read instructions
On Tue, Jun 09, 2020 at 03:28:38PM +1000, Michael Ellerman wrote: > On Mon, 24 Feb 2020 18:02:10 + (UTC), Christophe Leroy wrote: > > In order to avoid Oopses, use probe_address() to read the > > instruction at the address where the trap happened. > > Applied to powerpc/next. > > [1/1] powerpc/kprobes: Use probe_address() to read instructions > > https://git.kernel.org/powerpc/c/9ed5df69b79a22b40b20bc2132ba2495708b19c4 probe_addresss has been renamed to get_kernel_nofault in the -mm queue that Andrew sent off to Linus last night.
Re: [PATCH v8 22.5/30] powerpc/optprobes: Add register argument to patch_imm64_load_insns()
On Sat, 2020-05-16 at 11:54:49 UTC, Michael Ellerman wrote: > From: Jordan Niethe > > Currently patch_imm32_load_insns() is used to load an instruction to > r4 to be emulated by emulate_step(). For prefixed instructions we > would like to be able to load a 64bit immediate to r4. To prepare for > this make patch_imm64_load_insns() take an argument that decides which > register to load an immediate to - rather than hardcoding r3. > > Signed-off-by: Jordan Niethe > Signed-off-by: Michael Ellerman Applied to powerpc next. https://git.kernel.org/powerpc/c/7a8818e0df5c6b53c89c7c928498668a2bbb3de0 cheers
Re: [PATCH v2] powerpc/pseries: Make vio and ibmebus initcalls pseries specific
On Tue, 21 Apr 2020 18:15:39 +1000, Oliver O'Halloran wrote: > The vio and ibmebus buses are used for pseries specific paravirtualised > devices and currently they're initialised by the generic initcall types. > This is mostly fine, but it can result in some nuisance errors in dmesg > when booting on PowerNV on some OSes, e.g. > > [2.984439] synth uevent: /devices/vio: failed to send uevent > [2.984442] vio vio: uevent: failed to send synthetic uevent > [ 17.968551] synth uevent: /devices/vio: failed to send uevent > [ 17.968554] vio vio: uevent: failed to send synthetic uevent > > [...] Applied to powerpc/next. [1/1] powerpc/pseries: Make vio and ibmebus initcalls pseries specific https://git.kernel.org/powerpc/c/4336b9337824a60a0b10013c622caeee99460db5 cheers
Re: [PATCH] hw_breakpoint: Fix build warnings with clang
On Tue, 2 Jun 2020 09:42:08 +0530, Ravi Bangoria wrote: > kbuild test robot reported few build warnings with hw_breakpoint code > when compiled with clang[1]. Fix those. > > [1]: > https://lore.kernel.org/linuxppc-dev/202005192233.oi9cjrta%25...@intel.com/ Applied to powerpc/next. [1/1] hw-breakpoints: Fix build warnings with clang https://git.kernel.org/powerpc/c/ef3534a94fdbdeab4c89d18d0164be2ad5d6dbb7 cheers
Re: [PATCH 1/7] powerpc/powernv/npu: Clean up compound table group initialisation
On Mon, 6 Apr 2020 13:07:39 +1000, Oliver O'Halloran wrote: > Re-work the control flow a bit so what's going on is a little clearer. > This also ensures the table_group is only initialised once in the P9 > case. This shouldn't be a functional change since all the GPU PCI > devices should have the same table_group configuration, but it does > look strange. Applied to powerpc/next. [1/7] powerpc/powernv/npu: Clean up compound table group initialisation https://git.kernel.org/powerpc/c/6984856865b55c9c1ee0814c30296119cd8ba511 [2/7] powerpc/powernv/iov: Don't add VFs to iommu group during PE config https://git.kernel.org/powerpc/c/6cff91b2b97b1b40a52971c9b1e99980dd49fd54 [3/7] powerpc/powernv/pci: Register iommu group at PE DMA setup https://git.kernel.org/powerpc/c/9b9408c55935ecc3b1c27b3eeb5a507394113cbb [4/7] powerpc/powernv/pci: Add device to iommu group during dma_dev_setup() https://git.kernel.org/powerpc/c/84d8cc076723058cc294f4360db6ff7758c25b74 [5/7] powerpc/powernv/pci: Delete old iommu recursive iommu setup https://git.kernel.org/powerpc/c/f39b8b10fcc5d4617d2be5f2910e017a55444b43 [6/7] powerpc/powernv/pci: Move tce size parsing to pci-ioda-tce.c https://git.kernel.org/powerpc/c/96e2006a9dbc02cb1c103521405d457438a2e260 [7/7] powerpc/powernv/npu: Move IOMMU group setup into npu-dma.c https://git.kernel.org/powerpc/c/03b7bf341c18ff19129cc2825b62bb0e212463f1 cheers
Re: [PATCH] powerpc/powernv/pci: Add an explaination for PNV_IODA_PE_BUS_ALL
On Wed, 15 Apr 2020 09:35:02 +1000, Oliver O'Halloran wrote: > It's pretty obsecure and confused me for a long time so I figured it's > worth documenting properly. Applied to powerpc/next. [1/1] powerpc/powernv/pci: Add an explaination for PNV_IODA_PE_BUS_ALL https://git.kernel.org/powerpc/c/9d0879a2dbc3d0c15f8c71490079c1c38f9f3800 cheers
Re: [PATCH] powerpc/powernv: Add a print indicating when an IODA PE is released
On Wed, 8 Apr 2020 21:22:13 +1000, Oliver O'Halloran wrote: > Quite useful to know in some cases. Applied to powerpc/next. [1/1] powerpc/powernv: Add a print indicating when an IODA PE is released https://git.kernel.org/powerpc/c/e5500ab657c51bec5af8dcf564a096de48e7a132 cheers
Re: [PATCH] powerpc/64s: Fix early_init_mmu section mismatch
On Wed, 29 Apr 2020 17:02:47 +1000, Nicholas Piggin wrote: > Christian reports: > > MODPOST vmlinux.o > WARNING: modpost: vmlinux.o(.text.unlikely+0x1a0): Section mismatch in > reference from the function .early_init_mmu() to the function > .init.text:.radix__early_init_mmu() > The function .early_init_mmu() references > the function __init .radix__early_init_mmu(). > This is often because .early_init_mmu lacks a __init > annotation or the annotation of .radix__early_init_mmu is wrong. > > [...] Applied to powerpc/next. [1/1] powerpc/64s: Fix early_init_mmu section mismatch https://git.kernel.org/powerpc/c/9384e552aabb647ec22acb00181ca1715b0fcdfe cheers
Re: [PATCH] powerpc/64: refactor interrupt exit irq disabling sequence
On Wed, 29 Apr 2020 16:24:21 +1000, Nicholas Piggin wrote: > The same complicated sequence for juggling EE, RI, soft mask, and > irq tracing is repeated 3 times, tidy these up into one function. > > This differs qiute a bit between sub architectures, so this makes > the ppc32 port cleaner as well. Applied to powerpc/next. [1/1] powerpc/64: Refactor interrupt exit irq disabling sequence https://git.kernel.org/powerpc/c/0bdad33d6bd7b80722e2f9e588d3d7c6d6e34978 cheers
Re: [PATCH] powerpc/64s/radix: Don't prefetch DAR in update_mmu_cache
On Mon, 4 May 2020 22:29:07 +1000, Nicholas Piggin wrote: > The idea behind this prefetch was to kick off a page table walk before > returning from the fault, getting some pipelining advantage. > > But this never showed up any noticable performance advantage, and in > fact with KUAP the prefetches are actually blocked and cause some > kind of micro-architectural fault. Removing this improves page fault > microbenchmark performance by about 9%. Applied to powerpc/next. [1/1] powerpc/64s/radix: Don't prefetch DAR in update_mmu_cache https://git.kernel.org/powerpc/c/18594f9b8c45484bd527ebc6b08383b95f58ba73 cheers
Re: [PATCH 1/4] powerpc/powernv/pci: Add helper to find ioda_pe from BDFN
On Fri, 17 Apr 2020 17:35:05 +1000, Oliver O'Halloran wrote: > For each PHB we maintain a reverse-map that can be used to find the > PE that a BDFN is currently mapped to. Add a helper for doing this > lookup so we can check if a PE has been configured without looking > at pdn->pe_number. Applied to powerpc/next. [1/4] powerpc/powernv/pci: Add helper to find ioda_pe from BDFN https://git.kernel.org/powerpc/c/a8d7d5fc2e1672924a391aa37ef8c02d1ec84a4e [2/4] powerpc/powernv/pci: Re-work bus PE configuration https://git.kernel.org/powerpc/c/dc3d8f85bb571c3640ebba24b82a527cf2cb3f24 [3/4] powerpc/powernv/pci: Reserve the root bus PE during init https://git.kernel.org/powerpc/c/718d249aeadff058f79c2e6b25212dd45bd711ae [4/4] powerpc/powernv/pci: Sprinkle around some WARN_ON()s https://git.kernel.org/powerpc/c/6ae8aedf8fa932541f48a85219d75ca041c22080 cheers
Re: [PATCH 0/3] powerpc/module_64: Fix _mcount() stub
On Tue, 21 Apr 2020 23:05:42 +0530, Naveen N. Rao wrote: > This series addresses the crash reported by Qian Cai on ppc64le with > -mprofile-kernel here: > https://lore.kernel.org/r/15ac5b0e-a221-4b8c-9039-fa96b8ef7...@lca.pw > > While fixing patch_instruction() should address the crash, we should > still change the default stub we setup for _mcount() for cases where a > kernel is built without ftrace. > > [...] Applied to powerpc/next. [1/3] powerpc/module_64: Consolidate ftrace code https://git.kernel.org/powerpc/c/03b51416e876aea5e7638947e50831b6c988c246 [2/3] powerpc/module_64: Simplify check for -mprofile-kernel ftrace relocations https://git.kernel.org/powerpc/c/1f2aaed2db03150428dbcd2ddee02ae6cb4bac52 [3/3] powerpc/module_64: Use special stub for _mcount() with -mprofile-kernel https://git.kernel.org/powerpc/c/bd55e792de0844631d34487d43eaf3f13294ebfe cheers
Re: [PATCH] powerpc/wii: Fix declaration made after definition
On Mon, 13 Apr 2020 12:06:45 -0700, Nathan Chancellor wrote: > A 0day randconfig uncovered an error with clang, trimmed for brevity: > > arch/powerpc/platforms/embedded6xx/wii.c:195:7: error: attribute > declaration must precede definition [-Werror,-Wignored-attributes] > if (!machine_is(wii)) > ^ > > [...] Applied to powerpc/next. [1/1] powerpc/wii: Fix declaration made after definition https://git.kernel.org/powerpc/c/91ffeaa7e5dd62753e23a1204dc7ecd11f26eadc cheers
Re: [PATCH] powerpc/xmon: Show task->thread.regs in process display
On Wed, 20 May 2020 21:17:40 +1000, Michael Ellerman wrote: > Show the address of the tasks regs in the process listing in xmon. The > regs should always be on the stack page that we also print the address > of, but it's still helpful not to have to find them by hand. Applied to powerpc/next. [1/1] powerpc/xmon: Show task->thread.regs in process display https://git.kernel.org/powerpc/c/0e7e92efe11bc5993def689e10f7bcb36f127651 cheers
Re: [PATCH v4 1/2] powerpc/64s/hash: Add stress_slb kernel boot option to increase SLB faults
On Mon, 11 May 2020 22:58:24 +1000, Michael Ellerman wrote: > This option increases the number of SLB misses by limiting the number > of kernel SLB entries, and increased flushing of cached lookaside > information. This helps stress test difficult to hit paths in the > kernel. > > [mpe: Relocate the code into arch/powerpc/mm, s/torture/stress/] Applied to powerpc/next. [1/1] powerpc/64s/hash: Add stress_slb kernel boot option to increase SLB faults https://git.kernel.org/powerpc/c/82a1b8ed5604cccf30b6ff03bcd61640cd26369b cheers
Re: [RFC PATCH 1/4] powerpc/64s: Don't init FSCR_DSCR in __init_FSCR()
On Thu, 28 May 2020 00:58:40 +1000, Michael Ellerman wrote: > __init_FSCR() was added originally in commit 2468dcf641e4 ("powerpc: > Add support for context switching the TAR register") (Feb 2013), and > only set FSCR_TAR. > > At that point FSCR (Facility Status and Control Register) was not > context switched, so the setting was permanent after boot. > > [...] Applied to powerpc/next. [1/4] powerpc/64s: Don't init FSCR_DSCR in __init_FSCR() https://git.kernel.org/powerpc/c/0828137e8f16721842468e33df0460044a0c588b [2/4] powerpc/64s: Don't let DT CPU features set FSCR_DSCR https://git.kernel.org/powerpc/c/993e3d96fd08c3ebf7566e43be9b8cd622063e6d [3/4] powerpc/64s: Save FSCR to init_task.thread.fscr after feature init https://git.kernel.org/powerpc/c/912c0a7f2b5daa3cbb2bc10f303981e493de73bd [4/4] powerpc/64s: Don't set FSCR bits in INIT_THREAD https://git.kernel.org/powerpc/c/c887ef5707591e84f80271e95e99ff9fb38987b5 cheers
Re: [PATCH v2] powerpc: Add ppc_inst_as_u64()
On Tue, 26 May 2020 17:26:30 +1000, Michael Ellerman wrote: > The code patching code wants to get the value of a struct ppc_inst as > a u64 when the instruction is prefixed, so we can pass the u64 down to > __put_user_asm() and write it with a single store. > > The optprobes code wants to load a struct ppc_inst as an immediate > into a register so it is useful to have it as a u64 to use the > existing helper function. > > [...] Applied to powerpc/next. [1/1] powerpc: Add ppc_inst_as_u64() https://git.kernel.org/powerpc/c/16ef9767e4dc5cf03a71ae7bc2bc588dbbe7983e cheers
Re: [PATCH] input: i8042: Remove special PowerPC handling
On Mon, 18 May 2020 11:10:43 -0700, Nathan Chancellor wrote: > This causes a build error with CONFIG_WALNUT because kb_cs and kb_data > were removed in commit 917f0af9e5a9 ("powerpc: Remove arch/ppc and > include/asm-ppc"). > > ld.lld: error: undefined symbol: kb_cs > > referenced by i8042-ppcio.h:28 (drivers/input/serio/i8042-ppcio.h:28) > > input/serio/i8042.o:(__i8042_command) in archive drivers/built-in.a > > referenced by i8042-ppcio.h:28 (drivers/input/serio/i8042-ppcio.h:28) > > input/serio/i8042.o:(__i8042_command) in archive drivers/built-in.a > > referenced by i8042-ppcio.h:28 (drivers/input/serio/i8042-ppcio.h:28) > > input/serio/i8042.o:(__i8042_command) in archive drivers/built-in.a > > [...] Applied to powerpc/next. [1/1] input: i8042 - Remove special PowerPC handling https://git.kernel.org/powerpc/c/e4f4ffa8a98c24a4ab482669b1e2b4cfce3f52f4 cheers
Re: [PATCH v2] powerpc: Add ppc_inst_next()
On Fri, 22 May 2020 23:33:18 +1000, Michael Ellerman wrote: > In a few places we want to calculate the address of the next > instruction. Previously that was simple, we just added 4 bytes, or if > using a u32 * we incremented that pointer by 1. > > But prefixed instructions make it more complicated, we need to advance > by either 4 or 8 bytes depending on the actual instruction. We also > can't do pointer arithmetic using struct ppc_inst, because it is > always 8 bytes in size on 64-bit, even though we might only need to > advance by 4 bytes. > > [...] Applied to powerpc/next. [1/1] powerpc: Add ppc_inst_next() https://git.kernel.org/powerpc/c/c5ff46d69c410f7fac173e4fde3eea484b4b4eda cheers
Re: [PATCH] powerpc/configs/64s: Enable CONFIG_PRINTK_CALLER
On Wed, 20 May 2020 22:12:57 +1000, Michael Ellerman wrote: > This adds the CPU or thread number to printk messages. This helps a > lot when deciphering concurrent oopses that have been interleaved. > > Example output, of PID1 (T1) triggering a warning: > > [1.581678][T1] WARNING: CPU: 0 PID: 1 at crypto/rsa-pkcs1pad.c:539 > pkcs1pad_verify+0x38/0x140 > [1.581681][T1] Modules linked in: > [1.581693][T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted > 5.5.0-rc5-gcc-8.2.0-00121-gf84c2e595927-dirty #1515 > [1.581700][T1] NIP: c0207d64 LR: c0207d3c CTR: > c0207d2c > [1.581708][T1] REGS: c000fd2e7560 TRAP: 0700 Not tainted > (5.5.0-rc5-gcc-8.2.0-00121-gf84c2e595927-dirty) > [1.581712][T1] MSR: 90029033 > CR: 44000222 XER: 0004 Applied to powerpc/next. [1/1] powerpc/configs/64s: Enable CONFIG_PRINTK_CALLER https://git.kernel.org/powerpc/c/598c01b5b2fca3a9de8ad3400edbff98ec22f0b2 cheers
Re: [PATCH] powerpc: Add ppc_inst_as_u64()
On Mon, 25 May 2020 15:50:04 +1000, Michael Ellerman wrote: > The code patching code wants to get the value of a struct ppc_inst as > a u64 when the instruction is prefixed, so we can pass the u64 down to > __put_user_asm() and write it with a single store. > > This is a bit awkward because the value differs based on the CPU > endianness, so add a helper to do the conversion. Applied to powerpc/next. [1/1] powerpc: Add ppc_inst_as_u64() https://git.kernel.org/powerpc/c/16ef9767e4dc5cf03a71ae7bc2bc588dbbe7983e cheers
Re: [PATCH] powerpc/4xx: Don't unmap NULL mbase
On Thu, 21 May 2020 17:26:48 +1000, Michael Ellerman wrote: > Applied to powerpc/next. [1/1] powerpc/4xx: Don't unmap NULL mbase https://git.kernel.org/powerpc/c/bcec081ecc940fc38730b29c743bbee661164161 cheers
Re: [PATCH] powerpc/tm: Document h/rfid and mtmsrd quirk
On Wed, 25 Mar 2020 15:05:46 +1100, Michael Neuling wrote: > The ISA has a quirk that's useful for the Linux implementation. > Document it here so others are less likely to trip over it. Applied to powerpc/next. [1/1] powerpc/tm: Document h/rfid and mtmsrd quirk https://git.kernel.org/powerpc/c/b8707e2374f68cac79de553ae1ee5c35913813bd cheers
Re: [PATCH] powerpc: Fix misleading small cores print
On Fri, 29 May 2020 09:07:31 +1000, Michael Neuling wrote: > Currently when we boot on a big core system, we get this print: > [0.040500] Using small cores at SMT level > > This is misleading as we've actually detected big cores. > > This patch clears up the print to say we've detect big cores but are > using small cores for scheduling. Applied to powerpc/next. [1/1] powerpc: Fix misleading small cores print https://git.kernel.org/powerpc/c/82a7cebdd95cffa55449d6c1d97cc9b743a66056 cheers
Re: [PATCH] powerpc/configs: Add LIBNVDIMM to ppc64_defconfig
On Tue, 19 May 2020 14:30:09 +1000, Michael Neuling wrote: > This gives us OF_PMEM which is useful in mambo. > > This adds 153K to the text of ppc64le_defconfig which 0.8% of the > total text. > > LIBNVDIMM text databss dec hex > Without 18574833 5518150 1539240 25632223 1871ddf > With 18727834 5546206 1539368 25813408 189e1a0 Applied to powerpc/next. [1/1] powerpc/configs: Add LIBNVDIMM to ppc64_defconfig https://git.kernel.org/powerpc/c/08b1add150a8863665676d0ac9c3ad2d34b2540c cheers
Re: [PATCH v2 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms
On Mon, 30 Mar 2020 15:32:15 +0200, Michal Simek wrote: > recently we wanted to update xilinx intc driver and we found that function > which we wanted to remove is still wired by ancient Xilinx PowerPC > platforms. Here is the thread about it. > https://lore.kernel.org/linux-next/48d3232d-0f1d-42ea-3109-f44bbabfa...@xilinx.com/ > > I have been talking about it internally and there is no interest in these > platforms and it is also orphan for quite a long time. None is really > running/testing these platforms regularly that's why I think it makes sense > to remove them also with drivers which are specific to this platform. > > [...] Applied to powerpc/next. [1/2] sound: ac97: Remove sound driver for ancient platform https://git.kernel.org/powerpc/c/f16dca3e30c14aff545a834a7c1a1bb02b9edb48 [2/2] powerpc: Remove Xilinx PPC405/PPC440 support https://git.kernel.org/powerpc/c/7ade8495dcfd788a76e6877c9ea86f5207369ea4 cheers
Re: [PATCH v3 1/1] powerpc/kernel: Enables memory hot-remove after reboot on pseries guests
On Thu, 2 Apr 2020 16:51:57 -0300, Leonardo Bras wrote: > While providing guests, it's desirable to resize it's memory on demand. > > By now, it's possible to do so by creating a guest with a small base > memory, hot-plugging all the rest, and using 'movable_node' kernel > command-line parameter, which puts all hot-plugged memory in > ZONE_MOVABLE, allowing it to be removed whenever needed. > > [...] Applied to powerpc/next. [1/1] powerpc/kernel: Enables memory hot-remove after reboot on pseries guests https://git.kernel.org/powerpc/c/b6eca183e23e7a6625a0d2cdb806b7cd1abcd2d2 cheers
Re: [PATCH v3] powerpc/XIVE: SVM: share the event-queue page with the Hypervisor.
On Sat, 25 Apr 2020 19:05:18 -0700, Ram Pai wrote: > >From 10ea2eaf492ca3f22f67a5a63a2b7865e45299ad Mon Sep 17 00:00:00 2001 > From: Ram Pai > Date: Mon, 24 Feb 2020 01:09:48 -0500 > Subject: [PATCH v3] powerpc/XIVE: SVM: share the event-queue page with the > Hypervisor. > > XIVE interrupt controller uses an Event Queue (EQ) to enqueue event > notifications when an exception occurs. The EQ is a single memory page > provided by the O/S defining a circular buffer, one per server and > priority couple. > > [...] Applied to powerpc/next. [1/1] powerpc/xive: Share the event-queue page with the Hypervisor. https://git.kernel.org/powerpc/c/094235222d41d68d35de18170058d94a96a82628 cheers
Re: [PATCH v6 0/2] Implement reentrant rtas call
On Mon, 18 May 2020 20:42:43 -0300, Leonardo Bras wrote: > Patch 2 implement rtas_call_reentrant() for reentrant rtas-calls: > "ibm,int-on", "ibm,int-off",ibm,get-xive" and "ibm,set-xive", > according to LoPAPR Version 1.1 (March 24, 2016). > > For that, it's necessary that every call uses a different > rtas buffer (rtas_args). Paul Mackerras suggested using the PACA > structure for creating a per-cpu buffer for these calls. > > [...] Applied to powerpc/next. [1/2] powerpc/rtas: Move type/struct definitions from rtas.h into rtas-types.h https://git.kernel.org/powerpc/c/783a015b747f606e803b798eb8b50c73c548691d [2/2] powerpc/rtas: Implement reentrant rtas call https://git.kernel.org/powerpc/c/b664db8e3f976d9233cc9ea5e3f8a8c0bcabeb48 cheers
Re: [PATCH v2 1/1] powerpc/crash: Use NMI context for printk when starting to crash
On Tue, 12 May 2020 18:45:35 -0300, Leonardo Bras wrote: > Currently, if printk lock (logbuf_lock) is held by other thread during > crash, there is a chance of deadlocking the crash on next printk, and > blocking a possibly desired kdump. > > At the start of default_machine_crash_shutdown, make printk enter > NMI context, as it will use per-cpu buffers to store the message, > and avoid locking logbuf_lock. Applied to powerpc/next. [1/1] powerpc/crash: Use NMI context for printk when starting to crash https://git.kernel.org/powerpc/c/af2876b501e42c3fb5174cac9dd02598436f0fdf cheers
Re: [PATCH v10 0/5] powerpc/hv-24x7: Expose chip/sockets info to add json file metric support for the hv_24x7 socket/chip level events
On Mon, 25 May 2020 16:13:02 +0530, Kajol Jain wrote: > Patchset fixes the inconsistent results we are getting when > we run multiple 24x7 events. > > "hv_24x7" pmu interface events needs system dependent parameter > like socket/chip/core. For example, hv_24x7 chip level events needs > specific chip-id to which the data is requested should be added as part > of pmu events. > > [...] Applied to powerpc/next. [1/5] powerpc/perf/hv-24x7: Fix inconsistent output values incase multiple hv-24x7 events run https://git.kernel.org/powerpc/c/b4ac18eead28611ff470d0f47a35c4e0ac080d9c [2/5] powerpc/hv-24x7: Add rtas call in hv-24x7 driver to get processor details https://git.kernel.org/powerpc/c/8ba21426738207711347335b2cf3e99c690fc777 [3/5] powerpc/hv-24x7: Add sysfs files inside hv-24x7 device to show processor details https://git.kernel.org/powerpc/c/60beb65da1efd4cc23d05141181c39b98487950f [4/5] Documentation/ABI: Add ABI documentation for chips and sockets https://git.kernel.org/powerpc/c/15cd1d35ba4a59832df693858ef046457107bd8d [5/5] powerpc/pseries: Update hv-24x7 information after migration https://git.kernel.org/powerpc/c/373b373053384f12951ae9f916043d955501d482 cheers
Re: [PATCHv4] powerpc/crashkernel: take "mem=" option into account
On Wed, 1 Apr 2020 22:00:44 +0800, Pingfan Liu wrote: > 'mem=" option is an easy way to put high pressure on memory during some > test. Hence after applying the memory limit, instead of total mem, the > actual usable memory should be considered when reserving mem for > crashkernel. Otherwise the boot up may experience OOM issue. > > E.g. it would reserve 4G prior to the change and 512M afterward, if passing > crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and > mem=5G on a 256G machine. > > [...] Applied to powerpc/next. [1/1] powerpc/crashkernel: Take "mem=" option into account https://git.kernel.org/powerpc/c/be5470e0c285a68dc3afdea965032f5ddc8269d7 cheers
Re: [PATCH] powerpc/fadump: account for memory_limit while reserving memory
On Wed, 27 May 2020 15:14:35 +0530, Hari Bathini wrote: > If the memory chunk found for reserving memory overshoots the memory > limit imposed, do not proceed with reserving memory. Default behavior > was this until commit 140777a3d8df ("powerpc/fadump: consider reserved > ranges while reserving memory") changed it unwittingly. Applied to powerpc/next. [1/1] powerpc/fadump: Account for memory_limit while reserving memory https://git.kernel.org/powerpc/c/9a2921e5baca1d25eb8d21f21d1e90581a6d0f68 cheers
Re: [PATCH] macintosh/ams-input: switch to using input device polling mode
On Wed, 2 Oct 2019 14:48:54 -0700, Dmitry Torokhov wrote: > Now that instances of input_dev support polling mode natively, > we no longer need to create input_polled_dev instance. Applied to powerpc/next. [1/1] macintosh/ams-input: switch to using input device polling mode https://git.kernel.org/powerpc/c/0c444d98efad89e2a189d1a5a188e0385edac647 cheers
Re: [PATCH v5 00/13] Modernise powerpc 40x
On Thu, 21 May 2020 16:55:51 + (UTC), Christophe Leroy wrote: > v1 and v2 of this series were aiming at removing 40x entirely, > but it led to protests. > > v3 is trying to start modernising powerpc 40x: > - Rework TLB miss handlers to not use PTE_ATOMIC_UPDATES and _PAGE_HWWRITE > - Remove old versions of 40x processors, namely 403 and 405GP and associated > errata. > - Last two patches are trivial changes in TLB miss handlers to reduce number > of scratch registers. > > [...] Applied to powerpc/next. [01/13] powerpc: Remove Xilinx PPC405/PPC440 support https://git.kernel.org/powerpc/c/7ade8495dcfd788a76e6877c9ea86f5207369ea4 [02/13] powerpc/40x: Rework 40x PTE access and TLB miss https://git.kernel.org/powerpc/c/2c74e2586bb96012ffc05f1c819b05d9cad86d6e [03/13] powerpc/pgtable: Drop PTE_ATOMIC_UPDATES https://git.kernel.org/powerpc/c/4e1df545e2fae53e07c93b835c3dcc9d4917c849 [04/13] powerpc/40x: Remove support for IBM 403GCX https://git.kernel.org/powerpc/c/1b5c0967ab8aa9424cdd5108de4e055d8aeaa9d0 [05/13] powerpc/40x: Remove STB03xxx https://git.kernel.org/powerpc/c/7583b63c343c1076c89b2012fd8758473f046f5f [06/13] powerpc/40x: Remove WALNUT https://git.kernel.org/powerpc/c/5786074b96e38691a0cb3d3644ca2aa5d6d8830d [07/13] powerpc/40x: Remove EP405 https://git.kernel.org/powerpc/c/548f5244f1064c9facb19c5e97c21e1e80102ea0 [08/13] powerpc/40x: Remove support for ISS Simulator https://git.kernel.org/powerpc/c/2874ec75708eed59a47a9a986c02add747ae6e9b [09/13] powerpc/40x: Remove support for IBM 405GP https://git.kernel.org/powerpc/c/7d372d4ccdd55d5ead4d4ecbc336af4dd7d04344 [10/13] powerpc/40x: Remove IBM405 Erratum #51 https://git.kernel.org/powerpc/c/59fb463b48e904dfdfff64c7dd4d67f20ae27170 [11/13] powerpc: Remove IBM405 Erratum #77 https://git.kernel.org/powerpc/c/455531e9d88048c025ff9099796413df748d92b9 [12/13] powerpc/40x: Avoid using r12 in TLB miss handlers https://git.kernel.org/powerpc/c/797f4016f6da4a90ac83e32b213b68ff7be3812b [13/13] powerpc/40x: Don't save CR in SPRN_SPRG_SCRATCH6 https://git.kernel.org/powerpc/c/3aacaa719b7bf135551cabde2480e8f7bfdf7c7d cheers
Re: [PATCH 0/3] powerpc/xive: PCI hotplug fixes under PowerVM
On Wed, 29 Apr 2020 09:51:19 +0200, Cédric Le Goater wrote: > Here are a couple of fixes for PCI hotplug issues for machines running > under the POWER hypervisor using hash MMU and the XIVE interrupt mode. > > Commit 1ca3dec2b2df ("powerpc/xive: Prevent page fault issues in the > machine crash handler") forced the mapping of the XIVE ESB page and > this is now blocking the removal of a passthrough IO adapter because > the PCI isolation fails with "valid outstanding translations". Under > KVM, the ESB pages for the adapter interrupts are un-mapped from the > guest by the hypervisor in the KVM XIVE native device. This is is now > redundant but it's harmless. > > [...] Patches 1 & 3 pplied to powerpc/next. [1/3] powerpc/xive: Clear the page tables for the ESB IO mapping https://git.kernel.org/powerpc/c/a101950fcb78b0ba20cd487be6627dea58d55c2b [3/3] powerpc/xive: Do not expose a debugfs file when XIVE is disabled https://git.kernel.org/powerpc/c/0755e85570a4615ca674ad6489d44d63916f1f3e cheers
Re: [PATCH v4 00/45] Use hugepages to map kernel mem on 8xx
On Tue, 19 May 2020 05:48:42 + (UTC), Christophe Leroy wrote: > The main purpose of this big series is to: > - reorganise huge page handling to avoid using mm_slices. > - use huge pages to map kernel memory on the 8xx. > > The 8xx supports 4 page sizes: 4k, 16k, 512k and 8M. > It uses 2 Level page tables, PGD having 1024 entries, each entry > covering 4M address space. Then each page table has 1024 entries. > > [...] Patches 1-6 and 9-45 applied to powerpc/next. [01/45] powerpc/kasan: Fix error detection on memory allocation https://git.kernel.org/powerpc/c/d132443a73d7a131775df46f33000f67ed92de1e [02/45] powerpc/kasan: Fix issues by lowering KASAN_SHADOW_END https://git.kernel.org/powerpc/c/3a66a24f6060e6775f8c02ac52329ea0152d7e58 [03/45] powerpc/kasan: Fix shadow pages allocation failure https://git.kernel.org/powerpc/c/d2a91cef9bbdeb87b7449fdab1a6be6000930210 [04/45] powerpc/kasan: Remove unnecessary page table locking https://git.kernel.org/powerpc/c/7c31c05e00fc5ff2067332c5f80e525573e7269c [05/45] powerpc/kasan: Refactor update of early shadow mappings https://git.kernel.org/powerpc/c/7dec42ab57f2f59feba82abf0353164479bfde4c [06/45] powerpc/kasan: Declare kasan_init_region() weak https://git.kernel.org/powerpc/c/ec97d022f621c6c850aec46d8818b49c6aae95ad [09/45] powerpc/ptdump: Add _PAGE_COHERENT flag https://git.kernel.org/powerpc/c/3af4786eb429b2df76cbd7ce3bae21467ac3e4fb [10/45] powerpc/ptdump: Display size of BATs https://git.kernel.org/powerpc/c/6b30830e2003d9d77696084ebe2fc19dbe7d6f70 [11/45] powerpc/ptdump: Standardise display of BAT flags https://git.kernel.org/powerpc/c/8961a2a5353cca5451f648f4838cd848a3b2354c [12/45] powerpc/ptdump: Properly handle non standard page size https://git.kernel.org/powerpc/c/b00ff6d8c1c3898b0f768cbb38ef722d25bd2f39 [13/45] powerpc/ptdump: Handle hugepd at PGD level https://git.kernel.org/powerpc/c/6b789a26d7da2e0256d199da980369ef8fb49ec6 [14/45] powerpc/32s: Don't warn when mapping RO data ROX. https://git.kernel.org/powerpc/c/4b19f96a81bceaf0bcf44d79c0855c61158065ec [15/45] powerpc/mm: Allocate static page tables for fixmap https://git.kernel.org/powerpc/c/925ac141d106b55acbe112a9272f970631a3c082 [16/45] powerpc/mm: Fix conditions to perform MMU specific management by blocks on PPC32. https://git.kernel.org/powerpc/c/4e3319c23a66dabfd6c35f4d2633d64d99b68096 [17/45] powerpc/mm: PTE_ATOMIC_UPDATES is only for 40x https://git.kernel.org/powerpc/c/fadaac67c9007cad9fc485e36dcc54460d6d5886 [18/45] powerpc/mm: Refactor pte_update() on nohash/32 https://git.kernel.org/powerpc/c/2db99aeb63dd6e8808dc054d181c4d0e8645bbe0 [19/45] powerpc/mm: Refactor pte_update() on book3s/32 https://git.kernel.org/powerpc/c/1c1bf294882bd12669e39ccd7680c4ce34b7c15c [20/45] powerpc/mm: Standardise __ptep_test_and_clear_young() params between PPC32 and PPC64 https://git.kernel.org/powerpc/c/c7fa77016eb6093df38fdabdb7a89bb9617e7185 [21/45] powerpc/mm: Standardise pte_update() prototype between PPC32 and PPC64 https://git.kernel.org/powerpc/c/06f52524870122fb43b214d27e8f4546da36f8ba [22/45] powerpc/mm: Create a dedicated pte_update() for 8xx https://git.kernel.org/powerpc/c/6ad41bfbc907be0cd414f09fa5382d2133376595 [23/45] powerpc/mm: Reduce hugepd size for 8M hugepages on 8xx https://git.kernel.org/powerpc/c/b12c07a4bb064c0a8db7554557b89d40f57c936f [24/45] powerpc/8xx: Drop CONFIG_8xx_COPYBACK option https://git.kernel.org/powerpc/c/d3efcd38c0b99162d889e36a30425345a18edb33 [25/45] powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages. https://git.kernel.org/powerpc/c/a891c43b97d315ee5f9fe8e797d3d48fc351e053 [26/45] powerpc/8xx: Manage 512k huge pages as standard pages. https://git.kernel.org/powerpc/c/b250c8c08c79d1eb5354c7eaa84b7505f5f2d921 [27/45] powerpc/8xx: Only 8M pages are hugepte pages now https://git.kernel.org/powerpc/c/d4870b89acd7c362ded08f9295e8d143cf7e0024 [28/45] powerpc/8xx: MM_SLICE is not needed anymore https://git.kernel.org/powerpc/c/555904d07eef3a2e5fc458419edf6174362c4ddd [29/45] powerpc/8xx: Move PPC_PIN_TLB options into 8xx Kconfig https://git.kernel.org/powerpc/c/5d4656696c30cef56b2ab506b203533c818af04d [30/45] powerpc/8xx: Add function to set pinned TLBs https://git.kernel.org/powerpc/c/f76c8f6d257cefda60221c83af7f97d9f74cb3ce [31/45] powerpc/8xx: Don't set IMMR map anymore at boot https://git.kernel.org/powerpc/c/136a9a0f74d2e0d9de5515190fe80344b86b45cf [32/45] powerpc/8xx: Always pin TLBs at startup. https://git.kernel.org/powerpc/c/684c1664e0de63398aceb748343541b48d398710 [33/45] powerpc/8xx: Drop special handling of Linear and IMMR mappings in I/D TLB handlers https://git.kernel.org/powerpc/c/400dc0f86102d2ad11d3601f1948fbb02e926431 [34/45] powerpc/8xx: Remove now unused TLB m
Re: [PATCH v2] powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG
On Sat, 30 May 2020 17:16:33 + (UTC), Christophe Leroy wrote: > 'thread' doesn't exist in kuap_check() macro. > > Use 'current' instead. Applied to powerpc/next. [1/1] powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG https://git.kernel.org/powerpc/c/74016701fe5f873ae23bf02835407227138d874d cheers
Re: [PATCH] powerpc/32: disable KASAN with pages bigger than 16k
On Thu, 28 May 2020 10:17:04 + (UTC), Christophe Leroy wrote: > Mapping of early shadow area is implemented by using a single static > page table having all entries pointing to the same early shadow page. > The shadow area must therefore occupy full PGD entries. > > The shadow area has a size of 128Mbytes starting at 0xf800. > With 4k pages, a PGD entry is 4Mbytes > With 16k pages, a PGD entry is 64Mbytes > With 64k pages, a PGD entry is 256Mbytes which is too big. > > [...] Applied to powerpc/next. [1/1] powerpc/32: Disable KASAN with pages bigger than 16k https://git.kernel.org/powerpc/c/888468ce725a4cd56d72dc7e5096078f7a9251a0 cheers
Re: [PATCH v2 01/12] powerpc/52xx: Blacklist functions running with MMU disabled for kprobe
On Tue, 31 Mar 2020 16:03:36 + (UTC), Christophe Leroy wrote: > kprobe does not handle events happening in real mode, all > functions running with MMU disabled have to be blacklisted. Applied to powerpc/next. [01/12] powerpc/52xx: Blacklist functions running with MMU disabled for kprobe https://git.kernel.org/powerpc/c/e83f01fdb9143a4f90b17fbf7d8b8b21efb2f968 [02/12] powerpc/82xx: Blacklist pq2_restart() for kprobe https://git.kernel.org/powerpc/c/1740f15a99d30a5e2710b2b0754e65fc5ba68d1d [03/12] powerpc/83xx: Blacklist mpc83xx_deep_resume() for kprobe https://git.kernel.org/powerpc/c/7aa85127b1a170694b042cbc35a07afe3904173e [04/12] powerpc/powermac: Blacklist functions running with MMU disabled for kprobe https://git.kernel.org/powerpc/c/32a820670fa00419375a964ca8bc569e1499b90d [05/12] powerpc/mem: Blacklist flush_dcache_icache_phys() for kprobe https://git.kernel.org/powerpc/c/a64371b5d4fb37199dcd04cb7bf0132894018e33 [06/12] powerpc/32s: Make local symbols non visible in hash_low. https://git.kernel.org/powerpc/c/f892c21d2efb3b86ecbf8f5a95ea4abeedcc91b0 [07/12] powerpc/32s: Blacklist functions running with MMU disabled for kprobe https://git.kernel.org/powerpc/c/e6209318d63e2774c5ab214b14b948079e040064 [08/12] powerpc/rtas: Remove machine_check_in_rtas() https://git.kernel.org/powerpc/c/32746dfe4cf37f4077929601e8877a7fd02676e8 [09/12] powerpc/32: Blacklist functions running with MMU disabled for kprobe https://git.kernel.org/powerpc/c/5f32e8361cba8c58c4f272a389296f489ecc2823 [10/12] powerpc/entry32: Blacklist exception entry points for kprobe. https://git.kernel.org/powerpc/c/a616c442119f2ea5641e6abc215d7255b73b982b [11/12] powerpc/entry32: Blacklist syscall exit points for kprobe. https://git.kernel.org/powerpc/c/7cdf4401388572f720403a7038a178a4b30ac14c [12/12] powerpc/entry32: Blacklist exception exit points for kprobe. https://git.kernel.org/powerpc/c/e51c3e13709fe55d4d0eb50ba435bc53a64152bf cheers
Re: [PATCH] powerpc/uaccess: Don't set KUEP by default on book3s/32
On Wed, 15 Apr 2020 14:57:11 + (UTC), Christophe Leroy wrote: > On book3s/32, KUEP is an heavy process as it requires to > set/unset the NX bit in each of the 12 user segments > everytime the kernel is entered/exited from/to user space. > > Don't select KUEP by default on book3s/32. Applied to powerpc/next. [1/1] powerpc/uaccess: Don't set KUEP by default on book3s/32 https://git.kernel.org/powerpc/c/c3ba4dbbd1d05b49ec01efe098e0a78857d3ce22 cheers
Re: [PATCH] powerpc/uaccess: Don't set KUAP by default on book3s/32
On Wed, 15 Apr 2020 14:57:09 + (UTC), Christophe Leroy wrote: > On book3s/32, KUAP is an heavy process as it requires to > determine which segments are impacted and unlock/lock > each of them. > > And since the implementation of user_access_begin/end, it > is even worth for the time being because unlike __get_user(), > user_access_begin doesn't make difference between read and write > and unlocks access also for read allthought that's unneeded > on book3s/32. > > [...] Applied to powerpc/next. [1/1] powerpc/uaccess: Don't set KUAP by default on book3s/32 https://git.kernel.org/powerpc/c/547e687b2981a115814962506068873d24983af7 cheers
Re: [PATCH] powerpc/kprobes: Use probe_address() to read instructions
On Mon, 24 Feb 2020 18:02:10 + (UTC), Christophe Leroy wrote: > In order to avoid Oopses, use probe_address() to read the > instruction at the address where the trap happened. Applied to powerpc/next. [1/1] powerpc/kprobes: Use probe_address() to read instructions https://git.kernel.org/powerpc/c/9ed5df69b79a22b40b20bc2132ba2495708b19c4 cheers
Re: [PATCH] powerpc/8xx: Reduce time spent in allow_user_access() and friends
On Wed, 15 Apr 2020 10:06:09 + (UTC), Christophe Leroy wrote: > To enable/disable kernel access to user space, the 8xx has to > modify the properties of access group 1. This is done by writing > predefined values into SPRN_Mx_AP registers. > > As of today, a __put_user() gives: > > 0d64 : > d64: 3d 20 4f ff lis r9,20479 > d68: 61 29 ff ff ori r9,r9,65535 > d6c: 7d 3a c3 a6 mtspr 794,r9 > d70: 39 20 00 00 li r9,0 > d74: 90 83 00 00 stw r4,0(r3) > d78: 3d 20 6f ff lis r9,28671 > d7c: 61 29 ff ff ori r9,r9,65535 > d80: 7d 3a c3 a6 mtspr 794,r9 > d84: 4e 80 00 20 blr > > [...] Applied to powerpc/next. [1/1] powerpc/8xx: Reduce time spent in allow_user_access() and friends https://git.kernel.org/powerpc/c/332ce969b763553e9c4d55069e1e15aba4ea560f cheers
Re: [PATCH -next] powerpc/powernv: add NULL check after kzalloc
On Sat, 9 May 2020 10:08:38 +0800, Chen Zhou wrote: > Fixes coccicheck warning: > > ./arch/powerpc/platforms/powernv/opal.c:813:1-5: > alloc with no test, possible model on line 814 > > Add NULL check after kzalloc. Applied to powerpc/next. [1/1] powerpc/powernv: add NULL check after kzalloc https://git.kernel.org/powerpc/c/ceffa63acce7165c442395b7d64a11ab8b5c5dca cheers
Re: [PATCH v3] powerpc/64s/pgtable: fix an undefined behaviour
On Thu, 5 Mar 2020 23:48:52 -0500, Qian Cai wrote: > Booting a power9 server with hash MMU could trigger an undefined > behaviour because pud_offset(p4d, 0) will do, > > 0 >> (PAGE_SHIFT:16 + PTE_INDEX_SIZE:8 + H_PMD_INDEX_SIZE:10) > > Fix it by converting pud_index() and friends to static inline > functions. > > [...] Applied to powerpc/next. [1/1] powerpc/64s/pgtable: fix an undefined behaviour https://git.kernel.org/powerpc/c/c2e929b18cea6cbf71364f22d742d9aad7f4677a cheers
Re: [PATCH] powerpc/book3s64/radix/tlb: Determine hugepage flush correctly
On Wed, 13 May 2020 08:36:16 +0530, Aneesh Kumar K.V wrote: > With a 64K page size flush with start and end value as below > (start, end) = (721f680d, 721f680e) results in > (hstart, hend) = (721f6820, 721f6800) > > Avoid doing a __tlbie_va_range with the wrong hstart and hend value in this > case. > > [...] Applied to powerpc/next. [1/1] powerpc/book3s64/radix/tlb: Determine hugepage flush correctly https://git.kernel.org/powerpc/c/8f53f9c0f68ab2168f637494b9e24034899c1310 cheers
Re: [PATCH] powerpc/book3s64/kvm: Fix secondary page table walk warning during migration
On Thu, 28 May 2020 13:34:56 +0530, Aneesh Kumar K.V wrote: > This patch fix the below warning reported during migration. > > find_kvm_secondary_pte called with kvm mmu_lock not held > CPU: 23 PID: 5341 Comm: qemu-system-ppc Tainted: GW > 5.7.0-rc5-kvm-00211-g9ccf10d6d088 #432 > NIP: c00800fe848c LR: c00800fe8488 CTR: > REGS: c01e19f077e0 TRAP: 0700 Tainted: GW > (5.7.0-rc5-kvm-00211-g9ccf10d6d088) > MSR: 90029033 CR: 4422 XER: 2004 > CFAR: c012f5ac IRQMASK: 0 > GPR00: c00800fe8488 c01e19f07a70 c00800ffe200 0039 > GPR04: 0001 c01ffc8b4900 00018840 0007 > GPR08: 0003 0001 0007 0001 > GPR12: 2000 c01fff6d9400 00011f884678 7fff70b7 > GPR16: 7fff7137cb90 7fff7dcb4410 0001 > GPR20: 0ffe 0001 > GPR24: 8000 0001 c01e1f67e600 c01e1fd82410 > GPR28: 1000 c01e2e41 0fff 0ffe > NIP [c00800fe848c] kvmppc_hv_get_dirty_log_radix+0x2e4/0x340 [kvm_hv] > LR [c00800fe8488] kvmppc_hv_get_dirty_log_radix+0x2e0/0x340 [kvm_hv] > Call Trace: > [c01e19f07a70] [c00800fe8488] > kvmppc_hv_get_dirty_log_radix+0x2e0/0x340 [kvm_hv] (unreliable) > [c01e19f07b50] [c00800fd42e4] > kvm_vm_ioctl_get_dirty_log_hv+0x33c/0x3c0 [kvm_hv] > [c01e19f07be0] [c00800eea878] kvm_vm_ioctl_get_dirty_log+0x30/0x50 > [kvm] > [c01e19f07c00] [c00800edc818] kvm_vm_ioctl+0x2b0/0xc00 [kvm] > [c01e19f07d50] [c046e148] ksys_ioctl+0xf8/0x150 > [c01e19f07da0] [c046e1c8] sys_ioctl+0x28/0x80 > [c01e19f07dc0] [c003652c] system_call_exception+0x16c/0x240 > [c01e19f07e20] [c000d070] system_call_common+0xf0/0x278 > Instruction dump: > 7d3a512a 4200ffd0 7ffefb78 4bfffdc4 6000 3c82 e8848468 3c62 > e86384a8 38840010 4800673d e8410018 <0fe0> 4bfffdd4 6000 6000 Applied to powerpc/next. [1/1] powerpc/book3s64/kvm: Fix secondary page table walk warning during migration https://git.kernel.org/powerpc/c/bf8036a4098d1548cdccf9ed5c523ef4e83e3c68 cheers
Re: [PATCH v3 0/7] Base support for POWER10
On Thu, 21 May 2020 11:43:34 +1000, Alistair Popple wrote: > This series brings together several previously posted patches required for > POWER10 support and introduces a new patch enabling POWER10 architected > mode to enable booting as a POWER10 pseries guest. > > It includes support for enabling facilities related to MMA and prefix > instructions. > > [...] Patches 1-3 and 5-7 applied to powerpc/next. [1/7] powerpc: Add new HWCAP bits https://git.kernel.org/powerpc/c/ee988c11acf6f9464b7b44e9a091bf6afb3b3a49 [2/7] powerpc: Add support for ISA v3.1 https://git.kernel.org/powerpc/c/3fd5836ee801ab9ac5b314c26550e209bafa5eaa [3/7] powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selected https://git.kernel.org/powerpc/c/43d0d37acbe40a9a93d9891ca670638cd22116b1 [5/7] powerpc/dt_cpu_ftrs: Enable Prefixed Instructions https://git.kernel.org/powerpc/c/c63d688c3dabca973c5a7da73d17422ad13f3737 [6/7] powerpc/dt_cpu_ftrs: Add MMA feature https://git.kernel.org/powerpc/c/87939d50e5888bd78478d9aa9455f56b919df658 [7/7] powerpc: Add POWER10 architected mode https://git.kernel.org/powerpc/c/a3ea40d5c7365e7e5c7c85b6f30b15142b397571 cheers
Re: [PATCH] ocxl: Fix misleading comment
On Wed, 26 Feb 2020 15:39:23 +1100, Andrew Donnellan wrote: > In ocxl_context_free() we note that the AFU reference we're releasing was > taken in "ocxl_context_init", a function that doesn't actually exist. > > Fix it to say ocxl_context_alloc() instead, which I expect was what was > intended. Applied to powerpc/next. [1/1] ocxl: Fix misleading comment https://git.kernel.org/powerpc/c/a0594e89c9dc8e37883cc0d6642d1baad9c0744e cheers
Re: [PATCH] cxl: Remove dead Kconfig options
On Tue, 2 Jun 2020 14:03:41 +1000, Andrew Donnellan wrote: > The CXL_AFU_DRIVER_OPS and CXL_LIB Kconfig options were added to coordinate > merging of new features. They no longer serve any purpose, so remove them. Applied to powerpc/next. [1/1] cxl: Remove dead Kconfig options https://git.kernel.org/powerpc/c/f44b85da5e7450d0308695ba6f503d75fe6cc166 cheers
Re: [PATCH 5/5] powerpc: Add LKDTM test to hijack a patch mapping
On Wed Jun 3, 2020 at 9:20 AM, Christophe Leroy wrote: > > > > > Le 03/06/2020 à 07:19, Christopher M. Riedl a écrit : > > When live patching with STRICT_KERNEL_RWX, the CPU doing the patching > > must use a temporary mapping which allows for writing to kernel text. > > During the entire window of time when this temporary mapping is in use, > > another CPU could write to the same mapping and maliciously alter kernel > > text. Implement a LKDTM test to attempt to exploit such a openings when > > a CPU is patching under STRICT_KERNEL_RWX. The test is only implemented > > on powerpc for now. > > > > The LKDTM "hijack" test works as follows: > > > > 1. A CPU executes an infinite loop to patch an instruction. > >This is the "patching" CPU. > > 2. Another CPU attempts to write to the address of the temporary > >mapping used by the "patching" CPU. This other CPU is the > >"hijacker" CPU. The hijack either fails with a segfault or > >succeeds, in which case some kernel text is now overwritten. > > > > How to run the test: > > > > mount -t debugfs none /sys/kernel/debug > > (echo HIJACK_PATCH > /sys/kernel/debug/provoke-crash/DIRECT) > > > > Signed-off-by: Christopher M. Riedl > > --- > > drivers/misc/lkdtm/core.c | 1 + > > drivers/misc/lkdtm/lkdtm.h | 1 + > > drivers/misc/lkdtm/perms.c | 101 + > > 3 files changed, 103 insertions(+) > > > > diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c > > index a5e344df9166..482e72f6a1e1 100644 > > --- a/drivers/misc/lkdtm/core.c > > +++ b/drivers/misc/lkdtm/core.c > > @@ -145,6 +145,7 @@ static const struct crashtype crashtypes[] = { > > CRASHTYPE(WRITE_RO), > > CRASHTYPE(WRITE_RO_AFTER_INIT), > > CRASHTYPE(WRITE_KERN), > > + CRASHTYPE(HIJACK_PATCH), > > CRASHTYPE(REFCOUNT_INC_OVERFLOW), > > CRASHTYPE(REFCOUNT_ADD_OVERFLOW), > > CRASHTYPE(REFCOUNT_INC_NOT_ZERO_OVERFLOW), > > diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h > > index 601a2156a0d4..bfcf3542370d 100644 > > --- a/drivers/misc/lkdtm/lkdtm.h > > +++ b/drivers/misc/lkdtm/lkdtm.h > > @@ -62,6 +62,7 @@ void lkdtm_EXEC_USERSPACE(void); > > void lkdtm_EXEC_NULL(void); > > void lkdtm_ACCESS_USERSPACE(void); > > void lkdtm_ACCESS_NULL(void); > > +void lkdtm_HIJACK_PATCH(void); > > > > /* lkdtm_refcount.c */ > > void lkdtm_REFCOUNT_INC_OVERFLOW(void); > > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c > > index 62f76d506f04..8bda3b56bc78 100644 > > --- a/drivers/misc/lkdtm/perms.c > > +++ b/drivers/misc/lkdtm/perms.c > > @@ -9,6 +9,7 @@ > > #include > > #include > > #include > > +#include > > #include > > > > /* Whether or not to fill the target memory area with do_nothing(). */ > > @@ -213,6 +214,106 @@ void lkdtm_ACCESS_NULL(void) > > *ptr = tmp; > > } > > > > +#if defined(CONFIG_PPC) && defined(CONFIG_STRICT_KERNEL_RWX) > > > Why only PPC ? I understood that this applies also to x86. And > regarless, the test should be able to run on other architectures, > allthought for sure it will fail. That's the case for other tests. > I think the code patching details are different between architectures and (for now) I am only comfortable enough with PPC to implement something meaningful. The intent of the RFC versions was to try to get some interest (hence the distribution to the hardening list) or feedback about how this could work on other architectures. There are a few other tests which are arch specific in LKDTM so it's not completely unheard of :) > > > +#include > > + > > +extern unsigned long read_cpu_patching_addr(unsigned int cpu); > > > 'extern' keyword is useless for functions and shall be banned. > > > Shouldn't this declaration be in asm/code-patching.h ? > Yes, left-over from the RFC version, this will be fixed in the next spin. > > > + > > +static struct ppc_inst * const patch_site = (struct ppc_inst *)&do_nothing; > > + > > +static int lkdtm_patching_cpu(void *data) > > +{ > > + int err = 0; > > + struct ppc_inst insn = ppc_inst(0xdeadbeef); > > + > > + pr_info("starting patching_cpu=%d\n", smp_processor_id()); > > + do { > > + err = patch_instruction(patch_site, insn); > > + } while (ppc_inst_equal(ppc_inst_read(READ_ONCE(patch_site)), insn) && > > + !err && !kthread_should_stop()); > > + > > + if (err) > > + pr_warn("patch_instruction returned error: %d\n", err); > > + > > + set_current_state(TASK_INTERRUPTIBLE); > > + while (!kthread_should_stop()) { > > + schedule(); > > + set_current_state(TASK_INTERRUPTIBLE); > > + } > > + > > + return err; > > +} > > + > > +void lkdtm_HIJACK_PATCH(void) > > +{ > > + struct task_struct *patching_kthrd; > > + struct ppc_inst original_insn; > > + int patching_cpu, hijacker_cpu, attempts; > > + unsigned long addr; > > + bool hijacked; > > + > > + if (n
Re: [PATCH 4/5] powerpc/lib: Add LKDTM accessor for patching addr
On Wed Jun 3, 2020 at 9:14 AM, Christophe Leroy wrote: > > > > > Le 03/06/2020 à 07:19, Christopher M. Riedl a écrit : > > When live patching a STRICT_RWX kernel, a mapping is installed at a > > "patching address" with temporary write permissions. Provide a > > LKDTM-only accessor function for this address in preparation for a LKDTM > > test which attempts to "hijack" this mapping by writing to it from > > another CPU. > > > > Signed-off-by: Christopher M. Riedl > > --- > > arch/powerpc/lib/code-patching.c | 7 +++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/arch/powerpc/lib/code-patching.c > > b/arch/powerpc/lib/code-patching.c > > index df0765845204..c23453049116 100644 > > --- a/arch/powerpc/lib/code-patching.c > > +++ b/arch/powerpc/lib/code-patching.c > > @@ -52,6 +52,13 @@ int raw_patch_instruction(struct ppc_inst *addr, struct > > ppc_inst instr) > > static struct mm_struct *patching_mm __ro_after_init; > > static unsigned long patching_addr __ro_after_init; > > > > +#ifdef CONFIG_LKDTM > > +unsigned long read_cpu_patching_addr(unsigned int cpu) > > > If this fonction is not static, it means it is intended to be used from > some other C file, so it should be declared in a .h too. > Yup agreed. This was left-over from the RFC to simplify using the LKDTM test on a tree without this series. Will fix this in the next spin. > > Christophe > > > > +{ > > + return patching_addr; > > +} > > +#endif > > + > > void __init poking_init(void) > > { > > spinlock_t *ptl; /* for protecting pte table */ > > > > > >
[PATCH v2] selftests: powerpc: Fix CPU affinity for child process
On systems with large number of cpus, test fails trying to set affinity for child process by calling sched_setaffinity() with smaller size for cpuset. This patch fixes it by making sure that the size of allocated cpu set is dependent on the number of CPUs as reported by get_nprocs(). Fixes: 00b7ec5c9cf3 ("selftests/powerpc: Import Anton's context_switch2 benchmark") Reported-by: Shirisha Ganta Signed-off-by: Harish Signed-off-by: Sandipan Das --- .../powerpc/benchmarks/context_switch.c| 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/powerpc/benchmarks/context_switch.c b/tools/testing/selftests/powerpc/benchmarks/context_switch.c index a2e8c9da7fa5..de6c49d6f88f 100644 --- a/tools/testing/selftests/powerpc/benchmarks/context_switch.c +++ b/tools/testing/selftests/powerpc/benchmarks/context_switch.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -104,8 +105,9 @@ static void start_thread_on(void *(*fn)(void *), void *arg, unsigned long cpu) static void start_process_on(void *(*fn)(void *), void *arg, unsigned long cpu) { - int pid; - cpu_set_t cpuset; + int pid, ncpus; + cpu_set_t *cpuset; + size_t size; pid = fork(); if (pid == -1) { @@ -116,12 +118,16 @@ static void start_process_on(void *(*fn)(void *), void *arg, unsigned long cpu) if (pid) return; - CPU_ZERO(&cpuset); - CPU_SET(cpu, &cpuset); + size = CPU_ALLOC_SIZE(ncpus); + ncpus = get_nprocs(); + cpuset = CPU_ALLOC(ncpus); + CPU_ZERO_S(size, cpuset); + CPU_SET_S(cpu, size, cpuset); - if (sched_setaffinity(0, sizeof(cpuset), &cpuset)) { + if (sched_setaffinity(0, size, cpuset)) { perror("sched_setaffinity"); - exit(1); + CPU_FREE(cpuset); + exit(-1); } fn(arg); -- 2.24.1
Re: [PATCH v2] mm/debug_vm_pgtable: Fix kernel crash by checking for THP support
On 06/08/2020 06:22 PM, Aneesh Kumar K.V wrote: > Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but > no THP support enabled based on platforms. For ex: with 4K > PAGE_SIZE ppc64 supports THP only with radix translation. > > This results in below crash when running with hash translation and > 4K PAGE_SIZE. > > kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140! > cpu 0x61: Vector: 700 (Program Check) at [c00ff948f860] > pc: c18810f8: debug_vm_pgtable+0x480/0x8b0 > lr: c18810ec: debug_vm_pgtable+0x474/0x8b0 > ... > [c00ff948faf0] c1880fec debug_vm_pgtable+0x374/0x8b0 (unreliable) > [c00ff948fbf0] c0011648 do_one_initcall+0x98/0x4f0 > [c00ff948fcd0] c1843928 kernel_init_freeable+0x330/0x3fc > [c00ff948fdb0] c00122ac kernel_init+0x24/0x148 > [c00ff948fe20] c000cc44 ret_from_kernel_thread+0x5c/0x78 > > Check for THP support correctly > > Cc: anshuman.khand...@arm.com > Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table > helpers") > Signed-off-by: Aneesh Kumar K.V > --- > mm/debug_vm_pgtable.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > index 188c18908964..df3a3a08f4f8 100644 > --- a/mm/debug_vm_pgtable.c > +++ b/mm/debug_vm_pgtable.c > @@ -61,6 +61,9 @@ static void __init pmd_basic_tests(unsigned long pfn, > pgprot_t prot) > { > pmd_t pmd = pfn_pmd(pfn, prot); > > + if (!has_transparent_hugepage()) > + return; > + > WARN_ON(!pmd_same(pmd, pmd)); > WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd; > WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd; > @@ -80,6 +83,9 @@ static void __init pud_basic_tests(unsigned long pfn, > pgprot_t prot) > { > pud_t pud = pfn_pud(pfn, prot); > > + if (!has_transparent_hugepage()) > + return; > + > WARN_ON(!pud_same(pud, pud)); > WARN_ON(!pud_young(pud_mkyoung(pud_mkold(pud; > WARN_ON(!pud_write(pud_mkwrite(pud_wrprotect(pud; > Builds with THP on arc, s390 and runs with THP on x86 and arm64 platforms. Reviewed-by: Anshuman Khandual
[PATCH kernel] KVM: PPC: Protect kvm_vcpu_read_guest with srcu locks
The kvm_vcpu_read_guest/kvm_vcpu_write_guest used for nested guests eventually call srcu_dereference_check to dereference a memslot and lockdep produces a warning as neither kvm->slots_lock nor kvm->srcu lock is held and kvm->users_count is above zero (>100 in fact). This wraps mentioned VCPU read/write helpers in srcu read lock/unlock as it is done in other places. This uses vcpu->srcu_idx when possible. These helpers are only used for nested KVM so this may explain why we did not see these before. Here is an example of a warning: = WARNING: suspicious RCU usage 5.7.0-rc3-le_dma-bypass.3.2_a+fstn1 #897 Not tainted - include/linux/kvm_host.h:633 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by qemu-system-ppc/2752: #0: c000200359016be0 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x144/0xd80 [kvm] stack backtrace: CPU: 80 PID: 2752 Comm: qemu-system-ppc Not tainted 5.7.0-rc3-le_dma-bypass.3.2_a+fstn1 #897 Call Trace: [c0002003591ab240] [c0b23ab4] dump_stack+0x190/0x25c (unreliable) [c0002003591ab2b0] [c023f954] lockdep_rcu_suspicious+0x140/0x164 [c0002003591ab330] [c00804a445f8] kvm_vcpu_gfn_to_memslot+0x4c0/0x510 [kvm] [c0002003591ab3a0] [c00804a44c18] kvm_vcpu_read_guest+0xa0/0x180 [kvm] [c0002003591ab410] [c00804ff9bd8] kvmhv_enter_nested_guest+0x90/0xb80 [kvm_hv] [c0002003591ab980] [c00804fe07bc] kvmppc_pseries_do_hcall+0x7b4/0x1c30 [kvm_hv] [c0002003591aba10] [c00804fe5d30] kvmppc_vcpu_run_hv+0x10a8/0x1a30 [kvm_hv] [c0002003591abae0] [c00804a5d954] kvmppc_vcpu_run+0x4c/0x70 [kvm] [c0002003591abb10] [c00804a56e54] kvm_arch_vcpu_ioctl_run+0x56c/0x7c0 [kvm] [c0002003591abba0] [c00804a3ddc4] kvm_vcpu_ioctl+0x4ac/0xd80 [kvm] [c0002003591abd20] [c06ebb58] ksys_ioctl+0x188/0x210 [c0002003591abd70] [c06ebc28] sys_ioctl+0x48/0xb0 [c0002003591abdb0] [c0042764] system_call_exception+0x1d4/0x2e0 [c0002003591abe20] [c000cce8] system_call_common+0xe8/0x214 Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/kvm/book3s_64_mmu_radix.c | 4 arch/powerpc/kvm/book3s_hv_nested.c| 30 -- arch/powerpc/kvm/book3s_rtas.c | 2 ++ arch/powerpc/kvm/powerpc.c | 5 - 4 files changed, 29 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c index aa12cd4078b3..ef7fcc2e7c96 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -160,7 +160,9 @@ int kvmppc_mmu_walk_radix_tree(struct kvm_vcpu *vcpu, gva_t eaddr, return -EINVAL; /* Read the entry from guest memory */ addr = base + (index * sizeof(rpte)); + vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); ret = kvm_read_guest(kvm, addr, &rpte, sizeof(rpte)); + srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); if (ret) { if (pte_ret_p) *pte_ret_p = addr; @@ -236,7 +238,9 @@ int kvmppc_mmu_radix_translate_table(struct kvm_vcpu *vcpu, gva_t eaddr, /* Read the table to find the root of the radix tree */ ptbl = (table & PRTB_MASK) + (table_index * sizeof(entry)); + vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); ret = kvm_read_guest(kvm, ptbl, &entry, sizeof(entry)); + srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); if (ret) return ret; diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c index dc97e5be76f6..1d3ab6fb00a7 100644 --- a/arch/powerpc/kvm/book3s_hv_nested.c +++ b/arch/powerpc/kvm/book3s_hv_nested.c @@ -233,20 +233,21 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu) /* copy parameters in */ hv_ptr = kvmppc_get_gpr(vcpu, 4); + regs_ptr = kvmppc_get_gpr(vcpu, 5); + vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); err = kvm_vcpu_read_guest(vcpu, hv_ptr, &l2_hv, - sizeof(struct hv_guest_state)); + sizeof(struct hv_guest_state)) || + kvm_vcpu_read_guest(vcpu, regs_ptr, &l2_regs, + sizeof(struct pt_regs)); + srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx); if (err) return H_PARAMETER; + if (kvmppc_need_byteswap(vcpu)) byteswap_hv_regs(&l2_hv); if (l2_hv.version != HV_GUEST_STATE_VERSION) return H_P2; - regs_ptr = kvmppc_get_gpr(vcpu, 5); - err = kvm_vcpu_read_guest(vcpu, regs_ptr, &l2_regs, - sizeof(struct pt_regs)); - if (err) - return H_PARAMETER; if (kvmppc_need_byteswap(vcpu)) byteswap_pt
Re: [PATCH v11 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods
On Mon, Jun 8, 2020 at 5:16 PM kernel test robot wrote: > > Hi Vaibhav, > > Thank you for the patch! Perhaps something to improve: > > [auto build test WARNING on powerpc/next] > [also build test WARNING on linus/master v5.7 next-20200605] > [cannot apply to linux-nvdimm/libnvdimm-for-next scottwood/next] > [if your patch is applied to the wrong git tree, please drop us a note to help > improve the system. BTW, we also suggest to use '--base' option to specify the > base tree in git format-patch, please see > https://stackoverflow.com/a/37406982] > > url: > https://github.com/0day-ci/linux/commits/Vaibhav-Jain/powerpc-papr_scm-Add-support-for-reporting-nvdimm-health/20200607-211653 > base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next > config: powerpc-randconfig-r016-20200607 (attached as .config) > compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project > e429cffd4f228f70c1d9df0e5d77c08590dd9766) > reproduce (this is a W=1 build): > wget > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O > ~/bin/make.cross > chmod +x ~/bin/make.cross > # install powerpc cross compiling tool for clang build > # apt-get install binutils-powerpc-linux-gnu > # save the attached .config to linux build tree > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross > ARCH=powerpc > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot > > All warnings (new ones prefixed by >>, old ones prefixed by <<): > > In file included from :1: > >> ./usr/include/asm/papr_pdsm.h:69:20: warning: field 'hdr' with variable > >> sized type 'struct nd_cmd_pkg' not at the end of a struct or class is a > >> GNU extension [-Wgnu-variable-sized-type-not-at-end] > struct nd_cmd_pkg hdr; /* Package header containing sub-cmd */ Hi Vaibhav, This looks like it's going to need another round to get this fixed. I don't think 'struct nd_pdsm_cmd_pkg' should embed a definition of 'struct nd_cmd_pkg'. An instance of 'struct nd_cmd_pkg' carries a payload that is the 'pdsm' specifics. As the code has it now it's defined as a superset of 'struct nd_cmd_pkg' and the compiler warning is pointing out a real 'struct' organization problem. Given the soak time needed in -next after the code is finalized this there's no time to do another round of updates and still make the v5.8 merge window.
Re: [PATCH v11 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods
Hi Vaibhav, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on powerpc/next] [also build test WARNING on linus/master v5.7 next-20200605] [cannot apply to linux-nvdimm/libnvdimm-for-next scottwood/next] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Vaibhav-Jain/powerpc-papr_scm-Add-support-for-reporting-nvdimm-health/20200607-211653 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-randconfig-r016-20200607 (attached as .config) compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project e429cffd4f228f70c1d9df0e5d77c08590dd9766) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install powerpc cross compiling tool for clang build # apt-get install binutils-powerpc-linux-gnu # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All warnings (new ones prefixed by >>, old ones prefixed by <<): In file included from :1: >> ./usr/include/asm/papr_pdsm.h:69:20: warning: field 'hdr' with variable >> sized type 'struct nd_cmd_pkg' not at the end of a struct or class is a GNU >> extension [-Wgnu-variable-sized-type-not-at-end] struct nd_cmd_pkg hdr; /* Package header containing sub-cmd */ ^ 1 warning generated. --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
[PATCH AUTOSEL 4.4 21/37] powerpc/spufs: fix copy_to_user while atomic
From: Jeremy Kerr [ Upstream commit 88413a6bfbbe2f648df399b62f85c934460b7a4d ] Currently, we may perform a copy_to_user (through simple_read_from_buffer()) while holding a context's register_lock, while accessing the context save area. This change uses a temporary buffer for the context save area data, which we then pass to simple_read_from_buffer. Includes changes from Christoph Hellwig . Fixes: bf1ab978be23 ("[POWERPC] coredump: Add SPU elf notes to coredump.") Signed-off-by: Jeremy Kerr Reviewed-by: Arnd Bergmann [hch: renamed to function to avoid ___-prefixes] Signed-off-by: Christoph Hellwig Signed-off-by: Al Viro Signed-off-by: Sasha Levin --- arch/powerpc/platforms/cell/spufs/file.c | 113 +++ 1 file changed, 75 insertions(+), 38 deletions(-) diff --git a/arch/powerpc/platforms/cell/spufs/file.c b/arch/powerpc/platforms/cell/spufs/file.c index 5038fd578e65..e708c163fd6d 100644 --- a/arch/powerpc/platforms/cell/spufs/file.c +++ b/arch/powerpc/platforms/cell/spufs/file.c @@ -2044,8 +2044,9 @@ static ssize_t __spufs_mbox_info_read(struct spu_context *ctx, static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { - int ret; struct spu_context *ctx = file->private_data; + u32 stat, data; + int ret; if (!access_ok(VERIFY_WRITE, buf, len)) return -EFAULT; @@ -2054,11 +2055,16 @@ static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_mbox_info_read(ctx, buf, len, pos); + stat = ctx->csa.prob.mb_stat_R; + data = ctx->csa.prob.pu_mb_R; spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + /* EOF if there's no entry in the mbox */ + if (!(stat & 0xff)) + return 0; + + return simple_read_from_buffer(buf, len, pos, &data, sizeof(data)); } static const struct file_operations spufs_mbox_info_fops = { @@ -2085,6 +2091,7 @@ static ssize_t spufs_ibox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { struct spu_context *ctx = file->private_data; + u32 stat, data; int ret; if (!access_ok(VERIFY_WRITE, buf, len)) @@ -2094,11 +2101,16 @@ static ssize_t spufs_ibox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_ibox_info_read(ctx, buf, len, pos); + stat = ctx->csa.prob.mb_stat_R; + data = ctx->csa.priv2.puint_mb_R; spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + /* EOF if there's no entry in the ibox */ + if (!(stat & 0xff)) + return 0; + + return simple_read_from_buffer(buf, len, pos, &data, sizeof(data)); } static const struct file_operations spufs_ibox_info_fops = { @@ -2107,6 +2119,11 @@ static const struct file_operations spufs_ibox_info_fops = { .llseek = generic_file_llseek, }; +static size_t spufs_wbox_info_cnt(struct spu_context *ctx) +{ + return (4 - ((ctx->csa.prob.mb_stat_R & 0x00ff00) >> 8)) * sizeof(u32); +} + static ssize_t __spufs_wbox_info_read(struct spu_context *ctx, char __user *buf, size_t len, loff_t *pos) { @@ -2115,7 +2132,7 @@ static ssize_t __spufs_wbox_info_read(struct spu_context *ctx, u32 wbox_stat; wbox_stat = ctx->csa.prob.mb_stat_R; - cnt = 4 - ((wbox_stat & 0x00ff00) >> 8); + cnt = spufs_wbox_info_cnt(ctx); for (i = 0; i < cnt; i++) { data[i] = ctx->csa.spu_mailbox_data[i]; } @@ -2128,7 +2145,8 @@ static ssize_t spufs_wbox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { struct spu_context *ctx = file->private_data; - int ret; + u32 data[ARRAY_SIZE(ctx->csa.spu_mailbox_data)]; + int ret, count; if (!access_ok(VERIFY_WRITE, buf, len)) return -EFAULT; @@ -2137,11 +2155,13 @@ static ssize_t spufs_wbox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_wbox_info_read(ctx, buf, len, pos); + count = spufs_wbox_info_cnt(ctx); + memcpy(&data, &ctx->csa.spu_mailbox_data, sizeof(data)); spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + return simple_read_from_buffer(buf, len, pos, &data, + count * sizeof(u32)); } static const struct file_operations spufs_wbox_info_fops = { @@ -2150,27 +2170,33 @@ static const struct file_operations spufs_wbox_info_fops = { .llseek = gener
[PATCH AUTOSEL 4.9 29/50] powerpc/spufs: fix copy_to_user while atomic
From: Jeremy Kerr [ Upstream commit 88413a6bfbbe2f648df399b62f85c934460b7a4d ] Currently, we may perform a copy_to_user (through simple_read_from_buffer()) while holding a context's register_lock, while accessing the context save area. This change uses a temporary buffer for the context save area data, which we then pass to simple_read_from_buffer. Includes changes from Christoph Hellwig . Fixes: bf1ab978be23 ("[POWERPC] coredump: Add SPU elf notes to coredump.") Signed-off-by: Jeremy Kerr Reviewed-by: Arnd Bergmann [hch: renamed to function to avoid ___-prefixes] Signed-off-by: Christoph Hellwig Signed-off-by: Al Viro Signed-off-by: Sasha Levin --- arch/powerpc/platforms/cell/spufs/file.c | 113 +++ 1 file changed, 75 insertions(+), 38 deletions(-) diff --git a/arch/powerpc/platforms/cell/spufs/file.c b/arch/powerpc/platforms/cell/spufs/file.c index 06254467e4dd..f12b00a056cb 100644 --- a/arch/powerpc/platforms/cell/spufs/file.c +++ b/arch/powerpc/platforms/cell/spufs/file.c @@ -2044,8 +2044,9 @@ static ssize_t __spufs_mbox_info_read(struct spu_context *ctx, static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { - int ret; struct spu_context *ctx = file->private_data; + u32 stat, data; + int ret; if (!access_ok(VERIFY_WRITE, buf, len)) return -EFAULT; @@ -2054,11 +2055,16 @@ static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_mbox_info_read(ctx, buf, len, pos); + stat = ctx->csa.prob.mb_stat_R; + data = ctx->csa.prob.pu_mb_R; spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + /* EOF if there's no entry in the mbox */ + if (!(stat & 0xff)) + return 0; + + return simple_read_from_buffer(buf, len, pos, &data, sizeof(data)); } static const struct file_operations spufs_mbox_info_fops = { @@ -2085,6 +2091,7 @@ static ssize_t spufs_ibox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { struct spu_context *ctx = file->private_data; + u32 stat, data; int ret; if (!access_ok(VERIFY_WRITE, buf, len)) @@ -2094,11 +2101,16 @@ static ssize_t spufs_ibox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_ibox_info_read(ctx, buf, len, pos); + stat = ctx->csa.prob.mb_stat_R; + data = ctx->csa.priv2.puint_mb_R; spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + /* EOF if there's no entry in the ibox */ + if (!(stat & 0xff)) + return 0; + + return simple_read_from_buffer(buf, len, pos, &data, sizeof(data)); } static const struct file_operations spufs_ibox_info_fops = { @@ -2107,6 +2119,11 @@ static const struct file_operations spufs_ibox_info_fops = { .llseek = generic_file_llseek, }; +static size_t spufs_wbox_info_cnt(struct spu_context *ctx) +{ + return (4 - ((ctx->csa.prob.mb_stat_R & 0x00ff00) >> 8)) * sizeof(u32); +} + static ssize_t __spufs_wbox_info_read(struct spu_context *ctx, char __user *buf, size_t len, loff_t *pos) { @@ -2115,7 +2132,7 @@ static ssize_t __spufs_wbox_info_read(struct spu_context *ctx, u32 wbox_stat; wbox_stat = ctx->csa.prob.mb_stat_R; - cnt = 4 - ((wbox_stat & 0x00ff00) >> 8); + cnt = spufs_wbox_info_cnt(ctx); for (i = 0; i < cnt; i++) { data[i] = ctx->csa.spu_mailbox_data[i]; } @@ -2128,7 +2145,8 @@ static ssize_t spufs_wbox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { struct spu_context *ctx = file->private_data; - int ret; + u32 data[ARRAY_SIZE(ctx->csa.spu_mailbox_data)]; + int ret, count; if (!access_ok(VERIFY_WRITE, buf, len)) return -EFAULT; @@ -2137,11 +2155,13 @@ static ssize_t spufs_wbox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_wbox_info_read(ctx, buf, len, pos); + count = spufs_wbox_info_cnt(ctx); + memcpy(&data, &ctx->csa.spu_mailbox_data, sizeof(data)); spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + return simple_read_from_buffer(buf, len, pos, &data, + count * sizeof(u32)); } static const struct file_operations spufs_wbox_info_fops = { @@ -2150,27 +2170,33 @@ static const struct file_operations spufs_wbox_info_fops = { .llseek = gener
[PATCH AUTOSEL 4.14 42/72] powerpc/spufs: fix copy_to_user while atomic
From: Jeremy Kerr [ Upstream commit 88413a6bfbbe2f648df399b62f85c934460b7a4d ] Currently, we may perform a copy_to_user (through simple_read_from_buffer()) while holding a context's register_lock, while accessing the context save area. This change uses a temporary buffer for the context save area data, which we then pass to simple_read_from_buffer. Includes changes from Christoph Hellwig . Fixes: bf1ab978be23 ("[POWERPC] coredump: Add SPU elf notes to coredump.") Signed-off-by: Jeremy Kerr Reviewed-by: Arnd Bergmann [hch: renamed to function to avoid ___-prefixes] Signed-off-by: Christoph Hellwig Signed-off-by: Al Viro Signed-off-by: Sasha Levin --- arch/powerpc/platforms/cell/spufs/file.c | 113 +++ 1 file changed, 75 insertions(+), 38 deletions(-) diff --git a/arch/powerpc/platforms/cell/spufs/file.c b/arch/powerpc/platforms/cell/spufs/file.c index 5ffcdeb1eb17..9d9fffaedeef 100644 --- a/arch/powerpc/platforms/cell/spufs/file.c +++ b/arch/powerpc/platforms/cell/spufs/file.c @@ -1988,8 +1988,9 @@ static ssize_t __spufs_mbox_info_read(struct spu_context *ctx, static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { - int ret; struct spu_context *ctx = file->private_data; + u32 stat, data; + int ret; if (!access_ok(VERIFY_WRITE, buf, len)) return -EFAULT; @@ -1998,11 +1999,16 @@ static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_mbox_info_read(ctx, buf, len, pos); + stat = ctx->csa.prob.mb_stat_R; + data = ctx->csa.prob.pu_mb_R; spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + /* EOF if there's no entry in the mbox */ + if (!(stat & 0xff)) + return 0; + + return simple_read_from_buffer(buf, len, pos, &data, sizeof(data)); } static const struct file_operations spufs_mbox_info_fops = { @@ -2029,6 +2035,7 @@ static ssize_t spufs_ibox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { struct spu_context *ctx = file->private_data; + u32 stat, data; int ret; if (!access_ok(VERIFY_WRITE, buf, len)) @@ -2038,11 +2045,16 @@ static ssize_t spufs_ibox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_ibox_info_read(ctx, buf, len, pos); + stat = ctx->csa.prob.mb_stat_R; + data = ctx->csa.priv2.puint_mb_R; spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + /* EOF if there's no entry in the ibox */ + if (!(stat & 0xff)) + return 0; + + return simple_read_from_buffer(buf, len, pos, &data, sizeof(data)); } static const struct file_operations spufs_ibox_info_fops = { @@ -2051,6 +2063,11 @@ static const struct file_operations spufs_ibox_info_fops = { .llseek = generic_file_llseek, }; +static size_t spufs_wbox_info_cnt(struct spu_context *ctx) +{ + return (4 - ((ctx->csa.prob.mb_stat_R & 0x00ff00) >> 8)) * sizeof(u32); +} + static ssize_t __spufs_wbox_info_read(struct spu_context *ctx, char __user *buf, size_t len, loff_t *pos) { @@ -2059,7 +2076,7 @@ static ssize_t __spufs_wbox_info_read(struct spu_context *ctx, u32 wbox_stat; wbox_stat = ctx->csa.prob.mb_stat_R; - cnt = 4 - ((wbox_stat & 0x00ff00) >> 8); + cnt = spufs_wbox_info_cnt(ctx); for (i = 0; i < cnt; i++) { data[i] = ctx->csa.spu_mailbox_data[i]; } @@ -2072,7 +2089,8 @@ static ssize_t spufs_wbox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { struct spu_context *ctx = file->private_data; - int ret; + u32 data[ARRAY_SIZE(ctx->csa.spu_mailbox_data)]; + int ret, count; if (!access_ok(VERIFY_WRITE, buf, len)) return -EFAULT; @@ -2081,11 +2099,13 @@ static ssize_t spufs_wbox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_wbox_info_read(ctx, buf, len, pos); + count = spufs_wbox_info_cnt(ctx); + memcpy(&data, &ctx->csa.spu_mailbox_data, sizeof(data)); spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + return simple_read_from_buffer(buf, len, pos, &data, + count * sizeof(u32)); } static const struct file_operations spufs_wbox_info_fops = { @@ -2094,27 +2114,33 @@ static const struct file_operations spufs_wbox_info_fops = { .llseek = gener
[PATCH AUTOSEL 4.19 054/106] powerpc/spufs: fix copy_to_user while atomic
From: Jeremy Kerr [ Upstream commit 88413a6bfbbe2f648df399b62f85c934460b7a4d ] Currently, we may perform a copy_to_user (through simple_read_from_buffer()) while holding a context's register_lock, while accessing the context save area. This change uses a temporary buffer for the context save area data, which we then pass to simple_read_from_buffer. Includes changes from Christoph Hellwig . Fixes: bf1ab978be23 ("[POWERPC] coredump: Add SPU elf notes to coredump.") Signed-off-by: Jeremy Kerr Reviewed-by: Arnd Bergmann [hch: renamed to function to avoid ___-prefixes] Signed-off-by: Christoph Hellwig Signed-off-by: Al Viro Signed-off-by: Sasha Levin --- arch/powerpc/platforms/cell/spufs/file.c | 113 +++ 1 file changed, 75 insertions(+), 38 deletions(-) diff --git a/arch/powerpc/platforms/cell/spufs/file.c b/arch/powerpc/platforms/cell/spufs/file.c index 43e7b93f27c7..d16adcd93921 100644 --- a/arch/powerpc/platforms/cell/spufs/file.c +++ b/arch/powerpc/platforms/cell/spufs/file.c @@ -1991,8 +1991,9 @@ static ssize_t __spufs_mbox_info_read(struct spu_context *ctx, static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { - int ret; struct spu_context *ctx = file->private_data; + u32 stat, data; + int ret; if (!access_ok(VERIFY_WRITE, buf, len)) return -EFAULT; @@ -2001,11 +2002,16 @@ static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_mbox_info_read(ctx, buf, len, pos); + stat = ctx->csa.prob.mb_stat_R; + data = ctx->csa.prob.pu_mb_R; spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + /* EOF if there's no entry in the mbox */ + if (!(stat & 0xff)) + return 0; + + return simple_read_from_buffer(buf, len, pos, &data, sizeof(data)); } static const struct file_operations spufs_mbox_info_fops = { @@ -2032,6 +2038,7 @@ static ssize_t spufs_ibox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { struct spu_context *ctx = file->private_data; + u32 stat, data; int ret; if (!access_ok(VERIFY_WRITE, buf, len)) @@ -2041,11 +2048,16 @@ static ssize_t spufs_ibox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_ibox_info_read(ctx, buf, len, pos); + stat = ctx->csa.prob.mb_stat_R; + data = ctx->csa.priv2.puint_mb_R; spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + /* EOF if there's no entry in the ibox */ + if (!(stat & 0xff)) + return 0; + + return simple_read_from_buffer(buf, len, pos, &data, sizeof(data)); } static const struct file_operations spufs_ibox_info_fops = { @@ -2054,6 +2066,11 @@ static const struct file_operations spufs_ibox_info_fops = { .llseek = generic_file_llseek, }; +static size_t spufs_wbox_info_cnt(struct spu_context *ctx) +{ + return (4 - ((ctx->csa.prob.mb_stat_R & 0x00ff00) >> 8)) * sizeof(u32); +} + static ssize_t __spufs_wbox_info_read(struct spu_context *ctx, char __user *buf, size_t len, loff_t *pos) { @@ -2062,7 +2079,7 @@ static ssize_t __spufs_wbox_info_read(struct spu_context *ctx, u32 wbox_stat; wbox_stat = ctx->csa.prob.mb_stat_R; - cnt = 4 - ((wbox_stat & 0x00ff00) >> 8); + cnt = spufs_wbox_info_cnt(ctx); for (i = 0; i < cnt; i++) { data[i] = ctx->csa.spu_mailbox_data[i]; } @@ -2075,7 +2092,8 @@ static ssize_t spufs_wbox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { struct spu_context *ctx = file->private_data; - int ret; + u32 data[ARRAY_SIZE(ctx->csa.spu_mailbox_data)]; + int ret, count; if (!access_ok(VERIFY_WRITE, buf, len)) return -EFAULT; @@ -2084,11 +2102,13 @@ static ssize_t spufs_wbox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_wbox_info_read(ctx, buf, len, pos); + count = spufs_wbox_info_cnt(ctx); + memcpy(&data, &ctx->csa.spu_mailbox_data, sizeof(data)); spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + return simple_read_from_buffer(buf, len, pos, &data, + count * sizeof(u32)); } static const struct file_operations spufs_wbox_info_fops = { @@ -2097,27 +2117,33 @@ static const struct file_operations spufs_wbox_info_fops = { .llseek = gener
[PATCH AUTOSEL 4.19 049/106] sched/core: Fix illegal RCU from offline CPUs
From: Peter Zijlstra [ Upstream commit bf2c59fce4074e55d622089b34be3a6bc95484fb ] In the CPU-offline process, it calls mmdrop() after idle entry and the subsequent call to cpuhp_report_idle_dead(). Once execution passes the call to rcu_report_dead(), RCU is ignoring the CPU, which results in lockdep complaining when mmdrop() uses RCU from either memcg or debugobjects below. Fix it by cleaning up the active_mm state from BP instead. Every arch which has CONFIG_HOTPLUG_CPU should have already called idle_task_exit() from AP. The only exception is parisc because it switches them to &init_mm unconditionally (see smp_boot_one_cpu() and smp_cpu_init()), but the patch will still work there because it calls mmgrab(&init_mm) in smp_cpu_init() and then should call mmdrop(&init_mm) in finish_cpu(). WARNING: suspicious RCU usage - kernel/workqueue.c:710 RCU or wq_pool_mutex should be held! other info that might help us debug this: RCU used illegally from offline CPU! Call Trace: dump_stack+0xf4/0x164 (unreliable) lockdep_rcu_suspicious+0x140/0x164 get_work_pool+0x110/0x150 __queue_work+0x1bc/0xca0 queue_work_on+0x114/0x120 css_release+0x9c/0xc0 percpu_ref_put_many+0x204/0x230 free_pcp_prepare+0x264/0x570 free_unref_page+0x38/0xf0 __mmdrop+0x21c/0x2c0 idle_task_exit+0x170/0x1b0 pnv_smp_cpu_kill_self+0x38/0x2e0 cpu_die+0x48/0x64 arch_cpu_idle_dead+0x30/0x50 do_idle+0x2f4/0x470 cpu_startup_entry+0x38/0x40 start_secondary+0x7a8/0xa80 start_secondary_resume+0x10/0x14 Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Qian Cai Signed-off-by: Peter Zijlstra (Intel) Acked-by: Michael Ellerman (powerpc) Link: https://lkml.kernel.org/r/20200401214033.8448-1-...@lca.pw Signed-off-by: Sasha Levin --- arch/powerpc/platforms/powernv/smp.c | 1 - include/linux/sched/mm.h | 2 ++ kernel/cpu.c | 18 +- kernel/sched/core.c | 5 +++-- 4 files changed, 22 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c index 3d3c989e44dd..8d49ba370c50 100644 --- a/arch/powerpc/platforms/powernv/smp.c +++ b/arch/powerpc/platforms/powernv/smp.c @@ -171,7 +171,6 @@ static void pnv_smp_cpu_kill_self(void) /* Standard hot unplug procedure */ idle_task_exit(); - current->active_mm = NULL; /* for sanity */ cpu = smp_processor_id(); DBG("CPU%d offline\n", cpu); generic_set_cpu_dead(cpu); diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index e9d4e389aed9..766bbe813861 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -49,6 +49,8 @@ static inline void mmdrop(struct mm_struct *mm) __mmdrop(mm); } +void mmdrop(struct mm_struct *mm); + /* * This has to be called after a get_task_mm()/mmget_not_zero() * followed by taking the mmap_sem for writing before modifying the diff --git a/kernel/cpu.c b/kernel/cpu.c index 6d6c106a495c..08b9d6ba0807 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -3,6 +3,7 @@ * * This code is licenced under the GPL. */ +#include #include #include #include @@ -532,6 +533,21 @@ static int bringup_cpu(unsigned int cpu) return bringup_wait_for_ap(cpu); } +static int finish_cpu(unsigned int cpu) +{ + struct task_struct *idle = idle_thread_get(cpu); + struct mm_struct *mm = idle->active_mm; + + /* +* idle_task_exit() will have switched to &init_mm, now +* clean up any remaining active_mm state. +*/ + if (mm != &init_mm) + idle->active_mm = &init_mm; + mmdrop(mm); + return 0; +} + /* * Hotplug state machine related functions */ @@ -1379,7 +1395,7 @@ static struct cpuhp_step cpuhp_hp_states[] = { [CPUHP_BRINGUP_CPU] = { .name = "cpu:bringup", .startup.single = bringup_cpu, - .teardown.single= NULL, + .teardown.single= finish_cpu, .cant_stop = true, }, /* Final state before CPU kills itself */ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 2befd2c4ce9e..0325ccf3a8e4 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5571,13 +5571,14 @@ void idle_task_exit(void) struct mm_struct *mm = current->active_mm; BUG_ON(cpu_online(smp_processor_id())); + BUG_ON(current != this_rq()->idle); if (mm != &init_mm) { switch_mm(mm, &init_mm, current); - current->active_mm = &init_mm; finish_arch_post_lock_switch(); } - mmdrop(mm); + + /* finish_cpu(), as ran on the BP, will clean up the active_mm state */ } /* -- 2.25.1
[PATCH AUTOSEL 5.4 097/175] powerpc/spufs: fix copy_to_user while atomic
From: Jeremy Kerr [ Upstream commit 88413a6bfbbe2f648df399b62f85c934460b7a4d ] Currently, we may perform a copy_to_user (through simple_read_from_buffer()) while holding a context's register_lock, while accessing the context save area. This change uses a temporary buffer for the context save area data, which we then pass to simple_read_from_buffer. Includes changes from Christoph Hellwig . Fixes: bf1ab978be23 ("[POWERPC] coredump: Add SPU elf notes to coredump.") Signed-off-by: Jeremy Kerr Reviewed-by: Arnd Bergmann [hch: renamed to function to avoid ___-prefixes] Signed-off-by: Christoph Hellwig Signed-off-by: Al Viro Signed-off-by: Sasha Levin --- arch/powerpc/platforms/cell/spufs/file.c | 113 +++ 1 file changed, 75 insertions(+), 38 deletions(-) diff --git a/arch/powerpc/platforms/cell/spufs/file.c b/arch/powerpc/platforms/cell/spufs/file.c index c0f950a3f4e1..f4a4dfb191e7 100644 --- a/arch/powerpc/platforms/cell/spufs/file.c +++ b/arch/powerpc/platforms/cell/spufs/file.c @@ -1978,8 +1978,9 @@ static ssize_t __spufs_mbox_info_read(struct spu_context *ctx, static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { - int ret; struct spu_context *ctx = file->private_data; + u32 stat, data; + int ret; if (!access_ok(buf, len)) return -EFAULT; @@ -1988,11 +1989,16 @@ static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_mbox_info_read(ctx, buf, len, pos); + stat = ctx->csa.prob.mb_stat_R; + data = ctx->csa.prob.pu_mb_R; spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + /* EOF if there's no entry in the mbox */ + if (!(stat & 0xff)) + return 0; + + return simple_read_from_buffer(buf, len, pos, &data, sizeof(data)); } static const struct file_operations spufs_mbox_info_fops = { @@ -2019,6 +2025,7 @@ static ssize_t spufs_ibox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { struct spu_context *ctx = file->private_data; + u32 stat, data; int ret; if (!access_ok(buf, len)) @@ -2028,11 +2035,16 @@ static ssize_t spufs_ibox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_ibox_info_read(ctx, buf, len, pos); + stat = ctx->csa.prob.mb_stat_R; + data = ctx->csa.priv2.puint_mb_R; spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + /* EOF if there's no entry in the ibox */ + if (!(stat & 0xff)) + return 0; + + return simple_read_from_buffer(buf, len, pos, &data, sizeof(data)); } static const struct file_operations spufs_ibox_info_fops = { @@ -2041,6 +2053,11 @@ static const struct file_operations spufs_ibox_info_fops = { .llseek = generic_file_llseek, }; +static size_t spufs_wbox_info_cnt(struct spu_context *ctx) +{ + return (4 - ((ctx->csa.prob.mb_stat_R & 0x00ff00) >> 8)) * sizeof(u32); +} + static ssize_t __spufs_wbox_info_read(struct spu_context *ctx, char __user *buf, size_t len, loff_t *pos) { @@ -2049,7 +2066,7 @@ static ssize_t __spufs_wbox_info_read(struct spu_context *ctx, u32 wbox_stat; wbox_stat = ctx->csa.prob.mb_stat_R; - cnt = 4 - ((wbox_stat & 0x00ff00) >> 8); + cnt = spufs_wbox_info_cnt(ctx); for (i = 0; i < cnt; i++) { data[i] = ctx->csa.spu_mailbox_data[i]; } @@ -2062,7 +2079,8 @@ static ssize_t spufs_wbox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { struct spu_context *ctx = file->private_data; - int ret; + u32 data[ARRAY_SIZE(ctx->csa.spu_mailbox_data)]; + int ret, count; if (!access_ok(buf, len)) return -EFAULT; @@ -2071,11 +2089,13 @@ static ssize_t spufs_wbox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_wbox_info_read(ctx, buf, len, pos); + count = spufs_wbox_info_cnt(ctx); + memcpy(&data, &ctx->csa.spu_mailbox_data, sizeof(data)); spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + return simple_read_from_buffer(buf, len, pos, &data, + count * sizeof(u32)); } static const struct file_operations spufs_wbox_info_fops = { @@ -2084,27 +2104,33 @@ static const struct file_operations spufs_wbox_info_fops = { .llseek = generic_file_llseek, }; -static ssize_t __sp
[PATCH AUTOSEL 5.4 089/175] sched/core: Fix illegal RCU from offline CPUs
From: Peter Zijlstra [ Upstream commit bf2c59fce4074e55d622089b34be3a6bc95484fb ] In the CPU-offline process, it calls mmdrop() after idle entry and the subsequent call to cpuhp_report_idle_dead(). Once execution passes the call to rcu_report_dead(), RCU is ignoring the CPU, which results in lockdep complaining when mmdrop() uses RCU from either memcg or debugobjects below. Fix it by cleaning up the active_mm state from BP instead. Every arch which has CONFIG_HOTPLUG_CPU should have already called idle_task_exit() from AP. The only exception is parisc because it switches them to &init_mm unconditionally (see smp_boot_one_cpu() and smp_cpu_init()), but the patch will still work there because it calls mmgrab(&init_mm) in smp_cpu_init() and then should call mmdrop(&init_mm) in finish_cpu(). WARNING: suspicious RCU usage - kernel/workqueue.c:710 RCU or wq_pool_mutex should be held! other info that might help us debug this: RCU used illegally from offline CPU! Call Trace: dump_stack+0xf4/0x164 (unreliable) lockdep_rcu_suspicious+0x140/0x164 get_work_pool+0x110/0x150 __queue_work+0x1bc/0xca0 queue_work_on+0x114/0x120 css_release+0x9c/0xc0 percpu_ref_put_many+0x204/0x230 free_pcp_prepare+0x264/0x570 free_unref_page+0x38/0xf0 __mmdrop+0x21c/0x2c0 idle_task_exit+0x170/0x1b0 pnv_smp_cpu_kill_self+0x38/0x2e0 cpu_die+0x48/0x64 arch_cpu_idle_dead+0x30/0x50 do_idle+0x2f4/0x470 cpu_startup_entry+0x38/0x40 start_secondary+0x7a8/0xa80 start_secondary_resume+0x10/0x14 Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Qian Cai Signed-off-by: Peter Zijlstra (Intel) Acked-by: Michael Ellerman (powerpc) Link: https://lkml.kernel.org/r/20200401214033.8448-1-...@lca.pw Signed-off-by: Sasha Levin --- arch/powerpc/platforms/powernv/smp.c | 1 - include/linux/sched/mm.h | 2 ++ kernel/cpu.c | 18 +- kernel/sched/core.c | 5 +++-- 4 files changed, 22 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c index 13e251699346..b2ba3e95bda7 100644 --- a/arch/powerpc/platforms/powernv/smp.c +++ b/arch/powerpc/platforms/powernv/smp.c @@ -167,7 +167,6 @@ static void pnv_smp_cpu_kill_self(void) /* Standard hot unplug procedure */ idle_task_exit(); - current->active_mm = NULL; /* for sanity */ cpu = smp_processor_id(); DBG("CPU%d offline\n", cpu); generic_set_cpu_dead(cpu); diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index c49257a3b510..a132d875d351 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -49,6 +49,8 @@ static inline void mmdrop(struct mm_struct *mm) __mmdrop(mm); } +void mmdrop(struct mm_struct *mm); + /* * This has to be called after a get_task_mm()/mmget_not_zero() * followed by taking the mmap_sem for writing before modifying the diff --git a/kernel/cpu.c b/kernel/cpu.c index d7890c1285bf..7527825ac7da 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -3,6 +3,7 @@ * * This code is licenced under the GPL. */ +#include #include #include #include @@ -564,6 +565,21 @@ static int bringup_cpu(unsigned int cpu) return bringup_wait_for_ap(cpu); } +static int finish_cpu(unsigned int cpu) +{ + struct task_struct *idle = idle_thread_get(cpu); + struct mm_struct *mm = idle->active_mm; + + /* +* idle_task_exit() will have switched to &init_mm, now +* clean up any remaining active_mm state. +*/ + if (mm != &init_mm) + idle->active_mm = &init_mm; + mmdrop(mm); + return 0; +} + /* * Hotplug state machine related functions */ @@ -1434,7 +1450,7 @@ static struct cpuhp_step cpuhp_hp_states[] = { [CPUHP_BRINGUP_CPU] = { .name = "cpu:bringup", .startup.single = bringup_cpu, - .teardown.single= NULL, + .teardown.single= finish_cpu, .cant_stop = true, }, /* Final state before CPU kills itself */ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index e99d326fa569..4874e1468279 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6177,13 +6177,14 @@ void idle_task_exit(void) struct mm_struct *mm = current->active_mm; BUG_ON(cpu_online(smp_processor_id())); + BUG_ON(current != this_rq()->idle); if (mm != &init_mm) { switch_mm(mm, &init_mm, current); - current->active_mm = &init_mm; finish_arch_post_lock_switch(); } - mmdrop(mm); + + /* finish_cpu(), as ran on the BP, will clean up the active_mm state */ } /* -- 2.25.1
[PATCH AUTOSEL 5.6 150/606] powerpc/64s: Disable STRICT_KERNEL_RWX
From: Michael Ellerman commit 8659a0e0efdd975c73355dbc033f79ba3b31e82c upstream. Several strange crashes have been eventually traced back to STRICT_KERNEL_RWX and its interaction with code patching. Various paths in our ftrace, kprobes and other patching code need to be hardened against patching failures, otherwise we can end up running with partially/incorrectly patched ftrace paths, kprobes or jump labels, which can then cause strange crashes. Although fixes for those are in development, they're not -rc material. There also seem to be problems with the underlying strict RWX logic, which needs further debugging. So for now disable STRICT_KERNEL_RWX on 64-bit to prevent people from enabling the option and tripping over the bugs. Fixes: 1e0fc9d1eb2b ("powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs") Cc: sta...@vger.kernel.org # v4.13+ Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200520133605.972649-1-...@ellerman.id.au Signed-off-by: Greg Kroah-Hartman --- arch/powerpc/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 497b7d0b2d7e..b0fb42b0bf4b 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -129,7 +129,7 @@ config PPC select ARCH_HAS_PTE_SPECIAL select ARCH_HAS_MEMBARRIER_CALLBACKS select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64 - select ARCH_HAS_STRICT_KERNEL_RWX if ((PPC_BOOK3S_64 || PPC32) && !HIBERNATION) + select ARCH_HAS_STRICT_KERNEL_RWX if (PPC32 && !HIBERNATION) select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_HAS_UACCESS_FLUSHCACHE select ARCH_HAS_UACCESS_MCSAFE if PPC64 -- 2.25.1
[PATCH AUTOSEL 5.6 117/606] ibmvnic: Skip fatal error reset after passive init
From: Juliet Kim [ Upstream commit f9c6cea0b38518741c8dcf26ac056d26ee2fd61d ] During MTU change, the following events may happen. Client-driven CRQ initialization fails due to partner’s CRQ closed, causing client to enqueue a reset task for FATAL_ERROR. Then passive (server-driven) CRQ initialization succeeds, causing client to release CRQ and enqueue a reset task for failover. If the passive CRQ initialization occurs before the FATAL reset task is processed, the FATAL error reset task would try to access a CRQ message queue that was freed, causing an oops. The problem may be most likely to occur during DLPAR add vNIC with a non-default MTU, because the DLPAR process will automatically issue a change MTU request. Fix this by not processing fatal error reset if CRQ is passively initialized after client-driven CRQ initialization fails. Signed-off-by: Juliet Kim Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/ibm/ibmvnic.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 4bd33245bad6..3de549c6c693 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -2189,7 +2189,8 @@ static void __ibmvnic_reset(struct work_struct *work) rc = do_hard_reset(adapter, rwi, reset_state); rtnl_unlock(); } - } else { + } else if (!(rwi->reset_reason == VNIC_RESET_FATAL && + adapter->from_passive_init)) { rc = do_reset(adapter, rwi, reset_state); } kfree(rwi); -- 2.25.1
[PATCH AUTOSEL 5.6 115/606] scsi: ibmvscsi: Fix WARN_ON during event pool release
From: Tyrel Datwyler [ Upstream commit b36522150e5b85045f868768d46fbaaa034174b2 ] While removing an ibmvscsi client adapter a WARN_ON like the following is seen in the kernel log: drmgr: drmgr: -r -c slot -s U9080.M9S.783AEC8-V11-C11 -w 5 -d 1 WARNING: CPU: 9 PID: 24062 at ../kernel/dma/mapping.c:311 dma_free_attrs+0x78/0x110 Supported: No, Unreleased kernel CPU: 9 PID: 24062 Comm: drmgr Kdump: loaded Tainted: G X 5.3.18-12-default NIP: c01fa758 LR: c01fa744 CTR: c01fa6e0 REGS: c002173375d0 TRAP: 0700 Tainted: G X (5.3.18-12-default) MSR: 80029033 CR: 28088282 XER: 2000 CFAR: c01fbf0c IRQMASK: 1 GPR00: c01fa744 c00217337860 c161ab00 GPR04: c11e1225 1801 GPR08: 0001 0001 c008190f4fa8 GPR12: c01fa6e0 c7fc2a00 GPR16: GPR20: GPR24: 00011420e310 1801 GPR28: c159de50 c11e1225 6600 c11e5c994848 NIP [c01fa758] dma_free_attrs+0x78/0x110 LR [c01fa744] dma_free_attrs+0x64/0x110 Call Trace: [c00217337860] [00011420e310] 0x11420e310 (unreliable) [c002173378b0] [c008190f0280] release_event_pool+0xd8/0x120 [ibmvscsi] [c00217337930] [c008190f3f74] ibmvscsi_remove+0x6c/0x160 [ibmvscsi] [c00217337960] [c00f3cac] vio_bus_remove+0x5c/0x100 [c002173379a0] [c087a0a4] device_release_driver_internal+0x154/0x280 [c002173379e0] [c08777cc] bus_remove_device+0x11c/0x220 [c00217337a60] [c0870fc4] device_del+0x1c4/0x470 [c00217337b10] [c08712a0] device_unregister+0x30/0xa0 [c00217337b80] [c00f39ec] vio_unregister_device+0x2c/0x60 [c00217337bb0] [c0081a1d0964] dlpar_remove_slot+0x14c/0x250 [rpadlpar_io] [c00217337c50] [c0081a1d0bcc] remove_slot_store+0xa4/0x110 [rpadlpar_io] [c00217337cd0] [c0c091a0] kobj_attr_store+0x30/0x50 [c00217337cf0] [c057c934] sysfs_kf_write+0x64/0x90 [c00217337d10] [c057be10] kernfs_fop_write+0x1b0/0x290 [c00217337d60] [c0488c4c] __vfs_write+0x3c/0x70 [c00217337d80] [c048c648] vfs_write+0xd8/0x260 [c00217337dd0] [c048ca8c] ksys_write+0xdc/0x130 [c00217337e20] [c000b488] system_call+0x5c/0x70 Instruction dump: 7c840074 f8010010 f821ffb1 20840040 eb830218 7c8407b4 48002019 6000 2fa3 409e003c 892d0988 792907e0 <0b09> 2fbd 419e0028 2fbc ---[ end trace 5955b3c0cc079942 ]--- rpadlpar_io: slot U9080.M9S.783AEC8-V11-C11 removed This is tripped as a result of irqs being disabled during the call to dma_free_coherent() by release_event_pool(). At this point in the code path we have quiesced the adapter and it is overly paranoid to be holding the host lock. [mkp: fixed build warning reported by sfr] Link: https://lore.kernel.org/r/1588027793-17952-1-git-send-email-tyr...@linux.ibm.com Signed-off-by: Tyrel Datwyler Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvscsi.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index 7f66a7783209..59f0f1030c54 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -2320,16 +2320,12 @@ static int ibmvscsi_probe(struct vio_dev *vdev, const struct vio_device_id *id) static int ibmvscsi_remove(struct vio_dev *vdev) { struct ibmvscsi_host_data *hostdata = dev_get_drvdata(&vdev->dev); - unsigned long flags; srp_remove_host(hostdata->host); scsi_remove_host(hostdata->host); purge_requests(hostdata, DID_ERROR); - - spin_lock_irqsave(hostdata->host->host_lock, flags); release_event_pool(&hostdata->pool, hostdata); - spin_unlock_irqrestore(hostdata->host->host_lock, flags); ibmvscsi_release_crq_queue(&hostdata->queue, hostdata, max_events); -- 2.25.1
[PATCH AUTOSEL 5.6 069/606] powerpc/uaccess: Evaluate macro arguments once, before user access is allowed
From: Nicholas Piggin commit d02f6b7dab8228487268298ea1f21081c0b4b3eb upstream. get/put_user() can be called with nontrivial arguments. fs/proc/page.c has a good example: if (put_user(stable_page_flags(ppage), out)) { stable_page_flags() is quite a lot of code, including spin locks in the page allocator. Ensure these arguments are evaluated before user access is allowed. This improves security by reducing code with access to userspace, but it also fixes a PREEMPT bug with KUAP on powerpc/64s: stable_page_flags() is currently called with AMR set to allow writes, it ends up calling spin_unlock(), which can call preempt_schedule. But the task switch code can not be called with AMR set (it relies on interrupts saving the register), so this blows up. It's fine if the code inside allow_user_access() is preemptible, because a timer or IPI will save the AMR, but it's not okay to explicitly cause a reschedule. Fixes: de78a9c42a79 ("powerpc: Add a framework for Kernel Userspace Access Protection") Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200407041245.600651-1-npig...@gmail.com Signed-off-by: Greg Kroah-Hartman --- arch/powerpc/include/asm/uaccess.h | 49 +- 1 file changed, 35 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h index 2f500debae21..0969285996cb 100644 --- a/arch/powerpc/include/asm/uaccess.h +++ b/arch/powerpc/include/asm/uaccess.h @@ -166,13 +166,17 @@ do { \ ({ \ long __pu_err; \ __typeof__(*(ptr)) __user *__pu_addr = (ptr); \ + __typeof__(*(ptr)) __pu_val = (x); \ + __typeof__(size) __pu_size = (size);\ + \ if (!is_kernel_addr((unsigned long)__pu_addr)) \ might_fault(); \ - __chk_user_ptr(ptr);\ + __chk_user_ptr(__pu_addr); \ if (do_allow) \ - __put_user_size((x), __pu_addr, (size), __pu_err); \ + __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \ else \ - __put_user_size_allowed((x), __pu_addr, (size), __pu_err); \ + __put_user_size_allowed(__pu_val, __pu_addr, __pu_size, __pu_err); \ + \ __pu_err; \ }) @@ -180,9 +184,13 @@ do { \ ({ \ long __pu_err = -EFAULT;\ __typeof__(*(ptr)) __user *__pu_addr = (ptr); \ + __typeof__(*(ptr)) __pu_val = (x); \ + __typeof__(size) __pu_size = (size);\ + \ might_fault(); \ - if (access_ok(__pu_addr, size)) \ - __put_user_size((x), __pu_addr, (size), __pu_err); \ + if (access_ok(__pu_addr, __pu_size))\ + __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \ + \ __pu_err; \ }) @@ -190,8 +198,12 @@ do { \ ({ \ long __pu_err; \ __typeof__(*(ptr)) __user *__pu_addr = (ptr); \ - __chk_user_ptr(ptr);\ - __put_user_size((x), __pu_addr, (size), __pu_err); \ + __typeof__(*(ptr)) __pu_val = (x); \ + __typeof__(size) __pu_size = (size);\ + \ + __chk_user_ptr(__pu_addr); \ + __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \ + \ __pu_err; \ }) @@ -283,15 +295,18 @@ do { \ long __gu_err;
[PATCH AUTOSEL 5.6 070/606] powerpc/ima: Fix secure boot rules in ima arch policy
From: Nayna Jain commit fa4f3f56ccd28ac031ab275e673ed4098855fed4 upstream. To prevent verifying the kernel module appended signature twice (finit_module), once by the module_sig_check() and again by IMA, powerpc secure boot rules define an IMA architecture specific policy rule only if CONFIG_MODULE_SIG_FORCE is not enabled. This, unfortunately, does not take into account the ability of enabling "sig_enforce" on the boot command line (module.sig_enforce=1). Including the IMA module appraise rule results in failing the finit_module syscall, unless the module signing public key is loaded onto the IMA keyring. This patch fixes secure boot policy rules to be based on CONFIG_MODULE_SIG instead. Fixes: 4238fad366a6 ("powerpc/ima: Add support to initialize ima policy rules") Signed-off-by: Nayna Jain Signed-off-by: Michael Ellerman Signed-off-by: Mimi Zohar Link: https://lore.kernel.org/r/1588342612-14532-1-git-send-email-na...@linux.ibm.com Signed-off-by: Greg Kroah-Hartman --- arch/powerpc/kernel/ima_arch.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/ima_arch.c b/arch/powerpc/kernel/ima_arch.c index e34116255ced..957abd592075 100644 --- a/arch/powerpc/kernel/ima_arch.c +++ b/arch/powerpc/kernel/ima_arch.c @@ -19,12 +19,12 @@ bool arch_ima_get_secureboot(void) * to be stored as an xattr or as an appended signature. * * To avoid duplicate signature verification as much as possible, the IMA - * policy rule for module appraisal is added only if CONFIG_MODULE_SIG_FORCE + * policy rule for module appraisal is added only if CONFIG_MODULE_SIG * is not enabled. */ static const char *const secure_rules[] = { "appraise func=KEXEC_KERNEL_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig", -#ifndef CONFIG_MODULE_SIG_FORCE +#ifndef CONFIG_MODULE_SIG "appraise func=MODULE_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig", #endif NULL @@ -50,7 +50,7 @@ static const char *const secure_and_trusted_rules[] = { "measure func=KEXEC_KERNEL_CHECK template=ima-modsig", "measure func=MODULE_CHECK template=ima-modsig", "appraise func=KEXEC_KERNEL_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig", -#ifndef CONFIG_MODULE_SIG_FORCE +#ifndef CONFIG_MODULE_SIG "appraise func=MODULE_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig", #endif NULL -- 2.25.1
[PATCH AUTOSEL 5.6 039/606] powerpc/32s: Fix build failure with CONFIG_PPC_KUAP_DEBUG
From: Christophe Leroy commit 4833ce06e6855d526234618b746ffb71d6612c9a upstream. gpr2 is not a parametre of kuap_check(), it doesn't exist. Use gpr instead. Fixes: a68c31fc01ef ("powerpc/32s: Implement Kernel Userspace Access Protection") Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Cc: sta...@vger.kernel.org Link: https://lore.kernel.org/r/ea599546f2a7771bde551393889e44e6b2632332.1587368807.git.christophe.le...@c-s.fr Signed-off-by: Greg Kroah-Hartman --- arch/powerpc/include/asm/book3s/32/kup.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h index 3c0ba22dc360..db0a1c281587 100644 --- a/arch/powerpc/include/asm/book3s/32/kup.h +++ b/arch/powerpc/include/asm/book3s/32/kup.h @@ -75,7 +75,7 @@ .macro kuap_check current, gpr #ifdef CONFIG_PPC_KUAP_DEBUG - lwz \gpr2, KUAP(thread) + lwz \gpr, KUAP(thread) 999: twnei \gpr, 0 EMIT_BUG_ENTRY 999b, __FILE__, __LINE__, (BUGFLAG_WARNING | BUGFLAG_ONCE) #endif -- 2.25.1
[PATCH AUTOSEL 5.6 038/606] powerpc/vdso32: Fallback on getres syscall when clock is unknown
From: Christophe Leroy commit e963b7a28b2bf2416304e1a15df967fcf662aff5 upstream. There are other clocks than the standard ones, for instance per process clocks. Therefore, being above the last standard clock doesn't mean it is a bad clock. So, fallback to syscall instead of returning -EINVAL inconditionaly. Fixes: e33ffc956b08 ("powerpc/vdso32: implement clock_getres entirely") Cc: sta...@vger.kernel.org # v5.6+ Reported-by: Aurelien Jarno Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Tested-by: Aurelien Jarno Link: https://lore.kernel.org/r/7316a9e2c0c2517923eb4b0411c4a08d15e675a4.1589017281.git.christophe.le...@csgroup.eu Signed-off-by: Greg Kroah-Hartman --- arch/powerpc/kernel/vdso32/gettimeofday.S | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S index a3951567118a..e7f8f9f1b3f4 100644 --- a/arch/powerpc/kernel/vdso32/gettimeofday.S +++ b/arch/powerpc/kernel/vdso32/gettimeofday.S @@ -218,11 +218,11 @@ V_FUNCTION_BEGIN(__kernel_clock_getres) blr /* -* invalid clock +* syscall fallback */ 99: - li r3, EINVAL - crset so + li r0,__NR_clock_getres + sc blr .cfi_endproc V_FUNCTION_END(__kernel_clock_getres) -- 2.25.1
[PATCH AUTOSEL 5.7 161/274] powerpc/spufs: fix copy_to_user while atomic
From: Jeremy Kerr [ Upstream commit 88413a6bfbbe2f648df399b62f85c934460b7a4d ] Currently, we may perform a copy_to_user (through simple_read_from_buffer()) while holding a context's register_lock, while accessing the context save area. This change uses a temporary buffer for the context save area data, which we then pass to simple_read_from_buffer. Includes changes from Christoph Hellwig . Fixes: bf1ab978be23 ("[POWERPC] coredump: Add SPU elf notes to coredump.") Signed-off-by: Jeremy Kerr Reviewed-by: Arnd Bergmann [hch: renamed to function to avoid ___-prefixes] Signed-off-by: Christoph Hellwig Signed-off-by: Al Viro Signed-off-by: Sasha Levin --- arch/powerpc/platforms/cell/spufs/file.c | 113 +++ 1 file changed, 75 insertions(+), 38 deletions(-) diff --git a/arch/powerpc/platforms/cell/spufs/file.c b/arch/powerpc/platforms/cell/spufs/file.c index c0f950a3f4e1..f4a4dfb191e7 100644 --- a/arch/powerpc/platforms/cell/spufs/file.c +++ b/arch/powerpc/platforms/cell/spufs/file.c @@ -1978,8 +1978,9 @@ static ssize_t __spufs_mbox_info_read(struct spu_context *ctx, static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { - int ret; struct spu_context *ctx = file->private_data; + u32 stat, data; + int ret; if (!access_ok(buf, len)) return -EFAULT; @@ -1988,11 +1989,16 @@ static ssize_t spufs_mbox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_mbox_info_read(ctx, buf, len, pos); + stat = ctx->csa.prob.mb_stat_R; + data = ctx->csa.prob.pu_mb_R; spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + /* EOF if there's no entry in the mbox */ + if (!(stat & 0xff)) + return 0; + + return simple_read_from_buffer(buf, len, pos, &data, sizeof(data)); } static const struct file_operations spufs_mbox_info_fops = { @@ -2019,6 +2025,7 @@ static ssize_t spufs_ibox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { struct spu_context *ctx = file->private_data; + u32 stat, data; int ret; if (!access_ok(buf, len)) @@ -2028,11 +2035,16 @@ static ssize_t spufs_ibox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_ibox_info_read(ctx, buf, len, pos); + stat = ctx->csa.prob.mb_stat_R; + data = ctx->csa.priv2.puint_mb_R; spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + /* EOF if there's no entry in the ibox */ + if (!(stat & 0xff)) + return 0; + + return simple_read_from_buffer(buf, len, pos, &data, sizeof(data)); } static const struct file_operations spufs_ibox_info_fops = { @@ -2041,6 +2053,11 @@ static const struct file_operations spufs_ibox_info_fops = { .llseek = generic_file_llseek, }; +static size_t spufs_wbox_info_cnt(struct spu_context *ctx) +{ + return (4 - ((ctx->csa.prob.mb_stat_R & 0x00ff00) >> 8)) * sizeof(u32); +} + static ssize_t __spufs_wbox_info_read(struct spu_context *ctx, char __user *buf, size_t len, loff_t *pos) { @@ -2049,7 +2066,7 @@ static ssize_t __spufs_wbox_info_read(struct spu_context *ctx, u32 wbox_stat; wbox_stat = ctx->csa.prob.mb_stat_R; - cnt = 4 - ((wbox_stat & 0x00ff00) >> 8); + cnt = spufs_wbox_info_cnt(ctx); for (i = 0; i < cnt; i++) { data[i] = ctx->csa.spu_mailbox_data[i]; } @@ -2062,7 +2079,8 @@ static ssize_t spufs_wbox_info_read(struct file *file, char __user *buf, size_t len, loff_t *pos) { struct spu_context *ctx = file->private_data; - int ret; + u32 data[ARRAY_SIZE(ctx->csa.spu_mailbox_data)]; + int ret, count; if (!access_ok(buf, len)) return -EFAULT; @@ -2071,11 +2089,13 @@ static ssize_t spufs_wbox_info_read(struct file *file, char __user *buf, if (ret) return ret; spin_lock(&ctx->csa.register_lock); - ret = __spufs_wbox_info_read(ctx, buf, len, pos); + count = spufs_wbox_info_cnt(ctx); + memcpy(&data, &ctx->csa.spu_mailbox_data, sizeof(data)); spin_unlock(&ctx->csa.register_lock); spu_release_saved(ctx); - return ret; + return simple_read_from_buffer(buf, len, pos, &data, + count * sizeof(u32)); } static const struct file_operations spufs_wbox_info_fops = { @@ -2084,27 +2104,33 @@ static const struct file_operations spufs_wbox_info_fops = { .llseek = generic_file_llseek, }; -static ssize_t __sp
[PATCH AUTOSEL 5.7 145/274] sched/core: Fix illegal RCU from offline CPUs
From: Peter Zijlstra [ Upstream commit bf2c59fce4074e55d622089b34be3a6bc95484fb ] In the CPU-offline process, it calls mmdrop() after idle entry and the subsequent call to cpuhp_report_idle_dead(). Once execution passes the call to rcu_report_dead(), RCU is ignoring the CPU, which results in lockdep complaining when mmdrop() uses RCU from either memcg or debugobjects below. Fix it by cleaning up the active_mm state from BP instead. Every arch which has CONFIG_HOTPLUG_CPU should have already called idle_task_exit() from AP. The only exception is parisc because it switches them to &init_mm unconditionally (see smp_boot_one_cpu() and smp_cpu_init()), but the patch will still work there because it calls mmgrab(&init_mm) in smp_cpu_init() and then should call mmdrop(&init_mm) in finish_cpu(). WARNING: suspicious RCU usage - kernel/workqueue.c:710 RCU or wq_pool_mutex should be held! other info that might help us debug this: RCU used illegally from offline CPU! Call Trace: dump_stack+0xf4/0x164 (unreliable) lockdep_rcu_suspicious+0x140/0x164 get_work_pool+0x110/0x150 __queue_work+0x1bc/0xca0 queue_work_on+0x114/0x120 css_release+0x9c/0xc0 percpu_ref_put_many+0x204/0x230 free_pcp_prepare+0x264/0x570 free_unref_page+0x38/0xf0 __mmdrop+0x21c/0x2c0 idle_task_exit+0x170/0x1b0 pnv_smp_cpu_kill_self+0x38/0x2e0 cpu_die+0x48/0x64 arch_cpu_idle_dead+0x30/0x50 do_idle+0x2f4/0x470 cpu_startup_entry+0x38/0x40 start_secondary+0x7a8/0xa80 start_secondary_resume+0x10/0x14 Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Qian Cai Signed-off-by: Peter Zijlstra (Intel) Acked-by: Michael Ellerman (powerpc) Link: https://lkml.kernel.org/r/20200401214033.8448-1-...@lca.pw Signed-off-by: Sasha Levin --- arch/powerpc/platforms/powernv/smp.c | 1 - include/linux/sched/mm.h | 2 ++ kernel/cpu.c | 18 +- kernel/sched/core.c | 5 +++-- 4 files changed, 22 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c index 13e251699346..b2ba3e95bda7 100644 --- a/arch/powerpc/platforms/powernv/smp.c +++ b/arch/powerpc/platforms/powernv/smp.c @@ -167,7 +167,6 @@ static void pnv_smp_cpu_kill_self(void) /* Standard hot unplug procedure */ idle_task_exit(); - current->active_mm = NULL; /* for sanity */ cpu = smp_processor_id(); DBG("CPU%d offline\n", cpu); generic_set_cpu_dead(cpu); diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index c49257a3b510..a132d875d351 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -49,6 +49,8 @@ static inline void mmdrop(struct mm_struct *mm) __mmdrop(mm); } +void mmdrop(struct mm_struct *mm); + /* * This has to be called after a get_task_mm()/mmget_not_zero() * followed by taking the mmap_sem for writing before modifying the diff --git a/kernel/cpu.c b/kernel/cpu.c index 2371292f30b0..244d30544377 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -3,6 +3,7 @@ * * This code is licenced under the GPL. */ +#include #include #include #include @@ -564,6 +565,21 @@ static int bringup_cpu(unsigned int cpu) return bringup_wait_for_ap(cpu); } +static int finish_cpu(unsigned int cpu) +{ + struct task_struct *idle = idle_thread_get(cpu); + struct mm_struct *mm = idle->active_mm; + + /* +* idle_task_exit() will have switched to &init_mm, now +* clean up any remaining active_mm state. +*/ + if (mm != &init_mm) + idle->active_mm = &init_mm; + mmdrop(mm); + return 0; +} + /* * Hotplug state machine related functions */ @@ -1549,7 +1565,7 @@ static struct cpuhp_step cpuhp_hp_states[] = { [CPUHP_BRINGUP_CPU] = { .name = "cpu:bringup", .startup.single = bringup_cpu, - .teardown.single= NULL, + .teardown.single= finish_cpu, .cant_stop = true, }, /* Final state before CPU kills itself */ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9a2fbf98fd6f..0bbf387d0f19 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6190,13 +6190,14 @@ void idle_task_exit(void) struct mm_struct *mm = current->active_mm; BUG_ON(cpu_online(smp_processor_id())); + BUG_ON(current != this_rq()->idle); if (mm != &init_mm) { switch_mm(mm, &init_mm, current); - current->active_mm = &init_mm; finish_arch_post_lock_switch(); } - mmdrop(mm); + + /* finish_cpu(), as ran on the BP, will clean up the active_mm state */ } /* -- 2.25.1
[PATCH AUTOSEL 5.7 038/274] soc: fsl: dpio: properly compute the consumer index
From: Ioana Ciornei [ Upstream commit 7596ac9d19a9df25707ecaac0675881f62dd8c18 ] Mask the consumer index before using it. Without this, we would be writing frame descriptors beyond the ring size supported by the QBMAN block. Fixes: 3b2abda7d28c ("soc: fsl: dpio: Replace QMAN array mode with ring mode enqueue") Signed-off-by: Ioana Ciornei Acked-by: Li Yang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/soc/fsl/dpio/qbman-portal.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/soc/fsl/dpio/qbman-portal.c b/drivers/soc/fsl/dpio/qbman-portal.c index 804b8ba9bf5c..23a1377971f4 100644 --- a/drivers/soc/fsl/dpio/qbman-portal.c +++ b/drivers/soc/fsl/dpio/qbman-portal.c @@ -669,6 +669,7 @@ int qbman_swp_enqueue_multiple_direct(struct qbman_swp *s, eqcr_ci = s->eqcr.ci; p = s->addr_cena + QBMAN_CENA_SWP_EQCR_CI; s->eqcr.ci = qbman_read_register(s, QBMAN_CINH_SWP_EQCR_CI); + s->eqcr.ci &= full_mask; s->eqcr.available = qm_cyc_diff(s->eqcr.pi_ring_size, eqcr_ci, s->eqcr.ci); -- 2.25.1
[PATCH v12 2/6] seq_buf: Export seq_buf_printf
'seq_buf' provides a very useful abstraction for writing to a string buffer without needing to worry about it over-flowing. However even though the API has been stable for couple of years now its still not exported to kernel loadable modules limiting its usage. Hence this patch proposes update to 'seq_buf.c' to mark seq_buf_printf() which is part of the seq_buf API to be exported to kernel loadable GPL modules. This symbol will be used in later parts of this patch-set to simplify content creation for a sysfs attribute. Cc: Piotr Maziarz Cc: Cezary Rojewski Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Borislav Petkov Acked-by: Steven Rostedt (VMware) Signed-off-by: Vaibhav Jain --- Changelog: v11..v12: * None v10..v11: * None v9..v10: * None Resend: * Added ack from Steven Rostedt v8..v9: * None v7..v8: * Updated the patch title [ Christoph Hellwig ] * Updated patch description to replace confusing term 'external kernel modules' to 'kernel lodable modules'. Resend: * Added ack from Steven Rostedt v6..v7: * New patch in the series --- lib/seq_buf.c | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/seq_buf.c b/lib/seq_buf.c index 4e865d42ab03..707453f5d58e 100644 --- a/lib/seq_buf.c +++ b/lib/seq_buf.c @@ -91,6 +91,7 @@ int seq_buf_printf(struct seq_buf *s, const char *fmt, ...) return ret; } +EXPORT_SYMBOL_GPL(seq_buf_printf); #ifdef CONFIG_BINARY_PRINTF /** -- 2.26.2
[PATCH v12 6/6] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
This patch implements support for PDSM request 'PAPR_PDSM_HEALTH' that returns a newly introduced 'struct nd_papr_pdsm_health' instance containing dimm health information back to user space in response to ND_CMD_CALL. This functionality is implemented in newly introduced papr_pdsm_health() that queries the nvdimm health information and then copies this information to the package payload whose layout is defined by 'struct nd_papr_pdsm_health'. Cc: "Aneesh Kumar K . V" Cc: Dan Williams Cc: Michael Ellerman Cc: Ira Weiny Reviewed-by: Ira Weiny Signed-off-by: Vaibhav Jain --- Changelog: v11..v12: * Minor: Reodered the initialization of 'struct nd_papr_pdsm_health' fields to match order present in its definition. [ Ira ] * Added ack from Ira v10..v11: * Changed the definition of 'struct nd_papr_pdsm_health' to a maximal struct 184 bytes in size [ Dan Williams ] * Added new field 'extension_flags' to 'struct nd_papr_pdsm_health' [ Dan Williams ] * Updated papr_pdsm_health() to set field 'extension_flags' to 0. * Introduced a define ND_PDSM_PAYLOAD_MAX_SIZE that indicates the maximum size of a payload. * Fixed a suspicious conversion from u64 to u8 in papr_pdsm_health that was preventing correct initialization of 'struct nd_papr_pdsm_health'. [ Ira ] v9..v10: * Removed code in papr_pdsm_health that performed validation on pdsm payload version and corrosponding struct and defines used for validation of payload version. * Dropped usage of struct papr_pdsm_health in 'struct papr_scm_priv'. Instead papr_psdm_health() now uses 'papr_scm_priv.health_bitmap' to populate the pdsm payload. * Above change also fixes the problem where this patch was removing the code that was previously introduced in this patch-series. [ Ira ] * Introduced a new def ND_PDSM_ENVELOPE_HDR_SIZE that indicates the space allocated to 'struct nd_pdsm_cmd_pkg' fields except 'struct nd_cmd_pkg'. This def is useful in validating payload sizes. * Reworked papr_pdsm_health() to enforce a specific payload size for 'PAPR_PDSM_HEALTH' pdsm request. Resend: * Added ack from Aneesh. v8..v9: * s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g [ Dan , Aneesh ] * s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g * Renamed papr_scm_get_health() to papr_psdm_health() * Updated patch description to replace papr-scm dimm with nvdimm. v7..v8: * None Resend: * None v6..v7: * Updated flags_show() to use seq_buf_printf(). [Mpe] * Updated papr_scm_get_health() to use newly introduced __drc_pmem_query_health() bypassing the cache [Mpe]. v5..v6: * Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to gaurd against possibility of different compilers adding different paddings to the struct [ Dan Williams ] * Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of 'bool' and also updated drc_pmem_query_health() to take this into account. [ Dan Williams ] v4..v5: * None v3..v4: * Call the DSM_PAPR_SCM_HEALTH service function from papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh] v2..v3: * Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types as its exported to the userspace [Aneesh] * Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health from enum to #defines [Aneesh] v1..v2: * New patch in the series --- arch/powerpc/include/uapi/asm/papr_pdsm.h | 43 ++ arch/powerpc/platforms/pseries/papr_scm.c | 71 +++ 2 files changed, 114 insertions(+) diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h b/arch/powerpc/include/uapi/asm/papr_pdsm.h index 34d1a41d2406..d453baea13c4 100644 --- a/arch/powerpc/include/uapi/asm/papr_pdsm.h +++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h @@ -70,13 +70,56 @@ struct nd_pdsm_cmd_pkg { __u8 payload[]; /* In/Out: Sub-cmd data buffer */ } __packed; +/* Calculate size used by the pdsm header fields minus 'struct nd_cmd_pkg' */ +#define ND_PDSM_HDR_SIZE \ + (sizeof(struct nd_pdsm_cmd_pkg) - sizeof(struct nd_cmd_pkg)) + +/* Max payload size that we can handle */ +#define ND_PDSM_PAYLOAD_MAX_SIZE 184 + /* * Methods to be embedded in ND_CMD_CALL request. These are sent to the kernel * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct */ enum papr_pdsm { PAPR_PDSM_MIN = 0x0, + PAPR_PDSM_HEALTH, PAPR_PDSM_MAX, }; +/* Various nvdimm health indicators */ +#define PAPR_PDSM_DIMM_HEALTHY 0 +#define PAPR_PDSM_DIMM_UNHEALTHY 1 +#define PAPR_PDSM_DIMM_CRITICAL 2 +#define PAPR_PDSM_DIMM_FATAL 3 + +/* + * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH + * Various flags indicate the health status of the dimm. + * + * extension_flags : Any extension fields present in the struct. + * dimm_unarmed: Dimm not armed. So contents wont persist. + * dimm_bad_shutdown : Previous shutdown did not persist contents. + * dimm_bad_restore: Contents from previous shutdown werent restored. + * dimm_scrubbe
[PATCH v12 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods
Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm module and add the command family NVDIMM_FAMILY_PAPR to the white list of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the nvdimm command mask and implement necessary scaffolding in the module to handle ND_CMD_CALL ioctl and PDSM requests that we receive. The layout of the PDSM request as we expect from libnvdimm/libndctl is described in newly introduced uapi header 'papr_pdsm.h' which defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used to communicate the PDSM request via member 'nd_cmd_pkg.nd_command' and size of payload that need to be sent/received for servicing the PDSM. A new function is_cmd_valid() is implemented that reads the args to papr_scm_ndctl() and performs sanity tests on them. A new function papr_scm_service_pdsm() is introduced and is called from papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL command from libnvdimm. Cc: "Aneesh Kumar K . V" Cc: Dan Williams Cc: Michael Ellerman Cc: Ira Weiny Signed-off-by: Vaibhav Jain --- Changelog: v11..v12: * Updated a misleading comment in 'papr_pdsm.h' regarding payload size. [ Ira ] v10..v11: * Moved in-lines 'nd_pdsm_cmd_pkg()' and 'pdsm_cmd_to_payload()' from 'papr_pdsm.h' header to 'papr_scm.c'. The avoids a potential license incompatibility issue with non-GPL-2.0 user-space code trying to include the header in its code. [ Ira ] * Verified papr_pdsm.h with UAPI_HEADER_TEST config. * Moved the is_cmd_valid() check in papr_scm_ndctl() before check for cmd_rc == NULL. This prevents cmd_rc to be updated in case the nd-cmd is invalid or unknown. v9..v10: * Simplified 'struct nd_pdsm_cmd_pkg' by removing the 'payload_version' field. * Removed the corrosponding documentation on versioning and backward compatibility from 'papr_pdsm.h' * Reduced the size of reserved fields to 4-bytes making 'struct nd_pdsm_cmd_pkg' 64 + 8 bytes long. * Updated is_cmd_valid() to enforce validation checks on pdsm commands. [ Dan Williams ] * Added check for reserved fields being set to '0' in is_cmd_valid() [ Ira ] * Moved changes for checking cmd_rc == NULL and logging improvements to a separate prelim patch [ Ira ]. * Moved pdsm package validation checks from papr_scm_service_pdsm() to is_cmd_valid(). * Marked papr_scm_service_pdsm() return type as 'void' since errors are reported in nd_pdsm_cmd_pkg.cmd_status field. Resend: * Added ack from Aneesh. v8..v9: * Reduced the usage of term SCM replacing it with appropriate replacement [ Dan Williams, Aneesh ] * Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h' * s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g * s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g * Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'. * Minor update to patch description. v7..v8: * Removed the 'payload_offset' field from 'struct nd_pdsm_cmd_pkg'. Instead command payload is always assumed to start at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ] * To enable introducing new fields to 'struct nd_pdsm_cmd_pkg', 'reserved' field of 10-bytes is introduced. [ Aneesh ] * Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h [ Ira ] Resend: * None v6..v7 : * Removed the re-definitions of __packed macro from papr_scm_pdsm.h [Mpe]. * Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h [Mpe]. * Removed macros that were unused in papr_scm.c from papr_scm_pdsm.h [Mpe]. * Made functions defined in papr_scm_pdsm.h as static inline. [Mpe] v5..v6 : * Changed the usage of the term DSM to PDSM to distinguish it from the ACPI term [ Dan Williams ] * Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various struct to reflect the new terminology. * Updated the patch description and title to reflect the new terminology. * Squashed patch to introduce new command family in 'ndctl.h' with this patch [ Dan Williams ] * Updated the papr_scm_pdsm method starting index from 0x1 to 0x0 [ Dan Williams ] * Removed redundant license text from the papr_scm_psdm.h file. [ Dan Williams ] * s/envelop/envelope/ at various places [ Dan Williams ] * Added '__packed' attribute to command package header to gaurd against different compiler adding paddings between the fields. [ Dan Williams] * Converted various pr_debug to dev_debug [ Dan Williams ] v4..v5 : * None v3..v4 : * None v2..v3 : * Updated the patch prefix to 'ndctl/uapi' [Aneesh] v1..v2 : * None --- arch/powerpc/include/uapi/asm/papr_pdsm.h | 82 ++ arch/powerpc/platforms/pseries/papr_scm.c | 126 +- include/uapi/linux/ndctl.h| 1 + 3 files changed, 205 insertions(+), 4 deletions(-) create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h b/arch/powerpc/include/uapi/asm/papr_pdsm.h new file mode 100644 index ..34d1a41d2406 --- /dev/null +++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h @@ -0,0 +1,82 @@ +/*
[PATCH v12 4/6] powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
Since papr_scm_ndctl() can be called from outside papr_scm, its exposed to the possibility of receiving NULL as value of 'cmd_rc' argument. This patch updates papr_scm_ndctl() to protect against such possibility by assigning it pointer to a local variable in case cmd_rc == NULL. Finally the patch also updates the 'default' add a debug log unknown 'cmd' values. Cc: "Aneesh Kumar K . V" Cc: Dan Williams Cc: Michael Ellerman Cc: Ira Weiny Reviewed-by: Ira Weiny Signed-off-by: Vaibhav Jain --- Changelog: v11..v12: * Added ack from Ira v10..v11: * Instead of returning *cmd_rd just return '0' in case nd_cmd is handled. In case of unknown nd-cmd return -EINVAL [ Ira and Dan Williams ] * Updated patch description. v9..v10 * New patch in the series --- arch/powerpc/platforms/pseries/papr_scm.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index 0c091622b15e..692ad3d79826 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -355,11 +355,16 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc, { struct nd_cmd_get_config_size *get_size_hdr; struct papr_scm_priv *p; + int rc; /* Only dimm-specific calls are supported atm */ if (!nvdimm) return -EINVAL; + /* Use a local variable in case cmd_rc pointer is NULL */ + if (!cmd_rc) + cmd_rc = &rc; + p = nvdimm_provider_data(nvdimm); switch (cmd) { @@ -381,6 +386,7 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc, break; default: + dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd); return -EINVAL; } -- 2.26.2
[PATCH v12 3/6] powerpc/papr_scm: Fetch nvdimm health information from PHYP
Implement support for fetching nvdimm health information via H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair of 64-bit bitmap, bitwise-and of which is then stored in 'struct papr_scm_priv' and subsequently partially exposed to user-space via newly introduced dimm specific attribute 'papr/flags'. Since the hcall is costly, the health information is cached and only re-queried, 60s after the previous successful hcall. The patch also adds a documentation text describing flags reported by the the new sysfs attribute 'papr/flags' is also introduced at Documentation/ABI/testing/sysfs-bus-papr-pmem. [1] commit 58b278f568f0 ("powerpc: Provide initial documentation for PAPR hcalls") Cc: "Aneesh Kumar K . V" Cc: Dan Williams Cc: Michael Ellerman Cc: Ira Weiny Signed-off-by: Vaibhav Jain --- Changelog: v11..v12: * None v10..v11: * None v9..v10: * Removed an avoidable 'goto' in __drc_pmem_query_health. [ Ira ]. Resend: * Added ack from Aneesh. v8..v9: * Rename some variables and defines to reduce usage of term SCM replacing it with PMEM [Dan Williams, Aneesh] * s/PAPR_SCM_DIMM/PAPR_PMEM/g * s/papr_scm_nd_attributes/papr_nd_attributes/g * s/papr_scm_nd_attribute_group/papr_nd_attribute_group/g * s/papr_scm_dimm_attr_groups/papr_nd_attribute_groups/g * Renamed file sysfs-bus-papr-scm to sysfs-bus-papr-pmem v7..v8: * Update type of variable 'rc' in __drc_pmem_query_health() and drc_pmem_query_health() to long and int respectively. [ Ira ] * Updated the patch description to s/64 bit Big Endian Number/64-bit bitmap/ [ Ira, Aneesh ]. Resend: * None v6..v7 : * Used the exported buf_seq_printf() function to generate content for 'papr/flags' * Moved the PAPR_SCM_DIMM_* bit-flags macro definitions to papr_scm.c and removed the papr_scm.h file [Mpe] * Some minor consistency issued in sysfs-bus-papr-scm documentation. [Mpe] * s/dimm_mutex/health_mutex/g [Mpe] * Split drc_pmem_query_health() into two function one of which takes care of caching and locking. [Mpe] * Fixed a local copy creation of dimm health information using READ_ONCE(). [Mpe] v5..v6 : * Change the flags sysfs attribute from 'papr_flags' to 'papr/flags' [Dan Williams] * Include documentation for 'papr/flags' attr [Dan Williams] * Change flag 'save_fail' to 'flush_fail' [Dan Williams] * Caching of health bitmap to reduce expensive hcalls [Dan Williams] * Removed usage of PPC_BIT from 'papr-scm.h' header [Mpe] * Replaced two __be64 integers from papr_scm_priv to a single u64 integer [Mpe] * Updated patch description to reflect the changes made in this version. * Removed avoidable usage of 'papr_scm_priv.dimm_mutex' from flags_show() [Dan Williams] v4..v5 : * None v3..v4 : * None v2..v3 : * Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for NVDIMM unarmed [Aneesh] v1..v2 : * New patch in the series. --- Documentation/ABI/testing/sysfs-bus-papr-pmem | 27 +++ arch/powerpc/platforms/pseries/papr_scm.c | 168 +- 2 files changed, 193 insertions(+), 2 deletions(-) create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem diff --git a/Documentation/ABI/testing/sysfs-bus-papr-pmem b/Documentation/ABI/testing/sysfs-bus-papr-pmem new file mode 100644 index ..5b10d036a8d4 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-bus-papr-pmem @@ -0,0 +1,27 @@ +What: /sys/bus/nd/devices/nmemX/papr/flags +Date: Apr, 2020 +KernelVersion: v5.8 +Contact: linuxppc-dev , linux-nvd...@lists.01.org, +Description: + (RO) Report flags indicating various states of a + papr-pmem NVDIMM device. Each flag maps to a one or + more bits set in the dimm-health-bitmap retrieved in + response to H_SCM_HEALTH hcall. The details of the bit + flags returned in response to this hcall is available + at 'Documentation/powerpc/papr_hcalls.rst' . Below are + the flags reported in this sysfs file: + + * "not_armed" : Indicates that NVDIMM contents will not + survive a power cycle. + * "flush_fail" : Indicates that NVDIMM contents + couldn't be flushed during last + shut-down event. + * "restore_fail": Indicates that NVDIMM contents + couldn't be restored during NVDIMM + initialization. + * "encrypted" : NVDIMM contents are encrypted. + * "smart_notify": There is health event for the NVDIMM. + * "scrubbed": Indicating that contents of the + NVDIMM have been scrubbed. + * "locked" : Indicating that NVDIMM contents cant + be modified until next power cycle. diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/ps
[PATCH v12 1/6] powerpc: Document details on H_SCM_HEALTH hcall
Add documentation to 'papr_hcalls.rst' describing the bitmap flags that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM specification. Cc: "Aneesh Kumar K . V" Cc: Dan Williams Cc: Michael Ellerman Cc: Ira Weiny Acked-by: Ira Weiny Signed-off-by: Vaibhav Jain --- Changelog: v11..v12: * None v10..v11: * None v9..v10: * Added ack from Ira. Resend: * None v8..v9: * s/SCM/PMEM device. [ Dan Williams, Aneesh ] v7..v8: * Added a clarification on bit-ordering of Health Bitmap Resend: * None v6..v7: * None v5..v6: * New patch in the series --- Documentation/powerpc/papr_hcalls.rst | 46 --- 1 file changed, 42 insertions(+), 4 deletions(-) diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst index 3493631a60f8..48fcf1255a33 100644 --- a/Documentation/powerpc/papr_hcalls.rst +++ b/Documentation/powerpc/papr_hcalls.rst @@ -220,13 +220,51 @@ from the LPAR memory. **H_SCM_HEALTH** | Input: drcIndex -| Out: *health-bitmap, health-bit-valid-bitmap* +| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)* | Return Value: *H_Success, H_Parameter, H_Hardware* Given a DRC Index return the info on predictive failure and overall health of -the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive -failure and health-bit-valid-bitmap indicate which bits in health-bitmap are -valid. +the PMEM device. The asserted bits in the health-bitmap indicate one or more states +(described in table below) of the PMEM device and health-bit-valid-bitmap indicate +which bits in health-bitmap are valid. The bits are reported in +reverse bit ordering for example a value of 0xC400 +indicates bits 0, 1, and 5 are valid. + +Health Bitmap Flags: + ++--+---+ +| Bit | Definition | ++==+===+ +| 00 | PMEM device is unable to persist memory contents. | +| | If the system is powered down, nothing will be saved. | ++--+---+ +| 01 | PMEM device failed to persist memory contents. Either contents were | +| | not saved successfully on power down or were not restored properly on | +| | power up. | ++--+---+ +| 02 | PMEM device contents are persisted from previous IPL. The data from | +| | the last boot were successfully restored. | ++--+---+ +| 03 | PMEM device contents are not persisted from previous IPL. There was no| +| | data to restore from the last boot. | ++--+---+ +| 04 | PMEM device memory life remaining is critically low | ++--+---+ +| 05 | PMEM device will be garded off next IPL due to failure | ++--+---+ +| 06 | PMEM device contents cannot persist due to current platform health | +| | status. A hardware failure may prevent data from being saved or | +| | restored. | ++--+---+ +| 07 | PMEM device is unable to persist memory contents in certain conditions| ++--+---+ +| 08 | PMEM device is encrypted | ++--+---+ +| 09 | PMEM device has successfully completed a requested erase or secure | +| | erase procedure. | ++--+---+ +|10:63 | Reserved / Unused | ++--+---+ **H_SCM_PERFORMANCE_STATS** -- 2.26.2
[PATCH v12 0/6] powerpc/papr_scm: Add support for reporting nvdimm health
Changes since v11 [1]: * Minor update to 'papr_pdsm.h' fixing a misleading comment about 'possible' padding being added by GCC which doesn't apply in case structs are marked as __packed. * Fix the order of initialization of 'struct nd_papr_pdsm_health' in papr_pdsm_health(). * Added acks from Ira for various patches. [1] https://lore.kernel.org/linux-nvdimm/20200607131339.476036-1-vaib...@linux.ibm.com --- The PAPR standard[2][4] provides mechanisms to query the health and performance stats of an NVDIMM via various hcalls as described in Ref[3]. Until now these stats were never available nor exposed to the user-space tools like 'ndctl'. This is partly due to PAPR platform not having support for ACPI and NFIT. Hence 'ndctl' is unable to query and report the dimm health status and a user had no way to determine the current health status of a NDVIMM. To overcome this limitation, this patch-set updates papr_scm kernel module to query and fetch NVDIMM health stats using hcalls described in Ref[3]. This health and performance stats are then exposed to userspace via sysfs and PAPR-NVDIMM-Specific-Methods(PDSM) issued by libndctl. These changes coupled with proposed ndtcl changes located at Ref[5] should provide a way for the user to retrieve NVDIMM health status using ndtcl. Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM in a emulation environment: # ndctl list -DH [ { "dev":"nmem0", "health":{ "health_state":"fatal", "shutdown_state":"dirty" } } ] Dimm health report output on a pseries guest lpar with vPMEM or HMS based NVDIMMs that are in perfectly healthy conditions: # ndctl list -d nmem0 -H [ { "dev":"nmem0", "health":{ "health_state":"ok", "shutdown_state":"clean" } } ] PAPR NVDIMM-Specific-Methods(PDSM) == PDSM requests are issued by vendor specific code in libndctl to execute certain operations or fetch information from NVDIMMS. PDSMs requests can be sent to papr_scm module via libndctl(userspace) and libnvdimm (kernel) using the ND_CMD_CALL ioctl command which can be handled in the dimm control function papr_scm_ndctl(). Current patchset proposes a single PDSM to retrieve NVDIMM health, defined in the newly introduced uapi header named 'papr_pdsm.h'. Support for more PDSMs will be added in future. Structure of the patch-set == The patch-set starts with a doc patch documenting details of hcall H_SCM_HEALTH. Second patch exports kernel symbol seq_buf_printf() thats used in subsequent patches to generate sysfs attribute content. Third patch implements support for fetching NVDIMM health information from PHYP and partially exposing it to user-space via a NVDIMM sysfs flag. Fourth patch updates papr_scm_ndctl() to handle a possible error case and also improve debug logging. Fifth patch deals with implementing support for servicing PDSM commands in papr_scm module. Finally the last patch implements support for servicing PDSM 'PAPR_PDSM_HEALTH' that returns the NVDIMM health information to libndctl. References: [2] "Power Architecture Platform Reference" https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference [3] commit 58b278f568f0 ("powerpc: Provide initial documentation for PAPR hcalls") [4] "Linux on Power Architecture Platform Reference" https://members.openpowerfoundation.org/document/dl/469 [5] https://github.com/vaibhav92/ndctl/tree/papr_scm_health_v12 --- Vaibhav Jain (6): powerpc: Document details on H_SCM_HEALTH hcall seq_buf: Export seq_buf_printf powerpc/papr_scm: Fetch nvdimm health information from PHYP powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl() ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH Documentation/ABI/testing/sysfs-bus-papr-pmem | 27 ++ Documentation/powerpc/papr_hcalls.rst | 46 ++- arch/powerpc/include/uapi/asm/papr_pdsm.h | 125 ++ arch/powerpc/platforms/pseries/papr_scm.c | 373 +- include/uapi/linux/ndctl.h| 1 + lib/seq_buf.c | 1 + 6 files changed, 562 insertions(+), 11 deletions(-) create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h -- 2.26.2
Re: [PATCH v11 6/6] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
Thanks Ira, Ira Weiny writes: > On Sun, Jun 07, 2020 at 06:43:39PM +0530, Vaibhav Jain wrote: >> This patch implements support for PDSM request 'PAPR_PDSM_HEALTH' >> that returns a newly introduced 'struct nd_papr_pdsm_health' instance >> containing dimm health information back to user space in response to >> ND_CMD_CALL. This functionality is implemented in newly introduced >> papr_pdsm_health() that queries the nvdimm health information and >> then copies this information to the package payload whose layout is >> defined by 'struct nd_papr_pdsm_health'. >> >> Cc: "Aneesh Kumar K . V" >> Cc: Dan Williams >> Cc: Michael Ellerman >> Cc: Ira Weiny >> Signed-off-by: Vaibhav Jain >> --- >> Changelog: >> >> v10..v11: >> * Changed the definition of 'struct nd_papr_pdsm_health' to a maximal >> struct 184 bytes in size [ Dan Williams ] >> * Added new field 'extension_flags' to 'struct nd_papr_pdsm_health' >> [ Dan Williams ] >> * Updated papr_pdsm_health() to set field 'extension_flags' to 0. >> * Introduced a define ND_PDSM_PAYLOAD_MAX_SIZE that indicates the >> maximum size of a payload. >> * Fixed a suspicious conversion from u64 to u8 in papr_pdsm_health >> that was preventing correct initialization of 'struct >> nd_papr_pdsm_health'. [ Ira ] >> >> v9..v10: >> * Removed code in papr_pdsm_health that performed validation on pdsm >> payload version and corrosponding struct and defines used for >> validation of payload version. >> * Dropped usage of struct papr_pdsm_health in 'struct >> papr_scm_priv'. Instead papr_psdm_health() now uses >> 'papr_scm_priv.health_bitmap' to populate the pdsm payload. >> * Above change also fixes the problem where this patch was removing >> the code that was previously introduced in this patch-series. >> [ Ira ] >> * Introduced a new def ND_PDSM_ENVELOPE_HDR_SIZE that indicates the >> space allocated to 'struct nd_pdsm_cmd_pkg' fields except 'struct >> nd_cmd_pkg'. This def is useful in validating payload sizes. >> * Reworked papr_pdsm_health() to enforce a specific payload size for >> 'PAPR_PDSM_HEALTH' pdsm request. >> >> Resend: >> * Added ack from Aneesh. >> >> v8..v9: >> * s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g [ Dan , Aneesh ] >> * s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g >> * Renamed papr_scm_get_health() to papr_psdm_health() >> * Updated patch description to replace papr-scm dimm with nvdimm. >> >> v7..v8: >> * None >> >> Resend: >> * None >> >> v6..v7: >> * Updated flags_show() to use seq_buf_printf(). [Mpe] >> * Updated papr_scm_get_health() to use newly introduced >> __drc_pmem_query_health() bypassing the cache [Mpe]. >> >> v5..v6: >> * Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to >> gaurd against possibility of different compilers adding different >> paddings to the struct [ Dan Williams ] >> >> * Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of >> 'bool' and also updated drc_pmem_query_health() to take this into >> account. [ Dan Williams ] >> >> v4..v5: >> * None >> >> v3..v4: >> * Call the DSM_PAPR_SCM_HEALTH service function from >> papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh] >> >> v2..v3: >> * Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types >> as its exported to the userspace [Aneesh] >> * Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health >> from enum to #defines [Aneesh] >> >> v1..v2: >> * New patch in the series >> --- >> arch/powerpc/include/uapi/asm/papr_pdsm.h | 43 ++ >> arch/powerpc/platforms/pseries/papr_scm.c | 71 +++ >> 2 files changed, 114 insertions(+) >> >> diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h >> b/arch/powerpc/include/uapi/asm/papr_pdsm.h >> index df2447455cfe..12c7aa5ee8bf 100644 >> --- a/arch/powerpc/include/uapi/asm/papr_pdsm.h >> +++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h >> @@ -72,13 +72,56 @@ struct nd_pdsm_cmd_pkg { >> __u8 payload[]; /* In/Out: Sub-cmd data buffer */ >> } __packed; >> >> +/* Calculate size used by the pdsm header fields minus 'struct nd_cmd_pkg' >> */ >> +#define ND_PDSM_HDR_SIZE \ >> +(sizeof(struct nd_pdsm_cmd_pkg) - sizeof(struct nd_cmd_pkg)) >> + >> +/* Max payload size that we can handle */ >> +#define ND_PDSM_PAYLOAD_MAX_SIZE 184 >> + >> /* >> * Methods to be embedded in ND_CMD_CALL request. These are sent to the >> kernel >> * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct >> */ >> enum papr_pdsm { >> PAPR_PDSM_MIN = 0x0, >> +PAPR_PDSM_HEALTH, >> PAPR_PDSM_MAX, >> }; >> >> +/* Various nvdimm health indicators */ >> +#define PAPR_PDSM_DIMM_HEALTHY 0 >> +#define PAPR_PDSM_DIMM_UNHEALTHY 1 >> +#define PAPR_PDSM_DIMM_CRITICAL 2 >> +#define PAPR_PDSM_DIMM_FATAL 3 >> + >> +/* >> + * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH >> + * Various flags indicate the health status of the dimm. >> + * >> + * extensi
Re: [PATCH v11 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods
Ira Weiny writes: > On Sun, Jun 07, 2020 at 06:43:38PM +0530, Vaibhav Jain wrote: >> Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm >> module and add the command family NVDIMM_FAMILY_PAPR to the white list >> of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the >> nvdimm command mask and implement necessary scaffolding in the module >> to handle ND_CMD_CALL ioctl and PDSM requests that we receive. >> >> The layout of the PDSM request as we expect from libnvdimm/libndctl is >> described in newly introduced uapi header 'papr_pdsm.h' which >> defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used >> to communicate the PDSM request via member >> 'nd_cmd_pkg.nd_command' and size of payload that need to be >> sent/received for servicing the PDSM. >> >> A new function is_cmd_valid() is implemented that reads the args to >> papr_scm_ndctl() and performs sanity tests on them. A new function >> papr_scm_service_pdsm() is introduced and is called from >> papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL >> command from libnvdimm. >> >> Cc: "Aneesh Kumar K . V" >> Cc: Dan Williams >> Cc: Michael Ellerman >> Cc: Ira Weiny >> Signed-off-by: Vaibhav Jain >> --- >> Changelog: >> >> v10..v11: >> * Moved in-lines 'nd_pdsm_cmd_pkg()' and 'pdsm_cmd_to_payload()' from >> 'papr_pdsm.h' header to 'papr_scm.c'. The avoids a potential license >> incompatibility issue with non-GPL-2.0 user-space code trying to >> include the header in its code. [ Ira ] >> * Verified papr_pdsm.h with UAPI_HEADER_TEST config. >> * Moved the is_cmd_valid() check in papr_scm_ndctl() before check for >> cmd_rc == NULL. This prevents cmd_rc to be updated in case the >> nd-cmd is invalid or unknown. >> >> v9..v10: >> * Simplified 'struct nd_pdsm_cmd_pkg' by removing the >> 'payload_version' field. >> * Removed the corrosponding documentation on versioning and backward >> compatibility from 'papr_pdsm.h' >> * Reduced the size of reserved fields to 4-bytes making 'struct >> nd_pdsm_cmd_pkg' 64 + 8 bytes long. >> * Updated is_cmd_valid() to enforce validation checks on pdsm >> commands. [ Dan Williams ] >> * Added check for reserved fields being set to '0' in is_cmd_valid() >> [ Ira ] >> * Moved changes for checking cmd_rc == NULL and logging improvements >> to a separate prelim patch [ Ira ]. >> * Moved pdsm package validation checks from papr_scm_service_pdsm() >> to is_cmd_valid(). >> * Marked papr_scm_service_pdsm() return type as 'void' since errors >> are reported in nd_pdsm_cmd_pkg.cmd_status field. >> >> Resend: >> * Added ack from Aneesh. >> >> v8..v9: >> * Reduced the usage of term SCM replacing it with appropriate >> replacement [ Dan Williams, Aneesh ] >> * Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h' >> * s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g >> * s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g >> * Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'. >> * Minor update to patch description. >> >> v7..v8: >> * Removed the 'payload_offset' field from 'struct >> nd_pdsm_cmd_pkg'. Instead command payload is always assumed to start >> at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ] >> * To enable introducing new fields to 'struct nd_pdsm_cmd_pkg', >> 'reserved' field of 10-bytes is introduced. [ Aneesh ] >> * Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h >> [ Ira ] >> >> Resend: >> * None >> >> v6..v7 : >> * Removed the re-definitions of __packed macro from papr_scm_pdsm.h >> [Mpe]. >> * Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h [Mpe]. >> * Removed macros that were unused in papr_scm.c from papr_scm_pdsm.h >> [Mpe]. >> * Made functions defined in papr_scm_pdsm.h as static inline. [Mpe] >> >> v5..v6 : >> * Changed the usage of the term DSM to PDSM to distinguish it from the >> ACPI term [ Dan Williams ] >> * Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various struct >> to reflect the new terminology. >> * Updated the patch description and title to reflect the new terminology. >> * Squashed patch to introduce new command family in 'ndctl.h' with >> this patch [ Dan Williams ] >> * Updated the papr_scm_pdsm method starting index from 0x1 to 0x0 >> [ Dan Williams ] >> * Removed redundant license text from the papr_scm_psdm.h file. >> [ Dan Williams ] >> * s/envelop/envelope/ at various places [ Dan Williams ] >> * Added '__packed' attribute to command package header to gaurd >> against different compiler adding paddings between the fields. >> [ Dan Williams] >> * Converted various pr_debug to dev_debug [ Dan Williams ] >> >> v4..v5 : >> * None >> >> v3..v4 : >> * None >> >> v2..v3 : >> * Updated the patch prefix to 'ndctl/uapi' [Aneesh] >> >> v1..v2 : >> * None >> --- >> arch/powerpc/include/uapi/asm/papr_pdsm.h | 84 +++ >> arch/powerpc/platforms/pseries/papr_scm.c | 126 +- >> include/uapi/linux/ndctl.h| 1
Re: [PATCH v11 3/6] powerpc/papr_scm: Fetch nvdimm health information from PHYP
Hi Ira, During v9 you had provided your ack to this patch [1] and also had made a review comment in a later patch regarding an avoidable 'goto' statement. I have since updated the patch addressing that review comment. Can you please provide your ack to this patch too. [1] https://lore.kernel.org/linux-nvdimm/20200603231814.gk1505...@iweiny-desk2.sc.intel.com/T/#m668d7b35a2394104f11afdae5951e420a8ccffe6 [2] "I missed this... probably did not need the goto in the first patch?" https://lore.kernel.org/linux-nvdimm/20200603231814.gk1505...@iweiny-desk2.sc.intel.com/T/#m1ebdd309ac0cb6f47d3b574b8d05374b21ff75df Thanks, ~ Vaibhav Vaibhav Jain writes: > Implement support for fetching nvdimm health information via > H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair > of 64-bit bitmap, bitwise-and of which is then stored in > 'struct papr_scm_priv' and subsequently partially exposed to > user-space via newly introduced dimm specific attribute > 'papr/flags'. Since the hcall is costly, the health information is > cached and only re-queried, 60s after the previous successful hcall. > > The patch also adds a documentation text describing flags reported by > the the new sysfs attribute 'papr/flags' is also introduced at > Documentation/ABI/testing/sysfs-bus-papr-pmem. > > [1] commit 58b278f568f0 ("powerpc: Provide initial documentation for > PAPR hcalls") > > Cc: "Aneesh Kumar K . V" > Cc: Dan Williams > Cc: Michael Ellerman > Cc: Ira Weiny > Signed-off-by: Vaibhav Jain > --- > Changelog: > > v10..v11: > * None > > v9..v10: > * Removed an avoidable 'goto' in __drc_pmem_query_health. [ Ira ]. > > Resend: > * Added ack from Aneesh. > > v8..v9: > * Rename some variables and defines to reduce usage of term SCM > replacing it with PMEM [Dan Williams, Aneesh] > * s/PAPR_SCM_DIMM/PAPR_PMEM/g > * s/papr_scm_nd_attributes/papr_nd_attributes/g > * s/papr_scm_nd_attribute_group/papr_nd_attribute_group/g > * s/papr_scm_dimm_attr_groups/papr_nd_attribute_groups/g > * Renamed file sysfs-bus-papr-scm to sysfs-bus-papr-pmem > > v7..v8: > * Update type of variable 'rc' in __drc_pmem_query_health() and > drc_pmem_query_health() to long and int respectively. [ Ira ] > * Updated the patch description to s/64 bit Big Endian Number/64-bit > bitmap/ [ Ira, Aneesh ]. > > Resend: > * None > > v6..v7 : > * Used the exported buf_seq_printf() function to generate content for > 'papr/flags' > * Moved the PAPR_SCM_DIMM_* bit-flags macro definitions to papr_scm.c > and removed the papr_scm.h file [Mpe] > * Some minor consistency issued in sysfs-bus-papr-scm > documentation. [Mpe] > * s/dimm_mutex/health_mutex/g [Mpe] > * Split drc_pmem_query_health() into two function one of which takes > care of caching and locking. [Mpe] > * Fixed a local copy creation of dimm health information using > READ_ONCE(). [Mpe] > > v5..v6 : > * Change the flags sysfs attribute from 'papr_flags' to 'papr/flags' > [Dan Williams] > * Include documentation for 'papr/flags' attr [Dan Williams] > * Change flag 'save_fail' to 'flush_fail' [Dan Williams] > * Caching of health bitmap to reduce expensive hcalls [Dan Williams] > * Removed usage of PPC_BIT from 'papr-scm.h' header [Mpe] > * Replaced two __be64 integers from papr_scm_priv to a single u64 > integer [Mpe] > * Updated patch description to reflect the changes made in this > version. > * Removed avoidable usage of 'papr_scm_priv.dimm_mutex' from > flags_show() [Dan Williams] > > v4..v5 : > * None > > v3..v4 : > * None > > v2..v3 : > * Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for >NVDIMM unarmed [Aneesh] > > v1..v2 : > * New patch in the series. > --- > Documentation/ABI/testing/sysfs-bus-papr-pmem | 27 +++ > arch/powerpc/platforms/pseries/papr_scm.c | 168 +- > 2 files changed, 193 insertions(+), 2 deletions(-) > create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem > > diff --git a/Documentation/ABI/testing/sysfs-bus-papr-pmem > b/Documentation/ABI/testing/sysfs-bus-papr-pmem > new file mode 100644 > index ..5b10d036a8d4 > --- /dev/null > +++ b/Documentation/ABI/testing/sysfs-bus-papr-pmem > @@ -0,0 +1,27 @@ > +What:/sys/bus/nd/devices/nmemX/papr/flags > +Date:Apr, 2020 > +KernelVersion: v5.8 > +Contact: linuxppc-dev , > linux-nvd...@lists.01.org, > +Description: > + (RO) Report flags indicating various states of a > + papr-pmem NVDIMM device. Each flag maps to a one or > + more bits set in the dimm-health-bitmap retrieved in > + response to H_SCM_HEALTH hcall. The details of the bit > + flags returned in response to this hcall is available > + at 'Documentation/powerpc/papr_hcalls.rst' . Below are > + the flags reported in this sysfs file: > + > + * "not_armed" : Indicates that NVDIMM contents will not > + survive a
[PATCH] selftests: powerpc: Fix online CPU selection
On systems with large number of cpus, test fails trying to set affinity by calling sched_setaffinity() with smaller size for cpuset. This patch fixes it by making sure that the size of allocated cpu set is dependent on the number of CPUs as reported by get_nprocs(). Reported-by: Shirisha Ganta Signed-off-by: Harish Signed-off-by: Sandipan Das --- .../powerpc/benchmarks/context_switch.c| 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/powerpc/benchmarks/context_switch.c b/tools/testing/selftests/powerpc/benchmarks/context_switch.c index a2e8c9da7fa5..de6c49d6f88f 100644 --- a/tools/testing/selftests/powerpc/benchmarks/context_switch.c +++ b/tools/testing/selftests/powerpc/benchmarks/context_switch.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -104,8 +105,9 @@ static void start_thread_on(void *(*fn)(void *), void *arg, unsigned long cpu) static void start_process_on(void *(*fn)(void *), void *arg, unsigned long cpu) { - int pid; - cpu_set_t cpuset; + int pid, ncpus; + cpu_set_t *cpuset; + size_t size; pid = fork(); if (pid == -1) { @@ -116,12 +118,16 @@ static void start_process_on(void *(*fn)(void *), void *arg, unsigned long cpu) if (pid) return; - CPU_ZERO(&cpuset); - CPU_SET(cpu, &cpuset); + size = CPU_ALLOC_SIZE(ncpus); + ncpus = get_nprocs(); + cpuset = CPU_ALLOC(ncpus); + CPU_ZERO_S(size, cpuset); + CPU_SET_S(cpu, size, cpuset); - if (sched_setaffinity(0, sizeof(cpuset), &cpuset)) { + if (sched_setaffinity(0, size, cpuset)) { perror("sched_setaffinity"); - exit(1); + CPU_FREE(cpuset); + exit(-1); } fn(arg); -- 2.24.1
Re: [PATCH v11 6/6] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
On Sun, Jun 07, 2020 at 06:43:39PM +0530, Vaibhav Jain wrote: > This patch implements support for PDSM request 'PAPR_PDSM_HEALTH' > that returns a newly introduced 'struct nd_papr_pdsm_health' instance > containing dimm health information back to user space in response to > ND_CMD_CALL. This functionality is implemented in newly introduced > papr_pdsm_health() that queries the nvdimm health information and > then copies this information to the package payload whose layout is > defined by 'struct nd_papr_pdsm_health'. > > Cc: "Aneesh Kumar K . V" > Cc: Dan Williams > Cc: Michael Ellerman > Cc: Ira Weiny > Signed-off-by: Vaibhav Jain > --- > Changelog: > > v10..v11: > * Changed the definition of 'struct nd_papr_pdsm_health' to a maximal > struct 184 bytes in size [ Dan Williams ] > * Added new field 'extension_flags' to 'struct nd_papr_pdsm_health' > [ Dan Williams ] > * Updated papr_pdsm_health() to set field 'extension_flags' to 0. > * Introduced a define ND_PDSM_PAYLOAD_MAX_SIZE that indicates the > maximum size of a payload. > * Fixed a suspicious conversion from u64 to u8 in papr_pdsm_health > that was preventing correct initialization of 'struct > nd_papr_pdsm_health'. [ Ira ] > > v9..v10: > * Removed code in papr_pdsm_health that performed validation on pdsm > payload version and corrosponding struct and defines used for > validation of payload version. > * Dropped usage of struct papr_pdsm_health in 'struct > papr_scm_priv'. Instead papr_psdm_health() now uses > 'papr_scm_priv.health_bitmap' to populate the pdsm payload. > * Above change also fixes the problem where this patch was removing > the code that was previously introduced in this patch-series. > [ Ira ] > * Introduced a new def ND_PDSM_ENVELOPE_HDR_SIZE that indicates the > space allocated to 'struct nd_pdsm_cmd_pkg' fields except 'struct > nd_cmd_pkg'. This def is useful in validating payload sizes. > * Reworked papr_pdsm_health() to enforce a specific payload size for > 'PAPR_PDSM_HEALTH' pdsm request. > > Resend: > * Added ack from Aneesh. > > v8..v9: > * s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g [ Dan , Aneesh ] > * s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g > * Renamed papr_scm_get_health() to papr_psdm_health() > * Updated patch description to replace papr-scm dimm with nvdimm. > > v7..v8: > * None > > Resend: > * None > > v6..v7: > * Updated flags_show() to use seq_buf_printf(). [Mpe] > * Updated papr_scm_get_health() to use newly introduced > __drc_pmem_query_health() bypassing the cache [Mpe]. > > v5..v6: > * Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to > gaurd against possibility of different compilers adding different > paddings to the struct [ Dan Williams ] > > * Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of > 'bool' and also updated drc_pmem_query_health() to take this into > account. [ Dan Williams ] > > v4..v5: > * None > > v3..v4: > * Call the DSM_PAPR_SCM_HEALTH service function from > papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh] > > v2..v3: > * Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types > as its exported to the userspace [Aneesh] > * Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health > from enum to #defines [Aneesh] > > v1..v2: > * New patch in the series > --- > arch/powerpc/include/uapi/asm/papr_pdsm.h | 43 ++ > arch/powerpc/platforms/pseries/papr_scm.c | 71 +++ > 2 files changed, 114 insertions(+) > > diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h > b/arch/powerpc/include/uapi/asm/papr_pdsm.h > index df2447455cfe..12c7aa5ee8bf 100644 > --- a/arch/powerpc/include/uapi/asm/papr_pdsm.h > +++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h > @@ -72,13 +72,56 @@ struct nd_pdsm_cmd_pkg { > __u8 payload[]; /* In/Out: Sub-cmd data buffer */ > } __packed; > > +/* Calculate size used by the pdsm header fields minus 'struct nd_cmd_pkg' */ > +#define ND_PDSM_HDR_SIZE \ > + (sizeof(struct nd_pdsm_cmd_pkg) - sizeof(struct nd_cmd_pkg)) > + > +/* Max payload size that we can handle */ > +#define ND_PDSM_PAYLOAD_MAX_SIZE 184 > + > /* > * Methods to be embedded in ND_CMD_CALL request. These are sent to the > kernel > * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct > */ > enum papr_pdsm { > PAPR_PDSM_MIN = 0x0, > + PAPR_PDSM_HEALTH, > PAPR_PDSM_MAX, > }; > > +/* Various nvdimm health indicators */ > +#define PAPR_PDSM_DIMM_HEALTHY 0 > +#define PAPR_PDSM_DIMM_UNHEALTHY 1 > +#define PAPR_PDSM_DIMM_CRITICAL 2 > +#define PAPR_PDSM_DIMM_FATAL 3 > + > +/* > + * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH > + * Various flags indicate the health status of the dimm. > + * > + * extension_flags : Any extension fields present in the struct. > + * dimm_unarmed : Dimm not armed. So contents wont persist. > + * dimm_bad_shutdown : Previo
Re: [PATCH v11 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods
On Sun, Jun 07, 2020 at 06:43:38PM +0530, Vaibhav Jain wrote: > Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm > module and add the command family NVDIMM_FAMILY_PAPR to the white list > of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the > nvdimm command mask and implement necessary scaffolding in the module > to handle ND_CMD_CALL ioctl and PDSM requests that we receive. > > The layout of the PDSM request as we expect from libnvdimm/libndctl is > described in newly introduced uapi header 'papr_pdsm.h' which > defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used > to communicate the PDSM request via member > 'nd_cmd_pkg.nd_command' and size of payload that need to be > sent/received for servicing the PDSM. > > A new function is_cmd_valid() is implemented that reads the args to > papr_scm_ndctl() and performs sanity tests on them. A new function > papr_scm_service_pdsm() is introduced and is called from > papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL > command from libnvdimm. > > Cc: "Aneesh Kumar K . V" > Cc: Dan Williams > Cc: Michael Ellerman > Cc: Ira Weiny > Signed-off-by: Vaibhav Jain > --- > Changelog: > > v10..v11: > * Moved in-lines 'nd_pdsm_cmd_pkg()' and 'pdsm_cmd_to_payload()' from > 'papr_pdsm.h' header to 'papr_scm.c'. The avoids a potential license > incompatibility issue with non-GPL-2.0 user-space code trying to > include the header in its code. [ Ira ] > * Verified papr_pdsm.h with UAPI_HEADER_TEST config. > * Moved the is_cmd_valid() check in papr_scm_ndctl() before check for > cmd_rc == NULL. This prevents cmd_rc to be updated in case the > nd-cmd is invalid or unknown. > > v9..v10: > * Simplified 'struct nd_pdsm_cmd_pkg' by removing the > 'payload_version' field. > * Removed the corrosponding documentation on versioning and backward > compatibility from 'papr_pdsm.h' > * Reduced the size of reserved fields to 4-bytes making 'struct > nd_pdsm_cmd_pkg' 64 + 8 bytes long. > * Updated is_cmd_valid() to enforce validation checks on pdsm > commands. [ Dan Williams ] > * Added check for reserved fields being set to '0' in is_cmd_valid() > [ Ira ] > * Moved changes for checking cmd_rc == NULL and logging improvements > to a separate prelim patch [ Ira ]. > * Moved pdsm package validation checks from papr_scm_service_pdsm() > to is_cmd_valid(). > * Marked papr_scm_service_pdsm() return type as 'void' since errors > are reported in nd_pdsm_cmd_pkg.cmd_status field. > > Resend: > * Added ack from Aneesh. > > v8..v9: > * Reduced the usage of term SCM replacing it with appropriate > replacement [ Dan Williams, Aneesh ] > * Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h' > * s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g > * s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g > * Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'. > * Minor update to patch description. > > v7..v8: > * Removed the 'payload_offset' field from 'struct > nd_pdsm_cmd_pkg'. Instead command payload is always assumed to start > at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ] > * To enable introducing new fields to 'struct nd_pdsm_cmd_pkg', > 'reserved' field of 10-bytes is introduced. [ Aneesh ] > * Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h > [ Ira ] > > Resend: > * None > > v6..v7 : > * Removed the re-definitions of __packed macro from papr_scm_pdsm.h > [Mpe]. > * Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h [Mpe]. > * Removed macros that were unused in papr_scm.c from papr_scm_pdsm.h > [Mpe]. > * Made functions defined in papr_scm_pdsm.h as static inline. [Mpe] > > v5..v6 : > * Changed the usage of the term DSM to PDSM to distinguish it from the > ACPI term [ Dan Williams ] > * Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various struct > to reflect the new terminology. > * Updated the patch description and title to reflect the new terminology. > * Squashed patch to introduce new command family in 'ndctl.h' with > this patch [ Dan Williams ] > * Updated the papr_scm_pdsm method starting index from 0x1 to 0x0 > [ Dan Williams ] > * Removed redundant license text from the papr_scm_psdm.h file. > [ Dan Williams ] > * s/envelop/envelope/ at various places [ Dan Williams ] > * Added '__packed' attribute to command package header to gaurd > against different compiler adding paddings between the fields. > [ Dan Williams] > * Converted various pr_debug to dev_debug [ Dan Williams ] > > v4..v5 : > * None > > v3..v4 : > * None > > v2..v3 : > * Updated the patch prefix to 'ndctl/uapi' [Aneesh] > > v1..v2 : > * None > --- > arch/powerpc/include/uapi/asm/papr_pdsm.h | 84 +++ > arch/powerpc/platforms/pseries/papr_scm.c | 126 +- > include/uapi/linux/ndctl.h| 1 + > 3 files changed, 207 insertions(+), 4 deletions(-) > create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h > > diff --git a/arch/p
Re: [PATCH v11 4/6] powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
On Sun, Jun 07, 2020 at 06:43:37PM +0530, Vaibhav Jain wrote: > Since papr_scm_ndctl() can be called from outside papr_scm, its > exposed to the possibility of receiving NULL as value of 'cmd_rc' > argument. This patch updates papr_scm_ndctl() to protect against such > possibility by assigning it pointer to a local variable in case cmd_rc > == NULL. > > Finally the patch also updates the 'default' add a debug log unknown > 'cmd' values. > > Cc: "Aneesh Kumar K . V" > Cc: Dan Williams > Cc: Michael Ellerman > Cc: Ira Weiny Reviewed-by: Ira Weiny > Signed-off-by: Vaibhav Jain > --- > Changelog: > > v10..v11: > * Instead of returning *cmd_rd just return '0' in case nd_cmd is > handled. In case of unknown nd-cmd return -EINVAL > [ Ira and Dan Williams ] > * Updated patch description. > > v9..v10 > * New patch in the series > --- > arch/powerpc/platforms/pseries/papr_scm.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/arch/powerpc/platforms/pseries/papr_scm.c > b/arch/powerpc/platforms/pseries/papr_scm.c > index 0c091622b15e..692ad3d79826 100644 > --- a/arch/powerpc/platforms/pseries/papr_scm.c > +++ b/arch/powerpc/platforms/pseries/papr_scm.c > @@ -355,11 +355,16 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor > *nd_desc, > { > struct nd_cmd_get_config_size *get_size_hdr; > struct papr_scm_priv *p; > + int rc; > > /* Only dimm-specific calls are supported atm */ > if (!nvdimm) > return -EINVAL; > > + /* Use a local variable in case cmd_rc pointer is NULL */ > + if (!cmd_rc) > + cmd_rc = &rc; > + > p = nvdimm_provider_data(nvdimm); > > switch (cmd) { > @@ -381,6 +386,7 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor > *nd_desc, > break; > > default: > + dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd); > return -EINVAL; > } > > -- > 2.26.2 >
Re: [PATCH] selftests: powerpc: Fix online CPU selection
On 6/8/20 8:12 PM, Sandipan Das wrote: > The size of the cpu set must be large enough for systems > with a very large number of CPUs. Otherwise, tests which > try to determine the first online CPU by calling > sched_getaffinity() will fail. This makes sure that the > size of the allocated cpu set is dependent on the number > of CPUs as reported by get_nprocs(). > > Fixes: 3752e453f6ba ("selftests/powerpc: Add tests of PMU EBBs") > Reported-by: Shirisha Ganta > Signed-off-by: Sandipan Das LGTM, Reviewed-by: Kamalesh Babulal -- Kamalesh
[PATCH] selftests: powerpc: Fix online CPU selection
The size of the cpu set must be large enough for systems with a very large number of CPUs. Otherwise, tests which try to determine the first online CPU by calling sched_getaffinity() will fail. This makes sure that the size of the allocated cpu set is dependent on the number of CPUs as reported by get_nprocs(). Fixes: 3752e453f6ba ("selftests/powerpc: Add tests of PMU EBBs") Reported-by: Shirisha Ganta Signed-off-by: Sandipan Das --- tools/testing/selftests/powerpc/utils.c | 33 - 1 file changed, 21 insertions(+), 12 deletions(-) diff --git a/tools/testing/selftests/powerpc/utils.c b/tools/testing/selftests/powerpc/utils.c index 933678f1ed0a..bb8e402752c0 100644 --- a/tools/testing/selftests/powerpc/utils.c +++ b/tools/testing/selftests/powerpc/utils.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include @@ -88,28 +89,36 @@ void *get_auxv_entry(int type) int pick_online_cpu(void) { - cpu_set_t mask; - int cpu; + int ncpus, cpu = -1; + cpu_set_t *mask; + size_t size; - CPU_ZERO(&mask); + ncpus = get_nprocs(); + size = CPU_ALLOC_SIZE(ncpus); + mask = CPU_ALLOC(ncpus); - if (sched_getaffinity(0, sizeof(mask), &mask)) { + CPU_ZERO_S(size, mask); + + if (sched_getaffinity(0, size, mask)) { perror("sched_getaffinity"); - return -1; + goto done; } /* We prefer a primary thread, but skip 0 */ - for (cpu = 8; cpu < CPU_SETSIZE; cpu += 8) - if (CPU_ISSET(cpu, &mask)) - return cpu; + for (cpu = 8; cpu < ncpus; cpu += 8) + if (CPU_ISSET_S(cpu, size, mask)) + goto done; /* Search for anything, but in reverse */ - for (cpu = CPU_SETSIZE - 1; cpu >= 0; cpu--) - if (CPU_ISSET(cpu, &mask)) - return cpu; + for (cpu = ncpus - 1; cpu >= 0; cpu--) + if (CPU_ISSET_S(cpu, size, mask)) + goto done; printf("No cpus in affinity mask?!\n"); - return -1; + +done: + CPU_FREE(mask); + return cpu; } bool is_ppc64le(void) -- 2.25.1
[PATCH v2] mm/debug_vm_pgtable: Fix kernel crash by checking for THP support
Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but no THP support enabled based on platforms. For ex: with 4K PAGE_SIZE ppc64 supports THP only with radix translation. This results in below crash when running with hash translation and 4K PAGE_SIZE. kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140! cpu 0x61: Vector: 700 (Program Check) at [c00ff948f860] pc: c18810f8: debug_vm_pgtable+0x480/0x8b0 lr: c18810ec: debug_vm_pgtable+0x474/0x8b0 ... [c00ff948faf0] c1880fec debug_vm_pgtable+0x374/0x8b0 (unreliable) [c00ff948fbf0] c0011648 do_one_initcall+0x98/0x4f0 [c00ff948fcd0] c1843928 kernel_init_freeable+0x330/0x3fc [c00ff948fdb0] c00122ac kernel_init+0x24/0x148 [c00ff948fe20] c000cc44 ret_from_kernel_thread+0x5c/0x78 Check for THP support correctly Cc: anshuman.khand...@arm.com Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table helpers") Signed-off-by: Aneesh Kumar K.V --- mm/debug_vm_pgtable.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index 188c18908964..df3a3a08f4f8 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -61,6 +61,9 @@ static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) { pmd_t pmd = pfn_pmd(pfn, prot); + if (!has_transparent_hugepage()) + return; + WARN_ON(!pmd_same(pmd, pmd)); WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd; WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd; @@ -80,6 +83,9 @@ static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { pud_t pud = pfn_pud(pfn, prot); + if (!has_transparent_hugepage()) + return; + WARN_ON(!pud_same(pud, pud)); WARN_ON(!pud_young(pud_mkyoung(pud_mkold(pud; WARN_ON(!pud_write(pud_mkwrite(pud_wrprotect(pud; -- 2.26.2
[PATCH] KVM: PPC: Book3S HV: increase KVMPPC_NR_LPIDS on POWER8 and POWER9
POWER8 and POWER9 have 12-bit LPIDs. Change LPID_RSVD to support up to (4096 - 2) guests on these processors. POWER7 is kept the same with a limitation of (1024 - 2), but it might be time to drop KVM support for POWER7. Tested with 2048 guests * 4 vCPUs on a witherspoon system with 512G RAM and a bit of swap. Signed-off-by: Cédric Le Goater --- arch/powerpc/include/asm/reg.h | 3 ++- arch/powerpc/kvm/book3s_64_mmu_hv.c | 8 ++-- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index 88e6c78100d9..b70bbfb0ea3c 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -473,7 +473,8 @@ #ifndef SPRN_LPID #define SPRN_LPID 0x13F /* Logical Partition Identifier */ #endif -#define LPID_RSVD0x3ff /* Reserved LPID for partn switching */ +#define LPID_RSVD_POWER7 0x3ff /* Reserved LPID for partn switching */ +#define LPID_RSVD0xfff /* Reserved LPID for partn switching */ #defineSPRN_HMER 0x150 /* Hypervisor maintenance exception reg */ #define HMER_DEBUG_TRIG (1ul << (63 - 17)) /* Debug trigger */ #defineSPRN_HMEER 0x151 /* Hyp maintenance exception enable reg */ diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 18aed9775a3c..23035ab2ec50 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -260,11 +260,15 @@ int kvmppc_mmu_hv_init(void) if (!mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) return -EINVAL; - /* POWER7 has 10-bit LPIDs (12-bit in POWER8) */ host_lpid = 0; if (cpu_has_feature(CPU_FTR_HVMODE)) host_lpid = mfspr(SPRN_LPID); - rsvd_lpid = LPID_RSVD; + + /* POWER8 and above have 12-bit LPIDs (10-bit in POWER7) */ + if (cpu_has_feature(CPU_FTR_ARCH_207S)) + rsvd_lpid = LPID_RSVD; + else + rsvd_lpid = LPID_RSVD_POWER7; kvmppc_init_lpid(rsvd_lpid + 1); -- 2.25.4
Re: [PATCH] mm/debug_vm_pgtable: Fix kernel crash with page table validate
On 06/08/2020 04:46 PM, Aneesh Kumar K.V wrote: > On 6/8/20 4:31 PM, Anshuman Khandual wrote: >> Hi Aneesh, >> >> On 06/08/2020 11:57 AM, Aneesh Kumar K.V wrote: >>> Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but >>> no THP support enabled based on platforms. For ex: with 4K >>> PAGE_SIZE ppc64 supports THP only with radix translation. >> >> Good catch, never hit this before. >> >>> >>> This results in below crash when running with hash translation and >>> 4K PAGE_SIZE. >>> >>> kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140! >>> cpu 0x61: Vector: 700 (Program Check) at [c00ff948f860] >>> pc: c18810f8: debug_vm_pgtable+0x480/0x8b0 >>> lr: c18810ec: debug_vm_pgtable+0x474/0x8b0 >>> ... >>> [c00ff948faf0] c1880fec debug_vm_pgtable+0x374/0x8b0 >>> (unreliable) >>> [c00ff948fbf0] c0011648 do_one_initcall+0x98/0x4f0 >>> [c00ff948fcd0] c1843928 kernel_init_freeable+0x330/0x3fc >>> [c00ff948fdb0] c00122ac kernel_init+0x24/0x148 >>> [c00ff948fe20] c000cc44 ret_from_kernel_thread+0x5c/0x78 >>> >>> Check for THP support correctly >> >> Makes sense, is this the only configuration which hit the problem ? > > 4K hash ppc64 is the only config i guess. Okay. > >> >>> >>> Cc: anshuman.khand...@arm.com >>> Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page >>> table helpers") >>> Signed-off-by: Aneesh Kumar K.V >>> --- >>> mm/debug_vm_pgtable.c | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c >>> index 188c18908964..e60151c5e997 100644 >>> --- a/mm/debug_vm_pgtable.c >>> +++ b/mm/debug_vm_pgtable.c >>> @@ -61,6 +61,9 @@ static void __init pmd_basic_tests(unsigned long pfn, >>> pgprot_t prot) >>> { >>> pmd_t pmd = pfn_pmd(pfn, prot); >>> + if (!has_transparent_hugepage()) >>> + return; >>> + >> >> We should also add this check to pud_basic_tests() as well. > > > Do we have a function that check for runtime support for pud level THP? ppc64 > don't do pud level THP yet. So we have > CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=n I believe, we dont have such a generic function. Please correct me, if I am missing something here. > > are you suggesting we do the same check for pud level THP too? Yes. Because regardless CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD, could there be any THP at PUD level when has_transparent_hugepage() returns negative ? The current dependency between THP and PUD THP configs seems some what confusing but having this check at PUD level should protect against similar problems. A quick test (after adding this check to PUD level) on x86 does not indicate any problem on the normal path. > > >> >>> WARN_ON(!pmd_same(pmd, pmd)); >>> WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd; >>> WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd; >>> >> >> The subject line here should mention about correct THP support >> detection which fixes the problem. Probably something like this >> or similar ("Fix kernel crash with correct THP support check"). > > > Not sure about that. This fix a kernel crash with page table validate code. What this fixes is very clear from the prefix itself - "mm/debug_vm_pgtable:", making "page table validate" some what bit redundant. Instead, it could just accommodate method of the fix i.e "via correct THP support check". Nonetheless, it is just a small nit.
Re: [v1 PATCH 1/2] Refactoring carrying over IMA measuremnet logs over Kexec.
Hi Prakhar, On Sun, 2020-06-07 at 16:33 -0700, Prakhar Srivastava wrote: > This patch moves the non-architecture specific code out of powerpc and > adds to security/ima. > Update the arm64 and powerpc kexec file load paths to carry the IMA > measurement > logs. >From your patch description, this patch should be broken up. Moving the non-architecture specific code out of powerpc should be one patch. Additional support should be in another patch. After each patch, the code should work properly. Before posting patches, please review them, making sure unnecessary/unwanted changes haven't crept in - commenting out code, moving code without removing the original code. thanks, Mimi
Re: [PATCH] mm/debug_vm_pgtable: Fix kernel crash with page table validate
On 6/8/20 4:31 PM, Anshuman Khandual wrote: Hi Aneesh, On 06/08/2020 11:57 AM, Aneesh Kumar K.V wrote: Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but no THP support enabled based on platforms. For ex: with 4K PAGE_SIZE ppc64 supports THP only with radix translation. Good catch, never hit this before. This results in below crash when running with hash translation and 4K PAGE_SIZE. kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140! cpu 0x61: Vector: 700 (Program Check) at [c00ff948f860] pc: c18810f8: debug_vm_pgtable+0x480/0x8b0 lr: c18810ec: debug_vm_pgtable+0x474/0x8b0 ... [c00ff948faf0] c1880fec debug_vm_pgtable+0x374/0x8b0 (unreliable) [c00ff948fbf0] c0011648 do_one_initcall+0x98/0x4f0 [c00ff948fcd0] c1843928 kernel_init_freeable+0x330/0x3fc [c00ff948fdb0] c00122ac kernel_init+0x24/0x148 [c00ff948fe20] c000cc44 ret_from_kernel_thread+0x5c/0x78 Check for THP support correctly Makes sense, is this the only configuration which hit the problem ? 4K hash ppc64 is the only config i guess. Cc: anshuman.khand...@arm.com Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table helpers") Signed-off-by: Aneesh Kumar K.V --- mm/debug_vm_pgtable.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index 188c18908964..e60151c5e997 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -61,6 +61,9 @@ static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) { pmd_t pmd = pfn_pmd(pfn, prot); + if (!has_transparent_hugepage()) + return; + We should also add this check to pud_basic_tests() as well. Do we have a function that check for runtime support for pud level THP? ppc64 don't do pud level THP yet. So we have CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=n are you suggesting we do the same check for pud level THP too? WARN_ON(!pmd_same(pmd, pmd)); WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd; WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd; The subject line here should mention about correct THP support detection which fixes the problem. Probably something like this or similar ("Fix kernel crash with correct THP support check"). Not sure about that. This fix a kernel crash with page table validate code. -aneesh
Re: [PATCH] mm/debug_vm_pgtable: Fix kernel crash with page table validate
Hi Aneesh, On 06/08/2020 11:57 AM, Aneesh Kumar K.V wrote: > Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but > no THP support enabled based on platforms. For ex: with 4K > PAGE_SIZE ppc64 supports THP only with radix translation. Good catch, never hit this before. > > This results in below crash when running with hash translation and > 4K PAGE_SIZE. > > kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140! > cpu 0x61: Vector: 700 (Program Check) at [c00ff948f860] > pc: c18810f8: debug_vm_pgtable+0x480/0x8b0 > lr: c18810ec: debug_vm_pgtable+0x474/0x8b0 > ... > [c00ff948faf0] c1880fec debug_vm_pgtable+0x374/0x8b0 (unreliable) > [c00ff948fbf0] c0011648 do_one_initcall+0x98/0x4f0 > [c00ff948fcd0] c1843928 kernel_init_freeable+0x330/0x3fc > [c00ff948fdb0] c00122ac kernel_init+0x24/0x148 > [c00ff948fe20] c000cc44 ret_from_kernel_thread+0x5c/0x78 > > Check for THP support correctly Makes sense, is this the only configuration which hit the problem ? > > Cc: anshuman.khand...@arm.com > Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table > helpers") > Signed-off-by: Aneesh Kumar K.V > --- > mm/debug_vm_pgtable.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > index 188c18908964..e60151c5e997 100644 > --- a/mm/debug_vm_pgtable.c > +++ b/mm/debug_vm_pgtable.c > @@ -61,6 +61,9 @@ static void __init pmd_basic_tests(unsigned long pfn, > pgprot_t prot) > { > pmd_t pmd = pfn_pmd(pfn, prot); > > + if (!has_transparent_hugepage()) > + return; > + We should also add this check to pud_basic_tests() as well. > WARN_ON(!pmd_same(pmd, pmd)); > WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd; > WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd; > The subject line here should mention about correct THP support detection which fixes the problem. Probably something like this or similar ("Fix kernel crash with correct THP support check"). - Anshuman
[RFC PATCH v0 3/4] powerpc/pseries: H_REGISTER_PROC_TBL should ask for GTSE only if enabled
H_REGISTER_PROC_TBL asks for GTSE by default. GTSE flag bit should be set only when GTSE is supported. Signed-off-by: Bharata B Rao --- arch/powerpc/platforms/pseries/lpar.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c index e4ed5317f117..58ba76bc1964 100644 --- a/arch/powerpc/platforms/pseries/lpar.c +++ b/arch/powerpc/platforms/pseries/lpar.c @@ -1680,9 +1680,11 @@ static int pseries_lpar_register_process_table(unsigned long base, if (table_size) flags |= PROC_TABLE_NEW; - if (radix_enabled()) - flags |= PROC_TABLE_RADIX | PROC_TABLE_GTSE; - else + if (radix_enabled()) { + flags |= PROC_TABLE_RADIX; + if (mmu_has_feature(MMU_FTR_GTSE)) + flags |= PROC_TABLE_GTSE; + } else flags |= PROC_TABLE_HPT_SLB; for (;;) { rc = plpar_hcall_norets(H_REGISTER_PROC_TBL, flags, base, -- 2.21.3
[RFC PATCH v0 4/4] powerpc/mm/book3s64/radix: Off-load TLB invalidations to host when !GTSE
From: Nicholas Piggin When platform doesn't support GTSE, let TLB invalidation requests for radix guests be off-loaded to the host using H_RPT_INVALIDATE hcall Signed-off-by: Nicholas Piggin Signed-off-by: Bharata B Rao --- arch/powerpc/include/asm/hvcall.h | 1 + arch/powerpc/include/asm/plpar_wrappers.h | 14 +++ arch/powerpc/mm/book3s64/radix_tlb.c | 105 -- 3 files changed, 113 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index e90c073e437e..08917147415b 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -335,6 +335,7 @@ #define H_GET_24X7_CATALOG_PAGE0xF078 #define H_GET_24X7_DATA0xF07C #define H_GET_PERF_COUNTER_INFO0xF080 +#define H_RPT_INVALIDATE 0xF084 /* Platform-specific hcalls used for nested HV KVM */ #define H_SET_PARTITION_TABLE 0xF800 diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h index 4497c8afb573..e952139b0e47 100644 --- a/arch/powerpc/include/asm/plpar_wrappers.h +++ b/arch/powerpc/include/asm/plpar_wrappers.h @@ -334,6 +334,13 @@ static inline long plpar_get_cpu_characteristics(struct h_cpu_char_result *p) return rc; } +static inline long pseries_rpt_invalidate(u32 pid, u64 target, u64 what, + u64 pages, u64 start, u64 end) +{ + return plpar_hcall_norets(H_RPT_INVALIDATE, pid, target, what, + pages, start, end); +} + #else /* !CONFIG_PPC_PSERIES */ static inline long plpar_set_ciabr(unsigned long ciabr) @@ -346,6 +353,13 @@ static inline long plpar_pte_read_4(unsigned long flags, unsigned long ptex, { return 0; } + +static inline long pseries_rpt_invalidate(u32 pid, u64 target, u64 what, + u64 pages, u64 start, u64 end) +{ + return 0; +} + #endif /* CONFIG_PPC_PSERIES */ #endif /* _ASM_POWERPC_PLPAR_WRAPPERS_H */ diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c index b5cc9b23cf02..4dd1d3c75562 100644 --- a/arch/powerpc/mm/book3s64/radix_tlb.c +++ b/arch/powerpc/mm/book3s64/radix_tlb.c @@ -16,11 +16,39 @@ #include #include #include +#include #define RIC_FLUSH_TLB 0 #define RIC_FLUSH_PWC 1 #define RIC_FLUSH_ALL 2 +#define H_TLBI_TLB 0x0001 +#define H_TLBI_PWC 0x0002 +#define H_TLBI_PRS 0x0004 + +#define H_TLBI_TARGET_CMMU 0x01 +#define H_TLBI_TARGET_CMMU_LOCAL 0x02 +#define H_TLBI_TARGET_NMMU 0x04 + +#define H_TLBI_PAGE_ALL (-1UL) +#define H_TLBI_PAGE_4K 0x01 +#define H_TLBI_PAGE_64K0x02 +#define H_TLBI_PAGE_2M 0x04 +#define H_TLBI_PAGE_1G 0x08 + +static inline u64 psize_to_h_tlbi(unsigned long psize) +{ + if (psize == MMU_PAGE_4K) + return H_TLBI_PAGE_4K; + if (psize == MMU_PAGE_64K) + return H_TLBI_PAGE_64K; + if (psize == MMU_PAGE_2M) + return H_TLBI_PAGE_2M; + if (psize == MMU_PAGE_1G) + return H_TLBI_PAGE_1G; + return H_TLBI_PAGE_ALL; +} + /* * tlbiel instruction for radix, set invalidation * i.e., r=1 and is=01 or is=10 or is=11 @@ -694,7 +722,14 @@ void radix__flush_tlb_mm(struct mm_struct *mm) goto local; } - if (cputlb_use_tlbie()) { + if (!mmu_has_feature(MMU_FTR_GTSE)) { + unsigned long targ = H_TLBI_TARGET_CMMU; + + if (atomic_read(&mm->context.copros) > 0) + targ |= H_TLBI_TARGET_NMMU; + pseries_rpt_invalidate(pid, targ, H_TLBI_TLB, + H_TLBI_PAGE_ALL, 0, -1UL); + } else if (cputlb_use_tlbie()) { if (mm_needs_flush_escalation(mm)) _tlbie_pid(pid, RIC_FLUSH_ALL); else @@ -727,7 +762,16 @@ static void __flush_all_mm(struct mm_struct *mm, bool fullmm) goto local; } } - if (cputlb_use_tlbie()) + if (!mmu_has_feature(MMU_FTR_GTSE)) { + unsigned long targ = H_TLBI_TARGET_CMMU; + unsigned long what = H_TLBI_TLB | H_TLBI_PWC | +H_TLBI_PRS; + + if (atomic_read(&mm->context.copros) > 0) + targ |= H_TLBI_TARGET_NMMU; + pseries_rpt_invalidate(pid, targ, what, + H_TLBI_PAGE_ALL, 0, -1UL); + } else if (cputlb_use_tlbie()) _tlbie_pid(pid, RIC_FLUSH_ALL); else _tlbiel_pid_multicast(mm, pid, RIC_FLUSH_ALL); @@ -760,7 +804,17 @@ void radix__flush_t
[RFC PATCH v0 1/4] powerpc/mm: Make GTSE as MMU FTR
Make GTSE as an MMU feature and enable it by default for radix. However for guest, conditionally enable it if hypervisor supports it via OV5 vector. Making GTSE as a MMU feature will make it easy to enable radix without GTSE. Signed-off-by: Bharata B Rao --- arch/powerpc/include/asm/mmu.h| 4 arch/powerpc/kernel/dt_cpu_ftrs.c | 2 ++ arch/powerpc/mm/init_64.c | 6 +- 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index f4ac25d4df05..884d51995934 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -28,6 +28,9 @@ * Individual features below. */ +/* Guest Translation Shootdown Enable */ +#define MMU_FTR_GTSE ASM_CONST(0x1000) + /* * Support for 68 bit VA space. We added that from ISA 2.05 */ @@ -173,6 +176,7 @@ enum { #endif #ifdef CONFIG_PPC_RADIX_MMU MMU_FTR_TYPE_RADIX | + MMU_FTR_GTSE | #ifdef CONFIG_PPC_KUAP MMU_FTR_RADIX_KUAP | #endif /* CONFIG_PPC_KUAP */ diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c index 3a409517c031..571aa39e35d5 100644 --- a/arch/powerpc/kernel/dt_cpu_ftrs.c +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c @@ -337,6 +337,8 @@ static int __init feat_enable_mmu_radix(struct dt_cpu_feature *f) #ifdef CONFIG_PPC_RADIX_MMU cur_cpu_spec->mmu_features |= MMU_FTR_TYPE_RADIX; cur_cpu_spec->mmu_features |= MMU_FTRS_HASH_BASE; + /* TODO: Does this need a separate cpu dt feature? */ + cur_cpu_spec->mmu_features |= MMU_FTR_GTSE; cur_cpu_spec->cpu_user_features |= PPC_FEATURE_HAS_MMU; return 1; diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c index c7ce4ec5060e..feb9bed9177c 100644 --- a/arch/powerpc/mm/init_64.c +++ b/arch/powerpc/mm/init_64.c @@ -408,13 +408,17 @@ static void __init early_check_vec5(void) if (!(vec5[OV5_INDX(OV5_RADIX_GTSE)] & OV5_FEAT(OV5_RADIX_GTSE))) { pr_warn("WARNING: Hypervisor doesn't support RADIX with GTSE\n"); - } + cur_cpu_spec->mmu_features &= ~MMU_FTR_GTSE; + } else + cur_cpu_spec->mmu_features |= MMU_FTR_GTSE; /* Do radix anyway - the hypervisor said we had to */ cur_cpu_spec->mmu_features |= MMU_FTR_TYPE_RADIX; } else if (mmu_supported == OV5_FEAT(OV5_MMU_HASH)) { /* Hypervisor only supports hash - disable radix */ cur_cpu_spec->mmu_features &= ~MMU_FTR_TYPE_RADIX; + cur_cpu_spec->mmu_features &= ~MMU_FTR_GTSE; } + } void __init mmu_early_init_devtree(void) -- 2.21.3