[powerpc:next] BUILD SUCCESS 90bae4d99beb1f31d8bde7c438a36e8875ae6090
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next branch HEAD: 90bae4d99beb1f31d8bde7c438a36e8875ae6090 powerpc/xmon: Reapply "Relax frame size for clang" elapsed time: 2357m configs tested: 191 configs skipped: 2 The following configs have been built successfully. More configs may be tested in the coming days. tested configs: alpha allnoconfig gcc alphaallyesconfig gcc alpha defconfig gcc alpharandconfig-r011-20230829 gcc alpharandconfig-r036-20230829 gcc arc allmodconfig gcc arc allnoconfig gcc arc allyesconfig gcc arc defconfig gcc arc randconfig-001-20230829 gcc arc randconfig-r014-20230829 gcc arc randconfig-r016-20230829 gcc arm allmodconfig gcc arm allnoconfig gcc arm allyesconfig gcc arm defconfig gcc arm randconfig-001-20230829 clang arm randconfig-r016-20230829 clang arm randconfig-r035-20230829 gcc arm64allmodconfig gcc arm64 allnoconfig gcc arm64allyesconfig gcc arm64 defconfig gcc arm64randconfig-r004-20230829 clang csky allmodconfig gcc csky allnoconfig gcc csky allyesconfig gcc cskydefconfig gcc csky randconfig-r004-20230829 gcc csky randconfig-r005-20230829 gcc csky randconfig-r021-20230829 gcc csky randconfig-r035-20230829 gcc hexagon randconfig-001-20230829 clang hexagon randconfig-002-20230829 clang hexagon randconfig-r002-20230829 clang i386 allmodconfig gcc i386 allnoconfig gcc i386 allyesconfig gcc i386 buildonly-randconfig-001-20230829 clang i386 buildonly-randconfig-001-20230830 clang i386 buildonly-randconfig-002-20230829 clang i386 buildonly-randconfig-002-20230830 clang i386 buildonly-randconfig-003-20230829 clang i386 buildonly-randconfig-003-20230830 clang i386 buildonly-randconfig-004-20230829 clang i386 buildonly-randconfig-004-20230830 clang i386 buildonly-randconfig-005-20230829 clang i386 buildonly-randconfig-005-20230830 clang i386 buildonly-randconfig-006-20230829 clang i386 buildonly-randconfig-006-20230830 clang i386 debian-10.3 gcc i386defconfig gcc i386 randconfig-001-20230830 clang i386 randconfig-002-20230830 clang i386 randconfig-003-20230830 clang i386 randconfig-004-20230830 clang i386 randconfig-005-20230830 clang i386 randconfig-006-20230830 clang i386 randconfig-011-20230829 gcc i386 randconfig-012-20230829 gcc i386 randconfig-013-20230829 gcc i386 randconfig-014-20230829 gcc i386 randconfig-015-20230829 gcc i386 randconfig-016-20230829 gcc i386 randconfig-r011-20230829 gcc i386 randconfig-r034-20230829 clang loongarchallmodconfig gcc loongarch allnoconfig gcc loongarchallyesconfig gcc loongarch defconfig gcc loongarch randconfig-001-20230829 gcc loongarchrandconfig-r003-20230829 gcc loongarchrandconfig-r034-20230829 gcc m68k allmodconfig gcc m68k allnoconfig gcc m68k allyesconfig gcc m68kdefconfig gcc microblaze allmodconfig gcc microblazeallnoconfig gcc microblaze allyesconfig gcc microblaze defconfig gcc microblaze randconfig-r013-20230829 gcc microblaze randconfig-r015-20230829 gcc microblaze randconfig-r024-20230829 gcc microblaze randconfig-r026-20230829
[powerpc:merge] BUILD SUCCESS 54ab446adbe66e7b3bf489afbba93ce402d67884
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge branch HEAD: 54ab446adbe66e7b3bf489afbba93ce402d67884 powerpc/ci: Enable DEBUG_ATOMIC_SLEEP for pmac32 elapsed time: 2347m configs tested: 178 configs skipped: 2 The following configs have been built successfully. More configs may be tested in the coming days. tested configs: alpha allnoconfig gcc alphaallyesconfig gcc alpha defconfig gcc alpharandconfig-r002-20230829 gcc arc allmodconfig gcc arc allnoconfig gcc arc allyesconfig gcc arc defconfig gcc arc randconfig-001-20230829 gcc arc randconfig-r003-20230829 gcc arc randconfig-r014-20230829 gcc arm allmodconfig gcc arm allnoconfig gcc arm allyesconfig gcc arm defconfig gcc armmulti_v7_defconfig gcc arm randconfig-001-20230829 clang armshmobile_defconfig gcc arm64allmodconfig gcc arm64 allnoconfig gcc arm64allyesconfig gcc arm64 defconfig gcc arm64randconfig-r013-20230829 gcc csky allmodconfig gcc csky allnoconfig gcc csky allyesconfig gcc cskydefconfig gcc csky randconfig-r001-20230829 gcc csky randconfig-r011-20230829 gcc hexagon randconfig-001-20230829 clang hexagon randconfig-002-20230829 clang hexagon randconfig-r021-20230829 clang hexagon randconfig-r022-20230829 clang hexagon randconfig-r023-20230829 clang i386 allmodconfig gcc i386 allnoconfig gcc i386 allyesconfig gcc i386 buildonly-randconfig-001-20230830 clang i386 buildonly-randconfig-002-20230830 clang i386 buildonly-randconfig-003-20230830 clang i386 buildonly-randconfig-004-20230830 clang i386 buildonly-randconfig-005-20230830 clang i386 buildonly-randconfig-006-20230830 clang i386 debian-10.3 gcc i386defconfig gcc i386 randconfig-001-20230829 clang i386 randconfig-002-20230829 clang i386 randconfig-003-20230829 clang i386 randconfig-004-20230829 clang i386 randconfig-005-20230829 clang i386 randconfig-006-20230829 clang i386 randconfig-011-20230829 gcc i386 randconfig-012-20230829 gcc i386 randconfig-013-20230829 gcc i386 randconfig-014-20230829 gcc i386 randconfig-015-20230829 gcc i386 randconfig-016-20230829 gcc loongarchallmodconfig gcc loongarch allnoconfig gcc loongarchallyesconfig gcc loongarch defconfig gcc loongarch randconfig-001-20230829 gcc m68k allmodconfig gcc m68k allnoconfig gcc m68k allyesconfig gcc m68kdefconfig gcc m68k m5208evb_defconfig gcc microblaze allmodconfig gcc microblazeallnoconfig gcc microblaze allyesconfig gcc microblaze defconfig gcc mips allmodconfig gcc mips allnoconfig gcc mips allyesconfig gcc mipsar7_defconfig gcc mipsvocore2_defconfig gcc nios2allmodconfig gcc nios2 allnoconfig gcc nios2allyesconfig gcc nios2 defconfig gcc nios2randconfig-r024-20230829 gcc openrisc allmodconfig gcc openrisc allnoconfig gcc openrisc allyesconfig gcc openriscdefconfig gcc openrisc randconfig
Re: [PATCH v2 1/2] maple_tree: Disable mas_wr_append() when other readers are possible
Andreas Schwab writes: > This breaks booting on ppc32: Does enabling CONFIG_DEBUG_ATOMIC_SLEEP fix the crash? It did for me on qemu. cheers > Kernel attemptd to writ user page (1ff0) - exploit attempt? (uid: 0) > BUG: Unable to handle kernel data access on write at 0x1ff0 > Faulting instruction address: 0xc0009554 > Vector: 300 (Data Access) at [c0b09d10] > pc: c0009554: do_softirq_own_stack+0x18/0x30 > lr: c004f480: __irq_exit_rcu+0x70/0xc0 > sp: c0b09dd0 >msr: 1032 >dar: 1ff0 > dsisr: 4200 > current = 0xc0a08360 > pid = 0, comm = swapper > Linux version 6.5.0 ... > enter ? for help > [c0b09de0] c00ff480 __irq_exit_rcu+0x70/0xc0 > [c0b09df0] c0005a98 Decrementer_virt+0x108/0x10c > --- Exception: 900 (Decrementer) at c06cfa0c __schedule+0x4fc/0x510 > [c0b09ec0] c06cf75c __schedule+0x1cc/0x510 (unreliable) > [c0b09ef0] c06cfc90 __cond_resched+0x2c/0x54 > [c0b09f00] c06d07f8 mutex_lock_killable+0x18/0x5c > [c0b09f10] c013c404 pcpu_alloc+0x110/0x4dc > [c0b09f70] c000cc34 alloc_descr.isra.18+0x48/0x144 > [c0b09f90] c0988aa0 early_irq_init+0x64/0x8c > [c0b09fa0] c097a5a4 start_kernel+0x5b4/0x7b0 > [c0b09ff0] 3dc0 > mon> > > -- > Andreas Schwab, sch...@linux-m68k.org > GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 > "And now for something completely different."
Re: [PATCH RFC 1/2] powerpc/pseries: papr-vpd char driver for VPD retrieval
Michal Suchánek writes: > Hello, > > thanks for working on this. > > On Tue, Aug 22, 2023 at 04:33:39PM -0500, Nathan Lynch via B4 Relay wrote: >> From: Nathan Lynch >> >> PowerVM LPARs may retrieve Vital Product Data (VPD) for system >> components using the ibm,get-vpd RTAS function. >> >> We can expose this to user space with a /dev/papr-vpd character >> device, where the programming model is: >> >> struct papr_location_code plc = { .str = "", }; /* obtain all VPD */ >> int devfd = open("/dev/papr-vpd", O_WRONLY); >> int vpdfd = ioctl(devfd, PAPR_VPD_CREATE_HANDLE, ); >> size_t size = lseek(vpdfd, 0, SEEK_END); >> char *buf = malloc(size); >> pread(devfd, buf, size, 0); >> >> When a file descriptor is obtained from ioctl(PAPR_VPD_CREATE_HANDLE), >> the file contains the result of a complete ibm,get-vpd sequence. The > > Could this be somewhat less obfuscated? > > What the caller wants is the result of "ibm,get-vpd", which is a > well-known string identifier of the rtas call. Not really. What the caller wants is *the VPD*. Currently that's done by calling the RTAS "ibm,get-vpd" function, but that could change in future. There's RTAS calls that have been replaced with a "version 2" in the past, that could happen here too. Or the RTAS call could be replaced by a hypercall (though unlikely). But hopefully if the underlying mechanism changed the kernel would be able to hide that detail behind this new API, and users would not need to change at all. > Yet this identifier is never passed in. Instead we have this new > PAPR_VPD_CREATE_HANDLE. This is a completely new identifier, specific to > this call only as is the /dev/papr-vpd device name, another new > identifier. > > Maybe the interface could provide a way to specify the service name? > >> file contents are immutable from the POV of user space. To get a new >> view of VPD, clients must create a new handle. > > Which is basically the same as creating a file descriptor with open(). Sort of. But much cleaner becuase you don't need to create a file in the filesystem and tell userspace how to find it. This pattern of creating file descriptors from existing file descriptors to model a hiearachy of objects is well established in eg. the KVM and DRM APIs. > Maybe creating a directory in sysfs or procfs with filenames > corresponding to rtas services would do the same job without extra > obfuscation? It's not obfuscation, it's abstraction. The kernel talks to firmware to do things, and provides an API to user space. Not all the details of how the firmware works are relevant to user space, including the exact firmware calls required to implement a certain feature. >> This design choice insulates user space from most of the complexities >> that ibm,get-vpd brings: >> >> * ibm,get-vpd must be called more than once to obtain complete >> results. >> * Only one ibm,get-vpd call sequence should be in progress at a time; >> concurrent sequences will disrupt each other. Callers must have a >> protocol for serializing their use of the function. >> * A call sequence in progress may receive a "VPD changed, try again" >> status, requiring the client to start over. (The driver does not yet >> handle this, but it should be easy to add.) > > That certainly reduces the complexity of the user interface making it > much saner. Yes :) >> The memory required for the VPD buffers seems acceptable, around 20KB >> for all VPD on one of my systems. And the value of the >> /rtas/ibm,vpd-size DT property (the estimated maximum size of VPD) is >> consistently 300KB across various systems I've checked. >> >> I've implemented support for this new ABI in the rtas_get_vpd() >> function in librtas, which the vpdupdate command currently uses to >> populate its VPD database. I've verified that an unmodified vpdupdate >> binary generates an identical database when using a librtas.so that >> prefers the new ABI. >> >> Likely remaining work: >> >> * Handle RTAS call status -4 (VPD changed) during ibm,get-vpd call >> sequence. >> * Prevent ibm,get-vpd calls via rtas(2) from disrupting ibm,get-vpd >> call sequences in this driver. >> * (Maybe) implement a poll method for delivering notifications of >> potential changes to VPD, e.g. after a partition migration. > > That sounds like something for netlink. If that is desired maybe it > should be used in the first place? I don't see why that is related to netlink. It's entirely normal for file descriptor based APIs to implement poll. netlink adds a lot of complexity for zero gain IMO. cheers
Re: KASAN debug kernel fails to boot at early stage when CONFIG_SMP=y is set (kernel 6.5-rc5, PowerMac G4 3,6)
Le 28/08/2023 à 01:17, Erhard Furtner a écrit : > On Thu, 24 Aug 2023 21:36:26 +1000 > Michael Ellerman wrote: > >>> printk: bootconsole [udbg0] enabled >>> Total memory = 2048MB; using 4096kB for hash table >>> mapin_ram:125 >>> mmu_mapin_ram:169 0 3000 140 200 >>> __mmu_mapin_ram:146 0 140 >>> __mmu_mapin_ram:155 140 >>> __mmu_mapin_ram:146 140 3000 >>> __mmu_mapin_ram:155 2000 >>> __mapin_ram_chunk:107 2000 3000 >>> __mapin_ram_chunk:117 >>> mapin_ram:134 >>> kasan_mmu_init:129 >>> kasan_mmu_init:132 0 >>> kasan_mmu_init:137 >>> ioremap() called early from btext_map+0x64/0xdc. Use early_ioremap() instead >>> Linux version 6.5.0-rc7-PMacG4-dirty (root@T1000) (gcc (Gentoo >>> 12.3.1_p20230526 p2) 12.3.1 20230526, GNU ld (Gentoo 2.40 p7) 2.40.0) #4 >>> SMP Wed Aug 23 12:59:11 CEST 2023 >>> >>> which shows one line (Linux version...) more than before. Most of the time >>> I get this more interesting output however: >>> >>> kasan_mmu_init:129 >>> kasan_mmu_init:132 0 >>> kasan_mmu_init:137 >>> Linux version 6.5.0-rc7-PMacG4-dirty (root@T1000) (gcc (Gentoo >>> 12.3.1_p20230526 p2) 12.3.1 20230526, GNU ld (Gentoo 2.40 p7) 2.40.0) #4 >>> SMP Wed Aug 23 12:59:11 CEST 2023 >>> KASAN init done >>> list_add corruption. prev->next should be next (c17100c0), but was >>> 2c03. (prev=c036ac7c). >>> [ cut here ] >>> kernel BUG at lib/list_debug.c:30! >>> [ cut here ] >>> WARNING: CPU: 0 PID: 0 at arch/powerpc/include/asm/machdep.h:227 >>> die+0xd8/0x39c >> >> This is a WARN hit while handling the original bug. >> >> Can you apply this patch to avoid that happening, so we can see the >> original but better. >> >> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c >> index eeff136b83d9..341a0635e131 100644 >> --- a/arch/powerpc/kernel/traps.c >> +++ b/arch/powerpc/kernel/traps.c >> @@ -198,8 +198,6 @@ static unsigned long oops_begin(struct pt_regs *regs) >> die_owner = cpu; >> console_verbose(); >> bust_spinlocks(1); >> -if (machine_is(powermac)) >> -pmac_backlight_unblank(); >> return flags; >> } >> NOKPROBE_SYMBOL(oops_begin); >> >> >> cheers > > Ok, so I tested now: > Replace btext_unmap() with btext_map() at the end of MMU_init() + > Michaels patch. > > With the patch I get interesting output less often, but when I do it's: > > printk: bootconsole [udbg0] enabled > Total memory = 2048MB; using 4096kB for hash table > mapin_ram:125 > mmu_mapin_ram:169 0 3000 140 200 > __mmu_mapin_ram:146 0 140 > __mmu_mapin_ram:155 140 > __mmu_mapin_ram:146 140 3000 > __mmu_mapin_ram:155 2000 > __mapin_ram_chunk:107 2000 3000 > __mapin_ram_chunk:117 > mapin_ram:134 > kasan_mmu_init:129 > kasan_mmu_init:132 0 > kasan_mmu_init:137 > Linux version 6.5.0-rc7-PMacG4-dirty (root@T1000) (gcc (Gentoo > 12.3.1_p20230526 p2) 12.3.1 20230526, GNU ld (Gentoo 2.40 p7) 2.40.0) #4 SMP > Wed Aug 23 12:59:11 CEST 2023 > KASAN init done > BUG: spinlock bad magic on CPU#0, swapper/0 > lock: 0xc16cbc60, .magic: c036ab84, .owner: /-1, .owner_cpu: -1 > CPU: 0 PID: 0 Comm: swapper Tainted: GT xxx > Call Trace: > [c1717c20] [c0f4e288] dump_stack_lvl+0x60/0xa4 (unreliable) > [c1717c40] [c01065e8] do_raw_spin_lock+0x15c/0x1a8 > [c1717c70] [c0fa3890] _raw_spin_lock_irqsave+0x20/0x40 > [c1717c90] [c0c140ec] of_find_property+0x3c/0x140 > [c1717cc0] [c0c14204] of_get_property+0x14/0x4c > [c1717ce0] [c0c22c6c] unlatten_dt_nodes+0x76c/0x894 > [c1717f10] [c0c22e88] __unflatten_device_tree+0xf4/0x244 > [c1717f50] [c1458050] unflatten_device_tree+0x48/0x84 > [c1717f70] [c140b100] setup_arch+0x78/0x44c > [c1717fc0] [c14045b8] start_kernel+0x78/0x2d8 > [c1717ff0] [35d0] 0x35d0 Ok so there is some corrupted memory somewhere. Can you try what happens when you remove the call to kasan_init() at the start of setup_arch() in arch/powerpc/kernel/setup-common.c I'd also be curious to know what happens when CONFIG_DEBUG_SPINLOCK is disabled. Another question which I'm no sure I asked already: Is it a new problem you have got with recent kernels or is it just that you never tried such a config with older kernels ? Also, when you say you need to start with another SMP kernel first and then you don't have the problem anymore until the next cold reboot, do you mean you have some old kernel with KASAN that works, or is it a kernel without KASAN that you have to start first ? Thanks Christophe > > > and then the freeze. Or less often I get: > > [...] > Modules linked in: _various ASCII chars_ |(EK) _various ASCII chars_ §=(EKTN) > BUG: Unable to handle kernel data access on read at 0x813f0200 > Faulting instruction address: 0xc014e444 > Thread overran stack, or stack corrupted > Oops: Kernel access of bad area, sig: 11 [#3544] > BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 > Modules linked in: _various ASCII chars_ §=(EKTN) > BUG:
Re: (subset) [PATCH v2 4/4] powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT
On Wed, 23 Aug 2023 15:53:17 +1000, Michael Ellerman wrote: > lppaca_shared_proc() takes a pointer to the lppaca which is typically > accessed through get_lppaca(). With DEBUG_PREEMPT enabled, this leads > to checking if preemption is enabled, for example: > > BUG: using smp_processor_id() in preemptible [] code: grep/10693 > caller is lparcfg_data+0x408/0x19a0 > CPU: 4 PID: 10693 Comm: grep Not tainted 6.5.0-rc3 #2 > Call Trace: > dump_stack_lvl+0x154/0x200 (unreliable) > check_preemption_disabled+0x214/0x220 > lparcfg_data+0x408/0x19a0 > ... > > [...] Applied to powerpc/next. [4/4] powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT https://git.kernel.org/powerpc/c/eac030b22ea12cdfcbb2e941c21c03964403c63f cheers
Re: [PATCH] powerpc/powermac: Fix unused function warning
On Tue, 22 Aug 2023 00:09:49 +1000, Michael Ellerman wrote: > Clang reports: > arch/powerpc/platforms/powermac/feature.c:137:19: error: unused function > 'simple_feature_tweak' > > It's only used inside the #ifndef CONFIG_PPC64 block, so move it in > there to fix the warning. While at it drop the inline, the compiler will > decide whether it should be inlined or not. > > [...] Applied to powerpc/next. [1/1] powerpc/powermac: Fix unused function warning https://git.kernel.org/powerpc/c/1eafbd8764b10798934344bd40395b27cec63145 cheers
Re: [PATCH] powerpc/eeh: Use pci_dev_id() to simplify the code
On Tue, 15 Aug 2023 10:33:03 +0800, Jialin Zhang wrote: > PCI core API pci_dev_id() can be used to get the BDF number for a pci > device. We don't need to compose it mannually. Use pci_dev_id() to > simplify the code a little bit. > > Applied to powerpc/next. [1/1] powerpc/eeh: Use pci_dev_id() to simplify the code https://git.kernel.org/powerpc/c/664ec38673bef18bbcc4ede02274c8977470823f cheers
Re: [PATCH] powerpc/mpc5xxx: Add missing fwnode_handle_put()
On Wed, 22 Mar 2023 11:04:23 +0800, Liang He wrote: > In mpc5xxx_fwnode_get_bus_frequency(), we should add > fwnode_handle_put() when break out of the iteration > fwnode_for_each_parent_node() as it will automatically > increase and decrease the refcounter. > > Applied to powerpc/next. [1/1] powerpc/mpc5xxx: Add missing fwnode_handle_put() https://git.kernel.org/powerpc/c/b9bbbf4979073d5536b7650decd37fcb901e6556 cheers
Re: [PATCH] powerpc/64s: Move CPU -mtune options into Kconfig
On Thu, 30 Mar 2023 10:43:08 +1100, Michael Ellerman wrote: > Currently the -mtune options are set in the Makefile, depending on what > the compiler supports. > > One downside of doing it that way is that the chosen -mtune option is > not recorded in the .config. > > Another downside is that if there's ever a need to do more complicated > logic to calculate the correct option, that gets messy in the Makefile. > > [...] Applied to powerpc/next. [1/1] powerpc/64s: Move CPU -mtune options into Kconfig https://git.kernel.org/powerpc/c/50832720ec54c39ab189cd5e057aec1c514978ce cheers
Re: [PATCH] powerpc/iommu: Fix notifiers being shared by PCI and VIO buses
On Wed, 22 Mar 2023 14:53:22 +1100, Russell Currey wrote: > fail_iommu_setup() registers the fail_iommu_bus_notifier struct to both > PCI and VIO buses. struct notifier_block is a linked list node, so this > causes any notifiers later registered to either bus type to also be > registered to the other since they share the same node. > > This causes issues in (at least) the vgaarb code, which registers a > notifier for PCI buses. pci_notify() ends up being called on a vio > device, converted with to_pci_dev() even though it's not a PCI device, > and finally makes a bad access in vga_arbiter_add_pci_device() as > discovered with KASAN: > > [...] Applied to powerpc/next. [1/1] powerpc/iommu: Fix notifiers being shared by PCI and VIO buses https://git.kernel.org/powerpc/c/c37b6908f7b2bd24dcaaf14a180e28c9132b9c58 cheers
Re: [PATCH 1/2] powerpc/pseries: Fix hcall tracepoints with JUMP_LABEL=n
On Tue, 09 May 2023 19:15:59 +1000, Nicholas Piggin wrote: > With JUMP_LABEL=n, hcall_tracepoint_refcount's address is being tested > instead of its value. This results in the tracing slowpath always being > taken unnecessarily. > > Applied to powerpc/next. [1/2] powerpc/pseries: Fix hcall tracepoints with JUMP_LABEL=n https://git.kernel.org/powerpc/c/750bd41aeaeb1f0e0128aa4f8fcd6dd759713641 [2/2] powerpc/pseries: Remove unused hcall tracing instruction https://git.kernel.org/powerpc/c/61d7ebe0376e2640ba77be16e186b1a6c77eb3f7 cheers
Re: [PATCH v2] reapply: powerpc/xmon: Relax frame size for clang
On Mon, 28 Aug 2023 13:39:06 -0700, Nick Desaulniers wrote: > This is a manual revert of commit > 7f3c5d099b6f8452dc4dcfe4179ea48e6a13d0eb, but using > ccflags-$(CONFIG_CC_IS_CLANG) which is shorter. > > Turns out that this is reproducible still under specific compiler > versions (mea culpa: I did not test every supported version of clang), > and even a few randconfigs bots found. > > [...] Applied to powerpc/next. [1/1] reapply: powerpc/xmon: Relax frame size for clang https://git.kernel.org/powerpc/c/90bae4d99beb1f31d8bde7c438a36e8875ae6090 cheers
Re: [PATCH] powerpc: Drop zalloc_maybe_bootmem()
On Wed, 23 Aug 2023 15:54:30 +1000, Michael Ellerman wrote: > The only callers of zalloc_maybe_bootmem() are PCI setup routines. These > used to be called early during boot before slab setup, and also during > runtime due to hotplug. > > But commit 5537fcb319d0 ("powerpc/pci: Add ppc_md.discover_phbs()") > moved the boot-time calls later, after slab setup, meaning there's no > longer any need for zalloc_maybe_bootmem(), kzalloc() can be used in all > cases. > > [...] Applied to powerpc/next. [1/1] powerpc: Drop zalloc_maybe_bootmem() https://git.kernel.org/powerpc/c/fabdb27da78afb93b0a83c0579025cb8d05c0d2d cheers
Re: [PATCH] cxl: Drop unused detach_spa()
On Wed, 23 Aug 2023 14:48:03 +1000, Michael Ellerman wrote: > Clang warns: > drivers/misc/cxl/native.c:272:20: error: unused function 'detach_spa' > [-Werror,-Wunused-function] > > It was created as part of some refactoring in commit 05155772f642 ("cxl: > Allocate and release the SPA with the AFU"), but has never been called > in its current form. Drop it. > > [...] Applied to powerpc/next. [1/1] cxl: Drop unused detach_spa() https://git.kernel.org/powerpc/c/fe32945203ffc8d6fed815f7ed7729219f8b0ab6 cheers
Re: [PATCH 1/4] powerpc/pseries: Move VPHN constants into vphn.h
On Wed, 23 Aug 2023 15:53:14 +1000, Michael Ellerman wrote: > These don't have any particularly good reason to belong in lppaca.h, > move them into their own header. > > Applied to powerpc/next. [1/4] powerpc/pseries: Move VPHN constants into vphn.h https://git.kernel.org/powerpc/c/c040c7488b6a89c98dd0f6dd5f001101413779e2 [2/4] powerpc/pseries: Move hcall_vphn() prototype into vphn.h https://git.kernel.org/powerpc/c/9a6c05fe9a998386a61b5e70ce07d31ec47a01a0 [3/4] powerpc: Don't include lppaca.h in paca.h https://git.kernel.org/powerpc/c/1aa000667669fa855853decbb1c69e974d8ff716 cheers
Re: [PATCH 1/2] powerpc/powernv: Fix fortify source warnings in opal-prd.c
On Tue, 22 Aug 2023 00:28:19 +1000, Michael Ellerman wrote: > As reported by Mahesh & Aneesh, opal_prd_msg_notifier() triggers a > FORTIFY_SOURCE warning: > > memcpy: detected field-spanning write (size 32) of single field > ">msg" at arch/powerpc/platforms/powernv/opal-prd.c:355 (size 4) > WARNING: CPU: 9 PID: 660 at arch/powerpc/platforms/powernv/opal-prd.c:355 > opal_prd_msg_notifier+0x174/0x188 [opal_prd] > NIP opal_prd_msg_notifier+0x174/0x188 [opal_prd] > LR opal_prd_msg_notifier+0x170/0x188 [opal_prd] > Call Trace: > opal_prd_msg_notifier+0x170/0x188 [opal_prd] (unreliable) > notifier_call_chain+0xc0/0x1b0 > atomic_notifier_call_chain+0x2c/0x40 > opal_message_notify+0xf4/0x2c0 > > [...] Applied to powerpc/next. [1/2] powerpc/powernv: Fix fortify source warnings in opal-prd.c https://git.kernel.org/powerpc/c/feea65a338e52297b68ceb688eaf0ffc50310a83 [2/2] powerpc/powernv: Use struct opal_prd_msg in more places https://git.kernel.org/powerpc/c/22b165617b779418166319a19fd926a9c6feb9a3 cheers
Re: [PATCH] powerpc: fix debugfs_create_dir error checking
On Sun, 28 May 2023 13:16:44 +0530, mirim...@outlook.com wrote: > The debugfs_create_dir returns ERR_PTR incase of an error and the > correct way of checking it by using the IS_ERR inline function, and > not the simple null comparision. This patch fixes this. > > Applied to powerpc/next. [1/1] powerpc: fix debugfs_create_dir error checking https://git.kernel.org/powerpc/c/429356fac0440b962aaa6d3688709813a21dd122 cheers
Re: [PATCH] powerpc: dts: add missing space before {
On Wed, 05 Jul 2023 16:57:43 +0200, Krzysztof Kozlowski wrote: > Add missing whitespace between node name/label and opening {. > > Applied to powerpc/next. [1/1] powerpc: dts: add missing space before { https://git.kernel.org/powerpc/c/11073886cc4a2746845e8d113cadec2578c85033 cheers
Re: [PATCH] powerpc/config: Disable SLAB_DEBUG_ON in skiroot
On Wed, 05 Jul 2023 12:00:56 +0930, Joel Stanley wrote: > In 5.10 commit 5e84dd547bce ("powerpc/configs/skiroot: Enable some more > hardening options") set SLUB_DEUBG_ON. > > When 5.14 came around, commit 792702911f58 ("slub: force on > no_hash_pointers when slub_debug is enabled") print all the > pointers when SLUB_DEUBG_ON is set. This was fine, but in 5.12 commit > 5ead723a20e0 ("lib/vsprintf: no_hash_pointers prints all addresses as > unhashed") added the warning at boot. > > [...] Applied to powerpc/next. [1/1] powerpc/config: Disable SLAB_DEBUG_ON in skiroot https://git.kernel.org/powerpc/c/cdebfd27292ecdebe7d493830354e302368b3188 cheers
Re: [PATCH] powerpc/85xx: Mark some functions static and add missing includes to fix no previous prototype error
On Tue, 22 Aug 2023 08:13:13 +0200, Christophe Leroy wrote: > corenet{32/64}_smp_defconfig leads to: > > CC arch/powerpc/sysdev/ehv_pic.o > arch/powerpc/sysdev/ehv_pic.c:45:6: error: no previous prototype for > 'ehv_pic_unmask_irq' [-Werror=missing-prototypes] >45 | void ehv_pic_unmask_irq(struct irq_data *d) > | ^~ > arch/powerpc/sysdev/ehv_pic.c:52:6: error: no previous prototype for > 'ehv_pic_mask_irq' [-Werror=missing-prototypes] >52 | void ehv_pic_mask_irq(struct irq_data *d) > | ^~~~ > arch/powerpc/sysdev/ehv_pic.c:59:6: error: no previous prototype for > 'ehv_pic_end_irq' [-Werror=missing-prototypes] >59 | void ehv_pic_end_irq(struct irq_data *d) > | ^~~ > arch/powerpc/sysdev/ehv_pic.c:66:6: error: no previous prototype for > 'ehv_pic_direct_end_irq' [-Werror=missing-prototypes] >66 | void ehv_pic_direct_end_irq(struct irq_data *d) > | ^~ > arch/powerpc/sysdev/ehv_pic.c:71:5: error: no previous prototype for > 'ehv_pic_set_affinity' [-Werror=missing-prototypes] >71 | int ehv_pic_set_affinity(struct irq_data *d, const struct cpumask > *dest, > | ^~~~ > arch/powerpc/sysdev/ehv_pic.c:112:5: error: no previous prototype for > 'ehv_pic_set_irq_type' [-Werror=missing-prototypes] > 112 | int ehv_pic_set_irq_type(struct irq_data *d, unsigned int flow_type) > | ^~~~ > CC arch/powerpc/sysdev/fsl_rio.o > arch/powerpc/sysdev/fsl_rio.c:102:5: error: no previous prototype for > 'fsl_rio_mcheck_exception' [-Werror=missing-prototypes] > 102 | int fsl_rio_mcheck_exception(struct pt_regs *regs) > | ^~~~ > arch/powerpc/sysdev/fsl_rio.c:306:5: error: no previous prototype for > 'fsl_map_inb_mem' [-Werror=missing-prototypes] > 306 | int fsl_map_inb_mem(struct rio_mport *mport, dma_addr_t lstart, > | ^~~ > arch/powerpc/sysdev/fsl_rio.c:357:6: error: no previous prototype for > 'fsl_unmap_inb_mem' [-Werror=missing-prototypes] > 357 | void fsl_unmap_inb_mem(struct rio_mport *mport, dma_addr_t lstart) > | ^ > arch/powerpc/sysdev/fsl_rio.c:445:5: error: no previous prototype for > 'fsl_rio_setup' [-Werror=missing-prototypes] > 445 | int fsl_rio_setup(struct platform_device *dev) > | ^ > CC arch/powerpc/sysdev/fsl_rmu.o > arch/powerpc/sysdev/fsl_rmu.c:362:6: error: no previous prototype for > 'msg_unit_error_handler' [-Werror=missing-prototypes] > 362 | void msg_unit_error_handler(void) > | ^~ > CC arch/powerpc/platforms/85xx/corenet_generic.o > arch/powerpc/platforms/85xx/corenet_generic.c:33:13: error: no previous > prototype for 'corenet_gen_pic_init' [-Werror=missing-prototypes] >33 | void __init corenet_gen_pic_init(void) > | ^~~~ > arch/powerpc/platforms/85xx/corenet_generic.c:51:13: error: no previous > prototype for 'corenet_gen_setup_arch' [-Werror=missing-prototypes] >51 | void __init corenet_gen_setup_arch(void) > | ^~ > arch/powerpc/platforms/85xx/corenet_generic.c:104:12: error: no previous > prototype for 'corenet_gen_publish_devices' [-Werror=missing-prototypes] > 104 | int __init corenet_gen_publish_devices(void) > |^~~ > CC arch/powerpc/platforms/85xx/qemu_e500.o > arch/powerpc/platforms/85xx/qemu_e500.c:28:13: error: no previous prototype > for 'qemu_e500_pic_init' [-Werror=missing-prototypes] >28 | void __init qemu_e500_pic_init(void) > | ^~ > CC arch/powerpc/kernel/pmc.o > arch/powerpc/kernel/pmc.c:78:6: error: no previous prototype for > 'power4_enable_pmcs' [-Werror=missing-prototypes] >78 | void power4_enable_pmcs(void) > | ^~ > > [...] Applied to powerpc/next. [1/1] powerpc/85xx: Mark some functions static and add missing includes to fix no previous prototype error https://git.kernel.org/powerpc/c/c265735ff5b1f13272e2bfb196f5c55f9b3c9bac cheers
Re: [PATCH] powerpc/64e: Fix circular dependency with CONFIG_SMP disabled
On Tue, 22 Aug 2023 08:07:50 +0200, Christophe Leroy wrote: > asm/percpu.h includes asm/paca.h which needs struct tlb_core_data > which is defined in mmu-e500.h > > asm/percpu.h is included from asm/mmu.h in a #ifdef CONFIG_E500 > before the inclusion of mmu-e500.h > > To fix that, move the inclusion of asm/percpu.h into mmu-e500.h > after the definition of struct tlb_core_data > > [...] Applied to powerpc/next. [1/1] powerpc/64e: Fix circular dependency with CONFIG_SMP disabled https://git.kernel.org/powerpc/c/0e2a34c467a0de2b0309d033e2700ce608e3fbf4 cheers
Re: [PATCH 1/2] powerpc/mm/book3s64: Fix build error with SPARSEMEM disabled
On Mon, 28 Aug 2023 13:16:57 +0530, Aneesh Kumar K.V wrote: > With CONFIG_SPARSEMEM disabled the below kernel build error is observed. > > arch/powerpc/mm/init_64.c:477:38: error: use of undeclared identifier > 'SECTION_SIZE_BITS' > > CONFIG_MEMORY_HOTPLUG depends on CONFIG_SPARSEMEM and it is more clear > to describe the code dependency in terms of MEMORY_HOTPLUG. Outside > memory hotplug the kernel uses memory_block_size for kernel directmap. > Instead of depending on SECTION_SIZE_BITS to compute the direct map > page size, add a new #define which defaults to 16M(same as existing > SECTION_SIZE) > > [...] Applied to powerpc/next. [1/2] powerpc/mm/book3s64: Fix build error with SPARSEMEM disabled https://git.kernel.org/powerpc/c/f1424755db913c5971686537381588261cdfd1ee [2/2] powerpc/mm/book3s64: Use 256M as the upper limit with coherent device memory attached https://git.kernel.org/powerpc/c/4c33bf147249ebbf3dded016996a8a24c5737254 cheers
Re: [PATCH v2 0/2] kbuild: Show Kconfig fragments in "help"
On Tue, Aug 29, 2023 at 11:57:19PM +0900, Masahiro Yamada wrote: > The attached patch will work too. > > I dropped the "in the first line" restriction > because SPDX might be placed in the first line > of config fragments. Good call. Yes, this looks excellent; thank you! Do you want to send a formal patch? Please consider it: Reviewed-by: Kees Cook -- Kees Cook
Re: Recent Power changes and stack_trace_save_tsk_reliable?
On 8/30/23 02:37, Michael Ellerman wrote: > Michael Ellerman writes: >> Joe Lawrence writes: >>> Hi ppc-dev list, >>> >>> We noticed that our kpatch integration tests started failing on ppc64le >>> when targeting the upstream v6.4 kernel, and then confirmed that the >>> in-tree livepatching kselftests similarly fail, too. From the kselftest >>> results, it appears that livepatch transitions are no longer completing. >> >> Hi Joe, >> >> Thanks for the report. >> >> I thought I was running the livepatch tests, but looks like somewhere >> along the line my kernel .config lost CONFIG_TEST_LIVEPATCH=m, so I have >> been running the test but it just skips. :/ >> That config option is easy to drop if you use `make localmodconfig` to try and expedite the builds :D Been there, done that too many times. >> I can reproduce the failure, and will see if I can bisect it more >> successfully. > > It's caused by: > > eed7c420aac7 ("powerpc: copy_thread differentiate kthreads and user mode > threads") > > Which is obvious in hindsight :) > > The diff below fixes it for me, can you test that on your setup? > Thanks for the fast triage of this one. The proposed fix works well on our setup. I have yet to try the kpatch integration tests with this, but I can verify that all of the kernel livepatching kselftests now happily run. -- Joe > A proper fix will need to be a bit bigger because the comments in there > are all slightly wrong now since the above commit. > > Possibly we can also rework that code more substantially now that > copy_thread() is more careful about setting things up, but that would be > a follow-up. > > diff --git a/arch/powerpc/kernel/stacktrace.c > b/arch/powerpc/kernel/stacktrace.c > index 5de8597eaab8..d0b3509f13ee 100644 > --- a/arch/powerpc/kernel/stacktrace.c > +++ b/arch/powerpc/kernel/stacktrace.c > @@ -73,7 +73,7 @@ int __no_sanitize_address > arch_stack_walk_reliable(stack_trace_consume_fn consum > bool firstframe; > > stack_end = stack_page + THREAD_SIZE; > - if (!is_idle_task(task)) { > + if (!(task->flags & PF_KTHREAD)) { > /* >* For user tasks, this is the SP value loaded on >* kernel entry, see "PACAKSAVE(r13)" in _switch() and > > > cheers >
Re: [PATCH 1/9] ARM: Remove
On Thu, Aug 17, 2023, at 12:07, Geert Uytterhoeven wrote: > As of commit b7fb14d3ac63117e ("ide: remove the legacy ide driver") in > v5.14, there are no more generic users of . > > Signed-off-by: Geert Uytterhoeven > --- Acked-by: Arnd Bergmann
Re: [PATCH v2 1/2] maple_tree: Disable mas_wr_append() when other readers are possible
This breaks booting on ppc32: Kernel attemptd to writ user page (1ff0) - exploit attempt? (uid: 0) BUG: Unable to handle kernel data access on write at 0x1ff0 Faulting instruction address: 0xc0009554 Vector: 300 (Data Access) at [c0b09d10] pc: c0009554: do_softirq_own_stack+0x18/0x30 lr: c004f480: __irq_exit_rcu+0x70/0xc0 sp: c0b09dd0 msr: 1032 dar: 1ff0 dsisr: 4200 current = 0xc0a08360 pid = 0, comm = swapper Linux version 6.5.0 ... enter ? for help [c0b09de0] c00ff480 __irq_exit_rcu+0x70/0xc0 [c0b09df0] c0005a98 Decrementer_virt+0x108/0x10c --- Exception: 900 (Decrementer) at c06cfa0c __schedule+0x4fc/0x510 [c0b09ec0] c06cf75c __schedule+0x1cc/0x510 (unreliable) [c0b09ef0] c06cfc90 __cond_resched+0x2c/0x54 [c0b09f00] c06d07f8 mutex_lock_killable+0x18/0x5c [c0b09f10] c013c404 pcpu_alloc+0x110/0x4dc [c0b09f70] c000cc34 alloc_descr.isra.18+0x48/0x144 [c0b09f90] c0988aa0 early_irq_init+0x64/0x8c [c0b09fa0] c097a5a4 start_kernel+0x5b4/0x7b0 [c0b09ff0] 3dc0 mon> -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
Re: [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory
Binbin Wu writes: >> >> >> +static long kvm_gmem_allocate(struct inode *inode, loff_t offset, loff_t >> len) >> +{ >> +struct address_space *mapping = inode->i_mapping; >> +pgoff_t start, index, end; >> +int r; >> + >> +/* Dedicated guest is immutable by default. */ >> +if (offset + len > i_size_read(inode)) >> +return -EINVAL; >> + >> +filemap_invalidate_lock_shared(mapping); >> + >> +start = offset >> PAGE_SHIFT; >> +end = (offset + len) >> PAGE_SHIFT; >> + >> +r = 0; >> +for (index = start; index < end; ) { >> +struct folio *folio; >> + >> +if (signal_pending(current)) { >> +r = -EINTR; >> +break; >> +} >> + >> +folio = kvm_gmem_get_folio(inode, index); >> +if (!folio) { >> +r = -ENOMEM; >> +break; >> +} >> + >> +index = folio_next_index(folio); >> + >> +folio_unlock(folio); >> +folio_put(folio); > May be a dumb question, why we get the folio and then put it immediately? > Will it make the folio be released back to the page allocator? > I was wondering this too, but it is correct. In filemap_grab_folio(), the refcount is incremented in three places: + When the folio is created in filemap_alloc_folio(), it is given a refcount of 1 in filemap_alloc_folio() -> folio_alloc() -> __folio_alloc_node() -> __folio_alloc() -> __alloc_pages() -> get_page_from_freelist() -> prep_new_page() -> post_alloc_hook() -> set_page_refcounted() + Then, in filemap_add_folio(), the refcount is incremented twice: + The first is from the filemap (1 refcount per page if this is a hugepage): filemap_add_folio() -> __filemap_add_folio() -> folio_ref_add() + The second is a refcount from the lru list filemap_add_folio() -> folio_add_lru() -> folio_get() -> folio_ref_inc() In the other path, if the folio exists in the page cache (filemap), the refcount is also incremented through filemap_grab_folio() -> __filemap_get_folio() -> filemap_get_entry() -> folio_try_get_rcu() I believe all the branches in kvm_gmem_get_folio() are taking a refcount on the folio while the kernel does some work on the folio like clearing the folio in clear_highpage() or getting the next index, and then when done, the kernel does folio_put(). This pattern is also used in shmem and hugetlb. :) I'm not sure whose refcount the folio_put() in kvm_gmem_allocate() is dropping though: + The refcount for the filemap depends on whether this is a hugepage or not, but folio_put() strictly drops a refcount of 1. + The refcount for the lru list is just 1, but doesn't the page still remain in the lru list? >> + >> +/* 64-bit only, wrapping the index should be impossible. */ >> +if (WARN_ON_ONCE(!index)) >> +break; >> + >> +cond_resched(); >> +} >> + >> +filemap_invalidate_unlock_shared(mapping); >> + >> +return r; >> +} >> + >> >>
Re: [PATCH] macintosh/ams: linux/platform_device.h is needed
Hi, On 8/29/23 22:46, Christophe Leroy wrote: > > > Le 30/08/2023 à 00:58, Randy Dunlap a écrit : >> ams.h uses struct platform_device, so the header should be used >> to prevent build errors: >> >> drivers/macintosh/ams/ams-input.c: In function 'ams_input_enable': >> drivers/macintosh/ams/ams-input.c:68:45: error: invalid use of undefined >> type 'struct platform_device' >> 68 | input->dev.parent = _info.of_dev->dev; >> drivers/macintosh/ams/ams-input.c: In function 'ams_input_init': >> drivers/macintosh/ams/ams-input.c:146:51: error: invalid use of undefined >> type 'struct platform_device' >>146 | return device_create_file(_info.of_dev->dev, >> _attr_joystick); >> drivers/macintosh/ams/ams-input.c: In function 'ams_input_exit': >> drivers/macintosh/ams/ams-input.c:151:44: error: invalid use of undefined >> type 'struct platform_device' >>151 | device_remove_file(_info.of_dev->dev, >> _attr_joystick); >> drivers/macintosh/ams/ams-input.c: In function 'ams_input_init': >> drivers/macintosh/ams/ams-input.c:147:1: error: control reaches end of >> non-void function [-Werror=return-type] >>147 | } >> >> Fixes: 233d687d1b78 ("macintosh: Explicitly include correct DT includes") >> Signed-off-by: Randy Dunlap >> Cc: Rob Herring >> Cc: linuxppc-dev@lists.ozlabs.org >> Cc: Michael Ellerman >> --- >> drivers/macintosh/ams/ams.h |1 + >> 1 file changed, 1 insertion(+) >> >> diff -- a/drivers/macintosh/ams/ams.h b/drivers/macintosh/ams/ams.h >> --- a/drivers/macintosh/ams/ams.h >> +++ b/drivers/macintosh/ams/ams.h >> @@ -6,6 +6,7 @@ >> #include >> #include >> #include >> +#include > > You modify ams.h to fix a problem in ams-input.c > Is that correct ? > struct platform_device is used in ams.h so I think the change needs to be there. > Shouldn't the include be in ams-input.c instead ? > >> #include >> #include >> > > Christophe -- ~Randy
Re: [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory
On 7/19/2023 7:44 AM, Sean Christopherson wrote: [...] + +static struct folio *kvm_gmem_get_folio(struct file *file, pgoff_t index) +{ + struct folio *folio; + + /* TODO: Support huge pages. */ + folio = filemap_grab_folio(file->f_mapping, index); + if (!folio) Should use if ((IS_ERR(folio)) instead. + return NULL; + + /* +* Use the up-to-date flag to track whether or not the memory has been +* zeroed before being handed off to the guest. There is no backing +* storage for the memory, so the folio will remain up-to-date until +* it's removed. +* +* TODO: Skip clearing pages when trusted firmware will do it when +* assigning memory to the guest. +*/ + if (!folio_test_uptodate(folio)) { + unsigned long nr_pages = folio_nr_pages(folio); + unsigned long i; + + for (i = 0; i < nr_pages; i++) + clear_highpage(folio_page(folio, i)); + + folio_mark_uptodate(folio); + } + + /* +* Ignore accessed, referenced, and dirty flags. The memory is +* unevictable and there is no storage to write back to. +*/ + return folio; +} [...] + +static long kvm_gmem_allocate(struct inode *inode, loff_t offset, loff_t len) +{ + struct address_space *mapping = inode->i_mapping; + pgoff_t start, index, end; + int r; + + /* Dedicated guest is immutable by default. */ + if (offset + len > i_size_read(inode)) + return -EINVAL; + + filemap_invalidate_lock_shared(mapping); + + start = offset >> PAGE_SHIFT; + end = (offset + len) >> PAGE_SHIFT; + + r = 0; + for (index = start; index < end; ) { + struct folio *folio; + + if (signal_pending(current)) { + r = -EINTR; + break; + } + + folio = kvm_gmem_get_folio(inode, index); + if (!folio) { + r = -ENOMEM; + break; + } + + index = folio_next_index(folio); + + folio_unlock(folio); + folio_put(folio); May be a dumb question, why we get the folio and then put it immediately? Will it make the folio be released back to the page allocator? + + /* 64-bit only, wrapping the index should be impossible. */ + if (WARN_ON_ONCE(!index)) + break; + + cond_resched(); + } + + filemap_invalidate_unlock_shared(mapping); + + return r; +} + [...] + +int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, + unsigned int fd, loff_t offset) +{ + loff_t size = slot->npages << PAGE_SHIFT; + unsigned long start, end, flags; + struct kvm_gmem *gmem; + struct inode *inode; + struct file *file; + + BUILD_BUG_ON(sizeof(gfn_t) != sizeof(slot->gmem.pgoff)); + + file = fget(fd); + if (!file) + return -EINVAL; + + if (file->f_op != _gmem_fops) + goto err; + + gmem = file->private_data; + if (gmem->kvm != kvm) + goto err; + + inode = file_inode(file); + flags = (unsigned long)inode->i_private; + + /* +* For simplicity, require the offset into the file and the size of the +* memslot to be aligned to the largest possible page size used to back +* the file (same as the size of the file itself). +*/ + if (!kvm_gmem_is_valid_size(offset, flags) || + !kvm_gmem_is_valid_size(size, flags)) + goto err; + + if (offset + size > i_size_read(inode)) + goto err; + + filemap_invalidate_lock(inode->i_mapping); + + start = offset >> PAGE_SHIFT; + end = start + slot->npages; + + if (!xa_empty(>bindings) && + xa_find(>bindings, , end - 1, XA_PRESENT)) { + filemap_invalidate_unlock(inode->i_mapping); + goto err; + } + + /* +* No synchronize_rcu() needed, any in-flight readers are guaranteed to +* be see either a NULL file or this new file, no need for them to go +* away. +*/ + rcu_assign_pointer(slot->gmem.file, file); + slot->gmem.pgoff = start; + + xa_store_range(>bindings, start, end - 1, slot, GFP_KERNEL); + filemap_invalidate_unlock(inode->i_mapping); + + /* +* Drop the reference to the file, even on success. The file pins KVM, +* not the other way 'round. Active bindings are invalidated if the an extra ', or maybe around? +* file is closed before memslots are destroyed. +*/ + fput(file); + return 0; + +err: + fput(file); + return -EINVAL; +} + [...] []
Re: [PATCH] crypto: vmx: Improved AES/XTS performance of 6-way unrolling for ppc.
Hi Michael, I just submitted the v2 patch. Thanks. -Danny On 8/29/23 11:37 PM, Michael Ellerman wrote: Danny Tsen writes: Improve AES/XTS performance of 6-way unrolling for PowerPC up to 17% with tcrypt. This is done by using one instruction, vpermxor, to replace xor and vsldoi. This patch has been tested with the kernel crypto module tcrypt.ko and has passed the selftest. The patch is also tested with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS enabled. Signed-off-by: Danny Tsen --- drivers/crypto/vmx/aesp8-ppc.pl | 141 +--- 1 file changed, 92 insertions(+), 49 deletions(-) That's CRYPTOGAMS code, and is so far largely unchanged from the original. I see you've sent the same change to openssl, but it's not merged yet. Please document that in the change log, we want to keep the code in sync as much as possible, and document any divergences. cheers diff --git a/drivers/crypto/vmx/aesp8-ppc.pl b/drivers/crypto/vmx/aesp8-ppc.pl index 50a0a18f35da..f729589d792e 100644 --- a/drivers/crypto/vmx/aesp8-ppc.pl +++ b/drivers/crypto/vmx/aesp8-ppc.pl @@ -132,11 +132,12 @@ rcon: .long 0x1b00, 0x1b00, 0x1b00, 0x1b00 ?rev .long 0x0d0e0f0c, 0x0d0e0f0c, 0x0d0e0f0c, 0x0d0e0f0c ?rev .long 0,0,0,0 ?asis +.long 0x0f102132, 0x43546576, 0x8798a9ba, 0xcbdcedfe Lconsts: mflrr0 bcl 20,31,\$+4 mflr$ptr #v "distance between . and rcon - addi$ptr,$ptr,-0x48 + addi$ptr,$ptr,-0x58 mtlrr0 blr .long 0 @@ -2495,6 +2496,17 @@ _aesp8_xts_encrypt6x: li $x70,0x70 mtspr 256,r0 + xxlor 2, 32+$eighty7, 32+$eighty7 + vsldoi $eighty7,$tmp,$eighty7,1# 0x010101..87 + xxlor 1, 32+$eighty7, 32+$eighty7 + + # Load XOR Lconsts. + mr $x70, r6 + bl Lconsts + lxvw4x 0, $x40, r6 # load XOR contents + mr r6, $x70 + li $x70,0x70 + subi$rounds,$rounds,3 # -4 in total lvx $rndkey0,$x00,$key1 # load key schedule @@ -2537,69 +2549,77 @@ Load_xts_enc_key: ?vperm v31,v31,$twk5,$keyperm lvx v25,$x10,$key_ # pre-load round[2] + # Switch to use the following codes with 0x010101..87 to generate tweak. + # eighty7 = 0x010101..87 + # vsrab tmp, tweak, seven # next tweak value, right shift 7 bits + # vand tmp, tmp, eighty7 # last byte with carry + # vaddubm tweak, tweak, tweak # left shift 1 bit (x2) + # xxlor vsx, 0, 0 + # vpermxor tweak, tweak, tmp, vsx + vperm $in0,$inout,$inptail,$inpperm subi $inp,$inp,31# undo "caller" vxor$twk0,$tweak,$rndkey0 vsrab $tmp,$tweak,$seven # next tweak value vaddubm $tweak,$tweak,$tweak - vsldoi $tmp,$tmp,$tmp,15 vand$tmp,$tmp,$eighty7 vxor $out0,$in0,$twk0 - vxor$tweak,$tweak,$tmp + xxlor 32+$in1, 0, 0 + vpermxor$tweak, $tweak, $tmp, $in1 lvx_u $in1,$x10,$inp vxor$twk1,$tweak,$rndkey0 vsrab $tmp,$tweak,$seven # next tweak value vaddubm $tweak,$tweak,$tweak - vsldoi $tmp,$tmp,$tmp,15 le?vperm $in1,$in1,$in1,$leperm vand$tmp,$tmp,$eighty7 vxor $out1,$in1,$twk1 - vxor$tweak,$tweak,$tmp + xxlor 32+$in2, 0, 0 + vpermxor$tweak, $tweak, $tmp, $in2 lvx_u $in2,$x20,$inp andi. $taillen,$len,15 vxor$twk2,$tweak,$rndkey0 vsrab $tmp,$tweak,$seven # next tweak value vaddubm $tweak,$tweak,$tweak - vsldoi $tmp,$tmp,$tmp,15 le?vperm $in2,$in2,$in2,$leperm vand$tmp,$tmp,$eighty7 vxor $out2,$in2,$twk2 - vxor$tweak,$tweak,$tmp + xxlor 32+$in3, 0, 0 + vpermxor$tweak, $tweak, $tmp, $in3 lvx_u $in3,$x30,$inp sub$len,$len,$taillen vxor$twk3,$tweak,$rndkey0 vsrab $tmp,$tweak,$seven # next tweak value vaddubm $tweak,$tweak,$tweak - vsldoi $tmp,$tmp,$tmp,15 le?vperm $in3,$in3,$in3,$leperm vand$tmp,$tmp,$eighty7 vxor $out3,$in3,$twk3 - vxor$tweak,$tweak,$tmp + xxlor 32+$in4, 0, 0 + vpermxor$tweak, $tweak, $tmp, $in4 lvx_u $in4,$x40,$inp subi $len,$len,0x60 vxor
[PATCH v2] crypto: vmx: Improved AES/XTS performance of 6-way unrolling for ppc.
Improve AES/XTS performance of 6-way unrolling for PowerPC up to 17% with tcrypt. This is done by using one instruction, vpermxor, to replace xor and vsldoi. The same changes were applied to OpenSSL code and a pull request was submitted. This patch has been tested with the kernel crypto module tcrypt.ko and has passed the selftest. The patch is also tested with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS enabled. Signed-off-by: Danny Tsen --- drivers/crypto/vmx/aesp8-ppc.pl | 141 +--- 1 file changed, 92 insertions(+), 49 deletions(-) diff --git a/drivers/crypto/vmx/aesp8-ppc.pl b/drivers/crypto/vmx/aesp8-ppc.pl index 50a0a18f35da..f729589d792e 100644 --- a/drivers/crypto/vmx/aesp8-ppc.pl +++ b/drivers/crypto/vmx/aesp8-ppc.pl @@ -132,11 +132,12 @@ rcon: .long 0x1b00, 0x1b00, 0x1b00, 0x1b00 ?rev .long 0x0d0e0f0c, 0x0d0e0f0c, 0x0d0e0f0c, 0x0d0e0f0c ?rev .long 0,0,0,0 ?asis +.long 0x0f102132, 0x43546576, 0x8798a9ba, 0xcbdcedfe Lconsts: mflrr0 bcl 20,31,\$+4 mflr$ptr #v "distance between . and rcon - addi$ptr,$ptr,-0x48 + addi$ptr,$ptr,-0x58 mtlrr0 blr .long 0 @@ -2495,6 +2496,17 @@ _aesp8_xts_encrypt6x: li $x70,0x70 mtspr 256,r0 + xxlor 2, 32+$eighty7, 32+$eighty7 + vsldoi $eighty7,$tmp,$eighty7,1# 0x010101..87 + xxlor 1, 32+$eighty7, 32+$eighty7 + + # Load XOR Lconsts. + mr $x70, r6 + bl Lconsts + lxvw4x 0, $x40, r6 # load XOR contents + mr r6, $x70 + li $x70,0x70 + subi$rounds,$rounds,3 # -4 in total lvx $rndkey0,$x00,$key1 # load key schedule @@ -2537,69 +2549,77 @@ Load_xts_enc_key: ?vperm v31,v31,$twk5,$keyperm lvx v25,$x10,$key_ # pre-load round[2] + # Switch to use the following codes with 0x010101..87 to generate tweak. + # eighty7 = 0x010101..87 + # vsrab tmp, tweak, seven # next tweak value, right shift 7 bits + # vand tmp, tmp, eighty7 # last byte with carry + # vaddubm tweak, tweak, tweak # left shift 1 bit (x2) + # xxlor vsx, 0, 0 + # vpermxor tweak, tweak, tmp, vsx + vperm $in0,$inout,$inptail,$inpperm subi $inp,$inp,31# undo "caller" vxor$twk0,$tweak,$rndkey0 vsrab $tmp,$tweak,$seven # next tweak value vaddubm $tweak,$tweak,$tweak - vsldoi $tmp,$tmp,$tmp,15 vand$tmp,$tmp,$eighty7 vxor $out0,$in0,$twk0 - vxor$tweak,$tweak,$tmp + xxlor 32+$in1, 0, 0 + vpermxor$tweak, $tweak, $tmp, $in1 lvx_u $in1,$x10,$inp vxor$twk1,$tweak,$rndkey0 vsrab $tmp,$tweak,$seven # next tweak value vaddubm $tweak,$tweak,$tweak - vsldoi $tmp,$tmp,$tmp,15 le?vperm $in1,$in1,$in1,$leperm vand$tmp,$tmp,$eighty7 vxor $out1,$in1,$twk1 - vxor$tweak,$tweak,$tmp + xxlor 32+$in2, 0, 0 + vpermxor$tweak, $tweak, $tmp, $in2 lvx_u $in2,$x20,$inp andi. $taillen,$len,15 vxor$twk2,$tweak,$rndkey0 vsrab $tmp,$tweak,$seven # next tweak value vaddubm $tweak,$tweak,$tweak - vsldoi $tmp,$tmp,$tmp,15 le?vperm $in2,$in2,$in2,$leperm vand$tmp,$tmp,$eighty7 vxor $out2,$in2,$twk2 - vxor$tweak,$tweak,$tmp + xxlor 32+$in3, 0, 0 + vpermxor$tweak, $tweak, $tmp, $in3 lvx_u $in3,$x30,$inp sub$len,$len,$taillen vxor$twk3,$tweak,$rndkey0 vsrab $tmp,$tweak,$seven # next tweak value vaddubm $tweak,$tweak,$tweak - vsldoi $tmp,$tmp,$tmp,15 le?vperm $in3,$in3,$in3,$leperm vand$tmp,$tmp,$eighty7 vxor $out3,$in3,$twk3 - vxor$tweak,$tweak,$tmp + xxlor 32+$in4, 0, 0 + vpermxor$tweak, $tweak, $tmp, $in4 lvx_u $in4,$x40,$inp subi $len,$len,0x60 vxor$twk4,$tweak,$rndkey0 vsrab $tmp,$tweak,$seven # next tweak value vaddubm $tweak,$tweak,$tweak - vsldoi $tmp,$tmp,$tmp,15 le?vperm $in4,$in4,$in4,$leperm vand
RE: [PATCH] kbuild: single-quote the format string of printf
... > All of "\\n", "\n", '\n' are the same. and \\n (without any quote characters at all). Personally I'd use 'format' for printf. To make it obvious that nothing is to be expanded. But it isn't really worth changing existing 'stuff'. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
[PATCH] powerpc/smp: Dynamically build powerpc topology
Currently there are four powerpc specific sched topologies. These are all statically defined. However not all these topologies are used by all powerpc systems. To avoid unnecessary degenerations by the scheduler , masks and flags are compared. However if the sched topologies are build dynamically then the code is simpler and there are greater chances of avoiding degenerations. Even x86 builds its sched topologies dynamically and new changes are very similar to the way x86 is building its topologies. System Configuration type=Shared mode=Uncapped smt=8 lcpu=128 mem=1063126592 kB cpus=96 ent=40.00 $ lscpu Architecture:ppc64le Byte Order: Little Endian CPU(s): 1024 On-line CPU(s) list: 0-1023 Model name: POWER10 (architected), altivec supported Model: 2.0 (pvr 0080 0200) Thread(s) per core: 8 Core(s) per socket: 32 Socket(s): 4 Hypervisor vendor: pHyp Virtualization type: para L1d cache: 8 MiB (256 instances) L1i cache: 12 MiB (256 instances) NUMA node(s):4 >From dmesg of v6.5 [0.17] smp: Bringing up secondary CPUs ... [3.918535] smp: Brought up 4 nodes, 1024 CPUs [ 38.001402] sysrq: Changing Loglevel [ 38.001446] sysrq: Loglevel set to 9 >From dmesg of v6.5 + patch [0.174462] smp: Bringing up secondary CPUs ... [3.421462] smp: Brought up 4 nodes, 1024 CPUs [ 35.417917] sysrq: Changing Loglevel [ 35.417959] sysrq: Loglevel set to 9 5 runs of ppc64_cpu --smt=1 (time measured: lesser is better) Kernel N Min Max Median Avg Stddev %Change v6.55 518.08 574.27 528.61 535.388 22.341542 +patch 5 481.73 495.47 484.21 486.402 5.7997 -9.14963 5 runs of ppc64_cpu --smt=8 (time measured: lesser is better) Kernel N Min Max Median Avg Stddev %Change v6.55 1094.12 1117.1 1108.97 1106.38.606361 +patch 5 1067.5 1090.03 1073.89 1076.574 9.4189347 -2.68697 Signed-off-by: Srikar Dronamraju --- arch/powerpc/kernel/smp.c | 78 ++- 1 file changed, 28 insertions(+), 50 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 48b8161179a8..c16443a04c26 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -92,15 +92,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_l2_cache_map); EXPORT_PER_CPU_SYMBOL(cpu_core_map); EXPORT_SYMBOL_GPL(has_big_cores); -enum { -#ifdef CONFIG_SCHED_SMT - smt_idx, -#endif - cache_idx, - mc_idx, - die_idx, -}; - #define MAX_THREAD_LIST_SIZE 8 #define THREAD_GROUP_SHARE_L1 1 #define THREAD_GROUP_SHARE_L2_L3 2 @@ -1048,16 +1039,6 @@ static const struct cpumask *cpu_mc_mask(int cpu) return cpu_coregroup_mask(cpu); } -static struct sched_domain_topology_level powerpc_topology[] = { -#ifdef CONFIG_SCHED_SMT - { cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) }, -#endif - { shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) }, - { cpu_mc_mask, powerpc_shared_proc_flags, SD_INIT_NAME(MC) }, - { cpu_cpu_mask, powerpc_shared_proc_flags, SD_INIT_NAME(DIE) }, - { NULL, }, -}; - static int __init init_big_cores(void) { int cpu; @@ -1676,9 +1657,11 @@ void start_secondary(void *unused) BUG(); } -static void __init fixup_topology(void) +static struct sched_domain_topology_level powerpc_topology[6]; + +static void __init build_sched_topology(void) { - int i; + int i = 0; if (is_shared_processor()) { asym_pack_flag = SD_ASYM_PACKING; @@ -1690,36 +1673,33 @@ static void __init fixup_topology(void) #ifdef CONFIG_SCHED_SMT if (has_big_cores) { pr_info("Big cores detected but using small core scheduling\n"); - powerpc_topology[smt_idx].mask = smallcore_smt_mask; + powerpc_topology[i++] = (struct sched_domain_topology_level){ + smallcore_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) + }; + } else { + powerpc_topology[i++] = (struct sched_domain_topology_level){ + cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) + }; } #endif + if (shared_caches) { + powerpc_topology[i++] = (struct sched_domain_topology_level){ + shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) + }; + } + if (has_coregroup_support()) { + powerpc_topology[i++] = (struct sched_domain_topology_level){ + cpu_mc_mask, powerpc_shared_proc_flags, SD_INIT_NAME(MC) + }; + } + powerpc_topology[i++] = (struct sched_domain_topology_level){ + cpu_cpu_mask,
Re: [PATCH] kbuild: single-quote the format string of printf
On Wed, Aug 30, 2023 at 10:00 AM Nicolas Schier wrote: > > On Tue 29 Aug 2023 20:35:31 GMT, Masahiro Yamada wrote: > > Use single-quotes to avoid escape sequences (\\n). > > > > Signed-off-by: Masahiro Yamada > > --- > > Is this really necessary? Testing w/ GNU Make 4.3, bash 5.2.15 or > dash 0.5.12-6 and a stupid Makefile snippet I cannot see any difference > between these three: > > print: > @printf "hello med single-backslash and double quotes\n" > @printf 'hello med single-backslash and single quotes\n' > @printf "hello med double-backslash and double quotes\\n" > > Only double-backslash+n in single-quotes does not work, for obvious > reasons. You are right. I was misunderstanding the backslash-escaping in double-quotes. I always used single-quotes when I wanted to avoid escape sequences. The following POSIX spec applies here: 2.2.3 Double-Quotes The shall retain its special meaning as an escape character (see Escape Character (Backslash)) only when followed by one of the following characters when considered special: $ ` " \ https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html All of "\\n", "\n", '\n' are the same. -- Best Regards Masahiro Yamada
[PATCH 4/4] powerpc/smp: Disable MC domain for shared processor
Like L2-cache info, coregroup information which is used to determine MC sched domains is only present on dedicated LPARs. i.e PowerVM doesn't export coregroup information for shared processor LPARs. Hence disable creating MC domains on shared LPAR Systems. Signed-off-by: Srikar Dronamraju --- arch/powerpc/kernel/smp.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 51403640440c..48b8161179a8 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -1036,6 +1036,10 @@ static struct cpumask *cpu_coregroup_mask(int cpu) static bool has_coregroup_support(void) { + /* Coregroup identification not available on shared systems */ + if (is_shared_processor()) + return 0; + return coregroup_enabled; } -- 2.41.0
[PATCH 2/4] powerpc/smp: Move shared_processor static key to smp.h
The ability to detect if the system is running in a shared processor mode is helpful in few more generic cases not just in paravirtualization. For example: At boot time, different scheduler/ topology flags may be set based on the processor mode. Hence move it to a more generic file. Signed-off-by: Srikar Dronamraju --- arch/powerpc/include/asm/paravirt.h | 12 arch/powerpc/include/asm/smp.h | 14 ++ 2 files changed, 14 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h index f5ba1a3c41f8..3bc8cff97f9a 100644 --- a/arch/powerpc/include/asm/paravirt.h +++ b/arch/powerpc/include/asm/paravirt.h @@ -14,13 +14,6 @@ #include #include -DECLARE_STATIC_KEY_FALSE(shared_processor); - -static inline bool is_shared_processor(void) -{ - return static_branch_unlikely(_processor); -} - #ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING extern struct static_key paravirt_steal_enabled; extern struct static_key paravirt_steal_rq_enabled; @@ -71,11 +64,6 @@ static inline void yield_to_any(void) plpar_hcall_norets_notrace(H_CONFER, -1, 0); } #else -static inline bool is_shared_processor(void) -{ - return false; -} - static inline u32 yield_count_of(int cpu) { return 0; diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 576d0e15..08631b2a4528 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -34,6 +34,20 @@ extern bool coregroup_enabled; extern int cpu_to_chip_id(int cpu); extern int *chip_id_lookup_table; +#ifdef CONFIG_PPC_SPLPAR +DECLARE_STATIC_KEY_FALSE(shared_processor); + +static inline bool is_shared_processor(void) +{ + return static_branch_unlikely(_processor); +} +#else +static inline bool is_shared_processor(void) +{ + return false; +} +#endif + DECLARE_PER_CPU(cpumask_var_t, thread_group_l1_cache_map); DECLARE_PER_CPU(cpumask_var_t, thread_group_l2_cache_map); DECLARE_PER_CPU(cpumask_var_t, thread_group_l3_cache_map); -- 2.41.0
[PATCH 3/4] powerpc/smp: Enable Asym packing for cores on shared processor
If there are shared processor lpars, underlying Hypervisor can have more virtual cores to handle that actual physical cores. Starting with Power 9, a core has 2 nearly independent thread groups. On a shared processors lpars, it helps to pack threads to lesser number of cores so that the overall system performance and utilization improves. PowerVM schedules at a core level. Hence packing to fewer cores helps. For example: Lets says there are two 8-core Shared lpars that are actually sharing a 8 Core shared physical pool, each running 8 threads each. Then Consolidating 8 threads to 4 cores on each lpar would help them to perform better. To achieve this, enable SD_ASYM_PACKING flag at CACHE, MC and DIE level. Signed-off-by: Srikar Dronamraju --- arch/powerpc/kernel/smp.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index c7d1484ed230..51403640440c 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -1005,7 +1005,12 @@ static int powerpc_smt_flags(void) */ static int powerpc_shared_cache_flags(void) { - return SD_SHARE_PKG_RESOURCES; + return SD_SHARE_PKG_RESOURCES | asym_pack_flag; +} + +static int powerpc_shared_proc_flags(void) +{ + return asym_pack_flag; } /* @@ -1044,8 +1049,8 @@ static struct sched_domain_topology_level powerpc_topology[] = { { cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) }, #endif { shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) }, - { cpu_mc_mask, SD_INIT_NAME(MC) }, - { cpu_cpu_mask, SD_INIT_NAME(DIE) }, + { cpu_mc_mask, powerpc_shared_proc_flags, SD_INIT_NAME(MC) }, + { cpu_cpu_mask, powerpc_shared_proc_flags, SD_INIT_NAME(DIE) }, { NULL, }, }; @@ -1671,7 +1676,9 @@ static void __init fixup_topology(void) { int i; - if (cpu_has_feature(CPU_FTR_ASYM_SMT)) { + if (is_shared_processor()) { + asym_pack_flag = SD_ASYM_PACKING; + } else if (cpu_has_feature(CPU_FTR_ASYM_SMT)) { printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n"); asym_pack_flag = SD_ASYM_PACKING; } -- 2.41.0
[PATCH 0/4] powerpc/smp: Shared processor sched optimizations
PowerVM systems configured in shared processors mode have some unique challenges. Some device-tree properties will be missing on a shared processor. Hence some sched domains may not make sense for shared processor systems. Most shared processor systems are over-provisioned. Underlying PowerVM Hypervisor would schedule at a Big Core granularity. The most recent power processors support two almost independent cores. In a lightly loaded condition, it helps the overall system performance if we pack to lesser number of Big Cores. System Configuration type=Shared mode=Capped smt=8 lcpu=128 mem=1066732224 kB cpus=96 ent=40.00 So *40 Entitled cores / 128 Virtual processors* scenario. lscpu Architecture:ppc64le Byte Order: Little Endian CPU(s): 1024 On-line CPU(s) list: 0-1023 Model name: POWER10 (architected), altivec supported Model: 2.0 (pvr 0080 0200) Thread(s) per core: 8 Core(s) per socket: 16 Socket(s): 8 Hypervisor vendor: pHyp Virtualization type: para L1d cache: 8 MiB (256 instances) L1i cache: 12 MiB (256 instances) NUMA node(s):8 NUMA node0 CPU(s): 0-7,64-71,128-135,192-199,256-263,320-327,384-391,448-455,512-519,576-583,640-647,704-711,768-775,832-839,896-903,960-967 NUMA node1 CPU(s): 8-15,72-79,136-143,200-207,264-271,328-335,392-399,456-463,520-527,584-591,648-655,712-719,776-783,840-847,904-911,968-975 NUMA node2 CPU(s): 16-23,80-87,144-151,208-215,272-279,336-343,400-407,464-471,528-535,592-599,656-663,720-727,784-791,848-855,912-919,976-983 NUMA node3 CPU(s): 24-31,88-95,152-159,216-223,280-287,344-351,408-415,472-479,536-543,600-607,664-671,728-735,792-799,856-863,920-927,984-991 NUMA node4 CPU(s): 32-39,96-103,160-167,224-231,288-295,352-359,416-423,480-487,544-551,608-615,672-679,736-743,800-807,864-871,928-935,992-999 NUMA node5 CPU(s): 40-47,104-111,168-175,232-239,296-303,360-367,424-431,488-495,552-559,616-623,680-687,744-751,808-815,872-879,936-943,1000-1007 NUMA node6 CPU(s): 48-55,112-119,176-183,240-247,304-311,368-375,432-439,496-503,560-567,624-631,688-695,752-759,816-823,880-887,944-951,1008-1015 NUMA node7 CPU(s): 56-63,120-127,184-191,248-255,312-319,376-383,440-447,504-511,568-575,632-639,696-703,760-767,824-831,888-895,952-959,1016-1023 ebizzy -t 40 -S 200 (5 iterations) Records per second. (Higher is better) Kernel N Min Max MedianAvgStddev %Change v6.5 5 4664647 5148125 5130549 5043050.2 211756.06 +patch 5 4769453 5220808 5137476 5040333.8 193586.43 -0.0538642 >From lparstat (when the workload stabilized) Kernel %user %sys %wait %idle physc %entc lbusy appvcsw phint v6.56.23 0.00 0.00 93.77 40.06 100.15 6.23 55.92 138699651 100 +patch 6.26 0.01 0.00 93.73 21.15 52.87 6.27 74.78 71743299 148 ebizzy -t 80 -S 200 (5 iterations) Records per second. (Higher is better) Kernel N Min Max MedianAvgStddev %Change v6.5 5 8735907 9121401 8986218 8967125.6 152793.38 +patch 5 9636679 9990229 9765958 9770081.8 143913.29 8.95444 >From lparstat (when the workload stabilized) Kernel %user %sys %wait %idle physc %entc lbusy appvcsw phint v6.512.40 0.01 0.00 87.60 71.05 177.62 12.40 24.61 98047012 85 +patch 12.47 0.02 0.00 87.50 41.06 102.65 12.50 54.90 77821678 158 ebizzy -t 160 -S 200 (5 iterations) Records per second. (Higher is better) Kernel N Min Max MedianAvgStddev %Change v6.5 5 12378356 12946633 12780732 12682369 266135.73 +patch 5 16756702 17676670 17406971 17341585 346054.89 36.7377 >From lparstat (when the workload stabilized) Kernel %user %sys %wait %idle physc %entc lbusy appvcsw phint v6.524.56 0.09 0.15 75.19 77.42 193.55 24.65 17.94 135625276 98 +patch 24.78 0.03 0.00 75.19 78.33 195.83 24.81 17.17 107826112 215 - System Configuration type=Shared mode=Capped smt=8 lcpu=40 mem=1066732672 kB cpus=96 ent=40.00 So *40 Entitled cores / 40 Virtual processors* scenario. lscpu Architecture:ppc64le Byte Order: Little Endian CPU(s): 320 On-line CPU(s) list: 0-319 Model name: POWER10 (architected), altivec supported Model: 2.0 (pvr 0080 0200) Thread(s) per core: 8 Core(s) per socket: 10 Socket(s): 4 Hypervisor vendor: pHyp Virtualization type: para L1d cache: 2.5 MiB (80 instances) L1i cache: 3.8
[PATCH 1/4] powerpc/smp: Cache CPU has Asymmetric SMP
Currently cpu feature flag is checked whenever powerpc_smt_flags gets called. This is an unnecessary overhead. CPU_FTR_ASYM_SMT is set based on the processor and all processors will either have this set or will have it unset. Hence only check for the feature flag once and cache it to be used subsequently. This commit will help avoid a branch in powerpc_smt_flags Signed-off-by: Srikar Dronamraju --- arch/powerpc/kernel/smp.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index fbbb695bae3d..c7d1484ed230 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -987,18 +987,13 @@ static int __init init_thread_group_cache_map(int cpu, int cache_property) } static bool shared_caches; +static int asym_pack_flag; #ifdef CONFIG_SCHED_SMT /* cpumask of CPUs with asymmetric SMT dependency */ static int powerpc_smt_flags(void) { - int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES; - - if (cpu_has_feature(CPU_FTR_ASYM_SMT)) { - printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n"); - flags |= SD_ASYM_PACKING; - } - return flags; + return SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES | asym_pack_flag; } #endif @@ -1676,6 +1671,11 @@ static void __init fixup_topology(void) { int i; + if (cpu_has_feature(CPU_FTR_ASYM_SMT)) { + printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n"); + asym_pack_flag = SD_ASYM_PACKING; + } + #ifdef CONFIG_SCHED_SMT if (has_big_cores) { pr_info("Big cores detected but using small core scheduling\n"); -- 2.41.0
Re: [PATCH] powerpc: Hide empty pt_regs at base of the stack
On Thu, 24 Aug 2023 at 06:42, Michael Ellerman wrote: > > A thread started via eg. user_mode_thread() runs in the kernel to begin > with and then may later return to userspace. While it's running in the > kernel it has a pt_regs at the base of its kernel stack, but that > pt_regs is all zeroes. > > If the thread oopses in that state, it leads to an ugly stack trace with > a big block of zero GPRs, as reported by Joel: > > Kernel panic - not syncing: VFS: Unable to mount root fs on > unknown-block(0,0) > CPU: 0 PID: 1 Comm: swapper/0 Not tainted > 6.5.0-rc7-4-gf7757129e3de-dirty #3 > Hardware name: IBM PowerNV (emulated by qemu) POWER9 0x4e1200 opal:v7.0 > PowerNV > Call Trace: > [c36afb00] [c10dd058] dump_stack_lvl+0x6c/0x9c (unreliable) > [c36afb30] [c013c524] panic+0x178/0x424 > [c36afbd0] [c2005100] mount_root_generic+0x250/0x324 > [c36afca0] [c20057d0] prepare_namespace+0x2d4/0x344 > [c36afd20] [c20049c0] kernel_init_freeable+0x358/0x3ac > [c36afdf0] [c00111b0] kernel_init+0x30/0x1a0 > [c36afe50] [c000debc] ret_from_kernel_user_thread+0x14/0x1c > --- interrupt: 0 at 0x0 > NIP: LR: CTR: > REGS: c36afe80 TRAP: Not tainted > (6.5.0-rc7-4-gf7757129e3de-dirty) > MSR: <> CR: XER: > CFAR: IRQMASK: 0 > GPR00: > GPR04: > GPR08: > GPR12: > GPR16: > GPR20: > GPR24: > GPR28: > NIP [] 0x0 > LR [] 0x0 > --- interrupt: 0 > > The all-zero pt_regs looks ugly and conveys no useful information, other > than its presence. So detect that case and just show the presence of the > frame by printing the interrupt marker, eg: > > Kernel panic - not syncing: VFS: Unable to mount root fs on > unknown-block(0,0) > CPU: 0 PID: 1 Comm: swapper/0 Not tainted > 6.5.0-rc3-00126-g18e9506562a0-dirty #301 > Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 > 0xf05 of:SLOF,HEAD hv:linux,kvm pSeries > Call Trace: > [c3aabb00] [c1143db8] dump_stack_lvl+0x6c/0x9c (unreliable) > [c3aabb30] [c014c624] panic+0x178/0x424 > [c3aabbd0] [c20050fc] mount_root_generic+0x250/0x324 > [c3aabca0] [c20057cc] prepare_namespace+0x2d4/0x344 > [c3aabd20] [c20049bc] kernel_init_freeable+0x358/0x3ac > [c3aabdf0] [c00111b0] kernel_init+0x30/0x1a0 > [c3aabe50] [c000debc] ret_from_kernel_user_thread+0x14/0x1c > --- interrupt: 0 at 0x0 > > To avoid ever suppressing a valid pt_regs make sure the pt_regs has a > zero MSR and TRAP value, and is located at the very base of the stack. > > Reported-by: Joel Stanley > Reported-by: Nicholas Piggin > Signed-off-by: Michael Ellerman Reviewed-by: Joel Stanley Thanks for the explanation and the patch. > --- > arch/powerpc/kernel/process.c | 26 +++--- > 1 file changed, 23 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c > index b68898ac07e1..392404688cec 100644 > --- a/arch/powerpc/kernel/process.c > +++ b/arch/powerpc/kernel/process.c > @@ -2258,6 +2258,22 @@ unsigned long __get_wchan(struct task_struct *p) > return ret; > } > > +static bool empty_user_regs(struct pt_regs *regs, struct task_struct *tsk) > +{ > + unsigned long stack_page; > + > + // A non-empty pt_regs should never have a zero MSR or TRAP value. > + if (regs->msr || regs->trap) > + return false; > + > + // Check it sits at the very base of the stack > + stack_page = (unsigned long)task_stack_page(tsk); > + if ((unsigned long)(regs + 1) != stack_page + THREAD_SIZE) > + return false; > + > + return true; > +} > + > static int kstack_depth_to_print = CONFIG_PRINT_STACK_DEPTH; > > void __no_sanitize_address show_stack(struct task_struct *tsk, > @@ -2322,9 +2338,13 @@ void __no_sanitize_address show_stack(struct > task_struct *tsk, > lr = regs->link; > printk("%s--- interrupt: %lx at %pS\n", >loglvl, regs->trap, (void *)regs->nip); > - __show_regs(regs); > -
Re: [PATCH RFC 1/2] powerpc/pseries: papr-vpd char driver for VPD retrieval
Hello, thanks for working on this. On Tue, Aug 22, 2023 at 04:33:39PM -0500, Nathan Lynch via B4 Relay wrote: > From: Nathan Lynch > > PowerVM LPARs may retrieve Vital Product Data (VPD) for system > components using the ibm,get-vpd RTAS function. > > We can expose this to user space with a /dev/papr-vpd character > device, where the programming model is: > > struct papr_location_code plc = { .str = "", }; /* obtain all VPD */ > int devfd = open("/dev/papr-vpd", O_WRONLY); > int vpdfd = ioctl(devfd, PAPR_VPD_CREATE_HANDLE, ); > size_t size = lseek(vpdfd, 0, SEEK_END); > char *buf = malloc(size); > pread(devfd, buf, size, 0); > > When a file descriptor is obtained from ioctl(PAPR_VPD_CREATE_HANDLE), > the file contains the result of a complete ibm,get-vpd sequence. The Could this be somewhat less obfuscated? What the caller wants is the result of "ibm,get-vpd", which is a well-known string identifier of the rtas call. Yet this identifier is never passed in. Instead we have this new PAPR_VPD_CREATE_HANDLE. This is a completely new identifier, specific to this call only as is the /dev/papr-vpd device name, another new identifier. Maybe the interface could provide a way to specify the service name? > file contents are immutable from the POV of user space. To get a new > view of VPD, clients must create a new handle. Which is basically the same as creating a file descriptor with open(). Maybe creating a directory in sysfs or procfs with filenames corresponding to rtas services would do the same job without extra obfuscation? > This design choice insulates user space from most of the complexities > that ibm,get-vpd brings: > > * ibm,get-vpd must be called more than once to obtain complete > results. > * Only one ibm,get-vpd call sequence should be in progress at a time; > concurrent sequences will disrupt each other. Callers must have a > protocol for serializing their use of the function. > * A call sequence in progress may receive a "VPD changed, try again" > status, requiring the client to start over. (The driver does not yet > handle this, but it should be easy to add.) That certainly reduces the complexity of the user interface making it much saner. > The memory required for the VPD buffers seems acceptable, around 20KB > for all VPD on one of my systems. And the value of the > /rtas/ibm,vpd-size DT property (the estimated maximum size of VPD) is > consistently 300KB across various systems I've checked. > > I've implemented support for this new ABI in the rtas_get_vpd() > function in librtas, which the vpdupdate command currently uses to > populate its VPD database. I've verified that an unmodified vpdupdate > binary generates an identical database when using a librtas.so that > prefers the new ABI. > > Likely remaining work: > > * Handle RTAS call status -4 (VPD changed) during ibm,get-vpd call > sequence. > * Prevent ibm,get-vpd calls via rtas(2) from disrupting ibm,get-vpd > call sequences in this driver. > * (Maybe) implement a poll method for delivering notifications of > potential changes to VPD, e.g. after a partition migration. That sounds like something for netlink. If that is desired maybe it should be used in the first place? > Questions, points for discussion: > > * Am I allocating the ioctl numbers correctly? > * The only way to discover the size of a VPD buffer is > lseek(SEEK_END). fstat() doesn't work for anonymous fds like > this. Is this OK, or should the buffer length be discoverable some > other way? So long as users have /rtas/ibm,vpd-size as the top bound of the data they can receive I don't think it's critical to know the current VPD size. Thanks Michal
Re: Recent Power changes and stack_trace_save_tsk_reliable?
Michael Ellerman writes: > Joe Lawrence writes: >> Hi ppc-dev list, >> >> We noticed that our kpatch integration tests started failing on ppc64le >> when targeting the upstream v6.4 kernel, and then confirmed that the >> in-tree livepatching kselftests similarly fail, too. From the kselftest >> results, it appears that livepatch transitions are no longer completing. > > Hi Joe, > > Thanks for the report. > > I thought I was running the livepatch tests, but looks like somewhere > along the line my kernel .config lost CONFIG_TEST_LIVEPATCH=m, so I have > been running the test but it just skips. :/ > > I can reproduce the failure, and will see if I can bisect it more > successfully. It's caused by: eed7c420aac7 ("powerpc: copy_thread differentiate kthreads and user mode threads") Which is obvious in hindsight :) The diff below fixes it for me, can you test that on your setup? A proper fix will need to be a bit bigger because the comments in there are all slightly wrong now since the above commit. Possibly we can also rework that code more substantially now that copy_thread() is more careful about setting things up, but that would be a follow-up. diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c index 5de8597eaab8..d0b3509f13ee 100644 --- a/arch/powerpc/kernel/stacktrace.c +++ b/arch/powerpc/kernel/stacktrace.c @@ -73,7 +73,7 @@ int __no_sanitize_address arch_stack_walk_reliable(stack_trace_consume_fn consum bool firstframe; stack_end = stack_page + THREAD_SIZE; - if (!is_idle_task(task)) { + if (!(task->flags & PF_KTHREAD)) { /* * For user tasks, this is the SP value loaded on * kernel entry, see "PACAKSAVE(r13)" in _switch() and cheers