Re: [PATCH v2 RESEND] powerpc/audit: Convert powerpc to AUDIT_ARCH_COMPAT_GENERIC
Konstantin Ryabitsev writes: > On Wed, Nov 03, 2021 at 10:18:57AM +1100, Michael Ellerman wrote: >> It's not in next, that notification is from the b4 thanks script, which >> didn't notice that the commit has since been reverted. > > Yeah... I'm not sure how to catch that, but I'm open to suggestions. I think that's probably the first time I've had a commit and a revert of the commit in the same batch of thanks mails. And the notification is not wrong, the commit was applied with that SHA, it is in the tree. So I'm not sure it's very common to have a commit & a revert in the tree at the same time. On the other hand being able to generate a mail for an arbitrary revert would be helpful, ie. independent of any thanks state. eg, picking a random commit from the past: e95ad5f21693 ("powerpc/head_check: Fix shellcheck errors") If I revert that in my tree today, it'd be cool if I could run something that would detect the revert, backtrack to the reverted commit, extract the message-id from the Link: tag, and generate a reply to the original submission noting that it's now been reverted. In fact we could write a bot to do that across all commits ever ... cheers
[PATCH] powerpc: use swap() to make code cleaner
From: Yang Guang Use the macro 'swap()' defined in 'include/linux/minmax.h' to avoid opencoding it. Reported-by: Zeal Robot Signed-off-by: Yang Guang --- arch/powerpc/kernel/fadump.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c index b7ceb041743c..5b40e2d46090 100644 --- a/arch/powerpc/kernel/fadump.c +++ b/arch/powerpc/kernel/fadump.c @@ -1265,7 +1265,6 @@ static void fadump_release_reserved_area(u64 start, u64 end) static void sort_and_merge_mem_ranges(struct fadump_mrange_info *mrange_info) { struct fadump_memory_range *mem_ranges; - struct fadump_memory_range tmp_range; u64 base, size; int i, j, idx; @@ -1281,9 +1280,7 @@ static void sort_and_merge_mem_ranges(struct fadump_mrange_info *mrange_info) idx = j; } if (idx != i) { - tmp_range = mem_ranges[idx]; - mem_ranges[idx] = mem_ranges[i]; - mem_ranges[i] = tmp_range; + swap(mem_ranges[idx], mem_ranges[i]); } } -- 2.30.2
Re: [PATCH 2/2] soc: fsl: guts: Add a missing memory allocation failure check
Christophe JAILLET writes: > If 'devm_kstrdup()' fails, we should return -ENOMEM. > > While at it, move the 'of_node_put()' call in the error handling path and > after the 'machine' has been copied. > Better safe than sorry. > > Suggested-by: Tyrel Datwyler > Signed-off-by: Christophe JAILLET > --- > Not sure of which Fixes tag to add. Should be a6fc3b698130, but since > another commit needs to be reverted for this patch to make sense, I'm > unsure of what to do. :( > So, none is given. I think it's still correct to add: Fixes: a6fc3b698130 ("soc: fsl: add GUTS driver for QorIQ platforms") That is where the bug was introduced, and adding the tag creates a link between the fix and the bug, which is what we want. The fact that it also requires the revert in order to apply is kind of orthogonal, it means an automated backport of this commit will probably fail, but that's OK it just means someone might have to do it manually. There is some use of "Depends-on:" to flag a commit that is depended on, but you can't use that in a patch submission because you don't know the SHA of the parent commit. Possibly whoever applies this can add a Depends-on: pointing to patch 1. cheers > --- > drivers/soc/fsl/guts.c | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/drivers/soc/fsl/guts.c b/drivers/soc/fsl/guts.c > index af7741eafc57..5ed2fc1c53a0 100644 > --- a/drivers/soc/fsl/guts.c > +++ b/drivers/soc/fsl/guts.c > @@ -158,9 +158,14 @@ static int fsl_guts_probe(struct platform_device *pdev) > root = of_find_node_by_path("/"); > if (of_property_read_string(root, "model", &machine)) > of_property_read_string_index(root, "compatible", 0, &machine); > - of_node_put(root); > - if (machine) > + if (machine) { > soc_dev_attr.machine = devm_kstrdup(dev, machine, GFP_KERNEL); > + if (!soc_dev_attr.machine) { > + of_node_put(root); > + return -ENOMEM; > + } > + } > + of_node_put(root); > > svr = fsl_guts_get_svr(); > soc_die = fsl_soc_die_match(svr, fsl_soc_die); > -- > 2.30.2
Re: [V3] powerpc/perf: Enable PMU counters post partition migration if PMU is active
Nathan Lynch writes: > Nicholas Piggin writes: >> Excerpts from Michael Ellerman's message of October 29, 2021 11:15 pm: >>> Nicholas Piggin writes: Excerpts from Athira Rajeev's message of October 29, 2021 1:05 pm: > @@ -631,12 +632,18 @@ static int pseries_migrate_partition(u64 handle) > if (ret) > return ret; > > + /* Disable PMU before suspend */ > + on_each_cpu(&mobility_pmu_disable, NULL, 0); Why was this moved out of stop machine and to an IPI? My concern would be, what are the other CPUs doing at this time? Is it possible they could take interrupts and schedule? Could that mess up the perf state here? >>> >>> pseries_migrate_partition() is called directly from migration_store(), >>> which is the sysfs store function, which can be called concurrently by >>> different CPUs. >>> >>> It's also potentially called from rtas_syscall_dispatch_ibm_suspend_me(), >>> from sys_rtas(), again with no locking. >>> >>> So we could have two CPUs calling into here at the same time, which >>> might not crash, but is unlikely to work well. >>> >>> I think the lack of locking might have been OK in the past because only >>> one CPU will successfully get the other CPUs to call do_join() in >>> pseries_suspend(). But I could be wrong. >>> >>> Anyway, now that we're mutating the PMU state before suspending we need >>> to be more careful. So I think we need a lock around the whole >>> sequence. > > Regardless of the outcome here, generally agreed that some serialization > should be imposed in this path. The way the platform works (and some > extra measures by the drmgr utility) make it so that this code isn't > entered concurrently in usual operation, but it's possible to make it > happen if you are root. Yeah I agree it's unlikely to be a problem in practice. > A file-static mutex should be OK. Ack. >> My concern is still that we wouldn't necessarily have the other CPUs >> under control at that point even if we serialize the migrate path. >> They could take interrupts, possibly call into perf subsystem after >> the mobility_pmu_disable (e.g., via syscall or context switch) which >> might mess things up. >> >> I think the stop machine is a reasonable place for the code in this >> case. It's a low level disabling of hardware facility and saving off >> registers. > > That makes sense, but I can't help feeling concerned still. For this to > be safe, power_pmu_enable() and power_pmu_disable() must never sleep or > re-enable interrupts or send IPIs. I don't see anything obviously unsafe > right now, but is that already part of their contract? Is there much > risk they could change in the future to violate those constraints? > > That aside, the proposed change seems like we would be hacking around a > more generic perf/pmu limitation in a powerpc-specific way. I see the > same behavior on x86 across suspend/resume. > > # perf stat -a -e cache-misses -I 1000 & sleep 2 ; systemctl suspend ; sleep > 20 ; kill $(jobs -p) > [1] 189806 > # time counts unit events > 1.000501710 9,983,649 cache-misses > 2.002620321 14,131,072 cache-misses > 3.004579071 23,010,971 cache-misses > 9.971854783 140,737,491,680,853 cache-misses > 10.982669250 0 cache-misses > 11.984660498 0 cache-misses > 12.986648392 0 cache-misses > 13.988561766 0 cache-misses > 14.992670615 0 cache-misses > 15.994938111 0 cache-misses > 16.996703952 0 cache-misses > 17.999092812 0 cache-misses > 19.000602677 0 cache-misses > 20.003272216 0 cache-misses > 21.004770295 0 cache-misses > # uname -r > 5.13.19-100.fc33.x86_64 That is interesting. Athira, I guess we should bring that to the perf maintainers and see if there's any interest in solving the issue in a generic fashion. cheers
[Bug 214913] [xfstests generic/051] BUG: Kernel NULL pointer dereference on read at 0x00000108 NIP [c0000000000372e4] tm_cgpr_active+0x14/0x40
https://bugzilla.kernel.org/show_bug.cgi?id=214913 Michael Ellerman (mich...@ellerman.id.au) changed: What|Removed |Added Status|NEW |ASSIGNED CC||mich...@ellerman.id.au --- Comment #2 from Michael Ellerman (mich...@ellerman.id.au) --- Thanks for the report, I agree this looks like a powerpc bug not an XFS bug. I won't have time to look at this until next week probably, unless someone beats me to it. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[PATCH] use swap() to make code cleaner
From: Yang Guang Use the macro 'swap()' defined in 'include/linux/minmax.h' to avoid opencoding it. Reported-by: Zeal Robot Signed-off-by: Yang Guang --- drivers/macintosh/adbhid.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/macintosh/adbhid.c b/drivers/macintosh/adbhid.c index 994ba5cb3678..379ab49a8107 100644 --- a/drivers/macintosh/adbhid.c +++ b/drivers/macintosh/adbhid.c @@ -817,9 +817,7 @@ adbhid_input_register(int id, int default_id, int original_handler_id, case 0xC4: case 0xC7: keyboard_type = "ISO, swapping keys"; input_dev->id.version = ADB_KEYBOARD_ISO; - i = hid->keycode[10]; - hid->keycode[10] = hid->keycode[50]; - hid->keycode[50] = i; + swap(hid->keycode[10], hid->keycode[50]); break; case 0x12: case 0x15: case 0x16: case 0x17: case 0x1A: -- 2.30.2
[PATCH] powerpc: use swap() to make code cleaner
From: Yang Guang Use the macro 'swap()' defined in 'include/linux/minmax.h' to avoid opencoding it. Reported-by: Zeal Robot Signed-off-by: Yang Guang --- arch/powerpc/platforms/powermac/pic.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/arch/powerpc/platforms/powermac/pic.c b/arch/powerpc/platforms/powermac/pic.c index 4921bccf0376..75d8d7ec53db 100644 --- a/arch/powerpc/platforms/powermac/pic.c +++ b/arch/powerpc/platforms/powermac/pic.c @@ -311,11 +311,8 @@ static void __init pmac_pic_probe_oldstyle(void) /* Check ordering of master & slave */ if (of_device_is_compatible(master, "gatwick")) { - struct device_node *tmp; BUG_ON(slave == NULL); - tmp = master; - master = slave; - slave = tmp; + swap(master, slave); } /* We found a slave */ -- 2.30.2
Re: Fwd: Fwd: X stopped working with 5.14 on iBook
On Nov 04 2021, Finn Thain wrote: > Does your Xorg installation use --enable-suid-wrapper, Andreas? Doesn't look like. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
Re: Fwd: Fwd: X stopped working with 5.14 on iBook
On Wed, 3 Nov 2021, Andreas Schwab wrote: > On Nov 02 2021, Finn Thain wrote: > > > After many builds and tests, Stan and I were able to determine that this > > regression only affects builds with CONFIG_USER_NS=y. That is, > > > > d3ccc9781560 + CONFIG_USER_NS=y --> fail > > d3ccc9781560 + CONFIG_USER_NS=n --> okay > > d3ccc9781560~ + CONFIG_USER_NS=y --> okay > > d3ccc9781560~ + CONFIG_USER_NS=n --> okay > > On my iBook G4, X is working alright with 5.15 and CONFIG_USER_NS=y. > Stan said his Cube has these packages installed: # dpkg --list | grep Xorg ii xserver-xorg-core 2:1.20.11-1 powerpc Xorg X server - core server ii xserver-xorg-legacy2:1.20.11-1 powerpc setuid root Xorg server wrapper I gather that Riccardo also runs Debian SID. Perhaps there is some interaction between d3ccc9781560, CONFIG_USER_NS and the SUID wrapper... Does your Xorg installation use --enable-suid-wrapper, Andreas?
Re: [PATCH -next] powerpc/44x/fsp2: add missing of_node_put
在 2021/11/2 下午6:12, Michael Ellerman 写道: Early exits from for_each_compatible_node() should decrement the node reference counter. Reported by Coccinelle: ./arch/powerpc/platforms/44x/fsp2.c:206:1-25: WARNING: Function "for_each_compatible_node" should have of_node_put() before return around line 218. [...] Applied to powerpc/next. [1/1] powerpc/44x/fsp2: add missing of_node_put https://git.kernel.org/powerpc/c/290fe8aa69ef5c51c778c0bb33f8ef0181c769f5 Thanks. :-) Bixuan Cui
Re: ppc64le STRICT_MODULE_RWX and livepatch apply_relocate_add() crashes
Hi Russell, On Mon, 2021-11-01 at 19:20 +1000, Russell Currey wrote: > On Sun, 2021-10-31 at 22:43 -0400, Joe Lawrence wrote: > > Starting with 5.14 kernels, I can reliably reproduce a crash [1] on > > ppc64le when loading livepatches containing late klp-relocations > > [2]. > > These are relocations, specific to livepatching, that are resolved > > not > > when a livepatch module is loaded, but only when a livepatch-target > > module is loaded. > > Hey Joe, thanks for the report. > > > I haven't started looking at a fix yet, but in the case of the x86 > > code > > update, its apply_relocate_add() implementation was modified to use > > a > > common text_poke() function to allowed us to drop > > module_{en,dis}ble_ro() games by the livepatching code. > > It should be a similar fix for Power, our patch_instruction() uses a > text poke area but apply_relocate_add() doesn't use it and does its > own > raw patching instead. > > > I can take a closer look this week, but thought I'd send out a > > report > > in case this may be a known todo for STRICT_MODULE_RWX on Power. > > I'm looking into this now, will update when there's progress. I > personally wasn't aware but Jordan flagged this as an issue back in > August [0]. Are the selftests in the klp-convert tree sufficient for > testing? I'm not especially familiar with livepatching & haven't > used > the userspace tools. > You can test this by livepatching any module since this only occurs when writing relocations for modules since the vmlinux relocations are written earlier before the module text is mapped read-only. - Suraj > - Russell > > [0] https://github.com/linuxppc/issues/issues/375 > > > > > -- Joe > >
Re: [V3] powerpc/perf: Enable PMU counters post partition migration if PMU is active
Nicholas Piggin writes: > Excerpts from Michael Ellerman's message of October 29, 2021 11:15 pm: >> Nicholas Piggin writes: >>> Excerpts from Athira Rajeev's message of October 29, 2021 1:05 pm: @@ -631,12 +632,18 @@ static int pseries_migrate_partition(u64 handle) if (ret) return ret; + /* Disable PMU before suspend */ + on_each_cpu(&mobility_pmu_disable, NULL, 0); >>> >>> Why was this moved out of stop machine and to an IPI? >>> >>> My concern would be, what are the other CPUs doing at this time? Is it >>> possible they could take interrupts and schedule? Could that mess up the >>> perf state here? >> >> pseries_migrate_partition() is called directly from migration_store(), >> which is the sysfs store function, which can be called concurrently by >> different CPUs. >> >> It's also potentially called from rtas_syscall_dispatch_ibm_suspend_me(), >> from sys_rtas(), again with no locking. >> >> So we could have two CPUs calling into here at the same time, which >> might not crash, but is unlikely to work well. >> >> I think the lack of locking might have been OK in the past because only >> one CPU will successfully get the other CPUs to call do_join() in >> pseries_suspend(). But I could be wrong. >> >> Anyway, now that we're mutating the PMU state before suspending we need >> to be more careful. So I think we need a lock around the whole >> sequence. Regardless of the outcome here, generally agreed that some serialization should be imposed in this path. The way the platform works (and some extra measures by the drmgr utility) make it so that this code isn't entered concurrently in usual operation, but it's possible to make it happen if you are root. A file-static mutex should be OK. > My concern is still that we wouldn't necessarily have the other CPUs > under control at that point even if we serialize the migrate path. > They could take interrupts, possibly call into perf subsystem after > the mobility_pmu_disable (e.g., via syscall or context switch) which > might mess things up. > > I think the stop machine is a reasonable place for the code in this > case. It's a low level disabling of hardware facility and saving off > registers. That makes sense, but I can't help feeling concerned still. For this to be safe, power_pmu_enable() and power_pmu_disable() must never sleep or re-enable interrupts or send IPIs. I don't see anything obviously unsafe right now, but is that already part of their contract? Is there much risk they could change in the future to violate those constraints? That aside, the proposed change seems like we would be hacking around a more generic perf/pmu limitation in a powerpc-specific way. I see the same behavior on x86 across suspend/resume. # perf stat -a -e cache-misses -I 1000 & sleep 2 ; systemctl suspend ; sleep 20 ; kill $(jobs -p) [1] 189806 # time counts unit events 1.000501710 9,983,649 cache-misses 2.002620321 14,131,072 cache-misses 3.004579071 23,010,971 cache-misses 9.971854783 140,737,491,680,853 cache-misses 10.982669250 0 cache-misses 11.984660498 0 cache-misses 12.986648392 0 cache-misses 13.988561766 0 cache-misses 14.992670615 0 cache-misses 15.994938111 0 cache-misses 16.996703952 0 cache-misses 17.999092812 0 cache-misses 19.000602677 0 cache-misses 20.003272216 0 cache-misses 21.004770295 0 cache-misses # uname -r 5.13.19-100.fc33.x86_64
[PATCH stable 4.9 2/2] arch: pgtable: define MAX_POSSIBLE_PHYSMEM_BITS where needed
From: Arnd Bergmann [ Upstream commit cef397038167ac15d085914493d6c86385773709 ] Stefan Agner reported a bug when using zsram on 32-bit Arm machines with RAM above the 4GB address boundary: Unable to handle kernel NULL pointer dereference at virtual address pgd = a27bd01c [] *pgd=236a0003, *pmd=1ffa64003 Internal error: Oops: 207 [#1] SMP ARM Modules linked in: mdio_bcm_unimac(+) brcmfmac cfg80211 brcmutil raspberrypi_hwmon hci_uart crc32_arm_ce bcm2711_thermal phy_generic genet CPU: 0 PID: 123 Comm: mkfs.ext4 Not tainted 5.9.6 #1 Hardware name: BCM2711 PC is at zs_map_object+0x94/0x338 LR is at zram_bvec_rw.constprop.0+0x330/0xa64 pc : []lr : []psr: 6013 sp : e376bbe0 ip : fp : c1e2921c r10: 0002 r9 : c1dda730 r8 : r7 : e8ff7a00 r6 : r5 : 02f9ffa0 r4 : e371 r3 : 000fdffe r2 : c1e0ce80 r1 : ebf979a0 r0 : Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 30c5383d Table: 235c2a80 DAC: fffd Process mkfs.ext4 (pid: 123, stack limit = 0x495a22e6) Stack: (0xe376bbe0 to 0xe376c000) As it turns out, zsram needs to know the maximum memory size, which is defined in MAX_PHYSMEM_BITS when CONFIG_SPARSEMEM is set, or in MAX_POSSIBLE_PHYSMEM_BITS on the x86 architecture. The same problem will be hit on all 32-bit architectures that have a physical address space larger than 4GB and happen to not enable sparsemem and include asm/sparsemem.h from asm/pgtable.h. After the initial discussion, I suggested just always defining MAX_POSSIBLE_PHYSMEM_BITS whenever CONFIG_PHYS_ADDR_T_64BIT is set, or provoking a build error otherwise. This addresses all configurations that can currently have this runtime bug, but leaves all other configurations unchanged. I looked up the possible number of bits in source code and datasheets, here is what I found: - on ARC, CONFIG_ARC_HAS_PAE40 controls whether 32 or 40 bits are used - on ARM, CONFIG_LPAE enables 40 bit addressing, without it we never support more than 32 bits, even though supersections in theory allow up to 40 bits as well. - on MIPS, some MIPS32r1 or later chips support 36 bits, and MIPS32r5 XPA supports up to 60 bits in theory, but 40 bits are more than anyone will ever ship - On PowerPC, there are three different implementations of 36 bit addressing, but 32-bit is used without CONFIG_PTE_64BIT - On RISC-V, the normal page table format can support 34 bit addressing. There is no highmem support on RISC-V, so anything above 2GB is unused, but it might be useful to eventually support CONFIG_ZRAM for high pages. Fixes: 61989a80fb3a ("staging: zsmalloc: zsmalloc memory allocation library") Fixes: 02390b87a945 ("mm/zsmalloc: Prepare to variable MAX_PHYSMEM_BITS") Acked-by: Thomas Bogendoerfer Reviewed-by: Stefan Agner Tested-by: Stefan Agner Acked-by: Mike Rapoport Link: https://lore.kernel.org/linux-mm/bdfa44bf1c570b05d6c70898e2bbb0acf234ecdf.1604762181.git.ste...@agner.ch/ Signed-off-by: Arnd Bergmann Signed-off-by: Sasha Levin [florian: patch arch/powerpc/include/asm/pte-common.h for 4.9.y removed arch/riscv/include/asm/pgtable.h which does not exist] Signed-off-by: Florian Fainelli --- arch/arc/include/asm/pgtable.h| 2 ++ arch/arm/include/asm/pgtable-2level.h | 2 ++ arch/arm/include/asm/pgtable-3level.h | 2 ++ arch/mips/include/asm/pgtable-32.h| 3 +++ arch/powerpc/include/asm/pte-common.h | 2 ++ include/asm-generic/pgtable.h | 13 + 6 files changed, 24 insertions(+) diff --git a/arch/arc/include/asm/pgtable.h b/arch/arc/include/asm/pgtable.h index c10f5cb203e6..81198a6773c6 100644 --- a/arch/arc/include/asm/pgtable.h +++ b/arch/arc/include/asm/pgtable.h @@ -137,8 +137,10 @@ #ifdef CONFIG_ARC_HAS_PAE40 #define PTE_BITS_NON_RWX_IN_PD1(0xff | PAGE_MASK | _PAGE_CACHEABLE) +#define MAX_POSSIBLE_PHYSMEM_BITS 40 #else #define PTE_BITS_NON_RWX_IN_PD1(PAGE_MASK | _PAGE_CACHEABLE) +#define MAX_POSSIBLE_PHYSMEM_BITS 32 #endif /** diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h index 92fd2c8a9af0..6154902bed83 100644 --- a/arch/arm/include/asm/pgtable-2level.h +++ b/arch/arm/include/asm/pgtable-2level.h @@ -78,6 +78,8 @@ #define PTE_HWTABLE_OFF(PTE_HWTABLE_PTRS * sizeof(pte_t)) #define PTE_HWTABLE_SIZE (PTRS_PER_PTE * sizeof(u32)) +#define MAX_POSSIBLE_PHYSMEM_BITS 32 + /* * PMD_SHIFT determines the size of the area a second-level page table can map * PGDIR_SHIFT determines what a third-level page table entry can map diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h index 2a029bceaf2f..35807e611b6e 100644 --- a/arch/arm/include/asm/pgtable-3level.h +++ b/arch/arm/include/asm/pgtable-3level.h @@ -37,6 +37,8 @@ #define PTE_HWTABLE_
[PATCH stable 4.9 1/2] mm/zsmalloc: Prepare to variable MAX_PHYSMEM_BITS
From: "Kirill A. Shutemov" commit 02390b87a9459937cdb299e6b34ff33992512ec7 upstream With boot-time switching between paging mode we will have variable MAX_PHYSMEM_BITS. Let's use the maximum variable possible for CONFIG_X86_5LEVEL=y configuration to define zsmalloc data structures. The patch introduces MAX_POSSIBLE_PHYSMEM_BITS to cover such case. It also suits well to handle PAE special case. Signed-off-by: Kirill A. Shutemov Reviewed-by: Nitin Gupta Acked-by: Minchan Kim Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Sergey Senozhatsky Cc: Thomas Gleixner Cc: linux...@kvack.org Link: http://lkml.kernel.org/r/20180214111656.88514-3-kirill.shute...@linux.intel.com Signed-off-by: Ingo Molnar [florian: drop arch/x86/include/asm/pgtable_64_types.h changes since there is no CONFIG_X86_5LEVEL] Signed-off-by: Florian Fainelli --- arch/x86/include/asm/pgtable-3level_types.h | 1 + mm/zsmalloc.c | 13 +++-- 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/pgtable-3level_types.h b/arch/x86/include/asm/pgtable-3level_types.h index bcc89625ebe5..f3f719d59e61 100644 --- a/arch/x86/include/asm/pgtable-3level_types.h +++ b/arch/x86/include/asm/pgtable-3level_types.h @@ -42,5 +42,6 @@ typedef union { */ #define PTRS_PER_PTE 512 +#define MAX_POSSIBLE_PHYSMEM_BITS 36 #endif /* _ASM_X86_PGTABLE_3LEVEL_DEFS_H */ diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 8db3c2b27a17..2b7bfd97587a 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -83,18 +83,19 @@ * This is made more complicated by various memory models and PAE. */ -#ifndef MAX_PHYSMEM_BITS -#ifdef CONFIG_HIGHMEM64G -#define MAX_PHYSMEM_BITS 36 -#else /* !CONFIG_HIGHMEM64G */ +#ifndef MAX_POSSIBLE_PHYSMEM_BITS +#ifdef MAX_PHYSMEM_BITS +#define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS +#else /* * If this definition of MAX_PHYSMEM_BITS is used, OBJ_INDEX_BITS will just * be PAGE_SHIFT */ -#define MAX_PHYSMEM_BITS BITS_PER_LONG +#define MAX_POSSIBLE_PHYSMEM_BITS BITS_PER_LONG #endif #endif -#define _PFN_BITS (MAX_PHYSMEM_BITS - PAGE_SHIFT) + +#define _PFN_BITS (MAX_POSSIBLE_PHYSMEM_BITS - PAGE_SHIFT) /* * Memory for allocating for handle keeps object position by -- 2.25.1
[PATCH stable 4.9 0/2] zsmalloc MAX_POSSIBLE_PHYSMEM_BITS
This patch series is a back port necessary to address the problem reported by Stefan Agner: https://lore.kernel.org/linux-mm/bdfa44bf1c570b05d6c70898e2bbb0acf234ecdf.1604762181.git.ste...@agner.ch/ but which ended up being addressed by Arnd in a slightly different way from Stefan's submission. The first patch from Kirill is back ported in order to have MAX_POSSIBLE_PHYSMEM_BITS be acted on my the zsmalloc.c code. Arnd Bergmann (1): arch: pgtable: define MAX_POSSIBLE_PHYSMEM_BITS where needed Kirill A. Shutemov (1): mm/zsmalloc: Prepare to variable MAX_PHYSMEM_BITS arch/arc/include/asm/pgtable.h | 2 ++ arch/arm/include/asm/pgtable-2level.h | 2 ++ arch/arm/include/asm/pgtable-3level.h | 2 ++ arch/mips/include/asm/pgtable-32.h | 3 +++ arch/powerpc/include/asm/pte-common.h | 2 ++ arch/x86/include/asm/pgtable-3level_types.h | 1 + include/asm-generic/pgtable.h | 13 + mm/zsmalloc.c | 13 +++-- 8 files changed, 32 insertions(+), 6 deletions(-) -- 2.25.1
[PATCH stable 4.14 2/2] arch: pgtable: define MAX_POSSIBLE_PHYSMEM_BITS where needed
From: Arnd Bergmann [ Upstream commit cef397038167ac15d085914493d6c86385773709 ] Stefan Agner reported a bug when using zsram on 32-bit Arm machines with RAM above the 4GB address boundary: Unable to handle kernel NULL pointer dereference at virtual address pgd = a27bd01c [] *pgd=236a0003, *pmd=1ffa64003 Internal error: Oops: 207 [#1] SMP ARM Modules linked in: mdio_bcm_unimac(+) brcmfmac cfg80211 brcmutil raspberrypi_hwmon hci_uart crc32_arm_ce bcm2711_thermal phy_generic genet CPU: 0 PID: 123 Comm: mkfs.ext4 Not tainted 5.9.6 #1 Hardware name: BCM2711 PC is at zs_map_object+0x94/0x338 LR is at zram_bvec_rw.constprop.0+0x330/0xa64 pc : []lr : []psr: 6013 sp : e376bbe0 ip : fp : c1e2921c r10: 0002 r9 : c1dda730 r8 : r7 : e8ff7a00 r6 : r5 : 02f9ffa0 r4 : e371 r3 : 000fdffe r2 : c1e0ce80 r1 : ebf979a0 r0 : Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 30c5383d Table: 235c2a80 DAC: fffd Process mkfs.ext4 (pid: 123, stack limit = 0x495a22e6) Stack: (0xe376bbe0 to 0xe376c000) As it turns out, zsram needs to know the maximum memory size, which is defined in MAX_PHYSMEM_BITS when CONFIG_SPARSEMEM is set, or in MAX_POSSIBLE_PHYSMEM_BITS on the x86 architecture. The same problem will be hit on all 32-bit architectures that have a physical address space larger than 4GB and happen to not enable sparsemem and include asm/sparsemem.h from asm/pgtable.h. After the initial discussion, I suggested just always defining MAX_POSSIBLE_PHYSMEM_BITS whenever CONFIG_PHYS_ADDR_T_64BIT is set, or provoking a build error otherwise. This addresses all configurations that can currently have this runtime bug, but leaves all other configurations unchanged. I looked up the possible number of bits in source code and datasheets, here is what I found: - on ARC, CONFIG_ARC_HAS_PAE40 controls whether 32 or 40 bits are used - on ARM, CONFIG_LPAE enables 40 bit addressing, without it we never support more than 32 bits, even though supersections in theory allow up to 40 bits as well. - on MIPS, some MIPS32r1 or later chips support 36 bits, and MIPS32r5 XPA supports up to 60 bits in theory, but 40 bits are more than anyone will ever ship - On PowerPC, there are three different implementations of 36 bit addressing, but 32-bit is used without CONFIG_PTE_64BIT - On RISC-V, the normal page table format can support 34 bit addressing. There is no highmem support on RISC-V, so anything above 2GB is unused, but it might be useful to eventually support CONFIG_ZRAM for high pages. Fixes: 61989a80fb3a ("staging: zsmalloc: zsmalloc memory allocation library") Fixes: 02390b87a945 ("mm/zsmalloc: Prepare to variable MAX_PHYSMEM_BITS") Acked-by: Thomas Bogendoerfer Reviewed-by: Stefan Agner Tested-by: Stefan Agner Acked-by: Mike Rapoport Link: https://lore.kernel.org/linux-mm/bdfa44bf1c570b05d6c70898e2bbb0acf234ecdf.1604762181.git.ste...@agner.ch/ Signed-off-by: Arnd Bergmann Signed-off-by: Sasha Levin [florian: patch arch/powerpc/include/asm/pte-common.h for 4.14.y removed arch/riscv/include/asm/pgtable.h which does not exist] Signed-off-by: Florian Fainelli --- arch/arc/include/asm/pgtable.h| 2 ++ arch/arm/include/asm/pgtable-2level.h | 2 ++ arch/arm/include/asm/pgtable-3level.h | 2 ++ arch/mips/include/asm/pgtable-32.h| 3 +++ arch/powerpc/include/asm/pte-common.h | 2 ++ include/asm-generic/pgtable.h | 13 + 6 files changed, 24 insertions(+) diff --git a/arch/arc/include/asm/pgtable.h b/arch/arc/include/asm/pgtable.h index 77676e18da69..a31ae69da639 100644 --- a/arch/arc/include/asm/pgtable.h +++ b/arch/arc/include/asm/pgtable.h @@ -138,8 +138,10 @@ #ifdef CONFIG_ARC_HAS_PAE40 #define PTE_BITS_NON_RWX_IN_PD1(0xff | PAGE_MASK | _PAGE_CACHEABLE) +#define MAX_POSSIBLE_PHYSMEM_BITS 40 #else #define PTE_BITS_NON_RWX_IN_PD1(PAGE_MASK | _PAGE_CACHEABLE) +#define MAX_POSSIBLE_PHYSMEM_BITS 32 #endif /** diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h index 92fd2c8a9af0..6154902bed83 100644 --- a/arch/arm/include/asm/pgtable-2level.h +++ b/arch/arm/include/asm/pgtable-2level.h @@ -78,6 +78,8 @@ #define PTE_HWTABLE_OFF(PTE_HWTABLE_PTRS * sizeof(pte_t)) #define PTE_HWTABLE_SIZE (PTRS_PER_PTE * sizeof(u32)) +#define MAX_POSSIBLE_PHYSMEM_BITS 32 + /* * PMD_SHIFT determines the size of the area a second-level page table can map * PGDIR_SHIFT determines what a third-level page table entry can map diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h index 2a029bceaf2f..35807e611b6e 100644 --- a/arch/arm/include/asm/pgtable-3level.h +++ b/arch/arm/include/asm/pgtable-3level.h @@ -37,6 +37,8 @@ #define PTE_HWTABLE
[PATCH stable 4.14 1/2] mm/zsmalloc: Prepare to variable MAX_PHYSMEM_BITS
From: "Kirill A. Shutemov" commit 02390b87a9459937cdb299e6b34ff33992512ec7 upstream With boot-time switching between paging mode we will have variable MAX_PHYSMEM_BITS. Let's use the maximum variable possible for CONFIG_X86_5LEVEL=y configuration to define zsmalloc data structures. The patch introduces MAX_POSSIBLE_PHYSMEM_BITS to cover such case. It also suits well to handle PAE special case. Signed-off-by: Kirill A. Shutemov Reviewed-by: Nitin Gupta Acked-by: Minchan Kim Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Sergey Senozhatsky Cc: Thomas Gleixner Cc: linux...@kvack.org Link: http://lkml.kernel.org/r/20180214111656.88514-3-kirill.shute...@linux.intel.com Signed-off-by: Ingo Molnar Signed-off-by: Florian Fainelli --- arch/x86/include/asm/pgtable-3level_types.h | 1 + arch/x86/include/asm/pgtable_64_types.h | 2 ++ mm/zsmalloc.c | 13 +++-- 3 files changed, 10 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/pgtable-3level_types.h b/arch/x86/include/asm/pgtable-3level_types.h index 876b4c77d983..6a59a6d0cc50 100644 --- a/arch/x86/include/asm/pgtable-3level_types.h +++ b/arch/x86/include/asm/pgtable-3level_types.h @@ -44,5 +44,6 @@ typedef union { */ #define PTRS_PER_PTE 512 +#define MAX_POSSIBLE_PHYSMEM_BITS 36 #endif /* _ASM_X86_PGTABLE_3LEVEL_DEFS_H */ diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h index bf6d2692fc60..2bd79b7ae9d6 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -40,6 +40,8 @@ typedef struct { pteval_t pte; } pte_t; #define P4D_SIZE (_AC(1, UL) << P4D_SHIFT) #define P4D_MASK (~(P4D_SIZE - 1)) +#define MAX_POSSIBLE_PHYSMEM_BITS 52 + #else /* CONFIG_X86_5LEVEL */ /* diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 6ed736ea9b59..633ebcac82f8 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -83,18 +83,19 @@ * This is made more complicated by various memory models and PAE. */ -#ifndef MAX_PHYSMEM_BITS -#ifdef CONFIG_HIGHMEM64G -#define MAX_PHYSMEM_BITS 36 -#else /* !CONFIG_HIGHMEM64G */ +#ifndef MAX_POSSIBLE_PHYSMEM_BITS +#ifdef MAX_PHYSMEM_BITS +#define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS +#else /* * If this definition of MAX_PHYSMEM_BITS is used, OBJ_INDEX_BITS will just * be PAGE_SHIFT */ -#define MAX_PHYSMEM_BITS BITS_PER_LONG +#define MAX_POSSIBLE_PHYSMEM_BITS BITS_PER_LONG #endif #endif -#define _PFN_BITS (MAX_PHYSMEM_BITS - PAGE_SHIFT) + +#define _PFN_BITS (MAX_POSSIBLE_PHYSMEM_BITS - PAGE_SHIFT) /* * Memory for allocating for handle keeps object position by -- 2.25.1
[PATCH stable 4.14 0/2] zsmalloc MAX_POSSIBLE_PHYSMEM_BITS
This patch series is a back port necessary to address the problem reported by Stefan Agner: https://lore.kernel.org/linux-mm/bdfa44bf1c570b05d6c70898e2bbb0acf234ecdf.1604762181.git.ste...@agner.ch/ but which ended up being addressed by Arnd in a slightly different way from Stefan's submission. The first patch from Kirill is back ported in order to have MAX_POSSIBLE_PHYSMEM_BITS be acted on my the zsmalloc.c code. Arnd Bergmann (1): arch: pgtable: define MAX_POSSIBLE_PHYSMEM_BITS where needed Kirill A. Shutemov (1): mm/zsmalloc: Prepare to variable MAX_PHYSMEM_BITS arch/arc/include/asm/pgtable.h | 2 ++ arch/arm/include/asm/pgtable-2level.h | 2 ++ arch/arm/include/asm/pgtable-3level.h | 2 ++ arch/mips/include/asm/pgtable-32.h | 3 +++ arch/powerpc/include/asm/pte-common.h | 2 ++ arch/x86/include/asm/pgtable-3level_types.h | 1 + arch/x86/include/asm/pgtable_64_types.h | 2 ++ include/asm-generic/pgtable.h | 13 + mm/zsmalloc.c | 13 +++-- 9 files changed, 34 insertions(+), 6 deletions(-) -- 2.25.1
[PATCH stable 4.19 1/1] arch: pgtable: define MAX_POSSIBLE_PHYSMEM_BITS where needed
From: Arnd Bergmann [ Upstream commit cef397038167ac15d085914493d6c86385773709 ] Stefan Agner reported a bug when using zsram on 32-bit Arm machines with RAM above the 4GB address boundary: Unable to handle kernel NULL pointer dereference at virtual address pgd = a27bd01c [] *pgd=236a0003, *pmd=1ffa64003 Internal error: Oops: 207 [#1] SMP ARM Modules linked in: mdio_bcm_unimac(+) brcmfmac cfg80211 brcmutil raspberrypi_hwmon hci_uart crc32_arm_ce bcm2711_thermal phy_generic genet CPU: 0 PID: 123 Comm: mkfs.ext4 Not tainted 5.9.6 #1 Hardware name: BCM2711 PC is at zs_map_object+0x94/0x338 LR is at zram_bvec_rw.constprop.0+0x330/0xa64 pc : []lr : []psr: 6013 sp : e376bbe0 ip : fp : c1e2921c r10: 0002 r9 : c1dda730 r8 : r7 : e8ff7a00 r6 : r5 : 02f9ffa0 r4 : e371 r3 : 000fdffe r2 : c1e0ce80 r1 : ebf979a0 r0 : Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 30c5383d Table: 235c2a80 DAC: fffd Process mkfs.ext4 (pid: 123, stack limit = 0x495a22e6) Stack: (0xe376bbe0 to 0xe376c000) As it turns out, zsram needs to know the maximum memory size, which is defined in MAX_PHYSMEM_BITS when CONFIG_SPARSEMEM is set, or in MAX_POSSIBLE_PHYSMEM_BITS on the x86 architecture. The same problem will be hit on all 32-bit architectures that have a physical address space larger than 4GB and happen to not enable sparsemem and include asm/sparsemem.h from asm/pgtable.h. After the initial discussion, I suggested just always defining MAX_POSSIBLE_PHYSMEM_BITS whenever CONFIG_PHYS_ADDR_T_64BIT is set, or provoking a build error otherwise. This addresses all configurations that can currently have this runtime bug, but leaves all other configurations unchanged. I looked up the possible number of bits in source code and datasheets, here is what I found: - on ARC, CONFIG_ARC_HAS_PAE40 controls whether 32 or 40 bits are used - on ARM, CONFIG_LPAE enables 40 bit addressing, without it we never support more than 32 bits, even though supersections in theory allow up to 40 bits as well. - on MIPS, some MIPS32r1 or later chips support 36 bits, and MIPS32r5 XPA supports up to 60 bits in theory, but 40 bits are more than anyone will ever ship - On PowerPC, there are three different implementations of 36 bit addressing, but 32-bit is used without CONFIG_PTE_64BIT - On RISC-V, the normal page table format can support 34 bit addressing. There is no highmem support on RISC-V, so anything above 2GB is unused, but it might be useful to eventually support CONFIG_ZRAM for high pages. Fixes: 61989a80fb3a ("staging: zsmalloc: zsmalloc memory allocation library") Fixes: 02390b87a945 ("mm/zsmalloc: Prepare to variable MAX_PHYSMEM_BITS") Acked-by: Thomas Bogendoerfer Reviewed-by: Stefan Agner Tested-by: Stefan Agner Acked-by: Mike Rapoport Link: https://lore.kernel.org/linux-mm/bdfa44bf1c570b05d6c70898e2bbb0acf234ecdf.1604762181.git.ste...@agner.ch/ Signed-off-by: Arnd Bergmann Signed-off-by: Sasha Levin [florian: patch arch/powerpc/include/asm/pte-common.h for 4.19.y] Signed-off-by: Florian Fainelli --- arch/arc/include/asm/pgtable.h| 2 ++ arch/arm/include/asm/pgtable-2level.h | 2 ++ arch/arm/include/asm/pgtable-3level.h | 2 ++ arch/mips/include/asm/pgtable-32.h| 3 +++ arch/powerpc/include/asm/pte-common.h | 2 ++ arch/riscv/include/asm/pgtable-32.h | 2 ++ include/asm-generic/pgtable.h | 13 + 7 files changed, 26 insertions(+) diff --git a/arch/arc/include/asm/pgtable.h b/arch/arc/include/asm/pgtable.h index cf4be70d5892..f231963b4011 100644 --- a/arch/arc/include/asm/pgtable.h +++ b/arch/arc/include/asm/pgtable.h @@ -138,8 +138,10 @@ #ifdef CONFIG_ARC_HAS_PAE40 #define PTE_BITS_NON_RWX_IN_PD1(0xff | PAGE_MASK | _PAGE_CACHEABLE) +#define MAX_POSSIBLE_PHYSMEM_BITS 40 #else #define PTE_BITS_NON_RWX_IN_PD1(PAGE_MASK | _PAGE_CACHEABLE) +#define MAX_POSSIBLE_PHYSMEM_BITS 32 #endif /** diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h index 12659ce5c1f3..90bf19d99378 100644 --- a/arch/arm/include/asm/pgtable-2level.h +++ b/arch/arm/include/asm/pgtable-2level.h @@ -78,6 +78,8 @@ #define PTE_HWTABLE_OFF(PTE_HWTABLE_PTRS * sizeof(pte_t)) #define PTE_HWTABLE_SIZE (PTRS_PER_PTE * sizeof(u32)) +#define MAX_POSSIBLE_PHYSMEM_BITS 32 + /* * PMD_SHIFT determines the size of the area a second-level page table can map * PGDIR_SHIFT determines what a third-level page table entry can map diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h index 6d50a11d7793..7ba08dd650e3 100644 --- a/arch/arm/include/asm/pgtable-3level.h +++ b/arch/arm/include/asm/pgtable-3level.h @@ -37,6 +37,8 @@ #define PTE_HWTABLE_OFF
[PATCH 2/2] soc: fsl: guts: Add a missing memory allocation failure check
If 'devm_kstrdup()' fails, we should return -ENOMEM. While at it, move the 'of_node_put()' call in the error handling path and after the 'machine' has been copied. Better safe than sorry. Suggested-by: Tyrel Datwyler Signed-off-by: Christophe JAILLET --- Not sure of which Fixes tag to add. Should be a6fc3b698130, but since another commit needs to be reverted for this patch to make sense, I'm unsure of what to do. :( So, none is given. --- drivers/soc/fsl/guts.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/soc/fsl/guts.c b/drivers/soc/fsl/guts.c index af7741eafc57..5ed2fc1c53a0 100644 --- a/drivers/soc/fsl/guts.c +++ b/drivers/soc/fsl/guts.c @@ -158,9 +158,14 @@ static int fsl_guts_probe(struct platform_device *pdev) root = of_find_node_by_path("/"); if (of_property_read_string(root, "model", &machine)) of_property_read_string_index(root, "compatible", 0, &machine); - of_node_put(root); - if (machine) + if (machine) { soc_dev_attr.machine = devm_kstrdup(dev, machine, GFP_KERNEL); + if (!soc_dev_attr.machine) { + of_node_put(root); + return -ENOMEM; + } + } + of_node_put(root); svr = fsl_guts_get_svr(); soc_die = fsl_soc_die_match(svr, fsl_soc_die); -- 2.30.2
[PATCH 1/2] soc: fsl: guts: Revert commit 3c0d64e867ed
This reverts commit 3c0d64e867ed ("soc: fsl: guts: reuse machine name from device tree"). A following patch will fix the missing memory allocation failure check instead. Suggested-by: Tyrel Datwyler Signed-off-by: Christophe JAILLET --- This is a follow-up of discussion in: https://lore.kernel.org/kernel-janitors/b12e8c5c5d6ab3061d9504de8fbaefcad6bbc385.1629321668.git.christophe.jail...@wanadoo.fr/ --- drivers/soc/fsl/guts.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/soc/fsl/guts.c b/drivers/soc/fsl/guts.c index 072473a16f4d..af7741eafc57 100644 --- a/drivers/soc/fsl/guts.c +++ b/drivers/soc/fsl/guts.c @@ -28,7 +28,6 @@ struct fsl_soc_die_attr { static struct guts *guts; static struct soc_device_attribute soc_dev_attr; static struct soc_device *soc_dev; -static struct device_node *root; /* SoC die attribute definition for QorIQ platform */ @@ -138,7 +137,7 @@ static u32 fsl_guts_get_svr(void) static int fsl_guts_probe(struct platform_device *pdev) { - struct device_node *np = pdev->dev.of_node; + struct device_node *root, *np = pdev->dev.of_node; struct device *dev = &pdev->dev; const struct fsl_soc_die_attr *soc_die; const char *machine; @@ -159,8 +158,9 @@ static int fsl_guts_probe(struct platform_device *pdev) root = of_find_node_by_path("/"); if (of_property_read_string(root, "model", &machine)) of_property_read_string_index(root, "compatible", 0, &machine); + of_node_put(root); if (machine) - soc_dev_attr.machine = machine; + soc_dev_attr.machine = devm_kstrdup(dev, machine, GFP_KERNEL); svr = fsl_guts_get_svr(); soc_die = fsl_soc_die_match(svr, fsl_soc_die); @@ -195,7 +195,6 @@ static int fsl_guts_probe(struct platform_device *pdev) static int fsl_guts_remove(struct platform_device *dev) { soc_device_unregister(soc_dev); - of_node_put(root); return 0; } -- 2.30.2
Re: Fwd: Fwd: X stopped working with 5.14 on iBook
On Nov 02 2021, Finn Thain wrote: > After many builds and tests, Stan and I were able to determine that this > regression only affects builds with CONFIG_USER_NS=y. That is, > > d3ccc9781560 + CONFIG_USER_NS=y --> fail > d3ccc9781560 + CONFIG_USER_NS=n --> okay > d3ccc9781560~ + CONFIG_USER_NS=y --> okay > d3ccc9781560~ + CONFIG_USER_NS=n --> okay On my iBook G4, X is working alright with 5.15 and CONFIG_USER_NS=y. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
[PATCH 2/3] module: strip the signature marker in the verification function.
It is stripped by each caller separately. Signed-off-by: Michal Suchanek --- arch/s390/kernel/machine_kexec_file.c | 9 - kernel/module.c | 7 +-- kernel/module_signing.c | 12 ++-- 3 files changed, 11 insertions(+), 17 deletions(-) diff --git a/arch/s390/kernel/machine_kexec_file.c b/arch/s390/kernel/machine_kexec_file.c index 634e641cd8aa..82260bb61060 100644 --- a/arch/s390/kernel/machine_kexec_file.c +++ b/arch/s390/kernel/machine_kexec_file.c @@ -26,20 +26,11 @@ const struct kexec_file_ops * const kexec_file_loaders[] = { int s390_verify_sig(const char *kernel, unsigned long length) { size_t kernel_len = length; - const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1; /* Skip signature verification when not secure IPLed. */ if (!ipl_secure_flag) return 0; - if (marker_len > kernel_len) - return -EKEYREJECTED; - - if (memcmp(kernel + kernel_len - marker_len, MODULE_SIG_STRING, - marker_len)) - return -EKEYREJECTED; - kernel_len -= marker_len; - return verify_appended_signature(kernel, &kernel_len, VERIFY_USE_PLATFORM_KEYRING, "kexec_file"); } diff --git a/kernel/module.c b/kernel/module.c index 137b3661be75..1c421b0442e3 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2882,7 +2882,6 @@ static inline void kmemleak_load_module(const struct module *mod, static int module_sig_check(struct load_info *info, int flags) { int err = -ENODATA; - const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1; const char *reason; const void *mod = info->hdr; @@ -2890,11 +2889,7 @@ static int module_sig_check(struct load_info *info, int flags) * Require flags == 0, as a module with version information * removed is no longer the module that was signed */ - if (flags == 0 && - info->len > markerlen && - memcmp(mod + info->len - markerlen, MODULE_SIG_STRING, markerlen) == 0) { - /* We truncate the module to discard the signature */ - info->len -= markerlen; + if (flags == 0) { err = verify_appended_signature(mod, &info->len, VERIFY_USE_SECONDARY_KEYRING, "module"); if (!err) { diff --git a/kernel/module_signing.c b/kernel/module_signing.c index f492e410564d..4c28cb55275f 100644 --- a/kernel/module_signing.c +++ b/kernel/module_signing.c @@ -15,8 +15,7 @@ #include "module-internal.h" /** - * verify_appended_signature - Verify the signature on a module with the - * signature marker stripped. + * verify_appended_signature - Verify the signature on a module * @data: The data to be verified * @len: Size of @data. * @trusted_keys: Keyring to use for verification @@ -25,12 +24,21 @@ int verify_appended_signature(const void *data, size_t *len, struct key *trusted_keys, const char *what) { + const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1; struct module_signature ms; size_t sig_len, modlen = *len; int ret; pr_devel("==>%s(,%zu)\n", __func__, modlen); + if (markerlen > modlen) + return -ENODATA; + + if (memcmp(data + modlen - markerlen, MODULE_SIG_STRING, + markerlen)) + return -ENODATA; + modlen -= markerlen; + if (modlen <= sizeof(ms)) return -EBADMSG; -- 2.31.1
[PATCH 1/3] s390/kexec_file: Don't opencode appended signature verification.
Module verification already implements appeded signature verification. Reuse it for kexec_file. Signed-off-by: Michal Suchanek --- arch/s390/kernel/machine_kexec_file.c | 33 --- include/linux/verification.h | 3 +++ kernel/module-internal.h | 2 -- kernel/module.c | 4 +++- kernel/module_signing.c | 24 +++ 5 files changed, 25 insertions(+), 41 deletions(-) diff --git a/arch/s390/kernel/machine_kexec_file.c b/arch/s390/kernel/machine_kexec_file.c index f9e4baa64b67..634e641cd8aa 100644 --- a/arch/s390/kernel/machine_kexec_file.c +++ b/arch/s390/kernel/machine_kexec_file.c @@ -23,11 +23,10 @@ const struct kexec_file_ops * const kexec_file_loaders[] = { }; #ifdef CONFIG_KEXEC_SIG -int s390_verify_sig(const char *kernel, unsigned long kernel_len) +int s390_verify_sig(const char *kernel, unsigned long length) { + size_t kernel_len = length; const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1; - struct module_signature *ms; - unsigned long sig_len; /* Skip signature verification when not secure IPLed. */ if (!ipl_secure_flag) @@ -41,32 +40,8 @@ int s390_verify_sig(const char *kernel, unsigned long kernel_len) return -EKEYREJECTED; kernel_len -= marker_len; - ms = (void *)kernel + kernel_len - sizeof(*ms); - kernel_len -= sizeof(*ms); - - sig_len = be32_to_cpu(ms->sig_len); - if (sig_len >= kernel_len) - return -EKEYREJECTED; - kernel_len -= sig_len; - - if (ms->id_type != PKEY_ID_PKCS7) - return -EKEYREJECTED; - - if (ms->algo != 0 || - ms->hash != 0 || - ms->signer_len != 0 || - ms->key_id_len != 0 || - ms->__pad[0] != 0 || - ms->__pad[1] != 0 || - ms->__pad[2] != 0) { - return -EBADMSG; - } - - return verify_pkcs7_signature(kernel, kernel_len, - kernel + kernel_len, sig_len, - VERIFY_USE_PLATFORM_KEYRING, - VERIFYING_MODULE_SIGNATURE, - NULL, NULL); + return verify_appended_signature(kernel, &kernel_len, VERIFY_USE_PLATFORM_KEYRING, + "kexec_file"); } #endif /* CONFIG_KEXEC_SIG */ diff --git a/include/linux/verification.h b/include/linux/verification.h index a655923335ae..c1cf0582012a 100644 --- a/include/linux/verification.h +++ b/include/linux/verification.h @@ -60,5 +60,8 @@ extern int verify_pefile_signature(const void *pebuf, unsigned pelen, enum key_being_used_for usage); #endif +int verify_appended_signature(const void *data, size_t *len, struct key *trusted_keys, + const char *what); + #endif /* CONFIG_SYSTEM_DATA_VERIFICATION */ #endif /* _LINUX_VERIFY_PEFILE_H */ diff --git a/kernel/module-internal.h b/kernel/module-internal.h index 33783abc377b..80461e14bf29 100644 --- a/kernel/module-internal.h +++ b/kernel/module-internal.h @@ -27,5 +27,3 @@ struct load_info { unsigned int sym, str, mod, vers, info, pcpu; } index; }; - -extern int mod_verify_sig(const void *mod, struct load_info *info); diff --git a/kernel/module.c b/kernel/module.c index 5c26a76e800b..137b3661be75 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -57,6 +57,7 @@ #include #include #include +#include #include #include "module-internal.h" @@ -2894,7 +2895,8 @@ static int module_sig_check(struct load_info *info, int flags) memcmp(mod + info->len - markerlen, MODULE_SIG_STRING, markerlen) == 0) { /* We truncate the module to discard the signature */ info->len -= markerlen; - err = mod_verify_sig(mod, info); + err = verify_appended_signature(mod, &info->len, + VERIFY_USE_SECONDARY_KEYRING, "module"); if (!err) { info->sig_ok = true; return 0; diff --git a/kernel/module_signing.c b/kernel/module_signing.c index 8723ae70ea1f..f492e410564d 100644 --- a/kernel/module_signing.c +++ b/kernel/module_signing.c @@ -14,13 +14,19 @@ #include #include "module-internal.h" -/* - * Verify the signature on a module. +/** + * verify_appended_signature - Verify the signature on a module with the + * signature marker stripped. + * @data: The data to be verified + * @len: Size of @data. + * @trusted_keys: Keyring to use for verification + * @what: Informational string for log messages */ -int mod_verify_sig(const void *mod, struct load_info *info) +int verify_appended_signature(const void *data, size_t *len, + struct key *trusted_keys, const char *what) { struct module_signature ms; -
[PATCH 3/3] powerpc/kexec_file: Add KEXEC_SIG support.
Use the module verifier for the kernel image verification. Signed-off-by: Michal Suchanek --- arch/powerpc/Kconfig| 11 +++ arch/powerpc/kexec/elf_64.c | 14 ++ 2 files changed, 25 insertions(+) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 743c9783c64f..27bffafa9e79 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -558,6 +558,17 @@ config KEXEC_FILE config ARCH_HAS_KEXEC_PURGATORY def_bool KEXEC_FILE +config KEXEC_SIG + bool "Verify kernel signature during kexec_file_load() syscall" + depends on KEXEC_FILE && MODULE_SIG_FORMAT + help + This option makes kernel signature verification mandatory for + the kexec_file_load() syscall. + + In addition to that option, you need to enable signature + verification for the corresponding kernel image type being + loaded in order for this to work. + config PPC64_BUILD_ELF_V2_ABI bool diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c index eeb258002d1e..e8dff6b23ac5 100644 --- a/arch/powerpc/kexec/elf_64.c +++ b/arch/powerpc/kexec/elf_64.c @@ -23,6 +23,7 @@ #include #include #include +#include static void *elf64_load(struct kimage *image, char *kernel_buf, unsigned long kernel_len, char *initrd, @@ -151,7 +152,20 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, return ret ? ERR_PTR(ret) : NULL; } +#ifdef CONFIG_KEXEC_SIG +int elf64_verify_sig(const char *kernel, unsigned long length) +{ + size_t kernel_len = length; + + return verify_appended_signature(kernel, &kernel_len, VERIFY_USE_PLATFORM_KEYRING, +"kexec_file"); +} +#endif /* CONFIG_KEXEC_SIG */ + const struct kexec_file_ops kexec_elf64_ops = { .probe = kexec_elf_probe, .load = elf64_load, +#ifdef CONFIG_KEXEC_SIG + .verify_sig = elf64_verify_sig, +#endif }; -- 2.31.1
[PATCH 0/3] KEXEC_SIG with appended signature
S390 uses appended signature for kernel but implements the check separately from module loader. Support for secure boot on powerpc with appended signature is planned - grub patches submitted upstream but not yet merged. This is an attempt at unified appended signature verification. Thanks Michal Michal Suchanek (3): s390/kexec_file: Don't opencode appended signature verification. module: strip the signature marker in the verification function. powerpc/kexec_file: Add KEXEC_SIG support. arch/powerpc/Kconfig | 11 +++ arch/powerpc/kexec/elf_64.c | 14 + arch/s390/kernel/machine_kexec_file.c | 42 +++ include/linux/verification.h | 3 ++ kernel/module-internal.h | 2 -- kernel/module.c | 11 +++ kernel/module_signing.c | 32 ++-- 7 files changed, 59 insertions(+), 56 deletions(-) -- 2.31.1
Re: [PATCH 06/13] nvdimm/blk: avoid calling del_gendisk() on early failures
On Tue, Nov 02, 2021 at 07:28:02PM -0600, Jens Axboe wrote: > On 11/2/21 6:49 PM, Dan Williams wrote: > > On Tue, Nov 2, 2021 at 5:10 PM Luis Chamberlain wrote: > >> > >> On Fri, Oct 15, 2021 at 05:13:48PM -0700, Dan Williams wrote: > >>> On Fri, Oct 15, 2021 at 4:53 PM Luis Chamberlain > >>> wrote: > > If nd_integrity_init() fails we'd get del_gendisk() called, > but that's not correct as we should only call that if we're > done with device_add_disk(). Fix this by providing unwinding > prior to the devm call being registered and moving the devm > registration to the very end. > > This should fix calling del_gendisk() if nd_integrity_init() > fails. I only spotted this issue through code inspection. It > does not fix any real world bug. > > >>> > >>> Just fyi, I'm preparing patches to delete this driver completely as it > >>> is unused by any shipping platform. I hope to get that removal into > >>> v5.16. > >> > >> Curious if are you going to nuking it on v5.16? Otherwise it would stand > >> in the way of the last few patches to add __must_check for the final > >> add_disk() error handling changes. > > > > True, I don't think I can get it nuked in time, so you can add my > > Reviewed-by for this one. > > Luis, I lost track of the nv* patches from this discussion. If you want > them in 5.16 and they are reviewed, please do resend and I'll pick them > up for the middle-of-merge-window push. Sure thing, I'll resend whatever is left. I also noticed for some reason I forgot to convert nvdimm/pmem and so I'll roll those new patches in, but I suspect that those might be too late unless we get them reviewed in time. Luis
Re: [PATCH 06/13] nvdimm/blk: avoid calling del_gendisk() on early failures
On Tue, Nov 02, 2021 at 05:49:12PM -0700, Dan Williams wrote: > On Tue, Nov 2, 2021 at 5:10 PM Luis Chamberlain wrote: > > > > On Fri, Oct 15, 2021 at 05:13:48PM -0700, Dan Williams wrote: > > > On Fri, Oct 15, 2021 at 4:53 PM Luis Chamberlain > > > wrote: > > > > > > > > If nd_integrity_init() fails we'd get del_gendisk() called, > > > > but that's not correct as we should only call that if we're > > > > done with device_add_disk(). Fix this by providing unwinding > > > > prior to the devm call being registered and moving the devm > > > > registration to the very end. > > > > > > > > This should fix calling del_gendisk() if nd_integrity_init() > > > > fails. I only spotted this issue through code inspection. It > > > > does not fix any real world bug. > > > > > > > > > > Just fyi, I'm preparing patches to delete this driver completely as it > > > is unused by any shipping platform. I hope to get that removal into > > > v5.16. > > > > Curious if are you going to nuking it on v5.16? Otherwise it would stand > > in the way of the last few patches to add __must_check for the final > > add_disk() error handling changes. > > True, I don't think I can get it nuked in time, so you can add my > Reviewed-by for this one. This patch required the previous patch in this series to also be applied. Can I apply your Reviewed-by there too? Luis