Re: [PATCH v1 4/4] watchdog/pseries-wdt: initial support for PAPR H_WATCHDOG timers
On 5/21/22 04:35, Scott Cheloha wrote: PAPR v2.12 defines a new hypercall, H_WATCHDOG. The hypercall permits guest control of one or more virtual watchdog timers. The timers have millisecond granularity. The guest is terminated when a timer expires. This patch adds a watchdog driver for these timers, "pseries-wdt". pseries_wdt_probe() currently assumes the existence of only one platform device and always assigns it watchdogNumber 1. If we ever expose more than one timer to userspace we will need to devise a way to assign a distinct watchdogNumber to each platform device at device registration time. This one should go before 4/4 in the series for bisectability. What is platform_device_register_simple("pseries-wdt",...) going to do without the driver? Signed-off-by: Scott Cheloha --- .../watchdog/watchdog-parameters.rst | 12 + drivers/watchdog/Kconfig | 8 + drivers/watchdog/Makefile | 1 + drivers/watchdog/pseries-wdt.c| 337 ++ 4 files changed, 358 insertions(+) create mode 100644 drivers/watchdog/pseries-wdt.c diff --git a/Documentation/watchdog/watchdog-parameters.rst b/Documentation/watchdog/watchdog-parameters.rst index 223c99361a30..4ffe725e796c 100644 --- a/Documentation/watchdog/watchdog-parameters.rst +++ b/Documentation/watchdog/watchdog-parameters.rst @@ -425,6 +425,18 @@ pnx833x_wdt: - +pseries-wdt: +action: + Action taken when watchdog expires: 1 (power off), 2 (restart), + 3 (dump and restart). (default=2) +timeout: + Initial watchdog timeout in seconds. (default=60) +nowayout: + Watchdog cannot be stopped once started. + (default=kernel config parameter) + +- + rc32434_wdt: timeout: Watchdog timeout value, in seconds (default=20) diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig index c4e82a8d863f..06b412603f3e 100644 --- a/drivers/watchdog/Kconfig +++ b/drivers/watchdog/Kconfig @@ -1932,6 +1932,14 @@ config MEN_A21_WDT # PPC64 Architecture +config PSERIES_WDT + tristate "POWER Architecture Platform Watchdog Timer" + depends on PPC_PSERIES + select WATCHDOG_CORE + help + Driver for virtual watchdog timers provided by PAPR + hypervisors (e.g. PowerVM, KVM). + config WATCHDOG_RTAS tristate "RTAS watchdog" depends on PPC_RTAS diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile index f7da867e8782..f35660409f17 100644 --- a/drivers/watchdog/Makefile +++ b/drivers/watchdog/Makefile @@ -184,6 +184,7 @@ obj-$(CONFIG_BOOKE_WDT) += booke_wdt.o obj-$(CONFIG_MEN_A21_WDT) += mena21_wdt.o # PPC64 Architecture +obj-$(CONFIG_PSERIES_WDT) += pseries-wdt.o obj-$(CONFIG_WATCHDOG_RTAS) += wdrtas.o # S390 Architecture diff --git a/drivers/watchdog/pseries-wdt.c b/drivers/watchdog/pseries-wdt.c new file mode 100644 index ..f41bc4d3b7a2 --- /dev/null +++ b/drivers/watchdog/pseries-wdt.c @@ -0,0 +1,337 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2022 International Business Machines, Inc. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#define DRV_NAME "pseries-wdt" + +/* + * The PAPR's MSB->LSB bit ordering is 0->63. These macros simplify + * defining bitfields as described in the PAPR without needing to + * transpose values to the more C-like 63->0 ordering. + */ +#define SETFIELD(_v, _b, _e) \ + (((unsigned long)(_v) << PPC_BITLSHIFT(_e)) & PPC_BITMASK((_b), (_e))) +#define GETFIELD(_v, _b, _e) \ + (((unsigned long)(_v) & PPC_BITMASK((_b), (_e))) >> PPC_BITLSHIFT(_e)) + +/* + * H_WATCHDOG Hypercall Input + * + * R4: "flags": + * + * A 64-bit value structured as follows: + * + * Bits 0-46: Reserved (must be zero). + */ +#define PSERIES_WDTF_RESERVED PPC_BITMASK(0, 46) + +/* + * Bit 47: "leaveOtherWatchdogsRunningOnTimeout" + * + * 0 Stop outstanding watchdogs on timeout. + * 1 Leave outstanding watchdogs running on timeout. + */ +#define PSERIES_WDTF_LEAVE_OTHER PPC_BIT(47) + +/* + * Bits 48-55: "operation" + * + * 0x01 Start Watchdog + * 0x02 Stop Watchdog + * 0x03 Query Watchdog Capabilities + * 0x04 Query Watchdog LPM Requirement + */ +#define PSERIES_WDTF_OP(op)SETFIELD((op), 48, 55) +#define PSERIES_WDTF_OP_START PSERIES_WDTF_OP(0x1) +#define PSERIES_WDTF_OP_STOP PSERIES_WDTF_OP(0x2) +#define PSERIES_WDTF_OP_QUERY PSERIES_WDTF_OP(0x3) +#define PSERIES_WDTF_OP_QUERY_LPM PSERIES_WDTF_OP(0x4) + +/* + * Bits 56-63: "timeoutAction" + * + * 0x01 Hard poweroff + * 0x02 Hard restart + * 0x03 Dump restart + */ +#define PSERIES
Re: [PATCH v2] of: check previous kernel's ima-kexec-buffer against memory bounds
Just a minor nit which I noticed. On 22/05/24 11:20AM, Vaibhav Jain wrote: > Presently ima_get_kexec_buffer() doesn't check if the previous kernel's > ima-kexec-buffer lies outside the addressable memory range. This can result > in a kernel panic if the new kernel is booted with 'mem=X' arg and the > ima-kexec-buffer was allocated beyond that range by the previous kernel. > The panic is usually of the form below: > > $ sudo kexec --initrd initrd vmlinux --append='mem=16G' > > > BUG: Unable to handle kernel data access on read at 0xc000c01fff7f > Faulting instruction address: 0xc0837974 > Oops: Kernel access of bad area, sig: 11 [#1] > > NIP [c0837974] ima_restore_measurement_list+0x94/0x6c0 > LR [c083b55c] ima_load_kexec_buffer+0xac/0x160 > Call Trace: > [c371fa80] [c083b55c] ima_load_kexec_buffer+0xac/0x160 > [c371fb00] [c20512c4] ima_init+0x80/0x108 > [c371fb70] [c20514dc] init_ima+0x4c/0x120 > [c371fbf0] [c0012240] do_one_initcall+0x60/0x2c0 > [c371fcc0] [c2004ad0] kernel_init_freeable+0x344/0x3ec > [c371fda0] [c00128a4] kernel_init+0x34/0x1b0 > [c371fe10] [c000ce64] ret_from_kernel_thread+0x5c/0x64 > Instruction dump: > f92100b8 f92100c0 90e10090 910100a0 4182050c 282a0017 3bc0 40810330 > 7c0802a6 fb610198 7c9b2378 f80101d0 2c090001 40820614 e9240010 > ---[ end trace ]--- > > Fix this issue by checking returned PFN range of previous kernel's > ima-kexec-buffer with pfn_valid to ensure correct memory bounds. > > Fixes: 467d27824920 ("powerpc: ima: get the kexec buffer passed by the > previous kernel") > Cc: Frank Rowand > Cc: Prakhar Srivastava > Cc: Lakshmi Ramasubramanian > Cc: Thiago Jung Bauermann > Cc: Rob Herring > Signed-off-by: Vaibhav Jain > > --- > Changelog > == > > v2: > * Instead of using memblock to determine the valid bounds use pfn_valid() to > do > so since memblock may not be available late after the kernel init. [ Mpe ] > * Changed the patch prefix from 'powerpc' to 'of' [ Mpe ] > * Updated the 'Fixes' tag to point to correct commit that introduced this > function. [ Rob ] > * Fixed some whitespace/tab issues in the patch description [ Rob ] > * Added another check for checking ig 'tmp_size' for ima-kexec-buffer is > 0 > --- > drivers/of/kexec.c | 17 + > 1 file changed, 17 insertions(+) > > diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c > index 8d374cc552be..879e984fe901 100644 > --- a/drivers/of/kexec.c > +++ b/drivers/of/kexec.c > @@ -126,6 +126,7 @@ int ima_get_kexec_buffer(void **addr, size_t *size) > { > int ret, len; > unsigned long tmp_addr; > + unsigned int start_pfn, end_pfn; ^^^ Shouldn't this be unsigned long? -ritesh > size_t tmp_size; > const void *prop; > > @@ -140,6 +141,22 @@ int ima_get_kexec_buffer(void **addr, size_t *size) > if (ret) > return ret; > > + /* Do some sanity on the returned size for the ima-kexec buffer */ > + if (!tmp_size) > + return -ENOENT; > + > + /* > + * Calculate the PFNs for the buffer and ensure > + * they are with in addressable memory. > + */ > + start_pfn = PHYS_PFN(tmp_addr); > + end_pfn = PHYS_PFN(tmp_addr + tmp_size - 1); > + if (!pfn_valid(start_pfn) || !pfn_valid(end_pfn)) { > + pr_warn("IMA buffer at 0x%lx, size = 0x%zx beyond memory\n", > + tmp_addr, tmp_size); > + return -EINVAL; > + } > + > *addr = __va(tmp_addr); > *size = tmp_size; > > -- > 2.35.1 >
[PATCH] powerpc: Don't select HAVE_IRQ_EXIT_ON_IRQ_STACK
The HAVE_IRQ_EXIT_ON_IRQ_STACK option tells generic code that irq_exit() is called while still running on the hard irq stack (hardirq_ctx[] in the powerpc code). Selecting the option means the generic code will *not* switch to the softirq stack before running softirqs, because the code is already running on the (mostly empty) hard irq stack. But since commit 1b1b6a6f4cc0 ("powerpc: handle irq_enter/irq_exit in interrupt handler wrappers"), irq_exit() is now called on the regular task stack, not the hard irq stack. That's because previously irq_exit() was called in __do_irq() which is run on the hard irq stack, but now it is called in interrupt_async_exit_prepare() which is called from do_irq() constructed by the wrapper macro, which is after the switch back to the task stack. So drop HAVE_IRQ_EXIT_ON_IRQ_STACK from the Kconfig. This will mean an extra stack switch when processing some interrupts, but should significantly reduce the likelihood of stack overflow. It also means the softirq stack will be used for running softirqs from other interrupts that don't use the hard irq stack, eg. timer interrupts. Fixes: 1b1b6a6f4cc0 ("powerpc: handle irq_enter/irq_exit in interrupt handler wrappers") Cc: sta...@vger.kernel.org # v5.12+ Signed-off-by: Michael Ellerman --- arch/powerpc/Kconfig | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index f5ed355e75e6..7e7d12ae9ecb 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -221,7 +221,6 @@ config PPC select HAVE_HARDLOCKUP_DETECTOR_PERFif PERF_EVENTS && HAVE_PERF_EVENTS_NMI && !HAVE_HARDLOCKUP_DETECTOR_ARCH select HAVE_HW_BREAKPOINT if PERF_EVENTS && (PPC_BOOK3S || PPC_8xx) select HAVE_IOREMAP_PROT - select HAVE_IRQ_EXIT_ON_IRQ_STACK select HAVE_IRQ_TIME_ACCOUNTING select HAVE_KERNEL_GZIP select HAVE_KERNEL_LZMA if DEFAULT_UIMAGE -- 2.35.3
Re: [PATCH 2/2] drm/tiny: Add ofdrm for Open Firmware framebuffers
On Sun, 2022-05-22 at 21:35 +0200, Thomas Zimmermann wrote: > > Interesting. Did you find some formats that were not supported ? > > We still don't support XRGB1555. If the native buffer uses this format, > we'd have no conversion helper. In this case, we rely on userspace/fbcon > to use the native format exclusively. (BTW, I asked one of my coworkers > to implement XRGB1555 to make her familiar with DRM. That's why I > haven't sent a patch yet.) > Various old macs do 1555 ... Cheers, Ben.
[PATCH 3/3] powerpc/64s: Remove spurious fault flushing for NMMU
Commit 6d8278c414cb2 ("powerpc/64s/radix: do not flush TLB on spurious fault") removed the TLB flush for spurious faults, except when a coprocessor (nest MMU) maps the address space. This is not needed because the NMMU workaround in the PTE permission upgrade paths prevents PTEs existing with less restrictive access permissions than their corresponding TLB entries have. Remove it and replace with a comment. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/book3s/64/tlbflush.h | 28 +-- 1 file changed, 25 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h index d2e80f178b6d..ab01938f6c82 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h @@ -138,9 +138,31 @@ static inline void flush_all_mm(struct mm_struct *mm) static inline void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma, unsigned long address) { - /* See ptep_set_access_flags comment */ - if (atomic_read(&vma->vm_mm->context.copros) > 0) - flush_tlb_page(vma, address); + /* +* Book3S 64 does not require spurious fault flushes because the PTE +* must be re-fetched in case of an access permission problem. So the +* only reason for a spurious fault should be concurrent modification +* to the PTE, in which case the PTE will eventually be re-fetched by +* the MMU when it attempts the access again. +* +* See: Power ISA Version 3.1B, 6.10.1.2 Modifying a Translation Table +* Entry, Setting a Reference or Change Bit or Upgrading Access +* Authority (PTE Subject to Atomic Hardware Updates): +* + * "If the only change being made to a valid PTE that is subject to + * atomic hardware updates is to set the Reference or Change bit to + * 1 or to upgrade access authority, a simpler sequence suffices + * because the translation hardware will refetch the PTE if an + * access is attempted for which the only problems were reference + * and/or change bits needing to be set or insufficient access + * authority." +*/ + + /* +* The nest MMU in POWER9 does not perform this PTE re-fetch, but +* it avoids the spurious fault problem by flushing the TLB before +* upgrading PTE permissions, see radix__ptep_set_access_flags. +*/ } extern bool tlbie_capable; -- 2.35.1
[PATCH 2/3] powerpc/64s: POWER10 nest MMU can upgrade PTE access authority without TLB flush
The nest MMU in POWER9 does not re-fetch the PTE in response to permission mismatch, contrary to the architecture[*] and unlike the core MMU. This requires a TLB flush before upgrading permissions of valid PTEs, for any address space with a coprocessor attached. Per (non-public) Nest MMU Workbook, POWER10 nest MMU conforms to the architecture in this regard, so skip the workaround. [*] See: Power ISA Version 3.1B, 6.10.1.2 Modifying a Translation Table Entry, Setting a Reference or Change Bit or Upgrading Access Authority (PTE Subject to Atomic Hardware Updates): "If the only change being made to a valid PTE that is subject to atomic hardware updates is to set the Reference or Change bit to 1 or to upgrade access authority, a simpler sequence suffices because the translation hardware will refetch the PTE if an access is attempted for which the only problems were reference and/or change bits needing to be set or insufficient access authority." Signed-off-by: Nicholas Piggin --- arch/powerpc/mm/book3s64/radix_hugetlbpage.c | 10 +++--- arch/powerpc/mm/book3s64/radix_pgtable.c | 35 2 files changed, 28 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/mm/book3s64/radix_hugetlbpage.c b/arch/powerpc/mm/book3s64/radix_hugetlbpage.c index 23d3e08911d3..78618c9b618b 100644 --- a/arch/powerpc/mm/book3s64/radix_hugetlbpage.c +++ b/arch/powerpc/mm/book3s64/radix_hugetlbpage.c @@ -103,11 +103,13 @@ void radix__huge_ptep_modify_prot_commit(struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; /* -* To avoid NMMU hang while relaxing access we need to flush the tlb before -* we set the new value. +* POWER9 NMMU must flush the TLB after clearing the PTE before +* installing a PTE with more relaxed access permissions, see +* radix__ptep_set_access_flags. */ - if (is_pte_rw_upgrade(pte_val(old_pte), pte_val(pte)) && - (atomic_read(&mm->context.copros) > 0)) + if (!cpu_has_feature(CPU_FTR_ARCH_31) && + is_pte_rw_upgrade(pte_val(old_pte), pte_val(pte)) && + atomic_read(&mm->context.copros) > 0) radix__flush_hugetlb_page(vma, addr); set_huge_pte_at(vma->vm_mm, addr, ptep, pte); diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c index def04631a74d..195719a6c41c 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -1018,16 +1018,21 @@ void radix__ptep_set_access_flags(struct vm_area_struct *vma, pte_t *ptep, unsigned long change = pte_val(entry) ^ pte_val(*ptep); /* -* To avoid NMMU hang while relaxing access, we need mark -* the pte invalid in between. +* On POWER9, the NMMU is not able to relax PTE access permissions +* for a translation with a TLB. The PTE must be invalidated, TLB +* flushed before the new PTE is installed. +* +* This only needs to be done for radix, because hash translation does +* flush when updating the linux pte (and we don't support NMMU +* accelerators on HPT on POWER9 anyway XXX: do we?). +* +* POWER10 (and P9P) NMMU does behave as per ISA. */ - if ((change & _PAGE_RW) && atomic_read(&mm->context.copros) > 0) { + if (!cpu_has_feature(CPU_FTR_ARCH_31) && (change & _PAGE_RW) && + atomic_read(&mm->context.copros) > 0) { unsigned long old_pte, new_pte; old_pte = __radix_pte_update(ptep, _PAGE_PRESENT, _PAGE_INVALID); - /* -* new value of pte -*/ new_pte = old_pte | set; radix__flush_tlb_page_psize(mm, address, psize); __radix_pte_update(ptep, _PAGE_INVALID, new_pte); @@ -1035,9 +1040,12 @@ void radix__ptep_set_access_flags(struct vm_area_struct *vma, pte_t *ptep, __radix_pte_update(ptep, 0, set); /* * Book3S does not require a TLB flush when relaxing access -* restrictions when the address space is not attached to a -* NMMU, because the core MMU will reload the pte after taking -* an access fault, which is defined by the architecture. +* restrictions when the address space (modulo the POWER9 nest +* MMU issue above) because the MMU will reload the PTE after +* taking an access fault, as defined by the architecture. See +* "Setting a Reference or Change Bit or Upgrading Access +* Authority (PTE Subject to Atomic Hardware Updates)" in +* Power ISA Version 3.1B. */ } /* See ptesync comment in radix__set_pte_at */ @@ -1050,11 +1058,12 @@ void radix__ptep_modify_prot_commit(struct vm_area_s
[PATCH 1/3] powerpc/64s: POWER10 nest MMU does not require flush escalation workaround
Per (non-public) Nest MMU Workbook, POWER10 and POWER9P NMMU does not cache PTEs in PWC, so does not require PWC flush to invalidate these translations. Skip the workaround on POWER10 and later. Signed-off-by: Nicholas Piggin --- arch/powerpc/mm/book3s64/radix_tlb.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c index 7724af19ed7e..7e233829b453 100644 --- a/arch/powerpc/mm/book3s64/radix_tlb.c +++ b/arch/powerpc/mm/book3s64/radix_tlb.c @@ -755,10 +755,18 @@ EXPORT_SYMBOL(radix__local_flush_tlb_page); static bool mm_needs_flush_escalation(struct mm_struct *mm) { /* -* P9 nest MMU has issues with the page walk cache -* caching PTEs and not flushing them properly when -* RIC = 0 for a PID/LPID invalidate +* The P9 nest MMU has issues with the page walk cache caching PTEs +* and not flushing them when RIC = 0 for a PID/LPID invalidate. +* +* This may have been fixed in shipping firmware (by disabling PWC +* or preventing it from caching PTEs), but until that is confirmed, +* this workaround is required - escalate all RIC=0 IS=1/2/3 flushes +* to RIC=2. +* +* POWER10 (and P9P) does not have this problem. */ + if (cpu_has_feature(CPU_FTR_ARCH_31)) + return false; if (atomic_read(&mm->context.copros) > 0) return true; return false; -- 2.35.1
[PATCH 0/3] powerpc/64s: Restrict NMMU workarounds
POWER10 doesn't require the two Nest MMU workarounds according to the workbook. Also remove the last vestige of the spurious fault flushing for NMMU which shouldn't have been required anyway. Thanks, Nick Nicholas Piggin (3): powerpc/64s: POWER10 nest MMU does not require flush escalation workaround powerpc/64s: POWER10 nest MMU can upgrade PTE access authority without TLB flush powerpc/64s: Remove spurious fault flushing for NMMU arch/powerpc/include/asm/book3s/64/tlbflush.h | 28 +-- arch/powerpc/mm/book3s64/radix_hugetlbpage.c | 10 +++--- arch/powerpc/mm/book3s64/radix_pgtable.c | 35 --- arch/powerpc/mm/book3s64/radix_tlb.c | 14 ++-- 4 files changed, 64 insertions(+), 23 deletions(-) -- 2.35.1
[PATCH v3] mm: Avoid unnecessary page fault retires on shared memory types
I observed that for each of the shared file-backed page faults, we're very likely to retry one more time for the 1st write fault upon no page. It's because we'll need to release the mmap lock for dirty rate limit purpose with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()). Then after that throttling we return VM_FAULT_RETRY. We did that probably because VM_FAULT_RETRY is the only way we can return to the fault handler at that time telling it we've released the mmap lock. However that's not ideal because it's very likely the fault does not need to be retried at all since the pgtable was well installed before the throttling, so the next continuous fault (including taking mmap read lock, walk the pgtable, etc.) could be in most cases unnecessary. It's not only slowing down page faults for shared file-backed, but also add more mmap lock contention which is in most cases not needed at all. To observe this, one could try to write to some shmem page and look at "pgfault" value in /proc/vmstat, then we should expect 2 counts for each shmem write simply because we retried, and vm event "pgfault" will capture that. To make it more efficient, add a new VM_FAULT_COMPLETED return code just to show that we've completed the whole fault and released the lock. It's also a hint that we should very possibly not need another fault immediately on this page because we've just completed it. This patch provides a ~12% perf boost on my aarch64 test VM with a simple program sequentially dirtying 400MB shmem file being mmap()ed and these are the time it needs: Before: 650.980 ms (+-1.94%) After: 569.396 ms (+-1.38%) I believe it could help more than that. We need some special care on GUP and the s390 pgfault handler (for gmap code before returning from pgfault), the rest changes in the page fault handlers should be relatively straightforward. Another thing to mention is that mm_account_fault() does take this new fault as a generic fault to be accounted, unlike VM_FAULT_RETRY. I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping them as-is. Signed-off-by: Peter Xu --- v3: - Rebase to akpm/mm-unstable - Copy arch maintainers --- arch/alpha/mm/fault.c | 4 arch/arc/mm/fault.c | 4 arch/arm/mm/fault.c | 4 arch/arm64/mm/fault.c | 4 arch/csky/mm/fault.c | 4 arch/hexagon/mm/vm_fault.c| 4 arch/ia64/mm/fault.c | 4 arch/m68k/mm/fault.c | 4 arch/microblaze/mm/fault.c| 4 arch/mips/mm/fault.c | 4 arch/nios2/mm/fault.c | 4 arch/openrisc/mm/fault.c | 4 arch/parisc/mm/fault.c| 4 arch/powerpc/mm/copro_fault.c | 5 + arch/powerpc/mm/fault.c | 5 + arch/riscv/mm/fault.c | 4 arch/s390/mm/fault.c | 12 +++- arch/sh/mm/fault.c| 4 arch/sparc/mm/fault_32.c | 4 arch/sparc/mm/fault_64.c | 5 + arch/um/kernel/trap.c | 4 arch/x86/mm/fault.c | 4 arch/xtensa/mm/fault.c| 4 include/linux/mm_types.h | 2 ++ mm/gup.c | 34 +- mm/memory.c | 2 +- 26 files changed, 138 insertions(+), 3 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index ec20c1004abf..ef427a6bdd1a 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -155,6 +155,10 @@ do_page_fault(unsigned long address, unsigned long mmcsr, if (fault_signal_pending(fault, regs)) return; + /* The fault is fully completed (including releasing mmap lock) */ + if (fault & VM_FAULT_COMPLETED) + return; + if (unlikely(fault & VM_FAULT_ERROR)) { if (fault & VM_FAULT_OOM) goto out_of_memory; diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index dad27e4d69ff..5ca59a482632 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -146,6 +146,10 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) return; } + /* The fault is fully completed (including releasing mmap lock) */ + if (fault & VM_FAULT_COMPLETED) + return; + /* * Fault retry nuances, mmap_lock already relinquished by core mm */ diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index a062e07516dd..46cccd6bf705 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -322,6 +322,10 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) return 0; } + /* The fault is fully completed (including releasing mmap lock) */ + if (fault & VM_FAULT_COMPLETED) + return 0; + if (!(fault & VM_FAULT_ERROR)) {
Re: [PATCH V9 20/20] riscv: compat: Add COMPAT Kbuild skeletal support
On Wed, May 25, 2022 at 01:46:38AM +0800, Guo Ren wrote: [ ... ] > > The problem is come from "__dls3's vdso decode part in musl's > > ldso/dynlink.c". The ehdr->e_phnum & ehdr->e_phentsize are wrong. > > > > I think the root cause is from musl's implementation with the wrong > > elf parser. I would fix that soon. > Not elf parser, it's "aux vector just past environ[]". I think I could > solve this, but anyone who could help dig in is welcome. > I am not sure I understand what you are saying here. Point is that my root file system, generated with musl a year or so ago, crashes with your patch set applied. That is a regression, even if there is a bug in musl. Guenter
Re: [PATCH V2] platforms/83xx: Use of_device_get_match_data()
Le 25/02/2022 à 02:07, cgel@gmail.com a écrit : From: Minghao Chi (CGEL ZTE) Use of_device_get_match_data() to simplify the code. v1->v2: Add a judgment on the return value of the A function as NULL Reported-by: Zeal Robot Signed-off-by: Minghao Chi (CGEL ZTE) --- arch/powerpc/platforms/83xx/suspend.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/platforms/83xx/suspend.c b/arch/powerpc/platforms/83xx/suspend.c index bb147d34d4a6..6d47a5b81485 100644 --- a/arch/powerpc/platforms/83xx/suspend.c +++ b/arch/powerpc/platforms/83xx/suspend.c @@ -322,18 +322,15 @@ static const struct platform_suspend_ops mpc83xx_suspend_ops = { static const struct of_device_id pmc_match[]; Hi, I think that the line above can now be removed as well. just my 2c. CJ static int pmc_probe(struct platform_device *ofdev) { - const struct of_device_id *match; struct device_node *np = ofdev->dev.of_node; struct resource res; const struct pmc_type *type; int ret = 0; - match = of_match_device(pmc_match, &ofdev->dev); - if (!match) + type = of_device_get_match_data(&ofdev->dev); + if (!type) return -EINVAL; - type = match->data; - if (!of_device_is_available(np)) return -ENODEV;
Re: [PATCH] powerpc: e500: Fix compilation with gcc e500 compiler
On Tue, May 24, 2022 at 09:16:10PM +0200, Pali Rohár wrote: > On Tuesday 24 May 2022 13:52:47 Segher Boessenkool wrote: > > Aha. Right, because this config forces -mspe it requires one of these > > CPUs. > > > > You can use a powerpc-linux compiler instead, and everything will just > > work. These CPUs are still supported, in all of GCC 9 .. GCC 12 :-) > > Ok. I can use different "generic" powerpc compiler (It should work fine > as you said, as it has also -mcpu=8540 option). But I think that > compilation of kernel should be supported also by that gcc 8.5.0 e500 > compiler. That linuxspe compiler you mean. Sure, why not, the more the merrier, as long as it doesn't get in the way of other stuff, I won't protest. But please don't confuse people: you are talking about a powerpc-linuxspe compiler, not e500, which is supported just fine by current GCC trunk still, and does not have such limitations :-) Segher
Re: [RFC PATCH v2 5/7] objtool: Enable objtool to run only on files with ftrace enabled
On Tue, May 24, 2022 at 06:59:50PM +, Christophe Leroy wrote: > > > Le 24/05/2022 à 20:02, Peter Zijlstra a écrit : > > On Tue, May 24, 2022 at 08:01:39PM +0200, Peter Zijlstra wrote: > >> On Tue, May 24, 2022 at 03:17:45PM +0200, Christophe Leroy wrote: > >>> From: Sathvika Vasireddy > >>> > >>> This patch makes sure objtool runs only on the object files > >>> that have ftrace enabled, instead of running on all the object > >>> files. > >>> > >>> Signed-off-by: Naveen N. Rao > >>> Signed-off-by: Sathvika Vasireddy > >>> Signed-off-by: Christophe Leroy > >>> --- > >>> scripts/Makefile.build | 4 ++-- > >>> 1 file changed, 2 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/scripts/Makefile.build b/scripts/Makefile.build > >>> index 2e0c3f9c1459..06ceffd92921 100644 > >>> --- a/scripts/Makefile.build > >>> +++ b/scripts/Makefile.build > >>> @@ -258,8 +258,8 @@ else > >>> # 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a > >>> file > >>> # 'OBJECT_FILES_NON_STANDARD_foo.o := 'n': override directory skip for > >>> a file > >>> > >>> -$(obj)/%.o: objtool-enabled = $(if $(filter-out y%, \ > >>> - > >>> $(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n),y) > >>> +$(obj)/%.o: objtool-enabled = $(and $(if $(filter-out y%, > >>> $(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n),y), > >>> \ > >>> +$(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)),y),y) > >> > >> I think this breaks x86, quite a bit of files have ftrace disabled but > >> very much must run objtool anyway. > > > > Also; since the Changelog gives 0 clue as to what problem it's trying to > > solve, I can't suggest anything. > > I asked Sathvika on the previous series, see > https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20220523175548.922671-3...@linux.ibm.com/ > > He says it is to solve the problem I reported at > https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20220318105140.43914-4...@linux.ibm.com/#2861128 So on x86 we have: arch/x86/entry/vdso/Makefile:OBJECT_FILES_NON_STANDARD := y to kill objtool for the whole of the VDSO. When we run objtool on vmlinux it isn't a problem, because the VDSO ends up as a data section through linker scripts.
Re: [PATCH] powerpc: e500: Fix compilation with gcc e500 compiler
On Tuesday 24 May 2022 13:52:47 Segher Boessenkool wrote: > On Tue, May 24, 2022 at 08:12:55PM +0200, Pali Rohár wrote: > > On Tuesday 24 May 2022 12:59:55 Segher Boessenkool wrote: > > > On Tue, May 24, 2022 at 11:39:39AM +0200, Pali Rohár wrote: > > > > gcc e500 compiler does not support -mcpu=powerpc option. When it is > > > > specified then gcc throws compile error: > > > > > > > > gcc: error: unrecognized argument in option ‘-mcpu=powerpc’ > > > > gcc: note: valid arguments to ‘-mcpu=’ are: 8540 8548 native > > > > > > What? Are you using some modified version of GCC, perhaps? > > > > Hello! I'm using official gcc version, no special modification. > > > > > No version of GCC that isn't hamstrung can have this output. > > > > gcc for e500 cores has really this output when you pass -mcpu=powerpc. > > > > Upstream gcc dropped support for e500 cores during development of > > version 9. > > This isn't true. The SPE instruction extension is no longer supported > (because it wasn't maintained). Everything else still works. > > > But you can still compile and install gcc 8.5.0 (last version > > of gcc 8) which has this full e500 support. > > > > Really, you can easily try it. Debian 10 (Buster) has gcc 8.3.0 in its > > default installation and also provides packages with cross compilers. > > Just run 'sudo apt install gcc-powerpc-linux-gnuspe' on desktop amd64 > > version of Debian 10, it will install e500 cross compiler. > > > > -mcpu=8540 specify e500v1 and -mcpu=8548 specify e500v2 > > Aha. Right, because this config forces -mspe it requires one of these > CPUs. > > You can use a powerpc-linux compiler instead, and everything will just > work. These CPUs are still supported, in all of GCC 9 .. GCC 12 :-) > > > Segher Ok. I can use different "generic" powerpc compiler (It should work fine as you said, as it has also -mcpu=8540 option). But I think that compilation of kernel should be supported also by that gcc 8.5.0 e500 compiler. It is really annoying if for compiling kernel is needed different compiler than for compiling rest of the system (userspace and bootloader). And for user applications it should be really used e500 SPE-capable compiler due to performance reasons.
Re: [RFC PATCH v2 5/7] objtool: Enable objtool to run only on files with ftrace enabled
Le 24/05/2022 à 20:02, Peter Zijlstra a écrit : > On Tue, May 24, 2022 at 08:01:39PM +0200, Peter Zijlstra wrote: >> On Tue, May 24, 2022 at 03:17:45PM +0200, Christophe Leroy wrote: >>> From: Sathvika Vasireddy >>> >>> This patch makes sure objtool runs only on the object files >>> that have ftrace enabled, instead of running on all the object >>> files. >>> >>> Signed-off-by: Naveen N. Rao >>> Signed-off-by: Sathvika Vasireddy >>> Signed-off-by: Christophe Leroy >>> --- >>> scripts/Makefile.build | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/scripts/Makefile.build b/scripts/Makefile.build >>> index 2e0c3f9c1459..06ceffd92921 100644 >>> --- a/scripts/Makefile.build >>> +++ b/scripts/Makefile.build >>> @@ -258,8 +258,8 @@ else >>> # 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a >>> file >>> # 'OBJECT_FILES_NON_STANDARD_foo.o := 'n': override directory skip for a >>> file >>> >>> -$(obj)/%.o: objtool-enabled = $(if $(filter-out y%, \ >>> - >>> $(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n),y) >>> +$(obj)/%.o: objtool-enabled = $(and $(if $(filter-out y%, >>> $(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n),y), >>> \ >>> +$(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)),y),y) >> >> I think this breaks x86, quite a bit of files have ftrace disabled but >> very much must run objtool anyway. > > Also; since the Changelog gives 0 clue as to what problem it's trying to > solve, I can't suggest anything. I asked Sathvika on the previous series, see https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20220523175548.922671-3...@linux.ibm.com/ He says it is to solve the problem I reported at https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20220318105140.43914-4...@linux.ibm.com/#2861128 Christophe
Re: [PATCH] powerpc: e500: Fix compilation with gcc e500 compiler
On Tue, May 24, 2022 at 08:12:55PM +0200, Pali Rohár wrote: > On Tuesday 24 May 2022 12:59:55 Segher Boessenkool wrote: > > On Tue, May 24, 2022 at 11:39:39AM +0200, Pali Rohár wrote: > > > gcc e500 compiler does not support -mcpu=powerpc option. When it is > > > specified then gcc throws compile error: > > > > > > gcc: error: unrecognized argument in option ‘-mcpu=powerpc’ > > > gcc: note: valid arguments to ‘-mcpu=’ are: 8540 8548 native > > > > What? Are you using some modified version of GCC, perhaps? > > Hello! I'm using official gcc version, no special modification. > > > No version of GCC that isn't hamstrung can have this output. > > gcc for e500 cores has really this output when you pass -mcpu=powerpc. > > Upstream gcc dropped support for e500 cores during development of > version 9. This isn't true. The SPE instruction extension is no longer supported (because it wasn't maintained). Everything else still works. > But you can still compile and install gcc 8.5.0 (last version > of gcc 8) which has this full e500 support. > > Really, you can easily try it. Debian 10 (Buster) has gcc 8.3.0 in its > default installation and also provides packages with cross compilers. > Just run 'sudo apt install gcc-powerpc-linux-gnuspe' on desktop amd64 > version of Debian 10, it will install e500 cross compiler. > > -mcpu=8540 specify e500v1 and -mcpu=8548 specify e500v2 Aha. Right, because this config forces -mspe it requires one of these CPUs. You can use a powerpc-linux compiler instead, and everything will just work. These CPUs are still supported, in all of GCC 9 .. GCC 12 :-) Segher
Re: [PATCH Linux] powerpc: add documentation for HWCAPs
On Tue, May 24, 2022 at 11:52:00AM +0200, Florian Weimer wrote: > * Nicholas Piggin: > > > +2. Facilities > > +- > > +The Power ISA uses the term "facility" to describe a class of instructions, > > +registers, interrupts, etc. The presence or absence of a facility indicates > > +whether this class is available to be used, but the specifics depend on the > > +ISA version. For example, if the VSX facility is available, the VSX > > +instructions that can be used differ between the v3.0B and v3.1B ISA > > +verstions. > > The 2.07 ISA manual also has categories. ISA 3.0 made a lot of things > mandatory. It may make sense to clarify that feature bits for mandatory > aspects of the ISA are still set, to help with backwards compatibility. Linux runs on ISA 1.xx and ISA 2.01 machines still. "Category" wasn't invented for either yet either, but similar concepts did exist of course. Segher
Re: [PATCH] powerpc: e500: Fix compilation with gcc e500 compiler
On Tuesday 24 May 2022 12:59:55 Segher Boessenkool wrote: > Hi! > > On Tue, May 24, 2022 at 11:39:39AM +0200, Pali Rohár wrote: > > gcc e500 compiler does not support -mcpu=powerpc option. When it is > > specified then gcc throws compile error: > > > > gcc: error: unrecognized argument in option ‘-mcpu=powerpc’ > > gcc: note: valid arguments to ‘-mcpu=’ are: 8540 8548 native > > What? Are you using some modified version of GCC, perhaps? Hello! I'm using official gcc version, no special modification. > No version of GCC that isn't hamstrung can have this output. gcc for e500 cores has really this output when you pass -mcpu=powerpc. Upstream gcc dropped support for e500 cores during development of version 9. But you can still compile and install gcc 8.5.0 (last version of gcc 8) which has this full e500 support. Really, you can easily try it. Debian 10 (Buster) has gcc 8.3.0 in its default installation and also provides packages with cross compilers. Just run 'sudo apt install gcc-powerpc-linux-gnuspe' on desktop amd64 version of Debian 10, it will install e500 cross compiler. -mcpu=8540 specify e500v1 and -mcpu=8548 specify e500v2
Re: [RFC PATCH v2 5/7] objtool: Enable objtool to run only on files with ftrace enabled
On Tue, May 24, 2022 at 08:01:39PM +0200, Peter Zijlstra wrote: > On Tue, May 24, 2022 at 03:17:45PM +0200, Christophe Leroy wrote: > > From: Sathvika Vasireddy > > > > This patch makes sure objtool runs only on the object files > > that have ftrace enabled, instead of running on all the object > > files. > > > > Signed-off-by: Naveen N. Rao > > Signed-off-by: Sathvika Vasireddy > > Signed-off-by: Christophe Leroy > > --- > > scripts/Makefile.build | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/scripts/Makefile.build b/scripts/Makefile.build > > index 2e0c3f9c1459..06ceffd92921 100644 > > --- a/scripts/Makefile.build > > +++ b/scripts/Makefile.build > > @@ -258,8 +258,8 @@ else > > # 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a file > > # 'OBJECT_FILES_NON_STANDARD_foo.o := 'n': override directory skip for a > > file > > > > -$(obj)/%.o: objtool-enabled = $(if $(filter-out y%, \ > > - > > $(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n),y) > > +$(obj)/%.o: objtool-enabled = $(and $(if $(filter-out y%, > > $(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n),y), > > \ > > +$(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)),y),y) > > I think this breaks x86, quite a bit of files have ftrace disabled but > very much must run objtool anyway. Also; since the Changelog gives 0 clue as to what problem it's trying to solve, I can't suggest anything.
Re: [PATCH] powerpc: e500: Fix compilation with gcc e500 compiler
Hi! On Tue, May 24, 2022 at 11:39:39AM +0200, Pali Rohár wrote: > gcc e500 compiler does not support -mcpu=powerpc option. When it is > specified then gcc throws compile error: > > gcc: error: unrecognized argument in option ‘-mcpu=powerpc’ > gcc: note: valid arguments to ‘-mcpu=’ are: 8540 8548 native What? Are you using some modified version of GCC, perhaps? No version of GCC that isn't hamstrung can have this output. Segher
Re: [RFC PATCH v2 5/7] objtool: Enable objtool to run only on files with ftrace enabled
On Tue, May 24, 2022 at 03:17:45PM +0200, Christophe Leroy wrote: > From: Sathvika Vasireddy > > This patch makes sure objtool runs only on the object files > that have ftrace enabled, instead of running on all the object > files. > > Signed-off-by: Naveen N. Rao > Signed-off-by: Sathvika Vasireddy > Signed-off-by: Christophe Leroy > --- > scripts/Makefile.build | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/scripts/Makefile.build b/scripts/Makefile.build > index 2e0c3f9c1459..06ceffd92921 100644 > --- a/scripts/Makefile.build > +++ b/scripts/Makefile.build > @@ -258,8 +258,8 @@ else > # 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a file > # 'OBJECT_FILES_NON_STANDARD_foo.o := 'n': override directory skip for a file > > -$(obj)/%.o: objtool-enabled = $(if $(filter-out y%, \ > - > $(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n),y) > +$(obj)/%.o: objtool-enabled = $(and $(if $(filter-out y%, > $(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n),y), > \ > +$(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)),y),y) I think this breaks x86, quite a bit of files have ftrace disabled but very much must run objtool anyway.
Re: [RFC PATCH v2 3/7] objtool: Use target file class size instead of a compiled constant
On Tue, May 24, 2022 at 03:17:43PM +0200, Christophe Leroy wrote: > - sec = elf_create_section(elf, relocname, 0, sizeof(GElf_Rela), 0); > + if (size == sizeof(u32)) > + sec = elf_create_section(elf, relocname, 0, sizeof(Elf32_Rela), > 0); > + else > + sec = elf_create_section(elf, relocname, 0, sizeof(GElf_Rela), > 0); Probably best to use Elf64_* here instead of GElf_*.
Re: [PATCH V9 20/20] riscv: compat: Add COMPAT Kbuild skeletal support
On Wed, May 25, 2022 at 1:42 AM Guo Ren wrote: > > On Mon, May 23, 2022 at 1:45 PM Guenter Roeck wrote: > > > > On Tue, Mar 22, 2022 at 10:40:03PM +0800, guo...@kernel.org wrote: > > > From: Guo Ren > > > > > > Adds initial skeletal COMPAT Kbuild (Running 32bit U-mode on > > > 64bit S-mode) support. > > > - Setup kconfig & dummy functions for compiling. > > > - Implement compat_start_thread by the way. > > > > > > Signed-off-by: Guo Ren > > > Signed-off-by: Guo Ren > > > Reviewed-by: Arnd Bergmann > > > Tested-by: Heiko Stuebner > > > Cc: Palmer Dabbelt > > > > With this patch in linux-next, all my riscv64 emulations crash. > > > > [ 11.600082] Run /sbin/init as init process > > [ 11.628561] init[1]: unhandled signal 11 code 0x1 at 0x > > in libc.so[ff8ad39000+a4000] > > [ 11.629398] CPU: 0 PID: 1 Comm: init Not tainted > > 5.18.0-rc7-next-20220520 #1 > > [ 11.629462] Hardware name: riscv-virtio,qemu (DT) > > [ 11.629546] epc : 00ff8ada1100 ra : 00ff8ada13c8 sp : > > 00ffc58199f0 > > [ 11.629586] gp : 00ff8ad39000 tp : 00ff8ade0998 t0 : > > > > [ 11.629598] t1 : 00ffc5819fd0 t2 : s0 : > > 00ff8ade0cc0 > > [ 11.629610] s1 : 00ff8ade0cc0 a0 : a1 : > > 00ffc5819a00 > > [ 11.629622] a2 : 0001 a3 : 001e a4 : > > 00ffc5819b00 > > [ 11.629634] a5 : 00ffc5819b00 a6 : a7 : > > > > [ 11.629645] s2 : 00ff8ade0ac8 s3 : 00ff8ade0ec8 s4 : > > 00ff8ade0728 > > [ 11.629656] s5 : 00ff8ade0a90 s6 : s7 : > > 00ffc5819e40 > > [ 11.629667] s8 : 00ff8ade0ca0 s9 : 00ff8addba50 s10: > > > > [ 11.629678] s11: t3 : 0002 t4 : > > 0001 > > [ 11.629688] t5 : 0002 t6 : > > [ 11.629699] status: 4020 badaddr: cause: > > 000d > > [ 11.633421] Kernel panic - not syncing: Attempted to kill init! > > exitcode=0x000b > > [ 11.633664] CPU: 0 PID: 1 Comm: init Not tainted > > 5.18.0-rc7-next-20220520 #1 > > [ 11.633784] Hardware name: riscv-virtio,qemu (DT) > > [ 11.633881] Call Trace: > > [ 11.633960] [] dump_backtrace+0x1c/0x24 > > [ 11.634162] [] show_stack+0x2c/0x38 > > [ 11.634274] [] dump_stack_lvl+0x60/0x8e > > [ 11.634386] [] dump_stack+0x14/0x1c > > [ 11.634491] [] panic+0x116/0x2e2 > > [ 11.634596] [] do_exit+0x7ce/0x7d4 > > [ 11.634707] [] do_group_exit+0x24/0x7c > > [ 11.634817] [] get_signal+0x7ee/0x830 > > [ 11.634924] [] do_notify_resume+0x6c/0x41c > > [ 11.635037] [] ret_from_exception+0x0/0x10 > The problem is come from "__dls3's vdso decode part in musl's > ldso/dynlink.c". The ehdr->e_phnum & ehdr->e_phentsize are wrong. > > I think the root cause is from musl's implementation with the wrong > elf parser. I would fix that soon. Not elf parser, it's "aux vector just past environ[]". I think I could solve this, but anyone who could help dig in is welcome. > > If you CONFIG_COMPAT=n, the bug would be bypassed. > > > > > Guenter > > > > --- > > # bad: [18ecd30af1a8402c162cca1bd58771c0e5be7815] Add linux-next specific > > files for 20220520 > > # good: [42226c989789d8da4af1de0c31070c96726d990c] Linux 5.18-rc7 > > git bisect start 'HEAD' 'v5.18-rc7' > > # bad: [f9b63740b666dd9887eb0282d21b5f65bb0cadd0] Merge branch 'master' of > > git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git > > git bisect bad f9b63740b666dd9887eb0282d21b5f65bb0cadd0 > > # bad: [7db97132097c5973ff77466d0ee681650af653de] Merge branch 'for-next' > > of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux > > git bisect bad 7db97132097c5973ff77466d0ee681650af653de > > # good: [2b7d17d4b7c1ff40f58b0d32be40fc0bb6c582fb] soc: document merges > > git bisect good 2b7d17d4b7c1ff40f58b0d32be40fc0bb6c582fb > > # good: [69c9668f853fdd409bb8abbb37d615785510b29a] Merge branch 'clk-next' > > of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git > > git bisect good 69c9668f853fdd409bb8abbb37d615785510b29a > > # bad: [1577f290aa0d4c5b29c03c46ef52e4952a21bfbb] Merge branch 'for-next' > > of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git > > git bisect bad 1577f290aa0d4c5b29c03c46ef52e4952a21bfbb > > # good: [34f0971f8ca73d7e5502b4cf299788a9402120f7] powerpc/powernv/flash: > > Check OPAL flash calls exist before using > > git bisect good 34f0971f8ca73d7e5502b4cf299788a9402120f7 > > # good: [0349d7dfc70a26b3facd8ca97de34980d4b60954] Merge branch 'mips-next' > > of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git > > git bisect good 0349d7dfc70a26b3facd8ca97de34980d4b60954 > > # bad: [20bfb54d3b121699674c17a854c5ebc7a8f97d81] Merge branch 'for-next' > > of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git > > git bisect bad 20bfb54d3b121699674c17a854c5ebc7a8f97d81
Re: [PATCH V9 20/20] riscv: compat: Add COMPAT Kbuild skeletal support
On Mon, May 23, 2022 at 1:45 PM Guenter Roeck wrote: > > On Tue, Mar 22, 2022 at 10:40:03PM +0800, guo...@kernel.org wrote: > > From: Guo Ren > > > > Adds initial skeletal COMPAT Kbuild (Running 32bit U-mode on > > 64bit S-mode) support. > > - Setup kconfig & dummy functions for compiling. > > - Implement compat_start_thread by the way. > > > > Signed-off-by: Guo Ren > > Signed-off-by: Guo Ren > > Reviewed-by: Arnd Bergmann > > Tested-by: Heiko Stuebner > > Cc: Palmer Dabbelt > > With this patch in linux-next, all my riscv64 emulations crash. > > [ 11.600082] Run /sbin/init as init process > [ 11.628561] init[1]: unhandled signal 11 code 0x1 at 0x in > libc.so[ff8ad39000+a4000] > [ 11.629398] CPU: 0 PID: 1 Comm: init Not tainted 5.18.0-rc7-next-20220520 > #1 > [ 11.629462] Hardware name: riscv-virtio,qemu (DT) > [ 11.629546] epc : 00ff8ada1100 ra : 00ff8ada13c8 sp : > 00ffc58199f0 > [ 11.629586] gp : 00ff8ad39000 tp : 00ff8ade0998 t0 : > > [ 11.629598] t1 : 00ffc5819fd0 t2 : s0 : > 00ff8ade0cc0 > [ 11.629610] s1 : 00ff8ade0cc0 a0 : a1 : > 00ffc5819a00 > [ 11.629622] a2 : 0001 a3 : 001e a4 : > 00ffc5819b00 > [ 11.629634] a5 : 00ffc5819b00 a6 : a7 : > > [ 11.629645] s2 : 00ff8ade0ac8 s3 : 00ff8ade0ec8 s4 : > 00ff8ade0728 > [ 11.629656] s5 : 00ff8ade0a90 s6 : s7 : > 00ffc5819e40 > [ 11.629667] s8 : 00ff8ade0ca0 s9 : 00ff8addba50 s10: > > [ 11.629678] s11: t3 : 0002 t4 : > 0001 > [ 11.629688] t5 : 0002 t6 : > [ 11.629699] status: 4020 badaddr: cause: > 000d > [ 11.633421] Kernel panic - not syncing: Attempted to kill init! > exitcode=0x000b > [ 11.633664] CPU: 0 PID: 1 Comm: init Not tainted 5.18.0-rc7-next-20220520 > #1 > [ 11.633784] Hardware name: riscv-virtio,qemu (DT) > [ 11.633881] Call Trace: > [ 11.633960] [] dump_backtrace+0x1c/0x24 > [ 11.634162] [] show_stack+0x2c/0x38 > [ 11.634274] [] dump_stack_lvl+0x60/0x8e > [ 11.634386] [] dump_stack+0x14/0x1c > [ 11.634491] [] panic+0x116/0x2e2 > [ 11.634596] [] do_exit+0x7ce/0x7d4 > [ 11.634707] [] do_group_exit+0x24/0x7c > [ 11.634817] [] get_signal+0x7ee/0x830 > [ 11.634924] [] do_notify_resume+0x6c/0x41c > [ 11.635037] [] ret_from_exception+0x0/0x10 The problem is come from "__dls3's vdso decode part in musl's ldso/dynlink.c". The ehdr->e_phnum & ehdr->e_phentsize are wrong. I think the root cause is from musl's implementation with the wrong elf parser. I would fix that soon. If you CONFIG_COMPAT=n, the bug would be bypassed. > > Guenter > > --- > # bad: [18ecd30af1a8402c162cca1bd58771c0e5be7815] Add linux-next specific > files for 20220520 > # good: [42226c989789d8da4af1de0c31070c96726d990c] Linux 5.18-rc7 > git bisect start 'HEAD' 'v5.18-rc7' > # bad: [f9b63740b666dd9887eb0282d21b5f65bb0cadd0] Merge branch 'master' of > git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git > git bisect bad f9b63740b666dd9887eb0282d21b5f65bb0cadd0 > # bad: [7db97132097c5973ff77466d0ee681650af653de] Merge branch 'for-next' of > git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux > git bisect bad 7db97132097c5973ff77466d0ee681650af653de > # good: [2b7d17d4b7c1ff40f58b0d32be40fc0bb6c582fb] soc: document merges > git bisect good 2b7d17d4b7c1ff40f58b0d32be40fc0bb6c582fb > # good: [69c9668f853fdd409bb8abbb37d615785510b29a] Merge branch 'clk-next' of > git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git > git bisect good 69c9668f853fdd409bb8abbb37d615785510b29a > # bad: [1577f290aa0d4c5b29c03c46ef52e4952a21bfbb] Merge branch 'for-next' of > git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git > git bisect bad 1577f290aa0d4c5b29c03c46ef52e4952a21bfbb > # good: [34f0971f8ca73d7e5502b4cf299788a9402120f7] powerpc/powernv/flash: > Check OPAL flash calls exist before using > git bisect good 34f0971f8ca73d7e5502b4cf299788a9402120f7 > # good: [0349d7dfc70a26b3facd8ca97de34980d4b60954] Merge branch 'mips-next' > of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git > git bisect good 0349d7dfc70a26b3facd8ca97de34980d4b60954 > # bad: [20bfb54d3b121699674c17a854c5ebc7a8f97d81] Merge branch 'for-next' of > git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git > git bisect bad 20bfb54d3b121699674c17a854c5ebc7a8f97d81 > # bad: [9be8459298eadb39b9fe9974b890239e9c123107] riscv: compat: Add COMPAT > Kbuild skeletal support > git bisect bad 9be8459298eadb39b9fe9974b890239e9c123107 > # good: [01abdfeac81b5f56062d0a78f2cdc805db937a75] riscv: compat: Support > TASK_SIZE for compat mode > git bisect good 01abdfeac81b5f56062d0a78f2cdc805db937a75 > # good: [f4b395e6f1a588ed6c9a30474e58cf6b27b65783]
Re: [PATCH Linux] powerpc: add documentation for HWCAPs
Hi! On Tue, May 24, 2022 at 07:38:28PM +1000, Nicholas Piggin wrote: > Thanks for all the comments and corrections. It should be nearing the > point where it is useful now. Yes I do think it would be useful to align > this more with OpenPOWER docs (and possibly eventually move it into the > ABI, given that's the allocator of these numbers) but that's not > done yet. The auxiliary vector is a Linux/glibc thing, it should not be described in more generic ABI documents. It is fine where you have it now afaics. > +Where software relies on a feature described by a HWCAP, it should check the > +relevant HWCAP flag to verify that the feature is present before attempting > to > +make use of the feature. > + > +Features should not be probed through other means. When a feature is not > +available, attempting to use it may result in unpredictable behaviour, and > +may not be guaranteed to result in any reliable indication that the feature > +is unavailable. Traditionally VMX was tested for by simply executing an instruction and catching SIGILL. This is portable even. This has worked fine for over two decades, it's a bit weird to declare this a forbidden practice now :-) It certainly isn't recommended for more complex and/or newer things. > +verstions. (typo. spellcheck maybe?) Segher
Re: [PATCH v7 21/25] Kbuild: add Rust support
On Mon, May 23, 2022 at 8:45 PM Nick Desaulniers wrote: > > I'm super not into having the rust optimization level differ from the > C optimization level. This is just someone having too much fun > wrapping every compiler flag in a kbuild option. Either folks wan't I mean, `Makefile`s are not my favorite pastime... :) > smaller size or more optimizations. Allowing for RUST_OPT_LEVEL_S and > CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE or RUST_OPT_LEVEL_3 and > CONFIG_CC_OPTIMIZE_FOR_SIZE is just wacky nonsense that's going to > make randconfig bug reports more confusing to tease out. I think what is important is to decide whether extra levels, for C and Rust, should be kept compile-able/maintained or not (I also replied in the `-O1` for C thread [1]). Note that the Rust side can be compiled as `-O0` or `-O1` at the moment, which is something we do not have for the C side; thus having only the C == Rust option means we will not have a configuration with those anymore. For me it is less complex to not have them, and I have not heard more opinions on this, either for or against (apart from that thread suggesting `-O1` for the C side), so if nobody else chimes in, I will remove them. [1] https://lore.kernel.org/lkml/caniq72kysvvoq7eqwe0jzz3v0jttrcqodhr9ty4-sfdmdzp...@mail.gmail.com/ Cheers, Miguel
Re: [PATCH 24/30] panic: Refactor the panic path
"Guilherme G. Piccoli" writes: > The panic() function is somewhat convoluted - a lot of changes were > made over the years, adding comments that might be misleading/outdated > now, it has a code structure that is a bit complex to follow, with > lots of conditionals, for example. The panic notifier list is something > else - a single list, with multiple callbacks of different purposes, > that run in a non-deterministic order and may affect hardly kdump > reliability - see the "crash_kexec_post_notifiers" workaround-ish flag. > > This patch proposes a major refactor on the panic path based on Petr's > idea [0] - basically we split the notifiers list in three, having a set > of different call points in the panic path. Below a list of changes > proposed in this patch, culminating in the panic notifiers level > concept: > > (a) First of all, we improved comments all over the function > and removed useless variables / includes. Also, as part of this > clean-up we concentrate the console flushing functions in a helper. > > (b) As mentioned before, there is a split of the panic notifier list > in three, based on the purpose of the callback. The code contains > good documentation in form of comments, but a summary of the three > lists follows: > > - the hypervisor list aims low-risk procedures to inform hypervisors > or firmware about the panic event, also includes LED-related functions; > > - the informational list contains callbacks that provide more details, > like kernel offset or trace dump (if enabled) and also includes the > callbacks aimed at reducing log pollution or warns, like the RCU and > hung task disable callbacks; > > - finally, the pre_reboot list is the old notifier list renamed, > containing the more risky callbacks that didn't fit the previous > lists. There is also a 4th list (the post_reboot one), but it's not > related with the original list - it contains late time architecture > callbacks aimed at stopping the machine, for example. > > The 3 notifiers lists execute in different moments, hypervisor being > the first, followed by informational and finally the pre_reboot list. > > (c) But then, there is the ordering problem of the notifiers against > the crash_kernel() call - kdump must be as reliable as possible. > For that, a simple binary "switch" as "crash_kexec_post_notifiers" > is not enough, hence we introduce here concept of panic notifier > levels: there are 5 levels, from 0 (no notifier executes before > kdump) until 4 (all notifiers run before kdump); the default level > is 2, in which the hypervisor and (iff we have any kmsg dumper) > the informational notifiers execute before kdump. > > The detailed documentation of the levels is present in code comments > and in the kernel-parameters.txt file; as an analogy with the previous > panic() implementation, the level 0 is exactly the same as the old > behavior of notifiers, running all after kdump, and the level 4 is > the same as "crash_kexec_post_notifiers=Y" (we kept this parameter as > a deprecated one). > > (d) Finally, an important change made here: we now use only the > function "crash_smp_send_stop()" to shut all the secondary CPUs > in the panic path. Before, there was a case of using the regular > "smp_send_stop()", but the better approach is to simplify the > code and try to use the function which was created exclusively > for the panic path. Experiments showed that it works fine, and > code was very simplified with that. > > Functional change is expected from this refactor, since now we > call some notifiers by default before kdump, but the goal here > besides code clean-up is to have a better panic path, more > reliable and deterministic, but also very customizable. > > [0] https://lore.kernel.org/lkml/YfPxvzSzDLjO5ldp@alley/ I am late to this discussion. My apologies. Unfortunately I am also very grouchy. Notifiers before kexec on panic are fundamentally broken. So please just remove crash_kexec_post notifiers and be done with it. Part of the deep issue is that firmware always has a common and broken implementation for anything that is not mission critical to motherboards. Notifiers in any sense on these paths are just bollocks. Any kind of notifier list is fundamentally fragile in the face of memory corruption and very very difficult to review. So I am going to refresh my ancient NACK on this. I can certainly appreciate that there are pieces of the reboot paths that can be improved. I don't think making anything more feature full or flexible is any kind of real improvement. Eric > > Suggested-by: Petr Mladek > Signed-off-by: Guilherme G. Piccoli > --- > > Special thanks to Petr and Baoquan for the suggestion and feedback in a > previous > email thread. There's some important design decisions that worth mentioning > and > discussing: > > * The default panic notifiers level is 2, based on Petr Mladek's suggestion, > which makes a lot of sense. Of course, this is customizable through the > parameter, but
Re: [PATCH V9 20/20] riscv: compat: Add COMPAT Kbuild skeletal support
Am Dienstag, 24. Mai 2022, 01:00:39 CEST schrieb Guenter Roeck: > On Tue, May 24, 2022 at 12:40:06AM +0200, Heiko Stübner wrote: > > Hi Guenter, > > > > Am Montag, 23. Mai 2022, 18:18:47 CEST schrieb Guenter Roeck: > > > On 5/23/22 08:18, Guo Ren wrote: > > > > I tested Palmer's branch, it's okay: > > > > 8810d7feee5a (HEAD -> for-next, palmer/for-next) riscv: Don't output a > > > > bogus mmu-type on a no MMU kernel > > > > > > > > I also tested linux-next, it's okay: > > > > > > > > rv64_rootfs: > > > > # uname -a > > > > Linux buildroot 5.18.0-next-20220523 #7 SMP Mon May 23 11:15:17 EDT > > > > 2022 riscv64 GNU/Linux > > > > # > > > > > > That is is ok with one setup doesn't mean it is ok with > > > all setups. It is not ok with my root file system (from > > > https://github.com/groeck/linux-build-test/tree/master/rootfs/riscv64), > > > with qemu v6.2. > > > > That is very true that it shouldn't fail on any existing (qemu-)platform, > > but as I remember also testing Guo's series on both riscv32 and riscv64 > > qemu platforms in the past, I guess it would be really helpful to get more > > information about the failing platform you're experiencing so that we can > > find the source of the issue. > > > > As it looks like you both tested the same kernel source, I guess the only > > differences could be in the qemu-version, kernel config and rootfs. > > Is your rootfs something you can share or that can be rebuilt easily? > > > I provided a link to my root file system above. The link points to two > root file systems, for initrd (cpio archive) and for ext2. > I also mentioned above that I used qemu v6.2. And below I said > > > My root file system uses musl. > > I attached the buildroot configuration below. The buildroot version, > if I remember correctly, was 2021.02. > > Kernel configuration is basically defconfig. However, I do see one > detail that is possibly special in my configuration. > > # The latest kernel assumes SBI version 0.3, but that doesn't match qemu > # at least up to version 6.2 and results in hangup/crashes during reboot > # with sifive_u emulations. > enable_config "${defconfig}" CONFIG_RISCV_SBI_V01 > > Hope that helps, Actually it doesn't seem rootfs-specific at all. Merged was this v9, but the version I last tested was one of the earlier ones, so it looks like something really broke meanwhile. I tested both linux-next-20220524 and palmer's for-next on a very recent qemu - master from april if I remember correctly together with a Debian-based rootfs mounted as nfsroot inside qemu. With CONFIG_COMPAT=y (aka using defconfig directly) I get: [ 12.957612] VFS: Mounted root (nfs filesystem) on device 0:15. [ 12.967260] devtmpfs: mounted [ 13.101186] Freeing unused kernel image (initmem) memory: 2168K [ 13.110914] Run /sbin/init as init process [ 13.343810] Unable to handle kernel paging request at virtual address ff60007265776f78 [ 13.347271] Oops [#1] [ 13.347749] Modules linked in: [ 13.348689] CPU: 0 PID: 1 Comm: init Not tainted 5.18.0-next-20220524 #1 [ 13.349864] Hardware name: riscv-virtio,qemu (DT) [ 13.350706] epc : special_mapping_fault+0x4c/0x8e [ 13.351639] ra : __do_fault+0x28/0x11c [ 13.352311] epc : 801210e6 ra : 8011712a sp : ff200060bd10 [ 13.353276] gp : 810df030 tp : ff60012a8000 t0 : 80008acc [ 13.354063] t1 : 80c001d8 t2 : 80c00258 s0 : ff200060bd20 [ 13.354886] s1 : ff200060bd68 a0 : ff60013165f0 a1 : ff6001ec2450 [ 13.355675] a2 : ff200060bd68 a3 : a4 : ff603f0337c8 [ 13.356822] a5 : ff60007265776f70 a6 : ff6001ec2450 a7 : 0007 [ 13.357689] s2 : ff6001ec2450 s3 : ff6001ec2450 s4 : ff200060bd68 [ 13.358487] s5 : ff6001ec2450 s6 : 0254 s7 : 000c [ 13.359305] s8 : 000f s9 : 000d s10: ff6001e4c080 [ 13.360102] s11: 000d t3 : 00ffbbeab000 t4 : 6dff [ 13.361557] t5 : 6e35 t6 : 000a [ 13.362229] status: 00020120 badaddr: ff60007265776f78 cause: 000d [ 13.363504] [] __do_fault+0x28/0x11c [ 13.364464] [] __handle_mm_fault+0x71c/0x9ea [ 13.365577] [] handle_mm_fault+0x82/0x136 [ 13.366275] [] do_page_fault+0x120/0x31c [ 13.366906] [] ret_from_exception+0x0/0xc [ 13.368763] ---[ end trace ]--- [ 13.368763] ---[ end trace ]--- [ 13.369598] Kernel panic - not syncing: Attempted to
Re: [PATCH] powerpc/papr_scm: don't requests stats with '0' sized stats buffer
> On 24-May-2022, at 4:53 PM, Vaibhav Jain wrote: > > Sachin reported [1] that on a POWER-10 lpar he is seeing a kernel panic being > reported with vPMEM when papr_scm probe is being called. The panic is of the > form below and is observed only with following option disabled(profile) for > the > said LPAR 'Enable Performance Information Collection' in the HMC: > > Kernel attempted to write user page (1c) - exploit attempt? (uid: 0) > BUG: Kernel NULL pointer dereference on write at 0x001c > Faulting instruction address: 0xc00801b90844 > Oops: Kernel access of bad area, sig: 11 [#1] > > NIP [c00801b90844] drc_pmem_query_stats+0x5c/0x270 [papr_scm] > LR [c00801b92794] papr_scm_probe+0x2ac/0x6ec [papr_scm] > Call Trace: > 0xc941bca0 (unreliable) > papr_scm_probe+0x2ac/0x6ec [papr_scm] > platform_probe+0x98/0x150 > really_probe+0xfc/0x510 > __driver_probe_device+0x17c/0x230 > > ---[ end trace ]--- > Kernel panic - not syncing: Fatal exception > > On investigation looks like this panic was caused due to a 'stat_buffer' of > size==0 being provided to drc_pmem_query_stats() to fetch all performance > stats-ids of an NVDIMM. However drc_pmem_query_stats() shouldn't have been > called > since the vPMEM NVDIMM doesn't support and performance stat-id's. This was > caused > due to missing check for 'p->stat_buffer_len' at the beginning of > papr_scm_pmu_check_events() which indicates that the NVDIMM doesn't support > performance-stats. > > Fix this by introducing the check for 'p->stat_buffer_len' at the beginning of > papr_scm_pmu_check_events(). > > [1] > https://lore.kernel.org/all/6b3a522a-6a5f-4cc9-b268-0c63aa6e0...@linux.ibm.com > > Fixes: 0e0946e22f3665d2732 ("powerpc/papr_scm: Fix leaking nvdimm_events_map > elements") > Reported-by: Sachin Sant > Signed-off-by: Vaibhav Jain > --- Thanks Vaibhav for the patch. With the patch the reported problem is fixed. Tested-by: Sachin Sant -Sachin
Re: [RFC PATCH 4/4] objtool/powerpc: Add --mcount specific implementation
Le 24/05/2022 à 13:00, Sathvika Vasireddy a écrit : > [Vous ne recevez pas souvent de courriers de la part de > s...@linux.vnet.ibm.com. Découvrez pourquoi cela peut être important à > l’adresse https://aka.ms/LearnAboutSenderIdentification.] > > On 24/05/22 15:05, Christophe Leroy wrote: >> >> Le 23/05/2022 à 19:55, Sathvika Vasireddy a écrit : >>> This patch enables objtool --mcount on powerpc, and >>> adds implementation specific to powerpc. >>> >>> Signed-off-by: Sathvika Vasireddy >>> --- >>> arch/powerpc/Kconfig | 1 + >>> tools/objtool/arch/powerpc/decode.c | 14 ++ >>> tools/objtool/check.c | 12 +++- >>> tools/objtool/elf.c | 13 + >>> tools/objtool/include/objtool/elf.h | 1 + >>> 5 files changed, 36 insertions(+), 5 deletions(-) >>> >>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >>> index 732a3f91ee5e..3373d44a1298 100644 >>> --- a/arch/powerpc/Kconfig >>> +++ b/arch/powerpc/Kconfig >>> @@ -233,6 +233,7 @@ config PPC >>> select HAVE_NMI if PERF_EVENTS || (PPC64 >>> && PPC_BOOK3S) >>> select HAVE_OPTPROBES >>> select HAVE_OBJTOOL if PPC64 >>> + select HAVE_OBJTOOL_MCOUNT if HAVE_OBJTOOL >>> select HAVE_PERF_EVENTS >>> select HAVE_PERF_EVENTS_NMI if PPC64 >>> select HAVE_PERF_REGS >>> diff --git a/tools/objtool/arch/powerpc/decode.c >>> b/tools/objtool/arch/powerpc/decode.c >>> index e3b77a6ce357..ad3d79fffac2 100644 >>> --- a/tools/objtool/arch/powerpc/decode.c >>> +++ b/tools/objtool/arch/powerpc/decode.c >>> @@ -40,12 +40,26 @@ int arch_decode_instruction(struct objtool_file >>> *file, const struct section *sec >>> struct list_head *ops_list) >>> { >>> u32 insn; >>> + unsigned int opcode; >>> >>> *immediate = 0; >>> memcpy(&insn, sec->data->d_buf+offset, 4); >>> *len = 4; >>> *type = INSN_OTHER; >>> >>> + opcode = (insn >> 26); >> You dont need the brackets here. >> >>> + >>> + switch (opcode) { >>> + case 18: /* bl */ >>> + if ((insn & 3) == 1) { >>> + *type = INSN_CALL; >>> + *immediate = insn & 0x3fc; >>> + if (*immediate & 0x200) >>> + *immediate -= 0x400; >>> + } >>> + break; >>> + } >>> + >>> return 0; >>> } >>> >>> diff --git a/tools/objtool/check.c b/tools/objtool/check.c >>> index 056302d58e23..fd8bad092f89 100644 >>> --- a/tools/objtool/check.c >>> +++ b/tools/objtool/check.c >>> @@ -832,7 +832,7 @@ static int create_mcount_loc_sections(struct >>> objtool_file *file) >>> >>> if (elf_add_reloc_to_insn(file->elf, sec, >>> idx * sizeof(unsigned long), >>> - R_X86_64_64, >>> + elf_reloc_type_long(file->elf), >>> insn->sec, insn->offset)) >>> return -1; >>> >>> @@ -2183,7 +2183,7 @@ static int classify_symbols(struct objtool_file >>> *file) >>> if (arch_is_retpoline(func)) >>> func->retpoline_thunk = true; >>> >>> - if (!strcmp(func->name, "__fentry__")) >>> + if ((!strcmp(func->name, "__fentry__")) || >>> (!strcmp(func->name, "_mcount"))) >>> func->fentry = true; >>> >>> if (is_profiling_func(func->name)) >>> @@ -2259,9 +2259,11 @@ static int decode_sections(struct objtool_file >>> *file) >>> * Must be before add_jump_destinations(), which depends on 'func' >>> * being set for alternatives, to enable proper sibling call >>> detection. >>> */ >>> - ret = add_special_section_alts(file); >>> - if (ret) >>> - return ret; >>> + if (opts.stackval || opts.orc || opts.uaccess || opts.noinstr) { >>> + ret = add_special_section_alts(file); >>> + if (ret) >>> + return ret; >>> + } >> I think this change should be a patch by itself, it's not related to >> powerpc. > Makes sense. I'll make this a separate patch in the next revision. Great. Can you base your next revision on the one I just sent out ? I will now start looking at inline static calls for PPC32. >> >>> >>> ret = add_jump_destinations(file); >>> if (ret) >>> diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c >>> index c25e957c1e52..95763060d551 100644 >>> --- a/tools/objtool/elf.c >>> +++ b/tools/objtool/elf.c >>> @@ -793,6 +793,19 @@ elf_create_section_symbol(struct elf *elf, >>> struct section *sec) >>> return sym; >>> } >>> >>> +int elf_reloc_type_long(struct elf *elf) >> Not sure it's a good name, because for 32 bits we have to use 'int'. > Sure, I'll rename it to elf_reloc_type() or some such.
Re: [RFC PATCH 2/4] objtool: Enable objtool to run only on files with ftrace enabled
Hi Sathvika Le 24/05/2022 à 12:53, Sathvika Vasireddy a écrit : > [Vous ne recevez pas souvent de courriers de la part de > s...@linux.vnet.ibm.com. Découvrez pourquoi cela peut être important à > l’adresse https://aka.ms/LearnAboutSenderIdentification.] > > Hi Christophe, > > On 24/05/22 14:27, Christophe Leroy wrote: >> >> Le 23/05/2022 à 19:55, Sathvika Vasireddy a écrit : >>> This patch makes sure objtool runs only on the object files >>> that have ftrace enabled, instead of running on all the object >>> files. >> Why do that ? > This was done to address the issue discussed here: > https://lore.kernel.org/all/b06bb9bc-22d1-acce-fe68-c7c4cb7c1...@csgroup.eu/ Ah ? Ok. But how does x86 do it at the moment ? Shouldn't we use OBJECT_FILES_NON_STANDARD instead ? > > >> >> What about static_calls ? There may be files without ftrace but with >> static calls. > Yes, this prevents objtool from running on those files. We can > restrict this change to FTRACE_MCOUNT_USE_OBJTOOL >> >> By the way, it would be nice if we could use it only on C files. >> I get the following errors for ASM files: >> >> arch/powerpc/kernel/entry_32.o: warning: objtool: .text+0x1b4: >> unannotated intra-function call > > I'm looking into ways to address this. > Nice. Christophe
[RFC PATCH v2 2/7] objtool: Use target file endianness instead of a compiled constant
Some architectures like powerpc support both endianness, it's therefore not possible to fix the endianness via arch/endianness.h because there is no easy way to get the target endianness at build time. Use the endianness recorded in the file objtool is working on. Signed-off-by: Christophe Leroy --- .../arch/x86/include/arch/endianness.h| 9 -- tools/objtool/check.c | 2 +- tools/objtool/include/objtool/endianness.h| 29 +-- tools/objtool/orc_dump.c | 11 +-- tools/objtool/orc_gen.c | 4 +-- tools/objtool/special.c | 3 +- 6 files changed, 27 insertions(+), 31 deletions(-) delete mode 100644 tools/objtool/arch/x86/include/arch/endianness.h diff --git a/tools/objtool/arch/x86/include/arch/endianness.h b/tools/objtool/arch/x86/include/arch/endianness.h deleted file mode 100644 index 7c362527da20.. --- a/tools/objtool/arch/x86/include/arch/endianness.h +++ /dev/null @@ -1,9 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0-or-later */ -#ifndef _ARCH_ENDIANNESS_H -#define _ARCH_ENDIANNESS_H - -#include - -#define __TARGET_BYTE_ORDER __LITTLE_ENDIAN - -#endif /* _ARCH_ENDIANNESS_H */ diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 6cb07e151588..cef1dd54d505 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1971,7 +1971,7 @@ static int read_unwind_hints(struct objtool_file *file) return -1; } - cfi.cfa.offset = bswap_if_needed(hint->sp_offset); + cfi.cfa.offset = bswap_if_needed(file->elf, hint->sp_offset); cfi.type = hint->type; cfi.end = hint->end; diff --git a/tools/objtool/include/objtool/endianness.h b/tools/objtool/include/objtool/endianness.h index 10241341eff3..ab0515ba0538 100644 --- a/tools/objtool/include/objtool/endianness.h +++ b/tools/objtool/include/objtool/endianness.h @@ -2,33 +2,30 @@ #ifndef _OBJTOOL_ENDIANNESS_H #define _OBJTOOL_ENDIANNESS_H -#include #include #include - -#ifndef __TARGET_BYTE_ORDER -#error undefined arch __TARGET_BYTE_ORDER -#endif - -#if __BYTE_ORDER != __TARGET_BYTE_ORDER -#define __NEED_BSWAP 1 -#else -#define __NEED_BSWAP 0 -#endif +#include /* - * Does a byte swap if target endianness doesn't match the host, i.e. cross + * Does a byte swap if target file endianness doesn't match the host, i.e. cross * compilation for little endian on big endian and vice versa. * To be used for multi-byte values conversion, which are read from / about * to be written to a target native endianness ELF file. */ -#define bswap_if_needed(val) \ +static inline bool need_bswap(struct elf *elf) +{ + return (__BYTE_ORDER == __LITTLE_ENDIAN) ^ + (elf->ehdr.e_ident[EI_DATA] == ELFDATA2LSB); +} + +#define bswap_if_needed(elf, val) \ ({ \ __typeof__(val) __ret; \ + bool __need_bswap = need_bswap(elf);\ switch (sizeof(val)) { \ - case 8: __ret = __NEED_BSWAP ? bswap_64(val) : (val); break;\ - case 4: __ret = __NEED_BSWAP ? bswap_32(val) : (val); break;\ - case 2: __ret = __NEED_BSWAP ? bswap_16(val) : (val); break;\ + case 8: __ret = __need_bswap ? bswap_64(val) : (val); break;\ + case 4: __ret = __need_bswap ? bswap_32(val) : (val); break;\ + case 2: __ret = __need_bswap ? bswap_16(val) : (val); break;\ default:\ BUILD_BUG(); break; \ } \ diff --git a/tools/objtool/orc_dump.c b/tools/objtool/orc_dump.c index f5a8508c42d6..4f1211fec82c 100644 --- a/tools/objtool/orc_dump.c +++ b/tools/objtool/orc_dump.c @@ -76,6 +76,7 @@ int orc_dump(const char *_objname) GElf_Rela rela; GElf_Sym sym; Elf_Data *data, *symtab = NULL, *rela_orc_ip = NULL; + struct elf dummy_elf = {}; objname = _objname; @@ -94,6 +95,12 @@ int orc_dump(const char *_objname) return -1; } + if (!elf64_getehdr(elf)) { + WARN_ELF("elf64_getehdr"); + return -1; + } + memcpy(&dummy_elf.ehdr, elf64_getehdr(elf), sizeof(dummy_elf.ehdr)); + if (elf_getshdrnum(elf, &nr_sections)) { WARN_ELF("elf_getshdrnum"); return -1; @@ -198,11 +205,11 @@ int orc_dump(const char *_objname) printf(" sp:"); - print_reg(orc[i].sp_reg, bswap_if_needed(orc[i].sp_offset)); + print_reg(orc[i].sp_reg, bswap_if_needed(&dummy_elf, orc[i]
[RFC PATCH v2 6/7] objtool/powerpc: Enable objtool to be built on ppc
From: Sathvika Vasireddy This patch adds [stub] implementations for required functions, inorder to enable objtool build on powerpc. Signed-off-by: Sathvika Vasireddy Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig | 1 + tools/objtool/arch/powerpc/Build | 2 + tools/objtool/arch/powerpc/decode.c | 74 +++ .../arch/powerpc/include/arch/cfi_regs.h | 11 +++ tools/objtool/arch/powerpc/include/arch/elf.h | 8 ++ .../arch/powerpc/include/arch/special.h | 21 ++ tools/objtool/arch/powerpc/special.c | 19 + 7 files changed, 136 insertions(+) create mode 100644 tools/objtool/arch/powerpc/Build create mode 100644 tools/objtool/arch/powerpc/decode.c create mode 100644 tools/objtool/arch/powerpc/include/arch/cfi_regs.h create mode 100644 tools/objtool/arch/powerpc/include/arch/elf.h create mode 100644 tools/objtool/arch/powerpc/include/arch/special.h create mode 100644 tools/objtool/arch/powerpc/special.c diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 174edabb74fa..7c01229dd2e3 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -232,6 +232,7 @@ config PPC select HAVE_MOD_ARCH_SPECIFIC select HAVE_NMI if PERF_EVENTS || (PPC64 && PPC_BOOK3S) select HAVE_OPTPROBES + select HAVE_OBJTOOL select HAVE_PERF_EVENTS select HAVE_PERF_EVENTS_NMI if PPC64 select HAVE_PERF_REGS diff --git a/tools/objtool/arch/powerpc/Build b/tools/objtool/arch/powerpc/Build new file mode 100644 index ..d24d5636a5b8 --- /dev/null +++ b/tools/objtool/arch/powerpc/Build @@ -0,0 +1,2 @@ +objtool-y += decode.o +objtool-y += special.o diff --git a/tools/objtool/arch/powerpc/decode.c b/tools/objtool/arch/powerpc/decode.c new file mode 100644 index ..eb1542d155fe --- /dev/null +++ b/tools/objtool/arch/powerpc/decode.c @@ -0,0 +1,74 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +#include +#include +#include +#include +#include +#include +#include +#include + +unsigned long arch_dest_reloc_offset(int addend) +{ + return addend; +} + +bool arch_callee_saved_reg(unsigned char reg) +{ + return false; +} + +int arch_decode_hint_reg(u8 sp_reg, int *base) +{ + return 0; +} + +const char *arch_nop_insn(int len) +{ + return NULL; +} + +const char *arch_ret_insn(int len) +{ + return NULL; +} + +int arch_decode_instruction(struct objtool_file *file, const struct section *sec, + unsigned long offset, unsigned int maxlen, + unsigned int *len, enum insn_type *type, + unsigned long *immediate, + struct list_head *ops_list) +{ + u32 insn; + + *immediate = 0; + insn = bswap_if_needed(file->elf, *(u32 *)(sec->data->d_buf + offset)); + *len = 4; + *type = INSN_OTHER; + + return 0; +} + +unsigned long arch_jump_destination(struct instruction *insn) +{ + return insn->offset + insn->immediate; +} + +void arch_initial_func_cfi_state(struct cfi_init_state *state) +{ + int i; + + for (i = 0; i < CFI_NUM_REGS; i++) { + state->regs[i].base = CFI_UNDEFINED; + state->regs[i].offset = 0; + } + + /* initial CFA (call frame address) */ + state->cfa.base = CFI_SP; + state->cfa.offset = 0; + + /* initial LR (return address) */ + state->regs[CFI_RA].base = CFI_CFA; + state->regs[CFI_RA].offset = 0; +} diff --git a/tools/objtool/arch/powerpc/include/arch/cfi_regs.h b/tools/objtool/arch/powerpc/include/arch/cfi_regs.h new file mode 100644 index ..59638ebeafc8 --- /dev/null +++ b/tools/objtool/arch/powerpc/include/arch/cfi_regs.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ + +#ifndef _OBJTOOL_CFI_REGS_H +#define _OBJTOOL_CFI_REGS_H + +#define CFI_BP 1 +#define CFI_SP CFI_BP +#define CFI_RA 32 +#define CFI_NUM_REGS 33 + +#endif diff --git a/tools/objtool/arch/powerpc/include/arch/elf.h b/tools/objtool/arch/powerpc/include/arch/elf.h new file mode 100644 index ..3c8ebb7d2a6b --- /dev/null +++ b/tools/objtool/arch/powerpc/include/arch/elf.h @@ -0,0 +1,8 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ + +#ifndef _OBJTOOL_ARCH_ELF +#define _OBJTOOL_ARCH_ELF + +#define R_NONE R_PPC_NONE + +#endif /* _OBJTOOL_ARCH_ELF */ diff --git a/tools/objtool/arch/powerpc/include/arch/special.h b/tools/objtool/arch/powerpc/include/arch/special.h new file mode 100644 index ..ffef9ada7133 --- /dev/null +++ b/tools/objtool/arch/powerpc/include/arch/special.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +#ifndef _PPC_ARCH_SPECIAL_H +#define _PPC_ARCH_SPECIAL_H + +#define EX_ENTRY_SIZE 8 +#define EX_ORIG_OFFSET 0 +#define EX_NEW_OFFSET 4 + +#define JUMP_ENTRY_SIZE 16 +#define JUMP_ORIG_OFFSET 0 +#define JUMP
[RFC PATCH v2 4/7] objtool: Add --mnop as an option to --mcount
From: Sathvika Vasireddy Architectures can select HAVE_NOP_MCOUNT if they choose to nop out mcount call sites. If that config option is selected, then --mnop is passed as an option to objtool, along with --mcount. Also, make sure that --mnop can be passed as an option to objtool only when --mcount is passed. Signed-off-by: Sathvika Vasireddy Signed-off-by: Christophe Leroy --- Makefile| 4 +++- arch/x86/Kconfig| 1 + scripts/Makefile.build | 1 + tools/objtool/builtin-check.c | 14 ++ tools/objtool/check.c | 19 ++- tools/objtool/include/objtool/builtin.h | 1 + 6 files changed, 30 insertions(+), 10 deletions(-) diff --git a/Makefile b/Makefile index 250707647359..acaf88e3c694 100644 --- a/Makefile +++ b/Makefile @@ -851,7 +851,9 @@ ifdef CONFIG_FTRACE_MCOUNT_USE_CC endif endif ifdef CONFIG_FTRACE_MCOUNT_USE_OBJTOOL - CC_FLAGS_USING += -DCC_USING_NOP_MCOUNT + ifdef CONFIG_HAVE_NOP_MCOUNT +CC_FLAGS_USING += -DCC_USING_NOP_MCOUNT + endif endif ifdef CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT ifdef CONFIG_HAVE_C_RECORDMCOUNT diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 1847d6e974a1..4a41bfb644f0 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -189,6 +189,7 @@ config X86 select HAVE_CONTEXT_TRACKING_OFFSTACK if HAVE_CONTEXT_TRACKING select HAVE_C_RECORDMCOUNT select HAVE_OBJTOOL_MCOUNT if HAVE_OBJTOOL + select HAVE_NOP_MCOUNT if HAVE_OBJTOOL_MCOUNT select HAVE_BUILDTIME_MCOUNT_SORT select HAVE_DEBUG_KMEMLEAK select HAVE_DMA_CONTIGUOUS diff --git a/scripts/Makefile.build b/scripts/Makefile.build index ac8167227bc0..2e0c3f9c1459 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -231,6 +231,7 @@ objtool_args = \ $(if $(CONFIG_HAVE_NOINSTR_HACK), --hacks=noinstr) \ $(if $(CONFIG_X86_KERNEL_IBT), --ibt) \ $(if $(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL), --mcount) \ + $(if $(CONFIG_HAVE_NOP_MCOUNT), --mnop) \ $(if $(CONFIG_UNWINDER_ORC), --orc) \ $(if $(CONFIG_RETPOLINE), --retpoline) \ $(if $(CONFIG_SLS), --sls) \ diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c index f4c3a5091737..b05e2108c0c3 100644 --- a/tools/objtool/builtin-check.c +++ b/tools/objtool/builtin-check.c @@ -80,6 +80,7 @@ const struct option check_options[] = { OPT_BOOLEAN(0, "dry-run", &opts.dryrun, "don't write modifications"), OPT_BOOLEAN(0, "link", &opts.link, "object is a linked object"), OPT_BOOLEAN(0, "module", &opts.module, "object is part of a kernel module"), + OPT_BOOLEAN(0, "mnop", &opts.mnop, "nop out mcount call sites"), OPT_BOOLEAN(0, "no-unreachable", &opts.no_unreachable, "skip 'unreachable instruction' warnings"), OPT_BOOLEAN(0, "sec-address", &opts.sec_address, "print section addresses in warnings"), OPT_BOOLEAN(0, "stats", &opts.stats, "print statistics"), @@ -142,6 +143,16 @@ static bool opts_valid(void) return false; } +static bool mnop_opts_valid(void) +{ + if (opts.mnop && !opts.mcount) { + ERROR("--mnop requires --mcount"); + return false; + } + + return true; +} + static bool link_opts_valid(struct objtool_file *file) { if (opts.link) @@ -185,6 +196,9 @@ int objtool_run(int argc, const char **argv) if (!file) return 1; + if (!mnop_opts_valid()) + return 1; + if (!link_opts_valid(file)) return 1; diff --git a/tools/objtool/check.c b/tools/objtool/check.c index fabc0ea88747..7f0dc504dd92 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1177,17 +1177,18 @@ static void annotate_call_site(struct objtool_file *file, if (opts.mcount && sym->fentry) { if (sibling) WARN_FUNC("Tail call to __fentry__ !?!?", insn->sec, insn->offset); + if (opts.mnop) { + if (reloc) { + reloc->type = R_NONE; + elf_write_reloc(file->elf, reloc); + } - if (reloc) { - reloc->type = R_NONE; - elf_write_reloc(file->elf, reloc); - } - - elf_write_insn(file->elf, insn->sec, - insn->offset, insn->len, - arch_nop_insn(insn->len)); + elf_write_insn(file->elf, insn->sec, + insn->offset, insn->len, +
[RFC PATCH v2 0/7] objtool: Enable and implement --mcount option on powerpc
This draft series adds PPC32 support to Sathvika's series. Verified on pmac32 on QEMU. It should in principle also work for PPC64 BE but for the time being something goes wrong. In the beginning I had a segfaut hence the first patch. But I still get no mcount section in the files. Christophe Leroy (3): objtool: Fix SEGFAULT objtool: Use target file endianness instead of a compiled constant objtool: Use target file class size instead of a compiled constant Sathvika Vasireddy (4): objtool: Add --mnop as an option to --mcount objtool: Enable objtool to run only on files with ftrace enabled objtool/powerpc: Enable objtool to be built on ppc objtool/powerpc: Add --mcount specific implementation Makefile | 4 +- arch/powerpc/Kconfig | 2 + arch/x86/Kconfig | 1 + scripts/Makefile.build| 5 +- tools/objtool/arch/powerpc/Build | 2 + tools/objtool/arch/powerpc/decode.c | 88 +++ .../arch/powerpc/include/arch/cfi_regs.h | 11 +++ tools/objtool/arch/powerpc/include/arch/elf.h | 8 ++ .../arch/powerpc/include/arch/special.h | 21 + tools/objtool/arch/powerpc/special.c | 19 .../arch/x86/include/arch/endianness.h| 9 -- tools/objtool/builtin-check.c | 14 +++ tools/objtool/check.c | 51 ++- tools/objtool/elf.c | 23 - tools/objtool/include/objtool/builtin.h | 1 + tools/objtool/include/objtool/elf.h | 9 ++ tools/objtool/include/objtool/endianness.h| 29 +++--- tools/objtool/orc_dump.c | 11 ++- tools/objtool/orc_gen.c | 4 +- tools/objtool/special.c | 3 +- 20 files changed, 257 insertions(+), 58 deletions(-) create mode 100644 tools/objtool/arch/powerpc/Build create mode 100644 tools/objtool/arch/powerpc/decode.c create mode 100644 tools/objtool/arch/powerpc/include/arch/cfi_regs.h create mode 100644 tools/objtool/arch/powerpc/include/arch/elf.h create mode 100644 tools/objtool/arch/powerpc/include/arch/special.h create mode 100644 tools/objtool/arch/powerpc/special.c delete mode 100644 tools/objtool/arch/x86/include/arch/endianness.h -- 2.35.3
[RFC PATCH v2 7/7] objtool/powerpc: Add --mcount specific implementation
From: Sathvika Vasireddy This patch enables objtool --mcount on powerpc, and adds implementation specific to powerpc. Signed-off-by: Sathvika Vasireddy Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig| 1 + tools/objtool/arch/powerpc/decode.c | 14 ++ tools/objtool/check.c | 12 +++- tools/objtool/elf.c | 15 +++ tools/objtool/include/objtool/elf.h | 1 + 5 files changed, 38 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 7c01229dd2e3..5ef8bf8eb202 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -233,6 +233,7 @@ config PPC select HAVE_NMI if PERF_EVENTS || (PPC64 && PPC_BOOK3S) select HAVE_OPTPROBES select HAVE_OBJTOOL + select HAVE_OBJTOOL_MCOUNT if HAVE_OBJTOOL select HAVE_PERF_EVENTS select HAVE_PERF_EVENTS_NMI if PPC64 select HAVE_PERF_REGS diff --git a/tools/objtool/arch/powerpc/decode.c b/tools/objtool/arch/powerpc/decode.c index eb1542d155fe..048bb4cd2838 100644 --- a/tools/objtool/arch/powerpc/decode.c +++ b/tools/objtool/arch/powerpc/decode.c @@ -41,12 +41,26 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec struct list_head *ops_list) { u32 insn; + unsigned int opcode; *immediate = 0; insn = bswap_if_needed(file->elf, *(u32 *)(sec->data->d_buf + offset)); *len = 4; *type = INSN_OTHER; + opcode = (insn >> 26); + + switch (opcode) { + case 18: /* bl */ + if ((insn & 3) == 1) { + *type = INSN_CALL; + *immediate = insn & 0x3fc; + if (*immediate & 0x200) + *immediate -= 0x400; + } + break; + } + return 0; } diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 7f0dc504dd92..70be5a72e838 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -834,7 +834,7 @@ static int create_mcount_loc_sections(struct objtool_file *file) memset(loc, 0, size); if (elf_add_reloc_to_insn(file->elf, sec, idx, - R_X86_64_64, + elf_reloc_type_long(file->elf), insn->sec, insn->offset)) return -1; @@ -2185,7 +2185,7 @@ static int classify_symbols(struct objtool_file *file) if (arch_is_retpoline(func)) func->retpoline_thunk = true; - if (!strcmp(func->name, "__fentry__")) + if ((!strcmp(func->name, "__fentry__")) || (!strcmp(func->name, "_mcount"))) func->fentry = true; if (is_profiling_func(func->name)) @@ -2261,9 +2261,11 @@ static int decode_sections(struct objtool_file *file) * Must be before add_jump_destinations(), which depends on 'func' * being set for alternatives, to enable proper sibling call detection. */ - ret = add_special_section_alts(file); - if (ret) - return ret; + if (opts.stackval || opts.orc || opts.uaccess || opts.noinstr) { + ret = add_special_section_alts(file); + if (ret) + return ret; + } ret = add_jump_destinations(file); if (ret) diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c index 63218f5799c2..34b1c6817a5e 100644 --- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -793,6 +793,21 @@ elf_create_section_symbol(struct elf *elf, struct section *sec) return sym; } +int elf_reloc_type_long(struct elf *elf) +{ + switch (elf->ehdr.e_machine) { + case EM_X86_64: + return R_X86_64_64; + case EM_PPC64: + return R_PPC64_ADDR64; + case EM_PPC: + return R_PPC_ADDR32; + default: + WARN("unknown machine..."); + exit(-1); + } +} + int elf_add_reloc_to_insn(struct elf *elf, struct section *sec, unsigned long offset, unsigned int type, struct section *insn_sec, unsigned long insn_off) diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h index c720c4476828..d10f4701715b 100644 --- a/tools/objtool/include/objtool/elf.h +++ b/tools/objtool/include/objtool/elf.h @@ -152,6 +152,7 @@ static inline int elf_class_size(struct elf *elf) struct elf *elf_open_read(const char *name, int flags); struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr); +int elf_reloc_type_long(struct elf *elf); int elf_add_reloc(st
[RFC PATCH v2 3/7] objtool: Use target file class size instead of a compiled constant
In order to allow using objtool on cross-built kernels, determine size of long from elf data instead of using sizeof(long) at build time. For the time being this covers only mcount. Signed-off-by: Christophe Leroy --- tools/objtool/check.c | 16 +--- tools/objtool/elf.c | 8 ++-- tools/objtool/include/objtool/elf.h | 8 3 files changed, 23 insertions(+), 9 deletions(-) diff --git a/tools/objtool/check.c b/tools/objtool/check.c index cef1dd54d505..fabc0ea88747 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -802,9 +802,9 @@ static int create_ibt_endbr_seal_sections(struct objtool_file *file) static int create_mcount_loc_sections(struct objtool_file *file) { struct section *sec; - unsigned long *loc; struct instruction *insn; int idx; + int size = elf_class_size(file->elf); sec = find_section_by_name(file->elf, "__mcount_loc"); if (sec) { @@ -820,23 +820,25 @@ static int create_mcount_loc_sections(struct objtool_file *file) list_for_each_entry(insn, &file->mcount_loc_list, call_node) idx++; - sec = elf_create_section(file->elf, "__mcount_loc", 0, sizeof(unsigned long), idx); + sec = elf_create_section(file->elf, "__mcount_loc", 0, size, idx); if (!sec) return -1; + sec->sh.sh_addralign = size; + idx = 0; list_for_each_entry(insn, &file->mcount_loc_list, call_node) { + void *loc; - loc = (unsigned long *)sec->data->d_buf + idx; - memset(loc, 0, sizeof(unsigned long)); + loc = sec->data->d_buf + idx; + memset(loc, 0, size); - if (elf_add_reloc_to_insn(file->elf, sec, - idx * sizeof(unsigned long), + if (elf_add_reloc_to_insn(file->elf, sec, idx, R_X86_64_64, insn->sec, insn->offset)) return -1; - idx++; + idx += size; } return 0; diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c index c25e957c1e52..63218f5799c2 100644 --- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -1124,6 +1124,7 @@ static struct section *elf_create_rela_reloc_section(struct elf *elf, struct sec { char *relocname; struct section *sec; + int size = elf_class_size(elf); relocname = malloc(strlen(base->name) + strlen(".rela") + 1); if (!relocname) { @@ -1133,7 +1134,10 @@ static struct section *elf_create_rela_reloc_section(struct elf *elf, struct sec strcpy(relocname, ".rela"); strcat(relocname, base->name); - sec = elf_create_section(elf, relocname, 0, sizeof(GElf_Rela), 0); + if (size == sizeof(u32)) + sec = elf_create_section(elf, relocname, 0, sizeof(Elf32_Rela), 0); + else + sec = elf_create_section(elf, relocname, 0, sizeof(GElf_Rela), 0); free(relocname); if (!sec) return NULL; @@ -1142,7 +1146,7 @@ static struct section *elf_create_rela_reloc_section(struct elf *elf, struct sec sec->base = base; sec->sh.sh_type = SHT_RELA; - sec->sh.sh_addralign = 8; + sec->sh.sh_addralign = size; sec->sh.sh_link = find_section_by_name(elf, ".symtab")->idx; sec->sh.sh_info = base->idx; sec->sh.sh_flags = SHF_INFO_LINK; diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h index adebfbc2b518..c720c4476828 100644 --- a/tools/objtool/include/objtool/elf.h +++ b/tools/objtool/include/objtool/elf.h @@ -141,6 +141,14 @@ static inline bool has_multiple_files(struct elf *elf) return elf->num_files > 1; } +static inline int elf_class_size(struct elf *elf) +{ + if (elf->ehdr.e_ident[EI_CLASS] == ELFCLASS32) + return sizeof(u32); + else + return sizeof(u64); +} + struct elf *elf_open_read(const char *name, int flags); struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr); -- 2.35.3
[RFC PATCH v2 1/7] objtool: Fix SEGFAULT
Signed-off-by: Christophe Leroy --- tools/objtool/check.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 190b2f6e360a..6cb07e151588 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -203,7 +203,7 @@ static bool __dead_end_function(struct objtool_file *file, struct symbol *func, return false; insn = find_insn(file, func->sec, func->offset); - if (!insn->func) + if (!insn || !insn->func) return false; func_for_each_insn(file, func, insn) { -- 2.35.3
[RFC PATCH v2 5/7] objtool: Enable objtool to run only on files with ftrace enabled
From: Sathvika Vasireddy This patch makes sure objtool runs only on the object files that have ftrace enabled, instead of running on all the object files. Signed-off-by: Naveen N. Rao Signed-off-by: Sathvika Vasireddy Signed-off-by: Christophe Leroy --- scripts/Makefile.build | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 2e0c3f9c1459..06ceffd92921 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -258,8 +258,8 @@ else # 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a file # 'OBJECT_FILES_NON_STANDARD_foo.o := 'n': override directory skip for a file -$(obj)/%.o: objtool-enabled = $(if $(filter-out y%, \ - $(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n),y) +$(obj)/%.o: objtool-enabled = $(and $(if $(filter-out y%, $(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n),y), \ +$(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)),y),y) endif -- 2.35.3
Re: [powerpc] linux-next 20220520 boot failure (drc_pmem_query_stats)
Thanks for reporting this Sachin, I have posted a fix for this at https://lore.kernel.org/nvdimm/20220524112353.1718454-1-vaib...@linux.ibm.com Sachin Sant writes: > While booting linux-next (5.18.0-rc7-next-20220520) on a Power10 LPAR > configure with pmem following oops is seen. The LPAR fails to boot to > login prompt. > > [ 10.948211] papr_scm ibm,persistent-memory:ibm,pmemory@44104001: > Permission denied while accessing performance stats > [ 10.948536] Kernel attempted to write user page (1c) - exploit attempt? > (uid: 0) > [ 10.948539] BUG: Kernel NULL pointer dereference on write at 0x001c > [ 10.948540] Faulting instruction address: 0xc00801b90844 > [ 10.948542] Oops: Kernel access of bad area, sig: 11 [#1] > [ 10.948563] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries > [ 10.948568] Modules linked in: papr_scm(E+) libnvdimm(E) vmx_crypto(E) > ext4(E) mbcache(E) jbd2(E) sd_mod(E) t10_pi(E) crc64_rocksoft(E) crc64(E) > sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) fuse(E) > [ 10.948587] CPU: 25 PID: 796 Comm: systemd-udevd Tainted: GE > 5.18.0-rc7-next-20220520 #2 > [ 10.948592] NIP: c00801b90844 LR: c00801b92794 CTR: > c00801b907f8 > [ 10.948595] REGS: c0003082b110 TRAP: 0300 Tainted: GE > (5.18.0-rc7-next-20220520) > [ 10.948600] MSR: 8280b033 CR: > 44222822 XER: 0001 > [ 10.948613] CFAR: c007c744 DAR: 001c DSISR: 4200 > IRQMASK: 0 > [ 10.948613] GPR00: c00801b92794 c0003082b3b0 c00801bc8000 > c941bc00 > [ 10.948613] GPR04: 0010 c00016001800 > c0003082b420 > [ 10.948613] GPR08: 001c 0100 53544154 > c00801b92c98 > [ 10.948613] GPR12: c00801b907f8 c00abfd02b00 c0003082bd00 > 0001372bd8b0 > [ 10.948613] GPR16: ff20 c008008911b8 c0080089 > 11d0 > [ 10.948613] GPR20: 0001 c0003082bbc0 c00801bc0a88 > > [ 10.948613] GPR24: c2950e30 > 0010 > [ 10.948613] GPR28: c941bc00 0010 0020 > c941bc00 > [ 10.948660] NIP [c00801b90844] drc_pmem_query_stats+0x5c/0x270 > [papr_scm] > [ 10.948667] LR [c00801b92794] papr_scm_probe+0x2ac/0x6ec [papr_scm] > [ 10.948673] Call Trace: > [ 10.948675] [c0003082b3b0] [c941bca0] 0xc941bca0 > (unreliable) > [ 10.948680] [c0003082b460] [c00801b92794] > papr_scm_probe+0x2ac/0x6ec [papr_scm] > [ 10.948687] [c0003082b550] [c09809b8] platform_probe+0x98/0x150 > [ 10.948694] [c0003082b5d0] [c097bf2c] really_probe+0xfc/0x510 > [ 10.948699] [c0003082b650] [c097c4bc] > __driver_probe_device+0x17c/0x230 > [ 10.948704] [c0003082b6d0] [c097c5c8] > driver_probe_device+0x58/0x120 > [ 10.948709] [c0003082b710] [c097ce0c] > __driver_attach+0xfc/0x230 > [ 10.948714] [c0003082b790] [c0978458] > bus_for_each_dev+0xa8/0x130 > [ 10.948718] [c0003082b7f0] [c097b2c4] driver_attach+0x34/0x50 > [ 10.948722] [c0003082b810] [c097a508] > bus_add_driver+0x1e8/0x350 > [ 10.948729] [c0003082b8a0] [c097def8] > driver_register+0x98/0x1a0 > [ 10.948736] [c0003082b910] [c09804a8] > __platform_driver_register+0x38/0x50 > [ 10.948741] [c0003082b930] [c00801b92c10] papr_scm_init+0x3c/0x78 > [papr_scm] > [ 10.948747] [c0003082b960] [c0011ff0] > do_one_initcall+0x60/0x2d0 > [ 10.948753] [c0003082ba30] [c023627c] do_init_module+0x6c/0x2d0 > [ 10.948760] [c0003082bab0] [c0239650] load_module+0x1e90/0x2290 > [ 10.948765] [c0003082bc90] [c0239d9c] > __do_sys_finit_module+0xdc/0x180 > [ 10.948771] [c0003082bdb0] [c00335fc] > system_call_exception+0x17c/0x350 > [ 10.948777] [c0003082be10] [c000c53c] > system_call_common+0xec/0x270 > [ 10.948782] --- interrupt: c00 at 0x7fffa3f2f1d4 > [ 10.948785] NIP: 7fffa3f2f1d4 LR: 7fffa456ea9c CTR: > > [ 10.948789] REGS: c0003082be80 TRAP: 0c00 Tainted: GE > (5.18.0-rc7-next-20220520) > [ 10.948793] MSR: 8280f033 > CR: 2804 XER: > [ 10.948805] IRQMASK: 0 > [ 10.948805] GPR00: 0161 7fffd70550b0 7fffa4007300 > 0011 > [ 10.948805] GPR04: 7fffa457ad30 0011 > > [ 10.948805] GPR08: > > [ 10.948805] GPR12: 7fffa4656580 0002 > 0001372bd8b0 > [ 10.948805] GPR16: 000137300108 0001372c5c68 > > [ 10.948805] GPR20: 000
Re: [PATCH -next] powerpc/pseries/vas: Call misc_deregister if sysfs init fails
On Wed, 11 May 2022 11:35:07 +0800, Zheng Bin wrote: > Undo effects of misc_register if sysfs init fails after > misc_register. > > Applied to powerpc/next. [1/1] powerpc/pseries/vas: Call misc_deregister if sysfs init fails https://git.kernel.org/powerpc/c/426e5805226358dbe9af233347c5bf3c81f2125c cheers
Re: [PATCH -next] powerpc/kaslr_booke: Fix build error
On Tue, 17 May 2022 17:49:00 +0800, YueHaibing wrote: > arch/powerpc/mm/nohash/kaslr_booke.c: In function ‘kaslr_get_cmdline’: > arch/powerpc/mm/nohash/kaslr_booke.c:46:2: error: implicit declaration of > function ‘early_init_dt_scan_chosen’; did you mean > ‘early_init_mmu_secondary’? [-Werror=implicit-function-declaration] > early_init_dt_scan_chosen(boot_command_line); > ^ > early_init_mmu_secondary > arch/powerpc/mm/nohash/kaslr_booke.c: In function ‘get_initrd_range’: > arch/powerpc/mm/nohash/kaslr_booke.c:210:10: error: implicit declaration of > function ‘of_read_number’; did you mean ‘seq_read_iter’? > [-Werror=implicit-function-declaration] > start = of_read_number(prop, len / 4); > ^~ > seq_read_iter > > [...] Applied to powerpc/next. [1/1] powerpc/kaslr_booke: Fix build error https://git.kernel.org/powerpc/c/cdf87d2bd12cf3ea760a1fa35907a31e5177f425 cheers
Re: [PATCH -next] powerpc/book3e: Fix build error
On Tue, 17 May 2022 17:48:30 +0800, YueHaibing wrote: > arch/powerpc/mm/nohash/fsl_book3e.c: In function ‘relocate_init’: > arch/powerpc/mm/nohash/fsl_book3e.c:348:2: error: implicit declaration of > function ‘early_get_first_memblock_info’ > [-Werror=implicit-function-declaration] > early_get_first_memblock_info(__va(dt_ptr), &size); > ^ > > Add missing include file linux/of_fdt.h to fix this. > > [...] Applied to powerpc/next. [1/1] powerpc/book3e: Fix build error https://git.kernel.org/powerpc/c/7574dd080ee0a1e8a9c6312dc7c8fe97f73415ff cheers
Re: [PATCH -next] powerpc/iommu: Add missing of_node_put in iommu_init_early_dart
On Mon, 25 Apr 2022 08:12:45 +, Peng Wu wrote: > The device_node pointer is returned by of_find_compatible_node > with refcount incremented. We should use of_node_put() to avoid > the refcount leak. > > Applied to powerpc/next. [1/1] powerpc/iommu: Add missing of_node_put in iommu_init_early_dart https://git.kernel.org/powerpc/c/57b742a5b8945118022973e6416b71351df512fb cheers
Re: [PATCH v2] powerpc/papr_scm: Fix leaking nvdimm_events_map elements
On Wed, 11 May 2022 13:56:36 +0530, Vaibhav Jain wrote: > Right now 'char *' elements allocated for individual 'stat_id' in > 'papr_scm_priv.nvdimm_events_map[]' during papr_scm_pmu_check_events(), get > leaked in papr_scm_remove() and papr_scm_pmu_register(), > papr_scm_pmu_check_events() error paths. > > Also individual 'stat_id' arent NULL terminated 'char *' instead they are > fixed > 8-byte sized identifiers. However papr_scm_pmu_register() assumes it to be a > NULL terminated 'char *' and at other places it assumes it to be a > 'papr_scm_perf_stat.stat_id' sized string which is 8-byes in size. > > [...] Applied to powerpc/next. [1/1] powerpc/papr_scm: Fix leaking nvdimm_events_map elements https://git.kernel.org/powerpc/c/0e0946e22f3665d27325d389ff45ade6e93f3678 cheers
Re: [PATCH v2 1/2] powerpc/powernv: Get L1D flush requirements from device-tree
On Mon, 4 Apr 2022 20:15:35 +1000, Russell Currey wrote: > The device-tree properties no-need-l1d-flush-msr-pr-1-to-0 and > no-need-l1d-flush-kernel-on-user-access are the equivalents of > H_CPU_BEHAV_NO_L1D_FLUSH_ENTRY and H_CPU_BEHAV_NO_L1D_FLUSH_UACCESS > from the H_GET_CPU_CHARACTERISTICS hcall on pseries respectively. > > In commit d02fa40d759f ("powerpc/powernv: Remove POWER9 PVR version > check for entry and uaccess flushes") the condition for disabling the > L1D flush on kernel entry and user access was changed from any non-P9 > CPU to only checking P7 and P8. Without the appropriate device-tree > checks for newer processors on powernv, these flushes are unnecessarily > enabled on those systems. This patch corrects this. > > [...] Applied to powerpc/next. [1/2] powerpc/powernv: Get L1D flush requirements from device-tree https://git.kernel.org/powerpc/c/2efee6adb56159288bce9d1ab51fc9056d7007d4 [2/2] powerpc/powernv: Get STF barrier requirements from device-tree https://git.kernel.org/powerpc/c/d2a3c131981d4498571908df95c3c9393a00adf5 cheers
Re: [PATCH] selftests/powerpc: Handle more misreporting cases in spectre_v2
On Tue, 8 Jun 2021 16:48:09 +1000, Russell Currey wrote: > In commit f3054ffd71b5 ("selftests/powerpc: Return skip code for > spectre_v2"), the spectre_v2 selftest is updated to be aware of cases > where the vulnerability status reported in sysfs is incorrect, skipping > the test instead. > > This happens because qemu can misrepresent the mitigation status of the > host to the guest. If the count cache is disabled in the host, and this > is correctly reported to the guest, then the guest won't apply > mitigations. If the guest is then migrated to a new host where > mitigations are necessary, it is now vulnerable because it has not > applied mitigations. > > [...] Applied to powerpc/next. [1/1] selftests/powerpc: Handle more misreporting cases in spectre_v2 https://git.kernel.org/powerpc/c/48482f4dd3432e5e62873bf0f2e254cfb8ce2ac2 cheers
Re: [PATCH v2] macintosh: via-pmu and via-cuda need RTC_LIB
On Sun, 10 Apr 2022 09:10:35 -0700, Randy Dunlap wrote: > Fix build when RTC_LIB is not set/enabled. > Eliminates these build errors: > > m68k-linux-ld: drivers/macintosh/via-pmu.o: in function `pmu_set_rtc_time': > drivers/macintosh/via-pmu.c:1769: undefined reference to `rtc_tm_to_time64' > m68k-linux-ld: drivers/macintosh/via-cuda.o: in function `cuda_set_rtc_time': > drivers/macintosh/via-cuda.c:797: undefined reference to `rtc_tm_to_time64' > > [...] Applied to powerpc/next. [1/1] macintosh: via-pmu and via-cuda need RTC_LIB https://git.kernel.org/powerpc/c/9a9c5ff5fff87eb1a43db0d899473554e408fd7b cheers
Re: [PATCH v2 0/6] KASAN support for 64-bit Book 3S powerpc
On Wed, 18 May 2022 20:03:27 +1000, Paul Mackerras wrote: > This patch series implements KASAN on 64-bit POWER with radix MMU, > such as POWER9 or POWER10. Daniel Axtens posted previous versions of > these patches, but is no longer working on KASAN, and I have been > asked to get them ready for inclusion. > > Because of various technical difficulties, mostly around the need to > allow for code that runs in real mode, we only support "outline" mode > (as opposed to "inline" mode), where the compiler adds a call to > a checking procedure before every store to memory. > > [...] Patches 1-5 applied to powerpc/next. [1/6] kasan: Document support on 32-bit powerpc https://git.kernel.org/powerpc/c/60e832def18de7a0753393034c6ae459b3bee70a [2/6] powerpc/mm/kasan: rename kasan_init_32.c to init_32.c https://git.kernel.org/powerpc/c/f08aed52412c860f68e30da148da58ad8e40a43b [3/6] powerpc: Book3S 64-bit outline-only KASAN support https://git.kernel.org/powerpc/c/41b7a347bf1491e7300563bb224432608b41f62a [4/6] powerpc/kasan: Don't instrument non-maskable or raw interrupts https://git.kernel.org/powerpc/c/5352090a999570c6e8a701bcb755fd91e8c5a2cd [5/6] powerpc/kasan: Disable address sanitization in kexec paths https://git.kernel.org/powerpc/c/2ab2d5794f14c08676690bf0859f16cc768bb3a4 cheers
Re: [PATCH] powerpc/85xx: P2020: Add fsl,mpc8548-pmc node
On Fri, 6 May 2022 22:36:21 +0200, Pali Rohár wrote: > P2020 also contains Power Management Controller and their registers at > offset 0xe0070 compatible with mpc8548. So add PMC node into DTS include > file fsl/p2020si-post.dtsi > > Applied to powerpc/next. [1/1] powerpc/85xx: P2020: Add fsl,mpc8548-pmc node https://git.kernel.org/powerpc/c/294299b3d39e8b4ae10c12ef1ed71405ef7b1e43 cheers
Re: [PATCH] powerpc/numa: Associate numa node to its cpu earlier
On Mon, 11 Apr 2022 09:49:34 +0200, Oscar Salvador wrote: > powerpc is the only platform that do not rely on > cpu_up()->try_online_node() to bring up a numa node, > and special cases it, instead, deep in its own machinery: > > dlpar_online_cpu > find_and_online_cpu_nid > try_online_node > > [...] Applied to powerpc/next. [1/1] powerpc/numa: Associate numa node to its cpu earlier https://git.kernel.org/powerpc/c/48b63961c84671df4c4a4451c40057e34b26df1a cheers
Re: [PATCH] powerpc/powernv/pci: Drop VF MPS fixup
On Wed, 2 Sep 2020 13:51:59 +1000, Oliver O'Halloran wrote: > The MPS field in the VF config space is marked as reserved in current > versions of the SR-IOV spec. In other words, this fixup doesn't do > anything. > > Applied to powerpc/next. [1/1] powerpc/powernv/pci: Drop VF MPS fixup https://git.kernel.org/powerpc/c/a5d28039ecb288f4788ae98c8291e092961e8742 cheers
Re: [PATCH 1/2] powerpc/64: Bump SIGSTKSZ and MINSIGSTKSZ
On Tue, 8 Mar 2022 04:27:33 +1000, Nicholas Piggin wrote: > The sad tale of SIGSTKSZ and MINSIGSTKSZ is documented in glibc.git > commit f7c399cff5bd ("PowerPC SIGSTKSZ"), which explains why glibc > does not use the kernel defines for these constants. Since then in > fact there has been a further expansion of the signal stack frame size > on little-endian with linux commit 573ebfa6601f ("powerpc: Increase > stack redzone for 64-bit userspace to 512 bytes"), which has caused > it to exceed even the glibc defines. > > [...] Applied to powerpc/next. [1/2] powerpc/64: Bump SIGSTKSZ and MINSIGSTKSZ https://git.kernel.org/powerpc/c/2f82ec19757f58549467db568c56e7dfff8af283 [2/2] powerpc/signal: Report minimum signal frame size to userspace via AT_MINSIGSTKSZ https://git.kernel.org/powerpc/c/2896b2dff49d0377e4372f470dcddbcb26f2be59 cheers
Re: [PATCH 00/14] powerpc/rtas: various cleanups and improvements
On Tue, 8 Mar 2022 23:50:33 +1000, Nicholas Piggin wrote: > I had a bunch of random little fixes and cleanups around and > was prompted to put them together and make a change to call > RTAS with MSR[RI] enabled because of a report of the hard > lockup watchdog NMI IPI hitting in an rtas call which then > crashed because it's unrecoverable. > > Could possibly move patch 9 earlier if it would help with > backporting. > > [...] Patches 1-4, 7, 9 & 13 applied to powerpc/next. [01/14] powerpc/rtas: Move rtas entry assembly into its own file https://git.kernel.org/powerpc/c/838ee286ecc9a3c76e6bd8f5aaad0c8c5c66b9ca [02/14] powerpc/rtas: Make enter_rtas a nokprobe symbol on 64-bit https://git.kernel.org/powerpc/c/07940b4b61cf0cbcfb9e4226c07318f737157c42 [03/14] powerpc/rtas: Fix whitespace in rtas_entry.S https://git.kernel.org/powerpc/c/4e949faae2bd42783a2b2b732b7bf17557d94cfb [04/14] powerpc/rtas: Call enter_rtas with MSR[EE] disabled https://git.kernel.org/powerpc/c/c5a65e0a420d50655bf692fc7386813683c0cd81 [07/14] powerpc/rtas: PACA can be restored directly from SPRG https://git.kernel.org/powerpc/c/5c86bd02b3c3ef68a109fa7e690ad62d3091f6d4 [09/14] powerpc/rtas: Leave MSR[RI] enabled over RTAS call https://git.kernel.org/powerpc/c/014b2e896cc8445fcc04636e69bf5f9e24281daa [13/14] powerpc/rtas: enture rtas_call is called with MMU enabled https://git.kernel.org/powerpc/c/804c0a166ffea628eb7ef72b9fd710883cb1fa8f cheers
Re: [PATCH v2] powerpc/ftrace: Remove ftrace init tramp once kernel init is complete
On Mon, 16 May 2022 12:44:22 +0530, Naveen N. Rao wrote: > Stop using the ftrace trampoline for init section once kernel init is > complete. > > Applied to powerpc/next. [1/1] powerpc/ftrace: Remove ftrace init tramp once kernel init is complete https://git.kernel.org/powerpc/c/84ade0a6655bee803d176525ef457175cbf4df22 cheers
[PATCH] powerpc/papr_scm: don't requests stats with '0' sized stats buffer
Sachin reported [1] that on a POWER-10 lpar he is seeing a kernel panic being reported with vPMEM when papr_scm probe is being called. The panic is of the form below and is observed only with following option disabled(profile) for the said LPAR 'Enable Performance Information Collection' in the HMC: Kernel attempted to write user page (1c) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on write at 0x001c Faulting instruction address: 0xc00801b90844 Oops: Kernel access of bad area, sig: 11 [#1] NIP [c00801b90844] drc_pmem_query_stats+0x5c/0x270 [papr_scm] LR [c00801b92794] papr_scm_probe+0x2ac/0x6ec [papr_scm] Call Trace: 0xc941bca0 (unreliable) papr_scm_probe+0x2ac/0x6ec [papr_scm] platform_probe+0x98/0x150 really_probe+0xfc/0x510 __driver_probe_device+0x17c/0x230 ---[ end trace ]--- Kernel panic - not syncing: Fatal exception On investigation looks like this panic was caused due to a 'stat_buffer' of size==0 being provided to drc_pmem_query_stats() to fetch all performance stats-ids of an NVDIMM. However drc_pmem_query_stats() shouldn't have been called since the vPMEM NVDIMM doesn't support and performance stat-id's. This was caused due to missing check for 'p->stat_buffer_len' at the beginning of papr_scm_pmu_check_events() which indicates that the NVDIMM doesn't support performance-stats. Fix this by introducing the check for 'p->stat_buffer_len' at the beginning of papr_scm_pmu_check_events(). [1] https://lore.kernel.org/all/6b3a522a-6a5f-4cc9-b268-0c63aa6e0...@linux.ibm.com Fixes: 0e0946e22f3665d2732 ("powerpc/papr_scm: Fix leaking nvdimm_events_map elements") Reported-by: Sachin Sant Signed-off-by: Vaibhav Jain --- arch/powerpc/platforms/pseries/papr_scm.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index 181b855b3050..82cae08976bc 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -465,6 +465,9 @@ static int papr_scm_pmu_check_events(struct papr_scm_priv *p, struct nvdimm_pmu u32 available_events; int index, rc = 0; + if (!p->stat_buffer_len) + return -ENOENT; + available_events = (p->stat_buffer_len - sizeof(struct papr_scm_perf_stats)) / sizeof(struct papr_scm_perf_stat); if (available_events == 0) -- 2.35.1
Re: [PATCH v2 0/2] Link the PowerPC vDSO with ld.lld
On Wed, 11 May 2022 11:49:59 -0700, Nathan Chancellor wrote: > This series is an alternative to the one proposed by Nick before the > PowerPC vDSO unification in commit fd1feade75fb ("powerpc/vdso: Merge > vdso64 and vdso32 into a single directory"): > > https://lore.kernel.org/20200901222523.1941988-1-ndesaulni...@google.com/ > > Normally, we try to make compiling and linking two separate stages so > that they can be done by $(CC) and $(LD) respectively, which is more in > line with what the user expects, versus using the compiler as a linker > driver and relying on the implicit default linker value. However, as > shown in the above thread, getting this right for the PowerPC vDSO is a > little tricky due to the linker emulation values. > > [...] Applied to powerpc/next. [1/2] powerpc/vdso: Remove unused ENTRY in linker scripts https://git.kernel.org/powerpc/c/e247172854a57d1a7213bb835ecb4a40ce9bb2b9 [2/2] powerpc/vdso: Link with ld.lld when requested https://git.kernel.org/powerpc/c/4406b12214f6592909b63dabdea86d69f1b5ba2e cheers
Re: [PATCH] powerpc: Fix all occurences of "the the"
On Thu, 19 May 2022 00:26:29 +1000, Michael Ellerman wrote: > Rather than waiting for the bots to fix these one-by-one, fix all > occurences of "the the" throughout arch/powerpc. > > Applied to powerpc/next. [1/1] powerpc: Fix all occurences of "the the" https://git.kernel.org/powerpc/c/87c78b612f4feccdcf4019351206ebe0e3dfe828 cheers
Re: [PATCH 1/2] powerpc: Add generic PAGE_SIZE config symbols
On Thu, 5 May 2022 22:51:22 +1000, Michael Ellerman wrote: > Other arches (sh, mips, hexagon) use standard names for PAGE_SIZE > related config symbols. > > Add matching symbols for powerpc, which are enabled by default but > depend on our architecture specific PAGE_SIZE symbols. > > This allows generic/driver code to express dependencies on the PAGE_SIZE > without needing to refer to architecture specific config symbols. > > [...] Applied to powerpc/next. [1/2] powerpc: Add generic PAGE_SIZE config symbols https://git.kernel.org/powerpc/c/d036dc79cccd748e2a101c80c31efada7be8bb7c [2/2] arch/Kconfig: Drop references to powerpc PAGE_SIZE symbols https://git.kernel.org/powerpc/c/aa06530a535ffe8ba8b68054003b6fb262a8ec6f cheers
Re: [PATCH] selftest/powerpc/pmu/ebb: remove fixed_instruction.S
On Tue, 22 Mar 2022 10:26:38 +0530, Madhavan Srinivasan wrote: > Commit 3752e453f6ba ("selftests/powerpc: Add tests of PMU EBBs") added > selftest testcases to verify EBB interface. instruction_count_test.c > testcase needs a fixed loop function to count overhead. Instead of > using the thirty_two_instruction_loop() in fixed_instruction_loop.S > in ebb folder, file is linked with thirty_two_instruction_loop() in > loop.S from top folder. Since fixed_instruction_loop.S not used, patch > removes the file. > > [...] Applied to powerpc/next. [1/1] selftest/powerpc/pmu/ebb: remove fixed_instruction.S https://git.kernel.org/powerpc/c/079e5fd3a1e41c186c1bc4166d409d22e70729bf cheers
Re: [PATCH] powerpc/xive: Fix refcount leak in xive_spapr_init
On Thu, 12 May 2022 13:05:33 +0400, Miaoqian Lin wrote: > of_find_compatible_node() returns a node pointer with refcount > incremented, we should use of_node_put() on it when done. > Add missing of_node_put() to avoid refcount leak. > > Applied to powerpc/next. [1/1] powerpc/xive: Fix refcount leak in xive_spapr_init https://git.kernel.org/powerpc/c/1d1fb9618bdd5a5fbf9a9eb75133da301d33721c cheers
Re: [PATCH] powerpc/fsl_rio: Fix refcount leak in fsl_rio_setup
On Thu, 12 May 2022 16:37:18 +0400, Miaoqian Lin wrote: > of_parse_phandle() returns a node pointer with refcount > incremented, we should use of_node_put() on it when not need anymore. > Add missing of_node_put() to avoid refcount leak. > > Applied to powerpc/next. [1/1] powerpc/fsl_rio: Fix refcount leak in fsl_rio_setup https://git.kernel.org/powerpc/c/fcee96924ba1596ca80a6770b2567ca546f9a482 cheers
Re: [PATCH v3 1/2] powerpc/powermac: add missing g5_phy_disable_cpu1() declaration
On Fri, 24 Sep 2021 12:56:52 +0200, Krzysztof Kozlowski wrote: > g5_phy_disable_cpu1() is used outside of platforms/powermac/feature.c, > so it should have a declaration to fix W=1 warning: > > arch/powerpc/platforms/powermac/feature.c:1533:6: > error: no previous prototype for ‘g5_phy_disable_cpu1’ > [-Werror=missing-prototypes] > > > [...] Applied to powerpc/next. [1/2] powerpc/powermac: add missing g5_phy_disable_cpu1() declaration https://git.kernel.org/powerpc/c/cc025916b12a452df7932da528d25b2ef2b05072 [2/2] powerpc/powermac: constify device_node in of_irq_parse_oldworld() https://git.kernel.org/powerpc/c/bb12dd42d20f5513a8d1da225232af0a0743fd79 cheers
Re: [PATCH 1/2] powerpc/perf: Fix the threshold compare group constraint for power10
On Fri, 6 May 2022 11:40:14 +0530, Kajol Jain wrote: > Thresh compare bits for a event is used to program thresh compare > field in Monitor Mode Control Register A (MMCRA: 8-18 bits for power10). > When scheduling events as a group, all events in that group should > match value in threshold bits. Otherwise event open for the sibling > events should fail. But in the current code, incase thresh compare bits are > not valid, we are not failing in group_constraint function which can result > in invalid group schduling. > > [...] Applied to powerpc/next. [1/2] powerpc/perf: Fix the threshold compare group constraint for power10 https://git.kernel.org/powerpc/c/505d31650ba96d6032313480fdb566d289a4698c [2/2] powerpc/perf: Fix the threshold compare group constraint for power9 https://git.kernel.org/powerpc/c/ab0cc6bbf0c812731c703ec757fcc3fc3a457a34 cheers
Re: [PATCH] powerpc/microwatt: Add mmu bits to device tree
On Thu, 19 May 2022 22:27:06 +0930, Joel Stanley wrote: > In commit 5402e239d09f ("powerpc/64s: Get LPID bit width from device > tree") the kernel tried to determine the pid and lpid bits from the > device tree. If they are not found, there is a fallback, but Microwatt > wasn't covered as has the unusual configuration of being both !HV and bare > metal. > > Set the values in the device tree to avoid having to add a special case. > The lpid value is the only one required, but add both for completeness. > > [...] Applied to powerpc/next. [1/1] powerpc/microwatt: Add mmu bits to device tree https://git.kernel.org/powerpc/c/0ef1ffc7189521e293b4c5532659385dfddf8939 cheers
Re: [PATCH] powerpc/powernv/flash: Check OPAL flash calls exist before using
On Tue, 14 Sep 2021 15:46:30 +0530, Vasant Hegde wrote: > Currently only FSP based powernv systems supports firmware update > interfaces. Hence check that the token OPAL_FLASH_VALIDATE exists > before initalising the flash driver. > > Applied to powerpc/next. [1/1] powerpc/powernv/flash: Check OPAL flash calls exist before using https://git.kernel.org/powerpc/c/25e69962efdbbbd8bf52e6b4d7852c49717923a2 cheers
Re: [PATCH] powerpc/pseries/vas: sysfs comments with the correct entries
On Sat, 09 Apr 2022 01:46:15 -0700, Haren Myneni wrote: > VAS entry is created as a misc device and the sysfs comments > should list the proper entries > > Applied to powerpc/next. [1/1] powerpc/pseries/vas: sysfs comments with the correct entries https://git.kernel.org/powerpc/c/657ac633302b9d694958a82654363cb559277759 cheers
Re: [PATCH] powerpc/powernv/vas: Assign real address to rx_fifo in vas_rx_win_attr
On Sat, 09 Apr 2022 01:44:16 -0700, Haren Myneni wrote: > In init_winctx_regs(), __pa() is called on winctx->rx_fifo and this > function is called to initialize registers for receive and fault > windows. But the real address is passed in winctx->rx_fifo for > receive windows and the virtual address for fault windows which > causes errors with DEBUG_VIRTUAL enabled. Fixes this issue by > assigning only real address to rx_fifo in vas_rx_win_attr struct > for both receive and fault windows. > > [...] Applied to powerpc/next. [1/1] powerpc/powernv/vas: Assign real address to rx_fifo in vas_rx_win_attr https://git.kernel.org/powerpc/c/c127d130f6d59fa81701f6b04023cf7cd1972fb3 cheers
Re: [PATCH] powerpc: Export mmu_feature_keys[] as non-GPL
On Tue, 29 Mar 2022 16:57:09 +0800, Kevin Hao wrote: > When the mmu_feature_keys[] was introduced in the commit c12e6f24d413 > ("powerpc: Add option to use jump label for mmu_has_feature()"), > it is unlikely that it would be used either directly or indirectly in > the out of tree modules. So we export it as GPL only. But with the > evolution of the codes, especially the PPC_KUAP support, it may be > indirectly referenced by some primitive macro or inline functions such > as get_user() or __copy_from_user_inatomic(), this will make it > impossible to build many non GPL modules (such as ZFS) on ppc > architecture. Fix this by exposing the mmu_feature_keys[] to the > non-GPL modules too. > > [...] Applied to powerpc/next. [1/1] powerpc: Export mmu_feature_keys[] as non-GPL https://git.kernel.org/powerpc/c/d9e5c3e9e75162f845880535957b7fd0b4637d23 cheers
Re: (subset) [PATCH 00/30] The panic notifiers refactor
On Wed, 27 Apr 2022 19:48:54 -0300, Guilherme G. Piccoli wrote: > Hey folks, this is an attempt to improve/refactor the dated panic notifiers > infrastructure. This is strongly based in a suggestion made by Pter Mladek [0] > some time ago, and it's finally ready. Below I'll detail the patch ordering, > testing made, etc. > First, a bit about the reason behind this. > > The panic notifiers list is an infrastructure that allows callbacks to execute > during panic time. Happens that anybody can add functions there, no ordering > is enforced (by default) and the decision to execute or not such notifiers > before kdump may lead to high risk of failure in crash scenarios - default is > not to execute any of them. There is a parameter acting as a switch for that. > But some architectures require some notifiers, so..it's messy. > > [...] Patch 8 applied to powerpc/next. [08/30] powerpc/setup: Refactor/untangle panic notifiers https://git.kernel.org/powerpc/c/e2aa34ce80a26d24a0333da9402d533885f239c9 cheers
Re: [PATCH v2] macintosh/via-pmu: Fix build failure when CONFIG_INPUT is disabled
On Thu, 07 Apr 2022 20:11:32 +1000, Finn Thain wrote: > drivers/macintosh/via-pmu-event.o: In function `via_pmu_event': > via-pmu-event.c:(.text+0x44): undefined reference to `input_event' > via-pmu-event.c:(.text+0x68): undefined reference to `input_event' > via-pmu-event.c:(.text+0x94): undefined reference to `input_event' > via-pmu-event.c:(.text+0xb8): undefined reference to `input_event' > drivers/macintosh/via-pmu-event.o: In function `via_pmu_event_init': > via-pmu-event.c:(.init.text+0x20): undefined reference to > `input_allocate_device' > via-pmu-event.c:(.init.text+0xc4): undefined reference to > `input_register_device' > via-pmu-event.c:(.init.text+0xd4): undefined reference to `input_free_device' > make[1]: *** [Makefile:1155: vmlinux] Error 1 > make: *** [Makefile:350: __build_one_by_one] Error 2 > > [...] Applied to powerpc/next. [1/1] macintosh/via-pmu: Fix build failure when CONFIG_INPUT is disabled https://git.kernel.org/powerpc/c/86ce436e30d86327c9f5260f718104ae7b21f506 cheers
Re: [PATCH] selftests/powerpc/pmu: fix spelling mistake "mis-match" -> "mismatch"
On Sat, 19 Mar 2022 23:20:25 +, Colin Ian King wrote: > There are a few spelling mistakes in error messages. Fix them. > > Applied to powerpc/next. [1/1] selftests/powerpc/pmu: fix spelling mistake "mis-match" -> "mismatch" https://git.kernel.org/powerpc/c/7801cb1dc60f7348687ca1c3aa03a944687563f0 cheers
Re: [PATCH v3 00/25] powerpc: ftrace optimisation and cleanup and more [v3]
On Mon, 9 May 2022 07:35:58 +0200, Christophe Leroy wrote: > This series provides optimisation and cleanup of ftrace on powerpc. > > With this series ftrace activation is about 20% faster on an 8xx. > > At the end of the series come additional cleanups around ppc-opcode, > that would likely conflict with this series if posted separately. > > [...] Applied to powerpc/next. [01/25] powerpc/ftrace: Refactor prepare_ftrace_return() https://git.kernel.org/powerpc/c/d996d5053eb5c0abc0358e5670014a62ada6967e [02/25] powerpc/ftrace: Remove redundant create_branch() calls https://git.kernel.org/powerpc/c/ae3a2a2188214adc355a5bdf6deb29120886c96f [03/25] powerpc/code-patching: Inline is_offset_in_{cond}_branch_range() https://git.kernel.org/powerpc/c/1acbf27e8a5843911d122ad0008e79ec5f7b6382 [04/25] powerpc/ftrace: Use is_offset_in_branch_range() https://git.kernel.org/powerpc/c/a1facd2578b312770aaea384adc7de0ed3f543d1 [05/25] powerpc/code-patching: Inline create_branch() https://git.kernel.org/powerpc/c/d2f47dabf1252520a88d257133e6bdec474fd935 [06/25] powerpc/ftrace: Inline ftrace_modify_code() https://git.kernel.org/powerpc/c/2c920fca8c70287c4448f2653a388ecca7b32e83 [07/25] powerpc/ftrace: Use patch_instruction() return directly https://git.kernel.org/powerpc/c/bbffdd2fc743bdc529f9a8264bdb5d3491f58c95 [08/25] powerpc: Add CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2 https://git.kernel.org/powerpc/c/661aa880398add5c27943cb077c451a45cc112a1 [09/25] powerpc: Replace PPC64_ELF_ABI_v{1/2} by CONFIG_PPC64_ELF_ABI_V{1/2} https://git.kernel.org/powerpc/c/7d40aff8213c92e64a1576ba9dfebcd201c0564d [10/25] powerpc: Finalise cleanup around ABI use https://git.kernel.org/powerpc/c/5b89492c03e5c0a2c259b97d7d4c1bb9b02860aa [11/25] powerpc/ftrace: Make __ftrace_make_{nop/call}() common to PPC32 and PPC64 https://git.kernel.org/powerpc/c/23b44fc248f420bbcd0dcd290c3399885360984d [12/25] powerpc/ftrace: Don't include ftrace.o for CONFIG_FTRACE_SYSCALLS https://git.kernel.org/powerpc/c/a3d0f5b4b7e425b8abeadda1e76496bda88989bd [13/25] powerpc/ftrace: Use CONFIG_FUNCTION_TRACER instead of CONFIG_DYNAMIC_FTRACE https://git.kernel.org/powerpc/c/c2cba93d1a5e2475a636b5cb974da6b73d7a72df [14/25] powerpc/ftrace: Remove ftrace_plt_tramps[] https://git.kernel.org/powerpc/c/ccf6607e45aaf5e0ceabfe018aeb01818a936697 [15/25] powerpc/ftrace: Use BRANCH_SET_LINK instead of value 1 https://git.kernel.org/powerpc/c/cf9df92a823ce24c19c4c64b334dc5cadd74fa98 [16/25] powerpc/ftrace: Use PPC_RAW_xxx() macros instead of opencoding. https://git.kernel.org/powerpc/c/e89aa642be21b14e53bab40a37b8c6b0cf05143d [17/25] powerpc/ftrace: Use size macro instead of opencoding https://git.kernel.org/powerpc/c/c8deb28095f9cd2ee2f4d16e948c9e816a22811b [18/25] powerpc/ftrace: Simplify expected_nop_sequence() https://git.kernel.org/powerpc/c/b97d0e3dcfba07590ec3d2ca2b95b2f029962d16 [19/25] powerpc/ftrace: Minimise number of #ifdefs https://git.kernel.org/powerpc/c/af8b9f352ffd435734ab8f94f99ccb922da916b4 [20/25] powerpc/inst: Add __copy_inst_from_kernel_nofault() https://git.kernel.org/powerpc/c/8dfdbe4368c09d9eeae2df8968ee6c345ec8c1b5 [21/25] powerpc/ftrace: Don't use copy_from_kernel_nofault() in module_trampoline_target() https://git.kernel.org/powerpc/c/8052d043a48f733905e8ea8f900bf58b441a317f [22/25] powerpc/inst: Remove PPC_INST_BRANCH https://git.kernel.org/powerpc/c/4390a58ee1c37dc915dcf44fabe925b160f5bcf0 [23/25] powerpc/modules: Use PPC_LI macros instead of opencoding https://git.kernel.org/powerpc/c/e0c2ef43210b023ed9a58c520c2fbede7010c592 [24/25] powerpc/inst: Remove PPC_INST_BL https://git.kernel.org/powerpc/c/ae2c760fa10ba2475aa46fffa6be42050586c604 [25/25] powerpc/opcodes: Remove unused PPC_INST_XXX macros https://git.kernel.org/powerpc/c/6bdc81eca9519a85d36b3915136640ef9cba1a23 cheers
Re: [PATCH] powerpc/irq: Remove arch_local_irq_restore() for !CONFIG_CC_HAS_ASM_GOTO
On Mon, 16 May 2022 17:36:04 +0200, Christophe Leroy wrote: > All supported versions of GCC support asm goto. > > Remove the !CONFIG_CC_HAS_ASM_GOTO version of arch_local_irq_restore() > > Applied to powerpc/next. [1/1] powerpc/irq: Remove arch_local_irq_restore() for !CONFIG_CC_HAS_ASM_GOTO https://git.kernel.org/powerpc/c/5fe855169f9782c669f640b66242662209ffb72a cheers
Re: [PATCH] powerpc/fsl_book3e: Don't set rodata RO too early
On Thu, 19 May 2022 19:24:15 +0200, Christophe Leroy wrote: > On fsl_book3e, rodata is set read-only at the same time as > init text is set NX at the end of init. That's too early. > > As both action are performed at the same time, delay both > actions to the time rodata is expected to be made read-only. > > It means we will have a small window with init mem freed but > still executable. It shouldn't be an issue though, especially > because the said memory gets poisoned and should therefore > result to a bad instruction fault in case it gets executer. > > [...] Applied to powerpc/next. [1/1] powerpc/fsl_book3e: Don't set rodata RO too early https://git.kernel.org/powerpc/c/ad91f66f5fa7c6f9346e721c3159ce818568028b cheers
Re: [PATCH] powerpc/85xx: Remove FSL_85XX_CACHE_SRAM
On Thu, 31 Mar 2022 12:03:06 +0200, Christophe Leroy wrote: > CONFIG_FSL_85XX_CACHE_SRAM is an option that is not > user selectable and which is not selected by any driver > nor any defconfig. > > Remove it and all associated code. > > > [...] Applied to powerpc/next. [1/1] powerpc/85xx: Remove FSL_85XX_CACHE_SRAM https://git.kernel.org/powerpc/c/dc21ed2aef4150fc2fcf58227a4ff24502015c03 cheers
Re: [PATCH V2] platforms/83xx: Use of_device_get_match_data()
On Fri, 25 Feb 2022 01:07:37 +, cgel@gmail.com wrote: > From: Minghao Chi (CGEL ZTE) > > Use of_device_get_match_data() to simplify the code. > v1->v2: > Add a judgment on the return value of the A function as NULL > > > [...] Applied to powerpc/next. [1/1] platforms/83xx: Use of_device_get_match_data() https://git.kernel.org/powerpc/c/8a57c3cc2bcb8df98c239d6804fd01834960b7d2 cheers
Re: [PATCH] powerpc/sysdev: fix refcount leak in icp_opal_init()
On Sat, 2 Apr 2022 01:34:19 +, cgel@gmail.com wrote: > From: Lv Ruyi > > The of_find_compatible_node() function returns a node pointer with > refcount incremented, use of_node_put() on it when done. > > Applied to powerpc/next. [1/1] powerpc/sysdev: fix refcount leak in icp_opal_init() https://git.kernel.org/powerpc/c/5dd9e27ea4a39f7edd4bf81e9e70208e7ac0b7c9 cheers
Re: [PATCH] powerpc/powernv: fix missing of_node_put in uv_init()
On Thu, 7 Apr 2022 09:00:43 +, cgel@gmail.com wrote: > From: Lv Ruyi > > of_find_compatible_node() returns node pointer with refcount incremented, > use of_node_put() on it when done. > > Applied to powerpc/next. [1/1] powerpc/powernv: fix missing of_node_put in uv_init() https://git.kernel.org/powerpc/c/3ffa9fd471f57f365bc54fc87824c530422f64a5 cheers
Re: [PATCH V2] powerpc/eeh: Drop redundant spinlock initialization
On Wed, 11 May 2022 09:27:56 +0800, Haowen Bai wrote: > slot_errbuf_lock has declared and initialized by DEFINE_SPINLOCK, > so we don't need to spin_lock_init again, drop it. > > Applied to powerpc/next. [1/1] powerpc/eeh: Drop redundant spinlock initialization https://git.kernel.org/powerpc/c/3def164a5cedad9117859dd4610cae2cc59cb6d2 cheers
Re: [PATCH] powerpc: Enable the DAWR on POWER9 DD2.3 and above
On Tue, 3 May 2022 12:01:52 -0500, Reza Arbab wrote: > The hardware bug in POWER9 preventing use of the DAWR was fixed in > DD2.3. Set the CPU_FTR_DAWR feature bit on these newer systems to start > using it again, and update the documentation accordingly. > > The CPU features for DD2.3 are currently determined by "DD2.2 or later" > logic. In adding DD2.3 as a discrete case for the first time here, I'm > carrying the quirks of DD2.2 forward to keep all behavior outside of > this DAWR change the same. This leaves the assessment and potential > removal of those quirks on DD2.3 for later. > > [...] Applied to powerpc/next. [1/1] powerpc: Enable the DAWR on POWER9 DD2.3 and above https://git.kernel.org/powerpc/c/26b78c81e84c5133b299fb11bf17ec1a3d2ad217 cheers
Re: [RFC PATCH 4/4] objtool/powerpc: Add --mcount specific implementation
On 24/05/22 15:05, Christophe Leroy wrote: Le 23/05/2022 à 19:55, Sathvika Vasireddy a écrit : This patch enables objtool --mcount on powerpc, and adds implementation specific to powerpc. Signed-off-by: Sathvika Vasireddy --- arch/powerpc/Kconfig| 1 + tools/objtool/arch/powerpc/decode.c | 14 ++ tools/objtool/check.c | 12 +++- tools/objtool/elf.c | 13 + tools/objtool/include/objtool/elf.h | 1 + 5 files changed, 36 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 732a3f91ee5e..3373d44a1298 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -233,6 +233,7 @@ config PPC select HAVE_NMI if PERF_EVENTS || (PPC64 && PPC_BOOK3S) select HAVE_OPTPROBES select HAVE_OBJTOOL if PPC64 + select HAVE_OBJTOOL_MCOUNT if HAVE_OBJTOOL select HAVE_PERF_EVENTS select HAVE_PERF_EVENTS_NMI if PPC64 select HAVE_PERF_REGS diff --git a/tools/objtool/arch/powerpc/decode.c b/tools/objtool/arch/powerpc/decode.c index e3b77a6ce357..ad3d79fffac2 100644 --- a/tools/objtool/arch/powerpc/decode.c +++ b/tools/objtool/arch/powerpc/decode.c @@ -40,12 +40,26 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec struct list_head *ops_list) { u32 insn; + unsigned int opcode; *immediate = 0; memcpy(&insn, sec->data->d_buf+offset, 4); *len = 4; *type = INSN_OTHER; + opcode = (insn >> 26); You dont need the brackets here. + + switch (opcode) { + case 18: /* bl */ + if ((insn & 3) == 1) { + *type = INSN_CALL; + *immediate = insn & 0x3fc; + if (*immediate & 0x200) + *immediate -= 0x400; + } + break; + } + return 0; } diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 056302d58e23..fd8bad092f89 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -832,7 +832,7 @@ static int create_mcount_loc_sections(struct objtool_file *file) if (elf_add_reloc_to_insn(file->elf, sec, idx * sizeof(unsigned long), - R_X86_64_64, + elf_reloc_type_long(file->elf), insn->sec, insn->offset)) return -1; @@ -2183,7 +2183,7 @@ static int classify_symbols(struct objtool_file *file) if (arch_is_retpoline(func)) func->retpoline_thunk = true; - if (!strcmp(func->name, "__fentry__")) + if ((!strcmp(func->name, "__fentry__")) || (!strcmp(func->name, "_mcount"))) func->fentry = true; if (is_profiling_func(func->name)) @@ -2259,9 +2259,11 @@ static int decode_sections(struct objtool_file *file) * Must be before add_jump_destinations(), which depends on 'func' * being set for alternatives, to enable proper sibling call detection. */ - ret = add_special_section_alts(file); - if (ret) - return ret; + if (opts.stackval || opts.orc || opts.uaccess || opts.noinstr) { + ret = add_special_section_alts(file); + if (ret) + return ret; + } I think this change should be a patch by itself, it's not related to powerpc. Makes sense. I'll make this a separate patch in the next revision. ret = add_jump_destinations(file); if (ret) diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c index c25e957c1e52..95763060d551 100644 --- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -793,6 +793,19 @@ elf_create_section_symbol(struct elf *elf, struct section *sec) return sym; } +int elf_reloc_type_long(struct elf *elf) Not sure it's a good name, because for 32 bits we have to use 'int'. Sure, I'll rename it to elf_reloc_type() or some such. +{ + switch (elf->ehdr.e_machine) { + case EM_X86_64: + return R_X86_64_64; + case EM_PPC64: + return R_PPC64_ADDR64; + default: + WARN("unknown machine..."); + exit(-1); + } +} Wouldn't it be better to make that function arch specific ? This is so that we can support cross architecture builds. Thanks for reviewing! - Sathvika
Re: [RFC PATCH 2/4] objtool: Enable objtool to run only on files with ftrace enabled
Hi Christophe, On 24/05/22 14:27, Christophe Leroy wrote: Le 23/05/2022 à 19:55, Sathvika Vasireddy a écrit : This patch makes sure objtool runs only on the object files that have ftrace enabled, instead of running on all the object files. Why do that ? This was done to address the issue discussed here: https://lore.kernel.org/all/b06bb9bc-22d1-acce-fe68-c7c4cb7c1...@csgroup.eu/ What about static_calls ? There may be files without ftrace but with static calls. Yes, this prevents objtool from running on those files. We can restrict this change to FTRACE_MCOUNT_USE_OBJTOOL By the way, it would be nice if we could use it only on C files. I get the following errors for ASM files: arch/powerpc/kernel/entry_32.o: warning: objtool: .text+0x1b4: unannotated intra-function call I'm looking into ways to address this. - Sathvika
Re: [RESEND][PATCH] KVM: PPC: Book3S HV: fix incorrect NULL check on list iterator
On Thu, 14 Apr 2022 14:21:03 +0800, Xiaomeng Tong wrote: > The bug is here: > if (!p) > return ret; > > The list iterator value 'p' will *always* be set and non-NULL by > list_for_each_entry(), so it is incorrect to assume that the iterator > value will be NULL if the list is empty or no element is found. > > [...] Applied to powerpc/topic/ppc-kvm. [1/1] KVM: PPC: Book3S HV: fix incorrect NULL check on list iterator https://git.kernel.org/powerpc/c/300981abddcb13f8f06ad58f52358b53a8096775 cheers
Re: [PATCH] KVM: PPC: Book3S HV P9: Optimise loads around context switch
On Sun, 23 Jan 2022 21:47:25 +1000, Nicholas Piggin wrote: > It is better to get all loads for the register values in flight > before starting to switch LPID, PID, and LPCR because those > mtSPRs are expensive and serialising. > > This also just tidies up the code for a potential future change > to the context switching sequence. > > [...] Applied to powerpc/topic/ppc-kvm. [1/1] KVM: PPC: Book3S HV P9: Optimise loads around context switch https://git.kernel.org/powerpc/c/361234d7a1c9a5290d33e35d49821b7a32a32854 cheers
Re: [PATCH] KVM: PPC: Book3S HV: HFSCR[PREFIX] does not exist
On Sat, 22 Jan 2022 20:56:39 +1000, Nicholas Piggin wrote: > This facility is controlled by FSCR only. Reserved bits should not be > set in the HFSCR register (although it's likely harmless as this > position would not be re-used, and the L0 is forgiving here too). > > Applied to powerpc/topic/ppc-kvm. [1/1] KVM: PPC: Book3S HV: HFSCR[PREFIX] does not exist https://git.kernel.org/powerpc/c/861604614a94a7aabc111e4a18aaf5d56d270e8a cheers
Re: [PATCH 0/6] KVM: PPC: Book3S: Make LPID/nested LPID allocations dynamic
On Sun, 23 Jan 2022 22:00:37 +1000, Nicholas Piggin wrote: > With LPID width plumbed through from firmware, LPID allocations can now > be dynamic, which requires changing the fixed sized bitmap. Rather than > just dynamically sizing it, switch to IDA allocator. > > Nested KVM stays with a fixed 12-bit LPID width for now, but it is also > moved to a more dynamic allocator. In future if nested LPID width is > advertised to a guest it will be simple to take advantage of it. > > [...] Applied to powerpc/topic/ppc-kvm. [1/6] KVM: PPC: Remove kvmppc_claim_lpid https://git.kernel.org/powerpc/c/18827eeef022df43c1fdeca0fde00ca09405dff1 [2/6] KVM: PPC: Book3S HV: Update LPID allocator init for POWER9, Nested https://git.kernel.org/powerpc/c/5d506f159b2b9d0c9bee9bb43ccafb4f291143c2 [3/6] KVM: PPC: Book3S HV: Use IDA allocator for LPID allocator https://git.kernel.org/powerpc/c/6ba2a2924dcf6026de5078ba7025248a580d8bde [4/6] KVM: PPC: Book3S HV Nested: Change nested guest lookup to use idr https://git.kernel.org/powerpc/c/c0f00a18e2a8c350a9d263aaf9a2c8bc86caa1b0 [5/6] KVM: PPC: Book3S Nested: Use explicit 4096 LPID maximum https://git.kernel.org/powerpc/c/03a2e65f54b3acae37f0992133d2f4d1d35f4200 [6/6] KVM: PPC: Book3S HV: Remove KVMPPC_NR_LPIDS https://git.kernel.org/powerpc/c/f104df7d519ff1aa92c7ec87e124c88d4e7574cd cheers
Re: [PATCH 0/6] KVM: PPC: Book3S HV interrupt fixes
On Thu, 3 Mar 2022 15:33:09 +1000, Nicholas Piggin wrote: > This series fixes up a bunch of little interrupt issues which were found > by inspection haven't seem to have caused big problems but possibly > could or could cause the occasional latency spike from a temporarily lost > interrupt. > > The big thing is the xive context change. Currently we run an L2 with > its L1's xive OS context pushed. I'm proposing that we instead treat > that as an escalation similar to cede. > > [...] Patches 2-6 applied to powerpc/topic/ppc-kvm. [2/6] KVM: PPC: Book3S HV P9: Inject pending xive interrupts at guest entry https://git.kernel.org/powerpc/c/026728dc5d41f830e8194fe01e432dd4eb9b3d9a [3/6] KVM: PPC: Book3S HV P9: Move cede logic out of XIVE escalation rearming https://git.kernel.org/powerpc/c/ad5ace91c55e7bd16813617f67bcb7619d51a295 [4/6] KVM: PPC: Book3S HV P9: Split !nested case out from guest entry https://git.kernel.org/powerpc/c/42b4a2b347b09e7ee4c86f7121e3b45214b63e69 [5/6] KVM: PPC: Book3S HV Nested: L2 must not run with L1 xive context https://git.kernel.org/powerpc/c/11681b79b1ab52e7625844d7ce52c4d5201a43b2 [6/6] KVM: PPC: Book3S HV Nested: L2 LPCR should inherit L1 LPES setting https://git.kernel.org/powerpc/c/2852ebfa10afdcefff35ec72c8da97141df9845c cheers
Re: [PATCH] KVM: PPC: Book3S HV: fix the return value of kvm_age_rmapp()
On Fri, 1 Apr 2022 02:52:52 -0400, Bo Liu wrote: > The return value type defined in the function kvm_age_rmapp() is > "bool", but the return value type defined in the implementation of the > function kvm_age_rmapp() is "int". > > Change the return value type to "bool". > > > [...] Applied to powerpc/topic/ppc-kvm. [1/1] KVM: PPC: Book3S HV: fix the return value of kvm_age_rmapp() https://git.kernel.org/powerpc/c/15eb1b6afc3c73bcd44b5d265d43db666950b5af cheers
Re: [PATCH] KVM: PPC: Book3S HV: Initialize AMOR in nested entry
On Mon, 25 Apr 2022 11:21:51 -0300, Fabiano Rosas wrote: > The hypervisor always sets AMOR to ~0, but let's ensure we're not > passing stale values around. > > Applied to powerpc/topic/ppc-kvm. [1/1] KVM: PPC: Book3S HV: Initialize AMOR in nested entry https://git.kernel.org/powerpc/c/1d1cd0f12a3ab5d7f79ae6cca28e7d23dd351ce3 cheers
Re: [PATCH] KVM: PPC: Book3S HV: Fix vcore_blocked tracepoint
On Mon, 28 Mar 2022 18:58:31 -0300, Fabiano Rosas wrote: > We removed most of the vcore logic from the P9 path but there's still > a tracepoint that tried to dereference vc->runner. > > Applied to powerpc/topic/ppc-kvm. [1/1] KVM: PPC: Book3S HV: Fix vcore_blocked tracepoint https://git.kernel.org/powerpc/c/ad55bae7dc364417434b69dd6c30104f20d0f84d cheers
Re: [PATCH RESEND] KVM: powerpc: remove extraneous asterisk from rm_host_ipi_action comment
On Fri, 6 May 2022 14:07:47 +0700, Bagas Sanjaya wrote: > kernel test robot reported kernel-doc warning for rm_host_ipi_action(): > > >> arch/powerpc/kvm/book3s_hv_rm_xics.c:887: warning: This comment starts > >> with '/**', but isn't a kernel-doc comment. Refer > >> Documentation/doc-guide/kernel-doc.rst > * Host Operations poked by RM KVM > > Since the function is static, remove the extraneous (second) asterisk at > the head of function comment. > > [...] Applied to powerpc/topic/ppc-kvm. [1/1] KVM: powerpc: remove extraneous asterisk from rm_host_ipi_action comment https://git.kernel.org/powerpc/c/d53c36e6c83863fde4a2748411c31bc4853a0936 cheers
Re: [PATCH kernel] KVM: PPC: Book3s: PR: Enable default TCE hypercalls
On Fri, 6 May 2022 17:37:37 +1000, Alexey Kardashevskiy wrote: > When KVM_CAP_PPC_ENABLE_HCALL was introduced, H_GET_TCE and H_PUT_TCE > were already implemented and enabled by default; however H_GET_TCE > was missed out on PR KVM (probably because the handler was in > the real mode code at the time). > > This enables H_GET_TCE by default. While at this, this wraps > the checks in ifdef CONFIG_SPAPR_TCE_IOMMU just like HV KVM. > > [...] Applied to powerpc/topic/ppc-kvm. [1/1] KVM: PPC: Book3s: PR: Enable default TCE hypercalls https://git.kernel.org/powerpc/c/29592181c5496d93697a23e6dbb9d7cc317ff5ee cheers
Re: [PATCH kernel] KVM: PPC: Book3s: Remove real mode interrupt controller hcalls handlers
On Mon, 9 May 2022 17:11:50 +1000, Alexey Kardashevskiy wrote: > Currently we have 2 sets of interrupt controller hypercalls handlers > for real and virtual modes, this is from POWER8 times when switching > MMU on was considered an expensive operation. > > POWER9 however does not have dependent threads and MMU is enabled for > handling hcalls so the XIVE native or XICS-on-XIVE real mode handlers > never execute on real P9 and later CPUs. > > [...] Applied to powerpc/topic/ppc-kvm. [1/1] KVM: PPC: Book3s: Remove real mode interrupt controller hcalls handlers https://git.kernel.org/powerpc/c/b22af9041927075b82bcaf4b6c7a354688198d47 cheers
Re: [PATCH kernel v2] KVM: PPC: Book3s: Retire H_PUT_TCE/etc real mode handlers
On Fri, 6 May 2022 15:37:55 +1000, Alexey Kardashevskiy wrote: > LoPAPR defines guest visible IOMMU with hypercalls to use it - > H_PUT_TCE/etc. Implemented first on POWER7 where hypercalls would trap > in the KVM in the real mode (with MMU off). The problem with the real mode > is some memory is not available and some API usage crashed the host but > enabling MMU was an expensive operation. > > The problems with the real mode handlers are: > 1. Occasionally these cannot complete the request so the code is > copied+modified to work in the virtual mode, very little is shared; > 2. The real mode handlers have to be linked into vmlinux to work; > 3. An exception in real mode immediately reboots the machine. > > [...] Applied to powerpc/topic/ppc-kvm. [1/1] KVM: PPC: Book3s: Retire H_PUT_TCE/etc real mode handlers https://git.kernel.org/powerpc/c/cad32d9d42e8e6a659786f8a730b221a9fbee227 cheers
Re: [RFC PATCH 1/4] objtool: Add --mnop as an option to --mcount
Christophe Leroy wrote: Le 24/05/2022 à 12:15, Naveen N. Rao a écrit : Christophe Leroy wrote: Le 23/05/2022 à 19:55, Sathvika Vasireddy a écrit : Architectures can select HAVE_NOP_MCOUNT if they choose to nop out mcount call sites. If that config option is selected, then --mnop is passed as an option to objtool, along with --mcount. Is there a reason not to nop out mcount call sites on powerpc as well ? Yes, if there are functions that are out of range of _mcount(), then the linker would have inserted long branch trampolines. We detect such cases during boot. But, if we nop out the _mcount call sites during build time, we will need some other way to identify these. But does it really matter whether _mcount is reachable or not ? _mcount is never used, and the function we want to call in lieu of _mcount might be reachable while _mcount is not or might be unreachable while _mcount is. For the most part, we will end up having to go to ftrace_caller or ftrace_regs_caller, both of which will usually be close to _mcount. We need to know for sure either way. Nop'ing out the _mcount locations at boot allows us to discover existing long branch trampolines. If we want to avoid it, we need to note down those locations during build time. Do you have a different approach in mind? - Naveen
Re: [RFC PATCH 1/4] objtool: Add --mnop as an option to --mcount
Le 24/05/2022 à 12:15, Naveen N. Rao a écrit : > Christophe Leroy wrote: >> >> >> Le 23/05/2022 à 19:55, Sathvika Vasireddy a écrit : >>> Architectures can select HAVE_NOP_MCOUNT if they choose >>> to nop out mcount call sites. If that config option is >>> selected, then --mnop is passed as an option to objtool, >>> along with --mcount. >>> >> >> Is there a reason not to nop out mcount call sites on powerpc as well ? > > Yes, if there are functions that are out of range of _mcount(), then the > linker would have inserted long branch trampolines. We detect such cases > during boot. But, if we nop out the _mcount call sites during build > time, we will need some other way to identify these. > But does it really matter whether _mcount is reachable or not ? _mcount is never used, and the function we want to call in lieu of _mcount might be reachable while _mcount is not or might be unreachable while _mcount is. Christophe
Re: [PATCH 24/30] panic: Refactor the panic path
On 05/24/22 at 10:01am, Petr Mladek wrote: > On Fri 2022-05-20 08:23:33, Guilherme G. Piccoli wrote: > > On 19/05/2022 20:45, Baoquan He wrote: > > > [...] > > >> I really appreciate the summary skill you have, to convert complex > > >> problems in very clear and concise ideas. Thanks for that, very useful! > > >> I agree with what was summarized above. > > > > > > I want to say the similar words to Petr's reviewing comment when I went > > > through the patches and traced each reviewing sub-thread to try to > > > catch up. Petr has reivewed this series so carefully and given many > > > comments I want to ack immediately. > > > > > > I agree with most of the suggestions from Petr to this patch, except of > > > one tiny concern, please see below inline comment. > > > > Hi Baoquan, thanks! I'm glad you're also reviewing that =) > > > > > > > [...] > > > > > > I like the proposed skeleton of panic() and code style suggested by > > > Petr very much. About panic_prefer_crash_dump which might need be added, > > > I hope it has a default value true. This makes crash_dump execute at > > > first by default just as before, unless people specify > > > panic_prefer_crash_dump=0|n|off to disable it. Otherwise we need add > > > panic_prefer_crash_dump=1 in kernel and in our distros to enable kdump, > > > this is inconsistent with the old behaviour. > > > > I'd like to understand better why the crash_kexec() must always be the > > first thing in your use case. If we keep that behavior, we'll see all > > sorts of workarounds - see the last patches of this series, Hyper-V and > > PowerPC folks hardcoded "crash_kexec_post_notifiers" in order to force > > execution of their relevant notifiers (like the vmbus disconnect, > > specially in arm64 that has no custom machine_crash_shutdown, or the > > fadump case in ppc). This led to more risk in kdump. > > > > The thing is: with the notifiers' split, we tried to keep only the most > > relevant/necessary stuff in this first list, things that ultimately > > should improve kdump reliability or if not, at least not break it. My > > feeling is that, with this series, we should change the idea/concept > > that kdump must run first nevertheless, not matter what. We're here > > trying to accommodate the antagonistic goals of hypervisors that need > > some clean-up (even for kdump to work) VS. kdump users, that wish a > > "pristine" system reboot ASAP after the crash. > > Good question. I wonder if Baoquan knows about problems caused by the > particular notifiers that will end up in the hypervisor list. Note > that there will be some shuffles and the list will be slightly > different in V2. Yes, I knew some of them. Please check my response to Guilherme. We have bug to track the issue on Hyper-V in which failure happened during panic notifiers running, haven't come to kdump. Seems both of us sent mail replying to Guilherme at the same time. > > Anyway, I see four possible solutions: > > 1. The most conservative approach is to keep the current behavior > and call kdump first by default. > > 2. A medium conservative approach to change the default default > behavior and call hypervisor and eventually the info notifiers > before kdump. There still would be the possibility to call kdump > first by the command line parameter. > > 3. Remove the possibility to call kdump first completely. It would > assume that all the notifiers in the info list are super safe > or that they make kdump actually more safe. > > 4. Create one more notifier list for operations that always should > be called before crash_dump. I would vote for 1 or 4 without any hesitation, and prefer 4. I ever suggest the variant of solution 4 in v1 reviewing. That's taking those notifiers out of list and enforcing to execute them before kdump. E.g the one on HyperV to terminate VMbus connection. Maybe solution 4 is better to provide a determinate way for people to add necessary code at the earliest part. > > Regarding the extra notifier list (4th solution). It is not clear to > me whether it would be always called even before hypervisor list or > when kdump is not enabled. We must not over-engineer it. One thing I would like to notice is, no matter how perfect we split the lists this time, we can't gurantee people will add notifiers reasonablly in the future. And people from different sub-component may not do sufficient investigation and add them to fulfil their local purpose. The current panic notifers list is the best example. Hyper-V actually wants to run some necessary code before kdump, but not all of them, they just add it, ignoring the original purpose of crash_kexec_post_notifiers. I guess they do like this just because it's easy to do, no need to bother changing code in generic place. Solution 4 can make this no doubt, that's why I like it better. > > 2nd proposal looks like a good compromise. But maybe we could do > this change few releases later. The noti
Re: [RFC PATCH 1/4] objtool: Add --mnop as an option to --mcount
Christophe Leroy wrote: Le 23/05/2022 à 19:55, Sathvika Vasireddy a écrit : Architectures can select HAVE_NOP_MCOUNT if they choose to nop out mcount call sites. If that config option is selected, then --mnop is passed as an option to objtool, along with --mcount. Is there a reason not to nop out mcount call sites on powerpc as well ? Yes, if there are functions that are out of range of _mcount(), then the linker would have inserted long branch trampolines. We detect such cases during boot. But, if we nop out the _mcount call sites during build time, we will need some other way to identify these. - Naveen
Re: [PATCH Linux] powerpc: add documentation for HWCAPs
* Nicholas Piggin: > +2. Facilities > +- > +The Power ISA uses the term "facility" to describe a class of instructions, > +registers, interrupts, etc. The presence or absence of a facility indicates > +whether this class is available to be used, but the specifics depend on the > +ISA version. For example, if the VSX facility is available, the VSX > +instructions that can be used differ between the v3.0B and v3.1B ISA > +verstions. The 2.07 ISA manual also has categories. ISA 3.0 made a lot of things mandatory. It may make sense to clarify that feature bits for mandatory aspects of the ISA are still set, to help with backwards compatibility. Thanks, Florian