Re: [RFC PATCH 3/3] objtool/mcount: Add powerpc specific functions
On Tue, Mar 29, 2022 at 05:32:18PM +, Christophe Leroy wrote: > > > Le 29/03/2022 à 14:01, Michael Ellerman a écrit : > > Josh Poimboeuf writes: > >> On Sun, Mar 27, 2022 at 09:09:20AM +, Christophe Leroy wrote: > >>> Second point is the endianess and 32/64 selection, especially when > >>> crossbuilding. There is already some stuff regarding endianess based on > >>> bswap_if_needed() but that's based on constant selection at build time > >>> and I couldn't find an easy way to set it conditionaly based on the > >>> target being built. > >>> > >>> Regarding 32/64 selection, there is almost nothing, it's based on using > >>> type 'long' which means that at the time being the target and the build > >>> platform must both be 32 bits or 64 bits. > >>> > >>> For both cases (endianess and 32/64) I think the solution should > >>> probably be to start with the fileformat of the object file being > >>> reworked by objtool. > >> > >> Do we really need to detect the endianness/bitness at runtime? Objtool > >> is built with the kernel, why not just build-in the same target > >> assumptions as the kernel itself? > > > > I don't think we need runtime detection. But it will need to support > > basically most combinations of objtool running as 32-bit/64-bit LE/BE > > while the kernel it's analysing is 32-bit/64-bit LE/BE. > > Exactly, the way it is done today with a constant in > objtool/endianness.h is too simple, we need to be able to select it > based on kernel's config. Is there a way to get the CONFIG_ macros from > the kernel ? If yes then we could use CONFIG_64BIT and > CONFIG_CPU_LITTLE_ENDIAN to select the correct options in objtool. As of now, there's no good way to get CONFIG options from the kernel. That's pretty much by design, since objtool is meant to be a standalone tool. In fact there are people who've used objtool for other projects. The objtool Makefile does at least have access to HOSTARCH/SRCARCH, but I guess that doesn't help here. We could maybe export the endian/bit details in env variables to the objtool build somehow. But, I managed to forget that objtool can already be cross-compiled for a x86-64 target, from a 32-bit x86 LE host or a 64-bit powerpc BE host. There are some people out there doing x86 kernel builds on such systems who reported bugs, which were since fixed. And the fixes were pretty trivial, IIRC. Libelf actually does a decent job of abstracting those details from objtool. So, forget what I said, it might be ok to just detect endian/bit (and possibly even arch) at runtime like you originally suggested. For example bswap_if_needed() could be reworked to be a runtime check. -- Josh
Re: [PATCH] ibmvscsis: increase INITIAL_SRP_LIMIT to 1024
Tyrel, > The adapter request_limit is hardcoded to be INITIAL_SRP_LIMIT which > is currently an arbitrary value of 800. Increase this value to 1024 > which better matches the characteristics of the typical IBMi Initiator > that supports 32 LUNs and a queue depth of 32. Applied to 5.18/scsi-staging, thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v2] ftrace: Make ftrace_graph_is_dead() a static branch
On Fri, 25 Mar 2022 09:03:08 +0100 Christophe Leroy wrote: > --- a/kernel/trace/fgraph.c > +++ b/kernel/trace/fgraph.c > @@ -10,6 +10,7 @@ > #include > #include > #include > +#include > Small nit. Please order the includes in "upside-down x-mas tree" fashion: #include #include #include #include Thanks, -- Steve
Re: [PATCH v2] MAINTAINERS: Enlarge coverage of TRACING inside architectures
On Fri, 25 Mar 2022 07:32:21 +0100 Christophe Leroy wrote: > Most architectures have ftrace related stuff in arch/*/kernel/ftrace.c > but powerpc has it spread in multiple files located in > arch/powerpc/kernel/trace/ > In several architectures, there are also additional files containing > 'ftrace' as part of the name but with some prefix or suffix. Acked-by: Steven Rostedt (Google) -- Steve
[GIT PULL] LIBNVDIMM update for v5.18
Hi Linus, please pull from: git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm tags/libnvdimm-for-5.18 ...to receive the libnvdimm update for this cycle which includes the deprecation of block-aperture mode and a new perf events interface for the papr_scm nvdimm driver. The perf events approach was acked by PeterZ. You will notice the top commit is less than a week old as linux-next exposure identified some build failure scenarios. Kajol turned around a fix and it has appeared in linux-next with no additional reports. Some other fixups for the removal of block-aperture mode also generated some follow-on fixes from -next exposure. I am not aware of anything else outstanding, please pull. --- The following changes since commit 754e0b0e35608ed5206d6a67a791563c631cec07: Linux 5.17-rc4 (2022-02-13 12:13:30 -0800) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm tags/libnvdimm-for-5.18 for you to fetch changes up to ada8d8d337ee970860c9844126e634df8076aa11: nvdimm/blk: Fix title level (2022-03-23 17:52:33 -0700) libnvdimm for 5.18 - Add perf support for nvdimm events, initially only for 'papr_scm' devices. - Deprecate the 'block aperture' support in libnvdimm, it only ever existed in the specification, not in shipping product. Dan Williams (6): nvdimm/region: Fix default alignment for small regions nvdimm/blk: Delete the block-aperture window driver nvdimm/namespace: Delete blk namespace consideration in shared paths nvdimm/namespace: Delete nd_namespace_blk ACPI: NFIT: Remove block aperture support nvdimm/region: Delete nd_blk_region infrastructure Kajol Jain (6): drivers/nvdimm: Add nvdimm pmu structure drivers/nvdimm: Add perf interface to expose nvdimm performance stats powerpc/papr_scm: Add perf interface support docs: ABI: sysfs-bus-nvdimm: Document sysfs event format entries for nvdimm pmu drivers/nvdimm: Fix build failure when CONFIG_PERF_EVENTS is not set powerpc/papr_scm: Fix build failure when Lukas Bulwahn (1): MAINTAINERS: remove section LIBNVDIMM BLK: MMIO-APERTURE DRIVER Tom Rix (1): nvdimm/blk: Fix title level Documentation/ABI/testing/sysfs-bus-nvdimm | 35 ++ Documentation/driver-api/nvdimm/nvdimm.rst | 406 +-- MAINTAINERS| 11 - arch/powerpc/include/asm/device.h | 5 + arch/powerpc/platforms/pseries/papr_scm.c | 230 + drivers/acpi/nfit/core.c | 387 +- drivers/acpi/nfit/nfit.h | 6 - drivers/nvdimm/Kconfig | 25 +- drivers/nvdimm/Makefile| 4 +- drivers/nvdimm/blk.c | 335 --- drivers/nvdimm/bus.c | 2 - drivers/nvdimm/dimm_devs.c | 204 +--- drivers/nvdimm/label.c | 346 +--- drivers/nvdimm/label.h | 5 +- drivers/nvdimm/namespace_devs.c| 506 ++--- drivers/nvdimm/nd-core.h | 27 +- drivers/nvdimm/nd.h| 13 - drivers/nvdimm/nd_perf.c | 329 +++ drivers/nvdimm/region.c| 31 +- drivers/nvdimm/region_devs.c | 157 ++--- include/linux/libnvdimm.h | 24 -- include/linux/nd.h | 78 +++-- include/uapi/linux/ndctl.h | 2 - tools/testing/nvdimm/Kbuild| 4 - tools/testing/nvdimm/config_check.c| 1 - tools/testing/nvdimm/test/ndtest.c | 67 +--- tools/testing/nvdimm/test/nfit.c | 23 -- 27 files changed, 833 insertions(+), 2430 deletions(-) delete mode 100644 drivers/nvdimm/blk.c create mode 100644 drivers/nvdimm/nd_perf.c
[PATCH] i2c: pasemi: Wait for write xfers to finish
Wait for completion of write transfers before returning from the driver. At first sight it may seem advantageous to leave write transfers queued for the controller to carry out on its own time, but there's a couple of issues with it: * Driver doesn't check for FIFO space. * The queued writes can complete while the driver is in its I2C read transfer path which means it will get confused by the raising of XEN (the 'transaction ended' signal). This can cause a spurious ENODATA error due to premature reading of the MRXFIFO register. Adding the wait fixes some unreliability issues with the driver. There's some efficiency cost to it (especially with pasemi_smb_waitready doing its polling), but that will be alleviated once the driver receives interrupt support. Fixes: beb58aa39e6e ("i2c: PA Semi SMBus driver") Signed-off-by: Martin Povišer --- Tested on Apple's t8103 chip. To my knowledge the PA Semi controller in its pre-Apple occurences behaves the same as far as this patch is concerned. drivers/i2c/busses/i2c-pasemi-core.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/i2c/busses/i2c-pasemi-core.c b/drivers/i2c/busses/i2c-pasemi-core.c index 7728c8460dc0..9028ffb58cc0 100644 --- a/drivers/i2c/busses/i2c-pasemi-core.c +++ b/drivers/i2c/busses/i2c-pasemi-core.c @@ -137,6 +137,12 @@ static int pasemi_i2c_xfer_msg(struct i2c_adapter *adapter, TXFIFO_WR(smbus, msg->buf[msg->len-1] | (stop ? MTXFIFO_STOP : 0)); + + if (stop) { + err = pasemi_smb_waitready(smbus); + if (err) + goto reset_out; + } } return 0; -- 2.33.0
Re: [RFC PATCH 3/3] objtool/mcount: Add powerpc specific functions
Le 29/03/2022 à 14:01, Michael Ellerman a écrit : > Josh Poimboeuf writes: >> On Sun, Mar 27, 2022 at 09:09:20AM +, Christophe Leroy wrote: >>> Second point is the endianess and 32/64 selection, especially when >>> crossbuilding. There is already some stuff regarding endianess based on >>> bswap_if_needed() but that's based on constant selection at build time >>> and I couldn't find an easy way to set it conditionaly based on the >>> target being built. >>> >>> Regarding 32/64 selection, there is almost nothing, it's based on using >>> type 'long' which means that at the time being the target and the build >>> platform must both be 32 bits or 64 bits. >>> >>> For both cases (endianess and 32/64) I think the solution should >>> probably be to start with the fileformat of the object file being >>> reworked by objtool. >> >> Do we really need to detect the endianness/bitness at runtime? Objtool >> is built with the kernel, why not just build-in the same target >> assumptions as the kernel itself? > > I don't think we need runtime detection. But it will need to support > basically most combinations of objtool running as 32-bit/64-bit LE/BE > while the kernel it's analysing is 32-bit/64-bit LE/BE. Exactly, the way it is done today with a constant in objtool/endianness.h is too simple, we need to be able to select it based on kernel's config. Is there a way to get the CONFIG_ macros from the kernel ? If yes then we could use CONFIG_64BIT and CONFIG_CPU_LITTLE_ENDIAN to select the correct options in objtool. > >>> What are current works in progress on objtool ? Should I wait Josh's >>> changes before starting looking at all this ? Should I wait for anything >>> else ? >> >> I'm not making any major changes to the code, just shuffling things >> around to make the interface more modular. I hope to have something >> soon (this week). Peter recently added a big feature (Intel IBT) which >> is already in -next. >> >> Contributions are welcome, with the understanding that you'll help >> maintain it ;-) >> >> Some years ago Kamalesh Babulal had a prototype of objtool for ppc64le >> which did the full stack validation. I'm not sure what ever became of >> that. > > From memory he was starting to clean the patches up in late 2019, but I > guess that probably got derailed by COVID. AFAIK he never posted > anything. Maybe someone at IBM has a copy internally (Naveen?). > >> FWIW, there have been some objtool patches for arm64 stack validation, >> but the arm64 maintainers have been hesitant to get on board with >> objtool, as it brings a certain maintenance burden. Especially for the >> full stack validation and ORC unwinder. But if you only want inline >> static calls and/or mcount then it'd probably be much easier to >> maintain. > > I would like to have the stack validation, but I am also worried about > the maintenance burden. > > I guess we start with mcount, which looks pretty minimal judging by this > series, and see how we go from there. > I'm not sure mcount is really needed as we have recordmcount, but at least it is an easy one to start with and as we have recordmount we can easily compare the results and check it works as expected. Then it should be straight forward to provide static calls. Then I'd like to go with uaccess blocks checks as suggested by Christoph at https://patchwork.ozlabs.org/project/linuxppc-dev/patch/a94be61f008ab29c231b805e1a97e9dab35cb0cc.1629732940.git.christophe.le...@csgroup.eu/, thought it might be less easy. Christophe
[PATCH v2 8/8] powerpc/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE for book3s
Right now, the last 5 bits (0x1f) of the swap entry are used for the type and the bit before that (0x20) is used for _PAGE_SWP_SOFT_DIRTY. We cannot use 0x40, as that collides with _RPAGE_RSV1 -- contained in _PAGE_HPTEFLAGS. The next candidate would be _RPAGE_SW3 (0x200) -- which is used for _PAGE_SOFT_DIRTY for !swp ptes. So let's just use _PAGE_SOFT_DIRTY for _PAGE_SWP_SOFT_DIRTY (to make it easier to grasp) and use 0x20 now for _PAGE_SWP_EXCLUSIVE. Signed-off-by: David Hildenbrand --- arch/powerpc/include/asm/book3s/64/pgtable.h | 21 +++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 8e98375d5c4a..eecff2036869 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -752,6 +752,7 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) */ \ BUILD_BUG_ON(_PAGE_HPTEFLAGS & SWP_TYPE_MASK); \ BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_SOFT_DIRTY); \ + BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_EXCLUSIVE);\ } while (0) #define SWP_TYPE_BITS 5 @@ -772,11 +773,13 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) #define __swp_entry_to_pmd(x) (pte_pmd(__swp_entry_to_pte(x))) #ifdef CONFIG_MEM_SOFT_DIRTY -#define _PAGE_SWP_SOFT_DIRTY _PAGE_NON_IDEMPOTENT +#define _PAGE_SWP_SOFT_DIRTY _PAGE_SOFT_DIRTY #else #define _PAGE_SWP_SOFT_DIRTY 0UL #endif /* CONFIG_MEM_SOFT_DIRTY */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_NON_IDEMPOTENT + #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY static inline pte_t pte_swp_mksoft_dirty(pte_t pte) { @@ -794,6 +797,22 @@ static inline pte_t pte_swp_clear_soft_dirty(pte_t pte) } #endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */ +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_SWP_EXCLUSIVE)); +} + +static inline int pte_swp_exclusive(pte_t pte) +{ + return !!(pte_raw(pte) & cpu_to_be64(_PAGE_SWP_EXCLUSIVE)); +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return __pte_raw(pte_raw(pte) & cpu_to_be64(~_PAGE_SWP_EXCLUSIVE)); +} + static inline bool check_pte_access(unsigned long access, unsigned long ptev) { /* -- 2.35.1
[PATCH v2 7/8] powerpc/pgtable: remove _PAGE_BIT_SWAP_TYPE for book3s
The swap type is simply stored in bits 0x1f of the swap pte. Let's simplify by just getting rid of _PAGE_BIT_SWAP_TYPE. It's not like that we can simply change it: _PAGE_SWP_SOFT_DIRTY would suddenly fall into _RPAGE_RSV1, which isn't possible and would make the BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_SOFT_DIRTY) angry. While at it, make it clearer which bit we're actually using for _PAGE_SWP_SOFT_DIRTY by just using the proper define and introduce and use SWP_TYPE_MASK. Signed-off-by: David Hildenbrand --- arch/powerpc/include/asm/book3s/64/pgtable.h | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 875730d5af40..8e98375d5c4a 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -13,7 +13,6 @@ /* * Common bits between hash and Radix page table */ -#define _PAGE_BIT_SWAP_TYPE0 #define _PAGE_EXEC 0x1 /* execute permission */ #define _PAGE_WRITE0x2 /* write access allowed */ @@ -751,17 +750,16 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) * Don't have overlapping bits with _PAGE_HPTEFLAGS \ * We filter HPTEFLAGS on set_pte. \ */ \ - BUILD_BUG_ON(_PAGE_HPTEFLAGS & (0x1f << _PAGE_BIT_SWAP_TYPE)); \ + BUILD_BUG_ON(_PAGE_HPTEFLAGS & SWP_TYPE_MASK); \ BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_SOFT_DIRTY); \ } while (0) #define SWP_TYPE_BITS 5 -#define __swp_type(x) (((x).val >> _PAGE_BIT_SWAP_TYPE) \ - & ((1UL << SWP_TYPE_BITS) - 1)) +#define SWP_TYPE_MASK ((1UL << SWP_TYPE_BITS) - 1) +#define __swp_type(x) ((x).val & SWP_TYPE_MASK) #define __swp_offset(x)(((x).val & PTE_RPN_MASK) >> PAGE_SHIFT) #define __swp_entry(type, offset) ((swp_entry_t) { \ - ((type) << _PAGE_BIT_SWAP_TYPE) \ - | (((offset) << PAGE_SHIFT) & PTE_RPN_MASK)}) + (type) | (((offset) << PAGE_SHIFT) & PTE_RPN_MASK)}) /* * swp_entry_t must be independent of pte bits. We build a swp_entry_t from * swap type and offset we get from swap and convert that to pte to find a @@ -774,7 +772,7 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) #define __swp_entry_to_pmd(x) (pte_pmd(__swp_entry_to_pte(x))) #ifdef CONFIG_MEM_SOFT_DIRTY -#define _PAGE_SWP_SOFT_DIRTY (1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYPE)) +#define _PAGE_SWP_SOFT_DIRTY _PAGE_NON_IDEMPOTENT #else #define _PAGE_SWP_SOFT_DIRTY 0UL #endif /* CONFIG_MEM_SOFT_DIRTY */ -- 2.35.1
[PATCH v2 6/8] s390/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's use bit 52, which is unused. Signed-off-by: David Hildenbrand --- arch/s390/include/asm/pgtable.h | 23 +-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 3982575bb586..a397b072a580 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -181,6 +181,8 @@ static inline int is_module_addr(void *addr) #define _PAGE_SOFT_DIRTY 0x000 #endif +#define _PAGE_SWP_EXCLUSIVE _PAGE_LARGE/* SW pte exclusive swap bit */ + /* Set of bits not changed in pte_modify */ #define _PAGE_CHG_MASK (PAGE_MASK | _PAGE_SPECIAL | _PAGE_DIRTY | \ _PAGE_YOUNG | _PAGE_SOFT_DIRTY) @@ -826,6 +828,22 @@ static inline int pmd_protnone(pmd_t pmd) } #endif +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return set_pte_bit(pte, __pgprot(_PAGE_SWP_EXCLUSIVE)); +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return clear_pte_bit(pte, __pgprot(_PAGE_SWP_EXCLUSIVE)); +} + static inline int pte_soft_dirty(pte_t pte) { return pte_val(pte) & _PAGE_SOFT_DIRTY; @@ -1715,14 +1733,15 @@ static inline int has_transparent_hugepage(void) * Bits 54 and 63 are used to indicate the page type. Bit 53 marks the pte * as invalid. * A swap pte is indicated by bit pattern (pte & 0x201) == 0x200 - * | offset|X11XX|type |S0| + * | offset|E11XX|type |S0| * |001122334455|5|55566|66| * |0123456789012345678901234567890123456789012345678901|23456|78901|23| * * Bits 0-51 store the offset. + * Bit 52 (E) is used to remember PG_anon_exclusive. * Bits 57-61 store the type. * Bit 62 (S) is used for softdirty tracking. - * Bits 52, 55 and 56 (X) are unused. + * Bits 55 and 56 (X) are unused. */ #define __SWP_OFFSET_MASK ((1UL << 52) - 1) -- 2.35.1
[PATCH v2 5/8] s390/pgtable: cleanup description of swp pte layout
Bit 52 and bit 55 don't have to be zero: they only trigger a translation-specifiation exception if the PTE is marked as valid, which is not the case for swap ptes. Document which bits are used for what, and which ones are unused. Signed-off-by: David Hildenbrand --- arch/s390/include/asm/pgtable.h | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 9df679152620..3982575bb586 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -1712,18 +1712,17 @@ static inline int has_transparent_hugepage(void) /* * 64 bit swap entry format: * A page-table entry has some bits we have to treat in a special way. - * Bits 52 and bit 55 have to be zero, otherwise a specification - * exception will occur instead of a page translation exception. The - * specification exception has the bad habit not to store necessary - * information in the lowcore. - * Bits 54 and 63 are used to indicate the page type. + * Bits 54 and 63 are used to indicate the page type. Bit 53 marks the pte + * as invalid. * A swap pte is indicated by bit pattern (pte & 0x201) == 0x200 - * This leaves the bits 0-51 and bits 56-62 to store type and offset. - * We use the 5 bits from 57-61 for the type and the 52 bits from 0-51 - * for the offset. - * | offset|01100|type |00| + * | offset|X11XX|type |S0| * |001122334455|5|55566|66| * |0123456789012345678901234567890123456789012345678901|23456|78901|23| + * + * Bits 0-51 store the offset. + * Bits 57-61 store the type. + * Bit 62 (S) is used for softdirty tracking. + * Bits 52, 55 and 56 (X) are unused. */ #define __SWP_OFFSET_MASK ((1UL << 52) - 1) -- 2.35.1
[PATCH v2 4/8] arm64/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's use one of the type bits: core-mm only supports 5, so there is no need to consume 6. Note that we might be able to reuse bit 1, but reusing bit 1 turned out problematic in the past for PROT_NONE handling; so let's play safe and use another bit. Reviewed-by: Catalin Marinas Signed-off-by: David Hildenbrand --- arch/arm64/include/asm/pgtable-prot.h | 1 + arch/arm64/include/asm/pgtable.h | 23 --- 2 files changed, 21 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h index b1e1b74d993c..62e0ebeed720 100644 --- a/arch/arm64/include/asm/pgtable-prot.h +++ b/arch/arm64/include/asm/pgtable-prot.h @@ -14,6 +14,7 @@ * Software defined PTE bits definition. */ #define PTE_WRITE (PTE_DBM)/* same as DBM (51) */ +#define PTE_SWP_EXCLUSIVE (_AT(pteval_t, 1) << 2) /* only for swp ptes */ #define PTE_DIRTY (_AT(pteval_t, 1) << 55) #define PTE_SPECIAL(_AT(pteval_t, 1) << 56) #define PTE_DEVMAP (_AT(pteval_t, 1) << 57) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 94e147e5456c..ad9b221963d4 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -402,6 +402,22 @@ static inline pgprot_t mk_pmd_sect_prot(pgprot_t prot) return __pgprot((pgprot_val(prot) & ~PMD_TABLE_BIT) | PMD_TYPE_SECT); } +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return set_pte_bit(pte, __pgprot(PTE_SWP_EXCLUSIVE)); +} + +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & PTE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return clear_pte_bit(pte, __pgprot(PTE_SWP_EXCLUSIVE)); +} + #ifdef CONFIG_NUMA_BALANCING /* * See the comment in include/linux/pgtable.h @@ -909,12 +925,13 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, /* * Encode and decode a swap entry: * bits 0-1: present (must be zero) - * bits 2-7: swap type + * bits 2: remember PG_anon_exclusive + * bits 3-7: swap type * bits 8-57: swap offset * bit 58:PTE_PROT_NONE (must be zero) */ -#define __SWP_TYPE_SHIFT 2 -#define __SWP_TYPE_BITS6 +#define __SWP_TYPE_SHIFT 3 +#define __SWP_TYPE_BITS5 #define __SWP_OFFSET_BITS 50 #define __SWP_TYPE_MASK((1 << __SWP_TYPE_BITS) - 1) #define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT) -- 2.35.1
[PATCH v2 3/8] x86/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's use bit 3 to remember PG_anon_exclusive in swap ptes. Signed-off-by: David Hildenbrand --- arch/x86/include/asm/pgtable.h | 16 arch/x86/include/asm/pgtable_64.h| 4 +++- arch/x86/include/asm/pgtable_types.h | 5 + 3 files changed, 24 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 62ab07e24aef..e42e668153e9 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1292,6 +1292,22 @@ static inline void update_mmu_cache_pud(struct vm_area_struct *vma, { } +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return pte_set_flags(pte, _PAGE_SWP_EXCLUSIVE); +} + +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_flags(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return pte_clear_flags(pte, _PAGE_SWP_EXCLUSIVE); +} + #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY static inline pte_t pte_swp_mksoft_dirty(pte_t pte) { diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h index 56d0399a0cd1..e479491da8d5 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -186,7 +186,7 @@ static inline void native_pgd_clear(pgd_t *pgd) * * | ...| 11| 10| 9|8|7|6|5| 4| 3|2| 1|0| <- bit number * | ...|SW3|SW2|SW1|G|L|D|A|CD|WT|U| W|P| <- bit names - * | TYPE (59-63) | ~OFFSET (9-58) |0|0|X|X| X| X|F|SD|0| <- swp entry + * | TYPE (59-63) | ~OFFSET (9-58) |0|0|X|X| X| E|F|SD|0| <- swp entry * * G (8) is aliased and used as a PROT_NONE indicator for * !present ptes. We need to start storing swap entries above @@ -203,6 +203,8 @@ static inline void native_pgd_clear(pgd_t *pgd) * F (2) in swp entry is used to record when a pagetable is * writeprotected by userfaultfd WP support. * + * E (3) in swp entry is used to rememeber PG_anon_exclusive. + * * Bit 7 in swp entry should be 0 because pmd_present checks not only P, * but also L and G. * diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 40497a9020c6..54a8f370046d 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -83,6 +83,11 @@ #define _PAGE_SOFT_DIRTY (_AT(pteval_t, 0)) #endif +/* + * We borrow bit 3 to remember PG_anon_exclusive. + */ +#define _PAGE_SWP_EXCLUSIVE_PAGE_PWT + /* * Tracking soft dirty bit when a page goes to a swap is tricky. * We need a bit which can be stored in pte _and_ not conflict -- 2.35.1
[PATCH v2 1/8] mm/swap: remember PG_anon_exclusive via a swp pte bit
Currently, we clear PG_anon_exclusive in try_to_unmap() and forget about it. We do this, to keep fork() logic on swap entries easy and efficient: for example, if we wouldn't clear it when unmapping, we'd have to lookup the page in the swapcache for each and every swap entry during fork() and clear PG_anon_exclusive if set. Instead, we want to store that information directly in the swap pte, protected by the page table lock, similarly to how we handle SWP_MIGRATION_READ_EXCLUSIVE for migration entries. However, for actual swap entries, we don't want to mess with the swap type (e.g., still one bit) because it overcomplicates swap code. In try_to_unmap(), we already reject to unmap in case the page might be pinned, because we must not lose PG_anon_exclusive on pinned pages ever. Checking if there are other unexpected references reliably *before* completely unmapping a page is unfortunately not really possible: THP heavily overcomplicate the situation. Once fully unmapped it's easier -- we, for example, make sure that there are no unexpected references *after* unmapping a page before starting writeback on that page. So, we currently might end up unmapping a page and clearing PG_anon_exclusive if that page has additional references, for example, due to a FOLL_GET. do_swap_page() has to re-determine if a page is exclusive, which will easily fail if there are other references on a page, most prominently GUP references via FOLL_GET. This can currently result in memory corruptions when taking a FOLL_GET | FOLL_WRITE reference on a page even when fork() is never involved: try_to_unmap() will succeed, and when refaulting the page, it cannot be marked exclusive and will get replaced by a copy in the page tables on the next write access, resulting in writes via the GUP reference to the page being lost. In an ideal world, everybody that uses GUP and wants to modify page content, such as O_DIRECT, would properly use FOLL_PIN. However, that conversion will take a while. It's easier to fix what used to work in the past (FOLL_GET | FOLL_WRITE) remembering PG_anon_exclusive. In addition, by remembering PG_anon_exclusive we can further reduce unnecessary COW in some cases, so it's the natural thing to do. So let's transfer the PG_anon_exclusive information to the swap pte and store it via an architecture-dependant pte bit; use that information when restoring the swap pte in do_swap_page() and unuse_pte(). During fork(), we simply have to clear the pte bit and are done. Of course, there is one corner case to handle: swap backends that don't support concurrent page modifications while the page is under writeback. Special case these, and drop the exclusive marker. Add a comment why that is just fine (also, reuse_swap_page() would have done the same in the past). In the future, we'll hopefully have all architectures support __HAVE_ARCH_PTE_SWP_EXCLUSIVE, such that we can get rid of the empty stubs and the define completely. Then, we can also convert SWP_MIGRATION_READ_EXCLUSIVE. For architectures it's fairly easy to support: either simply use a yet unused pte bit that can be used for swap entries, steal one from the arch type bits if they exceed 5, or steal one from the offset bits. Note: R/O FOLL_GET references were never really reliable, especially when taking one on a shared page and then writing to the page (e.g., GUP after fork()). FOLL_GET, including R/W references, were never really reliable once fork was involved (e.g., GUP before fork(), GUP during fork()). KSM steps back in case it stumbles over unexpected references and is, therefore, fine. Signed-off-by: David Hildenbrand --- include/linux/pgtable.h | 29 ++ include/linux/swapops.h | 2 ++ mm/memory.c | 55 ++--- mm/rmap.c | 19 -- mm/swapfile.c | 13 +- 5 files changed, 105 insertions(+), 13 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index f4f4077b97aa..53750224e176 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1003,6 +1003,35 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) #define arch_start_context_switch(prev)do {} while (0) #endif +/* + * When replacing an anonymous page by a real (!non) swap entry, we clear + * PG_anon_exclusive from the page and instead remember whether the flag was + * set in the swp pte. During fork(), we have to mark the entry as !exclusive + * (possibly shared). On swapin, we use that information to restore + * PG_anon_exclusive, which is very helpful in cases where we might have + * additional (e.g., FOLL_GET) references on a page and wouldn't be able to + * detect exclusivity. + * + * These functions don't apply to non-swap entries (e.g., migration, hwpoison, + * ...). + */ +#ifndef __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return pte; +} + +static inline int
[PATCH v2 2/8] mm/debug_vm_pgtable: add tests for __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Let's test that __HAVE_ARCH_PTE_SWP_EXCLUSIVE works as expected. Signed-off-by: David Hildenbrand --- mm/debug_vm_pgtable.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index db2abd9e415b..55f1a8dc716f 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -837,6 +837,19 @@ static void __init pmd_soft_dirty_tests(struct pgtable_debug_args *args) { } static void __init pmd_swap_soft_dirty_tests(struct pgtable_debug_args *args) { } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ +static void __init pte_swap_exclusive_tests(struct pgtable_debug_args *args) +{ +#ifdef __HAVE_ARCH_PTE_SWP_EXCLUSIVE + pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); + + pr_debug("Validating PTE swap exclusive\n"); + pte = pte_swp_mkexclusive(pte); + WARN_ON(!pte_swp_exclusive(pte)); + pte = pte_swp_clear_exclusive(pte); + WARN_ON(pte_swp_exclusive(pte)); +#endif /* __HAVE_ARCH_PTE_SWP_EXCLUSIVE */ +} + static void __init pte_swap_tests(struct pgtable_debug_args *args) { swp_entry_t swp; @@ -1288,6 +1301,8 @@ static int __init debug_vm_pgtable(void) pte_swap_soft_dirty_tests(); pmd_swap_soft_dirty_tests(); + pte_swap_exclusive_tests(); + pte_swap_tests(); pmd_swap_tests(); -- 2.35.1
[PATCH v2 0/8] mm: COW fixes part 3: reliable GUP R/W FOLL_GET of anonymous pages
More information on the general COW issues can be found at [2]. This series is based on latest linus/master and [1]: [PATCH v3 00/16] mm: COW fixes part 2: reliable GUP pins of anonymous pages v2 is located at: https://github.com/davidhildenbrand/linux/tree/cow_fixes_part_3_v2 This series fixes memory corruptions when a GUP R/W reference (FOLL_WRITE | FOLL_GET) was taken on an anonymous page and COW logic fails to detect exclusivity of the page to then replacing the anonymous page by a copy in the page table: The GUP reference lost synchronicity with the pages mapped into the page tables. This series focuses on x86, arm64, s390x and ppc64/book3s -- other architectures are fairly easy to support by implementing __HAVE_ARCH_PTE_SWP_EXCLUSIVE. This primarily fixes the O_DIRECT memory corruptions that can happen on concurrent swapout, whereby we lose DMA reads to a page (modifying the user page by writing to it). O_DIRECT currently uses FOLL_GET for short-term (!FOLL_LONGTERM) DMA from/to a user page. In the long run, we want to convert it to properly use FOLL_PIN, and John is working on it, but that might take a while and might not be easy to backport. In the meantime, let's restore what used to work before we started modifying our COW logic: make R/W FOLL_GET references reliable as long as there is no fork() after GUP involved. This is just the natural follow-up of part 2, that will also further reduce "wrong COW" on the swapin path, for example, when we cannot remove a page from the swapcache due to concurrent writeback, or if we have two threads faulting on the same swapped-out page. Fixing O_DIRECT is just a nice side-product This issue, including other related COW issues, has been summarized in [3] under 2): " 2. Intra Process Memory Corruptions due to Wrong COW (FOLL_GET) It was discovered that we can create a memory corruption by reading a file via O_DIRECT to a part (e.g., first 512 bytes) of a page, concurrently writing to an unrelated part (e.g., last byte) of the same page, and concurrently write-protecting the page via clear_refs SOFTDIRTY tracking [6]. For the reproducer, the issue is that O_DIRECT grabs a reference of the target page (via FOLL_GET) and clear_refs write-protects the relevant page table entry. On successive write access to the page from the process itself, we wrongly COW the page when resolving the write fault, resulting in a loss of synchronicity and consequently a memory corruption. While some people might think that using clear_refs in this combination is a corner cases, it turns out to be a more generic problem unfortunately. For example, it was just recently discovered that we can similarly create a memory corruption without clear_refs, simply by concurrently swapping out the buffer pages [7]. Note that we nowadays even use the swap infrastructure in Linux without an actual swap disk/partition: the prime example is zram which is enabled as default under Fedora [10]. The root issue is that a write-fault on a page that has additional references results in a COW and thereby a loss of synchronicity and consequently a memory corruption if two parties believe they are referencing the same page. " We don't particularly care about R/O FOLL_GET references: they were never reliable and O_DIRECT doesn't expect to observe modifications from a page after DMA was started. Note that: * this only fixes the issue on x86, arm64, s390x and ppc64/book3s ("enterprise architectures"). Other architectures have to implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE to achieve the same. * this does *not * consider any kind of fork() after taking the reference: fork() after GUP never worked reliably with FOLL_GET. * Not losing PG_anon_exclusive during swapout was the last remaining piece. KSM already makes sure that there are no other references on a page before considering it for sharing. Page migration maintains PG_anon_exclusive and simply fails when there are additional references (freezing the refcount fails). Only swapout code dropped the PG_anon_exclusive flag because it requires more work to remember + restore it. With this series in place, most COW issues of [3] are fixed on said architectures. Other architectures can implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE fairly easily. [1] https://lkml.kernel.org/r/20220329160440.193848-1-da...@redhat.com [2] https://lkml.kernel.org/r/20211217113049.23850-1-da...@redhat.com [3] https://lore.kernel.org/r/3ae33b08-d9ef-f846-56fb-645e3b9b4...@redhat.com v2 -> v3: * Rebased and retested * "arm64/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE" -> Add RB and a comment to the patch description * "s390/pgtable: cleanup description of swp pte layout" -> Added * "s390/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE" -> Use new set_pte_bit()/clear_pte_bit() -> Fixups comments/patch description David Hildenbrand (8): mm/swap: remember PG_anon_exclusive via a swp pte
[PATCH] powerpc/pseries/vas: use default_groups in kobj_type
There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the pseries vas sysfs code to use default_groups field which has been the preferred way since aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Haren Myneni Cc: Nicholas Piggin Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Greg Kroah-Hartman --- Note, I would like to take this through my driver-core tree for 5.18-rc2 as this is the last hold-out of the default_attrs field. It "snuck" in as new code for 5.18-rc1, any objection to me taking it? thanks, greg k-h arch/powerpc/platforms/pseries/vas-sysfs.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/pseries/vas-sysfs.c b/arch/powerpc/platforms/pseries/vas-sysfs.c index 4a7fcde5afc0..909535ca513a 100644 --- a/arch/powerpc/platforms/pseries/vas-sysfs.c +++ b/arch/powerpc/platforms/pseries/vas-sysfs.c @@ -99,6 +99,7 @@ static struct attribute *vas_def_capab_attrs[] = { _used_credits_attribute.attr, NULL, }; +ATTRIBUTE_GROUPS(vas_def_capab); static struct attribute *vas_qos_capab_attrs[] = { _total_credits_attribute.attr, @@ -106,6 +107,7 @@ static struct attribute *vas_qos_capab_attrs[] = { _total_credits_attribute.attr, NULL, }; +ATTRIBUTE_GROUPS(vas_qos_capab); static ssize_t vas_type_show(struct kobject *kobj, struct attribute *attr, char *buf) @@ -154,13 +156,13 @@ static const struct sysfs_ops vas_sysfs_ops = { static struct kobj_type vas_def_attr_type = { .release= vas_type_release, .sysfs_ops = _sysfs_ops, - .default_attrs = vas_def_capab_attrs, + .default_groups = vas_def_capab_groups, }; static struct kobj_type vas_qos_attr_type = { .release= vas_type_release, .sysfs_ops = _sysfs_ops, - .default_attrs = vas_qos_capab_attrs, + .default_groups = vas_qos_capab_groups, }; static char *vas_caps_kobj_name(struct vas_caps_entry *centry, -- 2.35.1
Re: [PATCH 09/22] gpio-winbond: Use C99 initializers
On Sat, Mar 26, 2022 at 6:00 PM Benjamin Stürz wrote: > > This replaces comments with C99's designated > initializers because the kernel supports them now. > > Signed-off-by: Benjamin Stürz > --- > drivers/gpio/gpio-winbond.c | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpio/gpio-winbond.c b/drivers/gpio/gpio-winbond.c > index 7f8f5b02e31d..0b637fdb407c 100644 > --- a/drivers/gpio/gpio-winbond.c > +++ b/drivers/gpio/gpio-winbond.c > @@ -249,7 +249,7 @@ struct winbond_gpio_info { > }; > > static const struct winbond_gpio_info winbond_gpio_infos[6] = { > - { /* 0 */ > + [0] = { > .dev = WB_SIO_DEV_GPIO12, > .enablereg = WB_SIO_GPIO12_REG_ENABLE, > .enablebit = WB_SIO_GPIO12_ENABLE_1, > @@ -266,7 +266,7 @@ static const struct winbond_gpio_info > winbond_gpio_infos[6] = { > .warnonly = true > } > }, > - { /* 1 */ > + [1] = { > .dev = WB_SIO_DEV_GPIO12, > .enablereg = WB_SIO_GPIO12_REG_ENABLE, > .enablebit = WB_SIO_GPIO12_ENABLE_2, > @@ -277,7 +277,7 @@ static const struct winbond_gpio_info > winbond_gpio_infos[6] = { > .datareg = WB_SIO_GPIO12_REG_DATA2 > /* special conflict handling so doesn't use conflict data */ > }, > - { /* 2 */ > + [2] = { > .dev = WB_SIO_DEV_GPIO34, > .enablereg = WB_SIO_GPIO34_REG_ENABLE, > .enablebit = WB_SIO_GPIO34_ENABLE_3, > @@ -294,7 +294,7 @@ static const struct winbond_gpio_info > winbond_gpio_infos[6] = { > .warnonly = true > } > }, > - { /* 3 */ > + [3] = { > .dev = WB_SIO_DEV_GPIO34, > .enablereg = WB_SIO_GPIO34_REG_ENABLE, > .enablebit = WB_SIO_GPIO34_ENABLE_4, > @@ -311,7 +311,7 @@ static const struct winbond_gpio_info > winbond_gpio_infos[6] = { > .warnonly = true > } > }, > - { /* 4 */ > + [4] = { > .dev = WB_SIO_DEV_WDGPIO56, > .enablereg = WB_SIO_WDGPIO56_REG_ENABLE, > .enablebit = WB_SIO_WDGPIO56_ENABLE_5, > @@ -328,7 +328,7 @@ static const struct winbond_gpio_info > winbond_gpio_infos[6] = { > .warnonly = true > } > }, > - { /* 5 */ > + [5] = { > .dev = WB_SIO_DEV_WDGPIO56, > .enablereg = WB_SIO_WDGPIO56_REG_ENABLE, > .enablebit = WB_SIO_WDGPIO56_ENABLE_6, > -- > 2.35.1 > Acked-by: Bartosz Golaszewski
Re: [RFC PATCH 3/3] objtool/mcount: Add powerpc specific functions
Josh Poimboeuf writes: > On Sun, Mar 27, 2022 at 09:09:20AM +, Christophe Leroy wrote: >> Second point is the endianess and 32/64 selection, especially when >> crossbuilding. There is already some stuff regarding endianess based on >> bswap_if_needed() but that's based on constant selection at build time >> and I couldn't find an easy way to set it conditionaly based on the >> target being built. >> >> Regarding 32/64 selection, there is almost nothing, it's based on using >> type 'long' which means that at the time being the target and the build >> platform must both be 32 bits or 64 bits. >> >> For both cases (endianess and 32/64) I think the solution should >> probably be to start with the fileformat of the object file being >> reworked by objtool. > > Do we really need to detect the endianness/bitness at runtime? Objtool > is built with the kernel, why not just build-in the same target > assumptions as the kernel itself? I don't think we need runtime detection. But it will need to support basically most combinations of objtool running as 32-bit/64-bit LE/BE while the kernel it's analysing is 32-bit/64-bit LE/BE. >> What are current works in progress on objtool ? Should I wait Josh's >> changes before starting looking at all this ? Should I wait for anything >> else ? > > I'm not making any major changes to the code, just shuffling things > around to make the interface more modular. I hope to have something > soon (this week). Peter recently added a big feature (Intel IBT) which > is already in -next. > > Contributions are welcome, with the understanding that you'll help > maintain it ;-) > > Some years ago Kamalesh Babulal had a prototype of objtool for ppc64le > which did the full stack validation. I'm not sure what ever became of > that. >From memory he was starting to clean the patches up in late 2019, but I guess that probably got derailed by COVID. AFAIK he never posted anything. Maybe someone at IBM has a copy internally (Naveen?). > FWIW, there have been some objtool patches for arm64 stack validation, > but the arm64 maintainers have been hesitant to get on board with > objtool, as it brings a certain maintenance burden. Especially for the > full stack validation and ORC unwinder. But if you only want inline > static calls and/or mcount then it'd probably be much easier to > maintain. I would like to have the stack validation, but I am also worried about the maintenance burden. I guess we start with mcount, which looks pretty minimal judging by this series, and see how we go from there. cheers
Re: [PATCH] powerpc/rtas: Keep MSR RI set when calling RTAS
Laurent Dufour writes: > On 29/03/2022, 10:31:33, Nicholas Piggin wrote: >> Excerpts from Laurent Dufour's message of March 17, 2022 9:06 pm: >>> RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32bits >>> mode (MSR[SF] unset). >>> >>> The change in MSR is done in enter_rtas() in a relatively complex way, >>> since the MSR value could be hardcoded. >>> >>> Furthermore, a panic has been reported when hitting the watchdog interrupt >>> while running in RTAS, this leads to the following stack trace: >>> >>> [69244.027433][ C24] watchdog: CPU 24 Hard LOCKUP >>> [69244.027442][ C24] watchdog: CPU 24 TB:997512652051031, last heartbeat >>> TB:997504470175378 (15980ms ago) >>> [69244.027451][ C24] Modules linked in: chacha_generic(E) libchacha(E) >>> xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) >>> libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) >>> algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) >>> fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) >>> cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) >>> algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) >>> rpcsec_gss_krb5(E) auth_rpcgss(E) >>> nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) >>> udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) >>> netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) >>> fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) >>> crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) >>> fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) >>> dm_service_time(E) sd_mod(E) t10_pi(E) >>> [69244.027555][ C24] ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) >>> gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) >>> xor(E) raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) >>> dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) >>> scsi_mod(E) >>> [69244.027587][ C24] Supported: No, Unreleased kernel >>> [69244.027600][ C24] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded >>> Tainted: GE X5.14.21-150400.71.1.bz196362_2-default #1 >>> SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c >>> [69244.027609][ C24] NIP: 1fb41050 LR: 1fb4104c CTR: >>> >>> [69244.027612][ C24] REGS: cfc33d60 TRAP: 0100 Tainted: G >>> E X (5.14.21-150400.71.1.bz196362_2-default) >>> [69244.027615][ C24] MSR: 82981000 CR: 4882 >>> XER: 20040020 >>> [69244.027625][ C24] CFAR: 011c IRQMASK: 1 >>> [69244.027625][ C24] GPR00: 0003 >>> 0001 50dc >>> [69244.027625][ C24] GPR04: 1ffb6100 0020 >>> 0001 1fb09010 >>> [69244.027625][ C24] GPR08: 2000 >>> >>> [69244.027625][ C24] GPR12: 8004072a40a8 cff8b680 >>> 0007 0034 >>> [69244.027625][ C24] GPR16: 1fbf6e94 1fbf6d84 >>> 1fbd1db0 1fb3f008 >>> [69244.027625][ C24] GPR20: 1fb41018 >>> 017f f68f >>> [69244.027625][ C24] GPR24: 1fb18fe8 1fb3e000 >>> 1fb1adc0 1fb1cf40 >>> [69244.027625][ C24] GPR28: 1fb26000 1fb460f0 >>> 1fb17f18 1fb17000 >>> [69244.027663][ C24] NIP [1fb41050] 0x1fb41050 >>> [69244.027696][ C24] LR [1fb4104c] 0x1fb4104c >>> [69244.027699][ C24] Call Trace: >>> [69244.027701][ C24] Instruction dump: >>> [69244.027723][ C24] >>> >>> [69244.027728][ C24] >>> >>> [69244.027762][T87504] Oops: Unrecoverable System Reset, sig: 6 [#1] >>> [69244.028044][T87504] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA >>> pSeries >>> [69244.028089][T87504] Modules linked in: chacha_generic(E) libchacha(E) >>> xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) >>> libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) >>> algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) >>> fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) >>> cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) >>> algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) >>> rpcsec_gss_krb5(E) auth_rpcgss(E) >>> nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) >>> udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) >>> netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) >>> fscache(E) netfs(E) af_packet(E)
Re: [PATCH] livepatch: Remove klp_arch_set_pc() and asm/livepatch.h
On Mon, 28 Mar 2022, Christophe Leroy wrote: > All three versions of klp_arch_set_pc() do exactly the same: they > call ftrace_instruction_pointer_set(). > > Call ftrace_instruction_pointer_set() directly and remove > klp_arch_set_pc(). > > As klp_arch_set_pc() was the only thing remaining in asm/livepatch.h > on x86 and s390, remove asm/livepatch.h > > livepatch.h remains on powerpc but its content is exclusively used > by powerpc specific code. > > Signed-off-by: Christophe Leroy Acked-by: Miroslav Benes M
Re: [PATCH 1/3] sched: topology: add input parameter for sched_domain_flags_f()
On Tue, Mar 29, 2022 at 02:15:19AM -0700, Qing Wang wrote: > From: Wang Qing > > sched_domain_flags_f() are statically set now, but actually, we can get a lot > of necessary information based on the cpu_map. e.g. we can know whether its > cache is shared. > > Allows custom extension without affecting current. Still NAK
[PATCH 3/3] arm64: add arm64 default topology
From: Wang Qing default_topology does not fit arm64, especially CPU and cache topology. Add arm64_topology, so we can do more based on CONFIG_GENERIC_ARCH_TOPOLOGY. arm64_xxx_flags() prefer to get the cache attribute from DT. Signed-off-by: Wang Qing --- arch/arm64/kernel/smp.c | 56 + 1 file changed, 56 insertions(+) diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 27df5c1..d245012 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -715,6 +715,60 @@ void __init smp_init_cpus(void) } } +#ifdef CONFIG_SCHED_CLUSTER +static int arm64_cluster_flags(const struct cpumask *cpu_map) +{ + int flag = cpu_cluster_flags(); + int ret = cpu_share_private_cache(cpu_map); + if (ret == 1) + flag |= SD_SHARE_PKG_RESOURCES; + else if (ret == 0) + flag &= ~SD_SHARE_PKG_RESOURCES; + + return flag; +} +#endif + +#ifdef CONFIG_SCHED_MC +static int arm64_core_flags(const struct cpumask *cpu_map) +{ + int flag = cpu_core_flags(); + int ret = cpu_share_private_cache(cpu_map); + if (ret == 1) + flag |= SD_SHARE_PKG_RESOURCES; + else if (ret == 0) + flag &= ~SD_SHARE_PKG_RESOURCES; + + return flag; +} +#endif + +static int arm64_die_flags(const struct cpumask *cpu_map) +{ + int flag = 0; + int ret = cpu_share_private_cache(cpu_map); + if (ret == 1) + flag |= SD_SHARE_PKG_RESOURCES; + else if (ret == 0) + flag &= ~SD_SHARE_PKG_RESOURCES; + + return flag; +} + +static struct sched_domain_topology_level arm64_topology[] = { +#ifdef CONFIG_SCHED_SMT + { cpu_smt_mask, cpu_smt_flags, SD_INIT_NAME(SMT) }, +#endif +#ifdef CONFIG_SCHED_CLUSTER + { cpu_clustergroup_mask, arm64_cluster_flags, SD_INIT_NAME(CLS) }, +#endif +#ifdef CONFIG_SCHED_MC + { cpu_coregroup_mask, arm64_core_flags, SD_INIT_NAME(MC) }, +#endif + { cpu_cpu_mask, arm64_die_flags, SD_INIT_NAME(DIE) }, + { NULL, }, +}; + void __init smp_prepare_cpus(unsigned int max_cpus) { const struct cpu_operations *ops; @@ -723,6 +777,8 @@ void __init smp_prepare_cpus(unsigned int max_cpus) unsigned int this_cpu; init_cpu_topology(); + init_cpu_cache_topology(); + set_sched_topology(arm64_topology); this_cpu = smp_processor_id(); store_cpu_topology(this_cpu); -- 2.7.4
[PATCH 2/3] arch_topology: support for describing cache topology from DT
From: Wang Qing When ACPI is not enabled, we can get cache topolopy from DT like: * cpu0: cpu@000 { * next-level-cache = <_1>; * L2_1: l2-cache { * compatible = "cache"; * next-level-cache = <_1>; * }; * L3_1: l3-cache { * compatible = "cache"; * }; * }; * * cpu1: cpu@001 { * next-level-cache = <_1>; * cpu-idle-states = <_l * _mem _pll _bus * >; * }; * cpu2: cpu@002 { * L2_2: l2-cache { * compatible = "cache"; * next-level-cache = <_1>; * }; * }; * * cpu3: cpu@003 { * next-level-cache = <_2>; * }; cache_topology hold the pointer describing "next-level-cache", it can describe the cache topology of every level. Signed-off-by: Wang Qing --- drivers/base/arch_topology.c | 89 ++- include/linux/arch_topology.h | 4 ++ 2 files changed, 92 insertions(+), 1 deletion(-) diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index 1d6636e..41e0301 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -647,6 +647,92 @@ static int __init parse_dt_topology(void) } #endif + +/* + * cpu cache topology table + */ +#define MAX_CACHE_LEVEL 7 +struct device_node *cache_topology[NR_CPUS][MAX_CACHE_LEVEL]; + +void init_cpu_cache_topology(void) +{ + struct device_node *node_cpu, *node_cache; + int cpu; + int level = 0; + + for_each_possible_cpu(cpu) { + node_cpu = of_get_cpu_node(cpu, NULL); + if (!node_cpu) + continue; + + level = 0; + node_cache = node_cpu; + while (level < MAX_CACHE_LEVEL) { + node_cache = of_parse_phandle(node_cache, "next-level-cache", 0); + if (!node_cache) + break; + + cache_topology[cpu][level++] = node_cache; + } + of_node_put(node_cpu); + } +} + +/* + * private means only shared within cpu_mask + * Returns -1 if not described int DT. + */ +int cpu_share_private_cache(const struct cpumask *cpu_mask) +{ + int cache_level, cpu_id; + struct cpumask cache_mask; + int cpu = cpumask_first(cpu_mask); + + for (cache_level = 0; cache_level < MAX_CACHE_LEVEL; cache_level++) { + if (!cache_topology[cpu][cache_level]) + return -1; + + cpumask_clear(_mask); + for (cpu_id = 0; cpu_id < NR_CPUS; cpu_id++) { + if (cache_topology[cpu][cache_level] == cache_topology[cpu_id][cache_level]) + cpumask_set_cpu(cpu_id, _mask); + } + + if (cpumask_equal(cpu_mask, _mask)) + return 1; + } + + return 0; +} + +bool cpu_share_llc(int cpu1, int cpu2) +{ + int cache_level; + + for (cache_level = MAX_CACHE_LEVEL - 1; cache_level > 0; cache_level--) { + if (!cache_topology[cpu1][cache_level]) + continue; + + if (cache_topology[cpu1][cache_level] == cache_topology[cpu2][cache_level]) + return true; + + return false; + } + + return false; +} + +bool cpu_share_l2c(int cpu1, int cpu2) +{ + if (!cache_topology[cpu1][0]) + return false; + + if (cache_topology[cpu1][0] == cache_topology[cpu2][0]) + return true; + + return false; +} + /* * cpu topology table */ @@ -684,7 +770,8 @@ void update_siblings_masks(unsigned int cpuid) for_each_online_cpu(cpu) { cpu_topo = _topology[cpu]; - if (cpuid_topo->llc_id == cpu_topo->llc_id) { + if ((cpuid_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) + || (cpuid_topo->llc_id == -1 && cpu_share_llc(cpu, cpuid))) { cpumask_set_cpu(cpu, _topo->llc_sibling); cpumask_set_cpu(cpuid, _topo->llc_sibling); } diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h index 58cbe18..a402ff6 --- a/include/linux/arch_topology.h +++ b/include/linux/arch_topology.h @@ -86,6 +86,10 @@ extern struct cpu_topology cpu_topology[NR_CPUS]; #define topology_cluster_cpumask(cpu) (_topology[cpu].cluster_sibling) #define topology_llc_cpumask(cpu) (_topology[cpu].llc_sibling) void
[PATCH 1/3] sched: topology: add input parameter for sched_domain_flags_f()
From: Wang Qing sched_domain_flags_f() are statically set now, but actually, we can get a lot of necessary information based on the cpu_map. e.g. we can know whether its cache is shared. Allows custom extension without affecting current. Signed-off-by: Wang Qing --- arch/powerpc/kernel/smp.c | 4 ++-- arch/x86/kernel/smpboot.c | 8 include/linux/sched/topology.h | 10 +- kernel/sched/topology.c| 2 +- 4 files changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index de0f6f0..e503d23 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -1000,7 +1000,7 @@ static bool shared_caches; #ifdef CONFIG_SCHED_SMT /* cpumask of CPUs with asymmetric SMT dependency */ -static int powerpc_smt_flags(void) +static int powerpc_smt_flags(const struct cpumask *cpu_map) { int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES; @@ -1018,7 +1018,7 @@ static int powerpc_smt_flags(void) * since the migrated task remains cache hot. We want to take advantage of this * at the scheduler level so an extra topology level is required. */ -static int powerpc_shared_cache_flags(void) +static int powerpc_shared_cache_flags(const struct cpumask *cpu_map) { return SD_SHARE_PKG_RESOURCES; } diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 2ef1477..c005a8e --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -535,25 +535,25 @@ static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o) #if defined(CONFIG_SCHED_SMT) || defined(CONFIG_SCHED_CLUSTER) || defined(CONFIG_SCHED_MC) -static inline int x86_sched_itmt_flags(void) +static inline int x86_sched_itmt_flags(const struct cpumask *cpu_map) { return sysctl_sched_itmt_enabled ? SD_ASYM_PACKING : 0; } #ifdef CONFIG_SCHED_MC -static int x86_core_flags(void) +static int x86_core_flags(const struct cpumask *cpu_map) { return cpu_core_flags() | x86_sched_itmt_flags(); } #endif #ifdef CONFIG_SCHED_SMT -static int x86_smt_flags(void) +static int x86_smt_flags(const struct cpumask *cpu_map) { return cpu_smt_flags() | x86_sched_itmt_flags(); } #endif #ifdef CONFIG_SCHED_CLUSTER -static int x86_cluster_flags(void) +static int x86_cluster_flags(const struct cpumask *cpu_map) { return cpu_cluster_flags() | x86_sched_itmt_flags(); } diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h index 56cffe4..6aa985a --- a/include/linux/sched/topology.h +++ b/include/linux/sched/topology.h @@ -36,28 +36,28 @@ extern const struct sd_flag_debug sd_flag_debug[]; #endif #ifdef CONFIG_SCHED_SMT -static inline int cpu_smt_flags(void) +static inline int cpu_smt_flags(const struct cpumask *cpu_map) { return SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES; } #endif #ifdef CONFIG_SCHED_CLUSTER -static inline int cpu_cluster_flags(void) +static inline int cpu_cluster_flags(const struct cpumask *cpu_map) { return SD_SHARE_PKG_RESOURCES; } #endif #ifdef CONFIG_SCHED_MC -static inline int cpu_core_flags(void) +static inline int cpu_core_flags(const struct cpumask *cpu_map) { return SD_SHARE_PKG_RESOURCES; } #endif #ifdef CONFIG_NUMA -static inline int cpu_numa_flags(void) +static inline int cpu_numa_flags(const struct cpumask *cpu_map) { return SD_NUMA; } @@ -180,7 +180,7 @@ void free_sched_domains(cpumask_var_t doms[], unsigned int ndoms); bool cpus_share_cache(int this_cpu, int that_cpu); typedef const struct cpumask *(*sched_domain_mask_f)(int cpu); -typedef int (*sched_domain_flags_f)(void); +typedef int (*sched_domain_flags_f)(const struct cpumask *cpu_map); #define SDTL_OVERLAP 0x01 diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 05b6c2a..34dfec4 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1556,7 +1556,7 @@ sd_init(struct sched_domain_topology_level *tl, sd_weight = cpumask_weight(tl->mask(cpu)); if (tl->sd_flags) - sd_flags = (*tl->sd_flags)(); + sd_flags = (*tl->sd_flags)(tl->mask(cpu)); if (WARN_ONCE(sd_flags & ~TOPOLOGY_SD_FLAGS, "wrong sd_flags in topology description\n")) sd_flags &= TOPOLOGY_SD_FLAGS; -- 2.7.4
[PATCH 0/3] support for describing cache topology from DT
From: Wang Qing We don't know anything about the cache topology info without ACPI, but in fact we can get it from DT like: * cpu0: cpu@000 { * next-level-cache = <_1>; * L2_1: l2-cache { * compatible = "cache"; * next-level-cache = <_1>; * }; * L3_1: l3-cache { * compatible = "cache"; * }; * }; * * cpu1: cpu@001 { * next-level-cache = <_1>; * cpu-idle-states = <_l * _mem _pll _bus * >; * }; * cpu2: cpu@002 { * L2_2: l2-cache { * compatible = "cache"; * next-level-cache = <_1>; * }; * }; * * cpu3: cpu@003 { * next-level-cache = <_2>; * }; Building the cache topology has many benefits, here is a part of useage. Wang Qing (3): sched: topology: add input parameter for sched_domain_flags_f() arch_topology: support for describing cache topology from DT arm64: add arm64 default topology arch/arm64/kernel/smp.c| 56 ++ arch/powerpc/kernel/smp.c | 4 +- arch/x86/kernel/smpboot.c | 8 ++-- drivers/base/arch_topology.c | 89 +- include/linux/arch_topology.h | 4 ++ include/linux/sched/topology.h | 10 ++--- kernel/sched/topology.c| 2 +- 7 files changed, 160 insertions(+), 13 deletions(-) -- 2.7.4
[PATCH] powerpc: Export mmu_feature_keys[] as non-GPL
When the mmu_feature_keys[] was introduced in the commit c12e6f24d413 ("powerpc: Add option to use jump label for mmu_has_feature()"), it is unlikely that it would be used either directly or indirectly in the out of tree modules. So we export it as GPL only. But with the evolution of the codes, especially the PPC_KUAP support, it may be indirectly referenced by some primitive macro or inline functions such as get_user() or __copy_from_user_inatomic(), this will make it impossible to build many non GPL modules (such as ZFS) on ppc architecture. Fix this by exposing the mmu_feature_keys[] to the non-GPL modules too. Reported-by: Nathaniel Filardo Signed-off-by: Kevin Hao --- arch/powerpc/kernel/cputable.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c index ae0fdef0ac11..3a8cd40b6368 100644 --- a/arch/powerpc/kernel/cputable.c +++ b/arch/powerpc/kernel/cputable.c @@ -2119,7 +2119,7 @@ void __init cpu_feature_keys_init(void) struct static_key_true mmu_feature_keys[NUM_MMU_FTR_KEYS] = { [0 ... NUM_MMU_FTR_KEYS - 1] = STATIC_KEY_TRUE_INIT }; -EXPORT_SYMBOL_GPL(mmu_feature_keys); +EXPORT_SYMBOL(mmu_feature_keys); void __init mmu_feature_keys_init(void) { -- 2.34.1
Re: [PATCH] powerpc/rtas: Keep MSR RI set when calling RTAS
On 29/03/2022, 10:31:33, Nicholas Piggin wrote: > Excerpts from Laurent Dufour's message of March 17, 2022 9:06 pm: >> RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32bits >> mode (MSR[SF] unset). >> >> The change in MSR is done in enter_rtas() in a relatively complex way, >> since the MSR value could be hardcoded. >> >> Furthermore, a panic has been reported when hitting the watchdog interrupt >> while running in RTAS, this leads to the following stack trace: >> >> [69244.027433][ C24] watchdog: CPU 24 Hard LOCKUP >> [69244.027442][ C24] watchdog: CPU 24 TB:997512652051031, last heartbeat >> TB:997504470175378 (15980ms ago) >> [69244.027451][ C24] Modules linked in: chacha_generic(E) libchacha(E) >> xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) >> libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) >> algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) >> fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) >> cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) >> algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) >> rpcsec_gss_krb5(E) auth_rpcgss(E) >> nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) >> udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) >> netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) >> fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) >> crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) >> fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) >> sd_mod(E) t10_pi(E) >> [69244.027555][ C24] ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) >> gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) >> raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) >> dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) >> [69244.027587][ C24] Supported: No, Unreleased kernel >> [69244.027600][ C24] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: >> GE X5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 >> (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c >> [69244.027609][ C24] NIP: 1fb41050 LR: 1fb4104c CTR: >> >> [69244.027612][ C24] REGS: cfc33d60 TRAP: 0100 Tainted: G >> E X (5.14.21-150400.71.1.bz196362_2-default) >> [69244.027615][ C24] MSR: 82981000 CR: 4882 >> XER: 20040020 >> [69244.027625][ C24] CFAR: 011c IRQMASK: 1 >> [69244.027625][ C24] GPR00: 0003 >> 0001 50dc >> [69244.027625][ C24] GPR04: 1ffb6100 0020 >> 0001 1fb09010 >> [69244.027625][ C24] GPR08: 2000 >> >> [69244.027625][ C24] GPR12: 8004072a40a8 cff8b680 >> 0007 0034 >> [69244.027625][ C24] GPR16: 1fbf6e94 1fbf6d84 >> 1fbd1db0 1fb3f008 >> [69244.027625][ C24] GPR20: 1fb41018 >> 017f f68f >> [69244.027625][ C24] GPR24: 1fb18fe8 1fb3e000 >> 1fb1adc0 1fb1cf40 >> [69244.027625][ C24] GPR28: 1fb26000 1fb460f0 >> 1fb17f18 1fb17000 >> [69244.027663][ C24] NIP [1fb41050] 0x1fb41050 >> [69244.027696][ C24] LR [1fb4104c] 0x1fb4104c >> [69244.027699][ C24] Call Trace: >> [69244.027701][ C24] Instruction dump: >> [69244.027723][ C24] >> >> [69244.027728][ C24] >> >> [69244.027762][T87504] Oops: Unrecoverable System Reset, sig: 6 [#1] >> [69244.028044][T87504] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA >> pSeries >> [69244.028089][T87504] Modules linked in: chacha_generic(E) libchacha(E) >> xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) >> libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) >> algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) >> fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) >> cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) >> algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) >> rpcsec_gss_krb5(E) auth_rpcgss(E) >> nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) >> udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) >> netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) >> fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) >> crct10dif_vpmsum(E) rtc_generic(E) drm(E)
[PATCH v3 2/2] PCI/DPC: Disable DPC service when link is in L2/L3 ready, L2 and L3 state
On some Intel AlderLake platforms, Thunderbolt entering D3cold can cause some errors reported by AER: [ 30.100211] pcieport :00:1d.0: AER: Uncorrected (Non-Fatal) error received: :00:1d.0 [ 30.100251] pcieport :00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) [ 30.100256] pcieport :00:1d.0: device [8086:7ab0] error status/mask=0010/4000 [ 30.100262] pcieport :00:1d.0:[20] UnsupReq (First) [ 30.100267] pcieport :00:1d.0: AER: TLP Header: 3400 0852 [ 30.100372] thunderbolt :0a:00.0: AER: can't recover (no error_detected callback) [ 30.100401] xhci_hcd :3e:00.0: AER: can't recover (no error_detected callback) [ 30.100427] pcieport :00:1d.0: AER: device recovery failed Since AER is disabled in previous patch for a Link in L2/L3 Ready, L2 and L3, also disable DPC here as DPC depends on AER to work. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215453 Reviewed-by: Mika Westerberg Signed-off-by: Kai-Heng Feng --- v3: - Wording change to make the patch more clear. v2: - Wording change. - Empty line dropped. drivers/pci/pcie/dpc.c | 60 +++--- 1 file changed, 44 insertions(+), 16 deletions(-) diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c index 3e9afee02e8d1..414258967f08e 100644 --- a/drivers/pci/pcie/dpc.c +++ b/drivers/pci/pcie/dpc.c @@ -343,13 +343,33 @@ void pci_dpc_init(struct pci_dev *pdev) } } +static void dpc_enable(struct pcie_device *dev) +{ + struct pci_dev *pdev = dev->port; + u16 ctl; + + pci_read_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CTL, ); + ctl = (ctl & 0xfff4) | PCI_EXP_DPC_CTL_EN_FATAL | PCI_EXP_DPC_CTL_INT_EN; + pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CTL, ctl); +} + +static void dpc_disable(struct pcie_device *dev) +{ + struct pci_dev *pdev = dev->port; + u16 ctl; + + pci_read_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CTL, ); + ctl &= ~(PCI_EXP_DPC_CTL_EN_FATAL | PCI_EXP_DPC_CTL_INT_EN); + pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CTL, ctl); +} + #define FLAG(x, y) (((x) & (y)) ? '+' : '-') static int dpc_probe(struct pcie_device *dev) { struct pci_dev *pdev = dev->port; struct device *device = >device; int status; - u16 ctl, cap; + u16 cap; if (!pcie_aer_is_native(pdev) && !pcie_ports_dpc_native) return -ENOTSUPP; @@ -364,10 +384,7 @@ static int dpc_probe(struct pcie_device *dev) } pci_read_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CAP, ); - pci_read_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CTL, ); - - ctl = (ctl & 0xfff4) | PCI_EXP_DPC_CTL_EN_FATAL | PCI_EXP_DPC_CTL_INT_EN; - pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CTL, ctl); + dpc_enable(dev); pci_info(pdev, "enabled with IRQ %d\n", dev->irq); pci_info(pdev, "error containment capabilities: Int Msg #%d, RPExt%c PoisonedTLP%c SwTrigger%c RP PIO Log %d, DL_ActiveErr%c\n", @@ -380,22 +397,33 @@ static int dpc_probe(struct pcie_device *dev) return status; } -static void dpc_remove(struct pcie_device *dev) +static int dpc_suspend(struct pcie_device *dev) { - struct pci_dev *pdev = dev->port; - u16 ctl; + dpc_disable(dev); + return 0; +} - pci_read_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CTL, ); - ctl &= ~(PCI_EXP_DPC_CTL_EN_FATAL | PCI_EXP_DPC_CTL_INT_EN); - pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CTL, ctl); +static int dpc_resume(struct pcie_device *dev) +{ + dpc_enable(dev); + return 0; +} + +static void dpc_remove(struct pcie_device *dev) +{ + dpc_disable(dev); } static struct pcie_port_service_driver dpcdriver = { - .name = "dpc", - .port_type = PCIE_ANY_PORT, - .service= PCIE_PORT_SERVICE_DPC, - .probe = dpc_probe, - .remove = dpc_remove, + .name = "dpc", + .port_type = PCIE_ANY_PORT, + .service= PCIE_PORT_SERVICE_DPC, + .probe = dpc_probe, + .suspend= dpc_suspend, + .resume = dpc_resume, + .runtime_suspend= dpc_suspend, + .runtime_resume = dpc_resume, + .remove = dpc_remove, }; int __init pcie_dpc_init(void) -- 2.34.1
[PATCH v3 1/2] PCI/AER: Disable AER service when link is in L2/L3 ready, L2 and L3 state
On some Intel AlderLake platforms, Thunderbolt entering D3cold can cause some errors reported by AER: [ 30.100211] pcieport :00:1d.0: AER: Uncorrected (Non-Fatal) error received: :00:1d.0 [ 30.100251] pcieport :00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) [ 30.100256] pcieport :00:1d.0: device [8086:7ab0] error status/mask=0010/4000 [ 30.100262] pcieport :00:1d.0:[20] UnsupReq (First) [ 30.100267] pcieport :00:1d.0: AER: TLP Header: 3400 0852 [ 30.100372] thunderbolt :0a:00.0: AER: can't recover (no error_detected callback) [ 30.100401] xhci_hcd :3e:00.0: AER: can't recover (no error_detected callback) [ 30.100427] pcieport :00:1d.0: AER: device recovery failed So disable AER service to avoid the noises from turning power rails on/off when the device is in low power states (D3hot and D3cold), as PCIe spec "5.2 Link State Power Management" states that TLP and DLLP transmission is disabled for a Link in L2/L3 Ready (D3hot), L2 (D3cold with aux power) and L3 (D3cold). Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215453 Reviewed-by: Mika Westerberg Signed-off-by: Kai-Heng Feng --- v3: - Remove reference to ACS. - Wording change. v2: - Wording change. drivers/pci/pcie/aer.c | 31 +-- 1 file changed, 25 insertions(+), 6 deletions(-) diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index 9fa1f97e5b270..e4e9d4a3098d7 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -1367,6 +1367,22 @@ static int aer_probe(struct pcie_device *dev) return 0; } +static int aer_suspend(struct pcie_device *dev) +{ + struct aer_rpc *rpc = get_service_data(dev); + + aer_disable_rootport(rpc); + return 0; +} + +static int aer_resume(struct pcie_device *dev) +{ + struct aer_rpc *rpc = get_service_data(dev); + + aer_enable_rootport(rpc); + return 0; +} + /** * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP * @dev: pointer to Root Port, RCEC, or RCiEP @@ -1433,12 +1449,15 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev) } static struct pcie_port_service_driver aerdriver = { - .name = "aer", - .port_type = PCIE_ANY_PORT, - .service= PCIE_PORT_SERVICE_AER, - - .probe = aer_probe, - .remove = aer_remove, + .name = "aer", + .port_type = PCIE_ANY_PORT, + .service= PCIE_PORT_SERVICE_AER, + .probe = aer_probe, + .suspend= aer_suspend, + .resume = aer_resume, + .runtime_suspend= aer_suspend, + .runtime_resume = aer_resume, + .remove = aer_remove, }; /** -- 2.34.1
Re: [PATCH] powerpc/rtas: Keep MSR RI set when calling RTAS
Excerpts from Laurent Dufour's message of March 17, 2022 9:06 pm: > RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32bits > mode (MSR[SF] unset). > > The change in MSR is done in enter_rtas() in a relatively complex way, > since the MSR value could be hardcoded. > > Furthermore, a panic has been reported when hitting the watchdog interrupt > while running in RTAS, this leads to the following stack trace: > > [69244.027433][ C24] watchdog: CPU 24 Hard LOCKUP > [69244.027442][ C24] watchdog: CPU 24 TB:997512652051031, last heartbeat > TB:997504470175378 (15980ms ago) > [69244.027451][ C24] Modules linked in: chacha_generic(E) libchacha(E) > xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) > libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) > algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) > fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) > cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) > algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) > rpcsec_gss_krb5(E) auth_rpcgss(E) > nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) > udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) > netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) > fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) > crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) > fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) > sd_mod(E) t10_pi(E) > [69244.027555][ C24] ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) > gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) > raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) > dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) > [69244.027587][ C24] Supported: No, Unreleased kernel > [69244.027600][ C24] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: > GE X5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 > (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c > [69244.027609][ C24] NIP: 1fb41050 LR: 1fb4104c CTR: > > [69244.027612][ C24] REGS: cfc33d60 TRAP: 0100 Tainted: G >E X (5.14.21-150400.71.1.bz196362_2-default) > [69244.027615][ C24] MSR: 82981000 CR: 4882 > XER: 20040020 > [69244.027625][ C24] CFAR: 011c IRQMASK: 1 > [69244.027625][ C24] GPR00: 0003 > 0001 50dc > [69244.027625][ C24] GPR04: 1ffb6100 0020 > 0001 1fb09010 > [69244.027625][ C24] GPR08: 2000 > > [69244.027625][ C24] GPR12: 8004072a40a8 cff8b680 > 0007 0034 > [69244.027625][ C24] GPR16: 1fbf6e94 1fbf6d84 > 1fbd1db0 1fb3f008 > [69244.027625][ C24] GPR20: 1fb41018 > 017f f68f > [69244.027625][ C24] GPR24: 1fb18fe8 1fb3e000 > 1fb1adc0 1fb1cf40 > [69244.027625][ C24] GPR28: 1fb26000 1fb460f0 > 1fb17f18 1fb17000 > [69244.027663][ C24] NIP [1fb41050] 0x1fb41050 > [69244.027696][ C24] LR [1fb4104c] 0x1fb4104c > [69244.027699][ C24] Call Trace: > [69244.027701][ C24] Instruction dump: > [69244.027723][ C24] > > [69244.027728][ C24] > > [69244.027762][T87504] Oops: Unrecoverable System Reset, sig: 6 [#1] > [69244.028044][T87504] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries > [69244.028089][T87504] Modules linked in: chacha_generic(E) libchacha(E) > xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) > libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) > algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) > fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) > cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) > algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) > rpcsec_gss_krb5(E) auth_rpcgss(E) > nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) > udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) > netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) > fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) > crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) > fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) > sd_mod(E) t10_pi(E) > [69244.028171][T87504]
Re: [PATCH] powerpc/64: Fix build failure with allyesconfig in book3s_64_entry.S
Excerpts from Christophe Leroy's message of March 27, 2022 5:32 pm: > Using conditional branches between two files is hasardous, > they may get linked to far from each other. > > arch/powerpc/kvm/book3s_64_entry.o:(.text+0x3ec): relocation truncated > to fit: R_PPC64_REL14 (stub) against symbol `system_reset_common' > defined in .text section in arch/powerpc/kernel/head_64.o > > Reorganise the code to use non conditional branches. Thanks for the fix, I agree this is better. Reviewed-by: Nicholas Piggin > > Cc: Nicholas Piggin > Fixes: 89d35b239101 ("KVM: PPC: Book3S HV P9: Implement the rest of the P9 > path in C") > Signed-off-by: Christophe Leroy > --- > arch/powerpc/kvm/book3s_64_entry.S | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_64_entry.S > b/arch/powerpc/kvm/book3s_64_entry.S > index 05e003eb5d90..99fa36df36fa 100644 > --- a/arch/powerpc/kvm/book3s_64_entry.S > +++ b/arch/powerpc/kvm/book3s_64_entry.S > @@ -414,10 +414,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_DAWR1) >*/ > ld r10,HSTATE_SCRATCH0(r13) > cmpwi r10,BOOK3S_INTERRUPT_MACHINE_CHECK > - beq machine_check_common > + beq 1f > > cmpwi r10,BOOK3S_INTERRUPT_SYSTEM_RESET > - beq system_reset_common > + bne . > > - b . > + b system_reset_common > +1: b machine_check_common > #endif > -- > 2.35.1 > >
Re: [PATCH] KVM: PPC: Book3S HV: Fix vcore_blocked tracepoint
Excerpts from Fabiano Rosas's message of March 29, 2022 7:58 am: > We removed most of the vcore logic from the P9 path but there's still > a tracepoint that tried to dereference vc->runner. Thanks for the fix. Reviewed-by: Nicholas Piggin > > Fixes: ecb6a7207f92 ("KVM: PPC: Book3S HV P9: Remove most of the vcore logic") > Signed-off-by: Fabiano Rosas > --- > arch/powerpc/kvm/book3s_hv.c | 8 > arch/powerpc/kvm/trace_hv.h | 8 > 2 files changed, 8 insertions(+), 8 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index c886557638a1..5f5b2d0dee8c 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -4218,13 +4218,13 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore > *vc) > start_wait = ktime_get(); > > vc->vcore_state = VCORE_SLEEPING; > - trace_kvmppc_vcore_blocked(vc, 0); > + trace_kvmppc_vcore_blocked(vc->runner, 0); > spin_unlock(>lock); > schedule(); > finish_rcuwait(>wait); > spin_lock(>lock); > vc->vcore_state = VCORE_INACTIVE; > - trace_kvmppc_vcore_blocked(vc, 1); > + trace_kvmppc_vcore_blocked(vc->runner, 1); > ++vc->runner->stat.halt_successful_wait; > > cur = ktime_get(); > @@ -4596,9 +4596,9 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 > time_limit, > if (kvmppc_vcpu_check_block(vcpu)) > break; > > - trace_kvmppc_vcore_blocked(vc, 0); > + trace_kvmppc_vcore_blocked(vcpu, 0); > schedule(); > - trace_kvmppc_vcore_blocked(vc, 1); > + trace_kvmppc_vcore_blocked(vcpu, 1); > } > finish_rcuwait(wait); > } > diff --git a/arch/powerpc/kvm/trace_hv.h b/arch/powerpc/kvm/trace_hv.h > index 38cd0ed0a617..32e2cb5811cc 100644 > --- a/arch/powerpc/kvm/trace_hv.h > +++ b/arch/powerpc/kvm/trace_hv.h > @@ -409,9 +409,9 @@ TRACE_EVENT(kvmppc_run_core, > ); > > TRACE_EVENT(kvmppc_vcore_blocked, > - TP_PROTO(struct kvmppc_vcore *vc, int where), > + TP_PROTO(struct kvm_vcpu *vcpu, int where), > > - TP_ARGS(vc, where), > + TP_ARGS(vcpu, where), > > TP_STRUCT__entry( > __field(int,n_runnable) > @@ -421,8 +421,8 @@ TRACE_EVENT(kvmppc_vcore_blocked, > ), > > TP_fast_assign( > - __entry->runner_vcpu = vc->runner->vcpu_id; > - __entry->n_runnable = vc->n_runnable; > + __entry->runner_vcpu = vcpu->vcpu_id; > + __entry->n_runnable = vcpu->arch.vcore->n_runnable; > __entry->where = where; > __entry->tgid= current->tgid; > ), > -- > 2.35.1 > >