Re: [PATCH v5 06/11] powerpc/pseries/iommu: Add ddw_property_create() and refactor enable_ddw()
Hello Fred, thanks for this feedback! On Tue, 2021-07-20 at 19:49 +0200, Frederic Barrat wrote: > > > kfree(window); > > > > -out_clear_window: > > - remove_ddw(pdn, true); > > +out_del_prop: > > + of_remove_property(pdn, win64); > > > > out_free_prop: > > kfree(win64->name); > > kfree(win64->value); > > kfree(win64); > > > > +out_remove_win: > > + remove_ddw(pdn, true); > > > I believe there's a small problem here. We jump directly to > out_remove_win if allocating the property failed. Yet, the first > thing > remove_ddw() does is look for the property. So it will never find it > and > the window is never removed by the hypervisor. > > Fred That makes sense, thanks for catching this one! What I intended here was just removing the DDW, so I think it should be ok replacing remove_ddw() by a new helper that only does the rtas-call. I will send a v6 with this change soon. > > > > + > > out_failed: > > if (default_win_removed) > > reset_dma_window(dev, pdn); > >
Re: [PATCH v2 2/2] powerpc/perf: Return regs->nip as instruction pointer value when SIAR is 0
On 8/16/21 12:26 PM, Christophe Leroy wrote: Le 16/08/2021 à 08:44, kajoljain a écrit : On 8/14/21 6:14 PM, Michael Ellerman wrote: Christophe Leroy writes: Le 13/08/2021 à 10:24, Kajol Jain a écrit : Incase of random sampling, there can be scenarios where SIAR is not latching sample address and results in 0 value. Since current code directly returning the siar value, we could see multiple instruction pointer values as 0 in perf report. Can you please give more detail on that? What scenarios? On what CPUs? Hi Michael, Sure I will update these details in my next patch-set. Patch resolves this issue by adding a ternary condition to return regs->nip incase SIAR is 0. Your description seems rather similar to https://github.com/linuxppc/linux/commit/2ca13a4cc56c920a6c9fc8ee45d02bccacd7f46c Does it mean that the problem occurs on more than the power10 DD1 ? In that case, can the solution be common instead of doing something for power10 DD1 and something for others ? Agreed. This change would seem to make that P10 DD1 logic superfluous. Also we already have a fallback to regs->nip in the else case of the if, so we should just use that rather than adding a ternary condition. eg. if (use_siar && siar_valid(regs) && siar) return siar + perf_ip_adjust(regs); else if (use_siar) return 0; // no valid instruction pointer else return regs->nip; I'm also not sure why we have that return 0 case, I can't think of why we'd ever want to do that rather than using nip. So maybe we should do another patch to drop that case. Yeah make sense. I will remove return 0 case in my next version. This was added by commit https://github.com/linuxppc/linux/commit/e6878835ac4794f25385522d29c634b7bbb7cca9 Are we sure it was an error to add it and it can be removed ? pc having 0 is wrong (kernel does not execute at 0x0 or userspace). yeah we should drop it. Maddy Christophe
Re: [PATCH v2 3/3] powerpc/configs: Regenerate mpc885_ads_defconfig
Le 17/08/2021 à 06:54, Joel Stanley a écrit : Regenerate atop v5.14-rc6 by doing a make savedefconfig. The changes a re-ordering except for the following (which are still set indirectly): - CONFIG_DEBUG_KERNEL=y selected by EXPERT - CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008 which is the default setting Signed-off-by: Joel Stanley Acked-by: Christophe Leroy --- arch/powerpc/configs/mpc885_ads_defconfig | 46 +++ 1 file changed, 22 insertions(+), 24 deletions(-) diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig index cd08f9ed2c8d..c74dc76b1d0d 100644 --- a/arch/powerpc/configs/mpc885_ads_defconfig +++ b/arch/powerpc/configs/mpc885_ads_defconfig @@ -1,19 +1,30 @@ -CONFIG_PPC_8xx=y # CONFIG_SWAP is not set CONFIG_SYSVIPC=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y +CONFIG_BPF_JIT=y +CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y CONFIG_LOG_BUF_SHIFT=14 CONFIG_EXPERT=y # CONFIG_ELF_CORE is not set # CONFIG_BASE_FULL is not set # CONFIG_FUTEX is not set +CONFIG_PERF_EVENTS=y # CONFIG_VM_EVENT_COUNTERS is not set -# CONFIG_BLK_DEV_BSG is not set -CONFIG_PARTITION_ADVANCED=y +CONFIG_PPC_8xx=y +CONFIG_8xx_GPIO=y +CONFIG_SMC_UCODE_PATCH=y +CONFIG_PIN_TLB=y CONFIG_GEN_RTC=y CONFIG_HZ_100=y +CONFIG_MATH_EMULATION=y +CONFIG_PPC_16K_PAGES=y +CONFIG_ADVANCED_OPTIONS=y # CONFIG_SECCOMP is not set +CONFIG_STRICT_KERNEL_RWX=y +CONFIG_MODULES=y +# CONFIG_BLK_DEV_BSG is not set +CONFIG_PARTITION_ADVANCED=y CONFIG_NET=y CONFIG_PACKET=y CONFIG_UNIX=y @@ -46,38 +57,25 @@ CONFIG_DAVICOM_PHY=y # CONFIG_LEGACY_PTYS is not set CONFIG_SERIAL_CPM=y CONFIG_SERIAL_CPM_CONSOLE=y +CONFIG_SPI=y +CONFIG_SPI_FSL_SPI=y # CONFIG_HWMON is not set +CONFIG_WATCHDOG=y +CONFIG_8xxx_WDT=y # CONFIG_USB_SUPPORT is not set # CONFIG_DNOTIFY is not set CONFIG_TMPFS=y CONFIG_CRAMFS=y CONFIG_NFS_FS=y CONFIG_ROOT_NFS=y +CONFIG_CRYPTO=y +CONFIG_CRYPTO_DEV_TALITOS=y CONFIG_CRC32_SLICEBY4=y CONFIG_DEBUG_INFO=y CONFIG_MAGIC_SYSRQ=y -CONFIG_DETECT_HUNG_TASK=y -CONFIG_PPC_16K_PAGES=y -CONFIG_DEBUG_KERNEL=y CONFIG_DEBUG_FS=y -CONFIG_PPC_PTDUMP=y -CONFIG_MODULES=y -CONFIG_SPI=y -CONFIG_SPI_FSL_SPI=y -CONFIG_CRYPTO=y -CONFIG_CRYPTO_DEV_TALITOS=y -CONFIG_8xx_GPIO=y -CONFIG_WATCHDOG=y -CONFIG_8xxx_WDT=y -CONFIG_SMC_UCODE_PATCH=y -CONFIG_ADVANCED_OPTIONS=y -CONFIG_PIN_TLB=y -CONFIG_PERF_EVENTS=y -CONFIG_MATH_EMULATION=y -CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y -CONFIG_STRICT_KERNEL_RWX=y -CONFIG_BPF_JIT=y CONFIG_DEBUG_VM_PGTABLE=y +CONFIG_DETECT_HUNG_TASK=y CONFIG_BDI_SWITCH=y CONFIG_PPC_EARLY_DEBUG=y -CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008 +CONFIG_PPC_PTDUMP=y
Re: [PATCH v2 2/3] powerpc/config: Renable MTD_PHYSMAP_OF
Le 17/08/2021 à 06:54, Joel Stanley a écrit : CONFIG_MTD_PHYSMAP_OF is not longer enabled as it depends on MTD_PHYSMAP which is not enabled. This is a regression from commit 642b1e8dbed7 ("mtd: maps: Merge physmap_of.c into physmap-core.c"), which added the extra dependency. Add CONFIG_MTD_PHYSMAP=y so this stays in the config, as Christophe said it is useful for build coverage. Fixes: 642b1e8dbed7 ("mtd: maps: Merge physmap_of.c into physmap-core.c") Signed-off-by: Joel Stanley Acked-by: Christophe Leroy --- arch/powerpc/configs/mpc885_ads_defconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig index 5cd17adf903f..cd08f9ed2c8d 100644 --- a/arch/powerpc/configs/mpc885_ads_defconfig +++ b/arch/powerpc/configs/mpc885_ads_defconfig @@ -33,6 +33,7 @@ CONFIG_MTD_CFI_GEOMETRY=y # CONFIG_MTD_CFI_I2 is not set CONFIG_MTD_CFI_I4=y CONFIG_MTD_CFI_AMDSTD=y +CONFIG_MTD_PHYSMAP=y CONFIG_MTD_PHYSMAP_OF=y # CONFIG_BLK_DEV is not set CONFIG_NETDEVICES=y
[PATCH v2 3/3] powerpc/configs: Regenerate mpc885_ads_defconfig
Regenerate atop v5.14-rc6 by doing a make savedefconfig. The changes a re-ordering except for the following (which are still set indirectly): - CONFIG_DEBUG_KERNEL=y selected by EXPERT - CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008 which is the default setting Signed-off-by: Joel Stanley --- arch/powerpc/configs/mpc885_ads_defconfig | 46 +++ 1 file changed, 22 insertions(+), 24 deletions(-) diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig index cd08f9ed2c8d..c74dc76b1d0d 100644 --- a/arch/powerpc/configs/mpc885_ads_defconfig +++ b/arch/powerpc/configs/mpc885_ads_defconfig @@ -1,19 +1,30 @@ -CONFIG_PPC_8xx=y # CONFIG_SWAP is not set CONFIG_SYSVIPC=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y +CONFIG_BPF_JIT=y +CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y CONFIG_LOG_BUF_SHIFT=14 CONFIG_EXPERT=y # CONFIG_ELF_CORE is not set # CONFIG_BASE_FULL is not set # CONFIG_FUTEX is not set +CONFIG_PERF_EVENTS=y # CONFIG_VM_EVENT_COUNTERS is not set -# CONFIG_BLK_DEV_BSG is not set -CONFIG_PARTITION_ADVANCED=y +CONFIG_PPC_8xx=y +CONFIG_8xx_GPIO=y +CONFIG_SMC_UCODE_PATCH=y +CONFIG_PIN_TLB=y CONFIG_GEN_RTC=y CONFIG_HZ_100=y +CONFIG_MATH_EMULATION=y +CONFIG_PPC_16K_PAGES=y +CONFIG_ADVANCED_OPTIONS=y # CONFIG_SECCOMP is not set +CONFIG_STRICT_KERNEL_RWX=y +CONFIG_MODULES=y +# CONFIG_BLK_DEV_BSG is not set +CONFIG_PARTITION_ADVANCED=y CONFIG_NET=y CONFIG_PACKET=y CONFIG_UNIX=y @@ -46,38 +57,25 @@ CONFIG_DAVICOM_PHY=y # CONFIG_LEGACY_PTYS is not set CONFIG_SERIAL_CPM=y CONFIG_SERIAL_CPM_CONSOLE=y +CONFIG_SPI=y +CONFIG_SPI_FSL_SPI=y # CONFIG_HWMON is not set +CONFIG_WATCHDOG=y +CONFIG_8xxx_WDT=y # CONFIG_USB_SUPPORT is not set # CONFIG_DNOTIFY is not set CONFIG_TMPFS=y CONFIG_CRAMFS=y CONFIG_NFS_FS=y CONFIG_ROOT_NFS=y +CONFIG_CRYPTO=y +CONFIG_CRYPTO_DEV_TALITOS=y CONFIG_CRC32_SLICEBY4=y CONFIG_DEBUG_INFO=y CONFIG_MAGIC_SYSRQ=y -CONFIG_DETECT_HUNG_TASK=y -CONFIG_PPC_16K_PAGES=y -CONFIG_DEBUG_KERNEL=y CONFIG_DEBUG_FS=y -CONFIG_PPC_PTDUMP=y -CONFIG_MODULES=y -CONFIG_SPI=y -CONFIG_SPI_FSL_SPI=y -CONFIG_CRYPTO=y -CONFIG_CRYPTO_DEV_TALITOS=y -CONFIG_8xx_GPIO=y -CONFIG_WATCHDOG=y -CONFIG_8xxx_WDT=y -CONFIG_SMC_UCODE_PATCH=y -CONFIG_ADVANCED_OPTIONS=y -CONFIG_PIN_TLB=y -CONFIG_PERF_EVENTS=y -CONFIG_MATH_EMULATION=y -CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y -CONFIG_STRICT_KERNEL_RWX=y -CONFIG_BPF_JIT=y CONFIG_DEBUG_VM_PGTABLE=y +CONFIG_DETECT_HUNG_TASK=y CONFIG_BDI_SWITCH=y CONFIG_PPC_EARLY_DEBUG=y -CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008 +CONFIG_PPC_PTDUMP=y -- 2.32.0
[PATCH v2 2/3] powerpc/config: Renable MTD_PHYSMAP_OF
CONFIG_MTD_PHYSMAP_OF is not longer enabled as it depends on MTD_PHYSMAP which is not enabled. This is a regression from commit 642b1e8dbed7 ("mtd: maps: Merge physmap_of.c into physmap-core.c"), which added the extra dependency. Add CONFIG_MTD_PHYSMAP=y so this stays in the config, as Christophe said it is useful for build coverage. Fixes: 642b1e8dbed7 ("mtd: maps: Merge physmap_of.c into physmap-core.c") Signed-off-by: Joel Stanley --- arch/powerpc/configs/mpc885_ads_defconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig index 5cd17adf903f..cd08f9ed2c8d 100644 --- a/arch/powerpc/configs/mpc885_ads_defconfig +++ b/arch/powerpc/configs/mpc885_ads_defconfig @@ -33,6 +33,7 @@ CONFIG_MTD_CFI_GEOMETRY=y # CONFIG_MTD_CFI_I2 is not set CONFIG_MTD_CFI_I4=y CONFIG_MTD_CFI_AMDSTD=y +CONFIG_MTD_PHYSMAP=y CONFIG_MTD_PHYSMAP_OF=y # CONFIG_BLK_DEV is not set CONFIG_NETDEVICES=y -- 2.32.0
[PATCH v2 1/3] powerpc/config: Fix IPV6 warning in mpc855_ads
When building this config there's a warning: 79:warning: override: reassigning to symbol IPV6 Commit 9a1762a4a4ff ("powerpc/8xx: Update mpc885_ads_defconfig to improve CI") added CONFIG_IPV6=y, but left '# CONFIG_IPV6 is not set' in. IPV6 is default y, so remove both to clean up the build. Acked-by: Christophe Leroy Fixes: 9a1762a4a4ff ("powerpc/8xx: Update mpc885_ads_defconfig to improve CI") Signed-off-by: Joel Stanley --- arch/powerpc/configs/mpc885_ads_defconfig | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig index d21f266cea9a..5cd17adf903f 100644 --- a/arch/powerpc/configs/mpc885_ads_defconfig +++ b/arch/powerpc/configs/mpc885_ads_defconfig @@ -21,7 +21,6 @@ CONFIG_INET=y CONFIG_IP_MULTICAST=y CONFIG_IP_PNP=y CONFIG_SYN_COOKIES=y -# CONFIG_IPV6 is not set # CONFIG_FW_LOADER is not set CONFIG_MTD=y CONFIG_MTD_BLOCK=y @@ -76,7 +75,6 @@ CONFIG_PERF_EVENTS=y CONFIG_MATH_EMULATION=y CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y CONFIG_STRICT_KERNEL_RWX=y -CONFIG_IPV6=y CONFIG_BPF_JIT=y CONFIG_DEBUG_VM_PGTABLE=y CONFIG_BDI_SWITCH=y -- 2.32.0
[PATCH v2 0/3] powerpc: mpc855_ads defconfig fixes
v2: fix typos, split out mtd fix from savedefconfig patch The first patch fixes a build warning I noticed when testing something unrelated. The second fixes a regression where the MTD partition support dropped out of the config. I have enabled the dependency so this is still part of the config. The third patch is the result of doing a make savedefconfig with the first two patches applied. Joel Stanley (3): powerpc/config: Fix IPV6 warning in mpc855_ads powerpc/config: Renable MTD_PHYSMAP_OF powerpc/configs: Regenerate mpc885_ads_defconfig arch/powerpc/configs/mpc885_ads_defconfig | 49 +++ 1 file changed, 23 insertions(+), 26 deletions(-) -- 2.32.0
Re: [PATCH 2/2] powerpc/configs: Regenerate mpc885_ads_defconfig
On Mon, 16 Aug 2021 at 08:49, Christophe Leroy wrote: > > > > Le 16/08/2021 à 10:31, Joel Stanley a écrit : > > Regenrate atop v5.14-rc6. > > Typos. > > You mean you did redo a "make savedefconfig" ? Yes, I did.
[PATCH 3/7] powerpc: replace cc-option-yn uses with cc-option
cc-option-yn can be replaced with cc-option. ie. Checking for support: ifeq ($(call cc-option-yn,$(FLAG)),y) becomes: ifneq ($(call cc-option,$(FLAG)),) Checking for lack of support: ifeq ($(call cc-option-yn,$(FLAG)),n) becomes: ifeq ($(call cc-option,$(FLAG)),) This allows us to pursue removing cc-option-yn. Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Nick Desaulniers --- arch/powerpc/Makefile | 12 ++-- arch/powerpc/boot/Makefile | 5 + 2 files changed, 7 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 9aaf1abbc641..85e224536cf7 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -12,12 +12,12 @@ # Rewritten by Cort Dougan and Paul Mackerras # -HAS_BIARCH := $(call cc-option-yn, -m32) +HAS_BIARCH := $(call cc-option,-m32) # Set default 32 bits cross compilers for vdso and boot wrapper CROSS32_COMPILE ?= -ifeq ($(HAS_BIARCH),y) +ifeq ($(HAS_BIARCH),-m32) ifeq ($(CROSS32_COMPILE),) ifdef CONFIG_PPC32 # These options will be overridden by any -mcpu option that the CPU @@ -107,7 +107,7 @@ cflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mlittle-endian aflags-$(CONFIG_CPU_BIG_ENDIAN)+= $(call cc-option,-mbig-endian) aflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mlittle-endian -ifeq ($(HAS_BIARCH),y) +ifeq ($(HAS_BIARCH),-m32) KBUILD_CFLAGS += -m$(BITS) KBUILD_AFLAGS += -m$(BITS) -Wl,-a$(BITS) KBUILD_LDFLAGS += -m elf$(BITS)$(LDEMULATION) @@ -125,7 +125,9 @@ LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) := -pie LDFLAGS_vmlinux:= $(LDFLAGS_vmlinux-y) ifdef CONFIG_PPC64 -ifeq ($(call cc-option-yn,-mcmodel=medium),y) +ifeq ($(call cc-option,-mcmodel=medium),) + export NO_MINIMAL_TOC := -mno-minimal-toc +else # -mcmodel=medium breaks modules because it uses 32bit offsets from # the TOC pointer to create pointers where possible. Pointers into the # percpu data area are created by this method. @@ -135,8 +137,6 @@ ifeq ($(call cc-option-yn,-mcmodel=medium),y) # kernel percpu data space (starting with 0xc...). We need a full # 64bit relocation for this to work, hence -mcmodel=large. KBUILD_CFLAGS_MODULE += -mcmodel=large -else - export NO_MINIMAL_TOC := -mno-minimal-toc endif endif diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile index 10c0fb306f15..33e1de5d1c95 100644 --- a/arch/powerpc/boot/Makefile +++ b/arch/powerpc/boot/Makefile @@ -66,10 +66,7 @@ ifdef CONFIG_DEBUG_INFO BOOTCFLAGS += -g endif -ifeq ($(call cc-option-yn, -fstack-protector),y) -BOOTCFLAGS += -fno-stack-protector -endif - +BOOTCFLAGS += $(call cc-option,-fstack-protector) BOOTCFLAGS += -I$(objtree)/$(obj) -I$(srctree)/$(obj) DTC_FLAGS ?= -p 1024 -- 2.33.0.rc1.237.g0d66db33f3-goog
Re: [PATCH v8 5/5] powerpc/pseries: Add support for FORM2 associativity
On Thu, Aug 12, 2021 at 06:52:23PM +0530, Aneesh Kumar K.V wrote: > PAPR interface currently supports two different ways of communicating resource > grouping details to the OS. These are referred to as Form 0 and Form 1 > associativity grouping. Form 0 is the older format and is now considered > deprecated. This patch adds another resource grouping named FORM2. > > Signed-off-by: Daniel Henrique Barboza > Signed-off-by: Aneesh Kumar K.V Reviewed-by: David Gibson Though there a couple of cosmetic issues and one bad memory access issue (though only in the case of buggy firmware). [snip] > +Form 2 > +--- > +Form 2 associativity format adds separate device tree properties > representing NUMA node distance > +thereby making the node distance computation flexible. Form 2 also allows > flexible primary > +domain numbering. With numa distance computation now detached from the index > value in > +"ibm,associativity-reference-points" property, Form 2 allows a large number > of primary domain > +ids at the same domainID index representing resource groups of different > performance/latency > +characteristics. > + > +Hypervisor indicates the usage of FORM2 associativity using bit 2 of byte 5 > in the > +"ibm,architecture-vec-5" property. > + > +"ibm,numa-lookup-index-table" property contains a list of one or more > numbers representing > +the domainIDs present in the system. The offset of the domainID in this > property is > +used as an index while computing numa distance information via > "ibm,numa-distance-table". > + > +prop-encoded-array: The number N of the domainIDs encoded as with > encode-int, followed by > +N domainID encoded as with encode-int > + > +For ex: > +"ibm,numa-lookup-index-table" = {4, 0, 8, 250, 252}. The offset of domainID > 8 (2) is used when Since you're using dts syntax below, it probably makes sense to use it here as well. > +computing the distance of domain 8 from other domains present in the system. > For the rest of > +this document, this offset will be referred to as domain distance offset. > + > +"ibm,numa-distance-table" property contains a list of one or more numbers > representing the NUMA > +distance between resource groups/domains present in the system. > + > +prop-encoded-array: The number N of the distance values encoded as with > encode-int, followed by > +N distance values encoded as with encode-bytes. The max distance value we > could encode is 255. > +The number N must be equal to the square of m where m is the number of > domainIDs in the > +numa-lookup-index-table. > + > +For ex: > +ibm,numa-lookup-index-table = <3 0 8 40>; > +ibm,numa-distace-table = <9>, /bits/ 8 < 10 20 80 > + 20 10 160 > + 80 160 10>; [snip] > + > + /* FORM2 affinity */ > + nid = of_node_to_nid_single(node); > + if (nid == NUMA_NO_NODE) > + return; > + > + /* > + * With FORM2 we expect NUMA distance of all possible NUMA > + * nodes to be provided during boot. > + */ > + WARN(numa_distance_table[nid][nid] == -1, > + "NUMA distance details for node %d not provided\n", nid); > +} > + > +/* > + * ibm,numa-lookup-index-table= {N, domainid1, domainid2, . domainidN} > + * ibm,numa-distance-table = { N, 1, 2, 4, 5, 1, 6, N elements} .. and here too. > + */ > +static void initialize_form2_numa_distance_lookup_table(void) > +{ > + int i, j; > + struct device_node *root; > + const __u8 *numa_dist_table; > + const __be32 *numa_lookup_index; > + int numa_dist_table_length; > + int max_numa_index, distance_index; > + > + if (firmware_has_feature(FW_FEATURE_OPAL)) > + root = of_find_node_by_path("/ibm,opal"); > + else > + root = of_find_node_by_path("/rtas"); > + if (!root) > + root = of_find_node_by_path("/"); > + > + numa_lookup_index = of_get_property(root, > "ibm,numa-lookup-index-table", NULL); > + max_numa_index = of_read_number(_lookup_index[0], 1); > + > + /* first element of the array is the size and is encode-int */ > + numa_dist_table = of_get_property(root, "ibm,numa-distance-table", > NULL); > + numa_dist_table_length = of_read_number((const __be32 > *)_dist_table[0], 1); > + /* Skip the size which is encoded int */ > + numa_dist_table += sizeof(__be32); > + > + pr_debug("numa_dist_table_len = %d, numa_dist_indexes_len = %d\n", > + numa_dist_table_length, max_numa_index); You validate numa_dist_table_length below. However, AFAICT you don't anywhere check that the property actually has the length its first element claims it does. Yes, that represents a firmware bug, but it's probably best if we don't ready past the end of the array in that case, which I think is what will happen now. Same applies for the lookup-index-table. > + for (i = 0; i < max_numa_index; i++) > + /* +1 skip the
Re: [PATCH v8 3/5] powerpc/pseries: Consolidate different NUMA distance update code paths
On Thu, Aug 12, 2021 at 06:52:21PM +0530, Aneesh Kumar K.V wrote: > The associativity details of the newly added resourced are collected from > the hypervisor via "ibm,configure-connector" rtas call. Update the numa > distance details of the newly added numa node after the above call. > > Instead of updating NUMA distance every time we lookup a node id > from the associativity property, add helpers that can be used > during boot which does this only once. Also remove the distance > update from node id lookup helpers. > > Currently, we duplicate parsing code for ibm,associativity and > ibm,associativity-lookup-arrays in the kernel. The associativity array > provided > by these device tree properties are very similar and hence can use > a helper to parse the node id and numa distance details. > > Signed-off-by: Aneesh Kumar K.V Reviewed-by: David Gibson There are a handful of nits it would be nice to clean up as followups, though: [snip] > +static int get_nid_and_numa_distance(struct drmem_lmb *lmb) > +{ > + struct assoc_arrays aa = { .arrays = NULL }; > + int default_nid = NUMA_NO_NODE; I don't think there's any point to the 'default_nid' variable. > + int nid = default_nid; > + int rc, index; > + > + if ((primary_domain_index < 0) || !numa_enabled) > + return default_nid; > + > + rc = of_get_assoc_arrays(); > + if (rc) > + return default_nid; > + > + if (primary_domain_index <= aa.array_sz && You don't need this test any more - it's included in __associativity_to_nid(). -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH 2/3] powerpc: Fix undefined static key
Joel Stanley writes: > When CONFIG_PPC_BARRIER_NOSPEC=n, security.c is not built leading to a > missing definition of uaccess_flush_key. > > LD vmlinux.o > MODPOST vmlinux.symvers > MODINFO modules.builtin.modinfo > GEN modules.builtin > LD .tmp_vmlinux.kallsyms1 > powerpc64le-linux-gnu-ld: arch/powerpc/kernel/align.o:(.toc+0x0): undefined > reference to `uaccess_flush_key' > powerpc64le-linux-gnu-ld: arch/powerpc/kernel/signal_64.o:(.toc+0x0): > undefined reference to `uaccess_flush_key' > powerpc64le-linux-gnu-ld: arch/powerpc/kernel/process.o:(.toc+0x0): undefined > reference to `uaccess_flush_key' > powerpc64le-linux-gnu-ld: arch/powerpc/kernel/traps.o:(.toc+0x0): undefined > reference to `uaccess_flush_key' > powerpc64le-linux-gnu-ld: > arch/powerpc/kernel/hw_breakpoint_constraints.o:(.toc+0x0): undefined > reference to `uaccess_flush_key' > powerpc64le-linux-gnu-ld: arch/powerpc/kernel/ptrace/ptrace.o:(.toc+0x0): > more undefined references to `uaccess_flush_key' follow > make[1]: *** [Makefile:1176: vmlinux] Error 1 > > Hack one in to fix the build. Yeah sorry that is a bit of a hack :) I think the root cause here is that we don't have a CONFIG for "want security workaround stuff", because so far we haven't had a CPU that wants to turn that all off. The generic code uses CONFIG_GENERIC_CPU_VULNERABILITIES, so I guess we should follow that example and add PPC_CPU_VULNERABILITIES. Then we'd use that to disable all the security.c stuff, and PPC_BARRIER_NOSPEC would depend on it also. Then we could allow configuring that off for Microwatt, or possibly for all platforms if people want to do that. cheers
Re: [PATCH 2/3] powerpc: Fix undefined static key
On Mon, 16 Aug 2021 at 08:39, Christophe Leroy wrote: > > > > Le 16/08/2021 à 10:24, Joel Stanley a écrit : > > When CONFIG_PPC_BARRIER_NOSPEC=n, security.c is not built leading to a > > missing definition of uaccess_flush_key. > > > >LD vmlinux.o > >MODPOST vmlinux.symvers > >MODINFO modules.builtin.modinfo > >GEN modules.builtin > >LD .tmp_vmlinux.kallsyms1 > > powerpc64le-linux-gnu-ld: arch/powerpc/kernel/align.o:(.toc+0x0): undefined > > reference to `uaccess_flush_key' > > powerpc64le-linux-gnu-ld: arch/powerpc/kernel/signal_64.o:(.toc+0x0): > > undefined reference to `uaccess_flush_key' > > powerpc64le-linux-gnu-ld: arch/powerpc/kernel/process.o:(.toc+0x0): > > undefined reference to `uaccess_flush_key' > > powerpc64le-linux-gnu-ld: arch/powerpc/kernel/traps.o:(.toc+0x0): undefined > > reference to `uaccess_flush_key' > > powerpc64le-linux-gnu-ld: > > arch/powerpc/kernel/hw_breakpoint_constraints.o:(.toc+0x0): undefined > > reference to `uaccess_flush_key' > > powerpc64le-linux-gnu-ld: arch/powerpc/kernel/ptrace/ptrace.o:(.toc+0x0): > > more undefined references to `uaccess_flush_key' follow > > make[1]: *** [Makefile:1176: vmlinux] Error 1 > > > > Hack one in to fix the build. > > > > Signed-off-by: Joel Stanley > > --- > > arch/powerpc/include/asm/security_features.h | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/arch/powerpc/include/asm/security_features.h > > b/arch/powerpc/include/asm/security_features.h > > index 792eefaf230b..46ade7927a4c 100644 > > --- a/arch/powerpc/include/asm/security_features.h > > +++ b/arch/powerpc/include/asm/security_features.h > > @@ -39,6 +39,9 @@ static inline bool security_ftr_enabled(u64 feature) > > return !!(powerpc_security_features & feature); > > } > > > > +#ifndef CONFIG_PPC_BARRIER_NOSPEC > > +DEFINE_STATIC_KEY_FALSE(uaccess_flush_key); > > +#endif > > It will then be re-defined by each file that includes asm/security_features.h > > > You can't use a DEFINE_ in a .h > > You have to fix the problem at its source. > > Cleanest way I see it to modify asm/book3s/64/kup.h and something like > > if (IS_ENABLED(CONFIG_PPC_BARRIER_NOSPEC) && > static_branch_unlikely(_flush_key) This won't work either as there's no declaration for uaccess_flush_key: arch/powerpc/include/asm/book3s/64/kup.h:411:78: error: ‘uaccess_flush_key’ undeclared (first use in this function) We could add an extern for it, but that is distasteful as the static key API suggests using the structs directly is deprecated, and the macros we're supposed to use perform initialisation. Could we assume microwatt-like platforms will not gain pkeys support, and have a stubbed out set of prevent/restore_user_access for systems that turn off either pkeys or BARRIER_NOSPEC? Or do we get rid of PPC_BARRIER_NOSPEC as an option, and have machines rely on disabling it at runtime?
Re: [PATCH 3/7] powerpc: replace cc-option-yn uses with cc-option
Nick Desaulniers writes: > cc-option-yn can be replaced with cc-option. ie. > Checking for support: > ifeq ($(call cc-option-yn,$(FLAG)),y) > becomes: > ifneq ($(call cc-option,$(FLAG)),) > > Checking for lack of support: > ifeq ($(call cc-option-yn,$(FLAG)),n) > becomes: > ifeq ($(call cc-option,$(FLAG)),) > > This allows us to pursue removing cc-option-yn. > > Cc: Michael Ellerman > Cc: Benjamin Herrenschmidt > Cc: Paul Mackerras > Cc: linuxppc-dev@lists.ozlabs.org > Signed-off-by: Nick Desaulniers > --- > arch/powerpc/Makefile | 12 ++-- > arch/powerpc/boot/Makefile | 5 + > 2 files changed, 7 insertions(+), 10 deletions(-) > > diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile > index 9aaf1abbc641..85e224536cf7 100644 > --- a/arch/powerpc/Makefile > +++ b/arch/powerpc/Makefile > @@ -12,12 +12,12 @@ > # Rewritten by Cort Dougan and Paul Mackerras > # > > -HAS_BIARCH := $(call cc-option-yn, -m32) > +HAS_BIARCH := $(call cc-option,-m32) > > # Set default 32 bits cross compilers for vdso and boot wrapper > CROSS32_COMPILE ?= > > -ifeq ($(HAS_BIARCH),y) > +ifeq ($(HAS_BIARCH),-m32) I don't love that we have to repeat "-m32" in each check. I'm pretty sure you can use ifdef here, because HAS_BIARCH is a simple variable (assigned with ":="). ie, this can be: ifdef HAS_BIARCH And that avoids having to spell out "-m32" everywhere. cheers
Re: [PATCH] powerpc/xive: Do not mark xive_request_ipi() as __init
On Mon, Aug 16, 2021 at 12:06 PM Nathan Chancellor wrote: > > Compiling ppc64le_defconfig with clang-14 shows a modpost warning: > > WARNING: modpost: vmlinux.o(.text+0xa74e0): Section mismatch in > reference from the function xive_setup_cpu_ipi() to the function > .init.text:xive_request_ipi() > The function xive_setup_cpu_ipi() references > the function __init xive_request_ipi(). > This is often because xive_setup_cpu_ipi lacks a __init > annotation or the annotation of xive_request_ipi is wrong. > > xive_request_ipi() is called from xive_setup_cpu_ipi(), which is not > __init, so xive_request_ipi() should not be marked __init. Remove the > attribute so there is no more warning. > > Fixes: cbc06f051c52 ("powerpc/xive: Do not skip CPU-less nodes when creating > the IPIs") > Signed-off-by: Nathan Chancellor Thanks for the patch! Reviewed-by: Nick Desaulniers > --- > arch/powerpc/sysdev/xive/common.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/sysdev/xive/common.c > b/arch/powerpc/sysdev/xive/common.c > index 943fd30095af..8183ca343675 100644 > --- a/arch/powerpc/sysdev/xive/common.c > +++ b/arch/powerpc/sysdev/xive/common.c > @@ -1170,7 +1170,7 @@ static int __init xive_init_ipis(void) > return ret; > } > > -static int __init xive_request_ipi(unsigned int cpu) > +static int xive_request_ipi(unsigned int cpu) > { > struct xive_ipi_desc *xid = _ipis[early_cpu_to_node(cpu)]; > int ret; > > base-commit: a2824f19e6065a0d3735acd9fe7155b104e7edf5 > -- > 2.33.0.rc2 > -- Thanks, ~Nick Desaulniers
Re: [PATCH 3/7] powerpc: replace cc-option-yn uses with cc-option
On 8/16/2021 5:21 PM, 'Nick Desaulniers' via Clang Built Linux wrote: cc-option-yn can be replaced with cc-option. ie. Checking for support: ifeq ($(call cc-option-yn,$(FLAG)),y) becomes: ifneq ($(call cc-option,$(FLAG)),) Checking for lack of support: ifeq ($(call cc-option-yn,$(FLAG)),n) becomes: ifeq ($(call cc-option,$(FLAG)),) This allows us to pursue removing cc-option-yn. Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Nick Desaulniers --- arch/powerpc/Makefile | 12 ++-- arch/powerpc/boot/Makefile | 5 + 2 files changed, 7 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 9aaf1abbc641..85e224536cf7 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -12,12 +12,12 @@ # Rewritten by Cort Dougan and Paul Mackerras # -HAS_BIARCH := $(call cc-option-yn, -m32) +HAS_BIARCH := $(call cc-option,-m32) # Set default 32 bits cross compilers for vdso and boot wrapper CROSS32_COMPILE ?= -ifeq ($(HAS_BIARCH),y) +ifeq ($(HAS_BIARCH),-m32) ifeq ($(CROSS32_COMPILE),) ifdef CONFIG_PPC32 # These options will be overridden by any -mcpu option that the CPU @@ -107,7 +107,7 @@ cflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mlittle-endian aflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call cc-option,-mbig-endian) aflags-$(CONFIG_CPU_LITTLE_ENDIAN)+= -mlittle-endian -ifeq ($(HAS_BIARCH),y) +ifeq ($(HAS_BIARCH),-m32) KBUILD_CFLAGS += -m$(BITS) KBUILD_AFLAGS += -m$(BITS) -Wl,-a$(BITS) KBUILD_LDFLAGS+= -m elf$(BITS)$(LDEMULATION) @@ -125,7 +125,9 @@ LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) := -pie LDFLAGS_vmlinux := $(LDFLAGS_vmlinux-y) ifdef CONFIG_PPC64 -ifeq ($(call cc-option-yn,-mcmodel=medium),y) +ifeq ($(call cc-option,-mcmodel=medium),) + export NO_MINIMAL_TOC := -mno-minimal-toc +else # -mcmodel=medium breaks modules because it uses 32bit offsets from # the TOC pointer to create pointers where possible. Pointers into the # percpu data area are created by this method. @@ -135,8 +137,6 @@ ifeq ($(call cc-option-yn,-mcmodel=medium),y) # kernel percpu data space (starting with 0xc...). We need a full # 64bit relocation for this to work, hence -mcmodel=large. KBUILD_CFLAGS_MODULE += -mcmodel=large -else - export NO_MINIMAL_TOC := -mno-minimal-toc endif endif diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile index 10c0fb306f15..33e1de5d1c95 100644 --- a/arch/powerpc/boot/Makefile +++ b/arch/powerpc/boot/Makefile @@ -66,10 +66,7 @@ ifdef CONFIG_DEBUG_INFO BOOTCFLAGS+= -g endif -ifeq ($(call cc-option-yn, -fstack-protector),y) -BOOTCFLAGS += -fno-stack-protector -endif - +BOOTCFLAGS += $(call cc-option,-fstack-protector) This was previously disabling the stack protector but now it is enabling it. Just remove the ifeq conditional altogether as the kernel assumes -fno-stack-protector is always supported after commit 893ab00439a4 ("kbuild: remove cc-option test of -fno-stack-protector"). BOOTCFLAGS+= -I$(objtree)/$(obj) -I$(srctree)/$(obj) DTC_FLAGS ?= -p 1024
Re: [PATCH v2 1/2] sched/topology: Skip updating masks for non-online nodes
On 16/08/21 16:03, Srikar Dronamraju wrote: >> >> Your version is much much better than mine. >> And I have verified that it works as expected. >> >> > > Hey Peter/Valentin > > Are we waiting for any more feedback/testing for this? > I'm not overly fond of that last one, but AFAICT the only alternative is doing a full-fledged NUMA topology rebuild on new-node onlining (i.e. make calling sched_init_numa() more than once work). It's a lot more work for a very particular usecase. > > -- > Thanks and Regards > Srikar Dronamraju
[PATCH] powerpc/xive: Do not mark xive_request_ipi() as __init
Compiling ppc64le_defconfig with clang-14 shows a modpost warning: WARNING: modpost: vmlinux.o(.text+0xa74e0): Section mismatch in reference from the function xive_setup_cpu_ipi() to the function .init.text:xive_request_ipi() The function xive_setup_cpu_ipi() references the function __init xive_request_ipi(). This is often because xive_setup_cpu_ipi lacks a __init annotation or the annotation of xive_request_ipi is wrong. xive_request_ipi() is called from xive_setup_cpu_ipi(), which is not __init, so xive_request_ipi() should not be marked __init. Remove the attribute so there is no more warning. Fixes: cbc06f051c52 ("powerpc/xive: Do not skip CPU-less nodes when creating the IPIs") Signed-off-by: Nathan Chancellor --- arch/powerpc/sysdev/xive/common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c index 943fd30095af..8183ca343675 100644 --- a/arch/powerpc/sysdev/xive/common.c +++ b/arch/powerpc/sysdev/xive/common.c @@ -1170,7 +1170,7 @@ static int __init xive_init_ipis(void) return ret; } -static int __init xive_request_ipi(unsigned int cpu) +static int xive_request_ipi(unsigned int cpu) { struct xive_ipi_desc *xid = _ipis[early_cpu_to_node(cpu)]; int ret; base-commit: a2824f19e6065a0d3735acd9fe7155b104e7edf5 -- 2.33.0.rc2
Re: [PATCH v1 2/4] powerpc/64s/perf: add power_pmu_running to query whether perf is being used
Hi Nicholas, I love your patch! Yet something to improve: [auto build test ERROR on powerpc/next] [also build test ERROR on v5.14-rc6 next-20210816] [cannot apply to scottwood/next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-64s-interrupt-speedups/20210816-153424 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-wii_defconfig (attached as .config) compiler: powerpc-linux-gcc (GCC) 11.2.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/98b8a35aa718c93ace9df66df6274fe392633f80 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Nicholas-Piggin/powerpc-64s-interrupt-speedups/20210816-153424 git checkout 98b8a35aa718c93ace9df66df6274fe392633f80 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=powerpc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): >> arch/powerpc/perf/core-book3s.c:2392:6: error: no previous prototype for >> 'power_pmu_running' [-Werror=missing-prototypes] 2392 | bool power_pmu_running(void) | ^ cc1: all warnings being treated as errors vim +/power_pmu_running +2392 arch/powerpc/perf/core-book3s.c 2391 > 2392 bool power_pmu_running(void) 2393 { 2394 struct cpu_hw_events *cpuhw; 2395 2396 /* Could this simply test local_paca->pmcregs_in_use? */ 2397 2398 if (!ppmu) 2399 return false; 2400 2401 cpuhw = this_cpu_ptr(_hw_events); 2402 return cpuhw->n_events; 2403 } 2404 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH] powerpc/pseries/cpuhp: fix non-NUMA build
Christophe Leroy writes: > Le 16/08/2021 à 17:17, Nathan Lynch a écrit : >> With CONFIG_NUMA unset, direct references to node_to_cpumask_map don't >> work: >> >> arch/powerpc/platforms/pseries/hotplug-cpu.c: In function >> 'pseries_cpu_hotplug_init': arch/powerpc/platforms/pseries/hotplug-cpu.c:1022:8: error: 'node_to_cpumask_map' undeclared (first use in this function) >> 1022 |node_to_cpumask_map[node]); >> |^~~ >> >> Use cpumask_of_node() here instead. > > > Isn't it the same as > https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210816041032.2839343-1-...@ellerman.id.au/ > ? Yes, thanks.
Re: [PATCH] powerpc/pseries/cpuhp: fix non-NUMA build
Le 16/08/2021 à 17:17, Nathan Lynch a écrit : With CONFIG_NUMA unset, direct references to node_to_cpumask_map don't work: arch/powerpc/platforms/pseries/hotplug-cpu.c: In function 'pseries_cpu_hotplug_init': arch/powerpc/platforms/pseries/hotplug-cpu.c:1022:8: error: 'node_to_cpumask_map' undeclared (first use in this function) 1022 |node_to_cpumask_map[node]); |^~~ Use cpumask_of_node() here instead. Isn't it the same as https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210816041032.2839343-1-...@ellerman.id.au/ ? Christophe Fixes: bd1dd4c5f528 ("powerpc/pseries: Prevent free CPU ids being reused on another node") Reported-by: kernel test robot Signed-off-by: Nathan Lynch --- arch/powerpc/platforms/pseries/hotplug-cpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 1ef40ef699a6..d646c22e94ab 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -1021,7 +1021,7 @@ static int __init pseries_cpu_hotplug_init(void) /* Record ids of CPU added at boot time */ cpumask_or(node_recorded_ids_map[node], node_recorded_ids_map[node], - node_to_cpumask_map[node]); + cpumask_of_node(node)); } of_reconfig_notifier_register(_smp_nb);
[PATCH] powerpc/pseries/cpuhp: fix non-NUMA build
With CONFIG_NUMA unset, direct references to node_to_cpumask_map don't work: arch/powerpc/platforms/pseries/hotplug-cpu.c: In function 'pseries_cpu_hotplug_init': >> arch/powerpc/platforms/pseries/hotplug-cpu.c:1022:8: error: >> 'node_to_cpumask_map' undeclared (first use in this >> function) 1022 |node_to_cpumask_map[node]); |^~~ Use cpumask_of_node() here instead. Fixes: bd1dd4c5f528 ("powerpc/pseries: Prevent free CPU ids being reused on another node") Reported-by: kernel test robot Signed-off-by: Nathan Lynch --- arch/powerpc/platforms/pseries/hotplug-cpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 1ef40ef699a6..d646c22e94ab 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -1021,7 +1021,7 @@ static int __init pseries_cpu_hotplug_init(void) /* Record ids of CPU added at boot time */ cpumask_or(node_recorded_ids_map[node], node_recorded_ids_map[node], - node_to_cpumask_map[node]); + cpumask_of_node(node)); } of_reconfig_notifier_register(_smp_nb); -- 2.31.1
Re: [FSL P50x0] lscpu reports wrong values since the RC1 of kernel 5.13
Hi All, I tested the RC6 of kernel 5.14 today and unfortunately the issue still exists. We have figured out that only P5040 SoCs are affected. [1] P5020 SoCs display the correct values. Please check the CPU changes in the PowerPC updates 5.13-1 and 5.13-2. Thanks, Christian [1] https://forum.hyperion-entertainment.com/viewtopic.php?p=53775#p53775 On 09 August 2021 um 02:37 pm, Christian Zigotzky wrote: Hi All, Lscpu reports wrong values [1] since the RC1 of kernel 5.13 on my FSL P5040 Cyrus+ board (A-EON AmigaOne X5000). [2] The differences are: Since the RC1 of kernel 5.13 (wrong values): Core(s) per socket: 1 Socket(s): 3 Before (correct values): Core(s) per socket: 4 Socket(s): 1 Through the wrong values, I can't use "-smp 4" with a virtual e5500 QEMU machine with KVM HV anymore. [3] "-smp 3" works with KVM HV. Maybe the file_load_64 commit from the PowerPC updates 5.13-2 is the problem ( powerpc/kexec_file: Use current CPU info while setting up FDT). [4] Or maybe this change (PowerPC updates 5.13-1): -#ifdef CONFIG_PPC_BOOK3E_64 - state->ctx_state = exception_enter(); - if (user_mode(regs)) - account_cpu_user_entry(); -#endif --- Or maybe this change (PowerPC updates 5.13-1): diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 7a13bc20f0a0c..03b3d010cbab6 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -31,6 +31,7 @@ extern u32 *cpu_to_phys_id; extern bool coregroup_enabled; extern int cpu_to_chip_id(int cpu); +extern int *chip_id_lookup_table; #ifdef CONFIG_SMP @@ -121,6 +122,11 @@ static inline struct cpumask *cpu_sibling_mask(int cpu) return per_cpu(cpu_sibling_map, cpu); } +static inline struct cpumask *cpu_core_mask(int cpu) +{ + return per_cpu(cpu_core_map, cpu); +} + static inline struct cpumask *cpu_l2_cache_mask(int cpu) { return per_cpu(cpu_l2_cache_map, cpu); --- I have found a lot of other changes in the PowerPC updates 5.13-1 regarding the CPU. Could you please check the CPU changes in the PowerPC updates 5.13-1 and 5.13-2? Please find attached the kernel 5.14-rc5 config. Thanks, Christian [1] lscpu with the correct values before the RC1 of kernel 5.13: Architecture: ppc64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Big Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 Model: 1.2 (pvr 8024 0012) Model name: e5500 L1d cache: 128 KiB L1i cache: 128 KiB L2 cache: 2 MiB L3 cache: 2 MiB Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Spec store bypass: Not affected Vulnerability Spectre v1: Mitigation; __user pointer sanitization Vulnerability Spectre v2: Mitigation; Branch predictor state flush Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected --- lscpu with the wrong values since the RC1 of kernel 5.13: Architecture: ppc64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Big Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 3 Model: 1.2 (pvr 8024 0012) Model name: e5500 L1d cache: 128 KiB L1i cache: 128 KiB L2 cache: 2 MiB L3 cache: 2 MiB Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Spec store bypass: Not affected Vulnerability Spectre v1: Mitigation; __user pointer sanitization Vulnerability Spectre v2: Mitigation; Branch predictor state flush Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected --- [2] http://wiki.amiga.org/index.php?title=X5000 [3] https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-May/229103.html [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/arch/powerpc/kexec/file_load_64.c?id=ab159ac569fddf812c0a217d6dbffaa5d93ef88f
Re: [PATCH linux-next] module: remove duplicate include in interrupt.c
Le 16/08/2021 à 13:34, cgel@gmail.com a écrit : From: Lv Ruyi 'asm/interrupt.h' included in 'interrupt.c' is duplicated. This patch has been submitted at least half a dozen of times already. You should maintain alphabetic order in the include list. But please don't post it again, we have it in the pipe already, see https://patchwork.ozlabs.org/project/linuxppc-dev/patch/1624329437-84730-1-git-send-email-jiapeng.ch...@linux.alibaba.com/ Next time please check at https://patchwork.ozlabs.org/project/linuxppc-dev/list/ before submitting a new patch. Thanks Christophe Reported-by: Zeal Robot Signed-off-by: Lv Ruyi --- arch/powerpc/kernel/interrupt.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c index 21bbd615ca41..8a936515e4e4 100644 --- a/arch/powerpc/kernel/interrupt.c +++ b/arch/powerpc/kernel/interrupt.c @@ -10,7 +10,6 @@ #include #include #include -#include #include #include #include
[PATCH linux-next] module: remove duplicate include in interrupt.c
From: Lv Ruyi 'asm/interrupt.h' included in 'interrupt.c' is duplicated. Reported-by: Zeal Robot Signed-off-by: Lv Ruyi --- arch/powerpc/kernel/interrupt.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c index 21bbd615ca41..8a936515e4e4 100644 --- a/arch/powerpc/kernel/interrupt.c +++ b/arch/powerpc/kernel/interrupt.c @@ -10,7 +10,6 @@ #include #include #include -#include #include #include #include -- 2.25.1
Re: [PATCH v2 1/2] sched/topology: Skip updating masks for non-online nodes
> > Your version is much much better than mine. > And I have verified that it works as expected. > > Hey Peter/Valentin Are we waiting for any more feedback/testing for this? -- Thanks and Regards Srikar Dronamraju
Re: [PATCH 0/2] powerpc: mpc855_ads defconfig fixes
Le 16/08/2021 à 10:31, Joel Stanley a écrit : The first was a build warning I noticed when testing something unrelated. I took a moment to look into it, and came up with the second patch which updates the defconfig to make it easier to maintain in the future It also fixes a regression where the MTD partition support dropped out of the config. Given noone noticed the regression since v4.20 was released, perhaps it could be left disabled? Most likely nobody is using this board anymore. But that's a good to have it to perform CI builds. So we should leave that kind of config. Joel Stanley (2): powerpc/config: Fix IPV6 warning in mpc855_ads powerpc/configs: Regenerate mpc885_ads_defconfig arch/powerpc/configs/mpc885_ads_defconfig | 49 +++ 1 file changed, 23 insertions(+), 26 deletions(-)
Re: [PATCH 2/2] powerpc/configs: Regenerate mpc885_ads_defconfig
Le 16/08/2021 à 10:31, Joel Stanley a écrit : Regenrate atop v5.14-rc6. Typos. You mean you did redo a "make savedefconfig" ? The chagnes are mostly re-ordering, except for the following which fall out due to dependenacies: - CONFIG_DEBUG_KERNEL=y selected by EXPERT - CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008 which is the default setting CONFIG_MTD_PHYSMAP_OF is not longer enabled, as it depends on MTD_PHYSMAP which is not enabled. This is a regression from commit 642b1e8dbed7 ("mtd: maps: Merge physmap_of.c into physmap-core.c"), which added the extra dependency. Add CONFIG_MTD_PHYSMAP=y so this stays in the config. Signed-off-by: Joel Stanley --- arch/powerpc/configs/mpc885_ads_defconfig | 47 +++ 1 file changed, 23 insertions(+), 24 deletions(-) diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig index 5cd17adf903f..c74dc76b1d0d 100644 --- a/arch/powerpc/configs/mpc885_ads_defconfig +++ b/arch/powerpc/configs/mpc885_ads_defconfig @@ -1,19 +1,30 @@ -CONFIG_PPC_8xx=y # CONFIG_SWAP is not set CONFIG_SYSVIPC=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y +CONFIG_BPF_JIT=y +CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y CONFIG_LOG_BUF_SHIFT=14 CONFIG_EXPERT=y # CONFIG_ELF_CORE is not set # CONFIG_BASE_FULL is not set # CONFIG_FUTEX is not set +CONFIG_PERF_EVENTS=y # CONFIG_VM_EVENT_COUNTERS is not set -# CONFIG_BLK_DEV_BSG is not set -CONFIG_PARTITION_ADVANCED=y +CONFIG_PPC_8xx=y +CONFIG_8xx_GPIO=y +CONFIG_SMC_UCODE_PATCH=y +CONFIG_PIN_TLB=y CONFIG_GEN_RTC=y CONFIG_HZ_100=y +CONFIG_MATH_EMULATION=y +CONFIG_PPC_16K_PAGES=y +CONFIG_ADVANCED_OPTIONS=y # CONFIG_SECCOMP is not set +CONFIG_STRICT_KERNEL_RWX=y +CONFIG_MODULES=y +# CONFIG_BLK_DEV_BSG is not set +CONFIG_PARTITION_ADVANCED=y CONFIG_NET=y CONFIG_PACKET=y CONFIG_UNIX=y @@ -33,6 +44,7 @@ CONFIG_MTD_CFI_GEOMETRY=y # CONFIG_MTD_CFI_I2 is not set CONFIG_MTD_CFI_I4=y CONFIG_MTD_CFI_AMDSTD=y +CONFIG_MTD_PHYSMAP=y CONFIG_MTD_PHYSMAP_OF=y # CONFIG_BLK_DEV is not set CONFIG_NETDEVICES=y @@ -45,38 +57,25 @@ CONFIG_DAVICOM_PHY=y # CONFIG_LEGACY_PTYS is not set CONFIG_SERIAL_CPM=y CONFIG_SERIAL_CPM_CONSOLE=y +CONFIG_SPI=y +CONFIG_SPI_FSL_SPI=y # CONFIG_HWMON is not set +CONFIG_WATCHDOG=y +CONFIG_8xxx_WDT=y # CONFIG_USB_SUPPORT is not set # CONFIG_DNOTIFY is not set CONFIG_TMPFS=y CONFIG_CRAMFS=y CONFIG_NFS_FS=y CONFIG_ROOT_NFS=y +CONFIG_CRYPTO=y +CONFIG_CRYPTO_DEV_TALITOS=y CONFIG_CRC32_SLICEBY4=y CONFIG_DEBUG_INFO=y CONFIG_MAGIC_SYSRQ=y -CONFIG_DETECT_HUNG_TASK=y -CONFIG_PPC_16K_PAGES=y -CONFIG_DEBUG_KERNEL=y CONFIG_DEBUG_FS=y -CONFIG_PPC_PTDUMP=y -CONFIG_MODULES=y -CONFIG_SPI=y -CONFIG_SPI_FSL_SPI=y -CONFIG_CRYPTO=y -CONFIG_CRYPTO_DEV_TALITOS=y -CONFIG_8xx_GPIO=y -CONFIG_WATCHDOG=y -CONFIG_8xxx_WDT=y -CONFIG_SMC_UCODE_PATCH=y -CONFIG_ADVANCED_OPTIONS=y -CONFIG_PIN_TLB=y -CONFIG_PERF_EVENTS=y -CONFIG_MATH_EMULATION=y -CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y -CONFIG_STRICT_KERNEL_RWX=y -CONFIG_BPF_JIT=y CONFIG_DEBUG_VM_PGTABLE=y +CONFIG_DETECT_HUNG_TASK=y CONFIG_BDI_SWITCH=y CONFIG_PPC_EARLY_DEBUG=y -CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008 +CONFIG_PPC_PTDUMP=y
Re: [PATCH 1/2] powerpc/config: Fix IPV6 warning in mpc855_ads
Le 16/08/2021 à 10:31, Joel Stanley a écrit : When building this config there's a warning: 79:warning: override: reassigning to symbol IPV6 Commit 9a1762a4a4ff ("powerpc/8xx: Update mpc885_ads_defconfig to improve CI") added CONFIG_IPV6=y, but left '# CONFIG_IPV6 is not set' in. IPV6 is default y, so remove both to clean up the build. Signed-off-by: Joel Stanley Acked-by: Christophe Leroy --- arch/powerpc/configs/mpc885_ads_defconfig | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig index d21f266cea9a..5cd17adf903f 100644 --- a/arch/powerpc/configs/mpc885_ads_defconfig +++ b/arch/powerpc/configs/mpc885_ads_defconfig @@ -21,7 +21,6 @@ CONFIG_INET=y CONFIG_IP_MULTICAST=y CONFIG_IP_PNP=y CONFIG_SYN_COOKIES=y -# CONFIG_IPV6 is not set # CONFIG_FW_LOADER is not set CONFIG_MTD=y CONFIG_MTD_BLOCK=y @@ -76,7 +75,6 @@ CONFIG_PERF_EVENTS=y CONFIG_MATH_EMULATION=y CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y CONFIG_STRICT_KERNEL_RWX=y -CONFIG_IPV6=y CONFIG_BPF_JIT=y CONFIG_DEBUG_VM_PGTABLE=y CONFIG_BDI_SWITCH=y
Re: [PATCH 3/3] powerpc/microwatt: CPU doesn't (yet) have speculation bugs
Le 16/08/2021 à 10:24, Joel Stanley a écrit : Signed-off-by: Joel Stanley --- arch/powerpc/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 663766fbf505..d5af6667c206 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -279,6 +279,7 @@ config PPC_BARRIER_NOSPEC bool default y depends on PPC_BOOK3S_64 || PPC_FSL_BOOK3E + depends on !PPC_MICROWATT Not sure it is a good idea to disable it completely when PPC_MICROWATT is selected. Don't we want to be able to build generic kernels that can run on any book3s/64 ? Maybe you should change the default instead, something like: bool default y if !PPC_MICROWATT depends on PPC_BOOK3S_64 || PPC_FSL_BOOK3E config EARLY_PRINTK bool
Re: [PATCH v2 00/60] KVM: PPC: Book3S HV P9: entry/exit optimisations
> On 11-Aug-2021, at 9:30 PM, Nicholas Piggin wrote: > > This reduces radix guest full entry/exit latency on POWER9 and POWER10 > by 2x. > > Nested HV guests should see smaller improvements in their L1 entry/exit, > but this is also combined with most L0 speedups also applying to nested > entry. nginx localhost throughput test in a SMP nested guest is improved > about 10% (in a direct guest it doesn't change much because it uses XIVE > for IPIs) when L0 and L1 are patched. > > It does this in several main ways: > > - Rearrange code to optimise SPR accesses. Mainly, avoid scoreboard > stalls. > > - Test SPR values to avoid mtSPRs where possible. mtSPRs are expensive. > > - Reduce mftb. mftb is expensive. > > - Demand fault certain facilities to avoid saving and/or restoring them > (at the cost of fault when they are used, but this is mitigated over > a number of entries, like the facilities when context switching > processes). PM, TM, and EBB so far. > > - Defer some sequences that are made just in case a guest is interrupted > in the middle of a critical section to the case where the guest is > scheduled on a different CPU, rather than every time (at the cost of > an extra IPI in this case). Namely the tlbsync sequence for radix with > GTSE, which is very expensive. > > - Reduce locking, barriers, atomics related to the vcpus-per-vcore > 1 > handling that the P9 path does not require. > > Changes since v1: > - Verified DPDES changes still work with msgsndp SMT emulation. > - Fixed HMI handling bug. > - Split softpatch handling fixes into smaller pieces. > - Rebased with Fabiano's latest HV sanitising patches. > - Fix TM demand faulting bug causing nested guest TM tests to TM Bad > Thing the host in rare cases. > - Re-name new "pmu=" command line option to "pmu_override=" and update > documentation wording. Hi Nick, For the PMU related changes, Reviewed-by: Athira Rajeev Thanks Athira > - Add default=y config option rather than unconditionally removing the > L0 nested PMU workaround. > - Remove unnecessary MSR[RI] updates in entry/exit. Down to about 4700 > cycles now. > - Another bugfix from Alexey's testing. > > Changes since RFC: > - Rebased with Fabiano's HV sanitising patches at the front. > - Several demand faulting bug fixes mostly relating to nested guests. > - Removed facility demand-faulting from L0 nested entry/exit handler. > Demand faulting is still done in the L1, but not the L0. The reason > is to reduce complexity (although it's only a small amount of > complexity), reduce demand faulting overhead that may require several > > Fabiano Rosas (3): > KVM: PPC: Book3S HV Nested: Sanitise vcpu registers > KVM: PPC: Book3S HV Nested: Stop forwarding all HFUs to L1 > KVM: PPC: Book3S HV Nested: save_hv_return_state does not require trap >argument > > Nicholas Piggin (57): > KVM: PPC: Book3S HV: Initialise vcpu MSR with MSR_ME > KVM: PPC: Book3S HV: Remove TM emulation from POWER7/8 path > KVM: PPC: Book3S HV P9: Fixes for TM softpatch interrupt NIP > KVM: PPC: Book3S HV Nested: Fix TM softpatch HFAC interrupt emulation > KVM: PPC: Book3S HV Nested: Make nested HFSCR state accessible > KVM: PPC: Book3S HV Nested: Reflect guest PMU in-use to L0 when guest >SPRs are live > powerpc/64s: Remove WORT SPR from POWER9/10 > KMV: PPC: Book3S HV P9: Use set_dec to set decrementer to host > KVM: PPC: Book3S HV P9: Use host timer accounting to avoid decrementer >read > KVM: PPC: Book3S HV P9: Use large decrementer for HDEC > KVM: PPC: Book3S HV P9: Reduce mftb per guest entry/exit > powerpc/time: add API for KVM to re-arm the host timer/decrementer > KVM: PPC: Book3S HV: POWER10 enable HAIL when running radix guests > powerpc/64s: Keep AMOR SPR a constant ~0 at runtime > KVM: PPC: Book3S HV: Don't always save PMU for guest capable of >nesting > powerpc/64s: Always set PMU control registers to frozen/disabled when >not in use > powerpc/64s: Implement PMU override command line option > KVM: PPC: Book3S HV P9: Implement PMU save/restore in C > KVM: PPC: Book3S HV P9: Factor PMU save/load into context switch >functions > KVM: PPC: Book3S HV P9: Demand fault PMU SPRs when marked not inuse > KVM: PPC: Book3S HV P9: Factor out yield_count increment > KVM: PPC: Book3S HV: CTRL SPR does not require read-modify-write > KVM: PPC: Book3S HV P9: Move SPRG restore to restore_p9_host_os_sprs > KVM: PPC: Book3S HV P9: Reduce mtmsrd instructions required to save >host SPRs > KVM: PPC: Book3S HV P9: Improve mtmsrd scheduling by delaying MSR[EE] >disable > KVM: PPC: Book3S HV P9: Add kvmppc_stop_thread to match >kvmppc_start_thread > KVM: PPC: Book3S HV: Change dec_expires to be relative to guest >timebase > KVM: PPC: Book3S HV P9: Move TB updates > KVM: PPC: Book3S HV P9: Optimise timebase reads > KVM: PPC: Book3S HV P9: Avoid SPR scoreboard stalls > KVM: PPC: Book3S HV P9: Only execute mtSPR if the value changed >
Re: [PATCH 2/3] powerpc: Fix undefined static key
Le 16/08/2021 à 10:24, Joel Stanley a écrit : When CONFIG_PPC_BARRIER_NOSPEC=n, security.c is not built leading to a missing definition of uaccess_flush_key. LD vmlinux.o MODPOST vmlinux.symvers MODINFO modules.builtin.modinfo GEN modules.builtin LD .tmp_vmlinux.kallsyms1 powerpc64le-linux-gnu-ld: arch/powerpc/kernel/align.o:(.toc+0x0): undefined reference to `uaccess_flush_key' powerpc64le-linux-gnu-ld: arch/powerpc/kernel/signal_64.o:(.toc+0x0): undefined reference to `uaccess_flush_key' powerpc64le-linux-gnu-ld: arch/powerpc/kernel/process.o:(.toc+0x0): undefined reference to `uaccess_flush_key' powerpc64le-linux-gnu-ld: arch/powerpc/kernel/traps.o:(.toc+0x0): undefined reference to `uaccess_flush_key' powerpc64le-linux-gnu-ld: arch/powerpc/kernel/hw_breakpoint_constraints.o:(.toc+0x0): undefined reference to `uaccess_flush_key' powerpc64le-linux-gnu-ld: arch/powerpc/kernel/ptrace/ptrace.o:(.toc+0x0): more undefined references to `uaccess_flush_key' follow make[1]: *** [Makefile:1176: vmlinux] Error 1 Hack one in to fix the build. Signed-off-by: Joel Stanley --- arch/powerpc/include/asm/security_features.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/powerpc/include/asm/security_features.h b/arch/powerpc/include/asm/security_features.h index 792eefaf230b..46ade7927a4c 100644 --- a/arch/powerpc/include/asm/security_features.h +++ b/arch/powerpc/include/asm/security_features.h @@ -39,6 +39,9 @@ static inline bool security_ftr_enabled(u64 feature) return !!(powerpc_security_features & feature); } +#ifndef CONFIG_PPC_BARRIER_NOSPEC +DEFINE_STATIC_KEY_FALSE(uaccess_flush_key); +#endif It will then be re-defined by each file that includes asm/security_features.h You can't use a DEFINE_ in a .h You have to fix the problem at its source. Cleanest way I see it to modify asm/book3s/64/kup.h and something like if (IS_ENABLED(CONFIG_PPC_BARRIER_NOSPEC) && static_branch_unlikely(_flush_key) // Features indicating support for Spectre/Meltdown mitigations
[PATCH 2/2] powerpc/configs: Regenerate mpc885_ads_defconfig
Regenrate atop v5.14-rc6. The chagnes are mostly re-ordering, except for the following which fall out due to dependenacies: - CONFIG_DEBUG_KERNEL=y selected by EXPERT - CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008 which is the default setting CONFIG_MTD_PHYSMAP_OF is not longer enabled, as it depends on MTD_PHYSMAP which is not enabled. This is a regression from commit 642b1e8dbed7 ("mtd: maps: Merge physmap_of.c into physmap-core.c"), which added the extra dependency. Add CONFIG_MTD_PHYSMAP=y so this stays in the config. Signed-off-by: Joel Stanley --- arch/powerpc/configs/mpc885_ads_defconfig | 47 +++ 1 file changed, 23 insertions(+), 24 deletions(-) diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig index 5cd17adf903f..c74dc76b1d0d 100644 --- a/arch/powerpc/configs/mpc885_ads_defconfig +++ b/arch/powerpc/configs/mpc885_ads_defconfig @@ -1,19 +1,30 @@ -CONFIG_PPC_8xx=y # CONFIG_SWAP is not set CONFIG_SYSVIPC=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y +CONFIG_BPF_JIT=y +CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y CONFIG_LOG_BUF_SHIFT=14 CONFIG_EXPERT=y # CONFIG_ELF_CORE is not set # CONFIG_BASE_FULL is not set # CONFIG_FUTEX is not set +CONFIG_PERF_EVENTS=y # CONFIG_VM_EVENT_COUNTERS is not set -# CONFIG_BLK_DEV_BSG is not set -CONFIG_PARTITION_ADVANCED=y +CONFIG_PPC_8xx=y +CONFIG_8xx_GPIO=y +CONFIG_SMC_UCODE_PATCH=y +CONFIG_PIN_TLB=y CONFIG_GEN_RTC=y CONFIG_HZ_100=y +CONFIG_MATH_EMULATION=y +CONFIG_PPC_16K_PAGES=y +CONFIG_ADVANCED_OPTIONS=y # CONFIG_SECCOMP is not set +CONFIG_STRICT_KERNEL_RWX=y +CONFIG_MODULES=y +# CONFIG_BLK_DEV_BSG is not set +CONFIG_PARTITION_ADVANCED=y CONFIG_NET=y CONFIG_PACKET=y CONFIG_UNIX=y @@ -33,6 +44,7 @@ CONFIG_MTD_CFI_GEOMETRY=y # CONFIG_MTD_CFI_I2 is not set CONFIG_MTD_CFI_I4=y CONFIG_MTD_CFI_AMDSTD=y +CONFIG_MTD_PHYSMAP=y CONFIG_MTD_PHYSMAP_OF=y # CONFIG_BLK_DEV is not set CONFIG_NETDEVICES=y @@ -45,38 +57,25 @@ CONFIG_DAVICOM_PHY=y # CONFIG_LEGACY_PTYS is not set CONFIG_SERIAL_CPM=y CONFIG_SERIAL_CPM_CONSOLE=y +CONFIG_SPI=y +CONFIG_SPI_FSL_SPI=y # CONFIG_HWMON is not set +CONFIG_WATCHDOG=y +CONFIG_8xxx_WDT=y # CONFIG_USB_SUPPORT is not set # CONFIG_DNOTIFY is not set CONFIG_TMPFS=y CONFIG_CRAMFS=y CONFIG_NFS_FS=y CONFIG_ROOT_NFS=y +CONFIG_CRYPTO=y +CONFIG_CRYPTO_DEV_TALITOS=y CONFIG_CRC32_SLICEBY4=y CONFIG_DEBUG_INFO=y CONFIG_MAGIC_SYSRQ=y -CONFIG_DETECT_HUNG_TASK=y -CONFIG_PPC_16K_PAGES=y -CONFIG_DEBUG_KERNEL=y CONFIG_DEBUG_FS=y -CONFIG_PPC_PTDUMP=y -CONFIG_MODULES=y -CONFIG_SPI=y -CONFIG_SPI_FSL_SPI=y -CONFIG_CRYPTO=y -CONFIG_CRYPTO_DEV_TALITOS=y -CONFIG_8xx_GPIO=y -CONFIG_WATCHDOG=y -CONFIG_8xxx_WDT=y -CONFIG_SMC_UCODE_PATCH=y -CONFIG_ADVANCED_OPTIONS=y -CONFIG_PIN_TLB=y -CONFIG_PERF_EVENTS=y -CONFIG_MATH_EMULATION=y -CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y -CONFIG_STRICT_KERNEL_RWX=y -CONFIG_BPF_JIT=y CONFIG_DEBUG_VM_PGTABLE=y +CONFIG_DETECT_HUNG_TASK=y CONFIG_BDI_SWITCH=y CONFIG_PPC_EARLY_DEBUG=y -CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008 +CONFIG_PPC_PTDUMP=y -- 2.32.0
[PATCH 1/2] powerpc/config: Fix IPV6 warning in mpc855_ads
When building this config there's a warning: 79:warning: override: reassigning to symbol IPV6 Commit 9a1762a4a4ff ("powerpc/8xx: Update mpc885_ads_defconfig to improve CI") added CONFIG_IPV6=y, but left '# CONFIG_IPV6 is not set' in. IPV6 is default y, so remove both to clean up the build. Signed-off-by: Joel Stanley --- arch/powerpc/configs/mpc885_ads_defconfig | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig index d21f266cea9a..5cd17adf903f 100644 --- a/arch/powerpc/configs/mpc885_ads_defconfig +++ b/arch/powerpc/configs/mpc885_ads_defconfig @@ -21,7 +21,6 @@ CONFIG_INET=y CONFIG_IP_MULTICAST=y CONFIG_IP_PNP=y CONFIG_SYN_COOKIES=y -# CONFIG_IPV6 is not set # CONFIG_FW_LOADER is not set CONFIG_MTD=y CONFIG_MTD_BLOCK=y @@ -76,7 +75,6 @@ CONFIG_PERF_EVENTS=y CONFIG_MATH_EMULATION=y CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y CONFIG_STRICT_KERNEL_RWX=y -CONFIG_IPV6=y CONFIG_BPF_JIT=y CONFIG_DEBUG_VM_PGTABLE=y CONFIG_BDI_SWITCH=y -- 2.32.0
Re: [PATCH 1/3] powerpc/64s: Fix build when !PPC_BARRIER_NOSPEC
Le 16/08/2021 à 10:24, Joel Stanley a écrit : When disabling PPC_BARRIER_NOSPEC the do_barrier_nospec_fixups_range definition is still included, as well as a stub in asm/setup.h: ../arch/powerpc/lib/feature-fixups.c:502:6: error: redefinition of ‘do_barrier_nospec_fixu> 502 | void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_en> | ^~ In file included from ../arch/powerpc/lib/feature-fixups.c:23: ../arch/powerpc/include/asm/setup.h:70:20: note: previous definition of ‘do_barrier_nospec> 70 | static inline void do_barrier_nospec_fixups_range(bool enable, void *start, void *> |^~ I assume the intent was to put the just do_barrier_nospec_fixups behind PPC_BARRIER_NOSPEC and let the compiler drop _range when there are no users. (There is a caller in module.c, but this is behind PPC_BARRIER_NOSPEC). The compiler won't drop do_barrier_nospec_fixups_range() because it is not static. This makes PPC_BOOK3S_64 match how the PPC_FSL_BOOK3E build works. Signed-off-by: Joel Stanley --- arch/powerpc/include/asm/setup.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h index 6c1a7d217d1a..71012284c044 100644 --- a/arch/powerpc/include/asm/setup.h +++ b/arch/powerpc/include/asm/setup.h @@ -66,8 +66,6 @@ extern bool barrier_nospec_enabled; #ifdef CONFIG_PPC_BARRIER_NOSPEC void do_barrier_nospec_fixups_range(bool enable, void *start, void *end); -#else -static inline void do_barrier_nospec_fixups_range(bool enable, void *start, void *end) { } #endif #ifdef CONFIG_PPC_FSL_BOOK3E
[PATCH 0/2] powerpc: mpc855_ads defconfig fixes
The first was a build warning I noticed when testing something unrelated. I took a moment to look into it, and came up with the second patch which updates the defconfig to make it easier to maintain in the future It also fixes a regression where the MTD partition support dropped out of the config. Given noone noticed the regression since v4.20 was released, perhaps it could be left disabled? Joel Stanley (2): powerpc/config: Fix IPV6 warning in mpc855_ads powerpc/configs: Regenerate mpc885_ads_defconfig arch/powerpc/configs/mpc885_ads_defconfig | 49 +++ 1 file changed, 23 insertions(+), 26 deletions(-) -- 2.32.0
[PATCH 3/3] powerpc/microwatt: CPU doesn't (yet) have speculation bugs
Signed-off-by: Joel Stanley --- arch/powerpc/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 663766fbf505..d5af6667c206 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -279,6 +279,7 @@ config PPC_BARRIER_NOSPEC bool default y depends on PPC_BOOK3S_64 || PPC_FSL_BOOK3E + depends on !PPC_MICROWATT config EARLY_PRINTK bool -- 2.32.0
[PATCH 2/3] powerpc: Fix undefined static key
When CONFIG_PPC_BARRIER_NOSPEC=n, security.c is not built leading to a missing definition of uaccess_flush_key. LD vmlinux.o MODPOST vmlinux.symvers MODINFO modules.builtin.modinfo GEN modules.builtin LD .tmp_vmlinux.kallsyms1 powerpc64le-linux-gnu-ld: arch/powerpc/kernel/align.o:(.toc+0x0): undefined reference to `uaccess_flush_key' powerpc64le-linux-gnu-ld: arch/powerpc/kernel/signal_64.o:(.toc+0x0): undefined reference to `uaccess_flush_key' powerpc64le-linux-gnu-ld: arch/powerpc/kernel/process.o:(.toc+0x0): undefined reference to `uaccess_flush_key' powerpc64le-linux-gnu-ld: arch/powerpc/kernel/traps.o:(.toc+0x0): undefined reference to `uaccess_flush_key' powerpc64le-linux-gnu-ld: arch/powerpc/kernel/hw_breakpoint_constraints.o:(.toc+0x0): undefined reference to `uaccess_flush_key' powerpc64le-linux-gnu-ld: arch/powerpc/kernel/ptrace/ptrace.o:(.toc+0x0): more undefined references to `uaccess_flush_key' follow make[1]: *** [Makefile:1176: vmlinux] Error 1 Hack one in to fix the build. Signed-off-by: Joel Stanley --- arch/powerpc/include/asm/security_features.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/powerpc/include/asm/security_features.h b/arch/powerpc/include/asm/security_features.h index 792eefaf230b..46ade7927a4c 100644 --- a/arch/powerpc/include/asm/security_features.h +++ b/arch/powerpc/include/asm/security_features.h @@ -39,6 +39,9 @@ static inline bool security_ftr_enabled(u64 feature) return !!(powerpc_security_features & feature); } +#ifndef CONFIG_PPC_BARRIER_NOSPEC +DEFINE_STATIC_KEY_FALSE(uaccess_flush_key); +#endif // Features indicating support for Spectre/Meltdown mitigations -- 2.32.0
[PATCH 1/3] powerpc/64s: Fix build when !PPC_BARRIER_NOSPEC
When disabling PPC_BARRIER_NOSPEC the do_barrier_nospec_fixups_range definition is still included, as well as a stub in asm/setup.h: ../arch/powerpc/lib/feature-fixups.c:502:6: error: redefinition of ‘do_barrier_nospec_fixu> 502 | void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_en> | ^~ In file included from ../arch/powerpc/lib/feature-fixups.c:23: ../arch/powerpc/include/asm/setup.h:70:20: note: previous definition of ‘do_barrier_nospec> 70 | static inline void do_barrier_nospec_fixups_range(bool enable, void *start, void *> |^~ I assume the intent was to put the just do_barrier_nospec_fixups behind PPC_BARRIER_NOSPEC and let the compiler drop _range when there are no users. (There is a caller in module.c, but this is behind PPC_BARRIER_NOSPEC). This makes PPC_BOOK3S_64 match how the PPC_FSL_BOOK3E build works. Signed-off-by: Joel Stanley --- arch/powerpc/include/asm/setup.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h index 6c1a7d217d1a..71012284c044 100644 --- a/arch/powerpc/include/asm/setup.h +++ b/arch/powerpc/include/asm/setup.h @@ -66,8 +66,6 @@ extern bool barrier_nospec_enabled; #ifdef CONFIG_PPC_BARRIER_NOSPEC void do_barrier_nospec_fixups_range(bool enable, void *start, void *end); -#else -static inline void do_barrier_nospec_fixups_range(bool enable, void *start, void *end) { } #endif #ifdef CONFIG_PPC_FSL_BOOK3E -- 2.32.0
[PATCH 0/3] powerpc/64s: Fix PPC_BARRIER_NOSPEC=n
When disabling PPC_BARRIER_NOSPEC on Microwatt to see if it improved boot time, I discovered the build was broken (first patch). This got worse between when I first tried and now (second patch). The third patch disables PPC_BARRIER_NOSPEC when building for Microwatt. This one is optional, as it doesn't seem to change boot speed with the current Microwatt design on an Arty. Joel Stanley (3): powerpc/64s: Fix build when !PPC_BARRIER_NOSPEC powerpc: Fix undefined static key powerpc/microwatt: CPU doesn't (yet) have speculation bugs arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/security_features.h | 3 +++ arch/powerpc/include/asm/setup.h | 2 -- 3 files changed, 4 insertions(+), 2 deletions(-) -- 2.32.0
[PATCH v1 4/4] powerpc/64s/interrupt: avoid saving CFAR in some asynchronous interrupts
Reading the CFAR register is quite costly (~20 cycles on POWER9). It is a good idea to have for most synchronous interrupts, but for async ones it is much less important. Doorbell, external, and decrementer interrupts are the important asynchronous ones. HV interrupts can't skip CFAR if KVM HV is possible, because it might be a guest exit that requires CFAR preserved. But for now the important pseries interrupts can avoid loading CFAR. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 63 1 file changed, 63 insertions(+) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 69a472c38f62..42badd7beaf0 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -111,6 +111,8 @@ name: #define IAREA .L_IAREA_\name\() /* PACA save area */ #define IVIRT .L_IVIRT_\name\() /* Has virt mode entry point */ #define IISIDE .L_IISIDE_\name\() /* Uses SRR0/1 not DAR/DSISR */ +#define ICFAR .L_ICFAR_\name\() /* Uses CFAR */ +#define ICFAR_IF_HVMODE.L_ICFAR_IF_HVMODE_\name\() /* Uses CFAR if HV */ #define IDAR .L_IDAR_\name\()/* Uses DAR (or SRR0) */ #define IDSISR .L_IDSISR_\name\() /* Uses DSISR (or SRR1) */ #define IBRANCH_TO_COMMON .L_IBRANCH_TO_COMMON_\name\() /* ENTRY branch to common */ @@ -150,6 +152,12 @@ do_define_int n .ifndef IISIDE IISIDE=0 .endif + .ifndef ICFAR + ICFAR=1 + .endif + .ifndef ICFAR_IF_HVMODE + ICFAR_IF_HVMODE=0 + .endif .ifndef IDAR IDAR=0 .endif @@ -287,9 +295,21 @@ BEGIN_FTR_SECTION END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) HMT_MEDIUM std r10,IAREA+EX_R10(r13) /* save r10 - r12 */ + .if ICFAR BEGIN_FTR_SECTION mfspr r10,SPRN_CFAR END_FTR_SECTION_IFSET(CPU_FTR_CFAR) + .elseif ICFAR_IF_HVMODE +BEGIN_FTR_SECTION + BEGIN_FTR_SECTION_NESTED(69) + mfspr r10,SPRN_CFAR + END_FTR_SECTION_NESTED(CPU_FTR_CFAR, CPU_FTR_CFAR, 69) +FTR_SECTION_ELSE + BEGIN_FTR_SECTION_NESTED(69) + li r10,0 + END_FTR_SECTION_NESTED(CPU_FTR_CFAR, CPU_FTR_CFAR, 69) +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) + .endif .if \ool .if !\virt b tramp_real_\name @@ -305,9 +325,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) BEGIN_FTR_SECTION std r9,IAREA+EX_PPR(r13) END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) + .if ICFAR || ICFAR_IF_HVMODE BEGIN_FTR_SECTION std r10,IAREA+EX_CFAR(r13) END_FTR_SECTION_IFSET(CPU_FTR_CFAR) + .endif INTERRUPT_TO_KERNEL mfctr r10 std r10,IAREA+EX_CTR(r13) @@ -559,7 +581,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) .endif BEGIN_FTR_SECTION + .if ICFAR || ICFAR_IF_HVMODE ld r10,IAREA+EX_CFAR(r13) + .else + li r10,0 + .endif std r10,ORIG_GPR3(r1) END_FTR_SECTION_IFSET(CPU_FTR_CFAR) ld r10,IAREA+EX_CTR(r13) @@ -1501,6 +1527,12 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX) * * If soft masked, the masked handler will note the pending interrupt for * replay, and clear MSR[EE] in the interrupted context. + * + * CFAR is not required because this is an asynchronous interrupt that in + * general won't have much bearing on the state of the CPU, with the possible + * exception of crash/debug IPIs, but those are generally moving to use SRESET + * IPIs. Unless this is an HV interrupt and KVM HV is possible, in which case + * it may be exiting the guest and need CFAR to be saved. */ INT_DEFINE_BEGIN(hardware_interrupt) IVEC=0x500 @@ -1508,6 +1540,10 @@ INT_DEFINE_BEGIN(hardware_interrupt) IMASK=IRQS_DISABLED IKVM_REAL=1 IKVM_VIRT=1 + ICFAR=0 +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE + ICFAR_IF_HVMODE=1 +#endif INT_DEFINE_END(hardware_interrupt) EXC_REAL_BEGIN(hardware_interrupt, 0x500, 0x100) @@ -1726,6 +1762,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM) * If PPC_WATCHDOG is configured, the soft masked handler will actually set * things back up to run soft_nmi_interrupt as a regular interrupt handler * on the emergency stack. + * + * CFAR is not required because this is asynchronous (see hardware_interrupt). + * A watchdog interrupt may like to have CFAR, but usually the interesting + * branch is long gone by that point (e.g., infinite loop). */ INT_DEFINE_BEGIN(decrementer) IVEC=0x900 @@ -1733,6 +1773,7 @@ INT_DEFINE_BEGIN(decrementer) #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE IKVM_REAL=1 #endif + ICFAR=0 INT_DEFINE_END(decrementer) EXC_REAL_BEGIN(decrementer, 0x900, 0x80) @@ -1808,6 +1849,8 @@ EXC_COMMON_BEGIN(hdecrementer_common) * If soft masked, the masked handler will note the pending interrupt for *
[PATCH v1 3/4] powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless perf is in use
Enabling MSR[EE] in interrupt handlers while interrupts are still soft masked allows PMIs to profile interrupt handlers to some degree, beyond what SIAR latching allows. When perf is not being used, this is almost useless work. It requires an extra mtmsrd in the irq handler, and it also opens the door to masked interrupts hitting and requiring replay, which is more expensive than just taking them directly. This effect can be noticable in high IRQ workloads. Avoid enabling MSR[EE] unless perf is currently in use. This saves about 60 cycles (or 8%) on a simple decrementer interrupt microbenchmark. Replayed interrupts drop from 1.4% of interrupts to 0.003%. This does prevent the soft-nmi interrupt being taken in these handlers, but that's not too reliable anyway. The SMP watchdog will continue to be the reliable way to catch lockups. Cc: Madhavan Srinivasan Cc: Athira Rajeev Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/hw_irq.h | 47 +-- arch/powerpc/kernel/dbell.c | 3 +- arch/powerpc/kernel/irq.c | 3 +- arch/powerpc/kernel/time.c| 30 ++-- 4 files changed, 57 insertions(+), 26 deletions(-) diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index 2d5c0d3ccbb6..e6644509c7af 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -309,17 +309,46 @@ static inline bool lazy_irq_pending_nocheck(void) bool power_pmu_running(void); /* - * This is called by asynchronous interrupts to conditionally - * re-enable hard interrupts after having cleared the source - * of the interrupt. They are kept disabled if there is a different - * soft-masked interrupt pending that requires hard masking. + * This is called by asynchronous interrupts to check whether to + * conditionally re-enable hard interrupts after having cleared + * the source of the interrupt. They are kept disabled if there + * is a different soft-masked interrupt pending that requires hard + * masking. */ -static inline void may_hard_irq_enable(void) +static inline bool may_hard_irq_enable(void) { - if (!(get_paca()->irq_happened & PACA_IRQ_MUST_HARD_MASK)) { - get_paca()->irq_happened &= ~PACA_IRQ_HARD_DIS; - __hard_irq_enable(); - } +#ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG + BUG_ON(mfmsr() & MSR_EE); +#endif +#ifdef CONFIG_PERF_EVENTS + if (!power_pmu_running()) + return false; + + if (get_paca()->irq_happened & PACA_IRQ_MUST_HARD_MASK) + return false; + + return true; +#else + return false; +#endif +} + +/* + * Do the hard enabling, only call this if may_hard_irq_enable is true. + */ +static inline void do_hard_irq_enable(void) +{ +#ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG + WARN_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED); + WARN_ON(get_paca()->irq_happened & PACA_IRQ_MUST_HARD_MASK); + BUG_ON(mfmsr() & MSR_EE); +#endif + /* +* This allows PMI interrupts (and watchdog soft-NMIs) through. +* There is no other reason to enable this way. +*/ + get_paca()->irq_happened &= ~PACA_IRQ_HARD_DIS; + __hard_irq_enable(); } static inline bool arch_irq_disabled_regs(struct pt_regs *regs) diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c index 5545c9cd17c1..0edeb5e9fede 100644 --- a/arch/powerpc/kernel/dbell.c +++ b/arch/powerpc/kernel/dbell.c @@ -27,7 +27,8 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(doorbell_exception) ppc_msgsync(); - may_hard_irq_enable(); + if (may_hard_irq_enable()) + do_hard_irq_enable(); kvmppc_clear_host_ipi(smp_processor_id()); __this_cpu_inc(irq_stat.doorbell_irqs); diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index 551b653228c4..745becbcd1ad 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -739,7 +739,8 @@ void __do_irq(struct pt_regs *regs) irq = ppc_md.get_irq(); /* We can hard enable interrupts now to allow perf interrupts */ - may_hard_irq_enable(); + if (may_hard_irq_enable()) + do_hard_irq_enable(); /* And finally process it */ if (unlikely(!irq)) diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index c487ba5a6e11..ac67ec57f129 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -567,22 +567,22 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(timer_interrupt) return; } - /* Ensure a positive value is written to the decrementer, or else -* some CPUs will continue to take decrementer exceptions. When the -* PPC_WATCHDOG (decrementer based) is configured, keep this at most -* 31 bits, which is about 4 seconds on most systems, which gives -* the watchdog a chance of catching timer interrupt hard lockups. -*/ - if
[PATCH v1 2/4] powerpc/64s/perf: add power_pmu_running to query whether perf is being used
Interrupt handling code would like to know whether perf is enabled, to know whether it should enable MSR[EE] to improve PMI coverage. Cc: Madhavan Srinivasan Cc: Athira Rajeev Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/hw_irq.h | 2 ++ arch/powerpc/perf/core-book3s.c | 13 + 2 files changed, 15 insertions(+) diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index 21cc571ea9c2..2d5c0d3ccbb6 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -306,6 +306,8 @@ static inline bool lazy_irq_pending_nocheck(void) return __lazy_irq_pending(local_paca->irq_happened); } +bool power_pmu_running(void); + /* * This is called by asynchronous interrupts to conditionally * re-enable hard interrupts after having cleared the source diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index bb0ee716de91..76114a9afb2b 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -2380,6 +2380,19 @@ static void perf_event_interrupt(struct pt_regs *regs) perf_sample_event_took(sched_clock() - start_clock); } +bool power_pmu_running(void) +{ + struct cpu_hw_events *cpuhw; + + /* Could this simply test local_paca->pmcregs_in_use? */ + + if (!ppmu) + return false; + + cpuhw = this_cpu_ptr(_hw_events); + return cpuhw->n_events; +} + static int power_pmu_prepare_cpu(unsigned int cpu) { struct cpu_hw_events *cpuhw = _cpu(cpu_hw_events, cpu); -- 2.23.0
[PATCH v1 0/4] powerpc/64s: interrupt speedups
Here's a few stragglers. The first patch was submitted already but had some bugs with unrecoverable exceptions on HPT (current->blah being accessed before MSR[RI] was enabled). Those should be fixed now. The others are generally for helping asynch interrupts, which are a bit harder to measure well but important for IO and IPIs. After this series, the SPR accesses of the interrupt handlers for radix are becoming pretty optimal except for PPR which we could improve on, and virt CPU accounting which is very costly -- we might disable that by default unless someone comes up with a good reason to keep it. Thanks, Nick Nicholas Piggin (4): powerpc/64: handle MSR EE and RI in interrupt entry wrapper powerpc/64s/perf: add power_pmu_running to query whether perf is being used powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless perf is in use powerpc/64s/interrupt: avoid saving CFAR in some asynchronous interrupts arch/powerpc/include/asm/hw_irq.h| 49 --- arch/powerpc/include/asm/interrupt.h | 31 -- arch/powerpc/kernel/dbell.c | 3 +- arch/powerpc/kernel/exceptions-64s.S | 93 +++- arch/powerpc/kernel/fpu.S| 5 ++ arch/powerpc/kernel/irq.c| 3 +- arch/powerpc/kernel/time.c | 30 - arch/powerpc/kernel/vector.S | 8 +++ arch/powerpc/perf/core-book3s.c | 13 9 files changed, 175 insertions(+), 60 deletions(-) -- 2.23.0
[PATCH v1 1/4] powerpc/64: handle MSR EE and RI in interrupt entry wrapper
Similarly to the system call change in the previous patch, the mtmsrd to enable RI can be combined with the mtmsrd to enable EE for interrupts which enable the latter, which tends to be the important synchronous interrupts (i.e., page faults). Do this by enabling EE and RI together at the beginning of the entry wrapper if PACA_IRQ_HARD_DIS is clear, and just enabling RI if it is set (which means something wanted EE=0). Asynchronous interrupts set PACA_IRQ_HARD_DIS, but synchronous ones leave it unchanged, so by default they always get EE=1 unless they interrupt a caller that has hard disabled. When the sync interrupt later calls interrupt_cond_local_irq_enable(), that will not require another mtmsrd because we already enabled here. 64e is conceptually unchanged, but it also sets MSR[EE]=1 now in the interrupt wrapper for synchronous interrupts with the same code. On 64s, saves one mtmsrd L=1 for synchronous interrupts on 64s, which saves about 20 cycles. For kernel-mode interrupts, both synchronous and asynchronous, this saves an additional ~40 cycles due to the mtmsrd being moved ahead of mfspr SPRN_AMR, which prevents a SPR scoreboard stall (on POWER9). Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/interrupt.h | 31 arch/powerpc/kernel/exceptions-64s.S | 30 --- arch/powerpc/kernel/fpu.S| 5 + arch/powerpc/kernel/vector.S | 8 +++ 4 files changed, 40 insertions(+), 34 deletions(-) diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h index 6b800d3e2681..e3228a911b35 100644 --- a/arch/powerpc/include/asm/interrupt.h +++ b/arch/powerpc/include/asm/interrupt.h @@ -148,9 +148,21 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup #endif #ifdef CONFIG_PPC64 - if (irq_soft_mask_set_return(IRQS_ALL_DISABLED) == IRQS_ENABLED) + bool trace_enable = false; + + if (IS_ENABLED(CONFIG_TRACE_IRQFLAGS)) { + if (irq_soft_mask_set_return(IRQS_DISABLED) == IRQS_ENABLED) + trace_enable = true; + } else { + irq_soft_mask_set(IRQS_DISABLED); + } + /* If the interrupt was taken with HARD_DIS set, don't enable MSR[EE] */ + if (local_paca->irq_happened & PACA_IRQ_HARD_DIS) + __hard_RI_enable(); + else + __hard_irq_enable(); + if (trace_enable) trace_hardirqs_off(); - local_paca->irq_happened |= PACA_IRQ_HARD_DIS; if (user_mode(regs)) { CT_WARN_ON(ct_state() != CONTEXT_USER); @@ -200,13 +212,20 @@ static inline void interrupt_exit_prepare(struct pt_regs *regs, struct interrupt static inline void interrupt_async_enter_prepare(struct pt_regs *regs, struct interrupt_state *state) { +#ifdef CONFIG_PPC64 + /* Ensure interrupt_enter_prepare does not enable MSR[EE] */ + local_paca->irq_happened |= PACA_IRQ_HARD_DIS; +#endif + interrupt_enter_prepare(regs, state); #ifdef CONFIG_PPC_BOOK3S_64 + /* +* MSR[RI] is only enabled after interrupt_enter_prepare, so this +* thread flags access has to come afterward. +*/ if (cpu_has_feature(CPU_FTR_CTRL) && !test_thread_local_flags(_TLF_RUNLATCH)) __ppc64_runlatch_on(); #endif - - interrupt_enter_prepare(regs, state); irq_enter(); } @@ -273,6 +292,8 @@ static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG)) BUG_ON(!arch_irq_disabled_regs(regs) && !(regs->msr & MSR_EE)); + __hard_RI_enable(); + /* Don't do any per-CPU operations until interrupt state is fixed */ if (nmi_disables_ftrace(regs)) { @@ -370,6 +391,8 @@ interrupt_handler long func(struct pt_regs *regs) \ { \ long ret; \ \ + __hard_RI_enable(); \ + \ ret = ##func (regs);\ \ return ret; \ diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 4aec59a77d4c..69a472c38f62 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -113,7 +113,6 @@ name: #define IISIDE .L_IISIDE_\name\() /* Uses SRR0/1 not DAR/DSISR */ #define IDAR .L_IDAR_\name\()/* Uses DAR (or SRR0) */ #define IDSISR .L_IDSISR_\name\() /* Uses DSISR (or
Re: [RFC PATCH] powerpc/book3s64/radix: Upgrade va tlbie to PID tlbie if we cross PMD_SIZE
"Aneesh Kumar K.V" writes: > On 8/12/21 6:19 PM, Michael Ellerman wrote: >> "Puvichakravarthy Ramachandran" writes: With shared mapping, even though we are unmapping a large range, the kernel will force a TLB flush with ptl lock held to avoid the race mentioned in commit 1cf35d47712d ("mm: split 'tlb_flush_mmu()' into tlb flushing and memory freeing parts") This results in the kernel issuing a high number of TLB flushes even for a large range. This can be improved by making sure the kernel switch to pid based flush if the kernel is unmapping a 2M range. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/mm/book3s64/radix_tlb.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c > b/arch/powerpc/mm/book3s64/radix_tlb.c index aefc100d79a7..21d0f098e43b 100644 --- a/arch/powerpc/mm/book3s64/radix_tlb.c +++ b/arch/powerpc/mm/book3s64/radix_tlb.c @@ -1106,7 +1106,7 @@ EXPORT_SYMBOL(radix__flush_tlb_kernel_range); * invalidating a full PID, so it has a far lower threshold to change > from * individual page flushes to full-pid flushes. */ -static unsigned long tlb_single_page_flush_ceiling __read_mostly = 33; +static unsigned long tlb_single_page_flush_ceiling __read_mostly = 32; static unsigned long tlb_local_single_page_flush_ceiling __read_mostly > = POWER9_TLB_SETS_RADIX * 2; static inline void __radix__flush_tlb_range(struct mm_struct *mm, @@ -1133,7 +1133,7 @@ static inline void __radix__flush_tlb_range(struct > mm_struct *mm, if (fullmm) flush_pid = true; else if (type == FLUSH_TYPE_GLOBAL) - flush_pid = nr_pages > tlb_single_page_flush_ceiling; + flush_pid = nr_pages >= tlb_single_page_flush_ceiling; else flush_pid = nr_pages > tlb_local_single_page_flush_ceiling; >>> >>> Additional details on the test environment. This was tested on a 2 Node/8 >>> socket Power10 system. >>> The LPAR had 105 cores and the LPAR spanned across all the sockets. >>> >>> # perf stat -I 1000 -a -e cycles,instructions -e >>> "{cpu/config=0x030008,name=PM_EXEC_STALL/}" -e >>> "{cpu/config=0x02E01C,name=PM_EXEC_STALL_TLBIE/}" ./tlbie -i 10 -c 1 -t 1 >>> Rate of work: = 176 >>> # time counts unit events >>> 1.029206442 4198594519 cycles >>> 1.029206442 2458254252 instructions # 0.59 >>> insn per cycle >>> 1.029206442 3004031488 PM_EXEC_STALL >>> 1.029206442 1798186036 PM_EXEC_STALL_TLBIE >>> Rate of work: = 181 >>> 2.054288539 4183883450 cycles >>> 2.054288539 2472178171 instructions # 0.59 >>> insn per cycle >>> 2.054288539 3014609313 PM_EXEC_STALL >>> 2.054288539 1797851642 PM_EXEC_STALL_TLBIE >>> Rate of work: = 180 >>> 3.078306883 4171250717 cycles >>> 3.078306883 2468341094 instructions # 0.59 >>> insn per cycle >>> 3.078306883 2993036205 PM_EXEC_STALL >>> 3.078306883 1798181890 PM_EXEC_STALL_TLBIE >>> . >>> . >>> >>> # cat /sys/kernel/debug/powerpc/tlb_single_page_flush_ceiling >>> 34 >>> >>> # echo 32 > /sys/kernel/debug/powerpc/tlb_single_page_flush_ceiling >>> >>> # perf stat -I 1000 -a -e cycles,instructions -e >>> "{cpu/config=0x030008,name=PM_EXEC_STALL/}" -e >>> "{cpu/config=0x02E01C,name=PM_EXEC_STALL_TLBIE/}" ./tlbie -i 10 -c 1 -t 1 >>> Rate of work: = 313 >>> # time counts unit events >>> 1.030310506 4206071143 cycles >>> 1.030310506 4314716958 instructions # 1.03 >>> insn per cycle >>> 1.030310506 2157762167 PM_EXEC_STALL >>> 1.030310506 110825573 PM_EXEC_STALL_TLBIE >>> Rate of work: = 322 >>> 2.056034068 4331745630 cycles >>> 2.056034068 4531658304 instructions # 1.05 >>> insn per cycle >>> 2.056034068 2288971361 PM_EXEC_STALL >>> 2.056034068 111267927 PM_EXEC_STALL_TLBIE >>> Rate of work: = 321 >>> 3.081216434 4327050349 cycles >>> 3.081216434 4379679508 instructions # 1.01 >>> insn per cycle >>> 3.081216434 2252602550 PM_EXEC_STALL >>> 3.081216434 110974887 PM_EXEC_STALL_TLBIE >> >> >> What is the tlbie test actually doing? >> >> Does it do anything to measure the cost of refilling after the full mm flush? > > That is essentially > > for () > { >shmat() >fillshm() >shmdt() > > } > > for a 256MB range. So it is not really a fair benchmark because it >
Re: [PATCH v2 2/2] powerpc/perf: Return regs->nip as instruction pointer value when SIAR is 0
Le 16/08/2021 à 08:44, kajoljain a écrit : On 8/14/21 6:14 PM, Michael Ellerman wrote: Christophe Leroy writes: Le 13/08/2021 à 10:24, Kajol Jain a écrit : Incase of random sampling, there can be scenarios where SIAR is not latching sample address and results in 0 value. Since current code directly returning the siar value, we could see multiple instruction pointer values as 0 in perf report. Can you please give more detail on that? What scenarios? On what CPUs? Hi Michael, Sure I will update these details in my next patch-set. Patch resolves this issue by adding a ternary condition to return regs->nip incase SIAR is 0. Your description seems rather similar to https://github.com/linuxppc/linux/commit/2ca13a4cc56c920a6c9fc8ee45d02bccacd7f46c Does it mean that the problem occurs on more than the power10 DD1 ? In that case, can the solution be common instead of doing something for power10 DD1 and something for others ? Agreed. This change would seem to make that P10 DD1 logic superfluous. Also we already have a fallback to regs->nip in the else case of the if, so we should just use that rather than adding a ternary condition. eg. if (use_siar && siar_valid(regs) && siar) return siar + perf_ip_adjust(regs); else if (use_siar) return 0; // no valid instruction pointer else return regs->nip; I'm also not sure why we have that return 0 case, I can't think of why we'd ever want to do that rather than using nip. So maybe we should do another patch to drop that case. Yeah make sense. I will remove return 0 case in my next version. This was added by commit https://github.com/linuxppc/linux/commit/e6878835ac4794f25385522d29c634b7bbb7cca9 Are we sure it was an error to add it and it can be removed ? Christophe
Re: [PATCH v2 2/2] powerpc/perf: Return regs->nip as instruction pointer value when SIAR is 0
On 8/13/21 3:04 PM, Christophe Leroy wrote: > > > Le 13/08/2021 à 10:24, Kajol Jain a écrit : >> Incase of random sampling, there can be scenarios where SIAR is not >> latching sample address and results in 0 value. Since current code >> directly returning the siar value, we could see multiple instruction >> pointer values as 0 in perf report. >> Patch resolves this issue by adding a ternary condition to return >> regs->nip incase SIAR is 0. > > Your description seems rather similar to > https://github.com/linuxppc/linux/commit/2ca13a4cc56c920a6c9fc8ee45d02bccacd7f46c > > Does it mean that the problem occurs on more than the power10 DD1 ? > > In that case, can the solution be common instead of doing something for > power10 DD1 and something for others ? Hi Christophe, Yes its better to have common check. I will make these updates. Thanks, Kajol Jain > >> >> Fixes: 75382aa72f06 ("powerpc/perf: Move code to select SIAR or pt_regs >> into perf_read_regs") >> Signed-off-by: Kajol Jain >> --- >> arch/powerpc/perf/core-book3s.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/perf/core-book3s.c >> b/arch/powerpc/perf/core-book3s.c >> index 1b464aad29c4..aeecaaf6810f 100644 >> --- a/arch/powerpc/perf/core-book3s.c >> +++ b/arch/powerpc/perf/core-book3s.c >> @@ -2260,7 +2260,7 @@ unsigned long perf_instruction_pointer(struct pt_regs >> *regs) >> else >> return regs->nip; >> } else if (use_siar && siar_valid(regs)) >> - return siar + perf_ip_adjust(regs); >> + return siar ? siar + perf_ip_adjust(regs) : regs->nip; >> else if (use_siar) >> return 0; // no valid instruction pointer >> else >>
Re: [PATCH v2 2/2] powerpc/perf: Return regs->nip as instruction pointer value when SIAR is 0
On 8/14/21 6:14 PM, Michael Ellerman wrote: > Christophe Leroy writes: >> Le 13/08/2021 à 10:24, Kajol Jain a écrit : >>> Incase of random sampling, there can be scenarios where SIAR is not >>> latching sample address and results in 0 value. Since current code >>> directly returning the siar value, we could see multiple instruction >>> pointer values as 0 in perf report. > > Can you please give more detail on that? What scenarios? On what CPUs? > Hi Michael, Sure I will update these details in my next patch-set. >>> Patch resolves this issue by adding a ternary condition to return >>> regs->nip incase SIAR is 0. >> >> Your description seems rather similar to >> https://github.com/linuxppc/linux/commit/2ca13a4cc56c920a6c9fc8ee45d02bccacd7f46c >> >> Does it mean that the problem occurs on more than the power10 DD1 ? >> >> In that case, can the solution be common instead of doing something for >> power10 DD1 and something >> for others ? > > Agreed. > > This change would seem to make that P10 DD1 logic superfluous. > > Also we already have a fallback to regs->nip in the else case of the if, > so we should just use that rather than adding a ternary condition. > > eg. > > if (use_siar && siar_valid(regs) && siar) > return siar + perf_ip_adjust(regs); > else if (use_siar) > return 0; // no valid instruction pointer > else > return regs->nip; > > > I'm also not sure why we have that return 0 case, I can't think of why > we'd ever want to do that rather than using nip. So maybe we should do > another patch to drop that case. Yeah make sense. I will remove return 0 case in my next version. Thanks, Kajol Jain > > cheers >