[PATCH v5 04/10] libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier
Architectures like ppc64 provide persistent memory specific barriers that will ensure that all stores for which the modifications are written to persistent storage by preceding dcbfps and dcbstps instructions have updated persistent storage before any data access or data transfer caused by subsequent instructions is initiated. This is in addition to the ordering done by wmb() Update nvdimm core such that architecture can use barriers other than wmb to ensure all previous writes are architecturally visible for the platform buffer flush. Signed-off-by: Aneesh Kumar K.V --- drivers/md/dm-writecache.c | 2 +- drivers/nvdimm/region_devs.c | 8 include/linux/libnvdimm.h| 4 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c index 613c171b1b6d..904fdbf2b089 100644 --- a/drivers/md/dm-writecache.c +++ b/drivers/md/dm-writecache.c @@ -540,7 +540,7 @@ static void ssd_commit_superblock(struct dm_writecache *wc) static void writecache_commit_flushed(struct dm_writecache *wc, bool wait_for_ios) { if (WC_MODE_PMEM(wc)) - wmb(); + arch_pmem_flush_barrier(); else ssd_commit_flushed(wc, wait_for_ios); } diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c index ccbb5b43b8b2..88ea34a9c7fd 100644 --- a/drivers/nvdimm/region_devs.c +++ b/drivers/nvdimm/region_devs.c @@ -1216,13 +1216,13 @@ int generic_nvdimm_flush(struct nd_region *nd_region) idx = this_cpu_add_return(flush_idx, hash_32(current->pid + idx, 8)); /* -* The first wmb() is needed to 'sfence' all previous writes -* such that they are architecturally visible for the platform -* buffer flush. Note that we've already arranged for pmem +* The first arch_pmem_flush_barrier() is needed to 'sfence' all +* previous writes such that they are architecturally visible for +* the platform buffer flush. Note that we've already arranged for pmem * writes to avoid the cache via memcpy_flushcache(). The final * wmb() ensures ordering for the NVDIMM flush write. */ - wmb(); + arch_pmem_flush_barrier(); for (i = 0; i < nd_region->ndr_mappings; i++) if (ndrd_get_flush_wpq(ndrd, i, 0)) writeq(1, ndrd_get_flush_wpq(ndrd, i, idx)); diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h index 18da4059be09..66f6c65bd789 100644 --- a/include/linux/libnvdimm.h +++ b/include/linux/libnvdimm.h @@ -286,4 +286,8 @@ static inline void arch_invalidate_pmem(void *addr, size_t size) } #endif +#ifndef arch_pmem_flush_barrier +#define arch_pmem_flush_barrier() wmb() +#endif + #endif /* __LIBNVDIMM_H__ */ -- 2.26.2
[PATCH v5 01/10] powerpc/pmem: Restrict papr_scm to P8 and above.
The PAPR based virtualized persistent memory devices are only supported on POWER9 and above. In the followup patch, the kernel will switch the persistent memory cache flush functions to use a new `dcbf` variant instruction. The new instructions even though added in ISA 3.1 works even on P8 and P9 because these are implemented as a variant of existing `dcbf` and `hwsync` and on P8 and P9 behaves as such. Considering these devices are only supported on P8 and above, update the driver to prevent a P7-compat guest from using persistent memory devices. We don't update of_pmem driver with the same condition, because, on bare-metal, the firmware enables pmem support only on P9 and above. There the kernel depends on OPAL firmware to restrict exposing persistent memory related device tree entries on older hardware. of_pmem.ko is written without any arch dependency and we don't want to add ppc64 specific cpu feature check in of_pmem driver. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/platforms/pseries/pmem.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/powerpc/platforms/pseries/pmem.c b/arch/powerpc/platforms/pseries/pmem.c index f860a897a9e0..2347e1038f58 100644 --- a/arch/powerpc/platforms/pseries/pmem.c +++ b/arch/powerpc/platforms/pseries/pmem.c @@ -147,6 +147,12 @@ const struct of_device_id drc_pmem_match[] = { static int pseries_pmem_init(void) { + /* +* Only supported on POWER8 and above. +*/ + if (!cpu_has_feature(CPU_FTR_ARCH_207S)) + return 0; + pmem_node = of_find_node_by_type(NULL, "ibm,persistent-memory"); if (!pmem_node) return 0; -- 2.26.2
[PATCH v5 10/10] powerpc/pmem: Initialize pmem device on newer hardware
With kernel now supporting new pmem flush/sync instructions, we can now enable the kernel to initialize the device. On P10 these devices would appear with a new compatible string. For PAPR device we have compatible "ibm,pmemory-v2" and for OF pmem device we have compatible "pmem-region-v2" Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/platforms/pseries/papr_scm.c | 1 + drivers/nvdimm/of_pmem.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index b970d2dbe589..3efd827fe0ac 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -498,6 +498,7 @@ static int papr_scm_remove(struct platform_device *pdev) static const struct of_device_id papr_scm_match[] = { { .compatible = "ibm,pmemory" }, + { .compatible = "ibm,pmemory-v2" }, { }, }; diff --git a/drivers/nvdimm/of_pmem.c b/drivers/nvdimm/of_pmem.c index a6cc3488e552..1e1585ab07c7 100644 --- a/drivers/nvdimm/of_pmem.c +++ b/drivers/nvdimm/of_pmem.c @@ -97,6 +97,7 @@ static int of_pmem_region_remove(struct platform_device *pdev) static const struct of_device_id of_pmem_region_match[] = { { .compatible = "pmem-region" }, + { .compatible = "pmem-region-v2" }, { }, }; -- 2.26.2
[PATCH v5 09/10] powerpc/pmem: Disable synchronous fault by default
This adds a kernel config option that controls whether MAP_SYNC is enabled by default. With POWER10, architecture is adding new pmem flush and sync instructions. The kernel should prevent the usage of MAP_SYNC if applications are not using the new instructions on newer hardware. This config allows user to control whether MAP_SYNC should be enabled by default or not. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/platforms/Kconfig.cputype| 9 + arch/powerpc/platforms/pseries/papr_scm.c | 17 - drivers/nvdimm/of_pmem.c | 7 +++ 3 files changed, 32 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype index d349603fb889..abcc163b8dc6 100644 --- a/arch/powerpc/platforms/Kconfig.cputype +++ b/arch/powerpc/platforms/Kconfig.cputype @@ -383,6 +383,15 @@ config PPC_KUEP If you're unsure, say Y. +config ARCH_MAP_SYNC_DISABLE + bool "Disable synchronous fault support (MAP_SYNC)" + default y + help + Disable support for synchronous fault with nvdimm namespaces. + + If you're unsure, say Y. + + config PPC_HAVE_KUAP bool diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index ad506e7003c9..b970d2dbe589 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -30,6 +30,7 @@ struct papr_scm_priv { uint64_t block_size; int metadata_size; bool is_volatile; + bool disable_map_sync; uint64_t bound_addr; @@ -353,11 +354,18 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p) ndr_desc.num_mappings = 1; ndr_desc.nd_set = &p->nd_set; ndr_desc.flush = papr_scm_flush_sync; + set_bit(ND_REGION_SYNC_ENABLED, &ndr_desc.flags); if (p->is_volatile) p->region = nvdimm_volatile_region_create(p->bus, &ndr_desc); else { set_bit(ND_REGION_PERSIST_MEMCTRL, &ndr_desc.flags); + /* +* for a persistent region, check if the platform needs to +* force MAP_SYNC disable. +*/ + if (p->disable_map_sync) + clear_bit(ND_REGION_SYNC_ENABLED, &ndr_desc.flags); p->region = nvdimm_pmem_region_create(p->bus, &ndr_desc); } if (!p->region) { @@ -378,7 +386,7 @@ err:nvdimm_bus_unregister(p->bus); static int papr_scm_probe(struct platform_device *pdev) { - struct device_node *dn = pdev->dev.of_node; + struct device_node *dn; u32 drc_index, metadata_size; u64 blocks, block_size; struct papr_scm_priv *p; @@ -386,6 +394,10 @@ static int papr_scm_probe(struct platform_device *pdev) u64 uuid[2]; int rc; + dn = dev_of_node(&pdev->dev); + if (!dn) + return -ENXIO; + /* check we have all the required DT properties */ if (of_property_read_u32(dn, "ibm,my-drc-index", &drc_index)) { dev_err(&pdev->dev, "%pOF: missing drc-index!\n", dn); @@ -415,6 +427,9 @@ static int papr_scm_probe(struct platform_device *pdev) /* optional DT properties */ of_property_read_u32(dn, "ibm,metadata-size", &metadata_size); + if (of_device_is_compatible(dn, "ibm,pmemory-v2")) + p->disable_map_sync = true; + p->dn = dn; p->drc_index = drc_index; p->block_size = block_size; diff --git a/drivers/nvdimm/of_pmem.c b/drivers/nvdimm/of_pmem.c index 6826a274a1f1..a6cc3488e552 100644 --- a/drivers/nvdimm/of_pmem.c +++ b/drivers/nvdimm/of_pmem.c @@ -59,12 +59,19 @@ static int of_pmem_region_probe(struct platform_device *pdev) ndr_desc.res = &pdev->resource[i]; ndr_desc.of_node = np; set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags); + set_bit(ND_REGION_SYNC_ENABLED, &ndr_desc.flags); if (is_volatile) region = nvdimm_volatile_region_create(bus, &ndr_desc); else { set_bit(ND_REGION_PERSIST_MEMCTRL, &ndr_desc.flags); + /* +* for a persistent region, check for newer device +*/ + if (of_device_is_compatible(np, "pmem-region-v2")) + clear_bit(ND_REGION_SYNC_ENABLED, &ndr_desc.flags); region = nvdimm_pmem_region_create(bus, &ndr_desc); + } if (!region) -- 2.26.2
[PATCH v5 08/10] libnvdimm/dax: Add a dax flag to control synchronous fault support
With POWER10, architecture is adding new pmem flush and sync instructions. The kernel should prevent the usage of MAP_SYNC if applications are not using the new instructions on newer hardware This patch adds a dax attribute (/sys/bus/nd/devices/region0/pfn0.1/block/pmem0/dax/sync_fault) which can be used to control this flag. If the device supports synchronous flush then userspace can update this attribute to enable/disable the synchronous fault. The attribute is only visible if there is write cache enabled on the device. In a followup patch on ppc64 device with compat string "ibm,pmemory-v2" will disable the sync fault feature. Signed-off-by: Aneesh Kumar K.V --- drivers/dax/bus.c| 2 +- drivers/dax/super.c | 73 drivers/nvdimm/pmem.c| 4 ++ drivers/nvdimm/region_devs.c | 16 include/linux/dax.h | 16 include/linux/libnvdimm.h| 4 ++ mm/Kconfig | 3 ++ 7 files changed, 117 insertions(+), 1 deletion(-) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index df238c8b6ef2..8a825ecff49b 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -420,7 +420,7 @@ struct dev_dax *__devm_create_dev_dax(struct dax_region *dax_region, int id, * No 'host' or dax_operations since there is no access to this * device outside of mmap of the resulting character device. */ - dax_dev = alloc_dax(dev_dax, NULL, NULL, DAXDEV_F_SYNC); + dax_dev = alloc_dax(dev_dax, NULL, NULL, DAXDEV_F_SYNC | DAXDEV_F_SYNC_ENABLED); if (IS_ERR(dax_dev)) { rc = PTR_ERR(dax_dev); goto err; diff --git a/drivers/dax/super.c b/drivers/dax/super.c index 8e32345be0f7..f93e6649d452 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -198,6 +198,12 @@ enum dax_device_flags { DAXDEV_WRITE_CACHE, /* flag to check if device supports synchronous flush */ DAXDEV_SYNC, + /* +* flag to indicate whether synchronous flush is enabled. +* Some platform may want to disable synchronous flush support +* even though device supports the same. +*/ + DAXDEV_SYNC_ENABLED, }; /** @@ -254,6 +260,63 @@ static ssize_t write_cache_store(struct device *dev, } static DEVICE_ATTR_RW(write_cache); +bool __dax_synchronous_enabled(struct dax_device *dax_dev) +{ + return test_bit(DAXDEV_SYNC_ENABLED, &dax_dev->flags); +} +EXPORT_SYMBOL_GPL(__dax_synchronous_enabled); + +static void set_dax_synchronous_enable(struct dax_device *dax_dev, bool enable) +{ + if (!test_bit(DAXDEV_SYNC, &dax_dev->flags)) + return; + + if (enable) + set_bit(DAXDEV_SYNC_ENABLED, &dax_dev->flags); + else + clear_bit(DAXDEV_SYNC_ENABLED, &dax_dev->flags); +} + + +static ssize_t sync_fault_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + int enabled; + struct dax_device *dax_dev = dax_get_by_host(dev_name(dev)); + ssize_t rc; + + WARN_ON_ONCE(!dax_dev); + if (!dax_dev) + return -ENXIO; + + enabled = (dax_synchronous(dax_dev) && dax_synchronous_enabled(dax_dev)); + rc = sprintf(buf, "%d\n", enabled); + put_dax(dax_dev); + return rc; +} + +static ssize_t sync_fault_store(struct device *dev, + struct device_attribute *attr, const char *buf, size_t len) +{ + bool enable_sync; + int rc = strtobool(buf, &enable_sync); + struct dax_device *dax_dev = dax_get_by_host(dev_name(dev)); + + WARN_ON_ONCE(!dax_dev); + if (!dax_dev) + return -ENXIO; + + if (rc) + len = rc; + else + set_dax_synchronous_enable(dax_dev, enable_sync); + + put_dax(dax_dev); + return len; +} + +static DEVICE_ATTR_RW(sync_fault); + static umode_t dax_visible(struct kobject *kobj, struct attribute *a, int n) { struct device *dev = container_of(kobj, typeof(*dev), kobj); @@ -267,11 +330,18 @@ static umode_t dax_visible(struct kobject *kobj, struct attribute *a, int n) if (a == &dev_attr_write_cache.attr) return 0; #endif + if (a == &dev_attr_sync_fault.attr) { + if (dax_write_cache_enabled(dax_dev)) + return a->mode; + return 0; + } + return a->mode; } static struct attribute *dax_attributes[] = { &dev_attr_write_cache.attr, + &dev_attr_sync_fault.attr, NULL, }; @@ -594,6 +664,9 @@ struct dax_device *alloc_dax(void *private, const char *__host, if (flags & DAXDEV_F_SYNC) set_dax_synchronous(dax_dev); + if (flags & DAXDEV_F_SYNC_ENABLED) + set_dax_synchronous_enable(dax_dev, true); + return dax_dev; err_dev: diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index 97f948f8f4e6..a738b237a
[PATCH v5 07/10] powerpc/book3s/pmem: Add WARN_ONCE to catch the wrong usage of pmem flush functions.
We only support persistent memory on P8 and above. This is enforced by the firmware and further checked on virtualzied platform during platform init. Add WARN_ONCE in pmem flush routines to catch the wrong usage of these. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/cacheflush.h | 2 ++ arch/powerpc/lib/pmem.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h index bb56a49c9a66..6dad92bd4be3 100644 --- a/arch/powerpc/include/asm/cacheflush.h +++ b/arch/powerpc/include/asm/cacheflush.h @@ -126,6 +126,8 @@ static inline void arch_pmem_flush_barrier(void) { if (cpu_has_feature(CPU_FTR_ARCH_207S)) asm volatile(PPC_PHWSYNC ::: "memory"); + else + WARN_ONCE(1, "Using pmem flush on older hardware."); } #endif /* __KERNEL__ */ diff --git a/arch/powerpc/lib/pmem.c b/arch/powerpc/lib/pmem.c index 21210fa676e5..f40bd908d28d 100644 --- a/arch/powerpc/lib/pmem.c +++ b/arch/powerpc/lib/pmem.c @@ -37,12 +37,14 @@ static inline void clean_pmem_range(unsigned long start, unsigned long stop) { if (cpu_has_feature(CPU_FTR_ARCH_207S)) return __clean_pmem_range(start, stop); + WARN_ONCE(1, "Using pmem flush on older hardware."); } static inline void flush_pmem_range(unsigned long start, unsigned long stop) { if (cpu_has_feature(CPU_FTR_ARCH_207S)) return __flush_pmem_range(start, stop); + WARN_ONCE(1, "Using pmem flush on older hardware."); } /* -- 2.26.2
[PATCH v5 06/10] powerpc/pmem: Avoid the barrier in flush routines
nvdimm expect the flush routines to just mark the cache clean. The barrier that mark the store globally visible is done in nvdimm_flush(). Update the papr_scm driver to a simplified nvdim_flush callback that do only the required barrier. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/lib/pmem.c | 6 -- arch/powerpc/platforms/pseries/papr_scm.c | 13 + 2 files changed, 13 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/lib/pmem.c b/arch/powerpc/lib/pmem.c index 5a61aaeb6930..21210fa676e5 100644 --- a/arch/powerpc/lib/pmem.c +++ b/arch/powerpc/lib/pmem.c @@ -19,9 +19,6 @@ static inline void __clean_pmem_range(unsigned long start, unsigned long stop) for (i = 0; i < size >> shift; i++, addr += bytes) asm volatile(PPC_DCBSTPS(%0, %1): :"i"(0), "r"(addr): "memory"); - - - asm volatile(PPC_PHWSYNC ::: "memory"); } static inline void __flush_pmem_range(unsigned long start, unsigned long stop) @@ -34,9 +31,6 @@ static inline void __flush_pmem_range(unsigned long start, unsigned long stop) for (i = 0; i < size >> shift; i++, addr += bytes) asm volatile(PPC_DCBFPS(%0, %1): :"i"(0), "r"(addr): "memory"); - - - asm volatile(PPC_PHWSYNC ::: "memory"); } static inline void clean_pmem_range(unsigned long start, unsigned long stop) diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index f35592423380..ad506e7003c9 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -285,6 +285,18 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc, return 0; } +/* + * We have made sure the pmem writes are done such that before calling this + * all the caches are flushed/clean. We use dcbf/dcbfps to ensure this. Here + * we just need to add the necessary barrier to make sure the above flushes + * are have updated persistent storage before any data access or data transfer + * caused by subsequent instructions is initiated. + */ +static int papr_scm_flush_sync(struct nd_region *nd_region, struct bio *bio) +{ + arch_pmem_flush_barrier(); + return 0; +} static int papr_scm_nvdimm_init(struct papr_scm_priv *p) { @@ -340,6 +352,7 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p) ndr_desc.mapping = &mapping; ndr_desc.num_mappings = 1; ndr_desc.nd_set = &p->nd_set; + ndr_desc.flush = papr_scm_flush_sync; if (p->is_volatile) p->region = nvdimm_volatile_region_create(p->bus, &ndr_desc); -- 2.26.2
[PATCH v5 05/10] powerpc/pmem/of_pmem: Update of_pmem to use the new barrier instruction.
of_pmem on POWER10 can now use phwsync instead of hwsync to ensure all previous writes are architecturally visible for the platform buffer flush. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/cacheflush.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h index 81808d1b54ca..bb56a49c9a66 100644 --- a/arch/powerpc/include/asm/cacheflush.h +++ b/arch/powerpc/include/asm/cacheflush.h @@ -120,6 +120,13 @@ static inline void invalidate_dcache_range(unsigned long start, #define copy_from_user_page(vma, page, vaddr, dst, src, len) \ memcpy(dst, src, len) + +#define arch_pmem_flush_barrier arch_pmem_flush_barrier +static inline void arch_pmem_flush_barrier(void) +{ + if (cpu_has_feature(CPU_FTR_ARCH_207S)) + asm volatile(PPC_PHWSYNC ::: "memory"); +} #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_CACHEFLUSH_H */ -- 2.26.2
[PATCH v5 02/10] powerpc/pmem: Add new instructions for persistent storage and sync
POWER10 introduces two new variants of dcbf instructions (dcbstps and dcbfps) that can be used to write modified locations back to persistent storage. Additionally, POWER10 also introduce phwsync and plwsync which can be used to establish order of these writes to persistent storage. This patch exposes these instructions to the rest of the kernel. The existing dcbf and hwsync instructions in P8 and P9 are adequate to enable appropriate synchronization with OpenCAPI-hosted persistent storage. Hence the new instructions are added as a variant of the old ones that old hardware won't differentiate. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/ppc-opcode.h | 12 1 file changed, 12 insertions(+) diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index 2a39c716c343..1ad014e4633e 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h @@ -219,6 +219,8 @@ #define PPC_INST_STWCX 0x7c00012d #define PPC_INST_LWSYNC0x7c2004ac #define PPC_INST_SYNC 0x7c0004ac +#define PPC_INST_PHWSYNC 0x7c8004ac +#define PPC_INST_PLWSYNC 0x7ca004ac #define PPC_INST_SYNC_MASK 0xfc0007fe #define PPC_INST_ISYNC 0x4c00012c #define PPC_INST_LXVD2X0x7c000698 @@ -284,6 +286,8 @@ #define PPC_INST_TABORT0x7c00071d #define PPC_INST_TSR 0x7c0005dd +#define PPC_INST_DCBF 0x7cac + #define PPC_INST_NAP 0x4c000364 #define PPC_INST_SLEEP 0x4c0003a4 #define PPC_INST_WINKLE0x4c0003e4 @@ -532,6 +536,14 @@ #define STBCIX(s,a,b) stringify_in_c(.long PPC_INST_STBCIX | \ __PPC_RS(s) | __PPC_RA(a) | __PPC_RB(b)) +#definePPC_DCBFPS(a, b)stringify_in_c(.long PPC_INST_DCBF | \ + ___PPC_RA(a) | ___PPC_RB(b) | (4 << 21)) +#definePPC_DCBSTPS(a, b) stringify_in_c(.long PPC_INST_DCBF | \ + ___PPC_RA(a) | ___PPC_RB(b) | (6 << 21)) + +#definePPC_PHWSYNC stringify_in_c(.long PPC_INST_PHWSYNC) +#definePPC_PLWSYNC stringify_in_c(.long PPC_INST_PLWSYNC) + /* * Define what the VSX XX1 form instructions will look like, then add * the 128 bit load store instructions based on that. -- 2.26.2
[PATCH v5 03/10] powerpc/pmem: Add flush routines using new pmem store and sync instruction
Start using dcbstps; phwsync; sequence for flushing persistent memory range. The new instructions are implemented as a variant of dcbf and hwsync and on P8 and P9 they will be executed as those instructions. We avoid using them on older hardware. This helps to avoid difficult to debug bugs. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/cacheflush.h | 1 + arch/powerpc/lib/pmem.c | 50 --- 2 files changed, 47 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h index e92191b390f3..81808d1b54ca 100644 --- a/arch/powerpc/include/asm/cacheflush.h +++ b/arch/powerpc/include/asm/cacheflush.h @@ -8,6 +8,7 @@ #include #include +#include /* * No cache flushing is required when address mappings are changed, diff --git a/arch/powerpc/lib/pmem.c b/arch/powerpc/lib/pmem.c index 0666a8d29596..5a61aaeb6930 100644 --- a/arch/powerpc/lib/pmem.c +++ b/arch/powerpc/lib/pmem.c @@ -9,20 +9,62 @@ #include +static inline void __clean_pmem_range(unsigned long start, unsigned long stop) +{ + unsigned long shift = l1_dcache_shift(); + unsigned long bytes = l1_dcache_bytes(); + void *addr = (void *)(start & ~(bytes - 1)); + unsigned long size = stop - (unsigned long)addr + (bytes - 1); + unsigned long i; + + for (i = 0; i < size >> shift; i++, addr += bytes) + asm volatile(PPC_DCBSTPS(%0, %1): :"i"(0), "r"(addr): "memory"); + + + asm volatile(PPC_PHWSYNC ::: "memory"); +} + +static inline void __flush_pmem_range(unsigned long start, unsigned long stop) +{ + unsigned long shift = l1_dcache_shift(); + unsigned long bytes = l1_dcache_bytes(); + void *addr = (void *)(start & ~(bytes - 1)); + unsigned long size = stop - (unsigned long)addr + (bytes - 1); + unsigned long i; + + for (i = 0; i < size >> shift; i++, addr += bytes) + asm volatile(PPC_DCBFPS(%0, %1): :"i"(0), "r"(addr): "memory"); + + + asm volatile(PPC_PHWSYNC ::: "memory"); +} + +static inline void clean_pmem_range(unsigned long start, unsigned long stop) +{ + if (cpu_has_feature(CPU_FTR_ARCH_207S)) + return __clean_pmem_range(start, stop); +} + +static inline void flush_pmem_range(unsigned long start, unsigned long stop) +{ + if (cpu_has_feature(CPU_FTR_ARCH_207S)) + return __flush_pmem_range(start, stop); +} + /* * CONFIG_ARCH_HAS_PMEM_API symbols */ void arch_wb_cache_pmem(void *addr, size_t size) { unsigned long start = (unsigned long) addr; - flush_dcache_range(start, start + size); + clean_pmem_range(start, start + size); } EXPORT_SYMBOL_GPL(arch_wb_cache_pmem); void arch_invalidate_pmem(void *addr, size_t size) { unsigned long start = (unsigned long) addr; - flush_dcache_range(start, start + size); + flush_pmem_range(start, start + size); } EXPORT_SYMBOL_GPL(arch_invalidate_pmem); @@ -35,7 +77,7 @@ long __copy_from_user_flushcache(void *dest, const void __user *src, unsigned long copied, start = (unsigned long) dest; copied = __copy_from_user(dest, src, size); - flush_dcache_range(start, start + size); + clean_pmem_range(start, start + size); return copied; } @@ -45,7 +87,7 @@ void *memcpy_flushcache(void *dest, const void *src, size_t size) unsigned long start = (unsigned long) dest; memcpy(dest, src, size); - flush_dcache_range(start, start + size); + clean_pmem_range(start, start + size); return dest; } -- 2.26.2
[PATCH v5 00/10] Support new pmem flush and sync instructions for POWER
This patch series enables the usage os new pmem flush and sync instructions on POWER architecture. POWER10 introduces two new variants of dcbf instructions (dcbstps and dcbfps) that can be used to write modified locations back to persistent storage. Additionally, POWER10 also introduce phwsync and plwsync which can be used to establish order of these writes to persistent storage. This series exposes these instructions to the rest of the kernel. The existing dcbf and hwsync instructions in P8 and P9 are adequate to enable appropriate synchronization with OpenCAPI-hosted persistent storage. Hence the new instructions are added as a variant of the old ones that old hardware won't differentiate. On POWER10, pmem devices will be represented by a different device tree compat strings. This ensures that older kernels won't initialize pmem devices on POWER10. W.r.t userspace we want to make sure applications are enabled to use MAP_SYNC only if they are using the new instructions. To avoid the wrong usage of MAP_SYNC on newer hardware, we disable MAP_SYNC by default on newer hardware. The namespace specific attribute /sys/block/pmem0/dax/sync_fault can be used to enable MAP_SYNC later. With this: 1) vPMEM continues to work since it is a volatile region. That doesn't need any flush instructions. 2) pmdk and other user applications get updated to use new instructions and updated packages are made available to all distributions 3) On newer hardware, the device will appear with a new compat string. Hence older distributions won't initialize pmem on newer hardware. 4) If we have a newer kernel with an older distro, we use the per namespace sysfs knob that prevents the usage of MAP_SYNC. 5) Sometime in the future, we mark the CONFIG_ARCH_MAP_SYNC_DISABLE=n on ppc64 when we are confident that everybody is using the new flush instruction. Chaanges from V4: * Add namespace specific sychronous fault control. Changes from V3: * Add new compat string to be used for the device. * Use arch_pmem_flush_barrier() in dm-writecache. Aneesh Kumar K.V (10): powerpc/pmem: Restrict papr_scm to P8 and above. powerpc/pmem: Add new instructions for persistent storage and sync powerpc/pmem: Add flush routines using new pmem store and sync instruction libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier powerpc/pmem/of_pmem: Update of_pmem to use the new barrier instruction. powerpc/pmem: Avoid the barrier in flush routines powerpc/book3s/pmem: Add WARN_ONCE to catch the wrong usage of pmem flush functions. libnvdimm/dax: Add a dax flag to control synchronous fault support powerpc/pmem: Disable synchronous fault by default powerpc/pmem: Initialize pmem device on newer hardware arch/powerpc/include/asm/cacheflush.h | 10 arch/powerpc/include/asm/ppc-opcode.h | 12 arch/powerpc/lib/pmem.c | 46 -- arch/powerpc/platforms/Kconfig.cputype| 9 +++ arch/powerpc/platforms/pseries/papr_scm.c | 31 +- arch/powerpc/platforms/pseries/pmem.c | 6 ++ drivers/dax/bus.c | 2 +- drivers/dax/super.c | 73 +++ drivers/md/dm-writecache.c| 2 +- drivers/nvdimm/of_pmem.c | 8 +++ drivers/nvdimm/pmem.c | 4 ++ drivers/nvdimm/region_devs.c | 24 ++-- include/linux/dax.h | 16 + include/linux/libnvdimm.h | 8 +++ mm/Kconfig| 3 + 15 files changed, 243 insertions(+), 11 deletions(-) -- 2.26.2
Re: ipr crashes due to NULL dma_need_drain since cc97923a5bcc ("block: move dma drain handling to scsi")
Christoph Hellwig writes: > Can you try this patch? > > --- > From 1c9913360a0494375c5655b133899cb4323bceb4 Mon Sep 17 00:00:00 2001 > From: Christoph Hellwig > Date: Tue, 9 Jun 2020 14:07:31 +0200 > Subject: scsi: wire up ata_scsi_dma_need_drain for SAS HBA drivers > > We need ata_scsi_dma_need_drain for all drivers wired up to drive ATAPI > devices through libata. That also includes the SAS HBA drivers in > addition to native libata HBA drivers. > > Fixes: cc97923a5bcc ("block: move dma drain handling to scsi") > Reported-by: Michael Ellerman > Signed-off-by: Christoph Hellwig Yep that works for me here with ipr. Tested-by: Michael Ellerman cheers
Re: [PATCH v3 0/7] Base support for POWER10
Murilo Opsfelder Araújo writes: > On Tue, Jun 09, 2020 at 03:28:31PM +1000, Michael Ellerman wrote: >> On Thu, 21 May 2020 11:43:34 +1000, Alistair Popple wrote: >> > This series brings together several previously posted patches required for >> > POWER10 support and introduces a new patch enabling POWER10 architected >> > mode to enable booting as a POWER10 pseries guest. >> > >> > It includes support for enabling facilities related to MMA and prefix >> > instructions. >> > >> > [...] >> >> Patches 1-3 and 5-7 applied to powerpc/next. >> >> [1/7] powerpc: Add new HWCAP bits >> >> https://git.kernel.org/powerpc/c/ee988c11acf6f9464b7b44e9a091bf6afb3b3a49 >> [2/7] powerpc: Add support for ISA v3.1 >> >> https://git.kernel.org/powerpc/c/3fd5836ee801ab9ac5b314c26550e209bafa5eaa >> [3/7] powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selected >> >> https://git.kernel.org/powerpc/c/43d0d37acbe40a9a93d9891ca670638cd22116b1 > > Just out of curiosity, why do we define ISA_V3_0B and ISA_V3_1 macros > and don't use them anywhere else in the code? Because we're sloppy :/ > Can't they be used in cpufeatures_setup_start() instead of 3000 and > 3100 literals? Yes please. cheers
Re: [PATCH 2/6] powerpc/ppc-opcode: move ppc instruction encoding from test_emulate_step
On 26/05/20 1:45 pm, Balamuruhan S wrote: > Few ppc instructions are encoded in test_emulate_step.c, consolidate > them and use it from ppc-opcode.h > > Signed-off-by: Balamuruhan S > Acked-by: Naveen N. Rao > Tested-by: Naveen N. Rao > --- > arch/powerpc/include/asm/ppc-opcode.h | 35 ++ > arch/powerpc/lib/test_emulate_step.c | 155 ++ > 2 files changed, 91 insertions(+), 99 deletions(-) > [...] > Acked-by: Sandipan Das
Re: [PATCH 4/6] powerpc/ppc-opcode: consolidate powerpc instructions from bpf_jit.h
On 26/05/20 1:45 pm, Balamuruhan S wrote: > move macro definitions of powerpc instructions from bpf_jit.h to ppc-opcode.h > and adopt the users of the macros accordingly. `PPC_MR()` is defined twice in > bpf_jit.h, remove the duplicate one. > > Signed-off-by: Balamuruhan S > Acked-by: Naveen N. Rao > Tested-by: Naveen N. Rao > --- > arch/powerpc/include/asm/ppc-opcode.h | 139 + > arch/powerpc/net/bpf_jit.h| 166 ++- > arch/powerpc/net/bpf_jit32.h | 24 +-- > arch/powerpc/net/bpf_jit64.h | 12 +- > arch/powerpc/net/bpf_jit_comp.c | 132 ++-- > arch/powerpc/net/bpf_jit_comp64.c | 278 +- > 6 files changed, 378 insertions(+), 373 deletions(-) > [...] > Acked-by: Sandipan Das
Re: [PATCH 3/6] powerpc/bpf_jit: reuse instruction macros from ppc-opcode.h
On 26/05/20 1:45 pm, Balamuruhan S wrote: > remove duplicate macro definitions from bpf_jit.h and reuse the macros from > ppc-opcode.h > > Signed-off-by: Balamuruhan S > Acked-by: Naveen N. Rao > Tested-by: Naveen N. Rao > --- > arch/powerpc/net/bpf_jit.h| 18 +- > arch/powerpc/net/bpf_jit32.h | 10 +- > arch/powerpc/net/bpf_jit64.h | 4 ++-- > arch/powerpc/net/bpf_jit_comp.c | 2 +- > arch/powerpc/net/bpf_jit_comp64.c | 20 ++-- > 5 files changed, 19 insertions(+), 35 deletions(-) > [...] > Acked-by: Sandipan Das
Re: [PATCH] powerpc/pseries/svm: Fixup align argument in alloc_shared_lppaca() function
Satheesh Rajendran writes: > Argument "align" in alloc_shared_lppaca() function was unused inside the > function. Let's fix it and update code comment. > > Cc: linux-ker...@vger.kernel.org > Cc: Thiago Jung Bauermann > Cc: Ram Pai > Cc: Sukadev Bhattiprolu > Cc: Laurent Dufour > Signed-off-by: Satheesh Rajendran > --- > arch/powerpc/kernel/paca.c | 11 +-- > 1 file changed, 9 insertions(+), 2 deletions(-) Nice. I agree it's a good code cleanup. Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center
Re: [PATCH] powerpc/pseries/svm: Remove unwanted check for shared_lppaca_size
Satheesh Rajendran writes: > Early secure guest boot hits the below crash while booting with > vcpus numbers aligned with page boundary for PAGE size of 64k > and LPPACA size of 1k i.e 64, 128 etc, due to the BUG_ON assert > for shared_lppaca_total_size equal to shared_lppaca_size, > > [0.00] Partition configured for 64 cpus. > [0.00] CPU maps initialized for 1 thread per core > [0.00] [ cut here ] > [0.00] kernel BUG at arch/powerpc/kernel/paca.c:89! > [0.00] Oops: Exception in kernel mode, sig: 5 [#1] > [0.00] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries > > which is not necessary, let's remove it. > > Cc: linux-ker...@vger.kernel.org > Cc: Thiago Jung Bauermann > Cc: Ram Pai > Cc: Sukadev Bhattiprolu > Cc: Laurent Dufour > Signed-off-by: Satheesh Rajendran > --- > arch/powerpc/kernel/paca.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Thanks for fixing this bug! I would only add: Fixes: bd104e6db6f0 ("powerpc/pseries/svm: Use shared memory for LPPACA structures") In any case: Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center
Re: [PATCH] ibmvscsi: don't send host info in adapter info MAD after LPM
On Wed, 3 Jun 2020 15:36:32 -0500, Tyrel Datwyler wrote: > The adatper info MAD is used to send the client info and receive the > host info as a response. A peristent buffer is used and as such the > client info is overwritten after the response. During the course of > a normal adapter reset the client info is refreshed in the buffer in > preparation for sending the adapter info MAD. > > However, in the special case of LPM where we reenable the CRQ instead > of a full CRQ teardown and reset we fail to refresh the client info in > the adapter info buffer. As a result after Live Partition Migration > (LPM) we erroneously report the hosts info as our own. Applied to 5.8/scsi-queue, thanks! [1/1] scsi: ibmvscsi: Don't send host info in adapter info MAD after LPM https://git.kernel.org/mkp/scsi/c/4919b33b63c8 -- Martin K. Petersen Oracle Linux Engineering
PowerPC KVM-PR issue
Hello, KVM-PR doesn't work anymore on my Nemo board [1]. I figured out that the Git kernels and the kernel 5.7 are affected. Error message: Fienix kernel: kvmppc_exit_pr_progint: emulation at 700 failed () I can boot virtual QEMU PowerPC machines with KVM-PR with the kernel 5.6 without any problems on my Nemo board. I tested it with QEMU 2.5.0 and QEMU 5.0.0 today. Could you please check KVM-PR on your PowerPC machine? Thanks, Christian [1] https://en.wikipedia.org/wiki/AmigaOne_X1000
Re: [PATCH v11 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods
On Tue, Jun 9, 2020 at 10:54 AM Vaibhav Jain wrote: > > Thanks Dan for the consideration and taking time to look into this. > > My responses below: > > Dan Williams writes: > > > On Mon, Jun 8, 2020 at 5:16 PM kernel test robot wrote: > >> > >> Hi Vaibhav, > >> > >> Thank you for the patch! Perhaps something to improve: > >> > >> [auto build test WARNING on powerpc/next] > >> [also build test WARNING on linus/master v5.7 next-20200605] > >> [cannot apply to linux-nvdimm/libnvdimm-for-next scottwood/next] > >> [if your patch is applied to the wrong git tree, please drop us a note to > >> help > >> improve the system. BTW, we also suggest to use '--base' option to specify > >> the > >> base tree in git format-patch, please see > >> https://stackoverflow.com/a/37406982] > >> > >> url: > >> https://github.com/0day-ci/linux/commits/Vaibhav-Jain/powerpc-papr_scm-Add-support-for-reporting-nvdimm-health/20200607-211653 > >> base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git > >> next > >> config: powerpc-randconfig-r016-20200607 (attached as .config) > >> compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project > >> e429cffd4f228f70c1d9df0e5d77c08590dd9766) > >> reproduce (this is a W=1 build): > >> wget > >> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross > >> -O ~/bin/make.cross > >> chmod +x ~/bin/make.cross > >> # install powerpc cross compiling tool for clang build > >> # apt-get install binutils-powerpc-linux-gnu > >> # save the attached .config to linux build tree > >> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross > >> ARCH=powerpc > >> > >> If you fix the issue, kindly add following tag as appropriate > >> Reported-by: kernel test robot > >> > >> All warnings (new ones prefixed by >>, old ones prefixed by <<): > >> > >> In file included from :1: > >> >> ./usr/include/asm/papr_pdsm.h:69:20: warning: field 'hdr' with variable > >> >> sized type 'struct nd_cmd_pkg' not at the end of a struct or class is a > >> >> GNU extension [-Wgnu-variable-sized-type-not-at-end] > >> struct nd_cmd_pkg hdr; /* Package header containing sub-cmd */ > > > > Hi Vaibhav, > > > [.] > > This looks like it's going to need another round to get this fixed. I > > don't think 'struct nd_pdsm_cmd_pkg' should embed a definition of > > 'struct nd_cmd_pkg'. An instance of 'struct nd_cmd_pkg' carries a > > payload that is the 'pdsm' specifics. As the code has it now it's > > defined as a superset of 'struct nd_cmd_pkg' and the compiler warning > > is pointing out a real 'struct' organization problem. > > > > Given the soak time needed in -next after the code is finalized this > > there's no time to do another round of updates and still make the v5.8 > > merge window. > > Agreed that this looks bad, a solution will probably need some more > review cycles resulting in this series missing the merge window. > > I am investigating into the possible solutions for this reported issue > and made few observations: > > I see command pkg for Intel, Hpe, Msft and Hyperv families using a > similar layout of embedding nd_cmd_pkg at the head of the > command-pkg. struct nd_pdsm_cmd_pkg is following the same pattern. > > struct nd_pdsm_cmd_pkg { > struct nd_cmd_pkg hdr; > /* other members */ > }; > > struct ndn_pkg_msft { > struct nd_cmd_pkg gen; > /* other members */ > }; > struct nd_pkg_intel { > struct nd_cmd_pkg gen; > /* other members */ > }; > struct ndn_pkg_hpe1 { > struct nd_cmd_pkg gen; > /* other members */ In those cases the other members are a union and there is no second variable length array. Perhaps that is why those definitions are not getting flagged? I'm not seeing anything in ndctl build options that would explicitly disable this warning, but I'm not sure if the ndctl build environment is missing this build warning by accident. Those variable size payloads are also not being used in any code paths that would look at the size of the command payload, like the kernel ioctl() path. The payload validation code needs static sizes and the payload parsing code wants to cast the payload to a known type. I don't think you can use the same struct definition for both those cases which is why the ndctl parsing code uses the union layout, but the kernel command marshaling code does strict layering. > }; > > Even though other command families implement similar command-package > layout they were not flagged (yet) as they are (I am guessing) serviced > in vendor acpi drivers rather than in kernel like in case of papr-scm > command family. I sincerely hope there are no vendor acpi kernel drivers outside of the upstream one. > > So, I think this issue is not just specific to papr-scm command family > introduced in this patch series but rather across all other command > families. Every other command family assumes 'struct nd_cmd_pkg_hdr' to > be embeddable and puts it at the beginnin
Re: [PATCH v11 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods
Thanks Dan for the consideration and taking time to look into this. My responses below: Dan Williams writes: > On Mon, Jun 8, 2020 at 5:16 PM kernel test robot wrote: >> >> Hi Vaibhav, >> >> Thank you for the patch! Perhaps something to improve: >> >> [auto build test WARNING on powerpc/next] >> [also build test WARNING on linus/master v5.7 next-20200605] >> [cannot apply to linux-nvdimm/libnvdimm-for-next scottwood/next] >> [if your patch is applied to the wrong git tree, please drop us a note to >> help >> improve the system. BTW, we also suggest to use '--base' option to specify >> the >> base tree in git format-patch, please see >> https://stackoverflow.com/a/37406982] >> >> url: >> https://github.com/0day-ci/linux/commits/Vaibhav-Jain/powerpc-papr_scm-Add-support-for-reporting-nvdimm-health/20200607-211653 >> base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git >> next >> config: powerpc-randconfig-r016-20200607 (attached as .config) >> compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project >> e429cffd4f228f70c1d9df0e5d77c08590dd9766) >> reproduce (this is a W=1 build): >> wget >> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O >> ~/bin/make.cross >> chmod +x ~/bin/make.cross >> # install powerpc cross compiling tool for clang build >> # apt-get install binutils-powerpc-linux-gnu >> # save the attached .config to linux build tree >> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross >> ARCH=powerpc >> >> If you fix the issue, kindly add following tag as appropriate >> Reported-by: kernel test robot >> >> All warnings (new ones prefixed by >>, old ones prefixed by <<): >> >> In file included from :1: >> >> ./usr/include/asm/papr_pdsm.h:69:20: warning: field 'hdr' with variable >> >> sized type 'struct nd_cmd_pkg' not at the end of a struct or class is a >> >> GNU extension [-Wgnu-variable-sized-type-not-at-end] >> struct nd_cmd_pkg hdr; /* Package header containing sub-cmd */ > > Hi Vaibhav, > [.] > This looks like it's going to need another round to get this fixed. I > don't think 'struct nd_pdsm_cmd_pkg' should embed a definition of > 'struct nd_cmd_pkg'. An instance of 'struct nd_cmd_pkg' carries a > payload that is the 'pdsm' specifics. As the code has it now it's > defined as a superset of 'struct nd_cmd_pkg' and the compiler warning > is pointing out a real 'struct' organization problem. > > Given the soak time needed in -next after the code is finalized this > there's no time to do another round of updates and still make the v5.8 > merge window. Agreed that this looks bad, a solution will probably need some more review cycles resulting in this series missing the merge window. I am investigating into the possible solutions for this reported issue and made few observations: I see command pkg for Intel, Hpe, Msft and Hyperv families using a similar layout of embedding nd_cmd_pkg at the head of the command-pkg. struct nd_pdsm_cmd_pkg is following the same pattern. struct nd_pdsm_cmd_pkg { struct nd_cmd_pkg hdr; /* other members */ }; struct ndn_pkg_msft { struct nd_cmd_pkg gen; /* other members */ }; struct nd_pkg_intel { struct nd_cmd_pkg gen; /* other members */ }; struct ndn_pkg_hpe1 { struct nd_cmd_pkg gen; /* other members */ }; Even though other command families implement similar command-package layout they were not flagged (yet) as they are (I am guessing) serviced in vendor acpi drivers rather than in kernel like in case of papr-scm command family. So, I think this issue is not just specific to papr-scm command family introduced in this patch series but rather across all other command families. Every other command family assumes 'struct nd_cmd_pkg_hdr' to be embeddable and puts it at the beginning of their corresponding command-packages. And its only a matter of time when someone tries filtering/handling of vendor specific commands in nfit module when they hit similar issue. Possible Solutions: * One way would be to redefine 'struct nd_cmd_pkg' to mark field 'nd_payload[]' from a flexible array to zero sized array as 'nd_payload[0]'. This should make 'struct nd_cmd_pkg' embeddable and clang shouldn't report 'gnu-variable-sized-type-not-at-end' warning. Also I think this change shouldn't introduce any ABI change. * Another way to solve this issue might be to redefine 'struct nd_pdsm_cmd_pkg' to below removing the 'struct nd_cmd_pkg' member. This struct should immediately follow the 'struct nd_cmd_pkg' command package when sent to libnvdimm: struct nd_pdsm_cmd_pkg { __s32 cmd_status; /* Out: Sub-cmd status returned back */ __u16 reserved[2]; /* Ignored and to be used in future */ __u8 payload[]; }; This should remove the flexible member nc_cmd_pkg.nd_payload from the struct with just one remaining at the end. However this would make accessing the
Re: [PATCH v3 0/7] Base support for POWER10
On Tue, Jun 09, 2020 at 03:28:31PM +1000, Michael Ellerman wrote: > On Thu, 21 May 2020 11:43:34 +1000, Alistair Popple wrote: > > This series brings together several previously posted patches required for > > POWER10 support and introduces a new patch enabling POWER10 architected > > mode to enable booting as a POWER10 pseries guest. > > > > It includes support for enabling facilities related to MMA and prefix > > instructions. > > > > [...] > > Patches 1-3 and 5-7 applied to powerpc/next. > > [1/7] powerpc: Add new HWCAP bits > > https://git.kernel.org/powerpc/c/ee988c11acf6f9464b7b44e9a091bf6afb3b3a49 > [2/7] powerpc: Add support for ISA v3.1 > > https://git.kernel.org/powerpc/c/3fd5836ee801ab9ac5b314c26550e209bafa5eaa > [3/7] powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selected > > https://git.kernel.org/powerpc/c/43d0d37acbe40a9a93d9891ca670638cd22116b1 Just out of curiosity, why do we define ISA_V3_0B and ISA_V3_1 macros and don't use them anywhere else in the code? Can't they be used in cpufeatures_setup_start() instead of 3000 and 3100 literals? > [5/7] powerpc/dt_cpu_ftrs: Enable Prefixed Instructions > > https://git.kernel.org/powerpc/c/c63d688c3dabca973c5a7da73d17422ad13f3737 > [6/7] powerpc/dt_cpu_ftrs: Add MMA feature > > https://git.kernel.org/powerpc/c/87939d50e5888bd78478d9aa9455f56b919df658 > [7/7] powerpc: Add POWER10 architected mode > > https://git.kernel.org/powerpc/c/a3ea40d5c7365e7e5c7c85b6f30b15142b397571 > > cheers -- Murilo
Re: [PATCH v12 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods
Hi Vaibhav, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on powerpc/next] [also build test WARNING on linus/master v5.7 next-20200608] [cannot apply to linux-nvdimm/libnvdimm-for-next scottwood/next] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Vaibhav-Jain/powerpc-papr_scm-Add-support-for-reporting-nvdimm-health/20200609-051451 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-randconfig-r031-20200608 (attached as .config) compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project bc2b70982be8f5250cd0082a7190f8b417bd4dfe) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install powerpc cross compiling tool for clang build # apt-get install binutils-powerpc-linux-gnu # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All warnings (new ones prefixed by >>, old ones prefixed by <<): In file included from :1: >> ./usr/include/asm/papr_pdsm.h:67:20: warning: field 'hdr' with variable >> sized type 'struct nd_cmd_pkg' not at the end of a struct or class is a GNU >> extension [-Wgnu-variable-sized-type-not-at-end] struct nd_cmd_pkg hdr; /* Package header containing sub-cmd */ ^ 1 warning generated. --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [musl] ppc64le and 32-bit LE userland compatibility
On Tue, Jun 09, 2020 at 10:29:57AM +, Will Springer wrote: > On Saturday, May 30, 2020 3:56:47 PM PDT you wrote: > > On Friday, May 29, 2020 12:24:27 PM PDT Rich Felker wrote: > > > The argument passing for pread/pwrite is historically a mess and > > > differs between archs. musl has a dedicated macro that archs can > > > define to override it. But it looks like it should match regardless of > > > BE vs LE, and musl already defines it for powerpc with the default > > > definition, adding a zero arg to start on an even arg-slot index, > > > which is an odd register (since ppc32 args start with an odd one, r3). > > > > > > > [6]: > > > > https://gist.github.com/Skirmisher/02891c1a8cafa0ff18b2460933ef4f3c > > > > > > I don't think this is correct, but I'm confused about where it's > > > getting messed up because it looks like it should already be right. > > > > Hmm, interesting. Will have to go back to it I guess... > > > > > > This was enough to fix up the `file` bug. I'm no seasoned kernel > > > > hacker, though, and there is still concern over the right way to > > > > approach this, whether it should live in the kernel or libc, etc. > > > > Frankly, I don't know the ABI structure enough to understand why the > > > > register padding has to be different in this case, or what > > > > lower-level component is responsible for it.. For comparison, I had > > > > a > > > > look at the mips tree, since it's bi-endian and has a similar 32/64 > > > > situation. There is a macro conditional upon endianness that is > > > > responsible for munging long longs; it uses __MIPSEB__ and > > > > __MIPSEL__ > > > > instead of an if/else on the generic __LITTLE_ENDIAN__. Not sure > > > > what > > > > to make of that. (It also simply swaps registers for LE, unlike what > > > > I did for ppc.) > > > > > > Indeed the problem is probably that you need to swap registers for LE, > > > not remove the padding slot. Did you check what happens if you pass a > > > value larger than 32 bits? > > > > > > If so, the right way to fix this on the kernel side would be to > > > construct the value as a union rather than by bitwise ops so it's > > > > > > endian-agnostic: > > > (union { u32 parts[2]; u64 val; }){{ arg1, arg2 }}.val > > > > > > But the kernel folks might prefer endian ifdefs for some odd reason... > > > > You are right, this does seem odd considering what the other archs do. > > It's quite possible I made a silly mistake, of course... > > > > I haven't tested with values outside the 32-bit range yet; again, this > > is new territory for me, so I haven't exactly done exhaustive tests on > > everything. I'll give it a closer look. > > I took some cues from the mips linux32 syscall setup, and drafted a new > patch defining a macro to compose the hi/lo parts within the function, > instead of swapping the args at the function definition. `file /bin/bash` > and `truncate -s 5G test` both work correctly now. This appears to be the > correct solution, so I'm not sure what silly mistake I made before, but > apologies for the confusion. I've updated my gist with the new patch [1]. > [...] > > [1]: https://gist.github.com/Skirmisher/02891c1a8cafa0ff18b2460933ef4f3c This patch looks correct. I prefer the union approach with no #ifdef but I'm fine with either. Rich
Re: [PATCH] mm: Move p?d_alloc_track to separate header file
Hi Christophe, On Tue, 9 Jun 2020 17:24:14 +0200 Christophe Leroy wrote: > > Le 09/06/2020 à 14:05, Joerg Roedel a écrit : > > From: Joerg Roedel > > > > The functions are only used in two source files, so there is no need > > for them to be in the global header. Move them to the new > > header and include it only where needed. > > Do you mean we will now create a new header file for any new couple on > functions based on where they are used ? > > Can you explain why this change is needed or is a plus ? Well at a minimum, it means 45 lines less to be parsed every time the linux/mm is included (in at last count, 1996 places some of which are include files included by other files). And, as someone who does a lot of builds every day, I am in favour of that :-) -- Cheers, Stephen Rothwell pgpdoWwAKz_k7.pgp Description: OpenPGP digital signature
Re: ipr crashes due to NULL dma_need_drain since cc97923a5bcc ("block: move dma drain handling to scsi")
Can you try this patch? --- >From 1c9913360a0494375c5655b133899cb4323bceb4 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Tue, 9 Jun 2020 14:07:31 +0200 Subject: scsi: wire up ata_scsi_dma_need_drain for SAS HBA drivers We need ata_scsi_dma_need_drain for all drivers wired up to drive ATAPI devices through libata. That also includes the SAS HBA drivers in addition to native libata HBA drivers. Fixes: cc97923a5bcc ("block: move dma drain handling to scsi") Reported-by: Michael Ellerman Signed-off-by: Christoph Hellwig --- drivers/scsi/aic94xx/aic94xx_init.c| 1 + drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 1 + drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 1 + drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 1 + drivers/scsi/ipr.c | 1 + drivers/scsi/isci/init.c | 1 + drivers/scsi/mvsas/mv_init.c | 1 + drivers/scsi/pm8001/pm8001_init.c | 1 + 8 files changed, 8 insertions(+) diff --git a/drivers/scsi/aic94xx/aic94xx_init.c b/drivers/scsi/aic94xx/aic94xx_init.c index d022407e5645c7..bef47f38dd0dbc 100644 --- a/drivers/scsi/aic94xx/aic94xx_init.c +++ b/drivers/scsi/aic94xx/aic94xx_init.c @@ -40,6 +40,7 @@ static struct scsi_host_template aic94xx_sht = { /* .name is initialized */ .name = "aic94xx", .queuecommand = sas_queuecommand, + .dma_need_drain = ata_scsi_dma_need_drain, .target_alloc = sas_target_alloc, .slave_configure= sas_slave_configure, .scan_finished = asd_scan_finished, diff --git a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c index 2e1718f9ade218..09a7669dad4c67 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c @@ -1756,6 +1756,7 @@ static struct scsi_host_template sht_v1_hw = { .proc_name = DRV_NAME, .module = THIS_MODULE, .queuecommand = sas_queuecommand, + .dma_need_drain = ata_scsi_dma_need_drain, .target_alloc = sas_target_alloc, .slave_configure= hisi_sas_slave_configure, .scan_finished = hisi_sas_scan_finished, diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c index e7e7849a4c14e2..968d3870235359 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c @@ -3532,6 +3532,7 @@ static struct scsi_host_template sht_v2_hw = { .proc_name = DRV_NAME, .module = THIS_MODULE, .queuecommand = sas_queuecommand, + .dma_need_drain = ata_scsi_dma_need_drain, .target_alloc = sas_target_alloc, .slave_configure= hisi_sas_slave_configure, .scan_finished = hisi_sas_scan_finished, diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index 3e6b78a1f993b9..55e2321a65bc5f 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -3075,6 +3075,7 @@ static struct scsi_host_template sht_v3_hw = { .proc_name = DRV_NAME, .module = THIS_MODULE, .queuecommand = sas_queuecommand, + .dma_need_drain = ata_scsi_dma_need_drain, .target_alloc = sas_target_alloc, .slave_configure= hisi_sas_slave_configure, .scan_finished = hisi_sas_scan_finished, diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c index 7d77997d26d457..7d86f4ca266c86 100644 --- a/drivers/scsi/ipr.c +++ b/drivers/scsi/ipr.c @@ -6731,6 +6731,7 @@ static struct scsi_host_template driver_template = { .compat_ioctl = ipr_ioctl, #endif .queuecommand = ipr_queuecommand, + .dma_need_drain = ata_scsi_dma_need_drain, .eh_abort_handler = ipr_eh_abort, .eh_device_reset_handler = ipr_eh_dev_reset, .eh_host_reset_handler = ipr_eh_host_reset, diff --git a/drivers/scsi/isci/init.c b/drivers/scsi/isci/init.c index 974c3b9116d5ba..085e285f427d93 100644 --- a/drivers/scsi/isci/init.c +++ b/drivers/scsi/isci/init.c @@ -153,6 +153,7 @@ static struct scsi_host_template isci_sht = { .name = DRV_NAME, .proc_name = DRV_NAME, .queuecommand = sas_queuecommand, + .dma_need_drain = ata_scsi_dma_need_drain, .target_alloc = sas_target_alloc, .slave_configure= sas_slave_configure, .scan_finished = isci_host_scan_finished, diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c index 5973eed9493820..b0de3bdb01db06 100644 --- a/drivers/scsi/mvsas/mv_init.c +++ b/drivers/scsi/mvsas/mv_init.c @@ -33,6 +33,7 @@ static struct scsi_host_template mvs_sht = {
Re: [PATCH] mm: Move p?d_alloc_track to separate header file
Le 09/06/2020 à 14:05, Joerg Roedel a écrit : From: Joerg Roedel The functions are only used in two source files, so there is no need for them to be in the global header. Move them to the new header and include it only where needed. Do you mean we will now create a new header file for any new couple on functions based on where they are used ? Can you explain why this change is needed or is a plus ? Christophe Signed-off-by: Joerg Roedel --- include/linux/mm.h| 45 --- include/linux/pgalloc-track.h | 51 +++ lib/ioremap.c | 1 + mm/vmalloc.c | 1 + 4 files changed, 53 insertions(+), 45 deletions(-) create mode 100644 include/linux/pgalloc-track.h diff --git a/include/linux/mm.h b/include/linux/mm.h index 9d6042178ca7..22d8b2a2c9bc 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2092,51 +2092,11 @@ static inline pud_t *pud_alloc(struct mm_struct *mm, p4d_t *p4d, NULL : pud_offset(p4d, address); } -static inline p4d_t *p4d_alloc_track(struct mm_struct *mm, pgd_t *pgd, -unsigned long address, -pgtbl_mod_mask *mod_mask) - -{ - if (unlikely(pgd_none(*pgd))) { - if (__p4d_alloc(mm, pgd, address)) - return NULL; - *mod_mask |= PGTBL_PGD_MODIFIED; - } - - return p4d_offset(pgd, address); -} - -static inline pud_t *pud_alloc_track(struct mm_struct *mm, p4d_t *p4d, -unsigned long address, -pgtbl_mod_mask *mod_mask) -{ - if (unlikely(p4d_none(*p4d))) { - if (__pud_alloc(mm, p4d, address)) - return NULL; - *mod_mask |= PGTBL_P4D_MODIFIED; - } - - return pud_offset(p4d, address); -} - static inline pmd_t *pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address) { return (unlikely(pud_none(*pud)) && __pmd_alloc(mm, pud, address))? NULL: pmd_offset(pud, address); } - -static inline pmd_t *pmd_alloc_track(struct mm_struct *mm, pud_t *pud, -unsigned long address, -pgtbl_mod_mask *mod_mask) -{ - if (unlikely(pud_none(*pud))) { - if (__pmd_alloc(mm, pud, address)) - return NULL; - *mod_mask |= PGTBL_PUD_MODIFIED; - } - - return pmd_offset(pud, address); -} #endif /* CONFIG_MMU */ #if USE_SPLIT_PTE_PTLOCKS @@ -2252,11 +2212,6 @@ static inline void pgtable_pte_page_dtor(struct page *page) ((unlikely(pmd_none(*(pmd))) && __pte_alloc_kernel(pmd))? \ NULL: pte_offset_kernel(pmd, address)) -#define pte_alloc_kernel_track(pmd, address, mask) \ - ((unlikely(pmd_none(*(pmd))) && \ - (__pte_alloc_kernel(pmd) || ({*(mask)|=PGTBL_PMD_MODIFIED;0;})))?\ - NULL: pte_offset_kernel(pmd, address)) - #if USE_SPLIT_PMD_PTLOCKS static struct page *pmd_to_page(pmd_t *pmd) diff --git a/include/linux/pgalloc-track.h b/include/linux/pgalloc-track.h new file mode 100644 index ..1dcc865029a2 --- /dev/null +++ b/include/linux/pgalloc-track.h @@ -0,0 +1,51 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_PGALLLC_TRACK_H +#define _LINUX_PGALLLC_TRACK_H + +#if defined(CONFIG_MMU) +static inline p4d_t *p4d_alloc_track(struct mm_struct *mm, pgd_t *pgd, +unsigned long address, +pgtbl_mod_mask *mod_mask) +{ + if (unlikely(pgd_none(*pgd))) { + if (__p4d_alloc(mm, pgd, address)) + return NULL; + *mod_mask |= PGTBL_PGD_MODIFIED; + } + + return p4d_offset(pgd, address); +} + +static inline pud_t *pud_alloc_track(struct mm_struct *mm, p4d_t *p4d, +unsigned long address, +pgtbl_mod_mask *mod_mask) +{ + if (unlikely(p4d_none(*p4d))) { + if (__pud_alloc(mm, p4d, address)) + return NULL; + *mod_mask |= PGTBL_P4D_MODIFIED; + } + + return pud_offset(p4d, address); +} + +static inline pmd_t *pmd_alloc_track(struct mm_struct *mm, pud_t *pud, +unsigned long address, +pgtbl_mod_mask *mod_mask) +{ + if (unlikely(pud_none(*pud))) { + if (__pmd_alloc(mm, pud, address)) + return NULL; + *mod_mask |= PGTBL_PUD_MODIFIED; + } + + return pmd_offset(pud, address); +} +#endif /* CONFIG_MMU */ + +#define pte_alloc_kernel_track(pmd, address, mask) \ + ((unlikely(pmd_none(*(pmd))) &&
Re: [PATCH] mm: Move p?d_alloc_track to separate header file
On Tue, Jun 09, 2020 at 02:05:33PM +0200, Joerg Roedel wrote: > From: Joerg Roedel > > The functions are only used in two source files, so there is no need > for them to be in the global header. Move them to the new > header and include it only where needed. > > Signed-off-by: Joerg Roedel Acked-by: Mike Rapoport > --- > include/linux/mm.h| 45 --- > include/linux/pgalloc-track.h | 51 +++ > lib/ioremap.c | 1 + > mm/vmalloc.c | 1 + > 4 files changed, 53 insertions(+), 45 deletions(-) > create mode 100644 include/linux/pgalloc-track.h > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 9d6042178ca7..22d8b2a2c9bc 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2092,51 +2092,11 @@ static inline pud_t *pud_alloc(struct mm_struct *mm, > p4d_t *p4d, > NULL : pud_offset(p4d, address); > } > > -static inline p4d_t *p4d_alloc_track(struct mm_struct *mm, pgd_t *pgd, > - unsigned long address, > - pgtbl_mod_mask *mod_mask) > - > -{ > - if (unlikely(pgd_none(*pgd))) { > - if (__p4d_alloc(mm, pgd, address)) > - return NULL; > - *mod_mask |= PGTBL_PGD_MODIFIED; > - } > - > - return p4d_offset(pgd, address); > -} > - > -static inline pud_t *pud_alloc_track(struct mm_struct *mm, p4d_t *p4d, > - unsigned long address, > - pgtbl_mod_mask *mod_mask) > -{ > - if (unlikely(p4d_none(*p4d))) { > - if (__pud_alloc(mm, p4d, address)) > - return NULL; > - *mod_mask |= PGTBL_P4D_MODIFIED; > - } > - > - return pud_offset(p4d, address); > -} > - > static inline pmd_t *pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned > long address) > { > return (unlikely(pud_none(*pud)) && __pmd_alloc(mm, pud, address))? > NULL: pmd_offset(pud, address); > } > - > -static inline pmd_t *pmd_alloc_track(struct mm_struct *mm, pud_t *pud, > - unsigned long address, > - pgtbl_mod_mask *mod_mask) > -{ > - if (unlikely(pud_none(*pud))) { > - if (__pmd_alloc(mm, pud, address)) > - return NULL; > - *mod_mask |= PGTBL_PUD_MODIFIED; > - } > - > - return pmd_offset(pud, address); > -} > #endif /* CONFIG_MMU */ > > #if USE_SPLIT_PTE_PTLOCKS > @@ -2252,11 +2212,6 @@ static inline void pgtable_pte_page_dtor(struct page > *page) > ((unlikely(pmd_none(*(pmd))) && __pte_alloc_kernel(pmd))? \ > NULL: pte_offset_kernel(pmd, address)) > > -#define pte_alloc_kernel_track(pmd, address, mask) \ > - ((unlikely(pmd_none(*(pmd))) && \ > - (__pte_alloc_kernel(pmd) || ({*(mask)|=PGTBL_PMD_MODIFIED;0;})))?\ > - NULL: pte_offset_kernel(pmd, address)) > - > #if USE_SPLIT_PMD_PTLOCKS > > static struct page *pmd_to_page(pmd_t *pmd) > diff --git a/include/linux/pgalloc-track.h b/include/linux/pgalloc-track.h > new file mode 100644 > index ..1dcc865029a2 > --- /dev/null > +++ b/include/linux/pgalloc-track.h > @@ -0,0 +1,51 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef _LINUX_PGALLLC_TRACK_H > +#define _LINUX_PGALLLC_TRACK_H > + > +#if defined(CONFIG_MMU) > +static inline p4d_t *p4d_alloc_track(struct mm_struct *mm, pgd_t *pgd, > + unsigned long address, > + pgtbl_mod_mask *mod_mask) > +{ > + if (unlikely(pgd_none(*pgd))) { > + if (__p4d_alloc(mm, pgd, address)) > + return NULL; > + *mod_mask |= PGTBL_PGD_MODIFIED; > + } > + > + return p4d_offset(pgd, address); > +} > + > +static inline pud_t *pud_alloc_track(struct mm_struct *mm, p4d_t *p4d, > + unsigned long address, > + pgtbl_mod_mask *mod_mask) > +{ > + if (unlikely(p4d_none(*p4d))) { > + if (__pud_alloc(mm, p4d, address)) > + return NULL; > + *mod_mask |= PGTBL_P4D_MODIFIED; > + } > + > + return pud_offset(p4d, address); > +} > + > +static inline pmd_t *pmd_alloc_track(struct mm_struct *mm, pud_t *pud, > + unsigned long address, > + pgtbl_mod_mask *mod_mask) > +{ > + if (unlikely(pud_none(*pud))) { > + if (__pmd_alloc(mm, pud, address)) > + return NULL; > + *mod_mask |= PGTBL_PUD_MODIFIED; > + } > + > + return pmd_offset(pud, address); > +} > +#endif /* CONFIG_MMU */ > + > +#define pte_alloc_kernel_track(pmd, address, mask) \ > + ((unlikely(pmd_none(*(pmd))) &&
[PATCH 00/17] spelling.txt: /decriptors/descriptors/
I wouldn't normally go through spelling fixes, but I caught sight of this typo twice, and then foolishly grepped the tree for it, and saw how pervasive it was. so here I am ... fixing a typo globally... but with an addition in scripts/spelling.txt so it shouldn't re-appear ;-) Cc: linux-arm-ker...@lists.infradead.org (moderated list:TI DAVINCI MACHINE SUPPORT) Cc: linux-ker...@vger.kernel.org (open list) Cc: linux...@vger.kernel.org (open list:DEVICE FREQUENCY EVENT (DEVFREQ-EVENT)) Cc: linux-g...@vger.kernel.org (open list:GPIO SUBSYSTEM) Cc: dri-de...@lists.freedesktop.org (open list:DRM DRIVERS) Cc: linux-r...@vger.kernel.org (open list:HFI1 DRIVER) Cc: linux-in...@vger.kernel.org (open list:INPUT (KEYBOARD, MOUSE, JOYSTICK, TOUCHSCREEN)...) Cc: linux-...@lists.infradead.org (open list:NAND FLASH SUBSYSTEM) Cc: net...@vger.kernel.org (open list:NETWORKING DRIVERS) Cc: ath...@lists.infradead.org (open list:QUALCOMM ATHEROS ATH10K WIRELESS DRIVER) Cc: linux-wirel...@vger.kernel.org (open list:NETWORKING DRIVERS (WIRELESS)) Cc: linux-s...@vger.kernel.org (open list:IBM Power Virtual FC Device Drivers) Cc: linuxppc-dev@lists.ozlabs.org (open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)) Cc: linux-...@vger.kernel.org (open list:USB SUBSYSTEM) Cc: virtualizat...@lists.linux-foundation.org (open list:VIRTIO CORE AND NET DRIVERS) Cc: linux...@kvack.org (open list:MEMORY MANAGEMENT) Kieran Bingham (17): arch: arm: mach-davinci: Fix trivial spelling drivers: infiniband: Fix trivial spelling drivers: gpio: Fix trivial spelling drivers: mtd: nand: raw: Fix trivial spelling drivers: net: Fix trivial spelling drivers: scsi: Fix trivial spelling drivers: usb: Fix trivial spelling drivers: gpu: drm: Fix trivial spelling drivers: regulator: Fix trivial spelling drivers: input: joystick: Fix trivial spelling drivers: infiniband: Fix trivial spelling drivers: devfreq: Fix trivial spelling include: dynamic_debug.h: Fix trivial spelling kernel: trace: Fix trivial spelling mm: Fix trivial spelling regulator: gpio: Fix trivial spelling scripts/spelling.txt: Add descriptors correction arch/arm/mach-davinci/board-da830-evm.c | 2 +- drivers/devfreq/devfreq-event.c | 4 ++-- drivers/gpio/TODO| 2 +- drivers/gpu/drm/drm_dp_helper.c | 2 +- drivers/infiniband/hw/hfi1/iowait.h | 2 +- drivers/infiniband/hw/hfi1/ipoib_tx.c| 2 +- drivers/infiniband/hw/hfi1/verbs_txreq.h | 2 +- drivers/input/joystick/spaceball.c | 2 +- drivers/mtd/nand/raw/mxc_nand.c | 2 +- drivers/mtd/nand/raw/nand_bbt.c | 2 +- drivers/net/wan/lmc/lmc_main.c | 2 +- drivers/net/wireless/ath/ath10k/usb.c| 2 +- drivers/net/wireless/ath/ath6kl/usb.c| 2 +- drivers/net/wireless/cisco/airo.c| 2 +- drivers/regulator/fixed.c| 2 +- drivers/regulator/gpio-regulator.c | 2 +- drivers/scsi/ibmvscsi/ibmvfc.c | 2 +- drivers/scsi/ibmvscsi/ibmvscsi.c | 2 +- drivers/scsi/qla2xxx/qla_inline.h| 2 +- drivers/scsi/qla2xxx/qla_iocb.c | 6 +++--- drivers/usb/core/of.c| 2 +- include/drm/drm_dp_helper.h | 2 +- include/linux/dynamic_debug.h| 2 +- kernel/trace/trace_events.c | 2 +- mm/balloon_compaction.c | 4 ++-- scripts/spelling.txt | 1 + 26 files changed, 30 insertions(+), 29 deletions(-) -- 2.25.1
[PATCH 06/17] drivers: scsi: Fix trivial spelling
The word 'descriptor' is misspelled throughout the tree. Fix it up accordingly: decriptors -> descriptors Signed-off-by: Kieran Bingham --- drivers/scsi/ibmvscsi/ibmvfc.c| 2 +- drivers/scsi/ibmvscsi/ibmvscsi.c | 2 +- drivers/scsi/qla2xxx/qla_inline.h | 2 +- drivers/scsi/qla2xxx/qla_iocb.c | 6 +++--- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index 635f6f9cffc4..77f4d37d5bd6 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -1344,7 +1344,7 @@ static void ibmvfc_map_sg_list(struct scsi_cmnd *scmd, int nseg, } /** - * ibmvfc_map_sg_data - Maps dma for a scatterlist and initializes decriptor fields + * ibmvfc_map_sg_data - Maps dma for a scatterlist and initializes descriptor fields * @scmd: struct scsi_cmnd with the scatterlist * @evt: ibmvfc event struct * @vfc_cmd: vfc_cmd that contains the memory descriptor diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index 44e64aa21194..a92587624c72 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -667,7 +667,7 @@ static int map_sg_list(struct scsi_cmnd *cmd, int nseg, } /** - * map_sg_data: - Maps dma for a scatterlist and initializes decriptor fields + * map_sg_data: - Maps dma for a scatterlist and initializes descriptor fields * @cmd: struct scsi_cmnd with the scatterlist * @srp_cmd: srp_cmd that contains the memory descriptor * @dev: device for which to map dma memory diff --git a/drivers/scsi/qla2xxx/qla_inline.h b/drivers/scsi/qla2xxx/qla_inline.h index 1fb6ccac07cc..861dc522723c 100644 --- a/drivers/scsi/qla2xxx/qla_inline.h +++ b/drivers/scsi/qla2xxx/qla_inline.h @@ -11,7 +11,7 @@ * Continuation Type 1 IOCBs to allocate. * * @vha: HA context - * @dsds: number of data segment decriptors needed + * @dsds: number of data segment descriptors needed * * Returns the number of IOCB entries needed to store @dsds. */ diff --git a/drivers/scsi/qla2xxx/qla_iocb.c b/drivers/scsi/qla2xxx/qla_iocb.c index 8865c35d3421..1d3c58c5f0e2 100644 --- a/drivers/scsi/qla2xxx/qla_iocb.c +++ b/drivers/scsi/qla2xxx/qla_iocb.c @@ -44,7 +44,7 @@ qla2x00_get_cmd_direction(srb_t *sp) * qla2x00_calc_iocbs_32() - Determine number of Command Type 2 and * Continuation Type 0 IOCBs to allocate. * - * @dsds: number of data segment decriptors needed + * @dsds: number of data segment descriptors needed * * Returns the number of IOCB entries needed to store @dsds. */ @@ -66,7 +66,7 @@ qla2x00_calc_iocbs_32(uint16_t dsds) * qla2x00_calc_iocbs_64() - Determine number of Command Type 3 and * Continuation Type 1 IOCBs to allocate. * - * @dsds: number of data segment decriptors needed + * @dsds: number of data segment descriptors needed * * Returns the number of IOCB entries needed to store @dsds. */ @@ -669,7 +669,7 @@ qla24xx_build_scsi_type_6_iocbs(srb_t *sp, struct cmd_type_6 *cmd_pkt, * qla24xx_calc_dsd_lists() - Determine number of DSD list required * for Command Type 6. * - * @dsds: number of data segment decriptors needed + * @dsds: number of data segment descriptors needed * * Returns the number of dsd list needed to store @dsds. */ -- 2.25.1
Re: Add a new fchmodat4() syscall, v2
* Palmer Dabbelt: > This patch set adds fchmodat4(), a new syscall. The actual > implementation is super simple: essentially it's just the same as > fchmodat(), but LOOKUP_FOLLOW is conditionally set based on the flags. > I've attempted to make this match "man 2 fchmodat" as closely as > possible, which says EINVAL is returned for invalid flags (as opposed to > ENOTSUPP, which is currently returned by glibc for AT_SYMLINK_NOFOLLOW). > I have a sketch of a glibc patch that I haven't even compiled yet, but > seems fairly straight-forward: What's the status here? We'd really like to see this system call in the kernel because our emulation in glibc has its problems (especially if /proc is not mounted). Thanks, Florian
Re: [RFC PATCH] ASoC: fsl_asrc_dma: Fix warning "Cannot create DMA dma:tx symlink"
On Mon, Jun 08, 2020 at 03:07:00PM +0800, Shengjiu Wang wrote: > The issue log is: > > [ 48.021506] CPU: 0 PID: 664 Comm: aplay Not tainted > 5.7.0-rc1-13120-g12b434cbbea0 #343 > [ 48.031063] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) > [ 48.037638] [] (unwind_backtrace) from [] > (show_stack+0x10/0x14) > [ 48.045413] [] (show_stack) from [] > (dump_stack+0xe4/0x118) Please think hard before including complete backtraces in upstream reports, they are very large and contain almost no useful information relative to their size so often obscure the relevant content in your message. If part of the backtrace is usefully illustrative (it often is for search engines if nothing else) then it's usually better to pull out the relevant sections. > --- > include/sound/dmaengine_pcm.h | 11 ++ > include/sound/soc.h | 2 ++ > sound/soc/fsl/fsl_asrc_common.h | 2 ++ > sound/soc/fsl/fsl_asrc_dma.c | 49 +-- > sound/soc/soc-core.c | 3 +- > sound/soc/soc-generic-dmaengine-pcm.c | 12 --- > 6 files changed, 55 insertions(+), 24 deletions(-) Please split the core changes you are adding from the driver changes that use them. The change does look reasonable for the issue, it's not ideal but I'm not sure it's avoidable with DPCM. signature.asc Description: PGP signature
Re: [PATCH] powerpc/pseries/svm: Remove unwanted check for shared_lppaca_size
Le 09/06/2020 à 12:57, Satheesh Rajendran a écrit : Early secure guest boot hits the below crash while booting with vcpus numbers aligned with page boundary for PAGE size of 64k and LPPACA size of 1k i.e 64, 128 etc, due to the BUG_ON assert for shared_lppaca_total_size equal to shared_lppaca_size, [0.00] Partition configured for 64 cpus. [0.00] CPU maps initialized for 1 thread per core [0.00] [ cut here ] [0.00] kernel BUG at arch/powerpc/kernel/paca.c:89! [0.00] Oops: Exception in kernel mode, sig: 5 [#1] [0.00] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries which is not necessary, let's remove it. Reviewed-by: Laurent Dufour Cc: linux-ker...@vger.kernel.org Cc: Thiago Jung Bauermann Cc: Ram Pai Cc: Sukadev Bhattiprolu Cc: Laurent Dufour Signed-off-by: Satheesh Rajendran --- arch/powerpc/kernel/paca.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c index 949eceb25..10b7c54a7 100644 --- a/arch/powerpc/kernel/paca.c +++ b/arch/powerpc/kernel/paca.c @@ -86,7 +86,7 @@ static void *__init alloc_shared_lppaca(unsigned long size, unsigned long align, * This is very early in boot, so no harm done if the kernel crashes at * this point. */ - BUG_ON(shared_lppaca_size >= shared_lppaca_total_size); + BUG_ON(shared_lppaca_size > shared_lppaca_total_size); return ptr; }
[PATCH] mm: Move p?d_alloc_track to separate header file
From: Joerg Roedel The functions are only used in two source files, so there is no need for them to be in the global header. Move them to the new header and include it only where needed. Signed-off-by: Joerg Roedel --- include/linux/mm.h| 45 --- include/linux/pgalloc-track.h | 51 +++ lib/ioremap.c | 1 + mm/vmalloc.c | 1 + 4 files changed, 53 insertions(+), 45 deletions(-) create mode 100644 include/linux/pgalloc-track.h diff --git a/include/linux/mm.h b/include/linux/mm.h index 9d6042178ca7..22d8b2a2c9bc 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2092,51 +2092,11 @@ static inline pud_t *pud_alloc(struct mm_struct *mm, p4d_t *p4d, NULL : pud_offset(p4d, address); } -static inline p4d_t *p4d_alloc_track(struct mm_struct *mm, pgd_t *pgd, -unsigned long address, -pgtbl_mod_mask *mod_mask) - -{ - if (unlikely(pgd_none(*pgd))) { - if (__p4d_alloc(mm, pgd, address)) - return NULL; - *mod_mask |= PGTBL_PGD_MODIFIED; - } - - return p4d_offset(pgd, address); -} - -static inline pud_t *pud_alloc_track(struct mm_struct *mm, p4d_t *p4d, -unsigned long address, -pgtbl_mod_mask *mod_mask) -{ - if (unlikely(p4d_none(*p4d))) { - if (__pud_alloc(mm, p4d, address)) - return NULL; - *mod_mask |= PGTBL_P4D_MODIFIED; - } - - return pud_offset(p4d, address); -} - static inline pmd_t *pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address) { return (unlikely(pud_none(*pud)) && __pmd_alloc(mm, pud, address))? NULL: pmd_offset(pud, address); } - -static inline pmd_t *pmd_alloc_track(struct mm_struct *mm, pud_t *pud, -unsigned long address, -pgtbl_mod_mask *mod_mask) -{ - if (unlikely(pud_none(*pud))) { - if (__pmd_alloc(mm, pud, address)) - return NULL; - *mod_mask |= PGTBL_PUD_MODIFIED; - } - - return pmd_offset(pud, address); -} #endif /* CONFIG_MMU */ #if USE_SPLIT_PTE_PTLOCKS @@ -2252,11 +2212,6 @@ static inline void pgtable_pte_page_dtor(struct page *page) ((unlikely(pmd_none(*(pmd))) && __pte_alloc_kernel(pmd))? \ NULL: pte_offset_kernel(pmd, address)) -#define pte_alloc_kernel_track(pmd, address, mask) \ - ((unlikely(pmd_none(*(pmd))) && \ - (__pte_alloc_kernel(pmd) || ({*(mask)|=PGTBL_PMD_MODIFIED;0;})))?\ - NULL: pte_offset_kernel(pmd, address)) - #if USE_SPLIT_PMD_PTLOCKS static struct page *pmd_to_page(pmd_t *pmd) diff --git a/include/linux/pgalloc-track.h b/include/linux/pgalloc-track.h new file mode 100644 index ..1dcc865029a2 --- /dev/null +++ b/include/linux/pgalloc-track.h @@ -0,0 +1,51 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_PGALLLC_TRACK_H +#define _LINUX_PGALLLC_TRACK_H + +#if defined(CONFIG_MMU) +static inline p4d_t *p4d_alloc_track(struct mm_struct *mm, pgd_t *pgd, +unsigned long address, +pgtbl_mod_mask *mod_mask) +{ + if (unlikely(pgd_none(*pgd))) { + if (__p4d_alloc(mm, pgd, address)) + return NULL; + *mod_mask |= PGTBL_PGD_MODIFIED; + } + + return p4d_offset(pgd, address); +} + +static inline pud_t *pud_alloc_track(struct mm_struct *mm, p4d_t *p4d, +unsigned long address, +pgtbl_mod_mask *mod_mask) +{ + if (unlikely(p4d_none(*p4d))) { + if (__pud_alloc(mm, p4d, address)) + return NULL; + *mod_mask |= PGTBL_P4D_MODIFIED; + } + + return pud_offset(p4d, address); +} + +static inline pmd_t *pmd_alloc_track(struct mm_struct *mm, pud_t *pud, +unsigned long address, +pgtbl_mod_mask *mod_mask) +{ + if (unlikely(pud_none(*pud))) { + if (__pmd_alloc(mm, pud, address)) + return NULL; + *mod_mask |= PGTBL_PUD_MODIFIED; + } + + return pmd_offset(pud, address); +} +#endif /* CONFIG_MMU */ + +#define pte_alloc_kernel_track(pmd, address, mask) \ + ((unlikely(pmd_none(*(pmd))) && \ + (__pte_alloc_kernel(pmd) || ({*(mask)|=PGTBL_PMD_MODIFIED;0;})))?\ + NULL: pte_offset_kernel(pmd, address)) + +#endif /* _LINUX_PGALLLC_TRACK_H */ diff --git a/lib/ioremap.c b/lib/ioremap.c index ad485f081
Re: ipr crashes due to NULL dma_need_drain since cc97923a5bcc ("block: move dma drain handling to scsi")
On Tue, Jun 09, 2020 at 08:00:35PM +1000, Michael Ellerman wrote: > Hi all, > > I'm seeing crashes on powerpc with the ipr driver, which I'm fairly sure > are due to dma_need_drain being NULL. Ooops, my changes completely forgot about SAS attached ATAPI devices. I'll cook up a fix in a bit.
[PATCH] powerpc/pseries/svm: Fixup align argument in alloc_shared_lppaca() function
Argument "align" in alloc_shared_lppaca() function was unused inside the function. Let's fix it and update code comment. Cc: linux-ker...@vger.kernel.org Cc: Thiago Jung Bauermann Cc: Ram Pai Cc: Sukadev Bhattiprolu Cc: Laurent Dufour Signed-off-by: Satheesh Rajendran --- arch/powerpc/kernel/paca.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c index 8d96169c597e..9088e107fb43 100644 --- a/arch/powerpc/kernel/paca.c +++ b/arch/powerpc/kernel/paca.c @@ -70,7 +70,7 @@ static void *__init alloc_shared_lppaca(unsigned long size, unsigned long align, shared_lppaca = memblock_alloc_try_nid(shared_lppaca_total_size, - PAGE_SIZE, MEMBLOCK_LOW_LIMIT, + align, MEMBLOCK_LOW_LIMIT, limit, NUMA_NO_NODE); if (!shared_lppaca) panic("cannot allocate shared data"); @@ -122,7 +122,14 @@ static struct lppaca * __init new_lppaca(int cpu, unsigned long limit) return NULL; if (is_secure_guest()) - lp = alloc_shared_lppaca(LPPACA_SIZE, 0x400, limit, cpu); + /* +* See Documentation/powerpc/ultravisor.rst for mode details +* +* UV/HV data share is in PAGE granularity, In order to minimize +* the number of pages shared and maximize the use of a page, +* let's use page align. +*/ + lp = alloc_shared_lppaca(LPPACA_SIZE, PAGE_SIZE, limit, cpu); else lp = alloc_paca_data(LPPACA_SIZE, 0x400, limit, cpu); -- 2.26.2
[PATCH] powerpc/pseries/svm: Remove unwanted check for shared_lppaca_size
Early secure guest boot hits the below crash while booting with vcpus numbers aligned with page boundary for PAGE size of 64k and LPPACA size of 1k i.e 64, 128 etc, due to the BUG_ON assert for shared_lppaca_total_size equal to shared_lppaca_size, [0.00] Partition configured for 64 cpus. [0.00] CPU maps initialized for 1 thread per core [0.00] [ cut here ] [0.00] kernel BUG at arch/powerpc/kernel/paca.c:89! [0.00] Oops: Exception in kernel mode, sig: 5 [#1] [0.00] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries which is not necessary, let's remove it. Cc: linux-ker...@vger.kernel.org Cc: Thiago Jung Bauermann Cc: Ram Pai Cc: Sukadev Bhattiprolu Cc: Laurent Dufour Signed-off-by: Satheesh Rajendran --- arch/powerpc/kernel/paca.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c index 949eceb25..10b7c54a7 100644 --- a/arch/powerpc/kernel/paca.c +++ b/arch/powerpc/kernel/paca.c @@ -86,7 +86,7 @@ static void *__init alloc_shared_lppaca(unsigned long size, unsigned long align, * This is very early in boot, so no harm done if the kernel crashes at * this point. */ - BUG_ON(shared_lppaca_size >= shared_lppaca_total_size); + BUG_ON(shared_lppaca_size > shared_lppaca_total_size); return ptr; } -- 2.26.2
Re: [musl] ppc64le and 32-bit LE userland compatibility
On Saturday, May 30, 2020 3:56:47 PM PDT you wrote: > On Friday, May 29, 2020 12:24:27 PM PDT Rich Felker wrote: > > The argument passing for pread/pwrite is historically a mess and > > differs between archs. musl has a dedicated macro that archs can > > define to override it. But it looks like it should match regardless of > > BE vs LE, and musl already defines it for powerpc with the default > > definition, adding a zero arg to start on an even arg-slot index, > > which is an odd register (since ppc32 args start with an odd one, r3). > > > > > [6]: > > > https://gist.github.com/Skirmisher/02891c1a8cafa0ff18b2460933ef4f3c > > > > I don't think this is correct, but I'm confused about where it's > > getting messed up because it looks like it should already be right. > > Hmm, interesting. Will have to go back to it I guess... > > > > This was enough to fix up the `file` bug. I'm no seasoned kernel > > > hacker, though, and there is still concern over the right way to > > > approach this, whether it should live in the kernel or libc, etc. > > > Frankly, I don't know the ABI structure enough to understand why the > > > register padding has to be different in this case, or what > > > lower-level component is responsible for it.. For comparison, I had > > > a > > > look at the mips tree, since it's bi-endian and has a similar 32/64 > > > situation. There is a macro conditional upon endianness that is > > > responsible for munging long longs; it uses __MIPSEB__ and > > > __MIPSEL__ > > > instead of an if/else on the generic __LITTLE_ENDIAN__. Not sure > > > what > > > to make of that. (It also simply swaps registers for LE, unlike what > > > I did for ppc.) > > > > Indeed the problem is probably that you need to swap registers for LE, > > not remove the padding slot. Did you check what happens if you pass a > > value larger than 32 bits? > > > > If so, the right way to fix this on the kernel side would be to > > construct the value as a union rather than by bitwise ops so it's > > > > endian-agnostic: > > (union { u32 parts[2]; u64 val; }){{ arg1, arg2 }}.val > > > > But the kernel folks might prefer endian ifdefs for some odd reason... > > You are right, this does seem odd considering what the other archs do. > It's quite possible I made a silly mistake, of course... > > I haven't tested with values outside the 32-bit range yet; again, this > is new territory for me, so I haven't exactly done exhaustive tests on > everything. I'll give it a closer look. I took some cues from the mips linux32 syscall setup, and drafted a new patch defining a macro to compose the hi/lo parts within the function, instead of swapping the args at the function definition. `file /bin/bash` and `truncate -s 5G test` both work correctly now. This appears to be the correct solution, so I'm not sure what silly mistake I made before, but apologies for the confusion. I've updated my gist with the new patch [1]. > > > Also worth noting is the one other outstanding bug, where the > > > time-related syscalls in the 32-bit vDSO seem to return garbage. It > > > doesn't look like an endian bug to me, and it doesn't affect > > > standard > > > syscalls (which is why if you run `date` on musl it prints the > > > correct time, unlike on glibc). The vDSO time functions are > > > implemented in ppc asm (arch/powerpc/kernel/vdso32/ gettimeofday.S), > > > and I've never touched the stuff, so if anyone has a clue I'm all > > > ears. > > > > Not sure about this. Worst-case, just leave it disabled until someone > > finds a fix. > > Apparently these asm implementations are being replaced by the generic C > ones [1], so it may be this fixes itself on its own. > > Thanks, > Will [she/her] > > [1]: > https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=173231 I mentioned in Christophe's thread the other day, but his patchset does solve the vdso32 issues, though it introduced problems in vdso64 in my testing. With that solved and the syscall situation established, I think the kernel state is stable enough to start looking at solidifying libc/ compiler stuff. I'll try to get a larger userland built in the near future to try to catch any remaining problems (before rebuilding it all when libc/ABI support becomes explicit). Cheers, Will [she/her] [1]: https://gist.github.com/Skirmisher/02891c1a8cafa0ff18b2460933ef4f3c
Re: [PATCH v3] selftests: powerpc: Fix CPU affinity for child process
On Tue, Jun 09, 2020 at 01:44:23PM +0530, Harish wrote: > On systems with large number of cpus, test fails trying to set > affinity by calling sched_setaffinity() with smaller size for > affinity mask. This patch fixes it by making sure that the size of > allocated affinity mask is dependent on the number of CPUs as > reported by get_nprocs(). > > Fixes: 00b7ec5c9cf3 ("selftests/powerpc: Import Anton's context_switch2 > benchmark") > Reported-by: Shirisha Ganta > Signed-off-by: Sandipan Das > Signed-off-by: Harish > --- Reviewed-by: Satheesh Rajendran > v2: > https://lore.kernel.org/linuxppc-dev/20200609034005.520137-1-har...@linux.ibm.com/ > > Changes from v2: > - Interchanged size and ncpus as suggested by Satheesh > - Revert the exit code as suggested by Satheesh > - Added NULL check for the affinity mask as suggested by Kamalesh > - Freed the affinity mask allocation after affinity is set > as suggested by Kamalesh > - Changed "cpu set" to "affinity mask" in the commit message > > --- > .../powerpc/benchmarks/context_switch.c | 21 ++- > 1 file changed, 16 insertions(+), 5 deletions(-) > > diff --git a/tools/testing/selftests/powerpc/benchmarks/context_switch.c > b/tools/testing/selftests/powerpc/benchmarks/context_switch.c > index a2e8c9da7fa5..d50cc05df495 100644 > --- a/tools/testing/selftests/powerpc/benchmarks/context_switch.c > +++ b/tools/testing/selftests/powerpc/benchmarks/context_switch.c > @@ -19,6 +19,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -104,8 +105,9 @@ static void start_thread_on(void *(*fn)(void *), void > *arg, unsigned long cpu) > > static void start_process_on(void *(*fn)(void *), void *arg, unsigned long > cpu) > { > - int pid; > - cpu_set_t cpuset; > + int pid, ncpus; > + cpu_set_t *cpuset; > + size_t size; > > pid = fork(); > if (pid == -1) { > @@ -116,14 +118,23 @@ static void start_process_on(void *(*fn)(void *), void > *arg, unsigned long cpu) > if (pid) > return; > > - CPU_ZERO(&cpuset); > - CPU_SET(cpu, &cpuset); > + ncpus = get_nprocs(); > + size = CPU_ALLOC_SIZE(ncpus); > + cpuset = CPU_ALLOC(ncpus); > + if (!cpuset) { > + perror("malloc"); > + exit(1); > + } > + CPU_ZERO_S(size, cpuset); > + CPU_SET_S(cpu, size, cpuset); > > - if (sched_setaffinity(0, sizeof(cpuset), &cpuset)) { > + if (sched_setaffinity(0, size, cpuset)) { > perror("sched_setaffinity"); > + CPU_FREE(cpuset); > exit(1); > } > > + CPU_FREE(cpuset); > fn(arg); > > exit(0); > -- > 2.24.1 >
ipr crashes due to NULL dma_need_drain since cc97923a5bcc ("block: move dma drain handling to scsi")
Hi all, I'm seeing crashes on powerpc with the ipr driver, which I'm fairly sure are due to dma_need_drain being NULL. The backtrace is: scsi_init_io+0x1d8/0x350 scsi_queue_rq+0x7a4/0xc30 blk_mq_dispatch_rq_list+0x1b0/0x910 blk_mq_sched_dispatch_requests+0x154/0x270 __blk_mq_run_hw_queue+0xa0/0x160 __blk_mq_delay_run_hw_queue+0x244/0x250 blk_mq_sched_insert_request+0x13c/0x250 blk_execute_rq_nowait+0x88/0xb0 blk_execute_rq+0x5c/0xf0 __scsi_execute+0x10c/0x270 scsi_mode_sense+0x144/0x440 sr_probe+0x2e8/0x810 really_probe+0x12c/0x580 driver_probe_device+0x88/0x170 device_driver_attach+0x11c/0x130 __driver_attach+0xac/0x190 bus_for_each_dev+0xa8/0x130 driver_attach+0x34/0x50 bus_add_driver+0x170/0x2b0 driver_register+0xb4/0x1c0 scsi_register_driver+0x2c/0x40 init_sr+0x4c/0x80 do_one_initcall+0x60/0x2b0 kernel_init_freeable+0x2e0/0x3a0 kernel_init+0x2c/0x148 ret_from_kernel_thread+0x5c/0x74 And looking at the disassembly I think it's coming from: static inline bool scsi_cmd_needs_dma_drain(struct scsi_device *sdev, struct request *rq) { return sdev->dma_drain_len && blk_rq_is_passthrough(rq) && !op_is_write(req_op(rq)) && sdev->host->hostt->dma_need_drain(rq); ^^ } Bisect agrees: # first bad commit: [cc97923a5bccc776851c242b61015faf288d5c22] block: move dma drain handling to scsi And looking at ipr.c, it constructs its scsi_host_template manually, without using any of the macros that end up calling __ATA_BASE_SHT, which populates dma_need_drain. The obvious fix below works, the system boots and seems to be operating normally, but I don't know enough (anything) about SCSI to say if it's actually the correct fix. cheers diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c index 7d77997d26d4..7d86f4ca266c 100644 --- a/drivers/scsi/ipr.c +++ b/drivers/scsi/ipr.c @@ -6731,6 +6731,7 @@ static struct scsi_host_template driver_template = { .compat_ioctl = ipr_ioctl, #endif .queuecommand = ipr_queuecommand, + .dma_need_drain = ata_scsi_dma_need_drain, .eh_abort_handler = ipr_eh_abort, .eh_device_reset_handler = ipr_eh_dev_reset, .eh_host_reset_handler = ipr_eh_host_reset,
Re: [PATCH v3] selftests: powerpc: Fix CPU affinity for child process
On 6/9/20 1:44 PM, Harish wrote: > On systems with large number of cpus, test fails trying to set > affinity by calling sched_setaffinity() with smaller size for > affinity mask. This patch fixes it by making sure that the size of > allocated affinity mask is dependent on the number of CPUs as > reported by get_nprocs(). > > Fixes: 00b7ec5c9cf3 ("selftests/powerpc: Import Anton's context_switch2 > benchmark") > Reported-by: Shirisha Ganta > Signed-off-by: Sandipan Das > Signed-off-by: Harish LGTM, Reviewed-by: Kamalesh Babulal -- Kamalesh
[PATCH] ASoC: fsl_ssi: Fix bclk calculation for mono channel
For mono channel, ssi will switch to normal mode. In normal mode, the Word Length Control bits control the word length divider in clock generator, which is different with I2S master mode, the word length is fixed to 32bit. So we refine the famula for mono channel, otherwise there will be sound issue for S24_LE. Fixes: b0a7043d5c2c ("ASoC: fsl_ssi: Caculate bit clock rate using slot number and width") Signed-off-by: Shengjiu Wang --- sound/soc/fsl/fsl_ssi.c | 5 + 1 file changed, 5 insertions(+) diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c index bad89b0d129e..e347776590f7 100644 --- a/sound/soc/fsl/fsl_ssi.c +++ b/sound/soc/fsl/fsl_ssi.c @@ -695,6 +695,11 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream *substream, /* Generate bit clock based on the slot number and slot width */ freq = slots * slot_width * params_rate(hw_params); + /* The slot_width is not fixed to 32 for normal mode */ + if (params_channels(hw_params) == 1) + freq = (slots <= 1 ? 2 : slots) * params_width(hw_params) * + params_rate(hw_params); + /* Don't apply it to any non-baudclk circumstance */ if (IS_ERR(ssi->baudclk)) return -EINVAL; -- 2.21.0
[PATCH v3] selftests: powerpc: Fix CPU affinity for child process
On systems with large number of cpus, test fails trying to set affinity by calling sched_setaffinity() with smaller size for affinity mask. This patch fixes it by making sure that the size of allocated affinity mask is dependent on the number of CPUs as reported by get_nprocs(). Fixes: 00b7ec5c9cf3 ("selftests/powerpc: Import Anton's context_switch2 benchmark") Reported-by: Shirisha Ganta Signed-off-by: Sandipan Das Signed-off-by: Harish --- v2: https://lore.kernel.org/linuxppc-dev/20200609034005.520137-1-har...@linux.ibm.com/ Changes from v2: - Interchanged size and ncpus as suggested by Satheesh - Revert the exit code as suggested by Satheesh - Added NULL check for the affinity mask as suggested by Kamalesh - Freed the affinity mask allocation after affinity is set as suggested by Kamalesh - Changed "cpu set" to "affinity mask" in the commit message --- .../powerpc/benchmarks/context_switch.c | 21 ++- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/powerpc/benchmarks/context_switch.c b/tools/testing/selftests/powerpc/benchmarks/context_switch.c index a2e8c9da7fa5..d50cc05df495 100644 --- a/tools/testing/selftests/powerpc/benchmarks/context_switch.c +++ b/tools/testing/selftests/powerpc/benchmarks/context_switch.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -104,8 +105,9 @@ static void start_thread_on(void *(*fn)(void *), void *arg, unsigned long cpu) static void start_process_on(void *(*fn)(void *), void *arg, unsigned long cpu) { - int pid; - cpu_set_t cpuset; + int pid, ncpus; + cpu_set_t *cpuset; + size_t size; pid = fork(); if (pid == -1) { @@ -116,14 +118,23 @@ static void start_process_on(void *(*fn)(void *), void *arg, unsigned long cpu) if (pid) return; - CPU_ZERO(&cpuset); - CPU_SET(cpu, &cpuset); + ncpus = get_nprocs(); + size = CPU_ALLOC_SIZE(ncpus); + cpuset = CPU_ALLOC(ncpus); + if (!cpuset) { + perror("malloc"); + exit(1); + } + CPU_ZERO_S(size, cpuset); + CPU_SET_S(cpu, size, cpuset); - if (sched_setaffinity(0, sizeof(cpuset), &cpuset)) { + if (sched_setaffinity(0, size, cpuset)) { perror("sched_setaffinity"); + CPU_FREE(cpuset); exit(1); } + CPU_FREE(cpuset); fn(arg); exit(0); -- 2.24.1
[PATCH v2] selftests: powerpc: Fix online CPU selection
The size of the CPU affinity mask must be large enough for systems with a very large number of CPUs. Otherwise, tests which try to determine the first online CPU by calling sched_getaffinity() will fail. This makes sure that the size of the allocated affinity mask is dependent on the number of CPUs as reported by get_nprocs(). Fixes: 3752e453f6ba ("selftests/powerpc: Add tests of PMU EBBs") Reported-by: Shirisha Ganta Signed-off-by: Sandipan Das Reviewed-by: Kamalesh Babulal --- Previous versions can be found at: v1: https://lore.kernel.org/linuxppc-dev/20200608144212.985144-1-sandi...@linux.ibm.com/ Changes in v2: - Added NULL check for the affinity mask as suggested by Kamalesh. - Changed "cpu set" to "CPU affinity mask" in the commit message. --- tools/testing/selftests/powerpc/utils.c | 37 + 1 file changed, 25 insertions(+), 12 deletions(-) diff --git a/tools/testing/selftests/powerpc/utils.c b/tools/testing/selftests/powerpc/utils.c index 933678f1ed0a..798fa8fdd5f4 100644 --- a/tools/testing/selftests/powerpc/utils.c +++ b/tools/testing/selftests/powerpc/utils.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include @@ -88,28 +89,40 @@ void *get_auxv_entry(int type) int pick_online_cpu(void) { - cpu_set_t mask; - int cpu; + int ncpus, cpu = -1; + cpu_set_t *mask; + size_t size; + + ncpus = get_nprocs(); + size = CPU_ALLOC_SIZE(ncpus); + mask = CPU_ALLOC(ncpus); + if (!mask) { + perror("malloc"); + return -1; + } - CPU_ZERO(&mask); + CPU_ZERO_S(size, mask); - if (sched_getaffinity(0, sizeof(mask), &mask)) { + if (sched_getaffinity(0, size, mask)) { perror("sched_getaffinity"); - return -1; + goto done; } /* We prefer a primary thread, but skip 0 */ - for (cpu = 8; cpu < CPU_SETSIZE; cpu += 8) - if (CPU_ISSET(cpu, &mask)) - return cpu; + for (cpu = 8; cpu < ncpus; cpu += 8) + if (CPU_ISSET_S(cpu, size, mask)) + goto done; /* Search for anything, but in reverse */ - for (cpu = CPU_SETSIZE - 1; cpu >= 0; cpu--) - if (CPU_ISSET(cpu, &mask)) - return cpu; + for (cpu = ncpus - 1; cpu >= 0; cpu--) + if (CPU_ISSET_S(cpu, size, mask)) + goto done; printf("No cpus in affinity mask?!\n"); - return -1; + +done: + CPU_FREE(mask); + return cpu; } bool is_ppc64le(void) -- 2.25.1
Re: [PATCH] powerpc/pseries/svm: Remove unwanted check for shared_lppaca_size
Le 09/06/2020 à 07:38, sathn...@linux.vent.ibm.com a écrit : From: Satheesh Rajendran Early secure guest boot hits the below crash while booting with vcpus numbers aligned with page boundary for PAGE size of 64k and LPPACA size of 1k i.e 64, 128 etc, due to the BUG_ON assert for shared_lppaca_total_size equal to shared_lppaca_size, [0.00] Partition configured for 64 cpus. [0.00] CPU maps initialized for 1 thread per core [0.00] [ cut here ] [0.00] kernel BUG at arch/powerpc/kernel/paca.c:89! [0.00] Oops: Exception in kernel mode, sig: 5 [#1] [0.00] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries which is not necessary, let's remove it. Reviewed-by: Laurent Dufour Cc: linuxppc-dev@lists.ozlabs.org Cc: Thiago Jung Bauermann Cc: Ram Pai Cc: Sukadev Bhattiprolu Cc: Laurent Dufour Signed-off-by: Satheesh Rajendran --- arch/powerpc/kernel/paca.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c index 949eceb25..10b7c54a7 100644 --- a/arch/powerpc/kernel/paca.c +++ b/arch/powerpc/kernel/paca.c @@ -86,7 +86,7 @@ static void *__init alloc_shared_lppaca(unsigned long size, unsigned long align, * This is very early in boot, so no harm done if the kernel crashes at * this point. */ - BUG_ON(shared_lppaca_size >= shared_lppaca_total_size); + BUG_ON(shared_lppaca_size > shared_lppaca_total_size); return ptr; }
Re: [PATCH v2] selftests: powerpc: Fix CPU affinity for child process
On 6/9/20 9:10 AM, Harish wrote: > On systems with large number of cpus, test fails trying to set > affinity for child process by calling sched_setaffinity() with > smaller size for cpuset. This patch fixes it by making sure that > the size of allocated cpu set is dependent on the number of CPUs > as reported by get_nprocs(). > > Fixes: 00b7ec5c9cf3 ("selftests/powerpc: Import Anton's context_switch2 > benchmark") > Reported-by: Shirisha Ganta > Signed-off-by: Harish > Signed-off-by: Sandipan Das > --- > .../powerpc/benchmarks/context_switch.c| 18 -- > 1 file changed, 12 insertions(+), 6 deletions(-) > > diff --git a/tools/testing/selftests/powerpc/benchmarks/context_switch.c > b/tools/testing/selftests/powerpc/benchmarks/context_switch.c > index a2e8c9da7fa5..de6c49d6f88f 100644 > --- a/tools/testing/selftests/powerpc/benchmarks/context_switch.c > +++ b/tools/testing/selftests/powerpc/benchmarks/context_switch.c > @@ -19,6 +19,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -104,8 +105,9 @@ static void start_thread_on(void *(*fn)(void *), void > *arg, unsigned long cpu) > > static void start_process_on(void *(*fn)(void *), void *arg, unsigned long > cpu) > { > - int pid; > - cpu_set_t cpuset; > + int pid, ncpus; > + cpu_set_t *cpuset; > + size_t size; > > pid = fork(); > if (pid == -1) { > @@ -116,12 +118,16 @@ static void start_process_on(void *(*fn)(void *), void > *arg, unsigned long cpu) > if (pid) > return; > > - CPU_ZERO(&cpuset); > - CPU_SET(cpu, &cpuset); > + size = CPU_ALLOC_SIZE(ncpus); > + ncpus = get_nprocs(); > + cpuset = CPU_ALLOC(ncpus); CPU_ALLOC() allocation failure needs to be checked, like malloc() allocations. > + CPU_ZERO_S(size, cpuset); > + CPU_SET_S(cpu, size, cpuset); > > - if (sched_setaffinity(0, sizeof(cpuset), &cpuset)) { > + if (sched_setaffinity(0, size, cpuset)) { > perror("sched_setaffinity"); > - exit(1); > + CPU_FREE(cpuset); > + exit(-1); > } once the cpu affinity is set, you probably want to free the cpuset mask. > > fn(arg); > -- Kamalesh
[PATCH 7/7] powerpc/64s: advertise hardware link stack flush
For testing only at the moment, firmware does not define these bits. --- arch/powerpc/include/asm/hvcall.h | 1 + arch/powerpc/include/uapi/asm/kvm.h | 1 + arch/powerpc/kvm/powerpc.c| 9 +++-- arch/powerpc/platforms/powernv/setup.c| 3 +++ arch/powerpc/platforms/pseries/setup.c| 3 +++ tools/arch/powerpc/include/uapi/asm/kvm.h | 1 + 6 files changed, 16 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index e90c073e437e..a92a07c89b6f 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -373,6 +373,7 @@ #define H_CPU_CHAR_THREAD_RECONFIG_CTRL(1ull << 57) // IBM bit 6 #define H_CPU_CHAR_COUNT_CACHE_DISABLED(1ull << 56) // IBM bit 7 #define H_CPU_CHAR_BCCTR_FLUSH_ASSIST (1ull << 54) // IBM bit 9 +#define H_CPU_CHAR_BCCTR_LINK_FLUSH_ASSIST (1ull << 53) // IBM bit 10 #define H_CPU_BEHAV_FAVOUR_SECURITY(1ull << 63) // IBM bit 0 #define H_CPU_BEHAV_L1D_FLUSH_PR (1ull << 62) // IBM bit 1 diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index 264e266a85bf..dd229d5f46ee 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -464,6 +464,7 @@ struct kvm_ppc_cpu_char { #define KVM_PPC_CPU_CHAR_MTTRIG_THR_RECONF (1ULL << 57) #define KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS (1ULL << 56) #define KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST(1ull << 54) +#define KVM_PPC_CPU_CHAR_BCCTR_LINK_FLUSH_ASSIST (1ull << 53) #define KVM_PPC_CPU_BEHAV_FAVOUR_SECURITY (1ULL << 63) #define KVM_PPC_CPU_BEHAV_L1D_FLUSH_PR (1ULL << 62) diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 27ccff612903..fa981ee09dec 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -2221,7 +2221,8 @@ static int pseries_get_cpu_char(struct kvm_ppc_cpu_char *cp) KVM_PPC_CPU_CHAR_BR_HINT_HONOURED | KVM_PPC_CPU_CHAR_MTTRIG_THR_RECONF | KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS | - KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST; + KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST | + KVM_PPC_CPU_CHAR_BCCTR_LINK_FLUSH_ASSIST; cp->behaviour_mask = KVM_PPC_CPU_BEHAV_FAVOUR_SECURITY | KVM_PPC_CPU_BEHAV_L1D_FLUSH_PR | KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR | @@ -2287,13 +2288,17 @@ static int kvmppc_get_cpu_char(struct kvm_ppc_cpu_char *cp) if (have_fw_feat(fw_features, "enabled", "fw-count-cache-flush-bcctr2,0,0")) cp->character |= KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST; + if (have_fw_feat(fw_features, "enabled", +"fw-link-stack-flush-bcctr2,0,0")) + cp->character |= KVM_PPC_CPU_CHAR_BCCTR_LINK_FLUSH_ASSIST; cp->character_mask = KVM_PPC_CPU_CHAR_SPEC_BAR_ORI31 | KVM_PPC_CPU_CHAR_BCCTRL_SERIALISED | KVM_PPC_CPU_CHAR_L1D_FLUSH_ORI30 | KVM_PPC_CPU_CHAR_L1D_FLUSH_TRIG2 | KVM_PPC_CPU_CHAR_L1D_THREAD_PRIV | KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS | - KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST; + KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST | + KVM_PPC_CPU_CHAR_BCCTR_LINK_FLUSH_ASSIST; if (have_fw_feat(fw_features, "enabled", "speculation-policy-favor-security")) diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c index 3bc188da82ba..1a06d3b4c0a9 100644 --- a/arch/powerpc/platforms/powernv/setup.c +++ b/arch/powerpc/platforms/powernv/setup.c @@ -78,6 +78,9 @@ static void init_fw_feat_flags(struct device_node *np) if (fw_feature_is("enabled", "fw-count-cache-flush-bcctr2,0,0", np)) security_ftr_set(SEC_FTR_BCCTR_FLUSH_ASSIST); + if (fw_feature_is("enabled", "fw-link-stack-flush-bcctr2,0,0", np)) + security_ftr_set(SEC_FTR_BCCTR_LINK_FLUSH_ASSIST); + if (fw_feature_is("enabled", "needs-count-cache-flush-on-context-switch", np)) security_ftr_set(SEC_FTR_FLUSH_COUNT_CACHE); diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 64d18f4bf093..70c9264f23c5 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -517,6 +517,9 @@ static void init_cpu_char_feature_flags(struct h_cpu_char_result *result) if (result->character & H_CPU_CHAR_BCCTR_FLUSH_ASSIST) security_ftr_set(SEC_FTR_BCCTR_FLUSH_ASSIST); + if (result->character & H_CPU_CHAR_BCCTR_LINK_FLUSH_ASSIST) + security_ftr_set(S
[PATCH 6/7] powerpc/security: Allow for processors that flush the link stack using the special bcctr
If both count cache and link stack are to be flushed, and can be flushed with the special bcctr, patch that in directly to the flush/branch nop site. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/security_features.h | 2 ++ arch/powerpc/kernel/security.c | 27 ++-- 2 files changed, 21 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/include/asm/security_features.h b/arch/powerpc/include/asm/security_features.h index 7c05e95a5c44..fbb8fa32150f 100644 --- a/arch/powerpc/include/asm/security_features.h +++ b/arch/powerpc/include/asm/security_features.h @@ -63,6 +63,8 @@ static inline bool security_ftr_enabled(u64 feature) // bcctr 2,0,0 triggers a hardware assisted count cache flush #define SEC_FTR_BCCTR_FLUSH_ASSIST 0x0800ull +// bcctr 2,0,0 triggers a hardware assisted link stack flush +#define SEC_FTR_BCCTR_LINK_FLUSH_ASSIST0x2000ull // Features indicating need for Spectre/Meltdown mitigations diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index 2a413af21124..6ad5c753d47c 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -219,24 +219,25 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c if (ccd) seq_buf_printf(&s, "Indirect branch cache disabled"); - if (link_stack_flush_type == BRANCH_CACHE_FLUSH_SW) - seq_buf_printf(&s, ", Software link stack flush"); - } else if (count_cache_flush_type != BRANCH_CACHE_FLUSH_NONE) { seq_buf_printf(&s, "Mitigation: Software count cache flush"); if (count_cache_flush_type == BRANCH_CACHE_FLUSH_HW) seq_buf_printf(&s, " (hardware accelerated)"); - if (link_stack_flush_type == BRANCH_CACHE_FLUSH_SW) - seq_buf_printf(&s, ", Software link stack flush"); - } else if (btb_flush_enabled) { seq_buf_printf(&s, "Mitigation: Branch predictor state flush"); } else { seq_buf_printf(&s, "Vulnerable"); } + if (bcs || ccd || count_cache_flush_type != BRANCH_CACHE_FLUSH_NONE) { + if (link_stack_flush_type != BRANCH_CACHE_FLUSH_NONE) + seq_buf_printf(&s, ", Software link stack flush"); + if (link_stack_flush_type == BRANCH_CACHE_FLUSH_HW) + seq_buf_printf(&s, " (hardware accelerated)"); + } + seq_buf_printf(&s, "\n"); return s.len; @@ -435,6 +436,7 @@ static void update_branch_cache_flush(void) patch_instruction_site(&patch__call_kvm_flush_link_stack, ppc_inst(PPC_INST_NOP)); } else { + // Could use HW flush, but that could also flush count cache patch_branch_site(&patch__call_kvm_flush_link_stack, (u64)&kvm_flush_link_stack, BRANCH_SET_LINK); } @@ -445,6 +447,10 @@ static void update_branch_cache_flush(void) link_stack_flush_type == BRANCH_CACHE_FLUSH_NONE) { patch_instruction_site(&patch__call_flush_branch_caches, ppc_inst(PPC_INST_NOP)); + } else if (count_cache_flush_type == BRANCH_CACHE_FLUSH_HW && + link_stack_flush_type == BRANCH_CACHE_FLUSH_HW) { + patch_instruction_site(&patch__call_flush_branch_caches, + ppc_inst(PPC_INST_BCCTR_FLUSH)); } else { patch_branch_site(&patch__call_flush_branch_caches, (u64)&flush_branch_caches, BRANCH_SET_LINK); @@ -485,8 +491,13 @@ static void toggle_branch_cache_flush(bool enable) pr_info("link-stack-flush: flush disabled.\n"); } } else { - link_stack_flush_type = BRANCH_CACHE_FLUSH_SW; - pr_info("link-stack-flush: software flush enabled.\n"); + if (security_ftr_enabled(SEC_FTR_BCCTR_LINK_FLUSH_ASSIST)) { + link_stack_flush_type = BRANCH_CACHE_FLUSH_HW; + pr_info("link-stack-flush: hardware flush enabled.\n"); + } else { + link_stack_flush_type = BRANCH_CACHE_FLUSH_SW; + pr_info("link-stack-flush: software flush enabled.\n"); + } } update_branch_cache_flush(); -- 2.23.0
[PATCH 5/7] powerpc/64s: Move branch cache flushing bcctr variant to ppc-ops.h
Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/ppc-opcode.h | 2 ++ arch/powerpc/kernel/entry_64.S| 6 ++ 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index 2a39c716c343..79d511a38bbb 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h @@ -195,6 +195,7 @@ #define OP_LQ56 /* sorted alphabetically */ +#define PPC_INST_BCCTR_FLUSH 0x4c400420 #define PPC_INST_BHRBE 0x7c00025c #define PPC_INST_CLRBHRB 0x7c00035c #define PPC_INST_COPY 0x7c20060c @@ -432,6 +433,7 @@ #endif /* Deal with instructions that older assemblers aren't aware of */ +#definePPC_BCCTR_FLUSH stringify_in_c(.long PPC_INST_BCCTR_FLUSH) #definePPC_CP_ABORTstringify_in_c(.long PPC_INST_CP_ABORT) #definePPC_COPY(a, b) stringify_in_c(.long PPC_INST_COPY | \ ___PPC_RA(a) | ___PPC_RB(b)) diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 2ba25b3b701e..a115aeb2983a 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -261,8 +261,6 @@ _ASM_NOKPROBE_SYMBOL(save_nvgprs); 1: nop;\ patch_site 1b, patch__call_flush_branch_caches -#define BCCTR_FLUSH.long 0x4c400420 - .macro nops number .rept \number nop @@ -293,7 +291,7 @@ flush_branch_caches: li r9,0x7fff mtctr r9 - BCCTR_FLUSH + PPC_BCCTR_FLUSH 2: nop patch_site 2b patch__flush_count_cache_return @@ -302,7 +300,7 @@ flush_branch_caches: .rept 278 .balign 32 - BCCTR_FLUSH + PPC_BCCTR_FLUSH nops7 .endr -- 2.23.0
[PATCH 4/7] powerpc/security: split branch cache flush toggle from code patching
Branch cache flushing code patching has inter-dependencies on both the link stack and the count cache flushing state. To make the code clearer and to separate the link stack and count cache handling, split the "toggle" (setting up variables and printing enable/disable) from the code patching. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/security.c | 94 ++ 1 file changed, 51 insertions(+), 43 deletions(-) diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index 659ef6a92bb9..2a413af21124 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -427,61 +427,69 @@ static __init int stf_barrier_debugfs_init(void) device_initcall(stf_barrier_debugfs_init); #endif /* CONFIG_DEBUG_FS */ -static void no_count_cache_flush(void) +static void update_branch_cache_flush(void) { - count_cache_flush_type = BRANCH_CACHE_FLUSH_NONE; - pr_info("count-cache-flush: flush disabled.\n"); -} - -static void toggle_branch_cache_flush(bool enable) -{ - if (!security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE) && - !security_ftr_enabled(SEC_FTR_FLUSH_LINK_STACK)) - enable = false; - - if (!enable) { - patch_instruction_site(&patch__call_flush_branch_caches, - ppc_inst(PPC_INST_NOP)); #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE + // This controls the branch from guest_exit_cont to kvm_flush_link_stack + if (link_stack_flush_type == BRANCH_CACHE_FLUSH_NONE) { patch_instruction_site(&patch__call_kvm_flush_link_stack, ppc_inst(PPC_INST_NOP)); -#endif - pr_info("link-stack-flush: flush disabled.\n"); - link_stack_flush_type = BRANCH_CACHE_FLUSH_NONE; - no_count_cache_flush(); - return; + } else { + patch_branch_site(&patch__call_kvm_flush_link_stack, + (u64)&kvm_flush_link_stack, BRANCH_SET_LINK); } - - // This enables the branch from _switch to flush_branch_caches - patch_branch_site(&patch__call_flush_branch_caches, - (u64)&flush_branch_caches, BRANCH_SET_LINK); - -#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE - // This enables the branch from guest_exit_cont to kvm_flush_link_stack - patch_branch_site(&patch__call_kvm_flush_link_stack, - (u64)&kvm_flush_link_stack, BRANCH_SET_LINK); #endif - pr_info("link-stack-flush: software flush enabled.\n"); - link_stack_flush_type = BRANCH_CACHE_FLUSH_SW; + // This controls the branch from _switch to flush_branch_caches + if (count_cache_flush_type == BRANCH_CACHE_FLUSH_NONE && + link_stack_flush_type == BRANCH_CACHE_FLUSH_NONE) { + patch_instruction_site(&patch__call_flush_branch_caches, + ppc_inst(PPC_INST_NOP)); + } else { + patch_branch_site(&patch__call_flush_branch_caches, + (u64)&flush_branch_caches, BRANCH_SET_LINK); + + // If we just need to flush the link stack, early return + if (count_cache_flush_type == BRANCH_CACHE_FLUSH_NONE) { + patch_instruction_site(&patch__flush_link_stack_return, + ppc_inst(PPC_INST_BLR)); + + // If we have flush instruction, early return + } else if (count_cache_flush_type == BRANCH_CACHE_FLUSH_HW) { + patch_instruction_site(&patch__flush_count_cache_return, + ppc_inst(PPC_INST_BLR)); + } + } +} - // If we just need to flush the link stack, patch an early return - if (!security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) { - patch_instruction_site(&patch__flush_link_stack_return, - ppc_inst(PPC_INST_BLR)); - no_count_cache_flush(); - return; +static void toggle_branch_cache_flush(bool enable) +{ + if (!enable || !security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) { + if (count_cache_flush_type != BRANCH_CACHE_FLUSH_NONE) { + count_cache_flush_type = BRANCH_CACHE_FLUSH_NONE; + pr_info("count-cache-flush: flush disabled.\n"); + } + } else { + if (security_ftr_enabled(SEC_FTR_BCCTR_FLUSH_ASSIST)) { + count_cache_flush_type = BRANCH_CACHE_FLUSH_HW; + pr_info("count-cache-flush: hardware flush enabled.\n"); + } else { + count_cache_flush_type = BRANCH_CACHE_FLUSH_SW; + pr_info("count-cache-flush: software flush enabled.\n"); + } } - if (!security_ftr_enabled(SEC_FTR_BCCTR_FLUSH
[PATCH 3/7] powerpc/security: make display of branch cache flush more consistent
Make the count-cache and link-stack messages look the same Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/security.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index 28f4cb062f69..659ef6a92bb9 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -430,7 +430,7 @@ device_initcall(stf_barrier_debugfs_init); static void no_count_cache_flush(void) { count_cache_flush_type = BRANCH_CACHE_FLUSH_NONE; - pr_info("count-cache-flush: software flush disabled.\n"); + pr_info("count-cache-flush: flush disabled.\n"); } static void toggle_branch_cache_flush(bool enable) @@ -446,7 +446,7 @@ static void toggle_branch_cache_flush(bool enable) patch_instruction_site(&patch__call_kvm_flush_link_stack, ppc_inst(PPC_INST_NOP)); #endif - pr_info("link-stack-flush: software flush disabled.\n"); + pr_info("link-stack-flush: flush disabled.\n"); link_stack_flush_type = BRANCH_CACHE_FLUSH_NONE; no_count_cache_flush(); return; @@ -475,13 +475,13 @@ static void toggle_branch_cache_flush(bool enable) if (!security_ftr_enabled(SEC_FTR_BCCTR_FLUSH_ASSIST)) { count_cache_flush_type = BRANCH_CACHE_FLUSH_SW; - pr_info("count-cache-flush: full software flush sequence enabled.\n"); + pr_info("count-cache-flush: software flush enabled.\n"); return; } patch_instruction_site(&patch__flush_count_cache_return, ppc_inst(PPC_INST_BLR)); count_cache_flush_type = BRANCH_CACHE_FLUSH_HW; - pr_info("count-cache-flush: hardware assisted flush sequence enabled\n"); + pr_info("count-cache-flush: hardware flush enabled.\n"); } void setup_count_cache_flush(void) -- 2.23.0
[PATCH 2/7] powerpc/security: change link stack flush state to the flush type enum
Prepare to allow for hardware link stack flushing by using the none/sw/hw type, same as the count cache state. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/security.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index df2a3eff950b..28f4cb062f69 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -27,7 +27,7 @@ enum branch_cache_flush_type { BRANCH_CACHE_FLUSH_HW = 0x4, }; static enum branch_cache_flush_type count_cache_flush_type = BRANCH_CACHE_FLUSH_NONE; -static bool link_stack_flush_enabled; +static enum branch_cache_flush_type link_stack_flush_type = BRANCH_CACHE_FLUSH_NONE; bool barrier_nospec_enabled; static bool no_nospec; @@ -219,7 +219,7 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c if (ccd) seq_buf_printf(&s, "Indirect branch cache disabled"); - if (link_stack_flush_enabled) + if (link_stack_flush_type == BRANCH_CACHE_FLUSH_SW) seq_buf_printf(&s, ", Software link stack flush"); } else if (count_cache_flush_type != BRANCH_CACHE_FLUSH_NONE) { @@ -228,7 +228,7 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c if (count_cache_flush_type == BRANCH_CACHE_FLUSH_HW) seq_buf_printf(&s, " (hardware accelerated)"); - if (link_stack_flush_enabled) + if (link_stack_flush_type == BRANCH_CACHE_FLUSH_SW) seq_buf_printf(&s, ", Software link stack flush"); } else if (btb_flush_enabled) { @@ -447,7 +447,7 @@ static void toggle_branch_cache_flush(bool enable) ppc_inst(PPC_INST_NOP)); #endif pr_info("link-stack-flush: software flush disabled.\n"); - link_stack_flush_enabled = false; + link_stack_flush_type = BRANCH_CACHE_FLUSH_NONE; no_count_cache_flush(); return; } @@ -463,7 +463,7 @@ static void toggle_branch_cache_flush(bool enable) #endif pr_info("link-stack-flush: software flush enabled.\n"); - link_stack_flush_enabled = true; + link_stack_flush_type = BRANCH_CACHE_FLUSH_SW; // If we just need to flush the link stack, patch an early return if (!security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) { -- 2.23.0
[PATCH 1/7] powerpc/security: re-name count cache flush to branch cache flush
The count cache flush mostly refers to both count cache and link stack flushing. As a first step to untangling these a bit, re-name the bits that apply to both. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/asm-prototypes.h | 4 +-- arch/powerpc/kernel/entry_64.S| 7 ++--- arch/powerpc/kernel/security.c| 36 +++ 3 files changed, 23 insertions(+), 24 deletions(-) diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h index 7d81e86a1e5d..fa9057360e88 100644 --- a/arch/powerpc/include/asm/asm-prototypes.h +++ b/arch/powerpc/include/asm/asm-prototypes.h @@ -144,13 +144,13 @@ void _kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu, u64 guest_msr); void _kvmppc_save_tm_pr(struct kvm_vcpu *vcpu, u64 guest_msr); /* Patch sites */ -extern s32 patch__call_flush_count_cache; +extern s32 patch__call_flush_branch_caches; extern s32 patch__flush_count_cache_return; extern s32 patch__flush_link_stack_return; extern s32 patch__call_kvm_flush_link_stack; extern s32 patch__memset_nocache, patch__memcpy_nocache; -extern long flush_count_cache; +extern long flush_branch_caches; extern long kvm_flush_link_stack; #ifdef CONFIG_PPC_TRANSACTIONAL_MEM diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 9d49338e0c85..2ba25b3b701e 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -259,8 +259,7 @@ _ASM_NOKPROBE_SYMBOL(save_nvgprs); #define FLUSH_COUNT_CACHE \ 1: nop;\ - patch_site 1b, patch__call_flush_count_cache - + patch_site 1b, patch__call_flush_branch_caches #define BCCTR_FLUSH.long 0x4c400420 @@ -271,8 +270,8 @@ _ASM_NOKPROBE_SYMBOL(save_nvgprs); .endm .balign 32 -.global flush_count_cache -flush_count_cache: +.global flush_branch_caches +flush_branch_caches: /* Save LR into r9 */ mflrr9 diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index d86701ce116b..df2a3eff950b 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -21,12 +21,12 @@ u64 powerpc_security_features __read_mostly = SEC_FTR_DEFAULT; -enum count_cache_flush_type { - COUNT_CACHE_FLUSH_NONE = 0x1, - COUNT_CACHE_FLUSH_SW= 0x2, - COUNT_CACHE_FLUSH_HW= 0x4, +enum branch_cache_flush_type { + BRANCH_CACHE_FLUSH_NONE = 0x1, + BRANCH_CACHE_FLUSH_SW = 0x2, + BRANCH_CACHE_FLUSH_HW = 0x4, }; -static enum count_cache_flush_type count_cache_flush_type = COUNT_CACHE_FLUSH_NONE; +static enum branch_cache_flush_type count_cache_flush_type = BRANCH_CACHE_FLUSH_NONE; static bool link_stack_flush_enabled; bool barrier_nospec_enabled; @@ -222,10 +222,10 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c if (link_stack_flush_enabled) seq_buf_printf(&s, ", Software link stack flush"); - } else if (count_cache_flush_type != COUNT_CACHE_FLUSH_NONE) { + } else if (count_cache_flush_type != BRANCH_CACHE_FLUSH_NONE) { seq_buf_printf(&s, "Mitigation: Software count cache flush"); - if (count_cache_flush_type == COUNT_CACHE_FLUSH_HW) + if (count_cache_flush_type == BRANCH_CACHE_FLUSH_HW) seq_buf_printf(&s, " (hardware accelerated)"); if (link_stack_flush_enabled) @@ -429,18 +429,18 @@ device_initcall(stf_barrier_debugfs_init); static void no_count_cache_flush(void) { - count_cache_flush_type = COUNT_CACHE_FLUSH_NONE; + count_cache_flush_type = BRANCH_CACHE_FLUSH_NONE; pr_info("count-cache-flush: software flush disabled.\n"); } -static void toggle_count_cache_flush(bool enable) +static void toggle_branch_cache_flush(bool enable) { if (!security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE) && !security_ftr_enabled(SEC_FTR_FLUSH_LINK_STACK)) enable = false; if (!enable) { - patch_instruction_site(&patch__call_flush_count_cache, + patch_instruction_site(&patch__call_flush_branch_caches, ppc_inst(PPC_INST_NOP)); #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE patch_instruction_site(&patch__call_kvm_flush_link_stack, @@ -452,9 +452,9 @@ static void toggle_count_cache_flush(bool enable) return; } - // This enables the branch from _switch to flush_count_cache - patch_branch_site(&patch__call_flush_count_cache, - (u64)&flush_count_cache, BRANCH_SET_LINK); + // This enables the branch from _switch to flush_branch_caches + patch_branch_site(&patch__call_flush_branch_caches, + (u64)&flush_branch_caches, BRANCH_SET_LINK); #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE // This enables the branch from guest_exit_cont to kvm
[PATCH 0/7] powerpc: branch cache flush changes
This series allows the link stack to be flushed with the speical bcctr 2,0,0 flush instruction that also flushes the count cache if the processor supports it. Firmware does not support this at the moment, but I've tested it in simulator with a patched firmware to advertise support. Thanks, Nick Nicholas Piggin (7): powerpc/security: re-name count cache flush to branch cache flush powerpc/security: change link stack flush state to the flush type enum powerpc/security: make display of branch cache flush more consistent powerpc/security: split branch cache flush toggle from code patching powerpc/64s: Move branch cache flushing bcctr variant to ppc-ops.h powerpc/security: Allow for processors that flush the link stack using the special bcctr powerpc/64s: advertise hardware link stack flush arch/powerpc/include/asm/asm-prototypes.h| 4 +- arch/powerpc/include/asm/hvcall.h| 1 + arch/powerpc/include/asm/ppc-opcode.h| 2 + arch/powerpc/include/asm/security_features.h | 2 + arch/powerpc/include/uapi/asm/kvm.h | 1 + arch/powerpc/kernel/entry_64.S | 13 +- arch/powerpc/kernel/security.c | 139 +++ arch/powerpc/kvm/powerpc.c | 9 +- arch/powerpc/platforms/powernv/setup.c | 3 + arch/powerpc/platforms/pseries/setup.c | 3 + tools/arch/powerpc/include/uapi/asm/kvm.h| 1 + 11 files changed, 106 insertions(+), 72 deletions(-) -- 2.23.0
Re: [PATCH v2] selftests: powerpc: Fix CPU affinity for child process
On Tue, Jun 09, 2020 at 09:10:05AM +0530, Harish wrote: > On systems with large number of cpus, test fails trying to set > affinity for child process by calling sched_setaffinity() with > smaller size for cpuset. This patch fixes it by making sure that > the size of allocated cpu set is dependent on the number of CPUs > as reported by get_nprocs(). > > Fixes: 00b7ec5c9cf3 ("selftests/powerpc: Import Anton's context_switch2 > benchmark") > Reported-by: Shirisha Ganta > Signed-off-by: Harish > Signed-off-by: Sandipan Das > --- > .../powerpc/benchmarks/context_switch.c| 18 -- > 1 file changed, 12 insertions(+), 6 deletions(-) > > diff --git a/tools/testing/selftests/powerpc/benchmarks/context_switch.c > b/tools/testing/selftests/powerpc/benchmarks/context_switch.c > index a2e8c9da7fa5..de6c49d6f88f 100644 > --- a/tools/testing/selftests/powerpc/benchmarks/context_switch.c > +++ b/tools/testing/selftests/powerpc/benchmarks/context_switch.c > @@ -19,6 +19,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -104,8 +105,9 @@ static void start_thread_on(void *(*fn)(void *), void > *arg, unsigned long cpu) > > static void start_process_on(void *(*fn)(void *), void *arg, unsigned long > cpu) > { > - int pid; > - cpu_set_t cpuset; > + int pid, ncpus; > + cpu_set_t *cpuset; > + size_t size; > > pid = fork(); > if (pid == -1) { > @@ -116,12 +118,16 @@ static void start_process_on(void *(*fn)(void *), void > *arg, unsigned long cpu) > if (pid) > return; > > - CPU_ZERO(&cpuset); > - CPU_SET(cpu, &cpuset); > + size = CPU_ALLOC_SIZE(ncpus); > + ncpus = get_nprocs(); above two lines should be interchanged, ncpus not assigned while getting used to get size. > + cpuset = CPU_ALLOC(ncpus); > + CPU_ZERO_S(size, cpuset); > + CPU_SET_S(cpu, size, cpuset); > > - if (sched_setaffinity(0, sizeof(cpuset), &cpuset)) { > + if (sched_setaffinity(0, size, cpuset)) { > perror("sched_setaffinity"); > - exit(1); > + CPU_FREE(cpuset); > + exit(-1); do we need to change the return value here? probably other framework might rely on previous value? Regards, -Satheesh. > } > > fn(arg); > -- > 2.24.1 >
Re: [PATCH] selftests: powerpc: Fix online CPU selection
On 08/06/20 8:12 pm, Sandipan Das wrote: > The size of the cpu set must be large enough for systems > with a very large number of CPUs. Otherwise, tests which > try to determine the first online CPU by calling > sched_getaffinity() will fail. This makes sure that the > size of the allocated cpu set is dependent on the number > of CPUs as reported by get_nprocs(). > > Fixes: 3752e453f6ba ("selftests/powerpc: Add tests of PMU EBBs") > Reported-by: Shirisha Ganta > Signed-off-by: Sandipan Das > --- > tools/testing/selftests/powerpc/utils.c | 33 - > 1 file changed, 21 insertions(+), 12 deletions(-) > > diff --git a/tools/testing/selftests/powerpc/utils.c > b/tools/testing/selftests/powerpc/utils.c > index 933678f1ed0a..bb8e402752c0 100644 > --- a/tools/testing/selftests/powerpc/utils.c > +++ b/tools/testing/selftests/powerpc/utils.c > @@ -16,6 +16,7 @@ > [...] > > int pick_online_cpu(void) > { > - cpu_set_t mask; > - int cpu; > + int ncpus, cpu = -1; > + cpu_set_t *mask; > + size_t size; > > - CPU_ZERO(&mask); > + ncpus = get_nprocs(); > + size = CPU_ALLOC_SIZE(ncpus); > + mask = CPU_ALLOC(ncpus); > > - if (sched_getaffinity(0, sizeof(mask), &mask)) { > + CPU_ZERO_S(size, mask); > + > + if (sched_getaffinity(0, size, mask)) { > perror("sched_getaffinity"); > - return -1; > + goto done; > } > > /* We prefer a primary thread, but skip 0 */ > - for (cpu = 8; cpu < CPU_SETSIZE; cpu += 8) > - if (CPU_ISSET(cpu, &mask)) > - return cpu; > + for (cpu = 8; cpu < ncpus; cpu += 8) > + if (CPU_ISSET_S(cpu, size, mask)) > + goto done; > > /* Search for anything, but in reverse */ > - for (cpu = CPU_SETSIZE - 1; cpu >= 0; cpu--) > - if (CPU_ISSET(cpu, &mask)) > - return cpu; > + for (cpu = ncpus - 1; cpu >= 0; cpu--) > + if (CPU_ISSET_S(cpu, size, mask)) > + goto done; > > printf("No cpus in affinity mask?!\n"); > - return -1; Missed the fact that the for loop before this would anyway make 'cpu' count down to -1 if no online CPU is found. Please ignore the previous message. > + > +done: > + CPU_FREE(mask); > + return cpu; > } > > bool is_ppc64le(void) > - Sandipan
[PATCH 7/7] powerpc/64s: advertise hardware link stack flush
For testing only at the moment, firmware does not define these bits. --- arch/powerpc/include/asm/hvcall.h | 1 + arch/powerpc/include/uapi/asm/kvm.h | 1 + arch/powerpc/kvm/powerpc.c| 9 +++-- arch/powerpc/platforms/powernv/setup.c| 3 +++ arch/powerpc/platforms/pseries/setup.c| 3 +++ tools/arch/powerpc/include/uapi/asm/kvm.h | 1 + 6 files changed, 16 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index e90c073e437e..a92a07c89b6f 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -373,6 +373,7 @@ #define H_CPU_CHAR_THREAD_RECONFIG_CTRL(1ull << 57) // IBM bit 6 #define H_CPU_CHAR_COUNT_CACHE_DISABLED(1ull << 56) // IBM bit 7 #define H_CPU_CHAR_BCCTR_FLUSH_ASSIST (1ull << 54) // IBM bit 9 +#define H_CPU_CHAR_BCCTR_LINK_FLUSH_ASSIST (1ull << 53) // IBM bit 10 #define H_CPU_BEHAV_FAVOUR_SECURITY(1ull << 63) // IBM bit 0 #define H_CPU_BEHAV_L1D_FLUSH_PR (1ull << 62) // IBM bit 1 diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index 264e266a85bf..dd229d5f46ee 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -464,6 +464,7 @@ struct kvm_ppc_cpu_char { #define KVM_PPC_CPU_CHAR_MTTRIG_THR_RECONF (1ULL << 57) #define KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS (1ULL << 56) #define KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST(1ull << 54) +#define KVM_PPC_CPU_CHAR_BCCTR_LINK_FLUSH_ASSIST (1ull << 53) #define KVM_PPC_CPU_BEHAV_FAVOUR_SECURITY (1ULL << 63) #define KVM_PPC_CPU_BEHAV_L1D_FLUSH_PR (1ULL << 62) diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 27ccff612903..fa981ee09dec 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -2221,7 +2221,8 @@ static int pseries_get_cpu_char(struct kvm_ppc_cpu_char *cp) KVM_PPC_CPU_CHAR_BR_HINT_HONOURED | KVM_PPC_CPU_CHAR_MTTRIG_THR_RECONF | KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS | - KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST; + KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST | + KVM_PPC_CPU_CHAR_BCCTR_LINK_FLUSH_ASSIST; cp->behaviour_mask = KVM_PPC_CPU_BEHAV_FAVOUR_SECURITY | KVM_PPC_CPU_BEHAV_L1D_FLUSH_PR | KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR | @@ -2287,13 +2288,17 @@ static int kvmppc_get_cpu_char(struct kvm_ppc_cpu_char *cp) if (have_fw_feat(fw_features, "enabled", "fw-count-cache-flush-bcctr2,0,0")) cp->character |= KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST; + if (have_fw_feat(fw_features, "enabled", +"fw-link-stack-flush-bcctr2,0,0")) + cp->character |= KVM_PPC_CPU_CHAR_BCCTR_LINK_FLUSH_ASSIST; cp->character_mask = KVM_PPC_CPU_CHAR_SPEC_BAR_ORI31 | KVM_PPC_CPU_CHAR_BCCTRL_SERIALISED | KVM_PPC_CPU_CHAR_L1D_FLUSH_ORI30 | KVM_PPC_CPU_CHAR_L1D_FLUSH_TRIG2 | KVM_PPC_CPU_CHAR_L1D_THREAD_PRIV | KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS | - KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST; + KVM_PPC_CPU_CHAR_BCCTR_FLUSH_ASSIST | + KVM_PPC_CPU_CHAR_BCCTR_LINK_FLUSH_ASSIST; if (have_fw_feat(fw_features, "enabled", "speculation-policy-favor-security")) diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c index 3bc188da82ba..1a06d3b4c0a9 100644 --- a/arch/powerpc/platforms/powernv/setup.c +++ b/arch/powerpc/platforms/powernv/setup.c @@ -78,6 +78,9 @@ static void init_fw_feat_flags(struct device_node *np) if (fw_feature_is("enabled", "fw-count-cache-flush-bcctr2,0,0", np)) security_ftr_set(SEC_FTR_BCCTR_FLUSH_ASSIST); + if (fw_feature_is("enabled", "fw-link-stack-flush-bcctr2,0,0", np)) + security_ftr_set(SEC_FTR_BCCTR_LINK_FLUSH_ASSIST); + if (fw_feature_is("enabled", "needs-count-cache-flush-on-context-switch", np)) security_ftr_set(SEC_FTR_FLUSH_COUNT_CACHE); diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 64d18f4bf093..70c9264f23c5 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -517,6 +517,9 @@ static void init_cpu_char_feature_flags(struct h_cpu_char_result *result) if (result->character & H_CPU_CHAR_BCCTR_FLUSH_ASSIST) security_ftr_set(SEC_FTR_BCCTR_FLUSH_ASSIST); + if (result->character & H_CPU_CHAR_BCCTR_LINK_FLUSH_ASSIST) + security_ftr_set(S
[PATCH 6/7] powerpc/security: Allow for processors that flush the link stack using the special bcctr
If both count cache and link stack are to be flushed, and can be flushed with the special bcctr, patch that in directly to the flush/branch nop site. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/security_features.h | 2 ++ arch/powerpc/kernel/security.c | 27 ++-- 2 files changed, 21 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/include/asm/security_features.h b/arch/powerpc/include/asm/security_features.h index 7c05e95a5c44..fbb8fa32150f 100644 --- a/arch/powerpc/include/asm/security_features.h +++ b/arch/powerpc/include/asm/security_features.h @@ -63,6 +63,8 @@ static inline bool security_ftr_enabled(u64 feature) // bcctr 2,0,0 triggers a hardware assisted count cache flush #define SEC_FTR_BCCTR_FLUSH_ASSIST 0x0800ull +// bcctr 2,0,0 triggers a hardware assisted link stack flush +#define SEC_FTR_BCCTR_LINK_FLUSH_ASSIST0x2000ull // Features indicating need for Spectre/Meltdown mitigations diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index 2a413af21124..6ad5c753d47c 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -219,24 +219,25 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c if (ccd) seq_buf_printf(&s, "Indirect branch cache disabled"); - if (link_stack_flush_type == BRANCH_CACHE_FLUSH_SW) - seq_buf_printf(&s, ", Software link stack flush"); - } else if (count_cache_flush_type != BRANCH_CACHE_FLUSH_NONE) { seq_buf_printf(&s, "Mitigation: Software count cache flush"); if (count_cache_flush_type == BRANCH_CACHE_FLUSH_HW) seq_buf_printf(&s, " (hardware accelerated)"); - if (link_stack_flush_type == BRANCH_CACHE_FLUSH_SW) - seq_buf_printf(&s, ", Software link stack flush"); - } else if (btb_flush_enabled) { seq_buf_printf(&s, "Mitigation: Branch predictor state flush"); } else { seq_buf_printf(&s, "Vulnerable"); } + if (bcs || ccd || count_cache_flush_type != BRANCH_CACHE_FLUSH_NONE) { + if (link_stack_flush_type != BRANCH_CACHE_FLUSH_NONE) + seq_buf_printf(&s, ", Software link stack flush"); + if (link_stack_flush_type == BRANCH_CACHE_FLUSH_HW) + seq_buf_printf(&s, " (hardware accelerated)"); + } + seq_buf_printf(&s, "\n"); return s.len; @@ -435,6 +436,7 @@ static void update_branch_cache_flush(void) patch_instruction_site(&patch__call_kvm_flush_link_stack, ppc_inst(PPC_INST_NOP)); } else { + // Could use HW flush, but that could also flush count cache patch_branch_site(&patch__call_kvm_flush_link_stack, (u64)&kvm_flush_link_stack, BRANCH_SET_LINK); } @@ -445,6 +447,10 @@ static void update_branch_cache_flush(void) link_stack_flush_type == BRANCH_CACHE_FLUSH_NONE) { patch_instruction_site(&patch__call_flush_branch_caches, ppc_inst(PPC_INST_NOP)); + } else if (count_cache_flush_type == BRANCH_CACHE_FLUSH_HW && + link_stack_flush_type == BRANCH_CACHE_FLUSH_HW) { + patch_instruction_site(&patch__call_flush_branch_caches, + ppc_inst(PPC_INST_BCCTR_FLUSH)); } else { patch_branch_site(&patch__call_flush_branch_caches, (u64)&flush_branch_caches, BRANCH_SET_LINK); @@ -485,8 +491,13 @@ static void toggle_branch_cache_flush(bool enable) pr_info("link-stack-flush: flush disabled.\n"); } } else { - link_stack_flush_type = BRANCH_CACHE_FLUSH_SW; - pr_info("link-stack-flush: software flush enabled.\n"); + if (security_ftr_enabled(SEC_FTR_BCCTR_LINK_FLUSH_ASSIST)) { + link_stack_flush_type = BRANCH_CACHE_FLUSH_HW; + pr_info("link-stack-flush: hardware flush enabled.\n"); + } else { + link_stack_flush_type = BRANCH_CACHE_FLUSH_SW; + pr_info("link-stack-flush: software flush enabled.\n"); + } } update_branch_cache_flush(); -- 2.23.0
[PATCH 5/7] powerpc/64s: Move branch cache flushing bcctr variant to ppc-ops.h
Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/ppc-opcode.h | 2 ++ arch/powerpc/kernel/entry_64.S| 6 ++ 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index 2a39c716c343..79d511a38bbb 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h @@ -195,6 +195,7 @@ #define OP_LQ56 /* sorted alphabetically */ +#define PPC_INST_BCCTR_FLUSH 0x4c400420 #define PPC_INST_BHRBE 0x7c00025c #define PPC_INST_CLRBHRB 0x7c00035c #define PPC_INST_COPY 0x7c20060c @@ -432,6 +433,7 @@ #endif /* Deal with instructions that older assemblers aren't aware of */ +#definePPC_BCCTR_FLUSH stringify_in_c(.long PPC_INST_BCCTR_FLUSH) #definePPC_CP_ABORTstringify_in_c(.long PPC_INST_CP_ABORT) #definePPC_COPY(a, b) stringify_in_c(.long PPC_INST_COPY | \ ___PPC_RA(a) | ___PPC_RB(b)) diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 2ba25b3b701e..a115aeb2983a 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -261,8 +261,6 @@ _ASM_NOKPROBE_SYMBOL(save_nvgprs); 1: nop;\ patch_site 1b, patch__call_flush_branch_caches -#define BCCTR_FLUSH.long 0x4c400420 - .macro nops number .rept \number nop @@ -293,7 +291,7 @@ flush_branch_caches: li r9,0x7fff mtctr r9 - BCCTR_FLUSH + PPC_BCCTR_FLUSH 2: nop patch_site 2b patch__flush_count_cache_return @@ -302,7 +300,7 @@ flush_branch_caches: .rept 278 .balign 32 - BCCTR_FLUSH + PPC_BCCTR_FLUSH nops7 .endr -- 2.23.0
[PATCH 4/7] powerpc/security: split branch cache flush toggle from code patching
Branch cache flushing code patching has inter-dependencies on both the link stack and the count cache flushing state. To make the code clearer and to separate the link stack and count cache handling, split the "toggle" (setting up variables and printing enable/disable) from the code patching. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/security.c | 94 ++ 1 file changed, 51 insertions(+), 43 deletions(-) diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index 659ef6a92bb9..2a413af21124 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -427,61 +427,69 @@ static __init int stf_barrier_debugfs_init(void) device_initcall(stf_barrier_debugfs_init); #endif /* CONFIG_DEBUG_FS */ -static void no_count_cache_flush(void) +static void update_branch_cache_flush(void) { - count_cache_flush_type = BRANCH_CACHE_FLUSH_NONE; - pr_info("count-cache-flush: flush disabled.\n"); -} - -static void toggle_branch_cache_flush(bool enable) -{ - if (!security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE) && - !security_ftr_enabled(SEC_FTR_FLUSH_LINK_STACK)) - enable = false; - - if (!enable) { - patch_instruction_site(&patch__call_flush_branch_caches, - ppc_inst(PPC_INST_NOP)); #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE + // This controls the branch from guest_exit_cont to kvm_flush_link_stack + if (link_stack_flush_type == BRANCH_CACHE_FLUSH_NONE) { patch_instruction_site(&patch__call_kvm_flush_link_stack, ppc_inst(PPC_INST_NOP)); -#endif - pr_info("link-stack-flush: flush disabled.\n"); - link_stack_flush_type = BRANCH_CACHE_FLUSH_NONE; - no_count_cache_flush(); - return; + } else { + patch_branch_site(&patch__call_kvm_flush_link_stack, + (u64)&kvm_flush_link_stack, BRANCH_SET_LINK); } - - // This enables the branch from _switch to flush_branch_caches - patch_branch_site(&patch__call_flush_branch_caches, - (u64)&flush_branch_caches, BRANCH_SET_LINK); - -#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE - // This enables the branch from guest_exit_cont to kvm_flush_link_stack - patch_branch_site(&patch__call_kvm_flush_link_stack, - (u64)&kvm_flush_link_stack, BRANCH_SET_LINK); #endif - pr_info("link-stack-flush: software flush enabled.\n"); - link_stack_flush_type = BRANCH_CACHE_FLUSH_SW; + // This controls the branch from _switch to flush_branch_caches + if (count_cache_flush_type == BRANCH_CACHE_FLUSH_NONE && + link_stack_flush_type == BRANCH_CACHE_FLUSH_NONE) { + patch_instruction_site(&patch__call_flush_branch_caches, + ppc_inst(PPC_INST_NOP)); + } else { + patch_branch_site(&patch__call_flush_branch_caches, + (u64)&flush_branch_caches, BRANCH_SET_LINK); + + // If we just need to flush the link stack, early return + if (count_cache_flush_type == BRANCH_CACHE_FLUSH_NONE) { + patch_instruction_site(&patch__flush_link_stack_return, + ppc_inst(PPC_INST_BLR)); + + // If we have flush instruction, early return + } else if (count_cache_flush_type == BRANCH_CACHE_FLUSH_HW) { + patch_instruction_site(&patch__flush_count_cache_return, + ppc_inst(PPC_INST_BLR)); + } + } +} - // If we just need to flush the link stack, patch an early return - if (!security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) { - patch_instruction_site(&patch__flush_link_stack_return, - ppc_inst(PPC_INST_BLR)); - no_count_cache_flush(); - return; +static void toggle_branch_cache_flush(bool enable) +{ + if (!enable || !security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) { + if (count_cache_flush_type != BRANCH_CACHE_FLUSH_NONE) { + count_cache_flush_type = BRANCH_CACHE_FLUSH_NONE; + pr_info("count-cache-flush: flush disabled.\n"); + } + } else { + if (security_ftr_enabled(SEC_FTR_BCCTR_FLUSH_ASSIST)) { + count_cache_flush_type = BRANCH_CACHE_FLUSH_HW; + pr_info("count-cache-flush: hardware flush enabled.\n"); + } else { + count_cache_flush_type = BRANCH_CACHE_FLUSH_SW; + pr_info("count-cache-flush: software flush enabled.\n"); + } } - if (!security_ftr_enabled(SEC_FTR_BCCTR_FLUSH
[PATCH 3/7] powerpc/security: make display of branch cache flush more consistent
Make the count-cache and link-stack messages look the same Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/security.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index 28f4cb062f69..659ef6a92bb9 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -430,7 +430,7 @@ device_initcall(stf_barrier_debugfs_init); static void no_count_cache_flush(void) { count_cache_flush_type = BRANCH_CACHE_FLUSH_NONE; - pr_info("count-cache-flush: software flush disabled.\n"); + pr_info("count-cache-flush: flush disabled.\n"); } static void toggle_branch_cache_flush(bool enable) @@ -446,7 +446,7 @@ static void toggle_branch_cache_flush(bool enable) patch_instruction_site(&patch__call_kvm_flush_link_stack, ppc_inst(PPC_INST_NOP)); #endif - pr_info("link-stack-flush: software flush disabled.\n"); + pr_info("link-stack-flush: flush disabled.\n"); link_stack_flush_type = BRANCH_CACHE_FLUSH_NONE; no_count_cache_flush(); return; @@ -475,13 +475,13 @@ static void toggle_branch_cache_flush(bool enable) if (!security_ftr_enabled(SEC_FTR_BCCTR_FLUSH_ASSIST)) { count_cache_flush_type = BRANCH_CACHE_FLUSH_SW; - pr_info("count-cache-flush: full software flush sequence enabled.\n"); + pr_info("count-cache-flush: software flush enabled.\n"); return; } patch_instruction_site(&patch__flush_count_cache_return, ppc_inst(PPC_INST_BLR)); count_cache_flush_type = BRANCH_CACHE_FLUSH_HW; - pr_info("count-cache-flush: hardware assisted flush sequence enabled\n"); + pr_info("count-cache-flush: hardware flush enabled.\n"); } void setup_count_cache_flush(void) -- 2.23.0
[PATCH 2/7] powerpc/security: change link stack flush state to the flush type enum
Prepare to allow for hardware link stack flushing by using the none/sw/hw type, same as the count cache state. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/security.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index df2a3eff950b..28f4cb062f69 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -27,7 +27,7 @@ enum branch_cache_flush_type { BRANCH_CACHE_FLUSH_HW = 0x4, }; static enum branch_cache_flush_type count_cache_flush_type = BRANCH_CACHE_FLUSH_NONE; -static bool link_stack_flush_enabled; +static enum branch_cache_flush_type link_stack_flush_type = BRANCH_CACHE_FLUSH_NONE; bool barrier_nospec_enabled; static bool no_nospec; @@ -219,7 +219,7 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c if (ccd) seq_buf_printf(&s, "Indirect branch cache disabled"); - if (link_stack_flush_enabled) + if (link_stack_flush_type == BRANCH_CACHE_FLUSH_SW) seq_buf_printf(&s, ", Software link stack flush"); } else if (count_cache_flush_type != BRANCH_CACHE_FLUSH_NONE) { @@ -228,7 +228,7 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c if (count_cache_flush_type == BRANCH_CACHE_FLUSH_HW) seq_buf_printf(&s, " (hardware accelerated)"); - if (link_stack_flush_enabled) + if (link_stack_flush_type == BRANCH_CACHE_FLUSH_SW) seq_buf_printf(&s, ", Software link stack flush"); } else if (btb_flush_enabled) { @@ -447,7 +447,7 @@ static void toggle_branch_cache_flush(bool enable) ppc_inst(PPC_INST_NOP)); #endif pr_info("link-stack-flush: software flush disabled.\n"); - link_stack_flush_enabled = false; + link_stack_flush_type = BRANCH_CACHE_FLUSH_NONE; no_count_cache_flush(); return; } @@ -463,7 +463,7 @@ static void toggle_branch_cache_flush(bool enable) #endif pr_info("link-stack-flush: software flush enabled.\n"); - link_stack_flush_enabled = true; + link_stack_flush_type = BRANCH_CACHE_FLUSH_SW; // If we just need to flush the link stack, patch an early return if (!security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) { -- 2.23.0
[PATCH 1/7] powerpc/security: re-name count cache flush to branch cache flush
The count cache flush mostly refers to both count cache and link stack flushing. As a first step to untangling these a bit, re-name the bits that apply to both. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/asm-prototypes.h | 4 +-- arch/powerpc/kernel/entry_64.S| 7 ++--- arch/powerpc/kernel/security.c| 36 +++ 3 files changed, 23 insertions(+), 24 deletions(-) diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h index 7d81e86a1e5d..fa9057360e88 100644 --- a/arch/powerpc/include/asm/asm-prototypes.h +++ b/arch/powerpc/include/asm/asm-prototypes.h @@ -144,13 +144,13 @@ void _kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu, u64 guest_msr); void _kvmppc_save_tm_pr(struct kvm_vcpu *vcpu, u64 guest_msr); /* Patch sites */ -extern s32 patch__call_flush_count_cache; +extern s32 patch__call_flush_branch_caches; extern s32 patch__flush_count_cache_return; extern s32 patch__flush_link_stack_return; extern s32 patch__call_kvm_flush_link_stack; extern s32 patch__memset_nocache, patch__memcpy_nocache; -extern long flush_count_cache; +extern long flush_branch_caches; extern long kvm_flush_link_stack; #ifdef CONFIG_PPC_TRANSACTIONAL_MEM diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 9d49338e0c85..2ba25b3b701e 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -259,8 +259,7 @@ _ASM_NOKPROBE_SYMBOL(save_nvgprs); #define FLUSH_COUNT_CACHE \ 1: nop;\ - patch_site 1b, patch__call_flush_count_cache - + patch_site 1b, patch__call_flush_branch_caches #define BCCTR_FLUSH.long 0x4c400420 @@ -271,8 +270,8 @@ _ASM_NOKPROBE_SYMBOL(save_nvgprs); .endm .balign 32 -.global flush_count_cache -flush_count_cache: +.global flush_branch_caches +flush_branch_caches: /* Save LR into r9 */ mflrr9 diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index d86701ce116b..df2a3eff950b 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -21,12 +21,12 @@ u64 powerpc_security_features __read_mostly = SEC_FTR_DEFAULT; -enum count_cache_flush_type { - COUNT_CACHE_FLUSH_NONE = 0x1, - COUNT_CACHE_FLUSH_SW= 0x2, - COUNT_CACHE_FLUSH_HW= 0x4, +enum branch_cache_flush_type { + BRANCH_CACHE_FLUSH_NONE = 0x1, + BRANCH_CACHE_FLUSH_SW = 0x2, + BRANCH_CACHE_FLUSH_HW = 0x4, }; -static enum count_cache_flush_type count_cache_flush_type = COUNT_CACHE_FLUSH_NONE; +static enum branch_cache_flush_type count_cache_flush_type = BRANCH_CACHE_FLUSH_NONE; static bool link_stack_flush_enabled; bool barrier_nospec_enabled; @@ -222,10 +222,10 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c if (link_stack_flush_enabled) seq_buf_printf(&s, ", Software link stack flush"); - } else if (count_cache_flush_type != COUNT_CACHE_FLUSH_NONE) { + } else if (count_cache_flush_type != BRANCH_CACHE_FLUSH_NONE) { seq_buf_printf(&s, "Mitigation: Software count cache flush"); - if (count_cache_flush_type == COUNT_CACHE_FLUSH_HW) + if (count_cache_flush_type == BRANCH_CACHE_FLUSH_HW) seq_buf_printf(&s, " (hardware accelerated)"); if (link_stack_flush_enabled) @@ -429,18 +429,18 @@ device_initcall(stf_barrier_debugfs_init); static void no_count_cache_flush(void) { - count_cache_flush_type = COUNT_CACHE_FLUSH_NONE; + count_cache_flush_type = BRANCH_CACHE_FLUSH_NONE; pr_info("count-cache-flush: software flush disabled.\n"); } -static void toggle_count_cache_flush(bool enable) +static void toggle_branch_cache_flush(bool enable) { if (!security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE) && !security_ftr_enabled(SEC_FTR_FLUSH_LINK_STACK)) enable = false; if (!enable) { - patch_instruction_site(&patch__call_flush_count_cache, + patch_instruction_site(&patch__call_flush_branch_caches, ppc_inst(PPC_INST_NOP)); #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE patch_instruction_site(&patch__call_kvm_flush_link_stack, @@ -452,9 +452,9 @@ static void toggle_count_cache_flush(bool enable) return; } - // This enables the branch from _switch to flush_count_cache - patch_branch_site(&patch__call_flush_count_cache, - (u64)&flush_count_cache, BRANCH_SET_LINK); + // This enables the branch from _switch to flush_branch_caches + patch_branch_site(&patch__call_flush_branch_caches, + (u64)&flush_branch_caches, BRANCH_SET_LINK); #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE // This enables the branch from guest_exit_cont to kvm
[PATCH 0/7] powerpc: branch cache flush changes
This series allows the link stack to be flushed with the speical bcctr 2,0,0 flush instruction that also flushes the count cache if the processor supports it. Firmware does not support this at the moment, but I've tested it in simulator with a patched firmware to advertise support. Thanks, Nick Nicholas Piggin (7): powerpc/security: re-name count cache flush to branch cache flush powerpc/security: change link stack flush state to the flush type enum powerpc/security: make display of branch cache flush more consistent powerpc/security: split branch cache flush toggle from code patching powerpc/64s: Move branch cache flushing bcctr variant to ppc-ops.h powerpc/security: Allow for processors that flush the link stack using the special bcctr powerpc/64s: advertise hardware link stack flush arch/powerpc/include/asm/asm-prototypes.h| 4 +- arch/powerpc/include/asm/hvcall.h| 1 + arch/powerpc/include/asm/ppc-opcode.h| 2 + arch/powerpc/include/asm/security_features.h | 2 + arch/powerpc/include/uapi/asm/kvm.h | 1 + arch/powerpc/kernel/entry_64.S | 13 +- arch/powerpc/kernel/security.c | 139 +++ arch/powerpc/kvm/powerpc.c | 9 +- arch/powerpc/platforms/powernv/setup.c | 3 + arch/powerpc/platforms/pseries/setup.c | 3 + tools/arch/powerpc/include/uapi/asm/kvm.h| 1 + 11 files changed, 106 insertions(+), 72 deletions(-) -- 2.23.0
Re: [PATCH] selftests: powerpc: Fix online CPU selection
On 08/06/20 8:12 pm, Sandipan Das wrote: > The size of the cpu set must be large enough for systems > with a very large number of CPUs. Otherwise, tests which > try to determine the first online CPU by calling > sched_getaffinity() will fail. This makes sure that the > size of the allocated cpu set is dependent on the number > of CPUs as reported by get_nprocs(). > > Fixes: 3752e453f6ba ("selftests/powerpc: Add tests of PMU EBBs") > Reported-by: Shirisha Ganta > Signed-off-by: Sandipan Das > --- > tools/testing/selftests/powerpc/utils.c | 33 - > 1 file changed, 21 insertions(+), 12 deletions(-) > > diff --git a/tools/testing/selftests/powerpc/utils.c > b/tools/testing/selftests/powerpc/utils.c > index 933678f1ed0a..bb8e402752c0 100644 > --- a/tools/testing/selftests/powerpc/utils.c > +++ b/tools/testing/selftests/powerpc/utils.c > @@ -16,6 +16,7 @@ > @@ -88,28 +89,36 @@ void *get_auxv_entry(int type) > [...] > int pick_online_cpu(void) > { > - cpu_set_t mask; > - int cpu; > + int ncpus, cpu = -1; > + cpu_set_t *mask; > + size_t size; > > - CPU_ZERO(&mask); > + ncpus = get_nprocs(); > + size = CPU_ALLOC_SIZE(ncpus); > + mask = CPU_ALLOC(ncpus); > > - if (sched_getaffinity(0, sizeof(mask), &mask)) { > + CPU_ZERO_S(size, mask); > + > + if (sched_getaffinity(0, size, mask)) { > perror("sched_getaffinity"); > - return -1; > + goto done; > } > > /* We prefer a primary thread, but skip 0 */ > - for (cpu = 8; cpu < CPU_SETSIZE; cpu += 8) > - if (CPU_ISSET(cpu, &mask)) > - return cpu; > + for (cpu = 8; cpu < ncpus; cpu += 8) > + if (CPU_ISSET_S(cpu, size, mask)) > + goto done; > > /* Search for anything, but in reverse */ > - for (cpu = CPU_SETSIZE - 1; cpu >= 0; cpu--) > - if (CPU_ISSET(cpu, &mask)) > - return cpu; > + for (cpu = ncpus - 1; cpu >= 0; cpu--) > + if (CPU_ISSET_S(cpu, size, mask)) > + goto done; > > printf("No cpus in affinity mask?!\n"); > - return -1; There's a bug here as cpu should have been set to -1. Will send v2 with this fix. > + > +done: > + CPU_FREE(mask); > + return cpu; > } > > bool is_ppc64le(void) > - Sandipan
Re: [PATCH] powerpc/powernv: Fix a warning message
On Sat, 2020-05-02 at 11:59:49 UTC, Christophe JAILLET wrote: > Fix a cut'n'paste error in a warning message. This should be > 'cpu-idle-state-residency-ns' to match the property searched in the > previous 'of_property_read_u32_array()' > > Fixes: 9c7b185ab2fe ("powernv/cpuidle: Parse dt idle properties into global > structure") > Signed-off-by: Christophe JAILLET Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/2f62870ca5bc9d305f3c212192320c29e9dbdc54 cheers
Re: [PATCH v3 2/5] powerpc: module_[32|64].c: replace swap function with built-in one
On Tue, 2019-04-02 at 20:47:22 UTC, Andrey Abramov wrote: > Replace relaswap with built-in one, because relaswap > does a simple byte to byte swap. > > Since Spectre mitigations have made indirect function calls more > expensive, and the default simple byte copies swap is implemented > without them, an "optimized" custom swap function is now > a waste of time as well as code. > > Signed-off-by: Andrey Abramov > Reviewed by: George Spelvin > Acked-by: Michael Ellerman (powerpc) Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/bac7ca7b985b72873bd4ac2553b13b5af5b1f08a cheers
Re: [PATCH v3 7/9] powerpc/ps3: Add check for otheros image size
On Sat, 2020-05-16 at 16:20:46 UTC, Geoff Levand wrote: > The ps3's otheros flash loader has a size limit of 16 MiB for the > uncompressed image. If that limit will be reached output the > flash image file as 'otheros-too-big.bld'. > > Signed-off-by: Geoff Levand Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/aa3bc365ee73765af5059678bf55b0f3e4a3e6c4 cheers
Re: [PATCH 0/6] assorted kuap fixes (try again)
On Wed, 29 Apr 2020 16:56:48 +1000, Nicholas Piggin wrote: > Well the last series was a disaster, I'll try again sending the > patches with proper subject and changelogs written. > > Nicholas Piggin (6): > powerpc/64/kuap: move kuap checks out of MSR[RI]=0 regions of exit > code > powerpc/64s/kuap: kuap_restore missing isync > powerpc/64/kuap: interrupt exit conditionally restore AMR > powerpc/64s/kuap: restore AMR in system reset exception > powerpc/64s/kuap: restore AMR in fast_interrupt_return > powerpc/64s/kuap: conditionally restore AMR in kuap_restore_amr asm > > [...] Patches 2, 3 and 6 applied to powerpc/next. [2/6] powerpc/64s/kuap: Add missing isync to KUAP restore paths https://git.kernel.org/powerpc/c/cb2b53cbffe3c388cd676b63f34e54ceb2643ae2 [3/6] powerpc/64/kuap: Conditionally restore AMR in interrupt exit https://git.kernel.org/powerpc/c/579940bb451c2dd33396d2d56ce6ef5d92154b3b [6/6] powerpc/64s/kuap: Conditionally restore AMR in kuap_restore_amr asm https://git.kernel.org/powerpc/c/d4539074b0e9c5fa6508e8c33aaf51abc8ff6e91 cheers