[tip: timers/core] clocksource: mips-gic-timer: Register as sched_clock
The following commit has been merged into the timers/core branch of tip: Commit-ID: 48016e78d328998b1f00bcfb639adeabca51abe5 Gitweb: https://git.kernel.org/tip/48016e78d328998b1f00bcfb639adeabca51abe5 Author:Paul Burton AuthorDate:Thu, 21 May 2020 23:48:16 +03:00 Committer: Daniel Lezcano CommitterDate: Sat, 23 May 2020 00:03:08 +02:00 clocksource: mips-gic-timer: Register as sched_clock The MIPS GIC timer is well suited for use as sched_clock, so register it as such. Whilst the existing gic_read_count() function matches the prototype needed by sched_clock_register() already, we split it into 2 functions in order to remove the need to evaluate the mips_cm_is64 condition within each call since sched_clock should be as fast as possible. Note the sched clock framework needs the clock source being stable in order to rely on it. So we register the MIPS GIC timer as schedule clocks only if it's, if either the system doesn't have CPU-frequency enabled or the CPU frequency is changed by means of the CPC core clock divider available on the platforms with CM3 or newer. Signed-off-by: Paul Burton Co-developed-by: Serge Semin [sergey.se...@baikalelectronics.ru: Register sched-clock if CM3 or !CPU-freq] Signed-off-by: Serge Semin Cc: Alexey Malahov Cc: Thomas Bogendoerfer Cc: Ralf Baechle Cc: Alessandro Zummo Cc: Alexandre Belloni Cc: Arnd Bergmann Cc: Rob Herring Cc: linux-m...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: devicet...@vger.kernel.org Signed-off-by: Daniel Lezcano Link: https://lore.kernel.org/r/20200521204818.25436-8-sergey.se...@baikalelectronics.ru --- drivers/clocksource/mips-gic-timer.c | 31 +++ 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/drivers/clocksource/mips-gic-timer.c b/drivers/clocksource/mips-gic-timer.c index 8b5f8ae..ef12c12 100644 --- a/drivers/clocksource/mips-gic-timer.c +++ b/drivers/clocksource/mips-gic-timer.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include @@ -24,13 +25,10 @@ static DEFINE_PER_CPU(struct clock_event_device, gic_clockevent_device); static int gic_timer_irq; static unsigned int gic_frequency; -static u64 notrace gic_read_count(void) +static u64 notrace gic_read_count_2x32(void) { unsigned int hi, hi2, lo; - if (mips_cm_is64) - return read_gic_counter(); - do { hi = read_gic_counter_32h(); lo = read_gic_counter_32l(); @@ -40,6 +38,19 @@ static u64 notrace gic_read_count(void) return (((u64) hi) << 32) + lo; } +static u64 notrace gic_read_count_64(void) +{ + return read_gic_counter(); +} + +static u64 notrace gic_read_count(void) +{ + if (mips_cm_is64) + return gic_read_count_64(); + + return gic_read_count_2x32(); +} + static int gic_next_event(unsigned long delta, struct clock_event_device *evt) { int cpu = cpumask_first(evt->cpumask); @@ -228,6 +239,18 @@ static int __init gic_clocksource_of_init(struct device_node *node) /* And finally start the counter */ clear_gic_config(GIC_CONFIG_COUNTSTOP); + /* +* It's safe to use the MIPS GIC timer as a sched clock source only if +* its ticks are stable, which is true on either the platforms with +* stable CPU frequency or on the platforms with CM3 and CPU frequency +* change performed by the CPC core clocks divider. +*/ + if (mips_cm_revision() >= CM_REV_CM3 || !IS_ENABLED(CONFIG_CPU_FREQ)) { + sched_clock_register(mips_cm_is64 ? +gic_read_count_64 : gic_read_count_2x32, +64, gic_frequency); + } + return 0; } TIMER_OF_DECLARE(mips_gic_timer, "mti,gic-timer",
Re: piix4-poweroff.c I/O BAR usage
Hello, On Thu, May 21, 2020 at 6:04 PM Maciej W. Rozycki wrote: > Paul may or may not be reachable anymore, so I'll step in. I'm reachable but lacking free time & with no access to Malta hardware I can't claim to be too useful here, so thanks for responding :) Before being moved to a driver (which was mostly driven by a desire to migrate Malta to a multi-platform/generic kernel using DT) this code was part of arch/mips/mti-malta/ where I added it in commit b6911bba598f ("MIPS: Malta: add suspend state entry code"). My main motivation at the time was to make QEMU exit after running poweroff, but I did ensure it worked on real Malta boards too (at least Malta-R with CoreFPGA6). Over the years since then it shocked a couple of hardware people to see software power off a Malta - if the original hardware designers had intended that to work then the knowledge had been lost over time :) I suspect the code was based on visws_machine_power_off(): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/platform/visws/visws_quirks.c?h=v3.10#n125 > > pci_request_region() takes a BAR number (0-5), but here we're passing > > PCI_BRIDGE_RESOURCES (13 if CONFIG_PCI_IOV, or 7 otherwise), which is > > the bridge I/O window. > > > > I don't think this device ([8086:7113]) is a bridge, so that resource > > should be empty. > > Hmm, isn't the resource actually set up by `quirk_piix4_acpi' though? I agree that the region used is meant to match that set up by quirk_piix4_acpi(), which also refers to it using the PCI_BRIDGE_RESOURCES macro. Thanks, Paul
[PATCH] MIPS: tlbex: Fix build_restore_pagemask KScratch restore
build_restore_pagemask() will restore the value of register $1/$at when its restore_scratch argument is non-zero, and aims to do so by filling a branch delay slot. Commit 0b24cae4d535 ("MIPS: Add missing EHB in mtc0 -> mfc0 sequence.") added an EHB instruction (Execution Hazard Barrier) prior to restoring $1 from a KScratch register, in order to resolve a hazard that can result in stale values of the KScratch register being observed. In particular, P-class CPUs from MIPS with out of order execution pipelines such as the P5600 & P6600 are affected. Unfortunately this EHB instruction was inserted in the branch delay slot causing the MFC0 instruction which performs the restoration to no longer execute along with the branch. The result is that the $1 register isn't actually restored, ie. the TLB refill exception handler clobbers it - which is exactly the problem the EHB is meant to avoid for the P-class CPUs. Similarly build_get_pgd_vmalloc() will restore the value of $1/$at when its mode argument equals refill_scratch, and suffers from the same problem. Fix this by in both cases moving the EHB earlier in the emitted code. There's no reason it needs to immediately precede the MFC0 - it simply needs to be between the MTC0 & MFC0. This bug only affects Cavium Octeon systems which use build_fast_tlb_refill_handler(). Signed-off-by: Paul Burton Fixes: 0b24cae4d535 ("MIPS: Add missing EHB in mtc0 -> mfc0 sequence.") Cc: Dmitry Korotin Cc: sta...@vger.kernel.org # v3.15+ --- arch/mips/mm/tlbex.c | 23 +++ 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c index e01cb33bfa1a..41bb91f05688 100644 --- a/arch/mips/mm/tlbex.c +++ b/arch/mips/mm/tlbex.c @@ -653,6 +653,13 @@ static void build_restore_pagemask(u32 **p, struct uasm_reloc **r, int restore_scratch) { if (restore_scratch) { + /* +* Ensure the MFC0 below observes the value written to the +* KScratch register by the prior MTC0. +*/ + if (scratch_reg >= 0) + uasm_i_ehb(p); + /* Reset default page size */ if (PM_DEFAULT_MASK >> 16) { uasm_i_lui(p, tmp, PM_DEFAULT_MASK >> 16); @@ -667,12 +674,10 @@ static void build_restore_pagemask(u32 **p, struct uasm_reloc **r, uasm_i_mtc0(p, 0, C0_PAGEMASK); uasm_il_b(p, r, lid); } - if (scratch_reg >= 0) { - uasm_i_ehb(p); + if (scratch_reg >= 0) UASM_i_MFC0(p, 1, c0_kscratch(), scratch_reg); - } else { + else UASM_i_LW(p, 1, scratchpad_offset(0), 0); - } } else { /* Reset default page size */ if (PM_DEFAULT_MASK >> 16) { @@ -921,6 +926,10 @@ build_get_pgd_vmalloc64(u32 **p, struct uasm_label **l, struct uasm_reloc **r, } if (mode != not_refill && check_for_high_segbits) { uasm_l_large_segbits_fault(l, *p); + + if (mode == refill_scratch && scratch_reg >= 0) + uasm_i_ehb(p); + /* * We get here if we are an xsseg address, or if we are * an xuseg address above (PGDIR_SHIFT+PGDIR_BITS) boundary. @@ -939,12 +948,10 @@ build_get_pgd_vmalloc64(u32 **p, struct uasm_label **l, struct uasm_reloc **r, uasm_i_jr(p, ptr); if (mode == refill_scratch) { - if (scratch_reg >= 0) { - uasm_i_ehb(p); + if (scratch_reg >= 0) UASM_i_MFC0(p, 1, c0_kscratch(), scratch_reg); - } else { + else UASM_i_LW(p, 1, scratchpad_offset(0), 0); - } } else { uasm_i_nop(p); } -- 2.23.0
Re: [PATCH] MAINTAINERS: Use @kernel.org address for Paul Burton
Hello, Paul Burton wrote: > From: Paul Burton > > Switch to using my paulbur...@kernel.org email address in order to avoid > subject mangling that's being imposed on my previous address. Applied to mips-fixes. > commit 0ad8f7aa9f7e > https://git.kernel.org/mips/c/0ad8f7aa9f7e > > Signed-off-by: Paul Burton > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paulbur...@kernel.org to report it. ]
[PATCH] MAINTAINERS: Use @kernel.org address for Paul Burton
From: Paul Burton Switch to using my paulbur...@kernel.org email address in order to avoid subject mangling that's being imposed on my previous address. Signed-off-by: Paul Burton Signed-off-by: Paul Burton --- .mailmap| 3 ++- MAINTAINERS | 10 +- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/.mailmap b/.mailmap index edcac87e76c8..10b27ecb61c0 100644 --- a/.mailmap +++ b/.mailmap @@ -196,7 +196,8 @@ Oleksij Rempel Oleksij Rempel Paolo 'Blaisorblade' Giarrusso Patrick Mochel -Paul Burton +Paul Burton +Paul Burton Peter A Jonsson Peter Oruba Peter Oruba diff --git a/MAINTAINERS b/MAINTAINERS index a69e6db80c79..6c4dc607074a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3096,7 +3096,7 @@ S:Supported F: arch/arm64/net/ BPF JIT for MIPS (32-BIT AND 64-BIT) -M: Paul Burton +M: Paul Burton L: net...@vger.kernel.org L: b...@vger.kernel.org S: Maintained @@ -8001,7 +8001,7 @@ S:Maintained F: drivers/usb/atm/ueagle-atm.c IMGTEC ASCII LCD DRIVER -M: Paul Burton +M: Paul Burton S: Maintained F: Documentation/devicetree/bindings/auxdisplay/img-ascii-lcd.txt F: drivers/auxdisplay/img-ascii-lcd.c @@ -10828,7 +10828,7 @@ F: drivers/usb/image/microtek.* MIPS M: Ralf Baechle -M: Paul Burton +M: Paul Burton M: James Hogan L: linux-m...@vger.kernel.org W: http://www.linux-mips.org/ @@ -10842,7 +10842,7 @@ F: arch/mips/ F: drivers/platform/mips/ MIPS BOSTON DEVELOPMENT BOARD -M: Paul Burton +M: Paul Burton L: linux-m...@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/clock/img,boston-clock.txt @@ -10852,7 +10852,7 @@ F: drivers/clk/imgtec/clk-boston.c F: include/dt-bindings/clock/boston-clock.h MIPS GENERIC PLATFORM -M: Paul Burton +M: Paul Burton L: linux-m...@vger.kernel.org S: Supported F: Documentation/devicetree/bindings/power/mti,mips-cpc.txt -- 2.23.0
Re: [PATCH] MIPS: Loongson: Make default kernel log buffer size as 128KB for Loongson3
Hi Tiezhu & Huacai, On Tue, Oct 15, 2019 at 12:00:25PM +0800, Tiezhu Yang wrote: > On 10/15/2019 11:36 AM, Huacai Chen wrote: > > On Tue, Oct 15, 2019 at 10:12 AM Tiezhu Yang wrote: > > > When I update kernel with loongson3_defconfig based on the Loongson 3A3000 > > > platform, then using dmesg command to show kernel ring buffer, the initial > > > kernel messages have disappeared due to the log buffer is too small, it is > > > better to change the default kernel log buffer size from 16KB to 128KB. > > > > > > Signed-off-by: Tiezhu Yang > > > --- > > > arch/mips/configs/loongson3_defconfig | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/arch/mips/configs/loongson3_defconfig > > > b/arch/mips/configs/loongson3_defconfig > > > index 90ee008..3aa2201 100644 > > > --- a/arch/mips/configs/loongson3_defconfig > > > +++ b/arch/mips/configs/loongson3_defconfig > > > @@ -12,7 +12,7 @@ CONFIG_TASKSTATS=y > > > CONFIG_TASK_DELAY_ACCT=y > > > CONFIG_TASK_XACCT=y > > > CONFIG_TASK_IO_ACCOUNTING=y > > > -CONFIG_LOG_BUF_SHIFT=14 > > > +CONFIG_LOG_BUF_SHIFT=17 > > Hi, Tiezhu, > > > > Why you choose 128KB but not 64KB or 256KB? I found 64KB is enough for > > our cases. And if you really need more, I think 256KB could be better > > because there are many platforms choose 256KB. > > Hi Huacai, > > Thanks for your reply and suggestion, I will send a v2 patch. Thanks for the patches. I actually have a slight preference for 128KB if you've no specific need, since 128KB is the default. Some quick grepping says that of 405 defconfigs in tree (as of v5.4-rc3), we have: LOG_BUF_SHIFT Count 12 1 13 3 14 235 15 18 16 39 17 90 18 13 19 2 20 4 ie. 16KiB is by far the most common, then second most common is the default 128KiB. 256KiB is comparatively rare. However, I don't think your v1 patch is quite right Tiezhu - since 17 is the default it shouldn't be specified in the defconfig at all. Did you manually make the change in the loongson3_defconfig file? If so please take a look at the savedefconfig make target & try something like this: make ARCH=mips loongson3_defconfig make ARCH=mips menuconfig # Change LOG_BUF_SHIFT make ARCH=mips savedefconfig mv defconfig arch/mips/configs/loongson3_defconfig git add -i arch/mips/configs/loongson3_defconfig # Stage the relevant changes, drop the others You should end up with the CONFIG_LOG_BUF_SHIFT line just getting deleted. If on the other hand you really do prefer 256KiB for these systems please describe why in the commit message. It could be something as simple as "we have lots of memory so using 256KiB isn't a big deal, and gives us a better chance of preserving boot messages until they're examined". But if your log is getting this big before you look at it (or before something like systemd copies it into its journal), there's probably something fishy going on. Thanks, Paul
Re: [EXTERNAL]Re: Build regressions/improvements in v5.4-rc3
Hi Geert, Greg, On Mon, Oct 14, 2019 at 09:04:21AM +0200, Geert Uytterhoeven wrote: > On Mon, Oct 14, 2019 at 8:53 AM Geert Uytterhoeven > wrote: > > JFYI, when comparing v5.4-rc3[1] to v5.4-rc2[3], the summaries are: > > - build errors: +1/-0 > > + /kisskb/src/drivers/staging/octeon/ethernet-spi.c: error: > 'OCTEON_IRQ_RML' undeclared (first use in this function): => 198:19, > 224:12 > > mips-allmodconfig > > > [1] > > http://kisskb.ellerman.id.au/kisskb/branch/linus/head/4f5cafb5cb8471e54afdc9054d973535614f7675/ > > (232 out of 242 configs) > > [3] > > http://kisskb.ellerman.id.au/kisskb/branch/linus/head/da0c9ea146cbe92b832f1b0f694840ea8eb33cce/ > > (233 out of 242 configs) I believe this should be fixed by this patch: https://lore.kernel.org/lkml/20191007231741.2012860-1-paul.bur...@mips.com/ It's currently in staging-next as commit 17a29fea086b ("staging/octeon: Use stubs for MIPS && !CAVIUM_OCTEON_SOC"). Could we get that merged in the 5.4 cycle instead of 5.5? Thanks, Paul
[GIT PULL] MIPS fixes
Hi Linus, Here are a few MIPS fixes for 5.4; please pull. Thanks, Paul The following changes since commit da0c9ea146cbe92b832f1b0f694840ea8eb33cce: Linux 5.4-rc2 (2019-10-06 14:27:30 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git tags/mips_fixes_5.4_2 for you to fetch changes up to 2f2b4fd674cadd8c6b40eb629e140a14db4068fd: MIPS: Disable Loongson MMI instructions for kernel build (2019-10-10 11:58:52 -0700) A few MIPS fixes for 5.4: - Build fixes for CONFIG_OPTIMIZE_INLINING=y builds in which the compiler may choose not to inline __xchg() & __cmpxchg(). - A build fix for Loongson configurations with GCC 9.x. - Expose some extra HWCAP bits to indicate support for various instruction set extensions to userland. - Fix bad stack access in firmware handling code for old SNI RM200/300/400 machines. Jiaxun Yang (1): MIPS: elf_hwcap: Export userspace ASEs Paul Burton (1): MIPS: Disable Loongson MMI instructions for kernel build Thomas Bogendoerfer (3): MIPS: include: Mark __cmpxchg as __always_inline MIPS: include: Mark __xchg as __always_inline MIPS: fw: sni: Fix out of bounds init of o32 stack arch/mips/fw/sni/sniprom.c | 2 +- arch/mips/include/asm/cmpxchg.h| 9 + arch/mips/include/uapi/asm/hwcap.h | 11 +++ arch/mips/kernel/cpu-probe.c | 33 + arch/mips/loongson64/Platform | 4 arch/mips/vdso/Makefile| 1 + 6 files changed, 55 insertions(+), 5 deletions(-) signature.asc Description: PGP signature
Re: [PATCH] mips: Fix unroll macro when building with Clang
Hello, Nathan Chancellor wrote: > Building with Clang errors after commit 6baaeadae911 ("MIPS: Provide > unroll() macro, use it for cache ops") since the GCC_VERSION macro > is defined in include/linux/compiler-gcc.h, which is only included > in compiler.h when using GCC: > > In file included from arch/mips/kernel/mips-mt.c:20: > ./arch/mips/include/asm/r4kcache.h:254:1: error: use of undeclared > identifier 'GCC_VERSION'; did you mean 'S_VERSION'? > __BUILD_BLAST_CACHE(i, icache, Index_Invalidate_I, Hit_Invalidate_I, 32, > ) > ^ > ./arch/mips/include/asm/r4kcache.h:219:4: note: expanded from macro > '__BUILD_BLAST_CACHE' > cache_unroll(32, kernel_cache, indexop, > ^ > ./arch/mips/include/asm/r4kcache.h:203:2: note: expanded from macro > 'cache_unroll' > unroll(times, _cache_op, insn, op, (addr) + (i++ * (lsize))); > ^ > ./arch/mips/include/asm/unroll.h:28:15: note: expanded from macro > 'unroll' > BUILD_BUG_ON(GCC_VERSION >= 40700 &&\ > ^ > > Use CONFIG_GCC_VERSION, which will always be set by Kconfig. > Additionally, Clang 8 had improvements around __builtin_constant_p so > use that as a lower limit for this check with Clang (although MIPS > wasn't buildable until Clang 9); building a kernel with Clang 9.0.0 > has no issues after this change. Applied to mips-next. > commit df3da04880b4 > https://git.kernel.org/mips/c/df3da04880b4 > > Fixes: 6baaeadae911 ("MIPS: Provide unroll() macro, use it for cache ops") > Link: https://github.com/ClangBuiltLinux/linux/issues/736 > Signed-off-by: Nathan Chancellor > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH 0/6] Clean up ARC code and fix IP22/28 early printk
Hello, Thomas Bogendoerfer wrote: > While fixing the problem of not working EARLY_PRINTK on IP22/IP28 > I've removed not used ARC function and made 32bit ARC PROMs working > with 64bit kernels. By switching to memory detection via PROM calls > EARLY_PRINTK works now. And by using the regular 64bit spaces > maximum memory of 384MB on Indigo2 R4k machines is working, too. > > Thomas Bogendoerfer (6): > MIPS: fw: arc: remove unused ARC code > MIPS: fw: arc: use call_o32 to call ARC prom from 64bit kernel > MIPS: Kconfig: always select ARC_MEMORY and ARC_PROMLIB for platform > MIPS: fw: arc: workaround 64bit kernel/32bit ARC problems > MIPS: SGI-IP22: set PHYS_OFFSET to memory start > MIPS: SGI-IP22/28: Use PROM for memory detection Series applied to mips-next. > MIPS: fw: arc: remove unused ARC code > commit d11646b5ce93 > https://git.kernel.org/mips/c/d11646b5ce93 > > Signed-off-by: Thomas Bogendoerfer > Signed-off-by: Paul Burton > > MIPS: fw: arc: use call_o32 to call ARC prom from 64bit kernel > commit ce6c0a593b3c > https://git.kernel.org/mips/c/ce6c0a593b3c > > Signed-off-by: Thomas Bogendoerfer > Signed-off-by: Paul Burton > > MIPS: Kconfig: always select ARC_MEMORY and ARC_PROMLIB for platform > commit 39b2d7565a47 > https://git.kernel.org/mips/c/39b2d7565a47 > > Signed-off-by: Thomas Bogendoerfer > Signed-off-by: Paul Burton > > MIPS: fw: arc: workaround 64bit kernel/32bit ARC problems > commit 351889d35629 > https://git.kernel.org/mips/c/351889d35629 > > Signed-off-by: Thomas Bogendoerfer > Signed-off-by: Paul Burton > > MIPS: SGI-IP22: set PHYS_OFFSET to memory start > commit 931e1bfea403 > https://git.kernel.org/mips/c/931e1bfea403 > > Signed-off-by: Thomas Bogendoerfer > Signed-off-by: Paul Burton > > MIPS: SGI-IP22/28: Use PROM for memory detection > commit c0de00b286ed > https://git.kernel.org/mips/c/c0de00b286ed > > Signed-off-by: Thomas Bogendoerfer > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH] MIPS: fw: sni: Fix out of bounds init of o32 stack
Hello, Thomas Bogendoerfer wrote: > Use ARRAY_SIZE to caluculate the top of the o32 stack. Applied to mips-fixes. > commit efcb529694c3 > https://git.kernel.org/mips/c/efcb529694c3 > > Signed-off-by: Thomas Bogendoerfer > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH] MIPS: include: Mark __xchg as __always_inline
Hello, Thomas Bogendoerfer wrote: > Commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING > forcibly") allows compiler to uninline functions marked as 'inline'. > In cace of __xchg this would cause to reference function > __xchg_called_with_bad_pointer, which is an error case > for catching bugs and will not happen for correct code, if > __xchg is inlined. Applied to mips-fixes. > commit 46f1619500d0 > https://git.kernel.org/mips/c/46f1619500d0 > > Signed-off-by: Thomas Bogendoerfer > Reviewed-by: Philippe Mathieu-Daudé > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH v2] MIPS: generic: Use __initconst for const init data
Hello, Tiezhu Yang wrote: > Fix the following checkpatch errors: > > $ ./scripts/checkpatch.pl --no-tree -f arch/mips/generic/init.c > ERROR: Use of const init definition must use __initconst > #23: FILE: arch/mips/generic/init.c:23: > +static __initdata const void *fdt; > > ERROR: Use of const init definition must use __initconst > #24: FILE: arch/mips/generic/init.c:24: > +static __initdata const struct mips_machine *mach; > > ERROR: Use of const init definition must use __initconst > #25: FILE: arch/mips/generic/init.c:25: > +static __initdata const void *mach_match_data; Applied to mips-next. > commit a14bf1dc494a > https://git.kernel.org/mips/c/a14bf1dc494a > > Fixes: eed0eabd12ef ("MIPS: generic: Introduce generic DT-based board > support") > Signed-off-by: Tiezhu Yang > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
[PATCH] staging/octeon: Use stubs for MIPS && !CAVIUM_OCTEON_SOC
When building for a non-Cavium MIPS system with COMPILE_TEST=y, the Octeon ethernet driver hits a number of issues due to use of macros provided only for CONFIG_CAVIUM_OCTEON_SOC=y configurations. For example: drivers/staging/octeon/ethernet-rx.c:190:6: error: 'CONFIG_CAVIUM_OCTEON_CVMSEG_SIZE' undeclared (first use in this function) drivers/staging/octeon/ethernet-rx.c:472:25: error: 'OCTEON_IRQ_WORKQ0' undeclared (first use in this function) These come from various asm/ headers that a non-Octeon build will be using a non-Octeon version of. Fix this by using the octeon-stubs.h header for non-Cavium MIPS builds, and only using the real asm/octeon/ headers when building a Cavium Octeon kernel configuration. This requires that octeon-stubs.h doesn't redefine XKPHYS_TO_PHYS, which is defined for MIPS by asm/addrspace.h which is pulled in by many other common asm/ headers. Signed-off-by: Paul Burton Reported-by: Geert Uytterhoeven URL: https://lore.kernel.org/linux-mips/CAMuHMdXvu+BppwzsU9imNWVKea_hoLcRt9N+a29Q-QsjW=i...@mail.gmail.com/ Fixes: 171a9bae68c7 ("staging/octeon: Allow test build on !MIPS") Cc: Matthew Wilcox (Oracle) Cc: Greg Kroah-Hartman Cc: David S. Miller --- drivers/staging/octeon/octeon-ethernet.h | 2 +- drivers/staging/octeon/octeon-stubs.h| 5 - 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/staging/octeon/octeon-ethernet.h b/drivers/staging/octeon/octeon-ethernet.h index a8a864b40913..042220d86d33 100644 --- a/drivers/staging/octeon/octeon-ethernet.h +++ b/drivers/staging/octeon/octeon-ethernet.h @@ -14,7 +14,7 @@ #include #include -#ifdef CONFIG_MIPS +#ifdef CONFIG_CAVIUM_OCTEON_SOC #include diff --git a/drivers/staging/octeon/octeon-stubs.h b/drivers/staging/octeon/octeon-stubs.h index a4ac3bfb62a8..c7ff90207f8a 100644 --- a/drivers/staging/octeon/octeon-stubs.h +++ b/drivers/staging/octeon/octeon-stubs.h @@ -1,5 +1,8 @@ #define CONFIG_CAVIUM_OCTEON_CVMSEG_SIZE 512 -#define XKPHYS_TO_PHYS(p) (p) + +#ifndef XKPHYS_TO_PHYS +# define XKPHYS_TO_PHYS(p) (p) +#endif #define OCTEON_IRQ_WORKQ0 0 #define OCTEON_IRQ_RML 0 -- 2.23.0
Re: [PATCH] mips: check for dsp presence only once before save/restore
Hello, Aurabindo Jayamohanan wrote: > {save,restore}_dsp() internally checks if the cpu has dsp support. > Therefore, explicit check is not required before calling them in > {save,restore}_processor_state() Applied to mips-next. > commit 9662dd752c14 > https://git.kernel.org/mips/c/9662dd752c14 > > Signed-off-by: Aurabindo Jayamohanan > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH v2 4/5] MIPS: CI20: DTS: Add Leds
Hello, Alexandre GRIVEAUX wrote: > Adding leds and related triggers. Applied to mips-next. > commit 24b0cb4f883a > https://git.kernel.org/mips/c/24b0cb4f883a > > Signed-off-by: Alexandre GRIVEAUX > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH v2 3/5] MIPS: CI20: DTS: Add IW8103 Wifi + bluetooth
Hello, Alexandre GRIVEAUX wrote: > Add IW8103 Wifi + bluetooth module to device tree and related power domain. Applied to mips-next. > commit 948f2708f945 > https://git.kernel.org/mips/c/948f2708f945 > > Signed-off-by: Alexandre GRIVEAUX > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH v2 2/5] MIPS: CI20: DTS: Add I2C nodes
Hello, Alexandre GRIVEAUX wrote: > Adding missing I2C nodes and some peripheral: > - PMU > - RTC Applied to mips-next. > commit 73f2b940474d > https://git.kernel.org/mips/c/73f2b940474d > > Signed-off-by: Alexandre GRIVEAUX > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH 1/2] MIPS: SGI-IP27: remove not used stuff inherited from IRIX
Hello, Thomas Bogendoerfer wrote: > Most of the SN/SN0 header files are inherited from IRIX header files, > but not all of that stuff is useful for Linux. Remove not used parts. Series applied to mips-next. > MIPS: SGI-IP27: remove not used stuff inherited from IRIX > commit 46a73e9e6ccc > https://git.kernel.org/mips/c/46a73e9e6ccc > > Signed-off-by: Thomas Bogendoerfer > Signed-off-by: Paul Burton > > MIPS: SGI-IP27: get rid of compact node ids > commit 4bf841ebf17a > https://git.kernel.org/mips/c/4bf841ebf17a > > Signed-off-by: Thomas Bogendoerfer > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH v2 1/5] MIPS: JZ4780: DTS: Add I2C nodes
Hello, Alexandre GRIVEAUX wrote: > Add the devicetree nodes for the I2C core of the JZ4780 SoC, disabled > by default. Applied to mips-next. > commit f56a040c9faf > https://git.kernel.org/mips/c/f56a040c9faf > > Signed-off-by: Alexandre GRIVEAUX > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH v2] mips: sgi-ip27: switch from DISCONTIGMEM to SPARSEMEM
Hello, Mike Rapoport wrote: > From: Mike Rapoport > > The memory initialization of SGI-IP27 is already half-way to support > SPARSEMEM. It only had free_bootmem_with_active_regions() left-overs > interfering with sparse_memory_present_with_active_regions(). > > Replace these calls with simpler memblocks_present() call in prom_meminit() > and adjust arch/mips/Kconfig to enable SPARSEMEM and SPARSEMEM_EXTREME for > SGI-IP27. Applied to mips-next. > commit 397dc00e249e > https://git.kernel.org/mips/c/397dc00e249e > > Co-developed-by: Thomas Bogendoerfer > Signed-off-by: Thomas Bogendoerfer > Signed-off-by: Mike Rapoport > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH v2 00/36] MIPS: barriers & atomics cleanups
Hello, Paul Burton wrote: > This series consists of a bunch of cleanups to the way we handle memory > barriers (though no changes to the sync instructions we use to implement > them) & atomic memory accesses. One major goal was to ensure the > Loongson3 LL/SC errata workarounds are applied in a safe manner from > within inline-asm & that we can automatically verify the resulting > kernel binary looks reasonable. Many patches are cleanups found along > the way. > > Applies atop v5.4-rc1. > > Changes in v2: > - Keep our fls/ffs implementations. Turns out GCC's builtins call > intrinsics in some configurations, and if we'd need to go implement > those then using the generic fls/ffs doesn't seem like such a win. > - De-string __WEAK_LLSC_MB to allow use with __SYNC_ELSE(). > - Only try to build the loongson3-llsc-check tool from > arch/mips/Makefile when CONFIG_CPU_LOONGSON3_WORKAROUNDS is enabled. > > Paul Burton (36): > MIPS: Unify sc beqz definition > MIPS: Use compact branch for LL/SC loops on MIPSr6+ > MIPS: barrier: Add __SYNC() infrastructure > MIPS: barrier: Clean up rmb() & wmb() definitions > MIPS: barrier: Clean up __smp_mb() definition > MIPS: barrier: Remove fast_mb() Octeon #ifdef'ery > MIPS: barrier: Clean up __sync() definition > MIPS: barrier: Clean up sync_ginv() > MIPS: atomic: Fix whitespace in ATOMIC_OP macros > MIPS: atomic: Handle !kernel_uses_llsc first > MIPS: atomic: Use one macro to generate 32b & 64b functions > MIPS: atomic: Emit Loongson3 sync workarounds within asm > MIPS: atomic: Use _atomic barriers in atomic_sub_if_positive() > MIPS: atomic: Unify 32b & 64b sub_if_positive > MIPS: atomic: Deduplicate 32b & 64b read, set, xchg, cmpxchg > MIPS: bitops: Handle !kernel_uses_llsc first > MIPS: bitops: Only use ins for bit 16 or higher > MIPS: bitops: Use MIPS_ISA_REV, not #ifdefs > MIPS: bitops: ins start position is always an immediate > MIPS: bitops: Implement test_and_set_bit() in terms of _lock variant > MIPS: bitops: Allow immediates in test_and_{set,clear,change}_bit > MIPS: bitops: Use the BIT() macro > MIPS: bitops: Avoid redundant zero-comparison for non-LLSC > MIPS: bitops: Abstract LL/SC loops > MIPS: bitops: Use BIT_WORD() & BITS_PER_LONG > MIPS: bitops: Emit Loongson3 sync workarounds within asm > MIPS: bitops: Use smp_mb__before_atomic in test_* ops > MIPS: cmpxchg: Emit Loongson3 sync workarounds within asm > MIPS: cmpxchg: Omit redundant barriers for Loongson3 > MIPS: futex: Emit Loongson3 sync workarounds within asm > MIPS: syscall: Emit Loongson3 sync workarounds within asm > MIPS: barrier: Remove loongson_llsc_mb() > MIPS: barrier: Make __smp_mb__before_atomic() a no-op for Loongson3 > MIPS: genex: Add Loongson3 LL/SC workaround to ejtag_debug_handler > MIPS: genex: Don't reload address unnecessarily > MIPS: Check Loongson3 LL/SC errata workaround correctness > > arch/mips/Makefile | 3 + > arch/mips/Makefile.postlink| 10 +- Series applied to mips-next. > MIPS: Unify sc beqz definition > commit 878f75c7a253 > https://git.kernel.org/mips/c/878f75c7a253 > > Signed-off-by: Paul Burton > > MIPS: Use compact branch for LL/SC loops on MIPSr6+ > commit ef85d057a605 > https://git.kernel.org/mips/c/ef85d057a605 > > Signed-off-by: Paul Burton > > MIPS: barrier: Add __SYNC() infrastructure > commit bf92927251b3 > https://git.kernel.org/mips/c/bf92927251b3 > > Signed-off-by: Paul Burton > > MIPS: barrier: Clean up rmb() & wmb() definitions > commit 21e3134b3ec0 > https://git.kernel.org/mips/c/21e3134b3ec0 > > Signed-off-by: Paul Burton > > MIPS: barrier: Clean up __smp_mb() definition > commit 05e6da742b5b > https://git.kernel.org/mips/c/05e6da742b5b > > Signed-off-by: Paul Burton > > MIPS: barrier: Remove fast_mb() Octeon #ifdef'ery > commit 5c12a6eff6ae > https://git.kernel.org/mips/c/5c12a6eff6ae > > Signed-off-by: Paul Burton > > MIPS: barrier: Clean up __sync() definition > commit fe0065e56227 > https://git.kernel.org/mips/c/fe0065e56227 > > Signed-off-by: Paul Burton > > MIPS: barrier: Clean up sync_ginv() > commit 185d7d7a5819 > https://git.kernel.org/mips/c/185d7d7a5819 > > Signed-off-by: Paul Burton > > MIPS: atomic: Fix whitespace in ATOMIC_OP macros > commit 36d3295c5a0d > https://git.kernel.org/mips/c/36d3295c5a0d > > Signed-off-by: Paul Burton > > MIPS: atomic: Handle !kernel_uses_llsc first > commit 9537db24c65a > https://git.kernel.org/mips/c/9537db24c65a > >
Re: [PATCH] MIPS: include: Mark __cmpxchd as __always_inline
Hello, Thomas Bogendoerfer wrote: > Commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING > forcibly") allows compiler to uninline functions marked as 'inline'. > In cace of cmpxchg this would cause to reference function > __cmpxchg_called_with_bad_pointer, which is a error case > for catching bugs and will not happen for correct code, if > __cmpxchg is inlined. Applied to mips-fixes. > commit 88356d09904b > https://git.kernel.org/mips/c/88356d09904b > > Signed-off-by: Thomas Bogendoerfer > [paul.bur...@mips.com: s/__cmpxchd/__cmpxchg in subject] > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
[GIT PULL] MIPS fixes
Hi Linus, Here is a selection of fixes for arch/mips, mostly handling regressions introduced during the v5.4 merge window; please pull. Thanks, Paul The following changes since commit 54ecb8f7028c5eb3d740bb82b0f1d90f2df63c5c: Linux 5.4-rc1 (2019-09-30 10:35:40 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git tags/mips_fixes_5.4_1 for you to fetch changes up to 6822c29ddbbdeafd8d1b79ebe6c51b83efd55ae1: MIPS: fw/arc: Remove unused addr variable (2019-10-04 11:46:22 -0700) Some MIPS fixes for the 5.4 cycle: - Build fixes for Cavium Octeon & PMC-Sierra MSP systems, as well as all pre-MIPSr6 configurations built with binutils < 2.25. - Boot fixes for 64-bit Loongson systems & SGI IP28 systems. - Wire up the new clone3 syscall. - Clean ups for a few build-time warnings. Christophe JAILLET (1): mips: Loongson: Fix the link time qualifier of 'serial_exit()' Huacai Chen (1): MIPS: Loongson64: Fix boot failure after dropping boot_mem_map Jiaxun Yang (1): MIPS: cpu-bugs64: Mark inline functions as __always_inline Oleksij Rempel (1): MIPS: dts: ar9331: fix interrupt-controller size Paul Burton (7): MIPS: octeon: Include required header; fix octeon ethernet build MIPS: Wire up clone3 syscall MIPS: VDSO: Remove unused gettimeofday.c MIPS: VDSO: Fix build for binutils < 2.25 MIPS: pmcs-msp71xx: Add missing MAX_PROM_MEM definition MIPS: pmcs-msp71xx: Remove unused addr variable MIPS: fw/arc: Remove unused addr variable Thomas Bogendoerfer (2): MIPS: init: Fix reservation of memory between PHYS_OFFSET and mem start MIPS: init: Prevent adding memory before PHYS_OFFSET arch/mips/boot/dts/qca/ar9331.dtsi| 2 +- arch/mips/fw/arc/memory.c | 1 - arch/mips/include/asm/octeon/cvmx-ipd.h | 1 + arch/mips/include/asm/unistd.h| 1 + arch/mips/kernel/cpu-bugs64.c | 14 +- arch/mips/kernel/setup.c | 5 +- arch/mips/kernel/syscall.c| 1 + arch/mips/kernel/syscalls/syscall_n32.tbl | 2 +- arch/mips/kernel/syscalls/syscall_n64.tbl | 2 +- arch/mips/kernel/syscalls/syscall_o32.tbl | 2 +- arch/mips/loongson64/common/mem.c | 35 ++-- arch/mips/loongson64/common/serial.c | 2 +- arch/mips/loongson64/loongson-3/numa.c| 11 +- arch/mips/pmcs-msp71xx/msp_prom.c | 4 +- arch/mips/vdso/Makefile | 2 +- arch/mips/vdso/gettimeofday.c | 269 -- 16 files changed, 41 insertions(+), 313 deletions(-) delete mode 100644 arch/mips/vdso/gettimeofday.c signature.asc Description: PGP signature
[PATCH v2] mtd: rawnand: au1550nd: Fix au_read_buf16() prototype
Commit 7e534323c416 ("mtd: rawnand: Pass a nand_chip object to chip->read_xxx() hooks") modified the prototype of the struct nand_chip read_buf function pointer. In the au1550nd driver we have 2 implementations of read_buf. The previously mentioned commit modified the au_read_buf() implementation to match the function pointer, but not au_read_buf16(). This results in a compiler warning for MIPS db1xxx_defconfig builds: drivers/mtd/nand/raw/au1550nd.c:443:57: warning: pointer type mismatch in conditional expression Fix this by updating the prototype of au_read_buf16() to take a struct nand_chip pointer as its first argument, as is expected after commit 7e534323c416 ("mtd: rawnand: Pass a nand_chip object to chip->read_xxx() hooks"). Note that this shouldn't have caused any functional issues at runtime, since the offset of the struct mtd_info within struct nand_chip is 0 making mtd_to_nand() effectively a type-cast. Signed-off-by: Paul Burton Fixes: 7e534323c416 ("mtd: rawnand: Pass a nand_chip object to chip->read_xxx() hooks") Cc: Boris Brezillon Cc: Miquel Raynal Cc: David Woodhouse Cc: Brian Norris Cc: Marek Vasut Cc: Vignesh Raghavendra Cc: linux-...@lists.infradead.org Cc: linux-m...@vger.kernel.org Cc: sta...@vger.kernel.org # v4.20+ --- Changes in v2: - Update kerneldoc comment too... drivers/mtd/nand/raw/au1550nd.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/mtd/nand/raw/au1550nd.c b/drivers/mtd/nand/raw/au1550nd.c index 97a97a9ccc36..e10b76089048 100644 --- a/drivers/mtd/nand/raw/au1550nd.c +++ b/drivers/mtd/nand/raw/au1550nd.c @@ -134,16 +134,15 @@ static void au_write_buf16(struct nand_chip *this, const u_char *buf, int len) /** * au_read_buf16 - read chip data into buffer - * @mtd: MTD device structure + * @this: NAND chip object * @buf: buffer to store date * @len: number of bytes to read * * read function for 16bit buswidth */ -static void au_read_buf16(struct mtd_info *mtd, u_char *buf, int len) +static void au_read_buf16(struct nand_chip *this, u_char *buf, int len) { int i; - struct nand_chip *this = mtd_to_nand(mtd); u16 *p = (u16 *) buf; len >>= 1; -- 2.23.0
[PATCH] mtd: rawnand: au1550nd: Fix au_read_buf16() prototype
Commit 7e534323c416 ("mtd: rawnand: Pass a nand_chip object to chip->read_xxx() hooks") modified the prototype of the struct nand_chip read_buf function pointer. In the au1550nd driver we have 2 implementations of read_buf. The previously mentioned commit modified the au_read_buf() implementation to match the function pointer, but not au_read_buf16(). This results in a compiler warning for MIPS db1xxx_defconfig builds: drivers/mtd/nand/raw/au1550nd.c:443:57: warning: pointer type mismatch in conditional expression Fix this by updating the prototype of au_read_buf16() to take a struct nand_chip pointer as its first argument, as is expected after commit 7e534323c416 ("mtd: rawnand: Pass a nand_chip object to chip->read_xxx() hooks"). Note that this shouldn't have caused any functional issues at runtime, since the offset of the struct mtd_info within struct nand_chip is 0 making mtd_to_nand() effectively a type-cast. Signed-off-by: Paul Burton Fixes: 7e534323c416 ("mtd: rawnand: Pass a nand_chip object to chip->read_xxx() hooks") Cc: Boris Brezillon Cc: Miquel Raynal Cc: David Woodhouse Cc: Brian Norris Cc: Marek Vasut Cc: Vignesh Raghavendra Cc: linux-...@lists.infradead.org Cc: linux-m...@vger.kernel.org Cc: sta...@vger.kernel.org # v4.20+ --- drivers/mtd/nand/raw/au1550nd.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/mtd/nand/raw/au1550nd.c b/drivers/mtd/nand/raw/au1550nd.c index 97a97a9ccc36..2bc818dea2a8 100644 --- a/drivers/mtd/nand/raw/au1550nd.c +++ b/drivers/mtd/nand/raw/au1550nd.c @@ -140,10 +140,9 @@ static void au_write_buf16(struct nand_chip *this, const u_char *buf, int len) * * read function for 16bit buswidth */ -static void au_read_buf16(struct mtd_info *mtd, u_char *buf, int len) +static void au_read_buf16(struct nand_chip *this, u_char *buf, int len) { int i; - struct nand_chip *this = mtd_to_nand(mtd); u16 *p = (u16 *) buf; len >>= 1; -- 2.23.0
Re: [PATCH] MIPS: init: Prevent adding memory before PHYS_OFFSET
Hello, Thomas Bogendoerfer wrote: > On some SGI machines (IP28 and IP30) a small region of memory is mirrored > to pyhsical address 0 for exception vectors while rest of the memory > is reachable at a higher physical address. ARC PROM marks this > region as reserved, but with commit a94e4f24ec83 ("MIPS: init: Drop > boot_mem_map") this chunk is used, when searching for start of ram, > which breaks at least IP28 and IP30 machines. To fix this > add_region_memory() checks for start address < PHYS_OFFSET and ignores > these chunks. Applied to mips-fixes. > commit bd848d1b9235 > https://git.kernel.org/mips/c/bd848d1b9235 > > Fixes: a94e4f24ec83 ("MIPS: init: Drop boot_mem_map") > Signed-off-by: Thomas Bogendoerfer > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH] MIPS: init: Fix reservation of memory between PHYS_OFFSET and mem start
Hello, Thomas Bogendoerfer wrote: > Fix calculation of the size for reserving memory between PHYS_OFFSET > and real memory start. Applied to mips-fixes. > commit 66b416ee41ed > https://git.kernel.org/mips/c/66b416ee41ed > > Fixes: a94e4f24ec83 ("MIPS: init: Drop boot_mem_map") > Signed-off-by: Thomas Bogendoerfer > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: [PATCH] mips: Loongson: Fix the link time qualifier of 'serial_exit()'
Hello, Christophe JAILLET wrote: > 'exit' functions should be marked as __exit, not __init. Applied to mips-fixes. > commit 25b69a889b63 > https://git.kernel.org/mips/c/25b69a889b63 > > Fixes: 85cc028817ef ("mips: make loongsoon serial driver explicitly modular") > Signed-off-by: Christophe JAILLET > Signed-off-by: Paul Burton Thanks, Paul [ This message was auto-generated; if you believe anything is incorrect then please email paul.bur...@mips.com to report it. ]
Re: Build regressions/improvements in v5.4-rc1
Hi Geert, On Wed, Oct 02, 2019 at 11:17:26AM +0200, Geert Uytterhoeven wrote: > > 15 error regressions: > > + /kisskb/build/tmp/cc1Or5dj.s: Error: can't resolve `_start' {*UND* > > section} - `L0 ' {.text section}: => 663, 1200, 222, 873, 1420 > > + /kisskb/build/tmp/cc2uWmof.s: Error: can't resolve `_start' {*UND* > > section} - `L0 ' {.text section}: => 1213, 919, 688, 1434, 226 > > + /kisskb/build/tmp/ccc6hBqd.s: Error: can't resolve `_start' {*UND* > > section} - `L0 ' {.text section}: => 513, 1279, 1058, 727 > > + /kisskb/build/tmp/cclSQ19p.s: Error: can't resolve `_start' {*UND* > > section} - `L0 ' {.text section}: => 1396, 881, 1175, 671, 226 > > + /kisskb/build/tmp/ccu3SlxY.s: Error: can't resolve `_start' {*UND* > > section} - `L0 ' {.text section}: => 1238, 911, 222, 680, 1457 > > Various mips (allmodconfig, allnoconfig, malta_defconfig, ip22_defconfig) > > Related to > > /kisskb/src/arch/mips/vdso/Makefile:61: MIPS VDSO requires binutils >= > 2.25 > > ? Hmm, this looks like fallout from the conversion to the generic VDSO infrastructure. This patch resolves it: https://lore.kernel.org/linux-mips/20191002174438.127127-2-paul.bur...@mips.com/ > > + /kisskb/src/arch/mips/include/asm/octeon/cvmx-ipd.h: error: > > 'CVMX_PIP_SFT_RST' undeclared (first use in this function): => 331:36 > > + /kisskb/src/arch/mips/include/asm/octeon/cvmx-ipd.h: error: > > 'CVMX_PIP_SFT_RST' undeclared (first use in this function); did you mean > > 'CVMX_CIU_SOFT_RST'?: => 331:36 > > + /kisskb/src/arch/mips/include/asm/octeon/cvmx-ipd.h: error: storage > > size of 'pip_sft_rst' isn't known: => 330:27 > > mips-allmodconfig (CC Matthew Wilcox) That one's triggered by a change in the ordering of some include directives in the drivers/staging/octeon code, and fixed by commit 0228ecf6128c ("MIPS: octeon: Include required header; fix octeon ethernet build") in mips-next. Thanks, Paul
Re: [PATCH v2 5/5] MIPS: JZ4780: DTS: Add CPU nodes
Hi Alexandre, On Tue, Oct 01, 2019 at 09:09:48PM +0200, Alexandre GRIVEAUX wrote: > The JZ4780 have 2 core, adding to DT. > > Signed-off-by: Alexandre GRIVEAUX > --- > arch/mips/boot/dts/ingenic/jz4780.dtsi | 17 + > 1 file changed, 17 insertions(+) > > diff --git a/arch/mips/boot/dts/ingenic/jz4780.dtsi > b/arch/mips/boot/dts/ingenic/jz4780.dtsi > index f928329b034b..9c7346724f1f 100644 > --- a/arch/mips/boot/dts/ingenic/jz4780.dtsi > +++ b/arch/mips/boot/dts/ingenic/jz4780.dtsi > @@ -7,6 +7,23 @@ > #size-cells = <1>; > compatible = "ingenic,jz4780"; > > + cpus { > + #address-cells = <1>; > + #size-cells = <0>; > + > + cpu@0 { > + compatible = "ingenic,jz4780"; This should probably be something like ingenic,xburst2. JZ4780 is the SoC. It also should be a documented binding, but I think it would be worth holding off on the whole thing until we actually get SMP support merged - just in case we come up with a binding that doesn't actually work out. So I expect I'll just apply patches 1-4 for now. Thanks for working on it! Paul > + device_type = "cpu"; > + reg = <0>; > + }; > + > + cpu@1 { > + compatible = "ingenic,jz4780"; > + device_type = "cpu"; > + reg = <1>; > + }; > + }; > + > cpuintc: interrupt-controller { > #address-cells = <0>; > #interrupt-cells = <1>; > -- > 2.20.1 >
[PATCH v2 03/36] MIPS: barrier: Add __SYNC() infrastructure
Introduce an asm/sync.h header which provides infrastructure that can be used to generate sync instructions of various types, and for various reasons. For example if we need a sync instruction that provides a full completion barrier but only on systems which have weak memory ordering, we can generate the appropriate assembly code using: __SYNC(full, weak_ordering) When the kernel is configured to run on systems with weak memory ordering (ie. CONFIG_WEAK_ORDERING is selected) we'll emit a sync instruction. When the kernel is configured to run on systems with strong memory ordering (ie. CONFIG_WEAK_ORDERING is not selected) we'll emit nothing. The caller doesn't need to know which happened - it simply says what it needs & when, with no concern for checking the kernel configuration. There are some scenarios in which we may want to emit code only when we *didn't* emit a sync instruction. For example, some Loongson3 CPUs suffer from a bug that requires us to emit a sync instruction prior to each ll instruction (enabled by CONFIG_CPU_LOONGSON3_WORKAROUNDS). In cases where this bug workaround is enabled, it's wasteful to then have more generic code emit another sync instruction to provide barriers we need in general. A __SYNC_ELSE() macro allows for this, providing an extra argument that contains code to be assembled only in cases where the sync instruction was not emitted. For example if we have a scenario in which we generally want to emit a release barrier but for affected Loongson3 configurations upgrade that to a full completion barrier, we can do that like so: __SYNC_ELSE(full, loongson3_war, __SYNC(rl, always)) The assembly generated by these macros can be used either as inline assembly or in assembly source files. Differing types of sync as provided by MIPSr6 are defined, but currently they all generate a full completion barrier except in kernels configured for Cavium Octeon systems. There the wmb sync-type is used, and rmb syncs are omitted, as has been the case since commit 6b07d38aaa52 ("MIPS: Octeon: Use optimized memory barrier primitives."). Using __SYNC() with the wmb or rmb types will abstract away the Octeon specific behavior and allow us to later clean up asm/barrier.h code that currently includes a plethora of #ifdef's. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/barrier.h | 113 + arch/mips/include/asm/sync.h| 207 arch/mips/kernel/pm-cps.c | 20 +-- 3 files changed, 219 insertions(+), 121 deletions(-) create mode 100644 arch/mips/include/asm/sync.h diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index 9228f7386220..5ad39bfd3b6d 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -9,116 +9,7 @@ #define __ASM_BARRIER_H #include - -/* - * Sync types defined by the MIPS architecture (document MD00087 table 6.5) - * These values are used with the sync instruction to perform memory barriers. - * Types of ordering guarantees available through the SYNC instruction: - * - Completion Barriers - * - Ordering Barriers - * As compared to the completion barrier, the ordering barrier is a - * lighter-weight operation as it does not require the specified instructions - * before the SYNC to be already completed. Instead it only requires that those - * specified instructions which are subsequent to the SYNC in the instruction - * stream are never re-ordered for processing ahead of the specified - * instructions which are before the SYNC in the instruction stream. - * This potentially reduces how many cycles the barrier instruction must stall - * before it completes. - * Implementations that do not use any of the non-zero values of stype to define - * different barriers, such as ordering barriers, must make those stype values - * act the same as stype zero. - */ - -/* - * Completion barriers: - * - Every synchronizable specified memory instruction (loads or stores or both) - * that occurs in the instruction stream before the SYNC instruction must be - * already globally performed before any synchronizable specified memory - * instructions that occur after the SYNC are allowed to be performed, with - * respect to any other processor or coherent I/O module. - * - * - The barrier does not guarantee the order in which instruction fetches are - * performed. - * - * - A stype value of zero will always be defined such that it performs the most - * complete set of synchronization operations that are defined.This means - * stype zero always does a completion barrier that affects both loads and - * stores preceding the SYNC instruction and both loads and stores that are - * subsequent to the SYNC instruction. Non-zero values of stype may be defined - * by the architecture or specific implementations to perform synchronization - * behaviors that are less complete than that of stype zero. If an - * imple
[PATCH v2 18/36] MIPS: bitops: Use MIPS_ISA_REV, not #ifdefs
Rather than #ifdef on CONFIG_CPU_* to determine whether the ins instruction is supported we can simply check MIPS_ISA_REV to discover whether we're targeting MIPSr2 or higher. Do so in order to clean up the code. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/bitops.h | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 1e5739191ddf..0f5329e32e87 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -19,6 +19,7 @@ #include /* sigh ... */ #include #include +#include #include #include #include @@ -76,8 +77,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) return; } -#if defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_CPU_MIPSR6) - if (__builtin_constant_p(bit) && (bit >= 16)) { + if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(bit) && (bit >= 16)) { loongson_llsc_mb(); do { __asm__ __volatile__( @@ -90,7 +90,6 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) } while (unlikely(!temp)); return; } -#endif /* CONFIG_CPU_MIPSR2 || CONFIG_CPU_MIPSR6 */ loongson_llsc_mb(); do { @@ -143,8 +142,7 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr) return; } -#if defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_CPU_MIPSR6) - if (__builtin_constant_p(bit)) { + if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(bit)) { loongson_llsc_mb(); do { __asm__ __volatile__( @@ -157,7 +155,6 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr) } while (unlikely(!temp)); return; } -#endif /* CONFIG_CPU_MIPSR2 || CONFIG_CPU_MIPSR6 */ loongson_llsc_mb(); do { @@ -377,8 +374,7 @@ static inline int test_and_clear_bit(unsigned long nr, : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) : "r" (1UL << bit) : __LLSC_CLOBBER); -#if defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_CPU_MIPSR6) - } else if (__builtin_constant_p(nr)) { + } else if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(nr)) { loongson_llsc_mb(); do { __asm__ __volatile__( @@ -390,7 +386,6 @@ static inline int test_and_clear_bit(unsigned long nr, : "ir" (bit) : __LLSC_CLOBBER); } while (unlikely(!temp)); -#endif } else { loongson_llsc_mb(); do { -- 2.23.0
[PATCH v2 07/36] MIPS: barrier: Clean up __sync() definition
Implement __sync() using the new __SYNC() infrastructure, which will take care of not emitting an instruction for old R3k CPUs that don't support it. The only behavioral difference is that __sync() will now provide a compiler barrier on these old CPUs, but that seems like reasonable behavior anyway. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/barrier.h | 18 -- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index 657ec01120a4..a117c6d95038 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -11,20 +11,10 @@ #include #include -#ifdef CONFIG_CPU_HAS_SYNC -#define __sync() \ - __asm__ __volatile__( \ - ".set push\n\t" \ - ".set noreorder\n\t" \ - ".set mips2\n\t" \ - "sync\n\t" \ - ".set pop"\ - : /* no output */ \ - : /* no input */\ - : "memory") -#else -#define __sync() do { } while(0) -#endif +static inline void __sync(void) +{ + asm volatile(__SYNC(full, always) ::: "memory"); +} static inline void rmb(void) { -- 2.23.0
[PATCH v2 09/36] MIPS: atomic: Fix whitespace in ATOMIC_OP macros
We define macros in asm/atomic.h which end each line with space characters before a backslash to continue on the next line. Remove the space characters leaving tabs as the whitespace used for conformity with coding convention. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/atomic.h | 184 - 1 file changed, 92 insertions(+), 92 deletions(-) diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index 7578c807ef98..2d2a8a74c51b 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -42,102 +42,102 @@ */ #define atomic_set(v, i) WRITE_ONCE((v)->counter, (i)) -#define ATOMIC_OP(op, c_op, asm_op) \ -static __inline__ void atomic_##op(int i, atomic_t * v) \ -{\ - if (kernel_uses_llsc) { \ - int temp; \ - \ - loongson_llsc_mb(); \ - __asm__ __volatile__( \ - " .setpush\n" \ - " .set"MIPS_ISA_LEVEL"\n" \ - "1: ll %0, %1 # atomic_" #op "\n" \ - " " #asm_op " %0, %2 \n" \ - " sc %0, %1 \n" \ - "\t" __SC_BEQZ "%0, 1b \n" \ - " .setpop \n" \ - : "=" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter) \ - : "Ir" (i) : __LLSC_CLOBBER); \ - } else { \ - unsigned long flags; \ - \ - raw_local_irq_save(flags);\ - v->counter c_op i;\ - raw_local_irq_restore(flags); \ - } \ +#define ATOMIC_OP(op, c_op, asm_op)\ +static __inline__ void atomic_##op(int i, atomic_t * v) \ +{ \ + if (kernel_uses_llsc) { \ + int temp; \ + \ + loongson_llsc_mb(); \ + __asm__ __volatile__( \ + " .setpush\n" \ + " .set"MIPS_ISA_LEVEL"\n" \ + "1: ll %0, %1 # atomic_" #op "\n" \ + " " #asm_op " %0, %2 \n" \ + " sc %0, %1 \n" \ + "\t" __SC_BEQZ "%0, 1b \n" \ + " .setpop \n" \ + : "=" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)\ + : "Ir" (i) : __LLSC_CLOBBER); \ + } else {\ + unsigned long flags;\ + \ + raw_local_irq_save(flags); \ + v->counter c_op i; \ + raw_local_irq_restore(flags); \ + } \ } -#define ATOMIC_OP_RETURN(op, c_op, asm_op) \ -static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \ -{
[PATCH v2 05/36] MIPS: barrier: Clean up __smp_mb() definition
We #ifdef on Cavium Octeon CPUs, but emit the same sync instruction in both cases. Remove the #ifdef & simply expand to the __sync() macro. Whilst here indent the strong ordering case definitions to match the indentation of the weak ordering ones, helping readability. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/barrier.h | 12 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index f36cab87cfde..8a5abc1c85a6 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -89,17 +89,13 @@ static inline void wmb(void) #endif /* !CONFIG_CPU_HAS_WB */ #if defined(CONFIG_WEAK_ORDERING) -# ifdef CONFIG_CPU_CAVIUM_OCTEON -# define __smp_mb() __sync() -# else -# define __smp_mb() __asm__ __volatile__("sync" : : :"memory") -# endif +# define __smp_mb()__sync() # define __smp_rmb() rmb() # define __smp_wmb() wmb() #else -#define __smp_mb() barrier() -#define __smp_rmb()barrier() -#define __smp_wmb()barrier() +# define __smp_mb()barrier() +# define __smp_rmb() barrier() +# define __smp_wmb() barrier() #endif /* -- 2.23.0
[PATCH v2 02/36] MIPS: Use compact branch for LL/SC loops on MIPSr6+
When targeting MIPSr6 or higher make use of a compact branch in LL/SC loops, preventing the insertion of a delay slot nop that only serves to waste space. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/llsc.h | 4 1 file changed, 4 insertions(+) diff --git a/arch/mips/include/asm/llsc.h b/arch/mips/include/asm/llsc.h index 9b19f38562ac..d240a4a2d1c4 100644 --- a/arch/mips/include/asm/llsc.h +++ b/arch/mips/include/asm/llsc.h @@ -9,6 +9,8 @@ #ifndef __ASM_LLSC_H #define __ASM_LLSC_H +#include + #if _MIPS_SZLONG == 32 #define SZLONG_LOG 5 #define SZLONG_MASK 31UL @@ -32,6 +34,8 @@ */ #if R1_LLSC_WAR # define __SC_BEQZ "beqzl " +#elif MIPS_ISA_REV >= 6 +# define __SC_BEQZ "beqzc " #else # define __SC_BEQZ "beqz " #endif -- 2.23.0
[PATCH v2 15/36] MIPS: atomic: Deduplicate 32b & 64b read, set, xchg, cmpxchg
Remove the remaining duplication between 32b & 64b in asm/atomic.h by making use of an ATOMIC_OPS() macro to generate: - atomic_read()/atomic64_read() - atomic_set()/atomic64_set() - atomic_cmpxchg()/atomic64_cmpxchg() - atomic_xchg()/atomic64_xchg() This is consistent with the way all other functions in asm/atomic.h are generated, and ensures consistency between the 32b & 64b functions. Of note is that this results in the above now being static inline functions rather than macros. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/atomic.h | 70 +- 1 file changed, 27 insertions(+), 43 deletions(-) diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index 96ef50fa2817..e5ac88392d1f 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -24,24 +24,34 @@ #include #include -#define ATOMIC_INIT(i) { (i) } +#define ATOMIC_OPS(pfx, type) \ +static __always_inline type pfx##_read(const pfx##_t *v) \ +{ \ + return READ_ONCE(v->counter); \ +} \ + \ +static __always_inline void pfx##_set(pfx##_t *v, type i) \ +{ \ + WRITE_ONCE(v->counter, i); \ +} \ + \ +static __always_inline type pfx##_cmpxchg(pfx##_t *v, type o, type n) \ +{ \ + return cmpxchg(>counter, o, n); \ +} \ + \ +static __always_inline type pfx##_xchg(pfx##_t *v, type n) \ +{ \ + return xchg(>counter, n);\ +} -/* - * atomic_read - read atomic variable - * @v: pointer of type atomic_t - * - * Atomically reads the value of @v. - */ -#define atomic_read(v) READ_ONCE((v)->counter) +#define ATOMIC_INIT(i) { (i) } +ATOMIC_OPS(atomic, int) -/* - * atomic_set - set atomic variable - * @v: pointer of type atomic_t - * @i: required value - * - * Atomically sets the value of @v to @i. - */ -#define atomic_set(v, i) WRITE_ONCE((v)->counter, (i)) +#ifdef CONFIG_64BIT +# define ATOMIC64_INIT(i) { (i) } +ATOMIC_OPS(atomic64, s64) +#endif #define ATOMIC_OP(pfx, op, type, c_op, asm_op, ll, sc) \ static __inline__ void pfx##_##op(type i, pfx##_t * v) \ @@ -135,6 +145,7 @@ static __inline__ type pfx##_fetch_##op##_relaxed(type i, pfx##_t * v) \ return result; \ } +#undef ATOMIC_OPS #define ATOMIC_OPS(pfx, op, type, c_op, asm_op, ll, sc) \ ATOMIC_OP(pfx, op, type, c_op, asm_op, ll, sc) \ ATOMIC_OP_RETURN(pfx, op, type, c_op, asm_op, ll, sc) \ @@ -254,31 +265,4 @@ ATOMIC_SIP_OP(atomic64, s64, dsubu, lld, scd) #undef ATOMIC_SIP_OP -#define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n))) -#define atomic_xchg(v, new) (xchg(&((v)->counter), (new))) - -#ifdef CONFIG_64BIT - -#define ATOMIC64_INIT(i){ (i) } - -/* - * atomic64_read - read atomic variable - * @v: pointer of type atomic64_t - * - */ -#define atomic64_read(v) READ_ONCE((v)->counter) - -/* - * atomic64_set - set atomic variable - * @v: pointer of type atomic64_t - * @i: required value - */ -#define atomic64_set(v, i) WRITE_ONCE((v)->counter, (i)) - -#define atomic64_cmpxchg(v, o, n) \ - ((__typeof__((v)->counter))cmpxchg(&((v)->counter), (o), (n))) -#define atomic64_xchg(v, new) (xchg(&((v)->counter), (new))) - -#endif /* CONFIG_64BIT */ - #endif /* _ASM_ATOMIC_H */ -- 2.23.0
[PATCH v2 10/36] MIPS: atomic: Handle !kernel_uses_llsc first
Handle the !kernel_uses_llsc path first in our ATOMIC_OP(), ATOMIC_OP_RETURN() & ATOMIC_FETCH_OP() macros & return from within the block. This allows us to de-indent the kernel_uses_llsc path by one level which will be useful when making further changes. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/atomic.h | 99 +- 1 file changed, 49 insertions(+), 50 deletions(-) diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index 2d2a8a74c51b..ace2ea005588 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -45,51 +45,36 @@ #define ATOMIC_OP(op, c_op, asm_op)\ static __inline__ void atomic_##op(int i, atomic_t * v) \ { \ - if (kernel_uses_llsc) { \ - int temp; \ + int temp; \ \ - loongson_llsc_mb(); \ - __asm__ __volatile__( \ - " .setpush\n" \ - " .set"MIPS_ISA_LEVEL"\n" \ - "1: ll %0, %1 # atomic_" #op "\n" \ - " " #asm_op " %0, %2 \n" \ - " sc %0, %1 \n" \ - "\t" __SC_BEQZ "%0, 1b \n" \ - " .setpop \n" \ - : "=" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)\ - : "Ir" (i) : __LLSC_CLOBBER); \ - } else {\ + if (!kernel_uses_llsc) {\ unsigned long flags;\ \ raw_local_irq_save(flags); \ v->counter c_op i; \ raw_local_irq_restore(flags); \ + return; \ } \ + \ + loongson_llsc_mb(); \ + __asm__ __volatile__( \ + " .setpush\n" \ + " .set" MIPS_ISA_LEVEL " \n" \ + "1: ll %0, %1 # atomic_" #op "\n" \ + " " #asm_op " %0, %2 \n" \ + " sc %0, %1 \n" \ + "\t" __SC_BEQZ "%0, 1b \n" \ + " .setpop \n" \ + : "=" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)\ + : "Ir" (i) : __LLSC_CLOBBER); \ } #define ATOMIC_OP_RETURN(op, c_op, asm_op) \ static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \ { \ - int result; \ - \ - if (kernel_uses_llsc) { \ - int temp; \ + int temp, result; \ \ - loongson_llsc_mb(); \ - __asm__ __volatile__( \ - " .setpush\n" \ - " .set"MIPS_ISA_LEVEL"\n" \ - "1: ll %1, %2 # atomic_" #op "_return \n" \ - " " #asm_op " %0, %1, %3
[PATCH v2 23/36] MIPS: bitops: Avoid redundant zero-comparison for non-LLSC
The IRQ-disabling non-LLSC fallbacks for bitops on UP systems already return a zero or one, so there's no need to perform another comparison against zero. Move these comparisons into the LLSC paths to avoid the redundant work. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/bitops.h | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 0f8ff896e86b..7671db2a7b73 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -264,6 +264,8 @@ static inline int test_and_set_bit_lock(unsigned long nr, : "=" (temp), "+m" (*m), "=" (res) : "ir" (BIT(bit)) : __LLSC_CLOBBER); + + res = res != 0; } else { loongson_llsc_mb(); do { @@ -279,12 +281,12 @@ static inline int test_and_set_bit_lock(unsigned long nr, : __LLSC_CLOBBER); } while (unlikely(!res)); - res = temp & BIT(bit); + res = (temp & BIT(bit)) != 0; } smp_llsc_mb(); - return res != 0; + return res; } /* @@ -335,6 +337,8 @@ static inline int test_and_clear_bit(unsigned long nr, : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) : "ir" (BIT(bit)) : __LLSC_CLOBBER); + + res = res != 0; } else if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(nr)) { loongson_llsc_mb(); do { @@ -363,12 +367,12 @@ static inline int test_and_clear_bit(unsigned long nr, : __LLSC_CLOBBER); } while (unlikely(!res)); - res = temp & BIT(bit); + res = (temp & BIT(bit)) != 0; } smp_llsc_mb(); - return res != 0; + return res; } /* @@ -403,6 +407,8 @@ static inline int test_and_change_bit(unsigned long nr, : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) : "ir" (BIT(bit)) : __LLSC_CLOBBER); + + res = res != 0; } else { loongson_llsc_mb(); do { @@ -418,12 +424,12 @@ static inline int test_and_change_bit(unsigned long nr, : __LLSC_CLOBBER); } while (unlikely(!res)); - res = temp & BIT(bit); + res = (temp & BIT(bit)) != 0; } smp_llsc_mb(); - return res != 0; + return res; } #include -- 2.23.0
[PATCH v2 32/36] MIPS: barrier: Remove loongson_llsc_mb()
The loongson_llsc_mb() macro is no longer used - instead barriers are emitted as part of inline asm using the __SYNC() macro. Remove the now-defunct loongson_llsc_mb() macro. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/barrier.h | 40 - arch/mips/loongson64/Platform | 2 +- 2 files changed, 1 insertion(+), 41 deletions(-) diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index 133afd565067..6d92d5ccdafa 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -122,46 +122,6 @@ static inline void wmb(void) #define __smp_mb__before_atomic() __smp_mb__before_llsc() #define __smp_mb__after_atomic() smp_llsc_mb() -/* - * Some Loongson 3 CPUs have a bug wherein execution of a memory access (load, - * store or prefetch) in between an LL & SC can cause the SC instruction to - * erroneously succeed, breaking atomicity. Whilst it's unusual to write code - * containing such sequences, this bug bites harder than we might otherwise - * expect due to reordering & speculation: - * - * 1) A memory access appearing prior to the LL in program order may actually - *be executed after the LL - this is the reordering case. - * - *In order to avoid this we need to place a memory barrier (ie. a SYNC - *instruction) prior to every LL instruction, in between it and any earlier - *memory access instructions. - * - *This reordering case is fixed by 3A R2 CPUs, ie. 3A2000 models and later. - * - * 2) If a conditional branch exists between an LL & SC with a target outside - *of the LL-SC loop, for example an exit upon value mismatch in cmpxchg() - *or similar, then misprediction of the branch may allow speculative - *execution of memory accesses from outside of the LL-SC loop. - * - *In order to avoid this we need a memory barrier (ie. a SYNC instruction) - *at each affected branch target, for which we also use loongson_llsc_mb() - *defined below. - * - *This case affects all current Loongson 3 CPUs. - * - * The above described cases cause an error in the cache coherence protocol; - * such that the Invalidate of a competing LL-SC goes 'missing' and SC - * erroneously observes its core still has Exclusive state and lets the SC - * proceed. - * - * Therefore the error only occurs on SMP systems. - */ -#ifdef CONFIG_CPU_LOONGSON3_WORKAROUNDS /* Loongson-3's LLSC workaround */ -#define loongson_llsc_mb() __asm__ __volatile__("sync" : : :"memory") -#else -#define loongson_llsc_mb() do { } while (0) -#endif - static inline void sync_ginv(void) { asm volatile(__SYNC(ginv, always)); diff --git a/arch/mips/loongson64/Platform b/arch/mips/loongson64/Platform index c1a4d4dc4665..28172500f95a 100644 --- a/arch/mips/loongson64/Platform +++ b/arch/mips/loongson64/Platform @@ -27,7 +27,7 @@ cflags-$(CONFIG_CPU_LOONGSON3)+= -Wa,--trap # # Some versions of binutils, not currently mainline as of 2019/02/04, support # an -mfix-loongson3-llsc flag which emits a sync prior to each ll instruction -# to work around a CPU bug (see loongson_llsc_mb() in asm/barrier.h for a +# to work around a CPU bug (see __SYNC_loongson3_war in asm/sync.h for a # description). # # We disable this in order to prevent the assembler meddling with the -- 2.23.0
[PATCH v2 30/36] MIPS: futex: Emit Loongson3 sync workarounds within asm
Generate the sync instructions required to workaround Loongson3 LL/SC errata within inline asm blocks, which feels a little safer than doing it from C where strictly speaking the compiler would be well within its rights to insert a memory access between the separate asm statements we previously had, containing sync & ll instructions respectively. Signed-off-by: Paul Burton --- Changes in v2: - De-string __WEAK_LLSC_MB to allow its use with __SYNC_ELSE(). arch/mips/include/asm/barrier.h | 13 +++-- arch/mips/include/asm/futex.h | 15 +++ 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index c7e05e832da9..133afd565067 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -95,13 +95,14 @@ static inline void wmb(void) * ordering will be done by smp_llsc_mb() and friends. */ #if defined(CONFIG_WEAK_REORDERING_BEYOND_LLSC) && defined(CONFIG_SMP) -#define __WEAK_LLSC_MB " sync\n" -#define smp_llsc_mb() __asm__ __volatile__(__WEAK_LLSC_MB : : :"memory") -#define __LLSC_CLOBBER +# define __WEAK_LLSC_MBsync +# define smp_llsc_mb() \ + __asm__ __volatile__(__stringify(__WEAK_LLSC_MB) : : :"memory") +# define __LLSC_CLOBBER #else -#define __WEAK_LLSC_MB " \n" -#define smp_llsc_mb() do { } while (0) -#define __LLSC_CLOBBER "memory" +# define __WEAK_LLSC_MB +# define smp_llsc_mb() do { } while (0) +# define __LLSC_CLOBBER"memory" #endif #ifdef CONFIG_CPU_CAVIUM_OCTEON diff --git a/arch/mips/include/asm/futex.h b/arch/mips/include/asm/futex.h index b83b0397462d..54cf20530931 100644 --- a/arch/mips/include/asm/futex.h +++ b/arch/mips/include/asm/futex.h @@ -16,6 +16,7 @@ #include #include #include +#include #include #define __futex_atomic_op(insn, ret, oldval, uaddr, oparg) \ @@ -32,7 +33,7 @@ " .setarch=r4000 \n" \ "2: sc $1, %2 \n" \ " beqzl $1, 1b \n" \ - __WEAK_LLSC_MB \ + __stringify(__WEAK_LLSC_MB) \ "3: \n" \ " .insn \n" \ " .setpop \n" \ @@ -50,19 +51,19 @@ "i" (-EFAULT) \ : "memory");\ } else if (cpu_has_llsc) { \ - loongson_llsc_mb(); \ __asm__ __volatile__( \ " .setpush\n" \ " .setnoat\n" \ " .setpush\n" \ " .set"MIPS_ISA_ARCH_LEVEL" \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: "user_ll("%1", "%4")" # __futex_atomic_op\n"\ " .setpop \n" \ " " insn " \n" \ " .set"MIPS_ISA_ARCH_LEVEL" \n" \ "2: "user_sc("$1", "%2")" \n" \ " beqz$1, 1b \n" \ - __WEAK_LLSC_MB \ + __stringify(__WEAK_LLSC_MB) \ "3: \n" \ " .insn \n" \ " .setpop \n" \ @@ -147,7 +148,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, " .setarch=r4000 \n" "2: sc $1, %2 \n" " beqzl $1, 1b \n" - __WEAK_LLSC_MB + __stringify(__WEAK_LLSC_MB) &
[PATCH v2 36/36] MIPS: Check Loongson3 LL/SC errata workaround correctness
When Loongson3 LL/SC errata workarounds are enabled (ie. CONFIG_CPU_LOONGSON3_WORKAROUNDS=y) run a tool to scan through the compiled kernel & ensure that the workaround is applied correctly. That is, ensure that: - Every LL or LLD instruction is preceded by a sync instruction. - Any branches from within an LL/SC loop to outside of that loop target a sync instruction. Reasoning for these conditions can be found by reading the comment above the definition of __SYNC_loongson3_war in arch/mips/include/asm/sync.h. This tool will help ensure that we don't inadvertently introduce code paths that miss the required workarounds. Signed-off-by: Paul Burton --- Changes in v2: - Only try to build loongson3-llsc-check from arch/mips/Makefile when CONFIG_CPU_LOONGSON3_WORKAROUNDS is enabled. arch/mips/Makefile | 3 + arch/mips/Makefile.postlink| 10 +- arch/mips/tools/.gitignore | 1 + arch/mips/tools/Makefile | 5 + arch/mips/tools/loongson3-llsc-check.c | 307 + 5 files changed, 325 insertions(+), 1 deletion(-) create mode 100644 arch/mips/tools/loongson3-llsc-check.c diff --git a/arch/mips/Makefile b/arch/mips/Makefile index cdc09b71febe..0a5eab626260 100644 --- a/arch/mips/Makefile +++ b/arch/mips/Makefile @@ -14,6 +14,9 @@ archscripts: scripts_basic $(Q)$(MAKE) $(build)=arch/mips/tools elf-entry +ifeq ($(CONFIG_CPU_LOONGSON3_WORKAROUNDS),y) + $(Q)$(MAKE) $(build)=arch/mips/tools loongson3-llsc-check +endif $(Q)$(MAKE) $(build)=arch/mips/boot/tools relocs KBUILD_DEFCONFIG := 32r2el_defconfig diff --git a/arch/mips/Makefile.postlink b/arch/mips/Makefile.postlink index 4eea4188cb20..f03fdc95143e 100644 --- a/arch/mips/Makefile.postlink +++ b/arch/mips/Makefile.postlink @@ -3,7 +3,8 @@ # Post-link MIPS pass # === # -# 1. Insert relocations into vmlinux +# 1. Check that Loongson3 LL/SC workarounds are applied correctly +# 2. Insert relocations into vmlinux PHONY := __archpost __archpost: @@ -11,6 +12,10 @@ __archpost: -include include/config/auto.conf include scripts/Kbuild.include +CMD_LS3_LLSC = arch/mips/tools/loongson3-llsc-check +quiet_cmd_ls3_llsc = LLSCCHK $@ + cmd_ls3_llsc = $(CMD_LS3_LLSC) $@ + CMD_RELOCS = arch/mips/boot/tools/relocs quiet_cmd_relocs = RELOCS $@ cmd_relocs = $(CMD_RELOCS) $@ @@ -19,6 +24,9 @@ quiet_cmd_relocs = RELOCS $@ vmlinux: FORCE @true +ifeq ($(CONFIG_CPU_LOONGSON3_WORKAROUNDS),y) + $(call if_changed,ls3_llsc) +endif ifeq ($(CONFIG_RELOCATABLE),y) $(call if_changed,relocs) endif diff --git a/arch/mips/tools/.gitignore b/arch/mips/tools/.gitignore index 56d34ce4..b0209450d9ff 100644 --- a/arch/mips/tools/.gitignore +++ b/arch/mips/tools/.gitignore @@ -1 +1,2 @@ elf-entry +loongson3-llsc-check diff --git a/arch/mips/tools/Makefile b/arch/mips/tools/Makefile index 3baee4bc6775..aaef688749f5 100644 --- a/arch/mips/tools/Makefile +++ b/arch/mips/tools/Makefile @@ -3,3 +3,8 @@ hostprogs-y := elf-entry PHONY += elf-entry elf-entry: $(obj)/elf-entry @: + +hostprogs-$(CONFIG_CPU_LOONGSON3_WORKAROUNDS) += loongson3-llsc-check +PHONY += loongson3-llsc-check +loongson3-llsc-check: $(obj)/loongson3-llsc-check + @: diff --git a/arch/mips/tools/loongson3-llsc-check.c b/arch/mips/tools/loongson3-llsc-check.c new file mode 100644 index ..0ebddd0ae46f --- /dev/null +++ b/arch/mips/tools/loongson3-llsc-check.c @@ -0,0 +1,307 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#ifdef be32toh +/* If libc provides le{16,32,64}toh() then we'll use them */ +#elif BYTE_ORDER == LITTLE_ENDIAN +# define le16toh(x)(x) +# define le32toh(x)(x) +# define le64toh(x)(x) +#elif BYTE_ORDER == BIG_ENDIAN +# define le16toh(x)bswap_16(x) +# define le32toh(x)bswap_32(x) +# define le64toh(x)bswap_64(x) +#endif + +/* MIPS opcodes, in bits 31:26 of an instruction */ +#define OP_SPECIAL 0x00 +#define OP_REGIMM 0x01 +#define OP_BEQ 0x04 +#define OP_BNE 0x05 +#define OP_BLEZ0x06 +#define OP_BGTZ0x07 +#define OP_BEQL0x14 +#define OP_BNEL0x15 +#define OP_BLEZL 0x16 +#define OP_BGTZL 0x17 +#define OP_LL 0x30 +#define OP_LLD 0x34 +#define OP_SC 0x38 +#define OP_SCD 0x3c + +/* Bits 20:16 of OP_REGIMM instructions */ +#define REGIMM_BLTZ0x00 +#define REGIMM_BGEZ0x01 +#define REGIMM_BLTZL 0x02 +#define REGIMM_BGEZL 0x03 +#define REGIMM_BLTZAL 0x10 +#define REGIMM_BGEZAL 0x11 +#define REGIMM_BLTZALL 0x12 +#define REGIMM_BGEZALL 0x13 + +/* Bits 5:0 of OP_SPECIAL instructions */ +#define SPECIAL_SYNC 0x0f + +static void usage(
[PATCH v2 21/36] MIPS: bitops: Allow immediates in test_and_{set,clear,change}_bit
The logical operations or & xor used in the test_and_set_bit_lock(), test_and_clear_bit() & test_and_change_bit() functions currently force the value 1< --- Changes in v2: None arch/mips/include/asm/bitops.h | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index ea35a2e87b6d..7314ba5a3683 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -261,7 +261,7 @@ static inline int test_and_set_bit_lock(unsigned long nr, " and %2, %0, %3 \n" " .setpop \n" : "=" (temp), "+m" (*m), "=" (res) - : "r" (1UL << bit) + : "ir" (1UL << bit) : __LLSC_CLOBBER); } else { loongson_llsc_mb(); @@ -274,7 +274,7 @@ static inline int test_and_set_bit_lock(unsigned long nr, " " __SC "%2, %1 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) - : "r" (1UL << bit) + : "ir" (1UL << bit) : __LLSC_CLOBBER); } while (unlikely(!res)); @@ -332,7 +332,7 @@ static inline int test_and_clear_bit(unsigned long nr, " and %2, %0, %3 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) - : "r" (1UL << bit) + : "ir" (1UL << bit) : __LLSC_CLOBBER); } else if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(nr)) { loongson_llsc_mb(); @@ -358,7 +358,7 @@ static inline int test_and_clear_bit(unsigned long nr, " " __SC "%2, %1 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) - : "r" (1UL << bit) + : "ir" (1UL << bit) : __LLSC_CLOBBER); } while (unlikely(!res)); @@ -400,7 +400,7 @@ static inline int test_and_change_bit(unsigned long nr, " and %2, %0, %3 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) - : "r" (1UL << bit) + : "ir" (1UL << bit) : __LLSC_CLOBBER); } else { loongson_llsc_mb(); @@ -413,7 +413,7 @@ static inline int test_and_change_bit(unsigned long nr, " " __SC "\t%2, %1 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) - : "r" (1UL << bit) + : "ir" (1UL << bit) : __LLSC_CLOBBER); } while (unlikely(!res)); -- 2.23.0
[PATCH v2 31/36] MIPS: syscall: Emit Loongson3 sync workarounds within asm
Generate the sync instructions required to workaround Loongson3 LL/SC errata within inline asm blocks, which feels a little safer than doing it from C where strictly speaking the compiler would be well within its rights to insert a memory access between the separate asm statements we previously had, containing sync & ll instructions respectively. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/kernel/syscall.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/mips/kernel/syscall.c b/arch/mips/kernel/syscall.c index b0e25e913bdb..3ea288ca35f1 100644 --- a/arch/mips/kernel/syscall.c +++ b/arch/mips/kernel/syscall.c @@ -37,6 +37,7 @@ #include #include #include +#include #include #include @@ -132,12 +133,12 @@ static inline int mips_atomic_set(unsigned long addr, unsigned long new) [efault] "i" (-EFAULT) : "memory"); } else if (cpu_has_llsc) { - loongson_llsc_mb(); __asm__ __volatile__ ( " .setpush\n" " .set"MIPS_ISA_ARCH_LEVEL" \n" " li %[err], 0 \n" "1: \n" + " " __SYNC(full, loongson3_war) " \n" user_ll("%[old]", "(%[addr])") " move%[tmp], %[new] \n" "2: \n" -- 2.23.0
[PATCH v2 16/36] MIPS: bitops: Handle !kernel_uses_llsc first
Reorder conditions in our various bitops functions that check kernel_uses_llsc such that they handle the !kernel_uses_llsc case first. This allows us to avoid the need to duplicate the kernel_uses_llsc check in all the other cases. For functions that don't involve barriers common to the various implementations, we switch to returning from within each if block making each case easier to read in isolation. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/bitops.h | 213 - 1 file changed, 105 insertions(+), 108 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 985d6a02f9ea..e300960717e0 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -52,11 +52,16 @@ int __mips_test_and_change_bit(unsigned long nr, */ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG); + unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); int bit = nr & SZLONG_MASK; unsigned long temp; - if (kernel_uses_llsc && R1_LLSC_WAR) { + if (!kernel_uses_llsc) { + __mips_set_bit(nr, addr); + return; + } + + if (R1_LLSC_WAR) { __asm__ __volatile__( " .setpush\n" " .setarch=r4000 \n" @@ -68,8 +73,11 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) : "=" (temp), "=" GCC_OFF_SMALL_ASM() (*m) : "ir" (1UL << bit), GCC_OFF_SMALL_ASM() (*m) : __LLSC_CLOBBER); + return; + } + #if defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_CPU_MIPSR6) - } else if (kernel_uses_llsc && __builtin_constant_p(bit)) { + if (__builtin_constant_p(bit)) { loongson_llsc_mb(); do { __asm__ __volatile__( @@ -80,23 +88,23 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) : "ir" (bit), "r" (~0) : __LLSC_CLOBBER); } while (unlikely(!temp)); + return; + } #endif /* CONFIG_CPU_MIPSR2 || CONFIG_CPU_MIPSR6 */ - } else if (kernel_uses_llsc) { - loongson_llsc_mb(); - do { - __asm__ __volatile__( - " .setpush\n" - " .set"MIPS_ISA_ARCH_LEVEL" \n" - " " __LL "%0, %1 # set_bit \n" - " or %0, %2 \n" - " " __SC "%0, %1 \n" - " .setpop \n" - : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (1UL << bit) - : __LLSC_CLOBBER); - } while (unlikely(!temp)); - } else - __mips_set_bit(nr, addr); + + loongson_llsc_mb(); + do { + __asm__ __volatile__( + " .setpush\n" + " .set"MIPS_ISA_ARCH_LEVEL" \n" + " " __LL "%0, %1 # set_bit \n" + " or %0, %2 \n" + " " __SC "%0, %1 \n" + " .setpop \n" + : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) + : "ir" (1UL << bit) + : __LLSC_CLOBBER); + } while (unlikely(!temp)); } /* @@ -111,11 +119,16 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) */ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG); + unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); int bit = nr & SZLONG_MASK; unsigned long temp; - if (kernel_uses_llsc && R1_LLSC_WAR) { + if (!kernel_uses_llsc) { + __mips_clear_bit(nr, addr); + return; + } + + if (R1_LLSC_WAR) { __asm__ __volatile__( " .setpush
[PATCH v2 27/36] MIPS: bitops: Use smp_mb__before_atomic in test_* ops
Use smp_mb__before_atomic() rather than smp_mb__before_llsc() in test_and_set_bit(), test_and_clear_bit() & test_and_change_bit(). The _atomic() versions make semantic sense in these cases, and will allow a later patch to omit redundant barriers for Loongson3 systems that already include a barrier within __test_bit_op(). Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/bitops.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index c08b6d225f10..a74769940fbd 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -209,7 +209,7 @@ static inline int test_and_set_bit_lock(unsigned long nr, static inline int test_and_set_bit(unsigned long nr, volatile unsigned long *addr) { - smp_mb__before_llsc(); + smp_mb__before_atomic(); return test_and_set_bit_lock(nr, addr); } @@ -228,7 +228,7 @@ static inline int test_and_clear_bit(unsigned long nr, int bit = nr % BITS_PER_LONG; unsigned long res, orig; - smp_mb__before_llsc(); + smp_mb__before_atomic(); if (!kernel_uses_llsc) { res = __mips_test_and_clear_bit(nr, addr); @@ -265,7 +265,7 @@ static inline int test_and_change_bit(unsigned long nr, int bit = nr % BITS_PER_LONG; unsigned long res, orig; - smp_mb__before_llsc(); + smp_mb__before_atomic(); if (!kernel_uses_llsc) { res = __mips_test_and_change_bit(nr, addr); -- 2.23.0
[PATCH v2 33/36] MIPS: barrier: Make __smp_mb__before_atomic() a no-op for Loongson3
Loongson3 systems with CONFIG_CPU_LOONGSON3_WORKAROUNDS enabled already emit a full completion barrier as part of the inline assembly containing LL/SC loops for atomic operations. As such the barrier emitted by __smp_mb__before_atomic() is redundant, and we can remove it. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/barrier.h | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index 6d92d5ccdafa..49ff172a72b9 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -119,7 +119,17 @@ static inline void wmb(void) #define nudge_writes() mb() #endif -#define __smp_mb__before_atomic() __smp_mb__before_llsc() +/* + * In the Loongson3 LL/SC workaround case, all of our LL/SC loops already have + * a completion barrier immediately preceding the LL instruction. Therefore we + * can skip emitting a barrier from __smp_mb__before_atomic(). + */ +#ifdef CONFIG_CPU_LOONGSON3_WORKAROUNDS +# define __smp_mb__before_atomic() +#else +# define __smp_mb__before_atomic() __smp_mb__before_llsc() +#endif + #define __smp_mb__after_atomic() smp_llsc_mb() static inline void sync_ginv(void) -- 2.23.0
[PATCH v2 26/36] MIPS: bitops: Emit Loongson3 sync workarounds within asm
Generate the sync instructions required to workaround Loongson3 LL/SC errata within inline asm blocks, which feels a little safer than doing it from C where strictly speaking the compiler would be well within its rights to insert a memory access between the separate asm statements we previously had, containing sync & ll instructions respectively. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/bitops.h | 11 ++- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index d39fca2def60..c08b6d225f10 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -31,6 +31,7 @@ asm volatile( \ " .setpush\n" \ " .set" MIPS_ISA_LEVEL " \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: " __LL "%0, %1 \n" \ " " insn " \n" \ " " __SC "%0, %1 \n" \ @@ -47,6 +48,7 @@ asm volatile( \ " .setpush\n" \ " .set" MIPS_ISA_LEVEL " \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: " __LL ll_dst ", %2\n" \ " " insn " \n" \ " " __SC "%1, %2 \n" \ @@ -96,12 +98,10 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) } if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(bit) && (bit >= 16)) { - loongson_llsc_mb(); __bit_op(*m, __INS "%0, %3, %2, 1", "i"(bit), "r"(~0)); return; } - loongson_llsc_mb(); __bit_op(*m, "or\t%0, %2", "ir"(BIT(bit))); } @@ -126,12 +126,10 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr) } if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(bit)) { - loongson_llsc_mb(); __bit_op(*m, __INS "%0, $0, %2, 1", "i"(bit)); return; } - loongson_llsc_mb(); __bit_op(*m, "and\t%0, %2", "ir"(~BIT(bit))); } @@ -168,7 +166,6 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr) return; } - loongson_llsc_mb(); __bit_op(*m, "xor\t%0, %2", "ir"(BIT(bit))); } @@ -190,7 +187,6 @@ static inline int test_and_set_bit_lock(unsigned long nr, if (!kernel_uses_llsc) { res = __mips_test_and_set_bit_lock(nr, addr); } else { - loongson_llsc_mb(); orig = __test_bit_op(*m, "%0", "or\t%1, %0, %3", "ir"(BIT(bit))); @@ -237,13 +233,11 @@ static inline int test_and_clear_bit(unsigned long nr, if (!kernel_uses_llsc) { res = __mips_test_and_clear_bit(nr, addr); } else if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(nr)) { - loongson_llsc_mb(); res = __test_bit_op(*m, "%1", __EXT "%0, %1, %3, 1;" __INS "%1, $0, %3, 1", "i"(bit)); } else { - loongson_llsc_mb(); orig = __test_bit_op(*m, "%0", "or\t%1, %0, %3;" "xor\t%1, %1, %3", @@ -276,7 +270,6 @@ static inline int test_and_change_bit(unsigned long nr, if (!kernel_uses_llsc) { res = __mips_test_and_change_bit(nr, addr); } else { - loongson_llsc_mb(); orig = __test_bit_op(*m, "%0", "xor\t%1, %0, %3", "ir"(BIT(bit))); -- 2.23.0
[PATCH v2 17/36] MIPS: bitops: Only use ins for bit 16 or higher
set_bit() can set bits 0-15 using an ori instruction, rather than loading the value -1 into a register & then using an ins instruction. That is, rather than the following: li t0, -1 ll t1, 0(t2) ins t1, t0, 4, 1 sc t1, 0(t2) We can have the simpler: ll t1, 0(t2) ori t1, t1, 0x10 sc t1, 0(t2) The or path already allows immediates to be used, so simply restricting the ins path to bits that don't fit in immediates is sufficient to take advantage of this. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/bitops.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index e300960717e0..1e5739191ddf 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -77,7 +77,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) } #if defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_CPU_MIPSR6) - if (__builtin_constant_p(bit)) { + if (__builtin_constant_p(bit) && (bit >= 16)) { loongson_llsc_mb(); do { __asm__ __volatile__( -- 2.23.0
[PATCH v2 25/36] MIPS: bitops: Use BIT_WORD() & BITS_PER_LONG
Rather than using custom SZLONG_LOG & SZLONG_MASK macros to shift & mask a bit index to form word & bit offsets respectively, make use of the standard BIT_WORD() & BITS_PER_LONG macros for the same purpose. volatile is added to the definition of pointers to the long-sized word we'll operate on, in order to prevent the compiler complaining that we cast away the volatile qualifier of the addr argument. This should have no effect on generated code, which in the LL/SC case is inline asm anyway & in the non-LLSC case access is constrained by compiler barriers provided by raw_local_irq_{save,restore}(). Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/bitops.h | 24 arch/mips/include/asm/llsc.h | 4 arch/mips/lib/bitops.c | 31 +-- 3 files changed, 25 insertions(+), 34 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index fba0a842b98a..d39fca2def60 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -87,8 +87,8 @@ int __mips_test_and_change_bit(unsigned long nr, */ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); - int bit = nr & SZLONG_MASK; + volatile unsigned long *m = [BIT_WORD(nr)]; + int bit = nr % BITS_PER_LONG; if (!kernel_uses_llsc) { __mips_set_bit(nr, addr); @@ -117,8 +117,8 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) */ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); - int bit = nr & SZLONG_MASK; + volatile unsigned long *m = [BIT_WORD(nr)]; + int bit = nr % BITS_PER_LONG; if (!kernel_uses_llsc) { __mips_clear_bit(nr, addr); @@ -160,8 +160,8 @@ static inline void clear_bit_unlock(unsigned long nr, volatile unsigned long *ad */ static inline void change_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); - int bit = nr & SZLONG_MASK; + volatile unsigned long *m = [BIT_WORD(nr)]; + int bit = nr % BITS_PER_LONG; if (!kernel_uses_llsc) { __mips_change_bit(nr, addr); @@ -183,8 +183,8 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr) static inline int test_and_set_bit_lock(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); - int bit = nr & SZLONG_MASK; + volatile unsigned long *m = [BIT_WORD(nr)]; + int bit = nr % BITS_PER_LONG; unsigned long res, orig; if (!kernel_uses_llsc) { @@ -228,8 +228,8 @@ static inline int test_and_set_bit(unsigned long nr, static inline int test_and_clear_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); - int bit = nr & SZLONG_MASK; + volatile unsigned long *m = [BIT_WORD(nr)]; + int bit = nr % BITS_PER_LONG; unsigned long res, orig; smp_mb__before_llsc(); @@ -267,8 +267,8 @@ static inline int test_and_clear_bit(unsigned long nr, static inline int test_and_change_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); - int bit = nr & SZLONG_MASK; + volatile unsigned long *m = [BIT_WORD(nr)]; + int bit = nr % BITS_PER_LONG; unsigned long res, orig; smp_mb__before_llsc(); diff --git a/arch/mips/include/asm/llsc.h b/arch/mips/include/asm/llsc.h index d240a4a2d1c4..c49738bc3bda 100644 --- a/arch/mips/include/asm/llsc.h +++ b/arch/mips/include/asm/llsc.h @@ -12,15 +12,11 @@ #include #if _MIPS_SZLONG == 32 -#define SZLONG_LOG 5 -#define SZLONG_MASK 31UL #define __LL "ll " #define __SC "sc " #define __INS "ins" #define __EXT "ext" #elif _MIPS_SZLONG == 64 -#define SZLONG_LOG 6 -#define SZLONG_MASK 63UL #define __LL "lld" #define __SC "scd" #define __INS "dins " diff --git a/arch/mips/lib/bitops.c b/arch/mips/lib/bitops.c index fba402c0879d..116d0bd8b2ae 100644 --- a/arch/mips/lib/bitops.c +++ b/arch/mips/lib/bitops.c @@ -7,6 +7,7 @@ * Copyright (c) 1999, 2000 Silicon Graphics, Inc. */ #include +#include #include #include @@ -19,12 +20,11 @@ */ void __mips_set_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *a = (unsigned long *)addr; - unsigned bit = nr & SZLONG_MASK;
[PATCH v2 34/36] MIPS: genex: Add Loongson3 LL/SC workaround to ejtag_debug_handler
In ejtag_debug_handler we use LL & SC instructions to acquire & release an open-coded spinlock. For Loongson3 systems affected by LL/SC errata this requires that we insert a sync instruction prior to the LL in order to ensure correct behavior of the LL/SC loop. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/kernel/genex.S | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S index efde27c99414..ac4f2b835165 100644 --- a/arch/mips/kernel/genex.S +++ b/arch/mips/kernel/genex.S @@ -18,6 +18,7 @@ #include #include #include +#include #include #include @@ -353,6 +354,7 @@ NESTED(ejtag_debug_handler, PT_SIZE, sp) #ifdef CONFIG_SMP 1: PTR_LA k0, ejtag_debug_buffer_spinlock + __SYNC(full, loongson3_war) ll k0, 0(k0) bnezk0, 1b PTR_LA k0, ejtag_debug_buffer_spinlock -- 2.23.0
[PATCH v2 22/36] MIPS: bitops: Use the BIT() macro
Use the BIT() macro in asm/bitops.h rather than open-coding its equivalent. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/bitops.h | 31 --- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 7314ba5a3683..0f8ff896e86b 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -13,6 +13,7 @@ #error only can be included directly #endif +#include #include #include #include @@ -70,7 +71,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) " beqzl %0, 1b \n" " .setpop \n" : "=" (temp), "=" GCC_OFF_SMALL_ASM() (*m) - : "ir" (1UL << bit), GCC_OFF_SMALL_ASM() (*m) + : "ir" (BIT(bit)), GCC_OFF_SMALL_ASM() (*m) : __LLSC_CLOBBER); return; } @@ -99,7 +100,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) " " __SC "%0, %1 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (1UL << bit) + : "ir" (BIT(bit)) : __LLSC_CLOBBER); } while (unlikely(!temp)); } @@ -135,7 +136,7 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr) " beqzl %0, 1b \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (~(1UL << bit)) + : "ir" (~(BIT(bit))) : __LLSC_CLOBBER); return; } @@ -164,7 +165,7 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr) " " __SC "%0, %1 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (~(1UL << bit)) + : "ir" (~(BIT(bit))) : __LLSC_CLOBBER); } while (unlikely(!temp)); } @@ -213,7 +214,7 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr) " beqzl %0, 1b \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (1UL << bit) + : "ir" (BIT(bit)) : __LLSC_CLOBBER); return; } @@ -228,7 +229,7 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr) " " __SC "%0, %1 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (1UL << bit) + : "ir" (BIT(bit)) : __LLSC_CLOBBER); } while (unlikely(!temp)); } @@ -261,7 +262,7 @@ static inline int test_and_set_bit_lock(unsigned long nr, " and %2, %0, %3 \n" " .setpop \n" : "=" (temp), "+m" (*m), "=" (res) - : "ir" (1UL << bit) + : "ir" (BIT(bit)) : __LLSC_CLOBBER); } else { loongson_llsc_mb(); @@ -274,11 +275,11 @@ static inline int test_and_set_bit_lock(unsigned long nr, " " __SC "%2, %1 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) - : "ir" (1UL << bit) + : "ir" (BIT(bit)) : __LLSC_CLOBBER); } while (unlikely(!res)); - res = temp & (1UL << bit); + res = temp & BIT(bit); } smp_llsc_mb(); @@ -332,7 +333,7 @@ static inline int test_and_clear_bit(unsigned long nr, "
[PATCH v2 24/36] MIPS: bitops: Abstract LL/SC loops
Introduce __bit_op() & __test_bit_op() macros which abstract away the implementation of LL/SC loops. This cuts down on a lot of duplicate boilerplate code, and also allows R1_LLSC_WAR to be handled outside of the individual bitop functions. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/bitops.h | 267 - 1 file changed, 63 insertions(+), 204 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 7671db2a7b73..fba0a842b98a 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -25,6 +25,41 @@ #include #include +#define __bit_op(mem, insn, inputs...) do {\ + unsigned long temp; \ + \ + asm volatile( \ + " .setpush\n" \ + " .set" MIPS_ISA_LEVEL " \n" \ + "1: " __LL "%0, %1 \n" \ + " " insn " \n" \ + " " __SC "%0, %1 \n" \ + " " __SC_BEQZ "%0, 1b \n" \ + " .setpop \n" \ + : "="(temp), "+" GCC_OFF_SMALL_ASM()(mem) \ + : inputs\ + : __LLSC_CLOBBER); \ +} while (0) + +#define __test_bit_op(mem, ll_dst, insn, inputs...) ({ \ + unsigned long orig, temp; \ + \ + asm volatile( \ + " .setpush\n" \ + " .set" MIPS_ISA_LEVEL " \n" \ + "1: " __LL ll_dst ", %2\n" \ + " " insn " \n" \ + " " __SC "%1, %2 \n" \ + " " __SC_BEQZ "%1, 1b \n" \ + " .setpop \n" \ + : "="(orig), "="(temp), \ + "+" GCC_OFF_SMALL_ASM()(mem) \ + : inputs\ + : __LLSC_CLOBBER); \ + \ + orig; \ +}) + /* * These are the "slower" versions of the functions and are in bitops.c. * These functions call raw_local_irq_{save,restore}(). @@ -54,55 +89,20 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) { unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); int bit = nr & SZLONG_MASK; - unsigned long temp; if (!kernel_uses_llsc) { __mips_set_bit(nr, addr); return; } - if (R1_LLSC_WAR) { - __asm__ __volatile__( - " .setpush\n" - " .setarch=r4000 \n" - "1: " __LL "%0, %1 # set_bit \n" - " or %0, %2 \n" - " " __SC "%0, %1 \n" - " beqzl %0, 1b \n" - " .setpop \n" - : "=" (temp), "=" GCC_OFF_SMALL_ASM() (*m) - : "ir" (BIT(bit)), GCC_OFF_SMALL_ASM() (*m) - : __LLSC_CLOBBER); - return; - } - if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(bit) && (bit >= 16)) { loongson_llsc_mb(); - do { - __asm__ __volatile__( - " " __LL "%0, %1 # set_bit \n" - " " __INS "%0, %3, %2, 1 \n" - " " __SC "%0, %1 \n" - : "=" (temp), "+" GCC_OFF_SM
[PATCH v2 29/36] MIPS: cmpxchg: Omit redundant barriers for Loongson3
When building a kernel configured to support Loongson3 LL/SC workarounds (ie. CONFIG_CPU_LOONGSON3_WORKAROUNDS=y) the inline assembly in __xchg_asm() & __cmpxchg_asm() already emits completion barriers, and as such we don't need to emit extra barriers from the xchg() or cmpxchg() macros. Add compile-time constant checks causing us to omit the redundant memory barriers. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/cmpxchg.h | 26 +++--- 1 file changed, 23 insertions(+), 3 deletions(-) diff --git a/arch/mips/include/asm/cmpxchg.h b/arch/mips/include/asm/cmpxchg.h index fc121d20a980..820df68e32e1 100644 --- a/arch/mips/include/asm/cmpxchg.h +++ b/arch/mips/include/asm/cmpxchg.h @@ -94,7 +94,13 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x, ({ \ __typeof__(*(ptr)) __res; \ \ - smp_mb__before_llsc(); \ + /* \ +* In the Loongson3 workaround case __xchg_asm() already\ +* contains a completion barrier prior to the LL, so we don't \ +* need to emit an extra one here. \ +*/ \ + if (!__SYNC_loongson3_war) \ + smp_mb__before_llsc(); \ \ __res = (__typeof__(*(ptr)))\ __xchg((ptr), (unsigned long)(x), sizeof(*(ptr))); \ @@ -179,9 +185,23 @@ static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old, ({ \ __typeof__(*(ptr)) __res; \ \ - smp_mb__before_llsc(); \ + /* \ +* In the Loongson3 workaround case __cmpxchg_asm() already \ +* contains a completion barrier prior to the LL, so we don't \ +* need to emit an extra one here. \ +*/ \ + if (!__SYNC_loongson3_war) \ + smp_mb__before_llsc(); \ + \ __res = cmpxchg_local((ptr), (old), (new)); \ - smp_llsc_mb(); \ + \ + /* \ +* In the Loongson3 workaround case __cmpxchg_asm() already \ +* contains a completion barrier after the SC, so we don't \ +* need to emit an extra one here. \ +*/ \ + if (!__SYNC_loongson3_war) \ + smp_llsc_mb(); \ \ __res; \ }) -- 2.23.0
[PATCH v2 28/36] MIPS: cmpxchg: Emit Loongson3 sync workarounds within asm
Generate the sync instructions required to workaround Loongson3 LL/SC errata within inline asm blocks, which feels a little safer than doing it from C where strictly speaking the compiler would be well within its rights to insert a memory access between the separate asm statements we previously had, containing sync & ll instructions respectively. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/cmpxchg.h | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/arch/mips/include/asm/cmpxchg.h b/arch/mips/include/asm/cmpxchg.h index 5d3f0e3513b4..fc121d20a980 100644 --- a/arch/mips/include/asm/cmpxchg.h +++ b/arch/mips/include/asm/cmpxchg.h @@ -12,6 +12,7 @@ #include #include #include +#include #include /* @@ -36,12 +37,12 @@ extern unsigned long __xchg_called_with_bad_pointer(void) __typeof(*(m)) __ret; \ \ if (kernel_uses_llsc) { \ - loongson_llsc_mb(); \ __asm__ __volatile__( \ " .setpush\n" \ " .setnoat\n" \ " .setpush\n" \ " .set" MIPS_ISA_ARCH_LEVEL " \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: " ld " %0, %2 # __xchg_asm\n" \ " .setpop \n" \ " move$1, %z3 \n" \ @@ -108,12 +109,12 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x, __typeof(*(m)) __ret; \ \ if (kernel_uses_llsc) { \ - loongson_llsc_mb(); \ __asm__ __volatile__( \ " .setpush\n" \ " .setnoat\n" \ " .setpush\n" \ " .set"MIPS_ISA_ARCH_LEVEL" \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: " ld " %0, %2 # __cmpxchg_asm \n" \ " bne %0, %z3, 2f \n" \ " .setpop \n" \ @@ -122,11 +123,10 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x, " " st " $1, %1 \n" \ "\t" __SC_BEQZ "$1, 1b \n" \ " .setpop \n" \ - "2: \n" \ + "2: " __SYNC(full, loongson3_war) " \n" \ : "=" (__ret), "=" GCC_OFF_SMALL_ASM() (*m) \ : GCC_OFF_SMALL_ASM() (*m), "Jr" (old), "Jr" (new) \ : __LLSC_CLOBBER); \ - loongson_llsc_mb(); \ } else {\ unsigned long __flags; \ \ @@ -222,11 +222,11 @@ static inline unsigned long __cmpxchg64(volatile void *ptr, */ local_irq_save(flags); - loongson_llsc_mb(); asm volatile( " .setpush\n" " .set" MIPS_ISA_ARCH_LEVEL " \n" /* Load 64 bits from ptr */ + " " __SYNC(full, loongson3_war) " \n" "1: lld %L0, %3 # __cmpxchg64 \n" /* * Split the 64 bit value we loaded into the 2 registers that hold the @@ -260,7 +260,7 @@ static inline unsigned long __cmpxchg64(volatile void *ptr, /* If we failed, loop! */ "\t" __SC_BEQZ "%L1, 1b \n&quo
[PATCH v2 20/36] MIPS: bitops: Implement test_and_set_bit() in terms of _lock variant
The only difference between test_and_set_bit() & test_and_set_bit_lock() is memory ordering barrier semantics - the former provides a full barrier whilst the latter only provides acquire semantics. We can therefore implement test_and_set_bit() in terms of test_and_set_bit_lock() with the addition of the extra memory barrier. Do this in order to avoid duplicating logic. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/bitops.h | 66 +++--- arch/mips/lib/bitops.c | 26 -- 2 files changed, 13 insertions(+), 79 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 03532ae9f528..ea35a2e87b6d 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -31,8 +31,6 @@ void __mips_set_bit(unsigned long nr, volatile unsigned long *addr); void __mips_clear_bit(unsigned long nr, volatile unsigned long *addr); void __mips_change_bit(unsigned long nr, volatile unsigned long *addr); -int __mips_test_and_set_bit(unsigned long nr, - volatile unsigned long *addr); int __mips_test_and_set_bit_lock(unsigned long nr, volatile unsigned long *addr); int __mips_test_and_clear_bit(unsigned long nr, @@ -236,24 +234,22 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr) } /* - * test_and_set_bit - Set a bit and return its old value + * test_and_set_bit_lock - Set a bit and return its old value * @nr: Bit to set * @addr: Address to count from * - * This operation is atomic and cannot be reordered. - * It also implies a memory barrier. + * This operation is atomic and implies acquire ordering semantics + * after the memory operation. */ -static inline int test_and_set_bit(unsigned long nr, +static inline int test_and_set_bit_lock(unsigned long nr, volatile unsigned long *addr) { unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); int bit = nr & SZLONG_MASK; unsigned long res, temp; - smp_mb__before_llsc(); - if (!kernel_uses_llsc) { - res = __mips_test_and_set_bit(nr, addr); + res = __mips_test_and_set_bit_lock(nr, addr); } else if (R1_LLSC_WAR) { __asm__ __volatile__( " .setpush\n" @@ -264,7 +260,7 @@ static inline int test_and_set_bit(unsigned long nr, " beqzl %2, 1b \n" " and %2, %0, %3 \n" " .setpop \n" - : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) + : "=" (temp), "+m" (*m), "=" (res) : "r" (1UL << bit) : __LLSC_CLOBBER); } else { @@ -291,56 +287,20 @@ static inline int test_and_set_bit(unsigned long nr, } /* - * test_and_set_bit_lock - Set a bit and return its old value + * test_and_set_bit - Set a bit and return its old value * @nr: Bit to set * @addr: Address to count from * - * This operation is atomic and implies acquire ordering semantics - * after the memory operation. + * This operation is atomic and cannot be reordered. + * It also implies a memory barrier. */ -static inline int test_and_set_bit_lock(unsigned long nr, +static inline int test_and_set_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); - int bit = nr & SZLONG_MASK; - unsigned long res, temp; - - if (!kernel_uses_llsc) { - res = __mips_test_and_set_bit_lock(nr, addr); - } else if (R1_LLSC_WAR) { - __asm__ __volatile__( - " .setpush\n" - " .setarch=r4000 \n" - "1: " __LL "%0, %1 # test_and_set_bit \n" - " or %2, %0, %3 \n" - " " __SC "%2, %1 \n" - " beqzl %2, 1b \n" - " and %2, %0, %3 \n" - " .setpop \n" - : "=" (temp), "+m" (*m), "=" (res) - : "r" (1UL << bit) - : __LLSC_CLOBBER); - } else { - do { - __asm__ __volatile__( -
[PATCH v2 35/36] MIPS: genex: Don't reload address unnecessarily
In ejtag_debug_handler() we must reload the address of ejtag_debug_buffer_spinlock if an sc fails, since the address in k0 will have been clobbered by the result of the sc instruction. In the case where we simply load a non-zero value (ie. there's contention for the lock) the address will not be clobbered & we can simply branch back to repeat the load from memory without reloading the address into k0. The primary motivation for this change is that it moves the target of the bnez instruction to an instruction within the LL/SC loop (the LL itself), which we know contains no other memory accesses & therefore isn't affected by Loongson3 LL/SC errata. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/kernel/genex.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S index ac4f2b835165..60ede6b75a3b 100644 --- a/arch/mips/kernel/genex.S +++ b/arch/mips/kernel/genex.S @@ -355,8 +355,8 @@ NESTED(ejtag_debug_handler, PT_SIZE, sp) #ifdef CONFIG_SMP 1: PTR_LA k0, ejtag_debug_buffer_spinlock __SYNC(full, loongson3_war) - ll k0, 0(k0) - bnezk0, 1b +2: ll k0, 0(k0) + bnezk0, 2b PTR_LA k0, ejtag_debug_buffer_spinlock sc k0, 0(k0) beqzk0, 1b -- 2.23.0
[PATCH v2 12/36] MIPS: atomic: Emit Loongson3 sync workarounds within asm
Generate the sync instructions required to workaround Loongson3 LL/SC errata within inline asm blocks, which feels a little safer than doing it from C where strictly speaking the compiler would be well within its rights to insert a memory access between the separate asm statements we previously had, containing sync & ll instructions respectively. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/atomic.h | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index b834af5a7382..841ff274ada6 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -21,6 +21,7 @@ #include #include #include +#include #include #define ATOMIC_INIT(i) { (i) } @@ -56,10 +57,10 @@ static __inline__ void pfx##_##op(type i, pfx##_t * v) \ return; \ } \ \ - loongson_llsc_mb(); \ __asm__ __volatile__( \ " .setpush\n" \ " .set" MIPS_ISA_LEVEL " \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: " #ll " %0, %1 # " #pfx "_" #op " \n" \ " " #asm_op " %0, %2 \n" \ " " #sc " %0, %1 \n" \ @@ -85,10 +86,10 @@ static __inline__ type pfx##_##op##_return_relaxed(type i, pfx##_t * v) \ return result; \ } \ \ - loongson_llsc_mb(); \ __asm__ __volatile__( \ " .setpush\n" \ " .set" MIPS_ISA_LEVEL " \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: " #ll " %1, %2 # " #pfx "_" #op "_return\n"\ " " #asm_op " %0, %1, %3 \n" \ " " #sc " %0, %2 \n" \ @@ -117,10 +118,10 @@ static __inline__ type pfx##_fetch_##op##_relaxed(type i, pfx##_t * v)\ return result; \ } \ \ - loongson_llsc_mb(); \ __asm__ __volatile__( \ " .setpush\n" \ " .set" MIPS_ISA_LEVEL " \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: " #ll " %1, %2 # " #pfx "_fetch_" #op "\n" \ " " #asm_op " %0, %1, %3 \n" \ " " #sc " %0, %2 \n" \ @@ -200,10 +201,10 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v) if (kernel_uses_llsc) { int temp; - loongson_llsc_mb(); __asm__ __volatile__( " .setpush\n" " .set"MIPS_ISA_LEVEL"\n" + " " __SYNC(full, loongson3_war) " \n" "1: ll %1, %2 # atomic_sub_if_positive\n" " .setpop \n" " subu%0, %1, %3 \n" @@ -213,7 +214,7 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v) " .set"MIPS_ISA_LEVEL"\n" " sc %1, %2 \n" "\t
[PATCH v2 13/36] MIPS: atomic: Use _atomic barriers in atomic_sub_if_positive()
Use smp_mb__before_atomic() & smp_mb__after_atomic() in atomic_sub_if_positive() rather than the equivalent smp_mb__before_llsc() & smp_llsc_mb(). The former are more standard & this preps us for avoiding redundant duplicate barriers on Loongson3 in a later patch. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/atomic.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index 841ff274ada6..24443ef29337 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -196,7 +196,7 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v) { int result; - smp_mb__before_llsc(); + smp_mb__before_atomic(); if (kernel_uses_llsc) { int temp; @@ -237,7 +237,7 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v) * another barrier here. */ if (!__SYNC_loongson3_war) - smp_llsc_mb(); + smp_mb__after_atomic(); return result; } -- 2.23.0
[PATCH v2 19/36] MIPS: bitops: ins start position is always an immediate
The start position for an ins instruction is always encoded as an immediate, so allowing registers to be used by the inline asm makes no sense. It should never happen anyway since a bit index should always be small enough to be treated as an immediate, but remove the nonsensical "r" for sanity. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/bitops.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 0f5329e32e87..03532ae9f528 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -85,7 +85,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) " " __INS "%0, %3, %2, 1 \n" " " __SC "%0, %1 \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (bit), "r" (~0) + : "i" (bit), "r" (~0) : __LLSC_CLOBBER); } while (unlikely(!temp)); return; @@ -150,7 +150,7 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr) " " __INS "%0, $0, %2, 1 \n" " " __SC "%0, %1 \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (bit) + : "i" (bit) : __LLSC_CLOBBER); } while (unlikely(!temp)); return; @@ -383,7 +383,7 @@ static inline int test_and_clear_bit(unsigned long nr, " " __INS "%0, $0, %3, 1 \n" " " __SC "%0, %1 \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) - : "ir" (bit) + : "i" (bit) : __LLSC_CLOBBER); } while (unlikely(!temp)); } else { -- 2.23.0
[PATCH v2 08/36] MIPS: barrier: Clean up sync_ginv()
Use the new __SYNC() infrastructure to implement sync_ginv(), for consistency with much of the rest of the asm/barrier.h. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/barrier.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index a117c6d95038..c7e05e832da9 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -163,7 +163,7 @@ static inline void wmb(void) static inline void sync_ginv(void) { - asm volatile("sync\t%0" :: "i"(__SYNC_ginv)); + asm volatile(__SYNC(ginv, always)); } #include -- 2.23.0
[PATCH v2 14/36] MIPS: atomic: Unify 32b & 64b sub_if_positive
Unify the definitions of atomic_sub_if_positive() & atomic64_sub_if_positive() using a macro like we do for most other atomic functions. This allows us to share the implementation ensuring consistency between the two. Notably this provides the appropriate loongson3_war barriers in the atomic64_sub_if_positive() case which were previously missing. The code is rearranged a little to handle the !kernel_uses_llsc case first in order to de-indent the LL/SC case & allow us not to go over 80 characters per line. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/atomic.h | 164 - 1 file changed, 58 insertions(+), 106 deletions(-) diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index 24443ef29337..96ef50fa2817 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -192,65 +192,71 @@ ATOMIC_OPS(atomic64, xor, s64, ^=, xor, lld, scd) * Atomically test @v and subtract @i if @v is greater or equal than @i. * The function returns the old value of @v minus @i. */ -static __inline__ int atomic_sub_if_positive(int i, atomic_t * v) -{ - int result; - - smp_mb__before_atomic(); - - if (kernel_uses_llsc) { - int temp; - - __asm__ __volatile__( - " .setpush\n" - " .set"MIPS_ISA_LEVEL"\n" - " " __SYNC(full, loongson3_war) " \n" - "1: ll %1, %2 # atomic_sub_if_positive\n" - " .setpop \n" - " subu%0, %1, %3 \n" - " move%1, %0 \n" - " bltz%0, 2f \n" - " .setpush\n" - " .set"MIPS_ISA_LEVEL"\n" - " sc %1, %2 \n" - "\t" __SC_BEQZ "%1, 1b \n" - "2: " __SYNC(full, loongson3_war) " \n" - " .setpop \n" - : "=" (result), "=" (temp), - "+" GCC_OFF_SMALL_ASM() (v->counter) - : "Ir" (i) : __LLSC_CLOBBER); - } else { - unsigned long flags; +#define ATOMIC_SIP_OP(pfx, type, op, ll, sc) \ +static __inline__ int pfx##_sub_if_positive(type i, pfx##_t * v) \ +{ \ + type temp, result; \ + \ + smp_mb__before_atomic();\ + \ + if (!kernel_uses_llsc) {\ + unsigned long flags;\ + \ + raw_local_irq_save(flags); \ + result = v->counter;\ + result -= i;\ + if (result >= 0)\ + v->counter = result;\ + raw_local_irq_restore(flags); \ + smp_mb__after_atomic(); \ + return result; \ + } \ + \ + __asm__ __volatile__( \ + " .setpush\n" \ + " .set" MIPS_ISA_LEVEL " \n" \ + " " __SYNC(full, loongson3_war) " \n" \ + "1: " #ll " %1, %2 # atomic_sub_if_positive\n" \ + " .setpop \n" \ + " " #op " %0, %1, %3 \n" \ + " move%1, %0
[PATCH v2 04/36] MIPS: barrier: Clean up rmb() & wmb() definitions
Simplify our definitions of rmb() & wmb() using the new __SYNC() infrastructure. The fast_rmb() & fast_wmb() macros are removed, since they only provided a level of indirection that made the code less readable & weren't directly used anywhere in the kernel tree. The Octeon #ifdef'ery is removed, since the "syncw" instruction previously used is merely an alias for "sync 4" which __SYNC() will emit for the wmb sync type when the kernel is configured for an Octeon CPU. Similarly __SYNC() will emit nothing for the rmb sync type in Octeon configurations. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/barrier.h | 28 ++-- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index 5ad39bfd3b6d..f36cab87cfde 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -26,6 +26,18 @@ #define __sync() do { } while(0) #endif +static inline void rmb(void) +{ + asm volatile(__SYNC(rmb, always) ::: "memory"); +} +#define rmb rmb + +static inline void wmb(void) +{ + asm volatile(__SYNC(wmb, always) ::: "memory"); +} +#define wmb wmb + #define __fast_iob() \ __asm__ __volatile__( \ ".set push\n\t" \ @@ -37,16 +49,9 @@ : "m" (*(int *)CKSEG1) \ : "memory") #ifdef CONFIG_CPU_CAVIUM_OCTEON -# define OCTEON_SYNCW_STR ".set push\n.set arch=octeon\nsyncw\nsyncw\n.set pop\n" -# define __syncw() __asm__ __volatile__(OCTEON_SYNCW_STR : : : "memory") - -# define fast_wmb()__syncw() -# define fast_rmb()barrier() # define fast_mb() __sync() # define fast_iob()do { } while (0) #else /* ! CONFIG_CPU_CAVIUM_OCTEON */ -# define fast_wmb()__sync() -# define fast_rmb()__sync() # define fast_mb() __sync() # ifdef CONFIG_SGI_IP28 # define fast_iob() \ @@ -83,19 +88,14 @@ #endif /* !CONFIG_CPU_HAS_WB */ -#define wmb() fast_wmb() -#define rmb() fast_rmb() - #if defined(CONFIG_WEAK_ORDERING) # ifdef CONFIG_CPU_CAVIUM_OCTEON # define __smp_mb() __sync() -# define __smp_rmb() barrier() -# define __smp_wmb() __syncw() # else # define __smp_mb() __asm__ __volatile__("sync" : : :"memory") -# define __smp_rmb() __asm__ __volatile__("sync" : : :"memory") -# define __smp_wmb() __asm__ __volatile__("sync" : : :"memory") # endif +# define __smp_rmb() rmb() +# define __smp_wmb() wmb() #else #define __smp_mb() barrier() #define __smp_rmb()barrier() -- 2.23.0
[PATCH v2 11/36] MIPS: atomic: Use one macro to generate 32b & 64b functions
Cut down on duplication by generalizing the ATOMIC_OP(), ATOMIC_OP_RETURN() & ATOMIC_FETCH_OP() macros to work for both 32b & 64b atomics, and removing the ATOMIC64_ variants. This ensures consistency between our atomic_* & atomic64_* functions. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/atomic.h | 196 - 1 file changed, 45 insertions(+), 151 deletions(-) diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index ace2ea005588..b834af5a7382 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -42,10 +42,10 @@ */ #define atomic_set(v, i) WRITE_ONCE((v)->counter, (i)) -#define ATOMIC_OP(op, c_op, asm_op)\ -static __inline__ void atomic_##op(int i, atomic_t * v) \ +#define ATOMIC_OP(pfx, op, type, c_op, asm_op, ll, sc) \ +static __inline__ void pfx##_##op(type i, pfx##_t * v) \ { \ - int temp; \ + type temp; \ \ if (!kernel_uses_llsc) {\ unsigned long flags;\ @@ -60,19 +60,19 @@ static __inline__ void atomic_##op(int i, atomic_t * v) \ __asm__ __volatile__( \ " .setpush\n" \ " .set" MIPS_ISA_LEVEL " \n" \ - "1: ll %0, %1 # atomic_" #op "\n" \ + "1: " #ll " %0, %1 # " #pfx "_" #op " \n" \ " " #asm_op " %0, %2 \n" \ - " sc %0, %1 \n" \ + " " #sc " %0, %1 \n" \ "\t" __SC_BEQZ "%0, 1b \n" \ " .setpop \n" \ : "=" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)\ : "Ir" (i) : __LLSC_CLOBBER); \ } -#define ATOMIC_OP_RETURN(op, c_op, asm_op) \ -static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \ +#define ATOMIC_OP_RETURN(pfx, op, type, c_op, asm_op, ll, sc) \ +static __inline__ type pfx##_##op##_return_relaxed(type i, pfx##_t * v) \ { \ - int temp, result; \ + type temp, result; \ \ if (!kernel_uses_llsc) {\ unsigned long flags;\ @@ -89,9 +89,9 @@ static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \ __asm__ __volatile__( \ " .setpush\n" \ " .set" MIPS_ISA_LEVEL " \n" \ - "1: ll %1, %2 # atomic_" #op "_return \n" \ + "1: " #ll " %1, %2 # " #pfx "_" #op "_return\n"\ " " #asm_op " %0, %1, %3 \n" \ - " sc %0, %2 \n" \ + " " #sc " %0, %2 \n" \ "\t" __SC_BEQZ "%0, 1b \n" \ " " #asm_op " %0, %1, %3 \n" \ " .setpop \n" \ @@ -102,8 +102,8 @@ static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \ return result; \ } -#define ATOMIC_FETCH_OP(op, c_op, asm_op) \ -static __inline__ int atomic_fetch_##op##_relaxed(int i, atomic_t * v) \ +#define ATOMIC_FETCH_OP(pfx, op, type, c_op, asm_op, ll, sc) \ +static __i
[PATCH v2 01/36] MIPS: Unify sc beqz definition
We currently duplicate the definition of __scbeqz in asm/atomic.h & asm/cmpxchg.h. Move it to asm/llsc.h & rename it to __SC_BEQZ to fit better with the existing __SC macro provided there. We include a tab in the string in order to avoid the need for users to indent code any further to include whitespace of their own after the instruction mnemonic. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/atomic.h | 28 +--- arch/mips/include/asm/cmpxchg.h | 20 arch/mips/include/asm/llsc.h| 11 +++ 3 files changed, 24 insertions(+), 35 deletions(-) diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index bb8658cc7f12..7578c807ef98 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -20,19 +20,9 @@ #include #include #include +#include #include -/* - * Using a branch-likely instruction to check the result of an sc instruction - * works around a bug present in R1 CPUs prior to revision 3.0 that could - * cause ll-sc sequences to execute non-atomically. - */ -#if R1_LLSC_WAR -# define __scbeqz "beqzl" -#else -# define __scbeqz "beqz" -#endif - #define ATOMIC_INIT(i) { (i) } /* @@ -65,7 +55,7 @@ static __inline__ void atomic_##op(int i, atomic_t * v) \ "1: ll %0, %1 # atomic_" #op "\n" \ " " #asm_op " %0, %2 \n" \ " sc %0, %1 \n" \ - "\t" __scbeqz " %0, 1b \n" \ + "\t" __SC_BEQZ "%0, 1b \n" \ " .setpop \n" \ : "=" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter) \ : "Ir" (i) : __LLSC_CLOBBER); \ @@ -93,7 +83,7 @@ static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \ "1: ll %1, %2 # atomic_" #op "_return \n" \ " " #asm_op " %0, %1, %3 \n" \ " sc %0, %2 \n" \ - "\t" __scbeqz " %0, 1b \n" \ + "\t" __SC_BEQZ "%0, 1b \n" \ " " #asm_op " %0, %1, %3 \n" \ " .setpop \n" \ : "=" (result), "=" (temp), \ @@ -127,7 +117,7 @@ static __inline__ int atomic_fetch_##op##_relaxed(int i, atomic_t * v)\ "1: ll %1, %2 # atomic_fetch_" #op " \n" \ " " #asm_op " %0, %1, %3 \n" \ " sc %0, %2 \n" \ - "\t" __scbeqz " %0, 1b \n" \ + "\t" __SC_BEQZ "%0, 1b \n" \ " .setpop \n" \ " move%0, %1 \n" \ : "=" (result), "=" (temp), \ @@ -205,7 +195,7 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v) " .setpush\n" " .set"MIPS_ISA_LEVEL"\n" " sc %1, %2 \n" - "\t" __scbeqz " %1, 1b \n" + "\t" __SC_BEQZ "%1, 1b \n" "2: \n" " .setpop \n" : "=" (result), "=" (temp), @@ -267,7 +257,7 @@ static __inline__ void atomic64_##op(s64 i, atomic64_t * v) \ "1: lld %0, %1 # atomic64_" #op " \n" \ " " #asm_op " %0, %2 \n" \
[PATCH v2 00/36] MIPS: barriers & atomics cleanups
This series consists of a bunch of cleanups to the way we handle memory barriers (though no changes to the sync instructions we use to implement them) & atomic memory accesses. One major goal was to ensure the Loongson3 LL/SC errata workarounds are applied in a safe manner from within inline-asm & that we can automatically verify the resulting kernel binary looks reasonable. Many patches are cleanups found along the way. Applies atop v5.4-rc1. Changes in v2: - Keep our fls/ffs implementations. Turns out GCC's builtins call intrinsics in some configurations, and if we'd need to go implement those then using the generic fls/ffs doesn't seem like such a win. - De-string __WEAK_LLSC_MB to allow use with __SYNC_ELSE(). - Only try to build the loongson3-llsc-check tool from arch/mips/Makefile when CONFIG_CPU_LOONGSON3_WORKAROUNDS is enabled. Paul Burton (36): MIPS: Unify sc beqz definition MIPS: Use compact branch for LL/SC loops on MIPSr6+ MIPS: barrier: Add __SYNC() infrastructure MIPS: barrier: Clean up rmb() & wmb() definitions MIPS: barrier: Clean up __smp_mb() definition MIPS: barrier: Remove fast_mb() Octeon #ifdef'ery MIPS: barrier: Clean up __sync() definition MIPS: barrier: Clean up sync_ginv() MIPS: atomic: Fix whitespace in ATOMIC_OP macros MIPS: atomic: Handle !kernel_uses_llsc first MIPS: atomic: Use one macro to generate 32b & 64b functions MIPS: atomic: Emit Loongson3 sync workarounds within asm MIPS: atomic: Use _atomic barriers in atomic_sub_if_positive() MIPS: atomic: Unify 32b & 64b sub_if_positive MIPS: atomic: Deduplicate 32b & 64b read, set, xchg, cmpxchg MIPS: bitops: Handle !kernel_uses_llsc first MIPS: bitops: Only use ins for bit 16 or higher MIPS: bitops: Use MIPS_ISA_REV, not #ifdefs MIPS: bitops: ins start position is always an immediate MIPS: bitops: Implement test_and_set_bit() in terms of _lock variant MIPS: bitops: Allow immediates in test_and_{set,clear,change}_bit MIPS: bitops: Use the BIT() macro MIPS: bitops: Avoid redundant zero-comparison for non-LLSC MIPS: bitops: Abstract LL/SC loops MIPS: bitops: Use BIT_WORD() & BITS_PER_LONG MIPS: bitops: Emit Loongson3 sync workarounds within asm MIPS: bitops: Use smp_mb__before_atomic in test_* ops MIPS: cmpxchg: Emit Loongson3 sync workarounds within asm MIPS: cmpxchg: Omit redundant barriers for Loongson3 MIPS: futex: Emit Loongson3 sync workarounds within asm MIPS: syscall: Emit Loongson3 sync workarounds within asm MIPS: barrier: Remove loongson_llsc_mb() MIPS: barrier: Make __smp_mb__before_atomic() a no-op for Loongson3 MIPS: genex: Add Loongson3 LL/SC workaround to ejtag_debug_handler MIPS: genex: Don't reload address unnecessarily MIPS: Check Loongson3 LL/SC errata workaround correctness arch/mips/Makefile | 3 + arch/mips/Makefile.postlink| 10 +- arch/mips/include/asm/atomic.h | 571 + arch/mips/include/asm/barrier.h| 228 ++ arch/mips/include/asm/bitops.h | 443 ++- arch/mips/include/asm/cmpxchg.h| 59 +-- arch/mips/include/asm/futex.h | 15 +- arch/mips/include/asm/llsc.h | 19 +- arch/mips/include/asm/sync.h | 207 + arch/mips/kernel/genex.S | 6 +- arch/mips/kernel/pm-cps.c | 20 +- arch/mips/kernel/syscall.c | 3 +- arch/mips/lib/bitops.c | 57 +-- arch/mips/loongson64/Platform | 2 +- arch/mips/tools/.gitignore | 1 + arch/mips/tools/Makefile | 5 + arch/mips/tools/loongson3-llsc-check.c | 307 + 17 files changed, 981 insertions(+), 975 deletions(-) create mode 100644 arch/mips/include/asm/sync.h create mode 100644 arch/mips/tools/loongson3-llsc-check.c -- 2.23.0
[PATCH v2 06/36] MIPS: barrier: Remove fast_mb() Octeon #ifdef'ery
The definition of fast_mb() is the same in both the Octeon & non-Octeon cases, so remove the duplication & define it only once. Signed-off-by: Paul Burton --- Changes in v2: None arch/mips/include/asm/barrier.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index 8a5abc1c85a6..657ec01120a4 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -38,6 +38,8 @@ static inline void wmb(void) } #define wmb wmb +#define fast_mb() __sync() + #define __fast_iob() \ __asm__ __volatile__( \ ".set push\n\t" \ @@ -49,10 +51,8 @@ static inline void wmb(void) : "m" (*(int *)CKSEG1) \ : "memory") #ifdef CONFIG_CPU_CAVIUM_OCTEON -# define fast_mb() __sync() # define fast_iob()do { } while (0) #else /* ! CONFIG_CPU_CAVIUM_OCTEON */ -# define fast_mb() __sync() # ifdef CONFIG_SGI_IP28 # define fast_iob() \ __asm__ __volatile__( \ -- 2.23.0
[PATCH 02/37] MIPS: Use compact branch for LL/SC loops on MIPSr6+
When targeting MIPSr6 or higher make use of a compact branch in LL/SC loops, preventing the insertion of a delay slot nop that only serves to waste space. Signed-off-by: Paul Burton --- arch/mips/include/asm/llsc.h | 4 1 file changed, 4 insertions(+) diff --git a/arch/mips/include/asm/llsc.h b/arch/mips/include/asm/llsc.h index 9b19f38562ac..d240a4a2d1c4 100644 --- a/arch/mips/include/asm/llsc.h +++ b/arch/mips/include/asm/llsc.h @@ -9,6 +9,8 @@ #ifndef __ASM_LLSC_H #define __ASM_LLSC_H +#include + #if _MIPS_SZLONG == 32 #define SZLONG_LOG 5 #define SZLONG_MASK 31UL @@ -32,6 +34,8 @@ */ #if R1_LLSC_WAR # define __SC_BEQZ "beqzl " +#elif MIPS_ISA_REV >= 6 +# define __SC_BEQZ "beqzc " #else # define __SC_BEQZ "beqz " #endif -- 2.23.0
[PATCH 03/37] MIPS: barrier: Add __SYNC() infrastructure
Introduce an asm/sync.h header which provides infrastructure that can be used to generate sync instructions of various types, and for various reasons. For example if we need a sync instruction that provides a full completion barrier but only on systems which have weak memory ordering, we can generate the appropriate assembly code using: __SYNC(full, weak_ordering) When the kernel is configured to run on systems with weak memory ordering (ie. CONFIG_WEAK_ORDERING is selected) we'll emit a sync instruction. When the kernel is configured to run on systems with strong memory ordering (ie. CONFIG_WEAK_ORDERING is not selected) we'll emit nothing. The caller doesn't need to know which happened - it simply says what it needs & when, with no concern for checking the kernel configuration. There are some scenarios in which we may want to emit code only when we *didn't* emit a sync instruction. For example, some Loongson3 CPUs suffer from a bug that requires us to emit a sync instruction prior to each ll instruction (enabled by CONFIG_CPU_LOONGSON3_WORKAROUNDS). In cases where this bug workaround is enabled, it's wasteful to then have more generic code emit another sync instruction to provide barriers we need in general. A __SYNC_ELSE() macro allows for this, providing an extra argument that contains code to be assembled only in cases where the sync instruction was not emitted. For example if we have a scenario in which we generally want to emit a release barrier but for affected Loongson3 configurations upgrade that to a full completion barrier, we can do that like so: __SYNC_ELSE(full, loongson3_war, __SYNC(rl, always)) The assembly generated by these macros can be used either as inline assembly or in assembly source files. Differing types of sync as provided by MIPSr6 are defined, but currently they all generate a full completion barrier except in kernels configured for Cavium Octeon systems. There the wmb sync-type is used, and rmb syncs are omitted, as has been the case since commit 6b07d38aaa52 ("MIPS: Octeon: Use optimized memory barrier primitives."). Using __SYNC() with the wmb or rmb types will abstract away the Octeon specific behavior and allow us to later clean up asm/barrier.h code that currently includes a plethora of #ifdef's. Signed-off-by: Paul Burton --- arch/mips/include/asm/barrier.h | 113 + arch/mips/include/asm/sync.h| 207 arch/mips/kernel/pm-cps.c | 20 +-- 3 files changed, 219 insertions(+), 121 deletions(-) create mode 100644 arch/mips/include/asm/sync.h diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index 9228f7386220..5ad39bfd3b6d 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -9,116 +9,7 @@ #define __ASM_BARRIER_H #include - -/* - * Sync types defined by the MIPS architecture (document MD00087 table 6.5) - * These values are used with the sync instruction to perform memory barriers. - * Types of ordering guarantees available through the SYNC instruction: - * - Completion Barriers - * - Ordering Barriers - * As compared to the completion barrier, the ordering barrier is a - * lighter-weight operation as it does not require the specified instructions - * before the SYNC to be already completed. Instead it only requires that those - * specified instructions which are subsequent to the SYNC in the instruction - * stream are never re-ordered for processing ahead of the specified - * instructions which are before the SYNC in the instruction stream. - * This potentially reduces how many cycles the barrier instruction must stall - * before it completes. - * Implementations that do not use any of the non-zero values of stype to define - * different barriers, such as ordering barriers, must make those stype values - * act the same as stype zero. - */ - -/* - * Completion barriers: - * - Every synchronizable specified memory instruction (loads or stores or both) - * that occurs in the instruction stream before the SYNC instruction must be - * already globally performed before any synchronizable specified memory - * instructions that occur after the SYNC are allowed to be performed, with - * respect to any other processor or coherent I/O module. - * - * - The barrier does not guarantee the order in which instruction fetches are - * performed. - * - * - A stype value of zero will always be defined such that it performs the most - * complete set of synchronization operations that are defined.This means - * stype zero always does a completion barrier that affects both loads and - * stores preceding the SYNC instruction and both loads and stores that are - * subsequent to the SYNC instruction. Non-zero values of stype may be defined - * by the architecture or specific implementations to perform synchronization - * behaviors that are less complete than that of stype zero. If an - * implementation does not use on
[PATCH 18/37] MIPS: bitops: Only use ins for bit 16 or higher
set_bit() can set bits 0-15 using an ori instruction, rather than loading the value -1 into a register & then using an ins instruction. That is, rather than the following: li t0, -1 ll t1, 0(t2) ins t1, t0, 4, 1 sc t1, 0(t2) We can have the simpler: ll t1, 0(t2) ori t1, t1, 0x10 sc t1, 0(t2) The or path already allows immediates to be used, so simply restricting the ins path to bits that don't fit in immediates is sufficient to take advantage of this. Signed-off-by: Paul Burton --- arch/mips/include/asm/bitops.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index d3f3f37ca0b1..3ea4f172ac08 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -77,7 +77,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) } #if defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_CPU_MIPSR6) - if (__builtin_constant_p(bit)) { + if (__builtin_constant_p(bit) && (bit >= 16)) { loongson_llsc_mb(); do { __asm__ __volatile__( -- 2.23.0
[PATCH 22/37] MIPS: bitops: Allow immediates in test_and_{set,clear,change}_bit
The logical operations or & xor used in the test_and_set_bit_lock(), test_and_clear_bit() & test_and_change_bit() functions currently force the value 1< --- arch/mips/include/asm/bitops.h | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 34d6fe3f18d0..0b0ce0adce8f 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -261,7 +261,7 @@ static inline int test_and_set_bit_lock(unsigned long nr, " and %2, %0, %3 \n" " .setpop \n" : "=" (temp), "+m" (*m), "=" (res) - : "r" (1UL << bit) + : "ir" (1UL << bit) : __LLSC_CLOBBER); } else { loongson_llsc_mb(); @@ -274,7 +274,7 @@ static inline int test_and_set_bit_lock(unsigned long nr, " " __SC "%2, %1 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) - : "r" (1UL << bit) + : "ir" (1UL << bit) : __LLSC_CLOBBER); } while (unlikely(!res)); @@ -332,7 +332,7 @@ static inline int test_and_clear_bit(unsigned long nr, " and %2, %0, %3 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) - : "r" (1UL << bit) + : "ir" (1UL << bit) : __LLSC_CLOBBER); } else if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(nr)) { loongson_llsc_mb(); @@ -358,7 +358,7 @@ static inline int test_and_clear_bit(unsigned long nr, " " __SC "%2, %1 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) - : "r" (1UL << bit) + : "ir" (1UL << bit) : __LLSC_CLOBBER); } while (unlikely(!res)); @@ -400,7 +400,7 @@ static inline int test_and_change_bit(unsigned long nr, " and %2, %0, %3 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) - : "r" (1UL << bit) + : "ir" (1UL << bit) : __LLSC_CLOBBER); } else { loongson_llsc_mb(); @@ -413,7 +413,7 @@ static inline int test_and_change_bit(unsigned long nr, " " __SC "\t%2, %1 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) - : "r" (1UL << bit) + : "ir" (1UL << bit) : __LLSC_CLOBBER); } while (unlikely(!res)); -- 2.23.0
[PATCH 07/37] MIPS: barrier: Clean up __sync() definition
Implement __sync() using the new __SYNC() infrastructure, which will take care of not emitting an instruction for old R3k CPUs that don't support it. The only behavioral difference is that __sync() will now provide a compiler barrier on these old CPUs, but that seems like reasonable behavior anyway. Signed-off-by: Paul Burton --- arch/mips/include/asm/barrier.h | 18 -- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index 657ec01120a4..a117c6d95038 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -11,20 +11,10 @@ #include #include -#ifdef CONFIG_CPU_HAS_SYNC -#define __sync() \ - __asm__ __volatile__( \ - ".set push\n\t" \ - ".set noreorder\n\t" \ - ".set mips2\n\t" \ - "sync\n\t" \ - ".set pop"\ - : /* no output */ \ - : /* no input */\ - : "memory") -#else -#define __sync() do { } while(0) -#endif +static inline void __sync(void) +{ + asm volatile(__SYNC(full, always) ::: "memory"); +} static inline void rmb(void) { -- 2.23.0
[PATCH 05/37] MIPS: barrier: Clean up __smp_mb() definition
We #ifdef on Cavium Octeon CPUs, but emit the same sync instruction in both cases. Remove the #ifdef & simply expand to the __sync() macro. Whilst here indent the strong ordering case definitions to match the indentation of the weak ordering ones, helping readability. Signed-off-by: Paul Burton --- arch/mips/include/asm/barrier.h | 12 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index f36cab87cfde..8a5abc1c85a6 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -89,17 +89,13 @@ static inline void wmb(void) #endif /* !CONFIG_CPU_HAS_WB */ #if defined(CONFIG_WEAK_ORDERING) -# ifdef CONFIG_CPU_CAVIUM_OCTEON -# define __smp_mb() __sync() -# else -# define __smp_mb() __asm__ __volatile__("sync" : : :"memory") -# endif +# define __smp_mb()__sync() # define __smp_rmb() rmb() # define __smp_wmb() wmb() #else -#define __smp_mb() barrier() -#define __smp_rmb()barrier() -#define __smp_wmb()barrier() +# define __smp_mb()barrier() +# define __smp_rmb() barrier() +# define __smp_wmb() barrier() #endif /* -- 2.23.0
[PATCH 13/37] MIPS: atomic: Use _atomic barriers in atomic_sub_if_positive()
Use smp_mb__before_atomic() & smp_mb__after_atomic() in atomic_sub_if_positive() rather than the equivalent smp_mb__before_llsc() & smp_llsc_mb(). The former are more standard & this preps us for avoiding redundant duplicate barriers on Loongson3 in a later patch. Signed-off-by: Paul Burton --- arch/mips/include/asm/atomic.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index 841ff274ada6..24443ef29337 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -196,7 +196,7 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v) { int result; - smp_mb__before_llsc(); + smp_mb__before_atomic(); if (kernel_uses_llsc) { int temp; @@ -237,7 +237,7 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v) * another barrier here. */ if (!__SYNC_loongson3_war) - smp_llsc_mb(); + smp_mb__after_atomic(); return result; } -- 2.23.0
[PATCH 12/37] MIPS: atomic: Emit Loongson3 sync workarounds within asm
Generate the sync instructions required to workaround Loongson3 LL/SC errata within inline asm blocks, which feels a little safer than doing it from C where strictly speaking the compiler would be well within its rights to insert a memory access between the separate asm statements we previously had, containing sync & ll instructions respectively. Signed-off-by: Paul Burton --- arch/mips/include/asm/atomic.h | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index b834af5a7382..841ff274ada6 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -21,6 +21,7 @@ #include #include #include +#include #include #define ATOMIC_INIT(i) { (i) } @@ -56,10 +57,10 @@ static __inline__ void pfx##_##op(type i, pfx##_t * v) \ return; \ } \ \ - loongson_llsc_mb(); \ __asm__ __volatile__( \ " .setpush\n" \ " .set" MIPS_ISA_LEVEL " \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: " #ll " %0, %1 # " #pfx "_" #op " \n" \ " " #asm_op " %0, %2 \n" \ " " #sc " %0, %1 \n" \ @@ -85,10 +86,10 @@ static __inline__ type pfx##_##op##_return_relaxed(type i, pfx##_t * v) \ return result; \ } \ \ - loongson_llsc_mb(); \ __asm__ __volatile__( \ " .setpush\n" \ " .set" MIPS_ISA_LEVEL " \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: " #ll " %1, %2 # " #pfx "_" #op "_return\n"\ " " #asm_op " %0, %1, %3 \n" \ " " #sc " %0, %2 \n" \ @@ -117,10 +118,10 @@ static __inline__ type pfx##_fetch_##op##_relaxed(type i, pfx##_t * v)\ return result; \ } \ \ - loongson_llsc_mb(); \ __asm__ __volatile__( \ " .setpush\n" \ " .set" MIPS_ISA_LEVEL " \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: " #ll " %1, %2 # " #pfx "_fetch_" #op "\n" \ " " #asm_op " %0, %1, %3 \n" \ " " #sc " %0, %2 \n" \ @@ -200,10 +201,10 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v) if (kernel_uses_llsc) { int temp; - loongson_llsc_mb(); __asm__ __volatile__( " .setpush\n" " .set"MIPS_ISA_LEVEL"\n" + " " __SYNC(full, loongson3_war) " \n" "1: ll %1, %2 # atomic_sub_if_positive\n" " .setpop \n" " subu%0, %1, %3 \n" @@ -213,7 +214,7 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v) " .set"MIPS_ISA_LEVEL"\n" " sc %1, %2 \n" "\t"
[PATCH 33/37] MIPS: barrier: Remove loongson_llsc_mb()
The loongson_llsc_mb() macro is no longer used - instead barriers are emitted as part of inline asm using the __SYNC() macro. Remove the now-defunct loongson_llsc_mb() macro. Signed-off-by: Paul Burton --- arch/mips/include/asm/barrier.h | 40 - arch/mips/loongson64/Platform | 2 +- 2 files changed, 1 insertion(+), 41 deletions(-) diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index c7e05e832da9..1a99a6c5b5dd 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -121,46 +121,6 @@ static inline void wmb(void) #define __smp_mb__before_atomic() __smp_mb__before_llsc() #define __smp_mb__after_atomic() smp_llsc_mb() -/* - * Some Loongson 3 CPUs have a bug wherein execution of a memory access (load, - * store or prefetch) in between an LL & SC can cause the SC instruction to - * erroneously succeed, breaking atomicity. Whilst it's unusual to write code - * containing such sequences, this bug bites harder than we might otherwise - * expect due to reordering & speculation: - * - * 1) A memory access appearing prior to the LL in program order may actually - *be executed after the LL - this is the reordering case. - * - *In order to avoid this we need to place a memory barrier (ie. a SYNC - *instruction) prior to every LL instruction, in between it and any earlier - *memory access instructions. - * - *This reordering case is fixed by 3A R2 CPUs, ie. 3A2000 models and later. - * - * 2) If a conditional branch exists between an LL & SC with a target outside - *of the LL-SC loop, for example an exit upon value mismatch in cmpxchg() - *or similar, then misprediction of the branch may allow speculative - *execution of memory accesses from outside of the LL-SC loop. - * - *In order to avoid this we need a memory barrier (ie. a SYNC instruction) - *at each affected branch target, for which we also use loongson_llsc_mb() - *defined below. - * - *This case affects all current Loongson 3 CPUs. - * - * The above described cases cause an error in the cache coherence protocol; - * such that the Invalidate of a competing LL-SC goes 'missing' and SC - * erroneously observes its core still has Exclusive state and lets the SC - * proceed. - * - * Therefore the error only occurs on SMP systems. - */ -#ifdef CONFIG_CPU_LOONGSON3_WORKAROUNDS /* Loongson-3's LLSC workaround */ -#define loongson_llsc_mb() __asm__ __volatile__("sync" : : :"memory") -#else -#define loongson_llsc_mb() do { } while (0) -#endif - static inline void sync_ginv(void) { asm volatile(__SYNC(ginv, always)); diff --git a/arch/mips/loongson64/Platform b/arch/mips/loongson64/Platform index c1a4d4dc4665..28172500f95a 100644 --- a/arch/mips/loongson64/Platform +++ b/arch/mips/loongson64/Platform @@ -27,7 +27,7 @@ cflags-$(CONFIG_CPU_LOONGSON3)+= -Wa,--trap # # Some versions of binutils, not currently mainline as of 2019/02/04, support # an -mfix-loongson3-llsc flag which emits a sync prior to each ll instruction -# to work around a CPU bug (see loongson_llsc_mb() in asm/barrier.h for a +# to work around a CPU bug (see __SYNC_loongson3_war in asm/sync.h for a # description). # # We disable this in order to prevent the assembler meddling with the -- 2.23.0
[PATCH 20/37] MIPS: bitops: ins start position is always an immediate
The start position for an ins instruction is always encoded as an immediate, so allowing registers to be used by the inline asm makes no sense. It should never happen anyway since a bit index should always be small enough to be treated as an immediate, but remove the nonsensical "r" for sanity. Signed-off-by: Paul Burton --- arch/mips/include/asm/bitops.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index b8785bdf3507..83fd1f1c3ab4 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -85,7 +85,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) " " __INS "%0, %3, %2, 1 \n" " " __SC "%0, %1 \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (bit), "r" (~0) + : "i" (bit), "r" (~0) : __LLSC_CLOBBER); } while (unlikely(!temp)); return; @@ -150,7 +150,7 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr) " " __INS "%0, $0, %2, 1 \n" " " __SC "%0, %1 \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (bit) + : "i" (bit) : __LLSC_CLOBBER); } while (unlikely(!temp)); return; @@ -383,7 +383,7 @@ static inline int test_and_clear_bit(unsigned long nr, " " __INS "%0, $0, %3, 1 \n" " " __SC "%0, %1 \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) - : "ir" (bit) + : "i" (bit) : __LLSC_CLOBBER); } while (unlikely(!temp)); } else { -- 2.23.0
[PATCH 36/37] MIPS: genex: Don't reload address unnecessarily
In ejtag_debug_handler() we must reload the address of ejtag_debug_buffer_spinlock if an sc fails, since the address in k0 will have been clobbered by the result of the sc instruction. In the case where we simply load a non-zero value (ie. there's contention for the lock) the address will not be clobbered & we can simply branch back to repeat the load from memory without reloading the address into k0. The primary motivation for this change is that it moves the target of the bnez instruction to an instruction within the LL/SC loop (the LL itself), which we know contains no other memory accesses & therefore isn't affected by Loongson3 LL/SC errata. Signed-off-by: Paul Burton --- arch/mips/kernel/genex.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S index ac4f2b835165..60ede6b75a3b 100644 --- a/arch/mips/kernel/genex.S +++ b/arch/mips/kernel/genex.S @@ -355,8 +355,8 @@ NESTED(ejtag_debug_handler, PT_SIZE, sp) #ifdef CONFIG_SMP 1: PTR_LA k0, ejtag_debug_buffer_spinlock __SYNC(full, loongson3_war) - ll k0, 0(k0) - bnezk0, 1b +2: ll k0, 0(k0) + bnezk0, 2b PTR_LA k0, ejtag_debug_buffer_spinlock sc k0, 0(k0) beqzk0, 1b -- 2.23.0
[PATCH 35/37] MIPS: genex: Add Loongson3 LL/SC workaround to ejtag_debug_handler
In ejtag_debug_handler we use LL & SC instructions to acquire & release an open-coded spinlock. For Loongson3 systems affected by LL/SC errata this requires that we insert a sync instruction prior to the LL in order to ensure correct behavior of the LL/SC loop. Signed-off-by: Paul Burton --- arch/mips/kernel/genex.S | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S index efde27c99414..ac4f2b835165 100644 --- a/arch/mips/kernel/genex.S +++ b/arch/mips/kernel/genex.S @@ -18,6 +18,7 @@ #include #include #include +#include #include #include @@ -353,6 +354,7 @@ NESTED(ejtag_debug_handler, PT_SIZE, sp) #ifdef CONFIG_SMP 1: PTR_LA k0, ejtag_debug_buffer_spinlock + __SYNC(full, loongson3_war) ll k0, 0(k0) bnezk0, 1b PTR_LA k0, ejtag_debug_buffer_spinlock -- 2.23.0
[PATCH 30/37] MIPS: cmpxchg: Omit redundant barriers for Loongson3
When building a kernel configured to support Loongson3 LL/SC workarounds (ie. CONFIG_CPU_LOONGSON3_WORKAROUNDS=y) the inline assembly in __xchg_asm() & __cmpxchg_asm() already emits completion barriers, and as such we don't need to emit extra barriers from the xchg() or cmpxchg() macros. Add compile-time constant checks causing us to omit the redundant memory barriers. Signed-off-by: Paul Burton --- arch/mips/include/asm/cmpxchg.h | 26 +++--- 1 file changed, 23 insertions(+), 3 deletions(-) diff --git a/arch/mips/include/asm/cmpxchg.h b/arch/mips/include/asm/cmpxchg.h index fc121d20a980..820df68e32e1 100644 --- a/arch/mips/include/asm/cmpxchg.h +++ b/arch/mips/include/asm/cmpxchg.h @@ -94,7 +94,13 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x, ({ \ __typeof__(*(ptr)) __res; \ \ - smp_mb__before_llsc(); \ + /* \ +* In the Loongson3 workaround case __xchg_asm() already\ +* contains a completion barrier prior to the LL, so we don't \ +* need to emit an extra one here. \ +*/ \ + if (!__SYNC_loongson3_war) \ + smp_mb__before_llsc(); \ \ __res = (__typeof__(*(ptr)))\ __xchg((ptr), (unsigned long)(x), sizeof(*(ptr))); \ @@ -179,9 +185,23 @@ static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old, ({ \ __typeof__(*(ptr)) __res; \ \ - smp_mb__before_llsc(); \ + /* \ +* In the Loongson3 workaround case __cmpxchg_asm() already \ +* contains a completion barrier prior to the LL, so we don't \ +* need to emit an extra one here. \ +*/ \ + if (!__SYNC_loongson3_war) \ + smp_mb__before_llsc(); \ + \ __res = cmpxchg_local((ptr), (old), (new)); \ - smp_llsc_mb(); \ + \ + /* \ +* In the Loongson3 workaround case __cmpxchg_asm() already \ +* contains a completion barrier after the SC, so we don't \ +* need to emit an extra one here. \ +*/ \ + if (!__SYNC_loongson3_war) \ + smp_llsc_mb(); \ \ __res; \ }) -- 2.23.0
[PATCH 37/37] MIPS: Check Loongson3 LL/SC errata workaround correctness
When Loongson3 LL/SC errata workarounds are enabled (ie. CONFIG_CPU_LOONGSON3_WORKAROUNDS=y) run a tool to scan through the compiled kernel & ensure that the workaround is applied correctly. That is, ensure that: - Every LL or LLD instruction is preceded by a sync instruction. - Any branches from within an LL/SC loop to outside of that loop target a sync instruction. Reasoning for these conditions can be found by reading the comment above the definition of __SYNC_loongson3_war in arch/mips/include/asm/sync.h. This tool will help ensure that we don't inadvertently introduce code paths that miss the required workarounds. Signed-off-by: Paul Burton --- arch/mips/Makefile | 2 +- arch/mips/Makefile.postlink| 10 +- arch/mips/tools/.gitignore | 1 + arch/mips/tools/Makefile | 5 + arch/mips/tools/loongson3-llsc-check.c | 307 + 5 files changed, 323 insertions(+), 2 deletions(-) create mode 100644 arch/mips/tools/loongson3-llsc-check.c diff --git a/arch/mips/Makefile b/arch/mips/Makefile index cdc09b71febe..4ac0974cf902 100644 --- a/arch/mips/Makefile +++ b/arch/mips/Makefile @@ -13,7 +13,7 @@ # archscripts: scripts_basic - $(Q)$(MAKE) $(build)=arch/mips/tools elf-entry + $(Q)$(MAKE) $(build)=arch/mips/tools elf-entry loongson3-llsc-check $(Q)$(MAKE) $(build)=arch/mips/boot/tools relocs KBUILD_DEFCONFIG := 32r2el_defconfig diff --git a/arch/mips/Makefile.postlink b/arch/mips/Makefile.postlink index 4eea4188cb20..f03fdc95143e 100644 --- a/arch/mips/Makefile.postlink +++ b/arch/mips/Makefile.postlink @@ -3,7 +3,8 @@ # Post-link MIPS pass # === # -# 1. Insert relocations into vmlinux +# 1. Check that Loongson3 LL/SC workarounds are applied correctly +# 2. Insert relocations into vmlinux PHONY := __archpost __archpost: @@ -11,6 +12,10 @@ __archpost: -include include/config/auto.conf include scripts/Kbuild.include +CMD_LS3_LLSC = arch/mips/tools/loongson3-llsc-check +quiet_cmd_ls3_llsc = LLSCCHK $@ + cmd_ls3_llsc = $(CMD_LS3_LLSC) $@ + CMD_RELOCS = arch/mips/boot/tools/relocs quiet_cmd_relocs = RELOCS $@ cmd_relocs = $(CMD_RELOCS) $@ @@ -19,6 +24,9 @@ quiet_cmd_relocs = RELOCS $@ vmlinux: FORCE @true +ifeq ($(CONFIG_CPU_LOONGSON3_WORKAROUNDS),y) + $(call if_changed,ls3_llsc) +endif ifeq ($(CONFIG_RELOCATABLE),y) $(call if_changed,relocs) endif diff --git a/arch/mips/tools/.gitignore b/arch/mips/tools/.gitignore index 56d34ce4..b0209450d9ff 100644 --- a/arch/mips/tools/.gitignore +++ b/arch/mips/tools/.gitignore @@ -1 +1,2 @@ elf-entry +loongson3-llsc-check diff --git a/arch/mips/tools/Makefile b/arch/mips/tools/Makefile index 3baee4bc6775..aaef688749f5 100644 --- a/arch/mips/tools/Makefile +++ b/arch/mips/tools/Makefile @@ -3,3 +3,8 @@ hostprogs-y := elf-entry PHONY += elf-entry elf-entry: $(obj)/elf-entry @: + +hostprogs-$(CONFIG_CPU_LOONGSON3_WORKAROUNDS) += loongson3-llsc-check +PHONY += loongson3-llsc-check +loongson3-llsc-check: $(obj)/loongson3-llsc-check + @: diff --git a/arch/mips/tools/loongson3-llsc-check.c b/arch/mips/tools/loongson3-llsc-check.c new file mode 100644 index ..0ebddd0ae46f --- /dev/null +++ b/arch/mips/tools/loongson3-llsc-check.c @@ -0,0 +1,307 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#ifdef be32toh +/* If libc provides le{16,32,64}toh() then we'll use them */ +#elif BYTE_ORDER == LITTLE_ENDIAN +# define le16toh(x)(x) +# define le32toh(x)(x) +# define le64toh(x)(x) +#elif BYTE_ORDER == BIG_ENDIAN +# define le16toh(x)bswap_16(x) +# define le32toh(x)bswap_32(x) +# define le64toh(x)bswap_64(x) +#endif + +/* MIPS opcodes, in bits 31:26 of an instruction */ +#define OP_SPECIAL 0x00 +#define OP_REGIMM 0x01 +#define OP_BEQ 0x04 +#define OP_BNE 0x05 +#define OP_BLEZ0x06 +#define OP_BGTZ0x07 +#define OP_BEQL0x14 +#define OP_BNEL0x15 +#define OP_BLEZL 0x16 +#define OP_BGTZL 0x17 +#define OP_LL 0x30 +#define OP_LLD 0x34 +#define OP_SC 0x38 +#define OP_SCD 0x3c + +/* Bits 20:16 of OP_REGIMM instructions */ +#define REGIMM_BLTZ0x00 +#define REGIMM_BGEZ0x01 +#define REGIMM_BLTZL 0x02 +#define REGIMM_BGEZL 0x03 +#define REGIMM_BLTZAL 0x10 +#define REGIMM_BGEZAL 0x11 +#define REGIMM_BLTZALL 0x12 +#define REGIMM_BGEZALL 0x13 + +/* Bits 5:0 of OP_SPECIAL instructions */ +#define SPECIAL_SYNC 0x0f + +static void usage(FILE *f) +{ + fprintf(f, "Usage: loongson3-llsc-check /path/to/vmlinux\n"); +} + +static int se16(uint16_t x) +{ + return (int16_t)x; +} + +sta
[PATCH 23/37] MIPS: bitops: Use the BIT() macro
Use the BIT() macro in asm/bitops.h rather than open-coding its equivalent. Signed-off-by: Paul Burton --- arch/mips/include/asm/bitops.h | 31 --- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 0b0ce0adce8f..35582afc057b 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -13,6 +13,7 @@ #error only can be included directly #endif +#include #include #include #include @@ -70,7 +71,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) " beqzl %0, 1b \n" " .setpop \n" : "=" (temp), "=" GCC_OFF_SMALL_ASM() (*m) - : "ir" (1UL << bit), GCC_OFF_SMALL_ASM() (*m) + : "ir" (BIT(bit)), GCC_OFF_SMALL_ASM() (*m) : __LLSC_CLOBBER); return; } @@ -99,7 +100,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) " " __SC "%0, %1 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (1UL << bit) + : "ir" (BIT(bit)) : __LLSC_CLOBBER); } while (unlikely(!temp)); } @@ -135,7 +136,7 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr) " beqzl %0, 1b \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (~(1UL << bit)) + : "ir" (~(BIT(bit))) : __LLSC_CLOBBER); return; } @@ -164,7 +165,7 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr) " " __SC "%0, %1 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (~(1UL << bit)) + : "ir" (~(BIT(bit))) : __LLSC_CLOBBER); } while (unlikely(!temp)); } @@ -213,7 +214,7 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr) " beqzl %0, 1b \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (1UL << bit) + : "ir" (BIT(bit)) : __LLSC_CLOBBER); return; } @@ -228,7 +229,7 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr) " " __SC "%0, %1 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (1UL << bit) + : "ir" (BIT(bit)) : __LLSC_CLOBBER); } while (unlikely(!temp)); } @@ -261,7 +262,7 @@ static inline int test_and_set_bit_lock(unsigned long nr, " and %2, %0, %3 \n" " .setpop \n" : "=" (temp), "+m" (*m), "=" (res) - : "ir" (1UL << bit) + : "ir" (BIT(bit)) : __LLSC_CLOBBER); } else { loongson_llsc_mb(); @@ -274,11 +275,11 @@ static inline int test_and_set_bit_lock(unsigned long nr, " " __SC "%2, %1 \n" " .setpop \n" : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) - : "ir" (1UL << bit) + : "ir" (BIT(bit)) : __LLSC_CLOBBER); } while (unlikely(!res)); - res = temp & (1UL << bit); + res = temp & BIT(bit); } smp_llsc_mb(); @@ -332,7 +333,7 @@ static inline int test_and_clear_bit(unsigned long nr, "
[PATCH 28/37] MIPS: bitops: Use smp_mb__before_atomic in test_* ops
Use smp_mb__before_atomic() rather than smp_mb__before_llsc() in test_and_set_bit(), test_and_clear_bit() & test_and_change_bit(). The _atomic() versions make semantic sense in these cases, and will allow a later patch to omit redundant barriers for Loongson3 systems that already include a barrier within __test_bit_op(). Signed-off-by: Paul Burton --- arch/mips/include/asm/bitops.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 9e967d6622c8..e6d97238a321 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -209,7 +209,7 @@ static inline int test_and_set_bit_lock(unsigned long nr, static inline int test_and_set_bit(unsigned long nr, volatile unsigned long *addr) { - smp_mb__before_llsc(); + smp_mb__before_atomic(); return test_and_set_bit_lock(nr, addr); } @@ -228,7 +228,7 @@ static inline int test_and_clear_bit(unsigned long nr, int bit = nr % BITS_PER_LONG; unsigned long res, orig; - smp_mb__before_llsc(); + smp_mb__before_atomic(); if (!kernel_uses_llsc) { res = __mips_test_and_clear_bit(nr, addr); @@ -265,7 +265,7 @@ static inline int test_and_change_bit(unsigned long nr, int bit = nr % BITS_PER_LONG; unsigned long res, orig; - smp_mb__before_llsc(); + smp_mb__before_atomic(); if (!kernel_uses_llsc) { res = __mips_test_and_change_bit(nr, addr); -- 2.23.0
[PATCH 16/37] MIPS: bitops: Use generic builtin ffs/fls; drop cpu_has_clo_clz
The MIPS-specific implementations of __ffs(), ffs(), __fls() & fls() make use of the MIPS clz instruction where possible. They do this via inline asm, but in any configuration in which the kernel is built for a MIPS32 or MIPS64 release 1 or higher instruction set we know that these instructions are available & can be emitted using the __builtin_clz() function & other associated builtins which are provided by all currently supported versions of gcc. When targeting an older instruction set GCC will generate a longer code sequence similar to the fallback cases we have in our implementations. As such, remove our custom implementations of these functions & use the generic versions built atop compiler builtins. This allows us to drop a significant chunk of code, along with the cpu_has_clo_clz feature macro which was only used by these functions. The only thing we lose here is the ability for kernels built to target a pre-r1 ISA to opportunistically make use of clz when running on a CPU that implements it. This seems like a small cost, and well worth paying to simplify the code. Signed-off-by: Paul Burton --- arch/mips/include/asm/bitops.h| 146 +- arch/mips/include/asm/cpu-features.h | 10 -- .../asm/mach-malta/cpu-feature-overrides.h| 2 - 3 files changed, 4 insertions(+), 154 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 985d6a02f9ea..4b618afbfa5b 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -491,149 +491,11 @@ static inline void __clear_bit_unlock(unsigned long nr, volatile unsigned long * nudge_writes(); } -/* - * Return the bit position (0..63) of the most significant 1 bit in a word - * Returns -1 if no 1 bit exists - */ -static __always_inline unsigned long __fls(unsigned long word) -{ - int num; - - if (BITS_PER_LONG == 32 && !__builtin_constant_p(word) && - __builtin_constant_p(cpu_has_clo_clz) && cpu_has_clo_clz) { - __asm__( - " .setpush\n" - " .set"MIPS_ISA_LEVEL"\n" - " clz %0, %1 \n" - " .setpop \n" - : "=r" (num) - : "r" (word)); - - return 31 - num; - } - - if (BITS_PER_LONG == 64 && !__builtin_constant_p(word) && - __builtin_constant_p(cpu_has_mips64) && cpu_has_mips64) { - __asm__( - " .setpush\n" - " .set"MIPS_ISA_LEVEL"\n" - " dclz%0, %1 \n" - " .setpop \n" - : "=r" (num) - : "r" (word)); - - return 63 - num; - } - - num = BITS_PER_LONG - 1; - -#if BITS_PER_LONG == 64 - if (!(word & (~0ul << 32))) { - num -= 32; - word <<= 32; - } -#endif - if (!(word & (~0ul << (BITS_PER_LONG-16 { - num -= 16; - word <<= 16; - } - if (!(word & (~0ul << (BITS_PER_LONG-8 { - num -= 8; - word <<= 8; - } - if (!(word & (~0ul << (BITS_PER_LONG-4 { - num -= 4; - word <<= 4; - } - if (!(word & (~0ul << (BITS_PER_LONG-2 { - num -= 2; - word <<= 2; - } - if (!(word & (~0ul << (BITS_PER_LONG-1 - num -= 1; - return num; -} - -/* - * __ffs - find first bit in word. - * @word: The word to search - * - * Returns 0..SZLONG-1 - * Undefined if no bit exists, so code should check against 0 first. - */ -static __always_inline unsigned long __ffs(unsigned long word) -{ - return __fls(word & -word); -} - -/* - * fls - find last bit set. - * @word: The word to search - * - * This is defined the same way as ffs. - * Note fls(0) = 0, fls(1) = 1, fls(0x8000) = 32. - */ -static inline int fls(unsigned int x) -{ - int r; - - if (!__builtin_constant_p(x) && - __builtin_constant_p(cpu_has_clo_clz) && cpu_has_clo_clz) { - __asm__( - " .setpush\n" - " .set"MIPS_ISA_LEVEL"\n" - " clz %0, %1
[PATCH 27/37] MIPS: bitops: Emit Loongson3 sync workarounds within asm
Generate the sync instructions required to workaround Loongson3 LL/SC errata within inline asm blocks, which feels a little safer than doing it from C where strictly speaking the compiler would be well within its rights to insert a memory access between the separate asm statements we previously had, containing sync & ll instructions respectively. Signed-off-by: Paul Burton --- arch/mips/include/asm/bitops.h | 11 ++- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 59fe1d5d4fc9..9e967d6622c8 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -31,6 +31,7 @@ asm volatile( \ " .setpush\n" \ " .set" MIPS_ISA_LEVEL " \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: " __LL "%0, %1 \n" \ " " insn " \n" \ " " __SC "%0, %1 \n" \ @@ -47,6 +48,7 @@ asm volatile( \ " .setpush\n" \ " .set" MIPS_ISA_LEVEL " \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: " __LL ll_dst ", %2\n" \ " " insn " \n" \ " " __SC "%1, %2 \n" \ @@ -96,12 +98,10 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) } if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(bit) && (bit >= 16)) { - loongson_llsc_mb(); __bit_op(*m, __INS "%0, %3, %2, 1", "i"(bit), "r"(~0)); return; } - loongson_llsc_mb(); __bit_op(*m, "or\t%0, %2", "ir"(BIT(bit))); } @@ -126,12 +126,10 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr) } if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(bit)) { - loongson_llsc_mb(); __bit_op(*m, __INS "%0, $0, %2, 1", "i"(bit)); return; } - loongson_llsc_mb(); __bit_op(*m, "and\t%0, %2", "ir"(~BIT(bit))); } @@ -168,7 +166,6 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr) return; } - loongson_llsc_mb(); __bit_op(*m, "xor\t%0, %2", "ir"(BIT(bit))); } @@ -190,7 +187,6 @@ static inline int test_and_set_bit_lock(unsigned long nr, if (!kernel_uses_llsc) { res = __mips_test_and_set_bit_lock(nr, addr); } else { - loongson_llsc_mb(); orig = __test_bit_op(*m, "%0", "or\t%1, %0, %3", "ir"(BIT(bit))); @@ -237,13 +233,11 @@ static inline int test_and_clear_bit(unsigned long nr, if (!kernel_uses_llsc) { res = __mips_test_and_clear_bit(nr, addr); } else if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(nr)) { - loongson_llsc_mb(); res = __test_bit_op(*m, "%1", __EXT "%0, %1, %3, 1;" __INS "%1, $0, %3, 1", "i"(bit)); } else { - loongson_llsc_mb(); orig = __test_bit_op(*m, "%0", "or\t%1, %0, %3;" "xor\t%1, %1, %3", @@ -276,7 +270,6 @@ static inline int test_and_change_bit(unsigned long nr, if (!kernel_uses_llsc) { res = __mips_test_and_change_bit(nr, addr); } else { - loongson_llsc_mb(); orig = __test_bit_op(*m, "%0", "xor\t%1, %0, %3", "ir"(BIT(bit))); -- 2.23.0
[PATCH 32/37] MIPS: syscall: Emit Loongson3 sync workarounds within asm
Generate the sync instructions required to workaround Loongson3 LL/SC errata within inline asm blocks, which feels a little safer than doing it from C where strictly speaking the compiler would be well within its rights to insert a memory access between the separate asm statements we previously had, containing sync & ll instructions respectively. Signed-off-by: Paul Burton --- arch/mips/kernel/syscall.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/mips/kernel/syscall.c b/arch/mips/kernel/syscall.c index b0e25e913bdb..3ea288ca35f1 100644 --- a/arch/mips/kernel/syscall.c +++ b/arch/mips/kernel/syscall.c @@ -37,6 +37,7 @@ #include #include #include +#include #include #include @@ -132,12 +133,12 @@ static inline int mips_atomic_set(unsigned long addr, unsigned long new) [efault] "i" (-EFAULT) : "memory"); } else if (cpu_has_llsc) { - loongson_llsc_mb(); __asm__ __volatile__ ( " .setpush\n" " .set"MIPS_ISA_ARCH_LEVEL" \n" " li %[err], 0 \n" "1: \n" + " " __SYNC(full, loongson3_war) " \n" user_ll("%[old]", "(%[addr])") " move%[tmp], %[new] \n" "2: \n" -- 2.23.0
[PATCH 17/37] MIPS: bitops: Handle !kernel_uses_llsc first
Reorder conditions in our various bitops functions that check kernel_uses_llsc such that they handle the !kernel_uses_llsc case first. This allows us to avoid the need to duplicate the kernel_uses_llsc check in all the other cases. For functions that don't involve barriers common to the various implementations, we switch to returning from within each if block making each case easier to read in isolation. Signed-off-by: Paul Burton --- arch/mips/include/asm/bitops.h | 213 - 1 file changed, 105 insertions(+), 108 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 4b618afbfa5b..d3f3f37ca0b1 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -52,11 +52,16 @@ int __mips_test_and_change_bit(unsigned long nr, */ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG); + unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); int bit = nr & SZLONG_MASK; unsigned long temp; - if (kernel_uses_llsc && R1_LLSC_WAR) { + if (!kernel_uses_llsc) { + __mips_set_bit(nr, addr); + return; + } + + if (R1_LLSC_WAR) { __asm__ __volatile__( " .setpush\n" " .setarch=r4000 \n" @@ -68,8 +73,11 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) : "=" (temp), "=" GCC_OFF_SMALL_ASM() (*m) : "ir" (1UL << bit), GCC_OFF_SMALL_ASM() (*m) : __LLSC_CLOBBER); + return; + } + #if defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_CPU_MIPSR6) - } else if (kernel_uses_llsc && __builtin_constant_p(bit)) { + if (__builtin_constant_p(bit)) { loongson_llsc_mb(); do { __asm__ __volatile__( @@ -80,23 +88,23 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) : "ir" (bit), "r" (~0) : __LLSC_CLOBBER); } while (unlikely(!temp)); + return; + } #endif /* CONFIG_CPU_MIPSR2 || CONFIG_CPU_MIPSR6 */ - } else if (kernel_uses_llsc) { - loongson_llsc_mb(); - do { - __asm__ __volatile__( - " .setpush\n" - " .set"MIPS_ISA_ARCH_LEVEL" \n" - " " __LL "%0, %1 # set_bit \n" - " or %0, %2 \n" - " " __SC "%0, %1 \n" - " .setpop \n" - : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) - : "ir" (1UL << bit) - : __LLSC_CLOBBER); - } while (unlikely(!temp)); - } else - __mips_set_bit(nr, addr); + + loongson_llsc_mb(); + do { + __asm__ __volatile__( + " .setpush\n" + " .set"MIPS_ISA_ARCH_LEVEL" \n" + " " __LL "%0, %1 # set_bit \n" + " or %0, %2 \n" + " " __SC "%0, %1 \n" + " .setpop \n" + : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m) + : "ir" (1UL << bit) + : __LLSC_CLOBBER); + } while (unlikely(!temp)); } /* @@ -111,11 +119,16 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) */ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG); + unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); int bit = nr & SZLONG_MASK; unsigned long temp; - if (kernel_uses_llsc && R1_LLSC_WAR) { + if (!kernel_uses_llsc) { + __mips_clear_bit(nr, addr); + return; + } + + if (R1_LLSC_WAR) { __asm__ __volatile__( " .setpush\
[PATCH 26/37] MIPS: bitops: Use BIT_WORD() & BITS_PER_LONG
Rather than using custom SZLONG_LOG & SZLONG_MASK macros to shift & mask a bit index to form word & bit offsets respectively, make use of the standard BIT_WORD() & BITS_PER_LONG macros for the same purpose. volatile is added to the definition of pointers to the long-sized word we'll operate on, in order to prevent the compiler complaining that we cast away the volatile qualifier of the addr argument. This should have no effect on generated code, which in the LL/SC case is inline asm anyway & in the non-LLSC case access is constrained by compiler barriers provided by raw_local_irq_{save,restore}(). Signed-off-by: Paul Burton --- arch/mips/include/asm/bitops.h | 24 arch/mips/include/asm/llsc.h | 4 arch/mips/lib/bitops.c | 31 +-- 3 files changed, 25 insertions(+), 34 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 5701f8b41e87..59fe1d5d4fc9 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -87,8 +87,8 @@ int __mips_test_and_change_bit(unsigned long nr, */ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); - int bit = nr & SZLONG_MASK; + volatile unsigned long *m = [BIT_WORD(nr)]; + int bit = nr % BITS_PER_LONG; if (!kernel_uses_llsc) { __mips_set_bit(nr, addr); @@ -117,8 +117,8 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) */ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); - int bit = nr & SZLONG_MASK; + volatile unsigned long *m = [BIT_WORD(nr)]; + int bit = nr % BITS_PER_LONG; if (!kernel_uses_llsc) { __mips_clear_bit(nr, addr); @@ -160,8 +160,8 @@ static inline void clear_bit_unlock(unsigned long nr, volatile unsigned long *ad */ static inline void change_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); - int bit = nr & SZLONG_MASK; + volatile unsigned long *m = [BIT_WORD(nr)]; + int bit = nr % BITS_PER_LONG; if (!kernel_uses_llsc) { __mips_change_bit(nr, addr); @@ -183,8 +183,8 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr) static inline int test_and_set_bit_lock(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); - int bit = nr & SZLONG_MASK; + volatile unsigned long *m = [BIT_WORD(nr)]; + int bit = nr % BITS_PER_LONG; unsigned long res, orig; if (!kernel_uses_llsc) { @@ -228,8 +228,8 @@ static inline int test_and_set_bit(unsigned long nr, static inline int test_and_clear_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); - int bit = nr & SZLONG_MASK; + volatile unsigned long *m = [BIT_WORD(nr)]; + int bit = nr % BITS_PER_LONG; unsigned long res, orig; smp_mb__before_llsc(); @@ -267,8 +267,8 @@ static inline int test_and_clear_bit(unsigned long nr, static inline int test_and_change_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); - int bit = nr & SZLONG_MASK; + volatile unsigned long *m = [BIT_WORD(nr)]; + int bit = nr % BITS_PER_LONG; unsigned long res, orig; smp_mb__before_llsc(); diff --git a/arch/mips/include/asm/llsc.h b/arch/mips/include/asm/llsc.h index d240a4a2d1c4..c49738bc3bda 100644 --- a/arch/mips/include/asm/llsc.h +++ b/arch/mips/include/asm/llsc.h @@ -12,15 +12,11 @@ #include #if _MIPS_SZLONG == 32 -#define SZLONG_LOG 5 -#define SZLONG_MASK 31UL #define __LL "ll " #define __SC "sc " #define __INS "ins" #define __EXT "ext" #elif _MIPS_SZLONG == 64 -#define SZLONG_LOG 6 -#define SZLONG_MASK 63UL #define __LL "lld" #define __SC "scd" #define __INS "dins " diff --git a/arch/mips/lib/bitops.c b/arch/mips/lib/bitops.c index fba402c0879d..116d0bd8b2ae 100644 --- a/arch/mips/lib/bitops.c +++ b/arch/mips/lib/bitops.c @@ -7,6 +7,7 @@ * Copyright (c) 1999, 2000 Silicon Graphics, Inc. */ #include +#include #include #include @@ -19,12 +20,11 @@ */ void __mips_set_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *a = (unsigned long *)addr; - unsigned bit = nr & SZLONG_MASK; + volatile un
[PATCH 31/37] MIPS: futex: Emit Loongson3 sync workarounds within asm
Generate the sync instructions required to workaround Loongson3 LL/SC errata within inline asm blocks, which feels a little safer than doing it from C where strictly speaking the compiler would be well within its rights to insert a memory access between the separate asm statements we previously had, containing sync & ll instructions respectively. Signed-off-by: Paul Burton --- arch/mips/include/asm/futex.h | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/arch/mips/include/asm/futex.h b/arch/mips/include/asm/futex.h index b83b0397462d..45c3e3652f48 100644 --- a/arch/mips/include/asm/futex.h +++ b/arch/mips/include/asm/futex.h @@ -16,6 +16,7 @@ #include #include #include +#include #include #define __futex_atomic_op(insn, ret, oldval, uaddr, oparg) \ @@ -50,12 +51,12 @@ "i" (-EFAULT) \ : "memory");\ } else if (cpu_has_llsc) { \ - loongson_llsc_mb(); \ __asm__ __volatile__( \ " .setpush\n" \ " .setnoat\n" \ " .setpush\n" \ " .set"MIPS_ISA_ARCH_LEVEL" \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: "user_ll("%1", "%4")" # __futex_atomic_op\n"\ " .setpop \n" \ " " insn " \n" \ @@ -164,13 +165,13 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, "i" (-EFAULT) : "memory"); } else if (cpu_has_llsc) { - loongson_llsc_mb(); __asm__ __volatile__( "# futex_atomic_cmpxchg_inatomic\n" " .setpush\n" " .setnoat\n" " .setpush\n" " .set"MIPS_ISA_ARCH_LEVEL" \n" + " " __SYNC(full, loongson3_war) " \n" "1: "user_ll("%1", "%3")" \n" " bne %1, %z4, 3f \n" " .setpop \n" @@ -178,8 +179,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, " .set"MIPS_ISA_ARCH_LEVEL" \n" "2: "user_sc("$1", "%2")" \n" " beqz$1, 1b \n" - __WEAK_LLSC_MB - "3: \n" + "3: " __SYNC_ELSE(full, loongson3_war, __WEAK_LLSC_MB) "\n" " .insn \n" " .setpop \n" " .section .fixup,\"ax\" \n" @@ -194,7 +194,6 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, : GCC_OFF_SMALL_ASM() (*uaddr), "Jr" (oldval), "Jr" (newval), "i" (-EFAULT) : "memory"); - loongson_llsc_mb(); } else return -ENOSYS; -- 2.23.0
[PATCH 29/37] MIPS: cmpxchg: Emit Loongson3 sync workarounds within asm
Generate the sync instructions required to workaround Loongson3 LL/SC errata within inline asm blocks, which feels a little safer than doing it from C where strictly speaking the compiler would be well within its rights to insert a memory access between the separate asm statements we previously had, containing sync & ll instructions respectively. Signed-off-by: Paul Burton --- arch/mips/include/asm/cmpxchg.h | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/arch/mips/include/asm/cmpxchg.h b/arch/mips/include/asm/cmpxchg.h index 5d3f0e3513b4..fc121d20a980 100644 --- a/arch/mips/include/asm/cmpxchg.h +++ b/arch/mips/include/asm/cmpxchg.h @@ -12,6 +12,7 @@ #include #include #include +#include #include /* @@ -36,12 +37,12 @@ extern unsigned long __xchg_called_with_bad_pointer(void) __typeof(*(m)) __ret; \ \ if (kernel_uses_llsc) { \ - loongson_llsc_mb(); \ __asm__ __volatile__( \ " .setpush\n" \ " .setnoat\n" \ " .setpush\n" \ " .set" MIPS_ISA_ARCH_LEVEL " \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: " ld " %0, %2 # __xchg_asm\n" \ " .setpop \n" \ " move$1, %z3 \n" \ @@ -108,12 +109,12 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x, __typeof(*(m)) __ret; \ \ if (kernel_uses_llsc) { \ - loongson_llsc_mb(); \ __asm__ __volatile__( \ " .setpush\n" \ " .setnoat\n" \ " .setpush\n" \ " .set"MIPS_ISA_ARCH_LEVEL" \n" \ + " " __SYNC(full, loongson3_war) " \n" \ "1: " ld " %0, %2 # __cmpxchg_asm \n" \ " bne %0, %z3, 2f \n" \ " .setpop \n" \ @@ -122,11 +123,10 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x, " " st " $1, %1 \n" \ "\t" __SC_BEQZ "$1, 1b \n" \ " .setpop \n" \ - "2: \n" \ + "2: " __SYNC(full, loongson3_war) " \n" \ : "=" (__ret), "=" GCC_OFF_SMALL_ASM() (*m) \ : GCC_OFF_SMALL_ASM() (*m), "Jr" (old), "Jr" (new) \ : __LLSC_CLOBBER); \ - loongson_llsc_mb(); \ } else {\ unsigned long __flags; \ \ @@ -222,11 +222,11 @@ static inline unsigned long __cmpxchg64(volatile void *ptr, */ local_irq_save(flags); - loongson_llsc_mb(); asm volatile( " .setpush\n" " .set" MIPS_ISA_ARCH_LEVEL " \n" /* Load 64 bits from ptr */ + " " __SYNC(full, loongson3_war) " \n" "1: lld %L0, %3 # __cmpxchg64 \n" /* * Split the 64 bit value we loaded into the 2 registers that hold the @@ -260,7 +260,7 @@ static inline unsigned long __cmpxchg64(volatile void *ptr, /* If we failed, loop! */ "\t" __SC_BEQZ "%L1, 1b \n&quo
[PATCH 34/37] MIPS: barrier: Make __smp_mb__before_atomic() a no-op for Loongson3
Loongson3 systems with CONFIG_CPU_LOONGSON3_WORKAROUNDS enabled already emit a full completion barrier as part of the inline assembly containing LL/SC loops for atomic operations. As such the barrier emitted by __smp_mb__before_atomic() is redundant, and we can remove it. Signed-off-by: Paul Burton --- arch/mips/include/asm/barrier.h | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index 1a99a6c5b5dd..f3b5aa0938c1 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -118,7 +118,17 @@ static inline void wmb(void) #define nudge_writes() mb() #endif -#define __smp_mb__before_atomic() __smp_mb__before_llsc() +/* + * In the Loongson3 LL/SC workaround case, all of our LL/SC loops already have + * a completion barrier immediately preceding the LL instruction. Therefore we + * can skip emitting a barrier from __smp_mb__before_atomic(). + */ +#ifdef CONFIG_CPU_LOONGSON3_WORKAROUNDS +# define __smp_mb__before_atomic() +#else +# define __smp_mb__before_atomic() __smp_mb__before_llsc() +#endif + #define __smp_mb__after_atomic() smp_llsc_mb() static inline void sync_ginv(void) -- 2.23.0
[PATCH 08/37] MIPS: barrier: Clean up sync_ginv()
Use the new __SYNC() infrastructure to implement sync_ginv(), for consistency with much of the rest of the asm/barrier.h. Signed-off-by: Paul Burton --- arch/mips/include/asm/barrier.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h index a117c6d95038..c7e05e832da9 100644 --- a/arch/mips/include/asm/barrier.h +++ b/arch/mips/include/asm/barrier.h @@ -163,7 +163,7 @@ static inline void wmb(void) static inline void sync_ginv(void) { - asm volatile("sync\t%0" :: "i"(__SYNC_ginv)); + asm volatile(__SYNC(ginv, always)); } #include -- 2.23.0
[PATCH 10/37] MIPS: atomic: Handle !kernel_uses_llsc first
Handle the !kernel_uses_llsc path first in our ATOMIC_OP(), ATOMIC_OP_RETURN() & ATOMIC_FETCH_OP() macros & return from within the block. This allows us to de-indent the kernel_uses_llsc path by one level which will be useful when making further changes. Signed-off-by: Paul Burton --- arch/mips/include/asm/atomic.h | 99 +- 1 file changed, 49 insertions(+), 50 deletions(-) diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index 2d2a8a74c51b..ace2ea005588 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -45,51 +45,36 @@ #define ATOMIC_OP(op, c_op, asm_op)\ static __inline__ void atomic_##op(int i, atomic_t * v) \ { \ - if (kernel_uses_llsc) { \ - int temp; \ + int temp; \ \ - loongson_llsc_mb(); \ - __asm__ __volatile__( \ - " .setpush\n" \ - " .set"MIPS_ISA_LEVEL"\n" \ - "1: ll %0, %1 # atomic_" #op "\n" \ - " " #asm_op " %0, %2 \n" \ - " sc %0, %1 \n" \ - "\t" __SC_BEQZ "%0, 1b \n" \ - " .setpop \n" \ - : "=" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)\ - : "Ir" (i) : __LLSC_CLOBBER); \ - } else {\ + if (!kernel_uses_llsc) {\ unsigned long flags;\ \ raw_local_irq_save(flags); \ v->counter c_op i; \ raw_local_irq_restore(flags); \ + return; \ } \ + \ + loongson_llsc_mb(); \ + __asm__ __volatile__( \ + " .setpush\n" \ + " .set" MIPS_ISA_LEVEL " \n" \ + "1: ll %0, %1 # atomic_" #op "\n" \ + " " #asm_op " %0, %2 \n" \ + " sc %0, %1 \n" \ + "\t" __SC_BEQZ "%0, 1b \n" \ + " .setpop \n" \ + : "=" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)\ + : "Ir" (i) : __LLSC_CLOBBER); \ } #define ATOMIC_OP_RETURN(op, c_op, asm_op) \ static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \ { \ - int result; \ - \ - if (kernel_uses_llsc) { \ - int temp; \ + int temp, result; \ \ - loongson_llsc_mb(); \ - __asm__ __volatile__( \ - " .setpush\n" \ - " .set"MIPS_ISA_LEVEL"\n" \ - "1: ll %1, %2 # atomic_" #op "_return \n" \ - " " #asm_op " %0, %1, %3
[PATCH 24/37] MIPS: bitops: Avoid redundant zero-comparison for non-LLSC
The IRQ-disabling non-LLSC fallbacks for bitops on UP systems already return a zero or one, so there's no need to perform another comparison against zero. Move these comparisons into the LLSC paths to avoid the redundant work. Signed-off-by: Paul Burton --- arch/mips/include/asm/bitops.h | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 35582afc057b..3e5589320e83 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -264,6 +264,8 @@ static inline int test_and_set_bit_lock(unsigned long nr, : "=" (temp), "+m" (*m), "=" (res) : "ir" (BIT(bit)) : __LLSC_CLOBBER); + + res = res != 0; } else { loongson_llsc_mb(); do { @@ -279,12 +281,12 @@ static inline int test_and_set_bit_lock(unsigned long nr, : __LLSC_CLOBBER); } while (unlikely(!res)); - res = temp & BIT(bit); + res = (temp & BIT(bit)) != 0; } smp_llsc_mb(); - return res != 0; + return res; } /* @@ -335,6 +337,8 @@ static inline int test_and_clear_bit(unsigned long nr, : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) : "ir" (BIT(bit)) : __LLSC_CLOBBER); + + res = res != 0; } else if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(nr)) { loongson_llsc_mb(); do { @@ -363,12 +367,12 @@ static inline int test_and_clear_bit(unsigned long nr, : __LLSC_CLOBBER); } while (unlikely(!res)); - res = temp & BIT(bit); + res = (temp & BIT(bit)) != 0; } smp_llsc_mb(); - return res != 0; + return res; } /* @@ -403,6 +407,8 @@ static inline int test_and_change_bit(unsigned long nr, : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) : "ir" (BIT(bit)) : __LLSC_CLOBBER); + + res = res != 0; } else { loongson_llsc_mb(); do { @@ -418,12 +424,12 @@ static inline int test_and_change_bit(unsigned long nr, : __LLSC_CLOBBER); } while (unlikely(!res)); - res = temp & BIT(bit); + res = (temp & BIT(bit)) != 0; } smp_llsc_mb(); - return res != 0; + return res; } #include -- 2.23.0
[PATCH 21/37] MIPS: bitops: Implement test_and_set_bit() in terms of _lock variant
The only difference between test_and_set_bit() & test_and_set_bit_lock() is memory ordering barrier semantics - the former provides a full barrier whilst the latter only provides acquire semantics. We can therefore implement test_and_set_bit() in terms of test_and_set_bit_lock() with the addition of the extra memory barrier. Do this in order to avoid duplicating logic. Signed-off-by: Paul Burton --- arch/mips/include/asm/bitops.h | 66 +++--- arch/mips/lib/bitops.c | 26 -- 2 files changed, 13 insertions(+), 79 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 83fd1f1c3ab4..34d6fe3f18d0 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -31,8 +31,6 @@ void __mips_set_bit(unsigned long nr, volatile unsigned long *addr); void __mips_clear_bit(unsigned long nr, volatile unsigned long *addr); void __mips_change_bit(unsigned long nr, volatile unsigned long *addr); -int __mips_test_and_set_bit(unsigned long nr, - volatile unsigned long *addr); int __mips_test_and_set_bit_lock(unsigned long nr, volatile unsigned long *addr); int __mips_test_and_clear_bit(unsigned long nr, @@ -236,24 +234,22 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr) } /* - * test_and_set_bit - Set a bit and return its old value + * test_and_set_bit_lock - Set a bit and return its old value * @nr: Bit to set * @addr: Address to count from * - * This operation is atomic and cannot be reordered. - * It also implies a memory barrier. + * This operation is atomic and implies acquire ordering semantics + * after the memory operation. */ -static inline int test_and_set_bit(unsigned long nr, +static inline int test_and_set_bit_lock(unsigned long nr, volatile unsigned long *addr) { unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); int bit = nr & SZLONG_MASK; unsigned long res, temp; - smp_mb__before_llsc(); - if (!kernel_uses_llsc) { - res = __mips_test_and_set_bit(nr, addr); + res = __mips_test_and_set_bit_lock(nr, addr); } else if (R1_LLSC_WAR) { __asm__ __volatile__( " .setpush\n" @@ -264,7 +260,7 @@ static inline int test_and_set_bit(unsigned long nr, " beqzl %2, 1b \n" " and %2, %0, %3 \n" " .setpop \n" - : "=" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=" (res) + : "=" (temp), "+m" (*m), "=" (res) : "r" (1UL << bit) : __LLSC_CLOBBER); } else { @@ -291,56 +287,20 @@ static inline int test_and_set_bit(unsigned long nr, } /* - * test_and_set_bit_lock - Set a bit and return its old value + * test_and_set_bit - Set a bit and return its old value * @nr: Bit to set * @addr: Address to count from * - * This operation is atomic and implies acquire ordering semantics - * after the memory operation. + * This operation is atomic and cannot be reordered. + * It also implies a memory barrier. */ -static inline int test_and_set_bit_lock(unsigned long nr, +static inline int test_and_set_bit(unsigned long nr, volatile unsigned long *addr) { - unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); - int bit = nr & SZLONG_MASK; - unsigned long res, temp; - - if (!kernel_uses_llsc) { - res = __mips_test_and_set_bit_lock(nr, addr); - } else if (R1_LLSC_WAR) { - __asm__ __volatile__( - " .setpush\n" - " .setarch=r4000 \n" - "1: " __LL "%0, %1 # test_and_set_bit \n" - " or %2, %0, %3 \n" - " " __SC "%2, %1 \n" - " beqzl %2, 1b \n" - " and %2, %0, %3 \n" - " .setpop \n" - : "=" (temp), "+m" (*m), "=" (res) - : "r" (1UL << bit) - : __LLSC_CLOBBER); - } else { - do { - __asm__ __volatile__( - " .setpush
[PATCH 15/37] MIPS: atomic: Deduplicate 32b & 64b read, set, xchg, cmpxchg
Remove the remaining duplication between 32b & 64b in asm/atomic.h by making use of an ATOMIC_OPS() macro to generate: - atomic_read()/atomic64_read() - atomic_set()/atomic64_set() - atomic_cmpxchg()/atomic64_cmpxchg() - atomic_xchg()/atomic64_xchg() This is consistent with the way all other functions in asm/atomic.h are generated, and ensures consistency between the 32b & 64b functions. Of note is that this results in the above now being static inline functions rather than macros. Signed-off-by: Paul Burton --- arch/mips/include/asm/atomic.h | 70 +- 1 file changed, 27 insertions(+), 43 deletions(-) diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index 96ef50fa2817..e5ac88392d1f 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -24,24 +24,34 @@ #include #include -#define ATOMIC_INIT(i) { (i) } +#define ATOMIC_OPS(pfx, type) \ +static __always_inline type pfx##_read(const pfx##_t *v) \ +{ \ + return READ_ONCE(v->counter); \ +} \ + \ +static __always_inline void pfx##_set(pfx##_t *v, type i) \ +{ \ + WRITE_ONCE(v->counter, i); \ +} \ + \ +static __always_inline type pfx##_cmpxchg(pfx##_t *v, type o, type n) \ +{ \ + return cmpxchg(>counter, o, n); \ +} \ + \ +static __always_inline type pfx##_xchg(pfx##_t *v, type n) \ +{ \ + return xchg(>counter, n);\ +} -/* - * atomic_read - read atomic variable - * @v: pointer of type atomic_t - * - * Atomically reads the value of @v. - */ -#define atomic_read(v) READ_ONCE((v)->counter) +#define ATOMIC_INIT(i) { (i) } +ATOMIC_OPS(atomic, int) -/* - * atomic_set - set atomic variable - * @v: pointer of type atomic_t - * @i: required value - * - * Atomically sets the value of @v to @i. - */ -#define atomic_set(v, i) WRITE_ONCE((v)->counter, (i)) +#ifdef CONFIG_64BIT +# define ATOMIC64_INIT(i) { (i) } +ATOMIC_OPS(atomic64, s64) +#endif #define ATOMIC_OP(pfx, op, type, c_op, asm_op, ll, sc) \ static __inline__ void pfx##_##op(type i, pfx##_t * v) \ @@ -135,6 +145,7 @@ static __inline__ type pfx##_fetch_##op##_relaxed(type i, pfx##_t * v) \ return result; \ } +#undef ATOMIC_OPS #define ATOMIC_OPS(pfx, op, type, c_op, asm_op, ll, sc) \ ATOMIC_OP(pfx, op, type, c_op, asm_op, ll, sc) \ ATOMIC_OP_RETURN(pfx, op, type, c_op, asm_op, ll, sc) \ @@ -254,31 +265,4 @@ ATOMIC_SIP_OP(atomic64, s64, dsubu, lld, scd) #undef ATOMIC_SIP_OP -#define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n))) -#define atomic_xchg(v, new) (xchg(&((v)->counter), (new))) - -#ifdef CONFIG_64BIT - -#define ATOMIC64_INIT(i){ (i) } - -/* - * atomic64_read - read atomic variable - * @v: pointer of type atomic64_t - * - */ -#define atomic64_read(v) READ_ONCE((v)->counter) - -/* - * atomic64_set - set atomic variable - * @v: pointer of type atomic64_t - * @i: required value - */ -#define atomic64_set(v, i) WRITE_ONCE((v)->counter, (i)) - -#define atomic64_cmpxchg(v, o, n) \ - ((__typeof__((v)->counter))cmpxchg(&((v)->counter), (o), (n))) -#define atomic64_xchg(v, new) (xchg(&((v)->counter), (new))) - -#endif /* CONFIG_64BIT */ - #endif /* _ASM_ATOMIC_H */ -- 2.23.0
[PATCH 00/37] MIPS: barriers & atomics cleanups
This series consists of a bunch of cleanups to the way we handle memory barriers (though no changes to the sync instructions we use to implement them) & atomic memory accesses. One major goal was to ensure the Loongson3 LL/SC errata workarounds are applied in a safe manner from within inline-asm & that we can automatically verify the resulting kernel binary looks reasonable. Many patches are cleanups found along the way. Applies atop v5.4-rc1. Paul Burton (37): MIPS: Unify sc beqz definition MIPS: Use compact branch for LL/SC loops on MIPSr6+ MIPS: barrier: Add __SYNC() infrastructure MIPS: barrier: Clean up rmb() & wmb() definitions MIPS: barrier: Clean up __smp_mb() definition MIPS: barrier: Remove fast_mb() Octeon #ifdef'ery MIPS: barrier: Clean up __sync() definition MIPS: barrier: Clean up sync_ginv() MIPS: atomic: Fix whitespace in ATOMIC_OP macros MIPS: atomic: Handle !kernel_uses_llsc first MIPS: atomic: Use one macro to generate 32b & 64b functions MIPS: atomic: Emit Loongson3 sync workarounds within asm MIPS: atomic: Use _atomic barriers in atomic_sub_if_positive() MIPS: atomic: Unify 32b & 64b sub_if_positive MIPS: atomic: Deduplicate 32b & 64b read, set, xchg, cmpxchg MIPS: bitops: Use generic builtin ffs/fls; drop cpu_has_clo_clz MIPS: bitops: Handle !kernel_uses_llsc first MIPS: bitops: Only use ins for bit 16 or higher MIPS: bitops: Use MIPS_ISA_REV, not #ifdefs MIPS: bitops: ins start position is always an immediate MIPS: bitops: Implement test_and_set_bit() in terms of _lock variant MIPS: bitops: Allow immediates in test_and_{set,clear,change}_bit MIPS: bitops: Use the BIT() macro MIPS: bitops: Avoid redundant zero-comparison for non-LLSC MIPS: bitops: Abstract LL/SC loops MIPS: bitops: Use BIT_WORD() & BITS_PER_LONG MIPS: bitops: Emit Loongson3 sync workarounds within asm MIPS: bitops: Use smp_mb__before_atomic in test_* ops MIPS: cmpxchg: Emit Loongson3 sync workarounds within asm MIPS: cmpxchg: Omit redundant barriers for Loongson3 MIPS: futex: Emit Loongson3 sync workarounds within asm MIPS: syscall: Emit Loongson3 sync workarounds within asm MIPS: barrier: Remove loongson_llsc_mb() MIPS: barrier: Make __smp_mb__before_atomic() a no-op for Loongson3 MIPS: genex: Add Loongson3 LL/SC workaround to ejtag_debug_handler MIPS: genex: Don't reload address unnecessarily MIPS: Check Loongson3 LL/SC errata workaround correctness arch/mips/Makefile| 2 +- arch/mips/Makefile.postlink | 10 +- arch/mips/include/asm/atomic.h| 571 ++--- arch/mips/include/asm/barrier.h | 215 +-- arch/mips/include/asm/bitops.h| 593 -- arch/mips/include/asm/cmpxchg.h | 59 +- arch/mips/include/asm/cpu-features.h | 10 - arch/mips/include/asm/futex.h | 9 +- arch/mips/include/asm/llsc.h | 19 +- .../asm/mach-malta/cpu-feature-overrides.h| 2 - arch/mips/include/asm/sync.h | 207 ++ arch/mips/kernel/genex.S | 6 +- arch/mips/kernel/pm-cps.c | 20 +- arch/mips/kernel/syscall.c| 3 +- arch/mips/lib/bitops.c| 57 +- arch/mips/loongson64/Platform | 2 +- arch/mips/tools/.gitignore| 1 + arch/mips/tools/Makefile | 5 + arch/mips/tools/loongson3-llsc-check.c| 307 + 19 files changed, 975 insertions(+), 1123 deletions(-) create mode 100644 arch/mips/include/asm/sync.h create mode 100644 arch/mips/tools/loongson3-llsc-check.c -- 2.23.0
[PATCH 25/37] MIPS: bitops: Abstract LL/SC loops
Introduce __bit_op() & __test_bit_op() macros which abstract away the implementation of LL/SC loops. This cuts down on a lot of duplicate boilerplate code, and also allows R1_LLSC_WAR to be handled outside of the individual bitop functions. Signed-off-by: Paul Burton --- arch/mips/include/asm/bitops.h | 267 - 1 file changed, 63 insertions(+), 204 deletions(-) diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h index 3e5589320e83..5701f8b41e87 100644 --- a/arch/mips/include/asm/bitops.h +++ b/arch/mips/include/asm/bitops.h @@ -25,6 +25,41 @@ #include #include +#define __bit_op(mem, insn, inputs...) do {\ + unsigned long temp; \ + \ + asm volatile( \ + " .setpush\n" \ + " .set" MIPS_ISA_LEVEL " \n" \ + "1: " __LL "%0, %1 \n" \ + " " insn " \n" \ + " " __SC "%0, %1 \n" \ + " " __SC_BEQZ "%0, 1b \n" \ + " .setpop \n" \ + : "="(temp), "+" GCC_OFF_SMALL_ASM()(mem) \ + : inputs\ + : __LLSC_CLOBBER); \ +} while (0) + +#define __test_bit_op(mem, ll_dst, insn, inputs...) ({ \ + unsigned long orig, temp; \ + \ + asm volatile( \ + " .setpush\n" \ + " .set" MIPS_ISA_LEVEL " \n" \ + "1: " __LL ll_dst ", %2\n" \ + " " insn " \n" \ + " " __SC "%1, %2 \n" \ + " " __SC_BEQZ "%1, 1b \n" \ + " .setpop \n" \ + : "="(orig), "="(temp), \ + "+" GCC_OFF_SMALL_ASM()(mem) \ + : inputs\ + : __LLSC_CLOBBER); \ + \ + orig; \ +}) + /* * These are the "slower" versions of the functions and are in bitops.c. * These functions call raw_local_irq_{save,restore}(). @@ -54,55 +89,20 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr) { unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG); int bit = nr & SZLONG_MASK; - unsigned long temp; if (!kernel_uses_llsc) { __mips_set_bit(nr, addr); return; } - if (R1_LLSC_WAR) { - __asm__ __volatile__( - " .setpush\n" - " .setarch=r4000 \n" - "1: " __LL "%0, %1 # set_bit \n" - " or %0, %2 \n" - " " __SC "%0, %1 \n" - " beqzl %0, 1b \n" - " .setpop \n" - : "=" (temp), "=" GCC_OFF_SMALL_ASM() (*m) - : "ir" (BIT(bit)), GCC_OFF_SMALL_ASM() (*m) - : __LLSC_CLOBBER); - return; - } - if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(bit) && (bit >= 16)) { loongson_llsc_mb(); - do { - __asm__ __volatile__( - " " __LL "%0, %1 # set_bit \n" - " " __INS "%0, %3, %2, 1 \n" - " " __SC "%0, %1 \n" - : "=" (temp), "+" GCC_OFF_SM
[PATCH 09/37] MIPS: atomic: Fix whitespace in ATOMIC_OP macros
We define macros in asm/atomic.h which end each line with space characters before a backslash to continue on the next line. Remove the space characters leaving tabs as the whitespace used for conformity with coding convention. Signed-off-by: Paul Burton --- arch/mips/include/asm/atomic.h | 184 - 1 file changed, 92 insertions(+), 92 deletions(-) diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index 7578c807ef98..2d2a8a74c51b 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -42,102 +42,102 @@ */ #define atomic_set(v, i) WRITE_ONCE((v)->counter, (i)) -#define ATOMIC_OP(op, c_op, asm_op) \ -static __inline__ void atomic_##op(int i, atomic_t * v) \ -{\ - if (kernel_uses_llsc) { \ - int temp; \ - \ - loongson_llsc_mb(); \ - __asm__ __volatile__( \ - " .setpush\n" \ - " .set"MIPS_ISA_LEVEL"\n" \ - "1: ll %0, %1 # atomic_" #op "\n" \ - " " #asm_op " %0, %2 \n" \ - " sc %0, %1 \n" \ - "\t" __SC_BEQZ "%0, 1b \n" \ - " .setpop \n" \ - : "=" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter) \ - : "Ir" (i) : __LLSC_CLOBBER); \ - } else { \ - unsigned long flags; \ - \ - raw_local_irq_save(flags);\ - v->counter c_op i;\ - raw_local_irq_restore(flags); \ - } \ +#define ATOMIC_OP(op, c_op, asm_op)\ +static __inline__ void atomic_##op(int i, atomic_t * v) \ +{ \ + if (kernel_uses_llsc) { \ + int temp; \ + \ + loongson_llsc_mb(); \ + __asm__ __volatile__( \ + " .setpush\n" \ + " .set"MIPS_ISA_LEVEL"\n" \ + "1: ll %0, %1 # atomic_" #op "\n" \ + " " #asm_op " %0, %2 \n" \ + " sc %0, %1 \n" \ + "\t" __SC_BEQZ "%0, 1b \n" \ + " .setpop \n" \ + : "=" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)\ + : "Ir" (i) : __LLSC_CLOBBER); \ + } else {\ + unsigned long flags;\ + \ + raw_local_irq_save(flags); \ + v->counter c_op i; \ + raw_local_irq_restore(flags); \ + } \ } -#define ATOMIC_OP_RETURN(op, c_op, asm_op) \ -static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \ -{