Re: [PATCH 1/1] futex: remove duplicated code
Hi Jiri, On Mon, May 15, 2017 at 03:07:42PM +0200, Jiri Slaby wrote: > There is code duplicated over all architecture's headers for > futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr, > and comparison of the result. > > Remove this duplication and leave up to the arches only the needed > assembly which is now in arch_futex_atomic_op_inuser. > > Note that s390 removed access_ok check in d12a29703 ("s390/uaccess: > remove pointless access_ok() checks") as access_ok there returns true. > We introduce it back to the helper for the sake of simplicity (it gets > optimized away anyway). Whilst I think this is a good idea, the code in question actually results in undefined behaviour per the C spec and is reported by UBSAN. See my patch fixing arm64 here (which I'd forgotten about): https://www.spinics.net/lists/linux-arch/msg38564.html But, as stated in the thread above, I think we should go a step further and remove FUTEX_OP_{OR,ANDN,XOR,OPARG_SHIFT} altogether. They don't appear to be used by userspace, and this whole thing is a total mess. Any thoughts? Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 1/1] futex: remove duplicated code and fix UB
On Mon, Jun 26, 2017 at 02:02:31PM +0200, Jiri Slaby wrote: > On 06/23/2017, 09:51 AM, Thomas Gleixner wrote: > > On Wed, 21 Jun 2017, Jiri Slaby wrote: > >> diff --git a/arch/arm64/include/asm/futex.h > >> b/arch/arm64/include/asm/futex.h > >> index f32b42e8725d..5bb2fd4674e7 100644 > >> --- a/arch/arm64/include/asm/futex.h > >> +++ b/arch/arm64/include/asm/futex.h > >> @@ -48,20 +48,10 @@ do { > >> \ > >> } while (0) > >> > >> static inline int > >> -futex_atomic_op_inuser(unsigned int encoded_op, u32 __user *uaddr) > > > > That unsigned int seems to be a change from the arm64 tree in next. It's > > not upstream and it'll cause a (easy to resolve) conflict. > > Ugh, I thought the arm64 is in upstream already. Note that this patch > just takes what is in this arm64 fix and makes it effective for all > architectures. So I will wait with v2 until it merges upstream. > > So, Will, will you incorporate Thomas' comments into your arm64 fix? I wasn't planning to (it's already queued and I think they're just cosmetic changes). The easiest thing is probably for you to make the changes in the generic version when you post v2. Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 1/1] futex: remove duplicated code
On Wed, May 17, 2017 at 10:01:29AM +0200, Jiri Slaby wrote: > On 05/15/2017, 03:16 PM, Will Deacon wrote: > > Whilst I think this is a good idea, the code in question actually results > > in undefined behaviour per the C spec and is reported by UBSAN. > > Hi, yes, I know -- this patch was the 1st from the series of 3 which I > sent a long time ago to fix that up too. But I remember your patch, so I > sent only this one this time. > > > See my > > patch fixing arm64 here (which I'd forgotten about): > > > > https://www.spinics.net/lists/linux-arch/msg38564.html > > > > But, as stated in the thread above, I think we should go a step further > > and remove FUTEX_OP_{OR,ANDN,XOR,OPARG_SHIFT} altogether. They don't > > appear to be used by userspace, and this whole thing is a total mess. > > > > Any thoughts? > > Ok, I am all for that. I think the only question is who is going to do > the work and submit it :)? Do I understand correctly to eliminate all > these functions and the path into the kernel? But won't this break API > -- are there really no users of this interface? That's the million-dollar question, really. I don't know of any code using it, and I couldn't find any when I looked (also nothing reported by Debian Codesearch afaict), but I was hoping linux-arch might have some thoughts on this too. For now, I'll queue my arm64 patch before I forget about it again! Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 1/1] futex: remove duplicated code
On Mon, May 22, 2017 at 11:11:33PM +0200, Thomas Gleixner wrote: > On Mon, 15 May 2017, Will Deacon wrote: > > On Mon, May 15, 2017 at 03:07:42PM +0200, Jiri Slaby wrote: > > > There is code duplicated over all architecture's headers for > > > futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr, > > > and comparison of the result. > > > > > > Remove this duplication and leave up to the arches only the needed > > > assembly which is now in arch_futex_atomic_op_inuser. > > > > > > Note that s390 removed access_ok check in d12a29703 ("s390/uaccess: > > > remove pointless access_ok() checks") as access_ok there returns true. > > > We introduce it back to the helper for the sake of simplicity (it gets > > > optimized away anyway). > > > > Whilst I think this is a good idea, the code in question actually results > > in undefined behaviour per the C spec and is reported by UBSAN. See my > > patch fixing arm64 here (which I'd forgotten about): > > > > https://www.spinics.net/lists/linux-arch/msg38564.html > > > > But, as stated in the thread above, I think we should go a step further > > and remove FUTEX_OP_{OR,ANDN,XOR,OPARG_SHIFT} altogether. They don't > > appear to be used by userspace, and this whole thing is a total mess. > > You wish. The constants are not used, but FUTEX_WAKE_OP _IS_ used by > glibc. They only have one argument it seems: > >#define FUTEX_OP_CLEAR_WAKE_IF_GT_ONE ((4 << 24) | 1) > > but I'm pretty sure that there is enough (probably horrible) code (think > java) out there using FUTEX_WAKE_OP for whatever (non)sensical reasons in > any available combination. Indeed, and I'm not proposing to get rid of that. It's the grossly over-engineered array of operations and the FUTEX_OP_OPARG_SHIFT modifier that I think we should kill. The latter likely behaves differently across different architectures and potentially depending on the toolchain you used to build the kernel. Does anybody know the history behind the interface design? Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v2 1/1] futex: remove duplicated code and fix UB
Hi Jiri, On Thu, Aug 24, 2017 at 09:31:05AM +0200, Jiri Slaby wrote: > There is code duplicated over all architecture's headers for > futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr, > and comparison of the result. > > Remove this duplication and leave up to the arches only the needed > assembly which is now in arch_futex_atomic_op_inuser. > > This effectively distributes the Will Deacon's arm64 fix for undefined > behaviour reported by UBSAN to all architectures. The fix was done in > commit 5f16a046f8e1 (arm64: futex: Fix undefined behaviour with > FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump. > > And as suggested by Thomas, check for negative oparg too, because it was > also reported to cause undefined behaviour report. > > Note that s390 removed access_ok check in d12a29703 ("s390/uaccess: > remove pointless access_ok() checks") as access_ok there returns true. > We introduce it back to the helper for the sake of simplicity (it gets > optimized away anyway). For arm64 and the core code: Reviewed-by: Will Deacon <will.dea...@arm.com> Although one minor thing on the core part: > diff --git a/kernel/futex.c b/kernel/futex.c > index 0939255fc750..3d38eaf05492 100644 > --- a/kernel/futex.c > +++ b/kernel/futex.c > @@ -1551,6 +1551,45 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int > nr_wake, u32 bitset) > return ret; > } > > +static int futex_atomic_op_inuser(unsigned int encoded_op, u32 __user *uaddr) > +{ > + unsigned int op = (encoded_op & 0x7000) >> 28; > + unsigned int cmp =(encoded_op & 0x0f00) >> 24; > + int oparg = sign_extend32((encoded_op & 0x00fff000) >> 12, 12); > + int cmparg = sign_extend32(encoded_op & 0x0fff, 12); > + int oldval, ret; > + > + if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28)) { > + if (oparg < 0 || oparg > 31) > + return -EINVAL; > + oparg = 1 << oparg; > + } > + > + if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32))) > + return -EFAULT; > + > + ret = arch_futex_atomic_op_inuser(op, oparg, , uaddr); > + if (ret) > + return ret; We could move the pagefault_{disable,enable} calls here, and then remove them from the futex_atomic_op_inuser callsites elsewhere in futex.c Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v2 6/9] kbuild: consolidate Devicetree dtb build rules
On Wed, Sep 05, 2018 at 06:53:24PM -0500, Rob Herring wrote: > There is nothing arch specific about building dtb files other than their > location under /arch/*/boot/dts/. Keeping each arch aligned is a pain. > The dependencies and supported targets are all slightly different. > Also, a cross-compiler for each arch is needed, but really the host > compiler preprocessor is perfectly fine for building dtbs. Move the > build rules to a common location and remove the arch specific ones. This > is done in a single step to avoid warnings about overriding rules. > > The build dependencies had been a mixture of 'scripts' and/or 'prepare'. > These pull in several dependencies some of which need a target compiler > (specifically devicetable-offsets.h) and aren't needed to build dtbs. > All that is really needed is dtc, so adjust the dependencies to only be > dtc. > > This change enables support 'dtbs_install' on some arches which were > missing the target. > > Cc: Masahiro Yamada > Cc: Michal Marek > Cc: Vineet Gupta > Cc: Russell King > Cc: Catalin Marinas > Cc: Will Deacon > Cc: Yoshinori Sato > Cc: Michal Simek > Cc: Ralf Baechle > Cc: Paul Burton > Cc: James Hogan > Cc: Ley Foon Tan > Cc: Benjamin Herrenschmidt > Cc: Paul Mackerras > Cc: Michael Ellerman > Cc: Chris Zankel > Cc: Max Filippov > Cc: linux-kbu...@vger.kernel.org > Cc: linux-snps-arc@lists.infradead.org > Cc: linux-arm-ker...@lists.infradead.org > Cc: uclinux-h8-de...@lists.sourceforge.jp > Cc: linux-m...@linux-mips.org > Cc: nios2-...@lists.rocketboards.org > Cc: linuxppc-...@lists.ozlabs.org > Cc: linux-xte...@linux-xtensa.org > Signed-off-by: Rob Herring > --- > Please ack so I can take the whole series via the DT tree. For arm64: Acked-by: Will Deacon Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: Patch "asm-generic/bitops/lock.h: Rewrite using atomic_fetch_" causes kernel crash
On Fri, Aug 31, 2018 at 12:30:50AM +, Vineet Gupta wrote: > On 08/30/2018 01:45 PM, Peter Zijlstra wrote: > >> > >> Indeed this is the mother of all issues, I tried and system is clearly > >> hosed with > >> and works after. > >> What's amazing is the commit 4aef66c8ae9 which introduced it is from 2016 > >> ;-) > >> Back then we had a retry branch with backoff stuff which I'd reverted for > >> new > >> cores and the merge conflict somehow missed it. > >> > >> @PeterZ I'll create a patch with you as author ? do I need any formal sign > >> offs, > >> acks etc ? > > Well, Will spotted it, give authorship to him, you have my ack per the > > above. > > Oops, sorry for the mixup. I have him as author now and pushed to ARC for-curr > (will trickle into linux-next eventually). Okey doke: you can have my Signed-off-by for all the diffs I sent. Just stick something like [vineet: wrote commit message] in there so I don't read it in the future and confuse myself. Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: Patch "asm-generic/bitops/lock.h: Rewrite using atomic_fetch_" causes kernel crash
On Wed, Aug 29, 2018 at 09:16:43PM +, Vineet Gupta wrote: > On 08/29/2018 11:33 AM, Eugeniy Paltsev wrote: > > Hi Guys, > > Since v4.19-rc1 we are getting a serious regression on platforms with ARC > > architecture. > > The kernel have become unstable and spontaneously crashes on LTP tests > > execution / IO tests or > > even on boot. > > > > I don't know exactly what breaks but bisect clearly assign the blame to > > this commit: > > 84c6591103db ("locking/atomics, asm-generic/bitops/lock.h: Rewrite using > > atomic_fetch_*()") > > https://github.com/torvalds/linux/commit/84c6591103dbeaf393a092a3fc7b09510825f6b9 > > > > Reverting the commit solves this problem. > > > > I tested v4.19-rc1 on ARM (wandboard, i.mx6, 32bit, quard core, ARMv7) > > which uses same > > generic bitops implementation and it works fine. > > > > Do you have any ideas what went wrong? > > Back in 2016, Peter had fixed this file due to a problem I reported on ARC. > See > commit f75d48644c56a ("bitops: Do not default to __clear_bit() for > __clear_bit_unlock()") > That made __clear_bit_unlock() use the atomic clear_bit() vs. non-atomic > __clear_bit(), effectively making clear_bit_unlock() and __clear_bit_unlock() > same. > > This patch undoes that which could explain the issues you see. @Peter, @Will ? /me grabs arc toolchain (incidentally, make.cross fuzzy matches "arc" to "sparc", so that was fun for a few minutes). I'll take a look today, thanks for the report. Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: Patch "asm-generic/bitops/lock.h: Rewrite using atomic_fetch_" causes kernel crash
On Thu, Aug 30, 2018 at 11:44:11AM +0200, Peter Zijlstra wrote: > On Wed, Aug 29, 2018 at 09:16:43PM +, Vineet Gupta wrote: > > On 08/29/2018 11:33 AM, Eugeniy Paltsev wrote: > > > Hi Guys, > > > Since v4.19-rc1 we are getting a serious regression on platforms with ARC > > > architecture. > > > The kernel have become unstable and spontaneously crashes on LTP tests > > > execution / IO tests or > > > even on boot. > > > > > > I don't know exactly what breaks but bisect clearly assign the blame to > > > this commit: > > > 84c6591103db ("locking/atomics, asm-generic/bitops/lock.h: Rewrite using > > > atomic_fetch_*()") > > > https://github.com/torvalds/linux/commit/84c6591103dbeaf393a092a3fc7b09510825f6b9 > > > > > > Reverting the commit solves this problem. > > > > > > I tested v4.19-rc1 on ARM (wandboard, i.mx6, 32bit, quard core, ARMv7) > > > which uses same > > > generic bitops implementation and it works fine. > > > > > > Do you have any ideas what went wrong? > > > > Back in 2016, Peter had fixed this file due to a problem I reported on ARC. > > See > > commit f75d48644c56a ("bitops: Do not default to __clear_bit() for > > __clear_bit_unlock()") > > That made __clear_bit_unlock() use the atomic clear_bit() vs. non-atomic > > __clear_bit(), effectively making clear_bit_unlock() and > > __clear_bit_unlock() same. > > > > This patch undoes that which could explain the issues you see. @Peter, > > @Will ? > > Right, so the thinking is that on platforms that suffer that issue, > atomic_set*() should DTRT. And if you look at your spinlock based atomic > implementation, you'll note that atomic_set() does indeed do the right > thing. > > arch/arc/include/asm/atomic.h:108 Yeah, the bit_spin_lock()/__bit_spin_unlock() race described in f75d48644c56a boils down to concurrent atomic_long_set_release() vs atomic_long_fetch_or_acquire(), which really needs to work. I'll keep digging. In the meantime, Vineet, do you have any useful crash logs and do you only see the crashes in certain configurations (e.g. SMP but !CONFIG_ARC_HAS_LLSC)? Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH] atomic{64}_t: Explicitly specify data storage length and alignment
On Mon, Jul 09, 2018 at 03:47:41PM +0300, Alexey Brodkin wrote: > Atomic instructions require data they operate on to be aligned > according to data size. I.e. 32-bit atomic values must be 32-bit > aligned while 64-bit values must be 64-bit aligned. > > Otherwise even if CPU may handle not-aligend normal data access, > still atomic instructions fail and typically raise an exception > leaving us dead in the water. > > This came-up during lengthly discussion here: > http://lists.infradead.org/pipermail/linux-snps-arc/2018-July/004022.html > > Signed-off-by: Alexey Brodkin > Cc: Will Deacon > Cc: Peter Zijlstra > Cc: Boqun Feng > Cc: Russell King > Cc: Arnd Bergmann > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: Darren Hart > Cc: Shuah Khan > Cc: "Paul E. McKenney" > Cc: Josh Triplett > Cc: Steven Rostedt > Cc: Mathieu Desnoyers > Cc: Lai Jiangshan > Cc: David Laight > Cc: Geert Uytterhoeven > Cc: Greg Kroah-Hartman > --- > arch/arm/include/asm/atomic.h | 2 +- > include/asm-generic/atomic64.h| 2 +- > include/linux/types.h | 4 ++-- > tools/include/linux/types.h | 2 +- > tools/testing/selftests/futex/include/atomic.h| 2 +- > .../rcutorture/formal/srcu-cbmc/include/linux/types.h | 4 ++-- > 6 files changed, 8 insertions(+), 8 deletions(-) > > diff --git a/arch/arm/include/asm/atomic.h b/arch/arm/include/asm/atomic.h > index 66d0e215a773..2ed6d7cf1407 100644 > --- a/arch/arm/include/asm/atomic.h > +++ b/arch/arm/include/asm/atomic.h > @@ -267,7 +267,7 @@ ATOMIC_OPS(xor, ^=, eor) > > #ifndef CONFIG_GENERIC_ATOMIC64 > typedef struct { > - long long counter; > + u64 __aligned(8) counter; > } atomic64_t; Long long is 8-byte aligned per EABI ARM, and we use the generic atomic64 infrastructure for OABI, so we don't need to change anything here afaict. Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 2/4] mm: speed up mremap by 500x on large regions (v2)
On Fri, Oct 26, 2018 at 02:11:48PM -0700, Joel Fernandes wrote: > My thinking is to take it slow and get the patch in in its current state, > since it improves x86. Then as a next step, look into why the arm64 tlb > flushes are that expensive and look into optimizing that. On arm64 I am > testing on a 4.9 kernel so I'm wondering there are any optimizations since > 4.9 that can help speed it up there. After that, if all else fails about > speeding up arm64, then I look into developing the cleanest possible solution > where we can keep the lock held for longer and flush lesser. We rewrote a good chunk of the arm64 TLB invalidation and core mmu_gather code this merge window, so please do have another look at -rc1! Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v4 4/6] arm64: Utilize phys_initrd_start/phys_initrd_size
On Mon, Nov 05, 2018 at 02:54:29PM -0800, Florian Fainelli wrote: > ARM64 is the only architecture that re-defines > __early_init_dt_declare_initrd() in order for that function to populate > initrd_start/initrd_end with physical addresses instead of virtual > addresses. Instead of having an override we can leverage > drivers/of/fdt.c populating phys_initrd_start/phys_initrd_size to > populate those variables for us. > > Signed-off-by: Florian Fainelli > --- > arch/arm64/mm/init.c | 20 > 1 file changed, 8 insertions(+), 12 deletions(-) Looks ok to me: Acked-by: Will Deacon Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v4 1/4] kgdb: Remove irq flags from roundup
On Mon, Nov 12, 2018 at 10:26:55AM -0800, Douglas Anderson wrote: > The function kgdb_roundup_cpus() was passed a parameter that was > documented as: > > > the flags that will be used when restoring the interrupts. There is > > local_irq_save() call before kgdb_roundup_cpus(). > > Nobody used those flags. Anyone who wanted to temporarily turn on > interrupts just did local_irq_enable() and local_irq_disable() without > looking at them. So we can definitely remove the flags. > > Signed-off-by: Douglas Anderson > --- Acked-by: Will Deacon I'm hopeful that you'll keep hacking on kgdb, because it definitely needs some love in its current state. Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v4 2/4] kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function()
On Mon, Nov 12, 2018 at 10:26:56AM -0800, Douglas Anderson wrote: > When I had lockdep turned on and dropped into kgdb I got a nice splat > on my system. Specifically it hit: > DEBUG_LOCKS_WARN_ON(current->hardirq_context) > > Specifically it looked like this: > sysrq: SysRq : DEBUG > [ cut here ] > DEBUG_LOCKS_WARN_ON(current->hardirq_context) > WARNING: CPU: 0 PID: 0 at .../kernel/locking/lockdep.c:2875 > lockdep_hardirqs_on+0xf0/0x160 > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0 #27 > pstate: 604003c9 (nZCv DAIF +PAN -UAO) > pc : lockdep_hardirqs_on+0xf0/0x160 > ... > Call trace: >lockdep_hardirqs_on+0xf0/0x160 >trace_hardirqs_on+0x188/0x1ac >kgdb_roundup_cpus+0x14/0x3c >kgdb_cpu_enter+0x53c/0x5cc >kgdb_handle_exception+0x180/0x1d4 >kgdb_compiled_brk_fn+0x30/0x3c >brk_handler+0x134/0x178 >do_debug_exception+0xfc/0x178 >el1_dbg+0x18/0x78 >kgdb_breakpoint+0x34/0x58 >sysrq_handle_dbg+0x54/0x5c >__handle_sysrq+0x114/0x21c >handle_sysrq+0x30/0x3c >qcom_geni_serial_isr+0x2dc/0x30c > ... > ... > irq event stamp: ...45 > hardirqs last enabled at (...44): [...] __do_softirq+0xd8/0x4e4 > hardirqs last disabled at (...45): [...] el1_irq+0x74/0x130 > softirqs last enabled at (...42): [...] _local_bh_enable+0x2c/0x34 > softirqs last disabled at (...43): [...] irq_exit+0xa8/0x100 > ---[ end trace adf21f830c46e638 ]--- > > Looking closely at it, it seems like a really bad idea to be calling > local_irq_enable() in kgdb_roundup_cpus(). If nothing else that seems > like it could violate spinlock semantics and cause a deadlock. > > Instead, let's use a private csd alongside > smp_call_function_single_async() to round up the other CPUs. Using > smp_call_function_single_async() doesn't require interrupts to be > enabled so we can remove the offending bit of code. > > In order to avoid duplicating this across all the architectures that > use the default kgdb_roundup_cpus(), we'll add a "weak" implementation > to debug_core.c. > > Looking at all the people who previously had copies of this code, > there were a few variants. I've attempted to keep the variants > working like they used to. Specifically: > * For arch/arc we passed NULL to kgdb_nmicallback() instead of > get_irq_regs(). > * For arch/mips there was a bit of extra code around > kgdb_nmicallback() > > NOTE: In this patch we will still get into trouble if we try to round > up a CPU that failed to round up before. We'll try to round it up > again and potentially hang when we try to grab the csd lock. That's > not new behavior but we'll still try to do better in a future patch. > > Suggested-by: Daniel Thompson > Signed-off-by: Douglas Anderson > --- > > Changes in v4: None > Changes in v3: > - No separate init call. > - Don't round up the CPU that is doing the rounding up. > - Add "#ifdef CONFIG_SMP" to match the rest of the file. > - Updated desc saying we don't solve the "failed to roundup" case. > - Document the ignored parameter. > > Changes in v2: > - Removing irq flags separated from fixing lockdep splat. > - Don't use smp_call_function (Daniel). > > arch/arc/kernel/kgdb.c | 10 ++ > arch/arm/kernel/kgdb.c | 12 > arch/arm64/kernel/kgdb.c | 12 > arch/hexagon/kernel/kgdb.c | 27 --- > arch/mips/kernel/kgdb.c| 9 + > arch/powerpc/kernel/kgdb.c | 4 ++-- > arch/sh/kernel/kgdb.c | 12 > include/linux/kgdb.h | 15 +-- > kernel/debug/debug_core.c | 35 +++ > 9 files changed, 53 insertions(+), 83 deletions(-) [...] > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c > index f3cadda45f07..23f2b5613afa 100644 > --- a/kernel/debug/debug_core.c > +++ b/kernel/debug/debug_core.c > @@ -55,6 +55,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -220,6 +221,40 @@ int __weak kgdb_skipexception(int exception, struct > pt_regs *regs) > return 0; > } > > +#ifdef CONFIG_SMP > + > +/* > + * Default (weak) implementation for kgdb_roundup_cpus > + */ > + > +static DEFINE_PER_CPU(call_single_data_t, kgdb_roundup_csd); > + > +void __weak kgdb_call_nmi_hook(void *ignored) > +{ > + kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs()); > +} I suppose you could pass the cpu as an argument, but it doesn't really matter. Also, I think there are cases where the CSD callback can run without having received an IPI, so we could potentially up passing NULL for the regs here which probably goes boom. > + > +void __weak kgdb_roundup_cpus(void) > +{ > + call_single_data_t *csd; > + int this_cpu = get_cpu(); Do you actually need to disable preemption here? afaict, irqs are already disabled by the kgdb core. > + int cpu; > + > + for_each_cpu(cpu, cpu_online_mask) { for_each_online_cpu(cpu) ? I'm assuming
Re: Patch "asm-generic/bitops/lock.h: Rewrite using atomic_fetch_" causes kernel crash
On Thu, Aug 30, 2018 at 04:17:13PM +0200, Peter Zijlstra wrote: > On Thu, Aug 30, 2018 at 11:53:17AM +, Eugeniy Paltsev wrote: > > I can see crashes with LLSC enabled in both SMP running on 4 cores > > and SMP running on 1 core. > > So you're running on LL/SC enabled hardware; that would make Will's > patch irrelevant (although still a good idea for the hardware that does > care about that spinlocked atomic crud). Yeah, that's a good point. I think the !LLSC case is broken without my patch, so we're looking at two bugs... > Does something like the below cure things? That would confirm the > suggestion that the change to __clear_bit_unlock() is the curprit. > > If that doesn't cure things, then we've been looking in entirely the > wrong place. Yes, that would be worth trying. However, I also just noticed that the fetch-ops (which are now used to implement test_and_set_bit_lock()) seem to be missing the backwards branch in the LL/SC case. Yet another diff below. Will --->8 diff --git a/arch/arc/include/asm/atomic.h b/arch/arc/include/asm/atomic.h index 4e0072730241..f06c5ed672b3 100644 --- a/arch/arc/include/asm/atomic.h +++ b/arch/arc/include/asm/atomic.h @@ -84,7 +84,7 @@ static inline int atomic_fetch_##op(int i, atomic_t *v) \ "1: llock %[orig], [%[ctr]] \n" \ " " #asm_op " %[val], %[orig], %[i] \n" \ " scond %[val], [%[ctr]]\n" \ - " \n" \ + " bnz 1b \n" \ : [val] "=" (val), \ [orig] "=" (orig) \ : [ctr] "r" (>counter), \ ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: Patch "asm-generic/bitops/lock.h: Rewrite using atomic_fetch_" causes kernel crash
Hi Eugeniy, On Thu, Aug 30, 2018 at 11:53:17AM +, Eugeniy Paltsev wrote: > On Thu, 2018-08-30 at 10:51 +0100, Will Deacon wrote: > > On Thu, Aug 30, 2018 at 11:44:11AM +0200, Peter Zijlstra wrote: > > > On Wed, Aug 29, 2018 at 09:16:43PM +, Vineet Gupta wrote: > > > > On 08/29/2018 11:33 AM, Eugeniy Paltsev wrote: > > > > > Hi Guys, > > > > > Since v4.19-rc1 we are getting a serious regression on platforms with > > > > > ARC architecture. > > > > > The kernel have become unstable and spontaneously crashes on LTP > > > > > tests execution / IO tests or > > > > > even on boot. > > > > > > > > > > I don't know exactly what breaks but bisect clearly assign the blame > > > > > to this commit: > > > > > 84c6591103db ("locking/atomics, asm-generic/bitops/lock.h: Rewrite > > > > > using atomic_fetch_*()") > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_torvalds_linux_commit_84c6591103dbeaf393a092a3fc7b09510825f6b9=DwIBAg=DPL6 > > > > > _X_6JkXFx7AXWqB0tg=ZlJN1MriPUTkBKCrPSx67GmaplEUGcAEk9yPtCLdUXI=6y0FFvkGdIQ6kX2lZ31V99lMfMV- > > > > > RyWyYhiUGzh0Bi0=GNwmhSynIcWqgZhiOwFEEH_AtbZAH443_L6QH4nw_ls= > > > > > > > > > > Reverting the commit solves this problem. > > > > > > > > > > I tested v4.19-rc1 on ARM (wandboard, i.mx6, 32bit, quard core, > > > > > ARMv7) which uses same > > > > > generic bitops implementation and it works fine. > > > > > > > > > > Do you have any ideas what went wrong? > > > > > > > > Back in 2016, Peter had fixed this file due to a problem I reported on > > > > ARC. See > > > > commit f75d48644c56a ("bitops: Do not default to __clear_bit() for > > > > __clear_bit_unlock()") > > > > That made __clear_bit_unlock() use the atomic clear_bit() vs. non-atomic > > > > __clear_bit(), effectively making clear_bit_unlock() and > > > > __clear_bit_unlock() same. > > > > > > > > This patch undoes that which could explain the issues you see. @Peter, > > > > @Will ? > > > > > > Right, so the thinking is that on platforms that suffer that issue, > > > atomic_set*() should DTRT. And if you look at your spinlock based atomic > > > implementation, you'll note that atomic_set() does indeed do the right > > > thing. > > > > > > arch/arc/include/asm/atomic.h:108 > > > > Yeah, the bit_spin_lock()/__bit_spin_unlock() race described in > > f75d48644c56a > > boils down to concurrent atomic_long_set_release() vs > > atomic_long_fetch_or_acquire(), which really needs to work. > > > > I'll keep digging. In the meantime, Vineet, do you have any useful crash > > logs and do you only see the crashes in certain configurations (e.g. SMP but > > !CONFIG_ARC_HAS_LLSC)? > > We don't have such configuration (SMP with !CONFIG_ARC_HAS_LLSC). Ok, thanks for confirming. If you could put your .config somewhere that would be helpful, since the build fails for me with defconfig. > I can see crashes with LLSC enabled in both SMP running on 4 cores > and SMP running on 1 core. > > > There are some crash logs (not sure if they are really useful): > Crashes are quite spontaneous and mostly happens in IO-related code: Aha: arc seems to have separate spinlocks for protecting bitops and atomics. This is a seriously bad idea, because you only implement some of the bitops API in the architecture and rely on asm-generic to give you bitops/lock.h, assuming that it's going to be implemented in terms of the other parts of the bitops API (it isn't anymore). Here's a quick hack (totally untested, since I can't even build an arc kernel) which moves arc over to asm-generic for all of the bitops API and ditches the bitops lock. You could probably also drop the ffs() stuff, but I'll leave that up to you. The "downside" is serialisation between bitops and atomics, but I actually think you probably want that for xchg/cmpxchg anyway. Will --->8 diff --git a/arch/arc/include/asm/bitops.h b/arch/arc/include/asm/bitops.h index 8da87feec59a..50b0d5a56e32 100644 --- a/arch/arc/include/asm/bitops.h +++ b/arch/arc/include/asm/bitops.h @@ -17,242 +17,6 @@ #include #include -#include -#ifndef CONFIG_ARC_HAS_LLSC -#include -#endif - -#ifdef CONFIG_ARC_HAS_LLSC - -/* - * Hardware assisted Atomic-R-M-W - */ - -#define BIT_OP(op, c_op, asm_op) \ -static inline void op##_bit(unsigned long nr, vola
Re: [PATCH 3/3] bitops.h: set_mask_bits() to return old value
On Thu, Jan 10, 2019 at 04:26:27PM -0800, Vineet Gupta wrote: > | > Also, set_mask_bits is used in fs quite a bit and we can possibly come up > | > with a generic llsc based implementation (w/o the cmpxchg loop) > | > | May I also suggest changing the return value of set_mask_bits() to old. > | > | You can compute the new value given old, but you cannot compute the old > | value given new, therefore old is the better return value. Also, no > | current user seems to use the return value, so changing it is without > | risk. > > Link: > http://lkml.kernel.org/g/20150807110955.gh16...@twins.programming.kicks-ass.net > Suggested-by: Peter Zijlstra > Cc: Miklos Szeredi > Cc: Ingo Molnar > Cc: Jani Nikula > Cc: Chris Wilson > Cc: Andrew Morton > Cc: Will Deacon > Signed-off-by: Vineet Gupta > --- > include/linux/bitops.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/linux/bitops.h b/include/linux/bitops.h > index 705f7c442691..602af23b98c7 100644 > --- a/include/linux/bitops.h > +++ b/include/linux/bitops.h > @@ -246,7 +246,7 @@ static __always_inline void __assign_bit(long nr, > volatile unsigned long *addr, > new__ = (old__ & ~mask__) | bits__; \ > } while (cmpxchg(ptr, old__, new__) != old__); \ > \ > - new__; \ > + old__; \ > }) > #endif Acked-by: Will Deacon May also explain why no in-tree users appear to use the return value! Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v5 2/4] kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function()
On Mon, Nov 26, 2018 at 07:51:31PM -0800, Douglas Anderson wrote: > When I had lockdep turned on and dropped into kgdb I got a nice splat > on my system. Specifically it hit: > DEBUG_LOCKS_WARN_ON(current->hardirq_context) > > Specifically it looked like this: > sysrq: SysRq : DEBUG > [ cut here ] > DEBUG_LOCKS_WARN_ON(current->hardirq_context) > WARNING: CPU: 0 PID: 0 at .../kernel/locking/lockdep.c:2875 > lockdep_hardirqs_on+0xf0/0x160 > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0 #27 > pstate: 604003c9 (nZCv DAIF +PAN -UAO) > pc : lockdep_hardirqs_on+0xf0/0x160 > ... > Call trace: >lockdep_hardirqs_on+0xf0/0x160 >trace_hardirqs_on+0x188/0x1ac >kgdb_roundup_cpus+0x14/0x3c >kgdb_cpu_enter+0x53c/0x5cc >kgdb_handle_exception+0x180/0x1d4 >kgdb_compiled_brk_fn+0x30/0x3c >brk_handler+0x134/0x178 >do_debug_exception+0xfc/0x178 >el1_dbg+0x18/0x78 >kgdb_breakpoint+0x34/0x58 >sysrq_handle_dbg+0x54/0x5c >__handle_sysrq+0x114/0x21c >handle_sysrq+0x30/0x3c >qcom_geni_serial_isr+0x2dc/0x30c > ... > ... > irq event stamp: ...45 > hardirqs last enabled at (...44): [...] __do_softirq+0xd8/0x4e4 > hardirqs last disabled at (...45): [...] el1_irq+0x74/0x130 > softirqs last enabled at (...42): [...] _local_bh_enable+0x2c/0x34 > softirqs last disabled at (...43): [...] irq_exit+0xa8/0x100 > ---[ end trace adf21f830c46e638 ]--- > > Looking closely at it, it seems like a really bad idea to be calling > local_irq_enable() in kgdb_roundup_cpus(). If nothing else that seems > like it could violate spinlock semantics and cause a deadlock. > > Instead, let's use a private csd alongside > smp_call_function_single_async() to round up the other CPUs. Using > smp_call_function_single_async() doesn't require interrupts to be > enabled so we can remove the offending bit of code. > > In order to avoid duplicating this across all the architectures that > use the default kgdb_roundup_cpus(), we'll add a "weak" implementation > to debug_core.c. > > Looking at all the people who previously had copies of this code, > there were a few variants. I've attempted to keep the variants > working like they used to. Specifically: > * For arch/arc we passed NULL to kgdb_nmicallback() instead of > get_irq_regs(). > * For arch/mips there was a bit of extra code around > kgdb_nmicallback() > > NOTE: In this patch we will still get into trouble if we try to round > up a CPU that failed to round up before. We'll try to round it up > again and potentially hang when we try to grab the csd lock. That's > not new behavior but we'll still try to do better in a future patch. > > Suggested-by: Daniel Thompson > Signed-off-by: Douglas Anderson > --- > > Changes in v5: > - Add a comment about get_irq_regs(). > - get_cpu() => raw_smp_processor_id() in kgdb_roundup_cpus(). > - for_each_cpu() => for_each_online_cpu() > - Error check smp_call_function_single_async() For the arm64 bit: > diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c > index 12c339ff6e75..da880247c734 100644 > --- a/arch/arm64/kernel/kgdb.c > +++ b/arch/arm64/kernel/kgdb.c > @@ -284,18 +284,6 @@ static struct step_hook kgdb_step_hook = { > .fn = kgdb_step_brk_fn > }; > > -static void kgdb_call_nmi_hook(void *ignored) > -{ > - kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs()); > -} > - > -void kgdb_roundup_cpus(void) > -{ > - local_irq_enable(); > - smp_call_function(kgdb_call_nmi_hook, NULL, 0); > - local_irq_disable(); > -} > - Acked-by: Will Deacon Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: single copy atomicity for double load/stores on 32-bit systems
On Mon, Jul 01, 2019 at 08:05:51PM +, Vineet Gupta wrote: > On 5/31/19 1:21 AM, Peter Zijlstra wrote: > > On Thu, May 30, 2019 at 11:22:42AM -0700, Vineet Gupta wrote: > >> Had an interesting lunch time discussion with our hardware architects > >> pertinent to > >> "minimal guarantees expected of a CPU" section of memory-barriers.txt > >> > >> > >> | (*) These guarantees apply only to properly aligned and sized scalar > >> | variables. "Properly sized" currently means variables that are > >> | the same size as "char", "short", "int" and "long". "Properly > >> | aligned" means the natural alignment, thus no constraints for > >> | "char", two-byte alignment for "short", four-byte alignment for > >> | "int", and either four-byte or eight-byte alignment for "long", > >> | on 32-bit and 64-bit systems, respectively. > >> > >> > >> I'm not sure how to interpret "natural alignment" for the case of double > >> load/stores on 32-bit systems where the hardware and ABI allow for 4 byte > >> alignment (ARCv2 LDD/STD, ARM LDRD/STRD ) > > > > Natural alignment: !((uintptr_t)ptr % sizeof(*ptr)) > > > > For any u64 type, that would give 8 byte alignment. the problem > > otherwise being that your data spans two lines/pages etc.. > > > >> I presume (and the question) that lkmm doesn't expect such 8 byte > >> load/stores to > >> be atomic unless 8-byte aligned > >> > >> ARMv7 arch ref manual seems to confirm this. Quoting > >> > >> | LDM, LDC, LDC2, LDRD, STM, STC, STC2, STRD, PUSH, POP, RFE, SRS, VLDM, > >> VLDR, > >> | VSTM, and VSTR instructions are executed as a sequence of word-aligned > >> word > >> | accesses. Each 32-bit word access is guaranteed to be single-copy > >> atomic. A > >> | subsequence of two or more word accesses from the sequence might not > >> exhibit > >> | single-copy atomicity > >> > >> While it seems reasonable form hardware pov to not implement such > >> atomicity by > >> default it seems there's an additional burden on application writers. They > >> could > >> be happily using a lockless algorithm with just a shared flag between 2 > >> threads > >> w/o need for any explicit synchronization. > > > > If you're that careless with lockless code, you deserve all the pain you > > get. > > > >> But upgrade to a new compiler which > >> aggressively "packs" struct rendering long long 32-bit aligned (vs. 64-bit > >> before) > >> causing the code to suddenly stop working. Is the onus on them to declare > >> such > >> memory as c11 atomic or some such. > > > > When a programmer wants guarantees they already need to know wth they're > > doing. > > > > And I'll stand by my earlier conviction that any architecture that has a > > native u64 (be it a 64bit arch or a 32bit with double-width > > instructions) but has an ABI that allows u32 alignment on them is daft. > > So I agree with Paul's assertion that it is strange for 8-byte type being > 4-byte > aligned on a 64-bit system, but is it totally broken even if the ISA of the > said > 64-bit arch allows LD/ST to be augmented with acq/rel respectively. > > Say the ISA guarantees single-copy atomicity for aligned cases (i.e. for > 8-byte > data only if it is naturally aligned) and in lack thereof programmer needs to > use > the proper acq/release Apologies if I'm missing some context here, but it's not clear to me why the use of acquire/release instructions has anything to do with single-copy atomicity of unaligned accesses. The ordering they provide doesn't necessarily prevent tearing, although a CPU architecture could obviously provide that guarantee if it wanted to. Generally though, I wouldn't expect the two to go hand-in-hand like you're suggesting. Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 19/26] arm64: remove __iounmap
On Sat, Aug 17, 2019 at 09:32:46AM +0200, Christoph Hellwig wrote: > No need to indirect iounmap for arm64. > > Signed-off-by: Christoph Hellwig > --- > arch/arm64/include/asm/io.h | 3 +-- > arch/arm64/mm/ioremap.c | 4 ++-- > 2 files changed, 3 insertions(+), 4 deletions(-) Not sure why we did it like this... Acked-by: Will Deacon Will
Re: [PATCH 19/26] arm64: remove __iounmap
Hi Christoph, On Fri, Aug 30, 2019 at 06:05:15PM +0200, Christoph Hellwig wrote: > On Mon, Aug 19, 2019 at 08:36:02AM +0100, Will Deacon wrote: > > On Sat, Aug 17, 2019 at 09:32:46AM +0200, Christoph Hellwig wrote: > > > No need to indirect iounmap for arm64. > > > > > > Signed-off-by: Christoph Hellwig > > > --- > > > arch/arm64/include/asm/io.h | 3 +-- > > > arch/arm64/mm/ioremap.c | 4 ++-- > > > 2 files changed, 3 insertions(+), 4 deletions(-) > > > > Not sure why we did it like this... > > > > Acked-by: Will Deacon > > Can you just pick this one up through the arm64 tree for 5.4? Unfortunately, it doesn't apply because the tree you've based it on has removed ioremap_wt(). If you send a version based on mainline, I can queue it. Cheers, Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 2/2] mm/thp: Rename pmd_mknotpresent() as pmd_mknotvalid()
On Fri, Mar 20, 2020 at 10:24:17AM +0530, Anshuman Khandual wrote: > pmd_present() is expected to test positive after pmdp_mknotpresent() as the > PMD entry still points to a valid huge page in memory. pmdp_mknotpresent() > implies that given PMD entry is just invalidated from MMU perspective while > still holding on to pmd_page() referred valid huge page thus also clearing > pmd_present() test. This creates the following situation which is counter > intuitive. > > [pmd_present(pmd_mknotpresent(pmd)) = true] > > This renames pmd_mknotpresent() as pmd_mknotvalid() reflecting the helper's > functionality more accurately while changing the above mentioned situation > as follows. This does not create any functional change. > > [pmd_present(pmd_mknotvalid(pmd)) = true] > > This is not applicable for platforms that define own pmdp_invalidate() via > __HAVE_ARCH_PMDP_INVALIDATE. Suggestion for renaming came during a previous > discussion here. Bikeshed alert: maybe pmd_mkinvalid() would be better, given that this is a one-trick pony for pmdp_invalidate()? Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 2/2] mm/thp: Rename pmd_mknotpresent() as pmd_mknotvalid()
On Tue, Apr 21, 2020 at 04:57:26AM +0530, Anshuman Khandual wrote: > > > On 04/21/2020 02:33 AM, Will Deacon wrote: > > On Fri, Mar 20, 2020 at 10:24:17AM +0530, Anshuman Khandual wrote: > >> pmd_present() is expected to test positive after pmdp_mknotpresent() as the > >> PMD entry still points to a valid huge page in memory. pmdp_mknotpresent() > >> implies that given PMD entry is just invalidated from MMU perspective while > >> still holding on to pmd_page() referred valid huge page thus also clearing > >> pmd_present() test. This creates the following situation which is counter > >> intuitive. > >> > >> [pmd_present(pmd_mknotpresent(pmd)) = true] > >> > >> This renames pmd_mknotpresent() as pmd_mknotvalid() reflecting the helper's > >> functionality more accurately while changing the above mentioned situation > >> as follows. This does not create any functional change. > >> > >> [pmd_present(pmd_mknotvalid(pmd)) = true] > >> > >> This is not applicable for platforms that define own pmdp_invalidate() via > >> __HAVE_ARCH_PMDP_INVALIDATE. Suggestion for renaming came during a previous > >> discussion here. > > > > Bikeshed alert: maybe pmd_mkinvalid() would be better, given that this is > > a one-trick pony for pmdp_invalidate()? > > I had thought about making it pmd_mkinvalid() earlier. But as we were > replacing > pmd_mknotpresent(), hence went with similar pattern pmd_mknotvalid() which was > originally suggested by Catalin. There is an existing pte_mknotpresent() in > arc > platform as well. I dont have a very strong opinion either way, will be happy > to rename. But then still wondering if we really need to. I just think that having pmdp_invalidate() call pmd_mkinvalid() makes a lot of sense and, since this is a pure renaming patch, then that's worth taking into consideration. If you go with pmd_mkinvalid(), then: Acked-by: Will Deacon Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: Flushing transparent hugepages
On Tue, Aug 18, 2020 at 04:07:36PM +0100, Matthew Wilcox wrote: > For example, arm64 seems confused in this scenario: > > void flush_dcache_page(struct page *page) > { > if (test_bit(PG_dcache_clean, >flags)) > clear_bit(PG_dcache_clean, >flags); > } > > ... > > void __sync_icache_dcache(pte_t pte) > { > struct page *page = pte_page(pte); > > if (!test_and_set_bit(PG_dcache_clean, >flags)) > sync_icache_aliases(page_address(page), page_size(page)); > } > > So arm64 keeps track on a per-page basis which ones have been flushed. > page_size() will return PAGE_SIZE if called on a tail page or regular > page, but will return PAGE_SIZE << compound_order if called on a head > page. So this will either over-flush, or it's missing the opportunity > to clear the bits on all the subpages which have now been flushed. Hmm, that seems to go all the way back to 2014 as the result of a bug fix in 923b8f5044da ("arm64: mm: Make icache synchronisation logic huge page aware") which has a Reported-by Mark and a CC stable, suggesting something _was_ going wrong at the time :/ Was there a point where the tail pages could end up with PG_arch_1 uncleared on allocation? > What would you _like_ to see? Would you rather flush_dcache_page() > were called once for each subpage, or would you rather maintain > the page-needs-flushing state once per compound page? We could also > introduce flush_dcache_thp() if some architectures would prefer it one > way and one the other, although that brings into question what to do > for hugetlbfs pages. For arm64, we'd like to see PG_arch_1 preserved during huge page splitting [1], but there was a worry that it might break x86 and s390. It's also not clear to me that we can change __sync_icache_dcache() as it's called when we're installing the entry in the page-table, so why would it be called again for the tail pages? Will [1] https://lore.kernel.org/linux-arch/20200703153718.16973-8-catalin.mari...@arm.com/ ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v2] arch: consolidate pm_power_off callback
On Sun, Dec 27, 2020 at 03:01:28PM +0100, Enrico Weigelt, metux IT consult wrote: > Move the pm_power_off callback into one global place and also add an > function for conditionally calling it (when not NULL), in order to remove > code duplication in all individual archs. > > Reported-by: kernel test robot > Signed-off-by: Enrico Weigelt, metux IT consult [...] > diff --git a/kernel/reboot.c b/kernel/reboot.c > index eb1b15850761..ec4cd66dd1ae 100644 > --- a/kernel/reboot.c > +++ b/kernel/reboot.c > @@ -53,6 +53,16 @@ int reboot_force; > void (*pm_power_off_prepare)(void); > EXPORT_SYMBOL_GPL(pm_power_off_prepare); > > +void (*pm_power_off)(void); > +EXPORT_SYMBOL_GPL(pm_power_off); > + > +void do_power_off(void) > +{ > + if (pm_power_off) > + pm_power_off(); > +} > +EXPORT_SYMBOL_GPL(do_power_off); Could this just live as a static inline in pm.h to avoid having to export the extra symbol? Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 00/11] ARC atomics update
On Thu, Aug 05, 2021 at 07:04:32PM +0200, Peter Zijlstra wrote: > On Thu, Aug 05, 2021 at 04:18:29PM +, Vineet Gupta wrote: > > On 8/5/21 2:02 AM, Peter Zijlstra wrote: > > > On Wed, Aug 04, 2021 at 12:15:43PM -0700, Vineet Gupta wrote: > > > > > >> Vineet Gupta (10): > > >>ARC: atomics: disintegrate header > > >>ARC: atomic: !LLSC: remove hack in atomic_set() for for UP > > >>ARC: atomic: !LLSC: use int data type consistently > > >>ARC: atomic64: LLSC: elide unused atomic_{and,or,xor,andnot}_return > > >>ARC: atomics: implement relaxed variants > > >>ARC: bitops: fls/ffs to take int (vs long) per asm-generic defines > > >>ARC: xchg: !LLSC: remove UP micro-optimization/hack > > >>ARC: cmpxchg/xchg: rewrite as macros to make type safe > > >>ARC: cmpxchg/xchg: implement relaxed variants (LLSC config only) > > >>ARC: atomic_cmpxchg/atomic_xchg: implement relaxed variants > > >> > > >> Will Deacon (1): > > >>ARC: switch to generic bitops > > > > > > Didn't see any weird things: > > > > > > Acked-by: Peter Zijlstra (Intel) > > > > Thx Peter. A lot of this is your code anyways ;-) > > > > Any initial thoughts/comments on patch 06/11 - is there an obvious > > reason that generic bitops take signed @nr or the hurdle is need to be > > done consistently cross-arch. > > That does indeed seem daft and ready for a cleanup. Will any > recollection from when you touched this? I had a patch to fix this but it blew up in the robot and I didn't get round to reworking it: https://lore.kernel.org/patchwork/patch/124/ Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [RFC] bitops/non-atomic: make @nr unsigned to avoid any DIV
On Thu, Aug 05, 2021 at 12:14:08PM -0700, Vineet Gupta wrote: > signed math causes generation of costlier instructions such as DIV when > they could be done by barrerl shifter. > > Worse part is this is not caught by things like bloat-o-meter since > instruction length / symbols are typically same size. > > e.g. > > stock (signed math) > __ > > 919b4614 : > 919b4614: div r2,r0,0x20 > ^^^ > 919b4618: add2r2,0x920f6050,r2 > 919b4620: ld_sr2,[r2,0] > 919b4622: lsr r0,r2,r0 > 919b4626: j_s.d [blink] > 919b4628: bmsk_s r0,r0,0 > 919b462a: nop_s > > (patched) unsigned math > __ > > 919b4614 : > 919b4614: lsr r2,r0,0x5 @nr/32 > ^^^ > 919b4618: add2r2,0x920f6050,r2 > 919b4620: ld_sr2,[r2,0] > 919b4622: lsr r0,r2,r0 #test_bit() > 919b4626: j_s.d [blink] > 919b4628: bmsk_s r0,r0,0 > 919b462a: nop_s Just FYI, but on arm64 the existing codegen is alright as we have both arithmetic and logical shifts. > Signed-off-by: Vineet Gupta > --- > This is an RFC for feeback, I understand this impacts every arch, > but as of now it is only buld/run tested on ARC. > --- > --- > include/asm-generic/bitops/non-atomic.h | 14 +++--- > 1 file changed, 7 insertions(+), 7 deletions(-) Acked-by: Will Deacon We should really move test_bit() into the atomic header, but I failed to fix the resulting include mess last time I tried that. Will ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc