Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
Le 11/02/2021 à 15:30, Segher Boessenkool a écrit : On Thu, Feb 11, 2021 at 03:09:43PM +0100, Christophe Leroy wrote: Le 11/02/2021 à 12:49, Segher Boessenkool a écrit : On Thu, Feb 11, 2021 at 07:41:52AM +, Christophe Leroy wrote: powerpc BUG_ON() is based on using twnei or tdnei instruction, which obliges gcc to format the condition into a 0 or 1 value in a register. Huh? Why is that? Will it work better if this used __builtin_trap? Or does the kernel only detect very specific forms of trap instructions? We already made a try with __builtin_trap() 1,5 year ago, see https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.le...@c-s.fr/ The main problems encountered are: - It is only possible to use it for BUG_ON, not for WARN_ON because GCC considers it as noreturn. Is there any workaround ? A trap is noreturn by definition: -- Built-in Function: void __builtin_trap (void) This function causes the program to exit abnormally. - The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify the source file and line corresponding to the trap. How can that be done with __builtin_trap() ? The DWARF debug info should be sufficient. Perhaps you can post-process some way? You can create a trap that falls through yourself (by having a trap-on condition with a condition that is always true, but make the compiler not see that). This isn't efficient though. Could you file a feature request (in bugzilla)? It is probably useful for generic code as well, but we could implement this for powerpc only if needed. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99299 Christophe
Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
Excerpts from Segher Boessenkool's message of February 11, 2021 9:50 pm: > On Thu, Feb 11, 2021 at 08:04:55PM +1000, Nicholas Piggin wrote: >> It would be nice if we could have a __builtin_trap_if that gcc would use >> conditional traps with, (and which never assumes following code is >> unreachable even for constant true, so we can use it with WARN and put >> explicit unreachable for BUG). > > It automatically does that with just __builtin_trap, see my other mail :-) If that is generated without branches (or at least with no more branches than existing asm implementation), then it could be usable without trashing CFAR. Unfortunately I don't think we will be parsing the dwarf information to get line numbers from it any time soon, so not a drop in replacement but maybe one day someone would find a solution. Thanks, Nick
Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
Le 11/02/2021 à 15:30, Segher Boessenkool a écrit : On Thu, Feb 11, 2021 at 03:09:43PM +0100, Christophe Leroy wrote: Le 11/02/2021 à 12:49, Segher Boessenkool a écrit : On Thu, Feb 11, 2021 at 07:41:52AM +, Christophe Leroy wrote: powerpc BUG_ON() is based on using twnei or tdnei instruction, which obliges gcc to format the condition into a 0 or 1 value in a register. Huh? Why is that? Will it work better if this used __builtin_trap? Or does the kernel only detect very specific forms of trap instructions? We already made a try with __builtin_trap() 1,5 year ago, see https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.le...@c-s.fr/ The main problems encountered are: - It is only possible to use it for BUG_ON, not for WARN_ON because GCC considers it as noreturn. Is there any workaround ? A trap is noreturn by definition: -- Built-in Function: void __builtin_trap (void) This function causes the program to exit abnormally. - The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify the source file and line corresponding to the trap. How can that be done with __builtin_trap() ? The DWARF debug info should be sufficient. Perhaps you can post-process some way? You can create a trap that falls through yourself (by having a trap-on condition with a condition that is always true, but make the compiler not see that). This isn't efficient though. Could you file a feature request (in bugzilla)? It is probably useful for generic code as well, but we could implement this for powerpc only if needed. Ok I will. For sure using __builtin_trap() would be the best. unsigned long test1(unsigned long msr) { WARN_ON((msr & (MSR_DR | MSR_IR | MSR_RI)) != (MSR_DR | MSR_IR | MSR_RI)); return msr; } unsigned long test2(unsigned long msr) { if ((msr & (MSR_DR | MSR_IR | MSR_RI)) != (MSR_DR | MSR_IR | MSR_RI)) __builtin_trap(); return msr; } 03c0 : 3c0: 70 69 00 32 andi. r9,r3,50 3c4: 69 29 00 32 xorir9,r9,50 3c8: 31 49 ff ff addic r10,r9,-1 3cc: 7d 2a 49 10 subfe r9,r10,r9 3d0: 0f 09 00 00 twnei r9,0 3d4: 4e 80 00 20 blr 03d8 : 3d8: 70 69 00 32 andi. r9,r3,50 3dc: 0f 09 00 32 twnei r9,50 3e0: 4e 80 00 20 blr Christophe
Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
On Thu, Feb 11, 2021 at 03:09:43PM +0100, Christophe Leroy wrote: > Le 11/02/2021 à 12:49, Segher Boessenkool a écrit : > >On Thu, Feb 11, 2021 at 07:41:52AM +, Christophe Leroy wrote: > >>powerpc BUG_ON() is based on using twnei or tdnei instruction, > >>which obliges gcc to format the condition into a 0 or 1 value > >>in a register. > > > >Huh? Why is that? > > > >Will it work better if this used __builtin_trap? Or does the kernel only > >detect very specific forms of trap instructions? > > We already made a try with __builtin_trap() 1,5 year ago, see > https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.le...@c-s.fr/ > > The main problems encountered are: > - It is only possible to use it for BUG_ON, not for WARN_ON because GCC > considers it as noreturn. Is there any workaround ? A trap is noreturn by definition: -- Built-in Function: void __builtin_trap (void) This function causes the program to exit abnormally. > - The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify > the source file and line corresponding to the trap. How can that be done > with __builtin_trap() ? The DWARF debug info should be sufficient. Perhaps you can post-process some way? You can create a trap that falls through yourself (by having a trap-on condition with a condition that is always true, but make the compiler not see that). This isn't efficient though. Could you file a feature request (in bugzilla)? It is probably useful for generic code as well, but we could implement this for powerpc only if needed. Segher
Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
Le 11/02/2021 à 12:49, Segher Boessenkool a écrit : On Thu, Feb 11, 2021 at 07:41:52AM +, Christophe Leroy wrote: powerpc BUG_ON() is based on using twnei or tdnei instruction, which obliges gcc to format the condition into a 0 or 1 value in a register. Huh? Why is that? Will it work better if this used __builtin_trap? Or does the kernel only detect very specific forms of trap instructions? We already made a try with __builtin_trap() 1,5 year ago, see https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.le...@c-s.fr/ The main problems encountered are: - It is only possible to use it for BUG_ON, not for WARN_ON because GCC considers it as noreturn. Is there any workaround ? - The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify the source file and line corresponding to the trap. How can that be done with __builtin_trap() ? Christophe
Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
On Thu, Feb 11, 2021 at 01:26:12PM +0100, Christophe Leroy wrote: > >What PowerPC cpus implement branch folding? I know none. > > Extract from powerpc mpc8323 reference manual: > — Zero-cycle branch capability (branch folding) Yeah, this is not what is traditionally called branch folding (which stores the instruction being branched to in some cache, originally the instruction cache itself; somewhat similar (but different) to the BTIC on 750). Overloaded terminology :-) 6xx/7xx CPUs had the branch execution unit in the frontend, and it would not issue a branch at all if it could be resolved then already. Power4 and later predict all branches, and most are not issued at all (those that do need to be executed, like bdnz, are). At completion time it is checked if the prediction was correct (and corrective action is taken if not). Segher
Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
Le 11/02/2021 à 12:49, Segher Boessenkool a écrit : On Thu, Feb 11, 2021 at 07:41:52AM +, Christophe Leroy wrote: powerpc BUG_ON() is based on using twnei or tdnei instruction, which obliges gcc to format the condition into a 0 or 1 value in a register. Huh? Why is that? Will it work better if this used __builtin_trap? Or does the kernel only detect very specific forms of trap instructions? By using a generic implementation, gcc will generate a branch to the unconditional trap generated by BUG(). That is many more instructions than ideal. As modern powerpc implement branch folding, that's even more efficient. What PowerPC cpus implement branch folding? I know none. Extract from powerpc mpc8323 reference manual: High instruction and data throughput — Zero-cycle branch capability (branch folding) — Programmable static branch prediction on unresolved conditional branches — Two integer units with enhanced multipliers in thee300c2 for increased integer instruction throughput and a maximum two-cycle latency for multiply instructions — Instruction fetch unit capable of fetching two instructions per clock from the instruction cache — A six-entry instruction queue (IQ) that provides lookahead capability — Independent pipelines with feed-forwarding that reduces data dependencies in hardware — 16-Kbyte, four-way set-associative instruction and data caches on the e300c2. — Cache write-back or write-through operation programmable on a per-page or per-block basis — Features for instruction and data cache locking and protection — BPU that performs CR lookahead operations — Address translation facilities for 4-Kbyte page size, variable block size, and 256-Mbyte segment size — A 64-entry, two-way, set-associative ITLB and DTLB — Eight-entry data and instruction BAT arrays providing 128-Kbyte to 256-Mbyte blocks — Software table search operations and updates supported through fast trap mechanism — 52-bit virtual address; 32-bit physical address Christophe
Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
On Thu, Feb 11, 2021 at 08:04:55PM +1000, Nicholas Piggin wrote: > Excerpts from Christophe Leroy's message of February 11, 2021 5:41 pm: > > As modern powerpc implement branch folding, that's even more efficient. Ah, it seems you mean what Arm calls branch folding. Sure, power4 already did that, and this has not changed. > I think POWER will speculate conditional traps as non faulting always > so it should be just as good if not better than the branch. Right, these are not branch instructions, so are not branch predicted; all trap instructions are assumed to fall through, like all other non-branch instructions. Segher
Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
On Thu, Feb 11, 2021 at 08:04:55PM +1000, Nicholas Piggin wrote: > It would be nice if we could have a __builtin_trap_if that gcc would use > conditional traps with, (and which never assumes following code is > unreachable even for constant true, so we can use it with WARN and put > explicit unreachable for BUG). It automatically does that with just __builtin_trap, see my other mail :-) Segher
Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
On Thu, Feb 11, 2021 at 07:41:52AM +, Christophe Leroy wrote: > powerpc BUG_ON() is based on using twnei or tdnei instruction, > which obliges gcc to format the condition into a 0 or 1 value > in a register. Huh? Why is that? Will it work better if this used __builtin_trap? Or does the kernel only detect very specific forms of trap instructions? > By using a generic implementation, gcc will generate a branch > to the unconditional trap generated by BUG(). That is many more instructions than ideal. > As modern powerpc implement branch folding, that's even more efficient. What PowerPC cpus implement branch folding? I know none. Some example code generated via __builtin_trap: void trap(void) { __builtin_trap(); } void trap_if_0(int x) { if (x == 0) __builtin_trap(); } void trap_if_not_0(int x) { if (x != 0) __builtin_trap(); } -m64: trap: trap trap_if_0: tdeqi 3,0 blr trap_if_not_0: tdnei 3,0 blr -m32: trap: trap trap_if_0: tweqi 3,0 blr trap_if_not_0: twnei 3,0 blr Segher
Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
Excerpts from Christophe Leroy's message of February 11, 2021 5:41 pm: > powerpc BUG_ON() is based on using twnei or tdnei instruction, > which obliges gcc to format the condition into a 0 or 1 value > in a register. > > By using a generic implementation, gcc will generate a branch > to the unconditional trap generated by BUG(). We don't want to do this on 64s because that will lose the useful CFAR contents. Unfortunately the code generation is not great and the registers that give some useful information about the condition are often mangled :( It would be nice if we could have a __builtin_trap_if that gcc would use conditional traps with, (and which never assumes following code is unreachable even for constant true, so we can use it with WARN and put explicit unreachable for BUG). > > As modern powerpc implement branch folding, that's even more efficient. I think POWER will speculate conditional traps as non faulting always so it should be just as good if not better than the branch. Thanks, Nick
[PATCH] powerpc/bug: Remove specific powerpc BUG_ON()
powerpc BUG_ON() is based on using twnei or tdnei instruction, which obliges gcc to format the condition into a 0 or 1 value in a register. By using a generic implementation, gcc will generate a branch to the unconditional trap generated by BUG(). As modern powerpc implement branch folding, that's even more efficient. See below the difference at the entry of system_call_exception. With the patch: : 0: 81 6a 00 84 lwz r11,132(r10) 4: 90 6a 00 88 stw r3,136(r10) 8: 71 60 00 02 andi. r0,r11,2 c: 41 82 00 70 beq 7c 10: 71 60 40 00 andi. r0,r11,16384 14: 41 82 00 6c beq 80 18: 71 6b 80 00 andi. r11,r11,32768 1c: 41 82 00 68 beq 84 20: 94 21 ff e0 stwur1,-32(r1) 24: 93 e1 00 1c stw r31,28(r1) 28: 7d 8c 42 e6 mftbr12 ... 7c: 0f e0 00 00 twuir0,0 80: 0f e0 00 00 twuir0,0 84: 0f e0 00 00 twuir0,0 Without the patch: : 0: 94 21 ff e0 stwur1,-32(r1) 4: 93 e1 00 1c stw r31,28(r1) 8: 90 6a 00 88 stw r3,136(r10) c: 81 6a 00 84 lwz r11,132(r10) 10: 69 60 00 02 xorir0,r11,2 14: 54 00 ff fe rlwinm r0,r0,31,31,31 18: 0f 00 00 00 twnei r0,0 1c: 69 60 40 00 xorir0,r11,16384 20: 54 00 97 fe rlwinm r0,r0,18,31,31 24: 0f 00 00 00 twnei r0,0 28: 69 6b 80 00 xorir11,r11,32768 2c: 55 6b 8f fe rlwinm r11,r11,17,31,31 30: 0f 0b 00 00 twnei r11,0 34: 7d 8c 42 e6 mftbr12 Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/bug.h | 10 -- 1 file changed, 10 deletions(-) diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h index d1635ffbb179..21103d3e1f29 100644 --- a/arch/powerpc/include/asm/bug.h +++ b/arch/powerpc/include/asm/bug.h @@ -69,15 +69,6 @@ unreachable(); \ } while (0) -#define BUG_ON(x) do { \ - if (__builtin_constant_p(x)) { \ - if (x) \ - BUG(); \ - } else {\ - BUG_ENTRY(PPC_TLNEI " %4, 0", 0, "r" ((__force long)(x))); \ - } \ -} while (0) - #define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags)) #define WARN_ON(x) ({ \ @@ -94,7 +85,6 @@ }) #define HAVE_ARCH_BUG -#define HAVE_ARCH_BUG_ON #define HAVE_ARCH_WARN_ON #endif /* __ASSEMBLY __ */ #else -- 2.25.0