Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-27 Thread Christophe Leroy




Le 11/02/2021 à 15:30, Segher Boessenkool a écrit :

On Thu, Feb 11, 2021 at 03:09:43PM +0100, Christophe Leroy wrote:

Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :

On Thu, Feb 11, 2021 at 07:41:52AM +, Christophe Leroy wrote:

powerpc BUG_ON() is based on using twnei or tdnei instruction,
which obliges gcc to format the condition into a 0 or 1 value
in a register.


Huh?  Why is that?

Will it work better if this used __builtin_trap?  Or does the kernel only
detect very specific forms of trap instructions?


We already made a try with __builtin_trap() 1,5 year ago, see
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.le...@c-s.fr/

The main problems encountered are:
- It is only possible to use it for BUG_ON, not for WARN_ON because GCC
considers it as noreturn. Is there any workaround ?


A trap is noreturn by definition:

  -- Built-in Function: void __builtin_trap (void)
  This function causes the program to exit abnormally.


- The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify
the source file and line corresponding to the trap. How can that be done
with __builtin_trap() ?


The DWARF debug info should be sufficient.  Perhaps you can post-process
some way?

You can create a trap that falls through yourself (by having a trap-on
condition with a condition that is always true, but make the compiler
not see that).  This isn't efficient though.

Could you file a feature request (in bugzilla)?  It is probably useful
for generic code as well, but we could implement this for powerpc only
if needed.



https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99299

Christophe


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Nicholas Piggin
Excerpts from Segher Boessenkool's message of February 11, 2021 9:50 pm:
> On Thu, Feb 11, 2021 at 08:04:55PM +1000, Nicholas Piggin wrote:
>> It would be nice if we could have a __builtin_trap_if that gcc would use 
>> conditional traps with, (and which never assumes following code is 
>> unreachable even for constant true, so we can use it with WARN and put 
>> explicit unreachable for BUG).
> 
> It automatically does that with just __builtin_trap, see my other mail :-)

If that is generated without branches (or at least with no more
branches than existing asm implementation), then it could be usable 
without trashing CFAR.

Unfortunately I don't think we will be parsing the dwarf information
to get line numbers from it any time soon, so not a drop in replacement 
but maybe one day someone would find a solution.

Thanks,
Nick


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Christophe Leroy




Le 11/02/2021 à 15:30, Segher Boessenkool a écrit :

On Thu, Feb 11, 2021 at 03:09:43PM +0100, Christophe Leroy wrote:

Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :

On Thu, Feb 11, 2021 at 07:41:52AM +, Christophe Leroy wrote:

powerpc BUG_ON() is based on using twnei or tdnei instruction,
which obliges gcc to format the condition into a 0 or 1 value
in a register.


Huh?  Why is that?

Will it work better if this used __builtin_trap?  Or does the kernel only
detect very specific forms of trap instructions?


We already made a try with __builtin_trap() 1,5 year ago, see
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.le...@c-s.fr/

The main problems encountered are:
- It is only possible to use it for BUG_ON, not for WARN_ON because GCC
considers it as noreturn. Is there any workaround ?


A trap is noreturn by definition:

  -- Built-in Function: void __builtin_trap (void)
  This function causes the program to exit abnormally.


- The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify
the source file and line corresponding to the trap. How can that be done
with __builtin_trap() ?


The DWARF debug info should be sufficient.  Perhaps you can post-process
some way?

You can create a trap that falls through yourself (by having a trap-on
condition with a condition that is always true, but make the compiler
not see that).  This isn't efficient though.

Could you file a feature request (in bugzilla)?  It is probably useful
for generic code as well, but we could implement this for powerpc only
if needed.



Ok I will.


For sure using __builtin_trap() would be the best.

unsigned long test1(unsigned long msr)
{
WARN_ON((msr & (MSR_DR | MSR_IR | MSR_RI)) != (MSR_DR | MSR_IR | 
MSR_RI));

return msr;
}

unsigned long test2(unsigned long msr)
{
if ((msr & (MSR_DR | MSR_IR | MSR_RI)) != (MSR_DR | MSR_IR | MSR_RI))
__builtin_trap();

return msr;
}

03c0 :
 3c0:   70 69 00 32 andi.   r9,r3,50
 3c4:   69 29 00 32 xorir9,r9,50
 3c8:   31 49 ff ff addic   r10,r9,-1
 3cc:   7d 2a 49 10 subfe   r9,r10,r9
 3d0:   0f 09 00 00 twnei   r9,0
 3d4:   4e 80 00 20 blr

03d8 :
 3d8:   70 69 00 32 andi.   r9,r3,50
 3dc:   0f 09 00 32 twnei   r9,50
 3e0:   4e 80 00 20 blr

Christophe


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Segher Boessenkool
On Thu, Feb 11, 2021 at 03:09:43PM +0100, Christophe Leroy wrote:
> Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :
> >On Thu, Feb 11, 2021 at 07:41:52AM +, Christophe Leroy wrote:
> >>powerpc BUG_ON() is based on using twnei or tdnei instruction,
> >>which obliges gcc to format the condition into a 0 or 1 value
> >>in a register.
> >
> >Huh?  Why is that?
> >
> >Will it work better if this used __builtin_trap?  Or does the kernel only
> >detect very specific forms of trap instructions?
> 
> We already made a try with __builtin_trap() 1,5 year ago, see 
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.le...@c-s.fr/
> 
> The main problems encountered are:
> - It is only possible to use it for BUG_ON, not for WARN_ON because GCC 
> considers it as noreturn. Is there any workaround ?

A trap is noreturn by definition:

 -- Built-in Function: void __builtin_trap (void)
 This function causes the program to exit abnormally.

> - The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify 
> the source file and line corresponding to the trap. How can that be done 
> with __builtin_trap() ?

The DWARF debug info should be sufficient.  Perhaps you can post-process
some way?

You can create a trap that falls through yourself (by having a trap-on
condition with a condition that is always true, but make the compiler
not see that).  This isn't efficient though.

Could you file a feature request (in bugzilla)?  It is probably useful
for generic code as well, but we could implement this for powerpc only
if needed.


Segher


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Christophe Leroy




Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :

On Thu, Feb 11, 2021 at 07:41:52AM +, Christophe Leroy wrote:

powerpc BUG_ON() is based on using twnei or tdnei instruction,
which obliges gcc to format the condition into a 0 or 1 value
in a register.


Huh?  Why is that?

Will it work better if this used __builtin_trap?  Or does the kernel only
detect very specific forms of trap instructions?


We already made a try with __builtin_trap() 1,5 year ago, see 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20510ce03cc9463f1c9e743c1d93b939de501b53.1566219503.git.christophe.le...@c-s.fr/


The main problems encountered are:
- It is only possible to use it for BUG_ON, not for WARN_ON because GCC considers it as noreturn. Is 
there any workaround ?
- The kernel (With CONFIG_DEBUG_BUGVERBOSE) needs to be able to identify the source file and line 
corresponding to the trap. How can that be done with __builtin_trap() ?


Christophe


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Segher Boessenkool
On Thu, Feb 11, 2021 at 01:26:12PM +0100, Christophe Leroy wrote:
> >What PowerPC cpus implement branch folding?  I know none.
> 
> Extract from powerpc mpc8323 reference manual:

> — Zero-cycle branch capability (branch folding)

Yeah, this is not what is traditionally called branch folding (which
stores the instruction being branched to in some cache, originally the
instruction cache itself; somewhat similar (but different) to the BTIC
on 750).  Overloaded terminology :-)

6xx/7xx CPUs had the branch execution unit in the frontend, and it would
not issue a branch at all if it could be resolved then already.  Power4
and later predict all branches, and most are not issued at all (those
that do need to be executed, like bdnz, are).  At completion time it is
checked if the prediction was correct (and corrective action is taken if
not).


Segher


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Christophe Leroy




Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :

On Thu, Feb 11, 2021 at 07:41:52AM +, Christophe Leroy wrote:

powerpc BUG_ON() is based on using twnei or tdnei instruction,
which obliges gcc to format the condition into a 0 or 1 value
in a register.


Huh?  Why is that?

Will it work better if this used __builtin_trap?  Or does the kernel only
detect very specific forms of trap instructions?


By using a generic implementation, gcc will generate a branch
to the unconditional trap generated by BUG().


That is many more instructions than ideal.


As modern powerpc implement branch folding, that's even more efficient.


What PowerPC cpus implement branch folding?  I know none.


Extract from powerpc mpc8323 reference manual:

High instruction and data throughput
— Zero-cycle branch capability (branch folding)
— Programmable static branch prediction on unresolved conditional branches
— Two integer units with enhanced multipliers in thee300c2 for increased 
integer instruction
throughput and a maximum two-cycle latency for multiply instructions
— Instruction fetch unit capable of fetching two instructions per clock from 
the instruction cache
— A six-entry instruction queue (IQ) that provides lookahead capability
— Independent pipelines with feed-forwarding that reduces data dependencies in 
hardware
— 16-Kbyte, four-way set-associative instruction and data caches on the e300c2.
— Cache write-back or write-through operation programmable on a per-page or 
per-block basis
— Features for instruction and data cache locking and protection
— BPU that performs CR lookahead operations
— Address translation facilities for 4-Kbyte page size, variable block size, 
and 256-Mbyte
segment size
— A 64-entry, two-way, set-associative ITLB and DTLB
— Eight-entry data and instruction BAT arrays providing 128-Kbyte to 256-Mbyte 
blocks
— Software table search operations and updates supported through fast trap 
mechanism
— 52-bit virtual address; 32-bit physical address

Christophe


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Segher Boessenkool
On Thu, Feb 11, 2021 at 08:04:55PM +1000, Nicholas Piggin wrote:
> Excerpts from Christophe Leroy's message of February 11, 2021 5:41 pm:
> > As modern powerpc implement branch folding, that's even more efficient.

Ah, it seems you mean what Arm calls branch folding.  Sure, power4
already did that, and this has not changed.

> I think POWER will speculate conditional traps as non faulting always
> so it should be just as good if not better than the branch.

Right, these are not branch instructions, so are not branch predicted;
all trap instructions are assumed to fall through, like all other
non-branch instructions.


Segher


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Segher Boessenkool
On Thu, Feb 11, 2021 at 08:04:55PM +1000, Nicholas Piggin wrote:
> It would be nice if we could have a __builtin_trap_if that gcc would use 
> conditional traps with, (and which never assumes following code is 
> unreachable even for constant true, so we can use it with WARN and put 
> explicit unreachable for BUG).

It automatically does that with just __builtin_trap, see my other mail :-)


Segher


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Segher Boessenkool
On Thu, Feb 11, 2021 at 07:41:52AM +, Christophe Leroy wrote:
> powerpc BUG_ON() is based on using twnei or tdnei instruction,
> which obliges gcc to format the condition into a 0 or 1 value
> in a register.

Huh?  Why is that?

Will it work better if this used __builtin_trap?  Or does the kernel only
detect very specific forms of trap instructions?

> By using a generic implementation, gcc will generate a branch
> to the unconditional trap generated by BUG().

That is many more instructions than ideal.

> As modern powerpc implement branch folding, that's even more efficient.

What PowerPC cpus implement branch folding?  I know none.

Some example code generated via __builtin_trap:

void trap(void) { __builtin_trap(); }
void trap_if_0(int x) { if (x == 0) __builtin_trap(); }
void trap_if_not_0(int x) { if (x != 0) __builtin_trap(); }

-m64:

trap:
trap
trap_if_0:
tdeqi 3,0
blr
trap_if_not_0:
tdnei 3,0
blr

-m32:

trap:
trap
trap_if_0:
tweqi 3,0
blr
trap_if_not_0:
twnei 3,0
blr


Segher


Re: [PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-11 Thread Nicholas Piggin
Excerpts from Christophe Leroy's message of February 11, 2021 5:41 pm:
> powerpc BUG_ON() is based on using twnei or tdnei instruction,
> which obliges gcc to format the condition into a 0 or 1 value
> in a register.
> 
> By using a generic implementation, gcc will generate a branch
> to the unconditional trap generated by BUG().

We don't want to do this on 64s because that will lose the useful CFAR
contents.

Unfortunately the code generation is not great and the registers that 
give some useful information about the condition are often mangled :(

It would be nice if we could have a __builtin_trap_if that gcc would use 
conditional traps with, (and which never assumes following code is 
unreachable even for constant true, so we can use it with WARN and put 
explicit unreachable for BUG).

> 
> As modern powerpc implement branch folding, that's even more efficient.

I think POWER will speculate conditional traps as non faulting always
so it should be just as good if not better than the branch.

Thanks,
Nick


[PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

2021-02-10 Thread Christophe Leroy
powerpc BUG_ON() is based on using twnei or tdnei instruction,
which obliges gcc to format the condition into a 0 or 1 value
in a register.

By using a generic implementation, gcc will generate a branch
to the unconditional trap generated by BUG().

As modern powerpc implement branch folding, that's even more efficient.

See below the difference at the entry of system_call_exception.

With the patch:

 :
   0:   81 6a 00 84 lwz r11,132(r10)
   4:   90 6a 00 88 stw r3,136(r10)
   8:   71 60 00 02 andi.   r0,r11,2
   c:   41 82 00 70 beq 7c 
  10:   71 60 40 00 andi.   r0,r11,16384
  14:   41 82 00 6c beq 80 
  18:   71 6b 80 00 andi.   r11,r11,32768
  1c:   41 82 00 68 beq 84 
  20:   94 21 ff e0 stwur1,-32(r1)
  24:   93 e1 00 1c stw r31,28(r1)
  28:   7d 8c 42 e6 mftbr12
...
  7c:   0f e0 00 00 twuir0,0
  80:   0f e0 00 00 twuir0,0
  84:   0f e0 00 00 twuir0,0

Without the patch:

 :
   0:   94 21 ff e0 stwur1,-32(r1)
   4:   93 e1 00 1c stw r31,28(r1)
   8:   90 6a 00 88 stw r3,136(r10)
   c:   81 6a 00 84 lwz r11,132(r10)
  10:   69 60 00 02 xorir0,r11,2
  14:   54 00 ff fe rlwinm  r0,r0,31,31,31
  18:   0f 00 00 00 twnei   r0,0
  1c:   69 60 40 00 xorir0,r11,16384
  20:   54 00 97 fe rlwinm  r0,r0,18,31,31
  24:   0f 00 00 00 twnei   r0,0
  28:   69 6b 80 00 xorir11,r11,32768
  2c:   55 6b 8f fe rlwinm  r11,r11,17,31,31
  30:   0f 0b 00 00 twnei   r11,0
  34:   7d 8c 42 e6 mftbr12

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/bug.h | 10 --
 1 file changed, 10 deletions(-)

diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index d1635ffbb179..21103d3e1f29 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -69,15 +69,6 @@
unreachable();  \
 } while (0)
 
-#define BUG_ON(x) do { \
-   if (__builtin_constant_p(x)) {  \
-   if (x)  \
-   BUG();  \
-   } else {\
-   BUG_ENTRY(PPC_TLNEI " %4, 0", 0, "r" ((__force long)(x)));  
\
-   }   \
-} while (0)
-
 #define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | 
(flags))
 
 #define WARN_ON(x) ({  \
@@ -94,7 +85,6 @@
 })
 
 #define HAVE_ARCH_BUG
-#define HAVE_ARCH_BUG_ON
 #define HAVE_ARCH_WARN_ON
 #endif /* __ASSEMBLY __ */
 #else
-- 
2.25.0