Mark,

__builtin_dwarf_cfa() is lowered in clang to llvm intrinsic eh_dwarf_cfa.
There's a depth argument (which defaults to 0, saying it's correct for most
targets). 

Then the intrinsic gets lowered in SelectionDAG using
PPCTargetLowering::LowerFRAMEADDR()


Can you check that 1) the depth should be 0 for ppc64/ppc32 2) that
LowerFRAMEADDR() does something sensible?

There's a loop depth-times, so I wonder if that makes a difference.

Thanks, Roman


On Sat, Feb 27, 2016 at 05:55:02PM -0800, Mark Millard wrote:
> I discovered on powerpc that __builtin_dwarf_cfa() for clang 3.8.0 and g++ do 
> not agree. For powerpc this breaks C++ exception handling (via the use in 
> libgcc_s's unwind handling), resulting in uncaught exceptions and SEGV's. 
> objdump -d for the two line source file below shows the low level differences.
> 
> > extern void g(void*);
> > void f() { g(__builtin_dwarf_cfa()); }
> 
> I've also shown the same issue for powerpc64.
> 
> The issue is where g's argument value points relative to f's frame and f's 
> caller's frame (since __builtin_dwarf_cfa() is called by f, not g).
> 
> And now for armv6 . . .
> 
> > # clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
> > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o
> > 
> > builtin_dwarf_cfa.o:     file format elf32-littlearm
> > 
> > 
> > Disassembly of section .text:
> > 00000000 <_Z1fv> push       {fp, lr}
> > 00000004 <_Z1fv+0x4> mov    fp, sp
> > 00000008 <_Z1fv+0x8> mov    r0, fp
> > 0000000c <_Z1fv+0xc> bl     00000000 <_Z1gPv>
> > 00000010 <_Z1fv+0x10> pop   {fp, pc}
> 
> vs.
> 
> > # g++5 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
> > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o
> > 
> > builtin_dwarf_cfa.o:     file format elf32-littlearm
> > 
> > 
> > Disassembly of section .text:
> > 00000000 <_Z1fv> push       {fp, lr}
> > 00000004 <_Z1fv+0x4> add    fp, sp, #4, 0
> > 00000008 <_Z1fv+0x8> add    r3, fp, #4, 0
> > 0000000c <_Z1fv+0xc> mov    r0, r3
> > 00000010 <_Z1fv+0x10> bl    00000000 <_Z1gPv>
> > 00000014 <_Z1fv+0x14> nop                   ; (mov r0, r0)
> > 00000018 <_Z1fv+0x18> pop   {fp, pc}
> 
> 
> They do not agree.
> 
> So any infrastructure based on __builtin_dwarf_cfa() use will be compiler 
> sensitive for armv6 as well.
> 
> [It is my understanding that what g++ does is what the normal sort of 
> .eh_frame infrastructure is designed for: pointing between the caller's and 
> called's frames.]
> 
> 
> For reference: powerpc64 and powerpc results. . .
> 
> > # clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
> > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o
> > 
> > builtin_dwarf_cfa.o:     file format elf64-powerpc-freebsd
> > 
> > 
> > Disassembly of section .text:
> > 0000000000000000 <._Z1fv> mflr    r0
> > 0000000000000004 <._Z1fv+0x4> std     r31,-8(r1)
> > 0000000000000008 <._Z1fv+0x8> std     r0,16(r1)
> > 000000000000000c <._Z1fv+0xc> stdu    r1,-128(r1)
> > 0000000000000010 <._Z1fv+0x10> mr      r31,r1
> > 0000000000000014 <._Z1fv+0x14> mr      r3,r31
> > 0000000000000018 <._Z1fv+0x18> bl      0000000000000018 <._Z1fv+0x18>
> > 000000000000001c <._Z1fv+0x1c> nop
> > 0000000000000020 <._Z1fv+0x20> addi    r1,r1,128
> > 0000000000000024 <._Z1fv+0x24> ld      r0,16(r1)
> > 0000000000000028 <._Z1fv+0x28> ld      r31,-8(r1)
> > 000000000000002c <._Z1fv+0x2c> mtlr    r0
> > 0000000000000030 <._Z1fv+0x30> blr
> >         ...
> 
> r3 does not point to a boundary with f's caller's stack frame.
> 
> By contrast for g++49:
> 
> > # g++49 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
> > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o | more
> > 
> > builtin_dwarf_cfa.o:     file format elf64-powerpc-freebsd
> > 
> > 
> > Disassembly of section .text:
> > 0000000000000000 <._Z1fv> mflr    r0
> > 0000000000000004 <._Z1fv+0x4> std     r0,16(r1)
> > 0000000000000008 <._Z1fv+0x8> std     r31,-8(r1)
> > 000000000000000c <._Z1fv+0xc> stdu    r1,-128(r1)
> > 0000000000000010 <._Z1fv+0x10> mr      r31,r1
> > 0000000000000014 <._Z1fv+0x14> addi    r9,r31,128
> > 0000000000000018 <._Z1fv+0x18> mr      r3,r9
> > 000000000000001c <._Z1fv+0x1c> bl      000000000000001c <._Z1fv+0x1c>
> > 0000000000000020 <._Z1fv+0x20> nop
> > 0000000000000024 <._Z1fv+0x24> addi    r1,r31,128
> > 0000000000000028 <._Z1fv+0x28> ld      r0,16(r1)
> > 000000000000002c <._Z1fv+0x2c> mtlr    r0
> > 0000000000000030 <._Z1fv+0x30> ld      r31,-8(r1)
> > 0000000000000034 <._Z1fv+0x34> blr
> > 0000000000000038 <._Z1fv+0x38> .long 0x0
> > 000000000000003c <._Z1fv+0x3c> .long 0x90001
> > 0000000000000040 <._Z1fv+0x40> lwz     r0,1(r1)
> 
> r3 does point to a boundary with f's caller's stack frame.
> 
> For TARGET_ARCH=powerpc, clang 3.8.0 first:
> 
> > # clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
> > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o
> > 
> > builtin_dwarf_cfa.o:     file format elf32-powerpc-freebsd
> > 
> > 
> > Disassembly of section .text:
> > 00000000 <_Z1fv> mflr    r0
> > 00000004 <_Z1fv+0x4> stw     r31,-4(r1)
> > 00000008 <_Z1fv+0x8> stw     r0,4(r1)
> > 0000000c <_Z1fv+0xc> stwu    r1,-16(r1)
> > 00000010 <_Z1fv+0x10> mr      r31,r1
> > 00000014 <_Z1fv+0x14> mr      r3,r31
> > 00000018 <_Z1fv+0x18> bl      00000018 <_Z1fv+0x18>
> > 0000001c <_Z1fv+0x1c> addi    r1,r1,16
> > 00000020 <_Z1fv+0x20> lwz     r0,4(r1)
> > 00000024 <_Z1fv+0x24> lwz     r31,-4(r1)
> > 00000028 <_Z1fv+0x28> mtlr    r0
> > 0000002c <_Z1fv+0x2c> blr
> 
> Then g++5 (5.3):
> 
> > # g++5 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
> > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o
> > 
> > builtin_dwarf_cfa.o:     file format elf32-powerpc-freebsd
> > 
> > 
> > Disassembly of section .text:
> > 00000000 <_Z1fv> stwu    r1,-16(r1)
> > 00000004 <_Z1fv+0x4> mflr    r0
> > 00000008 <_Z1fv+0x8> stw     r0,20(r1)
> > 0000000c <_Z1fv+0xc> stw     r31,12(r1)
> > 00000010 <_Z1fv+0x10> mr      r31,r1
> > 00000014 <_Z1fv+0x14> addi    r9,r31,16
> > 00000018 <_Z1fv+0x18> mr      r3,r9
> > 0000001c <_Z1fv+0x1c> bl      0000001c <_Z1fv+0x1c>
> > 00000020 <_Z1fv+0x20> nop
> > 00000024 <_Z1fv+0x24> addi    r11,r31,16
> > 00000028 <_Z1fv+0x28> lwz     r0,4(r11)
> > 0000002c <_Z1fv+0x2c> mtlr    r0
> > 00000030 <_Z1fv+0x30> lwz     r31,-4(r11)
> > 00000034 <_Z1fv+0x34> mr      r1,r11
> > 00000038 <_Z1fv+0x38> blr
> 
> 
> The historical note below is from before I'd discovered powerpc64 or armv6 
> have the same sort of issue. But it gives an example use that is broken for 
> powerpc and powerpc64. (I do not know if armv6 uses the same infrastructure.)
> 
> ===
> Mark Millard
> markmi at dsl-only.net
> 
> On 2016-Feb-27, at 3:31 PM, Mark Millard <markmi at dsl-only.net> wrote:
> > 
> > [Top post for dinging the low level problem that directly breaks c++ 
> > exception handling for TARGET_ARCH=powerpc for clang 3.8.0 generated code.]
> > 
> > I've tracked down the c++ exception problem for TARGET_ARCH=powerpc via 
> > clang 3.8.0: misbehavior of clang 3.8.0 code generation for 
> > __builtin_dwarf_cfa () as used in:
> > 
> > #define uw_init_context(CONTEXT)                                           \
> >  do                                                                       \
> >    {                                                                      \
> >      /* Do any necessary initialization to access arbitrary stack frames. \
> >         On the SPARC, this means flushing the register windows.  */       \
> >      __builtin_unwind_init ();                                            \
> >      uw_init_context_1 (CONTEXT, __builtin_dwarf_cfa (),                  \
> >                         __builtin_return_address (0));                    \
> >    }                                                                      \
> >  while (0)
> > . . .
> > 85  _Unwind_Reason_Code
> > 86  _Unwind_RaiseException(struct _Unwind_Exception *exc)
> > 87  {
> > 88    struct _Unwind_Context this_context, cur_context;
> > 89    _Unwind_Reason_Code code;
> > 90  
> > 91    /* Set up this_context to describe the current stack frame.  */
> > 92    uw_init_context (&this_context);
> > 
> > In the below r4 ends up with the __builtin_dwarf_cfa () value supplied to 
> > uw_init_context_1:
> > 
> > Dump of assembler code for function _Unwind_RaiseException:
> >   0x419a8fd8 <+0>:  mflr    r0
> >   0x419a8fdc <+4>:  stw     r31,-148(r1)
> >   0x419a8fe0 <+8>:  stw     r30,-152(r1)
> >   0x419a8fe4 <+12>: stw     r0,4(r1)
> >   0x419a8fe8 <+16>: stwu    r1,-2992(r1)
> >   0x419a8fec <+20>: mr      r31,r1
> > . . .
> >   0x419a9094 <+188>:        mr      r4,r31
> >   0x419a9098 <+192>:        mflr    r30
> >   0x419a909c <+196>:        lwz     r5,2996(r31)
> >   0x419a90a0 <+200>:        mr      r3,r28
> >   0x419a90a4 <+204>:        bl      0x419a929c <uw_init_context_1>
> > 
> > That r4 ends up holding the stack pointer value for after it has been 
> > decremented. r4 is not pointing at the boundary with the caller's frame.
> > 
> > The .eh_frame information and unwind code is set up for pointing at the 
> > boundary with the caller's frame. So the cfa relative addressing is messed 
> > up for what it actually extracts.
> > 
> > Contrast this with gcc/g++ 5.3's TARGET_ARCH=powerpc64 code where r4 is  
> > made to be at the boundary with the caller's frame:
> > 
> > Dump of assembler code for function _Unwind_RaiseException:
> >   0x00000000501cb810 <+0>:  mflr    r0
> >   0x00000000501cb814 <+4>:  stdu    r1,-5648(r1)
> > . . .
> >   0x00000000501cb8d0 <+192>:        addi    r4,r1,5648
> >   0x00000000501cb8d4 <+196>:        stw     r12,5656(r1)
> >   0x00000000501cb8d8 <+200>:        mr      r28,r3
> >   0x00000000501cb8dc <+204>:        addi    r31,r1,2544
> >   0x00000000501cb8e0 <+208>:        mr      r3,r27
> >   0x00000000501cb8e4 <+212>:        addi    r29,r1,112
> >   0x00000000501cb8e8 <+216>:        bl      0x501cae60 <uw_init_context_1>
> > 
> > 
> > NOTE: The powerpc (32-bit) issue may in some way be associated with the 
> > clang 3.8.0 powerpc ABI violation in how it handles the stack pointer for 
> > FreeBSD: TARGET_ARCH=powerpc is currently using a "red zone", decrementing 
> > the stack pointer late, and incrementing the stack pointer early compared 
> > to the FreeBSD ABI rules. (This is similar to the official FreeBSD ABI for 
> > TARGET_ARCH=powerpc64.)
> > 
> > 
> > 
> > 
> > ===
> > Mark Millard
> > markmi at dsl-only.net
_______________________________________________
freebsd-toolchain@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
To unsubscribe, send any mail to "freebsd-toolchain-unsubscr...@freebsd.org"

Reply via email to