Re: [RS6000] asynch exceptions and unwind info

2011-08-01 Thread Alan Modra
On Fri, Jul 29, 2011 at 10:28:28PM +0930, Alan Modra wrote:
 libgcc/
   * config/rs6000/linux-unwind.h (frob_update_context __powerpc64__):
   Restore for indirect call bcrtl from correct stack slot, and only
   if cfa+40 isn't valid.
 gcc/
   * config/rs6000/rs6000-protos.h (rs6000_save_toc_in_prologue_p): Delete.
   * config/rs6000/rs6000.c (rs6000_save_toc_in_prologue_p): Make static.
   (rs6000_emit_prologue): Don't prematurely return when
   TARGET_SINGLE_PIC_BASE.  Don't emit eh_frame info in
   save_toc_in_prologue case.
   (rs6000_call_indirect_aix): Only disallow save_toc_in_prologue for
   calls_alloca.

Approved offline and applied with a comment change.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] asynch exceptions and unwind info

2011-07-29 Thread Alan Modra
On Fri, Jul 29, 2011 at 10:57:48AM +0930, Alan Modra wrote:
 Except that any info about r2 in an indirect call sequence really
 belongs to the *called* function frame, not the callee.  I woke up
 this morning with the realization that what I'd done in
 frob_update_context for indirect call sequences was wrong.  Ditto for
 the r2 store that Michael moved into the prologue.  The only time we
 want the unwinder to restore from that particular save is if r2 isn't
 saved in the current frame.
 
 Untested patch follows.

Here's a tested patch that fixes an issue with TOC_SINGLE_PIC_BASE and
enables Michael's save_toc_in_prologue optimization for all functions
except those that make dynamic stack adjustments.

Incidentally, the rs6000_emit_prologue comment I added below suggests
another solution.  Since all we need is the toc pointer for the frame,
it would be possible to tell the unwinder to simply load r2 from the
.opd entry.  I think..

libgcc/
* config/rs6000/linux-unwind.h (frob_update_context __powerpc64__):
Restore for indirect call bcrtl from correct stack slot, and only
if cfa+40 isn't valid.
gcc/
* config/rs6000/rs6000-protos.h (rs6000_save_toc_in_prologue_p): Delete.
* config/rs6000/rs6000.c (rs6000_save_toc_in_prologue_p): Make static.
(rs6000_emit_prologue): Don't prematurely return when
TARGET_SINGLE_PIC_BASE.  Don't emit eh_frame info in
save_toc_in_prologue case.
(rs6000_call_indirect_aix): Only disallow save_toc_in_prologue for
calls_alloca.

Index: libgcc/config/rs6000/linux-unwind.h
===
--- libgcc/config/rs6000/linux-unwind.h (revision 176905)
+++ libgcc/config/rs6000/linux-unwind.h (working copy)
@@ -354,20 +354,22 @@ frob_update_context (struct _Unwind_Cont
  /* We are in a plt call stub or r2 adjusting long branch stub,
 before r2 has been saved.  Keep REG_UNSAVED.  */
}
-  else if (pc[0] == 0x4E800421
-   pc[1] == 0xE8410028)
-   {
- /* We are at the bctrl instruction in a call via function
-pointer.  gcc always emits the load of the new r2 just
-before the bctrl.  */
- _Unwind_SetGRPtr (context, 2, context-cfa + 40);
-   }
   else
{
  unsigned int *insn
= (unsigned int *) _Unwind_GetGR (context, R_LR);
  if (insn  *insn == 0xE8410028)
_Unwind_SetGRPtr (context, 2, context-cfa + 40);
+ else if (pc[0] == 0x4E800421
+   pc[1] == 0xE8410028)
+   {
+ /* We are at the bctrl instruction in a call via function
+pointer.  gcc always emits the load of the new R2 just
+before the bctrl so this is the first and only place
+we need to use the stored R2.  */
+ _Unwind_Word sp = _Unwind_GetGR (context, 1);
+ _Unwind_SetGRPtr (context, 2, sp + 40);
+   }
}
 }
 #endif
Index: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h   (revision 176905)
+++ gcc/config/rs6000/rs6000-protos.h   (working copy)
@@ -172,8 +172,6 @@ extern void rs6000_emit_epilogue (int);
 extern void rs6000_emit_eh_reg_restore (rtx, rtx);
 extern const char * output_isel (rtx *);
 extern void rs6000_call_indirect_aix (rtx, rtx, rtx);
-extern bool rs6000_save_toc_in_prologue_p (void);
-
 extern void rs6000_aix_asm_output_dwarf_table_ref (char *);
 
 /* Declare functions in rs6000-c.c */
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 176905)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -1178,6 +1178,7 @@ static void rs6000_conditional_register_
 static void rs6000_trampoline_init (rtx, tree, rtx);
 static bool rs6000_cannot_force_const_mem (enum machine_mode, rtx);
 static bool rs6000_legitimate_constant_p (enum machine_mode, rtx);
+static bool rs6000_save_toc_in_prologue_p (void);
 
 /* Hash table stuff for keeping track of TOC entries.  */
 
@@ -20478,14 +20504,12 @@ rs6000_emit_prologue (void)
   insn = emit_insn (generate_set_vrsave (reg, info, 0));
 }
 
-  if (TARGET_SINGLE_PIC_BASE)
-return; /* Do not set PIC register */
-
   /* If we are using RS6000_PIC_OFFSET_TABLE_REGNUM, we need to set it up.  */
-  if ((TARGET_TOC  TARGET_MINIMAL_TOC  get_pool_size () != 0)
-  || (DEFAULT_ABI == ABI_V4
-  (flag_pic == 1 || (flag_pic  TARGET_SECURE_PLT))
-  df_regs_ever_live_p (RS6000_PIC_OFFSET_TABLE_REGNUM)))
+  if (!TARGET_SINGLE_PIC_BASE
+   ((TARGET_TOC  TARGET_MINIMAL_TOC  get_pool_size () != 0)
+ || (DEFAULT_ABI == ABI_V4
+  (flag_pic == 1 || (flag_pic  TARGET_SECURE_PLT))
+  df_regs_ever_live_p (RS6000_PIC_OFFSET_TABLE_REGNUM
 {
   /* If 

Re: [RS6000] asynch exceptions and unwind info

2011-07-29 Thread David Edelsohn
On Thu, Jul 28, 2011 at 9:27 PM, Alan Modra amo...@gmail.com wrote:

 Right, but I was talking about the normal case, where the unwinder
 won't even look at .glink unwind info.

 The whole problem is that toc pointer copy in 40(1) is only valid
 during indirect call sequences, and iff ld inserted a stub?  I.e.
 direct calls between functions that share toc pointers never save
 the copy?

 Yes.

 Would it make sense, if a function has any indirect call, to move
 the toc pointer save into the prologue?  You'd get to avoid that
 store all the time.  Of course you'd not be able to sink the load
 after the call, but it might still be a win.  And in that special
 case you can annotate the r2 save slot just once, correctly.

 Except that any info about r2 in an indirect call sequence really
 belongs to the *called* function frame, not the callee.  I woke up
 this morning with the realization that what I'd done in
 frob_update_context for indirect call sequences was wrong.  Ditto for
 the r2 store that Michael moved into the prologue.  The only time we
 want the unwinder to restore from that particular save is if r2 isn't
 saved in the current frame.

This discussion seems to be referencing both PLT stubs and pointer
glue.  Indirect calls through a function pointer create a frame, save
R2, and the unwinder can visit that frame.  PLT stub calls are tail
calls, save R2, and the unwinder only would visit the frame if an
exception occurs in the middle of a call.  One also can add lazy
resolution using the glink code, which performs additional work in the
dynamic linker on the first call.

Which has the problem?  Which are you trying to solve?  And how is
your change solving it?

Thanks, David


Re: [RS6000] asynch exceptions and unwind info

2011-07-29 Thread Alan Modra
On Fri, Jul 29, 2011 at 09:16:09AM -0400, David Edelsohn wrote:
 Which has the problem?  Which are you trying to solve?  And how is
 your change solving it?

Michael's save_toc_in_prologue emit_frame_save writes unwind info for
the wrong frame.  That r2 save is the current r2.  What we need is
info about the previous r2, so we can restore when unwinding.

I made a similar mistake in frob_update_context in that the value
saved by an indirect function call sequence is the r2 for the current
function.  I also restored from the wrong location.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread Alan Modra
On Wed, Jul 27, 2011 at 03:00:45PM +0930, Alan Modra wrote:
 Ideally what I'd like to
 do is have ld and gcc emit accurate r2 tracking unwind info and
 dispense with hacks like frob_update_context.  If ld did emit accurate
 unwind info for .glink, then the justification for frob_update_context
 disappears.

For the record, this statement of mine doesn't make sense.  A .glink
stub doesn't make a frame, so a backtrace won't normally pass through a
stub, thus having accurate unwind info for .glink doesn't help at all.

ld would need to insert unwind info for r2 on the call, but that
involves editing .eh_frame and in any case isn't accurate since
the r2 save doesn't happen until one or two instructions after the
call, in the stub.  I think we are stuck with frob_update_context.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread Richard Henderson
On 07/28/2011 12:27 AM, Alan Modra wrote:
 On Wed, Jul 27, 2011 at 03:00:45PM +0930, Alan Modra wrote:
 Ideally what I'd like to
 do is have ld and gcc emit accurate r2 tracking unwind info and
 dispense with hacks like frob_update_context.  If ld did emit accurate
 unwind info for .glink, then the justification for frob_update_context
 disappears.
 
 For the record, this statement of mine doesn't make sense.  A .glink
 stub doesn't make a frame, so a backtrace won't normally pass through a
 stub, thus having accurate unwind info for .glink doesn't help at all.

It does, for the duration of the stub.

The whole problem is that toc pointer copy in 40(1) is only valid
during indirect call sequences, and iff ld inserted a stub?  I.e.
direct calls between functions that share toc pointers never save
the copy?

Would it make sense, if a function has any indirect call, to move
the toc pointer save into the prologue?  You'd get to avoid that
store all the time.  Of course you'd not be able to sink the load
after the call, but it might still be a win.  And in that special
case you can annotate the r2 save slot just once, correctly.

For functions that do not contain an indirect function call, I
don't believe that there's a any way to use DW_CFA_offset that
is always correct.

One could, however, move the code in frob_update_context into a
(series of) DW_CFA_val_expression's.

  DW_CFA_val_expression
DW_OP_reg2  // Default to the value currently in R2
DW_OP_regx LR   // Test the insn following the call, as per 
frob_update_context
DW_OP_deref_size 4
DW_OP_const4u 0xE8410028
DW_OP_ne
DW_OP_bra L1
DW_OP_drop  // Could be omitted, given that we only examine 
top-of-stack at the end
DW_OP_breg1 40  // Pull the value from *(R1+40)
DW_OP_deref
  L1:

This version could appear in the CIE.  You'd have to adjust it
once LR gets saved to the stack, and R2 isn't itself being saved
as per above.

There isn't currently a hook in dwarf2cfi to add extra stuff to
the CIE program, but that wouldn't be hard to add.  The version
that gets emitted after LR is saved would need a new note as well.
But it all seems fairly tractable to actually implement, if we
think it'll actually solve the problem.


r~


Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread David Edelsohn
On Thu, Jul 28, 2011 at 2:49 PM, Richard Henderson r...@redhat.com wrote:

 The whole problem is that toc pointer copy in 40(1) is only valid
 during indirect call sequences, and iff ld inserted a stub?  I.e.
 direct calls between functions that share toc pointers never save
 the copy?

 Would it make sense, if a function has any indirect call, to move
 the toc pointer save into the prologue?  You'd get to avoid that
 store all the time.  Of course you'd not be able to sink the load
 after the call, but it might still be a win.  And in that special
 case you can annotate the r2 save slot just once, correctly.

Michael Meissner recently did move R2 save into the prologue, under
certain circumstances.  See TARGET_SAVE_TOC_INDIRECT.  Limitations
include alloca (unless one re-copies the R2.  Mike also encountered
some problems with EH, which may be related to this discussion.

The other problem is hoisting the store into the prologue is not
always profitable for performance.  It should be better once shrink
wrapping is implemented.  Currently the PPC ABI may perform a lot of
stores in the prologue if the function *may* make a call.  R2 adds yet
another store to the common path.

- David


Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread Richard Henderson
On 07/28/2011 12:02 PM, David Edelsohn wrote:
 The other problem is hoisting the store into the prologue is not
 always profitable for performance.  It should be better once shrink
 wrapping is implemented.  Currently the PPC ABI may perform a lot of
 stores in the prologue if the function *may* make a call.  R2 adds yet
 another store to the common path.

Well, even if we're not able to hoist the R2 store, we may be able
to simply add REG_CFA_OFFSET and REG_CFA_RESTORE notes to the insns
in the stream.


r~


Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread Alan Modra
On Thu, Jul 28, 2011 at 11:49:16AM -0700, Richard Henderson wrote:
 On 07/28/2011 12:27 AM, Alan Modra wrote:
  On Wed, Jul 27, 2011 at 03:00:45PM +0930, Alan Modra wrote:
  Ideally what I'd like to
  do is have ld and gcc emit accurate r2 tracking unwind info and
  dispense with hacks like frob_update_context.  If ld did emit accurate
  unwind info for .glink, then the justification for frob_update_context
  disappears.
  
  For the record, this statement of mine doesn't make sense.  A .glink
  stub doesn't make a frame, so a backtrace won't normally pass through a
  stub, thus having accurate unwind info for .glink doesn't help at all.
 
 It does, for the duration of the stub.

Right, but I was talking about the normal case, where the unwinder
won't even look at .glink unwind info.

 The whole problem is that toc pointer copy in 40(1) is only valid
 during indirect call sequences, and iff ld inserted a stub?  I.e.
 direct calls between functions that share toc pointers never save
 the copy?

Yes.

 Would it make sense, if a function has any indirect call, to move
 the toc pointer save into the prologue?  You'd get to avoid that
 store all the time.  Of course you'd not be able to sink the load
 after the call, but it might still be a win.  And in that special
 case you can annotate the r2 save slot just once, correctly.

Except that any info about r2 in an indirect call sequence really
belongs to the *called* function frame, not the callee.  I woke up
this morning with the realization that what I'd done in
frob_update_context for indirect call sequences was wrong.  Ditto for
the r2 store that Michael moved into the prologue.  The only time we
want the unwinder to restore from that particular save is if r2 isn't
saved in the current frame.

Untested patch follows.

libgcc/
* config/rs6000/linux-unwind.h (frob_update_context __powerpc64__):
Restore for indirect call bcrtl from correct stack slot, and only
if cfa+40 isn't valid.
gcc/
* config/rs6000/rs6000-protos.h (rs6000_save_toc_in_prologue_p): Delete.
* config/rs6000/rs6000.c (rs6000_save_toc_in_prologue_p): Make static.
(rs6000_emit_prologue): Don't emit eh_frame info in
save_toc_in_prologue case.
(rs6000_call_indirect_aix): Formatting.

Index: libgcc/config/rs6000/linux-unwind.h
===
--- libgcc/config/rs6000/linux-unwind.h (revision 176905)
+++ libgcc/config/rs6000/linux-unwind.h (working copy)
@@ -354,20 +354,22 @@ frob_update_context (struct _Unwind_Cont
  /* We are in a plt call stub or r2 adjusting long branch stub,
 before r2 has been saved.  Keep REG_UNSAVED.  */
}
-  else if (pc[0] == 0x4E800421
-   pc[1] == 0xE8410028)
-   {
- /* We are at the bctrl instruction in a call via function
-pointer.  gcc always emits the load of the new r2 just
-before the bctrl.  */
- _Unwind_SetGRPtr (context, 2, context-cfa + 40);
-   }
   else
{
  unsigned int *insn
= (unsigned int *) _Unwind_GetGR (context, R_LR);
  if (insn  *insn == 0xE8410028)
_Unwind_SetGRPtr (context, 2, context-cfa + 40);
+ else if (pc[0] == 0x4E800421
+   pc[1] == 0xE8410028)
+   {
+ /* We are at the bctrl instruction in a call via function
+pointer.  gcc always emits the load of the new R2 just
+before the bctrl so this is the first and only place
+we need to use the stored R2.  */
+ _Unwind_Word sp = _Unwind_GetGR (context, 1);
+ _Unwind_SetGRPtr (context, 2, sp + 40);
+   }
}
 }
 #endif
Index: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h   (revision 176905)
+++ gcc/config/rs6000/rs6000-protos.h   (working copy)
@@ -172,8 +172,6 @@ extern void rs6000_emit_epilogue (int);
 extern void rs6000_emit_eh_reg_restore (rtx, rtx);
 extern const char * output_isel (rtx *);
 extern void rs6000_call_indirect_aix (rtx, rtx, rtx);
-extern bool rs6000_save_toc_in_prologue_p (void);
-
 extern void rs6000_aix_asm_output_dwarf_table_ref (char *);
 
 /* Declare functions in rs6000-c.c */
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 176905)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -1178,6 +1178,7 @@ static void rs6000_conditional_register_
 static void rs6000_trampoline_init (rtx, tree, rtx);
 static bool rs6000_cannot_force_const_mem (enum machine_mode, rtx);
 static bool rs6000_legitimate_constant_p (enum machine_mode, rtx);
+static bool rs6000_save_toc_in_prologue_p (void);
 
 /* Hash table stuff for keeping track of TOC entries.  */
 
@@ -20536,8 +20562,11 @@ rs6000_emit_prologue (void)
 
   /* If we 

Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread Alan Modra
On Thu, Jul 28, 2011 at 12:09:51PM -0700, Richard Henderson wrote:
 Well, even if we're not able to hoist the R2 store, we may be able
 to simply add REG_CFA_OFFSET and REG_CFA_RESTORE notes to the insns
 in the stream.

You'd need to mark every non-local call with something that says
R2 may be saved, effectively duplicating md_frob_update in dwarf.
I guess that is possible even without extending our eh encoding, but
each call would have at least 6 bytes added to eh_frame:
   DW_CFA_expression, 2, 3, DW_OP_skip, offset_to_r2_prog
and you'd need to emit multiple copies of r2_prog for functions that
have a lot of calls, since the offset is limited to +/-32k.  I think
that would inflate the size of .eh_frame too much, and slow down
handling of exceptions dramatically.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] asynch exceptions and unwind info

2011-07-27 Thread David Edelsohn
On Wed, Jul 27, 2011 at 1:30 AM, Alan Modra amo...@gmail.com wrote:

        * config/rs6000/linux-unwind.h (frob_update_context __powerpc64__):
        Leave r2 REG_UNSAVED if stopped on the instruction that saves r2
        in a plt call stub.  Do restore r2 if stopped on bctrl.

Okay.

Thanks, David