Re: var-tracking and s390's LM(G)

2014-02-06 Thread Richard Sandiford
Richard Henderson r...@redhat.com writes:
 On 02/05/2014 02:30 PM, Ulrich Weigand wrote:
 Jakub Jelinek wrote:
 On Wed, Feb 05, 2014 at 10:26:16PM +0100, Ulrich Weigand wrote:
 Actually, now I think the problem originally described there is still
 valid: on s390 the CFA is *not* equal to the value at function entry,
 but biased by 96/160 bytes.  So setting the SP to the CFA is wrong ...

 Such biasing happens on most of the targets, e.g. x86_64 has upon function
 entry CFA set to %rsp + 8, i?86 to %esp + 4.
 
 Ah, but that's a different bias.  In those cases it is still correct
 to *unwind* SP to the CFA, since that's the value SP needs to have
 back in the caller immediately after the call site.
 
 On s390, the call/return instructions do not modify the SP so the
 SP at function entry is equal to the SP in the caller after return,
 but neither of these is equal to the CFA.  Instead, SP points to
 96/160 bytes below the CFA ...   Therefore you cannot simply
 unwind SP to the CFA.

 I was about to reply the same to Jakub, Ulrich beat me to it.

 Basically, the CFA has been defined incorrectly for s390, but it hasn't
 mattered since they record the save of the SP.  But the result is that you
 can't stop recording the save of SP without also adjusting how the CFA
 is defined.

 Which _can_ be done, in a way that's fully compatible with all of the existing
 unwinding.  But certainly shouldn't be attempted at this stage of development.

OK, I agree that's not 4.9 material.  What about the other change
of replacing:

   REF_CFA_DEF_CFA (plus (stack_pointer_rtx) (const_int 160/96))

with:

   REF_CFA_ADJUST_CFA (set (stack_pointer_rtx)
   (plus (current_cfa_base) (const_int offset)))

?  That works on its own, but having both a REG_CFA_ADJUST_CFA that assigns
to stack_pointer_rtx and a REG_CFA_RESTORE for stack_pointer_rtx feels like
a double assignment.  Personally I'd prefer to leave the REG_CFA_DEF_CFA
as-is for now and change all three (the save, the restore and the CFA
definition) at the same time.

I'm also a bit worried about handling REG_CFA_ADJUST_CFA in
var-tracking.c at this stage since AIUI it used to be valid for
a backend to use CFA notes to summarise the effect of several
instructions, with only the last of them being marked frame-related.
If we look at non-frame-related insn patterns as well as CFA notes
then we could end up double-counting.  But I suppose the same goes for
REG_FRAME_RELATED_EXPRs and insn_stack_adjust_offset_pre_post has been
using those without any known problems.

FWIW here's the patch I tested.

Thanks,
Richard


gcc/
* var-tracking.c (insn_stack_adjust_offset_pre_post): Handle
REG_CFA_ADJUST_CFA.
* config/s390/s390.c (s390_add_restore_cfa_note): New function.
(s390_emit_epilogue): Use it.

Index: gcc/var-tracking.c
===
--- gcc/var-tracking.c  2014-02-06 09:47:55.564375661 +
+++ gcc/var-tracking.c  2014-02-06 09:47:57.259364367 +
@@ -807,6 +807,14 @@ insn_stack_adjust_offset_pre_post (rtx i
   rtx expr = find_reg_note (insn, REG_FRAME_RELATED_EXPR, NULL_RTX);
   if (expr)
pattern = XEXP (expr, 0);
+  else
+   {
+ expr = find_reg_note (insn, REG_CFA_ADJUST_CFA, NULL_RTX);
+ if (expr
+  XEXP (expr, 0)
+  SET_DEST (XEXP (expr, 0)) == stack_pointer_rtx)
+   pattern = XEXP (expr, 0);
+   }
 }
 
   if (GET_CODE (pattern) == SET)
Index: gcc/config/s390/s390.c
===
--- gcc/config/s390/s390.c  2014-02-06 09:47:55.564375661 +
+++ gcc/config/s390/s390.c  2014-02-06 09:48:47.378031557 +
@@ -8775,6 +8775,22 @@ s390_emit_stack_tie (void)
   emit_insn (gen_stack_tie (mem));
 }
 
+/* INSN is the epilogue instruction that sets the stack pointer to its
+   final end-of-function value.  Add a REG_CFA_ADJUST_CFA to INSN to
+   describe that final value.  FP_OFFSET is the amount that needs to be
+   added to the current hard frame pointer or stack pointer in order to
+   get the final value.  */
+
+static void
+s390_add_restore_cfa_note (rtx insn, HOST_WIDE_INT fp_offset)
+{
+  rtx base = frame_pointer_needed ? hard_frame_pointer_rtx : stack_pointer_rtx;
+  if (base != stack_pointer_rtx || fp_offset != 0)
+add_reg_note (insn, REG_CFA_ADJUST_CFA,
+ gen_rtx_SET (VOIDmode, stack_pointer_rtx,
+  plus_constant (Pmode, base, fp_offset)));
+}
+
 /* Copy GPRS into FPR save slots.  */
 
 static void
@@ -9316,9 +9332,7 @@ s390_emit_epilogue (bool sibcall)
   cfun_frame_layout.last_restore_gpr);
   insn = emit_insn (insn);
   REG_NOTES (insn) = cfa_restores;
-  add_reg_note (insn, REG_CFA_DEF_CFA,
-   plus_constant (Pmode, stack_pointer_rtx,
-  STACK_POINTER_OFFSET));
+  

Re: var-tracking and s390's LM(G)

2014-02-06 Thread Richard Henderson
On 02/06/2014 01:55 AM, Richard Sandiford wrote:
 OK, I agree that's not 4.9 material.  What about the other change
 of replacing:
 
REF_CFA_DEF_CFA (plus (stack_pointer_rtx) (const_int 160/96))
 
 with:
 
REF_CFA_ADJUST_CFA (set (stack_pointer_rtx)
(plus (current_cfa_base) (const_int offset)))
 
 ?  That works on its own, but having both a REG_CFA_ADJUST_CFA that assigns
 to stack_pointer_rtx and a REG_CFA_RESTORE for stack_pointer_rtx feels like
 a double assignment.

It does seem like it.  I suppose it would be easy to suppress the RESTORE of
the stack pointer, without changing the save at all.

 Personally I'd prefer to leave the REG_CFA_DEF_CFA
 as-is for now and change all three (the save, the restore and the CFA
 definition) at the same time.

Yeah, well...

   * var-tracking.c (insn_stack_adjust_offset_pre_post): Handle
   REG_CFA_ADJUST_CFA.
   * config/s390/s390.c (s390_add_restore_cfa_note): New function.
   (s390_emit_epilogue): Use it.

Looks fine to me, as far as it goes.


r~


Re: var-tracking and s390's LM(G)

2014-02-06 Thread Richard Sandiford
Richard Henderson r...@redhat.com writes:
 On 02/06/2014 01:55 AM, Richard Sandiford wrote:
 OK, I agree that's not 4.9 material.  What about the other change
 of replacing:
 
REF_CFA_DEF_CFA (plus (stack_pointer_rtx) (const_int 160/96))
 
 with:
 
REF_CFA_ADJUST_CFA (set (stack_pointer_rtx)
(plus (current_cfa_base) (const_int offset)))
 
 ?  That works on its own, but having both a REG_CFA_ADJUST_CFA that assigns
 to stack_pointer_rtx and a REG_CFA_RESTORE for stack_pointer_rtx feels like
 a double assignment.

 It does seem like it.  I suppose it would be easy to suppress the RESTORE of
 the stack pointer, without changing the save at all.

But if having a restore in the presence of a save doesn't matter, why do
we have restores for the other registers?  If the idea is that we never
care what the CFI state is after the LM(G) then why not omit all of them?

Or do you mean that REG_CFA_ADJUST_CFA would act as a REG_CFA_RESTORE too,
if there had been a previous save?

Thanks,
Richard



Re: var-tracking and s390's LM(G)

2014-02-06 Thread Richard Henderson
On 02/06/2014 06:48 AM, Richard Sandiford wrote:
 Richard Henderson r...@redhat.com writes:
 On 02/06/2014 01:55 AM, Richard Sandiford wrote:
 OK, I agree that's not 4.9 material.  What about the other change
 of replacing:

REF_CFA_DEF_CFA (plus (stack_pointer_rtx) (const_int 160/96))

 with:

REF_CFA_ADJUST_CFA (set (stack_pointer_rtx)
(plus (current_cfa_base) (const_int offset)))

 ?  That works on its own, but having both a REG_CFA_ADJUST_CFA that assigns
 to stack_pointer_rtx and a REG_CFA_RESTORE for stack_pointer_rtx feels like
 a double assignment.

 It does seem like it.  I suppose it would be easy to suppress the RESTORE of
 the stack pointer, without changing the save at all.
 
 But if having a restore in the presence of a save doesn't matter, why do
 we have restores for the other registers?  If the idea is that we never
 care what the CFI state is after the LM(G) then why not omit all of them?
 
 Or do you mean that REG_CFA_ADJUST_CFA would act as a REG_CFA_RESTORE too,
 if there had been a previous save?

Hum.  Your response does make it clear that I may need more coffee.
What I wrote above doesn't really make sense.


r~




Re: var-tracking and s390's LM(G)

2014-02-05 Thread Richard Sandiford
Jakub Jelinek ja...@redhat.com writes:
 On Tue, Feb 04, 2014 at 12:21:14PM +, Richard Sandiford wrote:
 Jakub Jelinek ja...@redhat.com writes:
  But then we wouldn't be able to use var-tracking when __builtin_eh_return
  is used, since in that case replacing the (set (reg ...) (mem ...))
  with a (plus ...) would be incorrect -- the value we're loading from the
  stack will have had a variable adjustment applied.  And I know from 
  painful
  experience that being able to debug the unwind code is very useful. :-)
 
  Aren't functions using EH_RETURN typically using frame pointer?
 
 Sorry, forgot to respond to this bit.  On s390, _Unwind_RaiseException and
 _Unwind_ForcedUnwind don't use a frame pointer, at least not on trunk.
 %r11 is used as a general scratch register instead.  E.g.:
 
 2ba8 _Unwind_ForcedUnwind:
 2ba8:   90 6f f0 18 stm %r6,%r15,24(%r15)
 2bac:   0d d0   basr%r13,%r0
 2bae:   60 40 f0 50 std %f4,80(%r15)
 2bb2:   60 60 f0 58 std %f6,88(%r15)
 2bb6:   a7 fa fd f8 ahi %r15,-520
 2bba:   58 c0 d0 9e l   %r12,158(%r13)
 2bbe:   58 10 d0 9a l   %r1,154(%r13)
 2bc2:   18 b2   lr  %r11,%r2
 ...
 2c10:   98 6f f2 20 lm  %r6,%r15,544(%r15)
 2c14:   07 f4   br  %r4

 Looking at pr54693-2.c -O2 -g -m64 -march=z196 (was that the one you meant)
 with your patches, I see:
 (insn/f:TI 122 30 31 4 (parallel [
 (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 80 [0x50])) [3  S8 A8])
 (reg:DI 10 %r10))
 (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 88 [0x58])) [3  S8 A8])
 (reg:DI 11 %r11))
 (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 96 [0x60])) [3  S8 A8])
 (reg:DI 12 %r12))
 (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 104 [0x68])) [3  S8 A8])
 (reg:DI 13 %r13))
 (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 112 [0x70])) [3  S8 A8])
 (reg:DI 14 %r14))
 (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 120 [0x78])) [3  S8 A8])
 (reg/f:DI 15 %r15))
 ]) pr54693-2.c:16 94 {*store_multiple_di}
  (expr_list:REG_DEAD (reg:DI 14 %r14)
 (expr_list:REG_DEAD (reg:DI 13 %r13)
 (expr_list:REG_DEAD (reg:DI 12 %r12)
 (expr_list:REG_DEAD (reg:DI 11 %r11)
 (expr_list:REG_DEAD (reg:DI 10 %r10)
 (nil)))
 ...
 (insn/f 123 31 124 4 (set (reg/f:DI 15 %r15)
 (plus:DI (reg/f:DI 15 %r15)
 (const_int -160 [0xff60]))) pr54693-2.c:16 65 {*la_64}
  (expr_list:REG_FRAME_RELATED_EXPR (set (reg/f:DI 15 %r15)
 (plus:DI (reg/f:DI 15 %r15)
 (const_int -160 [0xff60])))
 (nil)))
 ...
 (insn/f:TI 135 134 136 7 (parallel [
 (set (reg:DI 10 %r10)
 (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 240 [0xf0])) [3  S8 A8]))
 (set (reg:DI 11 %r11)
 (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 248 [0xf8])) [3  S8 A8]))
 (set (reg:DI 12 %r12)
 (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 256 [0x100])) [3  S8 A8]))
 (set (reg:DI 13 %r13)
 (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 264 [0x108])) [3  S8 A8]))
 (set (reg:DI 14 %r14)
 (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 272 [0x110])) [3  S8 A8]))
 (set (reg/f:DI 15 %r15)
 (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 280 [0x118])) [3  S8 A8]))
 ]) pr54693-2.c:25 92 {*load_multiple_di}
  (expr_list:REG_CFA_DEF_CFA (plus:DI (reg/f:DI 15 %r15)
 (const_int 160 [0xa0]))
 (expr_list:REG_CFA_RESTORE (reg/f:DI 15 %r15)
 (expr_list:REG_CFA_RESTORE (reg:DI 14 %r14)
 (expr_list:REG_CFA_RESTORE (reg:DI 13 %r13)
 (expr_list:REG_CFA_RESTORE (reg:DI 12 %r12)
 (expr_list:REG_CFA_RESTORE (reg:DI 11 %r11)
 (expr_list:REG_CFA_RESTORE (reg:DI 10 %r10)
 (nil)
 In a function that doesn't need frame pointer, I'd say this is a serious
 bloat of the unwind info, you are saving/restoring %r15 not because you have
 to, but just that it seems to be cheaper for the target.  So, I'd say you
 shouldn't emit DW_CFA_offset 15, 5 on the insn 122 

Re: var-tracking and s390's LM(G)

2014-02-05 Thread Jakub Jelinek
On Wed, Feb 05, 2014 at 03:32:22PM +, Richard Sandiford wrote:
  In a function that doesn't need frame pointer, I'd say this is a serious
  bloat of the unwind info, you are saving/restoring %r15 not because you have
  to, but just that it seems to be cheaper for the target.  So, I'd say you
  shouldn't emit DW_CFA_offset 15, 5 on the insn 122 in the prologue and
  DW_CFA_restore 15 in the epilogue (twice in this case), that just waste
  of .eh_frame/.debug_frame space.  I'd say you should represent in
  this case the *store_multiple_di as REG_FRAME_RELATED_EXPR
  with all but the last set in the PARALLEL, and on the *load_multiple_di
  remove REG_CFA_RESTORE (r15) note and change REG_CFA_DEF_CFA note to
  REG_CFA_ADJUST_CFA note which would say that stack pointer has been
  incremented by 160 bytes (epilogue expansion knows that value).
 
 I think at the moment we're relying on the DW_CFA_offset 15s to
 provide correct %r15 values on eh_return.  Every non-leaf function
 saves the stack pointer and the unwind routines track the %r15 save
 slots like they would track any call-saved register.  In other words,
 the final eh_return %r15 comes directly from a save slot, without any
 CFA-specific handling.  If some frames omit the CFI for the %r15 save
 then we'll just keep the save slot from the previous (inner) frame instead.

CCing Richard Henderson into the discussion.

 The new unwind routines would handle 4.9+ and pre-4.9 frames, but pre-4.9
 unwind routines wouldn't handle 4.9+ frames, so would that require a version
 bump on the unwind symbols?

No need to version anything IMHO.  We've never done that for any port
where we've started to emit something the old unwinder wouldn't cope with,
it is user's responsibility to use new enough libgcc_s.

BTW, what is the reason why %r15 is saved/restored from the stack at all say
for simple:
void
foo (void)
{
  int a = 6;
  asm volatile (# %0 : +m (a));
}

Is that some ABI matter where it effectively always uses requires to use
some kind of frame pointer (which the saved previous stack pointer is)?
It can surely make debugging without unwind info easier, ditto backtraces,
on the other side it must have a runtime cost.

Jakub


Re: var-tracking and s390's LM(G)

2014-02-05 Thread Richard Henderson
On 02/05/2014 07:58 AM, Jakub Jelinek wrote:
 On Wed, Feb 05, 2014 at 03:32:22PM +, Richard Sandiford wrote:
 I think at the moment we're relying on the DW_CFA_offset 15s to
 provide correct %r15 values on eh_return.  Every non-leaf function
 saves the stack pointer and the unwind routines track the %r15 save
 slots like they would track any call-saved register.  In other words,
 the final eh_return %r15 comes directly from a save slot, without any
 CFA-specific handling.  If some frames omit the CFI for the %r15 save
 then we'll just keep the save slot from the previous (inner) frame instead.
 
 CCing Richard Henderson into the discussion.

If the sp is saved for a given frame, we'll use that instead of the CFA when
unwinding to a previous frame.  That should be unaffected by whether or not sp
is saved for any particular frame.

There appears to be code in uw_install_context_1 to handle a mix of sp-saving
and non-sp-saving frames.  Given that s390 is pretty much the only target that
uses these paths, I suppose it could be broken.  But if it is, it would surely
be better to fix than just say it doesn't work.

 BTW, what is the reason why %r15 is saved/restored from the stack at all say
 for simple:
 void
 foo (void)
 {
   int a = 6;
   asm volatile (# %0 : +m (a));
 }
 
 Is that some ABI matter where it effectively always uses requires to use
 some kind of frame pointer (which the saved previous stack pointer is)?
 It can surely make debugging without unwind info easier, ditto backtraces,
 on the other side it must have a runtime cost.

As far as I recall, the return address is in r14 so the ABI also saves r15 for
all non-leaf functions simply because it is practically free to do so with
store-multiple.  This creates the familiar ra/fp unwinding chain.  Given how
cheap it is to maintain this chain on s390, I see no particular call to break 
it.

As for why r15 gets saved for this leaf example... I have no idea.
Seems like a bug to me, frankly.


r~


Re: var-tracking and s390's LM(G)

2014-02-05 Thread Richard Sandiford
Richard Henderson r...@redhat.com writes:
 On 02/05/2014 07:58 AM, Jakub Jelinek wrote:
 On Wed, Feb 05, 2014 at 03:32:22PM +, Richard Sandiford wrote:
 I think at the moment we're relying on the DW_CFA_offset 15s to
 provide correct %r15 values on eh_return.  Every non-leaf function
 saves the stack pointer and the unwind routines track the %r15 save
 slots like they would track any call-saved register.  In other words,
 the final eh_return %r15 comes directly from a save slot, without any
 CFA-specific handling.  If some frames omit the CFI for the %r15 save
 then we'll just keep the save slot from the previous (inner) frame instead.
 
 CCing Richard Henderson into the discussion.

 If the sp is saved for a given frame, we'll use that instead of the CFA when
 unwinding to a previous frame.  That should be unaffected by whether or not sp
 is saved for any particular frame.

 There appears to be code in uw_install_context_1 to handle a mix of sp-saving
 and non-sp-saving frames.  Given that s390 is pretty much the only target that
 uses these paths, I suppose it could be broken.  But if it is, it would surely
 be better to fix than just say it doesn't work.

OK.  When I was looking at it earlier today, the reason it didn't work
seemed to be that if %r15 wasn't saved for a particular frame, the context
would inherit the %r15 save slot for the previous frame, just like when
some functions use and save a particular call-saved register and some don't.
So we would only get to uw_install_context_1 without a %r15 slot if none
of the frames had one.

In other words, the reason seemed to be that the _Unwind_SetGRPtr in:

#ifdef EH_RETURN_STACKADJ_RTX
  /* Special handling here: Many machines do not use a frame pointer,
 and track the CFA only through offsets from the stack pointer from
 one frame to the next.  In this case, the stack pointer is never
 stored, so it has no saved address in the context.  What we do
 have is the CFA from the previous stack frame.

 In very special situations (such as unwind info for signal return),
 there may be location expressions that use the stack pointer as well.

 Do this conditionally for one frame.  This allows the unwind info
 for one frame to save a copy of the stack pointer from the previous
 frame, and be able to use much easier CFA mechanisms to do it.
 Always zap the saved stack pointer value for the next frame; carrying
 the value over from one frame to another doesn't make sense.  */

  _Unwind_SpTmp tmp_sp;

  if (!_Unwind_GetGRPtr (orig_context, __builtin_dwarf_sp_column ()))
_Unwind_SetSpColumn (orig_context, context-cfa, tmp_sp);
  _Unwind_SetGRPtr (context, __builtin_dwarf_sp_column (), NULL);
#endif

was conditional on EH_RETURN_STACKADJ_RTX being defined.  It looked at
first glance like things would work if that call was moved outside,
so that we never inherit save slots for the stack pointer.  Does that
make sense?  I can give it a go tomorrow if so.

Thanks,
Richard


Re: var-tracking and s390's LM(G)

2014-02-05 Thread Ulrich Weigand
Richard Sandiford wrote:

 In other words, the reason seemed to be that the _Unwind_SetGRPtr in:
 
 #ifdef EH_RETURN_STACKADJ_RTX
   /* Special handling here: Many machines do not use a frame pointer,
  and track the CFA only through offsets from the stack pointer from
  one frame to the next.  In this case, the stack pointer is never
  stored, so it has no saved address in the context.  What we do
  have is the CFA from the previous stack frame.
 
  In very special situations (such as unwind info for signal return),
  there may be location expressions that use the stack pointer as well.
 
  Do this conditionally for one frame.  This allows the unwind info
  for one frame to save a copy of the stack pointer from the previous
  frame, and be able to use much easier CFA mechanisms to do it.
  Always zap the saved stack pointer value for the next frame; carrying
  the value over from one frame to another doesn't make sense.  */
 
   _Unwind_SpTmp tmp_sp;
 
   if (!_Unwind_GetGRPtr (orig_context, __builtin_dwarf_sp_column ()))
 _Unwind_SetSpColumn (orig_context, context-cfa, tmp_sp);
   _Unwind_SetGRPtr (context, __builtin_dwarf_sp_column (), NULL);
 #endif
 
 was conditional on EH_RETURN_STACKADJ_RTX being defined.  It looked at
 first glance like things would work if that call was moved outside,
 so that we never inherit save slots for the stack pointer.  Does that
 make sense?  I can give it a go tomorrow if so.

I haven't looked into this in detail right now, but I recall that I had
to specifically add that #ifdef a long time ago since unwinding wouldn't
work correctly on s390 otherwise:
http://gcc.gnu.org/ml/gcc-patches/2003-05/msg00904.html

And the background of the bug here:
http://gcc.gnu.org/ml/gcc/2003-05/msg00536.html

Actually, now I think the problem originally described there is still
valid: on s390 the CFA is *not* equal to the value at function entry,
but biased by 96/160 bytes.  So setting the SP to the CFA is wrong ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: var-tracking and s390's LM(G)

2014-02-05 Thread Ulrich Weigand
I wrote and forgot to proof-read:

 Actually, now I think the problem originally described there is still
 valid: on s390 the CFA is *not* equal to the value at function entry,

... CFA is not equal to the *SP* value at function entry ...

 but biased by 96/160 bytes.  So setting the SP to the CFA is wrong ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: var-tracking and s390's LM(G)

2014-02-05 Thread Jakub Jelinek
On Wed, Feb 05, 2014 at 10:26:16PM +0100, Ulrich Weigand wrote:
 Actually, now I think the problem originally described there is still
 valid: on s390 the CFA is *not* equal to the value at function entry,
 but biased by 96/160 bytes.  So setting the SP to the CFA is wrong ...

Such biasing happens on most of the targets, e.g. x86_64 has upon function
entry CFA set to %rsp + 8, i?86 to %esp + 4.

Jakub


Re: var-tracking and s390's LM(G)

2014-02-05 Thread Ulrich Weigand
Jakub Jelinek wrote:
 On Wed, Feb 05, 2014 at 10:26:16PM +0100, Ulrich Weigand wrote:
  Actually, now I think the problem originally described there is still
  valid: on s390 the CFA is *not* equal to the value at function entry,
  but biased by 96/160 bytes.  So setting the SP to the CFA is wrong ...
 
 Such biasing happens on most of the targets, e.g. x86_64 has upon function
 entry CFA set to %rsp + 8, i?86 to %esp + 4.

Ah, but that's a different bias.  In those cases it is still correct
to *unwind* SP to the CFA, since that's the value SP needs to have
back in the caller immediately after the call site.

On s390, the call/return instructions do not modify the SP so the
SP at function entry is equal to the SP in the caller after return,
but neither of these is equal to the CFA.  Instead, SP points to
96/160 bytes below the CFA ...   Therefore you cannot simply
unwind SP to the CFA.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: var-tracking and s390's LM(G)

2014-02-05 Thread Richard Henderson
On 02/05/2014 02:30 PM, Ulrich Weigand wrote:
 Jakub Jelinek wrote:
 On Wed, Feb 05, 2014 at 10:26:16PM +0100, Ulrich Weigand wrote:
 Actually, now I think the problem originally described there is still
 valid: on s390 the CFA is *not* equal to the value at function entry,
 but biased by 96/160 bytes.  So setting the SP to the CFA is wrong ...

 Such biasing happens on most of the targets, e.g. x86_64 has upon function
 entry CFA set to %rsp + 8, i?86 to %esp + 4.
 
 Ah, but that's a different bias.  In those cases it is still correct
 to *unwind* SP to the CFA, since that's the value SP needs to have
 back in the caller immediately after the call site.
 
 On s390, the call/return instructions do not modify the SP so the
 SP at function entry is equal to the SP in the caller after return,
 but neither of these is equal to the CFA.  Instead, SP points to
 96/160 bytes below the CFA ...   Therefore you cannot simply
 unwind SP to the CFA.

I was about to reply the same to Jakub, Ulrich beat me to it.

Basically, the CFA has been defined incorrectly for s390, but it hasn't
mattered since they record the save of the SP.  But the result is that you
can't stop recording the save of SP without also adjusting how the CFA is 
defined.

Which _can_ be done, in a way that's fully compatible with all of the existing
unwinding.  But certainly shouldn't be attempted at this stage of development.


r~



var-tracking and s390's LM(G)

2014-02-04 Thread Richard Sandiford
Sorry, looks like I missed the stage 3 deadline by a day, but:
we'd like to add support for shrink-wrapping and conditional returns
to s390(x) for 4.9.  Doing this showed up a problem with the way that
var-tracking handles the load-multiple instruction that s390 uses in
many function epilogues.

stack_adjust_offset_pre_post punts on any assignment to the stack
pointer that it doesn't understand and assumes that no adjustment
is made.  And the only kind of direct set that it understands is
(reasonably enough) one of the form (plus (sp) (const_int X)).
(There's a MINUS case too but it should never trigger.)

But if s390 needs to save and restore several registers, it will
use LM(G) to restore the stack pointer too, rather than generating
a separate addition.  stack_adjust_offset_pre_post doesn't understand
the load and so the stack_adjust after the epilogue is the same as
before it, rather than the correct post-epilogue value.

When shrink-wrapping is enabled, we have other edges to the exit block
that have the correct stack_adjust and so we fail the test:

  /* Check whether the adjustments on the edges are the same.  */
  if (VTI (dest)-in.stack_adjust != VTI (src)-out.stack_adjust)
{
  free (stack);
  return false;
}

No var-tracking is then performed on the function, leading to some new
guality.exp failures.

I wondered whether we could model the load of the stack pointer as an
addition in cases where that is safe (e.g. it would require !calls_eh_return
at least).  But that would only work in functions that don't need a frame
pointer.  In functions that do need a frame pointer we'd presumably have
(plus (hfp) (const_int X)) instead, which stack_adjust_offset_pre_post
would again not understand.

Another option would be to work out the offset from REG_CFA_DEF_CFA notes,
where present, but I'm a bit uncomfortable with the idea of mixing two
different methods of calculating the stack offset.

The simplest fix seems to be to disable this check for the exit block.
We never use its stack_adjust anyway, and dwarf2cfi already checks
(using CFA information) that the offsets in a shrink-wrapped function
are consistent.

Tested on s390-linux-gnu and s390x-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* var-tracking.c (vt_stack_adjustments): Don't require stack_adjusts
to match for the exit block.

Index: gcc/var-tracking.c
===
--- gcc/var-tracking.c  2014-02-03 12:16:49.009149939 +
+++ gcc/var-tracking.c  2014-02-04 09:58:59.924381497 +
@@ -886,8 +886,25 @@ vt_stack_adjustments (void)
}
   else
{
- /* Check whether the adjustments on the edges are the same.  */
- if (VTI (dest)-in.stack_adjust != VTI (src)-out.stack_adjust)
+ /* We can end up with different stack adjustments for the exit block
+of a shrink-wrapped function if stack_adjust_offset_pre_post
+doesn't understand the rtx pattern used to restore the stack
+pointer in the epilogue.  For example, on s390(x), the stack
+pointer is often restored via a load-multiple instruction
+and so no stack_adjust offset is recorded for it.  This means
+that the stack offset at the end of the epilogue block is the
+the same as the offset before the epilogue, whereas other paths
+to the exit block will have the correct stack_adjust.
+
+It is safe to ignore these differences because (a) we never
+use the stack_adjust for the exit block in this pass and
+(b) dwarf2cfi checks whether the CFA notes in a shrink-wrapped
+function are correct.
+
+We must check whether the adjustments on other edges are
+the same though.  */
+ if (dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
+  VTI (dest)-in.stack_adjust != VTI (src)-out.stack_adjust)
{
  free (stack);
  return false;



Re: var-tracking and s390's LM(G)

2014-02-04 Thread Jakub Jelinek
On Tue, Feb 04, 2014 at 10:47:49AM +, Richard Sandiford wrote:
 I wondered whether we could model the load of the stack pointer as an
 addition in cases where that is safe (e.g. it would require !calls_eh_return
 at least).  But that would only work in functions that don't need a frame
 pointer.  In functions that do need a frame pointer we'd presumably have
 (plus (hfp) (const_int X)) instead, which stack_adjust_offset_pre_post
 would again not understand.

But vt_stack_adjustments is only called if frame pointer isn't used.

 Another option would be to work out the offset from REG_CFA_DEF_CFA notes,
 where present, but I'm a bit uncomfortable with the idea of mixing two
 different methods of calculating the stack offset.

So, how does the lmg insn look like in RTL dump on some problematic
testcase?
insn_stack_adjust_offset_pre_post already uses REG_FRAME_RELATED_EXPR,
which is also a kind of CFI note (the oldest one), so likely the issue
is just that it hasn't been adjusted to handle other newer REG_CFA_* notes
that tell how the stack pointer is adjusted.

 The simplest fix seems to be to disable this check for the exit block.
 We never use its stack_adjust anyway, and dwarf2cfi already checks
 (using CFA information) that the offsets in a shrink-wrapped function
 are consistent.
 
 Tested on s390-linux-gnu and s390x-linux-gnu.  OK to install?

I don't like this, my strong preference is to handle REG_CFA_* notes.

Jakub


Re: var-tracking and s390's LM(G)

2014-02-04 Thread Richard Sandiford
Jakub Jelinek ja...@redhat.com writes:
 On Tue, Feb 04, 2014 at 10:47:49AM +, Richard Sandiford wrote:
 I wondered whether we could model the load of the stack pointer as an
 addition in cases where that is safe (e.g. it would require !calls_eh_return
 at least).  But that would only work in functions that don't need a frame
 pointer.  In functions that do need a frame pointer we'd presumably have
 (plus (hfp) (const_int X)) instead, which stack_adjust_offset_pre_post
 would again not understand.

 But vt_stack_adjustments is only called if frame pointer isn't used.

Sorry, missed that.

 Another option would be to work out the offset from REG_CFA_DEF_CFA notes,
 where present, but I'm a bit uncomfortable with the idea of mixing two
 different methods of calculating the stack offset.

 So, how does the lmg insn look like in RTL dump on some problematic
 testcase?
 insn_stack_adjust_offset_pre_post already uses REG_FRAME_RELATED_EXPR,
 which is also a kind of CFI note (the oldest one), so likely the issue
 is just that it hasn't been adjusted to handle other newer REG_CFA_* notes
 that tell how the stack pointer is adjusted.

It's just a (mem ...) access:

 (parallel
   [...
(set (reg %r14) (mem:[SD]I (plus (reg ...) (const_int X1
(set (reg %r15) (mem:[SD]I (plus (reg ...) (const_int X2])

 The simplest fix seems to be to disable this check for the exit block.
 We never use its stack_adjust anyway, and dwarf2cfi already checks
 (using CFA information) that the offsets in a shrink-wrapped function
 are consistent.
 
 Tested on s390-linux-gnu and s390x-linux-gnu.  OK to install?

 I don't like this, my strong preference is to handle REG_CFA_* notes.

But then we wouldn't be able to use var-tracking when __builtin_eh_return
is used, since in that case replacing the (set (reg ...) (mem ...))
with a (plus ...) would be incorrect -- the value we're loading from the
stack will have had a variable adjustment applied.  And I know from painful
experience that being able to debug the unwind code is very useful. :-)

Thanks,
Richard



Re: var-tracking and s390's LM(G)

2014-02-04 Thread Jakub Jelinek
On Tue, Feb 04, 2014 at 11:33:57AM +, Richard Sandiford wrote:
  So, how does the lmg insn look like in RTL dump on some problematic
  testcase?
  insn_stack_adjust_offset_pre_post already uses REG_FRAME_RELATED_EXPR,
  which is also a kind of CFI note (the oldest one), so likely the issue
  is just that it hasn't been adjusted to handle other newer REG_CFA_* notes
  that tell how the stack pointer is adjusted.
 
 It's just a (mem ...) access:
 
  (parallel
[...
 (set (reg %r14) (mem:[SD]I (plus (reg ...) (const_int X1
 (set (reg %r15) (mem:[SD]I (plus (reg ...) (const_int X2])

I meant what reg notes it has (and why it doesn't use
REG_FRAME_RELATED_EXPR).

  The simplest fix seems to be to disable this check for the exit block.
  We never use its stack_adjust anyway, and dwarf2cfi already checks
  (using CFA information) that the offsets in a shrink-wrapped function
  are consistent.
  
  Tested on s390-linux-gnu and s390x-linux-gnu.  OK to install?
 
  I don't like this, my strong preference is to handle REG_CFA_* notes.
 
 But then we wouldn't be able to use var-tracking when __builtin_eh_return
 is used, since in that case replacing the (set (reg ...) (mem ...))
 with a (plus ...) would be incorrect -- the value we're loading from the
 stack will have had a variable adjustment applied.  And I know from painful
 experience that being able to debug the unwind code is very useful. :-)

Aren't functions using EH_RETURN typically using frame pointer?
And, var-tracking disabling doesn't really mean no debug info, just worse
debug info.  IMHO the sanity check in var-tracking is worth much more than
var-tracking in unwind-dw2.o in the case where you wouldn't use frame
pointer.  Why doesn't dwarf2cfi ICE on it then when the CFA changes can't be
described properly?

Jakub


Re: var-tracking and s390's LM(G)

2014-02-04 Thread Richard Sandiford
Jakub Jelinek ja...@redhat.com writes:
 On Tue, Feb 04, 2014 at 11:33:57AM +, Richard Sandiford wrote:
  So, how does the lmg insn look like in RTL dump on some problematic
  testcase?
  insn_stack_adjust_offset_pre_post already uses REG_FRAME_RELATED_EXPR,
  which is also a kind of CFI note (the oldest one), so likely the issue
  is just that it hasn't been adjusted to handle other newer REG_CFA_* notes
  that tell how the stack pointer is adjusted.
 
 It's just a (mem ...) access:
 
  (parallel
[...
 (set (reg %r14) (mem:[SD]I (plus (reg ...) (const_int X1
 (set (reg %r15) (mem:[SD]I (plus (reg ...) (const_int X2])

 I meant what reg notes it has (and why it doesn't use
 REG_FRAME_RELATED_EXPR).

It has a REG_CFA_RESTORE for each loaded register.  It also has
a REG_CFA_DEF_CFA with (plus (sp) (const_int ...)) in lieu of a
REG_FRAME_RELATED_EXPR of the form (set (sp) (plus (...) (const_int ...))),
which like I say wouldn't always be correct.  And I think that's a valid
use of REG_CFA_DEF_CFA:

/* Attached to insns that are RTX_FRAME_RELATED_P, but are too complex
   for FRAME_RELATED_EXPR intuition.  ... */
REG_NOTE (CFA_DEF_CFA)

(although the assignment to %r15 isn't first in the parallel, despite
the snipped part of the comment implying that it should be.  I don't
understand why we'd require that though.)

FWIW this form comes from:

2009-06-04  Jakub Jelinek  ja...@redhat.com

   * config/s390/s390.c (global_not_special_regno_p): New static inline.
   (save_gprs): Don't tell unwinder when a global register is saved.
   (s390_emit_epilogue): Emit needed epilogue unwind info.

  The simplest fix seems to be to disable this check for the exit block.
  We never use its stack_adjust anyway, and dwarf2cfi already checks
  (using CFA information) that the offsets in a shrink-wrapped function
  are consistent.
  
  Tested on s390-linux-gnu and s390x-linux-gnu.  OK to install?
 
  I don't like this, my strong preference is to handle REG_CFA_* notes.
 
 But then we wouldn't be able to use var-tracking when __builtin_eh_return
 is used, since in that case replacing the (set (reg ...) (mem ...))
 with a (plus ...) would be incorrect -- the value we're loading from the
 stack will have had a variable adjustment applied.  And I know from painful
 experience that being able to debug the unwind code is very useful. :-)

 Aren't functions using EH_RETURN typically using frame pointer?
 And, var-tracking disabling doesn't really mean no debug info, just worse
 debug info.  IMHO the sanity check in var-tracking is worth much more than
 var-tracking in unwind-dw2.o in the case where you wouldn't use frame
 pointer.  Why doesn't dwarf2cfi ICE on it then when the CFA changes can't be
 described properly?

Because the CFA information (as provided by REG_CFA... notes) is correct.
It's just that var-tracking doesn't use those notes.  FWIW, using them was
one of the options I mentioned in the original mail.

I don't see why you think relaxing this sanity check for the exit block
is so bad though.  If your argument is that var-tracking must understand
every assignment to the stack pointer then I think it should bail out
whenever it sees an assignment to sp that it doesn't understand,
rather than silently ignoring it.

Thanks,
Richard


Re: var-tracking and s390's LM(G)

2014-02-04 Thread Richard Sandiford
Jakub Jelinek ja...@redhat.com writes:
 But then we wouldn't be able to use var-tracking when __builtin_eh_return
 is used, since in that case replacing the (set (reg ...) (mem ...))
 with a (plus ...) would be incorrect -- the value we're loading from the
 stack will have had a variable adjustment applied.  And I know from painful
 experience that being able to debug the unwind code is very useful. :-)

 Aren't functions using EH_RETURN typically using frame pointer?

Sorry, forgot to respond to this bit.  On s390, _Unwind_RaiseException and
_Unwind_ForcedUnwind don't use a frame pointer, at least not on trunk.
%r11 is used as a general scratch register instead.  E.g.:

2ba8 _Unwind_ForcedUnwind:
2ba8:   90 6f f0 18 stm %r6,%r15,24(%r15)
2bac:   0d d0   basr%r13,%r0
2bae:   60 40 f0 50 std %f4,80(%r15)
2bb2:   60 60 f0 58 std %f6,88(%r15)
2bb6:   a7 fa fd f8 ahi %r15,-520
2bba:   58 c0 d0 9e l   %r12,158(%r13)
2bbe:   58 10 d0 9a l   %r1,154(%r13)
2bc2:   18 b2   lr  %r11,%r2
...
2c10:   98 6f f2 20 lm  %r6,%r15,544(%r15)
2c14:   07 f4   br  %r4

Thanks,
Richard


Re: var-tracking and s390's LM(G)

2014-02-04 Thread Jakub Jelinek
On Tue, Feb 04, 2014 at 12:21:14PM +, Richard Sandiford wrote:
 Jakub Jelinek ja...@redhat.com writes:
  But then we wouldn't be able to use var-tracking when __builtin_eh_return
  is used, since in that case replacing the (set (reg ...) (mem ...))
  with a (plus ...) would be incorrect -- the value we're loading from the
  stack will have had a variable adjustment applied.  And I know from painful
  experience that being able to debug the unwind code is very useful. :-)
 
  Aren't functions using EH_RETURN typically using frame pointer?
 
 Sorry, forgot to respond to this bit.  On s390, _Unwind_RaiseException and
 _Unwind_ForcedUnwind don't use a frame pointer, at least not on trunk.
 %r11 is used as a general scratch register instead.  E.g.:
 
 2ba8 _Unwind_ForcedUnwind:
 2ba8:   90 6f f0 18 stm %r6,%r15,24(%r15)
 2bac:   0d d0   basr%r13,%r0
 2bae:   60 40 f0 50 std %f4,80(%r15)
 2bb2:   60 60 f0 58 std %f6,88(%r15)
 2bb6:   a7 fa fd f8 ahi %r15,-520
 2bba:   58 c0 d0 9e l   %r12,158(%r13)
 2bbe:   58 10 d0 9a l   %r1,154(%r13)
 2bc2:   18 b2   lr  %r11,%r2
 ...
 2c10:   98 6f f2 20 lm  %r6,%r15,544(%r15)
 2c14:   07 f4   br  %r4

Looking at pr54693-2.c -O2 -g -m64 -march=z196 (was that the one you meant)
with your patches, I see:
(insn/f:TI 122 30 31 4 (parallel [
(set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
(const_int 80 [0x50])) [3  S8 A8])
(reg:DI 10 %r10))
(set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
(const_int 88 [0x58])) [3  S8 A8])
(reg:DI 11 %r11))
(set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
(const_int 96 [0x60])) [3  S8 A8])
(reg:DI 12 %r12))
(set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
(const_int 104 [0x68])) [3  S8 A8])
(reg:DI 13 %r13))
(set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
(const_int 112 [0x70])) [3  S8 A8])
(reg:DI 14 %r14))
(set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
(const_int 120 [0x78])) [3  S8 A8])
(reg/f:DI 15 %r15))
]) pr54693-2.c:16 94 {*store_multiple_di}
 (expr_list:REG_DEAD (reg:DI 14 %r14)
(expr_list:REG_DEAD (reg:DI 13 %r13)
(expr_list:REG_DEAD (reg:DI 12 %r12)
(expr_list:REG_DEAD (reg:DI 11 %r11)
(expr_list:REG_DEAD (reg:DI 10 %r10)
(nil)))
...
(insn/f 123 31 124 4 (set (reg/f:DI 15 %r15)
(plus:DI (reg/f:DI 15 %r15)
(const_int -160 [0xff60]))) pr54693-2.c:16 65 {*la_64}
 (expr_list:REG_FRAME_RELATED_EXPR (set (reg/f:DI 15 %r15)
(plus:DI (reg/f:DI 15 %r15)
(const_int -160 [0xff60])))
(nil)))
...
(insn/f:TI 135 134 136 7 (parallel [
(set (reg:DI 10 %r10)
(mem:DI (plus:DI (reg/f:DI 15 %r15)
(const_int 240 [0xf0])) [3  S8 A8]))
(set (reg:DI 11 %r11)
(mem:DI (plus:DI (reg/f:DI 15 %r15)
(const_int 248 [0xf8])) [3  S8 A8]))
(set (reg:DI 12 %r12)
(mem:DI (plus:DI (reg/f:DI 15 %r15)
(const_int 256 [0x100])) [3  S8 A8]))
(set (reg:DI 13 %r13)
(mem:DI (plus:DI (reg/f:DI 15 %r15)
(const_int 264 [0x108])) [3  S8 A8]))
(set (reg:DI 14 %r14)
(mem:DI (plus:DI (reg/f:DI 15 %r15)
(const_int 272 [0x110])) [3  S8 A8]))
(set (reg/f:DI 15 %r15)
(mem:DI (plus:DI (reg/f:DI 15 %r15)
(const_int 280 [0x118])) [3  S8 A8]))
]) pr54693-2.c:25 92 {*load_multiple_di}
 (expr_list:REG_CFA_DEF_CFA (plus:DI (reg/f:DI 15 %r15)
(const_int 160 [0xa0]))
(expr_list:REG_CFA_RESTORE (reg/f:DI 15 %r15)
(expr_list:REG_CFA_RESTORE (reg:DI 14 %r14)
(expr_list:REG_CFA_RESTORE (reg:DI 13 %r13)
(expr_list:REG_CFA_RESTORE (reg:DI 12 %r12)
(expr_list:REG_CFA_RESTORE (reg:DI 11 %r11)
(expr_list:REG_CFA_RESTORE (reg:DI 10 %r10)
(nil)
In a function that doesn't need frame pointer, I'd say this is a serious
bloat of the unwind info, you are saving/restoring %r15 not because you have
to, but just that it seems to be cheaper for the target.  So, I'd say you
shouldn't emit DW_CFA_offset 15, 5 on the insn 122 in the prologue and
DW_CFA_restore 15 in the epilogue (twice in this case), that just waste
of 

Re: var-tracking and s390's LM(G)

2014-02-04 Thread Richard Sandiford
Jakub Jelinek ja...@redhat.com writes:
 On Tue, Feb 04, 2014 at 12:21:14PM +, Richard Sandiford wrote:
 Jakub Jelinek ja...@redhat.com writes:
  But then we wouldn't be able to use var-tracking when __builtin_eh_return
  is used, since in that case replacing the (set (reg ...) (mem ...))
  with a (plus ...) would be incorrect -- the value we're loading from the
  stack will have had a variable adjustment applied.  And I know from 
  painful
  experience that being able to debug the unwind code is very useful. :-)
 
  Aren't functions using EH_RETURN typically using frame pointer?
 
 Sorry, forgot to respond to this bit.  On s390, _Unwind_RaiseException and
 _Unwind_ForcedUnwind don't use a frame pointer, at least not on trunk.
 %r11 is used as a general scratch register instead.  E.g.:
 
 2ba8 _Unwind_ForcedUnwind:
 2ba8:   90 6f f0 18 stm %r6,%r15,24(%r15)
 2bac:   0d d0   basr%r13,%r0
 2bae:   60 40 f0 50 std %f4,80(%r15)
 2bb2:   60 60 f0 58 std %f6,88(%r15)
 2bb6:   a7 fa fd f8 ahi %r15,-520
 2bba:   58 c0 d0 9e l   %r12,158(%r13)
 2bbe:   58 10 d0 9a l   %r1,154(%r13)
 2bc2:   18 b2   lr  %r11,%r2
 ...
 2c10:   98 6f f2 20 lm  %r6,%r15,544(%r15)
 2c14:   07 f4   br  %r4

 Looking at pr54693-2.c -O2 -g -m64 -march=z196 (was that the one you meant)
 with your patches, I see:
 (insn/f:TI 122 30 31 4 (parallel [
 (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 80 [0x50])) [3  S8 A8])
 (reg:DI 10 %r10))
 (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 88 [0x58])) [3  S8 A8])
 (reg:DI 11 %r11))
 (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 96 [0x60])) [3  S8 A8])
 (reg:DI 12 %r12))
 (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 104 [0x68])) [3  S8 A8])
 (reg:DI 13 %r13))
 (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 112 [0x70])) [3  S8 A8])
 (reg:DI 14 %r14))
 (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 120 [0x78])) [3  S8 A8])
 (reg/f:DI 15 %r15))
 ]) pr54693-2.c:16 94 {*store_multiple_di}
  (expr_list:REG_DEAD (reg:DI 14 %r14)
 (expr_list:REG_DEAD (reg:DI 13 %r13)
 (expr_list:REG_DEAD (reg:DI 12 %r12)
 (expr_list:REG_DEAD (reg:DI 11 %r11)
 (expr_list:REG_DEAD (reg:DI 10 %r10)
 (nil)))
 ...
 (insn/f 123 31 124 4 (set (reg/f:DI 15 %r15)
 (plus:DI (reg/f:DI 15 %r15)
 (const_int -160 [0xff60]))) pr54693-2.c:16 65 {*la_64}
  (expr_list:REG_FRAME_RELATED_EXPR (set (reg/f:DI 15 %r15)
 (plus:DI (reg/f:DI 15 %r15)
 (const_int -160 [0xff60])))
 (nil)))
 ...
 (insn/f:TI 135 134 136 7 (parallel [
 (set (reg:DI 10 %r10)
 (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 240 [0xf0])) [3  S8 A8]))
 (set (reg:DI 11 %r11)
 (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 248 [0xf8])) [3  S8 A8]))
 (set (reg:DI 12 %r12)
 (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 256 [0x100])) [3  S8 A8]))
 (set (reg:DI 13 %r13)
 (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 264 [0x108])) [3  S8 A8]))
 (set (reg:DI 14 %r14)
 (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 272 [0x110])) [3  S8 A8]))
 (set (reg/f:DI 15 %r15)
 (mem:DI (plus:DI (reg/f:DI 15 %r15)
 (const_int 280 [0x118])) [3  S8 A8]))
 ]) pr54693-2.c:25 92 {*load_multiple_di}
  (expr_list:REG_CFA_DEF_CFA (plus:DI (reg/f:DI 15 %r15)
 (const_int 160 [0xa0]))
 (expr_list:REG_CFA_RESTORE (reg/f:DI 15 %r15)
 (expr_list:REG_CFA_RESTORE (reg:DI 14 %r14)
 (expr_list:REG_CFA_RESTORE (reg:DI 13 %r13)
 (expr_list:REG_CFA_RESTORE (reg:DI 12 %r12)
 (expr_list:REG_CFA_RESTORE (reg:DI 11 %r11)
 (expr_list:REG_CFA_RESTORE (reg:DI 10 %r10)
 (nil)
 In a function that doesn't need frame pointer, I'd say this is a serious
 bloat of the unwind info, you are saving/restoring %r15 not because you have
 to, but just that it seems to be cheaper for the target.  So, I'd say you
 shouldn't emit DW_CFA_offset 15, 5 on the insn 122 

Re: var-tracking and s390's LM(G)

2014-02-04 Thread Jakub Jelinek
On Tue, Feb 04, 2014 at 03:38:51PM +, Richard Sandiford wrote:
 What do you want var-tracking to do about the __builtin_eh_return case
 though?  It isn't correct to say that the instruction adds the frame size
 to the stack pointer then, since the loaded value includes the eh_return
 adjustment.  Pretending it is a constant addition in order to satisfy the
 sanity check seems a bit hackish, but there again it'd be a shame to lose
 tracking for the whole function just because of that.

For __builtin_eh_return the unwind info starting with restoring the stack
pointer to the landing pad's stack pointer until you actually return/jump to the
landing pad is wrong in either case, not just on s390, but on various other
targets.  Saying REG_CFA_ADJUST_CFA by the frame size will result in CFA
from that point being stack_pointer + 160, which is exactly what you are
doing anyway, the DW_CFA_offset 15, ... and DW_CFA_restore aren't helpful to
that at all and don't change anything on it.

Unless you have the old value of the stack pointer saved in some hard
register, providing correct unwind info there is not possible.

Jakub


Re: var-tracking and s390's LM(G)

2014-02-04 Thread Richard Sandiford
Jakub Jelinek ja...@redhat.com writes:
 On Tue, Feb 04, 2014 at 03:38:51PM +, Richard Sandiford wrote:
 What do you want var-tracking to do about the __builtin_eh_return case
 though?  It isn't correct to say that the instruction adds the frame size
 to the stack pointer then, since the loaded value includes the eh_return
 adjustment.  Pretending it is a constant addition in order to satisfy the
 sanity check seems a bit hackish, but there again it'd be a shame to lose
 tracking for the whole function just because of that.

 For __builtin_eh_return the unwind info starting with restoring the stack
 pointer to the landing pad's stack pointer until you actually return/jump to 
 the
 landing pad is wrong in either case, not just on s390, but on various other
 targets.  Saying REG_CFA_ADJUST_CFA by the frame size will result in CFA
 from that point being stack_pointer + 160, which is exactly what you are
 doing anyway, the DW_CFA_offset 15, ... and DW_CFA_restore aren't helpful to
 that at all and don't change anything on it.

 Unless you have the old value of the stack pointer saved in some hard
 register, providing correct unwind info there is not possible.

Right, that was my point.  So by relying on unwind info in var-tracking.c
we're getting the wrong answer for the stack offset after the LM(G)
instruction.  It sounds like we're going to pretend it's right anyway
for expediency.  But in that case why not also allow the exit block
offsets to vary on the same basis?  It's not like the current test is
an assert -- it's just silently refusing to do any var-tracking on the
function if the epilogue isn't fully understood.

If these offsets must match on exit for correctness then we should
assert rather than silently bailing out.  But if they're allowed to vary
then we should do our best to carry on.  And since the exit block offset
is never used, it seems expedient to allow varying offsets in that case.

Thanks,
Richard



Re: var-tracking and s390's LM(G)

2014-02-04 Thread Jakub Jelinek
On Tue, Feb 04, 2014 at 04:59:54PM +, Richard Sandiford wrote:
 Right, that was my point.  So by relying on unwind info in var-tracking.c
 we're getting the wrong answer for the stack offset after the LM(G)
 instruction.  It sounds like we're going to pretend it's right anyway
 for expediency.  But in that case why not also allow the exit block
 offsets to vary on the same basis?  It's not like the current test is
 an assert -- it's just silently refusing to do any var-tracking on the
 function if the epilogue isn't fully understood.
 
 If these offsets must match on exit for correctness then we should
 assert rather than silently bailing out.  But if they're allowed to vary
 then we should do our best to carry on.  And since the exit block offset
 is never used, it seems expedient to allow varying offsets in that case.

Ok then, but please do follow-up on the changes to not save/restore
r15 in the unwind info, use REG_CFA_ADJUST_CFA where possible and teach
var-tracking about that note.

Jakub


Re: var-tracking and s390's LM(G)

2014-02-04 Thread Richard Sandiford
Jakub Jelinek ja...@redhat.com writes:
 On Tue, Feb 04, 2014 at 04:59:54PM +, Richard Sandiford wrote:
 Right, that was my point.  So by relying on unwind info in var-tracking.c
 we're getting the wrong answer for the stack offset after the LM(G)
 instruction.  It sounds like we're going to pretend it's right anyway
 for expediency.  But in that case why not also allow the exit block
 offsets to vary on the same basis?  It's not like the current test is
 an assert -- it's just silently refusing to do any var-tracking on the
 function if the epilogue isn't fully understood.
 
 If these offsets must match on exit for correctness then we should
 assert rather than silently bailing out.  But if they're allowed to vary
 then we should do our best to carry on.  And since the exit block offset
 is never used, it seems expedient to allow varying offsets in that case.

 Ok then, but please do follow-up on the changes to not save/restore
 r15 in the unwind info, use REG_CFA_ADJUST_CFA where possible and teach
 var-tracking about that note.

Thanks, will do.  I know Uli was worried about the CFI size too.

richard