Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving

2017-08-16 Thread Segher Boessenkool
On Thu, Aug 17, 2017 at 11:25:27AM +0930, Alan Modra wrote:
> On Wed, Aug 16, 2017 at 06:23:13PM -0500, Segher Boessenkool wrote:
> > Maybe whatever is creating those instructions should set RTX_FRAME_RELATED_P
> > by itself?  Not sure if that is nicer.
> > 
> > Both this CR2 and R0 handling are pretty nasty hacks.  Could you add a
> > comment saying that?
> 
> I wouldn't say the R0 handling is a nasty hack at all.  You can't save
> LR directly, storing to the stack must go via a gpr.  I'm 100% sure
> you know that, and so would anyone else working on powerpc gcc
> support.  It so happens that we use r0 in every case we hit this 
> code.  *That* fact is commented.  I don't really know what else you
> want.

But R0 isn't necessarily used for LR here.  That is true currently in
all cases I can see, but it can be used for other things.  The ABI
doesn't force things here.

The separate shrink-wrapping code does use R0 for other things, but it
manually sets the CFI notes anyway, so no problem there.  Maybe I should
just stop worrying.

> (Incidentally, the dwarf2cfi.c behaviour is described by
> 
>   In addition, if a register has previously been saved to a different
>   register,
> 
> Yup, great comment that one!  Dates back to 2004, commit 60ea93bb72.)

Perhaps inspired by the REG_CFA_REGISTER name for the RTL note ;-)

> The CR2 thing is a long-standing convention for frame info about CR, a
> wart fixed by ELFv2.  See elsewhere

That ELFv2 fix is why I still haven't submitted the separate shrink-wrapping
for CR fields code -- it complicates things a lot :-/

> I'l go with:
> 
>   /* If we see CR2 then we are here on a Darwin world save.  Saves of
>  CR2 signify the whole CR is being saved.  This is a long-standing
>  ABI wart fixed by ELFv2.  As for r0/lr there is no need to check
>  that CR needs to be saved.  */

That is fine, thanks again.


Segher


Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving

2017-08-16 Thread Alan Modra
On Wed, Aug 16, 2017 at 06:23:13PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Aug 16, 2017 at 08:05:04AM +0930, Alan Modra wrote:
> > Repost with requested changes.  I've extracted the logic that omits
> > frame saves from rs6000_frame_related to a new function, because it's
> > easier to document that way.  The logic has been simplified a little
> > too: fixed_reg_p doesn't need to be called there.
> > 
> > PR target/80938
> > * config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09.
> > Don't use store multiple if only one reg needs saving.
> > (interesting_frame_related_regno): New function.
> > (rs6000_frame_related): Don't emit eh_frame for regs that
> > don't need saving.
> > (rs6000_emit_epilogue): Likewise.
> 
> It's not just eh_frame, it's all call frame information.

Sure, I meant to fix that.

> > +/* This function is called when rs6000_frame_related is processing
> > +   SETs within a PARALLEL, and returns whether the REGNO save ought to
> > +   be marked RTX_FRAME_RELATED_P.  The PARALLELs involved are those
> > +   for out-of-line register save functions, store multiple, and the
> > +   Darwin world_save.  They may contain registers that don't really
> > +   need saving.  */
> > +
> > +static bool
> > +interesting_frame_related_regno (unsigned int regno)
> > +{
> > +  /* Saves apparently of r0 are actually saving LR.  */
> > +  if (regno == 0)
> > +return true;
> > +  /* If we see CR2 then we are here on a Darwin world save.  Saves of
> > + CR2 signify the whole CR is being saved.  */
> > +  if (regno == CR2_REGNO)
> > +return true;
> > +  /* Omit frame info for any user-defined global regs.  If frame info
> > + is supplied for them, frame unwinding will restore a user reg.
> > + Also omit frame info for any reg we don't need to save, as that
> > + bloats eh_frame and can cause problems with shrink wrapping.
> > + Since global regs won't be seen as needing to be saved, both of
> > + these conditions are covered by save_reg_p.  */
> > +  return save_reg_p (regno);
> > +}
> 
> The function name isn't so great, doesn't say what it does at all.  Not
> that I can think of anything better.
> 
> Maybe whatever is creating those instructions should set RTX_FRAME_RELATED_P
> by itself?  Not sure if that is nicer.
> 
> Both this CR2 and R0 handling are pretty nasty hacks.  Could you add a
> comment saying that?

I wouldn't say the R0 handling is a nasty hack at all.  You can't save
LR directly, storing to the stack must go via a gpr.  I'm 100% sure
you know that, and so would anyone else working on powerpc gcc
support.  It so happens that we use r0 in every case we hit this 
code.  *That* fact is commented.  I don't really know what else you
want.  Hmm, maybe I'm just too close to this code.  I'll go with
expounding some of the things known, as follows. 

  /* Saves apparently of r0 are actually saving LR.  It doesn't make
 sense to substitute the regno here to test save_reg_p (LR_REGNO).
 We *know* LR needs saving, and dwarf2cfi.c is able to deduce that
 (set (mem) (r0)) is saving LR from a prior (set (r0) (lr)) marked
 as frame related.  */

(Incidentally, the dwarf2cfi.c behaviour is described by

  In addition, if a register has previously been saved to a different
  register,

Yup, great comment that one!  Dates back to 2004, commit 60ea93bb72.)

The CR2 thing is a long-standing convention for frame info about CR, a
wart fixed by ELFv2.  See elsewhere

  /* CR register traditionally saved as CR2.  */
and
  /* In other ABIs, by convention, we use a single CR regnum to
 represent the fact that all call-saved CR fields are saved.
 We use CR2_REGNO to be compatible with gcc-2.95 on Linux.  */

I'l go with:

  /* If we see CR2 then we are here on a Darwin world save.  Saves of
 CR2 signify the whole CR is being saved.  This is a long-standing
 ABI wart fixed by ELFv2.  As for r0/lr there is no need to check
 that CR needs to be saved.  */

> Okay for trunk with those improvements (eh_frame and hack comment).
> Thanks!
> 
> 
> Segher

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving

2017-08-16 Thread Segher Boessenkool
Hi!

On Wed, Aug 16, 2017 at 08:05:04AM +0930, Alan Modra wrote:
> Repost with requested changes.  I've extracted the logic that omits
> frame saves from rs6000_frame_related to a new function, because it's
> easier to document that way.  The logic has been simplified a little
> too: fixed_reg_p doesn't need to be called there.
> 
>   PR target/80938
>   * config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09.
>   Don't use store multiple if only one reg needs saving.
>   (interesting_frame_related_regno): New function.
>   (rs6000_frame_related): Don't emit eh_frame for regs that
>   don't need saving.
>   (rs6000_emit_epilogue): Likewise.

It's not just eh_frame, it's all call frame information.

> +/* This function is called when rs6000_frame_related is processing
> +   SETs within a PARALLEL, and returns whether the REGNO save ought to
> +   be marked RTX_FRAME_RELATED_P.  The PARALLELs involved are those
> +   for out-of-line register save functions, store multiple, and the
> +   Darwin world_save.  They may contain registers that don't really
> +   need saving.  */
> +
> +static bool
> +interesting_frame_related_regno (unsigned int regno)
> +{
> +  /* Saves apparently of r0 are actually saving LR.  */
> +  if (regno == 0)
> +return true;
> +  /* If we see CR2 then we are here on a Darwin world save.  Saves of
> + CR2 signify the whole CR is being saved.  */
> +  if (regno == CR2_REGNO)
> +return true;
> +  /* Omit frame info for any user-defined global regs.  If frame info
> + is supplied for them, frame unwinding will restore a user reg.
> + Also omit frame info for any reg we don't need to save, as that
> + bloats eh_frame and can cause problems with shrink wrapping.
> + Since global regs won't be seen as needing to be saved, both of
> + these conditions are covered by save_reg_p.  */
> +  return save_reg_p (regno);
> +}

The function name isn't so great, doesn't say what it does at all.  Not
that I can think of anything better.

Maybe whatever is creating those instructions should set RTX_FRAME_RELATED_P
by itself?  Not sure if that is nicer.

Both this CR2 and R0 handling are pretty nasty hacks.  Could you add a
comment saying that?

Okay for trunk with those improvements (eh_frame and hack comment).
Thanks!


Segher


Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving

2017-08-15 Thread Alan Modra
Repost with requested changes.  I've extracted the logic that omits
frame saves from rs6000_frame_related to a new function, because it's
easier to document that way.  The logic has been simplified a little
too: fixed_reg_p doesn't need to be called there.

PR target/80938
* config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09.
Don't use store multiple if only one reg needs saving.
(interesting_frame_related_regno): New function.
(rs6000_frame_related): Don't emit eh_frame for regs that
don't need saving.
(rs6000_emit_epilogue): Likewise.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index f9aa13b..178632e 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -24445,20 +24445,36 @@ rs6000_savres_strategy (rs6000_stack_t *info,
   && flag_shrink_wrap_separate
   && optimize_function_for_speed_p (cfun)))
 {
-  /* Prefer store multiple for saves over out-of-line routines,
-since the store-multiple instruction will always be smaller.  */
-  strategy |= SAVE_INLINE_GPRS | SAVE_MULTIPLE;
-
-  /* The situation is more complicated with load multiple.  We'd
-prefer to use the out-of-line routines for restores, since the
-"exit" out-of-line routines can handle the restore of LR and the
-frame teardown.  However if doesn't make sense to use the
-out-of-line routine if that is the only reason we'd need to save
-LR, and we can't use the "exit" out-of-line gpr restore if we
-have saved some fprs; In those cases it is advantageous to use
-load multiple when available.  */
-  if (info->first_fp_reg_save != 64 || !lr_save_p)
-   strategy |= REST_INLINE_GPRS | REST_MULTIPLE;
+  int count = 0;
+  for (int i = info->first_gp_reg_save; i < 32; i++)
+   if (save_reg_p (i))
+ count++;
+
+  if (count <= 1)
+   /* Don't use store multiple if only one reg needs to be
+  saved.  This can occur for example when the ABI_V4 pic reg
+  (r30) needs to be saved to make calls, but r31 is not
+  used.  */
+   strategy |= SAVE_INLINE_GPRS | REST_INLINE_GPRS;
+  else
+   {
+ /* Prefer store multiple for saves over out-of-line
+routines, since the store-multiple instruction will
+always be smaller.  */
+ strategy |= SAVE_INLINE_GPRS | SAVE_MULTIPLE;
+
+ /* The situation is more complicated with load multiple.
+We'd prefer to use the out-of-line routines for restores,
+since the "exit" out-of-line routines can handle the
+restore of LR and the frame teardown.  However if doesn't
+make sense to use the out-of-line routine if that is the
+only reason we'd need to save LR, and we can't use the
+"exit" out-of-line gpr restore if we have saved some
+fprs; In those cases it is advantageous to use load
+multiple when available.  */
+ if (info->first_fp_reg_save != 64 || !lr_save_p)
+   strategy |= REST_INLINE_GPRS | REST_MULTIPLE;
+   }
 }
 
   /* Using the "exit" out-of-line routine does not improve code size
@@ -24467,21 +24483,6 @@ rs6000_savres_strategy (rs6000_stack_t *info,
   else if (!lr_save_p && info->first_gp_reg_save > 29)
 strategy |= SAVE_INLINE_GPRS | REST_INLINE_GPRS;
 
-  /* We can only use save multiple if we need to save all the registers from
- first_gp_reg_save.  Otherwise, the CFI gets messed up (we save some
- register we do not restore).  */
-  if (strategy & SAVE_MULTIPLE)
-{
-  int i;
-
-  for (i = info->first_gp_reg_save; i < 32; i++)
-   if (fixed_reg_p (i) || !save_reg_p (i))
- {
-   strategy &= ~SAVE_MULTIPLE;
-   break;
- }
-}
-
   /* Don't ever restore fixed regs.  */
   if ((strategy & (REST_INLINE_GPRS | REST_MULTIPLE)) != REST_INLINE_GPRS)
 for (int i = info->first_gp_reg_save; i < 32; i++)
@@ -25654,6 +25655,32 @@ output_probe_stack_range (rtx reg1, rtx reg2)
   return "";
 }
 
+/* This function is called when rs6000_frame_related is processing
+   SETs within a PARALLEL, and returns whether the REGNO save ought to
+   be marked RTX_FRAME_RELATED_P.  The PARALLELs involved are those
+   for out-of-line register save functions, store multiple, and the
+   Darwin world_save.  They may contain registers that don't really
+   need saving.  */
+
+static bool
+interesting_frame_related_regno (unsigned int regno)
+{
+  /* Saves apparently of r0 are actually saving LR.  */
+  if (regno == 0)
+return true;
+  /* If we see CR2 then we are here on a Darwin world save.  Saves of
+ CR2 signify the whole CR is being saved.  */
+  if (regno == CR2_REGNO)
+return true;
+  /* Omit frame info for any user-defined global regs.  If frame info
+ is supplied for them, frame unwinding will restore a user reg.
+  

Re: [RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving

2017-08-15 Thread Segher Boessenkool
Hi!

On Fri, Aug 11, 2017 at 12:40:11PM +0930, Alan Modra wrote:
> It is possible when using out-of-line register saves or store multiple
> to save some registers unnecessarily, for example one reg in the block
> saved might be unused.  We don't need to emit eh_frame info for those
> registers as that just bloats the eh_frame info, and also can result
> in an ICE when shrink-wrap gives multiple paths through the function
> saving different sets of registers.  All the join points need to have
> identical eh_frame register save state.

It isn't just eh_frame, we might have never noticed if it was.  It is
also the DWARF call frame info used for debugging.

>   PR target/80938
>   * config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09.
>   Don't use store multiple if only one reg needs saving.
>   (rs6000_frame_related): Don't emit eh_frame for regs that
>   don't need saving.
>   (rs6000_emit_epilogue): Likewise.

> +  int count;
> +
> +  for (count = 0, i = info->first_gp_reg_save; i < 32; i++)
> + if (save_reg_p (i))
> +   count += 1;

  int count = 0;
  for (i = info->first_gp_reg_save; i < 32; i++)
if (save_reg_p (i))
  count++;

> @@ -25681,9 +25683,15 @@ rs6000_frame_related (rtx_insn *insn, rtx reg, 
> HOST_WIDE_INT val,
>register save functions, or store multiple, then omit
>eh_frame info for any user-defined global regs.  If
>eh_frame info is supplied, frame unwinding will
> -  restore a user reg.  */
> +  restore a user reg.  Also omit eh_frame info for any
> +  reg we don't need to save, as that bloats eh_frame
> +  and can cause problems with shrink wrapping.  Saves
> +  of r0 are actually saving LR, so don't omit those.   */
> if (!REG_P (SET_SRC (set))
> -   || !fixed_reg_p (REGNO (SET_SRC (set
> +   || REGNO (SET_SRC (set)) == 0
> +   || REGNO (SET_SRC (set)) == CR2_REGNO
> +   || (!fixed_reg_p (REGNO (SET_SRC (set)))
> +   && save_reg_p (REGNO (SET_SRC (set)
>   RTX_FRAME_RELATED_P (set) = 1;
>   }
>RTX_FRAME_RELATED_P (insn) = 1;

That "0" and "CR2_REGNO" is non-obvious, it needs a big fat comment (it's
not clear that just CR2_REGNO is correct, either).  Oh, and please factor
out "REGNO (SET_SRC (set))".

The rest looks good, but please repost.


Segher


[RS6000] PR 80938, Don't emit eh_frame for regs that don't need saving

2017-08-10 Thread Alan Modra
It is possible when using out-of-line register saves or store multiple
to save some registers unnecessarily, for example one reg in the block
saved might be unused.  We don't need to emit eh_frame info for those
registers as that just bloats the eh_frame info, and also can result
in an ICE when shrink-wrap gives multiple paths through the function
saving different sets of registers.  All the join points need to have
identical eh_frame register save state.

This patch reverts the previous fix for PR80939 "Use SAVE_MULTIPLE
only if we restore what it saves (PR80938)" and instead fixes the PR
by correcting the eh_frame info.  The change to rs6000_savres_strategy
is an optimization, but note that it hides the underlying problem in
the PR testcase.

Bootstrapped and regression tested powerpc64-linux (-m32 too) and
powerpc64le-linux, with
https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00774.html and
https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00775.html applied.
OK to apply?

PR target/80938
* config/rs6000/rs6000.c (rs6000_savres_strategy): Revert 2017-08-09.
Don't use store multiple if only one reg needs saving.
(rs6000_frame_related): Don't emit eh_frame for regs that
don't need saving.
(rs6000_emit_epilogue): Likewise.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 2070648..abc55bd 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -24432,20 +24432,37 @@ rs6000_savres_strategy (rs6000_stack_t *info,
   && flag_shrink_wrap_separate
   && optimize_function_for_speed_p (cfun)))
 {
-  /* Prefer store multiple for saves over out-of-line routines,
-since the store-multiple instruction will always be smaller.  */
-  strategy |= SAVE_INLINE_GPRS | SAVE_MULTIPLE;
-
-  /* The situation is more complicated with load multiple.  We'd
-prefer to use the out-of-line routines for restores, since the
-"exit" out-of-line routines can handle the restore of LR and the
-frame teardown.  However if doesn't make sense to use the
-out-of-line routine if that is the only reason we'd need to save
-LR, and we can't use the "exit" out-of-line gpr restore if we
-have saved some fprs; In those cases it is advantageous to use
-load multiple when available.  */
-  if (info->first_fp_reg_save != 64 || !lr_save_p)
-   strategy |= REST_INLINE_GPRS | REST_MULTIPLE;
+  int count;
+
+  for (count = 0, i = info->first_gp_reg_save; i < 32; i++)
+   if (save_reg_p (i))
+ count += 1;
+
+  if (count <= 1)
+   /* Don't use store multiple if only one reg needs to be
+  saved.  This can occur for example when the ABI_V4 pic reg
+  (r30) needs to be saved to make calls, but r31 is not
+  used.  */
+   strategy |= SAVE_INLINE_GPRS | REST_INLINE_GPRS;
+  else
+   {
+ /* Prefer store multiple for saves over out-of-line
+routines, since the store-multiple instruction will
+always be smaller.  */
+ strategy |= SAVE_INLINE_GPRS | SAVE_MULTIPLE;
+
+ /* The situation is more complicated with load multiple.
+We'd prefer to use the out-of-line routines for restores,
+since the "exit" out-of-line routines can handle the
+restore of LR and the frame teardown.  However if doesn't
+make sense to use the out-of-line routine if that is the
+only reason we'd need to save LR, and we can't use the
+"exit" out-of-line gpr restore if we have saved some
+fprs; In those cases it is advantageous to use load
+multiple when available.  */
+ if (info->first_fp_reg_save != 64 || !lr_save_p)
+   strategy |= REST_INLINE_GPRS | REST_MULTIPLE;
+   }
 }
 
   /* Using the "exit" out-of-line routine does not improve code size
@@ -24454,21 +24471,6 @@ rs6000_savres_strategy (rs6000_stack_t *info,
   else if (!lr_save_p && info->first_gp_reg_save > 29)
 strategy |= SAVE_INLINE_GPRS | REST_INLINE_GPRS;
 
-  /* We can only use save multiple if we need to save all the registers from
- first_gp_reg_save.  Otherwise, the CFI gets messed up (we save some
- register we do not restore).  */
-  if (strategy & SAVE_MULTIPLE)
-{
-  int i;
-
-  for (i = info->first_gp_reg_save; i < 32; i++)
-   if (fixed_reg_p (i) || !save_reg_p (i))
- {
-   strategy &= ~SAVE_MULTIPLE;
-   break;
- }
-}
-
   /* Don't ever restore fixed regs.  */
   if ((strategy & (REST_INLINE_GPRS | REST_MULTIPLE)) != REST_INLINE_GPRS)
 for (i = info->first_gp_reg_save; i < 32; i++)
@@ -25681,9 +25683,15 @@ rs6000_frame_related (rtx_insn *insn, rtx reg, 
HOST_WIDE_INT val,
 register save functions, or store multiple, then omit
 eh_frame info for any user-defined global regs.  If
 eh_frame