Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)

2012-02-04 Thread Ayal Zaks
SMS changes are ok.


* common.opt (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose):
   New flags.

We should document what the different verbosity levels are, or
at-least their range.


Thanks,
Ayal.



On Tue, Jan 10, 2012 at 7:48 PM, Vladimir Makarov vmaka...@redhat.com wrote:
 On 01/03/2012 04:25 AM, Revital1 Eres wrote:


 Attached is an updated version with the two changes mentioned above taken
 from the previous patch.

 Tested and bootstrap with the other patch in the series on
 ppc64-redhat-linux, enabling SMS on loops with SC 1.

 Thanks again,
 Revital


 IRA changes are ok for me.
 Thanks, Revital.

 2012-01-03  Richard Sandifordrichard.sandif...@linaro.org
             Revital Eresrevital.e...@linaro.org

         * loop-invariant.c (get_regno_pressure_class): Move function to...
         * ira.c: Here.
         * common.opt (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose):
         New flags.
         * doc/invoke.texi (fmodulo-sched-reg-pressure,
         -fmodulo-sched-verbose): Document the flags.
         * ira.h (get_regno_pressure_class,
         reset_pseudo_classes_defined_p): Declare.
         * ira-costs.c (reset_pseudo_classes_defined_p): New function.
         * Makefile.in (modulo-sched.o): Include ira.h and modulo-sched.h.
         (modulo-sched-pressure.o): New.
         * modulo-sched.c (ira.h, modulo-sched.h): New includes.
         (partial_schedule_ptr, ps_insn_ptr, struct ps_insn,
         struct ps_reg_move_info, struct partial_schedule): Move to
         modulo-sched.h.
         (ps_rtl_insn, ps_reg_move): Remove static.
         (apply_reg_moves): Remove static and call df_insn_rescan only
         if PS is final.
         (undo_reg_moves): New function.
         (sms_schedule): Call register pressure estimation.
         * modulo-sched.h: New file.
         * modulo-sched-pressure.c: New file.

 (See attached file: patch_pressure_3_1_12.txt)




Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)

2012-01-10 Thread Vladimir Makarov

On 01/03/2012 04:25 AM, Revital1 Eres wrote:


Attached is an updated version with the two changes mentioned above taken
from the previous patch.

Tested and bootstrap with the other patch in the series on
ppc64-redhat-linux, enabling SMS on loops with SC 1.

Thanks again,
Revital



IRA changes are ok for me.
Thanks, Revital.

2012-01-03  Richard Sandifordrichard.sandif...@linaro.org
 Revital Eresrevital.e...@linaro.org

 * loop-invariant.c (get_regno_pressure_class): Move function to...
 * ira.c: Here.
 * common.opt (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose):
 New flags.
 * doc/invoke.texi (fmodulo-sched-reg-pressure,
 -fmodulo-sched-verbose): Document the flags.
 * ira.h (get_regno_pressure_class,
 reset_pseudo_classes_defined_p): Declare.
 * ira-costs.c (reset_pseudo_classes_defined_p): New function.
 * Makefile.in (modulo-sched.o): Include ira.h and modulo-sched.h.
 (modulo-sched-pressure.o): New.
 * modulo-sched.c (ira.h, modulo-sched.h): New includes.
 (partial_schedule_ptr, ps_insn_ptr, struct ps_insn,
 struct ps_reg_move_info, struct partial_schedule): Move to
 modulo-sched.h.
 (ps_rtl_insn, ps_reg_move): Remove static.
 (apply_reg_moves): Remove static and call df_insn_rescan only
 if PS is final.
 (undo_reg_moves): New function.
 (sms_schedule): Call register pressure estimation.
 * modulo-sched.h: New file.
 * modulo-sched-pressure.c: New file.

(See attached file: patch_pressure_3_1_12.txt)




Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)

2012-01-03 Thread Revital1 Eres

Hello,

 On Mon, Jan 2, 2012 at 3:30 PM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
  Ayal Zaks ayal.z...@gmail.com writes:
  +  for (i = 0; i  ira_pressure_classes_num; i++)
  +    {
  +      enum reg_class pressure_class;
  +
  +      pressure_class = ira_pressure_classes[i];
  +
  +      if (max_reg_pressure[pressure_class] == 0)
  +     continue;
  +
  +      if (dump_file)
  +     fprintf (dump_file, %s=%d  %d , reg_class_names
[pressure_class],
  +              max_reg_pressure[pressure_class],
  +              ira_available_class_regs[pressure_class]);
  +
  +      if (max_reg_pressure[pressure_class]
  +        ira_class_hard_regs_num[pressure_class])
  +     {
  +       if (dump_file)
  +         fprintf (dump_file, (pressure)\n);
  +
  +       pressure_p = true;
 
  you can break; now.
 
  FWIW, I thought the same thing at first, but I think Revital's wayis
better.
  It's nice to know when looking at dumps whether there is excess
pressure
  in more than one class.  This isn't performance-critical code after
all.
 

 ok

  however, you have everything setup to compute the amount of spill, so
  suggest to do the following instead:
 
            pressure += max_reg_pressure[pressure_class]
                        - ira_class_hard_regs_num[pressure_class]);
 
  or better call the variable spillage instead of pressure, and
  return it. Current use will only check if it's zero or positive.
 
  I read this suggestion in the same way as Revital seems to have done:
  that we sum the pressure change over all classes.  But that isn't the
idea.
  Using N too many registers in one pressure class cannot be mitigated by
  leaving N registers in another pressure class unused.
 

 of-course (wasn't thinking of spilling from one register file to
 another); meant to suggest doing:

 if (max_reg_pressure[pressure_class]  ira_class_hard_regs_num
 [pressure_class])
   spillage += max_reg_pressure[pressure_class] -
 ira_class_hard_regs_num[pressure_class]);


  TBH, the original version of this function looked better to me.
 

 ok, it would surely suffice for now.


Attached is an updated version with the two changes mentioned above taken
from the previous patch.

Tested and bootstrap with the other patch in the series on
ppc64-redhat-linux, enabling SMS on loops with SC 1.

Thanks again,
Revital

2012-01-03  Richard Sandiford  richard.sandif...@linaro.org
Revital Eres  revital.e...@linaro.org

* loop-invariant.c (get_regno_pressure_class): Move function to...
* ira.c: Here.
* common.opt (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose):
New flags.
* doc/invoke.texi (fmodulo-sched-reg-pressure,
-fmodulo-sched-verbose): Document the flags.
* ira.h (get_regno_pressure_class,
reset_pseudo_classes_defined_p): Declare.
* ira-costs.c (reset_pseudo_classes_defined_p): New function.
* Makefile.in (modulo-sched.o): Include ira.h and modulo-sched.h.
(modulo-sched-pressure.o): New.
* modulo-sched.c (ira.h, modulo-sched.h): New includes.
(partial_schedule_ptr, ps_insn_ptr, struct ps_insn,
struct ps_reg_move_info, struct partial_schedule): Move to
modulo-sched.h.
(ps_rtl_insn, ps_reg_move): Remove static.
(apply_reg_moves): Remove static and call df_insn_rescan only
if PS is final.
(undo_reg_moves): New function.
(sms_schedule): Call register pressure estimation.
* modulo-sched.h: New file.
* modulo-sched-pressure.c: New file.

(See attached file: patch_pressure_3_1_12.txt)Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 182766)
+++ doc/invoke.texi (working copy)
@@ -374,6 +374,7 @@ Objective-C and Objective-C++ Dialects}.
 -floop-parallelize-all -flto -flto-compression-level @gol
 -flto-partition=@var{alg} -flto-report -fmerge-all-constants @gol
 -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol
+-fmodulo-sched-reg-pressure -fmodulo-sched-verbose=@var{n} @gol
 -fmove-loop-invariants fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg 
@gol
 -fno-default-inline @gol
 -fno-defer-pop -fno-function-cse -fno-guess-branch-probability @gol
@@ -6476,6 +6477,16 @@ deleted which will trigger the generatio
 life-range analysis.  This option is effective only with
 @option{-fmodulo-sched} enabled.
 
+@item -fmodulo-sched-reg-pressure
+@opindex fmodulo-sched-reg-pressure
+Do not apply @option{-fmodulo-sched} to loops if the result would lead
+to register spilling within the loop.
+This option is effective only with @option{-fmodulo-sched} enabled.
+
+@item -fmodulo-sched-verbose=@var{n}
+@opindex fmodulo-sched-verbose
+Set up how verbose dump file for the SMS will be.  
+
 @item -fno-branch-count-reg
 @opindex fno-branch-count-reg
 Do not use ``decrement and branch'' instructions on a count register,
Index: modulo-sched.h

Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)

2012-01-02 Thread Richard Sandiford
Ayal Zaks ayal.z...@gmail.com writes:
 +  for (i = 0; i  ira_pressure_classes_num; i++)
 +{
 +  enum reg_class pressure_class;
 +
 +  pressure_class = ira_pressure_classes[i];
 +
 +  if (max_reg_pressure[pressure_class] == 0)
 + continue;
 +
 +  if (dump_file)
 + fprintf (dump_file, %s=%d  %d , reg_class_names[pressure_class],
 +  max_reg_pressure[pressure_class],
 +  ira_available_class_regs[pressure_class]);
 +
 +  if (max_reg_pressure[pressure_class]
 +ira_class_hard_regs_num[pressure_class])
 + {
 +   if (dump_file)
 + fprintf (dump_file, (pressure)\n);
 +
 +   pressure_p = true;

 you can break; now.

FWIW, I thought the same thing at first, but I think Revital's way is better.
It's nice to know when looking at dumps whether there is excess pressure
in more than one class.  This isn't performance-critical code after all.

 however, you have everything setup to compute the amount of spill, so
 suggest to do the following instead:

   pressure += max_reg_pressure[pressure_class]
   - ira_class_hard_regs_num[pressure_class]);

 or better call the variable spillage instead of pressure, and
 return it. Current use will only check if it's zero or positive.

I read this suggestion in the same way as Revital seems to have done:
that we sum the pressure change over all classes.  But that isn't the idea.
Using N too many registers in one pressure class cannot be mitigated by
leaving N registers in another pressure class unused.

TBH, the original version of this function looked better to me.

 Future use, as discussed offline, should compare this to the amount of
 spillage the original loop (w/o modulo-scheduling) is expected to
 have.

I wonder if that's really worth pursuing though.  For one thing,
we've no control over where the spill code is going to be inserted,
so the final ii is going to be a little arbitrary.  That's less of a
problem without SMS because we're able to reschedule the code after
register allocation in order to move spills around.

For another thing, the pressure of the original (unscheduled) code isn't
necessarily indicative of what can be achieved by the normal scheduler
with something like -fsched-pressure.  We can't assume that the original
block has minimum pressure.

If we're serious about wanting to use SMS in high-pressure loops,
perhaps we should consider trying to split live ranges in SMS itself.
I'm not sure it makes sense to leave an SMSed loop that we know
is going to need spill code.

Richard


Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)

2012-01-02 Thread Ayal Zaks
On Mon, Jan 2, 2012 at 3:30 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Ayal Zaks ayal.z...@gmail.com writes:
 +  for (i = 0; i  ira_pressure_classes_num; i++)
 +    {
 +      enum reg_class pressure_class;
 +
 +      pressure_class = ira_pressure_classes[i];
 +
 +      if (max_reg_pressure[pressure_class] == 0)
 +     continue;
 +
 +      if (dump_file)
 +     fprintf (dump_file, %s=%d  %d , reg_class_names[pressure_class],
 +              max_reg_pressure[pressure_class],
 +              ira_available_class_regs[pressure_class]);
 +
 +      if (max_reg_pressure[pressure_class]
 +        ira_class_hard_regs_num[pressure_class])
 +     {
 +       if (dump_file)
 +         fprintf (dump_file, (pressure)\n);
 +
 +       pressure_p = true;

 you can break; now.

 FWIW, I thought the same thing at first, but I think Revital's way is better.
 It's nice to know when looking at dumps whether there is excess pressure
 in more than one class.  This isn't performance-critical code after all.


ok

 however, you have everything setup to compute the amount of spill, so
 suggest to do the following instead:

           pressure += max_reg_pressure[pressure_class]
                       - ira_class_hard_regs_num[pressure_class]);

 or better call the variable spillage instead of pressure, and
 return it. Current use will only check if it's zero or positive.

 I read this suggestion in the same way as Revital seems to have done:
 that we sum the pressure change over all classes.  But that isn't the idea.
 Using N too many registers in one pressure class cannot be mitigated by
 leaving N registers in another pressure class unused.


of-course (wasn't thinking of spilling from one register file to
another); meant to suggest doing:

if (max_reg_pressure[pressure_class]  ira_class_hard_regs_num[pressure_class])
  spillage += max_reg_pressure[pressure_class] -
ira_class_hard_regs_num[pressure_class]);


 TBH, the original version of this function looked better to me.


ok, it would surely suffice for now.


 Future use, as discussed offline, should compare this to the amount of
 spillage the original loop (w/o modulo-scheduling) is expected to
 have.

 I wonder if that's really worth pursuing though.  For one thing,
 we've no control over where the spill code is going to be inserted,
 so the final ii is going to be a little arbitrary.  That's less of a
 problem without SMS because we're able to reschedule the code after
 register allocation in order to move spills around.

Indeed; that's the main motivation for the second sched pass.


 For another thing, the pressure of the original (unscheduled) code isn't
 necessarily indicative of what can be achieved by the normal scheduler
 with something like -fsched-pressure.  We can't assume that the original
 block has minimum pressure.

 If we're serious about wanting to use SMS in high-pressure loops,
 perhaps we should consider trying to split live ranges in SMS itself.
 I'm not sure it makes sense to leave an SMSed loop that we know
 is going to need spill code.

 Richard

Agreed. Suggest to have concrete testcases guide further development.
Thanks,
Ayal.


Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)

2012-01-01 Thread Revital Eres
Hello,

Thanks for the comments! I incorporated them in the attached patch.

Currently testing and bootstrap with the other patch in the series on
ppc64-redhat-linux, enabling SMS on loops with SC 1.

Thanks again,
Revital

2012-01-01  Richard Sandiford  richard.sandif...@linaro.org
Revital Eres  revital.e...@linaro.org

* loop-invariant.c (get_regno_pressure_class): Move function to...
* ira.c: Here.
* common.opt (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose):
New flags.
* doc/invoke.texi (fmodulo-sched-reg-pressure,
-fmodulo-sched-verbose): Document the flags.
* ira.h (get_regno_pressure_class,
reset_pseudo_classes_defined_p): Declare.
* ira-costs.c (reset_pseudo_classes_defined_p): New function.
* Makefile.in (modulo-sched.o): Include ira.h and modulo-sched.h.
(modulo-sched-pressure.o): New.
* modulo-sched.c (ira.h, modulo-sched.h): New includes.
(partial_schedule_ptr, ps_insn_ptr, struct ps_insn,
struct ps_reg_move_info, struct partial_schedule): Move to
modulo-sched.h.
(ps_rtl_insn, ps_reg_move): Remove static.
(apply_reg_moves): Remove static and call df_insn_rescan only
if PS is final.
(undo_reg_moves): New function.
(sms_schedule): Call register pressure estimation.
* modulo-sched.h: New file.
* modulo-sched-pressure.c: New file.
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 182766)
+++ doc/invoke.texi (working copy)
@@ -374,6 +374,7 @@ Objective-C and Objective-C++ Dialects}.
 -floop-parallelize-all -flto -flto-compression-level @gol
 -flto-partition=@var{alg} -flto-report -fmerge-all-constants @gol
 -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol
+-fmodulo-sched-reg-pressure -fmodulo-sched-verbose=@var{n} @gol
 -fmove-loop-invariants fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg 
@gol
 -fno-default-inline @gol
 -fno-defer-pop -fno-function-cse -fno-guess-branch-probability @gol
@@ -6476,6 +6477,16 @@ deleted which will trigger the generatio
 life-range analysis.  This option is effective only with
 @option{-fmodulo-sched} enabled.
 
+@item -fmodulo-sched-reg-pressure
+@opindex fmodulo-sched-reg-pressure
+Do not apply @option{-fmodulo-sched} to loops if the result would lead
+to register spilling within the loop.
+This option is effective only with @option{-fmodulo-sched} enabled.
+
+@item -fmodulo-sched-verbose=@var{n}
+@opindex fmodulo-sched-verbose
+Set up how verbose dump file for the SMS will be.  
+
 @item -fno-branch-count-reg
 @opindex fno-branch-count-reg
 Do not use ``decrement and branch'' instructions on a count register,
Index: modulo-sched.h
===
--- modulo-sched.h  (revision 0)
+++ modulo-sched.h  (revision 0)
@@ -0,0 +1,120 @@
+/* Swing Modulo Scheduling implementation.
+   Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 
+   Free Software Foundation, Inc.
+   Contributed by Revital Eres revital.e...@linaro.org 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+http://www.gnu.org/licenses/.  */
+
+#ifndef GCC_SMS_H
+#define GCC_SMS_H
+
+#include ddg.h
+
+extern HARD_REG_SET eliminable_regset;
+
+typedef struct partial_schedule *partial_schedule_ptr;
+
+typedef struct ps_insn *ps_insn_ptr;
+
+/* A single instruction in the partial schedule.  */
+struct ps_insn
+{
+  /* Identifies the instruction to be scheduled.  Values smaller than
+ the ddg's num_nodes refer directly to ddg nodes.  A value of
+ X - num_nodes refers to register move X.  */
+  int id;
+
+  /* The (absolute) cycle in which the PS instruction is scheduled.
+ Same as SCHED_TIME (node).  */
+  int cycle;
+
+  /* The next/prev PS_INSN in the same row.  */
+  ps_insn_ptr next_in_row,
+ prev_in_row;
+
+};
+
+/* Information about a register move that has been added to a partial
+   schedule.  */
+struct ps_reg_move_info
+{
+  /* The source of the move is defined by the ps_insn with id DEF.
+ The destination is used by the ps_insns with the ids in USES.  */
+  int def;
+  sbitmap uses;
+
+  /* The original form of USES' instructions used OLD_REG, but they
+ should now use NEW_REG.  */
+  rtx old_reg;
+  rtx new_reg;
+
+  /* The number of consecutive stages that the 

Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)

2011-12-31 Thread Ayal Zaks
On Mon, Dec 19, 2011 at 4:28 PM, Richard Sandiford
richard.sandif...@linaro.org wrote:
 Hi Revital,

 Revital Eres revital.e...@linaro.org writes:
 The attached patch is a resubmission following comments made by Ayal
 and Richard.

 Tested and bootstrap with the other patches in the series on
 ppc64-redhat-linux, enabling SMS on loops with SC 1.


The modulo-sched parts are approved, with some comments provided below.
Moving get_regno_pressure_class() from loop-invariant.c to ira.c and
other ira.h/ira-costs.c changes are not for me to approve, but they
make sense to me.

 Looks really good to me.  I'll leave any approval to Ayal though.
 Just one suggestion:

 +       /* Handle register moves.  */

while we're at it, the comment above should be assigned to the then
part below:

 +       if (ps_i-id = ps-g-num_nodes)
 +         {
 +           int old_regno = REGNO (ps_reg_move (ps, ps_i-id)-old_reg);
 +           int new_regno = REGNO (ps_reg_move (ps, ps_i-id)-new_reg);
 +
 +           change_pressure (old_regno, true);
 +           change_pressure (new_regno, false);
 +           change_pressure (old_regno, true);
 +         }
 +       else
 +         {
 +           for (use_rec = DF_INSN_USES (insn); *use_rec; use_rec++)
 +             change_pressure (DF_REF_REGNO (*use_rec), true);
 +
 +           for (def_rec = DF_INSN_DEFS (insn); *def_rec; def_rec++)
 +             change_pressure (DF_REF_REGNO (*def_rec), false);
 +
 +           for (use_rec = DF_INSN_USES (insn); *use_rec; use_rec++)
 +             change_pressure (DF_REF_REGNO (*use_rec), true);
 +         }

 It might be worth adding a comment to say why the code is doing it this way.

Indeed, this does raise an eyebrow, and your explanation below is
great. Suggest to start by saying that The last two steps are the
natural way one would go about updating live registers in a bottom-up
scan, except that in some cases (iiuc) the same physical register
cannot be assigned to both use and def on same insn, so the first step
is added conservatively.

 E.g.:

      /* Process all uses, all defs, and then all uses again.  The first
         two steps give us an estimate of the pressure when dying inputs
         cannot be tied to outputs (which is the worst case).  The second
         two steps update the set of live registers ready for the next
         instruction.  */

 Also, as a general future direction comment, I think we should seriously
 consider turning both this and -fmodulo-sched-allow-regmoves on by default,
 so that -fmodulo-sched uses them unless explicitly told not to.  The results
 for ARM seemed to suggest that it is now the best SMS mode (although the
 results can't be shared, unfortunately).  It'd be unfortunate if users
 had to write:

    -fmodulo-sched -fmodulo-sched-allow-regmoves -fmodulo-sched-reg-pressure

 instead of plain:

    -fmodulo-sched

 It might make sense as a follow-on patch rather than part of the main commit.


I'd be in favor, provided this works well for other platforms as well
of-course. Ignoring the potential increase of register pressure inside
a loop is being incautiously optimistic...

It would be great to document testcases where spillage is detected;
the original motivation of SMS is to reduce size of register live
ranges so as to prevent spillage. There may be other means of
preventing or mitigating spills other than bumping the II (e.g.,
Minimum register requirements for a module schedule Micro'94), which
could be devised if concrete examples are analyzed.

 There's also the separate debate about whether SMS should now be enabled
 by default for ARM at -O3, but that's for another day...


Guess so... ;-)
Thanks.

 Richard


Here are some more comments:

+change_pressure (int regno, bool incr_p)
 ^^
I realize this was taken from elsewhere, but maybe update_pressure
or update_current_pressure would better fit here.

+{
+  int nregs;
+  enum reg_class pressure_class;
+
+  if (regno  FIRST_PSEUDO_REGISTER
+   (TEST_HARD_REG_BIT (ira_no_alloc_regs, regno)
+ || TEST_HARD_REG_BIT (eliminable_regset, regno)))
+return;
+
+  /* Update the current set of live registers.  Exit early if nothing
+ has changed.  */

please clarify: we want to increase curr_reg_pressure as we scan
upwards and encounter a last use; if regno is already live, this use
is not last.

likewise, we want to decrease curr_reg_pressure as we encounter a def;
regno might not be live when !incr_p, if the def feeds only uses in
next iteration.

+  if (incr_p
+  ? !bitmap_set_bit (curr_regs_live, regno)
+  : !bitmap_clear_bit (curr_regs_live, regno))
+return;
+
+  pressure_class = get_regno_pressure_class (regno, nregs);
+
+  if (!incr_p)
+curr_reg_pressure[pressure_class] -= nregs;
+  else
+curr_reg_pressure[pressure_class] += nregs;
+

sounds a bit more natural to ask if (incr_p)...; and only if so,
check also if the increased current pressure exceeds max:

+  if (max_reg_pressure[pressure_class]  

Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)

2011-12-19 Thread Richard Sandiford
Hi Revital,

Revital Eres revital.e...@linaro.org writes:
 The attached patch is a resubmission following comments made by Ayal
 and Richard.

 Tested and bootstrap with the other patches in the series on
 ppc64-redhat-linux, enabling SMS on loops with SC 1.

Looks really good to me.  I'll leave any approval to Ayal though.
Just one suggestion:

 +   /* Handle register moves.  */
 +   if (ps_i-id = ps-g-num_nodes)
 + {
 +   int old_regno = REGNO (ps_reg_move (ps, ps_i-id)-old_reg);
 +   int new_regno = REGNO (ps_reg_move (ps, ps_i-id)-new_reg);
 +
 +   change_pressure (old_regno, true);
 +   change_pressure (new_regno, false);
 +   change_pressure (old_regno, true);
 + }
 +   else
 + {
 +   for (use_rec = DF_INSN_USES (insn); *use_rec; use_rec++)
 + change_pressure (DF_REF_REGNO (*use_rec), true);
 +
 +   for (def_rec = DF_INSN_DEFS (insn); *def_rec; def_rec++)
 + change_pressure (DF_REF_REGNO (*def_rec), false);
 +
 +   for (use_rec = DF_INSN_USES (insn); *use_rec; use_rec++)
 + change_pressure (DF_REF_REGNO (*use_rec), true);
 + }

It might be worth adding a comment to say why the code is doing it this way.
E.g.:

  /* Process all uses, all defs, and then all uses again.  The first
 two steps give us an estimate of the pressure when dying inputs
 cannot be tied to outputs (which is the worst case).  The second
 two steps update the set of live registers ready for the next
 instruction.  */

Also, as a general future direction comment, I think we should seriously
consider turning both this and -fmodulo-sched-allow-regmoves on by default,
so that -fmodulo-sched uses them unless explicitly told not to.  The results
for ARM seemed to suggest that it is now the best SMS mode (although the
results can't be shared, unfortunately).  It'd be unfortunate if users
had to write:

-fmodulo-sched -fmodulo-sched-allow-regmoves -fmodulo-sched-reg-pressure

instead of plain:

-fmodulo-sched

It might make sense as a follow-on patch rather than part of the main commit.

There's also the separate debate about whether SMS should now be enabled
by default for ARM at -O3, but that's for another day...

Richard


[PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)

2011-12-17 Thread Revital Eres
Hello,

The attached patch is a resubmission following comments made by Ayal
and Richard.

Tested and bootstrap with the other patches in the series on
ppc64-redhat-linux, enabling SMS on loops with SC 1.

Comments are welcome.

Thanks,
Revital


   2011-12-18  Richard Sandiford  richard.sandif...@linaro.org
Revital Eres  revital.e...@linaro.org

* loop-invariant.c (get_regno_pressure_class): Move function to...
* ira.c: Here.
* common.opt (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose):
New flags.
* doc/invoke.texi (fmodulo-sched-reg-pressure,
-fmodulo-sched-verbose): Document the flags.
* ira.h (get_regno_pressure_class,
reset_pseudo_classes_defined_p): Declare.
* ira-costs.c (reset_pseudo_classes_defined_p): New function.
* Makefile.in (modulo-sched.o): Include ira.h and modulo-sched.h.
(modulo-sched-pressure.o): New.
* modulo-sched.c (ira.h, modulo-sched.h): New includes.
(partial_schedule_ptr, ps_insn_ptr, struct ps_insn,
struct ps_reg_move_info, struct partial_schedule): Move to
modulo-sched.h.
(ps_rtl_insn, ps_reg_move): Remove static.
(apply_reg_moves): Remove static and call df_insn_rescan only
if PS is final.
(undo_reg_moves): New function.
(sms_schedule): Call register pressure estimation.
* modulo-sched.h: New file.
* modulo-sched-pressure.c: New file.
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 182357)
+++ doc/invoke.texi (working copy)
@@ -373,6 +373,7 @@ Objective-C and Objective-C++ Dialects}.
 -floop-parallelize-all -flto -flto-compression-level @gol
 -flto-partition=@var{alg} -flto-report -fmerge-all-constants @gol
 -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol
+-fmodulo-sched-reg-pressure -fmodulo-sched-verbose=@var{n} @gol
 -fmove-loop-invariants fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg 
@gol
 -fno-default-inline @gol
 -fno-defer-pop -fno-function-cse -fno-guess-branch-probability @gol
@@ -6474,6 +6475,16 @@ deleted which will trigger the generatio
 life-range analysis.  This option is effective only with
 @option{-fmodulo-sched} enabled.
 
+@item -fmodulo-sched-reg-pressure
+@opindex fmodulo-sched-reg-pressure
+Do not apply @option{-fmodulo-sched} to loops if the result would lead
+to register spilling within the loop.
+This option is effective only with @option{-fmodulo-sched} enabled.
+
+@item -fmodulo-sched-verbose=@var{n}
+@opindex fmodulo-sched-verbose
+Set up how verbose dump file for the SMS will be.  
+
 @item -fno-branch-count-reg
 @opindex fno-branch-count-reg
 Do not use ``decrement and branch'' instructions on a count register,
Index: modulo-sched.h
===
--- modulo-sched.h  (revision 0)
+++ modulo-sched.h  (revision 0)
@@ -0,0 +1,120 @@
+/* Swing Modulo Scheduling implementation.
+   Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 
+   Free Software Foundation, Inc.
+   Contributed by Revital Eres revital.e...@linaro.org 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+http://www.gnu.org/licenses/.  */
+
+#ifndef GCC_SMS_H
+#define GCC_SMS_H
+
+#include ddg.h
+
+extern HARD_REG_SET eliminable_regset;
+
+typedef struct partial_schedule *partial_schedule_ptr;
+
+typedef struct ps_insn *ps_insn_ptr;
+
+/* A single instruction in the partial schedule.  */
+struct ps_insn
+{
+  /* Identifies the instruction to be scheduled.  Values smaller than
+ the ddg's num_nodes refer directly to ddg nodes.  A value of
+ X - num_nodes refers to register move X.  */
+  int id;
+
+  /* The (absolute) cycle in which the PS instruction is scheduled.
+ Same as SCHED_TIME (node).  */
+  int cycle;
+
+  /* The next/prev PS_INSN in the same row.  */
+  ps_insn_ptr next_in_row,
+ prev_in_row;
+
+};
+
+/* Information about a register move that has been added to a partial
+   schedule.  */
+struct ps_reg_move_info
+{
+  /* The source of the move is defined by the ps_insn with id DEF.
+ The destination is used by the ps_insns with the ids in USES.  */
+  int def;
+  sbitmap uses;
+
+  /* The original form of USES' instructions used OLD_REG, but they
+ should now use NEW_REG.  */
+  rtx old_reg;
+  rtx new_reg;
+
+  /* The number of