Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)
SMS changes are ok. * common.opt (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose): New flags. We should document what the different verbosity levels are, or at-least their range. Thanks, Ayal. On Tue, Jan 10, 2012 at 7:48 PM, Vladimir Makarov vmaka...@redhat.com wrote: On 01/03/2012 04:25 AM, Revital1 Eres wrote: Attached is an updated version with the two changes mentioned above taken from the previous patch. Tested and bootstrap with the other patch in the series on ppc64-redhat-linux, enabling SMS on loops with SC 1. Thanks again, Revital IRA changes are ok for me. Thanks, Revital. 2012-01-03 Richard Sandifordrichard.sandif...@linaro.org Revital Eresrevital.e...@linaro.org * loop-invariant.c (get_regno_pressure_class): Move function to... * ira.c: Here. * common.opt (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose): New flags. * doc/invoke.texi (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose): Document the flags. * ira.h (get_regno_pressure_class, reset_pseudo_classes_defined_p): Declare. * ira-costs.c (reset_pseudo_classes_defined_p): New function. * Makefile.in (modulo-sched.o): Include ira.h and modulo-sched.h. (modulo-sched-pressure.o): New. * modulo-sched.c (ira.h, modulo-sched.h): New includes. (partial_schedule_ptr, ps_insn_ptr, struct ps_insn, struct ps_reg_move_info, struct partial_schedule): Move to modulo-sched.h. (ps_rtl_insn, ps_reg_move): Remove static. (apply_reg_moves): Remove static and call df_insn_rescan only if PS is final. (undo_reg_moves): New function. (sms_schedule): Call register pressure estimation. * modulo-sched.h: New file. * modulo-sched-pressure.c: New file. (See attached file: patch_pressure_3_1_12.txt)
Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)
On 01/03/2012 04:25 AM, Revital1 Eres wrote: Attached is an updated version with the two changes mentioned above taken from the previous patch. Tested and bootstrap with the other patch in the series on ppc64-redhat-linux, enabling SMS on loops with SC 1. Thanks again, Revital IRA changes are ok for me. Thanks, Revital. 2012-01-03 Richard Sandifordrichard.sandif...@linaro.org Revital Eresrevital.e...@linaro.org * loop-invariant.c (get_regno_pressure_class): Move function to... * ira.c: Here. * common.opt (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose): New flags. * doc/invoke.texi (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose): Document the flags. * ira.h (get_regno_pressure_class, reset_pseudo_classes_defined_p): Declare. * ira-costs.c (reset_pseudo_classes_defined_p): New function. * Makefile.in (modulo-sched.o): Include ira.h and modulo-sched.h. (modulo-sched-pressure.o): New. * modulo-sched.c (ira.h, modulo-sched.h): New includes. (partial_schedule_ptr, ps_insn_ptr, struct ps_insn, struct ps_reg_move_info, struct partial_schedule): Move to modulo-sched.h. (ps_rtl_insn, ps_reg_move): Remove static. (apply_reg_moves): Remove static and call df_insn_rescan only if PS is final. (undo_reg_moves): New function. (sms_schedule): Call register pressure estimation. * modulo-sched.h: New file. * modulo-sched-pressure.c: New file. (See attached file: patch_pressure_3_1_12.txt)
Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)
Hello, On Mon, Jan 2, 2012 at 3:30 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Ayal Zaks ayal.z...@gmail.com writes: + for (i = 0; i ira_pressure_classes_num; i++) + { + enum reg_class pressure_class; + + pressure_class = ira_pressure_classes[i]; + + if (max_reg_pressure[pressure_class] == 0) + continue; + + if (dump_file) + fprintf (dump_file, %s=%d %d , reg_class_names [pressure_class], + max_reg_pressure[pressure_class], + ira_available_class_regs[pressure_class]); + + if (max_reg_pressure[pressure_class] + ira_class_hard_regs_num[pressure_class]) + { + if (dump_file) + fprintf (dump_file, (pressure)\n); + + pressure_p = true; you can break; now. FWIW, I thought the same thing at first, but I think Revital's wayis better. It's nice to know when looking at dumps whether there is excess pressure in more than one class. This isn't performance-critical code after all. ok however, you have everything setup to compute the amount of spill, so suggest to do the following instead: pressure += max_reg_pressure[pressure_class] - ira_class_hard_regs_num[pressure_class]); or better call the variable spillage instead of pressure, and return it. Current use will only check if it's zero or positive. I read this suggestion in the same way as Revital seems to have done: that we sum the pressure change over all classes. But that isn't the idea. Using N too many registers in one pressure class cannot be mitigated by leaving N registers in another pressure class unused. of-course (wasn't thinking of spilling from one register file to another); meant to suggest doing: if (max_reg_pressure[pressure_class] ira_class_hard_regs_num [pressure_class]) spillage += max_reg_pressure[pressure_class] - ira_class_hard_regs_num[pressure_class]); TBH, the original version of this function looked better to me. ok, it would surely suffice for now. Attached is an updated version with the two changes mentioned above taken from the previous patch. Tested and bootstrap with the other patch in the series on ppc64-redhat-linux, enabling SMS on loops with SC 1. Thanks again, Revital 2012-01-03 Richard Sandiford richard.sandif...@linaro.org Revital Eres revital.e...@linaro.org * loop-invariant.c (get_regno_pressure_class): Move function to... * ira.c: Here. * common.opt (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose): New flags. * doc/invoke.texi (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose): Document the flags. * ira.h (get_regno_pressure_class, reset_pseudo_classes_defined_p): Declare. * ira-costs.c (reset_pseudo_classes_defined_p): New function. * Makefile.in (modulo-sched.o): Include ira.h and modulo-sched.h. (modulo-sched-pressure.o): New. * modulo-sched.c (ira.h, modulo-sched.h): New includes. (partial_schedule_ptr, ps_insn_ptr, struct ps_insn, struct ps_reg_move_info, struct partial_schedule): Move to modulo-sched.h. (ps_rtl_insn, ps_reg_move): Remove static. (apply_reg_moves): Remove static and call df_insn_rescan only if PS is final. (undo_reg_moves): New function. (sms_schedule): Call register pressure estimation. * modulo-sched.h: New file. * modulo-sched-pressure.c: New file. (See attached file: patch_pressure_3_1_12.txt)Index: doc/invoke.texi === --- doc/invoke.texi (revision 182766) +++ doc/invoke.texi (working copy) @@ -374,6 +374,7 @@ Objective-C and Objective-C++ Dialects}. -floop-parallelize-all -flto -flto-compression-level @gol -flto-partition=@var{alg} -flto-report -fmerge-all-constants @gol -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol +-fmodulo-sched-reg-pressure -fmodulo-sched-verbose=@var{n} @gol -fmove-loop-invariants fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg @gol -fno-default-inline @gol -fno-defer-pop -fno-function-cse -fno-guess-branch-probability @gol @@ -6476,6 +6477,16 @@ deleted which will trigger the generatio life-range analysis. This option is effective only with @option{-fmodulo-sched} enabled. +@item -fmodulo-sched-reg-pressure +@opindex fmodulo-sched-reg-pressure +Do not apply @option{-fmodulo-sched} to loops if the result would lead +to register spilling within the loop. +This option is effective only with @option{-fmodulo-sched} enabled. + +@item -fmodulo-sched-verbose=@var{n} +@opindex fmodulo-sched-verbose +Set up how verbose dump file for the SMS will be. + @item -fno-branch-count-reg @opindex fno-branch-count-reg Do not use ``decrement and branch'' instructions on a count register, Index: modulo-sched.h
Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)
Ayal Zaks ayal.z...@gmail.com writes: + for (i = 0; i ira_pressure_classes_num; i++) +{ + enum reg_class pressure_class; + + pressure_class = ira_pressure_classes[i]; + + if (max_reg_pressure[pressure_class] == 0) + continue; + + if (dump_file) + fprintf (dump_file, %s=%d %d , reg_class_names[pressure_class], + max_reg_pressure[pressure_class], + ira_available_class_regs[pressure_class]); + + if (max_reg_pressure[pressure_class] +ira_class_hard_regs_num[pressure_class]) + { + if (dump_file) + fprintf (dump_file, (pressure)\n); + + pressure_p = true; you can break; now. FWIW, I thought the same thing at first, but I think Revital's way is better. It's nice to know when looking at dumps whether there is excess pressure in more than one class. This isn't performance-critical code after all. however, you have everything setup to compute the amount of spill, so suggest to do the following instead: pressure += max_reg_pressure[pressure_class] - ira_class_hard_regs_num[pressure_class]); or better call the variable spillage instead of pressure, and return it. Current use will only check if it's zero or positive. I read this suggestion in the same way as Revital seems to have done: that we sum the pressure change over all classes. But that isn't the idea. Using N too many registers in one pressure class cannot be mitigated by leaving N registers in another pressure class unused. TBH, the original version of this function looked better to me. Future use, as discussed offline, should compare this to the amount of spillage the original loop (w/o modulo-scheduling) is expected to have. I wonder if that's really worth pursuing though. For one thing, we've no control over where the spill code is going to be inserted, so the final ii is going to be a little arbitrary. That's less of a problem without SMS because we're able to reschedule the code after register allocation in order to move spills around. For another thing, the pressure of the original (unscheduled) code isn't necessarily indicative of what can be achieved by the normal scheduler with something like -fsched-pressure. We can't assume that the original block has minimum pressure. If we're serious about wanting to use SMS in high-pressure loops, perhaps we should consider trying to split live ranges in SMS itself. I'm not sure it makes sense to leave an SMSed loop that we know is going to need spill code. Richard
Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)
On Mon, Jan 2, 2012 at 3:30 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Ayal Zaks ayal.z...@gmail.com writes: + for (i = 0; i ira_pressure_classes_num; i++) + { + enum reg_class pressure_class; + + pressure_class = ira_pressure_classes[i]; + + if (max_reg_pressure[pressure_class] == 0) + continue; + + if (dump_file) + fprintf (dump_file, %s=%d %d , reg_class_names[pressure_class], + max_reg_pressure[pressure_class], + ira_available_class_regs[pressure_class]); + + if (max_reg_pressure[pressure_class] + ira_class_hard_regs_num[pressure_class]) + { + if (dump_file) + fprintf (dump_file, (pressure)\n); + + pressure_p = true; you can break; now. FWIW, I thought the same thing at first, but I think Revital's way is better. It's nice to know when looking at dumps whether there is excess pressure in more than one class. This isn't performance-critical code after all. ok however, you have everything setup to compute the amount of spill, so suggest to do the following instead: pressure += max_reg_pressure[pressure_class] - ira_class_hard_regs_num[pressure_class]); or better call the variable spillage instead of pressure, and return it. Current use will only check if it's zero or positive. I read this suggestion in the same way as Revital seems to have done: that we sum the pressure change over all classes. But that isn't the idea. Using N too many registers in one pressure class cannot be mitigated by leaving N registers in another pressure class unused. of-course (wasn't thinking of spilling from one register file to another); meant to suggest doing: if (max_reg_pressure[pressure_class] ira_class_hard_regs_num[pressure_class]) spillage += max_reg_pressure[pressure_class] - ira_class_hard_regs_num[pressure_class]); TBH, the original version of this function looked better to me. ok, it would surely suffice for now. Future use, as discussed offline, should compare this to the amount of spillage the original loop (w/o modulo-scheduling) is expected to have. I wonder if that's really worth pursuing though. For one thing, we've no control over where the spill code is going to be inserted, so the final ii is going to be a little arbitrary. That's less of a problem without SMS because we're able to reschedule the code after register allocation in order to move spills around. Indeed; that's the main motivation for the second sched pass. For another thing, the pressure of the original (unscheduled) code isn't necessarily indicative of what can be achieved by the normal scheduler with something like -fsched-pressure. We can't assume that the original block has minimum pressure. If we're serious about wanting to use SMS in high-pressure loops, perhaps we should consider trying to split live ranges in SMS itself. I'm not sure it makes sense to leave an SMSed loop that we know is going to need spill code. Richard Agreed. Suggest to have concrete testcases guide further development. Thanks, Ayal.
Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)
Hello, Thanks for the comments! I incorporated them in the attached patch. Currently testing and bootstrap with the other patch in the series on ppc64-redhat-linux, enabling SMS on loops with SC 1. Thanks again, Revital 2012-01-01 Richard Sandiford richard.sandif...@linaro.org Revital Eres revital.e...@linaro.org * loop-invariant.c (get_regno_pressure_class): Move function to... * ira.c: Here. * common.opt (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose): New flags. * doc/invoke.texi (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose): Document the flags. * ira.h (get_regno_pressure_class, reset_pseudo_classes_defined_p): Declare. * ira-costs.c (reset_pseudo_classes_defined_p): New function. * Makefile.in (modulo-sched.o): Include ira.h and modulo-sched.h. (modulo-sched-pressure.o): New. * modulo-sched.c (ira.h, modulo-sched.h): New includes. (partial_schedule_ptr, ps_insn_ptr, struct ps_insn, struct ps_reg_move_info, struct partial_schedule): Move to modulo-sched.h. (ps_rtl_insn, ps_reg_move): Remove static. (apply_reg_moves): Remove static and call df_insn_rescan only if PS is final. (undo_reg_moves): New function. (sms_schedule): Call register pressure estimation. * modulo-sched.h: New file. * modulo-sched-pressure.c: New file. Index: doc/invoke.texi === --- doc/invoke.texi (revision 182766) +++ doc/invoke.texi (working copy) @@ -374,6 +374,7 @@ Objective-C and Objective-C++ Dialects}. -floop-parallelize-all -flto -flto-compression-level @gol -flto-partition=@var{alg} -flto-report -fmerge-all-constants @gol -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol +-fmodulo-sched-reg-pressure -fmodulo-sched-verbose=@var{n} @gol -fmove-loop-invariants fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg @gol -fno-default-inline @gol -fno-defer-pop -fno-function-cse -fno-guess-branch-probability @gol @@ -6476,6 +6477,16 @@ deleted which will trigger the generatio life-range analysis. This option is effective only with @option{-fmodulo-sched} enabled. +@item -fmodulo-sched-reg-pressure +@opindex fmodulo-sched-reg-pressure +Do not apply @option{-fmodulo-sched} to loops if the result would lead +to register spilling within the loop. +This option is effective only with @option{-fmodulo-sched} enabled. + +@item -fmodulo-sched-verbose=@var{n} +@opindex fmodulo-sched-verbose +Set up how verbose dump file for the SMS will be. + @item -fno-branch-count-reg @opindex fno-branch-count-reg Do not use ``decrement and branch'' instructions on a count register, Index: modulo-sched.h === --- modulo-sched.h (revision 0) +++ modulo-sched.h (revision 0) @@ -0,0 +1,120 @@ +/* Swing Modulo Scheduling implementation. + Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 + Free Software Foundation, Inc. + Contributed by Revital Eres revital.e...@linaro.org + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + +#ifndef GCC_SMS_H +#define GCC_SMS_H + +#include ddg.h + +extern HARD_REG_SET eliminable_regset; + +typedef struct partial_schedule *partial_schedule_ptr; + +typedef struct ps_insn *ps_insn_ptr; + +/* A single instruction in the partial schedule. */ +struct ps_insn +{ + /* Identifies the instruction to be scheduled. Values smaller than + the ddg's num_nodes refer directly to ddg nodes. A value of + X - num_nodes refers to register move X. */ + int id; + + /* The (absolute) cycle in which the PS instruction is scheduled. + Same as SCHED_TIME (node). */ + int cycle; + + /* The next/prev PS_INSN in the same row. */ + ps_insn_ptr next_in_row, + prev_in_row; + +}; + +/* Information about a register move that has been added to a partial + schedule. */ +struct ps_reg_move_info +{ + /* The source of the move is defined by the ps_insn with id DEF. + The destination is used by the ps_insns with the ids in USES. */ + int def; + sbitmap uses; + + /* The original form of USES' instructions used OLD_REG, but they + should now use NEW_REG. */ + rtx old_reg; + rtx new_reg; + + /* The number of consecutive stages that the
Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)
On Mon, Dec 19, 2011 at 4:28 PM, Richard Sandiford richard.sandif...@linaro.org wrote: Hi Revital, Revital Eres revital.e...@linaro.org writes: The attached patch is a resubmission following comments made by Ayal and Richard. Tested and bootstrap with the other patches in the series on ppc64-redhat-linux, enabling SMS on loops with SC 1. The modulo-sched parts are approved, with some comments provided below. Moving get_regno_pressure_class() from loop-invariant.c to ira.c and other ira.h/ira-costs.c changes are not for me to approve, but they make sense to me. Looks really good to me. I'll leave any approval to Ayal though. Just one suggestion: + /* Handle register moves. */ while we're at it, the comment above should be assigned to the then part below: + if (ps_i-id = ps-g-num_nodes) + { + int old_regno = REGNO (ps_reg_move (ps, ps_i-id)-old_reg); + int new_regno = REGNO (ps_reg_move (ps, ps_i-id)-new_reg); + + change_pressure (old_regno, true); + change_pressure (new_regno, false); + change_pressure (old_regno, true); + } + else + { + for (use_rec = DF_INSN_USES (insn); *use_rec; use_rec++) + change_pressure (DF_REF_REGNO (*use_rec), true); + + for (def_rec = DF_INSN_DEFS (insn); *def_rec; def_rec++) + change_pressure (DF_REF_REGNO (*def_rec), false); + + for (use_rec = DF_INSN_USES (insn); *use_rec; use_rec++) + change_pressure (DF_REF_REGNO (*use_rec), true); + } It might be worth adding a comment to say why the code is doing it this way. Indeed, this does raise an eyebrow, and your explanation below is great. Suggest to start by saying that The last two steps are the natural way one would go about updating live registers in a bottom-up scan, except that in some cases (iiuc) the same physical register cannot be assigned to both use and def on same insn, so the first step is added conservatively. E.g.: /* Process all uses, all defs, and then all uses again. The first two steps give us an estimate of the pressure when dying inputs cannot be tied to outputs (which is the worst case). The second two steps update the set of live registers ready for the next instruction. */ Also, as a general future direction comment, I think we should seriously consider turning both this and -fmodulo-sched-allow-regmoves on by default, so that -fmodulo-sched uses them unless explicitly told not to. The results for ARM seemed to suggest that it is now the best SMS mode (although the results can't be shared, unfortunately). It'd be unfortunate if users had to write: -fmodulo-sched -fmodulo-sched-allow-regmoves -fmodulo-sched-reg-pressure instead of plain: -fmodulo-sched It might make sense as a follow-on patch rather than part of the main commit. I'd be in favor, provided this works well for other platforms as well of-course. Ignoring the potential increase of register pressure inside a loop is being incautiously optimistic... It would be great to document testcases where spillage is detected; the original motivation of SMS is to reduce size of register live ranges so as to prevent spillage. There may be other means of preventing or mitigating spills other than bumping the II (e.g., Minimum register requirements for a module schedule Micro'94), which could be devised if concrete examples are analyzed. There's also the separate debate about whether SMS should now be enabled by default for ARM at -O3, but that's for another day... Guess so... ;-) Thanks. Richard Here are some more comments: +change_pressure (int regno, bool incr_p) ^^ I realize this was taken from elsewhere, but maybe update_pressure or update_current_pressure would better fit here. +{ + int nregs; + enum reg_class pressure_class; + + if (regno FIRST_PSEUDO_REGISTER + (TEST_HARD_REG_BIT (ira_no_alloc_regs, regno) + || TEST_HARD_REG_BIT (eliminable_regset, regno))) +return; + + /* Update the current set of live registers. Exit early if nothing + has changed. */ please clarify: we want to increase curr_reg_pressure as we scan upwards and encounter a last use; if regno is already live, this use is not last. likewise, we want to decrease curr_reg_pressure as we encounter a def; regno might not be live when !incr_p, if the def feeds only uses in next iteration. + if (incr_p + ? !bitmap_set_bit (curr_regs_live, regno) + : !bitmap_clear_bit (curr_regs_live, regno)) +return; + + pressure_class = get_regno_pressure_class (regno, nregs); + + if (!incr_p) +curr_reg_pressure[pressure_class] -= nregs; + else +curr_reg_pressure[pressure_class] += nregs; + sounds a bit more natural to ask if (incr_p)...; and only if so, check also if the increased current pressure exceeds max: + if (max_reg_pressure[pressure_class]
Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)
Hi Revital, Revital Eres revital.e...@linaro.org writes: The attached patch is a resubmission following comments made by Ayal and Richard. Tested and bootstrap with the other patches in the series on ppc64-redhat-linux, enabling SMS on loops with SC 1. Looks really good to me. I'll leave any approval to Ayal though. Just one suggestion: + /* Handle register moves. */ + if (ps_i-id = ps-g-num_nodes) + { + int old_regno = REGNO (ps_reg_move (ps, ps_i-id)-old_reg); + int new_regno = REGNO (ps_reg_move (ps, ps_i-id)-new_reg); + + change_pressure (old_regno, true); + change_pressure (new_regno, false); + change_pressure (old_regno, true); + } + else + { + for (use_rec = DF_INSN_USES (insn); *use_rec; use_rec++) + change_pressure (DF_REF_REGNO (*use_rec), true); + + for (def_rec = DF_INSN_DEFS (insn); *def_rec; def_rec++) + change_pressure (DF_REF_REGNO (*def_rec), false); + + for (use_rec = DF_INSN_USES (insn); *use_rec; use_rec++) + change_pressure (DF_REF_REGNO (*use_rec), true); + } It might be worth adding a comment to say why the code is doing it this way. E.g.: /* Process all uses, all defs, and then all uses again. The first two steps give us an estimate of the pressure when dying inputs cannot be tied to outputs (which is the worst case). The second two steps update the set of live registers ready for the next instruction. */ Also, as a general future direction comment, I think we should seriously consider turning both this and -fmodulo-sched-allow-regmoves on by default, so that -fmodulo-sched uses them unless explicitly told not to. The results for ARM seemed to suggest that it is now the best SMS mode (although the results can't be shared, unfortunately). It'd be unfortunate if users had to write: -fmodulo-sched -fmodulo-sched-allow-regmoves -fmodulo-sched-reg-pressure instead of plain: -fmodulo-sched It might make sense as a follow-on patch rather than part of the main commit. There's also the separate debate about whether SMS should now be enabled by default for ARM at -O3, but that's for another day... Richard
[PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)
Hello, The attached patch is a resubmission following comments made by Ayal and Richard. Tested and bootstrap with the other patches in the series on ppc64-redhat-linux, enabling SMS on loops with SC 1. Comments are welcome. Thanks, Revital 2011-12-18 Richard Sandiford richard.sandif...@linaro.org Revital Eres revital.e...@linaro.org * loop-invariant.c (get_regno_pressure_class): Move function to... * ira.c: Here. * common.opt (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose): New flags. * doc/invoke.texi (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose): Document the flags. * ira.h (get_regno_pressure_class, reset_pseudo_classes_defined_p): Declare. * ira-costs.c (reset_pseudo_classes_defined_p): New function. * Makefile.in (modulo-sched.o): Include ira.h and modulo-sched.h. (modulo-sched-pressure.o): New. * modulo-sched.c (ira.h, modulo-sched.h): New includes. (partial_schedule_ptr, ps_insn_ptr, struct ps_insn, struct ps_reg_move_info, struct partial_schedule): Move to modulo-sched.h. (ps_rtl_insn, ps_reg_move): Remove static. (apply_reg_moves): Remove static and call df_insn_rescan only if PS is final. (undo_reg_moves): New function. (sms_schedule): Call register pressure estimation. * modulo-sched.h: New file. * modulo-sched-pressure.c: New file. Index: doc/invoke.texi === --- doc/invoke.texi (revision 182357) +++ doc/invoke.texi (working copy) @@ -373,6 +373,7 @@ Objective-C and Objective-C++ Dialects}. -floop-parallelize-all -flto -flto-compression-level @gol -flto-partition=@var{alg} -flto-report -fmerge-all-constants @gol -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol +-fmodulo-sched-reg-pressure -fmodulo-sched-verbose=@var{n} @gol -fmove-loop-invariants fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg @gol -fno-default-inline @gol -fno-defer-pop -fno-function-cse -fno-guess-branch-probability @gol @@ -6474,6 +6475,16 @@ deleted which will trigger the generatio life-range analysis. This option is effective only with @option{-fmodulo-sched} enabled. +@item -fmodulo-sched-reg-pressure +@opindex fmodulo-sched-reg-pressure +Do not apply @option{-fmodulo-sched} to loops if the result would lead +to register spilling within the loop. +This option is effective only with @option{-fmodulo-sched} enabled. + +@item -fmodulo-sched-verbose=@var{n} +@opindex fmodulo-sched-verbose +Set up how verbose dump file for the SMS will be. + @item -fno-branch-count-reg @opindex fno-branch-count-reg Do not use ``decrement and branch'' instructions on a count register, Index: modulo-sched.h === --- modulo-sched.h (revision 0) +++ modulo-sched.h (revision 0) @@ -0,0 +1,120 @@ +/* Swing Modulo Scheduling implementation. + Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 + Free Software Foundation, Inc. + Contributed by Revital Eres revital.e...@linaro.org + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + +#ifndef GCC_SMS_H +#define GCC_SMS_H + +#include ddg.h + +extern HARD_REG_SET eliminable_regset; + +typedef struct partial_schedule *partial_schedule_ptr; + +typedef struct ps_insn *ps_insn_ptr; + +/* A single instruction in the partial schedule. */ +struct ps_insn +{ + /* Identifies the instruction to be scheduled. Values smaller than + the ddg's num_nodes refer directly to ddg nodes. A value of + X - num_nodes refers to register move X. */ + int id; + + /* The (absolute) cycle in which the PS instruction is scheduled. + Same as SCHED_TIME (node). */ + int cycle; + + /* The next/prev PS_INSN in the same row. */ + ps_insn_ptr next_in_row, + prev_in_row; + +}; + +/* Information about a register move that has been added to a partial + schedule. */ +struct ps_reg_move_info +{ + /* The source of the move is defined by the ps_insn with id DEF. + The destination is used by the ps_insns with the ids in USES. */ + int def; + sbitmap uses; + + /* The original form of USES' instructions used OLD_REG, but they + should now use NEW_REG. */ + rtx old_reg; + rtx new_reg; + + /* The number of