Thanks Revital; it may actually be better to include the scheduling of the
branch even where the doloop pattern is simple, as it rightfully accounts
for its resources.
> OK for mainline?
Yes. I have only the following minor comments:
+/* Bump the SCHED_TIMEs of all nodes by AMOUNT. Set the values of
+ SCHED_ROW and SCHED_STAGE. */
please clarify that, e.g., instruction scheduled on cycle AMOUNT will move
to cycle zero.
+ /* The calculation of stage count is done adding the number
+ of stages before cycle zero and after cycle zero. */
+ sc_until_cycle_zero = CALC_STAGE_COUNT (-1, new_min_cycle, ii);
+
+ if (SCHED_TIME (u) < 0)
+ {
+ stage = CALC_STAGE_COUNT (-1, SCHED_TIME (u), ii);
+ SCHED_STAGE (u) = sc_until_cycle_zero - stage;
+ }
+ else
+ {
+ stage = CALC_STAGE_COUNT (SCHED_TIME (u), 0, ii);
+ SCHED_STAGE (u) = sc_until_cycle_zero + stage - 1;
+ }
shouldn't normalized_time be used here instead of SCHED_TIME (u)?
In any case, an alternative way of calculating the stage count of u can be
(iinm):
CALC_STAGE_COUNT (normalized_time, new_zero_kelvin, ii);
where
new_zero_kelvin = new_min_cycle - SMODULO (new_min_cycle, ii);
As you've pointed out, this change may effect code size (by increasing the
stage count) and/or performance (by increasing the ii, although the branch
might not have had an available slot originally). So it may be interesting
to share comparative measurements when available.
Ayal.
Revital Eres <[email protected]> wrote on 08/05/2011 07:37:07 AM:
> From: Revital Eres <[email protected]>
> To: Ayal Zaks/Haifa/IBM@IBMIL
> Cc: [email protected], Patch Tracking <[email protected]>
> Date: 08/05/2011 07:37 AM
> Subject: [PATCH, SMS 1/3] Support closing_branch_deps (second try)
>
> Hello,
>
> The attached patch includes enhancements for SMS to support targets
> that their doloop part is not decoupled from the rest of the loop's
> instructions, as SMS currently requires. In this case, the branch can
> not be placed wherever we want (as is currently done) due to the fact
> it must honor dependencies and thus we schedule the branch instruction
> with the rest of the loop's instructions and rotate it to be in row
> ii-1 at the end of the scheduling procedure to make sure it's the last
> instruction in the iteration.
>
> The attached patch changes the current implementation to always schedule
> the branch in order to support the above case.
>
> As explained in http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00250.html
> by always scheduling the branch the code size might be effected due to
> the fact SC can be increased by 1, which means adding instructions from
> at most one iteration to the prologue and epilogue. Also, it might
> be that ii will be increased by one due to resources constraints --
> unavailability of free slots to schedule the branch.
>
> The patch was tested together with the rest of the patches in this series
> and on top of the patch to support do-loop for ARM (not yet in mainline,
> but approved http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01718.html).
> On ppc64-redhat-linux regtest as well as bootstrap with SMS flags
> enabling SMS also on loops with stage count 1. Regtested on SPU.
> On arm-linux-gnueabi regtseted on c,c++. Bootstrap c language with SMS
> flags enabling SMS also on loops with stage count 1.
>
> OK for mainline?
>
> Thanks,
> Revital
>
> ChangeLog:
>
> * ddg.c (create_ddg_dep_from_intra_loop_link): If a true dep edge
> enters the branch create an anti edge in the opposite direction
> to prevent the creation of reg-moves.
> * modulo-sched.c: Adjust comment to reflect the fact we are
> scheduling closing branch.
> (PS_STAGE_COUNT): Rename to CALC_STAGE_COUNT and redefine.
> (stage_count): New field in struct partial_schedule.
> (calculate_stage_count): New function.
> (normalize_sched_times): Rename to reset_sched_times and handle
> incrementing the sched time of the nodes by a constant value
> passed as parameter.
> (duplicate_insns_of_cycles): Skip closing branch.
> (sms_schedule_by_order): Schedule closing branch.
> (ps_insn_find_column): Handle closing branch.
> (sms_schedule): Call reset_sched_times and adjust the code to
> support scheduling of the closing branch.
> (ps_insert_empty_row): Update calls to normalize_sched_times
> and rotate_partial_schedule functions.
>
> testsuite Changlog:
>
> * gcc.target/arm/sms-9.c: New file.
> * gcc.target/arm/sms-10.c: New file.
> [attachment "patch_final_linaro_6_5.txt" deleted by Ayal Zaks/Haifa/IBM]
> [attachment "sms-9.c" deleted by Ayal Zaks/Haifa/IBM] [attachment
"sms-10.c"
> deleted by Ayal Zaks/Haifa/IBM]