On Tue, Jul 9, 2019 at 12:23 PM Richard Biener <richard.guent...@gmail.com> wrote: > > On Tue, Jul 9, 2019 at 12:22 PM Richard Biener > <richard.guent...@gmail.com> wrote: > > > > On Tue, Jul 9, 2019 at 11:56 AM Jan Hubicka <hubi...@ucw.cz> wrote: > > > > > > > Hi. > > > > > > > > I'm suggesting to restrict LOOP_ALIGN to only loop headers. That are the > > > > basic blocks for which it makes the biggest sense. I quite some binary > > > > size reductions on SPEC2006 and SPEC2017. Speed numbers are also > > > > slightly > > > > positive. > > > > > > > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > > > > > > > Ready to be installed? > > > The original idea of distinction between jump alignment and loop > > > alignment was that they have two basic meanings: > > > 1) jump alignment is there to avoid jumping just to the end of decode > > > window (if the window is aligned) so CPU will get stuck after reaching > > > the jump and also to possibly reduce code cache polution by populating > > > by code that is not executed > > > 2) loop alignment aims to fit loop in as few cache windows as possible > > > > > > Now if you have loop laid in a way that header of loop is not first > > > basic block, 2) IMO still apply. I.e. > > > > > > jump loop > > > :loopback > > > loop body > > > :loop > > > if cond jump to loopback > > > > > > So dropping loop alignment for those does not seem to make much sense > > > from high level. We may want to have differnt alignment for loops > > > starting by header and loops starting in the middle, but I still liked > > > more your patch which did bundles for loops. > > > > > > modern x86 chips are not very good testing targets on it. I guess > > > generic changes to alignment needs to be tested on other chips too. > > > > > > Honza > > > > Thanks, > > > > Martin > > > > > > > > gcc/ChangeLog: > > > > > > > > 2019-07-09 Martin Liska <mli...@suse.cz> > > > > > > > > * final.c (compute_alignments): Apply the LOOP_ALIGN only > > > > to basic blocks that all loop headers. > > > > --- > > > > gcc/final.c | 1 + > > > > 1 file changed, 1 insertion(+) > > > > > > > > > > > > > > > diff --git a/gcc/final.c b/gcc/final.c > > > > index fefc4874b24..ce2678da988 100644 > > > > --- a/gcc/final.c > > > > +++ b/gcc/final.c > > > > @@ -739,6 +739,7 @@ compute_alignments (void) > > > > if (has_fallthru > > > > && !(single_succ_p (bb) > > > > && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun)) > > > > + && bb->loop_father->header == bb > > > > I agree that the above is the wrong condition - but I'm not sure we > > only end up using LOOP_ALIGN for blocks reached by a DFS_BACK > > edge. Note that DFS_BACK would have to be applied considering > > the current CFG layout, simply doing mark_dfs_back_edges doesn't > > work (we're in CFG layout mode here, no?). > > So a "backedge" in this sense would be e->dest->index < e->src->index. > No?
To me the following would make sense. Index: gcc/final.c =================================================================== --- gcc/final.c (revision 273294) +++ gcc/final.c (working copy) @@ -669,6 +669,7 @@ compute_alignments (void) { rtx_insn *label = BB_HEAD (bb); bool has_fallthru = 0; + bool has_backedge = 0; edge e; edge_iterator ei; @@ -693,6 +694,8 @@ compute_alignments (void) has_fallthru = 1, fallthru_count += e->count (); else branch_count += e->count (); + if (e->src->index > bb->index) + has_backedge = 1; } if (dump_file) { @@ -736,7 +739,7 @@ compute_alignments (void) } /* In case block is frequent and reached mostly by non-fallthru edge, align it. It is most likely a first block of loop. */ - if (has_fallthru + if (has_backedge && !(single_succ_p (bb) && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun)) && optimize_bb_for_speed_p (bb)