On Tue, Jul 9, 2019 at 12:22 PM Richard Biener <richard.guent...@gmail.com> wrote: > > On Tue, Jul 9, 2019 at 11:56 AM Jan Hubicka <hubi...@ucw.cz> wrote: > > > > > Hi. > > > > > > I'm suggesting to restrict LOOP_ALIGN to only loop headers. That are the > > > basic blocks for which it makes the biggest sense. I quite some binary > > > size reductions on SPEC2006 and SPEC2017. Speed numbers are also slightly > > > positive. > > > > > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > > > > > Ready to be installed? > > The original idea of distinction between jump alignment and loop > > alignment was that they have two basic meanings: > > 1) jump alignment is there to avoid jumping just to the end of decode > > window (if the window is aligned) so CPU will get stuck after reaching > > the jump and also to possibly reduce code cache polution by populating > > by code that is not executed > > 2) loop alignment aims to fit loop in as few cache windows as possible > > > > Now if you have loop laid in a way that header of loop is not first > > basic block, 2) IMO still apply. I.e. > > > > jump loop > > :loopback > > loop body > > :loop > > if cond jump to loopback > > > > So dropping loop alignment for those does not seem to make much sense > > from high level. We may want to have differnt alignment for loops > > starting by header and loops starting in the middle, but I still liked > > more your patch which did bundles for loops. > > > > modern x86 chips are not very good testing targets on it. I guess > > generic changes to alignment needs to be tested on other chips too. > > > > Honza > > > Thanks, > > > Martin > > > > > > gcc/ChangeLog: > > > > > > 2019-07-09 Martin Liska <mli...@suse.cz> > > > > > > * final.c (compute_alignments): Apply the LOOP_ALIGN only > > > to basic blocks that all loop headers. > > > --- > > > gcc/final.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > > > > > > diff --git a/gcc/final.c b/gcc/final.c > > > index fefc4874b24..ce2678da988 100644 > > > --- a/gcc/final.c > > > +++ b/gcc/final.c > > > @@ -739,6 +739,7 @@ compute_alignments (void) > > > if (has_fallthru > > > && !(single_succ_p (bb) > > > && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun)) > > > + && bb->loop_father->header == bb > > I agree that the above is the wrong condition - but I'm not sure we > only end up using LOOP_ALIGN for blocks reached by a DFS_BACK > edge. Note that DFS_BACK would have to be applied considering > the current CFG layout, simply doing mark_dfs_back_edges doesn't > work (we're in CFG layout mode here, no?).
So a "backedge" in this sense would be e->dest->index < e->src->index. No? > Eventually the code > counting brances effectively already does this though. > > The odd thing is that we apply LOOP_ALIGN only to blocks that > have a fallthru incoming edge. I don't see Honzas example > above having one. > > > > && optimize_bb_for_speed_p (bb) > > > && branch_count + fallthru_count > count_threshold > > > && (branch_count > > > > > > > > >