On Tue, Jul 9, 2019 at 11:56 AM Jan Hubicka <hubi...@ucw.cz> wrote: > > > Hi. > > > > I'm suggesting to restrict LOOP_ALIGN to only loop headers. That are the > > basic blocks for which it makes the biggest sense. I quite some binary > > size reductions on SPEC2006 and SPEC2017. Speed numbers are also slightly > > positive. > > > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > > > Ready to be installed? > The original idea of distinction between jump alignment and loop > alignment was that they have two basic meanings: > 1) jump alignment is there to avoid jumping just to the end of decode > window (if the window is aligned) so CPU will get stuck after reaching > the jump and also to possibly reduce code cache polution by populating > by code that is not executed > 2) loop alignment aims to fit loop in as few cache windows as possible > > Now if you have loop laid in a way that header of loop is not first > basic block, 2) IMO still apply. I.e. > > jump loop > :loopback > loop body > :loop > if cond jump to loopback > > So dropping loop alignment for those does not seem to make much sense > from high level. We may want to have differnt alignment for loops > starting by header and loops starting in the middle, but I still liked > more your patch which did bundles for loops. > > modern x86 chips are not very good testing targets on it. I guess > generic changes to alignment needs to be tested on other chips too. > > Honza > > Thanks, > > Martin > > > > gcc/ChangeLog: > > > > 2019-07-09 Martin Liska <mli...@suse.cz> > > > > * final.c (compute_alignments): Apply the LOOP_ALIGN only > > to basic blocks that all loop headers. > > --- > > gcc/final.c | 1 + > > 1 file changed, 1 insertion(+) > > > > > > > diff --git a/gcc/final.c b/gcc/final.c > > index fefc4874b24..ce2678da988 100644 > > --- a/gcc/final.c > > +++ b/gcc/final.c > > @@ -739,6 +739,7 @@ compute_alignments (void) > > if (has_fallthru > > && !(single_succ_p (bb) > > && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun)) > > + && bb->loop_father->header == bb
I agree that the above is the wrong condition - but I'm not sure we only end up using LOOP_ALIGN for blocks reached by a DFS_BACK edge. Note that DFS_BACK would have to be applied considering the current CFG layout, simply doing mark_dfs_back_edges doesn't work (we're in CFG layout mode here, no?). Eventually the code counting brances effectively already does this though. The odd thing is that we apply LOOP_ALIGN only to blocks that have a fallthru incoming edge. I don't see Honzas example above having one. > > && optimize_bb_for_speed_p (bb) > > && branch_count + fallthru_count > count_threshold > > && (branch_count > > > > >