On Fri, Dec 02, 2016 at 09:47:00AM +0100, Richard Biener wrote: > >> STC tries to make as large as possible consecutive "traces", mainly to > >> help with instruction cache utilization and hit rate etc. It cannot do > >> a very good job if it isn't allowed to copy blocks. > >> > >> "simple" tries to (dynamically) have as many fall-throughs as possible, > >> i.e. as few jumps as possible. > > I hope it special-cases backedges, that is, not make those fallthru.
It doesn't, and that is a good thing. Firstly, what is classified as backedge doesn't make too much sense in some cases; quite often the cfg isn't reducible. Secondly, for simple loops it does not matter: the backedge has lower frequency than the forward edges, so everything works out as you want. Thirdly, consider this loop: | S<---- | | A | / \ | B C | \ / | D | | | E----- | F where the backedge now has higher frequency than A->B and A->C. With simple this ends up as: S: ...; goto A D: ... E: ...; if (not again) goto F S: ... A: ...; if () goto C B: ...; goto D C: ...; goto D and this is cheaper than the alternative: S: ... A: ...; if () goto C B: ... D: ... E: ...; if (again) goto A F: C: ...; goto D If a fraction f of the loop iterations does C (so 1-f does B), the alternative has 1+2f jumps per iteration; the code simple makes has 1+f. > > I think we've probably discussed this more than is really necessary. We > > just need to pick an alternative and go with it, I think either alternative > > is reasonable (avoid copying when STC has no length information or fall back > > to simple when there is no length information). > > The description of both algorithms doesn't make it an obvious pick. So lets > leave the decision of the algorithm to the target and make STC beave sane > with no length information (whether that means disallowing any copying or > substituting a "default" length is another question -- but obviously it's the > targets fault to not provide lenght information). I agree. Segher