11.02.2020 11:01, Richard Biener wrote: > On Tue, 11 Feb 2020, Segher Boessenkool wrote: > >> On Tue, Feb 11, 2020 at 08:34:15AM +0100, Richard Biener wrote: >>> On Mon, 10 Feb 2020, Segher Boessenkool wrote: >>>> Yes, we should decide how often we want to unroll things somewhere before >>>> ivopts already, and just use that info here. >>>> >>>> Or are there advantage to doing it *in* ivopts? It sounds like doing >>>> it there is probably expensive, but maybe not, and we need to do similar >>>> analysis there anyway. >>> Well, if the only benefit of doing the unrolling is that IVs get >>> cheaper then yes, IVOPTs should drive it. >> We need to know much earlier in the pass pipeline how often a loop will >> be unrolled. We don't have to *do* it early. >> >> If we want to know it before ivopts, then obviously it has to be done >> earlier. Otherwise, maybe it is a good idea to do it in ivopts itself. >> Or maybe not. It's just an idea :-) >> >> We know we do not want it *later*, ivopts needs to know this to make >> good decisions of its own. >> >>> But usually unrolling exposes redundancies (catched by predictive >>> commoning which drives some unrolling) or it enables better use >>> of CPU resources via scheduling (only catched later in RTL). >>> For scheduling we have the additional complication that the RTL >>> side doesn't have as much of a fancy data dependence analysis >>> framework as on the GIMPLE side. So I'd put my bet on trying to >>> move something like SMS to GIMPLE and combine it with unrolling >>> (IIRC SMS at most interleaves 1 1/2 loop iterations). To clarify, without specifying -fmodulo-sched-allow-regmoves it only interleaves 2 iterations. With register moves enabled more iterations can be considered. > SMS on RTL always was quite disappointing... Hmm, even when trying to move it just few passes earlier many years ago, got another opinion: https://gcc.gnu.org/ml/gcc-patches/2011-10/msg01526.html Although without such a move we still have annoying issues which RTL folks can't solve, see e.q. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93264#c2 > It originally came with "data dependence export from GIMPLE to RTL" > that never materialized so I'm not surprised ;) It also relies > on doloop detection. My current attempt to drop doloop dependency is still WIP, hopefully I'll create branch in refs/users/ in a month or so. But older (gcc-7 and earlier) versions are available, see https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01647.html Doloops are still supported for some kind of backward compatibility, but much more loops (which loop-iv can analyze) are considered in new SMS. >> Do you expect it will be more useful on Gimple? Moving it there is a >> good idea in any case ;-) >> >> I don't quite see the synergy between SMS and loop unrolling, but maybe >> I need to look harder. > As said elsewhere I don't believe in actual unrolling doing much good > but in removing data dependences in the CPU pipeline. SMS rotates > the loop, peeling N iterations (and somehow I think for N > 1 that > should better mean unrolling the loop body). Yes, this is what theory tells us. > Of course doing "scheduling" on GIMPLE is "interesting" in its own > but OTOH our pipeline DFAs are imprecise enough that one could even > devise some basic GIMPLE <-> "RTL" mapping to make use of it. But > then scheduling without IVs or register pressure in mind is somewhat > pointless as well. Unfortunately, even with -fmodulo-sched-allow-regmoves it doesn't interact much with register pressure. > That said - if I had enough time I'd still thing that investigating > "scheduling on GIMPLE" as replacement for sched1 is an interesting > thing to do. Sound good, but IMHO modulo scheduler is not the best choice to be the first step implementing such a concept.
Roman