Robin Dapp via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > Thanks for looking at it in detail. > >> Yeah, I think this is potentially a blocker for propagating A into B >> when A is used elsewhere. Combine is able to combine A and B while >> keeping A in parallel with the result. I think either fwprop would >> need to try that too, or it would need to be restricted to cases where A >> is only used in B. > > That seems a rather severe limitation and my original use case would > not get optimized considerably anymore. The intention was to replace > all uses (if register pressure allows). Of course the example is simple > enough that a propagation is always useful if the costs allow it, so > it might not be representative. > > I'm wondering if we could (my original misunderstanding) tentatively > try to propagate into all uses of a definition and, when reaching > a certain ratio, decide that it might be worth it, otherwise revert. > Would be very crude though, and not driven by the actual problem we're > trying to avoid. > >> I think the summary is: >> >> IMO, we have to be mindful that combine is still to run. We need to >> avoid making equal-cost changes if the new form is more complex, or >> otherwise likely to interfere with combine. > > I guess we don't have a good measure for complexity or "combinability" > and even lower-cost changes could result in worse options later. > Would it make sense to have a strict less-than cost policy for those > more complex propagations? Or do you consider the approach in its > current shape "hopeless", given the complications we discussed? > >> Alternatively, we could delay the optimisation until after combine >> and have freer rein, since we're then just mopping up opportunities >> that other passes left behind. >> >> A while back I was experimenting with a second combine pass. That was >> the original motiviation for rtl-ssa. I never got chance to finish it >> off though. > > This doesn't sound like something that would still materialize before > the end of stage 1 :) > Do you see any way of restricting the current approach to make it less > intrusive and still worthwhile? Limiting to vec_duplicate might be > much too arbitrary but would still help for my original example.
FWIW, I sent an RFC for a late-combine pass that might help: https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631406.html I think it'll need some tweaking for your use case, but hopefully it's "just" a case of expanding the register pressure tests. Thanks, Richard