> Hi, > > similarly to PR123629, the issue again stems from that when propagating > polymorphic contexts, when there are no known "values" in the corresponding > lattice of the caller we use just the information on the edge and when > there are some we combine them with the information, and from the fact that > we iterate the propagation in strongly connected components of the call > graph (SCCs). > > In the first iteration over such SCC, we process the edge from > unmark/1097720 to onChild/1097719 before we determined the lattices of the > caller. In the second iteration, we already know what context there will > be in the first unmarks's parameter and so can add a more precise value to > the corresponding lattice of onChild. Because we always add values to the > lattices and never "improve" them, we get two values for the call. > > In PR123629, that actually described reality well because the caller's > lattice had the variable flag set, without cloning the caller we could not > assume we could use the caller's lattice value and so both values were > possible (and in fact both cases happened, the problem was that their meet > failed). > > In this case however, one could argue that the lattices contain wrong info > or at least information that is misleading because the caller's lattice > contains just that single "constant" and the variable flag is not set. We > know that regardless of cloning decisions for the caller the more precise > derived value will be the case. And indeed since I changed the cloning > code to re-gather all constants for the given set of callers, that code > arrives at the more precise context. For the record they only differ in > the fact that the more precise one has the dynamic flag cleared. > > I have thought about how to fix up the lattices in one way or another but > so far it has always turned ugly. Therefore this patch simply changes the > verification to simply allow this situation because even though the final > result is just a bit more precise than what was expected, it is however > correct. There will not be any attempt to clone for the more precise > context because all the call graph edges will have been redirected away. > The only "issue" is that the less precise contexts take up place in the > lattice, which has a limited length. That should not be a problem in > practice. > > I do not have a simple testcase, the issue was discovered by Sam when > building Firefox with both LTO and PGO. However Sam has reported in > Bugzilla that with this patch the build succeeds. Needless to say, I > have bootstrapped, tested and LTOprofieldbootstrapped the patch. > > OK for master? > > Thanks, > > Martin > > > gcc/ChangeLog: > > 2026-03-06 Martin Jambor <[email protected]> > > PR ipa/124291 > * ipa-cp.cc (ipcp_val_replacement_ok_p): Allow more precise > contexts that what the clone was originally intended for.
We discussed the patch in person yesterday. I think it is best option for stage3, so the patch is OK. We probably want to revisit this for stage1, especially if we to add clonning for value ranges which will hit similar issues of possibly improving the estimate across SCC component. Honza
