https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95001
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed| |2025-09-01 --- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> --- comment #0 has been improved on the trunk. There is still a tail part of the loop still being missed: ``` bnd.12_53 = count_23(D) >> 2; ... <bb 7> [local count: 94607391]: niters_vector_mult_vf.13_54 = bnd.12_53 * 4; if (count_23(D) == niters_vector_mult_vf.13_54) goto <bb 9>; [25.00%] else goto <bb 8>; [75.00%] <bb 8> [local count: 81467477]: _60 = bnd.12_53 * 16; ``` I wonder if that is because we don't combine `count_23(D) >> 2` with `bnd.12_53 * 4` to make `count_23(D) & ~0x3` which should be just `count_23(D)`.