https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121993
--- Comment #6 from Filip Kastl <pheeck at gcc dot gnu.org> --- (In reply to cuilili from comment #5) > command line), Both binary executed 1.01s. Since this is a "-i test", > runtime is very short, even slight fluctuations can have a significant > impact on performance. I also used "-i ref", both binary executed 166s. So I I actually use "-i ref" when benchmarking :). I've mentioned using "-i test" in comment #4 only because I was comparing binaries and didn't need to run the full benchmark. If I understand correctly, you're saying that the slowdowns happen in sections of the benchmark binary where both r16-3484 and r16-3485 versions contain the same instructions except for sometimes using different registers. That's what I see in the example you've provided. In that case I don't know if anything can be done. I'm inclined to leave this bug open for now, though. Maybe later (stage 3/4?) someone will be able to decide if anything can be improved here or if we should just close this bug.
