https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70773
wilco at gcc dot gnu.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |WAITING CC| |wilco at gcc dot gnu.org --- Comment #10 from wilco at gcc dot gnu.org --- I can't reproduce any of this. GCC6 and GCC7 always use smull for the divisions on ARM, even with profile-use. I could only make GCC emit a library call by using -Os on a CPU that doesn't have divide, but that is expected and correct. On AArch64 I get > 20% speedup with -fprofile-use vs plain -O3, so it works as expected. With -mcpu=cortex-a53 there are more uses of sdiv, but the profiled version is still faster. So without more details I don't see any issue here.