https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108922
--- Comment #9 from Alexander Monakov <amonakov at gcc dot gnu.org> --- (In reply to Jan Kratochvil from comment #8) > The revert makes it 13x faster. But the produced code still falls back to > calling glibc fmod() as shown in the disassembly in Comment 0. > If I use the "fprem" instruction directly it gets 15x faster - but I did not > figure out some (easy) way for me how to patch GCC to no longer produce the > call to fmod() at all and produce only the "fprem" instruction. You just need to pass -fno-math-errno (the call is for setting errno, similar to how gcc emits the sqrt() sequence). > (In reply to Alexander Monakov from comment #4) > > Plus, Glibc does use fprem/fprem1 for fmodl/remainderl on x86_64, > > It is true replacing fmod() with fmodl() makes it 5x faster (but only 5x). > There is still some infinity check and I haven't found any real > justification in glibc sources for it: > 28 if (__builtin_expect (isinf (x) || y == 0.0L, 0) > 29 && _LIB_VERSION != _IEEE_ && !isnan (y) && !isnan (x)) > 30 /* fmod(+-Inf,y) or fmod(x,0) */ > 31 return __kernel_standard_l (x, y, 227); This is for legacy/fancy error handling beyond setting IEEE exception flags.