https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89774
JunMa <JunMa at linux dot alibaba.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |JunMa at linux dot alibaba.com --- Comment #10 from JunMa <JunMa at linux dot alibaba.com> --- (In reply to Segher Boessenkool from comment #9) > We currently only do it for trivial cases, as the example in comment 6 shows > as well. This is done during expand, which is the wrong place for it. > > PR90070 is asking for better optimisation of this: do the operation in single > precision, and use single-precision constants, if this does not change the > result (or there is some -ffast-math option). > > PR22326 is also closely related. I don't think we can close any of these PRs > as a dup of another, they are all asking for slightly different things :-) clang can do this optimization in instcombine pass. see this case: float f4( float x ) {double t = x + 2.0; return t; } float f5( float x ) {return x + 2.0; } compiled with -O2 -march=native, GCC gives: f4: vcvtss2sd %xmm0, %xmm0, %xmm0 vaddsd .LC1(%rip), %xmm0, %xmm0 vcvtsd2ss %xmm0, %xmm0, %xmm0 ret f5: vaddss .LC3(%rip), %xmm0, %xmm0 ret while clang always emits vaddss instruction.