https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89774

JunMa <JunMa at linux dot alibaba.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |JunMa at linux dot alibaba.com

--- Comment #10 from JunMa <JunMa at linux dot alibaba.com> ---
(In reply to Segher Boessenkool from comment #9)
> We currently only do it for trivial cases, as the example in comment 6 shows
> as well.  This is done during expand, which is the wrong place for it.
> 
> PR90070 is asking for better optimisation of this: do the operation in single
> precision, and use single-precision constants, if this does not change the
> result (or there is some -ffast-math option).
> 
> PR22326 is also closely related.  I don't think we can close any of these PRs
> as a dup of another, they are all asking for slightly different things :-)

clang can do this optimization in instcombine pass. see this case:

  float f4( float x ) {double t = x + 2.0; return  t; }
  float f5( float x ) {return  x + 2.0;  }

compiled with -O2 -march=native, GCC gives:

f4:
vcvtss2sd    %xmm0, %xmm0, %xmm0
vaddsd .LC1(%rip), %xmm0, %xmm0
vcvtsd2ss    %xmm0, %xmm0, %xmm0
ret

f5:
vaddss .LC3(%rip), %xmm0, %xmm0
ret

while clang always emits vaddss instruction.

Reply via email to