https://llvm.org/bugs/show_bug.cgi?id=31872
Bug ID: 31872 Summary: Complex division is not optimised with -ffast-math Product: new-bugs Version: trunk Hardware: PC OS: Linux Status: NEW Severity: normal Priority: P Component: new bugs Assignee: unassignedb...@nondot.org Reporter: drr...@gmail.com CC: llvm-bugs@lists.llvm.org Classification: Unclassified Consider: #include <complex.h> complex float f(complex float x, complex float y) { return x/y; } clang trunk with -O3 -march=core-avx2 but with or without -ffast-math gives: f: # @f vmovaps xmm2, xmm1 vmovshdup xmm1, xmm0 # xmm1 = xmm0[1,1,3,3] vmovshdup xmm3, xmm2 # xmm3 = xmm2[1,1,3,3] jmp __divsc3 # TAILCALL However both gcc and ICC attempt to optimise this code when -ffast-math (or equivalent) is enabled. ICC appears to give the fastest code which is: f: vcvtps2pd xmm2, xmm1 #3.12 vcvtps2pd xmm4, xmm0 #3.12 vmulpd xmm8, xmm2, xmm2 #3.12 vunpckhpd xmm3, xmm2, xmm2 #3.12 vmulpd xmm6, xmm3, xmm4 #3.12 vmovddup xmm7, xmm2 #3.12 vshufpd xmm5, xmm4, xmm4, 1 #3.12 vshufpd xmm9, xmm8, xmm8, 1 #3.12 vfmaddsub213pd xmm7, xmm5, xmm6 #3.12 vaddpd xmm11, xmm8, xmm9 #3.12 vshufpd xmm10, xmm7, xmm7, 1 #3.12 vdivpd xmm12, xmm10, xmm11 #3.12 vcvtpd2ps xmm0, xmm12 #3.12 ret -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs