http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47754

--- Comment #3 from Matthias Kretz <kretz at kde dot org> 2011-02-15 16:40:38 
UTC ---
ICC??? Whatever, I stopped to trust that compiler long ago:
<unaligned()>:
vmovups 0x2039b8(%rip),%xmm0
vmovups 0x2039b4(%rip),%xmm1
vinsertf128 $0x1,0x2039b6(%rip),%ymm0,%ymm2
vinsertf128 $0x1,0x2039b0(%rip),%ymm1,%ymm3
vaddps %ymm3,%ymm2,%ymm4
vmovups %ymm4,0x20399c(%rip)
vzeroupper
retq

<aligned()>:
vmovups 0x203978(%rip),%ymm0
vaddps 0x203974(%rip),%ymm0,%ymm1
vmovups %ymm1,0x203974(%rip)
vzeroupper
retq

Nice optimization of unaligned loads there... not. ???


Just a small side-note for your enjoyment: I wrote a C++ abstraction for SSE;
and with GCC this gives an almost four-fold speedup for Mandelbrot. ICC on the
other hand compiles such awful code that - even with SSE use - it rather
creates a four-fold slowdown compared to the non-SSE code.

GCC really is a nice compiler! Keep on rocking!

Reply via email to