http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57954

Yuri Rumyantsev <ysrumyan at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ysrumyan at gmail dot com

--- Comment #9 from Yuri Rumyantsev <ysrumyan at gmail dot com> ---
Uros,

I assume that this fix is not good and must be reverted - I will prepare
another fix for your reviewing. There are at least 2 problems:

1. New split for int --> fp converisons is done under TARGET_SSE2 and
TARGET_SSE_PARTIAL_REG_DEPENDENCY which include both Atom chips - SLT and SLM.
I checked that zeroing of xmm register before conversion leads to performance
slowdown on SLM (-5%) for proveded test-case. I assume that TARGET_AVX must be
used instead of TARGET_SSE2.
2. This zeroing must redundant and should not be inserted, e.g. for the
following simple test-case:

void foo (float* p, int n)
{
  int i;
  for (i=0; i<n; i++)
    p[i] = (float) i;
}

with H.J patch we got the following assembly (I compiled it for slm but it does
not matter):

.L3:
    xorps    %xmm0, %xmm0
    cvtsi2ss    %eax, %xmm0
    movss    %xmm0, (%ecx,%eax,4)
    addl    $1, %eax
    cmpl    %edx, %eax
    jne    .L3

It is clear that zeroing is redundant for it and must be deleted.

Reply via email to