[Bug target/36222] x86 fails to optimize out __v4si - __m128i move

2009-02-13 Thread hjl dot tools at gmail dot com
--- Comment #10 from hjl dot tools at gmail dot com 2009-02-13 15:57 --- Fixed. Gcc 4.4.0 revision 144128 generates: foo2: movd%edi, %xmm0 movd%esi, %xmm1 movd%edx, %xmm2 punpckldq %xmm0, %xmm1 movd%ecx, %xmm0

[Bug target/36222] x86 fails to optimize out __v4si - __m128i move

2008-05-18 Thread uweigand at gcc dot gnu dot org
--- Comment #8 from uweigand at gcc dot gnu dot org 2008-05-18 15:58 --- That special case in find_reloads is really about a different situation. We do not have a simple move here. The problem also is not really related to vector instruction in particular; reload doesn't at all care

[Bug target/36222] x86 fails to optimize out __v4si - __m128i move

2008-05-18 Thread rguenth at gcc dot gnu dot org
--- Comment #9 from rguenth at gcc dot gnu dot org 2008-05-18 16:58 --- Did you investigate whether IRA fixes this issue? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36222

[Bug target/36222] x86 fails to optimize out __v4si - __m128i move

2008-05-16 Thread ubizjak at gmail dot com
--- Comment #6 from ubizjak at gmail dot com 2008-05-16 19:09 --- There is still underlying RA problem left. Compiling the testcase from Comment 0 on 32bit i686 (-O2 -msse2), we get: _mm_set_epi32: pushl %ebp movl%esp, %ebp movd8(%ebp), %xmm0

[Bug target/36222] x86 fails to optimize out __v4si - __m128i move

2008-05-16 Thread hjl dot tools at gmail dot com
--- Comment #7 from hjl dot tools at gmail dot com 2008-05-16 21:54 --- find_reloads in reload.c has /* Special case a simple move with an input reload and a destination of a hard reg, if the hard reg is ok, use it. */ for (i = 0; i n_reloads; i++) if

[Bug target/36222] x86 fails to optimize out __v4si - __m128i move

2008-05-13 Thread hjl dot tools at gmail dot com
--- Comment #4 from hjl dot tools at gmail dot com 2008-05-13 14:09 --- It looks like reload doesn't check any vector instructions. I guess there may be many missed optimizations with vector instructions. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36222

[Bug target/36222] x86 fails to optimize out __v4si - __m128i move

2008-05-13 Thread hjl dot tools at gmail dot com
--- Comment #3 from hjl dot tools at gmail dot com 2008-05-13 13:58 --- This bug may be related to PR 30961. Another example: bash-3.2$ cat d.c #include emmintrin.h __m128i foo2 (long long x1, long long x2) { return _mm_set_epi64x (x1, x2); } bash-3.2$

[Bug target/36222] x86 fails to optimize out __v4si - __m128i move

2008-05-13 Thread uros at gcc dot gnu dot org
--- Comment #5 from uros at gcc dot gnu dot org 2008-05-13 21:34 --- Subject: Bug 36222 Author: uros Date: Tue May 13 21:33:40 2008 New Revision: 135275 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=135275 Log: PR target/36222 * config/i386/i386.c

[Bug target/36222] x86 fails to optimize out __v4si - __m128i move

2008-05-12 Thread hjl dot tools at gmail dot com
--- Comment #1 from hjl dot tools at gmail dot com 2008-05-12 15:57 --- Also do we need movq%xmm1, %xmm2? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36222

[Bug target/36222] x86 fails to optimize out __v4si - __m128i move

2008-05-12 Thread ubizjak at gmail dot com
--- Comment #2 from ubizjak at gmail dot com 2008-05-12 19:23 --- (In reply to comment #1) Also do we need movq%xmm1, %xmm2? We can help RA a bit by emitting RTL sequence that requires less pseudos. Index: i386.c ===