[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2008-11-15 Thread howarth at nitro dot med dot uc dot edu


--- Comment #11 from howarth at nitro dot med dot uc dot edu  2008-11-15 
23:59 ---
This test case fails at -m64 on i686-apple-darwin9 in current gcc trunk.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2008-11-15 Thread howarth at nitro dot med dot uc dot edu


--- Comment #12 from howarth at nitro dot med dot uc dot edu  2008-11-16 
00:01 ---
Created an attachment (id=16690)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16690action=view)
assembly file generated for gcc.target/i386/pr32661-1.c at -m64 on
i686-apple-darwin9


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2008-11-15 Thread howarth at nitro dot med dot uc dot edu


--- Comment #13 from howarth at nitro dot med dot uc dot edu  2008-11-16 
00:01 ---
Test fails as...

Executing on host:
/sw/src/fink.build/gcc44-4.3.999-20081115/darwin_objdir/gcc/xgcc
-B/sw/src/fink.build/gcc44-4.3.999-20081115/darwin_objdi
r/gcc/
/sw/src/fink.build/gcc44-4.3.999-20081115/gcc-4.4-20081115/gcc/testsuite/gcc.target/i386/pr32661-1.c
  -O2 -S  -m64 -o pr32661-1.s   
 (timeout = 300)
PASS: gcc.target/i386/pr32661-1.c (test for excess errors)
FAIL: gcc.target/i386/pr32661-1.c scan-assembler-times mov 2


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2007-08-28 Thread ubizjak at gmail dot com


--- Comment #10 from ubizjak at gmail dot com  2007-08-28 09:57 ---
Fixed.


-- 

ubizjak at gmail dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2007-08-28 Thread ubizjak at gmail dot com


-- 

ubizjak at gmail dot com changed:

   What|Removed |Added

   Target Milestone|--- |4.3.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2007-08-28 Thread uros at gcc dot gnu dot org


--- Comment #9 from uros at gcc dot gnu dot org  2007-08-28 09:52 ---
Subject: Bug 32661

Author: uros
Date: Tue Aug 28 09:52:06 2007
New Revision: 127857

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=127857
Log:
PR target/32661
* simplify-rtx.c (simplify_binary_operation_1) [VEC_SELECT]:
Simplify nested VEC_SELECT (with optional VEC_CONCAT operator as
operand) when top VEC_SELECT extracts scalar element.
* config/i386/sse.md (*vec_extract_v4si_mem): New.
(*vec_extract_v4sf_mem): Ditto.

testsuite/ChangeLog:

PR target/32661
* gcc.target/i386/pr32661.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/i386/pr32661.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/sse.md
trunk/gcc/simplify-rtx.c
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2007-07-13 Thread ubizjak at gmail dot com


--- Comment #7 from ubizjak at gmail dot com  2007-07-13 06:08 ---
I have following patch that solves nested VEC_SELECT insn. However, I would
like to enhance it for nested VEC_SELECT (VEC_SELECT (VEC_DUPLICATE (...)))
that is generated i.e. for __builtin_ia32_vec_ext_v4si(*val, 2);

Index: simplify-rtx.c
===
--- simplify-rtx.c  (revision 126587)
+++ simplify-rtx.c  (working copy)
@@ -2669,6 +2669,31 @@ simplify_binary_operation_1 (enum rtx_co
  if (GET_CODE (trueop0) == CONST_VECTOR)
return CONST_VECTOR_ELT (trueop0, INTVAL (XVECEXP
  (trueop1, 0, 0)));
+ if (GET_CODE (trueop0) == VEC_SELECT
+  (GET_MODE (XEXP (trueop0, 0)) == GET_MODE (trueop0)))
+   {
+ rtx op = XEXP (trueop0, 0);
+ rtx sel = XEXP (trueop0, 1);
+ enum machine_mode opmode = GET_MODE (op);
+ rtvec vec;
+ rtx tmp;
+
+ int elt_size = GET_MODE_SIZE (GET_MODE_INNER (opmode));
+ int n_elts = GET_MODE_SIZE (opmode) / elt_size;
+
+ int i = INTVAL (XVECEXP (trueop1, 0, 0));
+
+ gcc_assert (GET_CODE (sel) == PARALLEL);
+ gcc_assert (i  n_elts);
+
+ /* Select value, pointed by nested selector.  */
+ vec = rtvec_alloc (1);
+ RTVEC_ELT (vec, 0) = CONST_VECTOR_ELT (sel, i);
+ tmp = gen_rtx_PARALLEL (VOIDmode, vec);
+
+ tmp = gen_rtx_fmt_ee (code, mode, op, tmp);
+ return tmp;
+   }
}
   else
{
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 126587)
+++ config/i386/sse.md  (working copy)
@@ -4578,6 +4578,22 @@
   operands[1] = gen_rtx_REG (SImode, REGNO (operands[1]));
 })

+(define_insn_and_split *sse2_stored_1
+  [(set (match_operand:SI 0 register_operand =r)
+   (vec_select:SI
+ (match_operand:V4SI 1 memory_operand o)
+ (parallel [(match_operand 2 const_0_to_3_operand )])))]
+  TARGET_SSE
+  #
+   reload_completed
+  [(const_int 0)]
+{
+  int i = INTVAL (operands[2]);
+
+  emit_move_insn (operands[0], adjust_address (operands[1], SImode, i*4));
+  DONE;
+})
+
 (define_expand sse_storeq
   [(set (match_operand:DI 0 nonimmediate_operand )
(vec_select:DI


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2007-07-13 Thread ubizjak at gmail dot com


--- Comment #8 from ubizjak at gmail dot com  2007-07-13 13:25 ---
Patch for SImode and SFmode vec_select at
http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01263.html


-- 

ubizjak at gmail dot com changed:

   What|Removed |Added

URL|http://gcc.gnu.org/ml/gcc-  |http://gcc.gnu.org/ml/gcc-
   |patches/2007-   |patches/2007-
   |07/msg01077.html|07/msg01263.html
   Keywords||patch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2007-07-11 Thread scovich at gmail dot com


--- Comment #2 from scovich at gmail dot com  2007-07-11 15:03 ---
(In reply to comment #1)
 Confirmed, not a regression.
 

Also affects 4.3. Changing target


-- 

scovich at gmail dot com changed:

   What|Removed |Added

Version|4.1.2   |4.3.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2007-07-11 Thread scovich at gmail dot com


--- Comment #3 from scovich at gmail dot com  2007-07-11 15:10 ---
This bug also causes _mm_cvtsi128_si64x() (which calls
__builtin_ia32_vec_ext_v2di) to emit suboptimal code.

// g++-4.3-070710 -mtune=core2 -O3 -S -dp
#include emmintrin.h
long vector2long(__m128i* src) { return _mm_cvtsi128_si64x(*src); }

Becomes

_Z11vector2longPU8__vectorx:
.LFB529:
movdqa  (%rdi), %xmm0   # 6 *movv2di_internal/2 [length = 3]
movd%xmm0, %rax # 25*movdi_1_rex64/14   [length = 4]
ret # 28return_internal [length = 1]

This might be related to bug 32708 (and therefore have a similar fix?)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2007-07-11 Thread uros at gcc dot gnu dot org


--- Comment #4 from uros at gcc dot gnu dot org  2007-07-11 18:43 ---
Subject: Bug 32661

Author: uros
Date: Wed Jul 11 18:42:44 2007
New Revision: 126557

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=126557
Log:
PR target/32661
* config/i386/sse.md (*sse2_storeq_rex64): Handle 64bit mem-reg moves.
(*vec_extractv2di_1_sse2): Disable for TARGET_64BIT.
(*vec_extractv2di_1_rex64): New insn pattern.

testsuite/ChangeLog:

PR target/32661
* gcc.target/i386/pr32661-1.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/i386/pr32661-1.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/sse.md
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2007-07-11 Thread ubizjak at gmail dot com


--- Comment #5 from ubizjak at gmail dot com  2007-07-11 18:47 ---
(In reply to comment #3)

 This might be related to bug 32708 (and therefore have a similar fix?)

Yes, DImode moves are implemented/fixed by the patch above. Your example now
compiles to:

movq(%rdi), %rax
ret

Other examples are shown in
http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01077.html.

SImode moves will be a bit harder, because shufps insn pattern is involved in
the vector expansion.


-- 

ubizjak at gmail dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |ubizjak at gmail dot com
   |dot org |
URL||http://gcc.gnu.org/ml/gcc-
   ||patches/2007-
   ||07/msg01077.html
 Status|NEW |ASSIGNED
   Last reconfirmed|2007-07-07 09:25:01 |2007-07-11 18:47:20
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2007-07-11 Thread scovich at gmail dot com


--- Comment #6 from scovich at gmail dot com  2007-07-11 20:27 ---
(In reply to comment #5)
 SImode moves will be a bit harder, because shufps insn pattern is involved in
 the vector expansion.

IIRC, shufps takes 3 cycles on Core2
(http://www.agner.org/optimize/instruction_tables.pdf), even without the
operand type mismatch (does that still exist?). That's =4 cycles.

Storing the vector to stack and load the desired entry would take =4 cycles,
even without Intel's store-load optimizations, and I imagine the optimizer
would be able to deal with it better.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2007-07-07 Thread ubizjak at gmail dot com


--- Comment #1 from ubizjak at gmail dot com  2007-07-07 09:25 ---
Confirmed, not a regression.


-- 

ubizjak at gmail dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2007-07-07 09:25:01
   date||
   Target Milestone|--- |4.3.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2007-07-07 Thread rguenth at gcc dot gnu dot org


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu dot
   ||org
   Target Milestone|4.3.0   |---


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661



[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args

2007-07-06 Thread pinskia at gcc dot gnu dot org


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu dot
   ||org
   Severity|normal  |enhancement
  Component|middle-end  |target
   Keywords||missed-optimization


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661