[PATCH] Fix vector division lowering (PR tree-optimization/60960)

2014-04-25 Thread Jakub Jelinek
Hi!

If a vector type has scalar mode, such as 4xchar vector in the testcase
SImode, then unfortunately various optabs checks in expand_vector_divmod
and functions it calls can succeed, but the operation is actually not
a vector operation (e.g. a SImode shift is very different from
V4QImode shift with scalar shift count).

The following patch fixes this by not calling expand_vector_divmod
at all if the type doesn't have a vector mode.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.9?

2014-04-25  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/60960
* tree-vect-generic.c (expand_vector_operation): Only call
expand_vector_divmod if type's mode satisfies VECTOR_MODE_P.

* gcc.c-torture/execute/pr60960.c: New test.

--- gcc/tree-vect-generic.c.jj  2014-04-17 14:49:04.0 +0200
+++ gcc/tree-vect-generic.c 2014-04-25 09:27:31.689530647 +0200
@@ -971,7 +971,8 @@ expand_vector_operation (gimple_stmt_ite
 
  if (!optimize
  || !VECTOR_INTEGER_TYPE_P (type)
- || TREE_CODE (rhs2) != VECTOR_CST)
+ || TREE_CODE (rhs2) != VECTOR_CST
+ || !VECTOR_MODE_P (TYPE_MODE (type)))
break;
 
  ret = expand_vector_divmod (gsi, type, rhs1, rhs2, code);
--- gcc/testsuite/gcc.c-torture/execute/pr60960.c.jj2014-04-25 
09:30:10.891687231 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr60960.c   2014-04-25 
09:29:50.0 +0200
@@ -0,0 +1,38 @@
+/* PR tree-optimization/60960 */
+
+typedef unsigned char v4qi __attribute__ ((vector_size (4)));
+
+__attribute__((noinline, noclone)) v4qi
+f1 (v4qi v)
+{
+  return v / 2;
+}
+
+__attribute__((noinline, noclone)) v4qi
+f2 (v4qi v)
+{
+  return v / (v4qi) { 2, 2, 2, 2 };
+}
+
+__attribute__((noinline, noclone)) v4qi
+f3 (v4qi x, v4qi y)
+{
+  return x / y;
+}
+
+int
+main ()
+{
+  v4qi x = { 5, 5, 5, 5 };
+  v4qi y = { 2, 2, 2, 2 };
+  v4qi z = f1 (x);
+  if (__builtin_memcmp (y, z, sizeof (y)) != 0)
+__builtin_abort ();
+  z = f2 (x);
+  if (__builtin_memcmp (y, z, sizeof (y)) != 0)
+__builtin_abort ();
+  z = f3 (x, y);
+  if (__builtin_memcmp (y, z, sizeof (y)) != 0)
+__builtin_abort ();
+  return 0;
+}

Jakub


Re: [PATCH] Fix vector division lowering (PR tree-optimization/60960)

2014-04-25 Thread Richard Biener
On April 25, 2014 3:39:29 PM CEST, Jakub Jelinek ja...@redhat.com wrote:
Hi!

If a vector type has scalar mode, such as 4xchar vector in the testcase
SImode, then unfortunately various optabs checks in
expand_vector_divmod
and functions it calls can succeed, but the operation is actually not
a vector operation (e.g. a SImode shift is very different from
V4QImode shift with scalar shift count).

The following patch fixes this by not calling expand_vector_divmod
at all if the type doesn't have a vector mode.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk/4.9?

OK,
Thanks,
Richard.

2014-04-25  Jakub Jelinek  ja...@redhat.com

   PR tree-optimization/60960
   * tree-vect-generic.c (expand_vector_operation): Only call
   expand_vector_divmod if type's mode satisfies VECTOR_MODE_P.

   * gcc.c-torture/execute/pr60960.c: New test.

--- gcc/tree-vect-generic.c.jj 2014-04-17 14:49:04.0 +0200
+++ gcc/tree-vect-generic.c2014-04-25 09:27:31.689530647 +0200
@@ -971,7 +971,8 @@ expand_vector_operation (gimple_stmt_ite
 
 if (!optimize
 || !VECTOR_INTEGER_TYPE_P (type)
-|| TREE_CODE (rhs2) != VECTOR_CST)
+|| TREE_CODE (rhs2) != VECTOR_CST
+|| !VECTOR_MODE_P (TYPE_MODE (type)))
   break;
 
 ret = expand_vector_divmod (gsi, type, rhs1, rhs2, code);
--- gcc/testsuite/gcc.c-torture/execute/pr60960.c.jj   2014-04-25
09:30:10.891687231 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr60960.c  2014-04-25
09:29:50.0 +0200
@@ -0,0 +1,38 @@
+/* PR tree-optimization/60960 */
+
+typedef unsigned char v4qi __attribute__ ((vector_size (4)));
+
+__attribute__((noinline, noclone)) v4qi
+f1 (v4qi v)
+{
+  return v / 2;
+}
+
+__attribute__((noinline, noclone)) v4qi
+f2 (v4qi v)
+{
+  return v / (v4qi) { 2, 2, 2, 2 };
+}
+
+__attribute__((noinline, noclone)) v4qi
+f3 (v4qi x, v4qi y)
+{
+  return x / y;
+}
+
+int
+main ()
+{
+  v4qi x = { 5, 5, 5, 5 };
+  v4qi y = { 2, 2, 2, 2 };
+  v4qi z = f1 (x);
+  if (__builtin_memcmp (y, z, sizeof (y)) != 0)
+__builtin_abort ();
+  z = f2 (x);
+  if (__builtin_memcmp (y, z, sizeof (y)) != 0)
+__builtin_abort ();
+  z = f3 (x, y);
+  if (__builtin_memcmp (y, z, sizeof (y)) != 0)
+__builtin_abort ();
+  return 0;
+}

   Jakub