Re: [PATCH, PR middle-end/71488] Fix vectorization of comparison of booleans

2016-06-27 Thread Ilya Enkovich
Looks like it caused PR71655 and therefore is not so safe :/

Ilya

2016-06-22 17:00 GMT+03:00 Ilya Enkovich :
> 2016-06-21 23:57 GMT+03:00 Jeff Law :
>> On 06/16/2016 05:06 AM, Ilya Enkovich wrote:
>>>
>>> Hi,
>>>
>>> This patch fixes incorrect comparison vectorization for booleans.
>>> The problem is that regular comparison which works for scalars
>>> doesn't work for vectors due to different binary representation.
>>> Also this never works for scalar masks.
>>>
>>> This patch replaces such comparisons with bitwise operations
>>> which work correctly for both vector and scalar masks.
>>>
>>> Bootstrapped and regtested on x86_64-unknown-linux-gnu.  Is it
>>> OK for trunk?  What should be done for gcc-6-branch?  Port this
>>> patch or just restrict vectorization for comparison of booleans?
>>>
>>> Thanks,
>>> Ilya
>>> --
>>> gcc/
>>>
>>> 2016-06-15  Ilya Enkovich  
>>>
>>> PR middle-end/71488
>>> * tree-vect-patterns.c (vect_recog_mask_conversion_pattern):
>>> Support
>>> comparison of boolean vectors.
>>> * tree-vect-stmts.c (vectorizable_comparison): Vectorize
>>> comparison
>>> of boolean vectors using bitwise operations.
>>>
>>> gcc/testsuite/
>>>
>>> 2016-06-15  Ilya Enkovich  
>>>
>>> PR middle-end/71488
>>> * g++.dg/pr71488.C: New test.
>>> * gcc.dg/vect/vect-bool-cmp.c: New test.
>>
>> OK.  Given this is a code generation bug, I'll support porting this patch to
>> the gcc-6 branch.  Is there any reason to think that porting out be more
>> risky than usual?  It looks pretty simple to me, am I missing some subtle
>> dependency?
>
> I don't feel this patch is too risky.  I asked only because simple restriction
> on masks comparison is even more safe.
>
> Thanks,
> Ilya
>
>
>>
>> jeff
>>


Re: [PATCH, PR middle-end/71488] Fix vectorization of comparison of booleans

2016-06-22 Thread Ilya Enkovich
2016-06-21 23:57 GMT+03:00 Jeff Law :
> On 06/16/2016 05:06 AM, Ilya Enkovich wrote:
>>
>> Hi,
>>
>> This patch fixes incorrect comparison vectorization for booleans.
>> The problem is that regular comparison which works for scalars
>> doesn't work for vectors due to different binary representation.
>> Also this never works for scalar masks.
>>
>> This patch replaces such comparisons with bitwise operations
>> which work correctly for both vector and scalar masks.
>>
>> Bootstrapped and regtested on x86_64-unknown-linux-gnu.  Is it
>> OK for trunk?  What should be done for gcc-6-branch?  Port this
>> patch or just restrict vectorization for comparison of booleans?
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2016-06-15  Ilya Enkovich  
>>
>> PR middle-end/71488
>> * tree-vect-patterns.c (vect_recog_mask_conversion_pattern):
>> Support
>> comparison of boolean vectors.
>> * tree-vect-stmts.c (vectorizable_comparison): Vectorize
>> comparison
>> of boolean vectors using bitwise operations.
>>
>> gcc/testsuite/
>>
>> 2016-06-15  Ilya Enkovich  
>>
>> PR middle-end/71488
>> * g++.dg/pr71488.C: New test.
>> * gcc.dg/vect/vect-bool-cmp.c: New test.
>
> OK.  Given this is a code generation bug, I'll support porting this patch to
> the gcc-6 branch.  Is there any reason to think that porting out be more
> risky than usual?  It looks pretty simple to me, am I missing some subtle
> dependency?

I don't feel this patch is too risky.  I asked only because simple restriction
on masks comparison is even more safe.

Thanks,
Ilya


>
> jeff
>


Re: [PATCH, PR middle-end/71488] Fix vectorization of comparison of booleans

2016-06-21 Thread Jeff Law

On 06/16/2016 05:06 AM, Ilya Enkovich wrote:

Hi,

This patch fixes incorrect comparison vectorization for booleans.
The problem is that regular comparison which works for scalars
doesn't work for vectors due to different binary representation.
Also this never works for scalar masks.

This patch replaces such comparisons with bitwise operations
which work correctly for both vector and scalar masks.

Bootstrapped and regtested on x86_64-unknown-linux-gnu.  Is it
OK for trunk?  What should be done for gcc-6-branch?  Port this
patch or just restrict vectorization for comparison of booleans?

Thanks,
Ilya
--
gcc/

2016-06-15  Ilya Enkovich  

PR middle-end/71488
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern): Support
comparison of boolean vectors.
* tree-vect-stmts.c (vectorizable_comparison): Vectorize comparison
of boolean vectors using bitwise operations.

gcc/testsuite/

2016-06-15  Ilya Enkovich  

PR middle-end/71488
* g++.dg/pr71488.C: New test.
* gcc.dg/vect/vect-bool-cmp.c: New test.
OK.  Given this is a code generation bug, I'll support porting this 
patch to the gcc-6 branch.  Is there any reason to think that porting 
out be more risky than usual?  It looks pretty simple to me, am I 
missing some subtle dependency?


jeff



[PATCH, PR middle-end/71488] Fix vectorization of comparison of booleans

2016-06-16 Thread Ilya Enkovich
Hi,

This patch fixes incorrect comparison vectorization for booleans.
The problem is that regular comparison which works for scalars
doesn't work for vectors due to different binary representation.
Also this never works for scalar masks.

This patch replaces such comparisons with bitwise operations
which work correctly for both vector and scalar masks.

Bootstrapped and regtested on x86_64-unknown-linux-gnu.  Is it
OK for trunk?  What should be done for gcc-6-branch?  Port this
patch or just restrict vectorization for comparison of booleans?

Thanks,
Ilya
--
gcc/

2016-06-15  Ilya Enkovich  

PR middle-end/71488
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern): Support
comparison of boolean vectors.
* tree-vect-stmts.c (vectorizable_comparison): Vectorize comparison
of boolean vectors using bitwise operations.

gcc/testsuite/

2016-06-15  Ilya Enkovich  

PR middle-end/71488
* g++.dg/pr71488.C: New test.
* gcc.dg/vect/vect-bool-cmp.c: New test.


diff --git a/gcc/testsuite/g++.dg/pr71488.C b/gcc/testsuite/g++.dg/pr71488.C
new file mode 100644
index 000..d7d657e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr71488.C
@@ -0,0 +1,24 @@
+// PR middle-end/71488
+// { dg-do run }
+// { dg-options "-O3 -std=c++11" }
+// { dg-additional-options "-march=westmere" { target i?86-*-* x86_64-*-* } }
+// { dg-require-effective-target c++11 }
+
+#include 
+
+int var_4 = 1;
+long long var_9 = 0;
+
+int main() {
+  
+  std::valarray v10;
+
+  v10.resize(1);
+  v10[0].resize(4);
+
+  for (int i = 0; i < 4; i++)
+v10[0][i] = ((var_9 == 0) > unsigned (var_4 == 0)) + (var_9 == 0);
+
+  if (v10[0][0] != 2)
+__builtin_abort ();
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bool-cmp.c 
b/gcc/testsuite/gcc.dg/vect/vect-bool-cmp.c
new file mode 100644
index 000..a1e2a24
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-bool-cmp.c
@@ -0,0 +1,252 @@
+/* PR71488 */
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_pack_trunc } */
+/* { dg-additional-options "-msse4" { target { i?86-*-* x86_64-*-* } } } */
+
+int i1, i2;
+
+void __attribute__((noclone,noinline))
+fn1 (int * __restrict__ p1, int * __restrict__ p2, int * __restrict__ p3, int 
size)
+{
+  int i;
+
+  for (i = 0; i < size; i++)
+p1[i] = ((p2[i] == 0) > (unsigned)(p3[i] == 0)) + (p2[i] == 0);
+}
+
+void __attribute__((noclone,noinline))
+fn2 (int * __restrict__ p1, int * __restrict__ p2, short * __restrict__ p3, 
int size)
+{
+  int i;
+
+  for (i = 0; i < size; i++)
+p1[i] = ((p2[i] == 0) > (unsigned)(p3[i] == 0)) + (p2[i] == 0);
+}
+
+void __attribute__((noclone,noinline))
+fn3 (int * __restrict__ p1, int * __restrict__ p2, long long * __restrict__ 
p3, int size)
+{
+  int i;
+
+  for (i = 0; i < size; i++)
+p1[i] = ((p2[i] == 0) > (unsigned)(p3[i] == 0)) + (p2[i] == 0);
+}
+
+void __attribute__((noclone,noinline))
+fn4 (int * __restrict__ p1, int * __restrict__ p2, int * __restrict__ p3, int 
size)
+{
+  int i;
+
+  for (i = 0; i < size; i++)
+p1[i] = ((p2[i] == 0) >= (unsigned)(p3[i] == 0)) + (p2[i] == 0);
+}
+
+void __attribute__((noclone,noinline))
+fn5 (int * __restrict__ p1, int * __restrict__ p2, short * __restrict__ p3, 
int size)
+{
+  int i;
+
+  for (i = 0; i < size; i++)
+p1[i] = ((p2[i] == 0) >= (unsigned)(p3[i] == 0)) + (p2[i] == 0);
+}
+
+void __attribute__((noclone,noinline))
+fn6 (int * __restrict__ p1, int * __restrict__ p2, long long * __restrict__ 
p3, int size)
+{
+  int i;
+
+  for (i = 0; i < size; i++)
+p1[i] = ((p2[i] == 0) >= (unsigned)(p3[i] == 0)) + (p2[i] == 0);
+}
+
+void __attribute__((noclone,noinline))
+fn7 (int * __restrict__ p1, int * __restrict__ p2, int * __restrict__ p3, int 
size)
+{
+  int i;
+
+  for (i = 0; i < size; i++)
+p1[i] = ((p2[i] == 0) < (unsigned)(p3[i] == 0)) + (p2[i] == 0);
+}
+
+void __attribute__((noclone,noinline))
+fn8 (int * __restrict__ p1, int * __restrict__ p2, short * __restrict__ p3, 
int size)
+{
+  int i;
+
+  for (i = 0; i < size; i++)
+p1[i] = ((p2[i] == 0) < (unsigned)(p3[i] == 0)) + (p2[i] == 0);
+}
+
+void __attribute__((noclone,noinline))
+fn9 (int * __restrict__ p1, int * __restrict__ p2, long long * __restrict__ 
p3, int size)
+{
+  int i;
+
+  for (i = 0; i < size; i++)
+p1[i] = ((p2[i] == 0) < (unsigned)(p3[i] == 0)) + (p2[i] == 0);
+}
+
+void __attribute__((noclone,noinline))
+fn10 (int * __restrict__ p1, int * __restrict__ p2, int * __restrict__ p3, int 
size)
+{
+  int i;
+
+  for (i = 0; i < size; i++)
+p1[i] = ((p2[i] == 0) <= (unsigned)(p3[i] == 0)) + (p2[i] == 0);
+}
+
+void __attribute__((noclone,noinline))
+fn11 (int * __restrict__ p1, int * __restrict__ p2, short * __restrict__ p3, 
int size)
+{
+  int i;
+
+  for (i = 0; i < size; i++)
+p1[i] = ((p2[i] == 0) <= (unsigned)(p3[i] == 0)) + (p2[i] == 0);
+}
+
+void __attribute__((noclone,noinline))
+fn12 (int