Re: [PATCH, PR68337] Don't fold memcpy/memmove we want to instrument

2015-11-23 Thread Ilya Enkovich
On 23 Nov 10:39, Richard Biener wrote:
> On Fri, Nov 20, 2015 at 3:30 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
> > On 20 Nov 14:54, Richard Biener wrote:
> >> On Fri, Nov 20, 2015 at 2:08 PM, Ilya Enkovich <enkovich@gmail.com> 
> >> wrote:
> >> > On 19 Nov 18:19, Richard Biener wrote:
> >> >> On November 19, 2015 6:12:30 PM GMT+01:00, Bernd Schmidt 
> >> >> <bschm...@redhat.com> wrote:
> >> >> >On 11/19/2015 05:31 PM, Ilya Enkovich wrote:
> >> >> >> Currently we fold all memcpy/memmove calls with a known data size.
> >> >> >> It causes two problems when used with Pointer Bounds Checker.
> >> >> >> The first problem is that we may copy pointers as integer data
> >> >> >> and thus loose bounds.  The second problem is that if we inline
> >> >> >> memcpy, we also have to inline bounds copy and this may result
> >> >> >> in a huge amount of code and significant compilation time growth.
> >> >> >> This patch disables folding for functions we want to instrument.
> >> >> >>
> >> >> >> Does it look reasonable for trunk and GCC5 branch?  Bootstrapped
> >> >> >> and regtested on x86_64-unknown-linux-gnu.
> >> >> >
> >> >> >Can't see anything wrong with it. Ok.
> >> >>
> >> >> But for small sizes this can have a huge impact on optimization.  Which 
> >> >> is why we have the code in the first place.  I'd make the check less 
> >> >> broad, for example inlining copies of size less than a pointer 
> >> >> shouldn't be affected.
> >> >
> >> > Right.  We also may inline in case we know no pointers are copied.  
> >> > Below is a version with extended condition and a couple more tests.  
> >> > Bootstrapped and regtested on x86_64-unknown-linux-gnu.  Does it OK for 
> >> > trunk and gcc-5-branch?
> >> >
> >> >>
> >> >> Richard.
> >> >>
> >> >> >
> >> >> >Bernd
> >> >>
> >> >>
> >> >
> >> > Thanks,
> >> > Ilya
> >> > --
> >> > gcc/
> >> >
> >> > 2015-11-20  Ilya Enkovich  <enkovich@gmail.com>
> >> >
> >> > * gimple-fold.c (gimple_fold_builtin_memory_op): Don't
> >> > fold call if we are going to instrument it and it may
> >> > copy pointers.
> >> >
> >> > gcc/testsuite/
> >> >
> >> > 2015-11-20  Ilya Enkovich  <enkovich@gmail.com>
> >> >
> >> > * gcc.target/i386/mpx/pr68337-1.c: New test.
> >> > * gcc.target/i386/mpx/pr68337-2.c: New test.
> >> > * gcc.target/i386/mpx/pr68337-3.c: New test.
> >> >
> >> >
> >> > diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> >> > index 1ab20d1..dd9f80b 100644
> >> > --- a/gcc/gimple-fold.c
> >> > +++ b/gcc/gimple-fold.c
> >> > @@ -53,6 +53,8 @@ along with GCC; see the file COPYING3.  If not see
> >> >  #include "gomp-constants.h"
> >> >  #include "optabs-query.h"
> >> >  #include "omp-low.h"
> >> > +#include "tree-chkp.h"
> >> > +#include "ipa-chkp.h"
> >> >
> >> >
> >> >  /* Return true when DECL can be referenced from current unit.
> >> > @@ -664,6 +666,23 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator 
> >> > *gsi,
> >> >unsigned int src_align, dest_align;
> >> >tree off0;
> >> >
> >> > +  /* Inlining of memcpy/memmove may cause bounds lost (if we copy
> >> > +pointers as wide integer) and also may result in huge function
> >> > +size because of inlined bounds copy.  Thus don't inline for
> >> > +functions we want to instrument in case pointers are copied.  */
> >> > +  if (flag_check_pointer_bounds
> >> > + && chkp_instrumentable_p (cfun->decl)
> >> > + /* Even if data may contain pointers we can inline if copy
> >> > +less than a pointer size.  */
> >> > + && (!tree_fits_uhwi_p (len)
> >> > + || compare_tree_int (len, POINTER_SIZE_UNITS) >= 0)
> &

Re: [PATCH] Avoid false vector mask conversion

2015-11-23 Thread Ilya Enkovich
Ping

2015-11-13 16:17 GMT+03:00 Ilya Enkovich <enkovich@gmail.com>:
> 2015-11-13 13:03 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>> On Thu, Nov 12, 2015 at 5:08 PM, Ilya Enkovich <enkovich@gmail.com> 
>> wrote:
>>> Hi,
>>>
>>> When we use LTO for fortran we may have a mix 32bit and 1bit scalar 
>>> booleans. It means we may have conversion of one scalar type to another 
>>> which confuses vectorizer because values with different scalar boolean type 
>>> may get the same vectype.
>>
>> Confuses aka fails to vectorize?
>
> Right.
>
>>
>>>  This patch transforms such conversions into comparison.
>>>
>>> I managed to make a small fortran test which gets vectorized with this 
>>> patch but I didn't find how I can run fortran test with LTO and then scan 
>>> tree dump to check it is vectorized.  BTW here is a loop from the test:
>>>
>>>   real*8 a(18)
>>>   logical b(18)
>>>   integer i
>>>
>>>   do i=1,18
>>>  if(a(i).gt.0.d0) then
>>> b(i)=.true.
>>>  else
>>> b(i)=.false.
>>>  endif
>>>   enddo
>>
>> This looks the the "error" comes from if-conversion - can't we do
>> better there then?
>
> No, this loop is transformed into a single BB before if-conversion by
> cselim + phiopt.
>
> Ilya
>
>>
>> Richard.
>>
>>> Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?
>>>
>>> Thanks,
>>> Ilya


Re: [PATCH, PR68337] Don't fold memcpy/memmove we want to instrument

2015-11-23 Thread Ilya Enkovich
On 23 Nov 14:29, Richard Biener wrote:
> On Mon, Nov 23, 2015 at 12:33 PM, Ilya Enkovich <enkovich@gmail.com> 
> wrote:
> >
> > I see.  But it should still be OK to check type in case of strict aliasing, 
> > right?
> 
> No, memcpy is always "no-strict-aliasing"
> 

Thanks a lot for help!  Here is a variant with a size check only as
you originally suggested.  Is it OK for trunk and gcc-5-branch if
no regressions?

Thanks,
Ilya
--
gcc/

2015-11-23  Ilya Enkovich  <enkovich@gmail.com>

* gimple-fold.c: Include ipa-chkp.h.
(gimple_fold_builtin_memory_op): Don't fold call if we
are going to instrument it and it may copy pointers.

gcc/testsuite/

2015-11-23  Ilya Enkovich  <enkovich@gmail.com>

* gcc.target/i386/mpx/pr68337-1.c: New test.
* gcc.target/i386/mpx/pr68337-2.c: New test.


diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 1ab20d1..6ff5e26 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gomp-constants.h"
 #include "optabs-query.h"
 #include "omp-low.h"
+#include "ipa-chkp.h"
 
 
 /* Return true when DECL can be referenced from current unit.
@@ -664,6 +665,18 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
   unsigned int src_align, dest_align;
   tree off0;
 
+  /* Inlining of memcpy/memmove may cause bounds lost (if we copy
+pointers as wide integer) and also may result in huge function
+size because of inlined bounds copy.  Thus don't inline for
+functions we want to instrument.  */
+  if (flag_check_pointer_bounds
+ && chkp_instrumentable_p (cfun->decl)
+ /* Even if data may contain pointers we can inline if copy
+less than a pointer size.  */
+ && (!tree_fits_uhwi_p (len)
+ || compare_tree_int (len, POINTER_SIZE_UNITS) >= 0))
+   return false;
+
   /* Build accesses at offset zero with a ref-all character type.  */
   off0 = build_int_cst (build_pointer_type_for_mode (char_type_node,
 ptr_mode, true), 0);
diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c 
b/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c
new file mode 100644
index 000..3f8d79d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+#include "mpx-check.h"
+
+#define N 2
+
+extern void abort ();
+
+static int
+mpx_test (int argc, const char **argv)
+{
+  char ** src = (char **)malloc (sizeof (char *) * N);
+  char ** dst = (char **)malloc (sizeof (char *) * N);
+  int i;
+
+  for (i = 0; i < N; i++)
+src[i] = __bnd_set_ptr_bounds (argv[0] + i, i + 1);
+
+  __builtin_memcpy(dst, src, sizeof (char *) * N);
+
+  for (i = 0; i < N; i++)
+{
+  char *p = dst[i];
+  if (p != argv[0] + i
+ || __bnd_get_ptr_lbound (p) != p
+ || __bnd_get_ptr_ubound (p) != p + i)
+   abort ();
+}
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr68337-2.c 
b/gcc/testsuite/gcc.target/i386/mpx/pr68337-2.c
new file mode 100644
index 000..8845cca
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/pr68337-2.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+/* { dg-final { scan-assembler-not "memcpy" } } */
+
+void
+test (void *dst, void *src)
+{
+  __builtin_memcpy (dst, src, sizeof (char *) / 2);
+}


Re: [PATCH, PR68337] Don't fold memcpy/memmove we want to instrument

2015-11-20 Thread Ilya Enkovich
On 20 Nov 14:54, Richard Biener wrote:
> On Fri, Nov 20, 2015 at 2:08 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
> > On 19 Nov 18:19, Richard Biener wrote:
> >> On November 19, 2015 6:12:30 PM GMT+01:00, Bernd Schmidt 
> >> <bschm...@redhat.com> wrote:
> >> >On 11/19/2015 05:31 PM, Ilya Enkovich wrote:
> >> >> Currently we fold all memcpy/memmove calls with a known data size.
> >> >> It causes two problems when used with Pointer Bounds Checker.
> >> >> The first problem is that we may copy pointers as integer data
> >> >> and thus loose bounds.  The second problem is that if we inline
> >> >> memcpy, we also have to inline bounds copy and this may result
> >> >> in a huge amount of code and significant compilation time growth.
> >> >> This patch disables folding for functions we want to instrument.
> >> >>
> >> >> Does it look reasonable for trunk and GCC5 branch?  Bootstrapped
> >> >> and regtested on x86_64-unknown-linux-gnu.
> >> >
> >> >Can't see anything wrong with it. Ok.
> >>
> >> But for small sizes this can have a huge impact on optimization.  Which is 
> >> why we have the code in the first place.  I'd make the check less broad, 
> >> for example inlining copies of size less than a pointer shouldn't be 
> >> affected.
> >
> > Right.  We also may inline in case we know no pointers are copied.  Below 
> > is a version with extended condition and a couple more tests.  Bootstrapped 
> > and regtested on x86_64-unknown-linux-gnu.  Does it OK for trunk and 
> > gcc-5-branch?
> >
> >>
> >> Richard.
> >>
> >> >
> >> >Bernd
> >>
> >>
> >
> > Thanks,
> > Ilya
> > --
> > gcc/
> >
> > 2015-11-20  Ilya Enkovich  <enkovich@gmail.com>
> >
> > * gimple-fold.c (gimple_fold_builtin_memory_op): Don't
> > fold call if we are going to instrument it and it may
> > copy pointers.
> >
> > gcc/testsuite/
> >
> > 2015-11-20  Ilya Enkovich  <enkovich@gmail.com>
> >
> > * gcc.target/i386/mpx/pr68337-1.c: New test.
> > * gcc.target/i386/mpx/pr68337-2.c: New test.
> > * gcc.target/i386/mpx/pr68337-3.c: New test.
> >
> >
> > diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> > index 1ab20d1..dd9f80b 100644
> > --- a/gcc/gimple-fold.c
> > +++ b/gcc/gimple-fold.c
> > @@ -53,6 +53,8 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "gomp-constants.h"
> >  #include "optabs-query.h"
> >  #include "omp-low.h"
> > +#include "tree-chkp.h"
> > +#include "ipa-chkp.h"
> >
> >
> >  /* Return true when DECL can be referenced from current unit.
> > @@ -664,6 +666,23 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator 
> > *gsi,
> >unsigned int src_align, dest_align;
> >tree off0;
> >
> > +  /* Inlining of memcpy/memmove may cause bounds lost (if we copy
> > +pointers as wide integer) and also may result in huge function
> > +size because of inlined bounds copy.  Thus don't inline for
> > +functions we want to instrument in case pointers are copied.  */
> > +  if (flag_check_pointer_bounds
> > + && chkp_instrumentable_p (cfun->decl)
> > + /* Even if data may contain pointers we can inline if copy
> > +less than a pointer size.  */
> > + && (!tree_fits_uhwi_p (len)
> > + || compare_tree_int (len, POINTER_SIZE_UNITS) >= 0)
> 
> || tree_to_uhwi (len) >= POINTER_SIZE_UNITS
> 
> > + /* Check data type for pointers.  */
> > + && (!TREE_TYPE (src)
> > + || !TREE_TYPE (TREE_TYPE (src))
> > + || VOID_TYPE_P (TREE_TYPE (TREE_TYPE (src)))
> > + || chkp_type_has_pointer (TREE_TYPE (TREE_TYPE (src)
> 
> I don't think you can in any way rely on the pointer type of the src argument
> as all pointer conversions are useless and memcpy and friends take void *
> anyway.

This check is looking for cases when we have type information indicating
no pointers are copied.  In case of 'void *' we have to assume pointers
are copied and inlining is undesired.  Test pr68337-2.c checks pointer
type allows to enable inlining.  Looks like this check misses
|| !COMPLETE_TYPE_P(TREE_TYPE (TREE_TYPE (src)))?

> 
> Note that you also disable memmove to memcpy simplification with this
> early check.

Doesn't matter for MPX which uses the same implementation for both cases.

> 
> Where is pointer transfer handled for MPX?  I suppose it's not done
> transparently
> for all memory move instructions but explicitely by instrumented block copy
> routines in libmpx?  In which case how does that identify pointers vs.
> non-pointers?

It is handled by instrumentation pass.  Compiler checks type of stored data to
find pointer stores.  Each pointer store is instrumented with bndstx call.

MPX versions of memcpy, memmove etc. don't make any assumptions about
type of copied data and just copy whole chunk of bounds metadata corresponding
to copied block.

Thanks,
Ilya

> 
> Richard.
> 


Re: [PATCH, PR tree-optimization/68327] Compute vectype for live phi nodes when copmputing VF

2015-11-20 Thread Ilya Enkovich
On 20 Nov 14:31, Ilya Enkovich wrote:
> 2015-11-20 14:28 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> > On Wed, Nov 18, 2015 at 2:53 PM, Ilya Enkovich <enkovich@gmail.com> 
> > wrote:
> >> 2015-11-18 16:44 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> >>> On Wed, Nov 18, 2015 at 12:34 PM, Ilya Enkovich <enkovich@gmail.com> 
> >>> wrote:
> >>>> Hi,
> >>>>
> >>>> When we compute vectypes we skip non-relevant phi nodes.  But we process 
> >>>> non-relevant alive statements and thus may need vectype of non-relevant 
> >>>> live phi node to compute mask vectype.  This patch enables vectype 
> >>>> computation for live phi nodes.  Botostrapped and regtested on 
> >>>> x86_64-unknown-linux-gnu.  OK for trunk?
> >>>
> >>> Hmm.  What breaks if you instead skip all !relevant stmts and not
> >>> compute vectype for life but not relevant ones?  We won't ever
> >>> "vectorize" !relevant ones, that is, we don't need their vector type.
> >>
> >> I tried it and got regression in SLP.  It expected non-null vectype
> >> for non-releveant but live statement. Regression was in
> >> gcc/gcc/testsuite/gfortran.fortran-torture/execute/pr43390.f90
> >
> > Because somebody put a vector type check before
> >
> >   if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> > return false;
> >
> > @@ -7590,6 +7651,9 @@ vectorizable_comparison (gimple *stmt, g
> >tree mask_type;
> >tree mask;
> >
> > +  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> > +return false;
> > +
> >if (!VECTOR_BOOLEAN_TYPE_P (vectype))
> >  return false;
> >
> > @@ -7602,8 +7666,6 @@ vectorizable_comparison (gimple *stmt, g
> >  ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
> >
> >gcc_assert (ncopies >= 1);
> > -  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> > -return false;
> >
> >if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
> >&& !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
> >
> > fixes this particular fallout for me.
> 
> I'll try it.

With this fix it works fine, thanks!  Bootstrapped and regtested on 
x86_64-unknown-linux-gnu.  OK for trunk?

Ilya
--
gcc/

2015-11-20  Ilya Enkovich  <enkovich@gmail.com>
Richard Biener  <rguent...@suse.de>

* tree-vect-loop.c (vect_determine_vectorization_factor): Don't
compute vectype for non-relevant mask producers.
* gcc/tree-vect-stmts.c (vectorizable_comparison): Check stmt
relevance earlier.

gcc/testsuite/

2015-11-20  Ilya Enkovich  <enkovich@gmail.com>

* gcc.dg/pr68327.c: New test.


diff --git a/gcc/testsuite/gcc.dg/pr68327.c b/gcc/testsuite/gcc.dg/pr68327.c
new file mode 100644
index 000..c3e6a94
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr68327.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+int a, d;
+char b, c;
+
+void
+fn1 ()
+{
+  int i = 0;
+  for (; i < 1; i++)
+d = 1;
+  for (; b; b++)
+a = 1 && (d & b);
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 80937ec..592372d 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -439,7 +439,8 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
 compute a factor.  */
  if (TREE_CODE (scalar_type) == BOOLEAN_TYPE)
{
- mask_producers.safe_push (stmt_info);
+ if (STMT_VINFO_RELEVANT_P (stmt_info))
+   mask_producers.safe_push (stmt_info);
  bool_result = true;
 
  if (gimple_code (stmt) == GIMPLE_ASSIGN
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 0f64aaf..3723b26 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -7546,6 +7546,9 @@ vectorizable_comparison (gimple *stmt, 
gimple_stmt_iterator *gsi,
   tree mask_type;
   tree mask;
 
+  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
+return false;
+
   if (!VECTOR_BOOLEAN_TYPE_P (vectype))
 return false;
 
@@ -7558,9 +7561,6 @@ vectorizable_comparison (gimple *stmt, 
gimple_stmt_iterator *gsi,
 ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
 
   gcc_assert (ncopies >= 1);
-  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
-return false;
-
   if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
   && !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
   && reduc_def))


Re: [PATCH, PR tree-optimization/68327] Compute vectype for live phi nodes when copmputing VF

2015-11-20 Thread Ilya Enkovich
2015-11-20 14:28 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Wed, Nov 18, 2015 at 2:53 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
>> 2015-11-18 16:44 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>> On Wed, Nov 18, 2015 at 12:34 PM, Ilya Enkovich <enkovich@gmail.com> 
>>> wrote:
>>>> Hi,
>>>>
>>>> When we compute vectypes we skip non-relevant phi nodes.  But we process 
>>>> non-relevant alive statements and thus may need vectype of non-relevant 
>>>> live phi node to compute mask vectype.  This patch enables vectype 
>>>> computation for live phi nodes.  Botostrapped and regtested on 
>>>> x86_64-unknown-linux-gnu.  OK for trunk?
>>>
>>> Hmm.  What breaks if you instead skip all !relevant stmts and not
>>> compute vectype for life but not relevant ones?  We won't ever
>>> "vectorize" !relevant ones, that is, we don't need their vector type.
>>
>> I tried it and got regression in SLP.  It expected non-null vectype
>> for non-releveant but live statement. Regression was in
>> gcc/gcc/testsuite/gfortran.fortran-torture/execute/pr43390.f90
>
> Because somebody put a vector type check before
>
>   if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> return false;
>
> @@ -7590,6 +7651,9 @@ vectorizable_comparison (gimple *stmt, g
>tree mask_type;
>tree mask;
>
> +  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> +return false;
> +
>if (!VECTOR_BOOLEAN_TYPE_P (vectype))
>  return false;
>
> @@ -7602,8 +7666,6 @@ vectorizable_comparison (gimple *stmt, g
>  ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
>
>gcc_assert (ncopies >= 1);
> -  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> -return false;
>
>if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
>&& !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
>
> fixes this particular fallout for me.

I'll try it.

Thanks,
Ilya

>
> Richard.
>
>> Ilya
>>
>>>
>>> Richard.
>>>
>>>> Thanks,
>>>> Ilya


Re: [PATCH, PR68337] Don't fold memcpy/memmove we want to instrument

2015-11-20 Thread Ilya Enkovich
On 19 Nov 18:19, Richard Biener wrote:
> On November 19, 2015 6:12:30 PM GMT+01:00, Bernd Schmidt 
> <bschm...@redhat.com> wrote:
> >On 11/19/2015 05:31 PM, Ilya Enkovich wrote:
> >> Currently we fold all memcpy/memmove calls with a known data size.
> >> It causes two problems when used with Pointer Bounds Checker.
> >> The first problem is that we may copy pointers as integer data
> >> and thus loose bounds.  The second problem is that if we inline
> >> memcpy, we also have to inline bounds copy and this may result
> >> in a huge amount of code and significant compilation time growth.
> >> This patch disables folding for functions we want to instrument.
> >>
> >> Does it look reasonable for trunk and GCC5 branch?  Bootstrapped
> >> and regtested on x86_64-unknown-linux-gnu.
> >
> >Can't see anything wrong with it. Ok.
> 
> But for small sizes this can have a huge impact on optimization.  Which is 
> why we have the code in the first place.  I'd make the check less broad, for 
> example inlining copies of size less than a pointer shouldn't be affected.

Right.  We also may inline in case we know no pointers are copied.  Below is a 
version with extended condition and a couple more tests.  Bootstrapped and 
regtested on x86_64-unknown-linux-gnu.  Does it OK for trunk and gcc-5-branch?

> 
> Richard.
> 
> >
> >Bernd
> 
> 

Thanks,
Ilya
--
gcc/

2015-11-20  Ilya Enkovich  <enkovich@gmail.com>

* gimple-fold.c (gimple_fold_builtin_memory_op): Don't
fold call if we are going to instrument it and it may
copy pointers.

gcc/testsuite/

2015-11-20  Ilya Enkovich  <enkovich@gmail.com>

* gcc.target/i386/mpx/pr68337-1.c: New test.
* gcc.target/i386/mpx/pr68337-2.c: New test.
* gcc.target/i386/mpx/pr68337-3.c: New test.


diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 1ab20d1..dd9f80b 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -53,6 +53,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "gomp-constants.h"
 #include "optabs-query.h"
 #include "omp-low.h"
+#include "tree-chkp.h"
+#include "ipa-chkp.h"
 
 
 /* Return true when DECL can be referenced from current unit.
@@ -664,6 +666,23 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
   unsigned int src_align, dest_align;
   tree off0;
 
+  /* Inlining of memcpy/memmove may cause bounds lost (if we copy
+pointers as wide integer) and also may result in huge function
+size because of inlined bounds copy.  Thus don't inline for
+functions we want to instrument in case pointers are copied.  */
+  if (flag_check_pointer_bounds
+ && chkp_instrumentable_p (cfun->decl)
+ /* Even if data may contain pointers we can inline if copy
+less than a pointer size.  */
+ && (!tree_fits_uhwi_p (len)
+ || compare_tree_int (len, POINTER_SIZE_UNITS) >= 0)
+ /* Check data type for pointers.  */
+ && (!TREE_TYPE (src)
+ || !TREE_TYPE (TREE_TYPE (src))
+ || VOID_TYPE_P (TREE_TYPE (TREE_TYPE (src)))
+ || chkp_type_has_pointer (TREE_TYPE (TREE_TYPE (src)
+   return false;
+
   /* Build accesses at offset zero with a ref-all character type.  */
   off0 = build_int_cst (build_pointer_type_for_mode (char_type_node,
 ptr_mode, true), 0);
diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c 
b/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c
new file mode 100644
index 000..3f8d79d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+#include "mpx-check.h"
+
+#define N 2
+
+extern void abort ();
+
+static int
+mpx_test (int argc, const char **argv)
+{
+  char ** src = (char **)malloc (sizeof (char *) * N);
+  char ** dst = (char **)malloc (sizeof (char *) * N);
+  int i;
+
+  for (i = 0; i < N; i++)
+src[i] = __bnd_set_ptr_bounds (argv[0] + i, i + 1);
+
+  __builtin_memcpy(dst, src, sizeof (char *) * N);
+
+  for (i = 0; i < N; i++)
+{
+  char *p = dst[i];
+  if (p != argv[0] + i
+ || __bnd_get_ptr_lbound (p) != p
+ || __bnd_get_ptr_ubound (p) != p + i)
+   abort ();
+}
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr68337-2.c 
b/gcc/testsuite/gcc.target/i386/mpx/pr68337-2.c
new file mode 100644
index 000..16736b4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/pr68337-2.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+/* { dg-final { scan-assembler-not "

[PATCH, PR68337] Don't fold memcpy/memmove we want to instrument

2015-11-19 Thread Ilya Enkovich
Hi,

Currently we fold all memcpy/memmove calls with a known data size.
It causes two problems when used with Pointer Bounds Checker.
The first problem is that we may copy pointers as integer data
and thus loose bounds.  The second problem is that if we inline
memcpy, we also have to inline bounds copy and this may result
in a huge amount of code and significant compilation time growth.
This patch disables folding for functions we want to instrument.

Does it look reasonable for trunk and GCC5 branch?  Bootstrapped
and regtested on x86_64-unknown-linux-gnu.

Thanks,
Ilya
--
gcc/

2015-11-19  Ilya Enkovich  <enkovich@gmail.com>

* gimple-fold.c (gimple_fold_builtin_memory_op): Don't
fold non-useless call if we are going to instrument it.

gcc/testsuite/

2015-11-19  Ilya Enkovich  <enkovich@gmail.com>

* gcc.target/i386/mpx/pr68337.c: New test.


diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 1ab20d1..b3a1229 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gomp-constants.h"
 #include "optabs-query.h"
 #include "omp-low.h"
+#include "ipa-chkp.h"
 
 
 /* Return true when DECL can be referenced from current unit.
@@ -664,6 +665,13 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
   unsigned int src_align, dest_align;
   tree off0;
 
+  /* Inlining of memcpy/memmove may cause bounds lost (if we copy
+pointers as wide integer) and also may result in huge function
+size because of inlined bounds copy.  Thus don't inline for
+functions we want to instrument.  */
+  if (flag_check_pointer_bounds && chkp_instrumentable_p (cfun->decl))
+   return false;
+
   /* Build accesses at offset zero with a ref-all character type.  */
   off0 = build_int_cst (build_pointer_type_for_mode (char_type_node,
 ptr_mode, true), 0);
diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr68337.c 
b/gcc/testsuite/gcc.target/i386/mpx/pr68337.c
new file mode 100644
index 000..3f8d79d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/pr68337.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+#include "mpx-check.h"
+
+#define N 2
+
+extern void abort ();
+
+static int
+mpx_test (int argc, const char **argv)
+{
+  char ** src = (char **)malloc (sizeof (char *) * N);
+  char ** dst = (char **)malloc (sizeof (char *) * N);
+  int i;
+
+  for (i = 0; i < N; i++)
+src[i] = __bnd_set_ptr_bounds (argv[0] + i, i + 1);
+
+  __builtin_memcpy(dst, src, sizeof (char *) * N);
+
+  for (i = 0; i < N; i++)
+{
+  char *p = dst[i];
+  if (p != argv[0] + i
+ || __bnd_get_ptr_lbound (p) != p
+ || __bnd_get_ptr_ubound (p) != p + i)
+   abort ();
+}
+
+  return 0;
+}


Re: [PATCH] Get rid of insn-codes.h in optabs-tree.c

2015-11-19 Thread Ilya Enkovich
On 19 Nov 16:46, Bernd Schmidt wrote:
> On 11/19/2015 03:28 PM, Ilya Enkovich wrote:
> >This is a refactoring patch discussed in another thread [1].  It gets
> >rid of CODE_FOR_nothing usage in optabs-tree.c by introducing boolean
> >predicated in optabs-query.  Bootstrapped and regtesed on
> >x86_64-unknown-linux-gnu.
> 
> Looks pretty reasonable, but I think we have to start saying "not now" after
> the end of stage 1.

I send it now because Jeff considered this patch at early stage3.  I can commit 
it at the next stage1 either.

> 
> >
> >+/* Return 1 id there is a valid insn code to convert fixed-point mode
> 
> "true", not "1" (elsewhere too), and "if".
> 
> >+{
> >+  return get_fix_icode (fixmode, fltmode, unsignedp, truncp_ptr)
> >+    != CODE_FOR_nothing;
> 
> Formatting.

Thanks! Below is a fixed version.

Ilya

> 
> 
> Bernd


--
gcc/

2015-11-19  Ilya Enkovich  <enkovich@gmail.com>

* optabs-query.h (get_vec_cmp_icode): Remove 'static'.
(get_vcond_mask_icode): Likewise.
(get_extend_icode): New.
(get_float_icode): New.
(get_fix_icode): New.
(can_extend_p): Return bool
(can_float_p): Return bool.
(can_fix_p): Return bool.
(can_vec_cmp_p): New.
(can_vcond_p): New.
(can_vcond_mask_p): New.
* optabs-query.c (get_float_icode): New.
(can_extend_p): Return bool.
(get_float_icode): New.
(can_float_p): Return bool.
(get_fix_icode): New.
(can_fix_p): Return bool.
(can_vec_cmp_p): New.
(can_vcond_p): New.
(can_vcond_mask_p): New.
* expr.c (init_expr_target): Use get_extend_icode and
adjust to new can_extend_p return type.
(convert_move): Likewise.
(compress_float_constant): Likewise.
* function.c (assign_parm_setup_reg): Likewise.
* optabs-tree.c: Don't include insn-codes.h.
(supportable_convert_operation): Adjust to can_fix_p and
can_float_p new return types.
* optabs.c (gen_extend_insn): Use get_extend_icode.
(expand_float): Use get_float_icode and adjust to can_float_p
new return type.
(expand_fix): Use get_fix_icode and adjust to can_fix_p
new return type.
* tree-vrp.c (simplify_float_conversion_using_ranges): Adjust
to can_float_p new return type.


diff --git a/gcc/expr.c b/gcc/expr.c
index bd43dc4..f4c06a1 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -223,7 +223,7 @@ init_expr_target (void)
{
  enum insn_code ic;
 
- ic = can_extend_p (mode, srcmode, 0);
+ ic = get_extend_icode (mode, srcmode, 0);
  if (ic == CODE_FOR_nothing)
continue;
 
@@ -452,7 +452,7 @@ convert_move (rtx to, rtx from, int unsignedp)
   int nwords = CEIL (GET_MODE_SIZE (to_mode), UNITS_PER_WORD);
 
   /* Try converting directly if the insn is supported.  */
-  if ((code = can_extend_p (to_mode, from_mode, unsignedp))
+  if ((code = get_extend_icode (to_mode, from_mode, unsignedp))
  != CODE_FOR_nothing)
{
  /* If FROM is a SUBREG, put it into a register.  Do this
@@ -466,7 +466,7 @@ convert_move (rtx to, rtx from, int unsignedp)
}
   /* Next, try converting via full word.  */
   else if (GET_MODE_PRECISION (from_mode) < BITS_PER_WORD
-  && ((code = can_extend_p (to_mode, word_mode, unsignedp))
+  && ((code = get_extend_icode (to_mode, word_mode, unsignedp))
   != CODE_FOR_nothing))
{
  rtx word_to = gen_reg_rtx (word_mode);
@@ -573,7 +573,7 @@ convert_move (rtx to, rtx from, int unsignedp)
   if (GET_MODE_PRECISION (to_mode) > GET_MODE_PRECISION (from_mode))
 {
   /* Convert directly if that works.  */
-  if ((code = can_extend_p (to_mode, from_mode, unsignedp))
+  if ((code = get_extend_icode (to_mode, from_mode, unsignedp))
  != CODE_FOR_nothing)
{
  emit_unop_insn (code, to, from, equiv_code);
@@ -588,12 +588,10 @@ convert_move (rtx to, rtx from, int unsignedp)
  /* Search for a mode to convert via.  */
  for (intermediate = from_mode; intermediate != VOIDmode;
   intermediate = GET_MODE_WIDER_MODE (intermediate))
-   if (((can_extend_p (to_mode, intermediate, unsignedp)
- != CODE_FOR_nothing)
+   if ((can_extend_p (to_mode, intermediate, unsignedp)
 || (GET_MODE_SIZE (to_mode) < GET_MODE_SIZE (intermediate)
 && TRULY_NOOP_TRUNCATION_MODES_P (to_mode, intermediate)))
-   && (can_extend_p (intermediate, from_mode, unsignedp)
-   != CODE_FOR_nothing))
+   && can_extend_p (intermediate, from_mode, unsignedp))
  {

[PATCH] Get rid of insn-codes.h in optabs-tree.c

2015-11-19 Thread Ilya Enkovich
Hi,

This is a refactoring patch discussed in another thread [1].  It gets rid of 
CODE_FOR_nothing usage in optabs-tree.c by introducing boolean predicated in 
optabs-query.  Bootstrapped and regtesed on x86_64-unknown-linux-gnu.

Thanks,
Ilya

[1] - https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02973.html
--
gcc/

2015-11-19  Ilya Enkovich  <enkovich@gmail.com>

* optabs-query.h (get_vec_cmp_icode): Remove 'static'.
(get_vcond_mask_icode): Likewise.
(get_extend_icode): New.
(get_float_icode): New.
(get_fix_icode): New.
(can_extend_p): Return bool
(can_float_p): Return bool.
(can_fix_p): Return bool.
(can_vec_cmp_p): New.
(can_vcond_p): New.
(can_vcond_mask_p): New.
* optabs-query.c (get_float_icode): New.
(can_extend_p): Return bool.
(get_float_icode): New.
(can_float_p): Return bool.
(get_fix_icode): New.
(can_fix_p): Return bool.
(can_vec_cmp_p): New.
(can_vcond_p): New.
(can_vcond_mask_p): New.
* expr.c (init_expr_target): Use get_extend_icode and
adjust to new can_extend_p return type.
(convert_move): Likewise.
(compress_float_constant): Likewise.
* function.c (assign_parm_setup_reg): Likewise.
* optabs-tree.c: Don't include insn-codes.h.
(supportable_convert_operation): Adjust to can_fix_p and
can_float_p new return types.
* optabs.c (gen_extend_insn): Use get_extend_icode.
(expand_float): Use get_float_icode and adjust to can_float_p
new return type.
(expand_fix): Use get_fix_icode and adjust to can_fix_p
new return type.
* tree-vrp.c (simplify_float_conversion_using_ranges): Adjust
to can_float_p new return type.


diff --git a/gcc/expr.c b/gcc/expr.c
index bd43dc4..f4c06a1 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -223,7 +223,7 @@ init_expr_target (void)
{
  enum insn_code ic;
 
- ic = can_extend_p (mode, srcmode, 0);
+ ic = get_extend_icode (mode, srcmode, 0);
  if (ic == CODE_FOR_nothing)
continue;
 
@@ -452,7 +452,7 @@ convert_move (rtx to, rtx from, int unsignedp)
   int nwords = CEIL (GET_MODE_SIZE (to_mode), UNITS_PER_WORD);
 
   /* Try converting directly if the insn is supported.  */
-  if ((code = can_extend_p (to_mode, from_mode, unsignedp))
+  if ((code = get_extend_icode (to_mode, from_mode, unsignedp))
  != CODE_FOR_nothing)
{
  /* If FROM is a SUBREG, put it into a register.  Do this
@@ -466,7 +466,7 @@ convert_move (rtx to, rtx from, int unsignedp)
}
   /* Next, try converting via full word.  */
   else if (GET_MODE_PRECISION (from_mode) < BITS_PER_WORD
-  && ((code = can_extend_p (to_mode, word_mode, unsignedp))
+  && ((code = get_extend_icode (to_mode, word_mode, unsignedp))
   != CODE_FOR_nothing))
{
  rtx word_to = gen_reg_rtx (word_mode);
@@ -573,7 +573,7 @@ convert_move (rtx to, rtx from, int unsignedp)
   if (GET_MODE_PRECISION (to_mode) > GET_MODE_PRECISION (from_mode))
 {
   /* Convert directly if that works.  */
-  if ((code = can_extend_p (to_mode, from_mode, unsignedp))
+  if ((code = get_extend_icode (to_mode, from_mode, unsignedp))
  != CODE_FOR_nothing)
{
  emit_unop_insn (code, to, from, equiv_code);
@@ -588,12 +588,10 @@ convert_move (rtx to, rtx from, int unsignedp)
  /* Search for a mode to convert via.  */
  for (intermediate = from_mode; intermediate != VOIDmode;
   intermediate = GET_MODE_WIDER_MODE (intermediate))
-   if (((can_extend_p (to_mode, intermediate, unsignedp)
- != CODE_FOR_nothing)
+   if ((can_extend_p (to_mode, intermediate, unsignedp)
 || (GET_MODE_SIZE (to_mode) < GET_MODE_SIZE (intermediate)
 && TRULY_NOOP_TRUNCATION_MODES_P (to_mode, intermediate)))
-   && (can_extend_p (intermediate, from_mode, unsignedp)
-   != CODE_FOR_nothing))
+   && can_extend_p (intermediate, from_mode, unsignedp))
  {
convert_move (to, convert_to_mode (intermediate, from,
   unsignedp), unsignedp);
@@ -3638,7 +3636,7 @@ compress_float_constant (rtx x, rtx y)
   rtx_insn *last_insn;
 
   /* Skip if the target can't extend this way.  */
-  ic = can_extend_p (dstmode, srcmode, 0);
+  ic = get_extend_icode (dstmode, srcmode, 0);
   if (ic == CODE_FOR_nothing)
continue;
 
diff --git a/gcc/function.c b/gcc/function.c
index afc2c87..1be96dc 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -3143,8 +3143,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, 
tree parm,
   enum insn_code ic

Re: x32 libraries

2015-11-18 Thread Ilya Enkovich
For libmpx the reason is that MPX ISA doesn't support x32. I.e. MPX
instructions can't use 32bit bounds in 64bit mode.

Ilya

2015-11-18 9:34 GMT+03:00 Ulrich Drepper :
> Is there a reason why libmpx and libgccjit aren't build for x32?  This
> is in the case when building IA-32, x86-64, and x32 all together.
> Haven't tested any other way to build.  I suspect it's just an
> oversight in the way configuration works since I cannot see a
> technical reason.


[PATCH, PR target/68405, i386, committed] Add missing break

2015-11-18 Thread Ilya Enkovich
Hi,

This patch adds missing break for ix86_expand_mask_vec_cmp.  Bootstrapped and 
tested on x86_64-unknown-linux-gnu.  Committed to trunk as obvious.

Thanks,
Ilya
--
gcc/

2015-11-18  Ilya Enkovich  <enkovich@gmail.com>

PR target/68405
* config/i386/i386.c (ix86_expand_mask_vec_cmp): Add missing
break.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 6173dae..43cbdfb 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -22948,6 +22948,8 @@ ix86_expand_mask_vec_cmp (rtx operands[])
 case GEU:
 case LTU:
   unspec_code = UNSPEC_UNSIGNED_PCMP;
+  break;
+
 default:
   unspec_code = UNSPEC_PCMP;
 }


[PATCH, PR tree-optimization/68327] Compute vectype for live phi nodes when copmputing VF

2015-11-18 Thread Ilya Enkovich
Hi,

When we compute vectypes we skip non-relevant phi nodes.  But we process 
non-relevant alive statements and thus may need vectype of non-relevant live 
phi node to compute mask vectype.  This patch enables vectype computation for 
live phi nodes.  Botostrapped and regtested on x86_64-unknown-linux-gnu.  OK 
for trunk?

Thanks,
Ilya
--
gcc/

2015-11-18  Ilya Enkovich  <enkovich@gmail.com>

PR tree-optimization/68327
* tree-vect-loop.c (vect_determine_vectorization_factor): Don't
skip non-relevant live phi nodes.

gcc/testsuite/

2015-11-18  Ilya Enkovich  <enkovich@gmail.com>

PR tree-optimization/68327
* gcc.dg/pr68327.c: New test.


diff --git a/gcc/testsuite/gcc.dg/pr68327.c b/gcc/testsuite/gcc.dg/pr68327.c
new file mode 100644
index 000..c3e6a94
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr68327.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+int a, d;
+char b, c;
+
+void
+fn1 ()
+{
+  int i = 0;
+  for (; i < 1; i++)
+d = 1;
+  for (; b; b++)
+a = 1 && (d & b);
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 80937ec..7dba027 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -216,7 +216,8 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
 
  gcc_assert (stmt_info);
 
- if (STMT_VINFO_RELEVANT_P (stmt_info))
+ if (STMT_VINFO_RELEVANT_P (stmt_info)
+ || STMT_VINFO_LIVE_P (stmt_info))
 {
  gcc_assert (!STMT_VINFO_VECTYPE (stmt_info));
   scalar_type = TREE_TYPE (PHI_RESULT (phi));


Re: [PATCH, PR tree-optimization/68327] Compute vectype for live phi nodes when copmputing VF

2015-11-18 Thread Ilya Enkovich
2015-11-18 16:44 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Wed, Nov 18, 2015 at 12:34 PM, Ilya Enkovich <enkovich@gmail.com> 
> wrote:
>> Hi,
>>
>> When we compute vectypes we skip non-relevant phi nodes.  But we process 
>> non-relevant alive statements and thus may need vectype of non-relevant live 
>> phi node to compute mask vectype.  This patch enables vectype computation 
>> for live phi nodes.  Botostrapped and regtested on x86_64-unknown-linux-gnu. 
>>  OK for trunk?
>
> Hmm.  What breaks if you instead skip all !relevant stmts and not
> compute vectype for life but not relevant ones?  We won't ever
> "vectorize" !relevant ones, that is, we don't need their vector type.

I tried it and got regression in SLP.  It expected non-null vectype
for non-releveant but live statement. Regression was in
gcc/gcc/testsuite/gfortran.fortran-torture/execute/pr43390.f90

Ilya

>
> Richard.
>
>> Thanks,
>> Ilya


Re: [PATCH, PR middle-end/68134] Reject scalar modes in default get_mask_mode hook

2015-11-17 Thread Ilya Enkovich
2015-11-17 15:26 GMT+03:00 Bernd Schmidt <bschm...@redhat.com>:
> On 11/17/2015 12:49 PM, Ilya Enkovich wrote:
>>
>> Default hook for get_mask_mode is supposed to return integer vector
>> modes.  This means it should reject calar modes returned by
>> mode_for_vector.  Bootstrapped and regtested on
>> x86_64-unknown-linux-gnu, regtested on aarch64-unknown-linux-gnu.  OK
>> for trunk?
>
>
> You didn't say what exactly fails if an integer mode is returned. I'm
> assuming it's build_truth_vector_type which can call make_vector_type with
> an integer mode.

In case of integer mode we don't have such instruction in optab but
don't lower it either.

Ilya

>
> The patch looks OK to me.
>
>
> Bernd


[PATCH, PR middle-end/68134] Reject scalar modes in default get_mask_mode hook

2015-11-17 Thread Ilya Enkovich
Hi,

Default hook for get_mask_mode is supposed to return integer vector modes.  
This means it should reject calar modes returned by mode_for_vector.  
Bootstrapped and regtested on x86_64-unknown-linux-gnu, regtested on 
aarch64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-11-17  Ilya Enkovich  <enkovich@gmail.com>

PR middle-end/68134
* targhooks.c (default_get_mask_mode): Filter out
scalar modes returned by mode_for_vector.

gcc/testsuite/

2015-11-17  Ilya Enkovich  <enkovich@gmail.com>

PR middle-end/68134
* gcc.dg/pr68134.c: New test.


diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index c34b4e9..66d983b 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1093,8 +1093,8 @@ default_get_mask_mode (unsigned nunits, unsigned 
vector_size)
   gcc_assert (elem_size * nunits == vector_size);
 
   vector_mode = mode_for_vector (elem_mode, nunits);
-  if (VECTOR_MODE_P (vector_mode)
-  && !targetm.vector_mode_supported_p (vector_mode))
+  if (!VECTOR_MODE_P (vector_mode)
+  || !targetm.vector_mode_supported_p (vector_mode))
 vector_mode = BLKmode;
 
   return vector_mode;
diff --git a/gcc/testsuite/gcc.dg/pr68134.c b/gcc/testsuite/gcc.dg/pr68134.c
new file mode 100644
index 000..522b4c6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr68134.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c99" } */
+
+#include 
+
+typedef double float64x1_t __attribute__ ((vector_size (8)));
+typedef uint64_t uint64x1_t;
+
+void
+foo (void)
+{
+  float64x1_t arg1 = (float64x1_t) 0x3fedf9d4343c7c80;
+  float64x1_t arg2 = (float64x1_t) 0x3fcdc53742ea9c40;
+  uint64x1_t result = (uint64x1_t) (arg1 == arg2);
+  uint64_t got = result;
+  uint64_t exp = 0;
+  if (got != 0)
+__builtin_abort ();
+}


Re: [PATCH] Fix ICE for boolean comparison

2015-11-13 Thread Ilya Enkovich
2015-11-13 14:28 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Fri, Nov 13, 2015 at 11:52 AM, Ilya Enkovich <enkovich@gmail.com> 
> wrote:
>> 2015-11-13 13:38 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>> On Thu, Nov 12, 2015 at 4:44 PM, Ilya Enkovich <enkovich@gmail.com> 
>>> wrote:
>>>> Hi,
>>>>
>>>> Currently compiler may ICE when loaded boolean is compared with vector 
>>>> invariant or another boolean value.  This is because we don't detect mix 
>>>> of bool and non-bool vectypes and incorrectly determine vectype for 
>>>> boolean loop invariant for comparison.  This was fixed for COND_EXP before 
>>>> but also needs to be fixed for comparison.  This patch was bootstrapped 
>>>> and tested on x86_64-unknown-linux-gnu.  OK for trunk?
>>>
>>> Hmm, so this disables vectorization in these cases.  Isn't this a
>>> regression?  Shouldn't we simply "materialize"
>>> the non-bool vector from the boolean one say, with
>>>
>>>  vec = boolvec ? {-1, -1 ... } : {0, 0, 0 ...}
>>
>> We may do this using patterns, but still should catch cases when
>> patterns don't catch it. Patterns don't have vectypes computed and
>> therefore may miss such cases. Thus stability fix is still valid.
>>
>> I don't think we have a compiler version which can vectorize
>> simd-bool-comparison-2.cc, thus technically it is not a regression.
>> There are also other similar cases, e.g. store of comparison result or
>> use loaded boolean as a predicate. I was going to support
>> vectorization for such cases later (seems I don't hit stage1 for them
>> and not sure if it will be OK for stage3).
>
> I still think those checks show that there is an issue we should fix
> differently.  We're accumulating more mess into the already messy
> vectorizer :(

Right. Earlier vectype computation would let to reveal such cases more easily.

>
> Ok.

Thanks!
Ilya

>
> Thanks,
> Richard.
>
>> Ilya
>>
>>>
>>> ?
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> Thanks,
>>>> Ilya


Re: [PATCH] Fix ICE for boolean comparison

2015-11-13 Thread Ilya Enkovich
2015-11-13 13:38 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Thu, Nov 12, 2015 at 4:44 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
>> Hi,
>>
>> Currently compiler may ICE when loaded boolean is compared with vector 
>> invariant or another boolean value.  This is because we don't detect mix of 
>> bool and non-bool vectypes and incorrectly determine vectype for boolean 
>> loop invariant for comparison.  This was fixed for COND_EXP before but also 
>> needs to be fixed for comparison.  This patch was bootstrapped and tested on 
>> x86_64-unknown-linux-gnu.  OK for trunk?
>
> Hmm, so this disables vectorization in these cases.  Isn't this a
> regression?  Shouldn't we simply "materialize"
> the non-bool vector from the boolean one say, with
>
>  vec = boolvec ? {-1, -1 ... } : {0, 0, 0 ...}

We may do this using patterns, but still should catch cases when
patterns don't catch it. Patterns don't have vectypes computed and
therefore may miss such cases. Thus stability fix is still valid.

I don't think we have a compiler version which can vectorize
simd-bool-comparison-2.cc, thus technically it is not a regression.
There are also other similar cases, e.g. store of comparison result or
use loaded boolean as a predicate. I was going to support
vectorization for such cases later (seems I don't hit stage1 for them
and not sure if it will be OK for stage3).

Ilya

>
> ?
>
> Thanks,
> Richard.
>
>> Thanks,
>> Ilya


Re: [PATCH] Avoid false vector mask conversion

2015-11-13 Thread Ilya Enkovich
2015-11-13 13:03 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Thu, Nov 12, 2015 at 5:08 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
>> Hi,
>>
>> When we use LTO for fortran we may have a mix 32bit and 1bit scalar 
>> booleans. It means we may have conversion of one scalar type to another 
>> which confuses vectorizer because values with different scalar boolean type 
>> may get the same vectype.
>
> Confuses aka fails to vectorize?

Right.

>
>>  This patch transforms such conversions into comparison.
>>
>> I managed to make a small fortran test which gets vectorized with this patch 
>> but I didn't find how I can run fortran test with LTO and then scan tree 
>> dump to check it is vectorized.  BTW here is a loop from the test:
>>
>>   real*8 a(18)
>>   logical b(18)
>>   integer i
>>
>>   do i=1,18
>>  if(a(i).gt.0.d0) then
>> b(i)=.true.
>>  else
>> b(i)=.false.
>>  endif
>>   enddo
>
> This looks the the "error" comes from if-conversion - can't we do
> better there then?

No, this loop is transformed into a single BB before if-conversion by
cselim + phiopt.

Ilya

>
> Richard.
>
>> Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?
>>
>> Thanks,
>> Ilya


[PATCH, PR68286] Fix vector comparison expand

2015-11-12 Thread Ilya Enkovich
Hi,

My vector comparison patches broken expand of vector comparison on targets 
which don't have new comparison patterns but support VEC_COND_EXPR.  This 
happens because it's not checked vector comparison may be expanded as a 
comparison.  This patch fixes it.  Bootstrapped and regtested on 
powerpc64le-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-11-12  Ilya Enkovich  <enkovich@gmail.com>

* expr.c (do_store_flag): Expand vector comparison as
VEC_COND_EXPR if vector comparison is not supported
by target.

gcc/testsuite/

2015-11-12  Ilya Enkovich  <enkovich@gmail.com>

* gcc.dg/pr68286.c: New test.


diff --git a/gcc/expr.c b/gcc/expr.c
index 03936ee..bd43dc4 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11128,7 +11128,8 @@ do_store_flag (sepops ops, rtx target, machine_mode 
mode)
   if (TREE_CODE (ops->type) == VECTOR_TYPE)
 {
   tree ifexp = build2 (ops->code, ops->type, arg0, arg1);
-  if (VECTOR_BOOLEAN_TYPE_P (ops->type))
+  if (VECTOR_BOOLEAN_TYPE_P (ops->type)
+ && expand_vec_cmp_expr_p (TREE_TYPE (arg0), ops->type))
return expand_vec_cmp_expr (ops->type, ifexp, target);
   else
{
diff --git a/gcc/testsuite/gcc.dg/pr68286.c b/gcc/testsuite/gcc.dg/pr68286.c
new file mode 100644
index 000..d0392e8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr68286.c
@@ -0,0 +1,17 @@
+/* PR target/68286 */
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+int a, b, c;
+int fn1 ()
+{
+  int d[] = {0};
+  for (; c; c++)
+{
+  float e = c;
+  if (e)
+d[0]++;
+}
+  b = d[0];
+  return a;
+}


Re: Recent patch craters vector tests on powerpc64le-linux-gnu

2015-11-12 Thread Ilya Enkovich
2015-11-12 12:48 GMT+03:00 James Greenhalgh :
> On Wed, Nov 11, 2015 at 05:12:29PM -0600, Bill Schmidt wrote:
>> Hi Ilya,
>>
>> The patch committed as r230098 has caused a number of ICEs on
>> powerpc64le-linux-gnu.
>
> And arm-none-linux-gnueabihf, and aarch64-none-linux-gnu.
>
>> Could you please either revert the patch or fix these issues?
>
> Thanks,
> James
>

Sorry for the breakage. I sent a patch to fix it.

https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01467.html

Thanks,
Ilya


[PATCH, PR tree-optimization/PR68305] Support masked COND_EXPR in SLP

2015-11-12 Thread Ilya Enkovich
Hi,

This patch fixes a way operand is chosen by its num for COND_EXPR.  
Bootstrapped and regtested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-11-12  Ilya Enkovich  <enkovich@gmail.com>

PR tree-optimization/68305
* tree-vect-slp.c (vect_get_constant_vectors): Support
COND_EXPR with SSA_NAME as a condition.

gcc/testsuite/

2015-11-12  Ilya Enkovich  <enkovich@gmail.com>

PR tree-optimization/68305
* gcc.dg/vect/pr68305.c: New test.


diff --git a/gcc/testsuite/gcc.dg/vect/pr68305.c 
b/gcc/testsuite/gcc.dg/vect/pr68305.c
new file mode 100644
index 000..fde3db7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr68305.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+/* { dg-additional-options "-mavx2" { target avx_runtime } } */
+
+int a, b;
+
+void
+fn1 ()
+{
+  int c, d;
+  for (; b; b++)
+a = a ^ !c ^ !d;
+}
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 9d97140..9402474 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2738,18 +2738,20 @@ vect_get_constant_vectors (tree op, slp_tree slp_node,
  switch (code)
{
  case COND_EXPR:
-   if (op_num == 0 || op_num == 1)
- {
-   tree cond = gimple_assign_rhs1 (stmt);
+   {
+ tree cond = gimple_assign_rhs1 (stmt);
+ if (TREE_CODE (cond) == SSA_NAME)
+   op = gimple_op (stmt, op_num + 1);
+ else if (op_num == 0 || op_num == 1)
op = TREE_OPERAND (cond, op_num);
- }
-   else
- {
-   if (op_num == 2)
- op = gimple_assign_rhs2 (stmt);
-   else
- op = gimple_assign_rhs3 (stmt);
- }
+ else
+   {
+ if (op_num == 2)
+   op = gimple_assign_rhs2 (stmt);
+ else
+   op = gimple_assign_rhs3 (stmt);
+   }
+   }
break;
 
  case CALL_EXPR:


Re: [mask-vec_cond, patch 1/2] Support vectorization of VEC_COND_EXPR with no embedded comparison

2015-11-12 Thread Ilya Enkovich
2015-11-12 13:03 GMT+03:00 Ramana Radhakrishnan <ramana@googlemail.com>:
> On Thu, Oct 8, 2015 at 4:50 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
>> Hi,
>>
>> This patch allows COND_EXPR with no embedded comparison to be vectorized.
>>  It's applied on top of vectorized comparison support series.  New optab 
>> vcond_mask_optab
>> is introduced for such statements.  Bool patterns now avoid comparison in 
>> COND_EXPR in case vector comparison is supported by target.
>
> New standard pattern names are documented in the internals manual.
> This patch does not do so neither do I see any patches to do so.
>
>
> regards
> Ramana

Thanks for the point.  I see we also miss description for some other
patterns (e.g. maskload). Will add it.

Ilya


Re: [PATCH] New version of libmpx with new memmove wrapper

2015-11-12 Thread Ilya Enkovich
2015-11-05 13:37 GMT+03:00 Aleksandra Tsvetkova :
> New version of libmpx was added. There is a new function get_bd() that
> allows to get bounds directory. Wrapper for memmove was modified. Now
> it moves data and then moves corresponding bounds directly from one
> bounds table to another. This approach made moving unaligned pointers
> possible. It also makes memmove function faster on sizes bigger than
> 64 bytes.

+2015-10-27  Tsvetkova Alexandra  
+
+ * gcc.target/i386/mpx/memmove.c: New test for __mpx_wrapper_memmove.
+

Did you test it on different targets? It seems to me this test will
fail if you run it
on non-MPX target.  Please look at mpx-check.h and how other MPX run
tests use it.

+ * mpxrt/mpxrt.c (NUM_L1_BITS): Moved to mpxrt.h.
+ * mpxrt/mpxrt.c (REG_IP_IDX): Moved to mpxrt.h.
+ * mpxrt/mpxrt.c (REX_PREFIX): Moved to mpxrt.h.
+ * mpxrt/mpxrt.c (XSAVE_OFFSET_IN_FPMEM): Moved to mpxrt.h.
+ * mpxrt/mpxrt.c (MPX_L1_SIZE): Moved to mpxrt.h.

No need to repeat file name.

+ * libmpxwrap/mpx_wrappers.c: Rewrite __mpx_wrapper_memmove to make it faster.

You added new functions, types and modified existing function.  Make
ChangeLog more detailed.

--- /dev/null
+++ b/libmpx/mpxrt/mpxrt.h
@@ -0,0 +1,75 @@
+/* mpxrt.h  -*-C++-*-
+ *
+ *
+ *
+ *  @copyright
+ *  Copyright (C) 2014, 2015, Intel Corporation
+ *  All rights reserved.

2015 only

+const uintptr_t MPX_L1_ADDR_MASK = 0xf000UL;
+const uintptr_t MPX_L2_ADDR_MASK = 0xfffcUL;
+const uintptr_t MPX_L2_VALID_MASK = 0x0001UL;

Use defines


--- a/libmpx/mpxwrap/Makefile.am
+++ b/libmpx/mpxwrap/Makefile.am
@@ -1,4 +1,5 @@
 ALCLOCAL_AMFLAGS = -I .. -I ../config
+AM_CPPFLAGS = -I $(top_srcdir)

This is not reflected in ChangeLog

+/* The mpx_bt_entry struct represents a cell in bounds table.
+   *lb is the lower bound, *ub is the upper bound,
+   *p is the stored pointer.  */

Bounds and pointer are in lb, ub, p, not in *lb, *ub, *p. Right?

+static inline void
+alloc_bt (void *ptr)
+{
+  __asm__ __volatile__ ("bndstx %%bnd0, (%0,%0)"::"r" (ptr):"%bnd0");
+}

This should be marked as bnd_legacy.

+/* move_bounds function copies N bytes from SRC to DST.

Really?

+   It also copies bounds for all pointers inside.
+   There are 3 parts of the algorithm:
+   1) We copy everything till the end of the first bounds table SRC)

SRC is not a bounds table

+   2) In loop we copy whole bound tables till the second-last one
+   3) Data in the last bounds table is copied separately, after the loop.
+   If one of bound tables in SRC doesn't exist,
+   we skip it because there are no pointers.
+   Depending on the arrangement of SRC and DST we copy from the beginning
+   or from the end.  */
+__attribute__ ((bnd_legacy)) static void *
+move_bounds (void *dst, const void *src, size_t n)

What is returned value for?

+void *
+__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
+{
+  if (n == 0)
+return dst;
+
+  __bnd_chk_ptr_bounds (dst, n);
+  __bnd_chk_ptr_bounds (src, n);
+
+  memmove (dst, src, n);
+  move_bounds (dst, src, n);
+  return dst;
 }

You completely remove old algorithm which should be faster on small
sizes. __mpx_wrapper_memmove should become a dispatcher between old
and new implementations depending on target (32-bit or 64-bit) and N.
Since old version performs both data and bounds copy, BD check should
be moved into __mpx_wrapper_memmove to never call
it when MPX is disabled.

Thanks,
Ilya


[PATCH] Fix ICE for masked store of boolean value

2015-11-12 Thread Ilya Enkovich
Hi,

We may get ICE in vectorizer in case stored value get vectype not compatible 
with a storage.  This may happen for bool values.  This patch fixes ICE.  
Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-11-12  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-stmts.c (vectorizable_mask_load_store): Check
types of stored value and storage are compatible.

gcc/testsuite/

2015-11-12  Ilya Enkovich  <enkovich@gmail.com>

* g++.dg/vect/simd-mask-store-bool.cc: New test.


diff --git a/gcc/testsuite/g++.dg/vect/simd-mask-store-bool.cc 
b/gcc/testsuite/g++.dg/vect/simd-mask-store-bool.cc
new file mode 100644
index 000..c5f0458
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/simd-mask-store-bool.cc
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_condition } */
+/* { dg-additional-options "-mavx512bw" { target { i?86-*-* x86_64-*-* } } } */
+
+#define N 1024
+
+int a[N], b[N], c[N];
+bool d[N];
+
+void
+test (void)
+{
+  int i;
+#pragma omp simd safelen(64)
+  for (i = 0; i < N; i++)
+if (a[i] > 0)
+  d[i] = b[i] > c[i];
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index cfe30e0..7870b29 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1688,6 +1688,7 @@ vectorizable_mask_load_store (gimple *stmt, 
gimple_stmt_iterator *gsi,
   bool nested_in_vect_loop = nested_in_vect_loop_p (loop, stmt);
   struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+  tree rhs_vectype = NULL_TREE;
   tree mask_vectype;
   tree elem_type;
   gimple *new_stmt;
@@ -1757,6 +1758,13 @@ vectorizable_mask_load_store (gimple *stmt, 
gimple_stmt_iterator *gsi,
   if (!mask_vectype)
 return false;
 
+  if (is_store)
+{
+  tree rhs = gimple_call_arg (stmt, 3);
+  if (!vect_is_simple_use (rhs, loop_vinfo, _stmt, , _vectype))
+   return false;
+}
+
   if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
 {
   gimple *def_stmt;
@@ -1790,16 +1798,11 @@ vectorizable_mask_load_store (gimple *stmt, 
gimple_stmt_iterator *gsi,
   else if (!VECTOR_MODE_P (TYPE_MODE (vectype))
   || !can_vec_mask_load_store_p (TYPE_MODE (vectype),
  TYPE_MODE (mask_vectype),
- !is_store))
+ !is_store)
+  || (rhs_vectype
+  && !useless_type_conversion_p (vectype, rhs_vectype)))
 return false;
 
-  if (is_store)
-{
-  tree rhs = gimple_call_arg (stmt, 3);
-  if (!vect_is_simple_use (rhs, loop_vinfo, _stmt, ))
-   return false;
-}
-
   if (!vec_stmt) /* transformation not required.  */
 {
   STMT_VINFO_TYPE (stmt_info) = call_vec_info_type;


[PATCH, doc] Document some standard pattern names

2015-11-12 Thread Ilya Enkovich
Hi,

This patch adds description for several standard pattern names.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-11-12  Ilya Enkovich  <enkovich@gmail.com>

* doc/md.texi (vec_cmp@var{m}@var{n}): New item.
(vec_cmpu@var{m}@var{n}): New item.
(vcond@var{m}@var{n}): Specify comparison is signed.
(vcondu@var{m}@var{n}): New item.
(vcond_mask_@var{m}@var{n}): New item.
(maskload@var{m}@var{n}): New item.
(maskstore@var{m}@var{n}): New item.


diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 71a2791..7fdc935 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -4749,17 +4749,51 @@ specify field index and operand 0 place to store value 
into.
 Initialize the vector to given values.  Operand 0 is the vector to initialize
 and operand 1 is parallel containing values for individual fields.
 
+@cindex @code{vec_cmp@var{m}@var{n}} instruction pattern
+@item @samp{vec_cmp@var{m}@var{n}}
+Output a vector comparison.  Operand 0 of mode @var{n} is the destination for
+predicate in operand 1 which is a signed vector comparison with operands of
+mode @var{m} in operands 2 and 3.  Predicate is computed by element-wise
+evaluation of the vector comparison with a truth value of all-ones and a false
+value of all-zeros.
+
+@cindex @code{vec_cmpu@var{m}@var{n}} instruction pattern
+@item @samp{vec_cmpu@var{m}@var{n}}
+Similar to @code{vec_cmp@var{m}@var{n}} but perform unsigned vector comparison.
+
 @cindex @code{vcond@var{m}@var{n}} instruction pattern
 @item @samp{vcond@var{m}@var{n}}
 Output a conditional vector move.  Operand 0 is the destination to
 receive a combination of operand 1 and operand 2, which are of mode @var{m},
-dependent on the outcome of the predicate in operand 3 which is a
+dependent on the outcome of the predicate in operand 3 which is a signed
 vector comparison with operands of mode @var{n} in operands 4 and 5.  The
 modes @var{m} and @var{n} should have the same size.  Operand 0
 will be set to the value @var{op1} & @var{msk} | @var{op2} & ~@var{msk}
 where @var{msk} is computed by element-wise evaluation of the vector
 comparison with a truth value of all-ones and a false value of all-zeros.
 
+@cindex @code{vcondu@var{m}@var{n}} instruction pattern
+@item @samp{vcondu@var{m}@var{n}}
+Similar to @code{vcond@var{m}@var{n}} but performs unsigned vector
+comparison.
+
+@cindex @code{vcond_mask_@var{m}@var{n}} instruction pattern
+@item @samp{vcond_mask_@var{m}@var{n}}
+Similar to @code{vcond@var{m}@var{n}} but operand 3 holds a pre-computed
+result of vector comparison.
+
+@cindex @code{maskload@var{m}@var{n}} instruction pattern
+@item @samp{maskload@var{m}@var{n}}
+Perform a masked load of vector from memory operand 1 of mode @var{m}
+into register operand 0.  Mask is provided in register operand 2 of
+mode @var{n}.
+
+@cindex @code{maskstore@var{m}@var{n}} instruction pattern
+@item @samp{maskload@var{m}@var{n}}
+Perform a masked store of vector from register operand 1 of mode @var{m}
+into memory operand 0.  Mask is provided in register operand 2 of
+mode @var{n}.
+
 @cindex @code{vec_perm@var{m}} instruction pattern
 @item @samp{vec_perm@var{m}}
 Output a (variable) vector permutation.  Operand 0 is the destination


[PATCH] Fix ICE for boolean comparison

2015-11-12 Thread Ilya Enkovich
Hi,

Currently compiler may ICE when loaded boolean is compared with vector 
invariant or another boolean value.  This is because we don't detect mix of 
bool and non-bool vectypes and incorrectly determine vectype for boolean loop 
invariant for comparison.  This was fixed for COND_EXP before but also needs to 
be fixed for comparison.  This patch was bootstrapped and tested on 
x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-11-12  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-loop.c (vect_determine_vectorization_factor): Check
mix of boolean and integer vectors in a single statement.
* tree-vect-slp.c (vect_mask_constant_operand_p): New.
(vect_get_constant_vectors): Use vect_mask_constant_operand_p to
determine constant type.
* tree-vect-stmts.c (vectorizable_comparison): Provide vectype
for loop invariants.

gcc/testsuite/

2015-11-12  Ilya Enkovich  <enkovich@gmail.com>

* g++.dg/vect/simd-bool-comparison-1.cc: New test.
* g++.dg/vect/simd-bool-comparison-2.cc: New test.


diff --git a/gcc/testsuite/g++.dg/vect/simd-bool-comparison-1.cc 
b/gcc/testsuite/g++.dg/vect/simd-bool-comparison-1.cc
new file mode 100644
index 000..a08362f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/simd-bool-comparison-1.cc
@@ -0,0 +1,21 @@
+// { dg-do compile }
+// { dg-additional-options "-mavx512bw -mavx512dq" { target { i?86-*-* 
x86_64-*-* } } }
+
+#define N 1024
+
+double a[N];
+bool b[N];
+bool c;
+
+void test ()
+{
+  int i;
+
+  for (i = 0; i < N; i++)
+if (b[i] != c)
+  a[i] = 0.0;
+else
+  a[i] = 1.0;
+}
+
+// { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { 
i?86-*-* x86_64-*-* } } } }
diff --git a/gcc/testsuite/g++.dg/vect/simd-bool-comparison-2.cc 
b/gcc/testsuite/g++.dg/vect/simd-bool-comparison-2.cc
new file mode 100644
index 000..4accf56
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/simd-bool-comparison-2.cc
@@ -0,0 +1,20 @@
+// { dg-do compile }
+// { dg-additional-options "-mavx512bw -mavx512dq" { target { i?86-*-* 
x86_64-*-* } } }
+
+#define N 1024
+
+double a[N];
+bool b[N];
+char c[N];
+
+void test ()
+{
+  int i;
+
+  #pragma omp simd
+  for (i = 0; i < N; i++)
+if ((c[i] > 0) && b[i])
+  a[i] = 0.0;
+else
+  a[i] = 1.0;
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 55e5309..6b78b55 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -649,7 +649,32 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
}
  return false;
}
+ else if (VECTOR_BOOLEAN_TYPE_P (mask_type)
+  != VECTOR_BOOLEAN_TYPE_P (vectype))
+   {
+ if (dump_enabled_p ())
+   {
+ dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+  "not vectorized: mixed mask and "
+  "nonmask vector types in statement, ");
+ dump_generic_expr (MSG_MISSED_OPTIMIZATION, TDF_SLIM,
+mask_type);
+ dump_printf (MSG_MISSED_OPTIMIZATION, " and ");
+ dump_generic_expr (MSG_MISSED_OPTIMIZATION, TDF_SLIM,
+vectype);
+ dump_printf (MSG_MISSED_OPTIMIZATION, "\n");
+   }
+ return false;
+   }
}
+
+ /* We may compare boolean value loaded as vector of integers.
+Fix mask_type in such case.  */
+ if (mask_type
+ && !VECTOR_BOOLEAN_TYPE_P (mask_type)
+ && gimple_code (stmt) == GIMPLE_ASSIGN
+ && TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == 
tcc_comparison)
+   mask_type = build_same_sized_truth_vector_type (mask_type);
}
 
   /* No mask_type should mean loop invariant predicate.
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 9d97140..f3acb04 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2589,6 +2589,57 @@ vect_slp_bb (basic_block bb)
 }
 
 
+/* Return 1 if vector type of boolean constant which is OPNUM
+   operand in statement STMT is a boolean vector.  */
+
+static bool
+vect_mask_constant_operand_p (gimple *stmt, int opnum)
+{
+  stmt_vec_info stmt_vinfo = vinfo_for_stmt (stmt);
+  enum tree_code code = gimple_expr_code (stmt);
+  tree op, vectype;
+  gimple *def_stmt;
+  enum vect_def_type dt;
+
+  /* For comparison and COND_EXPR type is chosen depending
+ on the other comparison operand.  */
+  if (TREE_CODE_CLASS (code) == tcc_comparison)
+{
+  if (opnum)
+   op = gimple_assign_rhs1 (stmt);
+  else
+   op = gimple_assign_rhs2 (stmt);
+
+  i

[PATCH] Enable libmpx by default on supported target

2015-11-12 Thread Ilya Enkovich
Hi,

libmpx was added close to release date and therefore was disabled by default 
for all targets.  This patch enables it by default for supported targets.  Is 
it OK for trunk?

Thanks,
Ilya
--
2015-11-12  Tsvetkova Alexandra  

* configure.ac: Enable libmpx by default.
* configure: Regenerated.


diff --git a/configure.ac b/configure.ac
index cb6ca24..55f9ab0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -660,7 +660,7 @@ fi
 
 # Enable libmpx on supported systems by request.
 if test -d ${srcdir}/libmpx; then
-if test x$enable_libmpx = xyes; then
+if test x$enable_libmpx = x; then
AC_MSG_CHECKING([for libmpx support])
if (srcdir=${srcdir}/libmpx; \
. ${srcdir}/configure.tgt; \
@@ -671,8 +671,6 @@ if test -d ${srcdir}/libmpx; then
else
AC_MSG_RESULT([yes])
fi
-else
-   noconfigdirs="$noconfigdirs target-libmpx"
 fi
 fi
 


[PATCH] Avoid false vector mask conversion

2015-11-12 Thread Ilya Enkovich
Hi,

When we use LTO for fortran we may have a mix 32bit and 1bit scalar booleans. 
It means we may have conversion of one scalar type to another which confuses 
vectorizer because values with different scalar boolean type may get the same 
vectype.  This patch transforms such conversions into comparison.

I managed to make a small fortran test which gets vectorized with this patch 
but I didn't find how I can run fortran test with LTO and then scan tree dump 
to check it is vectorized.  BTW here is a loop from the test:

  real*8 a(18)
  logical b(18)
  integer i

  do i=1,18
 if(a(i).gt.0.d0) then
b(i)=.true.
 else
b(i)=.false.
 endif
  enddo

Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-11-12  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-patterns.c (vect_recog_mask_conversion_pattern):
Transform useless boolean conversion into assignment.


diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index b9d900c..62070da 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -3674,6 +3674,38 @@ vect_recog_mask_conversion_pattern (vec 
*stmts, tree *type_in,
   if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE)
 return NULL;
 
+  /* Check conversion between boolean types of different sizes.
+ If no vectype is specified, then we have a regular mask
+ assignment with no actual conversion.  */
+  if (rhs_code == CONVERT_EXPR
+  && !STMT_VINFO_DATA_REF (stmt_vinfo)
+  && !STMT_VINFO_VECTYPE (stmt_vinfo))
+{
+  if (TREE_CODE (rhs1) != SSA_NAME)
+   return NULL;
+
+  rhs1_type = search_type_for_mask (rhs1, vinfo);
+  if (!rhs1_type)
+   return NULL;
+
+  vectype1 = get_mask_type_for_scalar_type (rhs1_type);
+
+  if (!vectype1)
+   return NULL;
+
+  lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
+  pattern_stmt = gimple_build_assign (lhs, rhs1);
+
+  *type_out = vectype1;
+  *type_in = vectype1;
+  stmts->safe_push (last_stmt);
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_NOTE, vect_location,
+ "vect_recog_mask_conversion_pattern: detected:\n");
+
+  return pattern_stmt;
+}
+
   if (rhs_code != BIT_IOR_EXPR
   && rhs_code != BIT_XOR_EXPR
   && rhs_code != BIT_AND_EXPR)


Re: [PATCH] Simple optimization for MASK_STORE.

2015-11-10 Thread Ilya Enkovich
2015-11-10 17:46 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Tue, Nov 10, 2015 at 1:48 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
>> 2015-11-10 15:33 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>> On Fri, Nov 6, 2015 at 2:28 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>>>> Richard,
>>>>
>>>> I tried it but 256-bit precision integer type is not yet supported.
>>>
>>> What's the symptom?  The compare cannot be expanded?  Just add a pattern 
>>> then.
>>> After all we have modes up to XImode.
>>
>> I suppose problem may be in:
>>
>> gcc/config/i386/i386-modes.def:#define MAX_BITSIZE_MODE_ANY_INT (128)
>>
>> which doesn't allow to create constants of bigger size.  Changing it
>> to maximum vector size (512) would mean we increase wide_int structure
>> size significantly. New patterns are probably also needed.
>
> Yes, new patterns are needed but wide-int should be fine (we only need to 
> create
> a literal zero AFACS).  The "new pattern" would be equality/inequality
> against zero
> compares only.

Currently 256bit integer creation fails because wide_int for max and
min values cannot be created.
It is fixed by increasing MAX_BITSIZE_MODE_ANY_INT, but it increases
WIDE_INT_MAX_ELTS
and thus increases wide_int structure. If we use 512 for
MAX_BITSIZE_MODE_ANY_INT then
wide_int structure would grow by 48 bytes (16 bytes if use 256 for
MAX_BITSIZE_MODE_ANY_INT).
Is it OK for such narrow usage?

Ilya

>
> Richard.
>
>> Ilya
>>
>>>
>>> Richard.
>>>
>>>> Yuri.
>>>>
>>>>


Re: [mask conversion, patch 2/2, i386] Add pack/unpack patterns for scalar masks

2015-11-10 Thread Ilya Enkovich
On 19 Oct 15:30, Ilya Enkovich wrote:
> Hi,
> 
> This patch adds patterns to be used for vector masks pack/unpack for AVX512.  
>  Bootstrapped and tested on x86_64-unknown-linux-gnu.  Does it look OK?
> 
> Thanks,
> Ilya

Here is a modified version which reflects changes in boolean type sign.   Only 
pattern names were changed.  Bootstrapped and tested on 
x86_64-unknown-linux-gnu.  Does it look OK?

Thanks,
Ilya
--
gcc/

2015-11-10  Ilya Enkovich  <enkovich@gmail.com>

* config/i386/sse.md (HALFMASKMODE): New attribute.
(DOUBLEMASKMODE): New attribute.
(vec_pack_trunc_qi): New.
(vec_pack_trunc_): New.
(vec_unpacks_lo_hi): New.
(vec_unpacks_lo_si): New.
(vec_unpacks_lo_di): New.
(vec_unpacks_hi_hi): New.
(vec_unpacks_hi_): New.

gcc/testsuite/

2015-11-10  Ilya Enkovich  <enkovich@gmail.com>

* gcc.target/i386/mask-pack.c: New test.
* gcc.target/i386/mask-unpack.c: New test.


diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 452629f..aad6a0d 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -799,6 +799,14 @@
   [(V32QI "t") (V16HI "t") (V8SI "t") (V4DI "t") (V8SF "t") (V4DF "t")
(V64QI "g") (V32HI "g") (V16SI "g") (V8DI "g") (V16SF "g") (V8DF "g")])
 
+;; Half mask mode for unpacks
+(define_mode_attr HALFMASKMODE
+  [(DI "SI") (SI "HI")])
+
+;; Double mask mode for packs
+(define_mode_attr DOUBLEMASKMODE
+  [(HI "SI") (SI "DI")])
+
 
 ;; Include define_subst patterns for instructions with mask
 (include "subst.md")
@@ -11578,6 +11586,23 @@
   DONE;
 })
 
+(define_expand "vec_pack_trunc_qi"
+  [(set (match_operand:HI 0 ("register_operand"))
+(ior:HI (ashift:HI (zero_extend:HI (match_operand:QI 1 
("register_operand")))
+   (const_int 8))
+(zero_extend:HI (match_operand:QI 2 ("register_operand")]
+  "TARGET_AVX512F")
+
+(define_expand "vec_pack_trunc_"
+  [(set (match_operand: 0 ("register_operand"))
+(ior: (ashift: 
(zero_extend: (match_operand:SWI24 1 ("register_operand")))
+   (match_dup 3))
+(zero_extend: (match_operand:SWI24 2 
("register_operand")]
+  "TARGET_AVX512BW"
+{
+  operands[3] = GEN_INT (GET_MODE_BITSIZE (mode));
+})
+
 (define_insn "_packsswb"
   [(set (match_operand:VI1_AVX512 0 "register_operand" "=x,x")
(vec_concat:VI1_AVX512
@@ -13474,12 +13499,42 @@
   "TARGET_SSE2"
   "ix86_expand_sse_unpack (operands[0], operands[1], true, false); DONE;")
 
+(define_expand "vec_unpacks_lo_hi"
+  [(set (match_operand:QI 0 "register_operand")
+(subreg:QI (match_operand:HI 1 "register_operand") 0))]
+  "TARGET_AVX512DQ")
+
+(define_expand "vec_unpacks_lo_si"
+  [(set (match_operand:HI 0 "register_operand")
+(subreg:HI (match_operand:SI 1 "register_operand") 0))]
+  "TARGET_AVX512F")
+
+(define_expand "vec_unpacks_lo_di"
+  [(set (match_operand:SI 0 "register_operand")
+(subreg:SI (match_operand:DI 1 "register_operand") 0))]
+  "TARGET_AVX512BW")
+
 (define_expand "vec_unpacku_hi_"
   [(match_operand: 0 "register_operand")
(match_operand:VI124_AVX2_24_AVX512F_1_AVX512BW 1 "register_operand")]
   "TARGET_SSE2"
   "ix86_expand_sse_unpack (operands[0], operands[1], true, true); DONE;")
 
+(define_expand "vec_unpacks_hi_hi"
+  [(set (subreg:HI (match_operand:QI 0 "register_operand") 0)
+(lshiftrt:HI (match_operand:HI 1 "register_operand")
+ (const_int 8)))]
+  "TARGET_AVX512F")
+
+(define_expand "vec_unpacks_hi_"
+  [(set (subreg:SWI48x (match_operand: 0 "register_operand") 0)
+(lshiftrt:SWI48x (match_operand:SWI48x 1 "register_operand")
+ (match_dup 2)))]
+  "TARGET_AVX512BW"
+{
+  operands[2] = GEN_INT (GET_MODE_BITSIZE (mode));
+})
+
 ;
 ;;
 ;; Miscellaneous
diff --git a/gcc/testsuite/gcc.target/i386/mask-pack.c 
b/gcc/testsuite/gcc.target/i386/mask-pack.c
new file mode 100644
index 000..0b564ef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mask-pack.c
@@ -0,0 +1,100 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O3 -fopenmp-simd -fdump-tree-vect-details" } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 10 "vect" } } */
+/* { dg-final { scan-assembler-

Re: [RFC] Combine vectorized loops with its scalar remainder.

2015-11-10 Thread Ilya Enkovich
2015-11-10 15:30 GMT+03:00 Richard Biener :
> On Tue, Nov 3, 2015 at 1:08 PM, Yuri Rumyantsev  wrote:
>> Richard,
>>
>> It looks like misunderstanding - we assume that for GCCv6 the simple
>> scheme of remainder will be used through introducing new IV :
>> https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01435.html
>>
>> Is it true or we missed something?
>
> 
>> > Do you have an idea how "masking" is better be organized to be usable
>> > for both 4b and 4c?
>>
>> Do 2a ...
> Okay.
> 

2a was 'transform already vectorized loop as a separate
post-processing'. Isn't it what this prototype patch implements?
Current version only masks loop body which is in practice applicable
for AVX-512 only in the most cases.  With AVX-512 it's easier to see
how profitable masking might be and it is a main target for the first
masking version.  Extending it to prologues/epilogues and thus making
it more profitable for other targets is the next step and is out of
the scope of this patch.

Thanks,
Ilya

>
> Richard.
>


Re: [PATCH] Simple optimization for MASK_STORE.

2015-11-10 Thread Ilya Enkovich
2015-11-10 15:33 GMT+03:00 Richard Biener :
> On Fri, Nov 6, 2015 at 2:28 PM, Yuri Rumyantsev  wrote:
>> Richard,
>>
>> I tried it but 256-bit precision integer type is not yet supported.
>
> What's the symptom?  The compare cannot be expanded?  Just add a pattern then.
> After all we have modes up to XImode.

I suppose problem may be in:

gcc/config/i386/i386-modes.def:#define MAX_BITSIZE_MODE_ANY_INT (128)

which doesn't allow to create constants of bigger size.  Changing it
to maximum vector size (512) would mean we increase wide_int structure
size significantly. New patterns are probably also needed.

Ilya

>
> Richard.
>
>> Yuri.
>>
>>


Re: [vec-cmp, patch 2/6] Vectorization factor computation

2015-11-09 Thread Ilya Enkovich
2015-10-20 16:45 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Wed, Oct 14, 2015 at 1:21 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
>> 2015-10-13 16:37 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>> On Thu, Oct 8, 2015 at 4:59 PM, Ilya Enkovich <enkovich@gmail.com> 
>>> wrote:
>>>> Hi,
>>>>
>>>> This patch handles statements with boolean result in vectorization factor 
>>>> computation.  For comparison its operands type is used instead of restult 
>>>> type to compute VF.  Other boolean statements are ignored for VF.
>>>>
>>>> Vectype for comparison is computed using type of compared values.  
>>>> Computed type is propagated into other boolean operations.
>>>
>>> This feels rather ad-hoc, mixing up the existing way of computing
>>> vector type and VF.  I'd rather have turned the whole
>>> vector type computation around to the scheme working on the operands
>>> rather than on the lhs and then searching
>>> for smaller/larger types on the rhs'.
>>>
>>> I know this is a tricky function (heh, but you make it even worse...).
>>> And it needs a helper with knowledge about operations
>>> so one can compute the result vector type for an operation on its
>>> operands.  The seeds should be PHIs (handled like now)
>>> and loads, and yes, externals need special handling.
>>>
>>> Ideally we'd do things in two stages, first compute vector types in a
>>> less constrained manner (not forcing a single vector size)
>>> and then in a 2nd run promote to a common size also computing the VF to do 
>>> that.
>>
>> This sounds like a refactoring, not a functional change, right? Also I
>> don't see a reason to analyze DF to compute vectypes if we promote it
>> to a single vector size anyway. For booleans we have to do it because
>> boolean vectors of the same size may have different number of
>> elements. What is the reason to do it for other types?
>
> For conversions and operators which support different sized operands

That's what we handle in vector patterns and use some helper functions
to determine vectypes there. Looks like this refactoring would affects
patterns significantly. Probably compute vectypes before searching for
patterns?

>
>> Shouldn't it be a patch independent from comparison vectorization series?
>
> As you like.

I'd like to move on with vector comparison and consider VF computation
refactoring when it's stabilized. This patch is the last one (except
target ones) not approved in all vector comparison related series.
Would it be OK to go on with it in a current shape?

Thanks,
Ilya


Re: [PATCH] Use signed boolean type for boolean vectors

2015-11-09 Thread Ilya Enkovich
On 03 Nov 14:42, Richard Biener wrote:
> On Wed, Oct 28, 2015 at 4:30 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
> > 2015-10-28 18:21 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> >> On Wed, Oct 28, 2015 at 2:13 PM, Ilya Enkovich <enkovich@gmail.com> 
> >> wrote:
> >>> Hi,
> >>>
> >>> Testing boolean vector conversions I found several runtime regressions
> >>> and investigation showed it's due to incorrect conversion caused by
> >>> unsigned boolean type.  When boolean vector is represented as an
> >>> integer vector on target it's a signed integer actually.  Unsigned
> >>> boolean type was chosen due to possible single bit values, but for
> >>> multiple bit values it causes wrong casting.  The easiest way to fix
> >>> it is to use signed boolean value.  The following patch does this and
> >>> fixes my problems with conversion.  Bootstrapped and tested on
> >>> x86_64-unknown-linux-gnu.  Is it OK?
> >>
> >> Hmm.  Actually formally the "boolean" vectors were always 0 or -1
> >> (all bits set).  That is also true for a signed boolean with precision 1
> >> but with higher precision what makes sure to sign-extend 'true'?
> >>
> >> So it's far from an obvious change, esp as you don't change the
> >> precision == 1 case.  [I still think we should have precision == 1
> >> for all boolean types]
> >>
> >> Richard.
> >>
> >
> > For 1 bit precision signed type value 1 is out of range, right? This might 
> > break
> > in many place due to used 1 as true value.
> 
> For vectors -1 is true.  Did you try whether it breaks many places?
> build_int_cst (type, 1) should still work fine.
> 
> Richard.
> 

I tried it and didn't find any new failures.  So looks I was wrong assuming it 
should cause many failures.  Testing is not complete because many SPEC 
benchmarks are failing to compile on -O3 for AVX-512 on trunk.  But I think we 
may proceed with signed type and fix constant generation issues if any 
revealed.  This patch was bootstrapped and regtested on 
x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-11-09  Ilya Enkovich  <enkovich@gmail.com>

* optabs.c (expand_vec_cond_expr): Always get sign from type.
* tree.c (wide_int_to_tree): Support negative values for boolean.
(build_nonstandard_boolean_type): Use signed type for booleans.


diff --git a/gcc/optabs.c b/gcc/optabs.c
index fdcdc6a..44971ad 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -5365,7 +5365,6 @@ expand_vec_cond_expr (tree vec_cond_type, tree op0, tree 
op1, tree op2,
   op0a = TREE_OPERAND (op0, 0);
   op0b = TREE_OPERAND (op0, 1);
   tcode = TREE_CODE (op0);
-  unsignedp = TYPE_UNSIGNED (TREE_TYPE (op0a));
 }
   else
 {
@@ -5374,9 +5373,9 @@ expand_vec_cond_expr (tree vec_cond_type, tree op0, tree 
op1, tree op2,
   op0a = op0;
   op0b = build_zero_cst (TREE_TYPE (op0));
   tcode = LT_EXPR;
-  unsignedp = false;
 }
   cmp_op_mode = TYPE_MODE (TREE_TYPE (op0a));
+  unsignedp = TYPE_UNSIGNED (TREE_TYPE (op0a));
 
 
   gcc_assert (GET_MODE_SIZE (mode) == GET_MODE_SIZE (cmp_op_mode)
diff --git a/gcc/tree.c b/gcc/tree.c
index 18d6544..6fb4c09 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -1437,7 +1437,7 @@ wide_int_to_tree (tree type, const wide_int_ref )
case BOOLEAN_TYPE:
  /* Cache false or true.  */
  limit = 2;
- if (hwi < 2)
+ if (IN_RANGE (hwi, 0, 1))
ix = hwi;
  break;
 
@@ -8069,7 +8069,7 @@ build_nonstandard_boolean_type (unsigned HOST_WIDE_INT 
precision)
 
   type = make_node (BOOLEAN_TYPE);
   TYPE_PRECISION (type) = precision;
-  fixup_unsigned_type (type);
+  fixup_signed_type (type);
 
   if (precision <= MAX_INT_CACHED_PREC)
 nonstandard_boolean_type_cache[precision] = type;


Re: [vec-cmp, patch 3/6] Vectorize comparison

2015-11-09 Thread Ilya Enkovich
On 26 Oct 16:09, Richard Biener wrote:
> On Wed, Oct 14, 2015 at 6:12 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
> > +
> > + ops.release ();
> > + vec_defs.release ();
> 
> No need to release auto_vec<>s at the end of scope explicitely.

Fixed

> 
> > + vec_compare = build2 (code, mask_type, vec_rhs1, vec_rhs2);
> > + new_stmt = gimple_build_assign (mask, vec_compare);
> > + new_temp = make_ssa_name (mask, new_stmt);
> > + gimple_assign_set_lhs (new_stmt, new_temp);
> 
>  new_temp = make_ssa_name (mask);
>  gimple_build_assign (new_temp, code, vec_rhs1, vec_rhs2);
> 
> for the 4 stmts above.

Fixed

> 
> > +
> > +  vec_oprnds0.release ();
> > +  vec_oprnds1.release ();
> 
> Please use auto_vec<>s.

These are used to hold vecs returned by vect_get_slp_defs.  Thus can't 
use auto_vec.

> 
> Ok with those changes.
> 
> RIchard.
> 


gcc/

2015-11-09  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-data-refs.c (vect_get_new_vect_var): Support vect_mask_var.
(vect_create_destination_var): Likewise.
* tree-vect-stmts.c (vectorizable_comparison): New.
(vect_analyze_stmt): Add vectorizable_comparison.
(vect_transform_stmt): Likewise.
* tree-vectorizer.h (enum vect_var_kind): Add vect_mask_var.
(enum stmt_vec_info_type): Add comparison_vec_info_type.
(vectorizable_comparison): New.


diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 11bce79..926752b 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -3790,6 +3790,9 @@ vect_get_new_vect_var (tree type, enum vect_var_kind 
var_kind, const char *name)
   case vect_scalar_var:
 prefix = "stmp";
 break;
+  case vect_mask_var:
+prefix = "mask";
+break;
   case vect_pointer_var:
 prefix = "vectp";
 break;
@@ -4379,7 +4382,11 @@ vect_create_destination_var (tree scalar_dest, tree 
vectype)
   tree type;
   enum vect_var_kind kind;
 
-  kind = vectype ? vect_simple_var : vect_scalar_var;
+  kind = vectype
+? VECTOR_BOOLEAN_TYPE_P (vectype)
+? vect_mask_var
+: vect_simple_var
+: vect_scalar_var;
   type = vectype ? vectype : TREE_TYPE (scalar_dest);
 
   gcc_assert (TREE_CODE (scalar_dest) == SSA_NAME);
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index f1216c8..ee549f4 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -7416,6 +7416,185 @@ vectorizable_condition (gimple *stmt, 
gimple_stmt_iterator *gsi,
   return true;
 }
 
+/* vectorizable_comparison.
+
+   Check if STMT is comparison expression that can be vectorized.
+   If VEC_STMT is also passed, vectorize the STMT: create a vectorized
+   comparison, put it in VEC_STMT, and insert it at GSI.
+
+   Return FALSE if not a vectorizable STMT, TRUE otherwise.  */
+
+bool
+vectorizable_comparison (gimple *stmt, gimple_stmt_iterator *gsi,
+gimple **vec_stmt, tree reduc_def,
+slp_tree slp_node)
+{
+  tree lhs, rhs1, rhs2;
+  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+  tree vectype1 = NULL_TREE, vectype2 = NULL_TREE;
+  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+  tree vec_rhs1 = NULL_TREE, vec_rhs2 = NULL_TREE;
+  tree new_temp;
+  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
+  enum vect_def_type dts[2] = {vect_unknown_def_type, vect_unknown_def_type};
+  unsigned nunits;
+  int ncopies;
+  enum tree_code code;
+  stmt_vec_info prev_stmt_info = NULL;
+  int i, j;
+  bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
+  vec vec_oprnds0 = vNULL;
+  vec vec_oprnds1 = vNULL;
+  gimple *def_stmt;
+  tree mask_type;
+  tree mask;
+
+  if (!VECTOR_BOOLEAN_TYPE_P (vectype))
+return false;
+
+  mask_type = vectype;
+  nunits = TYPE_VECTOR_SUBPARTS (vectype);
+
+  if (slp_node || PURE_SLP_STMT (stmt_info))
+ncopies = 1;
+  else
+ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
+
+  gcc_assert (ncopies >= 1);
+  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
+return false;
+
+  if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
+  && !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
+  && reduc_def))
+return false;
+
+  if (STMT_VINFO_LIVE_P (stmt_info))
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"value used after loop.\n");
+  return false;
+}
+
+  if (!is_gimple_assign (stmt))
+return false;
+
+  code = gimple_assign_rhs_code (stmt);
+
+  if (TREE_CODE_CLASS (code) != tcc_comparison)
+return false;
+
+  rhs1 = gimple_assign_rhs1 (stmt);
+  rhs2 = gimple_assign_rhs2 (stmt);
+
+  if (!vect_is_simple_use (rhs1, stmt_info->vinfo, 

Re: [vec-cmp, patch 4/6] Support vector mask invariants

2015-11-09 Thread Ilya Enkovich
On 26 Oct 16:21, Richard Biener wrote:
> On Wed, Oct 14, 2015 at 6:13 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
> > -   val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (type), val);
> > +   {
> > + /* Can't use VIEW_CONVERT_EXPR for booleans because
> > +of possibly different sizes of scalar value and
> > +vector element.  */
> > + if (VECTOR_BOOLEAN_TYPE_P (type))
> > +   {
> > + if (integer_zerop (val))
> > +   val = build_int_cst (TREE_TYPE (type), 0);
> > + else if (integer_onep (val))
> > +   val = build_int_cst (TREE_TYPE (type), 1);
> > + else
> > +   gcc_unreachable ();
> > +   }
> > + else
> > +   val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (type), val);
> 
> I think the existing code is fine with using fold_convert () here
> which should also work
> for the boolean types.  So does just
> 
>   val = fold_convert (TREE_TYPE (type), val);
> 
> work?

It seems to work OK.

> 
> > @@ -7428,13 +7459,13 @@ vectorizable_condition (gimple *stmt, 
> > gimple_stmt_iterator *gsi,
> >   gimple *gtemp;
> >   vec_cond_lhs =
> >   vect_get_vec_def_for_operand (TREE_OPERAND (cond_expr, 0),
> > -   stmt, NULL);
> > +   stmt, NULL, comp_vectype);
> >   vect_is_simple_use (TREE_OPERAND (cond_expr, 0), stmt,
> >   loop_vinfo, , , [0]);
> >
> >   vec_cond_rhs =
> > vect_get_vec_def_for_operand (TREE_OPERAND (cond_expr, 1),
> > -   stmt, NULL);
> > + stmt, NULL, comp_vectype);
> >   vect_is_simple_use (TREE_OPERAND (cond_expr, 1), stmt,
> >   loop_vinfo, , , [1]);
> 
> I still don't like this very much but I guess without some major
> refactoring of all
> the functions there isn't a better way to do it for now.
> 
> Thus, ok with trying the change suggested above.
> 
> Thanks,
> Richard.
> 

Here is an updated version.

Thanks,
Ilya
--
gcc/

2015-11-09  Ilya Enkovich  <enkovich@gmail.com>

* expr.c (const_vector_mask_from_tree): New.
(const_vector_from_tree): Use const_vector_mask_from_tree
for boolean vectors.
* tree-vect-stmts.c (vect_init_vector): Support boolean vector
invariants.
(vect_get_vec_def_for_operand): Add VECTYPE arg.
(vectorizable_condition): Directly provide vectype for invariants
used in comparison.
* tree-vectorizer.h (vect_get_vec_def_for_operand): Add VECTYPE
arg.


diff --git a/gcc/expr.c b/gcc/expr.c
index 2b2174f..03936ee 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11423,6 +11423,40 @@ try_tablejump (tree index_type, tree index_expr, tree 
minval, tree range,
   return 1;
 }
 
+/* Return a CONST_VECTOR rtx representing vector mask for
+   a VECTOR_CST of booleans.  */
+static rtx
+const_vector_mask_from_tree (tree exp)
+{
+  rtvec v;
+  unsigned i;
+  int units;
+  tree elt;
+  machine_mode inner, mode;
+
+  mode = TYPE_MODE (TREE_TYPE (exp));
+  units = GET_MODE_NUNITS (mode);
+  inner = GET_MODE_INNER (mode);
+
+  v = rtvec_alloc (units);
+
+  for (i = 0; i < VECTOR_CST_NELTS (exp); ++i)
+{
+  elt = VECTOR_CST_ELT (exp, i);
+
+  gcc_assert (TREE_CODE (elt) == INTEGER_CST);
+  if (integer_zerop (elt))
+   RTVEC_ELT (v, i) = CONST0_RTX (inner);
+  else if (integer_onep (elt)
+  || integer_minus_onep (elt))
+   RTVEC_ELT (v, i) = CONSTM1_RTX (inner);
+  else
+   gcc_unreachable ();
+}
+
+  return gen_rtx_CONST_VECTOR (mode, v);
+}
+
 /* Return a CONST_VECTOR rtx for a VECTOR_CST tree.  */
 static rtx
 const_vector_from_tree (tree exp)
@@ -11438,6 +11472,9 @@ const_vector_from_tree (tree exp)
   if (initializer_zerop (exp))
 return CONST0_RTX (mode);
 
+  if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (exp)))
+return const_vector_mask_from_tree (exp);
+
   units = GET_MODE_NUNITS (mode);
   inner = GET_MODE_INNER (mode);
 
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index ee549f4..af203ab 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1300,7 +1300,7 @@ vect_init_vector (gimple *stmt, tree val, tree type, 
gimple_stmt_iterator *gsi)
   if (!types_compatible_p (TREE_TYPE (type), TREE_TYPE (val)))
{
  if (CONSTANT_CLASS_P (val))
-   val = fold_unary (VIEW_CONVERT_EXPR, TREE

Re: [PATCH] Fix c-c++-common/torture/vector-compare-1.c on powerpc

2015-11-05 Thread Ilya Enkovich
On 29 Oct 15:29, Richard Biener wrote:
> 
> I think this should unconditionally produce the COND_EXPR and
> build cst_true using build_all_ones_cst (stype).
> 
> Ok with that change.
> 
> Thanks,
> Richard.
> 

Here is an updated patch version.  Bootstrapped and regtested on 
powerpc64le-unknown-linux-gnu and x86_64-unknown-linux-gnu.  Applied to trunk.

Thanks,
Ilya
--
gcc/

2015-11-05  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-generic.c (do_compare): Use -1 for true
result instead of 1.


diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index d0a4e0f..b59f699 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -161,10 +161,16 @@ static tree
 do_compare (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b,
tree bitpos, tree bitsize, enum tree_code code, tree type)
 {
+  tree stype = TREE_TYPE (type);
+  tree cst_false = build_zero_cst (stype);
+  tree cst_true = build_all_ones_cst (stype);
+  tree cmp;
+
   a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
   b = tree_vec_extract (gsi, inner_type, b, bitsize, bitpos);
 
-  return gimplify_build2 (gsi, code, TREE_TYPE (type), a, b);
+  cmp = build2 (code, boolean_type_node, a, b);
+  return gimplify_build3 (gsi, COND_EXPR, stype, cmp, cst_true, cst_false);
 }
 
 /* Expand vector addition to scalars.  This does bit twiddling


Re: [vec-cmp, patch 1/6] Add optabs for vector comparison

2015-11-05 Thread Ilya Enkovich
2015-10-27 23:52 GMT+03:00 Jeff Law :
>
> Sigh.  I searched for the enum type, not for CODE_FOR_nothing ;(  My bad.
>
> If it's easy to get rid of, yes.  I believe we've got 3 uses of
> CODE_FOR_nothing.  AFAICT in none of those cases do we care about the code
> other than does it correspond to CODE_FOR_nothing.
>
> Ideally we'd like to have both optabs-query and optabs-tree not know about
> insn codes.  The former is supposed to be IR agnostic, but insn codes are
> part of the RTL IR, so that's a wart.  The latter is supposed to be tree
> specific and thus shouldn't know about the RTL IR either.
>
> I'd settle for getting the wart out of optabs-tree and we can put further
> cleanup of optabs-query in the queue.
>
> To get the wart out of optabs-tree all I think we need is a true boolean
> function that tells us if there's a suitable optab.
>
> It's unfortunate that the routines exported by optabs-query are
> can_{extend,float,fix}_p since those would fairly natural for the boolean
> query we want to make and they're used elsewhere, but not in a boolean form.
>
> I think that we ought to rename the existing uses & definition of can_XXX_p
> that are exported by optabs-query.c, then creating new can_XXX_p for those
> uses that just care about the boolean status should work.  At that point we
> remove insn-codes.h from optab-tree.c.

Do you want this refactoring be a part of this patch or series?

Thanks,
Ilya

>
> Jeff


[PATCH, PR tree-optimization/68145] Fix vectype computation in vectorizable_operation

2015-11-05 Thread Ilya Enkovich
Hi,

This patch fixes a way vectype is computed in vectorizable_operation.  
Currently op0 is always used to compute vectype.  If it is a loop invariant 
then its type is used to get vectype which is impossible for booleans requiring 
a context to correctly compute vectype.  This patch uses output vectype in such 
cases, this should always work fine for operations on booleans.  Bootstrapped 
on x86_64-unknown-linux-gnu.  Regression tesing is in progress.  Ok if no 
regressions?

Thanks,
Ilya
--
gcc/

2015-11-05  Ilya Enkovich  <enkovich@gmail.com>

PR tree-optimization/68145
* tree-vect-stmts.c (vectorizable_operation): Fix
determination for booleans.

gcc/testsuite/

2015-11-05  Ilya Enkovich  <enkovich@gmail.com>

PR tree-optimization/68145
* g++.dg/vect/pr68145.cc: New test.


diff --git a/gcc/testsuite/g++.dg/vect/pr68145.cc 
b/gcc/testsuite/g++.dg/vect/pr68145.cc
new file mode 100644
index 000..51e663a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/pr68145.cc
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+
+struct A {
+  bool operator()(int p1, int p2) { return p1 && p2; }
+};
+class B {
+public:
+  bool *cbegin();
+  bool *cend();
+};
+template  void operator&&(B p1, T p2) {
+  B a;
+  arrayContTransform(p1, p2, a, A());
+}
+
+template 
+void myrtransform(_InputIterator1 p1, _OutputIterator p2, T p3,
+  _BinaryOperation p4) {
+  _InputIterator1 b;
+  for (; b != p1; ++b, ++p2)
+*p2 = p4(*b, p3);
+}
+
+template 
+void arrayContTransform(L p1, R p2, RES p3, BinaryOperator p4) {
+  myrtransform(p1.cend(), p3.cbegin(), p2, p4);
+}
+
+class C {
+public:
+  B getArrayBool();
+};
+class D {
+  B getArrayBool(const int &);
+  C lnode_p;
+};
+bool c;
+B D::getArrayBool(const int &) { lnode_p.getArrayBool() && c; }
+
+// { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target { i?86-*-* 
x86_64-*-* } } } }
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index ae14075..9aa2d4e 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -4697,7 +4697,26 @@ vectorizable_operation (gimple *stmt, 
gimple_stmt_iterator *gsi,
   /* If op0 is an external or constant def use a vector type with
  the same size as the output vector type.  */
   if (!vectype)
-vectype = get_same_sized_vectype (TREE_TYPE (op0), vectype_out);
+{
+  /* For boolean type we cannot determine vectype by
+invariant value (don't know whether it is a vector
+of booleans or vector of integers).  We use output
+vectype because operations on boolean don't change
+type.  */
+  if (TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE)
+   {
+ if (TREE_CODE (TREE_TYPE (scalar_dest)) != BOOLEAN_TYPE)
+   {
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"not supported operation on bool value.\n");
+ return false;
+   }
+ vectype = vectype_out;
+   }
+  else
+   vectype = get_same_sized_vectype (TREE_TYPE (op0), vectype_out);
+}
   if (vec_stmt)
 gcc_assert (vectype);
   if (!vectype)


Re: [Boolean Vector, patch 1/5] Introduce boolean vector to be used as a vector comparison type

2015-10-29 Thread Ilya Enkovich
On 28 Oct 22:37, Ilya Enkovich wrote:
> Seems the problem occurs in this check in expand_vector_operations_1:
> 
>   /* A scalar operation pretending to be a vector one.  */
>   if (VECTOR_BOOLEAN_TYPE_P (type)
>   && !VECTOR_MODE_P (TYPE_MODE (type))
>   && TYPE_MODE (type) != BLKmode)
> return;
> 
> This is to filter out scalar operations on boolean vectors.
> The problem here is that TYPE_MODE (type) doesn't return
> V4SImode assigned to the type but calls vector_type_mode
> instead which tries to find an integer mode for it and returns
> TImode. This causes function exit and we don't expand vector
> comparison.
> 
> Suppose simple option to fix it is to change default get_mask_mode
> hook to return BLKmode in case chosen integer vector mode is not
> vector_mode_supported_p.
> 
> Thanks,
> Ilya
> 

Here is a patch which fixes the problem on ARM (and on i386 with -mno-sse 
also).  I checked it fixes the problem on ARM and also bootstrapped and checked 
it on x86_64-unknown-linux-gnu.  Is it OK?

Thanks,
Ilya
--
gcc/

2015-10-29  Ilya Enkovich  <enkovich@gmail.com>

* targhooks.c (default_get_mask_mode): Use BLKmode in
case target doesn't support required vector mode.
* stor-layout.c (layout_type): Check for BLKmode.


diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 58ecd7b..ae7d6fb 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -2185,7 +2185,8 @@ layout_type (tree type)
TYPE_SATURATING (type) = TYPE_SATURATING (TREE_TYPE (type));
 TYPE_UNSIGNED (type) = TYPE_UNSIGNED (TREE_TYPE (type));
/* Several boolean vector elements may fit in a single unit.  */
-   if (VECTOR_BOOLEAN_TYPE_P (type))
+   if (VECTOR_BOOLEAN_TYPE_P (type)
+   && type->type_common.mode != BLKmode)
  TYPE_SIZE_UNIT (type)
= size_int (GET_MODE_SIZE (type->type_common.mode));
else
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index c39f266..d378864 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1095,10 +1095,16 @@ default_get_mask_mode (unsigned nunits, unsigned 
vector_size)
   unsigned elem_size = vector_size / nunits;
   machine_mode elem_mode
 = smallest_mode_for_size (elem_size * BITS_PER_UNIT, MODE_INT);
+  machine_mode vector_mode;
 
   gcc_assert (elem_size * nunits == vector_size);
 
-  return mode_for_vector (elem_mode, nunits);
+  vector_mode = mode_for_vector (elem_mode, nunits);
+  if (VECTOR_MODE_P (vector_mode)
+  && !targetm.vector_mode_supported_p (vector_mode))
+vector_mode = BLKmode;
+
+  return vector_mode;
 }
 
 /* By default, the cost model accumulates three separate costs (prologue,


[PATCH] Fix c-c++-common/torture/vector-compare-1.c on powerpc

2015-10-29 Thread Ilya Enkovich
Hi,

This patches powerpc fails for c-c++-common/torture/vector-compare-1.c test.  
The problem is that vector comparison lowering produces vector of 0s and 1s 
instead of 0s and -1s.  It doesn't matter if it usage is also lowered (like 
happens on ARM and i386 with -mno-sse) byt on powerpc we may have comparison of 
doubles be lowered but following VEC_COND_EXPR not lowered.  It causes wrong 
VEC_COND_EXPR result.  i checked this patch fixes the test.  Full regression 
testing on powerpc64le-unknown-linux-gnu is in progress.  OK if no regression?

Thanks,
Ilya
--
gcc/

2015-10-29  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-generic.c (do_compare): Use -1 for true
result instead of 1.


diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index d0a4e0f..0b60b15 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -161,10 +161,27 @@ static tree
 do_compare (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b,
tree bitpos, tree bitsize, enum tree_code code, tree type)
 {
+  tree stype = TREE_TYPE (type);
+
   a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
   b = tree_vec_extract (gsi, inner_type, b, bitsize, bitpos);
 
-  return gimplify_build2 (gsi, code, TREE_TYPE (type), a, b);
+  if (TYPE_PRECISION (stype) > 1)
+{
+  tree cst_false = build_zero_cst (stype);
+  tree cst_true;
+  tree cmp;
+
+  if (TYPE_UNSIGNED (stype))
+   cst_true = TYPE_MAXVAL (stype);
+  else
+   cst_true = build_minus_one_cst (stype);
+
+  cmp = build2 (code, boolean_type_node, a, b);
+  return gimplify_build3 (gsi, COND_EXPR, stype, cmp, cst_true, cst_false);
+}
+
+  return gimplify_build2 (gsi, code, stype, a, b);
 }
 
 /* Expand vector addition to scalars.  This does bit twiddling


[PATCH] Use signed boolean type for boolean vectors

2015-10-28 Thread Ilya Enkovich
Hi,

Testing boolean vector conversions I found several runtime regressions
and investigation showed it's due to incorrect conversion caused by
unsigned boolean type.  When boolean vector is represented as an
integer vector on target it's a signed integer actually.  Unsigned
boolean type was chosen due to possible single bit values, but for
multiple bit values it causes wrong casting.  The easiest way to fix
it is to use signed boolean value.  The following patch does this and
fixes my problems with conversion.  Bootstrapped and tested on
x86_64-unknown-linux-gnu.  Is it OK?

Thanks,
Ilya
--
gcc/

2015-10-28  Ilya Enkovich  <enkovich@gmail.com>

* optabs.c (expand_vec_cond_expr): Always get sign from type.
* tree.c (wide_int_to_tree): Support negative values for boolean.
(build_nonstandard_boolean_type): Use signed type for booleans
with precision greater than 1.


diff --git a/gcc/optabs.c b/gcc/optabs.c
index e1ac0b8..37a67f1 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -5373,7 +5373,6 @@ expand_vec_cond_expr (tree vec_cond_type, tree op0, tree 
op1, tree op2,
   op0a = TREE_OPERAND (op0, 0);
   op0b = TREE_OPERAND (op0, 1);
   tcode = TREE_CODE (op0);
-  unsignedp = TYPE_UNSIGNED (TREE_TYPE (op0a));
 }
   else
 {
@@ -5382,9 +5381,9 @@ expand_vec_cond_expr (tree vec_cond_type, tree op0, tree 
op1, tree op2,
   op0a = op0;
   op0b = build_zero_cst (TREE_TYPE (op0));
   tcode = LT_EXPR;
-  unsignedp = false;
 }
   cmp_op_mode = TYPE_MODE (TREE_TYPE (op0a));
+  unsignedp = TYPE_UNSIGNED (TREE_TYPE (op0a));
 
 
   gcc_assert (GET_MODE_SIZE (mode) == GET_MODE_SIZE (cmp_op_mode)
diff --git a/gcc/tree.c b/gcc/tree.c
index e77d4b8..712390f 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -1451,7 +1451,7 @@ wide_int_to_tree (tree type, const wide_int_ref )
case BOOLEAN_TYPE:
  /* Cache false or true.  */
  limit = 2;
- if (hwi < 2)
+ if (IN_RANGE (hwi, 0, 1))
ix = hwi;
  break;
 
@@ -8076,7 +8076,10 @@ build_nonstandard_boolean_type (unsigned HOST_WIDE_INT 
precision)
 
   type = make_node (BOOLEAN_TYPE);
   TYPE_PRECISION (type) = precision;
-  fixup_unsigned_type (type);
+  if (precision > 1)
+fixup_signed_type (type);
+  else
+fixup_unsigned_type (type);
 
   if (precision <= MAX_INT_CACHED_PREC)
 nonstandard_boolean_type_cache[precision] = type;


Re: [mask-load, patch 1/2] Use boolean predicate for masked loads and store

2015-10-28 Thread Ilya Enkovich
On 23 Oct 13:36, Ilya Enkovich wrote:
> 2015-10-23 13:32 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> >
> > No, we'd get
> >
> >   mask_1 = bool != 1;
> >
> > and the 'mask' variable should have been simplified to 'bool'
> > (yes, we'd insert a dead stmt).  gimple_build simplifies
> > stmts via the match-and-simplify machinery and match.pd
> > knows how to invert conditions.
> >
> 
> Thanks! I'll try it.
> 
> Ilya

Hi,

Here is a new version.  Changes you suggested cause BIT_NOT_EXPR used for 
generated mask (instead of != 1 used before).  It required a small fix to get 
it vectorized to avoid regressions.  Is this version OK?

Thanks,
Ilya
--
gcc/

2015-10-28  Ilya Enkovich  <enkovich@gmail.com>

* internal-fn.c (expand_MASK_LOAD): Adjust to maskload optab changes.
(expand_MASK_STORE): Adjust to maskstore optab changes.
* optabs-query.c (can_vec_mask_load_store_p): Add MASK_MODE arg.
 Adjust to maskload, maskstore optab changes.
* optabs-query.h (can_vec_mask_load_store_p): Add MASK_MODE arg.
* optabs.def (maskload_optab): Transform into convert optab.
(maskstore_optab): Likewise.
* tree-if-conv.c (ifcvt_can_use_mask_load_store): Adjust to
can_vec_mask_load_store_p signature change.
(predicate_mem_writes): Use boolean mask.
* tree-vect-stmts.c (vectorizable_mask_load_store): Adjust to
can_vec_mask_load_store_p signature change.  Allow invariant masks.
(vectorizable_operation): Ignore type precision for boolean vectors.

gcc/testsuite/

2015-10-28  Ilya Enkovich  <enkovich@gmail.com>

* gcc.target/i386/avx2-vec-mask-bit-not.c: New test.


diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index f12d3af..2317e20 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -1901,7 +1901,9 @@ expand_MASK_LOAD (gcall *stmt)
   create_output_operand ([0], target, TYPE_MODE (type));
   create_fixed_operand ([1], mem);
   create_input_operand ([2], mask, TYPE_MODE (TREE_TYPE (maskt)));
-  expand_insn (optab_handler (maskload_optab, TYPE_MODE (type)), 3, ops);
+  expand_insn (convert_optab_handler (maskload_optab, TYPE_MODE (type),
+ TYPE_MODE (TREE_TYPE (maskt))),
+  3, ops);
 }
 
 static void
@@ -1924,7 +1926,9 @@ expand_MASK_STORE (gcall *stmt)
   create_fixed_operand ([0], mem);
   create_input_operand ([1], reg, TYPE_MODE (type));
   create_input_operand ([2], mask, TYPE_MODE (TREE_TYPE (maskt)));
-  expand_insn (optab_handler (maskstore_optab, TYPE_MODE (type)), 3, ops);
+  expand_insn (convert_optab_handler (maskstore_optab, TYPE_MODE (type),
+ TYPE_MODE (TREE_TYPE (maskt))),
+  3, ops);
 }
 
 static void
diff --git a/gcc/optabs-query.c b/gcc/optabs-query.c
index 254089f..c20597c 100644
--- a/gcc/optabs-query.c
+++ b/gcc/optabs-query.c
@@ -466,7 +466,9 @@ can_mult_highpart_p (machine_mode mode, bool uns_p)
 /* Return true if target supports vector masked load/store for mode.  */
 
 bool
-can_vec_mask_load_store_p (machine_mode mode, bool is_load)
+can_vec_mask_load_store_p (machine_mode mode,
+  machine_mode mask_mode,
+  bool is_load)
 {
   optab op = is_load ? maskload_optab : maskstore_optab;
   machine_mode vmode;
@@ -474,7 +476,7 @@ can_vec_mask_load_store_p (machine_mode mode, bool is_load)
 
   /* If mode is vector mode, check it directly.  */
   if (VECTOR_MODE_P (mode))
-return optab_handler (op, mode) != CODE_FOR_nothing;
+return convert_optab_handler (op, mode, mask_mode) != CODE_FOR_nothing;
 
   /* Otherwise, return true if there is some vector mode with
  the mask load/store supported.  */
@@ -485,7 +487,12 @@ can_vec_mask_load_store_p (machine_mode mode, bool is_load)
   if (!VECTOR_MODE_P (vmode))
 return false;
 
-  if (optab_handler (op, vmode) != CODE_FOR_nothing)
+  mask_mode = targetm.vectorize.get_mask_mode (GET_MODE_NUNITS (vmode),
+  GET_MODE_SIZE (vmode));
+  if (mask_mode == VOIDmode)
+return false;
+
+  if (convert_optab_handler (op, vmode, mask_mode) != CODE_FOR_nothing)
 return true;
 
   vector_sizes = targetm.vectorize.autovectorize_vector_sizes ();
@@ -496,8 +503,10 @@ can_vec_mask_load_store_p (machine_mode mode, bool is_load)
   if (cur <= GET_MODE_SIZE (mode))
continue;
   vmode = mode_for_vector (mode, cur / GET_MODE_SIZE (mode));
+  mask_mode = targetm.vectorize.get_mask_mode (GET_MODE_NUNITS (vmode),
+  cur);
   if (VECTOR_MODE_P (vmode)
- && optab_handler (op, vmode) != CODE_FOR_nothing)
+ && convert_optab_handler (op, vmode, mask_mode) != CODE_FOR_nothing)
return true;
 }
   return false;
diff --git a/gcc/optabs-query.

Re: [PATCH] Use signed boolean type for boolean vectors

2015-10-28 Thread Ilya Enkovich
2015-10-28 18:21 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Wed, Oct 28, 2015 at 2:13 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
>> Hi,
>>
>> Testing boolean vector conversions I found several runtime regressions
>> and investigation showed it's due to incorrect conversion caused by
>> unsigned boolean type.  When boolean vector is represented as an
>> integer vector on target it's a signed integer actually.  Unsigned
>> boolean type was chosen due to possible single bit values, but for
>> multiple bit values it causes wrong casting.  The easiest way to fix
>> it is to use signed boolean value.  The following patch does this and
>> fixes my problems with conversion.  Bootstrapped and tested on
>> x86_64-unknown-linux-gnu.  Is it OK?
>
> Hmm.  Actually formally the "boolean" vectors were always 0 or -1
> (all bits set).  That is also true for a signed boolean with precision 1
> but with higher precision what makes sure to sign-extend 'true'?
>
> So it's far from an obvious change, esp as you don't change the
> precision == 1 case.  [I still think we should have precision == 1
> for all boolean types]
>
> Richard.
>

For 1 bit precision signed type value 1 is out of range, right? This might break
in many place due to used 1 as true value.

Ilya


Re: [Boolean Vector, patch 1/5] Introduce boolean vector to be used as a vector comparison type

2015-10-28 Thread Ilya Enkovich
Seems the problem occurs in this check in expand_vector_operations_1:

  /* A scalar operation pretending to be a vector one.  */
  if (VECTOR_BOOLEAN_TYPE_P (type)
  && !VECTOR_MODE_P (TYPE_MODE (type))
  && TYPE_MODE (type) != BLKmode)
return;

This is to filter out scalar operations on boolean vectors.
The problem here is that TYPE_MODE (type) doesn't return
V4SImode assigned to the type but calls vector_type_mode
instead which tries to find an integer mode for it and returns
TImode. This causes function exit and we don't expand vector
comparison.

Suppose simple option to fix it is to change default get_mask_mode
hook to return BLKmode in case chosen integer vector mode is not
vector_mode_supported_p.

Thanks,
Ilya

2015-10-28 19:48 GMT+03:00 Bill Schmidt :
> On Wed, 2015-10-28 at 14:44 +0100, Christophe Lyon wrote:
>> Hi,
>>
>> Since r229128, I see:
>> FAIL: c-c++-common/torture/vector-compare-1.c   -O0  execution test
>> on arm targets, such as arm-none-eabi.
>
> Likewise for powerpc64le-linux-gnu.  The test produces:
>
> 0 != ((1.00 > 0.00 ? -1 : 0) FAIL: 
> c-c++-common/torture/vector-compare-1
> .c   -O0  execution test
>
>>
>> Christophe.
>>
>
>


Re: [PATCH, PR68062] Fix uniform vector operation lowering

2015-10-26 Thread Ilya Enkovich
On 26 Oct 10:56, Richard Biener wrote:
> On Mon, Oct 26, 2015 at 10:35 AM, Ilya Enkovich <enkovich@gmail.com> 
> wrote:
> > On 26 Oct 10:09, Richard Biener wrote:
> >> On Sat, Oct 24, 2015 at 12:29 AM, Ilya Enkovich <enkovich@gmail.com> 
> >> wrote:
> >> > 2015-10-24 0:32 GMT+03:00 Jeff Law <l...@redhat.com>:
> >> >> On 10/23/2015 09:26 AM, Ilya Enkovich wrote:
> >> >>>
> >> >>> Hi,
> >> >>>
> >> >>> This patch checks optab exists before using it vector vector statement
> >> >>> lowering.  It fixes compilation of test from PR68062 with 
> >> >>> -funsigned-char
> >> >>> option added (doesn't fix original testcase).  Bootstrapped for
> >> >>> x86_64-unknown-linux-gnu.  OK for trunk if no regressions?
> >> >>>
> >> >>> Thanks,
> >> >>> Ilya
> >> >>> --
> >> >>> gcc/
> >> >>>
> >> >>> 2015-10-23  Ilya Enkovich  <enkovich@gmail.com>
> >> >>>
> >> >>> * tree-vect-generic.c (expand_vector_operations_1): Check
> >> >>> optab exists before use it.
> >> >>>
> >> >>> gcc/testsuite/
> >> >>>
> >> >>> 2015-10-23  Ilya Enkovich  <enkovich@gmail.com>
> >> >>>
> >> >>> * g++.dg/pr68062.C: New test.
> >> >>
> >> >> OK.
> >> >>
> >> >> Just curious, what was the tree code for which we couldn't find a 
> >> >> suitable
> >> >> optab?
> >> >
> >> > Those are various comparison codes.
> >>
> >> Yeah, sorry for missing that check.  Btw, I was curious to see that we miss
> >> a way to query from optab_tag the "kind" (normal, conversion, etc.) so code
> >> can decide what optab_handler function to call (optab_handler or
> >> convert_optab_handler).  So the code I added errs on the "simplistic" side
> >> and hopes that matching lhs and rhs1 type always gets us a non-convert 
> >> optab...
> >
> > So probably better fix is
> >
> > diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
> > index d1fc0ba..73c5cc5 100644
> > --- a/gcc/tree-vect-generic.c
> > +++ b/gcc/tree-vect-generic.c
> > @@ -1533,7 +1533,8 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi)
> >&& TYPE_MODE (TREE_TYPE (type)) == TYPE_MODE (TREE_TYPE (srhs1)))
> >  {
> >op = optab_for_tree_code (code, TREE_TYPE (type), optab_scalar);
> > -  if (optab_handler (op, TYPE_MODE (TREE_TYPE (type))) != 
> > CODE_FOR_nothing)
> > +  if (op >= FIRST_NORM_OPTAB && op <= LAST_NORM_OPTAB
> > + && optab_handler (op, TYPE_MODE (TREE_TYPE (type))) != 
> > CODE_FOR_nothing)
> > {
> >   tree slhs = make_ssa_name (TREE_TYPE (srhs1));
> >   gimple *repl = gimple_build_assign (slhs, code, srhs1, srhs2);
> >
> > ?
> 
> Ah, didn't know we have those constants - yes, that's a better fix.
> After all we want
> optab_handler to return sth sensible for it.

I'll install it then.  Here is a version rebased on trunk.

Thanks,
Ilya
--
gcc/

2015-10-26  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-generic.c (expand_vector_operations_1): Check
optab type before using it.


diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index 9c59402..a376ca2 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -1533,7 +1533,7 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi)
   && TYPE_MODE (TREE_TYPE (type)) == TYPE_MODE (TREE_TYPE (srhs1)))
 {
   op = optab_for_tree_code (code, TREE_TYPE (type), optab_scalar);
-  if (op != unknown_optab
+  if (op >= FIRST_NORM_OPTAB && op <= LAST_NORM_OPTAB
  && optab_handler (op, TYPE_MODE (TREE_TYPE (type))) != 
CODE_FOR_nothing)
{
  tree slhs = make_ssa_name (TREE_TYPE (srhs1));


Re: [PATCH, PR68062] Fix uniform vector operation lowering

2015-10-26 Thread Ilya Enkovich
On 26 Oct 10:09, Richard Biener wrote:
> On Sat, Oct 24, 2015 at 12:29 AM, Ilya Enkovich <enkovich@gmail.com> 
> wrote:
> > 2015-10-24 0:32 GMT+03:00 Jeff Law <l...@redhat.com>:
> >> On 10/23/2015 09:26 AM, Ilya Enkovich wrote:
> >>>
> >>> Hi,
> >>>
> >>> This patch checks optab exists before using it vector vector statement
> >>> lowering.  It fixes compilation of test from PR68062 with -funsigned-char
> >>> option added (doesn't fix original testcase).  Bootstrapped for
> >>> x86_64-unknown-linux-gnu.  OK for trunk if no regressions?
> >>>
> >>> Thanks,
> >>> Ilya
> >>> --
> >>> gcc/
> >>>
> >>> 2015-10-23  Ilya Enkovich  <enkovich@gmail.com>
> >>>
> >>> * tree-vect-generic.c (expand_vector_operations_1): Check
> >>> optab exists before use it.
> >>>
> >>> gcc/testsuite/
> >>>
> >>> 2015-10-23  Ilya Enkovich  <enkovich@gmail.com>
> >>>
> >>> * g++.dg/pr68062.C: New test.
> >>
> >> OK.
> >>
> >> Just curious, what was the tree code for which we couldn't find a suitable
> >> optab?
> >
> > Those are various comparison codes.
> 
> Yeah, sorry for missing that check.  Btw, I was curious to see that we miss
> a way to query from optab_tag the "kind" (normal, conversion, etc.) so code
> can decide what optab_handler function to call (optab_handler or
> convert_optab_handler).  So the code I added errs on the "simplistic" side
> and hopes that matching lhs and rhs1 type always gets us a non-convert 
> optab...

So probably better fix is

diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index d1fc0ba..73c5cc5 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -1533,7 +1533,8 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi)
   && TYPE_MODE (TREE_TYPE (type)) == TYPE_MODE (TREE_TYPE (srhs1)))
 {
   op = optab_for_tree_code (code, TREE_TYPE (type), optab_scalar);
-  if (optab_handler (op, TYPE_MODE (TREE_TYPE (type))) != CODE_FOR_nothing)
+  if (op >= FIRST_NORM_OPTAB && op <= LAST_NORM_OPTAB
+ && optab_handler (op, TYPE_MODE (TREE_TYPE (type))) != 
CODE_FOR_nothing)
{
  tree slhs = make_ssa_name (TREE_TYPE (srhs1));
  gimple *repl = gimple_build_assign (slhs, code, srhs1, srhs2);

?

Ilya

> 
> Richard.
> 
> > Ilya
> >
> >>
> >> jeff
> >>


Re: [Boolean Vector, patch 1/5] Introduce boolean vector to be used as a vector comparison type

2015-10-23 Thread Ilya Enkovich
On 23 Oct 11:40, Richard Biener wrote:
> On Thu, Oct 22, 2015 at 6:21 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
> > On 22 Oct 12:37, Andreas Schwab wrote:
> >> Ilya Enkovich <enkovich@gmail.com> writes:
> >>
> >> > 2015-10-22 13:13 GMT+03:00 Andreas Schwab <sch...@suse.de>:
> >> >> FAIL: gcc.c-torture/compile/pr54713-1.c   -O0  (internal compiler error)
> >> >
> >> > Can't reproduce it on i386. What's config used?
> >>
> >> http://gcc.gnu.org/ml/gcc-testresults/2015-10/msg02350.html
> >> http://gcc.gnu.org/ml/gcc-testresults/2015-10/msg02361.html
> >> http://gcc.gnu.org/ml/gcc-testresults/2015-10/msg02396.html
> >>
> >> Andreas.
> >>
> >> --
> >> Andreas Schwab, SUSE Labs, sch...@suse.de
> >> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> >> "And now for something completely different."
> >
> > Thanks!
> > The problem is in wrong mboolean vector size in case target cannot provide 
> > a mode for it.  I tested it on i386 with vector extension switched off, but 
> > with extensions off vector modes still exist, thus I missed this case.  
> > Here is a patch to fix it.  Bootstrapped and regtested on 
> > powerpc64le-unknown-linux-gnu.  I see disappeared fails:
> >
> > gcc.c-torture/compile/pr54713-2.c   -O0  (test for excess errors)
> > gcc.c-torture/compile/pr54713-3.c   -O0  (test for excess errors)
> >
> > I believe other targets should be fixed as well.
> >
> > Thanks,
> > Ilya
> > --
> > gcc/
> >
> > 2015-10-22  Ilya Enkovich  <enkovich@gmail.com>
> >
> > * tree.c (build_truth_vector_type): Support BLK mode
> > returned for boolean vector.
> >
> >
> > diff --git a/gcc/tree.c b/gcc/tree.c
> > index 7d10dd6..836b69a 100644
> > --- a/gcc/tree.c
> > +++ b/gcc/tree.c
> > @@ -10654,8 +10654,12 @@ build_truth_vector_type (unsigned nunits, unsigned 
> > vector_size)
> >
> >gcc_assert (mask_mode != VOIDmode);
> >
> > -  unsigned HOST_WIDE_INT esize = GET_MODE_BITSIZE (mask_mode) / nunits;
> > -  gcc_assert (esize * nunits == GET_MODE_BITSIZE (mask_mode));
> > +  unsigned HOST_WIDE_INT vsize = GET_MODE_BITSIZE (mask_mode);
> > +  if (!vsize)
> 
> This should better check for mask_mode == BLKmode instead?

Here is a version with BLKmode check.  I bootstrapped it on 
powerpc64le-unknown-linux-gnu (c,c++,fotran only) and checked pr54713-2.c, 
pr54713-3.c are fixed by this patch.  Is it OK for trunk?

Thanks,
Ilya
--
gcc/

2015-10-23  Ilya Enkovich  <enkovich@gmail.com>

PR middle-end/68066
* tree.c (build_truth_vector_type): Support BLK mode
returned for boolean vector.


diff --git a/gcc/tree.c b/gcc/tree.c
index 09df67e..79bbd07 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -10671,8 +10671,14 @@ build_truth_vector_type (unsigned nunits, unsigned 
vector_size)
 
   gcc_assert (mask_mode != VOIDmode);
 
-  unsigned HOST_WIDE_INT esize = GET_MODE_BITSIZE (mask_mode) / nunits;
-  gcc_assert (esize * nunits == GET_MODE_BITSIZE (mask_mode));
+  unsigned HOST_WIDE_INT vsize;
+  if (mask_mode == BLKmode)
+vsize = vector_size * BITS_PER_UNIT;
+  else
+vsize = GET_MODE_BITSIZE (mask_mode);
+
+  unsigned HOST_WIDE_INT esize = vsize / nunits;
+  gcc_assert (esize * nunits == vsize);
 
   tree bool_type = build_nonstandard_boolean_type (esize);
 


Re: [mask-load, patch 1/2] Use boolean predicate for masked loads and store

2015-10-23 Thread Ilya Enkovich
2015-10-23 12:59 GMT+03:00 Richard Biener :
>
> ICK.  So what does the above do?  It basically preserves the boolean condition
> as "mask" unless ... we ought to swap it (formerly easy, just swap arguments
> of the cond_expr, now a bit harder, we need to invert the condition).  But I
> don't understand the 'negate' dance.  It looks like you want to have mask
> not be bool != 0 or bool == 1 but just bool in this case.  I suggest you
> rework this to do sth like

That's right, I want to avoid ==,!= comparisons with 0 and 1 by either
using compared SSA_NAME
or SSA_NAME != 0 (negate case).

>
>gimple_seq stmts = NULL;
>gcc_assert (types_compatible_p (TREE_TYPE (cond), boolean_type_node));

Is it really valid assert? Compiling fortran test with LTO I may have
logical(kind=4) [aka 32bit boolean]
type for cond and single bit _Bool for boolean_type_node.

>if (TREE_CODE (cond) == SSA_NAME)
>  ;
>else if (COMPARISON_CLASS_P (cond))
>  mask = gimple_build (, TREE_CODE (cond), boolean_type_node,
> TREE_OPERAND (cond, 0), TREE_OPERAND (cond, 1));
>else
>   gcc_unreachable ();
>if (swap)
>  mask = gimple_build (, BIT_XOR_EXPR, boolean_type_node,
> mask, boolean_true_node);
>gsi_insert_seq_before (, stmts, GSI_SAME_STMT);
>
> which should do all of the above.

Thus we would get smth like

mask_1 = bool != 1
mask_2 = mask_1 XOR 1
_ifc_ = mask_2

instead of

_ifc_ = bool

Note that cond is built to be used as a condition in COND_EXPR
prepared for vectorization, i.e. it is always a comparison, thus
comparison with 0 and 1 is quite a common case.

Thanks,
Ilya

>
> The rest of the changes look good to me.
>
> Thanks,
> Richard.
>
>> /* Save mask and its size for further use.  */
>> vect_sizes.safe_push (bitsize);
>> vect_masks.safe_push (mask);
>> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
>> index 337ea7b..d6819ec 100644
>> --- a/gcc/tree-vect-stmts.c
>> +++ b/gcc/tree-vect-stmts.c
>> @@ -1800,6 +1800,7 @@ vectorizable_mask_load_store (gimple *stmt, 
>> gimple_stmt_iterator *gsi,
>>bool nested_in_vect_loop = nested_in_vect_loop_p (loop, stmt);
>>struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
>>tree vectype = STMT_VINFO_VECTYPE (stmt_info);
>> +  tree mask_vectype;
>>tree elem_type;
>>gimple *new_stmt;
>>tree dummy;
>> @@ -1827,8 +1828,8 @@ vectorizable_mask_load_store (gimple *stmt, 
>> gimple_stmt_iterator *gsi,
>>
>>is_store = gimple_call_internal_fn (stmt) == IFN_MASK_STORE;
>>mask = gimple_call_arg (stmt, 2);
>> -  if (TYPE_PRECISION (TREE_TYPE (mask))
>> -  != GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (vectype
>> +
>> +  if (TREE_CODE (TREE_TYPE (mask)) != BOOLEAN_TYPE)
>>  return false;
>>
>>/* FORNOW. This restriction should be relaxed.  */
>> @@ -1857,6 +1858,19 @@ vectorizable_mask_load_store (gimple *stmt, 
>> gimple_stmt_iterator *gsi,
>>if (STMT_VINFO_STRIDED_P (stmt_info))
>>  return false;
>>
>> +  if (TREE_CODE (mask) != SSA_NAME)
>> +return false;
>> +
>> +  if (!vect_is_simple_use_1 (mask, stmt, loop_vinfo, NULL,
>> +_stmt, , , _vectype))
>> +return false;
>> +
>> +  if (!mask_vectype)
>> +mask_vectype = get_mask_type_for_scalar_type (TREE_TYPE (vectype));
>> +
>> +  if (!mask_vectype)
>> +return false;
>> +
>>if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
>>  {
>>gimple *def_stmt;
>> @@ -1890,14 +1904,9 @@ vectorizable_mask_load_store (gimple *stmt, 
>> gimple_stmt_iterator *gsi,
>>  : DR_STEP (dr), size_zero_node) <= 0)
>>  return false;
>>else if (!VECTOR_MODE_P (TYPE_MODE (vectype))
>> -  || !can_vec_mask_load_store_p (TYPE_MODE (vectype), !is_store))
>> -return false;
>> -
>> -  if (TREE_CODE (mask) != SSA_NAME)
>> -return false;
>> -
>> -  if (!vect_is_simple_use (mask, stmt, loop_vinfo, NULL,
>> -  _stmt, , ))
>> +  || !can_vec_mask_load_store_p (TYPE_MODE (vectype),
>> + TYPE_MODE (mask_vectype),
>> + !is_store))
>>  return false;
>>
>>if (is_store)


Re: [mask-load, patch 1/2] Use boolean predicate for masked loads and store

2015-10-23 Thread Ilya Enkovich
2015-10-23 13:32 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Fri, Oct 23, 2015 at 12:23 PM, Ilya Enkovich <enkovich@gmail.com> 
> wrote:
>> 2015-10-23 12:59 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>>
>>> ICK.  So what does the above do?  It basically preserves the boolean 
>>> condition
>>> as "mask" unless ... we ought to swap it (formerly easy, just swap arguments
>>> of the cond_expr, now a bit harder, we need to invert the condition).  But I
>>> don't understand the 'negate' dance.  It looks like you want to have mask
>>> not be bool != 0 or bool == 1 but just bool in this case.  I suggest you
>>> rework this to do sth like
>>
>> That's right, I want to avoid ==,!= comparisons with 0 and 1 by either
>> using compared SSA_NAME
>> or SSA_NAME != 0 (negate case).
>>
>>>
>>>gimple_seq stmts = NULL;
>>>gcc_assert (types_compatible_p (TREE_TYPE (cond), boolean_type_node));
>>
>> Is it really valid assert? Compiling fortran test with LTO I may have
>> logical(kind=4) [aka 32bit boolean]
>> type for cond and single bit _Bool for boolean_type_node.
>
> I put it there to make sure it is because otherwise the use of 
> boolean_type_node
> below needs adjustment (boolean_true_node as well).  TREE_TYPE (cond)
> would work and constant_boolean_node (true, TREE_TYPE (cond)) for
> boolean_type_node.
>
> Yes, you are right.
>
>>>if (TREE_CODE (cond) == SSA_NAME)
>>>  ;
>>>else if (COMPARISON_CLASS_P (cond))
>>>  mask = gimple_build (, TREE_CODE (cond), boolean_type_node,
>>> TREE_OPERAND (cond, 0), TREE_OPERAND (cond, 1));
>>>else
>>>   gcc_unreachable ();
>>>if (swap)
>>>  mask = gimple_build (, BIT_XOR_EXPR, boolean_type_node,
>>> mask, boolean_true_node);
>>>gsi_insert_seq_before (, stmts, GSI_SAME_STMT);
>>>
>>> which should do all of the above.
>>
>> Thus we would get smth like
>>
>> mask_1 = bool != 1
>> mask_2 = mask_1 XOR 1
>> _ifc_ = mask_2
>
> No, we'd get
>
>   mask_1 = bool != 1;
>
> and the 'mask' variable should have been simplified to 'bool'
> (yes, we'd insert a dead stmt).  gimple_build simplifies
> stmts via the match-and-simplify machinery and match.pd
> knows how to invert conditions.
>

Thanks! I'll try it.

Ilya


[PATCH, committed] Fix uninitialized variable warning

2015-10-23 Thread Ilya Enkovich
Hi,

This patch fixes uninitialized variable warning.  Applied to trunk.

Thanks,
Ilya
--
gcc/

2015-10-23  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-generic.c (expand_vector_condition): Avoid
uninitialized variable warning.


diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index 2005383..d1fc0ba 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -844,7 +844,7 @@ expand_vector_condition (gimple_stmt_iterator *gsi)
   tree type = gimple_expr_type (stmt);
   tree a = gimple_assign_rhs1 (stmt);
   tree a1 = a;
-  tree a2;
+  tree a2 = NULL_TREE;
   bool a_is_comparison = false;
   tree b = gimple_assign_rhs2 (stmt);
   tree c = gimple_assign_rhs3 (stmt);


[PATCH, PR68062] Fix uniform vector operation lowering

2015-10-23 Thread Ilya Enkovich
Hi,

This patch checks optab exists before using it vector vector statement 
lowering.  It fixes compilation of test from PR68062 with -funsigned-char 
option added (doesn't fix original testcase).  Bootstrapped for 
x86_64-unknown-linux-gnu.  OK for trunk if no regressions?

Thanks,
Ilya
--
gcc/

2015-10-23  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-generic.c (expand_vector_operations_1): Check
optab exists before use it.

gcc/testsuite/

2015-10-23  Ilya Enkovich  <enkovich@gmail.com>

* g++.dg/pr68062.C: New test.


diff --git a/gcc/testsuite/g++.dg/pr68062.C b/gcc/testsuite/g++.dg/pr68062.C
new file mode 100644
index 000..236a488
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr68062.C
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-funsigned-char" } */
+
+typedef char __attribute__ ((vector_size (4))) v4qi;
+typedef unsigned char __attribute__ ((vector_size (4))) uv4qi;
+
+v4qi v;
+void ret(char a)
+{
+  v4qi c={a,a,a,a};
+  uv4qi d={a,a,a,a};
+  v = (c!=d);
+}
diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index 2005383..9c59402 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -1533,7 +1533,8 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi)
   && TYPE_MODE (TREE_TYPE (type)) == TYPE_MODE (TREE_TYPE (srhs1)))
 {
   op = optab_for_tree_code (code, TREE_TYPE (type), optab_scalar);
-  if (optab_handler (op, TYPE_MODE (TREE_TYPE (type))) != CODE_FOR_nothing)
+  if (op != unknown_optab
+ && optab_handler (op, TYPE_MODE (TREE_TYPE (type))) != 
CODE_FOR_nothing)
{
  tree slhs = make_ssa_name (TREE_TYPE (srhs1));
  gimple *repl = gimple_build_assign (slhs, code, srhs1, srhs2);


Re: [PATCH, PR68062] Fix uniform vector operation lowering

2015-10-23 Thread Ilya Enkovich
2015-10-24 0:32 GMT+03:00 Jeff Law <l...@redhat.com>:
> On 10/23/2015 09:26 AM, Ilya Enkovich wrote:
>>
>> Hi,
>>
>> This patch checks optab exists before using it vector vector statement
>> lowering.  It fixes compilation of test from PR68062 with -funsigned-char
>> option added (doesn't fix original testcase).  Bootstrapped for
>> x86_64-unknown-linux-gnu.  OK for trunk if no regressions?
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2015-10-23  Ilya Enkovich  <enkovich@gmail.com>
>>
>> * tree-vect-generic.c (expand_vector_operations_1): Check
>> optab exists before use it.
>>
>> gcc/testsuite/
>>
>> 2015-10-23  Ilya Enkovich  <enkovich@gmail.com>
>>
>> * g++.dg/pr68062.C: New test.
>
> OK.
>
> Just curious, what was the tree code for which we couldn't find a suitable
> optab?

Those are various comparison codes.

Ilya

>
> jeff
>


Re: [vec-cmp, patch 1/6] Add optabs for vector comparison

2015-10-22 Thread Ilya Enkovich
2015-10-21 20:25 GMT+03:00 Jeff Law <l...@redhat.com>:
> On 10/08/2015 08:52 AM, Ilya Enkovich wrote:
>>
>> Hi,
>>
>> This series introduces autogeneration of vector comparison and its support
>> on i386 target.  It lets comparison statements to be vectorized into vector
>> comparison instead of VEC_COND_EXPR.  This allows to avoid some restrictions
>> implied by boolean patterns.  This series applies on top of bolean vectors
>> series [1].
>>
>> This patch introduces optabs for vector comparison.
>>
>> [1] https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00215.html
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2015-10-08  Ilya Enkovich  <enkovich@gmail.com>
>>
>> * expr.c (do_store_flag): Use expand_vec_cmp_expr for mask
>> results.
>> * optabs-query.h (get_vec_cmp_icode): New.
>> * optabs-tree.c (expand_vec_cmp_expr_p): New.
>> * optabs-tree.h (expand_vec_cmp_expr_p): New.
>> * optabs.c (vector_compare_rtx): Add OPNO arg.
>> (expand_vec_cond_expr): Adjust to vector_compare_rtx change.
>> (expand_vec_cmp_expr): New.
>> * optabs.def (vec_cmp_optab): New.
>> (vec_cmpu_optab): New.
>> * optabs.h (expand_vec_cmp_expr): New.
>> * tree-vect-generic.c (expand_vector_comparison): Add vector
>> comparison optabs check.
>>
>>
>> diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
>> index 3b03338..aa863cf 100644
>> --- a/gcc/optabs-tree.c
>> +++ b/gcc/optabs-tree.c
>> @@ -320,6 +320,19 @@ supportable_convert_operation (enum tree_code code,
>> return false;
>>   }
>>
>> +/* Return TRUE if appropriate vector insn is available
>> +   for vector comparison expr with vector type VALUE_TYPE
>> +   and resulting mask with MASK_TYPE.  */
>> +
>> +bool
>> +expand_vec_cmp_expr_p (tree value_type, tree mask_type)
>> +{
>> +  enum insn_code icode = get_vec_cmp_icode (TYPE_MODE (value_type),
>> +   TYPE_MODE (mask_type),
>> +   TYPE_UNSIGNED (value_type));
>> +  return (icode != CODE_FOR_nothing);
>> +}
>> +
>
> Nothing inherently wrong with the code, but it seems like it's in the wrong
> place.  Why optabs-tree rather than optabs-query?

Because it uses tree type for arguments. There is no single tree usage
in optabs-query.c. I think expand_vec_cond_expr_p is in optabs-tree
for the same reason.

Thanks,
Ilya

>
> I think with that fixed this patch will be ready to go onto the trunk.
>
> jeff
>


Re: [Boolean Vector, patch 1/5] Introduce boolean vector to be used as a vector comparison type

2015-10-22 Thread Ilya Enkovich
2015-10-22 13:13 GMT+03:00 Andreas Schwab :
> FAIL: gcc.c-torture/compile/pr54713-1.c   -O0  (internal compiler error)

Can't reproduce it on i386. What's config used?

Ilya


Re: [vec-cmp, patch 5/6] Disable bool patterns when possible

2015-10-22 Thread Ilya Enkovich
On 21 Oct 11:45, Jeff Law wrote:
> On 10/08/2015 09:15 AM, Ilya Enkovich wrote:
> >Hi,
> >
> >This patch disables transformation of boolean computations into integer ones 
> >in case target supports vector comparison.  Pattern still applies to 
> >transform resulting boolean value into integer or avoid COND_EXPR with 
> >SSA_NAME as condition.
> >
> >Thanks,
> >Ilya
> >--
> >2015-10-08  Ilya Enkovich  <enkovich@gmail.com>
> >
> > * tree-vect-patterns.c (check_bool_pattern): Check fails
> > if we can vectorize comparison directly.
> > (search_type_for_mask): New.
> > (vect_recog_bool_pattern): Support cases when bool pattern
> > check fails.
> >
> >
> >diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> >index 830801a..e3be3d1 100644
> >--- a/gcc/tree-vect-patterns.c
> >+++ b/gcc/tree-vect-patterns.c
> >@@ -2962,6 +2962,11 @@ check_bool_pattern (tree var, loop_vec_info 
> >loop_vinfo, bb_vec_info bb_vinfo)
> >   if (comp_vectype == NULL_TREE)
> > return false;
> >
> >+  mask_type = get_mask_type_for_scalar_type (TREE_TYPE (rhs1));
> >+  if (mask_type
> >+  && expand_vec_cmp_expr_p (comp_vectype, mask_type))
> >+return false;
> >+
> >   if (TREE_CODE (TREE_TYPE (rhs1)) != INTEGER_TYPE)
> > {
> >   machine_mode mode = TYPE_MODE (TREE_TYPE (rhs1));
> So we're essentially saying here that we've got another preferred method for
> optimizing this case, right?
> 
> Can you update the function comment for check_bool_pattern?  In particular
> change the "if bool VAR can ..." to "can and should".
> 
> I think that more clearly states the updated purpose of that code.
> 
> 
> 
> 
> >@@ -3186,6 +3191,75 @@ adjust_bool_pattern (tree var, tree out_type, tree 
> >trueval,
> >  }
> >
> >
> >+/* Try to determine a proper type for converting bool VAR
> >+   into an integer value.  The type is chosen so that
> >+   conversion has the same number of elements as a mask
> >+   producer.  */
> >+
> >+static tree
> >+search_type_for_mask (tree var, loop_vec_info loop_vinfo, bb_vec_info 
> >bb_vinfo)
> What is the return value here?  Presumably the type or NULL.
> 
> So instead of "Try to determine a proper type" how about
> "Return the proper type or NULL_TREE if no such type exists ..."?
> 
> Please change the references to NULL to instead use NULL_TREE in that
> function as well.  They're functionally equivalent, but the latter is
> considered more correct these days.
> 
> 
> 
> >+{
> >+  tree type = search_type_for_mask (var, loop_vinfo, bb_vinfo);
> >+  tree cst0, cst1, cmp, tmp;
> >+
> >+  if (!type)
> >+return NULL;
> >+
> >+  /* We may directly use cond with narrowed type to avoid
> >+ multiple cond exprs with following result packing and
> >+ perform single cond with packed mask intead.  In case
> s/intead/instead/
> 
> With those changes above, this should be OK for the trunk.
> 
> jeff
> 

Thanks for review!  Here is an updated version with all mentioned issues fixed.

Thanks,
Ilya
--
2015-09-12  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-patterns.c (check_bool_pattern): Check fails
if we can vectorize comparison directly.
(search_type_for_mask): New.
(vect_recog_bool_pattern): Support cases when bool pattern
check fails.


diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index 3fe094c..516034d 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -2879,7 +2879,9 @@ vect_recog_mixed_size_cond_pattern (vec *stmts, 
tree *type_in,
 
 
 /* Helper function of vect_recog_bool_pattern.  Called recursively, return
-   true if bool VAR can be optimized that way.  */
+   true if bool VAR can and should be optimized that way.  Assume it shouldn't
+   in case it's a result of a comparison which can be directly vectorized into
+   a vector comparison.  */
 
 static bool
 check_bool_pattern (tree var, vec_info *vinfo)
@@ -2928,7 +2930,7 @@ check_bool_pattern (tree var, vec_info *vinfo)
 default:
   if (TREE_CODE_CLASS (rhs_code) == tcc_comparison)
{
- tree vecitype, comp_vectype;
+ tree vecitype, comp_vectype, mask_type;
 
  /* If the comparison can throw, then is_gimple_condexpr will be
 false and we can't make a COND_EXPR/VEC_COND_EXPR out of it.  */
@@ -2939,6 +2941,11 @@ check_bool_pattern (tree var, vec_info *vinfo)
  if (comp_vectype == NULL_TR

Re: [vec-cmp, patch 1/6] Add optabs for vector comparison

2015-10-22 Thread Ilya Enkovich
2015-10-22 18:52 GMT+03:00 Jeff Law <l...@redhat.com>:
> On 10/22/2015 04:35 AM, Ilya Enkovich wrote:
>>
>> 2015-10-21 20:25 GMT+03:00 Jeff Law <l...@redhat.com>:
>>>
>>> On 10/08/2015 08:52 AM, Ilya Enkovich wrote:
>>>>
>>>>
>>>> Hi,
>>>>
>>>> This series introduces autogeneration of vector comparison and its
>>>> support
>>>> on i386 target.  It lets comparison statements to be vectorized into
>>>> vector
>>>> comparison instead of VEC_COND_EXPR.  This allows to avoid some
>>>> restrictions
>>>> implied by boolean patterns.  This series applies on top of bolean
>>>> vectors
>>>> series [1].
>>>>
>>>> This patch introduces optabs for vector comparison.
>>>>
>>>> [1] https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00215.html
>>>>
>>>> Thanks,
>>>> Ilya
>>>> --
>>>> gcc/
>>>>
>>>> 2015-10-08  Ilya Enkovich  <enkovich@gmail.com>
>>>>
>>>>  * expr.c (do_store_flag): Use expand_vec_cmp_expr for mask
>>>> results.
>>>>  * optabs-query.h (get_vec_cmp_icode): New.
>>>>  * optabs-tree.c (expand_vec_cmp_expr_p): New.
>>>>  * optabs-tree.h (expand_vec_cmp_expr_p): New.
>>>>  * optabs.c (vector_compare_rtx): Add OPNO arg.
>>>>  (expand_vec_cond_expr): Adjust to vector_compare_rtx change.
>>>>  (expand_vec_cmp_expr): New.
>>>>  * optabs.def (vec_cmp_optab): New.
>>>>  (vec_cmpu_optab): New.
>>>>  * optabs.h (expand_vec_cmp_expr): New.
>>>>  * tree-vect-generic.c (expand_vector_comparison): Add vector
>>>>  comparison optabs check.
>>>>
>>>>
>>>> diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
>>>> index 3b03338..aa863cf 100644
>>>> --- a/gcc/optabs-tree.c
>>>> +++ b/gcc/optabs-tree.c
>>>> @@ -320,6 +320,19 @@ supportable_convert_operation (enum tree_code code,
>>>>  return false;
>>>>}
>>>>
>>>> +/* Return TRUE if appropriate vector insn is available
>>>> +   for vector comparison expr with vector type VALUE_TYPE
>>>> +   and resulting mask with MASK_TYPE.  */
>>>> +
>>>> +bool
>>>> +expand_vec_cmp_expr_p (tree value_type, tree mask_type)
>>>> +{
>>>> +  enum insn_code icode = get_vec_cmp_icode (TYPE_MODE (value_type),
>>>> +   TYPE_MODE (mask_type),
>>>> +   TYPE_UNSIGNED (value_type));
>>>> +  return (icode != CODE_FOR_nothing);
>>>> +}
>>>> +
>>>
>>>
>>> Nothing inherently wrong with the code, but it seems like it's in the
>>> wrong
>>> place.  Why optabs-tree rather than optabs-query?
>>
>>
>> Because it uses tree type for arguments. There is no single tree usage
>> in optabs-query.c. I think expand_vec_cond_expr_p is in optabs-tree
>> for the same reason.
>
> Note that expand_vec_cond_expr_p doesn't rely on enum insn code.  Well, it
> relies on enum insn code being an integer and CODE_FOR_nothing always having
> the value zero, which is probably worse.

Actually it also uses CODE_FOR_nothing in comparison:

  || get_vcond_icode (TYPE_MODE (value_type), TYPE_MODE (cmp_op_type),
  TYPE_UNSIGNED (cmp_op_type)) == CODE_FOR_nothing)

There are also two other instances of CODE_FOR_nothing in
optabs-tree.c. Do you want to get rid of "#include insn-codes.h" in
optabs-tree.c? Will it be really better if we replace it with
"#include optabs-query.h"?

Thanks,
Ilya

>
> We should clean both of these up so that:
>
>   1. We don't need enum insn_code in optabs-tree
>   2. We don't implicitly rely on CODE_FOR_nothing == 0
>
> It may be as simple as a adding a predicate function to optabs-query that
> returns true/false if there's a suitable icode, then using that predicate in
> optabs-tree.
>
> jeff
>
>


Re: [Boolean Vector, patch 1/5] Introduce boolean vector to be used as a vector comparison type

2015-10-22 Thread Ilya Enkovich
On 22 Oct 12:37, Andreas Schwab wrote:
> Ilya Enkovich <enkovich@gmail.com> writes:
> 
> > 2015-10-22 13:13 GMT+03:00 Andreas Schwab <sch...@suse.de>:
> >> FAIL: gcc.c-torture/compile/pr54713-1.c   -O0  (internal compiler error)
> >
> > Can't reproduce it on i386. What's config used?
> 
> http://gcc.gnu.org/ml/gcc-testresults/2015-10/msg02350.html
> http://gcc.gnu.org/ml/gcc-testresults/2015-10/msg02361.html
> http://gcc.gnu.org/ml/gcc-testresults/2015-10/msg02396.html
> 
> Andreas.
> 
> -- 
> Andreas Schwab, SUSE Labs, sch...@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."

Thanks!
The problem is in wrong mboolean vector size in case target cannot provide a 
mode for it.  I tested it on i386 with vector extension switched off, but with 
extensions off vector modes still exist, thus I missed this case.  Here is a 
patch to fix it.  Bootstrapped and regtested on powerpc64le-unknown-linux-gnu.  
I see disappeared fails:

gcc.c-torture/compile/pr54713-2.c   -O0  (test for excess errors)
gcc.c-torture/compile/pr54713-3.c   -O0  (test for excess errors)

I believe other targets should be fixed as well.

Thanks,
Ilya
--
gcc/

2015-10-22  Ilya Enkovich  <enkovich@gmail.com>

* tree.c (build_truth_vector_type): Support BLK mode
returned for boolean vector.


diff --git a/gcc/tree.c b/gcc/tree.c
index 7d10dd6..836b69a 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -10654,8 +10654,12 @@ build_truth_vector_type (unsigned nunits, unsigned 
vector_size)
 
   gcc_assert (mask_mode != VOIDmode);
 
-  unsigned HOST_WIDE_INT esize = GET_MODE_BITSIZE (mask_mode) / nunits;
-  gcc_assert (esize * nunits == GET_MODE_BITSIZE (mask_mode));
+  unsigned HOST_WIDE_INT vsize = GET_MODE_BITSIZE (mask_mode);
+  if (!vsize)
+vsize = vector_size * BITS_PER_UNIT;
+
+  unsigned HOST_WIDE_INT esize = vsize / nunits;
+  gcc_assert (esize * nunits == vsize);
 
   tree bool_type = build_nonstandard_boolean_type (esize);
 


Re: [PATCH] Fix ICE for SIMD clones usage in LTO

2015-10-21 Thread Ilya Enkovich
Ping

2015-10-05 19:13 GMT+03:00 Ilya Enkovich <enkovich@gmail.com>:
> Hi,
>
> When SIMD clone is created original function may be defined in another 
> partition.  In this case SIMD clone also has to have in_other_partition flag. 
>  Now it doesn't and we get an ICE.  This patch fixes it.  Bootstrapped and 
> regtested for x86_64-unknown-linux-gnu.  OK for trunk?
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2015-10-05  Ilya Enkovich  <enkovich@gmail.com>
>
> * omp-low.c (simd_clone_create): Set in_other_partition
> for created clones.
>
> gcc/testsuite/
>
> 2015-10-05  Ilya Enkovich  <enkovich@gmail.com>
>
> * gcc.dg/lto/simd-function_0.c: New test.
>
>
> diff --git a/gcc/omp-low.c b/gcc/omp-low.c
> index cdcf9d6..8d25784 100644
> --- a/gcc/omp-low.c
> +++ b/gcc/omp-low.c
> @@ -12948,6 +12948,8 @@ simd_clone_create (struct cgraph_node *old_node)
>DECL_STATIC_CONSTRUCTOR (new_decl) = 0;
>DECL_STATIC_DESTRUCTOR (new_decl) = 0;
>new_node = old_node->create_version_clone (new_decl, vNULL, NULL);
> +  if (old_node->in_other_partition)
> +   new_node->in_other_partition = 1;
>symtab->call_cgraph_insertion_hooks (new_node);
>  }
>if (new_node == NULL)
> diff --git a/gcc/testsuite/gcc.dg/lto/simd-function_0.c 
> b/gcc/testsuite/gcc.dg/lto/simd-function_0.c
> new file mode 100755
> index 000..cda31aa
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/lto/simd-function_0.c
> @@ -0,0 +1,34 @@
> +/* { dg-lto-do link } */
> +/* { dg-require-effective-target avx2 } */
> +/* { dg-lto-options { { -fopenmp-simd -O3 -ffast-math -mavx2 -flto 
> -flto-partition=max } } } */
> +
> +#define SIZE 4096
> +float x[SIZE];
> +
> +
> +#pragma omp declare simd
> +float
> +__attribute__ ((noinline))
> +my_mul (float x, float y) {
> +  return x * y;
> +}
> +
> +__attribute__ ((noinline))
> +int foo ()
> +{
> +  int i = 0;
> +#pragma omp simd safelen (16)
> +  for (i = 0; i < SIZE; i++)
> +x[i] = my_mul ((float)i, 9932.3323);
> +  return (int)x[0];
> +}
> +
> +int main ()
> +{
> +  int i = 0;
> +  for (i = 0; i < SIZE; i++)
> +x[i] = my_mul ((float) i, 9932.3323);
> +  foo ();
> +  return (int)x[0];
> +}
> +


Re: [mask-vec_cond, patch 3/2] SLP support

2015-10-20 Thread Ilya Enkovich
2015-10-19 19:05 GMT+03:00 Jeff Law <l...@redhat.com>:
> On 10/19/2015 05:21 AM, Ilya Enkovich wrote:
>>
>> Hi,
>>
>> This patch adds missing support for cond_expr with no embedded comparison
>> in SLP.  No new test added because vec cmp SLP test becomes (due to changes
>> in bool patterns by the first patch) a regression test for this patch.  Does
>> it look OK?
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2015-10-19  Ilya Enkovich  <enkovich@gmail.com>
>>
>> * tree-vect-slp.c (vect_get_and_check_slp_defs): Allow
>> cond_exp with no embedded comparison.
>> (vect_build_slp_tree_1): Likewise.
>
> Is it even valid gimple to have a COND_EXPR that is anything other than a
> conditional?
>
> From looking at gimplify_cond_expr, it looks like we could have a SSA_NAME
> that's a bool as the conditional.  Presumably we're allowing a vector of
> bools as the conditional once we hit the vectorizer, which seems fairly
> natural.

Currently vectorizer doesn't handle such COND_EXPR and never produces
VEC_COND_EXPR with SSA_NAME as a condition. Expand treats such
VEC_COND_EXPR as implicit (OP < 0) case (we just have no optab to
expand it with no comparison). But the first patch in this series[1]
allows such conditions to enable re-using vector comparison result by
multiple VEC_COND_EXPRs.

>
> OK.  Please install when the prerequisites are installed.
>
> Thanks,
> jeff
>

Thanks!
Ilya

[1] https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00862.html


[vec-cmp, patch 7/6] Vector comparison enabling in SLP

2015-10-19 Thread Ilya Enkovich
Hi,

It appeared our testsuite doesn't have a test which would require vector 
comparison support in SLP even after boolean pattern disabling.  This patch 
adds such tests and allow comparison for SLP.  Is it OK?

Thanks,
Ilya
--
gcc/

2015-10-19  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-slp.c (vect_build_slp_tree_1): Allow
comparison statements.
(vect_get_constant_vectors): Support boolean vector
constants.

gcc/testsuite/

2015-10-19  Ilya Enkovich  <enkovich@gmail.com>

* gcc.dg/vect/slp-cond-5.c: New test.

diff --git a/gcc/testsuite/gcc.dg/vect/slp-cond-5.c 
b/gcc/testsuite/gcc.dg/vect/slp-cond-5.c
new file mode 100644
index 000..5ade7d1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/slp-cond-5.c
@@ -0,0 +1,81 @@
+/* { dg-require-effective-target vect_condition } */
+
+#include "tree-vect.h"
+
+#define N 128
+
+static inline int
+foo (int x, int y, int a, int b)
+{
+  if (x >= y && a > b)
+return a;
+  else
+return b;
+}
+
+__attribute__((noinline, noclone)) void
+bar (int * __restrict__ a, int * __restrict__ b,
+ int * __restrict__ c, int * __restrict__ d,
+ int * __restrict__ e, int w)
+{
+  int i;
+  for (i = 0; i < N/16; i++, a += 16, b += 16, c += 16, d += 16, e += 16)
+{
+  e[0] = foo (c[0], d[0], a[0] * w, b[0] * w);
+  e[1] = foo (c[1], d[1], a[1] * w, b[1] * w);
+  e[2] = foo (c[2], d[2], a[2] * w, b[2] * w);
+  e[3] = foo (c[3], d[3], a[3] * w, b[3] * w);
+  e[4] = foo (c[4], d[4], a[4] * w, b[4] * w);
+  e[5] = foo (c[5], d[5], a[5] * w, b[5] * w);
+  e[6] = foo (c[6], d[6], a[6] * w, b[6] * w);
+  e[7] = foo (c[7], d[7], a[7] * w, b[7] * w);
+  e[8] = foo (c[8], d[8], a[8] * w, b[8] * w);
+  e[9] = foo (c[9], d[9], a[9] * w, b[9] * w);
+  e[10] = foo (c[10], d[10], a[10] * w, b[10] * w);
+  e[11] = foo (c[11], d[11], a[11] * w, b[11] * w);
+  e[12] = foo (c[12], d[12], a[12] * w, b[12] * w);
+  e[13] = foo (c[13], d[13], a[13] * w, b[13] * w);
+  e[14] = foo (c[14], d[14], a[14] * w, b[14] * w);
+  e[15] = foo (c[15], d[15], a[15] * w, b[15] * w);
+}
+}
+
+
+int a[N], b[N], c[N], d[N], e[N];
+
+int main ()
+{
+  int i;
+
+  check_vect ();
+
+  for (i = 0; i < N; i++)
+{
+  a[i] = i;
+  b[i] = 5;
+  e[i] = 0;
+
+  switch (i % 9)
+{
+case 0: asm (""); c[i] = i; d[i] = i + 1; break;
+case 1: c[i] = 0; d[i] = 0; break;
+case 2: c[i] = i + 1; d[i] = i - 1; break;
+case 3: c[i] = i; d[i] = i + 7; break;
+case 4: c[i] = i; d[i] = i; break;
+case 5: c[i] = i + 16; d[i] = i + 3; break;
+case 6: c[i] = i - 5; d[i] = i; break;
+case 7: c[i] = i; d[i] = i; break;
+case 8: c[i] = i; d[i] = i - 7; break;
+}
+}
+
+  bar (a, b, c, d, e, 2);
+  for (i = 0; i < N; i++)
+if (e[i] != ((i % 3) == 0 || i <= 5 ? 10 : 2 * i))
+  abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
target { i?86-*-* x86_64-*-* } } } } */
+
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 1424123..fa8291e 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -827,6 +827,7 @@ vect_build_slp_tree_1 (vec_info *vinfo,
  if (TREE_CODE_CLASS (rhs_code) != tcc_binary
  && TREE_CODE_CLASS (rhs_code) != tcc_unary
  && TREE_CODE_CLASS (rhs_code) != tcc_expression
+ && TREE_CODE_CLASS (rhs_code) != tcc_comparison
  && rhs_code != CALL_EXPR)
{
  if (dump_enabled_p ())
@@ -2596,7 +2597,14 @@ vect_get_constant_vectors (tree op, slp_tree slp_node,
   struct loop *loop;
   gimple_seq ctor_seq = NULL;
 
-  vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
+  /* Check if vector type is a boolean vector.  */
+  if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE
+  && (VECTOR_BOOLEAN_TYPE_P (STMT_VINFO_VECTYPE (stmt_vinfo))
+ || (code == COND_EXPR && op_num < 2)))
+vector_type
+  = build_same_sized_truth_vector_type (STMT_VINFO_VECTYPE (stmt_vinfo));
+  else
+vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
   nunits = TYPE_VECTOR_SUBPARTS (vector_type);
 
   if (STMT_VINFO_DEF_TYPE (stmt_vinfo) == vect_reduction_def
@@ -2768,8 +2776,21 @@ vect_get_constant_vectors (tree op, slp_tree slp_node,
{
  if (CONSTANT_CLASS_P (op))
{
- op = fold_unary (VIEW_CONVERT_EXPR,
-  TREE_TYPE (vector_type), op);
+ if (VECTOR_BOOLEAN_TYPE_P (vector_type))
+   {
+ /* Can't use VIEW_CONVERT_EXPR for booleans because
+of possibly different sizes of scalar value and
+vector element.  */
+   

[mask conversion, patch 1/2] Add pattern for mask conversions

2015-10-19 Thread Ilya Enkovich
Hi,

This patch adds a vectorization pattern which detects cases where mask 
conversion is needed and adds it.  It is done for all statements which may 
consume mask.  Some additional changes were made to support MASK_LOAD with 
pattern and allow scalar mode for vectype of pattern stmt.  It is applied on 
top of all other boolean vector series.  Does it look OK?

Thanks,
Ilya
--
gcc/

2015-10-19  Ilya Enkovich  <enkovich@gmail.com>

* optabs.c (expand_binop_directly): Allow scalar mode for
vec_pack_trunc_optab.
* tree-vect-loop.c (vect_determine_vectorization_factor): Skip
boolean vector producers from pattern sequence when computing VF.
* tree-vect-patterns.c (vect_vect_recog_func_ptrs) Add
vect_recog_mask_conversion_pattern.
(search_type_for_mask): Choose the smallest
type if different size types are mixed.
(build_mask_conversion): New.
(vect_recog_mask_conversion_pattern): New.
(vect_pattern_recog_1): Allow scalar mode for boolean vectype.
* tree-vect-stmts.c (vectorizable_mask_load_store): Support masked
load with pattern.
(vectorizable_conversion): Support boolean vectors.
(free_stmt_vec_info): Allow patterns for statements with no lhs.
* tree-vectorizer.h (NUM_PATTERNS): Increase to 14.


diff --git a/gcc/optabs.c b/gcc/optabs.c
index 83f4be3..8d61d33 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -1055,7 +1055,8 @@ expand_binop_directly (machine_mode mode, optab binoptab,
   /* The mode of the result is different then the mode of the
 arguments.  */
   tmp_mode = insn_data[(int) icode].operand[0].mode;
-  if (GET_MODE_NUNITS (tmp_mode) != 2 * GET_MODE_NUNITS (mode))
+  if (VECTOR_MODE_P (mode)
+ && GET_MODE_NUNITS (tmp_mode) != 2 * GET_MODE_NUNITS (mode))
{
  delete_insns_since (last);
  return NULL_RTX;
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 14804b3..e388533 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -497,6 +497,17 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
}
 }
 
+ /* Boolean vectors don't affect VF.  */
+ if (VECTOR_BOOLEAN_TYPE_P (vectype))
+   {
+ if (!analyze_pattern_stmt && gsi_end_p (pattern_def_si))
+   {
+ pattern_def_seq = NULL;
+ gsi_next ();
+   }
+ continue;
+   }
+
  /* The vectorization factor is according to the smallest
 scalar type (or the largest vector size, but we only
 support one vector size per loop).  */
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index a737129..34b1ea6 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -76,6 +76,7 @@ static gimple *vect_recog_mult_pattern (vec *,
 static gimple *vect_recog_mixed_size_cond_pattern (vec *,
  tree *, tree *);
 static gimple *vect_recog_bool_pattern (vec *, tree *, tree *);
+static gimple *vect_recog_mask_conversion_pattern (vec *, tree *, 
tree *);
 static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = {
vect_recog_widen_mult_pattern,
vect_recog_widen_sum_pattern,
@@ -89,7 +90,8 @@ static vect_recog_func_ptr 
vect_vect_recog_func_ptrs[NUM_PATTERNS] = {
vect_recog_divmod_pattern,
vect_recog_mult_pattern,
vect_recog_mixed_size_cond_pattern,
-   vect_recog_bool_pattern};
+   vect_recog_bool_pattern,
+   vect_recog_mask_conversion_pattern};
 
 static inline void
 append_pattern_def_seq (stmt_vec_info stmt_info, gimple *stmt)
@@ -3180,7 +3182,7 @@ search_type_for_mask (tree var, vec_info *vinfo)
   enum vect_def_type dt;
   tree rhs1;
   enum tree_code rhs_code;
-  tree res = NULL;
+  tree res = NULL, res2;
 
   if (TREE_CODE (var) != SSA_NAME)
 return NULL;
@@ -3213,13 +3215,26 @@ search_type_for_mask (tree var, vec_info *vinfo)
 case BIT_AND_EXPR:
 case BIT_IOR_EXPR:
 case BIT_XOR_EXPR:
-  if (!(res = search_type_for_mask (rhs1, vinfo)))
-   res = search_type_for_mask (gimple_assign_rhs2 (def_stmt), vinfo);
+  res = search_type_for_mask (rhs1, vinfo);
+  res2 = search_type_for_mask (gimple_assign_rhs2 (def_stmt), vinfo);
+  if (!res || (res2 && TYPE_PRECISION (res) > TYPE_PRECISION (res2)))
+   res = res2;
   break;
 
 default:
   if (TREE_CODE_CLASS (rhs_code) == tcc_comparison)
{
+ tree comp_vectype, mask_type;
+
+ comp_vectype = get_vectype_for_scalar_type (TREE_TYPE (rhs1));
+ if (comp_vectype == NULL_TREE)
+   return NULL;
+
+ mask_type = get_mask_type_for_scalar_type (TREE_TYPE (rhs1));
+ if (!mask_type
+ || !expand_vec_cmp_expr_p (comp_vectype, mask_type))
+   return NULL;
+
   

[mask conversion, patch 2/2, i386] Add pack/unpack patterns for scalar masks

2015-10-19 Thread Ilya Enkovich
Hi,

This patch adds patterns to be used for vector masks pack/unpack for AVX512.   
Bootstrapped and tested on x86_64-unknown-linux-gnu.  Does it look OK?

Thanks,
Ilya
--
gcc/

2015-10-19  Ilya Enkovich  <enkovich@gmail.com>

* config/i386/sse.md (HALFMASKMODE): New attribute.
(DOUBLEMASKMODE): New attribute.
(vec_pack_trunc_qi): New.
(vec_pack_trunc_): New.
(vec_unpacku_lo_hi): New.
(vec_unpacku_lo_si): New.
(vec_unpacku_lo_di): New.
(vec_unpacku_hi_hi): New.
(vec_unpacku_hi_): New.

gcc/testsuite/

2015-10-19  Ilya Enkovich  <enkovich@gmail.com>

* gcc.target/i386/mask-pack.c: New test.
* gcc.target/i386/mask-unpack.c: New test.


diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 452629f..ed0eedc 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -799,6 +799,14 @@
   [(V32QI "t") (V16HI "t") (V8SI "t") (V4DI "t") (V8SF "t") (V4DF "t")
(V64QI "g") (V32HI "g") (V16SI "g") (V8DI "g") (V16SF "g") (V8DF "g")])
 
+;; Half mask mode for unpacks
+(define_mode_attr HALFMASKMODE
+  [(DI "SI") (SI "HI")])
+
+;; Double mask mode for packs
+(define_mode_attr DOUBLEMASKMODE
+  [(HI "SI") (SI "DI")])
+
 
 ;; Include define_subst patterns for instructions with mask
 (include "subst.md")
@@ -11578,6 +11586,23 @@
   DONE;
 })
 
+(define_expand "vec_pack_trunc_qi"
+  [(set (match_operand:HI 0 ("register_operand"))
+(ior:HI (ashift:HI (zero_extend:HI (match_operand:QI 1 
("register_operand")))
+   (const_int 8))
+(zero_extend:HI (match_operand:QI 2 ("register_operand")]
+  "TARGET_AVX512F")
+
+(define_expand "vec_pack_trunc_"
+  [(set (match_operand: 0 ("register_operand"))
+(ior: (ashift: 
(zero_extend: (match_operand:SWI24 1 ("register_operand")))
+   (match_dup 3))
+(zero_extend: (match_operand:SWI24 2 
("register_operand")]
+  "TARGET_AVX512BW"
+{
+  operands[3] = GEN_INT (GET_MODE_BITSIZE (mode));
+})
+
 (define_insn "_packsswb"
   [(set (match_operand:VI1_AVX512 0 "register_operand" "=x,x")
(vec_concat:VI1_AVX512
@@ -13474,12 +13499,42 @@
   "TARGET_SSE2"
   "ix86_expand_sse_unpack (operands[0], operands[1], true, false); DONE;")
 
+(define_expand "vec_unpacku_lo_hi"
+  [(set (match_operand:QI 0 "register_operand")
+(subreg:QI (match_operand:HI 1 "register_operand") 0))]
+  "TARGET_AVX512DQ")
+
+(define_expand "vec_unpacku_lo_si"
+  [(set (match_operand:HI 0 "register_operand")
+(subreg:HI (match_operand:SI 1 "register_operand") 0))]
+  "TARGET_AVX512F")
+
+(define_expand "vec_unpacku_lo_di"
+  [(set (match_operand:SI 0 "register_operand")
+(subreg:SI (match_operand:DI 1 "register_operand") 0))]
+  "TARGET_AVX512BW")
+
 (define_expand "vec_unpacku_hi_"
   [(match_operand: 0 "register_operand")
(match_operand:VI124_AVX2_24_AVX512F_1_AVX512BW 1 "register_operand")]
   "TARGET_SSE2"
   "ix86_expand_sse_unpack (operands[0], operands[1], true, true); DONE;")
 
+(define_expand "vec_unpacku_hi_hi"
+  [(set (subreg:HI (match_operand:QI 0 "register_operand") 0)
+(lshiftrt:HI (match_operand:HI 1 "register_operand")
+ (const_int 8)))]
+  "TARGET_AVX512F")
+
+(define_expand "vec_unpacku_hi_"
+  [(set (subreg:SWI48x (match_operand: 0 "register_operand") 0)
+(lshiftrt:SWI48x (match_operand:SWI48x 1 "register_operand")
+ (match_dup 2)))]
+  "TARGET_AVX512BW"
+{
+  operands[2] = GEN_INT (GET_MODE_BITSIZE (mode));
+})
+
 ;
 ;;
 ;; Miscellaneous
diff --git a/gcc/testsuite/gcc.target/i386/mask-pack.c 
b/gcc/testsuite/gcc.target/i386/mask-pack.c
new file mode 100644
index 000..0b564ef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mask-pack.c
@@ -0,0 +1,100 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O3 -fopenmp-simd -fdump-tree-vect-details" } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 10 "vect" } } */
+/* { dg-final { scan-assembler-not "maskmov" } } */
+
+#define LENGTH 1000
+
+long l1[LENGTH], l2[LENGTH];
+int i1[LENGTH], i2[LENGTH];
+short s1[LENGTH], s2[LENGTH];
+char c1[LENGTH], c2[LENGTH];
+double d1[LENGTH], d2[LENGTH];
+
+int test1 (int n)
+{
+  int i;
+  #pragma omp simd safelen(1

[mask-vec_cond, patch 3/2] SLP support

2015-10-19 Thread Ilya Enkovich
Hi,

This patch adds missing support for cond_expr with no embedded comparison in 
SLP.  No new test added because vec cmp SLP test becomes (due to changes in 
bool patterns by the first patch) a regression test for this patch.  Does it 
look OK?

Thanks,
Ilya
--
gcc/

2015-10-19  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-slp.c (vect_get_and_check_slp_defs): Allow
cond_exp with no embedded comparison.
(vect_build_slp_tree_1): Likewise.


diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index fa8291e..48311dd 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -257,7 +257,8 @@ vect_get_and_check_slp_defs (vec_info *vinfo,
 {
   enum tree_code code = gimple_assign_rhs_code (stmt);
   number_of_oprnds = gimple_num_ops (stmt) - 1;
-  if (gimple_assign_rhs_code (stmt) == COND_EXPR)
+  if (gimple_assign_rhs_code (stmt) == COND_EXPR
+ && COMPARISON_CLASS_P (gimple_assign_rhs1 (stmt)))
{
  first_op_cond = true;
  commutative = true;
@@ -482,7 +483,6 @@ vect_build_slp_tree_1 (vec_info *vinfo,
   machine_mode vec_mode;
   HOST_WIDE_INT dummy;
   gimple *first_load = NULL, *prev_first_load = NULL;
-  tree cond;
 
   /* For every stmt in NODE find its def stmt/s.  */
   FOR_EACH_VEC_ELT (stmts, i, stmt)
@@ -527,24 +527,6 @@ vect_build_slp_tree_1 (vec_info *vinfo,
  return false;
}
 
-   if (is_gimple_assign (stmt)
-  && gimple_assign_rhs_code (stmt) == COND_EXPR
-   && (cond = gimple_assign_rhs1 (stmt))
-   && !COMPARISON_CLASS_P (cond))
-{
-  if (dump_enabled_p ())
-{
-  dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, 
-  "Build SLP failed: condition is not "
-  "comparison ");
-  dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM, stmt, 0);
-  dump_printf (MSG_MISSED_OPTIMIZATION, "\n");
-}
- /* Fatal mismatch.  */
- matches[0] = false;
-  return false;
-}
-
   scalar_type = vect_get_smallest_scalar_type (stmt, , );
   vectype = get_vectype_for_scalar_type (scalar_type);
   if (!vectype)


[PATCH, libmpx, PR66887] Remove redundant code

2015-10-15 Thread Ilya Enkovich
Hi,

This patch removes redundant memset and memcpy calls from libmpx.  Bootstrapped 
and tested w/ MPX on x86_64-unknown-linux-gnu.  Applied to trunk.

Thanks,
Ilya
--
libmpx/

2015-10-15  Ilya Enkovich  <enkovich@gmail.com>

PR other/66887
* mpxrt/mpxrt.c (read_mpx_status_sig): Remove useless code.


diff --git a/libmpx/mpxrt/mpxrt.c b/libmpx/mpxrt/mpxrt.c
index 0eff87e..c29c5d9 100644
--- a/libmpx/mpxrt/mpxrt.c
+++ b/libmpx/mpxrt/mpxrt.c
@@ -152,13 +152,8 @@ xgetbv (uint32_t index)
 static uint64_t
 read_mpx_status_sig (ucontext_t *uctxt)
 {
-  uint8_t __attribute__ ((__aligned__ (64))) buffer[4096];
-  struct xsave_struct *xsave_buf = (struct xsave_struct *)buffer;
-
-  memset (buffer, 0, sizeof (buffer));
-  memcpy (buffer,
- (uint8_t *)uctxt->uc_mcontext.fpregs + XSAVE_OFFSET_IN_FPMEM,
- sizeof (struct xsave_struct));
+  uint8_t *regs = (uint8_t *)uctxt->uc_mcontext.fpregs + XSAVE_OFFSET_IN_FPMEM;
+  struct xsave_struct *xsave_buf = (struct xsave_struct *)regs;
   return xsave_buf->bndcsr.status_reg;
 }
 


Re: [vec-cmp, patch 3/6] Vectorize comparison

2015-10-14 Thread Ilya Enkovich
On 14 Oct 15:06, Ilya Enkovich wrote:
> 
> Will send an updated version after testing.
> 
> Thanks,
> Ilya
> 

Here is an updated patch version.

Thanks,
Ilya
--
gcc/

2015-10-14  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-data-refs.c (vect_get_new_vect_var): Support vect_mask_var.
(vect_create_destination_var): Likewise.
* tree-vect-stmts.c (vectorizable_comparison): New.
(vect_analyze_stmt): Add vectorizable_comparison.
(vect_transform_stmt): Likewise.
* tree-vectorizer.h (enum vect_var_kind): Add vect_mask_var.
(enum stmt_vec_info_type): Add comparison_vec_info_type.
(vectorizable_comparison): New.


diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 8a4d489..0be0523 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -3870,6 +3870,9 @@ vect_get_new_vect_var (tree type, enum vect_var_kind 
var_kind, const char *name)
   case vect_scalar_var:
 prefix = "stmp";
 break;
+  case vect_mask_var:
+prefix = "mask";
+break;
   case vect_pointer_var:
 prefix = "vectp";
 break;
@@ -4424,7 +4427,11 @@ vect_create_destination_var (tree scalar_dest, tree 
vectype)
   tree type;
   enum vect_var_kind kind;
 
-  kind = vectype ? vect_simple_var : vect_scalar_var;
+  kind = vectype
+? VECTOR_BOOLEAN_TYPE_P (vectype)
+? vect_mask_var
+: vect_simple_var
+: vect_scalar_var;
   type = vectype ? vectype : TREE_TYPE (scalar_dest);
 
   gcc_assert (TREE_CODE (scalar_dest) == SSA_NAME);
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 23cec8a..6a52895 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -7516,6 +7516,192 @@ vectorizable_condition (gimple *stmt, 
gimple_stmt_iterator *gsi,
   return true;
 }
 
+/* vectorizable_comparison.
+
+   Check if STMT is comparison expression that can be vectorized.
+   If VEC_STMT is also passed, vectorize the STMT: create a vectorized
+   comparison, put it in VEC_STMT, and insert it at GSI.
+
+   Return FALSE if not a vectorizable STMT, TRUE otherwise.  */
+
+bool
+vectorizable_comparison (gimple *stmt, gimple_stmt_iterator *gsi,
+gimple **vec_stmt, tree reduc_def,
+slp_tree slp_node)
+{
+  tree lhs, rhs1, rhs2;
+  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+  tree vectype1 = NULL_TREE, vectype2 = NULL_TREE;
+  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+  tree vec_rhs1 = NULL_TREE, vec_rhs2 = NULL_TREE;
+  tree vec_compare;
+  tree new_temp;
+  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
+  tree def;
+  enum vect_def_type dts[2] = {vect_unknown_def_type, vect_unknown_def_type};
+  unsigned nunits;
+  int ncopies;
+  enum tree_code code;
+  stmt_vec_info prev_stmt_info = NULL;
+  int i, j;
+  bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
+  vec vec_oprnds0 = vNULL;
+  vec vec_oprnds1 = vNULL;
+  gimple *def_stmt;
+  tree mask_type;
+  tree mask;
+
+  if (!VECTOR_BOOLEAN_TYPE_P (vectype))
+return false;
+
+  mask_type = vectype;
+  nunits = TYPE_VECTOR_SUBPARTS (vectype);
+
+  if (slp_node || PURE_SLP_STMT (stmt_info))
+ncopies = 1;
+  else
+ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
+
+  gcc_assert (ncopies >= 1);
+  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
+return false;
+
+  if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
+  && !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
+  && reduc_def))
+return false;
+
+  if (STMT_VINFO_LIVE_P (stmt_info))
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"value used after loop.\n");
+  return false;
+}
+
+  if (!is_gimple_assign (stmt))
+return false;
+
+  code = gimple_assign_rhs_code (stmt);
+
+  if (TREE_CODE_CLASS (code) != tcc_comparison)
+return false;
+
+  rhs1 = gimple_assign_rhs1 (stmt);
+  rhs2 = gimple_assign_rhs2 (stmt);
+
+  if (!vect_is_simple_use_1 (rhs1, stmt, stmt_info->vinfo,
+_stmt, , [0], ))
+return false;
+
+  if (!vect_is_simple_use_1 (rhs2, stmt, stmt_info->vinfo,
+_stmt, , [1], ))
+   return false;
+
+  if (vectype1 && vectype2
+  && TYPE_VECTOR_SUBPARTS (vectype1) != TYPE_VECTOR_SUBPARTS (vectype2))
+return false;
+
+  vectype = vectype1 ? vectype1 : vectype2;
+
+  /* Invariant comparison.  */
+  if (!vectype)
+{
+  vectype = build_vector_type (TREE_TYPE (rhs1), nunits);
+  if (tree_to_shwi (TYPE_SIZE_UNIT (vectype)) != current_vector_size)
+   return false;
+}
+  else if (nunits != TYPE_VECTOR_SUBPARTS (vectype))
+return false;
+
+  if (!vec_stmt)
+{
+  STMT_VINFO_TYPE (stmt_info) = comparison_vec_info_type;
+  vect_model_simple_cost (stmt_info, ncopies, dts, NULL

Re: [vec-cmp, patch 4/6] Support vector mask invariants

2015-10-14 Thread Ilya Enkovich
On 14 Oct 13:50, Ilya Enkovich wrote:
> 2015-10-14 11:49 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> > On Tue, Oct 13, 2015 at 4:52 PM, Ilya Enkovich <enkovich@gmail.com> 
> > wrote:
> >> I don't understand what you mean. vect_get_vec_def_for_operand has two
> >> changes made.
> >> 1. For boolean invariants use build_same_sized_truth_vector_type
> >> instead of get_vectype_for_scalar_type in case statement produces a
> >> boolean vector. This covers cases when we use invariants in
> >> comparison, AND, IOR, XOR.
> >
> > Yes, I understand we need this special-casing to differentiate between
> > the vector type
> > used for boolean-typed loads/stores and the type for boolean typed 
> > constants.
> > What happens if we mix them btw, like with
> >
> >   _Bool b = bools[i];
> >   _Bool c = b || d;
> >   ...
> >
> > ?
> 
> Here both statements should get vector of char as a vectype and we
> never go VECTOR_BOOLEAN_TYPE_P way for them
> 
> >
> >> 2. COND_EXPR is an exception because it has built-in boolean vector
> >> result not reflected in its vecinfo. Thus I added additional operand
> >> for vect_get_vec_def_for_operand to directly specify vectype for
> >> vector definition in case it is a loop invariant.
> >> So what do you propose to do with these changes?
> >
> > This is the change I don't like and don't see why we need it.  It works 
> > today
> > and the comparison operands should be of appropriate type already?
> 
> Today it works because we always create vector of integer constant.
> With boolean vectors it may be either integer vector or boolean vector
> depending on context. Consider:
> 
> _Bool _1;
> int _2;
> 
> _2 = _1 != 0 ? 0 : 1
> 
> We have two zero constants here requiring different vectypes.
> 
> Ilya
> 
> >
> > Richard.
> >
> >> Thanks,
> >> Ilya

Here is an updated patch version.

Thanks,
Ilya
--
gcc/

2015-10-14  Ilya Enkovich  <enkovich@gmail.com>

* expr.c (const_vector_mask_from_tree): New.
(const_vector_from_tree): Use const_vector_mask_from_tree
for boolean vectors.
* tree-vect-stmts.c (vect_init_vector): Support boolean vector
invariants.
(vect_get_vec_def_for_operand): Add VECTYPE arg.
(vectorizable_condition): Directly provide vectype for invariants
used in comparison.
* tree-vectorizer.h (vect_get_vec_def_for_operand): Add VECTYPE
arg.


diff --git a/gcc/expr.c b/gcc/expr.c
index b5ff598..ab25d1a 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11344,6 +11344,40 @@ try_tablejump (tree index_type, tree index_expr, tree 
minval, tree range,
   return 1;
 }
 
+/* Return a CONST_VECTOR rtx representing vector mask for
+   a VECTOR_CST of booleans.  */
+static rtx
+const_vector_mask_from_tree (tree exp)
+{
+  rtvec v;
+  unsigned i;
+  int units;
+  tree elt;
+  machine_mode inner, mode;
+
+  mode = TYPE_MODE (TREE_TYPE (exp));
+  units = GET_MODE_NUNITS (mode);
+  inner = GET_MODE_INNER (mode);
+
+  v = rtvec_alloc (units);
+
+  for (i = 0; i < VECTOR_CST_NELTS (exp); ++i)
+{
+  elt = VECTOR_CST_ELT (exp, i);
+
+  gcc_assert (TREE_CODE (elt) == INTEGER_CST);
+  if (integer_zerop (elt))
+   RTVEC_ELT (v, i) = CONST0_RTX (inner);
+  else if (integer_onep (elt)
+  || integer_minus_onep (elt))
+   RTVEC_ELT (v, i) = CONSTM1_RTX (inner);
+  else
+   gcc_unreachable ();
+}
+
+  return gen_rtx_CONST_VECTOR (mode, v);
+}
+
 /* Return a CONST_VECTOR rtx for a VECTOR_CST tree.  */
 static rtx
 const_vector_from_tree (tree exp)
@@ -11359,6 +11393,9 @@ const_vector_from_tree (tree exp)
   if (initializer_zerop (exp))
 return CONST0_RTX (mode);
 
+  if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (exp)))
+return const_vector_mask_from_tree (exp);
+
   units = GET_MODE_NUNITS (mode);
   inner = GET_MODE_INNER (mode);
 
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 6a52895..01168ae 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1308,7 +1308,22 @@ vect_init_vector (gimple *stmt, tree val, tree type, 
gimple_stmt_iterator *gsi)
   if (!types_compatible_p (TREE_TYPE (type), TREE_TYPE (val)))
{
  if (CONSTANT_CLASS_P (val))
-   val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (type), val);
+   {
+ /* Can't use VIEW_CONVERT_EXPR for booleans because
+of possibly different sizes of scalar value and
+vector element.  */
+ if (VECTOR_BOOLEAN_TYPE_P (type))
+   {
+ if (integer_zerop (val))
+   val = build_int_cst (TREE_TYPE (type), 0);
+ else if

Re: [vec-cmp, patch 4/6] Support vector mask invariants

2015-10-14 Thread Ilya Enkovich
2015-10-14 11:49 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Tue, Oct 13, 2015 at 4:52 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
>> I don't understand what you mean. vect_get_vec_def_for_operand has two
>> changes made.
>> 1. For boolean invariants use build_same_sized_truth_vector_type
>> instead of get_vectype_for_scalar_type in case statement produces a
>> boolean vector. This covers cases when we use invariants in
>> comparison, AND, IOR, XOR.
>
> Yes, I understand we need this special-casing to differentiate between
> the vector type
> used for boolean-typed loads/stores and the type for boolean typed constants.
> What happens if we mix them btw, like with
>
>   _Bool b = bools[i];
>   _Bool c = b || d;
>   ...
>
> ?

Here both statements should get vector of char as a vectype and we
never go VECTOR_BOOLEAN_TYPE_P way for them

>
>> 2. COND_EXPR is an exception because it has built-in boolean vector
>> result not reflected in its vecinfo. Thus I added additional operand
>> for vect_get_vec_def_for_operand to directly specify vectype for
>> vector definition in case it is a loop invariant.
>> So what do you propose to do with these changes?
>
> This is the change I don't like and don't see why we need it.  It works today
> and the comparison operands should be of appropriate type already?

Today it works because we always create vector of integer constant.
With boolean vectors it may be either integer vector or boolean vector
depending on context. Consider:

_Bool _1;
int _2;

_2 = _1 != 0 ? 0 : 1

We have two zero constants here requiring different vectypes.

Ilya

>
> Richard.
>
>> Thanks,
>> Ilya


Re: [vec-cmp, patch 2/6] Vectorization factor computation

2015-10-14 Thread Ilya Enkovich
2015-10-13 16:37 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Thu, Oct 8, 2015 at 4:59 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
>> Hi,
>>
>> This patch handles statements with boolean result in vectorization factor 
>> computation.  For comparison its operands type is used instead of restult 
>> type to compute VF.  Other boolean statements are ignored for VF.
>>
>> Vectype for comparison is computed using type of compared values.  Computed 
>> type is propagated into other boolean operations.
>
> This feels rather ad-hoc, mixing up the existing way of computing
> vector type and VF.  I'd rather have turned the whole
> vector type computation around to the scheme working on the operands
> rather than on the lhs and then searching
> for smaller/larger types on the rhs'.
>
> I know this is a tricky function (heh, but you make it even worse...).
> And it needs a helper with knowledge about operations
> so one can compute the result vector type for an operation on its
> operands.  The seeds should be PHIs (handled like now)
> and loads, and yes, externals need special handling.
>
> Ideally we'd do things in two stages, first compute vector types in a
> less constrained manner (not forcing a single vector size)
> and then in a 2nd run promote to a common size also computing the VF to do 
> that.

This sounds like a refactoring, not a functional change, right? Also I
don't see a reason to analyze DF to compute vectypes if we promote it
to a single vector size anyway. For booleans we have to do it because
boolean vectors of the same size may have different number of
elements. What is the reason to do it for other types?

Shouldn't it be a patch independent from comparison vectorization series?

>
> Btw, I think you "mishandle" bool b = boolvar != 0;

This should be handled fine. Statement will inherit a vectype of
'boolvar' definition. If it's invariant - then yes, invariant boolean
statement case is not handled. But this is only because I supposed we
just shouldn't have such statements in a loop. If we may have them,
then using 'vector _Bool (VF)' type for that should be OK.

Ilya

>
> Richard.
>


Re: [vec-cmp, patch 3/6] Vectorize comparison

2015-10-14 Thread Ilya Enkovich
2015-10-13 16:45 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Thu, Oct 8, 2015 at 5:03 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
>> Hi,
>>
>> This patch supports comparison statements vectrization basing on introduced 
>> optabs.
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2015-10-08  Ilya Enkovich  <enkovich@gmail.com>
>>
>> * tree-vect-data-refs.c (vect_get_new_vect_var): Support 
>> vect_mask_var.
>> (vect_create_destination_var): Likewise.
>> * tree-vect-stmts.c (vectorizable_comparison): New.
>> (vect_analyze_stmt): Add vectorizable_comparison.
>> (vect_transform_stmt): Likewise.
>> * tree-vectorizer.h (enum vect_var_kind): Add vect_mask_var.
>> (enum stmt_vec_info_type): Add comparison_vec_info_type.
>> (vectorizable_comparison): New.
>>
>>
>> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
>> index 3befa38..9edc663 100644
>> --- a/gcc/tree-vect-data-refs.c
>> +++ b/gcc/tree-vect-data-refs.c
>> @@ -3849,6 +3849,9 @@ vect_get_new_vect_var (tree type, enum vect_var_kind 
>> var_kind, const char *name)
>>case vect_scalar_var:
>>  prefix = "stmp";
>>  break;
>> +  case vect_mask_var:
>> +prefix = "mask";
>> +break;
>>case vect_pointer_var:
>>  prefix = "vectp";
>>  break;
>> @@ -4403,7 +4406,11 @@ vect_create_destination_var (tree scalar_dest, tree 
>> vectype)
>>tree type;
>>enum vect_var_kind kind;
>>
>> -  kind = vectype ? vect_simple_var : vect_scalar_var;
>> +  kind = vectype
>> +? VECTOR_BOOLEAN_TYPE_P (vectype)
>> +? vect_mask_var
>> +: vect_simple_var
>> +: vect_scalar_var;
>>type = vectype ? vectype : TREE_TYPE (scalar_dest);
>>
>>gcc_assert (TREE_CODE (scalar_dest) == SSA_NAME);
>> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
>> index 8eda8e9..6949c71 100644
>> --- a/gcc/tree-vect-stmts.c
>> +++ b/gcc/tree-vect-stmts.c
>> @@ -7525,6 +7525,211 @@ vectorizable_condition (gimple *stmt, 
>> gimple_stmt_iterator *gsi,
>>return true;
>>  }
>>
>> +/* vectorizable_comparison.
>> +
>> +   Check if STMT is comparison expression that can be vectorized.
>> +   If VEC_STMT is also passed, vectorize the STMT: create a vectorized
>> +   comparison, put it in VEC_STMT, and insert it at GSI.
>> +
>> +   Return FALSE if not a vectorizable STMT, TRUE otherwise.  */
>> +
>> +bool
>> +vectorizable_comparison (gimple *stmt, gimple_stmt_iterator *gsi,
>> +gimple **vec_stmt, tree reduc_def,
>> +slp_tree slp_node)
>> +{
>> +  tree lhs, rhs1, rhs2;
>> +  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>> +  tree vectype1 = NULL_TREE, vectype2 = NULL_TREE;
>> +  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
>> +  tree vec_rhs1 = NULL_TREE, vec_rhs2 = NULL_TREE;
>> +  tree vec_compare;
>> +  tree new_temp;
>> +  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>> +  tree def;
>> +  enum vect_def_type dt, dts[4];
>> +  unsigned nunits;
>> +  int ncopies;
>> +  enum tree_code code;
>> +  stmt_vec_info prev_stmt_info = NULL;
>> +  int i, j;
>> +  bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
>> +  vec vec_oprnds0 = vNULL;
>> +  vec vec_oprnds1 = vNULL;
>> +  tree mask_type;
>> +  tree mask;
>> +
>> +  if (!VECTOR_BOOLEAN_TYPE_P (vectype))
>> +return false;
>> +
>> +  mask_type = vectype;
>> +  nunits = TYPE_VECTOR_SUBPARTS (vectype);
>> +
>> +  if (slp_node || PURE_SLP_STMT (stmt_info))
>> +ncopies = 1;
>> +  else
>> +ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
>> +
>> +  gcc_assert (ncopies >= 1);
>> +  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
>> +return false;
>> +
>> +  if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
>> +  && !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
>> +  && reduc_def))
>> +return false;
>> +
>> +  if (STMT_VINFO_LIVE_P (stmt_info))
>> +{
>> +  if (dump_enabled_p ())
>> +   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>> +"value used after loop.\n");
>> +  return false;
>> +}
>>

Re: [Boolean Vector, patch 1/5] Introduce boolean vector to be used as a vector comparison type

2015-10-13 Thread Ilya Enkovich
2015-10-13 16:17 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Fri, Oct 9, 2015 at 10:43 PM, Jeff Law <l...@redhat.com> wrote:
>> On 10/02/2015 07:59 AM, Ilya Enkovich wrote:
>>>
>>> 2015-10-02  Ilya Enkovich  <enkovich@gmail.com>
>>>
>>> * doc/tm.texi: Regenerated.
>>> * doc/tm.texi.in (TARGET_VECTORIZE_GET_MASK_MODE): New.
>>> * stor-layout.c (layout_type): Use mode to get vector mask size.
>>> * target.def (get_mask_mode): New.
>>> * targhooks.c (default_get_mask_mode): New.
>>> * targhooks.h (default_get_mask_mode): New.
>>> * gcc/tree-vect-stmts.c (get_same_sized_vectype): Add special case
>>> for boolean vector.
>>> * tree.c (MAX_BOOL_CACHED_PREC): New.
>>> (nonstandard_boolean_type_cache): New.
>>> (build_nonstandard_boolean_type): New.
>>> (make_vector_type): Vector mask has no canonical type.
>>> (build_truth_vector_type): New.
>>> (build_same_sized_truth_vector_type): New.
>>> (truth_type_for): Support vector masks.
>>> * tree.h (VECTOR_BOOLEAN_TYPE_P): New.
>>> (build_truth_vector_type): New.
>>> (build_same_sized_truth_vector_type): New.
>>> (build_nonstandard_boolean_type): New.
>>>
>>>
>>> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
>>> index eb495a8..098213e 100644
>>> --- a/gcc/doc/tm.texi
>>> +++ b/gcc/doc/tm.texi
>>> @@ -5688,6 +5688,11 @@ mode returned by
>>> @code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE}.
>>>   The default is zero which means to not iterate over other vector sizes.
>>>   @end deftypefn
>>>
>>> +@deftypefn {Target Hook} machine_mode TARGET_VECTORIZE_GET_MASK_MODE
>>> (unsigned @var{nunits}, unsigned @var{length})
>>> +This hook returns mode to be used for a mask to be used for a vector
>>> +of specified @var{length} with @var{nunits} elements.
>>> +@end deftypefn
>>
>> Does it make sense to indicate the default used if the target does not
>> provide a definition for this hook?
>>
>>
>>
>>
>>> diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
>>> index 938e54b..58ecd7b 100644
>>> --- a/gcc/stor-layout.c
>>> +++ b/gcc/stor-layout.c
>>> @@ -2184,10 +2184,16 @@ layout_type (tree type)
>>>
>>> TYPE_SATURATING (type) = TYPE_SATURATING (TREE_TYPE (type));
>>>   TYPE_UNSIGNED (type) = TYPE_UNSIGNED (TREE_TYPE (type));
>>> -   TYPE_SIZE_UNIT (type) = int_const_binop (MULT_EXPR,
>>> -TYPE_SIZE_UNIT
>>> (innertype),
>>> -size_int (nunits));
>>> -   TYPE_SIZE (type) = int_const_binop (MULT_EXPR, TYPE_SIZE
>>> (innertype),
>>> +   /* Several boolean vector elements may fit in a single unit.  */
>>> +   if (VECTOR_BOOLEAN_TYPE_P (type))
>>> + TYPE_SIZE_UNIT (type)
>>> +   = size_int (GET_MODE_SIZE (type->type_common.mode));
>>
>> Shouldn't this be TYPE_MODE rather than accessing the internals of the tree
>> node directly?
>
> Probably not because of TYPE_MODE interfering for vector types.

Seems I need to roll it back then. I don't think I want scalar mode to
be used for cases when proper integer vector mode is unsupported by
target but returned by default get_mask_mode hook. Such cases just
should be lowered into scalars.

>
> But...
>
> +/* Builds a boolean type of precision PRECISION.
> +   Used for boolean vectors to choose proper vector element size.  */
> +tree
> +build_nonstandard_boolean_type (unsigned HOST_WIDE_INT precision)
> +{
> +  tree type;
> +
> +  if (precision <= MAX_BOOL_CACHED_PREC)
> +{
> +  type = nonstandard_boolean_type_cache[precision];
> +  if (type)
> +   return type;
> +}
> +
> +  type = make_node (BOOLEAN_TYPE);
> +  TYPE_PRECISION (type) = precision;
> +  fixup_unsigned_type (type);
>
> do we really need differing _precision_ boolean types?  I think we only
> need differing size (aka mode) boolean types, no?  Thus, keep precision == 1
> but "only" adjust the mode (possibly by simply setting precision to 1 after
> fixup_unsigned_type ...)?

The reason for that was -1 value of a proper size which may be used as
vector element value. I'm not sure if something breaks in the compiler
if I set 1 precision for all created boolean typ

Re: [Boolean Vector, patch 1/5] Introduce boolean vector to be used as a vector comparison type

2015-10-13 Thread Ilya Enkovich
On 09 Oct 14:43, Jeff Law wrote:
> On 10/02/2015 07:59 AM, Ilya Enkovich wrote:
> >+This hook returns mode to be used for a mask to be used for a vector
> >+of specified @var{length} with @var{nunits} elements.
> >+@end deftypefn
> Does it make sense to indicate the default used if the target does not
> provide a definition for this hook?
> 
> 

Sure

> 
> 
> >diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
> >index 938e54b..58ecd7b 100644
> >--- a/gcc/stor-layout.c
> >+++ b/gcc/stor-layout.c
> >@@ -2184,10 +2184,16 @@ layout_type (tree type)
> >
> > TYPE_SATURATING (type) = TYPE_SATURATING (TREE_TYPE (type));
> >  TYPE_UNSIGNED (type) = TYPE_UNSIGNED (TREE_TYPE (type));
> >-TYPE_SIZE_UNIT (type) = int_const_binop (MULT_EXPR,
> >- TYPE_SIZE_UNIT (innertype),
> >- size_int (nunits));
> >-TYPE_SIZE (type) = int_const_binop (MULT_EXPR, TYPE_SIZE (innertype),
> >+/* Several boolean vector elements may fit in a single unit.  */
> >+if (VECTOR_BOOLEAN_TYPE_P (type))
> >+  TYPE_SIZE_UNIT (type)
> >+= size_int (GET_MODE_SIZE (type->type_common.mode));
> Shouldn't this be TYPE_MODE rather than accessing the internals of the tree
> node directly?

Previous version of this patch had changes in vector_type_mode and seems I 
copy-pasted this field access from there.
Will fix it here.

> 
> 
> >diff --git a/gcc/tree.c b/gcc/tree.c
> >index 84fd34d..0cb8361 100644
> >--- a/gcc/tree.c
> >+++ b/gcc/tree.c
> >@@ -11067,9 +11130,10 @@ truth_type_for (tree type)
> >  {
> >if (TREE_CODE (type) == VECTOR_TYPE)
> >  {
> >-  tree elem = lang_hooks.types.type_for_size
> >-(GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (type))), 0);
> >-  return build_opaque_vector_type (elem, TYPE_VECTOR_SUBPARTS (type));
> >+  if (VECTOR_BOOLEAN_TYPE_P (type))
> >+return type;
> >+  return build_truth_vector_type (TYPE_VECTOR_SUBPARTS (type),
> >+  GET_MODE_SIZE (TYPE_MODE (type)));
> Presumably you're not building an opaque type anymore because you want
> warnings if somethings tries to do a conversion?  I'm going to assume this
> was intentional.

Right.  I don't expect front-end to cast boolean vector to anything.  Its usage 
should be limited by VEC_COND_EXPR.

> 
> 
> With the doc update and the fix to use TYPE_MODE (assuming there's not a
> good reason to be looking at the underlying type directly) this is OK.
> 
> jeff

Here is an updated version.

Thanks,
Ilya
--
2015-10-13  Ilya Enkovich  <enkovich@gmail.com>

* doc/tm.texi: Regenerated.
* doc/tm.texi.in (TARGET_VECTORIZE_GET_MASK_MODE): New.
* stor-layout.c (layout_type): Use mode to get vector mask size.
* target.def (get_mask_mode): New.
* targhooks.c (default_get_mask_mode): New.
* targhooks.h (default_get_mask_mode): New.
* gcc/tree-vect-stmts.c (get_same_sized_vectype): Add special case
for boolean vector.
* tree.c (MAX_BOOL_CACHED_PREC): New.
(nonstandard_boolean_type_cache): New.
(build_nonstandard_boolean_type): New.
(make_vector_type): Vector mask has no canonical type.
(build_truth_vector_type): New.
(build_same_sized_truth_vector_type): New.
(truth_type_for): Support vector masks.
* tree.h (VECTOR_BOOLEAN_TYPE_P): New.
(build_truth_vector_type): New.
(build_same_sized_truth_vector_type): New.
(build_nonstandard_boolean_type): New.


diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 33939ec..914cfea 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4225,6 +4225,8 @@ address;  but often a machine-dependent strategy can 
generate better code.
 
 @hook TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES
 
+@hook TARGET_VECTORIZE_GET_MASK_MODE
+
 @hook TARGET_VECTORIZE_INIT_COST
 
 @hook TARGET_VECTORIZE_ADD_STMT_COST
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 938e54b..d2289d9 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -2184,10 +2184,16 @@ layout_type (tree type)
 
TYPE_SATURATING (type) = TYPE_SATURATING (TREE_TYPE (type));
 TYPE_UNSIGNED (type) = TYPE_UNSIGNED (TREE_TYPE (type));
-   TYPE_SIZE_UNIT (type) = int_const_binop (MULT_EXPR,
-TYPE_SIZE_UNIT (innertype),
-size_int (nunits));
-   TYPE_SIZE (type) = int_const_binop (MULT_EXPR, TYPE_SIZE (innertype),
+   /* Several boolean vector elements may fit in a single unit.  */
+   if (VECTOR_BOOLEAN_TYPE_P (type))
+ TYPE_SIZE_UNIT (type)
+  

Re: [[Boolean Vector, patch 5/5] Support boolean vectors in vector lowering

2015-10-13 Thread Ilya Enkovich
2015-10-12 13:37 GMT+03:00 Alan Lawrence :
> On 09/10/15 22:01, Jeff Law wrote:
>
>> So my question for the series as a whole is whether or not we need to do
>> something for the other languages, particularly Fortran.  I was a bit
>> surprised to see this stuff bleed into the C/C++ front-ends and
>> obviously wonder if it's bled into Fortran, Ada, Java, etc.
>
>
> Isn't that just because, we have GNU extensions to C/C++, for vectors? I
> admit I don't know enough Ada/Fortran to know whether we've added GNU
> extensions to those languages as well...
>
> A.

I also got an impression only GNU vector extensions should be
affected. And those are for C/C++ only.

Thanks,
Ilya


Re: [Boolean Vector, patch 3/5] Use boolean vector in C/C++ FE

2015-10-13 Thread Ilya Enkovich
On 09 Oct 14:51, Jeff Law wrote:
> On 10/02/2015 08:04 AM, Ilya Enkovich wrote:
> >Hi,
> >
> >This patch makes C/C++ FE to use boolean vector as a resulting type for 
> >vector comparison.  As a result vector comparison in source code now parsed 
> >into VEC_COND_EXPR, it required a testcase fix-up.
> >
> >Thanks,
> >Ilya
> >--
> >gcc/c
> >
> >2015-10-02  Ilya Enkovich  <enkovich@gmail.com>
> >
> > * c-typeck.c (build_conditional_expr): Use boolean vector
> > type for vector comparison.
> > (build_vec_cmp): New.
> > (build_binary_op): Use build_vec_cmp for comparison.
> >
> >gcc/cp
> >
> >2015-10-02  Ilya Enkovich  <enkovich@gmail.com>
> >
> > * call.c (build_conditional_expr_1): Use boolean vector
> > type for vector comparison.
> > * typeck.c (build_vec_cmp): New.
> > (cp_build_binary_op): Use build_vec_cmp for comparison.
> >
> >gcc/testsuite/
> >
> >2015-10-02  Ilya Enkovich  <enkovich@gmail.com>
> >
> > * g++.dg/ext/vector22.C: Allow VEC_COND_EXPR.
> >
> >
> >diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
> >index 3b26231..3f64d76 100644
> >--- a/gcc/c/c-typeck.c
> >+++ b/gcc/c/c-typeck.c
> >@@ -10220,6 +10232,19 @@ push_cleanup (tree decl, tree cleanup, bool eh_only)
> >STATEMENT_LIST_STMT_EXPR (list) = stmt_expr;
> >  }
> >  
> >+/* Build a vector comparison using VEC_COND_EXPR.  */
> Please make sure your function comments include descriptions of all the
> arguments and return values.

Fixed.

> 
> 
> >+
> >+static tree
> >+build_vec_cmp (tree_code code, tree type,
> >+   tree arg0, tree arg1)
> >+{
> >+  tree zero_vec = build_zero_cst (type);
> >+  tree minus_one_vec = build_minus_one_cst (type);
> >+  tree cmp_type = build_same_sized_truth_vector_type (type);
> >+  tree cmp = build2 (code, cmp_type, arg0, arg1);
> >+  return build3 (VEC_COND_EXPR, type, cmp, minus_one_vec, zero_vec);
> >+}
> Isn't this implementation the same for C & C++?  Does it make sense to put
> it in c-family/c-common.c?

C++ version calls fold_if_not_in_template for generated comparison.  It is 
required there to successfully recognize vector MIN, MAX and ABS templates for 
vector ?: conditional operator.  Vector form of ?: conditional operator is 
supported for C++ only.

> 
> 
> >+
> >  /* Build a binary-operation expression without default conversions.
> > CODE is the kind of expression to build.
> > LOCATION is the operator's location.
> >@@ -10786,7 +10811,8 @@ build_binary_op (location_t location, enum tree_code 
> >code,
> >result_type = build_opaque_vector_type (intt,
> >   TYPE_VECTOR_SUBPARTS (type0));
> >converted = 1;
> >-  break;
> >+  ret = build_vec_cmp (resultcode, result_type, op0, op1);
> >+  goto return_build_binary_op;
> I suspect there's some kind of whitespace/tab problem.  Those two lines
> should be indented the same, right?

Fixed.

> 
> 
> >  }
> >if (FLOAT_TYPE_P (type0) || FLOAT_TYPE_P (type1))
> > warning_at (location,
> >@@ -10938,7 +10964,8 @@ build_binary_op (location_t location, enum tree_code 
> >code,
> >result_type = build_opaque_vector_type (intt,
> >   TYPE_VECTOR_SUBPARTS (type0));
> >converted = 1;
> >-      break;
> >+  ret = build_vec_cmp (resultcode, result_type, op0, op1);
> >+  goto return_build_binary_op;
> Similarly here.
> 
> With the items above fixed, this is OK.
> 
> However, more generally, do we need to do anything for the other languages?

Looking into that I got an impression vector modes are used by C/C++ vector 
extensions only.  And I think regression testing would reveal some failures 
otherwise.

> 
> Jeff

Here is an updated version.

Thanks,
Ilya
--
gcc/c

2015-10-02  Ilya Enkovich  <enkovich@gmail.com>

* c-typeck.c (build_conditional_expr): Use boolean vector
type for vector comparison.
(build_vec_cmp): New.
(build_binary_op): Use build_vec_cmp for comparison.

gcc/cp

2015-10-02  Ilya Enkovich  <enkovich@gmail.com>

* call.c (build_conditional_expr_1): Use boolean vector
type for vector comparison.
* typeck.c (build_vec_cmp): New.
(cp_build_binary_op): Use build_vec_cmp for comparison.

gcc/testsuite/

2015-10-02  Ilya Enkovich  <enkovich@gmail.com>

* g++.dg/e

Re: [vec-cmp, patch 4/6] Support vector mask invariants

2015-10-13 Thread Ilya Enkovich
2015-10-13 16:54 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Thu, Oct 8, 2015 at 5:11 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
>> Hi,
>>
>> This patch adds a special handling of boolean vector invariants.  We need 
>> additional code to determine type of generated invariant.  For VEC_COND_EXPR 
>> case we even provide this type directly because statement vectype doesn't 
>> allow us to compute it.  Separate code is used to generate and expand such 
>> vectors.
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2015-10-08  Ilya Enkovich  <enkovich@gmail.com>
>>
>> * expr.c (const_vector_mask_from_tree): New.
>> (const_vector_from_tree): Use const_vector_mask_from_tree
>> for boolean vectors.
>> * tree-vect-stmts.c (vect_init_vector): Support boolean vector
>> invariants.
>> (vect_get_vec_def_for_operand): Add VECTYPE arg.
>> (vectorizable_condition): Directly provide vectype for invariants
>> used in comparison.
>> * tree-vectorizer.h (vect_get_vec_def_for_operand): Add VECTYPE
>> arg.
>>
>>
>> diff --git a/gcc/expr.c b/gcc/expr.c
>> index 88da8cb..a624a34 100644
>> --- a/gcc/expr.c
>> +++ b/gcc/expr.c
>> @@ -11320,6 +11320,40 @@ try_tablejump (tree index_type, tree index_expr, 
>> tree minval, tree range,
>>return 1;
>>  }
>>
>> +/* Return a CONST_VECTOR rtx representing vector mask for
>> +   a VECTOR_CST of booleans.  */
>> +static rtx
>> +const_vector_mask_from_tree (tree exp)
>> +{
>> +  rtvec v;
>> +  unsigned i;
>> +  int units;
>> +  tree elt;
>> +  machine_mode inner, mode;
>> +
>> +  mode = TYPE_MODE (TREE_TYPE (exp));
>> +  units = GET_MODE_NUNITS (mode);
>> +  inner = GET_MODE_INNER (mode);
>> +
>> +  v = rtvec_alloc (units);
>> +
>> +  for (i = 0; i < VECTOR_CST_NELTS (exp); ++i)
>> +{
>> +  elt = VECTOR_CST_ELT (exp, i);
>> +
>> +  gcc_assert (TREE_CODE (elt) == INTEGER_CST);
>> +  if (integer_zerop (elt))
>> +   RTVEC_ELT (v, i) = CONST0_RTX (inner);
>> +  else if (integer_onep (elt)
>> +  || integer_minus_onep (elt))
>> +   RTVEC_ELT (v, i) = CONSTM1_RTX (inner);
>> +  else
>> +   gcc_unreachable ();
>> +}
>> +
>> +  return gen_rtx_CONST_VECTOR (mode, v);
>> +}
>> +
>>  /* Return a CONST_VECTOR rtx for a VECTOR_CST tree.  */
>>  static rtx
>>  const_vector_from_tree (tree exp)
>> @@ -11335,6 +11369,9 @@ const_vector_from_tree (tree exp)
>>if (initializer_zerop (exp))
>>  return CONST0_RTX (mode);
>>
>> +  if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (exp)))
>> +  return const_vector_mask_from_tree (exp);
>> +
>>units = GET_MODE_NUNITS (mode);
>>inner = GET_MODE_INNER (mode);
>>
>> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
>> index 6949c71..337ea7b 100644
>> --- a/gcc/tree-vect-stmts.c
>> +++ b/gcc/tree-vect-stmts.c
>> @@ -1308,27 +1308,61 @@ vect_init_vector_1 (gimple *stmt, gimple *new_stmt, 
>> gimple_stmt_iterator *gsi)
>>  tree
>>  vect_init_vector (gimple *stmt, tree val, tree type, gimple_stmt_iterator 
>> *gsi)
>>  {
>> +  tree val_type = TREE_TYPE (val);
>> +  machine_mode mode = TYPE_MODE (type);
>> +  machine_mode val_mode = TYPE_MODE(val_type);
>>tree new_var;
>>gimple *init_stmt;
>>tree vec_oprnd;
>>tree new_temp;
>>
>>if (TREE_CODE (type) == VECTOR_TYPE
>> -  && TREE_CODE (TREE_TYPE (val)) != VECTOR_TYPE)
>> -{
>> -  if (!types_compatible_p (TREE_TYPE (type), TREE_TYPE (val)))
>> +  && TREE_CODE (val_type) != VECTOR_TYPE)
>> +{
>> +  /* Handle vector of bool represented as a vector of
>> +integers here rather than on expand because it is
>> +a default mask type for targets.  Vector mask is
>> +built in a following way:
>> +
>> +tmp = (int)val
>> +vec_tmp = {tmp, ..., tmp}
>> +vec_cst = VIEW_CONVERT_EXPR<vector(N) _Bool>(vec_tmp);  */
>> +  if (TREE_CODE (val_type) == BOOLEAN_TYPE
>> + && VECTOR_MODE_P (mode)
>> + && SCALAR_INT_MODE_P (GET_MODE_INNER (mode))
>> + && GET_MODE_INNER (mode) != val_mode)
>> {
>> - if (CONSTANT_CLASS_P (val))
>> -   val = fold_unary (V

Re: [[Boolean Vector, patch 5/5] Support boolean vectors in vector lowering

2015-10-13 Thread Ilya Enkovich
2015-10-13 18:35 GMT+03:00 Jeff Law <l...@redhat.com>:
> On 10/13/2015 08:56 AM, Ilya Enkovich wrote:
>>
>> 2015-10-12 13:37 GMT+03:00 Alan Lawrence <alan.lawre...@arm.com>:
>>>
>>> On 09/10/15 22:01, Jeff Law wrote:
>>>
>>>> So my question for the series as a whole is whether or not we need to do
>>>> something for the other languages, particularly Fortran.  I was a bit
>>>> surprised to see this stuff bleed into the C/C++ front-ends and
>>>> obviously wonder if it's bled into Fortran, Ada, Java, etc.
>>>
>>>
>>>
>>> Isn't that just because, we have GNU extensions to C/C++, for vectors? I
>>> admit I don't know enough Ada/Fortran to know whether we've added GNU
>>> extensions to those languages as well...
>>>
>>> A.
>>
>>
>> I also got an impression only GNU vector extensions should be
>> affected. And those are for C/C++ only.
>
> I'd be surprised if Fortran doesn't have vector capabilities.  I think some
> sanity checking in there would be wise.

Vector type in language doesn't mean SIMD. AFAIK OpenMP is used in
Fortran for SIMD features. Also I would get a lot of Fortran
regressions in case such feature exists due to fixed IL checker.

Thanks,
Ilya

>
> jeff


Re: [Boolean Vector, patch 3/5] Use boolean vector in C/C++ FE

2015-10-13 Thread Ilya Enkovich
2015-10-13 18:42 GMT+03:00 Jeff Law <l...@redhat.com>:
> On 10/13/2015 08:14 AM, Ilya Enkovich wrote:
>>>>
>>>> +
>>>> +static tree
>>>> +build_vec_cmp (tree_code code, tree type,
>>>> +  tree arg0, tree arg1)
>>>> +{
>>>> +  tree zero_vec = build_zero_cst (type);
>>>> +  tree minus_one_vec = build_minus_one_cst (type);
>>>> +  tree cmp_type = build_same_sized_truth_vector_type (type);
>>>> +  tree cmp = build2 (code, cmp_type, arg0, arg1);
>>>> +  return build3 (VEC_COND_EXPR, type, cmp, minus_one_vec, zero_vec);
>>>> +}
>>>
>>> Isn't this implementation the same for C & C++?  Does it make sense to
>>> put
>>> it in c-family/c-common.c?
>>
>>
>> C++ version calls fold_if_not_in_template for generated comparison.  It is
>> required there to successfully recognize vector MIN, MAX and ABS templates
>> for vector ?: conditional operator.  Vector form of ?: conditional operator
>> is supported for C++ only.
>
> Ah, nevermind then.
>
>
>>>
>>> However, more generally, do we need to do anything for the other
>>> languages?
>>
>>
>> Looking into that I got an impression vector modes are used by C/C++
>> vector extensions only.  And I think regression testing would reveal some
>> failures otherwise.
>
> Maybe this stuff hasn't bled into the Fortran front-end, but the gfortran
> front-end certainly has OpenMP support which presumably has vector
> extensions.

OpenMP extension doesn't produce any vector code in front-end. Code
will be produced by vectorizer anyway.

>
> The fact that nothing's failing in the testsuite is encouraging, but it'd be
> worth spending a few minutes taking a look to see if there's something that
> might need updating.

I also grepped for VEC_COND_EXPR and it never occurs in front-ends
other than C/C++.

Thanks,
Ilya

>
> Jeff
>


[vec-cmp, patch 2/6] Vectorization factor computation

2015-10-08 Thread Ilya Enkovich
Hi,

This patch handles statements with boolean result in vectorization factor 
computation.  For comparison its operands type is used instead of restult type 
to compute VF.  Other boolean statements are ignored for VF.

Vectype for comparison is computed using type of compared values.  Computed 
type is propagated into other boolean operations.

Thanks,
Ilya
--
gcc/

2015-10-08  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-loop.c (vect_determine_vectorization_factor):  Ignore mask
operations for VF.  Add mask type computation.
* tree-vect-stmts.c (get_mask_type_for_scalar_type): New.
* tree-vectorizer.h (get_mask_type_for_scalar_type): New.


diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 63e29aa..c7e8067 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -183,19 +183,21 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
 {
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-  int nbbs = loop->num_nodes;
+  unsigned nbbs = loop->num_nodes;
   unsigned int vectorization_factor = 0;
   tree scalar_type;
   gphi *phi;
   tree vectype;
   unsigned int nunits;
   stmt_vec_info stmt_info;
-  int i;
+  unsigned i;
   HOST_WIDE_INT dummy;
   gimple *stmt, *pattern_stmt = NULL;
   gimple_seq pattern_def_seq = NULL;
   gimple_stmt_iterator pattern_def_si = gsi_none ();
   bool analyze_pattern_stmt = false;
+  bool bool_result;
+  auto_vec mask_producers;
 
   if (dump_enabled_p ())
 dump_printf_loc (MSG_NOTE, vect_location,
@@ -414,6 +416,8 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
  return false;
}
 
+ bool_result = false;
+
  if (STMT_VINFO_VECTYPE (stmt_info))
{
  /* The only case when a vectype had been already set is for stmts
@@ -434,6 +438,32 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
scalar_type = TREE_TYPE (gimple_call_arg (stmt, 3));
  else
scalar_type = TREE_TYPE (gimple_get_lhs (stmt));
+
+ /* Bool ops don't participate in vectorization factor
+computation.  For comparison use compared types to
+compute a factor.  */
+ if (TREE_CODE (scalar_type) == BOOLEAN_TYPE)
+   {
+ mask_producers.safe_push (stmt_info);
+ bool_result = true;
+
+ if (gimple_code (stmt) == GIMPLE_ASSIGN
+ && TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+== tcc_comparison
+ && TREE_CODE (TREE_TYPE (gimple_assign_rhs1 (stmt)))
+!= BOOLEAN_TYPE)
+   scalar_type = TREE_TYPE (gimple_assign_rhs1 (stmt));
+ else
+   {
+ if (!analyze_pattern_stmt && gsi_end_p (pattern_def_si))
+   {
+ pattern_def_seq = NULL;
+ gsi_next ();
+   }
+ continue;
+   }
+   }
+
  if (dump_enabled_p ())
{
  dump_printf_loc (MSG_NOTE, vect_location,
@@ -456,7 +486,8 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
  return false;
}
 
- STMT_VINFO_VECTYPE (stmt_info) = vectype;
+ if (!bool_result)
+   STMT_VINFO_VECTYPE (stmt_info) = vectype;
 
  if (dump_enabled_p ())
{
@@ -469,8 +500,9 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
  /* The vectorization factor is according to the smallest
 scalar type (or the largest vector size, but we only
 support one vector size per loop).  */
- scalar_type = vect_get_smallest_scalar_type (stmt, ,
-  );
+ if (!bool_result)
+   scalar_type = vect_get_smallest_scalar_type (stmt, ,
+);
  if (dump_enabled_p ())
{
  dump_printf_loc (MSG_NOTE, vect_location,
@@ -545,6 +577,100 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
 }
   LOOP_VINFO_VECT_FACTOR (loop_vinfo) = vectorization_factor;
 
+  for (i = 0; i < mask_producers.length (); i++)
+{
+  tree mask_type = NULL;
+  bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (mask_producers[i]);
+
+  stmt = STMT_VINFO_STMT (mask_producers[i]);
+
+  if (gimple_code (stmt) == GIMPLE_ASSIGN
+ && TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_comparison
+ && TREE_CODE (TREE_TYPE (gimple_assign_rhs1 (stmt))) != BOOLEAN_TYPE)
+   {
+ scalar_type = TREE_TYPE (gimple_assign_rhs1 (stmt));
+

[vec-cmp, patch 4/6] Support vector mask invariants

2015-10-08 Thread Ilya Enkovich
Hi,

This patch adds a special handling of boolean vector invariants.  We need 
additional code to determine type of generated invariant.  For VEC_COND_EXPR 
case we even provide this type directly because statement vectype doesn't allow 
us to compute it.  Separate code is used to generate and expand such vectors.

Thanks,
Ilya
--
gcc/

2015-10-08  Ilya Enkovich  <enkovich@gmail.com>

* expr.c (const_vector_mask_from_tree): New.
(const_vector_from_tree): Use const_vector_mask_from_tree
for boolean vectors.
* tree-vect-stmts.c (vect_init_vector): Support boolean vector
invariants.
(vect_get_vec_def_for_operand): Add VECTYPE arg.
(vectorizable_condition): Directly provide vectype for invariants
used in comparison.
* tree-vectorizer.h (vect_get_vec_def_for_operand): Add VECTYPE
arg.


diff --git a/gcc/expr.c b/gcc/expr.c
index 88da8cb..a624a34 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11320,6 +11320,40 @@ try_tablejump (tree index_type, tree index_expr, tree 
minval, tree range,
   return 1;
 }
 
+/* Return a CONST_VECTOR rtx representing vector mask for
+   a VECTOR_CST of booleans.  */
+static rtx
+const_vector_mask_from_tree (tree exp)
+{
+  rtvec v;
+  unsigned i;
+  int units;
+  tree elt;
+  machine_mode inner, mode;
+
+  mode = TYPE_MODE (TREE_TYPE (exp));
+  units = GET_MODE_NUNITS (mode);
+  inner = GET_MODE_INNER (mode);
+
+  v = rtvec_alloc (units);
+
+  for (i = 0; i < VECTOR_CST_NELTS (exp); ++i)
+{
+  elt = VECTOR_CST_ELT (exp, i);
+
+  gcc_assert (TREE_CODE (elt) == INTEGER_CST);
+  if (integer_zerop (elt))
+   RTVEC_ELT (v, i) = CONST0_RTX (inner);
+  else if (integer_onep (elt)
+  || integer_minus_onep (elt))
+   RTVEC_ELT (v, i) = CONSTM1_RTX (inner);
+  else
+   gcc_unreachable ();
+}
+
+  return gen_rtx_CONST_VECTOR (mode, v);
+}
+
 /* Return a CONST_VECTOR rtx for a VECTOR_CST tree.  */
 static rtx
 const_vector_from_tree (tree exp)
@@ -11335,6 +11369,9 @@ const_vector_from_tree (tree exp)
   if (initializer_zerop (exp))
 return CONST0_RTX (mode);
 
+  if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (exp)))
+  return const_vector_mask_from_tree (exp);
+
   units = GET_MODE_NUNITS (mode);
   inner = GET_MODE_INNER (mode);
 
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 6949c71..337ea7b 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1308,27 +1308,61 @@ vect_init_vector_1 (gimple *stmt, gimple *new_stmt, 
gimple_stmt_iterator *gsi)
 tree
 vect_init_vector (gimple *stmt, tree val, tree type, gimple_stmt_iterator *gsi)
 {
+  tree val_type = TREE_TYPE (val);
+  machine_mode mode = TYPE_MODE (type);
+  machine_mode val_mode = TYPE_MODE(val_type);
   tree new_var;
   gimple *init_stmt;
   tree vec_oprnd;
   tree new_temp;
 
   if (TREE_CODE (type) == VECTOR_TYPE
-  && TREE_CODE (TREE_TYPE (val)) != VECTOR_TYPE)
-{
-  if (!types_compatible_p (TREE_TYPE (type), TREE_TYPE (val)))
+  && TREE_CODE (val_type) != VECTOR_TYPE)
+{
+  /* Handle vector of bool represented as a vector of
+integers here rather than on expand because it is
+a default mask type for targets.  Vector mask is
+built in a following way:
+
+tmp = (int)val
+vec_tmp = {tmp, ..., tmp}
+vec_cst = VIEW_CONVERT_EXPR<vector(N) _Bool>(vec_tmp);  */
+  if (TREE_CODE (val_type) == BOOLEAN_TYPE
+ && VECTOR_MODE_P (mode)
+ && SCALAR_INT_MODE_P (GET_MODE_INNER (mode))
+ && GET_MODE_INNER (mode) != val_mode)
{
- if (CONSTANT_CLASS_P (val))
-   val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (type), val);
- else
+ unsigned size = GET_MODE_BITSIZE (GET_MODE_INNER (mode));
+ tree stype = build_nonstandard_integer_type (size, 1);
+ tree vectype = get_vectype_for_scalar_type (stype);
+
+ new_temp = make_ssa_name (stype);
+ init_stmt = gimple_build_assign (new_temp, NOP_EXPR, val);
+ vect_init_vector_1 (stmt, init_stmt, gsi);
+
+ val = make_ssa_name (vectype);
+ new_temp = build_vector_from_val (vectype, new_temp);
+ init_stmt = gimple_build_assign (val, new_temp);
+ vect_init_vector_1 (stmt, init_stmt, gsi);
+
+ val = build1 (VIEW_CONVERT_EXPR, type, val);
+   }
+  else
+   {
+ if (!types_compatible_p (TREE_TYPE (type), val_type))
{
- new_temp = make_ssa_name (TREE_TYPE (type));
- init_stmt = gimple_build_assign (new_temp, NOP_EXPR, val);
- vect_init_vector_1 (stmt, init_stmt, gsi);
- val = new_temp;
+ if (CONSTANT_CLASS_P (val))
+   val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (type), val);
+ else
+   {
+ new_temp = make_ssa_name (TREE_TYPE (type));
+ 

[mask-load, patch 2/2, i386] Add/modify mask load/store patterns

2015-10-08 Thread Ilya Enkovich
Hi,

This patch reflects changes in maskload and maskstore optabs and adds patterns 
for AVX-512.

Thanks,
Ilya
--
2015-10-08  Ilya Enkovich  <enkovich@gmail.com>

* config/i386/sse.md (maskload): Rename to ...
(maskload): ... this.
(maskstore): Rename to ...
(maskstore): ... this.
(maskload): New.
(maskstore): New.


diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 3a9d2d3..48424fc 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -18153,7 +18153,7 @@
(set_attr "btver2_decode" "vector") 
(set_attr "mode" "")])
 
-(define_expand "maskload"
+(define_expand "maskload"
   [(set (match_operand:V48_AVX2 0 "register_operand")
(unspec:V48_AVX2
  [(match_operand: 2 "register_operand")
@@ -18161,7 +18161,23 @@
  UNSPEC_MASKMOV))]
   "TARGET_AVX")
 
-(define_expand "maskstore"
+(define_expand "maskload"
+  [(set (match_operand:V48_AVX512VL 0 "register_operand")
+   (vec_merge:V48_AVX512VL
+ (match_operand:V48_AVX512VL 1 "memory_operand")
+ (match_dup 0)
+ (match_operand: 2 "register_operand")))]
+  "TARGET_AVX512F")
+
+(define_expand "maskload"
+  [(set (match_operand:VI12_AVX512VL 0 "register_operand")
+   (vec_merge:VI12_AVX512VL
+ (match_operand:VI12_AVX512VL 1 "memory_operand")
+ (match_dup 0)
+ (match_operand: 2 "register_operand")))]
+  "TARGET_AVX512BW")
+
+(define_expand "maskstore"
   [(set (match_operand:V48_AVX2 0 "memory_operand")
(unspec:V48_AVX2
  [(match_operand: 2 "register_operand")
@@ -18170,6 +18186,22 @@
  UNSPEC_MASKMOV))]
   "TARGET_AVX")
 
+(define_expand "maskstore"
+  [(set (match_operand:V48_AVX512VL 0 "memory_operand")
+   (vec_merge:V48_AVX512VL
+ (match_operand:V48_AVX512VL 1 "register_operand")
+ (match_dup 0)
+ (match_operand: 2 "register_operand")))]
+  "TARGET_AVX512F")
+
+(define_expand "maskstore"
+  [(set (match_operand:VI12_AVX512VL 0 "memory_operand")
+   (vec_merge:VI12_AVX512VL
+ (match_operand:VI12_AVX512VL 1 "register_operand")
+ (match_dup 0)
+ (match_operand: 2 "register_operand")))]
+  "TARGET_AVX512BW")
+
 (define_insn_and_split "avx__"
   [(set (match_operand:AVX256MODE2P 0 "nonimmediate_operand" "=x,m")
(unspec:AVX256MODE2P


[vec-cmp, patch 1/6] Add optabs for vector comparison

2015-10-08 Thread Ilya Enkovich
Hi,

This series introduces autogeneration of vector comparison and its support on 
i386 target.  It lets comparison statements to be vectorized into vector 
comparison instead of VEC_COND_EXPR.  This allows to avoid some restrictions 
implied by boolean patterns.  This series applies on top of bolean vectors 
series [1].

This patch introduces optabs for vector comparison.

[1] https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00215.html

Thanks,
Ilya
--
gcc/

2015-10-08  Ilya Enkovich  <enkovich@gmail.com>

* expr.c (do_store_flag): Use expand_vec_cmp_expr for mask results.
* optabs-query.h (get_vec_cmp_icode): New.
* optabs-tree.c (expand_vec_cmp_expr_p): New.
* optabs-tree.h (expand_vec_cmp_expr_p): New.
* optabs.c (vector_compare_rtx): Add OPNO arg.
(expand_vec_cond_expr): Adjust to vector_compare_rtx change.
(expand_vec_cmp_expr): New.
* optabs.def (vec_cmp_optab): New.
(vec_cmpu_optab): New.
* optabs.h (expand_vec_cmp_expr): New.
* tree-vect-generic.c (expand_vector_comparison): Add vector
comparison optabs check.


diff --git a/gcc/expr.c b/gcc/expr.c
index 0bbfccd..88da8cb 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11025,9 +11025,15 @@ do_store_flag (sepops ops, rtx target, machine_mode 
mode)
   if (TREE_CODE (ops->type) == VECTOR_TYPE)
 {
   tree ifexp = build2 (ops->code, ops->type, arg0, arg1);
-  tree if_true = constant_boolean_node (true, ops->type);
-  tree if_false = constant_boolean_node (false, ops->type);
-  return expand_vec_cond_expr (ops->type, ifexp, if_true, if_false, 
target);
+  if (VECTOR_BOOLEAN_TYPE_P (ops->type))
+   return expand_vec_cmp_expr (ops->type, ifexp, target);
+  else
+   {
+ tree if_true = constant_boolean_node (true, ops->type);
+ tree if_false = constant_boolean_node (false, ops->type);
+ return expand_vec_cond_expr (ops->type, ifexp, if_true,
+  if_false, target);
+   }
 }
 
   /* Get the rtx comparison code to use.  We know that EXP is a comparison
diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h
index 73f2729..81ac362 100644
--- a/gcc/optabs-query.h
+++ b/gcc/optabs-query.h
@@ -74,6 +74,16 @@ trapv_binoptab_p (optab binoptab)
  || binoptab == smulv_optab);
 }
 
+/* Return insn code for a comparison operator with VMODE
+   resultin MASK_MODE, unsigned if UNS is true.  */
+
+static inline enum insn_code
+get_vec_cmp_icode (machine_mode vmode, machine_mode mask_mode, bool uns)
+{
+  optab tab = uns ? vec_cmpu_optab : vec_cmp_optab;
+  return convert_optab_handler (tab, vmode, mask_mode);
+}
+
 /* Return insn code for a conditional operator with a comparison in
mode CMODE, unsigned if UNS is true, resulting in a value of mode VMODE.  */
 
diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index 3b03338..aa863cf 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -320,6 +320,19 @@ supportable_convert_operation (enum tree_code code,
   return false;
 }
 
+/* Return TRUE if appropriate vector insn is available
+   for vector comparison expr with vector type VALUE_TYPE
+   and resulting mask with MASK_TYPE.  */
+
+bool
+expand_vec_cmp_expr_p (tree value_type, tree mask_type)
+{
+  enum insn_code icode = get_vec_cmp_icode (TYPE_MODE (value_type),
+   TYPE_MODE (mask_type),
+   TYPE_UNSIGNED (value_type));
+  return (icode != CODE_FOR_nothing);
+}
+
 /* Return TRUE iff, appropriate vector insns are available
for vector cond expr with vector type VALUE_TYPE and a comparison
with operand vector types in CMP_OP_TYPE.  */
diff --git a/gcc/optabs-tree.h b/gcc/optabs-tree.h
index bf6c9e3..5b966ca 100644
--- a/gcc/optabs-tree.h
+++ b/gcc/optabs-tree.h
@@ -39,6 +39,7 @@ optab optab_for_tree_code (enum tree_code, const_tree, enum 
optab_subtype);
 optab scalar_reduc_to_vector (optab, const_tree);
 bool supportable_convert_operation (enum tree_code, tree, tree, tree *,
enum tree_code *);
+bool expand_vec_cmp_expr_p (tree, tree);
 bool expand_vec_cond_expr_p (tree, tree);
 void init_tree_optimization_optabs (tree);
 
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 8d9d742..ca1a6e7 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -5100,11 +5100,13 @@ get_rtx_code (enum tree_code tcode, bool unsignedp)
 }
 
 /* Return comparison rtx for COND. Use UNSIGNEDP to select signed or
-   unsigned operators. Do not generate compare instruction.  */
+   unsigned operators.  OPNO holds an index of the first comparison
+   operand in insn with code ICODE.  Do not generate compare instruction.  */
 
 static rtx
 vector_compare_rtx (enum tree_code tcode, tree t_op0, tree t_op1,
-   bool unsignedp, enum insn_code icode)
+   bool unsignedp, enum insn_code icode,
+   

[mask-load, patch 1/2] Use boolean predicate for masked loads and store

2015-10-08 Thread Ilya Enkovich
Hi,

This patch replaces integer mask argument for MASK_LOAD ans MASK_STORE calls 
with a boolean one.  To allow various boolean vector modes assigned by a target 
maskload and maskstore optabs were transformed into convert_optab to get mask 
mode as a second operand.  Patch applies on top of boolean vector patch series.

Thanks,
Ilya
--
gcc/

2015-10-08  Ilya Enkovich  <enkovich@gmail.com>

* internal-fn.c (expand_MASK_LOAD): Adjust to maskload optab changes.
(expand_MASK_STORE): Adjust to maskstore optab changes.
* optabs-query.c (can_vec_mask_load_store_p): Add MASK_MODE arg.
 Adjust to maskload, maskstore optab changes.
* optabs-query.h (can_vec_mask_load_store_p): Add MASK_MODE arg.
* optabs.def (maskload_optab): Transform into convert optab.
(maskstore_optab): Likewise.
* tree-if-conv.c (ifcvt_can_use_mask_load_store): Adjust to
can_vec_mask_load_store_p signature change.
(predicate_mem_writes): Use boolean mask.
* tree-vect-stmts.c (vectorizable_mask_load_store): Adjust to
can_vec_mask_load_store_p signature change.  Allow invariant masks.


diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 71f811c..5ea3c0d 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -1885,7 +1885,9 @@ expand_MASK_LOAD (gcall *stmt)
   create_output_operand ([0], target, TYPE_MODE (type));
   create_fixed_operand ([1], mem);
   create_input_operand ([2], mask, TYPE_MODE (TREE_TYPE (maskt)));
-  expand_insn (optab_handler (maskload_optab, TYPE_MODE (type)), 3, ops);
+  expand_insn (convert_optab_handler (maskload_optab, TYPE_MODE (type),
+ TYPE_MODE (TREE_TYPE (maskt))),
+  3, ops);
 }
 
 static void
@@ -1908,7 +1910,9 @@ expand_MASK_STORE (gcall *stmt)
   create_fixed_operand ([0], mem);
   create_input_operand ([1], reg, TYPE_MODE (type));
   create_input_operand ([2], mask, TYPE_MODE (TREE_TYPE (maskt)));
-  expand_insn (optab_handler (maskstore_optab, TYPE_MODE (type)), 3, ops);
+  expand_insn (convert_optab_handler (maskstore_optab, TYPE_MODE (type),
+ TYPE_MODE (TREE_TYPE (maskt))),
+  3, ops);
 }
 
 static void
diff --git a/gcc/optabs-query.c b/gcc/optabs-query.c
index 254089f..c20597c 100644
--- a/gcc/optabs-query.c
+++ b/gcc/optabs-query.c
@@ -466,7 +466,9 @@ can_mult_highpart_p (machine_mode mode, bool uns_p)
 /* Return true if target supports vector masked load/store for mode.  */
 
 bool
-can_vec_mask_load_store_p (machine_mode mode, bool is_load)
+can_vec_mask_load_store_p (machine_mode mode,
+  machine_mode mask_mode,
+  bool is_load)
 {
   optab op = is_load ? maskload_optab : maskstore_optab;
   machine_mode vmode;
@@ -474,7 +476,7 @@ can_vec_mask_load_store_p (machine_mode mode, bool is_load)
 
   /* If mode is vector mode, check it directly.  */
   if (VECTOR_MODE_P (mode))
-return optab_handler (op, mode) != CODE_FOR_nothing;
+return convert_optab_handler (op, mode, mask_mode) != CODE_FOR_nothing;
 
   /* Otherwise, return true if there is some vector mode with
  the mask load/store supported.  */
@@ -485,7 +487,12 @@ can_vec_mask_load_store_p (machine_mode mode, bool is_load)
   if (!VECTOR_MODE_P (vmode))
 return false;
 
-  if (optab_handler (op, vmode) != CODE_FOR_nothing)
+  mask_mode = targetm.vectorize.get_mask_mode (GET_MODE_NUNITS (vmode),
+  GET_MODE_SIZE (vmode));
+  if (mask_mode == VOIDmode)
+return false;
+
+  if (convert_optab_handler (op, vmode, mask_mode) != CODE_FOR_nothing)
 return true;
 
   vector_sizes = targetm.vectorize.autovectorize_vector_sizes ();
@@ -496,8 +503,10 @@ can_vec_mask_load_store_p (machine_mode mode, bool is_load)
   if (cur <= GET_MODE_SIZE (mode))
continue;
   vmode = mode_for_vector (mode, cur / GET_MODE_SIZE (mode));
+  mask_mode = targetm.vectorize.get_mask_mode (GET_MODE_NUNITS (vmode),
+  cur);
   if (VECTOR_MODE_P (vmode)
- && optab_handler (op, vmode) != CODE_FOR_nothing)
+ && convert_optab_handler (op, vmode, mask_mode) != CODE_FOR_nothing)
return true;
 }
   return false;
diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h
index 81ac362..162d2e9 100644
--- a/gcc/optabs-query.h
+++ b/gcc/optabs-query.h
@@ -140,7 +140,7 @@ enum insn_code find_widening_optab_handler_and_mode (optab, 
machine_mode,
 machine_mode, int,
 machine_mode *);
 int can_mult_highpart_p (machine_mode, bool);
-bool can_vec_mask_load_store_p (machine_mode, bool);
+bool can_vec_mask_load_store_p (machine_mode, machine_mode, bool);
 bool can_compare_and_swap_p (machine_mode, bool);
 bool can_atomic_exchange_p (machine_mode, bool);
 bool lshift_c

[mask-vec_cond, patch 1/2] Support vectorization of VEC_COND_EXPR with no embedded comparison

2015-10-08 Thread Ilya Enkovich
Hi,

This patch allows COND_EXPR with no embedded comparison to be vectorized.  It's 
applied on top of vectorized comparison support series.  New optab 
vcond_mask_optab is introduced for such statements.  Bool patterns now avoid 
comparison in COND_EXPR in case vector comparison is supported by target.

Thanks,
Ilya
--
gcc/

2015-10-08  Ilya Enkovich  <enkovich@gmail.com>

* optabs-query.h (get_vcond_mask_icode): New.
* optabs-tree.c (expand_vec_cond_expr_p): Use
get_vcond_mask_icode for VEC_COND_EXPR with mask.
* optabs.c (expand_vec_cond_mask_expr): New.
(expand_vec_cond_expr): Use get_vcond_mask_icode
when possible.
* optabs.def (vcond_mask_optab): New.
* tree-vect-patterns.c (vect_recog_bool_pattern): Don't
generate redundant comparison for COND_EXPR.
* tree-vect-stmts.c (vect_is_simple_cond): Allow SSA_NAME
as a condition.
(vectorizable_condition): Likewise.


diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h
index 162d2e9..48bcf7c 100644
--- a/gcc/optabs-query.h
+++ b/gcc/optabs-query.h
@@ -98,6 +98,15 @@ get_vcond_icode (machine_mode vmode, machine_mode cmode, 
bool uns)
   return icode;
 }
 
+/* Return insn code for a conditional operator with a mask mode
+   MMODE resulting in a value of mode VMODE.  */
+
+static inline enum insn_code
+get_vcond_mask_icode (machine_mode vmode, machine_mode mmode)
+{
+  return convert_optab_handler (vcond_mask_optab, vmode, mmode);
+}
+
 /* Enumerates the possible extraction_insn operations.  */
 enum extraction_pattern { EP_insv, EP_extv, EP_extzv };
 
diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index aa863cf..d887619 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -342,6 +342,9 @@ expand_vec_cond_expr_p (tree value_type, tree cmp_op_type)
 {
   machine_mode value_mode = TYPE_MODE (value_type);
   machine_mode cmp_op_mode = TYPE_MODE (cmp_op_type);
+  if (VECTOR_BOOLEAN_TYPE_P (cmp_op_type))
+return get_vcond_mask_icode (TYPE_MODE (value_type),
+TYPE_MODE (cmp_op_type)) != CODE_FOR_nothing;
   if (GET_MODE_SIZE (value_mode) != GET_MODE_SIZE (cmp_op_mode)
   || GET_MODE_NUNITS (value_mode) != GET_MODE_NUNITS (cmp_op_mode)
   || get_vcond_icode (TYPE_MODE (value_type), TYPE_MODE (cmp_op_type),
diff --git a/gcc/optabs.c b/gcc/optabs.c
index ca1a6e7..d26b8f8 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -5346,6 +5346,38 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, rtx 
sel, rtx target)
   return tmp;
 }
 
+/* Generate insns for a VEC_COND_EXPR with mask, given its TYPE and its
+   three operands.  */
+
+rtx
+expand_vec_cond_mask_expr (tree vec_cond_type, tree op0, tree op1, tree op2,
+  rtx target)
+{
+  struct expand_operand ops[4];
+  machine_mode mode = TYPE_MODE (vec_cond_type);
+  machine_mode mask_mode = TYPE_MODE (TREE_TYPE (op0));
+  enum insn_code icode = get_vcond_mask_icode (mode, mask_mode);
+  rtx mask, rtx_op1, rtx_op2;
+
+  if (icode == CODE_FOR_nothing)
+return 0;
+
+  mask = expand_normal (op0);
+  rtx_op1 = expand_normal (op1);
+  rtx_op2 = expand_normal (op2);
+
+  mask = force_reg (GET_MODE (mask), mask);
+  rtx_op1 = force_reg (GET_MODE (rtx_op1), rtx_op1);
+
+  create_output_operand ([0], target, mode);
+  create_input_operand ([1], rtx_op1, mode);
+  create_input_operand ([2], rtx_op2, mode);
+  create_input_operand ([3], mask, mask_mode);
+  expand_insn (icode, 4, ops);
+
+  return ops[0].value;
+}
+
 /* Generate insns for a VEC_COND_EXPR, given its TYPE and its
three operands.  */
 
@@ -5371,12 +5403,21 @@ expand_vec_cond_expr (tree vec_cond_type, tree op0, 
tree op1, tree op2,
 }
   else
 {
-  /* Fake op0 < 0.  */
   gcc_assert (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (op0)));
-  op0a = op0;
-  op0b = build_zero_cst (TREE_TYPE (op0));
-  tcode = LT_EXPR;
-  unsignedp = false;
+  if (get_vcond_mask_icode (mode, TYPE_MODE (TREE_TYPE (op0)))
+ != CODE_FOR_nothing)
+   return expand_vec_cond_mask_expr (vec_cond_type, op0, op1,
+ op2, target);
+  /* Fake op0 < 0.  */
+  else
+   {
+ gcc_assert (GET_MODE_CLASS (TYPE_MODE (TREE_TYPE (op0)))
+ == MODE_VECTOR_INT);
+ op0a = op0;
+ op0b = build_zero_cst (TREE_TYPE (op0));
+ tcode = LT_EXPR;
+ unsignedp = false;
+   }
 }
   cmp_op_mode = TYPE_MODE (TREE_TYPE (op0a));
 
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 9804378..70530a6 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -61,6 +61,7 @@ OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b")
 OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b")
 OPTAB_CD(vcond_optab, "vcond$a$b")
 OPTAB_CD(vcondu_optab, "vcondu$a$b")
+OPTAB_CD(vcond_mask_optab, "vcond_mask_$a$b")
 OPTAB_CD(vec_cmp_optab, "vec_cmp$a$b&q

[mask-vec_cond, patch 2/2, i386] Add patterns for vcond_mask_optab

2015-10-08 Thread Ilya Enkovich
Hi,

This patch add patterns for vcond_mask_optab.  No new expand code is required, 
existing ix86_expand_sse_movcc is used.

Thanks,
Ilya
--
gcc/ChangeLog:

2015-10-08  Ilya Enkovich  <enkovich@gmail.com>

* config/i386/i386-protos.h (ix86_expand_sse_movcc): New.
* config/i386/i386.c (ix86_expand_sse_movcc): Make public.
Cast mask to FP mode if required.
* config/i386/sse.md (vcond_mask_): New.
(vcond_mask_): New.
(vcond_mask_): New.
(vcond_mask_): New.
(vcond_mask_v2div2di): New.
(vcond_mask_): New.
(vcond_mask_): New.


diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index e22aa57..6a0e437 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -132,6 +132,7 @@ extern bool ix86_expand_vec_perm_const (rtx[]);
 extern bool ix86_expand_mask_vec_cmp (rtx[]);
 extern bool ix86_expand_int_vec_cmp (rtx[]);
 extern bool ix86_expand_fp_vec_cmp (rtx[]);
+extern void ix86_expand_sse_movcc (rtx, rtx, rtx, rtx);
 extern void ix86_expand_sse_unpack (rtx, rtx, bool, bool);
 extern bool ix86_expand_int_addcc (rtx[]);
 extern rtx ix86_expand_call (rtx, rtx, rtx, rtx, rtx, bool);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index a8e3538..0619b9a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -21497,7 +21497,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx 
cmp_op0, rtx cmp_op1,
 /* Expand DEST = CMP ? OP_TRUE : OP_FALSE into a sequence of logical
operations.  This is used for both scalar and vector conditional moves.  */
 
-static void
+void
 ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false)
 {
   machine_mode mode = GET_MODE (dest);
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 48424fc..1e5a455 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -3015,6 +3015,87 @@
   DONE;
 })
 
+(define_expand "vcond_mask_"
+  [(set (match_operand:V48_AVX512VL 0 "register_operand")
+   (vec_merge:V48_AVX512VL
+ (match_operand:V48_AVX512VL 1 "nonimmediate_operand")
+ (match_operand:V48_AVX512VL 2 "vector_move_operand")
+ (match_operand: 3 "register_operand")))]
+  "TARGET_AVX512F")
+
+(define_expand "vcond_mask_"
+  [(set (match_operand:VI12_AVX512VL 0 "register_operand")
+   (vec_merge:VI12_AVX512VL
+ (match_operand:VI12_AVX512VL 1 "nonimmediate_operand")
+ (match_operand:VI12_AVX512VL 2 "vector_move_operand")
+ (match_operand: 3 "register_operand")))]
+  "TARGET_AVX512BW")
+
+(define_expand "vcond_mask_"
+  [(set (match_operand:VI_256 0 "register_operand")
+   (vec_merge:VI_256
+ (match_operand:VI_256 1 "nonimmediate_operand")
+ (match_operand:VI_256 2 "vector_move_operand")
+ (match_operand: 3 "register_operand")))]
+  "TARGET_AVX2"
+{
+  ix86_expand_sse_movcc (operands[0], operands[3],
+operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "vcond_mask_"
+  [(set (match_operand:VI124_128 0 "register_operand")
+   (vec_merge:VI124_128
+ (match_operand:VI124_128 1 "nonimmediate_operand")
+ (match_operand:VI124_128 2 "vector_move_operand")
+ (match_operand: 3 "register_operand")))]
+  "TARGET_SSE2"
+{
+  ix86_expand_sse_movcc (operands[0], operands[3],
+operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "vcond_mask_v2div2di"
+  [(set (match_operand:V2DI 0 "register_operand")
+   (vec_merge:V2DI
+ (match_operand:V2DI 1 "nonimmediate_operand")
+ (match_operand:V2DI 2 "vector_move_operand")
+ (match_operand:V2DI 3 "register_operand")))]
+  "TARGET_SSE4_2"
+{
+  ix86_expand_sse_movcc (operands[0], operands[3],
+operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "vcond_mask_"
+  [(set (match_operand:VF_256 0 "register_operand")
+   (vec_merge:VF_256
+ (match_operand:VF_256 1 "nonimmediate_operand")
+ (match_operand:VF_256 2 "vector_move_operand")
+ (match_operand: 3 "register_operand")))]
+  "TARGET_AVX"
+{
+  ix86_expand_sse_movcc (operands[0], operands[3],
+operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "vcond_mask_"
+  [(set (match_operand:VF_128 0 "register_operand")
+   (vec_merge:VF_128
+ (match_operand:VF_128 1 "nonimmediate_operand")
+ (match_operand:VF_128 2 "vector_move_operand")
+ (match_operand: 3 "register_operand")))]
+  "TARGET_SSE"
+{
+  ix86_expand_sse_movcc (operands[0], operands[3],
+operands[1], operands[2]);
+  DONE;
+})
+
 ;
 ;;
 ;; Parallel floating point logical operations


[vec-cmp, patch 3/6] Vectorize comparison

2015-10-08 Thread Ilya Enkovich
Hi,

This patch supports comparison statements vectrization basing on introduced 
optabs.

Thanks,
Ilya
--
gcc/

2015-10-08  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-data-refs.c (vect_get_new_vect_var): Support vect_mask_var.
(vect_create_destination_var): Likewise.
* tree-vect-stmts.c (vectorizable_comparison): New.
(vect_analyze_stmt): Add vectorizable_comparison.
(vect_transform_stmt): Likewise.
* tree-vectorizer.h (enum vect_var_kind): Add vect_mask_var.
(enum stmt_vec_info_type): Add comparison_vec_info_type.
(vectorizable_comparison): New.


diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 3befa38..9edc663 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -3849,6 +3849,9 @@ vect_get_new_vect_var (tree type, enum vect_var_kind 
var_kind, const char *name)
   case vect_scalar_var:
 prefix = "stmp";
 break;
+  case vect_mask_var:
+prefix = "mask";
+break;
   case vect_pointer_var:
 prefix = "vectp";
 break;
@@ -4403,7 +4406,11 @@ vect_create_destination_var (tree scalar_dest, tree 
vectype)
   tree type;
   enum vect_var_kind kind;
 
-  kind = vectype ? vect_simple_var : vect_scalar_var;
+  kind = vectype
+? VECTOR_BOOLEAN_TYPE_P (vectype)
+? vect_mask_var
+: vect_simple_var
+: vect_scalar_var;
   type = vectype ? vectype : TREE_TYPE (scalar_dest);
 
   gcc_assert (TREE_CODE (scalar_dest) == SSA_NAME);
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 8eda8e9..6949c71 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -7525,6 +7525,211 @@ vectorizable_condition (gimple *stmt, 
gimple_stmt_iterator *gsi,
   return true;
 }
 
+/* vectorizable_comparison.
+
+   Check if STMT is comparison expression that can be vectorized.
+   If VEC_STMT is also passed, vectorize the STMT: create a vectorized
+   comparison, put it in VEC_STMT, and insert it at GSI.
+
+   Return FALSE if not a vectorizable STMT, TRUE otherwise.  */
+
+bool
+vectorizable_comparison (gimple *stmt, gimple_stmt_iterator *gsi,
+gimple **vec_stmt, tree reduc_def,
+slp_tree slp_node)
+{
+  tree lhs, rhs1, rhs2;
+  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+  tree vectype1 = NULL_TREE, vectype2 = NULL_TREE;
+  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+  tree vec_rhs1 = NULL_TREE, vec_rhs2 = NULL_TREE;
+  tree vec_compare;
+  tree new_temp;
+  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
+  tree def;
+  enum vect_def_type dt, dts[4];
+  unsigned nunits;
+  int ncopies;
+  enum tree_code code;
+  stmt_vec_info prev_stmt_info = NULL;
+  int i, j;
+  bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
+  vec vec_oprnds0 = vNULL;
+  vec vec_oprnds1 = vNULL;
+  tree mask_type;
+  tree mask;
+
+  if (!VECTOR_BOOLEAN_TYPE_P (vectype))
+return false;
+
+  mask_type = vectype;
+  nunits = TYPE_VECTOR_SUBPARTS (vectype);
+
+  if (slp_node || PURE_SLP_STMT (stmt_info))
+ncopies = 1;
+  else
+ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
+
+  gcc_assert (ncopies >= 1);
+  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
+return false;
+
+  if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
+  && !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
+  && reduc_def))
+return false;
+
+  if (STMT_VINFO_LIVE_P (stmt_info))
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"value used after loop.\n");
+  return false;
+}
+
+  if (!is_gimple_assign (stmt))
+return false;
+
+  code = gimple_assign_rhs_code (stmt);
+
+  if (TREE_CODE_CLASS (code) != tcc_comparison)
+return false;
+
+  rhs1 = gimple_assign_rhs1 (stmt);
+  rhs2 = gimple_assign_rhs2 (stmt);
+
+  if (TREE_CODE (rhs1) == SSA_NAME)
+{
+  gimple *rhs1_def_stmt = SSA_NAME_DEF_STMT (rhs1);
+  if (!vect_is_simple_use_1 (rhs1, stmt, loop_vinfo, bb_vinfo,
+_def_stmt, , , ))
+   return false;
+}
+  else if (TREE_CODE (rhs1) != INTEGER_CST && TREE_CODE (rhs1) != REAL_CST
+  && TREE_CODE (rhs1) != FIXED_CST)
+return false;
+
+  if (TREE_CODE (rhs2) == SSA_NAME)
+{
+  gimple *rhs2_def_stmt = SSA_NAME_DEF_STMT (rhs2);
+  if (!vect_is_simple_use_1 (rhs2, stmt, loop_vinfo, bb_vinfo,
+_def_stmt, , , ))
+   return false;
+}
+  else if (TREE_CODE (rhs2) != INTEGER_CST && TREE_CODE (rhs2) != REAL_CST
+  && TREE_CODE (rhs2) != FIXED_CST)
+return false;
+
+  if (vectype1 && vectype2
+  && TYPE_VECTOR_SUBPARTS (vectype1) != TYPE_VECTOR_SUBPARTS (vectype2))
+return false;
+
+  vectype = vectype1 ? vectype1 : vectype2;
+
+  /* Invariant comparison.  */
+  if

[vec-cmp, patch 5/6] Disable bool patterns when possible

2015-10-08 Thread Ilya Enkovich
Hi,

This patch disables transformation of boolean computations into integer ones in 
case target supports vector comparison.  Pattern still applies to transform 
resulting boolean value into integer or avoid COND_EXPR with SSA_NAME as 
condition.

Thanks,
Ilya
--
2015-10-08  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-patterns.c (check_bool_pattern): Check fails
if we can vectorize comparison directly.
(search_type_for_mask): New.
(vect_recog_bool_pattern): Support cases when bool pattern
check fails.


diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index 830801a..e3be3d1 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -2951,7 +2951,7 @@ check_bool_pattern (tree var, loop_vec_info loop_vinfo, 
bb_vec_info bb_vinfo)
 default:
   if (TREE_CODE_CLASS (rhs_code) == tcc_comparison)
{
- tree vecitype, comp_vectype;
+ tree vecitype, comp_vectype, mask_type;
 
  /* If the comparison can throw, then is_gimple_condexpr will be
 false and we can't make a COND_EXPR/VEC_COND_EXPR out of it.  */
@@ -2962,6 +2962,11 @@ check_bool_pattern (tree var, loop_vec_info loop_vinfo, 
bb_vec_info bb_vinfo)
  if (comp_vectype == NULL_TREE)
return false;
 
+ mask_type = get_mask_type_for_scalar_type (TREE_TYPE (rhs1));
+ if (mask_type
+ && expand_vec_cmp_expr_p (comp_vectype, mask_type))
+   return false;
+
  if (TREE_CODE (TREE_TYPE (rhs1)) != INTEGER_TYPE)
{
  machine_mode mode = TYPE_MODE (TREE_TYPE (rhs1));
@@ -3186,6 +3191,75 @@ adjust_bool_pattern (tree var, tree out_type, tree 
trueval,
 }
 
 
+/* Try to determine a proper type for converting bool VAR
+   into an integer value.  The type is chosen so that
+   conversion has the same number of elements as a mask
+   producer.  */
+
+static tree
+search_type_for_mask (tree var, loop_vec_info loop_vinfo, bb_vec_info bb_vinfo)
+{
+  gimple *def_stmt;
+  enum vect_def_type dt;
+  tree def, rhs1;
+  enum tree_code rhs_code;
+  tree res = NULL;
+
+  if (TREE_CODE (var) != SSA_NAME)
+return NULL;
+
+  if ((TYPE_PRECISION (TREE_TYPE (var)) != 1
+   || !TYPE_UNSIGNED (TREE_TYPE (var)))
+  && TREE_CODE (TREE_TYPE (var)) != BOOLEAN_TYPE)
+return NULL;
+
+  if (!vect_is_simple_use (var, NULL, loop_vinfo, bb_vinfo, _stmt, ,
+  ))
+return NULL;
+
+  if (dt != vect_internal_def)
+return NULL;
+
+  if (!is_gimple_assign (def_stmt))
+return NULL;
+
+  rhs_code = gimple_assign_rhs_code (def_stmt);
+  rhs1 = gimple_assign_rhs1 (def_stmt);
+
+  switch (rhs_code)
+{
+case SSA_NAME:
+case BIT_NOT_EXPR:
+CASE_CONVERT:
+  res = search_type_for_mask (rhs1, loop_vinfo, bb_vinfo);
+  break;
+
+case BIT_AND_EXPR:
+case BIT_IOR_EXPR:
+case BIT_XOR_EXPR:
+  if (!(res = search_type_for_mask (rhs1, loop_vinfo, bb_vinfo)))
+   res = search_type_for_mask (gimple_assign_rhs2 (def_stmt),
+   loop_vinfo, bb_vinfo);
+  break;
+
+default:
+  if (TREE_CODE_CLASS (rhs_code) == tcc_comparison)
+   {
+ if (TREE_CODE (TREE_TYPE (rhs1)) != INTEGER_TYPE
+ || !TYPE_UNSIGNED (TREE_TYPE (rhs1)))
+   {
+ machine_mode mode = TYPE_MODE (TREE_TYPE (rhs1));
+ res = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1);
+   }
+ else
+   res = TREE_TYPE (rhs1);
+   }
+}
+
+  return res;
+}
+
+
 /* Function vect_recog_bool_pattern
 
Try to find pattern like following:
@@ -3243,6 +3317,7 @@ vect_recog_bool_pattern (vec *stmts, tree 
*type_in,
   enum tree_code rhs_code;
   tree var, lhs, rhs, vectype;
   stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
+  stmt_vec_info new_stmt_info;
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo);
   bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_vinfo);
   gimple *pattern_stmt;
@@ -3268,16 +3343,53 @@ vect_recog_bool_pattern (vec *stmts, tree 
*type_in,
   if (vectype == NULL_TREE)
return NULL;
 
-  if (!check_bool_pattern (var, loop_vinfo, bb_vinfo))
-   return NULL;
-
-  rhs = adjust_bool_pattern (var, TREE_TYPE (lhs), NULL_TREE, stmts);
-  lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
-  if (useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs)))
-   pattern_stmt = gimple_build_assign (lhs, SSA_NAME, rhs);
+  if (check_bool_pattern (var, loop_vinfo, bb_vinfo))
+   {
+ rhs = adjust_bool_pattern (var, TREE_TYPE (lhs), NULL_TREE, stmts);
+ lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
+ if (useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs)))
+   pattern_stmt = gimple_build_assign (lhs, SSA_NAME, rhs);
+ else
+   pattern_stmt
+ = gimple_build_assi

[vec-cmp, patch 6/6, i386] Add i386 support for vector comparison

2015-10-08 Thread Ilya Enkovich
Hi,

This patch adds patterns for vec_cmp optabs.  Vector comparison expand code was 
moved from VEC_COND_EXPR expanders into a separate functions.  AVX-512 patterns 
use more simple masked versions.

Thanks,
Ilya
--
gcc/

2015-10-08  Ilya Enkovich  <enkovich@gmail.com>

* config/i386/i386-protos.h (ix86_expand_mask_vec_cmp): New.
(ix86_expand_int_vec_cmp): New.
(ix86_expand_fp_vec_cmp): New.
* config/i386/i386.c (ix86_expand_sse_cmp): Allow NULL for
op_true and op_false.
(ix86_int_cmp_code_to_pcmp_immediate): New.
(ix86_fp_cmp_code_to_pcmp_immediate): New.
(ix86_cmp_code_to_pcmp_immediate): New.
(ix86_expand_mask_vec_cmp): New.
(ix86_expand_fp_vec_cmp): New.
(ix86_expand_int_sse_cmp): New.
(ix86_expand_int_vcond): Use ix86_expand_int_sse_cmp.
(ix86_expand_fp_vcond): Use ix86_expand_sse_cmp.
(ix86_expand_int_vec_cmp): New.
(ix86_get_mask_mode): New.
(TARGET_VECTORIZE_GET_MASK_MODE): New.
* config/i386/sse.md (avx512fmaskmodelower): New.
(vec_cmp): New.
(vec_cmp): New.
(vec_cmpv2div2di): New.
(vec_cmpu): New.
(vec_cmpu): New.
(vec_cmpuv2div2di): New.


diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 6a17ef4..e22aa57 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -129,6 +129,9 @@ extern bool ix86_expand_fp_vcond (rtx[]);
 extern bool ix86_expand_int_vcond (rtx[]);
 extern void ix86_expand_vec_perm (rtx[]);
 extern bool ix86_expand_vec_perm_const (rtx[]);
+extern bool ix86_expand_mask_vec_cmp (rtx[]);
+extern bool ix86_expand_int_vec_cmp (rtx[]);
+extern bool ix86_expand_fp_vec_cmp (rtx[]);
 extern void ix86_expand_sse_unpack (rtx, rtx, bool, bool);
 extern bool ix86_expand_int_addcc (rtx[]);
 extern rtx ix86_expand_call (rtx, rtx, rtx, rtx, rtx, bool);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 8a26f68..a8e3538 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -21446,8 +21446,8 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx 
cmp_op0, rtx cmp_op1,
 cmp_op1 = force_reg (cmp_ops_mode, cmp_op1);
 
   if (optimize
-  || reg_overlap_mentioned_p (dest, op_true)
-  || reg_overlap_mentioned_p (dest, op_false))
+  || (op_true && reg_overlap_mentioned_p (dest, op_true))
+  || (op_false && reg_overlap_mentioned_p (dest, op_false)))
 dest = gen_reg_rtx (maskcmp ? cmp_mode : mode);
 
   /* Compare patterns for int modes are unspec in AVX512F only.  */
@@ -21508,6 +21508,14 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, 
rtx op_false)
 
   rtx t2, t3, x;
 
+  /* If we have an integer mask and FP value then we need
+ to cast mask to FP mode.  */
+  if (mode != cmpmode && VECTOR_MODE_P (cmpmode))
+{
+  cmp = force_reg (cmpmode, cmp);
+  cmp = gen_rtx_SUBREG (mode, cmp, 0);
+}
+
   if (vector_all_ones_operand (op_true, mode)
   && rtx_equal_p (op_false, CONST0_RTX (mode))
   && !maskcmp)
@@ -21719,34 +21727,127 @@ ix86_expand_fp_movcc (rtx operands[])
   return true;
 }
 
-/* Expand a floating-point vector conditional move; a vcond operation
-   rather than a movcc operation.  */
+/* Helper for ix86_cmp_code_to_pcmp_immediate for int modes.  */
+
+static int
+ix86_int_cmp_code_to_pcmp_immediate (enum rtx_code code)
+{
+  switch (code)
+{
+case EQ:
+  return 0;
+case LT:
+case LTU:
+  return 1;
+case LE:
+case LEU:
+  return 2;
+case NE:
+  return 4;
+case GE:
+case GEU:
+  return 5;
+case GT:
+case GTU:
+  return 6;
+default:
+  gcc_unreachable ();
+}
+}
+
+/* Helper for ix86_cmp_code_to_pcmp_immediate for fp modes.  */
+
+static int
+ix86_fp_cmp_code_to_pcmp_immediate (enum rtx_code code)
+{
+  switch (code)
+{
+case EQ:
+  return 0x08;
+case NE:
+  return 0x04;
+case GT:
+  return 0x16;
+case LE:
+  return 0x1a;
+case GE:
+  return 0x15;
+case LT:
+  return 0x19;
+default:
+  gcc_unreachable ();
+}
+}
+
+/* Return immediate value to be used in UNSPEC_PCMP
+   for comparison CODE in MODE.  */
+
+static int
+ix86_cmp_code_to_pcmp_immediate (enum rtx_code code, machine_mode mode)
+{
+  if (FLOAT_MODE_P (mode))
+return ix86_fp_cmp_code_to_pcmp_immediate (code);
+  return ix86_int_cmp_code_to_pcmp_immediate (code);
+}
+
+/* Expand AVX-512 vector comparison.  */
 
 bool
-ix86_expand_fp_vcond (rtx operands[])
+ix86_expand_mask_vec_cmp (rtx operands[])
 {
-  enum rtx_code code = GET_CODE (operands[3]);
+  machine_mode mask_mode = GET_MODE (operands[0]);
+  machine_mode cmp_mode = GET_MODE (operands[2]);
+  enum rtx_code code = GET_CODE (operands[1]);
+  rtx imm = GEN_INT (ix86_cmp_code_to_pcmp_immediate (code, cmp_mode));
+  int unspec_code;
+  rtx unspec;
+
+  switch (code)

[PATCH] Fix ICE for SIMD clones usage in LTO

2015-10-05 Thread Ilya Enkovich
Hi,

When SIMD clone is created original function may be defined in another 
partition.  In this case SIMD clone also has to have in_other_partition flag.  
Now it doesn't and we get an ICE.  This patch fixes it.  Bootstrapped and 
regtested for x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-10-05  Ilya Enkovich  <enkovich@gmail.com>

* omp-low.c (simd_clone_create): Set in_other_partition
for created clones.

gcc/testsuite/

2015-10-05  Ilya Enkovich  <enkovich@gmail.com>

* gcc.dg/lto/simd-function_0.c: New test.


diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index cdcf9d6..8d25784 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -12948,6 +12948,8 @@ simd_clone_create (struct cgraph_node *old_node)
   DECL_STATIC_CONSTRUCTOR (new_decl) = 0;
   DECL_STATIC_DESTRUCTOR (new_decl) = 0;
   new_node = old_node->create_version_clone (new_decl, vNULL, NULL);
+  if (old_node->in_other_partition)
+   new_node->in_other_partition = 1;
   symtab->call_cgraph_insertion_hooks (new_node);
 }
   if (new_node == NULL)
diff --git a/gcc/testsuite/gcc.dg/lto/simd-function_0.c 
b/gcc/testsuite/gcc.dg/lto/simd-function_0.c
new file mode 100755
index 000..cda31aa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/simd-function_0.c
@@ -0,0 +1,34 @@
+/* { dg-lto-do link } */
+/* { dg-require-effective-target avx2 } */
+/* { dg-lto-options { { -fopenmp-simd -O3 -ffast-math -mavx2 -flto 
-flto-partition=max } } } */
+
+#define SIZE 4096
+float x[SIZE];
+
+
+#pragma omp declare simd
+float
+__attribute__ ((noinline))
+my_mul (float x, float y) {
+  return x * y;
+}
+
+__attribute__ ((noinline))
+int foo ()
+{
+  int i = 0;
+#pragma omp simd safelen (16)
+  for (i = 0; i < SIZE; i++)
+x[i] = my_mul ((float)i, 9932.3323);
+  return (int)x[0];
+}
+
+int main ()
+{
+  int i = 0;
+  for (i = 0; i < SIZE; i++)
+x[i] = my_mul ((float) i, 9932.3323);
+  foo ();
+  return (int)x[0];
+}
+


[Boolean Vector, patch 2/5] Change vector comparison IL requirement

2015-10-02 Thread Ilya Enkovich
Hi,

This patch change vector comparison to require boolean vector resulting type.

Thanks,
Ilya
--
gcc/

2015-10-02  Ilya Enkovich  <enkovich@gmail.com>

* tree-cfg.c (verify_gimple_comparison) Require boolean
vector type for vector comparison.
(verify_gimple_assign_ternary): Likewise.


diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 807d96f..c3dcced 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3464,10 +3464,10 @@ verify_gimple_comparison (tree type, tree op0, tree op1)
   return true;
 }
 }
-  /* Or an integer vector type with the same size and element count
+  /* Or a boolean vector type with the same element count
  as the comparison operand types.  */
   else if (TREE_CODE (type) == VECTOR_TYPE
-  && TREE_CODE (TREE_TYPE (type)) == INTEGER_TYPE)
+  && TREE_CODE (TREE_TYPE (type)) == BOOLEAN_TYPE)
 {
   if (TREE_CODE (op0_type) != VECTOR_TYPE
  || TREE_CODE (op1_type) != VECTOR_TYPE)
@@ -3478,12 +3478,7 @@ verify_gimple_comparison (tree type, tree op0, tree op1)
   return true;
 }
 
-  if (TYPE_VECTOR_SUBPARTS (type) != TYPE_VECTOR_SUBPARTS (op0_type)
- || (GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (type)))
- != GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op0_type
- /* The result of a vector comparison is of signed
-integral type.  */
- || TYPE_UNSIGNED (TREE_TYPE (type)))
+  if (TYPE_VECTOR_SUBPARTS (type) != TYPE_VECTOR_SUBPARTS (op0_type))
 {
   error ("invalid vector comparison resulting type");
   debug_generic_expr (type);
@@ -3970,15 +3965,13 @@ verify_gimple_assign_ternary (gassign *stmt)
   break;
 
 case VEC_COND_EXPR:
-  if (!VECTOR_INTEGER_TYPE_P (rhs1_type)
- || TYPE_SIGN (rhs1_type) != SIGNED
- || TYPE_SIZE (rhs1_type) != TYPE_SIZE (lhs_type)
+  if (!VECTOR_BOOLEAN_TYPE_P (rhs1_type)
  || TYPE_VECTOR_SUBPARTS (rhs1_type)
 != TYPE_VECTOR_SUBPARTS (lhs_type))
{
- error ("the first argument of a VEC_COND_EXPR must be of a signed "
-"integral vector type of the same size and number of "
-"elements as the result");
+ error ("the first argument of a VEC_COND_EXPR must be of a "
+"boolean vector type of the same number of elements "
+"as the result");
  debug_generic_expr (lhs_type);
  debug_generic_expr (rhs1_type);
  return true;


[Boolean Vector, patch 4/5] Use boolean vectors in VEC_COND_EXPR

2015-10-02 Thread Ilya Enkovich
Hi,

This patch forces boolean vector usage in VEC_COND_EXPR generated by 
vectorizer.  VEC_COND_EXPR expand is fixed appropriately.

Thanks,
Ilya
--
gcc/

2015-10-02  Ilya Enkovich  <enkovich@gmail.com>

* optabs.c (expand_vec_cond_expr): Accept boolean vector as
condition operand.
* tree-vect-stmts.c (vectorizable_condition): Use boolean
vector type for vector comparison.


diff --git a/gcc/optabs.c b/gcc/optabs.c
index c49d66b..8d9d742 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -5365,16 +5365,17 @@ expand_vec_cond_expr (tree vec_cond_type, tree op0, 
tree op1, tree op2,
   op0a = TREE_OPERAND (op0, 0);
   op0b = TREE_OPERAND (op0, 1);
   tcode = TREE_CODE (op0);
+  unsignedp = TYPE_UNSIGNED (TREE_TYPE (op0a));
 }
   else
 {
   /* Fake op0 < 0.  */
-  gcc_assert (!TYPE_UNSIGNED (TREE_TYPE (op0)));
+  gcc_assert (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (op0)));
   op0a = op0;
   op0b = build_zero_cst (TREE_TYPE (op0));
   tcode = LT_EXPR;
+  unsignedp = false;
 }
-  unsignedp = TYPE_UNSIGNED (TREE_TYPE (op0a));
   cmp_op_mode = TYPE_MODE (TREE_TYPE (op0a));
 
 
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 2ff2827..e93f5ef 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -7384,10 +7384,7 @@ vectorizable_condition (gimple *stmt, 
gimple_stmt_iterator *gsi,
   && TREE_CODE (else_clause) != FIXED_CST)
 return false;
 
-  unsigned int prec = GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (vectype)));
-  /* The result of a vector comparison should be signed type.  */
-  tree cmp_type = build_nonstandard_integer_type (prec, 0);
-  vec_cmp_type = get_same_sized_vectype (cmp_type, vectype);
+  vec_cmp_type = build_same_sized_truth_vector_type (comp_vectype);
   if (vec_cmp_type == NULL_TREE)
 return false;
 


[Boolean Vector, patch 1/5] Introduce boolean vector to be used as a vector comparison type

2015-10-02 Thread Ilya Enkovich
Hi,

This patch starts the first series to introduce vec as a vector 
comparison type.  This series introduces the new vec type and force its 
usage for all vector comparisons.  This series doesn't intoroduce any new 
vectorization features.  I split it into five small patches but will commit in 
a single chunk.  Patch series was bootstrapped and tested on 
x86_64-unknown-linux-gnu.

The first patch introduces a target hook and functions to produce new vector 
type.

Thanks,
Ilya
--
2015-10-02  Ilya Enkovich  <enkovich@gmail.com>

* doc/tm.texi: Regenerated.
* doc/tm.texi.in (TARGET_VECTORIZE_GET_MASK_MODE): New.
* stor-layout.c (layout_type): Use mode to get vector mask size.
* target.def (get_mask_mode): New.
* targhooks.c (default_get_mask_mode): New.
* targhooks.h (default_get_mask_mode): New.
* gcc/tree-vect-stmts.c (get_same_sized_vectype): Add special case
for boolean vector.
* tree.c (MAX_BOOL_CACHED_PREC): New.
(nonstandard_boolean_type_cache): New.
(build_nonstandard_boolean_type): New.
(make_vector_type): Vector mask has no canonical type.
(build_truth_vector_type): New.
(build_same_sized_truth_vector_type): New.
(truth_type_for): Support vector masks.
* tree.h (VECTOR_BOOLEAN_TYPE_P): New.
(build_truth_vector_type): New.
(build_same_sized_truth_vector_type): New.
(build_nonstandard_boolean_type): New.


diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index eb495a8..098213e 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -5688,6 +5688,11 @@ mode returned by 
@code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE}.
 The default is zero which means to not iterate over other vector sizes.
 @end deftypefn
 
+@deftypefn {Target Hook} machine_mode TARGET_VECTORIZE_GET_MASK_MODE (unsigned 
@var{nunits}, unsigned @var{length})
+This hook returns mode to be used for a mask to be used for a vector
+of specified @var{length} with @var{nunits} elements.
+@end deftypefn
+
 @deftypefn {Target Hook} {void *} TARGET_VECTORIZE_INIT_COST (struct loop 
*@var{loop_info})
 This hook should initialize target-specific data structures in preparation for 
modeling the costs of vectorizing a loop or basic block.  The default allocates 
three unsigned integers for accumulating costs for the prologue, body, and 
epilogue of the loop or basic block.  If @var{loop_info} is non-NULL, it 
identifies the loop being vectorized; otherwise a single block is being 
vectorized.
 @end deftypefn
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 92835c1..92cfa1d 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4225,6 +4225,8 @@ address;  but often a machine-dependent strategy can 
generate better code.
 
 @hook TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES
 
+@hook TARGET_VECTORIZE_GET_MASK_MODE
+
 @hook TARGET_VECTORIZE_INIT_COST
 
 @hook TARGET_VECTORIZE_ADD_STMT_COST
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 938e54b..58ecd7b 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -2184,10 +2184,16 @@ layout_type (tree type)
 
TYPE_SATURATING (type) = TYPE_SATURATING (TREE_TYPE (type));
 TYPE_UNSIGNED (type) = TYPE_UNSIGNED (TREE_TYPE (type));
-   TYPE_SIZE_UNIT (type) = int_const_binop (MULT_EXPR,
-TYPE_SIZE_UNIT (innertype),
-size_int (nunits));
-   TYPE_SIZE (type) = int_const_binop (MULT_EXPR, TYPE_SIZE (innertype),
+   /* Several boolean vector elements may fit in a single unit.  */
+   if (VECTOR_BOOLEAN_TYPE_P (type))
+ TYPE_SIZE_UNIT (type)
+   = size_int (GET_MODE_SIZE (type->type_common.mode));
+   else
+ TYPE_SIZE_UNIT (type) = int_const_binop (MULT_EXPR,
+  TYPE_SIZE_UNIT (innertype),
+  size_int (nunits));
+   TYPE_SIZE (type) = int_const_binop (MULT_EXPR,
+   TYPE_SIZE (innertype),
bitsize_int (nunits));
 
/* For vector types, we do not default to the mode's alignment.
diff --git a/gcc/target.def b/gcc/target.def
index f330709..b96fd51 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1789,6 +1789,15 @@ The default is zero which means to not iterate over 
other vector sizes.",
  (void),
  default_autovectorize_vector_sizes)
 
+/* Function to get a target mode for a vector mask.  */
+DEFHOOK
+(get_mask_mode,
+ "This hook returns mode to be used for a mask to be used for a vector\n\
+of specified @var{length} with @var{nunits} elements.",
+ machine_mode,
+ (unsigned nunits, unsigned length),
+ default_get_mask_mode)
+
 /* Target builtin that implements vector gather operation.  */
 DEFHOOK
 (builtin_gather,
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 7238c8f..

[Boolean Vector, patch 3/5] Use boolean vector in C/C++ FE

2015-10-02 Thread Ilya Enkovich
Hi,

This patch makes C/C++ FE to use boolean vector as a resulting type for vector 
comparison.  As a result vector comparison in source code now parsed into 
VEC_COND_EXPR, it required a testcase fix-up.

Thanks,
Ilya
--
gcc/c

2015-10-02  Ilya Enkovich  <enkovich@gmail.com>

* c-typeck.c (build_conditional_expr): Use boolean vector
type for vector comparison.
(build_vec_cmp): New.
(build_binary_op): Use build_vec_cmp for comparison.

gcc/cp

2015-10-02  Ilya Enkovich  <enkovich@gmail.com>

* call.c (build_conditional_expr_1): Use boolean vector
type for vector comparison.
* typeck.c (build_vec_cmp): New.
(cp_build_binary_op): Use build_vec_cmp for comparison.

gcc/testsuite/

2015-10-02  Ilya Enkovich  <enkovich@gmail.com>

* g++.dg/ext/vector22.C: Allow VEC_COND_EXPR.


diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 3b26231..3f64d76 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -4771,6 +4771,18 @@ build_conditional_expr (location_t colon_loc, tree 
ifexp, bool ifexp_bcp,
   && TREE_CODE (orig_op2) == INTEGER_CST
   && !TREE_OVERFLOW (orig_op2)));
 }
+
+  /* Need to convert condition operand into a vector mask.  */
+  if (VECTOR_TYPE_P (TREE_TYPE (ifexp)))
+{
+  tree vectype = TREE_TYPE (ifexp);
+  tree elem_type = TREE_TYPE (vectype);
+  tree zero = build_int_cst (elem_type, 0);
+  tree zero_vec = build_vector_from_val (vectype, zero);
+  tree cmp_type = build_same_sized_truth_vector_type (vectype);
+  ifexp = build2 (NE_EXPR, cmp_type, ifexp, zero_vec);
+}
+
   if (int_const || (ifexp_bcp && TREE_CODE (ifexp) == INTEGER_CST))
 ret = fold_build3_loc (colon_loc, COND_EXPR, result_type, ifexp, op1, op2);
   else
@@ -10220,6 +10232,19 @@ push_cleanup (tree decl, tree cleanup, bool eh_only)
   STATEMENT_LIST_STMT_EXPR (list) = stmt_expr;
 }
 
+/* Build a vector comparison using VEC_COND_EXPR.  */
+
+static tree
+build_vec_cmp (tree_code code, tree type,
+  tree arg0, tree arg1)
+{
+  tree zero_vec = build_zero_cst (type);
+  tree minus_one_vec = build_minus_one_cst (type);
+  tree cmp_type = build_same_sized_truth_vector_type (type);
+  tree cmp = build2 (code, cmp_type, arg0, arg1);
+  return build3 (VEC_COND_EXPR, type, cmp, minus_one_vec, zero_vec);
+}
+
 /* Build a binary-operation expression without default conversions.
CODE is the kind of expression to build.
LOCATION is the operator's location.
@@ -10786,7 +10811,8 @@ build_binary_op (location_t location, enum tree_code 
code,
   result_type = build_opaque_vector_type (intt,
  TYPE_VECTOR_SUBPARTS (type0));
   converted = 1;
-  break;
+ ret = build_vec_cmp (resultcode, result_type, op0, op1);
+  goto return_build_binary_op;
 }
   if (FLOAT_TYPE_P (type0) || FLOAT_TYPE_P (type1))
warning_at (location,
@@ -10938,7 +10964,8 @@ build_binary_op (location_t location, enum tree_code 
code,
   result_type = build_opaque_vector_type (intt,
  TYPE_VECTOR_SUBPARTS (type0));
   converted = 1;
-  break;
+ ret = build_vec_cmp (resultcode, result_type, op0, op1);
+  goto return_build_binary_op;
 }
   build_type = integer_type_node;
   if ((code0 == INTEGER_TYPE || code0 == REAL_TYPE
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 367d42b..0488b82 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -4615,6 +4615,15 @@ build_conditional_expr_1 (location_t loc, tree arg1, 
tree arg2, tree arg3,
 
   if (VECTOR_INTEGER_TYPE_P (TREE_TYPE (arg1)))
 {
+  /* If arg1 is another cond_expr choosing between -1 and 0,
+then we can use its comparison.  It may help to avoid
+additional comparison, produce more accurate diagnostics
+and enables folding.  */
+  if (TREE_CODE (arg1) == VEC_COND_EXPR
+ && integer_minus_onep (TREE_OPERAND (arg1, 1))
+ && integer_zerop (TREE_OPERAND (arg1, 2)))
+   arg1 = TREE_OPERAND (arg1, 0);
+
   arg1 = force_rvalue (arg1, complain);
   arg2 = force_rvalue (arg2, complain);
   arg3 = force_rvalue (arg3, complain);
@@ -4727,8 +4736,10 @@ build_conditional_expr_1 (location_t loc, tree arg1, 
tree arg2, tree arg3,
}
 
   if (!COMPARISON_CLASS_P (arg1))
-   arg1 = cp_build_binary_op (loc, NE_EXPR, arg1,
-  build_zero_cst (arg1_type), complain);
+   {
+ tree cmp_type = build_same_sized_truth_vector_type (arg1_type);
+ arg1 = build2 (NE_EXPR, cmp_type, arg1, build_zero_cst (arg1_type));
+   }
   return fold_build3 (VEC_COND_EXPR, arg2_type, arg1, arg2, arg3);
 }
 
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 482e42c..96b1683 100

[[Boolean Vector, patch 5/5] Support boolean vectors in vector lowering

2015-10-02 Thread Ilya Enkovich
Hi,

This patch supports boolean vectors in vector lowering.  Main change is to 
lower vector comparison into comparisons, not cond_exprs.

Thanks,
Ilya
--
2015-10-02  Ilya Enkovich  <enkovich@gmail.com>

* tree-vect-generic.c (elem_op_func): Add new operand to hold
vector type.
(do_unop): Adjust to modified function type.
(do_binop): Likewise.
(do_plus_minus): Likewise.
(do_negate); Likewise.
(expand_vector_piecewise): Likewise.
(do_cond): Likewise.
(do_compare): Use comparison instead of condition.
(expand_vector_divmod): Use boolean vector type for comparison.
(expand_vector_operations_1): Skip scalar mask operations.


diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index dad38a2..a20b9af 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -105,14 +105,27 @@ build_word_mode_vector_type (int nunits)
 }
 
 typedef tree (*elem_op_func) (gimple_stmt_iterator *,
- tree, tree, tree, tree, tree, enum tree_code);
+ tree, tree, tree, tree, tree, enum tree_code,
+ tree);
 
 static inline tree
 tree_vec_extract (gimple_stmt_iterator *gsi, tree type,
  tree t, tree bitsize, tree bitpos)
 {
   if (bitpos)
-return gimplify_build3 (gsi, BIT_FIELD_REF, type, t, bitsize, bitpos);
+{
+  if (TREE_CODE (type) == BOOLEAN_TYPE)
+   {
+ tree itype
+   = build_nonstandard_integer_type (tree_to_uhwi (bitsize), 0);
+ tree field = gimplify_build3 (gsi, BIT_FIELD_REF, itype, t,
+   bitsize, bitpos);
+ return gimplify_build2 (gsi, NE_EXPR, type, field,
+ build_zero_cst (itype));
+   }
+  else
+   return gimplify_build3 (gsi, BIT_FIELD_REF, type, t, bitsize, bitpos);
+}
   else
 return gimplify_build1 (gsi, VIEW_CONVERT_EXPR, type, t);
 }
@@ -120,7 +133,7 @@ tree_vec_extract (gimple_stmt_iterator *gsi, tree type,
 static tree
 do_unop (gimple_stmt_iterator *gsi, tree inner_type, tree a,
 tree b ATTRIBUTE_UNUSED, tree bitpos, tree bitsize,
-enum tree_code code)
+enum tree_code code, tree type ATTRIBUTE_UNUSED)
 {
   a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
   return gimplify_build1 (gsi, code, inner_type, a);
@@ -128,7 +141,8 @@ do_unop (gimple_stmt_iterator *gsi, tree inner_type, tree a,
 
 static tree
 do_binop (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b,
- tree bitpos, tree bitsize, enum tree_code code)
+ tree bitpos, tree bitsize, enum tree_code code,
+ tree type ATTRIBUTE_UNUSED)
 {
   if (TREE_CODE (TREE_TYPE (a)) == VECTOR_TYPE)
 a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
@@ -145,20 +159,12 @@ do_binop (gimple_stmt_iterator *gsi, tree inner_type, 
tree a, tree b,
size equal to the size of INNER_TYPE.  */
 static tree
 do_compare (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b,
- tree bitpos, tree bitsize, enum tree_code code)
+   tree bitpos, tree bitsize, enum tree_code code, tree type)
 {
-  tree comp_type;
-
   a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
   b = tree_vec_extract (gsi, inner_type, b, bitsize, bitpos);
 
-  comp_type = build_nonstandard_integer_type
- (GET_MODE_BITSIZE (TYPE_MODE (inner_type)), 0);
-
-  return gimplify_build3 (gsi, COND_EXPR, comp_type,
- fold_build2 (code, boolean_type_node, a, b),
- build_int_cst (comp_type, -1),
- build_int_cst (comp_type, 0));
+  return gimplify_build2 (gsi, code, TREE_TYPE (type), a, b);
 }
 
 /* Expand vector addition to scalars.  This does bit twiddling
@@ -177,7 +183,7 @@ do_compare (gimple_stmt_iterator *gsi, tree inner_type, 
tree a, tree b,
 static tree
 do_plus_minus (gimple_stmt_iterator *gsi, tree word_type, tree a, tree b,
   tree bitpos ATTRIBUTE_UNUSED, tree bitsize ATTRIBUTE_UNUSED,
-  enum tree_code code)
+  enum tree_code code, tree type ATTRIBUTE_UNUSED)
 {
   tree inner_type = TREE_TYPE (TREE_TYPE (a));
   unsigned HOST_WIDE_INT max;
@@ -209,7 +215,8 @@ static tree
 do_negate (gimple_stmt_iterator *gsi, tree word_type, tree b,
   tree unused ATTRIBUTE_UNUSED, tree bitpos ATTRIBUTE_UNUSED,
   tree bitsize ATTRIBUTE_UNUSED,
-  enum tree_code code ATTRIBUTE_UNUSED)
+  enum tree_code code ATTRIBUTE_UNUSED,
+  tree type ATTRIBUTE_UNUSED)
 {
   tree inner_type = TREE_TYPE (TREE_TYPE (b));
   HOST_WIDE_INT max;
@@ -255,7 +262,7 @@ expand_vector_piecewise (gimple_stmt_iterator *gsi, 
elem_op_func f,
   for (i = 0; i < nunits;
i += delta, index = int_const_binop (PLUS_EXPR, index, part_width))
 {
-  tree result = f (gsi, inner_type, a, b, index, part_width, code);
+ 

Re: [PATCH, testsuite]: Fix gcc.target/i386/pr65105-1.c test

2015-10-01 Thread Ilya Enkovich
2015-10-01 13:12 GMT+03:00 Uros Bizjak :
> Hello!
>
> Attached patch fixes gcc.target/i386/pr65105-1.c:

Thanks!
Ilya

>
> a) As a runtime SSE2 test, we have to check for target SSE2 support
> and use proper test infrastructure.
>
> b) A runtime test can't check output assembly without -save-temps.
>
> The patch also use another misuse of -save-temps in gcc.target/i386 directory.
>
> The patch solves:
>
> UNRESOLVED: gcc.target/i386/pr65105-1.c scan-assembler por
> UNRESOLVED: gcc.target/i386/pr65105-1.c scan-assembler pand
>
> 2015-10-01  Uros Bizjak  
>
> * gcc.target/i386/pr65105-1.c: Require sse2 effective target.
> (main): Rename to sse2_test.  Abort if count != 5.
> (dg-options): Add -save-temps.  Use "-msse2 -mtune=slm" instead
> of -march=slm.
> * gcc.target/i386/pr46865-2.c (dg-options): Remove -save-temps.
>
> Tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN.
>
> Uros.


Re: [PATCH, PR target/67761] Fix i686-*-* bootstrap comparison failure

2015-09-30 Thread Ilya Enkovich
2015-09-30 9:06 GMT+03:00 Uros Bizjak <ubiz...@gmail.com>:
> Hello!
>
>> My recenttly introduced STV pass doesn't skip debug instructions and it 
>> causes transformation
>> (mistly cost computation) depending on debug info.  It causes bootstrap 
>> comparison failure.  This
>> patch fixes.  Bootstrapped for i686-linux.  Testing for 
>> x86_64-unknown-linux-gnu{,m32} is in
>> progress.  OK for trunk if pass?
>
> IMO, it would be also beneficial to bootstrap with slm default
> architecture, so new code paths get some coverage via bootstrap.

I bootstrapped with --with-cpu=slm also.

>
>> gcc/
>>
>> 2015-09-29  Ilya Enkovich  <enkovich@gmail.com>
>>
>> * config/i386/i386.c (scalar_chain::analyze_register_chain): Ignore
>> debug insns.
>> (scalar_chain::convert_reg): Likewise.
>>
>> gcc/testsuite/
>>
>> 2015-09-29  Ilya Enkovich  <enkovich@gmail.com>
>>
>> * gcc.target/i386/pr67761.c: New test.
>
> OK.

Thanks!

Ilya

>
> Thanks,
> Uros.


[PATCH, PR target/67761] Fix i686-*-* bootstrap comparison failure

2015-09-29 Thread Ilya Enkovich
Hi,

My recenttly introduced STV pass doesn't skip debug instructions and it causes 
transformation (mistly cost computation) depending on debug info.  It causes 
bootstrap comparison failure.  This patch fixes.  Bootstrapped for i686-linux.  
Testing for x86_64-unknown-linux-gnu{,m32} is in progress.  OK for trunk if 
pass?

Thanks,
Ilya
--
gcc/

2015-09-29  Ilya Enkovich  <enkovich@gmail.com>

* config/i386/i386.c (scalar_chain::analyze_register_chain): Ignore
debug insns.
(scalar_chain::convert_reg): Likewise.

gcc/testsuite/

2015-09-29  Ilya Enkovich  <enkovich@gmail.com>

* gcc.target/i386/pr67761.c: New test.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 6f2380f..7b3ffb0 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2919,6 +2919,10 @@ scalar_chain::analyze_register_chain (bitmap candidates, 
df_ref ref)
   for (chain = DF_REF_CHAIN (ref); chain; chain = chain->next)
 {
   unsigned uid = DF_REF_INSN_UID (chain->ref);
+
+  if (!NONDEBUG_INSN_P (DF_REF_INSN (chain->ref)))
+   continue;
+
   if (!DF_REF_REG_MEM_P (chain->ref))
{
  if (bitmap_bit_p (insns, uid))
@@ -3279,7 +3283,7 @@ scalar_chain::convert_reg (unsigned regno)
bitmap_clear_bit (conv, DF_REF_INSN_UID (ref));
  }
   }
-else
+else if (NONDEBUG_INSN_P (DF_REF_INSN (ref)))
   {
replace_rtx (DF_REF_INSN (ref), reg, scopy);
df_insn_rescan (DF_REF_INSN (ref));
diff --git a/gcc/testsuite/gcc.target/i386/pr67761.c 
b/gcc/testsuite/gcc.target/i386/pr67761.c
new file mode 100644
index 000..9b13d58
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr67761.c
@@ -0,0 +1,13 @@
+/* PR target/pr67761 */
+/* { dg-do run { target { ia32 } } } */
+/* { dg-options "-O2 -march=slm -g" } */
+/* { dg-final { scan-assembler "paddq" } } */
+
+void
+test (long long *values, long long val, long long delta)
+{
+  unsigned i;
+
+  for (i = 0; i < 128; i++, val += delta)
+values[i] = val;
+}


Re: [RFC] Try vector as a new representation for vector masks

2015-09-25 Thread Ilya Enkovich
2015-09-23 16:53 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Wed, Sep 23, 2015 at 3:41 PM, Ilya Enkovich <enkovich@gmail.com> wrote:
>> 2015-09-18 16:40 GMT+03:00 Ilya Enkovich <enkovich@gmail.com>:
>>> 2015-09-18 15:22 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>>>
>>>> I was thinking about targets not supporting generating vec
>>>> (of whatever mode) from a comparison directly but only via
>>>> a COND_EXPR.
>>>
>>> Where may these direct comparisons come from? Vectorizer never
>>> generates unsupported statements. It means we get them from
>>> gimplifier? So touch optabs in gimplifier to avoid direct comparisons?
>>> Actually vect lowering checks if we are able to make comparison and
>>> expand also uses vec_cond to expand vector comparison, so probably we
>>> may live with them.
>>>
>>>>
>>>> Not sure if we are always talking about the same thing for
>>>> "bool patterns".  I'd remove bool patterns completely, IMHO
>>>> they are not necessary at all.
>>>
>>> I refer to transformations made by vect_recog_bool_pattern. Don't see
>>> how to remove them completely for targets not supporting comparison
>>> vectorization.
>>>
>>>>
>>>> I think we do allow this, just the vectorizer doesn't expect it.  In the 
>>>> long
>>>> run I want to get rid of the GENERIC exprs in both COND_EXPR and
>>>> VEC_COND_EXPR.  Just didn't have the time to do this...
>>>
>>> That would be nice. As a first step I'd like to support optabs for
>>> VEC_COND_EXPR directly using vec.
>>>
>>> Thanks,
>>> Ilya
>>>
>>>>
>>>> Richard.
>>
>> Hi Richard,
>>
>> Do you think we have enough confidence approach is working and we may
>> start integrating it into trunk? What would be integration plan then?
>
> I'm still worried about the vec vector size vs. element size
> issue (well, somewhat).

Yeah, I hit another problem related to element size in vec lowering.
It uses inner type sizes in expand_vector_piecewise and bool vector
expand goes in a wrong way. There were also other places with similar
problems and therefore I want to try to use bools of different sizes
and see how it goes. Also having different sized bools may be useful
to represent masks pack/unpack in scalar code.

>
> Otherwise the integration plan would be
>
>  1) put in the vector GIMPLE type support and change the vector
> comparison type IL requirement to be vector,
> fixing all fallout
>
>  2) get support for directly expanding vector comparisons to
> vector and make use of that from the x86 backend
>
>  3) make the vectorizer generate the above if supported
>
> I think independent improvements are
>
>  1) remove (most) of the bool patterns from the vectorizer
>
>  2) make VEC_COND_EXPR not have a GENERIC comparison embedded

Sounds great!

Ilya

>
> (same for COND_EXPR?)
>
> Richard.
>
>> Thanks,
>> Ilya


Re: [PATCH, PR67405, committed] Avoid NULL pointer dereference

2015-09-24 Thread Ilya Enkovich
2015-09-15 14:01 GMT+03:00 Ilya Enkovich <enkovich@gmail.com>:
> 2015-09-15 13:32 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>> On Tue, Sep 15, 2015 at 11:28 AM, Ilya Enkovich <enkovich@gmail.com> 
>> wrote:
>>
>> I see.  I wonder why we even call chkp_find_bound_slots if seen_errors().
>
> Even with errors we still gimplify function. Function parameters
> gimplification checks where parameters are passed to possibly copy
> some of them. It triggers ix86_function_arg_advance which uses
> chkp_find_bound_slots to skip required amount of bounds registers.
>
>> I suppose only recursing for COMPLETE_TYPE_P () would work?
>
> Yep, it should work. I'll rework my fix.

It turned out to be wrong. For this test

struct S
{
  S f;
};

void fn1 (S p1) {}

Structure S is considered as complete (has size 8 for some reason) at
fn1 gimplification. Thus even with complete type check I still hit
this field with error_node instead of a type and NULL at
DECL_FIELD_BIT_OFFSET. Should my current fix be OK then?

Thanks,
Ilya


Re: [RFC, PR target/65105] Use vector instructions for scalar 64bit computations on 32bit target

2015-09-23 Thread Ilya Enkovich
On 14 Sep 17:50, Uros Bizjak wrote:
> 
> +(define_insn_and_split "*zext_doubleword"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> + (zero_extend:DI (match_operand:SWI24 1 "nonimmediate_operand" "rm")))]
> +  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2"
> +  "#"
> +  "&& reload_completed && GENERAL_REG_P (operands[0])"
> +  [(set (match_dup 0) (zero_extend:SI (match_dup 1)))
> +   (set (match_dup 2) (const_int 0))]
> +  "split_double_mode (DImode, [0], 1, [0], [2]);")
> +
> +(define_insn_and_split "*zextqi_doubleword"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> + (zero_extend:DI (match_operand:QI 1 "nonimmediate_operand" "qm")))]
> +  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2"
> +  "#"
> +  "&& reload_completed && GENERAL_REG_P (operands[0])"
> +  [(set (match_dup 0) (zero_extend:SI (match_dup 1)))
> +   (set (match_dup 2) (const_int 0))]
> +  "split_double_mode (DImode, [0], 1, [0], [2]);")
> +
> 
> Please put the above patterns together with other zero_extend
> patterns. You can also merge these two patterns using SWI124 mode
> iterator with  mode attribute as a register constraint. Also, no
> need to check for GENERAL_REG_P after reload, when "r" constraint is
> in effect:
> 
> (define_insn_and_split "*zext_doubleword"
>   [(set (match_operand:DI 0 "register_operand" "=r")
>  (zero_extend:DI (match_operand:SWI124 1 "nonimmediate_operand" "m")))]
>   "!TARGET_64BIT && TARGET_STV && TARGET_SSE2"
>   "#"
>   "&& reload_completed"
>   [(set (match_dup 0) (zero_extend:SI (match_dup 1)))
>(set (match_dup 2) (const_int 0))]
>   "split_double_mode (DImode, [0], 1, [0], [2]);")

Register constraint doesn't affect split and I need GENERAL_REG_P to filter 
other registers case.

I merged QI and HI cases of zext but made a separate pattern for SI case 
because it doesn't need zero_extend in resulting code.  Bootstrapped and 
regtested for x86_64-unknown-linux-gnu.

Thanks,
Ilya
--
gcc/

2015-09-23  Ilya Enkovich  <enkovich@gmail.com>

* config/i386/i386.c: Include dbgcnt.h.
(has_non_address_hard_reg): New.
(convertible_comparison_p): New.
(scalar_to_vector_candidate_p): New.
(remove_non_convertible_regs): New.
(scalar_chain): New.
(scalar_chain::scalar_chain): New.
(scalar_chain::~scalar_chain): New.
(scalar_chain::add_to_queue): New.
(scalar_chain::mark_dual_mode_def): New.
(scalar_chain::analyze_register_chain): New.
(scalar_chain::add_insn): New.
(scalar_chain::build): New.
(scalar_chain::compute_convert_gain): New.
(scalar_chain::replace_with_subreg): New.
(scalar_chain::replace_with_subreg_in_insn): New.
(scalar_chain::emit_conversion_insns): New.
(scalar_chain::make_vector_copies): New.
(scalar_chain::convert_reg): New.
(scalar_chain::convert_op): New.
(scalar_chain::convert_insn): New.
(scalar_chain::convert): New.
(convert_scalars_to_vector): New.
(pass_data_stv): New.
(pass_stv): New.
(make_pass_stv): New.
(ix86_option_override): Created and register stv pass.
(flag_opts): Add -mstv.
(ix86_option_override_internal): Likewise.
    * config/i386/i386.md (SWIM1248x): New.
(*movdi_internal): Add xmm to mem alternative for TARGET_STV.
(and3): Use SWIM1248x iterator instead of SWIM.
(*anddi3_doubleword): New.
(*zext_doubleword): New.
(*zextsi_doubleword): New.
(3): Use SWIM1248x iterator instead of SWIM.
(*di3_doubleword): New.
* config/i386/i386.opt (mstv): New.
* dbgcnt.def (stv_conversion): New.

gcc/testsuite/

2015-09-23  Ilya Enkovich  <enkovich@gmail.com>

* gcc.target/i386/pr65105-1.c: New.
* gcc.target/i386/pr65105-2.c: New.
* gcc.target/i386/pr65105-3.c: New.
* gcc.target/i386/pr65105-4.C: New.
* gcc.dg/lower-subreg-1.c: Add -mno-stv options for ia32.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index d547cfd..2663f85 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -87,6 +87,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-iterator.h"
 #include "tree-chkp.h"
 #include "rtl-chkp.h"
+#include "dbgcnt.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -2600,6 +2601,908 @@ rest_of_handle_insert_vzer

<    1   2   3   4   5   6   7   8   9   10   >