Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-12 Thread Tamar Christina
I've been able to build spec again with this patch.

Thanks Richard!

Tamar.

From: Richard Biener <rguent...@suse.de>
Sent: Tuesday, September 12, 2017 5:36:10 PM
To: Andreas Schwab
Cc: Tamar Christina; Vidya Praveen; Andrew Pinski; Jon Beniston; 
gcc-patches@gcc.gnu.org; nd
Subject: Re: [RFC, vectorizer] Allow single element vector types for vector 
reduction operations

On September 12, 2017 6:25:32 PM GMT+02:00, Andreas Schwab 
<sch...@linux-m68k.org> wrote:
>On Sep 12 2017, Richard Biener <rguent...@suse.de> wrote:
>
>> Index: gcc/tree-vect-generic.c
>> ===
>> --- gcc/tree-vect-generic.c  (revision 251997)
>> +++ gcc/tree-vect-generic.c  (working copy)
>> @@ -1638,7 +1638,11 @@ expand_vector_operations_1 (gimple_stmt_
>>/* The signedness is determined from input argument.  */
>>if (code == VEC_UNPACK_FLOAT_HI_EXPR
>>|| code == VEC_UNPACK_FLOAT_LO_EXPR)
>> -type = TREE_TYPE (rhs1);
>> +{
>> +  type = TREE_TYPE (rhs1);
>> +  /* We do not know how to scalarize those.  */
>> +  return;
>
>The assignment is surely useless.

Sure. I left it because all this needs further cleanup.

Richard.

>Andreas.



Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-12 Thread Richard Biener
On September 12, 2017 6:25:32 PM GMT+02:00, Andreas Schwab 
 wrote:
>On Sep 12 2017, Richard Biener  wrote:
>
>> Index: gcc/tree-vect-generic.c
>> ===
>> --- gcc/tree-vect-generic.c  (revision 251997)
>> +++ gcc/tree-vect-generic.c  (working copy)
>> @@ -1638,7 +1638,11 @@ expand_vector_operations_1 (gimple_stmt_
>>/* The signedness is determined from input argument.  */
>>if (code == VEC_UNPACK_FLOAT_HI_EXPR
>>|| code == VEC_UNPACK_FLOAT_LO_EXPR)
>> -type = TREE_TYPE (rhs1);
>> +{
>> +  type = TREE_TYPE (rhs1);
>> +  /* We do not know how to scalarize those.  */
>> +  return;
>
>The assignment is surely useless.

Sure. I left it because all this needs further cleanup. 

Richard. 

>Andreas.



Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-12 Thread Andreas Schwab
On Sep 12 2017, Richard Biener  wrote:

> Index: gcc/tree-vect-generic.c
> ===
> --- gcc/tree-vect-generic.c   (revision 251997)
> +++ gcc/tree-vect-generic.c   (working copy)
> @@ -1638,7 +1638,11 @@ expand_vector_operations_1 (gimple_stmt_
>/* The signedness is determined from input argument.  */
>if (code == VEC_UNPACK_FLOAT_HI_EXPR
>|| code == VEC_UNPACK_FLOAT_LO_EXPR)
> -type = TREE_TYPE (rhs1);
> +{
> +  type = TREE_TYPE (rhs1);
> +  /* We do not know how to scalarize those.  */
> +  return;

The assignment is surely useless.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-12 Thread Richard Biener
On Tue, 12 Sep 2017, Richard Biener wrote:

> On Tue, 12 Sep 2017, Tamar Christina wrote:
> 
> > Hi Jon, Richard,
> > 
> > > > >
> > > > > Testing the following instead:
> > > >
> > > > Any news on this?
> > > 
> > > Didn't work out as expected.  I think the logic in tree-vect-generic is 
> > > the one
> > > to be fixed but I have to carve out some time to look into it so stay 
> > > tuned.
> > 
> > In the meantime can this patch be reverted? It is causing ICEs on 
> > AArch64 and ARM Preventing us from building benchmarks and tracking the 
> > toolchain performance.
> 
> You can always locally patch things.  I have yet another patch in
> testing right now.

I've installed the following.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

2017-09-12  Richard Biener  

* tree-vect-generic.c (expand_vector_operations_1): Do nothing
for operations we cannot scalarize.

Index: gcc/tree-vect-generic.c
===
--- gcc/tree-vect-generic.c (revision 251997)
+++ gcc/tree-vect-generic.c (working copy)
@@ -1638,7 +1638,11 @@ expand_vector_operations_1 (gimple_stmt_
   /* The signedness is determined from input argument.  */
   if (code == VEC_UNPACK_FLOAT_HI_EXPR
   || code == VEC_UNPACK_FLOAT_LO_EXPR)
-type = TREE_TYPE (rhs1);
+{
+  type = TREE_TYPE (rhs1);
+  /* We do not know how to scalarize those.  */
+  return;
+}
 
   /* For widening/narrowing vector operations, the relevant type is of the
  arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
@@ -1655,7 +1659,11 @@ expand_vector_operations_1 (gimple_stmt_
   || code == VEC_PACK_FIX_TRUNC_EXPR
   || code == VEC_WIDEN_LSHIFT_HI_EXPR
   || code == VEC_WIDEN_LSHIFT_LO_EXPR)
-type = TREE_TYPE (rhs1);
+{
+  type = TREE_TYPE (rhs1);
+  /* We do not know how to scalarize those.  */
+  return;
+}
 
   /* Choose between vector shift/rotate by vector and vector shift/rotate by
  scalar */


RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-12 Thread Richard Biener
On Tue, 12 Sep 2017, Tamar Christina wrote:

> Hi Jon, Richard,
> 
> > > >
> > > > Testing the following instead:
> > >
> > > Any news on this?
> > 
> > Didn't work out as expected.  I think the logic in tree-vect-generic is the 
> > one
> > to be fixed but I have to carve out some time to look into it so stay tuned.
> 
> In the meantime can this patch be reverted? It is causing ICEs on 
> AArch64 and ARM Preventing us from building benchmarks and tracking the 
> toolchain performance.

You can always locally patch things.  I have yet another patch in
testing right now.

Richard.

> Thanks,
> Tamar
> 
> > 
> > Richard.
> > 
> > > VP.
> > >
> > >
> > > >
> > > > Index: gcc/tree-vect-generic.c
> > > >
> > ==
> > =
> > > > --- gcc/tree-vect-generic.c (revision 251642)
> > > > +++ gcc/tree-vect-generic.c (working copy)
> > > > @@ -1640,7 +1640,7 @@ expand_vector_operations_1 (gimple_stmt_
> > > >|| code == VEC_UNPACK_FLOAT_LO_EXPR)
> > > >  type = TREE_TYPE (rhs1);
> > > >
> > > > -  /* For widening/narrowing vector operations, the relevant type is
> > > > of the
> > > > +  /* For widening vector operations, the relevant type is of the
> > > >   arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
> > > >   calculated in the same way above.  */
> > > >if (code == WIDEN_SUM_EXPR
> > > > @@ -1650,9 +1650,6 @@ expand_vector_operations_1 (gimple_stmt_
> > > >|| code == VEC_WIDEN_MULT_ODD_EXPR
> > > >|| code == VEC_UNPACK_HI_EXPR
> > > >|| code == VEC_UNPACK_LO_EXPR
> > > > -  || code == VEC_PACK_TRUNC_EXPR
> > > > -  || code == VEC_PACK_SAT_EXPR
> > > > -  || code == VEC_PACK_FIX_TRUNC_EXPR
> > > >|| code == VEC_WIDEN_LSHIFT_HI_EXPR
> > > >|| code == VEC_WIDEN_LSHIFT_LO_EXPR)
> > > >  type = TREE_TYPE (rhs1);
> > > >
> > > >
> > > > also fix for a bug uncovered by the previous one:
> > > >
> > > > Index: gcc/gimple-ssa-strength-reduction.c
> > > >
> > ==
> > =
> > > > --- gcc/gimple-ssa-strength-reduction.c (revision 251710)
> > > > +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> > > > @@ -1742,8 +1742,7 @@ find_candidates_dom_walker::before_dom_c
> > > > slsr_process_ref (gs);
> > > >
> > > >else if (is_gimple_assign (gs)
> > > > -  && SCALAR_INT_MODE_P
> > > > -   (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs)
> > > > +  && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs
> > > > + (gs
> > > > {
> > > >   tree rhs1 = NULL_TREE, rhs2 = NULL_TREE;
> > > >
> > > >
> > >
> > >
> > 
> > --
> > Richard Biener 
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> > HRB 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-12 Thread Tamar Christina
Hi Jon, Richard,

> > >
> > > Testing the following instead:
> >
> > Any news on this?
> 
> Didn't work out as expected.  I think the logic in tree-vect-generic is the 
> one
> to be fixed but I have to carve out some time to look into it so stay tuned.

In the meantime can this patch be reverted? It is causing ICEs on AArch64 and 
ARM
Preventing us from building benchmarks and tracking the toolchain performance.

Thanks,
Tamar

> 
> Richard.
> 
> > VP.
> >
> >
> > >
> > > Index: gcc/tree-vect-generic.c
> > >
> ==
> =
> > > --- gcc/tree-vect-generic.c (revision 251642)
> > > +++ gcc/tree-vect-generic.c (working copy)
> > > @@ -1640,7 +1640,7 @@ expand_vector_operations_1 (gimple_stmt_
> > >|| code == VEC_UNPACK_FLOAT_LO_EXPR)
> > >  type = TREE_TYPE (rhs1);
> > >
> > > -  /* For widening/narrowing vector operations, the relevant type is
> > > of the
> > > +  /* For widening vector operations, the relevant type is of the
> > >   arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
> > >   calculated in the same way above.  */
> > >if (code == WIDEN_SUM_EXPR
> > > @@ -1650,9 +1650,6 @@ expand_vector_operations_1 (gimple_stmt_
> > >|| code == VEC_WIDEN_MULT_ODD_EXPR
> > >|| code == VEC_UNPACK_HI_EXPR
> > >|| code == VEC_UNPACK_LO_EXPR
> > > -  || code == VEC_PACK_TRUNC_EXPR
> > > -  || code == VEC_PACK_SAT_EXPR
> > > -  || code == VEC_PACK_FIX_TRUNC_EXPR
> > >|| code == VEC_WIDEN_LSHIFT_HI_EXPR
> > >|| code == VEC_WIDEN_LSHIFT_LO_EXPR)
> > >  type = TREE_TYPE (rhs1);
> > >
> > >
> > > also fix for a bug uncovered by the previous one:
> > >
> > > Index: gcc/gimple-ssa-strength-reduction.c
> > >
> ==
> =
> > > --- gcc/gimple-ssa-strength-reduction.c (revision 251710)
> > > +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> > > @@ -1742,8 +1742,7 @@ find_candidates_dom_walker::before_dom_c
> > > slsr_process_ref (gs);
> > >
> > >else if (is_gimple_assign (gs)
> > > -  && SCALAR_INT_MODE_P
> > > -   (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs)
> > > +  && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs
> > > + (gs
> > > {
> > >   tree rhs1 = NULL_TREE, rhs2 = NULL_TREE;
> > >
> > >
> >
> >
> 
> --
> Richard Biener 
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)


Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-12 Thread Richard Biener
On Mon, 11 Sep 2017, Vidya Praveen wrote:

> On Tue, Sep 05, 2017 at 03:12:47PM +0200, Richard Biener wrote:
> > On Tue, 5 Sep 2017, Tamar Christina wrote:
> > 
> > > 
> > > 
> > > > -Original Message-
> > > > From: Richard Biener [mailto:rguent...@suse.de]
> > > > Sent: 05 September 2017 13:51
> > > > To: Tamar Christina
> > > > Cc: Andrew Pinski; Andreas Schwab; Jon Beniston; 
> > > > gcc-patches@gcc.gnu.org;
> > > > nd
> > > > Subject: RE: [RFC, vectorizer] Allow single element vector types for 
> > > > vector
> > > > reduction operations
> > > > 
> > > > On Tue, 5 Sep 2017, Richard Biener wrote:
> > > > 
> > > > > On Tue, 5 Sep 2017, Tamar Christina wrote:
> > > > >
> > > > > > Hi Richard,
> > > > > >
> > > > > > That was an really interesting analysis, thanks for the details!
> > > > > >
> > > > > > Would you be submitting the patch you proposed at the end as a fix?
> > > > >
> > > > > I'm testing it currently.
> > > > 
> > > > Unfortunately it breaks some required lowering.  I'll have to more 
> > > > closely
> > > > look at this.
> > > 
> > > Ah, ok. In the meantime, can this patch be reverted? It's currently 
> > > breaking spec for us so we're
> > > Not able to get any benchmarking numbers.
> > 
> > Testing the following instead:
> 
> Any news on this?

Didn't work out as expected.  I think the logic in tree-vect-generic
is the one to be fixed but I have to carve out some time to look
into it so stay tuned.

Richard.

> VP.
> 
> 
> > 
> > Index: gcc/tree-vect-generic.c
> > ===
> > --- gcc/tree-vect-generic.c (revision 251642)
> > +++ gcc/tree-vect-generic.c (working copy)
> > @@ -1640,7 +1640,7 @@ expand_vector_operations_1 (gimple_stmt_
> >|| code == VEC_UNPACK_FLOAT_LO_EXPR)
> >  type = TREE_TYPE (rhs1);
> >  
> > -  /* For widening/narrowing vector operations, the relevant type is of 
> > the
> > +  /* For widening vector operations, the relevant type is of the
> >   arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
> >   calculated in the same way above.  */
> >if (code == WIDEN_SUM_EXPR
> > @@ -1650,9 +1650,6 @@ expand_vector_operations_1 (gimple_stmt_
> >|| code == VEC_WIDEN_MULT_ODD_EXPR
> >|| code == VEC_UNPACK_HI_EXPR
> >|| code == VEC_UNPACK_LO_EXPR
> > -  || code == VEC_PACK_TRUNC_EXPR
> > -  || code == VEC_PACK_SAT_EXPR
> > -  || code == VEC_PACK_FIX_TRUNC_EXPR
> >|| code == VEC_WIDEN_LSHIFT_HI_EXPR
> >|| code == VEC_WIDEN_LSHIFT_LO_EXPR)
> >  type = TREE_TYPE (rhs1);
> > 
> > 
> > also fix for a bug uncovered by the previous one:
> > 
> > Index: gcc/gimple-ssa-strength-reduction.c
> > ===
> > --- gcc/gimple-ssa-strength-reduction.c (revision 251710)
> > +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> > @@ -1742,8 +1742,7 @@ find_candidates_dom_walker::before_dom_c
> > slsr_process_ref (gs);
> >  
> >else if (is_gimple_assign (gs)
> > -  && SCALAR_INT_MODE_P
> > -   (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs)
> > +  && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (gs
> > {
> >   tree rhs1 = NULL_TREE, rhs2 = NULL_TREE;
> >  
> > 
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-11 Thread Vidya Praveen
On Tue, Sep 05, 2017 at 03:12:47PM +0200, Richard Biener wrote:
> On Tue, 5 Sep 2017, Tamar Christina wrote:
> 
> > 
> > 
> > > -Original Message-
> > > From: Richard Biener [mailto:rguent...@suse.de]
> > > Sent: 05 September 2017 13:51
> > > To: Tamar Christina
> > > Cc: Andrew Pinski; Andreas Schwab; Jon Beniston; gcc-patches@gcc.gnu.org;
> > > nd
> > > Subject: RE: [RFC, vectorizer] Allow single element vector types for 
> > > vector
> > > reduction operations
> > > 
> > > On Tue, 5 Sep 2017, Richard Biener wrote:
> > > 
> > > > On Tue, 5 Sep 2017, Tamar Christina wrote:
> > > >
> > > > > Hi Richard,
> > > > >
> > > > > That was an really interesting analysis, thanks for the details!
> > > > >
> > > > > Would you be submitting the patch you proposed at the end as a fix?
> > > >
> > > > I'm testing it currently.
> > > 
> > > Unfortunately it breaks some required lowering.  I'll have to more closely
> > > look at this.
> > 
> > Ah, ok. In the meantime, can this patch be reverted? It's currently 
> > breaking spec for us so we're
> > Not able to get any benchmarking numbers.
> 
> Testing the following instead:

Any news on this?

VP.


> 
> Index: gcc/tree-vect-generic.c
> ===
> --- gcc/tree-vect-generic.c (revision 251642)
> +++ gcc/tree-vect-generic.c (working copy)
> @@ -1640,7 +1640,7 @@ expand_vector_operations_1 (gimple_stmt_
>|| code == VEC_UNPACK_FLOAT_LO_EXPR)
>  type = TREE_TYPE (rhs1);
>  
> -  /* For widening/narrowing vector operations, the relevant type is of 
> the
> +  /* For widening vector operations, the relevant type is of the
>   arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
>   calculated in the same way above.  */
>if (code == WIDEN_SUM_EXPR
> @@ -1650,9 +1650,6 @@ expand_vector_operations_1 (gimple_stmt_
>|| code == VEC_WIDEN_MULT_ODD_EXPR
>|| code == VEC_UNPACK_HI_EXPR
>|| code == VEC_UNPACK_LO_EXPR
> -  || code == VEC_PACK_TRUNC_EXPR
> -  || code == VEC_PACK_SAT_EXPR
> -  || code == VEC_PACK_FIX_TRUNC_EXPR
>|| code == VEC_WIDEN_LSHIFT_HI_EXPR
>|| code == VEC_WIDEN_LSHIFT_LO_EXPR)
>  type = TREE_TYPE (rhs1);
> 
> 
> also fix for a bug uncovered by the previous one:
> 
> Index: gcc/gimple-ssa-strength-reduction.c
> ===
> --- gcc/gimple-ssa-strength-reduction.c (revision 251710)
> +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> @@ -1742,8 +1742,7 @@ find_candidates_dom_walker::before_dom_c
> slsr_process_ref (gs);
>  
>else if (is_gimple_assign (gs)
> -  && SCALAR_INT_MODE_P
> -   (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs)
> +  && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (gs
> {
>   tree rhs1 = NULL_TREE, rhs2 = NULL_TREE;
>  
> 


RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-07 Thread Richard Biener
On September 7, 2017 1:20:13 PM GMT+02:00, Jon Beniston  
wrote:
>Hi Bernd,
>
>>This seems to ring a bell. I think I submitted a patch for this here 
>>- is this the same problem?
>>https://gcc.gnu.org/ml/gcc-patches/2010-12/msg01724.html
>
>Looks like it.

Looks my suggestion was exactly what we have now. Note I didn't yet manage to 
do minimal surgery to vector lowering to fix the ARM fallout... 

Richard. 

>Cheers,
>Jon



RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-07 Thread Jon Beniston
Hi Bernd,

>This seems to ring a bell. I think I submitted a patch for this here 
>- is this the same problem?
>https://gcc.gnu.org/ml/gcc-patches/2010-12/msg01724.html

Looks like it.

Cheers,
Jon




Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-07 Thread Bernd Schmidt
On 08/27/2017 09:36 PM, Jon Beniston wrote:
> I have an out-of-tree GCC port and it is struggling supporting
> auto-vectorization on some dot product instructions.  For example, I have an
> instruction that takes three operands which are all 32-bit general
> registers. The second and third operands will be treated as V2HI then do dot
> product, and then generate an SI result which is then added to the first
> operand which is SI as well.

This seems to ring a bell. I think I submitted a patch for this here -
is this the same problem?
https://gcc.gnu.org/ml/gcc-patches/2010-12/msg01724.html


Bernd


RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-05 Thread Tamar Christina


> -Original Message-
> From: Richard Biener [mailto:rguent...@suse.de]
> Sent: 05 September 2017 14:13
> To: Tamar Christina
> Cc: Andrew Pinski; Andreas Schwab; Jon Beniston; gcc-patches@gcc.gnu.org;
> nd
> Subject: RE: [RFC, vectorizer] Allow single element vector types for vector
> reduction operations
> 
> On Tue, 5 Sep 2017, Tamar Christina wrote:
> 
> >
> >
> > > -Original Message-
> > > From: Richard Biener [mailto:rguent...@suse.de]
> > > Sent: 05 September 2017 13:51
> > > To: Tamar Christina
> > > Cc: Andrew Pinski; Andreas Schwab; Jon Beniston;
> > > gcc-patches@gcc.gnu.org; nd
> > > Subject: RE: [RFC, vectorizer] Allow single element vector types for
> > > vector reduction operations
> > >
> > > On Tue, 5 Sep 2017, Richard Biener wrote:
> > >
> > > > On Tue, 5 Sep 2017, Tamar Christina wrote:
> > > >
> > > > > Hi Richard,
> > > > >
> > > > > That was an really interesting analysis, thanks for the details!
> > > > >
> > > > > Would you be submitting the patch you proposed at the end as a fix?
> > > >
> > > > I'm testing it currently.
> > >
> > > Unfortunately it breaks some required lowering.  I'll have to more
> > > closely look at this.
> >
> > Ah, ok. In the meantime, can this patch be reverted? It's currently
> > breaking spec for us so we're Not able to get any benchmarking numbers.
> 
> Testing the following instead:

That does seem to build spec again, haven't tested the testsuite yet. 

> Index: gcc/tree-vect-generic.c
> ==
> =
> --- gcc/tree-vect-generic.c (revision 251642)
> +++ gcc/tree-vect-generic.c (working copy)
> @@ -1640,7 +1640,7 @@ expand_vector_operations_1 (gimple_stmt_
>|| code == VEC_UNPACK_FLOAT_LO_EXPR)
>  type = TREE_TYPE (rhs1);
> 
> -  /* For widening/narrowing vector operations, the relevant type is of the
> +  /* For widening vector operations, the relevant type is of the
>   arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
>   calculated in the same way above.  */
>if (code == WIDEN_SUM_EXPR
> @@ -1650,9 +1650,6 @@ expand_vector_operations_1 (gimple_stmt_
>|| code == VEC_WIDEN_MULT_ODD_EXPR
>|| code == VEC_UNPACK_HI_EXPR
>|| code == VEC_UNPACK_LO_EXPR
> -  || code == VEC_PACK_TRUNC_EXPR
> -  || code == VEC_PACK_SAT_EXPR
> -  || code == VEC_PACK_FIX_TRUNC_EXPR
>|| code == VEC_WIDEN_LSHIFT_HI_EXPR
>|| code == VEC_WIDEN_LSHIFT_LO_EXPR)
>  type = TREE_TYPE (rhs1);
> 
> 
> also fix for a bug uncovered by the previous one:
> 
> Index: gcc/gimple-ssa-strength-reduction.c
> ==
> =
> --- gcc/gimple-ssa-strength-reduction.c (revision 251710)
> +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> @@ -1742,8 +1742,7 @@ find_candidates_dom_walker::before_dom_c
> slsr_process_ref (gs);
> 
>else if (is_gimple_assign (gs)
> -  && SCALAR_INT_MODE_P
> -   (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs)
> +  && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (gs
> {
>   tree rhs1 = NULL_TREE, rhs2 = NULL_TREE;
> 



RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-05 Thread Richard Biener
On Tue, 5 Sep 2017, Tamar Christina wrote:

> 
> 
> > -Original Message-
> > From: Richard Biener [mailto:rguent...@suse.de]
> > Sent: 05 September 2017 13:51
> > To: Tamar Christina
> > Cc: Andrew Pinski; Andreas Schwab; Jon Beniston; gcc-patches@gcc.gnu.org;
> > nd
> > Subject: RE: [RFC, vectorizer] Allow single element vector types for vector
> > reduction operations
> > 
> > On Tue, 5 Sep 2017, Richard Biener wrote:
> > 
> > > On Tue, 5 Sep 2017, Tamar Christina wrote:
> > >
> > > > Hi Richard,
> > > >
> > > > That was an really interesting analysis, thanks for the details!
> > > >
> > > > Would you be submitting the patch you proposed at the end as a fix?
> > >
> > > I'm testing it currently.
> > 
> > Unfortunately it breaks some required lowering.  I'll have to more closely
> > look at this.
> 
> Ah, ok. In the meantime, can this patch be reverted? It's currently breaking 
> spec for us so we're
> Not able to get any benchmarking numbers.

Testing the following instead:

Index: gcc/tree-vect-generic.c
===
--- gcc/tree-vect-generic.c (revision 251642)
+++ gcc/tree-vect-generic.c (working copy)
@@ -1640,7 +1640,7 @@ expand_vector_operations_1 (gimple_stmt_
   || code == VEC_UNPACK_FLOAT_LO_EXPR)
 type = TREE_TYPE (rhs1);
 
-  /* For widening/narrowing vector operations, the relevant type is of 
the
+  /* For widening vector operations, the relevant type is of the
  arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
  calculated in the same way above.  */
   if (code == WIDEN_SUM_EXPR
@@ -1650,9 +1650,6 @@ expand_vector_operations_1 (gimple_stmt_
   || code == VEC_WIDEN_MULT_ODD_EXPR
   || code == VEC_UNPACK_HI_EXPR
   || code == VEC_UNPACK_LO_EXPR
-  || code == VEC_PACK_TRUNC_EXPR
-  || code == VEC_PACK_SAT_EXPR
-  || code == VEC_PACK_FIX_TRUNC_EXPR
   || code == VEC_WIDEN_LSHIFT_HI_EXPR
   || code == VEC_WIDEN_LSHIFT_LO_EXPR)
 type = TREE_TYPE (rhs1);


also fix for a bug uncovered by the previous one:

Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c (revision 251710)
+++ gcc/gimple-ssa-strength-reduction.c (working copy)
@@ -1742,8 +1742,7 @@ find_candidates_dom_walker::before_dom_c
slsr_process_ref (gs);
 
   else if (is_gimple_assign (gs)
-  && SCALAR_INT_MODE_P
-   (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs)
+  && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (gs
{
  tree rhs1 = NULL_TREE, rhs2 = NULL_TREE;
 



RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-05 Thread Tamar Christina


> -Original Message-
> From: Richard Biener [mailto:rguent...@suse.de]
> Sent: 05 September 2017 13:51
> To: Tamar Christina
> Cc: Andrew Pinski; Andreas Schwab; Jon Beniston; gcc-patches@gcc.gnu.org;
> nd
> Subject: RE: [RFC, vectorizer] Allow single element vector types for vector
> reduction operations
> 
> On Tue, 5 Sep 2017, Richard Biener wrote:
> 
> > On Tue, 5 Sep 2017, Tamar Christina wrote:
> >
> > > Hi Richard,
> > >
> > > That was an really interesting analysis, thanks for the details!
> > >
> > > Would you be submitting the patch you proposed at the end as a fix?
> >
> > I'm testing it currently.
> 
> Unfortunately it breaks some required lowering.  I'll have to more closely
> look at this.

Ah, ok. In the meantime, can this patch be reverted? It's currently breaking 
spec for us so we're
Not able to get any benchmarking numbers.

> 
> Richard.


RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-05 Thread Richard Biener
On Tue, 5 Sep 2017, Richard Biener wrote:

> On Tue, 5 Sep 2017, Tamar Christina wrote:
> 
> > Hi Richard,
> > 
> > That was an really interesting analysis, thanks for the details!
> > 
> > Would you be submitting the patch you proposed at the end as a fix?
> 
> I'm testing it currently.

Unfortunately it breaks some required lowering.  I'll have to more
closely look at this.

Richard.


RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-05 Thread Richard Biener
On Tue, 5 Sep 2017, Tamar Christina wrote:

> Hi Richard,
> 
> That was an really interesting analysis, thanks for the details!
> 
> Would you be submitting the patch you proposed at the end as a fix?

I'm testing it currently.

Richard.

> Thanks,
> Tamar
> 
> > -Original Message-
> > From: Richard Biener [mailto:rguent...@suse.de]
> > Sent: 05 September 2017 10:38
> > To: Andrew Pinski
> > Cc: Tamar Christina; Andreas Schwab; Jon Beniston; gcc-
> > patc...@gcc.gnu.org; nd
> > Subject: Re: [RFC, vectorizer] Allow single element vector types for vector
> > reduction operations
> > 
> > On Mon, 4 Sep 2017, Andrew Pinski wrote:
> > 
> > > On Mon, Sep 4, 2017 at 7:28 AM, Tamar Christina
> > <tamar.christ...@arm.com> wrote:
> > > >> >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > > >> intD.11>(vect__4.21_65);
> > > >> >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > > >> intD.11>(vect__4.22_63);
> > > >> >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > > >> intD.11>(vect__4.23_61);
> > > >> >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > > >> > intD.11>(vect__4.24_59);
> > > >> >
> > > >> > I suspect this patch will be quite bad for us performance wise as
> > > >> > it thinks it's as cheap to do all our integer operations on the
> > > >> > vector side with
> > > >> vectors of 1 element. But I'm still waiting for the perf numbers to
> > confirm.
> > > >>
> > > >> Looks like the backend advertises that it can do POPCOUNT on V1DI.
> > > >> So SLP vectorization decides it can vectorize this without unrolling.
> > > >
> > > > We don't, POPCOUNT is only defined for vector modes V8QI and V16QI,
> > > > we also don't define support For V1DI anywhere in the backend, we do
> > > > however say we support V1DF, but removing That doesn't cause the ICE
> > to go away.
> > > >
> > > >> Vectorization with V2DI is rejected:
> > > >>
> > > >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: function
> > > >> is not vectorizable.
> > > >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: not
> > vectorized:
> > > >> relevant stmt not supported: _8 = __builtin_popcountl (_5); ...
> > > >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: *
> > > >> Re-trying analysis with vector size 8
> > > >>
> > > >> and that now succeeds (it probably didn't succeed before the patch).
> > > >
> > > > In the .optimized file, I see it vectorised it to
> > > >
> > > >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > intD.11>(vect__4.21_65);
> > > >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > intD.11>(vect__4.22_63);
> > > >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > intD.11>(vect__4.23_61);
> > > >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > intD.11>(vect__4.24_59);
> > > >   _54 = POPCOUNT (vect__5.25_58);
> > > >   _53 = POPCOUNT (vect__5.25_57);
> > > >
> > > > Which is something we just don't have a pattern for. Before this patch, 
> > > > it
> > was rejecting "long unsigned int"
> > > > With this patch is somehow thinks we support an integer vector of 1
> > > > element, even though 1) we don't have an optab Defined for this
> > operation for POPCOUNT (or at all in aarch64 as far as I can tell), and 2) 
> > we
> > don't have it in our supported list of vector modes.
> > >
> > > Here are the two popcount optab aarch64 has:
> > > (define_mode_iterator GPI [SI DI])
> > > (define_expand "popcount2"
> > >   [(match_operand:GPI 0 "register_operand")
> > >(match_operand:GPI 1 "register_operand")]
> > >
> > >
> > > (define_insn "popcount2"
> > > (define_mode_iterator VB [V8QI V16QI])
> > >   [(set (match_operand:VB 0 "register_operand" "=w")
> > > (popcount:VB (match_operand:VB 1 "register_operand" "w")))]
> > >
> > > As you can see we only define popcount optab for SI, DI, V8QI and
> 

RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-05 Thread Tamar Christina
Hi Richard,

That was an really interesting analysis, thanks for the details!

Would you be submitting the patch you proposed at the end as a fix?

Thanks,
Tamar

> -Original Message-
> From: Richard Biener [mailto:rguent...@suse.de]
> Sent: 05 September 2017 10:38
> To: Andrew Pinski
> Cc: Tamar Christina; Andreas Schwab; Jon Beniston; gcc-
> patc...@gcc.gnu.org; nd
> Subject: Re: [RFC, vectorizer] Allow single element vector types for vector
> reduction operations
> 
> On Mon, 4 Sep 2017, Andrew Pinski wrote:
> 
> > On Mon, Sep 4, 2017 at 7:28 AM, Tamar Christina
> <tamar.christ...@arm.com> wrote:
> > >> >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > >> intD.11>(vect__4.21_65);
> > >> >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > >> intD.11>(vect__4.22_63);
> > >> >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > >> intD.11>(vect__4.23_61);
> > >> >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > >> > intD.11>(vect__4.24_59);
> > >> >
> > >> > I suspect this patch will be quite bad for us performance wise as
> > >> > it thinks it's as cheap to do all our integer operations on the
> > >> > vector side with
> > >> vectors of 1 element. But I'm still waiting for the perf numbers to
> confirm.
> > >>
> > >> Looks like the backend advertises that it can do POPCOUNT on V1DI.
> > >> So SLP vectorization decides it can vectorize this without unrolling.
> > >
> > > We don't, POPCOUNT is only defined for vector modes V8QI and V16QI,
> > > we also don't define support For V1DI anywhere in the backend, we do
> > > however say we support V1DF, but removing That doesn't cause the ICE
> to go away.
> > >
> > >> Vectorization with V2DI is rejected:
> > >>
> > >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: function
> > >> is not vectorizable.
> > >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: not
> vectorized:
> > >> relevant stmt not supported: _8 = __builtin_popcountl (_5); ...
> > >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: *
> > >> Re-trying analysis with vector size 8
> > >>
> > >> and that now succeeds (it probably didn't succeed before the patch).
> > >
> > > In the .optimized file, I see it vectorised it to
> > >
> > >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.21_65);
> > >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.22_63);
> > >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.23_61);
> > >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.24_59);
> > >   _54 = POPCOUNT (vect__5.25_58);
> > >   _53 = POPCOUNT (vect__5.25_57);
> > >
> > > Which is something we just don't have a pattern for. Before this patch, it
> was rejecting "long unsigned int"
> > > With this patch is somehow thinks we support an integer vector of 1
> > > element, even though 1) we don't have an optab Defined for this
> operation for POPCOUNT (or at all in aarch64 as far as I can tell), and 2) we
> don't have it in our supported list of vector modes.
> >
> > Here are the two popcount optab aarch64 has:
> > (define_mode_iterator GPI [SI DI])
> > (define_expand "popcount2"
> >   [(match_operand:GPI 0 "register_operand")
> >(match_operand:GPI 1 "register_operand")]
> >
> >
> > (define_insn "popcount2"
> > (define_mode_iterator VB [V8QI V16QI])
> >   [(set (match_operand:VB 0 "register_operand" "=w")
> > (popcount:VB (match_operand:VB 1 "register_operand" "w")))]
> >
> > As you can see we only define popcount optab for SI, DI, V8QI and
> > V16QI.  (note SI and DI uses the V8QI and V16QI during the expansion
> > but that is a different story).
> >
> > Maybe somehow the vectorizer is thinking V1DI and DI are
> > interchangeable in some places.
> 
> We ask
> 
> Breakpoint 5, vectorizable_internal_function
> (cfn=CFN_BUILT_IN_POPCOUNTL,
> fndecl=,
> vectype_out=,
> vectype_in=)
> at /tmp/trunk/gcc/tree-vect-stmts.c:1666
> 1666  if (internal_fn_p (cfn))
> (gdb) p debug_generic_expr (vectype_out)
> vector(2) in

Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-05 Thread Richard Biener
torizable = 1}

so require vectype_in as vectype_out which means we ask for V1DI -> V1DI
(and the vectorizer compensates for the demotion).  But the
mode of V1DI is actually DI (because the target doesn't support V1DI).
Thus we end up with asking for popcount with DImode which is
available.

Note that this V1DI vector type having DImode is desirable for Jons
case as far as I understand.  DImode gets used because aarch64
advertises generic vectorization support with word_mode.

Things may get tricky later when we look for VEC_PACK_TRUNC
of V1DI, V1DI -> V2SI but the vec_pack_trunc_optab advertises
DImode support via the VDN iterator (maybe expecting this only
in case no V2SI support is available).

So in the end the vectorizer works as expected ;)

What we don't seem to handle is a single-element vector typed, DImode
constructor with a single DImode element.

We call store_bit_field with fieldmode DI but a value with V2SI,
I guess things go downhill from there.  The SSA exp we store
was DImode.  Ah.

 
unit-size 
align:64 warn_if_not_align:0 symtab:0 alias-set 3 canonical-type 
0x768a07e0 precision:64 min  max 

pointer_to_this >
visited
def_stmt _62 = VEC_PACK_TRUNC_EXPR <_67, _64>;

I suppose _that_'s already somewhat bogus.  vector lowering creates it
for some reason:

-  vect__8.26_52 = VEC_PACK_TRUNC_EXPR <_54, _53>;
+  _67 = BIT_FIELD_REF <_54, 64, 0>;
+  _64 = BIT_FIELD_REF <_53, 64, 0>;
+  _62 = VEC_PACK_TRUNC_EXPR <_67, _64>;
+  _60 = {_62};
+  _48 = VIEW_CONVERT_EXPR<vector(2) int>(_60);
+  vect__8.26_52 = _48;

which happens because

static tree
get_compute_type (enum tree_code code, optab op, tree type)
{
  /* For very wide vectors, try using a smaller vector mode.  */
  tree compute_type = type;
  if (op
  && (!VECTOR_MODE_P (TYPE_MODE (type))
  || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing))

and/or

  if (compute_type == NULL_TREE)
compute_type = get_compute_type (code, op, type);
  if (compute_type == type)
return; 

note that lowering sth like VEC_PACK_TRUNC_EXPR into scalars is
pointless - at least in the way the lowering code does.

Index: gcc/tree-vect-generic.c
===
--- gcc/tree-vect-generic.c (revision 251642)
+++ gcc/tree-vect-generic.c (working copy)
@@ -1439,7 +1439,7 @@ get_compute_type (enum tree_code code, o
   /* For very wide vectors, try using a smaller vector mode.  */
   tree compute_type = type;
   if (op
-  && (!VECTOR_MODE_P (TYPE_MODE (type))
+  && (TYPE_MODE (type) == BLKmode
  || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing))
 {
   tree vector_compute_type
@@ -1459,7 +1459,7 @@ get_compute_type (enum tree_code code, o
   if (compute_type == type)
 {
   machine_mode compute_mode = TYPE_MODE (compute_type);
-  if (VECTOR_MODE_P (compute_mode))
+  if (compute_mode != BLKmode)
{
  if (op && optab_handler (op, compute_mode) != CODE_FOR_nothing)
return compute_type;

fixes this and we pass the testcase again.  Note we vectorize
that way even with the cost model enabled so it might show us
that the cost model fails to account for some costs:

/tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: Cost model 
analysis:
  Vector inside of loop cost: 8
  Vector prologue cost: 2
  Vector epilogue cost: 0
  Scalar iteration cost: 16
  Scalar outside cost: 0
  Vector outside cost: 2
  prologue iterations: 0
  epilogue iterations: 0
  Calculated minimum iters for profitability: 1

SLP costing seems to fail to account for promotion/demotion costs
it seems accounting for that still doesn't push it over the edge.

Richard.

> Thanks,
> Andrew
> 
> 
> >
> > So I don't quite understand how we end up with this expression.
> >
> > Regards,
> > Tamar
> >
> >>
> >> Richard.
> >>
> >> > 
> >> > From: gcc-patches-ow...@gcc.gnu.org <gcc-patches-ow...@gcc.gnu.org>
> >> on
> >> > behalf of Andreas Schwab <sch...@linux-m68k.org>
> >> > Sent: Saturday, September 2, 2017 10:58 PM
> >> > To: Jon Beniston
> >> > Cc: 'Richard Biener'; gcc-patches@gcc.gnu.org
> >> > Subject: Re: [RFC, vectorizer] Allow single element vector types for
> >> > vector reduction operations
> >> >
> >> > On Aug 30 2017, "Jon Beniston" <j...@beniston.com> wrote:
> >> >
> >> > > gcc/
> >> > > 2017-08-30  Jon Beniston  <j...@beniston.com>
> >> > > Richard Biener  <rguent...@suse.de>
> >> > >
> >> > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect

Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-04 Thread Andrew Pinski
On Mon, Sep 4, 2017 at 7:28 AM, Tamar Christina <tamar.christ...@arm.com> wrote:
>> >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
>> intD.11>(vect__4.21_65);
>> >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
>> intD.11>(vect__4.22_63);
>> >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
>> intD.11>(vect__4.23_61);
>> >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
>> > intD.11>(vect__4.24_59);
>> >
>> > I suspect this patch will be quite bad for us performance wise as it
>> > thinks it's as cheap to do all our integer operations on the vector side 
>> > with
>> vectors of 1 element. But I'm still waiting for the perf numbers to confirm.
>>
>> Looks like the backend advertises that it can do POPCOUNT on V1DI.  So SLP
>> vectorization decides it can vectorize this without unrolling.
>
> We don't, POPCOUNT is only defined for vector modes V8QI and V16QI, we also 
> don't define support
> For V1DI anywhere in the backend, we do however say we support V1DF, but 
> removing
> That doesn't cause the ICE to go away.
>
>> Vectorization with V2DI is rejected:
>>
>> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: function is not
>> vectorizable.
>> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: not vectorized:
>> relevant stmt not supported: _8 = __builtin_popcountl (_5); ...
>> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: * Re-trying
>> analysis with vector size 8
>>
>> and that now succeeds (it probably didn't succeed before the patch).
>
> In the .optimized file, I see it vectorised it to
>
>   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
> intD.11>(vect__4.21_65);
>   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
> intD.11>(vect__4.22_63);
>   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
> intD.11>(vect__4.23_61);
>   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
> intD.11>(vect__4.24_59);
>   _54 = POPCOUNT (vect__5.25_58);
>   _53 = POPCOUNT (vect__5.25_57);
>
> Which is something we just don't have a pattern for. Before this patch, it 
> was rejecting "long unsigned int"
> With this patch is somehow thinks we support an integer vector of 1 element, 
> even though 1) we don't have an optab
> Defined for this operation for POPCOUNT (or at all in aarch64 as far as I can 
> tell), and 2) we don't have it in our supported list of vector modes.

Here are the two popcount optab aarch64 has:
(define_mode_iterator GPI [SI DI])
(define_expand "popcount2"
  [(match_operand:GPI 0 "register_operand")
   (match_operand:GPI 1 "register_operand")]


(define_insn "popcount2"
(define_mode_iterator VB [V8QI V16QI])
  [(set (match_operand:VB 0 "register_operand" "=w")
(popcount:VB (match_operand:VB 1 "register_operand" "w")))]

As you can see we only define popcount optab for SI, DI, V8QI and
V16QI.  (note SI and DI uses the V8QI and V16QI during the expansion
but that is a different story).

Maybe somehow the vectorizer is thinking V1DI and DI are
interchangeable in some places.

Thanks,
Andrew


>
> So I don't quite understand how we end up with this expression.
>
> Regards,
> Tamar
>
>>
>> Richard.
>>
>> > 
>> > From: gcc-patches-ow...@gcc.gnu.org <gcc-patches-ow...@gcc.gnu.org>
>> on
>> > behalf of Andreas Schwab <sch...@linux-m68k.org>
>> > Sent: Saturday, September 2, 2017 10:58 PM
>> > To: Jon Beniston
>> > Cc: 'Richard Biener'; gcc-patches@gcc.gnu.org
>> > Subject: Re: [RFC, vectorizer] Allow single element vector types for
>> > vector reduction operations
>> >
>> > On Aug 30 2017, "Jon Beniston" <j...@beniston.com> wrote:
>> >
>> > > gcc/
>> > > 2017-08-30  Jon Beniston  <j...@beniston.com>
>> > > Richard Biener  <rguent...@suse.de>
>> > >
>> > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
>> > > index cfdb72c..5ebeac2 100644
>> > > --- a/gcc/tree-vect-patterns.c
>> > > +++ b/gcc/tree-vect-patterns.c
>> > > @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func
>> *recog_func,
>> > >loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>> > >
>> > >if (VECTOR_BOOLEAN_TYPE_P (type_in)
>> > > -  || VECTOR_MODE_P (TYPE_MODE (type_in)))

Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-04 Thread Christophe Lyon
On 30 August 2017 at 17:11, Jon Beniston  wrote:
> Hi Richard,
>
>> I think the issue is that with TARGET_VECTOR_MODE_SUPPORTED_P false
>> for V1SI you'll get a SImode vector type and the
>>
>>  if (VECTOR_BOOLEAN_TYPE_P (type_in)
>>  || VECTOR_MODE_P (TYPE_MODE (type_in)))
>>
>>check fails.  Try changing that to || VECTOR_TYPE_P (type_in).
>>The else path should be hopefully dead ...
>>
>>> It looks to me like the other call sites of optab_for_tree_code which
>>> are passing optab_default are mostly places where GCC is sure it is
>>> not a shift operation.
>>> Given TYPE_IN is returned from get_vectype_for_scalar_type, is it
>>> safe to simply pass optab_vector in vect_pattern_recog_1?
>>
>>I guess so.  Can you also try the above?
>
> Changing VECTOR_MODE_P check into VECTOR_TYPE_P check also works for me, I
> verified the following patch could vectorize my dot product case and there
> is no regression on gcc tests on my port.
>
> Meanwhile x86-64 bootstraps OK and no regression on gcc/g++ test.
>
>  Does this look OK?
>
>
> gcc/
> 2017-08-30  Jon Beniston  
> Richard Biener  
>
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index cfdb72c..5ebeac2 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
>loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>
>if (VECTOR_BOOLEAN_TYPE_P (type_in)
> -  || VECTOR_MODE_P (TYPE_MODE (type_in)))
> +  || VECTOR_TYPE_P (type_in))
>  {
>/* No need to check target support (already checked by the pattern
>   recognition function).  */
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 013fb1f..fc62efb 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
> scalar_type, unsigned size)
>else
>  simd_mode = mode_for_vector (inner_mode, size / nbytes);
>nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> -  if (nunits <= 1)
> +  /* NOTE: nunits == 1 is allowed to support single element vector types.
> */
> +  if (nunits < 1)
>  return NULL_TREE;
>
>vectype = build_vector_type (scalar_type, nunits);
>
>

Hi,

This patch makes this test fail:
gcc.dg/tree-ssa/ssa-dom-cse-2.c scan-tree-dump optimized "return 28;"
on arm-none-linux-gnueabi when using the default fpu, or on
arm-none-linux-gnueabihf --with-fpu=vfp (as opposed to
--with-fpu=neon*)

Christophe


RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-04 Thread Tamar Christina
> >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.21_65);
> >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.22_63);
> >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.23_61);
> >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > intD.11>(vect__4.24_59);
> >
> > I suspect this patch will be quite bad for us performance wise as it
> > thinks it's as cheap to do all our integer operations on the vector side 
> > with
> vectors of 1 element. But I'm still waiting for the perf numbers to confirm.
> 
> Looks like the backend advertises that it can do POPCOUNT on V1DI.  So SLP
> vectorization decides it can vectorize this without unrolling.

We don't, POPCOUNT is only defined for vector modes V8QI and V16QI, we also 
don't define support
For V1DI anywhere in the backend, we do however say we support V1DF, but 
removing
That doesn't cause the ICE to go away. 

> Vectorization with V2DI is rejected:
> 
> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: function is not
> vectorizable.
> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: not vectorized:
> relevant stmt not supported: _8 = __builtin_popcountl (_5); ...
> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: * Re-trying
> analysis with vector size 8
> 
> and that now succeeds (it probably didn't succeed before the patch).

In the .optimized file, I see it vectorised it to 

  vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
intD.11>(vect__4.21_65);
  vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
intD.11>(vect__4.22_63);
  vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
intD.11>(vect__4.23_61);
  vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
intD.11>(vect__4.24_59);
  _54 = POPCOUNT (vect__5.25_58);
  _53 = POPCOUNT (vect__5.25_57);

Which is something we just don't have a pattern for. Before this patch, it was 
rejecting "long unsigned int"
With this patch is somehow thinks we support an integer vector of 1 element, 
even though 1) we don't have an optab
Defined for this operation for POPCOUNT (or at all in aarch64 as far as I can 
tell), and 2) we don't have it in our supported list of vector modes.

So I don't quite understand how we end up with this expression.

Regards,
Tamar

> 
> Richard.
> 
> > 
> > From: gcc-patches-ow...@gcc.gnu.org <gcc-patches-ow...@gcc.gnu.org>
> on
> > behalf of Andreas Schwab <sch...@linux-m68k.org>
> > Sent: Saturday, September 2, 2017 10:58 PM
> > To: Jon Beniston
> > Cc: 'Richard Biener'; gcc-patches@gcc.gnu.org
> > Subject: Re: [RFC, vectorizer] Allow single element vector types for
> > vector reduction operations
> >
> > On Aug 30 2017, "Jon Beniston" <j...@beniston.com> wrote:
> >
> > > gcc/
> > > 2017-08-30  Jon Beniston  <j...@beniston.com>
> > > Richard Biener  <rguent...@suse.de>
> > >
> > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> > > index cfdb72c..5ebeac2 100644
> > > --- a/gcc/tree-vect-patterns.c
> > > +++ b/gcc/tree-vect-patterns.c
> > > @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func
> *recog_func,
> > >loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
> > >
> > >if (VECTOR_BOOLEAN_TYPE_P (type_in)
> > > -  || VECTOR_MODE_P (TYPE_MODE (type_in)))
> > > +  || VECTOR_TYPE_P (type_in))
> > >  {
> > >/* No need to check target support (already checked by the pattern
> > >   recognition function).  */ diff --git
> > > a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index
> > > 013fb1f..fc62efb 100644
> > > --- a/gcc/tree-vect-stmts.c
> > > +++ b/gcc/tree-vect-stmts.c
> > > @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
> > > scalar_type, unsigned size)
> > >else
> > >  simd_mode = mode_for_vector (inner_mode, size / nbytes);
> > >nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> > > -  if (nunits <= 1)
> > > +  /* NOTE: nunits == 1 is allowed to support single element vector types.
> > > */
> > > +  if (nunits < 1)
> > >  return NULL_TREE;
> > >
> > >vectype = build_vector_type (scalar_type, nunits);
> > >
> > >
> > >
> >
> > That breaks vect/pr68577.c on aarch64.
> >
> > during RTL pass: expand
> > /opt/gcc/gcc-20170902/gcc/tests

Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-04 Thread Richard Biener
On Mon, 4 Sep 2017, Tamar Christina wrote:

> Indeed it does, but even worse if you look at the gimple,
> you see lots of
> 
>   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
> intD.11>(vect__4.21_65);
>   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
> intD.11>(vect__4.22_63);
>   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
> intD.11>(vect__4.23_61);
>   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
> intD.11>(vect__4.24_59);
> 
> I suspect this patch will be quite bad for us performance wise as it thinks 
> it's as cheap to do all our integer
> operations on the vector side with vectors of 1 element. But I'm still 
> waiting for the perf numbers to confirm.

Looks like the backend advertises that it can do POPCOUNT on V1DI.  So
SLP vectorization decides it can vectorize this without unrolling.

Vectorization with V2DI is rejected:

/tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: function is not 
vectorizable.
/tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: not vectorized: 
relevant stmt not supported: _8 = __builtin_popcountl (_5);
...
/tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: * Re-trying 
analysis with vector size 8

and that now succeeds (it probably didn't succeed before the patch).

Richard.

> 
> From: gcc-patches-ow...@gcc.gnu.org <gcc-patches-ow...@gcc.gnu.org> on behalf 
> of Andreas Schwab <sch...@linux-m68k.org>
> Sent: Saturday, September 2, 2017 10:58 PM
> To: Jon Beniston
> Cc: 'Richard Biener'; gcc-patches@gcc.gnu.org
> Subject: Re: [RFC, vectorizer] Allow single element vector types for vector 
> reduction operations
> 
> On Aug 30 2017, "Jon Beniston" <j...@beniston.com> wrote:
> 
> > gcc/
> > 2017-08-30  Jon Beniston  <j...@beniston.com>
> > Richard Biener  <rguent...@suse.de>
> >
> > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> > index cfdb72c..5ebeac2 100644
> > --- a/gcc/tree-vect-patterns.c
> > +++ b/gcc/tree-vect-patterns.c
> > @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
> >loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
> >
> >if (VECTOR_BOOLEAN_TYPE_P (type_in)
> > -  || VECTOR_MODE_P (TYPE_MODE (type_in)))
> > +  || VECTOR_TYPE_P (type_in))
> >  {
> >/* No need to check target support (already checked by the pattern
> >   recognition function).  */
> > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> > index 013fb1f..fc62efb 100644
> > --- a/gcc/tree-vect-stmts.c
> > +++ b/gcc/tree-vect-stmts.c
> > @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
> > scalar_type, unsigned size)
> >else
> >  simd_mode = mode_for_vector (inner_mode, size / nbytes);
> >nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> > -  if (nunits <= 1)
> > +  /* NOTE: nunits == 1 is allowed to support single element vector types.
> > */
> > +  if (nunits < 1)
> >  return NULL_TREE;
> >
> >vectype = build_vector_type (scalar_type, nunits);
> >
> >
> >
> 
> That breaks vect/pr68577.c on aarch64.
> 
> during RTL pass: expand
> /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c: In function 
> 'slp_test':
> /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c:20:12: internal 
> compiler error: in simplify_subreg, at simplify-rtx.c:6050
> 0xb55983 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
> ../../gcc/simplify-rtx.c:6049
> 0xb595f7 simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned 
> int)
> ../../gcc/simplify-rtx.c:6278
> 0x81d277 store_bit_field_1
> ../../gcc/expmed.c:798
> 0x81d55f store_bit_field(rtx_def*, unsigned long, unsigned long, unsigned 
> long, unsigned long, machine_mode, rtx_def*, bool)
> ../../gcc/expmed.c:1133
> 0x840bf7 store_field
> ../../gcc/expr.c:6950
> 0x84792f store_constructor_field
> ../../gcc/expr.c:6142
> 0x83edbf store_constructor
> ../../gcc/expr.c:6726
> 0x840443 expand_constructor
> ../../gcc/expr.c:8027
> 0x82d5bf expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
> expand_modifier, rtx_def**, bool)
> ../../gcc/expr.c:10133
> 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
> expand_modifier, rtx_def**, bool)
> ../../gcc/expr.c:9819
> 0x82dadf expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
> expand_modifier, rtx_def**, bool)
> ../../gcc/expr.c:10942
> 0x82eaeb expand_expr_re

Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-04 Thread Tamar Christina
Indeed it does, but even worse if you look at the gimple,
you see lots of

  vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
intD.11>(vect__4.21_65);
  vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
intD.11>(vect__4.22_63);
  vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
intD.11>(vect__4.23_61);
  vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
intD.11>(vect__4.24_59);

I suspect this patch will be quite bad for us performance wise as it thinks 
it's as cheap to do all our integer
operations on the vector side with vectors of 1 element. But I'm still waiting 
for the perf numbers to confirm.

From: gcc-patches-ow...@gcc.gnu.org <gcc-patches-ow...@gcc.gnu.org> on behalf 
of Andreas Schwab <sch...@linux-m68k.org>
Sent: Saturday, September 2, 2017 10:58 PM
To: Jon Beniston
Cc: 'Richard Biener'; gcc-patches@gcc.gnu.org
Subject: Re: [RFC, vectorizer] Allow single element vector types for vector 
reduction operations

On Aug 30 2017, "Jon Beniston" <j...@beniston.com> wrote:

> gcc/
> 2017-08-30  Jon Beniston  <j...@beniston.com>
> Richard Biener  <rguent...@suse.de>
>
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index cfdb72c..5ebeac2 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
>loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>
>if (VECTOR_BOOLEAN_TYPE_P (type_in)
> -  || VECTOR_MODE_P (TYPE_MODE (type_in)))
> +  || VECTOR_TYPE_P (type_in))
>  {
>/* No need to check target support (already checked by the pattern
>   recognition function).  */
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 013fb1f..fc62efb 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
> scalar_type, unsigned size)
>else
>  simd_mode = mode_for_vector (inner_mode, size / nbytes);
>nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> -  if (nunits <= 1)
> +  /* NOTE: nunits == 1 is allowed to support single element vector types.
> */
> +  if (nunits < 1)
>  return NULL_TREE;
>
>vectype = build_vector_type (scalar_type, nunits);
>
>
>

That breaks vect/pr68577.c on aarch64.

during RTL pass: expand
/opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c: In function 
'slp_test':
/opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c:20:12: internal 
compiler error: in simplify_subreg, at simplify-rtx.c:6050
0xb55983 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
../../gcc/simplify-rtx.c:6049
0xb595f7 simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
../../gcc/simplify-rtx.c:6278
0x81d277 store_bit_field_1
../../gcc/expmed.c:798
0x81d55f store_bit_field(rtx_def*, unsigned long, unsigned long, unsigned long, 
unsigned long, machine_mode, rtx_def*, bool)
../../gcc/expmed.c:1133
0x840bf7 store_field
../../gcc/expr.c:6950
0x84792f store_constructor_field
../../gcc/expr.c:6142
0x83edbf store_constructor
../../gcc/expr.c:6726
0x840443 expand_constructor
../../gcc/expr.c:8027
0x82d5bf expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
expand_modifier, rtx_def**, bool)
../../gcc/expr.c:10133
0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
expand_modifier, rtx_def**, bool)
../../gcc/expr.c:9819
0x82dadf expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
expand_modifier, rtx_def**, bool)
../../gcc/expr.c:10942
0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
expand_modifier, rtx_def**, bool)
../../gcc/expr.c:9819
0x83d197 expand_expr
../../gcc/expr.h:276
0x83d197 expand_assignment(tree_node*, tree_node*, bool)
../../gcc/expr.c:4971
0x71e2f3 expand_gimple_stmt_1
../../gcc/cfgexpand.c:3653
0x71e2f3 expand_gimple_stmt
../../gcc/cfgexpand.c:3751
0x721cdb expand_gimple_basic_block
../../gcc/cfgexpand.c:5750
0x726b07 execute
../../gcc/cfgexpand.c:6357

Andreas.

--
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-02 Thread Andreas Schwab
On Aug 30 2017, "Jon Beniston"  wrote:

> gcc/
> 2017-08-30  Jon Beniston  
> Richard Biener  
>
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index cfdb72c..5ebeac2 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
>loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>   
>if (VECTOR_BOOLEAN_TYPE_P (type_in)
> -  || VECTOR_MODE_P (TYPE_MODE (type_in)))
> +  || VECTOR_TYPE_P (type_in))
>  {
>/* No need to check target support (already checked by the pattern
>   recognition function).  */
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 013fb1f..fc62efb 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
> scalar_type, unsigned size)
>else
>  simd_mode = mode_for_vector (inner_mode, size / nbytes);
>nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> -  if (nunits <= 1)
> +  /* NOTE: nunits == 1 is allowed to support single element vector types.
> */
> +  if (nunits < 1)
>  return NULL_TREE;
>  
>vectype = build_vector_type (scalar_type, nunits);
>
>
>

That breaks vect/pr68577.c on aarch64.

during RTL pass: expand
/opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c: In function 
'slp_test':
/opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c:20:12: internal 
compiler error: in simplify_subreg, at simplify-rtx.c:6050
0xb55983 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
../../gcc/simplify-rtx.c:6049
0xb595f7 simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
../../gcc/simplify-rtx.c:6278
0x81d277 store_bit_field_1
../../gcc/expmed.c:798
0x81d55f store_bit_field(rtx_def*, unsigned long, unsigned long, unsigned long, 
unsigned long, machine_mode, rtx_def*, bool)
../../gcc/expmed.c:1133
0x840bf7 store_field
../../gcc/expr.c:6950
0x84792f store_constructor_field
../../gcc/expr.c:6142
0x83edbf store_constructor
../../gcc/expr.c:6726
0x840443 expand_constructor
../../gcc/expr.c:8027
0x82d5bf expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
expand_modifier, rtx_def**, bool)
../../gcc/expr.c:10133
0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
expand_modifier, rtx_def**, bool)
../../gcc/expr.c:9819
0x82dadf expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
expand_modifier, rtx_def**, bool)
../../gcc/expr.c:10942
0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
expand_modifier, rtx_def**, bool)
../../gcc/expr.c:9819
0x83d197 expand_expr
../../gcc/expr.h:276
0x83d197 expand_assignment(tree_node*, tree_node*, bool)
../../gcc/expr.c:4971
0x71e2f3 expand_gimple_stmt_1
../../gcc/cfgexpand.c:3653
0x71e2f3 expand_gimple_stmt
../../gcc/cfgexpand.c:3751
0x721cdb expand_gimple_basic_block
../../gcc/cfgexpand.c:5750
0x726b07 execute
../../gcc/cfgexpand.c:6357

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-08-30 Thread Richard Sandiford
"Jon Beniston"  writes:
> Hi Richard,
>
>> I think the issue is that with TARGET_VECTOR_MODE_SUPPORTED_P false 
>> for V1SI you'll get a SImode vector type and the
>>
>>  if (VECTOR_BOOLEAN_TYPE_P (type_in)
>>  || VECTOR_MODE_P (TYPE_MODE (type_in)))
>>
>>check fails.  Try changing that to || VECTOR_TYPE_P (type_in).
>>The else path should be hopefully dead ...
>>
>>> It looks to me like the other call sites of optab_for_tree_code which 
>>> are passing optab_default are mostly places where GCC is sure it is 
>>> not a shift operation.
>>> Given TYPE_IN is returned from get_vectype_for_scalar_type, is it 
>>> safe to simply pass optab_vector in vect_pattern_recog_1?
>>
>>I guess so.  Can you also try the above?
>
> Changing VECTOR_MODE_P check into VECTOR_TYPE_P check also works for me, I
> verified the following patch could vectorize my dot product case and there
> is no regression on gcc tests on my port.
>
> Meanwhile x86-64 bootstraps OK and no regression on gcc/g++ test.
>
>  Does this look OK?
>
>
> gcc/
> 2017-08-30  Jon Beniston  
> Richard Biener  
>
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index cfdb72c..5ebeac2 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
>loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>   
>if (VECTOR_BOOLEAN_TYPE_P (type_in)
> -  || VECTOR_MODE_P (TYPE_MODE (type_in)))
> +  || VECTOR_TYPE_P (type_in))
>  {
>/* No need to check target support (already checked by the pattern
>   recognition function).  */

It looks like this makes the VECTOR_BOOLEAN_TYPE_P (type_in) check redundant.

Thanks,
Richard


RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-08-30 Thread Richard Biener
On August 30, 2017 5:11:24 PM GMT+02:00, Jon Beniston  wrote:
>Hi Richard,
>
>> I think the issue is that with TARGET_VECTOR_MODE_SUPPORTED_P false 
>> for V1SI you'll get a SImode vector type and the
>>
>>  if (VECTOR_BOOLEAN_TYPE_P (type_in)
>>  || VECTOR_MODE_P (TYPE_MODE (type_in)))
>>
>>check fails.  Try changing that to || VECTOR_TYPE_P (type_in).
>>The else path should be hopefully dead ...
>>
>>> It looks to me like the other call sites of optab_for_tree_code
>which 
>>> are passing optab_default are mostly places where GCC is sure it is 
>>> not a shift operation.
>>> Given TYPE_IN is returned from get_vectype_for_scalar_type, is it 
>>> safe to simply pass optab_vector in vect_pattern_recog_1?
>>
>>I guess so.  Can you also try the above?
>
>Changing VECTOR_MODE_P check into VECTOR_TYPE_P check also works for
>me, I
>verified the following patch could vectorize my dot product case and
>there
>is no regression on gcc tests on my port.
>
>Meanwhile x86-64 bootstraps OK and no regression on gcc/g++ test.
>
> Does this look OK?

OK. 

Thanks, 
Richard. 

>
>gcc/
>2017-08-30  Jon Beniston  
>Richard Biener  
>
>diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
>index cfdb72c..5ebeac2 100644
>--- a/gcc/tree-vect-patterns.c
>+++ b/gcc/tree-vect-patterns.c
>@@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func
>*recog_func,
>   loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>  
>   if (VECTOR_BOOLEAN_TYPE_P (type_in)
>-  || VECTOR_MODE_P (TYPE_MODE (type_in)))
>+  || VECTOR_TYPE_P (type_in))
> {
> /* No need to check target support (already checked by the pattern
>  recognition function).  */
>diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
>index 013fb1f..fc62efb 100644
>--- a/gcc/tree-vect-stmts.c
>+++ b/gcc/tree-vect-stmts.c
>@@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
>scalar_type, unsigned size)
>   else
> simd_mode = mode_for_vector (inner_mode, size / nbytes);
>   nunits = GET_MODE_SIZE (simd_mode) / nbytes;
>-  if (nunits <= 1)
>+  /* NOTE: nunits == 1 is allowed to support single element vector
>types.
>*/
>+  if (nunits < 1)
> return NULL_TREE;
> 
>   vectype = build_vector_type (scalar_type, nunits);



RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-08-30 Thread Jon Beniston
Hi Richard,

> I think the issue is that with TARGET_VECTOR_MODE_SUPPORTED_P false 
> for V1SI you'll get a SImode vector type and the
>
>  if (VECTOR_BOOLEAN_TYPE_P (type_in)
>  || VECTOR_MODE_P (TYPE_MODE (type_in)))
>
>check fails.  Try changing that to || VECTOR_TYPE_P (type_in).
>The else path should be hopefully dead ...
>
>> It looks to me like the other call sites of optab_for_tree_code which 
>> are passing optab_default are mostly places where GCC is sure it is 
>> not a shift operation.
>> Given TYPE_IN is returned from get_vectype_for_scalar_type, is it 
>> safe to simply pass optab_vector in vect_pattern_recog_1?
>
>I guess so.  Can you also try the above?

Changing VECTOR_MODE_P check into VECTOR_TYPE_P check also works for me, I
verified the following patch could vectorize my dot product case and there
is no regression on gcc tests on my port.

Meanwhile x86-64 bootstraps OK and no regression on gcc/g++ test.

 Does this look OK?


gcc/
2017-08-30  Jon Beniston  
Richard Biener  

diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index cfdb72c..5ebeac2 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
   loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
  
   if (VECTOR_BOOLEAN_TYPE_P (type_in)
-  || VECTOR_MODE_P (TYPE_MODE (type_in)))
+  || VECTOR_TYPE_P (type_in))
 {
   /* No need to check target support (already checked by the pattern
  recognition function).  */
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 013fb1f..fc62efb 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
scalar_type, unsigned size)
   else
 simd_mode = mode_for_vector (inner_mode, size / nbytes);
   nunits = GET_MODE_SIZE (simd_mode) / nbytes;
-  if (nunits <= 1)
+  /* NOTE: nunits == 1 is allowed to support single element vector types.
*/
+  if (nunits < 1)
 return NULL_TREE;
 
   vectype = build_vector_type (scalar_type, nunits);




RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-08-30 Thread Richard Biener
On Wed, 30 Aug 2017, Jon Beniston wrote:

> Hi Richard,
> 
> >> Ah, that's what I first tried, and mistakenly sent the patch against
> that.
> >> 
> >> That did help the initial problem, but then I needed to add a lot of 
> >> support for the V1SI type in the backend (which just duplicated SI 
> >> patterns) and there are a few places where GCC seems to have sanity 
> >> checks against vectors that are only one element wide. A dot product 
> >> producing a scalar result seems natural.
> >
> >Where do you need V1SI types?  The vectorizer should perform a SI extract 
> >from V1SI here.  You need to elaborate a bit here.
> 
> After adding single element vector type support in the middle-end, at the
> tree level, V1SI vector types would be the result type for the dot product
> if the input operands were V2HI.
> 
> During RTL expansion, if V1SImode is accepted in the
> TARGET_VECTOR_MODE_SUPPORTED_P hook, then V1SImode will be used after RTL
> expansion and V1SImode patterns are needed, although they are actually
> duplicates of SImode patterns.  I have now removed V1SImode from
> TARGET_VECTOR_MODE_SUPPORTED_P, and they are turned into SI mode so there is
> no need for the V1SI patterns now.
> 
> >> The vectorizer doesn't really support vector->scalar ops so V1SI 
> >> feels more natural here.  That is, your patch looks a bit ugly -- 
> >> there's nothing wrong with V1SI vector types.
> 
> I agree "there's nothing wrong with V1SI vector types" and the original
> patch was trying to let vector reduction operation accept V1SI vector types.
> 
> However, if the only change was "<= 1" into "< 1" in
> get_vectype_for_scalar_type_and_size, then it will cause a GCC assertion
> failure in optab_for_tree_node for some shift operations. It seems all such
> failures come from:
> 
> vect_pattern_recog_1:4185
>   optab = optab_for_tree_code (code, type_in, optab_default);
> 
> The SUB_TYPE passed to optab_for_tree_code is "optab_default", however it is
> possible that vect_pattern_recog_1 will generate gimple containing shift
> operations, for example vect_recog_divmod_pattern will generate RSHIFT_EXPR,
> while shift operation requires sub_type to be not optab_default?

I think the issue is that with TARGET_VECTOR_MODE_SUPPORTED_P false
for V1SI you'll get a SImode vector type and the

  if (VECTOR_BOOLEAN_TYPE_P (type_in)
  || VECTOR_MODE_P (TYPE_MODE (type_in)))

check fails.  Try changing that to || VECTOR_TYPE_P (type_in).
The else path should be hopefully dead ...

With your TARGET_VECTOR_MODE_SUPPORTED_P setting you are essentially
making the vectorizer mix what was desired as "generic" vectorization
and regular vectorization...  if it works, fine ;)

>   gcc/optabs-tree.h:
> 
>   /* An extra flag to control optab_for_tree_code's behavior.  This is
> needed to
>  distinguish between machines with a vector shift that takes a scalar
> for the
>  shift amount vs. machines that take a vector for the shift amount.  */
> 
>   enum optab_subtype
>   {
> optab_default,
> optab_scalar,
> optab_vector
>   };
> 
> It looks to me like the other call sites of optab_for_tree_code which are
> passing optab_default are mostly places where GCC is sure it is not a shift
> operation.
> Given TYPE_IN is returned from get_vectype_for_scalar_type, is it safe to
> simply pass optab_vector in vect_pattern_recog_1?

I guess so.  Can you also try the above?

> I have verified the following middle-end fix also works for my port, it
> passed also x86-64 bootstrap and there is no regression in gcc/g++
> regression.
> 
> How is this new patch?
> 
> gcc/
> 2017-08-30  Jon Beniston  
> 
> * tree-vect-patterns.c (vect_pattern_recog_1): Pass optab_vector
> when
> calling optab_for_tree_code.
> * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Allow
> single
> element vector types.
> 
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index cfdb72c..4b39cb6 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -4182,7 +4182,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
>   code = CALL_EXPR;
> }
> 
> -  optab = optab_for_tree_code (code, type_in, optab_default);
> +  optab = optab_for_tree_code (code, type_in, optab_vector);
>vec_mode = TYPE_MODE (type_in);
>if (!optab
>|| (icode = optab_handler (optab, vec_mode)) == CODE_FOR_nothing
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 013fb1f..fc62efb 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
> scalar_type, unsigned size)
>else
>  simd_mode = mode_for_vector (inner_mode, size / nbytes);
>nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> -  if (nunits <= 1)
> +  /* NOTE: nunits == 1 is allowed to support single element vector types.
> */
> +  if (nunits < 1)
>  return NULL_TREE;
> 
>

RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-08-30 Thread Jon Beniston
Hi Richard,

>> Ah, that's what I first tried, and mistakenly sent the patch against
that.
>> 
>> That did help the initial problem, but then I needed to add a lot of 
>> support for the V1SI type in the backend (which just duplicated SI 
>> patterns) and there are a few places where GCC seems to have sanity 
>> checks against vectors that are only one element wide. A dot product 
>> producing a scalar result seems natural.
>
>Where do you need V1SI types?  The vectorizer should perform a SI extract 
>from V1SI here.  You need to elaborate a bit here.

After adding single element vector type support in the middle-end, at the
tree level, V1SI vector types would be the result type for the dot product
if the input operands were V2HI.

During RTL expansion, if V1SImode is accepted in the
TARGET_VECTOR_MODE_SUPPORTED_P hook, then V1SImode will be used after RTL
expansion and V1SImode patterns are needed, although they are actually
duplicates of SImode patterns.  I have now removed V1SImode from
TARGET_VECTOR_MODE_SUPPORTED_P, and they are turned into SI mode so there is
no need for the V1SI patterns now.

>> The vectorizer doesn't really support vector->scalar ops so V1SI 
>> feels more natural here.  That is, your patch looks a bit ugly -- 
>> there's nothing wrong with V1SI vector types.

I agree "there's nothing wrong with V1SI vector types" and the original
patch was trying to let vector reduction operation accept V1SI vector types.

However, if the only change was "<= 1" into "< 1" in
get_vectype_for_scalar_type_and_size, then it will cause a GCC assertion
failure in optab_for_tree_node for some shift operations. It seems all such
failures come from:

vect_pattern_recog_1:4185
  optab = optab_for_tree_code (code, type_in, optab_default);

The SUB_TYPE passed to optab_for_tree_code is "optab_default", however it is
possible that vect_pattern_recog_1 will generate gimple containing shift
operations, for example vect_recog_divmod_pattern will generate RSHIFT_EXPR,
while shift operation requires sub_type to be not optab_default?

  gcc/optabs-tree.h:

  /* An extra flag to control optab_for_tree_code's behavior.  This is
needed to
 distinguish between machines with a vector shift that takes a scalar
for the
 shift amount vs. machines that take a vector for the shift amount.  */

  enum optab_subtype
  {
optab_default,
optab_scalar,
optab_vector
  };

It looks to me like the other call sites of optab_for_tree_code which are
passing optab_default are mostly places where GCC is sure it is not a shift
operation.
Given TYPE_IN is returned from get_vectype_for_scalar_type, is it safe to
simply pass optab_vector in vect_pattern_recog_1?

I have verified the following middle-end fix also works for my port, it
passed also x86-64 bootstrap and there is no regression in gcc/g++
regression.

How is this new patch?

gcc/
2017-08-30  Jon Beniston  

* tree-vect-patterns.c (vect_pattern_recog_1): Pass optab_vector
when
calling optab_for_tree_code.
* tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Allow
single
element vector types.

diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index cfdb72c..4b39cb6 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -4182,7 +4182,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
  code = CALL_EXPR;
}

-  optab = optab_for_tree_code (code, type_in, optab_default);
+  optab = optab_for_tree_code (code, type_in, optab_vector);
   vec_mode = TYPE_MODE (type_in);
   if (!optab
   || (icode = optab_handler (optab, vec_mode)) == CODE_FOR_nothing
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 013fb1f..fc62efb 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
scalar_type, unsigned size)
   else
 simd_mode = mode_for_vector (inner_mode, size / nbytes);
   nunits = GET_MODE_SIZE (simd_mode) / nbytes;
-  if (nunits <= 1)
+  /* NOTE: nunits == 1 is allowed to support single element vector types.
*/
+  if (nunits < 1)
 return NULL_TREE;

   vectype = build_vector_type (scalar_type, nunits);




RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-08-28 Thread Richard Biener
On Mon, 28 Aug 2017, Jon Beniston wrote:

> Hi Richard,
> 
> >-  if (nunits < 1) /* Support V1SI.  */
> >+  if (nunits < 1 || (nunits == 1 && !reduct_p))
> > return NULL_TREE;
> >
> >doesn't seem to be against trunk which has
> >
> >  if (nunits <= 1)
> >return NULL_TREE;
> >
> >so what happens if you just change that to
> >
> >  if (nunits < 1)
> >return NULL_TREE;
> 
> Ah, that's what I first tried, and mistakenly sent the patch against that.
> 
> That did help the initial problem, but then I needed to add a lot of support
> for the V1SI type in the backend (which just duplicated SI patterns) and
> there are a few places where GCC seems to have sanity checks against vectors
> that are only one element wide. A dot product producing a scalar result
> seems natural.

Where do you need V1SI types?  The vectorizer should perform a SI extract
from V1SI here.  You need to elaborate a bit here.

> 
> Also, as well as ARC benefitting from this, I think the TI c6x port would
> too. That currently has a sdot_prodv2hi pattern that uses SI and V2HI types.

The vectorizer doesn't really support vector->scalar ops so V1SI feels
more natural here.  That is, your patch looks a bit ugly -- there's
nothing wrong with V1SI vector types.  So maybe instead of adjusting
that function the respective caller when facing a lane-reducing op
such as DOT_PROD needs to consider scalar result types.

Richard.

> Cheers,
> Jon
> 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-08-28 Thread Jon Beniston
Hi Richard,

>-  if (nunits < 1) /* Support V1SI.  */
>+  if (nunits < 1 || (nunits == 1 && !reduct_p))
> return NULL_TREE;
>
>doesn't seem to be against trunk which has
>
>  if (nunits <= 1)
>return NULL_TREE;
>
>so what happens if you just change that to
>
>  if (nunits < 1)
>return NULL_TREE;

Ah, that's what I first tried, and mistakenly sent the patch against that.

That did help the initial problem, but then I needed to add a lot of support
for the V1SI type in the backend (which just duplicated SI patterns) and
there are a few places where GCC seems to have sanity checks against vectors
that are only one element wide. A dot product producing a scalar result
seems natural. 

Also, as well as ARC benefitting from this, I think the TI c6x port would
too. That currently has a sdot_prodv2hi pattern that uses SI and V2HI types.

Cheers,
Jon




Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-08-28 Thread Richard Biener
On Sun, 27 Aug 2017, Jon Beniston wrote:

> Hi,
> 
> I have an out-of-tree GCC port and it is struggling supporting
> auto-vectorization on some dot product instructions.  For example, I have an
> instruction that takes three operands which are all 32-bit general
> registers. The second and third operands will be treated as V2HI then do dot
> product, and then generate an SI result which is then added to the first
> operand which is SI as well.
> 
> I do see there is dot product recognizer in tree-vect-patters.c, however, I
> found the following testcase still can't be auto-vectorized on my port which
> has implemented all necessary dot product standard patterns.  This testcase
> can't be auto-vectorized on other targets that have similar V2HI dot product
> instructions as well, for example ARC.
> 
> === test.c ===
> #define K 4
> #define M 4
> #define N 256
> int in[N*K][M];
> int out[K];
> int coeff[N][M];
> void
> foo (void)
> {
>   int i, j, k;
>   int sum;
>   for (k = 0; k < K; k++)
> {
>   sum = 0;
>   for (j = 0; j < M; j++)
> for (i = 0; i < N; i++)
>   sum += in[i+k][j] * coeff[i][j];
>   out[k] = sum;
> }
> }
> ===
>   The reason that auto-vectorizer doesn't work seems to be that GCC doesn't
> support single-element vector types in get_vectype_for_scalar_type_and_size.
> tree-vect-stmts.c: get_vectype_for_scalar_type_and_size
>   ...
>   if (nunits <= 1)
> return NULL_TREE;
> 
> So, I am thinking this actually should be relaxed to support more cases.  At
> least on vector reduction operations which normally will have scalar result
> with wider types than the element type of input operands.
> 
> I have tried to make the auto-vectorizer work for my V2HI dot product case,
> with the patch attached. Is this the correct approach?

Hum,

@@ -9039,7 +9040,7 @@
   else
 simd_mode = mode_for_vector (inner_mode, size / nbytes);
   nunits = GET_MODE_SIZE (simd_mode) / nbytes;
-  if (nunits < 1) /* Support V1SI.  */
+  if (nunits < 1 || (nunits == 1 && !reduct_p))
 return NULL_TREE;

   vectype = build_vector_type (scalar_type, nunits);

doesn't seem to be against trunk which has

  if (nunits <= 1)
return NULL_TREE;

so what happens if you just change that to

  if (nunits < 1)
return NULL_TREE;

?

Richard.

> Cheers,
> Jon
> 
> gcc/
> 2017-08-27  Jon Beniston 
> 
> * tree-vectorizer.h (get_vectype_for_scalar_type): New optional
> parameter declaration.
> * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Add new
> optional parameter "reduct_p".  Support single element vector types
> if it is true.
> (get_vectype_for_scalar_type): Add new parameter "reduct_p".
> * tree-vect-patterns.c (vect_pattern_recog_1): Pass new parameter
> "reduct_p".
> * tree-vect-loop.c (vect_determine_vectorization_factor): Likewise.
> (vect_model_reduction_cost): Likewise.
> (get_initial_def_for_induction): Likewise.
> (vect_create_epilog_for_reduction): Likewise.
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)