Re: [PATCH] Avoid useless work in loop vectorization

2015-11-19 Thread Richard Biener
On Wed, 18 Nov 2015, Alan Lawrence wrote:

> On 13/11/15 08:41, Richard Biener wrote:
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> > 
> > Richard.
> > 
> > 2015-11-13  Richard Biener  
> > 
> > * tree-vect-loop.c (vect_analyze_loop_2): Add fatal parameter.
> > Signal fatal failure if early checks fail.
> > (vect_analyze_loop): If vect_analyze_loop_2 fails fatally
> > do not bother testing further vector sizes.
> 
> It seems that on AArch64 this causes:
> 
> FAIL: gcc.dg/vect/vect-outer-1-big-array.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "grouped access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1-big-array.c scan-tree-dump-times vect "grouped
> access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1.c -flto -ffat-lto-objects  scan-tree-dump-times
> vect "grouped access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1.c scan-tree-dump-times vect "grouped access in
> outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1a-big-array.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "grouped access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1a-big-array.c scan-tree-dump-times vect "grouped
> access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1a.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "grouped access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1a.c scan-tree-dump-times vect "grouped access in
> outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1b-big-array.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "grouped access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1b-big-array.c scan-tree-dump-times vect "grouped
> access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1b.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "grouped access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-1b.c scan-tree-dump-times vect "grouped access in
> outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-2b.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "grouped access in outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-2b.c scan-tree-dump-times vect "grouped access in
> outer loop" 2
> FAIL: gcc.dg/vect/vect-outer-3b.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "grouped access in outer loop" 4
> FAIL: gcc.dg/vect/vect-outer-3b.c scan-tree-dump-times vect "grouped access in
> outer loop" 4
> 
> Still there on r230556, I haven't dug any further yet.

Probably a testsuite issue as we have

/* { dg-final { scan-tree-dump-times "grouped access in outer loop" 1 
"vect" { target { ! vect_multiple_sizes } } } } */
/* { dg-final { scan-tree-dump-times "grouped access in outer loop" 2 
"vect" { target vect_multiple_sizes } } } */

and now possibly terminate early before considering the other vector
size(s).

Richard.


Re: [PATCH] Avoid useless work in loop vectorization

2015-11-18 Thread Alan Lawrence

On 13/11/15 08:41, Richard Biener wrote:


Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2015-11-13  Richard Biener  

* tree-vect-loop.c (vect_analyze_loop_2): Add fatal parameter.
Signal fatal failure if early checks fail.
(vect_analyze_loop): If vect_analyze_loop_2 fails fatally
do not bother testing further vector sizes.


It seems that on AArch64 this causes:

FAIL: gcc.dg/vect/vect-outer-1-big-array.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "grouped access in outer loop" 2
FAIL: gcc.dg/vect/vect-outer-1-big-array.c scan-tree-dump-times vect "grouped 
access in outer loop" 2
FAIL: gcc.dg/vect/vect-outer-1.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "grouped access in outer loop" 2
FAIL: gcc.dg/vect/vect-outer-1.c scan-tree-dump-times vect "grouped access in 
outer loop" 2
FAIL: gcc.dg/vect/vect-outer-1a-big-array.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "grouped access in outer loop" 2
FAIL: gcc.dg/vect/vect-outer-1a-big-array.c scan-tree-dump-times vect "grouped 
access in outer loop" 2
FAIL: gcc.dg/vect/vect-outer-1a.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "grouped access in outer loop" 2
FAIL: gcc.dg/vect/vect-outer-1a.c scan-tree-dump-times vect "grouped access in 
outer loop" 2
FAIL: gcc.dg/vect/vect-outer-1b-big-array.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "grouped access in outer loop" 2
FAIL: gcc.dg/vect/vect-outer-1b-big-array.c scan-tree-dump-times vect "grouped 
access in outer loop" 2
FAIL: gcc.dg/vect/vect-outer-1b.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "grouped access in outer loop" 2
FAIL: gcc.dg/vect/vect-outer-1b.c scan-tree-dump-times vect "grouped access in 
outer loop" 2
FAIL: gcc.dg/vect/vect-outer-2b.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "grouped access in outer loop" 2
FAIL: gcc.dg/vect/vect-outer-2b.c scan-tree-dump-times vect "grouped access in 
outer loop" 2
FAIL: gcc.dg/vect/vect-outer-3b.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "grouped access in outer loop" 4
FAIL: gcc.dg/vect/vect-outer-3b.c scan-tree-dump-times vect "grouped access in 
outer loop" 4


Still there on r230556, I haven't dug any further yet.

Thanks, Alan



[PATCH] Avoid useless work in loop vectorization

2015-11-13 Thread Richard Biener

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2015-11-13  Richard Biener  

* tree-vect-loop.c (vect_analyze_loop_2): Add fatal parameter.
Signal fatal failure if early checks fail.
(vect_analyze_loop): If vect_analyze_loop_2 fails fatally
do not bother testing further vector sizes.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 230260)
+++ gcc/tree-vect-loop.c(working copy)
@@ -1709,13 +1709,16 @@ vect_analyze_loop_operations (loop_vec_i
for it.  The different analyses will record information in the
loop_vec_info struct.  */
 static bool
-vect_analyze_loop_2 (loop_vec_info loop_vinfo)
+vect_analyze_loop_2 (loop_vec_info loop_vinfo, bool )
 {
   bool ok;
   int max_vf = MAX_VECTORIZATION_FACTOR;
   int min_vf = 2;
   unsigned int n_stmts = 0;
 
+  /* The first group of checks is independent of the vector size.  */
+  fatal = true;
+
   /* Find all data references in the loop (which correspond to vdefs/vuses)
  and analyze their evolution in the loop.  */
 
@@ -1795,7 +1798,6 @@ vect_analyze_loop_2 (loop_vec_info loop_
 
   /* Classify all cross-iteration scalar data-flow cycles.
  Cross-iteration cycles caused by virtual phis are analyzed separately.  */
-
   vect_analyze_scalar_cycles (loop_vinfo);
 
   vect_pattern_recog (loop_vinfo);
@@ -1825,6 +1827,9 @@ vect_analyze_loop_2 (loop_vec_info loop_
   return false;
 }
 
+  /* While the rest of the analysis below depends on it in some way.  */
+  fatal = false;
+
   /* Analyze data dependences between the data-refs in the loop
  and adjust the maximum vectorization factor according to
  the dependences.
@@ -2118,7 +2169,8 @@ vect_analyze_loop (struct loop *loop)
  return NULL;
}
 
-  if (vect_analyze_loop_2 (loop_vinfo))
+  bool fatal = false;
+  if (vect_analyze_loop_2 (loop_vinfo, fatal))
{
  LOOP_VINFO_VECTORIZABLE_P (loop_vinfo) = 1;
 
@@ -2128,7 +2180,8 @@ vect_analyze_loop (struct loop *loop)
   destroy_loop_vec_info (loop_vinfo, true);
 
   vector_sizes &= ~current_vector_size;
-  if (vector_sizes == 0
+  if (fatal
+ || vector_sizes == 0
  || current_vector_size == 0)
return NULL;