https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66285

vries at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization

--- Comment #1 from vries at gcc dot gnu.org ---
FWIW, this patch puts pass_parallelize_loops before pass_vectorize: 
...
diff --git a/gcc/passes.def b/gcc/passes.def
index 4690e23..f0629ff 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -243,14 +243,14 @@ along with GCC; see the file COPYING3.  If not see
              NEXT_PASS (pass_dce);
          POP_INSERT_PASSES ()
          NEXT_PASS (pass_iv_canon);
-         NEXT_PASS (pass_parallelize_loops);
-         PUSH_INSERT_PASSES_WITHIN (pass_parallelize_loops)
-             NEXT_PASS (pass_expand_omp_ssa);
-         POP_INSERT_PASSES ()
          NEXT_PASS (pass_if_conversion);
          /* pass_vectorize must immediately follow pass_if_conversion.
             Please do not add any other passes in between.  */
          NEXT_PASS (pass_vectorize);
+         NEXT_PASS (pass_parallelize_loops);
+         PUSH_INSERT_PASSES_WITHIN (pass_parallelize_loops)
+             NEXT_PASS (pass_expand_omp_ssa);
+         POP_INSERT_PASSES ()
           PUSH_INSERT_PASSES_WITHIN (pass_vectorize)
              NEXT_PASS (pass_dce);
           POP_INSERT_PASSES ()
...

And that makes the problem go away (btw, dump file names need adapting in
investigate.sh):
...
$ ./investigate.sh 
parloops_factor: 0, index_type: int:
  vectorized: 1, parallelized: 0
parloops_factor: 0, index_type: unsigned int:
  vectorized: 1, parallelized: 0
parloops_factor: 0, index_type: long:
  vectorized: 1, parallelized: 0
parloops_factor: 0, index_type: unsigned long:
  vectorized: 1, parallelized: 0
parloops_factor: 2, index_type: int:
  vectorized: 1, parallelized: 1
parloops_factor: 2, index_type: unsigned int:
  vectorized: 1, parallelized: 1
parloops_factor: 2, index_type: long:
  vectorized: 1, parallelized: 1
parloops_factor: 2, index_type: unsigned long:
  vectorized: 1, parallelized: 1
...

Of course, the patch means we're no longer vectorizing parallelized loops, but
parallelizing vectorized loops.

Reply via email to