https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66285
vries at gcc dot gnu.org changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization --- Comment #1 from vries at gcc dot gnu.org --- FWIW, this patch puts pass_parallelize_loops before pass_vectorize: ... diff --git a/gcc/passes.def b/gcc/passes.def index 4690e23..f0629ff 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -243,14 +243,14 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_dce); POP_INSERT_PASSES () NEXT_PASS (pass_iv_canon); - NEXT_PASS (pass_parallelize_loops); - PUSH_INSERT_PASSES_WITHIN (pass_parallelize_loops) - NEXT_PASS (pass_expand_omp_ssa); - POP_INSERT_PASSES () NEXT_PASS (pass_if_conversion); /* pass_vectorize must immediately follow pass_if_conversion. Please do not add any other passes in between. */ NEXT_PASS (pass_vectorize); + NEXT_PASS (pass_parallelize_loops); + PUSH_INSERT_PASSES_WITHIN (pass_parallelize_loops) + NEXT_PASS (pass_expand_omp_ssa); + POP_INSERT_PASSES () PUSH_INSERT_PASSES_WITHIN (pass_vectorize) NEXT_PASS (pass_dce); POP_INSERT_PASSES () ... And that makes the problem go away (btw, dump file names need adapting in investigate.sh): ... $ ./investigate.sh parloops_factor: 0, index_type: int: vectorized: 1, parallelized: 0 parloops_factor: 0, index_type: unsigned int: vectorized: 1, parallelized: 0 parloops_factor: 0, index_type: long: vectorized: 1, parallelized: 0 parloops_factor: 0, index_type: unsigned long: vectorized: 1, parallelized: 0 parloops_factor: 2, index_type: int: vectorized: 1, parallelized: 1 parloops_factor: 2, index_type: unsigned int: vectorized: 1, parallelized: 1 parloops_factor: 2, index_type: long: vectorized: 1, parallelized: 1 parloops_factor: 2, index_type: unsigned long: vectorized: 1, parallelized: 1 ... Of course, the patch means we're no longer vectorizing parallelized loops, but parallelizing vectorized loops.