Thanks a lot Richard for your review.

I presented updated patch which is not gated by force_vectorize.
I added test on outer-loop in vect_enhance_data_refs_alignment
and it returns false for it because we can not improve dr alighment
through outer-loop peeling in general. So I assume that only
versioning for alignment can be applied for targets do not support
unaligned memory access.

I did not change tests for outer loops in slpeel_can_duplicate_loop_p
as you proposed since it is not called outside vectorization.

I also noticed one not-resolved issue with outer-loop peeling - we don't
consider remainder for possible vectorization of inner-loop as we can see
on the following example:

  for (i = 0; i < n; i++) {
    diff = 0;
    for (j = 0; j < M; j++) {
      diff += in[j+i]*coeff[j];
    }
    out[i] = diff;
  }

Is it worth to fix it?

2015-06-16  Yuri Rumyantsev  <ysrum...@gmail.com>

* tree-vect-loop-manip.c (rename_variables_in_bb): Add argument
to allow renaming of PHI arguments on edges incoming from outer
loop header, add corresponding check before start PHI iterator.
(slpeel_tree_duplicate_loop_to_edge_cfg): Introduce new bool
variable DUPLICATE_OUTER_LOOP and set it to true for outer loops
with true force_vectorize.  Set-up dominator for outer loop too.
Pass DUPLICATE_OUTER_LOOP as argument to rename_variables_in_bb.
(slpeel_can_duplicate_loop_p): Allow duplicate of outer loop if it
was marked with force_vectorize and has restricted cfg.
* tree-vect-data-refs.c (vector_alignment_reachable_p): Alignment can
not be reachable for outer loops.
(vect_enhance_data_refs_alignment): Add test on true value of
do_peeling.

gcc/testsuite/ChangeLog:
* gcc.dg/vect/vect-outer-simd-2.c: New test.

2015-06-09 16:26 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Mon, Jun 8, 2015 at 12:27 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>> Hi All,
>>
>> Here is a simple fix which allows duplication of outer loops to
>> perform peeling for number of iterations if outer loop is marked with
>> pragma omp simd.
>>
>> Bootstrap and regression testing did not show any new failures.
>> Is it OK for trunk?
>
> Hmm, I don't remember needing to adjust rename_variables_in_bb
> when teaching loop distibution to call slpeel_tree_duplicate_to_edge_cfg
> on non-innermost loops...  (I just copied, I never called
> slpeel_can_duplicate_loop_p though).
>
> So - you should just remove the loop->inner condition from
> slpeel_can_duplicate_loop_p as it is used by non-vectorizer
> code as well (yeah, I never merged the nested loop support
> for loop distribution...).
>
> Index: tree-vect-loop.c
> ===================================================================
> --- tree-vect-loop.c    (revision 224100)
> +++ tree-vect-loop.c    (working copy)
> @@ -1879,6 +1879,10 @@
>        return false;
>      }
>
> +  /* Peeling for alignment is not supported for outer-loop vectorization.  */
> +  if (LOOP_VINFO_LOOP (loop_vinfo)->inner)
> +    LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) = 0;
>
> I think you can't simply do this - if vect_enhance_data_refs_alignment
> decided to peel for alignment then it has adjusted the DRs alignment
> info already.  So instead of the above simply disallow peeling for
> alignment in vect_enhance_data_refs_alignment?  Thus add
> || ->inner to
>
>   /* Check if we can possibly peel the loop.  */
>   if (!vect_can_advance_ivs_p (loop_vinfo)
>       || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
>     do_peeling = false;
>
> ?
>
> I also can't see why the improvement has to be gated on force_vect,
> it surely looks profitable to enable more outer loop vectorization in
> general, no?
>
> How do the cost model calculations end up with peeling the outer loop
> for niter?
>
> On targets which don't support unaligned accesses we're left with
> versioning for alignment.  Isn't peeling for alignment better there?
> Thus only disallow peeling for alignment if there is no unhandled
> alignment?
>
> Thanks,
> Richard.
>
>> ChangeLog:
>>
>> 2015-06-08  Yuri Rumyantsev  <ysrum...@gmail.com>
>>
>> * tree-vect-loop-manip.c (rename_variables_in_bb): Add argument
>> to allow renaming of PHI arguments on edges incoming from outer
>> loop header, add corresponding check before start PHI iterator.
>> (slpeel_tree_duplicate_loop_to_edge_cfg): Introduce new bool
>> variable DUPLICATE_OUTER_LOOP and set it to true for outer loops
>> with true force_vectorize.  Set-up dominator for outer loop too.
>> Pass DUPLICATE_OUTER_LOOP as argument to rename_variables_in_bb.
>> (slpeel_can_duplicate_loop_p): Allow duplicate of outer loop if it
>> was marked with force_vectorize and has restricted cfg.
>> * tre-vect-loop.c (vect_analyze_loop_2): Prohibit alignment peeling
>> for outer loops.
>>
>> gcc/testsuite/ChangeLog:
>> * gcc.dg/vect/vect-outer-simd-2.c: New test.

Attachment: patch.1.2
Description: Binary data

Reply via email to