[Bug tree-optimization/45764] [4.6 Regression] wrong code -O2 vs -O3 (problem in vectorizer???)

rguenth at gcc dot gnu.org Thu, 14 Oct 2010 03:19:32 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45764


--- Comment #6 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> 2010-10-03 
05:41:44 UTC ---
Author: hjl
Date: Sun Oct  3 05:39:32 2010
New Revision: 164914

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164914
Log:
Disallow negative steps in vectorizer.

gcc/

2010-10-02  H.J. Lu  <hongjiu...@intel.com>

    PR tree-optimization/45720
    PR tree-optimization/45764
    * tree-vect-data-refs.c (vect_analyze_data_ref_access):
    Don't accept backwards consecutive accesses.
    (vect_create_data_ref_ptr): Disallow negative steps.

    * tree-vect-stmts.c (vectorizable_store): Allow negative steps.
    (perm_mask_for_reverse): Removed.
    (reverse_vec_elements): Likewise.
    (vectorizable_load): Don't hanle negative steps.

gcc/testsuite/

2010-10-02  H.J. Lu  <hongjiu...@intel.com>

    PR tree-optimization/45720
    PR tree-optimization/45764
    * g++.dg/torture/pr45764.C: New.

    * gcc.dg/vect/pr43432.c: Xfail.
    * gcc.dg/vect/vect-114.c: Likewise.
    * gcc.dg/vect/vect-15.c: Likewise.

Added:
    trunk/gcc/testsuite/g++.dg/torture/pr45764.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/vect/pr43432.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-114.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-15.c
    trunk/gcc/tree-vect-data-refs.c
    trunk/gcc/tree-vect-stmts.c

--- Comment #7 from Richard Guenther <rguenth at gcc dot gnu.org> 2010-10-14 
10:19:07 UTC ---
(In reply to comment #5)
> This patch will set the correct misalign info on that data-reference:
> 
> Index: tree-vect-data-refs.c
> ===================================================================
> --- tree-vect-data-refs.c       (revision 164476)
> +++ tree-vect-data-refs.c       (working copy)
> @@ -900,6 +900,19 @@ vect_compute_data_ref_alignment (struct
>               || (TREE_CODE (base) == VAR_DECL
>                   && DECL_ALIGN (base) >= TYPE_ALIGN (vectype)));
> 
> +  /* If this is a backward running DR then first access in the larger
> +     vectype actually is N-1 elements before the address in the DR.
> +     Adjust misalign accordingly.  */
> +  if (tree_int_cst_compare (DR_STEP (dr), size_zero_node) < 0)
> +    {
> +      tree offset = ssize_int (TYPE_VECTOR_SUBPARTS (vectype) - 1);
> +      /* DR_STEP(dr) is the same as -TYPE_SIZE of the scalar type,
> +         otherwise we wouldn't be here.  */
> +      offset = fold_build2 (MULT_EXPR, ssizetype, offset, DR_STEP (dr));
> +      /* PLUS because DR_STEP was negative.  */
> +      misalign = size_binop (PLUS_EXPR, misalign, offset);
> +    }
> +
>    /* Modulo alignment.  */
>    misalign = size_binop (FLOOR_MOD_EXPR, misalign, alignment);
> 
> 
> Unfortunately this will later result in an ICE during peeling.  It
> wants to peel these data-refs that now are known misaligned, and during cost
> computation changes data-refs that have the same alignment state to be
> aligned too (makes sense because those data-refs will also be aligned when
> the to-be-peeled DR is aligned).
> 
> Unfortunately the STMT_VINFO_SAME_ALIGN_REFS contains bogus reference,
> because ultimately the distance vectors are wrong:
> 
> (compute_affine_dependence
>   (stmt_a = D.2097_46 = ibuf[D.2094_43];)
>   (stmt_b = D.2100_64 = ibuf[D.2099_63];)
> (subscript_dependence_tester
> (analyze_overlapping_iterations
>   (chrec_a = {64, +, -1}_2)
>   (chrec_b = {64, +, 1}_2)
> (analyze_siv_subscript
> (analyze_subscript_affine_affine
>   (overlaps_a = [0]
>   (overlaps_b = [0]
>   (overlap_iterations_a = [0]
>   (overlap_iterations_b = [0]
> (build_classic_dist_vector
>   dist_vector = (  0  )
> 
> So, we have two DRs running in opposite directions, which happen to
> have the initial element in common (index 64), but afterwards diverge and
> have nothing in common anymore.  So overlap_iterations_a/b is correct.
> But the distance vector is 0, meaning 'same in all iterations'.  That
> is used to initialize STMT_VINFO_SAME_ALIGN_REFS in
> vect_find_same_alignment_drs .
> 
> Obviously the distance vector should be unknown (it could be a chrec, if
> we really want, namely {0, +, 2}_2, but we don't do anything with such
> distances).  I tried to determine why it is wrong, it's ultimately
> coming from SUB_DISTANCE of that DDR, computed like so
> (compute_subscript_distance):
> 
>           subscript = DDR_SUBSCRIPT (ddr, i);
>           cf_a = SUB_CONFLICTS_IN_A (subscript);
>           cf_b = SUB_CONFLICTS_IN_B (subscript);
> 
>           fn_a = common_affine_function (cf_a);
>           fn_b = common_affine_function (cf_b);
>           if (!fn_a || !fn_b)
>             {
>               SUB_DISTANCE (subscript) = chrec_dont_know;
>               return;
>             }
>           diff = affine_fn_minus (fn_a, fn_b);
> 
>           if (affine_function_constant_p (diff))
>             SUB_DISTANCE (subscript) = affine_function_base (diff);
>           else
>             SUB_DISTANCE (subscript) = chrec_dont_know;
> 
> And this is baffling me a bit.  How could it be correct to
> determine a distance vector from only the conflict functions?  Doing so
> ignores all non-conflicting accesses, although they can (and indeed here do)
> influence the distance too.
> 
> CCing Zdenek and Sebastian, maybe they have some insight in the latter
> problem, as those aren't really related to the vectorizer.

It seems to be that the distance vector should be unknown if for
any pair of indices in the data-references the evolution of the
access function is different.  Thus,

   if (tree_does_not_contain_chrecs
        (chrec_fold_minus (some-type,
                         DR_ACCESS_FN (DDR_A (ddr), i),
                         DR_ACCESS_FN (DDR_B (ddr), i))))
     {
       SUB_DISTANCE (subscript) = chrec_dont_know;
       return;
     }

in case the DDR subscripts match up with the DR access-fns.  Otherwise
something more complicated of course.

And maybe there's a helper for the above already.

Zdenek?  Sebastian?

[Bug tree-optimization/45764] [4.6 Regression] wrong code -O2 vs -O3 (problem in vectorizer???)

Reply via email to