http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45764
--- Comment #6 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> 2010-10-03 05:41:44 UTC --- Author: hjl Date: Sun Oct 3 05:39:32 2010 New Revision: 164914 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=164914 Log: Disallow negative steps in vectorizer. gcc/ 2010-10-02 H.J. Lu <hongjiu...@intel.com> PR tree-optimization/45720 PR tree-optimization/45764 * tree-vect-data-refs.c (vect_analyze_data_ref_access): Don't accept backwards consecutive accesses. (vect_create_data_ref_ptr): Disallow negative steps. * tree-vect-stmts.c (vectorizable_store): Allow negative steps. (perm_mask_for_reverse): Removed. (reverse_vec_elements): Likewise. (vectorizable_load): Don't hanle negative steps. gcc/testsuite/ 2010-10-02 H.J. Lu <hongjiu...@intel.com> PR tree-optimization/45720 PR tree-optimization/45764 * g++.dg/torture/pr45764.C: New. * gcc.dg/vect/pr43432.c: Xfail. * gcc.dg/vect/vect-114.c: Likewise. * gcc.dg/vect/vect-15.c: Likewise. Added: trunk/gcc/testsuite/g++.dg/torture/pr45764.C Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/vect/pr43432.c trunk/gcc/testsuite/gcc.dg/vect/vect-114.c trunk/gcc/testsuite/gcc.dg/vect/vect-15.c trunk/gcc/tree-vect-data-refs.c trunk/gcc/tree-vect-stmts.c --- Comment #7 from Richard Guenther <rguenth at gcc dot gnu.org> 2010-10-14 10:19:07 UTC --- (In reply to comment #5) > This patch will set the correct misalign info on that data-reference: > > Index: tree-vect-data-refs.c > =================================================================== > --- tree-vect-data-refs.c (revision 164476) > +++ tree-vect-data-refs.c (working copy) > @@ -900,6 +900,19 @@ vect_compute_data_ref_alignment (struct > || (TREE_CODE (base) == VAR_DECL > && DECL_ALIGN (base) >= TYPE_ALIGN (vectype))); > > + /* If this is a backward running DR then first access in the larger > + vectype actually is N-1 elements before the address in the DR. > + Adjust misalign accordingly. */ > + if (tree_int_cst_compare (DR_STEP (dr), size_zero_node) < 0) > + { > + tree offset = ssize_int (TYPE_VECTOR_SUBPARTS (vectype) - 1); > + /* DR_STEP(dr) is the same as -TYPE_SIZE of the scalar type, > + otherwise we wouldn't be here. */ > + offset = fold_build2 (MULT_EXPR, ssizetype, offset, DR_STEP (dr)); > + /* PLUS because DR_STEP was negative. */ > + misalign = size_binop (PLUS_EXPR, misalign, offset); > + } > + > /* Modulo alignment. */ > misalign = size_binop (FLOOR_MOD_EXPR, misalign, alignment); > > > Unfortunately this will later result in an ICE during peeling. It > wants to peel these data-refs that now are known misaligned, and during cost > computation changes data-refs that have the same alignment state to be > aligned too (makes sense because those data-refs will also be aligned when > the to-be-peeled DR is aligned). > > Unfortunately the STMT_VINFO_SAME_ALIGN_REFS contains bogus reference, > because ultimately the distance vectors are wrong: > > (compute_affine_dependence > (stmt_a = D.2097_46 = ibuf[D.2094_43];) > (stmt_b = D.2100_64 = ibuf[D.2099_63];) > (subscript_dependence_tester > (analyze_overlapping_iterations > (chrec_a = {64, +, -1}_2) > (chrec_b = {64, +, 1}_2) > (analyze_siv_subscript > (analyze_subscript_affine_affine > (overlaps_a = [0] > (overlaps_b = [0] > (overlap_iterations_a = [0] > (overlap_iterations_b = [0] > (build_classic_dist_vector > dist_vector = ( 0 ) > > So, we have two DRs running in opposite directions, which happen to > have the initial element in common (index 64), but afterwards diverge and > have nothing in common anymore. So overlap_iterations_a/b is correct. > But the distance vector is 0, meaning 'same in all iterations'. That > is used to initialize STMT_VINFO_SAME_ALIGN_REFS in > vect_find_same_alignment_drs . > > Obviously the distance vector should be unknown (it could be a chrec, if > we really want, namely {0, +, 2}_2, but we don't do anything with such > distances). I tried to determine why it is wrong, it's ultimately > coming from SUB_DISTANCE of that DDR, computed like so > (compute_subscript_distance): > > subscript = DDR_SUBSCRIPT (ddr, i); > cf_a = SUB_CONFLICTS_IN_A (subscript); > cf_b = SUB_CONFLICTS_IN_B (subscript); > > fn_a = common_affine_function (cf_a); > fn_b = common_affine_function (cf_b); > if (!fn_a || !fn_b) > { > SUB_DISTANCE (subscript) = chrec_dont_know; > return; > } > diff = affine_fn_minus (fn_a, fn_b); > > if (affine_function_constant_p (diff)) > SUB_DISTANCE (subscript) = affine_function_base (diff); > else > SUB_DISTANCE (subscript) = chrec_dont_know; > > And this is baffling me a bit. How could it be correct to > determine a distance vector from only the conflict functions? Doing so > ignores all non-conflicting accesses, although they can (and indeed here do) > influence the distance too. > > CCing Zdenek and Sebastian, maybe they have some insight in the latter > problem, as those aren't really related to the vectorizer. It seems to be that the distance vector should be unknown if for any pair of indices in the data-references the evolution of the access function is different. Thus, if (tree_does_not_contain_chrecs (chrec_fold_minus (some-type, DR_ACCESS_FN (DDR_A (ddr), i), DR_ACCESS_FN (DDR_B (ddr), i)))) { SUB_DISTANCE (subscript) = chrec_dont_know; return; } in case the DDR subscripts match up with the DR access-fns. Otherwise something more complicated of course. And maybe there's a helper for the above already. Zdenek? Sebastian?