Richard Guenther wrote:
> On Mon, Jun 11, 2012 at 5:25 PM, Richard Earnshaw <rearn...@arm.com> wrote:
> > On 11/06/12 15:53, Richard Guenther wrote:
> >> The type argument or the size argument looks redundant.
> >
> > Technically, yes, we could get rid of "tree_low_cst (TYPE_SIZE (type)"
> > and calculate it inside the alignment function if it was needed.
> > However, it seemed likely that most targets would need that number one
> > way or another, such that passing it would be helpful.
> 
> Well, you don't need it in stor-layout and targets might think the value
> may be completely unrelated to the type ...
> 
> >> Note that we still can have such vector "properly" aligned, thus the
> >> vectorizer would need to use build_aligned_type also if it knows the
> >> type is aligned, not only when thinks it is misaligned.  You basically
> >> change the alignment of the "default" vector type.
> >
> > I'm not sure I follow...
> 
> I say that a large vector may be still aligned, so the vectorizer when
> creating vector memory references has to use a non-default aligned vector
> type when the vector is aligned.  It won't do that at the moment.

Richard (Earnshaw) has asked me to take over working on this patch now.

I've now made the change requested above and removed the size argument.
The target is now simply asked to return the required alignment for the
given vector type.  I've also added a check for the case where the
target provides both an alignment and a mode for a vector type, but
the mode actually requires bigger alignment than the type.  This is
simply rejected (the target can fix this by reporting a different
type alignment or changing the mode alignment).

I've not made any attempts to have the vectorizer register larger
alignments than the one returned by the target hook.  It's not
clear to me when this would be useful (at least on ARM) ...

I've also run the testsuite, and this actually uncovered to bugs in
the vectorizer where it made an implicit assumption that vector types
must always be naturally aligned:

- In vect_update_misalignment_for_peel, the code used the vector size
  instead of the required alignment in order to bound misalignment
  values -- leading to a misalignment value bigger than the underlying
  alignment requirement of the vector type, causing an ICE later on

- In vect_do_peeling_for_loop_bound, the code divided the vector type
  alignment by the number of elements in order to arrive at the element
  size ... this returns a wrong value if the alignment is less than the
  vector size, causing incorrect code to be generated

  (This routine also had some confusion between size and alignment in
  comments and variable names, which I've fixed as well.)

Finally, two test cases still failed spuriously:

- gcc.dg/align-2.c actually checked that vector types are naturally
  aligned

- gcc.dg/vect/slp-25.c checked that we needed to perform peeling for
  alignment, which we actually don't need any more if vector types
  have a lesser alignment requirement in the first place

I've added a new effective target flag to check whether the target
requires natural alignment for vector types, and disabled those two
tests if it doesn't.

With those changes, I've completed testing with no regressions on
arm-linux-gnueabi.

OK for mainline?

Bye,
Ulrich


ChangeLog:

        * target.def (vector_alignment): New target hook.
        * doc/tm.texi.in (TARGET_VECTOR_ALIGNMENT): Document new hook.
        * doc/tm.texi: Regenerate.
        * targhooks.c (default_vector_alignment): New function.
        * targhooks.h (default_vector_alignment): Add prototype.
        * stor-layout.c (layout_type): Use targetm.vector_alignment.
        * config/arm/arm.c (arm_vector_alignment): New function.
        (TARGET_VECTOR_ALIGNMENT): Define.

        * tree-vect-data-refs.c (vect_update_misalignment_for_peel): Use
        vector type alignment instead of size.
        * tree-vect-loop-manip.c (vect_do_peeling_for_loop_bound): Use
        element type size directly instead of computing it from alignment.
        Fix variable naming and comment.

testsuite/ChangeLog:

        * lib/target-supports.exp
        (check_effective_target_vect_natural_alignment): New function.
        * gcc.dg/align-2.c: Only run on targets with natural alignment
        of vector types.
        * gcc.dg/vect/slp-25.c: Adjust tests for targets without natural
        alignment of vector types.


Index: gcc/target.def
===================================================================
*** gcc/target.def      (revision 189809)
--- gcc/target.def      (working copy)
*************** DEFHOOK
*** 1659,1664 ****
--- 1659,1672 ----
   bool, (enum machine_mode mode),
   hook_bool_mode_false)
  
+ DEFHOOK
+ (vector_alignment,
+  "This hook can be used to define the alignment for a vector of type\n\
+ @var{type}, in order to comply with a platform ABI.  The default is to\n\
+ require natural alignment for vector types.",
+  HOST_WIDE_INT, (const_tree type),
+  default_vector_alignment)
+ 
  /* True if we should try to use a scalar mode to represent an array,
     overriding the usual MAX_FIXED_MODE limit.  */
  DEFHOOK
Index: gcc/doc/tm.texi
===================================================================
*** gcc/doc/tm.texi     (revision 189809)
--- gcc/doc/tm.texi     (working copy)
*************** make it all fit in fewer cache lines.
*** 1107,1112 ****
--- 1107,1118 ----
  If the value of this macro has a type, it should be an unsigned type.
  @end defmac
  
+ @deftypefn {Target Hook} HOST_WIDE_INT TARGET_VECTOR_ALIGNMENT (const_tree 
@var{type})
+ This hook can be used to define the alignment for a vector of type
+ @var{type}, in order to comply with a platform ABI.  The default is to
+ require natural alignment for vector types.
+ @end deftypefn
+ 
  @defmac STACK_SLOT_ALIGNMENT (@var{type}, @var{mode}, @var{basic-align})
  If defined, a C expression to compute the alignment for stack slot.
  @var{type} is the data type, @var{mode} is the widest mode available,
Index: gcc/doc/tm.texi.in
===================================================================
*** gcc/doc/tm.texi.in  (revision 189809)
--- gcc/doc/tm.texi.in  (working copy)
*************** make it all fit in fewer cache lines.
*** 1091,1096 ****
--- 1091,1098 ----
  If the value of this macro has a type, it should be an unsigned type.
  @end defmac
  
+ @hook TARGET_VECTOR_ALIGNMENT
+ 
  @defmac STACK_SLOT_ALIGNMENT (@var{type}, @var{mode}, @var{basic-align})
  If defined, a C expression to compute the alignment for stack slot.
  @var{type} is the data type, @var{mode} is the widest mode available,
Index: gcc/targhooks.c
===================================================================
*** gcc/targhooks.c     (revision 189809)
--- gcc/targhooks.c     (working copy)
*************** tree default_mangle_decl_assembler_name 
*** 945,950 ****
--- 945,957 ----
     return id;
  }
  
+ /* Default to natural alignment for vector types.  */
+ HOST_WIDE_INT
+ default_vector_alignment (const_tree type)
+ {
+   return tree_low_cst (TYPE_SIZE (type), 0);
+ }
+ 
  bool
  default_builtin_vector_alignment_reachable (const_tree type, bool is_packed)
  {
Index: gcc/targhooks.h
===================================================================
*** gcc/targhooks.h     (revision 189809)
--- gcc/targhooks.h     (working copy)
*************** extern int default_builtin_vectorization
*** 83,88 ****
--- 83,90 ----
  
  extern tree default_builtin_reciprocal (unsigned int, bool, bool);
  
+ extern HOST_WIDE_INT default_vector_alignment (const_tree);
+ 
  extern bool default_builtin_vector_alignment_reachable (const_tree, bool);
  extern bool
  default_builtin_support_vector_misalignment (enum machine_mode mode,
Index: gcc/stor-layout.c
===================================================================
*** gcc/stor-layout.c   (revision 189809)
--- gcc/stor-layout.c   (working copy)
*************** layout_type (tree type)
*** 2131,2139 ****
        TYPE_SIZE (type) = int_const_binop (MULT_EXPR, TYPE_SIZE (innertype),
                                            bitsize_int (nunits));
  
!       /* Always naturally align vectors.  This prevents ABI changes
!          depending on whether or not native vector modes are supported.  */
!       TYPE_ALIGN (type) = tree_low_cst (TYPE_SIZE (type), 0);
          break;
        }
  
--- 2131,2147 ----
        TYPE_SIZE (type) = int_const_binop (MULT_EXPR, TYPE_SIZE (innertype),
                                            bitsize_int (nunits));
  
!       /* For vector types, we do not default to the mode's alignment.
!          Instead, query a target hook, defaulting to natural alignment.
!          This prevents ABI changes depending on whether or not native
!          vector modes are supported.  */
!       TYPE_ALIGN (type) = targetm.vector_alignment (type);
! 
!       /* However, if the underlying mode requires a bigger alignment than
!          what the target hook provides, we cannot use the mode.  For now,
!          simply reject that case.  */
!       gcc_assert (TYPE_ALIGN (type)
!                   >= GET_MODE_ALIGNMENT (TYPE_MODE (type)));
          break;
        }
  
Index: gcc/config/arm/arm.c
===================================================================
*** gcc/config/arm/arm.c        (revision 189809)
--- gcc/config/arm/arm.c        (working copy)
*************** static bool arm_array_mode_supported_p (
*** 254,259 ****
--- 254,260 ----
                                        unsigned HOST_WIDE_INT);
  static enum machine_mode arm_preferred_simd_mode (enum machine_mode);
  static bool arm_class_likely_spilled_p (reg_class_t);
+ static HOST_WIDE_INT arm_vector_alignment (const_tree type);
  static bool arm_vector_alignment_reachable (const_tree type, bool is_packed);
  static bool arm_builtin_support_vector_misalignment (enum machine_mode mode,
                                                     const_tree type,
*************** static const struct attribute_spec arm_a
*** 602,607 ****
--- 603,611 ----
  #undef TARGET_CLASS_LIKELY_SPILLED_P
  #define TARGET_CLASS_LIKELY_SPILLED_P arm_class_likely_spilled_p
  
+ #undef TARGET_VECTOR_ALIGNMENT
+ #define TARGET_VECTOR_ALIGNMENT arm_vector_alignment
+ 
  #undef TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE
  #define TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE \
    arm_vector_alignment_reachable
*************** arm_have_conditional_execution (void)
*** 25051,25056 ****
--- 25055,25072 ----
    return !TARGET_THUMB1;
  }
  
+ /* The AAPCS sets the maximum alignment of a vector to 64 bits.  */
+ static HOST_WIDE_INT
+ arm_vector_alignment (const_tree type)
+ {
+   HOST_WIDE_INT align = tree_low_cst (TYPE_SIZE (type), 0);
+ 
+   if (TARGET_AAPCS_BASED)
+     align = MIN (align, 64);
+ 
+   return align;
+ }
+ 
  static unsigned int
  arm_autovectorize_vector_sizes (void)
  {
Index: gcc/tree-vect-data-refs.c
===================================================================
*** gcc/tree-vect-data-refs.c   (revision 189809)
--- gcc/tree-vect-data-refs.c   (working copy)
*************** vect_update_misalignment_for_peel (struc
*** 1059,1065 ****
        int misal = DR_MISALIGNMENT (dr);
        tree vectype = STMT_VINFO_VECTYPE (stmt_info);
        misal += negative ? -npeel * dr_size : npeel * dr_size;
!       misal &= GET_MODE_SIZE (TYPE_MODE (vectype)) - 1;
        SET_DR_MISALIGNMENT (dr, misal);
        return;
      }
--- 1059,1065 ----
        int misal = DR_MISALIGNMENT (dr);
        tree vectype = STMT_VINFO_VECTYPE (stmt_info);
        misal += negative ? -npeel * dr_size : npeel * dr_size;
!       misal &= (TYPE_ALIGN (vectype) / BITS_PER_UNIT) - 1;
        SET_DR_MISALIGNMENT (dr, misal);
        return;
      }
Index: gcc/tree-vect-loop-manip.c
===================================================================
*** gcc/tree-vect-loop-manip.c  (revision 189809)
--- gcc/tree-vect-loop-manip.c  (working copy)
*************** vect_do_peeling_for_loop_bound (loop_vec
*** 1937,1943 ****
     If the misalignment of DR is known at compile time:
       addr_mis = int mis = DR_MISALIGNMENT (dr);
     Else, compute address misalignment in bytes:
!      addr_mis = addr & (vectype_size - 1)
  
     prolog_niters = min (LOOP_NITERS, ((VF - addr_mis/elem_size)&(VF-1))/step)
  
--- 1937,1943 ----
     If the misalignment of DR is known at compile time:
       addr_mis = int mis = DR_MISALIGNMENT (dr);
     Else, compute address misalignment in bytes:
!      addr_mis = addr & (vectype_align - 1)
  
     prolog_niters = min (LOOP_NITERS, ((VF - addr_mis/elem_size)&(VF-1))/step)
  
*************** vect_gen_niters_for_prolog_loop (loop_ve
*** 1991,1999 ****
        tree start_addr = vect_create_addr_base_for_vector_ref (dr_stmt,
                                                &new_stmts, offset, loop);
        tree type = unsigned_type_for (TREE_TYPE (start_addr));
!       tree vectype_size_minus_1 = build_int_cst (type, vectype_align - 1);
!       tree elem_size_log =
!         build_int_cst (type, exact_log2 (vectype_align/nelements));
        tree nelements_minus_1 = build_int_cst (type, nelements - 1);
        tree nelements_tree = build_int_cst (type, nelements);
        tree byte_misalign;
--- 1991,2000 ----
        tree start_addr = vect_create_addr_base_for_vector_ref (dr_stmt,
                                                &new_stmts, offset, loop);
        tree type = unsigned_type_for (TREE_TYPE (start_addr));
!       tree vectype_align_minus_1 = build_int_cst (type, vectype_align - 1);
!       HOST_WIDE_INT elem_size =
!                 int_cst_value (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
!       tree elem_size_log = build_int_cst (type, exact_log2 (elem_size));
        tree nelements_minus_1 = build_int_cst (type, nelements - 1);
        tree nelements_tree = build_int_cst (type, nelements);
        tree byte_misalign;
*************** vect_gen_niters_for_prolog_loop (loop_ve
*** 2002,2011 ****
        new_bb = gsi_insert_seq_on_edge_immediate (pe, new_stmts);
        gcc_assert (!new_bb);
  
!       /* Create:  byte_misalign = addr & (vectype_size - 1)  */
        byte_misalign =
          fold_build2 (BIT_AND_EXPR, type, fold_convert (type, start_addr), 
!                      vectype_size_minus_1);
  
        /* Create:  elem_misalign = byte_misalign / element_size  */
        elem_misalign =
--- 2003,2012 ----
        new_bb = gsi_insert_seq_on_edge_immediate (pe, new_stmts);
        gcc_assert (!new_bb);
  
!       /* Create:  byte_misalign = addr & (vectype_align - 1)  */
        byte_misalign =
          fold_build2 (BIT_AND_EXPR, type, fold_convert (type, start_addr), 
!                      vectype_align_minus_1);
  
        /* Create:  elem_misalign = byte_misalign / element_size  */
        elem_misalign =
Index: gcc/testsuite/lib/target-supports.exp
===================================================================
*** gcc/testsuite/lib/target-supports.exp       (revision 189809)
--- gcc/testsuite/lib/target-supports.exp       (working copy)
*************** proc check_effective_target_natural_alig
*** 3379,3384 ****
--- 3379,3404 ----
      return $et_natural_alignment_64_saved
  }
  
+ # Return 1 if all vector types are naturally aligned (aligned to their
+ # type-size), 0 otherwise.
+ #
+ # This won't change for different subtargets so cache the result.
+ 
+ proc check_effective_target_vect_natural_alignment { } {
+     global et_vect_natural_alignment
+ 
+     if [info exists et_vect_natural_alignment_saved] {
+         verbose "check_effective_target_vect_natural_alignment: using cached 
result" 2
+     } else {
+         set et_vect_natural_alignment_saved 1
+         if { [check_effective_target_arm_eabi] } {
+             set et_vect_natural_alignment_saved 0
+         }
+     }
+     verbose "check_effective_target_vect_natural_alignment: returning 
$et_vect_natural_alignment_saved" 2
+     return $et_vect_natural_alignment_saved
+ }
+ 
  # Return 1 if vector alignment (for types of size 32 bit or less) is 
reachable, 0 otherwise.
  #
  # This won't change for different subtargets so cache the result.
Index: gcc/testsuite/gcc.dg/align-2.c
===================================================================
*** gcc/testsuite/gcc.dg/align-2.c      (revision 189809)
--- gcc/testsuite/gcc.dg/align-2.c      (working copy)
***************
*** 1,5 ****
  /* PR 17962 */
! /* { dg-do compile } */
  /* { dg-options "" } */
  
  typedef float v4 __attribute__((vector_size(sizeof(float)*4)));
--- 1,5 ----
  /* PR 17962 */
! /* { dg-do compile { target vect_natural_alignment } } */
  /* { dg-options "" } */
  
  typedef float v4 __attribute__((vector_size(sizeof(float)*4)));
Index: gcc/testsuite/gcc.dg/vect/slp-25.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/slp-25.c  (revision 189809)
--- gcc/testsuite/gcc.dg/vect/slp-25.c  (working copy)
*************** int main (void)
*** 56,60 ****
  
  /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  } } */
  /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 
"vect" } } */
! /* { dg-final { scan-tree-dump-times "Alignment of access forced using 
peeling" 2 "vect" { xfail { vect_no_align } } } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
--- 56,60 ----
  
  /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  } } */
  /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 
"vect" } } */
! /* { dg-final { scan-tree-dump-times "Alignment of access forced using 
peeling" 2 "vect" { xfail { vect_no_align || { ! vect_natural_alignment } } } } 
} */
  /* { dg-final { cleanup-tree-dump "vect" } } */

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com

Reply via email to