Richard Biener <richard.guent...@gmail.com> writes: > On Fri, Oct 25, 2019 at 2:43 PM Richard Sandiford > <richard.sandif...@arm.com> wrote: >> >> After previous patches, it's now possible to make the vectoriser >> support multiple vector sizes in the same vector region, using >> related_vector_mode to pick the right vector mode for a given >> element mode. No port yet takes advantage of this, but I have >> a follow-on patch for AArch64. >> >> This patch also seemed like a good opportunity to add some more dump >> messages: one to make it clear which vector size/mode was being used >> when analysis passed or failed, and another to say when we've decided >> to skip a redundant vector size/mode. > > OK. > > I wonder if, when we requested a specific size previously, we now > have to verify we got that constraint satisfied after the change. > Esp. the epilogue vectorization cases want to get V2DI > from V4DI. > > sz /= 2; > - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz); > + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), > + scalar_type, > + sz / scalar_bytes); > > doesn't look like an improvement in readability to me there.
Yeah, guess it isn't great. > Maybe re-formulating the whole code in terms of lanes instead of size > would make it easier to follow? OK, how about this version? It still won't win awards, but it's at least a bit more readable. Tested as before. Richard 2019-11-06 Richard Sandiford <richard.sandif...@arm.com> gcc/ * machmode.h (opt_machine_mode::operator==): New function. (opt_machine_mode::operator!=): Likewise. * tree-vectorizer.h (vec_info::vector_mode): Update comment. (get_related_vectype_for_scalar_type): Delete. (get_vectype_for_scalar_type_and_size): Declare. * tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say whether analysis passed or failed, and with what vector modes. Use related_vector_mode to check whether trying a particular vector mode would be redundant with the autodetected mode, and print a dump message if we decide to skip it. * tree-vect-loop.c (vect_analyze_loop): Likewise. (vect_create_epilog_for_reduction): Use get_related_vectype_for_scalar_type instead of get_vectype_for_scalar_type_and_size. * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace with... (get_related_vectype_for_scalar_type): ...this new function. Take a starting/"prevailing" vector mode rather than a vector size. Take an optional nunits argument, with the same meaning as for related_vector_mode. Use related_vector_mode when not auto-detecting a mode, falling back to mode_for_vector if no target mode exists. (get_vectype_for_scalar_type): Update accordingly. (get_same_sized_vectype): Likewise. * tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise. Index: gcc/machmode.h =================================================================== --- gcc/machmode.h 2019-11-06 12:35:12.460201615 +0000 +++ gcc/machmode.h 2019-11-06 12:35:27.972093472 +0000 @@ -258,6 +258,9 @@ #define CLASS_HAS_WIDER_MODES_P(CLASS) bool exists () const; template<typename U> bool exists (U *) const; + bool operator== (const T &m) const { return m_mode == m; } + bool operator!= (const T &m) const { return m_mode != m; } + private: machine_mode m_mode; }; Index: gcc/tree-vectorizer.h =================================================================== --- gcc/tree-vectorizer.h 2019-11-06 12:35:12.764199495 +0000 +++ gcc/tree-vectorizer.h 2019-11-06 12:35:27.976093444 +0000 @@ -335,8 +335,9 @@ typedef std::pair<tree, tree> vec_object /* Cost data used by the target cost model. */ void *target_cost_data; - /* If we've chosen a vector size for this vectorization region, - this is one mode that has such a size, otherwise it is VOIDmode. */ + /* The argument we should pass to related_vector_mode when looking up + the vector mode for a scalar mode, or VOIDmode if we haven't yet + made any decisions about which vector modes to use. */ machine_mode vector_mode; private: @@ -1609,8 +1610,9 @@ extern bool vect_can_advance_ivs_p (loop extern void vect_update_inits_of_drs (loop_vec_info, tree, tree_code); /* In tree-vect-stmts.c. */ +extern tree get_related_vectype_for_scalar_type (machine_mode, tree, + poly_uint64 = 0); extern tree get_vectype_for_scalar_type (vec_info *, tree); -extern tree get_vectype_for_scalar_type_and_size (tree, poly_uint64); extern tree get_mask_type_for_scalar_type (vec_info *, tree); extern tree get_same_sized_vectype (tree, tree); extern bool vect_get_loop_mask_type (loop_vec_info); Index: gcc/tree-vect-slp.c =================================================================== --- gcc/tree-vect-slp.c 2019-11-06 12:35:12.760199523 +0000 +++ gcc/tree-vect-slp.c 2019-11-06 12:35:27.972093472 +0000 @@ -3202,7 +3202,12 @@ vect_slp_bb_region (gimple_stmt_iterator && dbg_cnt (vect_slp)) { if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); + { + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis succeeded with vector mode" + " %s\n", GET_MODE_NAME (bb_vinfo->vector_mode)); + dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); + } bb_vinfo->shared->check_datarefs (); vect_schedule_slp (bb_vinfo); @@ -3222,6 +3227,13 @@ vect_slp_bb_region (gimple_stmt_iterator vectorized = true; } + else + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis failed with vector mode %s\n", + GET_MODE_NAME (bb_vinfo->vector_mode)); + } if (mode_i == 0) autodetected_vector_mode = bb_vinfo->vector_mode; @@ -3229,9 +3241,22 @@ vect_slp_bb_region (gimple_stmt_iterator delete bb_vinfo; if (mode_i < vector_modes.length () - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), - GET_MODE_SIZE (autodetected_vector_mode))) - mode_i += 1; + && VECTOR_MODE_P (autodetected_vector_mode) + && (related_vector_mode (vector_modes[mode_i], + GET_MODE_INNER (autodetected_vector_mode)) + == autodetected_vector_mode) + && (related_vector_mode (autodetected_vector_mode, + GET_MODE_INNER (vector_modes[mode_i])) + == vector_modes[mode_i])) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Skipping vector mode %s, which would" + " repeat the analysis for %s\n", + GET_MODE_NAME (vector_modes[mode_i]), + GET_MODE_NAME (autodetected_vector_mode)); + mode_i += 1; + } if (vectorized || mode_i == vector_modes.length () Index: gcc/tree-vect-loop.c =================================================================== --- gcc/tree-vect-loop.c 2019-11-06 12:35:12.756199552 +0000 +++ gcc/tree-vect-loop.c 2019-11-06 12:35:27.972093472 +0000 @@ -2417,6 +2417,17 @@ vect_analyze_loop (class loop *loop, vec res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts); if (mode_i == 0) autodetected_vector_mode = loop_vinfo->vector_mode; + if (dump_enabled_p ()) + { + if (res) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis succeeded with vector mode %s\n", + GET_MODE_NAME (loop_vinfo->vector_mode)); + else + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis failed with vector mode %s\n", + GET_MODE_NAME (loop_vinfo->vector_mode)); + } loop->aux = NULL; if (res) @@ -2479,9 +2490,22 @@ vect_analyze_loop (class loop *loop, vec } if (mode_i < vector_modes.length () - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), - GET_MODE_SIZE (autodetected_vector_mode))) - mode_i += 1; + && VECTOR_MODE_P (autodetected_vector_mode) + && (related_vector_mode (vector_modes[mode_i], + GET_MODE_INNER (autodetected_vector_mode)) + == autodetected_vector_mode) + && (related_vector_mode (autodetected_vector_mode, + GET_MODE_INNER (vector_modes[mode_i])) + == vector_modes[mode_i])) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Skipping vector mode %s, which would" + " repeat the analysis for %s\n", + GET_MODE_NAME (vector_modes[mode_i]), + GET_MODE_NAME (autodetected_vector_mode)); + mode_i += 1; + } if (mode_i == vector_modes.length () || autodetected_vector_mode == VOIDmode) @@ -4870,13 +4894,15 @@ vect_create_epilog_for_reduction (stmt_v in a vector mode of smaller size and first reduce upper/lower halves against each other. */ enum machine_mode mode1 = mode; - unsigned sz = tree_to_uhwi (TYPE_SIZE_UNIT (vectype)); - unsigned sz1 = sz; + unsigned nunits = TYPE_VECTOR_SUBPARTS (vectype).to_constant (); + unsigned nunits1 = nunits; if (!slp_reduc && (mode1 = targetm.vectorize.split_reduction (mode)) != mode) - sz1 = GET_MODE_SIZE (mode1).to_constant (); + nunits1 = GET_MODE_NUNITS (mode1).to_constant (); - tree vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz1); + tree vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), + scalar_type, + nunits1); reduce_with_shift = have_whole_vector_shift (mode1); if (!VECTOR_MODE_P (mode1)) reduce_with_shift = false; @@ -4890,11 +4916,13 @@ vect_create_epilog_for_reduction (stmt_v /* First reduce the vector to the desired vector size we should do shift reduction on by combining upper and lower halves. */ new_temp = new_phi_result; - while (sz > sz1) + while (nunits > nunits1) { gcc_assert (!slp_reduc); - sz /= 2; - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz); + nunits /= 2; + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), + scalar_type, nunits); + unsigned int bitsize = tree_to_uhwi (TYPE_SIZE (vectype1)); /* The target has to make sure we support lowpart/highpart extraction, either via direct vector extract or through @@ -4919,15 +4947,14 @@ vect_create_epilog_for_reduction (stmt_v = gimple_build_assign (dst2, BIT_FIELD_REF, build3 (BIT_FIELD_REF, vectype1, new_temp, TYPE_SIZE (vectype1), - bitsize_int (sz * BITS_PER_UNIT))); + bitsize_int (bitsize))); gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); } else { /* Extract via punning to appropriately sized integer mode vector. */ - tree eltype = build_nonstandard_integer_type (sz * BITS_PER_UNIT, - 1); + tree eltype = build_nonstandard_integer_type (bitsize, 1); tree etype = build_vector_type (eltype, 2); gcc_assert (convert_optab_handler (vec_extract_optab, TYPE_MODE (etype), @@ -4956,7 +4983,7 @@ vect_create_epilog_for_reduction (stmt_v = gimple_build_assign (tem, BIT_FIELD_REF, build3 (BIT_FIELD_REF, eltype, new_temp, TYPE_SIZE (eltype), - bitsize_int (sz * BITS_PER_UNIT))); + bitsize_int (bitsize))); gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); dst2 = make_ssa_name (vectype1); epilog_stmt = gimple_build_assign (dst2, VIEW_CONVERT_EXPR, Index: gcc/tree-vect-stmts.c =================================================================== --- gcc/tree-vect-stmts.c 2019-11-06 12:35:12.796199272 +0000 +++ gcc/tree-vect-stmts.c 2019-11-06 12:35:27.976093444 +0000 @@ -11097,18 +11097,28 @@ vect_remove_stores (stmt_vec_info first_ } } -/* Function get_vectype_for_scalar_type_and_size. - - Returns the vector type corresponding to SCALAR_TYPE and SIZE as supported - by the target. */ +/* If NUNITS is nonzero, return a vector type that contains NUNITS + elements of type SCALAR_TYPE, or null if the target doesn't support + such a type. + + If NUNITS is zero, return a vector type that contains elements of + type SCALAR_TYPE, choosing whichever vector size the target prefers. + + If PREVAILING_MODE is VOIDmode, we have not yet chosen a vector mode + for this vectorization region and want to "autodetect" the best choice. + Otherwise, PREVAILING_MODE is a previously-chosen vector TYPE_MODE + and we want the new type to be interoperable with it. PREVAILING_MODE + in this case can be a scalar integer mode or a vector mode; when it + is a vector mode, the function acts like a tree-level version of + related_vector_mode. */ tree -get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size) +get_related_vectype_for_scalar_type (machine_mode prevailing_mode, + tree scalar_type, poly_uint64 nunits) { tree orig_scalar_type = scalar_type; scalar_mode inner_mode; machine_mode simd_mode; - poly_uint64 nunits; tree vectype; if (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode) @@ -11148,10 +11158,11 @@ get_vectype_for_scalar_type_and_size (tr if (scalar_type == NULL_TREE) return NULL_TREE; - /* If no size was supplied use the mode the target prefers. Otherwise - lookup a vector mode of the specified size. */ - if (known_eq (size, 0U)) + /* If no prevailing mode was supplied, use the mode the target prefers. + Otherwise lookup a vector mode based on the prevailing mode. */ + if (prevailing_mode == VOIDmode) { + gcc_assert (known_eq (nunits, 0U)); simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode); if (SCALAR_INT_MODE_P (simd_mode)) { @@ -11167,9 +11178,19 @@ get_vectype_for_scalar_type_and_size (tr return NULL_TREE; } } - else if (!multiple_p (size, nbytes, &nunits) - || !mode_for_vector (inner_mode, nunits).exists (&simd_mode)) - return NULL_TREE; + else if (SCALAR_INT_MODE_P (prevailing_mode) + || !related_vector_mode (prevailing_mode, + inner_mode, nunits).exists (&simd_mode)) + { + /* Fall back to using mode_for_vector, mostly in the hope of being + able to use an integer mode. */ + if (known_eq (nunits, 0U) + && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits)) + return NULL_TREE; + + if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode)) + return NULL_TREE; + } vectype = build_vector_type_for_mode (scalar_type, simd_mode); @@ -11197,9 +11218,8 @@ get_vectype_for_scalar_type_and_size (tr tree get_vectype_for_scalar_type (vec_info *vinfo, tree scalar_type) { - tree vectype; - poly_uint64 vector_size = GET_MODE_SIZE (vinfo->vector_mode); - vectype = get_vectype_for_scalar_type_and_size (scalar_type, vector_size); + tree vectype = get_related_vectype_for_scalar_type (vinfo->vector_mode, + scalar_type); if (vectype && vinfo->vector_mode == VOIDmode) vinfo->vector_mode = TYPE_MODE (vectype); return vectype; @@ -11232,8 +11252,13 @@ get_same_sized_vectype (tree scalar_type if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type)) return truth_type_for (vector_type); - return get_vectype_for_scalar_type_and_size - (scalar_type, GET_MODE_SIZE (TYPE_MODE (vector_type))); + poly_uint64 nunits; + if (!multiple_p (GET_MODE_SIZE (TYPE_MODE (vector_type)), + GET_MODE_SIZE (TYPE_MODE (scalar_type)), &nunits)) + return NULL_TREE; + + return get_related_vectype_for_scalar_type (TYPE_MODE (vector_type), + scalar_type, nunits); } /* Function vect_is_simple_use. Index: gcc/tree-vectorizer.c =================================================================== --- gcc/tree-vectorizer.c 2019-11-06 12:35:12.764199495 +0000 +++ gcc/tree-vectorizer.c 2019-11-06 12:35:27.976093444 +0000 @@ -1359,7 +1359,7 @@ get_vec_alignment_for_array_type (tree t poly_uint64 array_size, vector_size; tree scalar_type = strip_array_types (type); - tree vectype = get_vectype_for_scalar_type_and_size (scalar_type, 0); + tree vectype = get_related_vectype_for_scalar_type (VOIDmode, scalar_type); if (!vectype || !poly_int_tree_p (TYPE_SIZE (type), &array_size) || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size)