Hi,

Sorry for the slow review.

> The new version v7 is attached which has addressed your review comments
> on v6.  Could you have a further look?  Many thanks in advance!
>
> Bootstrapped/regtested on aarch64-linux-gnu and powerpc64le-linux-gnu P9.
> Even with explicit vect-partial-vector-usage settings 1/2 on Power target,
> I didn't find any remarkable failures (only some trivial test case issues).

Thanks, this looks great.  OK for trunk with the minor nits below fixed.

"Kewen.Lin" <li...@linux.ibm.com> writes:
> @@ -968,4 +968,8 @@ Bound on number of runtime checks inserted by the 
> vectorizer's loop versioning f
>  Common Joined UInteger Var(param_vect_max_version_for_alignment_checks) 
> Init(6) Param Optimization
>  Bound on number of runtime checks inserted by the vectorizer's loop 
> versioning for alignment check.
>  
> +-param=vect-partial-vector-usage=
> +Common Joined UInteger Var(param_vect_partial_vector_usage) Init(2) 
> IntegerRange(0, 2) Param Optimization
> +Controls how loop vectorizer uses partial vectors.  0 means never, 1 means 
> only for loops whose iterating need can be removed, 2 means for all loops.  
> The default value is 2.

IMO reads better as s/iterating need/need to iterate/

> +  FOR_EACH_MODE_IN_CLASS (tmode_iter, MODE_INT)
> +  {
> +    scalar_mode tmode = tmode_iter.require ();
> +    unsigned int tbits = GET_MODE_BITSIZE (tmode);
> +
> +    /* ??? Do we really want to construct one IV whose precision exceeds
> +       BITS_PER_WORD?  */
> +    if (tbits > BITS_PER_WORD)
> +      break;
> +
> +    /* Find the first available standard integral type.  */
> +    if (tbits >= min_ni_prec && targetm.scalar_mode_supported_p (tmode))
> +      {
> +     iv_type = build_nonstandard_integer_type (tbits, true);
> +     break;
> +      }
> +  }

The outer {…} block should be indented by two spaces relative
to the FOR_EACH_MODE_IN_CLASS.
> +
> +  if (!iv_type)
> +    {
> +      if (dump_enabled_p ())
> +     dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> +                      "can't vectorize with length-based partial vectors"
> +                      " due to no suitable iv type.\n");

IMO reads better as s/due to/because there is/

> +  /* Shouldn't go with length-based approach if fully masked.  */
> +  gcc_assert (!loop_lens || (loop_lens && !loop_masks));

The “loop_lens &&” is redundant.

Same for vectorizable_load.

> @@ -7994,6 +8030,42 @@ vectorizable_store (vec_info *vinfo,
>                 vect_finish_stmt_generation (vinfo, stmt_info, call, gsi);
>                 new_stmt = call;
>               }
> +           else if (loop_lens)
> +             {
> +               tree final_len
> +                 = vect_get_loop_len (loop_vinfo, loop_lens,
> +                                      vec_num * ncopies, vec_num * j + i);
> +               align = least_bit_hwi (misalign | align);
> +               tree ptr = build_int_cst (ref_type, align);
> +               machine_mode vmode = TYPE_MODE (vectype);
> +               opt_machine_mode new_ovmode
> +                 = get_len_load_store_mode (vmode, false);
> +               gcc_assert (new_ovmode.exists ());
> +               machine_mode new_vmode = new_ovmode.require ();

The assert is redundant with the “require ()”.

Same for vectorizable_load.

Thanks,
Richard

Reply via email to