Re: [PATCH v3 2/2]AArch64: propose -mmax-vectorization as an option to override vector costing

Andrew Pinski Tue, 03 Jun 2025 08:54:43 -0700

On Tue, Jun 3, 2025 at 3:07 AM Tamar Christina <tamar.christ...@arm.com> wrote:
>
> Hi All,
>
> With the middle-end providing a way to make vectorization more profitable by
> scaling vect-scalar-cost-multiplier this makes a more user friendly option
> to make it easier to use.
>
> I propose making it an actual -m option that we document and retain vs using
> the parameter name.  In the future I would like to extend this option to 
> modify
> additional costing in the AArch64 backend itself.
>
> This can be used together with --param aarch64-autovec-preference to get the
> vectorizer to say, always vectorize with SVE.  I did consider making this an
> additional enum to --param aarch64-autovec-preference but I also think this is
> a useful thing to be able to set with pragmas and attributes, but am open to
> suggestions.


I did file a bug about requesting supporting --param in the
attributes/pragmas,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116092 . It might be easy
to add the support there too.
Note this is not to stop adding an -m option which is considered more
stable than the --param option but rather to let you know there is a
bug about allowing --param.

Thanks,
Andrew

>
> Note that as a follow up I plan on extending -fdump-tree-vect to support 
> -stats
> which is then intended to be usable with this flag.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> -m32, -m64 and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>         * config/aarch64/aarch64.opt (max-vectorization): New.
>         * config/aarch64/aarch64.cc (aarch64_override_options_internal): Save
>         and restore option.
>         Implement it through vect-scalar-cost-multiplier.
>         (aarch64_attributes): Default to off.
>         * common/config/aarch64/aarch64-common.cc (aarch64_handle_option):
>         Initialize option.
>         * doc/extend.texi (max-vectorization): Document attribute.
>         * doc/invoke.texi (max-vectorization): Document flag.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/aarch64/sve/cost_model_17.c: New test.
>         * gcc.target/aarch64/sve/cost_model_18.c: New test.
>
> ---
> diff --git a/gcc/common/config/aarch64/aarch64-common.cc 
> b/gcc/common/config/aarch64/aarch64-common.cc
> index 
> b9ed83642ade4462f1b030d68cf9744d31d70c23..1488697c6ce43108ae2938e5b8a00ac7ac262da6
>  100644
> --- a/gcc/common/config/aarch64/aarch64-common.cc
> +++ b/gcc/common/config/aarch64/aarch64-common.cc
> @@ -142,6 +142,10 @@ aarch64_handle_option (struct gcc_options *opts,
>        opts->x_aarch64_flag_outline_atomics = val;
>        return true;
>
> +    case OPT_mmax_vectorization:
> +      opts->x_flag_aarch64_max_vectorization = val;
> +      return true;
> +
>      default:
>        return true;
>      }
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> 9e3f2885bccb62550c5fcfdf93d72fbc2e63233e..f11f0da28915f49829360cd7a6269e2a3f67a860
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -18973,6 +18973,12 @@ aarch64_override_options_internal (struct 
> gcc_options *opts)
>    if (TARGET_SME && !TARGET_SVE2)
>      sorry ("no support for %qs without %qs", "sme", "sve2");
>
> +  /* Set scalar costing to a high value such that we always pick
> +     vectorization.  Increase scalar costing by 10000%.  */
> +  if (opts->x_flag_aarch64_max_vectorization)
> +    SET_OPTION_IF_UNSET (opts, &global_options_set,
> +                        param_vect_scalar_cost_multiplier, 10000);
> +
>    aarch64_override_options_after_change_1 (opts);
>  }
>
> @@ -19723,6 +19729,8 @@ static const struct aarch64_attribute_info 
> aarch64_attributes[] =
>       OPT_msign_return_address_ },
>    { "outline-atomics", aarch64_attr_bool, true, NULL,
>       OPT_moutline_atomics},
> +  { "max-vectorization", aarch64_attr_bool, false, NULL,
> +     OPT_mmax_vectorization},
>    { NULL, aarch64_attr_custom, false, NULL, OPT____ }
>  };
>
> diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
> index 
> f32d56d4ffaef7862c1c45a11753be5d480220d0..2725c50da64a2c05489ea6202bdd5eedf1ba7e27
>  100644
> --- a/gcc/config/aarch64/aarch64.opt
> +++ b/gcc/config/aarch64/aarch64.opt
> @@ -290,6 +290,10 @@ msve-vector-bits=
>  Target RejectNegative Joined Enum(sve_vector_bits) 
> Var(aarch64_sve_vector_bits) Init(SVE_SCALABLE)
>  -msve-vector-bits=<number>     Set the number of bits in an SVE vector 
> register.
>
> +mmax-vectorization
> +Target Undocumented Var(flag_aarch64_max_vectorization) Save
> +Override the scalar cost model such that vectorization is always profitable.
> +
>  mverbose-cost-dump
>  Target Undocumented Var(flag_aarch64_verbose_cost)
>  Enables verbose cost model dumping in the debug dump files.
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 
> 40ccf22b29f4316928f905ec2c978fdaf30a55ec..429cbeb8a3c8186af60b1441acb52bc052b60721
>  100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -3882,6 +3882,15 @@ Enable or disable calls to out-of-line helpers to 
> implement atomic operations.
>  This corresponds to the behavior of the command-line options
>  @option{-moutline-atomics} and @option{-mno-outline-atomics}.
>
> +@cindex @code{max-vectorization} function attribute, AArch64
> +@item max-vectorization
> +@itemx no-max-vectorization
> +@code{max-vectorization} tells GCC's vectorizer to treat all vector
> +loops as being more profitable than the original scalar loops when
> +optimizing the current function.  @code{no-max-vectorization} disables
> +this behavior.
> +@option{-mmax-vectorization} and @option{-mno-max-vectorization}.
> +
>  @cindex @code{indirect_return} function attribute, AArch64
>  @item indirect_return
>  The @code{indirect_return} attribute can be applied to a function type
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 
> 95a25c0f63b77f26db05a7b48bfad8f9c58bcc5f..54d4cf88cccd4d0e4ede3442ad9907faac325d52
>  100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -21984,6 +21984,14 @@ used directly.  The same applies when using 
> @option{-mcpu=} when the
>  selected cpu supports the @samp{lse} feature.
>  This option is on by default.
>
> +@item -mmax-vectorization
> +@itemx -mno-max-vectorization
> +Enable or disable override to vectorizer cost model making vectorization 
> always
> +profitable.  This option can be combined with -mautovec-preference=... 
> allowing
> +precise control over which ISA will be used for auto-vectorization.  Unlike
> +-fno-vect-cost-model or -fvect-cost-model=unlimited this option does not turn
> +off cost comparison between different vector modes.
> +
>  @opindex march
>  @item -march=@var{name}
>  Specify the name of the target architecture and, optionally, one or
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cost_model_17.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/cost_model_17.c
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..c405591a101d50b4734bc6d65a6d6c01888bea48
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/cost_model_17.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Ofast -march=armv8-a+sve -mmax-vectorization 
> -fdump-tree-vect-details" } */
> +
> +void
> +foo (char *restrict a, int *restrict b, int *restrict c,
> +     int *restrict d, int stride)
> +{
> +    if (stride <= 1)
> +        return;
> +
> +    for (int i = 0; i < 3; i++)
> +        {
> +            int res = c[i];
> +            int t = b[i * stride];
> +            if (a[i] != 0)
> +                res = t * d[i];
> +            c[i] = res;
> +        }
> +}
> +
> +/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cost_model_18.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/cost_model_18.c
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..8e91f9e9c29971a4bb7033be6be4d2fc4c71d05a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/cost_model_18.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Ofast -march=armv8-a+sve -fdump-tree-vect-details" } */
> +
> +void __attribute__ (( target ("max-vectorization")))
> +foo (char *restrict a, int *restrict b, int *restrict c,
> +     int *restrict d, int stride)
> +{
> +    if (stride <= 1)
> +        return;
> +
> +    for (int i = 0; i < 3; i++)
> +        {
> +            int res = c[i];
> +            int t = b[i * stride];
> +            if (a[i] != 0)
> +                res = t * d[i];
> +            c[i] = res;
> +        }
> +}
> +
> +/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" } } */
>
>
> --

Re: [PATCH v3 2/2]AArch64: propose -mmax-vectorization as an option to override vector costing

Reply via email to