Re: [PATCH 03/14] aarch64: Relaxed SEL combiner patterns for unpacked SVE FP conversions

Richard Sandiford Tue, 10 Jun 2025 11:43:38 -0700

Spencer Abson <spencer.ab...@arm.com> writes:
> On Mon, Jun 09, 2025 at 02:48:58PM +0100, Richard Sandiford wrote:
>> Spencer Abson <spencer.ab...@arm.com> writes:
>> > On Thu, Jun 05, 2025 at 09:24:27PM +0100, Richard Sandiford wrote:
>> >> Spencer Abson <spencer.ab...@arm.com> writes:
>> >> > diff --git 
>> >> > a/gcc/testsuite/gcc.target/aarch64/sve/unpacked_cond_cvtf_1.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/sve/unpacked_cond_cvtf_1.c
>> >> > new file mode 100644
>> >> > index 00000000000..8f69232f2cf
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/unpacked_cond_cvtf_1.c
>> >> > @@ -0,0 +1,47 @@
>> >> > +/* { dg-do compile } */
>> >> > +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=2048 
>> >> > -fno-trapping-math" } */
>> >> 
>> >> The =2048 is ok, but do you need it for these autovectorisation tests?
>> >> If vectorisation is treated as not profitable without it, then perhaps
>> >> we could switch to Tamar's -mmax-vectorization, once that's in.
>> >
>> > This isn't needed to make vectorization profitable, but rather to
>> > make partial vector modes the reliably obvious choice - and hopefully
>> > one that is isn't affected by future cost model changes.  With =2048
>> > and COUNT, each loop should be fully-unrolled into a single unpacked 
>> > operation (plus setup and return).
>> >
>> > For me, this was much more flexible than using builtin vector types,
>> > and easier to reason about.  Maybe that's just me though!  I can try
>> > something else if it would be preferred.
>> 
>> I don't really agree about the "easier to reason about" bit: IMO,
>> builtin vector types are the most direct and obvious way of testing
>> things with fixed-length vectors, for the cases that they can handle
>> directly.  But I agree that vectorisation is more flexible, in that
>> it can deal with cases that fixed-length builtin vectors can't yet
>> handle directly.
>> 
>> My main concern was that the tests didn't seem to have much coverage
>> of normal VLA codegen.  If the aim is predictable costing, it might
>> be enough to use -moverride=sve_width=2048 instead of
>> -msve-vector-bits=2048.
>
> I see - yeah, -moverride=sve_width=2048 is enough.
>
> How about we use builtin vectors wherever possible, and fall back
> to the current approach (but replacing -msve-vector-bits with
> -moverride=sve_width) everywhere else?
>
> Alternatively, if we'd like to focus on VLA codegen, I could
> just replace -msve-vector-bits with -moverride=sve_width throughout
> the series.


I don't think there's any need to go back and change the way the tests
are written.  Just replacing -msve-vector-bits with -moverride=sve_width
for the vectoriser-based tests sounds good.

Thanks,
Richard

Re: [PATCH 03/14] aarch64: Relaxed SEL combiner patterns for unpacked SVE FP conversions

Reply via email to