Re: [PATCH 03/14] aarch64: Relaxed SEL combiner patterns for unpacked SVE FP conversions

Spencer Abson Mon, 09 Jun 2025 09:12:18 -0700

On Mon, Jun 09, 2025 at 02:48:58PM +0100, Richard Sandiford wrote:
> Spencer Abson <spencer.ab...@arm.com> writes:
> > On Thu, Jun 05, 2025 at 09:24:27PM +0100, Richard Sandiford wrote:
> >> Spencer Abson <spencer.ab...@arm.com> writes:
> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/unpacked_cond_cvtf_1.c 
> >> > b/gcc/testsuite/gcc.target/aarch64/sve/unpacked_cond_cvtf_1.c
> >> > new file mode 100644
> >> > index 00000000000..8f69232f2cf
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/unpacked_cond_cvtf_1.c
> >> > @@ -0,0 +1,47 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=2048 
> >> > -fno-trapping-math" } */
> >> 
> >> The =2048 is ok, but do you need it for these autovectorisation tests?
> >> If vectorisation is treated as not profitable without it, then perhaps
> >> we could switch to Tamar's -mmax-vectorization, once that's in.
> >
> > This isn't needed to make vectorization profitable, but rather to
> > make partial vector modes the reliably obvious choice - and hopefully
> > one that is isn't affected by future cost model changes.  With =2048
> > and COUNT, each loop should be fully-unrolled into a single unpacked 
> > operation (plus setup and return).
> >
> > For me, this was much more flexible than using builtin vector types,
> > and easier to reason about.  Maybe that's just me though!  I can try
> > something else if it would be preferred.
> 
> I don't really agree about the "easier to reason about" bit: IMO,
> builtin vector types are the most direct and obvious way of testing
> things with fixed-length vectors, for the cases that they can handle
> directly.  But I agree that vectorisation is more flexible, in that
> it can deal with cases that fixed-length builtin vectors can't yet
> handle directly.
> 
> My main concern was that the tests didn't seem to have much coverage
> of normal VLA codegen.  If the aim is predictable costing, it might
> be enough to use -moverride=sve_width=2048 instead of
> -msve-vector-bits=2048.


I see - yeah, -moverride=sve_width=2048 is enough.

How about we use builtin vectors wherever possible, and fall back
to the current approach (but replacing -msve-vector-bits with
-moverride=sve_width) everywhere else?

Alternatively, if we'd like to focus on VLA codegen, I could
just replace -msve-vector-bits with -moverride=sve_width throughout
the series.

Thanks,
Spencer
> 
> Thanks,
> Richard

Re: [PATCH 03/14] aarch64: Relaxed SEL combiner patterns for unpacked SVE FP conversions

Reply via email to