On Wed, 22 Oct 2025, Tamar Christina wrote:
> > -----Original Message-----
> > From: Richard Biener <[email protected]>
> > Sent: 22 October 2025 14:05
> > To: Tamar Christina <[email protected]>
> > Cc: Soumya AR <[email protected]>; [email protected]; Kyrylo
> > Tkachov <[email protected]>
> > Subject: RE: [PATCH] aarch64: Use SVE ASRD with vector division using
> > division operator
> >
> > On Wed, 22 Oct 2025, Tamar Christina wrote:
> >
> > > Hi Soumya,
> > >
> > > > -----Original Message-----
> > > > From: Soumya AR <[email protected]>
> > > > Sent: 13 October 2025 13:42
> > > > To: Tamar Christina <[email protected]>
> > > > Cc: [email protected]; Kyrylo Tkachov <[email protected]>;
> > > > Richard Biener <[email protected]>
> > > > Subject: Re: [PATCH] aarch64: Use SVE ASRD with vector division using
> > > > division operator
> > > > Importance: High
> > > >
> > > >
> > > >
> > > > > On 13 Oct 2025, at 1:48 PM, Tamar Christina
> > <[email protected]>
> > > > wrote:
> > > > >
> > > > > External email: Use caution opening links or attachments
> > > > >
> > > > >
> > > > > Hi Soumya,
> > > > >
> > > > > The 10/13/2025 07:05, Soumya AR wrote:
> > > > >> The ASRD instruction on SVE performs an arithmetic shift right by an
> > > > immediate
> > > > >> for divide. This patch enables ASRD when dividing vectors using the
> > > > >> GNU
> > C
> > > > >> division operator.
> > > > >>
> > > > >> For example:
> > > > >>
> > > > >> int32x4_t
> > > > >> foo (int32x4_t x)
> > > > >> {
> > > > >> return x / 4;
> > > > >> }
> > > > >>
> > > > >> svint32_t
> > > > >> bar (svint32_t x)
> > > > >> {
> > > > >> return x / 4;
> > > > >> }
> > > > >>
> > > > >> currently generates a DIV, but can be done using ASRD.
> > > > >>
> > > > >> The patch was bootstrapped and regtested on aarch64-linux-gnu, no
> > > > regression.
> > > > >> OK for GCC16?
> > > > >>
> > > > >> Signed-off-by: Soumya AR <[email protected]>
> > > > >>
> > > > >> gcc/ChangeLog:
> > > > >>
> > > > >> * expmed.cc (expand_divmod): Expand to sdiv_pow2 optab for
> > vectors.
> > > > >>
> > > > >> gcc/testsuite/ChangeLog:
> > > > >>
> > > > >> * gcc.target/aarch64/sve/sve-asrd-2.c: New test.
> > > > >>
> > > > >
> > > >
> > > > Hi Tamar,
> > > >
> > > > Thanks for the quick review!
> > > >
> > > > > The test is OK, but you'll need a middle-end maintainer for the
> > > > > expmed.cc
> > > > changes.
> > > >
> > > > My bad, CC'ing Richi here as well.
> > > >
> > > > > But that said...
> > > > >
> > > > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > > > > index df09cbccd08..a35aa229d44 100644
> > > > > --- a/gcc/expmed.cc
> > > > > +++ b/gcc/expmed.cc
> > > > > @@ -4465,6 +4465,24 @@ expand_divmod (int rem_flag, enum
> > tree_code
> > > > code, machine_mode mode,
> > > > > {
> > > > > case TRUNC_MOD_EXPR:
> > > > > case TRUNC_DIV_EXPR:
> > > > > + if (CONST_VECTOR_P (op1)
> > > > > + && optab_handler (sdiv_pow2_optab, mode) !=
> > CODE_FOR_nothing)
> > > > > + {
> > > > > + rtx scalar_op1 = unwrap_const_vec_duplicate (op1);
> > > > > + if (scalar_op1 != NULL_RTX && CONST_INT_P (scalar_op1))
> > > > >
> > > > > You can simplify this using const_vec_duplicate_p.
> > > > >
> > > > > + {
> > > > > + HOST_WIDE_INT d = INTVAL (scalar_op1);
> > > > > + if (d > 0 && pow2p_hwi (d))
> > > > > + {
> > > > > + rtx shift_amount
> > > > > + = gen_const_vec_duplicate (mode,
> > > > > + GEN_INT (floor_log2
> > > > > (d)));
> > > > > + return expand_binop (mode, sdiv_pow2_optab, op0,
> > > > > + shift_amount, target,
> > > > > unsignedp,
> > > > > + methods);
> > > > >
> > > > > This expansion can fail, when e.g. the shift amount is larger than the
> > bitsize
> > > > of the element,
> > > > > so you should check the result of it before exiting from
> > > > > expand_divmod.
> > > > >
> > > >
> > > > Makes sense, attaching an updated patch with both changes.
> > > >
> > >
> > > The new patch looks OK to me (it uses the old interface but that's what
> > > most
> > of
> > > expmed uses).
> > >
> > > I think your cover letter needs to mention that we can't do this in the
> > backend because
> > > Adv. SIMD lacks a division instruction and that the decomposition is
> > > harder
> > to match
> > > correctly in all cases.
> > >
> > > That said you still need a middle-end maintainer approval.
> >
> > So the backend advertises a vector division? Othewise I would have
> > expected vector lowering to lower this (if it does that at all, for
> > VLA vectors?!).
> >
> > So this is an alternative better expansion - but there should already
> > be generic code to deal with divisions by a power of two, why does
> > that not trigger? And what's this sdiv_pow2 thing again?
>
> Yes, expand intercepts in expand_divmod division by a power of 2 and
> decomposes
> It to shifts. We could re-recognize the instruction in combine of course,
> but we already
> have an optab the vectorizer would have used in this case [us]div_pow2, see
> vect_recog_divmod_pattern. So this is just teaching expand about the special
> division of
> a power of 2 optab that the vectorizer uses for normal vector operations.
I can't remember the difference of div_pow2 and shifts, but it feels like
this expand-time improvement should be done by expand_divmod then?
'asrd' is described as arithmetic shift right for divide by immediate
whatever the exact difference from 'asr' is ... the docs say sth
about round towards zero, but then we're using arithmetic right shifts
for signed division already ...
Richard.
> Thanks,
> Tamar
>
> >
> > Richard.
> >
> > > Thanks,
> > > Tamar
> > >
> > > > Thanks,
> > > > Soumya
> > > >
> > > > > Thanks,
> > > > > Tamar
> > > > >
> > > > > + }
> > > > > + }
> > > > > + }
> > > > > if (op1_is_constant)
> > > > > {
> > > > > scalar_int_mode int_mode = as_a <scalar_int_mode>
> > > > (compute_mode);
> > > >
> > >
> > >
> >
> > --
> > Richard Biener <[email protected]>
> > SUSE Software Solutions Germany GmbH,
> > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > Nuernberg)
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)