On Tue, Jun 10, 2025 at 07:54:31PM +0100, Richard Sandiford wrote:
> Spencer Abson <spencer.ab...@arm.com> writes:
> > On Fri, Jun 06, 2025 at 12:46:32PM +0100, Richard Sandiford wrote:
> >> Spencer Abson <spencer.ab...@arm.com> writes:
> >> > This patch extends the unpredicated FP division expander to support
> >> > partial FP modes.  It extends the existing patterns used to implement
> >> > UNSPEC_COND_FDIV and it's approximation as needed.
> >> >
> >> > gcc/ChangeLog:
> >> >
> >> >  * config/aarch64/aarch64-sve.md: (@aarch64_sve_<optab><mode>):
> >> >  Extend from SVE_FULL_F to SVE_F, use aarch64_predicate_operand.
> >> >  (div<mode>3): Extend from SVE_FULL_F to SVE_F.
> >> >  (@aarch64_frecpe<mode>): Likewise.
> >> >  (@aarch64_frecps<mode>): Likewise.
> >> >
> >> > gcc/testsuite/ChangeLog:
> >> >
> >> >  * gcc.target/aarch64/sve/unpacked_fdiv_1.c: New test.
> >> >  * gcc.target/aarch64/sve/unpacked_fdiv_2.c: Likewise.
> >> >  * gcc.target/aarch64/sve/unpacked_fdiv_3.c: Likewise.
> >> > ---
> >> >  gcc/config/aarch64/aarch64-sve.md             | 50 +++++++++----------
> >> >  .../gcc.target/aarch64/sve/unpacked_fdiv_1.c  | 34 +++++++++++++
> >> >  .../gcc.target/aarch64/sve/unpacked_fdiv_2.c  | 11 ++++
> >> >  .../gcc.target/aarch64/sve/unpacked_fdiv_3.c  | 11 ++++
> >> >  4 files changed, 81 insertions(+), 25 deletions(-)
> >> >  create mode 100644 
> >> > gcc/testsuite/gcc.target/aarch64/sve/unpacked_fdiv_1.c
> >> >  create mode 100644 
> >> > gcc/testsuite/gcc.target/aarch64/sve/unpacked_fdiv_2.c
> >> >  create mode 100644 
> >> > gcc/testsuite/gcc.target/aarch64/sve/unpacked_fdiv_3.c
> >> >
> >> > diff --git a/gcc/config/aarch64/aarch64-sve.md 
> >> > b/gcc/config/aarch64/aarch64-sve.md
> >> > index cdad900d9cf..79a087837de 100644
> >> > --- a/gcc/config/aarch64/aarch64-sve.md
> >> > +++ b/gcc/config/aarch64/aarch64-sve.md
> >> > @@ -3752,9 +3752,9 @@
> >> >  
> >> >  ;; Unpredicated floating-point unary operations.
> >> >  (define_insn "@aarch64_sve_<optab><mode>"
> >> > -  [(set (match_operand:SVE_FULL_F 0 "register_operand" "=w")
> >> > -        (unspec:SVE_FULL_F
> >> > -          [(match_operand:SVE_FULL_F 1 "register_operand" "w")]
> >> > +  [(set (match_operand:SVE_F 0 "register_operand" "=w")
> >> > +        (unspec:SVE_F
> >> > +          [(match_operand:SVE_F 1 "register_operand" "w")]
> >> >            SVE_FP_UNARY))]
> >> >    "TARGET_SVE"
> >> >    "<sve_fp_op>\t%0.<Vetype>, %1.<Vetype>"
> >> 
> >> I agree the patch is correct for the current definitions of SVE_FP_UNARY
> >> and SVE_FP_BINARY.  Since the names are generic, though, I think it
> >> would be worth adding a comment to iterators.md above the definition
> >> of both iterators, saying something like:
> >> 
> >> ;; This iterator is currently only used for estimation instructions,
> >> ;; where lax handling of exceptions is assumed to be acceptable.
> >> ;; The iterator is therefore applied unconditionally to partial FP modes.
> >> ;; This would need to be revisited if new operations are added in future.
> >> 
> >> (feel free to reword)
> >> 
> >> The patch LGTM with that change.
> >> 
> >> That said, I suppose the documentation of the -mlow-precision-*
> >> options doesn't explicit state that they change exception behaviour.
> >> Maybe it would be safer to leave the reciprocal stuff out for now,
> >> even though wanting low-precision results with strict exception
> >> conformance seems like an odd combination.  Does anyone else have
> >> any thoughts?
> >
> > Yeah, I agree that it's not immediately clear whether -mlow-precision-*
> > alone justifies this.  I wouldn't have made this change if the low-
> > precision expansion wasn't predicated on all of:
> >
> >   if (!flag_finite_math_only
> >       || flag_trapping_math
> >       || !flag_unsafe_math_optimizations
> >       || optimize_function_for_size_p (cfun)
> >       || !use_approx_division_p)
> >     return false;
> >
> > Which, IIUC, reflects the fact that it also requires -ffast-math or
> > -funsafe-math-optimizations.
> 
> Ah, OK.
> >
> > I should have placed an assertion (or something similar) to make sure
> > that we have !flag_trapping_math when the low-precision expander is
> > handling partial vector modes.
> >
> > Perhaps something for V2?  Happy to drop it for now if not.
> 
> I think in that case we can just change the comment I suggested to:
> 
> ;; This iterator is currently only used for estimation instructions,
> ;; which are never generated automatically when -ftrapping-math is true.
> ;; The iterator is therefore applied unconditionally to partial FP modes.
> ;; This might need to be revisited if new operations are added in future.
> 
> It can be tricky for expanders to assert that the caller is sane,
> because expanders generally don't have as much context as the caller does.
> For example, the instructions could theoretically be protected by a
> test for trouble-free inputs, a bit like for -ftree-builtin-call-dce.

Thank you, sounds good.

Spencer
> 
> Thanks,
> Richard

Reply via email to