On Tue, 26 Aug 2025, Zhou Zhao wrote:

> 
> 在 2025/8/26 下午6:52, Richard Biener 写道:
> > On Tue, 26 Aug 2025, Zhou Zhao wrote:
> >
> >> 在 2025/8/26 下午3:37, Richard Biener 写道:
> >>> On Thu, 21 Aug 2025, Zhou Zhao wrote:
> >>>
> >>>> This patch is a respond of the patch posted at
> >>>> https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673051.html
> >>>> as some suggestion by Richard Biener, I have adopted these suggestions
> >>>> and regenerated the patch.
> >>>>
> >>>> In the 538.imagick_r benchmark of Spec2017, I find these pattern from
> >>>> MagickRound function. This patch implements these pattern in match.pd
> >>>> for 4 rules under -funsafe-math-optimizations:
> >>>> 1) (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) -> floor(x+0.5)
> >>>> 2) (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) -> floor(x+0.5)
> >>>> 3) (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) -> floor(x+0.5)
> >>>> 4) (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) -> floor(x+0.5)
> >>>>
> >>>> The patch implements floor(x+0.5) operation to replace these pattern
> >>>> that semantics of round(x) function.
> >>> Can you say why you particularly chose floor (x + 0.5) as result
> >>> while describing it to have the semantics of round (x)?  The reasonable
> >>> other choice is round(x) itself?
> >>>
> >>> What exact differences prompt you do gate this with
> >>> -funsafe-math-optimizations?  I can see signed zeros have
> >>> different behavior, for x == -0.0 all forms in original form
> >>> return -0.0 while the simplification will return 0.0.  The behavior
> >>> for Infs and NaNs looks unchanged.  0.5 and -0.5 seem to compute
> >>> to the same value when using floor(x+0.5) as simplification (unless
> >>> I made a mistake).  floor or ceil do not raise IEEE exceptions,
> >>> so wouldn't -fno-signed-zeros be enough as a gate?
> >>>
> >>> Thanks,
> >>> Richard.
> >>>
> >> Thank you for your reply. The time interval since the last patch
> >> submission might be too long, so I will re-describe our discussion on
> >> the above issues:
> >>
> >> 1. I consider the round functions are round(x) and rint(x). In round
> >> halfway cases, round(x) away from zero, rint(x) rounds x to the nearest
> >> even integer. When the pattern input is x=-2.5, return -2.0, but
> >> round(-2.5) return -3.0. When the pattern input is x=2.5, it return
> >> 3.0, but rint(2.5) return 2.0. Therefore, using floor(x + 0.5) is the
> >> best matches expression I think. Do you have any other functions with
> >> semantics of round that could be used to represent this pattern?
> > No, I think that covers it.  rint() also is affected by the rounding
> > mode so I think cannot be used here.
> >
> >> 2. As you mentioned, I need to add the -funsafe-math-optimizations
> >> option to protect the (+0.5) operation. With the -Ofast option, which
> >> enables -fno-signed-zeros, I observed that when x = -0.4, the pattern
> >> returns -0.0 on aarch64-linux-gnu but returns 0.0 on x86_64-linux-gnu.
> >> floor(x + 0.5) will return 0.0 on all the above targets. Additionally,
> >> the pattern behaves the same for all double values, including INFs
> >> and NaNs.
> >>
> >> 3. Indeed, I cannot guarantee that (+0.5) will always yield the
> >> expected value, so I use -funsafe-math-optimizations  for protection.
> >> I think this is better than checking HONOR_NANS/INFS/SIGNED_ZEROS,
> >> because when the true result of x + 0.5 cannot be exactly represented
> >> in the target floating-point format, and happens to be halfway between
> >> two adjacent floating-point numbers, different rounding rules will
> >> make different choices, leading to different final results.
> > I think 0.5 can be always exactly represented, but there might be
> > a special 'x' where + 0.5 triggers a one-ulp difference depending
> > on rounding mode.
> >
> > That said, HONOR_SIGN_DEPENDENT_ROUNDING might be also an issue
> > because of that.
> >
> > But other than that the transform should be value-preserving?
> yes, I think so.
> > So I'd gate with !HONOR_SIGNED_ZEROS && !HONOR_SIGN_DEPENDENT_ROUNDING
> > instead?
> These conditions can indeed cover all the case I have considered,
> I'm no issues with it.
> 
> If no other issues, I will make the revisions and then resubmit
> [patch v3] afterward.

Yes please, the rest looks OK.

Richard.

> Thanks
> Zhou Zhao.
> > Richard.
> >
> >> Thanks,
> >> Zhou Zhao.
> >>>> The patch was regtested on aarch64-linux-gnu and x86_64-linux-gnu,
> >>>> SPEC 2017 and SPEC 2006 were run:
> >>>> As for SPEC 2017, 538.imagick_r benchmark performance increased by 3%+
> >>>> in base test of ratio mode.
> >>>> As for SPEC 2006, while the transform does not seem to be triggered,
> >>>> we also see no non-noise impact on performance.
> >>>> OK for mainline?
> >>>>
> >>>> gcc/ChangeLog:
> >>>>
> >>>>   * match.pd: Add new pattern for round.
> >>>>
> >>>> gcc/testsuite/ChangeLog:
> >>>>
> >>>>  * gcc.dg/fold-round-1.c: New test.
> >>>> ---
> >>>>    gcc/match.pd                        | 17 +++++++++
> >>>>    gcc/testsuite/gcc.dg/fold-round-1.c | 56 +++++++++++++++++++++++++++++
> >>>>    2 files changed, 73 insertions(+)
> >>>>    create mode 100644 gcc/testsuite/gcc.dg/fold-round-1.c
> >>>>
> >>>> diff --git a/gcc/match.pd b/gcc/match.pd
> >>>> index 66e8a787449..94036603e08 100644
> >>>> --- a/gcc/match.pd
> >>>> +++ b/gcc/match.pd
> >>>> @@ -794,6 +794,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >>>>     (rdiv @0 (negate @1))
> >>>>     (rdiv (negate @0) @1))
> >>>>    +(if (flag_unsafe_math_optimizations)
> >>>> +/* convert semantics of round(x) function to floor(x+0.5).  */
> >>>> +/* (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) --> floor(x+0.5).  */
> >>>> +/* (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) --> floor(x+0.5).
> >>>> */
> >>>> +/* (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) --> floor(x+0.5).  */
> >>>> +/* (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) --> floor(x+0.5).
> >>>> */
> >>>> +(for op (lt ge)
> >>>> +     bt (FLOOR CEIL)
> >>>> +     bf (CEIL FLOOR)
> >>>> +     floor (FLOOR FLOOR)
> >>>> +     ceil (CEIL CEIL)
> >>>> + (simplify
> >>>> +  (cond (op:c (minus:s SSA_NAME@0 (floor SSA_NAME@0))
> >>>> +              (minus:s (ceil SSA_NAME@0) SSA_NAME@0))
> >>>> +        (bt SSA_NAME@0) (bf SSA_NAME@0))
> >>>> +  (floor (plus @0 { build_real (type, dconsthalf); })))))
> >>>> +
> >>>>    (if (flag_unsafe_math_optimizations)
> >>>>     /* Simplify (C / x op 0.0) to x op 0.0 for C != 0, C != Inf/Nan.
> >>>>        Since C / x may underflow to zero, do this only for unsafe math.
> >>>> */
> >>>> diff --git a/gcc/testsuite/gcc.dg/fold-round-1.c
> >>>> b/gcc/testsuite/gcc.dg/fold-round-1.c
> >>>> new file mode 100644
> >>>> index 00000000000..845d6d2e475
> >>>> --- /dev/null
> >>>> +++ b/gcc/testsuite/gcc.dg/fold-round-1.c
> >>>> @@ -0,0 +1,56 @@
> >>>> +/* { dg-do compile } */
> >>>> +/* { dg-options "-Ofast -funsafe-math-optimizations" } */
> >>>> +
> >>>> +extern void link_error (void);
> >>>> +
> >>>> +#define TEST_ROUND(TYPE, FFLOOR, FCEIL)
> >>>> \
> >>>> +  void round_##FFLOOR##_1 (TYPE x)
> >>>> \
> >>>> +  {
> >>>> \
> >>>> +    TYPE t1 = 0;
> >>>> \
> >>>> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);
> >>>> \
> >>>> +    if ((x - __builtin_##FFLOOR (x)) < (__builtin_##FCEIL (x) - x))
> >>>> \
> >>>> +      t1 = __builtin_##FFLOOR (x);
> >>>> \
> >>>> +    else
> >>>> \
> >>>> +      t1 = __builtin_##FCEIL (x);
> >>>> \
> >>>> +    if (t1 != t2)
> >>>> \
> >>>> +      link_error ();
> >>>> \
> >>>> +  }
> >>>> \
> >>>> +  void round_##FFLOOR##_2 (TYPE x)
> >>>> \
> >>>> +  {
> >>>> \
> >>>> +    TYPE t1 = 0;
> >>>> \
> >>>> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);
> >>>> \
> >>>> +    if ((__builtin_##FCEIL (x) - x) > (x - __builtin_##FFLOOR (x)))
> >>>> \
> >>>> +      t1 = __builtin_##FFLOOR (x);
> >>>> \
> >>>> +    else
> >>>> \
> >>>> +      t1 = __builtin_##FCEIL (x);
> >>>> \
> >>>> +    if (t1 != t2)
> >>>> \
> >>>> +      link_error ();
> >>>> \
> >>>> +  }
> >>>> \
> >>>> +  void round_##FFLOOR##_3 (TYPE x)
> >>>> \
> >>>> +  {
> >>>> \
> >>>> +    TYPE t1 = 0;
> >>>> \
> >>>> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);
> >>>> \
> >>>> +    if ((__builtin_##FCEIL (x) - x) <= (x - __builtin_##FFLOOR (x)))
> >>>> \
> >>>> +      t1 = __builtin_##FCEIL (x);
> >>>> \
> >>>> +    else
> >>>> \
> >>>> +      t1 = __builtin_##FFLOOR (x);
> >>>> \
> >>>> +    if (t1 != t2)
> >>>> \
> >>>> +      link_error ();
> >>>> \
> >>>> +  }
> >>>> \
> >>>> +  void round_##FFLOOR##_4 (TYPE x)
> >>>> \
> >>>> +  {
> >>>> \
> >>>> +    TYPE t1 = 0;
> >>>> \
> >>>> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);
> >>>> \
> >>>> +    if ((x - __builtin_##FFLOOR (x)) >= (__builtin_##FCEIL (x) - x))
> >>>> \
> >>>> +      t1 = __builtin_##FCEIL (x);
> >>>> \
> >>>> +    else
> >>>> \
> >>>> +      t1 = __builtin_##FFLOOR (x);
> >>>> \
> >>>> +    if (t1 != t2)
> >>>> \
> >>>> +      link_error ();
> >>>> \
> >>>> +  }
> >>>> +
> >>>> +TEST_ROUND (float, floorf, ceilf)
> >>>> +TEST_ROUND (double, floor, ceil)
> >>>> +TEST_ROUND (long double, floorl, ceill)
> >>>> +
> >>>> +/* { dg-final { scan-assembler-not "link_error" } } */
> >>>>
> >>
> >>
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to