On Thu, 21 Aug 2025, Zhou Zhao wrote:

> This patch is a respond of the patch posted at
> https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673051.html
> as some suggestion by Richard Biener, I have adopted these suggestions
> and regenerated the patch.
> 
> In the 538.imagick_r benchmark of Spec2017, I find these pattern from
> MagickRound function. This patch implements these pattern in match.pd
> for 4 rules under -funsafe-math-optimizations:
> 1) (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) -> floor(x+0.5)
> 2) (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) -> floor(x+0.5)
> 3) (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) -> floor(x+0.5)
> 4) (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) -> floor(x+0.5)
> 
> The patch implements floor(x+0.5) operation to replace these pattern
> that semantics of round(x) function.

Can you say why you particularly chose floor (x + 0.5) as result
while describing it to have the semantics of round (x)?  The reasonable
other choice is round(x) itself?

What exact differences prompt you do gate this with 
-funsafe-math-optimizations?  I can see signed zeros have
different behavior, for x == -0.0 all forms in original form
return -0.0 while the simplification will return 0.0.  The behavior
for Infs and NaNs looks unchanged.  0.5 and -0.5 seem to compute
to the same value when using floor(x+0.5) as simplification (unless
I made a mistake).  floor or ceil do not raise IEEE exceptions,
so wouldn't -fno-signed-zeros be enough as a gate?

Thanks,
Richard.

> 
> The patch was regtested on aarch64-linux-gnu and x86_64-linux-gnu,
> SPEC 2017 and SPEC 2006 were run:
> As for SPEC 2017, 538.imagick_r benchmark performance increased by 3%+
> in base test of ratio mode.
> As for SPEC 2006, while the transform does not seem to be triggered,
> we also see no non-noise impact on performance.
> OK for mainline?
> 
> gcc/ChangeLog:
> 
>       * match.pd: Add new pattern for round.
> 
> gcc/testsuite/ChangeLog:
> 
>       * gcc.dg/fold-round-1.c: New test.
> ---
>  gcc/match.pd                        | 17 +++++++++
>  gcc/testsuite/gcc.dg/fold-round-1.c | 56 +++++++++++++++++++++++++++++
>  2 files changed, 73 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/fold-round-1.c
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 66e8a787449..94036603e08 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -794,6 +794,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (rdiv @0 (negate @1))
>   (rdiv (negate @0) @1))
>  
> +(if (flag_unsafe_math_optimizations)
> +/* convert semantics of round(x) function to floor(x+0.5).  */
> +/* (x-floor(x)) < (ceil(x)-x) ? floor(x) : ceil(x) --> floor(x+0.5).  */
> +/* (x-floor(x)) >= (ceil(x)-x) ? ceil(x) : floor(x) --> floor(x+0.5).  */
> +/* (ceil(x)-x) > (x-floor(x)) ? floor(x) : ceil(x) --> floor(x+0.5).  */
> +/* (ceil(x)-x) <= (x-floor(x)) ? ceil(x) : floor(x) --> floor(x+0.5).  */
> +(for op (lt ge)
> +     bt (FLOOR CEIL)
> +     bf (CEIL FLOOR)
> +     floor (FLOOR FLOOR)
> +     ceil (CEIL CEIL)
> + (simplify
> +  (cond (op:c (minus:s SSA_NAME@0 (floor SSA_NAME@0))
> +           (minus:s (ceil SSA_NAME@0) SSA_NAME@0))
> +     (bt SSA_NAME@0) (bf SSA_NAME@0))
> +  (floor (plus @0 { build_real (type, dconsthalf); })))))
> +
>  (if (flag_unsafe_math_optimizations)
>   /* Simplify (C / x op 0.0) to x op 0.0 for C != 0, C != Inf/Nan.
>      Since C / x may underflow to zero, do this only for unsafe math.  */
> diff --git a/gcc/testsuite/gcc.dg/fold-round-1.c 
> b/gcc/testsuite/gcc.dg/fold-round-1.c
> new file mode 100644
> index 00000000000..845d6d2e475
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/fold-round-1.c
> @@ -0,0 +1,56 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Ofast -funsafe-math-optimizations" } */
> +
> +extern void link_error (void);
> +
> +#define TEST_ROUND(TYPE, FFLOOR, FCEIL)                                      
>  \
> +  void round_##FFLOOR##_1 (TYPE x)                                           
>  \
> +  {                                                                          
>  \
> +    TYPE t1 = 0;                                                             
>  \
> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);                                  
>  \
> +    if ((x - __builtin_##FFLOOR (x)) < (__builtin_##FCEIL (x) - x))          
>  \
> +      t1 = __builtin_##FFLOOR (x);                                           
>  \
> +    else                                                                     
>  \
> +      t1 = __builtin_##FCEIL (x);                                            
>  \
> +    if (t1 != t2)                                                            
>  \
> +      link_error ();                                                         
>  \
> +  }                                                                          
>  \
> +  void round_##FFLOOR##_2 (TYPE x)                                           
>  \
> +  {                                                                          
>  \
> +    TYPE t1 = 0;                                                             
>  \
> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);                                  
>  \
> +    if ((__builtin_##FCEIL (x) - x) > (x - __builtin_##FFLOOR (x)))          
>  \
> +      t1 = __builtin_##FFLOOR (x);                                           
>  \
> +    else                                                                     
>  \
> +      t1 = __builtin_##FCEIL (x);                                            
>  \
> +    if (t1 != t2)                                                            
>  \
> +      link_error ();                                                         
>  \
> +  }                                                                          
>  \
> +  void round_##FFLOOR##_3 (TYPE x)                                           
>  \
> +  {                                                                          
>  \
> +    TYPE t1 = 0;                                                             
>  \
> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);                                  
>  \
> +    if ((__builtin_##FCEIL (x) - x) <= (x - __builtin_##FFLOOR (x)))         
>  \
> +      t1 = __builtin_##FCEIL (x);                                            
>  \
> +    else                                                                     
>  \
> +      t1 = __builtin_##FFLOOR (x);                                           
>  \
> +    if (t1 != t2)                                                            
>  \
> +      link_error ();                                                         
>  \
> +  }                                                                          
>  \
> +  void round_##FFLOOR##_4 (TYPE x)                                           
>  \
> +  {                                                                          
>  \
> +    TYPE t1 = 0;                                                             
>  \
> +    TYPE t2 = __builtin_##FFLOOR (x + 0.5);                                  
>  \
> +    if ((x - __builtin_##FFLOOR (x)) >= (__builtin_##FCEIL (x) - x))         
>  \
> +      t1 = __builtin_##FCEIL (x);                                            
>  \
> +    else                                                                     
>  \
> +      t1 = __builtin_##FFLOOR (x);                                           
>  \
> +    if (t1 != t2)                                                            
>  \
> +      link_error ();                                                         
>  \
> +  }
> +
> +TEST_ROUND (float, floorf, ceilf)
> +TEST_ROUND (double, floor, ceil)
> +TEST_ROUND (long double, floorl, ceill)
> +
> +/* { dg-final { scan-assembler-not "link_error" } } */
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to