Instead of calculating the reciprocal of x upfront and doing the repeated multiplication with that, calculate the reciprocal of the result at the end instead.
As a special case, for cases where this algorithm overflows (where the result would be a subnormal), fall back to the previous algorithm. Instead of spending upfront complexity on trying to decide between these two algorithms, just try to detect whether this happened afterwards (which should be quite cheap) and redo the calculation if that happened - with the assumption that this is the uncommon case (subnormals generally can be considered more expensive). This fixes https://github.com/msys2/MINGW-packages/issues/11733. Signed-off-by: Martin Storsjö <[email protected]> --- This is a separate attempt at this; instead of trying to replace the implementation of the core loop in the algorithm with a separate one, just keep that part (which is essentially fine even if it's vaguely different than others), and just apply the logic change of moving the reciprocal to the end. Then secondly, instead of making an expensive effort upfront of detecting whether the calculation will overflow (which should be the exception case, not the default case), just try doing the preferred algorithm, and if it fails, redo it with the original algorithm. --- mingw-w64-crt/math/powi.def.h | 45 +++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 12 deletions(-) diff --git a/mingw-w64-crt/math/powi.def.h b/mingw-w64-crt/math/powi.def.h index 1997727fd..bc57fb0f1 100644 --- a/mingw-w64-crt/math/powi.def.h +++ b/mingw-w64-crt/math/powi.def.h @@ -68,6 +68,22 @@ #include <math.h> #include <errno.h> +static __FLT_TYPE do_powi_iter(__FLT_TYPE d, int y) +{ + unsigned int u = (unsigned int) y; + __FLT_TYPE rslt = ((u & 1) != 0) ? d : __FLT_CST(1.0); + u >>= 1; + do + { + d *= d; + if ((u & 1) != 0) + rslt *= d; + u >>= 1; + } + while (u > 0); + return rslt; +} + __FLT_TYPE __cdecl __FLT_ABI(__powi) (__FLT_TYPE x, int y); @@ -76,6 +92,7 @@ __FLT_ABI(__powi) (__FLT_TYPE x, int y) { int x_class = fpclassify (x); int odd_y = y & 1; + int recip = 0; __FLT_TYPE d, rslt; if (y == 0 || x == __FLT_CST(1.0)) @@ -125,7 +142,8 @@ __FLT_ABI(__powi) (__FLT_TYPE x, int y) if (y < 0) { - d = __FLT_CST(1.0) / d; + /* By default, do the reciprocal of the result. */ + recip = 1; y = -y; } @@ -135,18 +153,21 @@ __FLT_ABI(__powi) (__FLT_TYPE x, int y) rslt = d; else { - unsigned int u = (unsigned int) y; - rslt = ((u & 1) != 0) ? d : __FLT_CST(1.0); - u >>= 1; - do - { - d *= d; - if ((u & 1) != 0) - rslt *= d; - u >>= 1; - } - while (u > 0); + rslt = do_powi_iter(d, y); + if (recip && fpclassify(rslt) == FP_INFINITE && d > __FLT_CST(1.0)) + { + /* Uncommon case - we had overflow, but we're going to calculate + the reciprocal. If this happened, redo the calculation by doing + the reciprocal upfront instead. Instead of trying to calculate + whether this will happen, we prefer keeping the default case + cheap. */ + d = __FLT_CST(1.0) / d; + recip = 0; + rslt = do_powi_iter(d, y); + } } + if (recip) + rslt = __FLT_CST(1.0) / rslt; if (signbit (x) && odd_y) rslt = -rslt; return rslt; -- 2.25.1 _______________________________________________ Mingw-w64-public mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
