Tim Peters <t...@python.org> added the comment:

Thanks for doing the "real ulp" calc, Raymond!  It was intended to make the 
Kahan gimmick look better, and it succeeded ;-)  I don't personally care 
whether adding 10K things ends up with 50 ulp error, but to each their own.

Division can be mostly removed from scaling, but it's a bit delicate.  The 
obvious stab would be to multiply by 1/max instead of dividing by max.  That 
introduces another rounding error, but as you've already discovered this 
computation as a whole is relatively insensitive to rounding errors in scaling. 
 The real problem is that 1/max can overflow (when max is subnormal - the IEEE 
range isn't symmetric).

Instead a power of 2 could be used, chosen so that it and its reciprocal are 
both representable.  There's no particular virtue in scaling the largest value 
to 1.0 - the real point is to avoid spurious overflow/underflow when squaring 
the scaled x.  Code like the following could work.  Scaling would introduce 
essentially no rounding errors then.

But I bet a call to `ldexp()` is even more expensive than division on most 
platforms.  So it may well be slower when there are few points.

def hyp_ps(xs):
    b = max(map(abs, xs))
    _, exp = frexp(b)
    exp = -3 * exp // 4
    # scale and invscale are exact.
    # Multiplying by scale is exact, except when
    # the result becomes subnormal (in which case
    # |x| is insignifcant compared to |b|).
    # invscale isn't needed until the loop ends,
    # so the division time should overlap with the
    # loop body.
    scale = ldexp(1.0, exp) # exact
    invscale = 1.0 / scale  # exact
    # and `x * scale` is exact except for underflow
    s = fsum((x * scale)**2 for x in xs)
    return invscale * sqrt(s)


Python tracker <rep...@bugs.python.org>
Python-bugs-list mailing list

Reply via email to