On 24 November 2015 at 16:20, Tom Lane <t...@sss.pgh.pa.us> wrote:
> After further thought I've pretty well convinced myself that there is
> indeed no observable bug, at least as long as you assume that overflow
> within the multiplication will behave as stated above. The proof goes
> like this:
> We already know that the divisor-subtraction step cannot cause an overflow
> (modulo the question of div[qi] possibly exceeding the maxdiv limit,
> which I'll get back to). So what is at stake is just the possibility of
> overflow in the final update that transfers leftover digits from div[qi]
> to div[qi+1].
> To analyze that, consider just the first two dividend and divisor terms in
> the qdigit calculation, that is during each loop iteration approximate
> qdigit as
> qdigit ~= trunc((div[qi]*NBASE + div[qi+1]) /
> (var2digits*NBASE + var2digits))
> This holds whether or not we do a carry propagation step. Now write
> what the divisor-subtraction step updates div[qi] to as
> div[qi]' = div[qi] - qdigit * var2digits
> and the updated div[qi+1] as
> div[qi+1]' = div[qi+1] - qdigit * var2digits
> So, if we temporarily disregard the possibility of overflow within the
> final calculation
> div[qi+1]'' = div[qi+1]' + div[qi]'*NBASE
> we can see that div[qi+1]'' is going to be
> = div[qi+1]' + div[qi]'*NBASE
> = div[qi+1] - qdigit * var2digits +
> (div[qi] - qdigit * var2digits)*NBASE
> = div[qi]*NBASE + div[qi+1] -
> qdigit * (var2digits*NBASE + var2digits)
> Comparing that to the approximate value of qdigit, we can see that
> what we've got here is a modulo calculation, and the value of div[qi+1]''
> is going to cancel to zero except for a remainder modulo
> var2digits*NBASE + var2digits, which of course must fall between 0
> and NBASE*NBASE+NBASE (since the truncation is towards minus infinity).
Yes that makes sense.
> Now of course this is only an approximation. The main source of error is
> that we've omitted the lower-order dividend terms. Including div[qi+2]
> could change the result by at most about INT_MAX/NBASE (which is the most
> it could add to the numerator in the qdigit expression above, which will
> propagate more or less linearly to div[qi+1]''). Similarly, adding
> div[qi+3] could add at most INT_MAX/NBASE/NBASE. So we aren't anywhere
> near the overflow threshold. Including the lower-order divisor terms
> could change the value of qdigit by a multiplicative factor of no more
> than about 1/NBASE. Lastly, roundoff error in the floating-point part
> of the calculation could cause qdigit to be off by one count either
> way. None of these effects are going to let the final div[qi+1] value
> get to more than two or three times NBASE squared, which is still
> an order of magnitude less than INT_MAX.
Right. In fact I think div[qi+1] is even more constrained than that (see below).
> Therefore, the final div[qi+1] value cannot overflow an int, and even
> though it might be larger than what maxdiv would suggest, it's not going
> to be large enough to cause overflow in the next loop iteration either.
> It obviously won't cause overflow during carry propagation, and as for
> the next divisor-subtraction step, we can do an analysis similar to the
> above but approximating qdigit with just these terms:
> qdigit ~= trunc((div[qi] + div[qi+1]/NBASE) / var2digits)
> Plugging that into
> div[qi]' = div[qi] - qdigit * var2digits
> shows that the updated div[qi] in any loop iteration is going to be just
> about -div[qi+1]/NBASE, plus a truncation term that's between 0 and
> var2digits, plus lower-order terms that aren't going to get you very
> much past INT_MAX/NBASE. So div[qi]' is never an overflow hazard in any
> loop iteration.
> In short, it's impossible for the end result of the div[qi+1] update
> calculation to overflow, or even to get large enough to create a problem
> in the next loop iteration. However, the intermediate result
> div[qi]*NBASE can overflow as per the example I showed before.
I tried analysing this in a different way, by considering the maximum
possible error in the floating point computations. fdividend takes up
to 4 digits from the dividend:
fdividend = div[qi] * NBASE^3 + div[qi+1] * NBASE^2
+ div[qi+2] * NBASE + div[qi+3]
and the digits being discarded are limited to a little less than
+/-INT_MAX, so the error in fdividend due to discarding digits is at
most +/-INT_MAX/NBASE. fdivisor is computed in a similar way, except
that the var2 digits are in the range [0,NBASE-1], so fdivisor is in
the range [NBASE^3, NBASE^4-1]. Therefore the maximum error in the
computation of fquotient is +/-INT_MAX/NBASE^4, which is less than 1.
So the most this can do is to cause a single integer boundary to be
crossed, and therefore qdigit will be within +/-1 of the correct
result. That's consistent with the observation that in practice qdigit
always falls in the range [-1,10000].
In fact, since we round down when computing qdigit, it should always
be correct or 1 smaller than the correct result, but the "correct"
result in this context may be equal to 10000, to compensate for an
underestimate in the previously calculated digit.
It then follows that the result of the divisor subtraction step can be
to have up to one extra copy of the divisor added to the remainder,
and this will contribute up to an extra var2digits*NBASE^4 +
var2digits*NBASE^3 to fdividend in the next iteration, making
fdividend up to around NBASE*fdivisor.
So fdividend will always be less than around NBASE^5, which means that
at the top of the loop div[qi] is always less than
NBASE^2+INT_MAX/NBASE (since the next digit of the divisor could be
large and negative, down to -INT_MAX).
So, in other words, the limit on the value of div[qi+1] computed at
the bottom of the loop is around NBASE^2+INT_MAX/NBASE=100214748, and
so it can't possibly overflow a 32-bit integer.
I added some logging and ran the self tests plus a few other examples
analogous to your example upthread and the largest value for div[qi+1]
logged was 100210148, which is consistent with that bound.
> We could just ignore this on the grounds that it doesn't matter given sane
> behavior of integer arithmetic. Or, if we wanted to be more paranoid, we
> could do the multiplication-and-addition in int64 arithmetic; but that
> would probably slow things down a bit.
I doubt that a single int64 computation would make any noticeable
performance difference to this computation, given all the other code
in the loop, but I think it's probably unnecessary.
> I think all this deserves some code comments, but I'm not sure whether
> we need to bother with casting to int64. Thoughts?
> regards, tom lane
I think probably just a comment is OK.
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: