Andi Gutmans wrote: > I don't quite understand. Can you give us a few examples, how they are > handled by PHP today and how they would be handled by your code. > I'll start off: > 0.9+0.1 > > 0.9+0.0000001 > > 8/10.0 + 0.2 > > I'm sure you know of juicier examples :) > > Andi >
Andi, All the fix does is to round the answer to DBL_DIG, i.e. 15, decimal digits. Much of the time the answer is already rounded to 15 decimal digits so the fix has no effect. This is the case with all your examples. However, if we change your last example to 8/10.0 - 0.2, the fix does do something. It doesn't change the division, which is still 0.8, (or rather the double closest to 0.8), but it does change the subtraction. Neither 0.8 nor 0.2 can be held precisely in a double, so they are actually stored as: 0.8 =1.1001100110011001100110011001100110011001100110011010*2^-1 0.2 =1.1001100110011001100110011001100110011001100110011010*2^-3 To see what happens on the subtraction we can look at them as 0.8 =1.1001100110011001100110011001100110011001100110011010*2^-1 0.2 =0.0110011001100110011001100110011001100110011001100110*2^-1 The result is : Answer=1.0011001100110011001100110011001100110011001100110100*2^-1 But this is not the same as 0.6 which is stored as 0.6 =1.0011001100110011001100110011001100110011001100110011*2^-1 The difference is only in the last three digits, 100 as against 011. That's a difference of only one "bit" or rather 2^-53, but that's quite enough for comparisons of (0.8 - 0.2) to 0.6 fail. What happens with my fix is that the answer that comes back from (0.8 - 0.2) is actually turned into the same number as 0.6, so that (0.8 - 0.2) is now equivalent to 0.6 and everything works fine. It's a pain in the neck to have to show the full binary representation of doubles. An alternative is to show the doubles in decimal notation to 17 digits. 17 decimal digits is the magic number which gives every double a unique decimal representation. Expressed as the best 17 decimal digit representation of the doubles, the previous description becomes: 0.8 =0.80000000000000004 0.2 =0.20000000000000001 Answer=0.60000000000000009 0.6 =0.59999999999999998 You'll notice that you can't check the arithmetic in the last place any more, but you can see there's a difference and that, to 15 decimal places the answer is 0.6, so as long as my rounding works properly you the answer turn into the double for 0.6, i.e. 0.59999... That means that comparisons of (0.8 - 0.2) and 0.6 work fine and (0.8 - 0.2) - 0.6 comes out as zero for me. If we look at another example, this time taken from the bug database, we can also see why the rounding is different for subtraction/addition because of "catastrophic loss of precision on differences". The example is 120 - 119.9: 120.0 =120.00000000000000 |<- 15th digit 119.9 =119.90000000000001 Answer=000.099999999999994316 |<- 15th digit 000.1 =000.10000000000000001 Note that all of these are shown to 17 decimal digits, and the 15th digit is marked. Now in this case the calculated answer is quite different from the real one, and it isn't the same to 15 decimal digits any more. What's happened is that the three digits shown in the answer which come before the decimal point, which are a "significant" part of the answer, are not counted as decimal digits because they are all zeros. They should really have been counted as proper digits when we decided where to round and then everything would have been fine. Compare this with 130 - 119.9: 130.0 =130.00000000000000 |<- 15th digit 119.9 =119.90000000000001 Answer=010.099999999999994 |<- 15th digit 010.1 =010.100000000000000 The answer is no more accurate than in the first case, but this time rounding to the 15th digit is fine. The problem is that, by convention, leading zeros are not decimal digits, even if they may contain useful information. What we need to do is to round to the digit corresponding to the same place as the 15th digit of the original number, to ensure leading zeros in the result are given due respect. So, instead of using the result to tell us where to round, we should look at the original operands, or rather whichever is biggest in size, (furthest from zero). To be even more precise, for addition and subtraction we should base rounding on whichever of the two operands and the result have the greatest absolute value. If we do that then we get answers that are equal to 0.1 and 10.1 and once again everything is fine. We haven't lost any real precision by using less than 15 decimal digits from the answer, we've just ignored the apparent extra precision which was simply unreliable. This business of special treatment for addition/subtraction is what my 'addednum' change was all about. As it happens I still got it wrong, because I forgot to look at both operands and the result itself, so redecimalize should really be: double redecimalize(double dval,double rval) { double f; if (dval == 0 || dval > pow(2.0,52)) { return dval; } if (fabs(dval)>fabs(rval)) { rval = fabs(dval); } f = pow(10.0, (double) DBL_DIG - (1 + floor(log10(fabs(rval))))); return (double) (rint(dval*f))/f; } You might notice that I've also added a check to make sure the number we're handling isn't actually greater than 2^52. If it is, it must be a representation of an integer in which case there is no point changing it, it is as good as it's going to get. This check also saves us having to worry about floating point overflow during rounding if the result being rounded is close to the maximum floating point. That's more than enough for now, but your questions are very helpful. That's two bugs in my fix that I've already spotted when trying to answer you. Many thanks and please ask more questions! George