Andi Gutmans wrote:

> I don't quite understand. Can you give us a few examples, how they are
> handled by PHP today and how they would be handled by your code.
> I'll start off:
> 0.9+0.1
>
> 0.9+0.0000001
>
> 8/10.0 + 0.2
>
> I'm sure you know of juicier examples :)
>
> Andi
>

Andi,

All the fix does is to round the answer to DBL_DIG, i.e. 15, decimal
digits.   Much of the time the answer is already rounded to 15 decimal
digits so the fix has no effect.  This is the case with all your examples.

However, if we change your last example to 8/10.0 - 0.2, the fix
does do something.

It doesn't change the division, which is still 0.8, (or rather the double
closest to 0.8), but it does change the subtraction.

Neither 0.8 nor 0.2 can be held precisely in a double, so they are
actually stored as:

0.8   =1.1001100110011001100110011001100110011001100110011010*2^-1
0.2   =1.1001100110011001100110011001100110011001100110011010*2^-3

To see what happens on the subtraction we can look at them as
0.8   =1.1001100110011001100110011001100110011001100110011010*2^-1
0.2   =0.0110011001100110011001100110011001100110011001100110*2^-1

The result is :

Answer=1.0011001100110011001100110011001100110011001100110100*2^-1

But this is not the same as 0.6 which is stored as

0.6   =1.0011001100110011001100110011001100110011001100110011*2^-1

The difference is only in the last three digits, 100 as against 011.
That's a difference of only one "bit" or rather 2^-53, but that's
quite enough for comparisons of (0.8 - 0.2) to 0.6 fail.

What happens with my fix is that the answer that comes back from
(0.8 - 0.2) is actually turned into the same number as 0.6, so that
(0.8 - 0.2) is now equivalent to 0.6 and everything works fine.

It's a pain in the neck to have to show the full binary representation
of doubles. An alternative is to show the doubles in decimal notation
to 17 digits.  17 decimal digits is the magic number which gives every
double a unique decimal representation.  Expressed as the best
17 decimal digit representation of the doubles, the previous description
becomes:


0.8   =0.80000000000000004
0.2   =0.20000000000000001

Answer=0.60000000000000009

0.6   =0.59999999999999998

You'll notice that you can't check the arithmetic in the last place any
more, but you can see there's a difference and that, to 15 decimal
places the answer is 0.6, so as long as my rounding works
properly you the answer turn into the double for 0.6, i.e. 0.59999...
That means that comparisons of (0.8 - 0.2) and  0.6 work fine
and (0.8 - 0.2) - 0.6 comes out as zero for me.

If we look at another example, this time taken from the bug database,
we can also see why the rounding is different for subtraction/addition
because of "catastrophic loss of precision on differences".

The example is 120 - 119.9:

120.0 =120.00000000000000
                       |<- 15th digit
119.9 =119.90000000000001

Answer=000.099999999999994316
                          |<- 15th digit
000.1 =000.10000000000000001

Note that all of these are shown to 17 decimal digits, and
the 15th digit is marked.  Now in this case the calculated
answer is quite different from the real one, and it isn't
the same to 15 decimal digits any more.

What's happened is that the three digits shown in the answer
which come before the decimal point, which are a
"significant" part of the answer, are not counted as decimal
digits because they are all zeros.   They should really have
been counted as proper digits when we decided where to
round and then everything would have been fine.

Compare this with 130 - 119.9:

130.0 =130.00000000000000
                      |<- 15th digit
119.9 =119.90000000000001

Answer=010.099999999999994
                       |<- 15th digit
010.1 =010.100000000000000

The answer is no more accurate than in the first case,
but this time rounding to the 15th digit is fine.  The problem
is that, by convention, leading zeros are not decimal digits,
even if they may contain useful information.   What we need
to do is to round to the digit corresponding to the same place
as the 15th digit of the original number, to ensure leading
zeros in the result are given due respect.

So, instead of using the result to tell us where to round, we
should look at the original operands, or rather whichever
is biggest in size, (furthest from zero).  To be even more precise,
for addition and subtraction we should base rounding on
whichever of the two operands and the result have the
greatest absolute value.

If we do that then we get answers  that are equal to 0.1 and 10.1
and once again everything is fine.  We haven't lost any real precision
by using less than 15 decimal digits from the answer, we've just
ignored the apparent extra precision which was simply unreliable.

This business of special treatment for addition/subtraction
is what my 'addednum' change was all about.  As it happens
I still got it wrong, because I forgot to look at both operands
and the result itself, so redecimalize should really be:

double redecimalize(double dval,double rval)
{
    double f;
    if (dval == 0 || dval > pow(2.0,52))
   {
      return dval;
   }
   if (fabs(dval)>fabs(rval))
   {
      rval = fabs(dval);
   }
   f = pow(10.0, (double) DBL_DIG - (1 + floor(log10(fabs(rval)))));
   return (double) (rint(dval*f))/f;
  }

You might notice that I've also added a check to make sure the number
we're handling isn't actually greater than 2^52.  If it is, it must be a
representation of an integer in which case there is no point changing it,
it is as good as it's going to get.  This check  also saves us having to worry
about floating point overflow during rounding if the result being
rounded is close to the maximum floating point.

That's more than enough for now, but your questions are very
helpful.  That's two bugs  in my fix that I've already spotted
when trying to answer you.

Many thanks and please ask more questions!


George


Reply via email to