One of the things that people fail to understand is that floating point
numbers are stored in *binary*.  In fact, I bet a number of people who
understand the exact binary formatting of integers don't understand that the
technique translates pretty much directly into floating point: a floating
point number is recorded like 1.01110000001010101110101....

So when they hear "floating point numbers are inaccurate because of rounding
errors", they often think "Oh, that doesn't apply to me -- I'm not doing
anything with enough decimal places to hit a rounding error."  This exact
sort of problem cropped up in a financial transaction processing system I
helped maintain.

Problem: Sometimes, members's transactions would be declined even though
they had enough to cover the transaction.

This was narrowed down to

Specific problem: Member X is unable to make a purchase for $1.40 cents
despite them having 40 cents in their account and a $1.00-off coupon.

I looked at the code in question and announced it was a rounding error due
to the use of a 'double' for storing currency values.  "How is that
possible? It's only two decimal places, I've seen these things work for half
a dozen decimal places!" I was asked.

So I demonstrated what happened:  The application's test was (total amount)
- (balance) <= (coupon).  Or something like that.  It was years ago; all I
remember is that the three numbers were $1.40, $1.00, and $0.40.  So I
translated everything into the internal binary form:

Purchase amount: $1.40 =>
1.0110011001100110011001100110011001100110011001100110 * (2**0)
Member's balance: $0.40 =>
1.1001100110011001100110011001100110011001100110011010 * (2**-2)

To add these together, they need to be adjusted to the same exponent:

Purchase amount: $1.40 =>
1.0110011001100110011001100110011001100110011001100110 * (2**0)
Member's balance: $0.40 =>
0.011001100110011001100110011001100110011001100110011010
* (2**0)

This is where things go wrong.  You see that extra '10' at the end of the
member's balance?  The floating point process doesn't have room for it, so
it rounds.  And much the same way as 0.5 rounds up to 1.0, so does binary
0.1:

Purchase amount: $1.40 =>
1.0110011001100110011001100110011001100110011001100110 * (2**0)
Member's balance: $0.40 =>
0.0110011001100110011001100110011001100110011001100111
* (2**0)

Now we subtract:


  1.0110011001100110011001100110011001100110011001100110
- 0.0110011001100110011001100110011001100110011001100111
========================================================
  0.1111111111111111111111111111111111111111111111111111

This is *practically* 1, in much the same way as
0.99999999999999999999999999999999999999 is *practically* 1.  But it's still
technically less than 1.  So when the application compared it to the coupon
amount, or whichever it was, the rounding error caused a false failure and
the transaction was declined.

Things are easier to understand if you realize that for any fraction (P/Q),
if Q is not exactly a power of 2, then the answer cannot be exactly
represented in binary.  In contrast, for our decimal system, any fraction
(P/Q) cannot be represented exactly unless Q can be expressed as (some power
of 2)*(some power of 5).

For your edification, I wrote a Perl script to tell how many Qs offer exact
representations in bases 2, 10, and 60.  These are the results:

Bin: 25 of 16777216 (0.00015%)
Decimal: 143 of 16777216 (0.00085%)
Base60: 836 of 16777216 (0.00498%)

This roughly indicates that if you have a number that can be expressed
exactly in decimal, there's only about a 1-in-6 chance that it's *also*
expressible exactly in binary without running into rounding errors. I also
threw in base 60 for comparison -- an arbitrary number is nearly 6 times as
likely to be expressible exactly using base-60 than it is in base-10.  GPS
coordinates are expressed using base-60 (degrees, minutes, seconds).

== script  ==
#!/usr/bin/perl
my ($b,$d,$b60) = (0,0,0);
my $max = 16_777_216;
for (1..$max) {
  my $q = $_;
  while ( ($q % 2) == 0 ) { $q /= 2; }
  if ($q == 1) { $b++; }
  while ( ($q % 5) == 0) { $q /= 5; }
  if ($q == 1) { $d++; }
  while ( ($q % 3) == 0) { $q /= 3; }
  if ($q == 1) { $b60++; }
}
printf "%s: %d of $max (%.5f%%)\n", @$_, 100*$_->[1]/$max
  for [Bin=>$b],[Decimal=>$d],[Base60=>$b60];
#!/usr/bin/perl
my ($b,$d,$b60) = (0,0,0);
my $max = 16_777_216;
for (1..$max) {
  my $q = $_;
  while ( ($q % 2) == 0 ) { $q /= 2; }
  if ($q == 1) { $b++; }
  while ( ($q % 5) == 0) { $q /= 5; }
  if ($q == 1) { $d++; }
  while ( ($q % 3) == 0) { $q /= 3; }
  if ($q == 1) { $b60++; }
}
printf "%s: %d of $max (%.5f%%)\n", @$_, 100*$_->[1]/$max
  for [Bin=>$b],[Decimal=>$d],[Base60=>$b60];
== snip ==

--
-- Stevie-O
Real programmers use COPY CON PROGRAM.EXE
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to