Requirements for an integer-based value representation

Bill Gribble Fri, 23 Jun 2000 10:54:49 -0700
PREAMBLE 
--------

As of version 1.4.0, gnucash uses double-precision IEEE floats to
represent all monetary quantities.  We see several related reasons
that doubles are inadequate for representation of monetary quantities:

  - Range.  IEEE 64-bit doubles have 52 bits of mantissa, meaning a
    total of about 16 decimal digits of precision.  With some very
    small-valued currencies, this leaves really only one or two orders
    of magnitude of "buffer" between practical recordkeeping and the
    limit of the representation to reliably count by one unit of
    currency.

  - Floating-point error accumulation.  Financial institutions conduct
    monetary transactions using integer quantities of some
    smallest-currency-unit (for example, pennies/USD$0.01 in dollars).
    However, most such quantities cannot be represented exactly in
    IEEE double format; for example, 0.01 has a continued-fraction
    representation, which is not exact.  

    While the error is very small, it can rather quickly accumulate to
    the point of error in the smallest-currency-unit; for example, the
    result of adding "1000.03" to itself one million times, which
    should result in the integer value "1,000,030,000", actually
    results in 1000029999.9794, an error of at least 2
    smallest-currency-units (pennies) [1].  

These problems are not immediately catastrophic, but they are
sufficiently serious that we should address them now.  

This document is an attempt to pull together the various threads of
discussion on the list about integer-based representations for
monetary quantities and make a first pass at the requirements of such
a representation.

REQUIREMENTS
------------

  1. Maximum magnitude.  We should be able to represent very large
     monetary sums as exact integer quantities of
     smallest-currency-units.  A rule of thumb is that we should be
     able to represent national budgets to the SCU
     (smallest-currency-unit) with several orders of magnitude to
     spare, even in very small-valued currencies.

  2. Genericity.  We must be able to handle fixed-point operations
     including addition, subtraction, multiplication, and division for
     numbers in different units.  SCU values must be allowed to be
     non-decimal fractions of the currency (such as share prices in
     64ths of a dollar, or old-style English currency units with 12
     (?) shillings/pound sterling).  
 
     We must be flexible enough to allow a fixed-point "number of
     shares" (perhaps represented as an integer number of
     thousandths-of-a-share) to be multiplied by a fixed-point "price
     per share" (perhaps represented as an integer number of
     64ths-of-a-dollar) to get a fixed-point "total transaction value"
     (perhaps represented as an integer number of pennies).

  3. Rounding/truncation control.  When performing multiplication and
     division between fixed-point quantities, rounding/truncation may
     be performed at various stages of the computation, depending on
     the desired fixed-point format of the output.  There must be a
     mechanism to control when and how rounding/truncation is done, if
     there are several choices.

  4. Intermediate results.  If an overall computation does not
     overflow the bounds of the representation, the computation must
     be performed correctly and without an error flag even if
     intermediate results might overflow.  Where overflow is possible
     in intermediate results, an extended-range format must be used
     temporarily to store intermediate results.
     
  5. Input/output conversion.  There must be an interface API which
     will allow for conversion to and from common C data types,
     including floating-point and integer data types, with a mechanism
     for signaling overflow/underflow and loss-of-information in the
     conversion.
    
  6. Compatibility with existing codebase.  The implementation of 
     a new numeric format must be compatible with both the C and
     Scheme portions of gnucash.

  7. Abstraction.  Since we have made at least one fundamental mistake
     in specifying the original representation of monetary values in
     gnucash, we have to assume that we may make others.  We are going
     to have to do significant surgery on gnucash to remove instances
     of double math using built-in operators; by all means, let's
     replace *all* such math with a c-function-based API that can have
     its implementation repaired at a later time, even if some simple
     operations (such as addition) can be implemented directly as C
     operations on "long long" (if that happens to end up as the
     underlying representation).

  8. Fixed storage size.  Considering that the gnucash engine may at
     some time have a database backend, it's important to have a
     numeric format that has a fixed storage size and (optimally) can
     be directly operated on by SQL operators.

-------------------------------------------------------------
[1] sample program.

#include <stdio.h>

int
main(int argc, char ** argv) {
  double accumulator = 0.0;
  double unit = 1000.03;
  int    count;

  for(count = 0; count < 1000000; count++) {
    accumulator += unit;
  }
  printf("total = %.4f\n", accumulator);
}

--
Gnucash Developer's List
To unsubscribe send empty email to: [EMAIL PROTECTED]
Requirements for an integer-based value representation

Reply via email to