Hi Jon and James! On Wed, Mar 23, 2011 at 12:45 PM, JonY <[email protected]> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 3/23/2011 22:06, James K Beard wrote: >> Jon: The simplest and quite possibly the most efficient way to implement a >> standard function library in BCD decimal arithmetic is to convert to IEEE >> standard double precision (or, if necessary, quad precision), use the >> existing libraries, and convert back to BCD decimal floating point format. >> The binary floating point will have more accuracy, thus providing a few >> guard bits for the process, and hardware arithmetic (even quad precision is >> supported by hardware because the preserved carry fields make quad precision >> simple to support and allow good efficiency) is hard to match with software >> floating point, which is what any BCD decimal arithmetic would be. >> >> James K Beard > > Hi, > > Thanks for the reply. > > To my understanding, converting DFP to BCD then IEEE float and back > again seems to defeat the purpose using decimal floating points where > exact representation is needed, I'm not too clear about this part. Will > calculations suffer from inexact representation?
I believe that this is a fully legitimate concern. To be explicit, because decimal exponents scale numbers by powers of five, as well as two (10 = 2 * 5), and binary exponents only scale by powers of two, decimal floating-point numbers can represent more real numbers exactly than can binary floating-point numbers. By way of example, 1/2 can be represented exactly in both decimal and binary floating-point, 1/5 can be represented exactly in only decimal floating-point (and 1/3 can be represented exactly in neither). Because of this, blindly converting from decimal to binary, carrying out the computation, and converting back to decimal can fail to produce the same result as carrying out the "correct" decimal computation. Having said that, if you wish to perform fixed-precision (as distinct from fixed-point) decimal arithmetic, and your binary floating-point hardware has enough extra precision (I'm not sure of exactly how much is needed, but I would think that one extra decimal digit of precision would be more than enough), then I believe that (neglecting underflow, overflow, denormalization, and so on), James's scheme can be made to work (although I don't think I would call it simple). (I use the phrase fixed-precision in contrast to arbitrary-precision. By a fixed-precision decimal floating-point number, I mean a mantissa with a fixed number of decimal digits -- say ten -- and a decimal exponent.) In its simplest form, the basic idea is for each decimal floating-point arithmetic operation convert the operands to binary floating-point, perform the operation, and convert back to decimal floating-point by rounding to the nearest decimal floating-point value. This, however, isn't cheap. All of this converting and rounding is somewhat costly, and defeats the added benefits of any kind of floating-point pipeline and registers. Note, if you don't convert back to decimal floating-point after every operation (or implement some other additional logic), you won't be guaranteed to get "correct" decimal floating-point results. For example, (1/5 + 2/5) - 3/5 is exactly equal to zero in real (non-computer) arithmetic. It should also be exactly zero in correctly-implemented decimal floating-point arithmetic, because all input values, intermediate results, and the final result are exactly representable by decimal floating-point numbers. However, if you calculate this with double-precision binary floating-point operations (without rounding the intermediate results back to decimal floating-point, and reconverting them to binary floating-point), you will get a non-zero result on the order of 10^-16 (the approximate precision of double precision). Note, that rounding this result back to decimal floating-point still leaves you with this non-zero result -- the result of the binary computation is a perfectly good value that can be well-approximated by a decimal floating-point number with, say, ten decimal digits of precision. Of course, it all depends on what you actually need. If you don't need the specific results that correct decimal floating-point arithmetic gives you, then converting to binary, computing, and converting back will generally give you a very good result. But if you don't need the specific decimal results, why not just use binary from the beginning? Good luck. K. Frank ------------------------------------------------------------------------------ Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar _______________________________________________ Mingw-w64-public mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
