Wayne Werner wrote:
On Wed, Jan 5, 2011 at 4:59 PM, Steven D'Aprano <[email protected]> wrote:

Wayne Werner wrote:

<snip>
I never said rounding errors - I said "pesky floating point errors". When
Which ARE rounding errors. They're *all* rounding errors, caused by the
same fundamental issue --  the impossibility of representing some specific
exact number in the finite number of bits, or digits, available.

Only the specific numbers change, not the existence of the errors.


So truncation == rounding. I can agree with that, though they've always
seemed distinct entities before, because you can round up or round down, but
truncation simply removes what you don't want, which is equivalent to
rounding down at whatever precision you want.

Well, technically truncation is a special case of rounding: round towards zero. When you round, you are throwing away information: the number you have might have (say) 20 digits of precision, and you only need, or want, or can take (say) 18 digits. (Or bits, for binary numbers, or whatever base you are using. There were some early Russian computers that used base three, many early Western machines used base 12, etc.) So you have to throw away two digits. How you throw them away is up to you. There are five basic types of rounding:

1 round towards positive infinity (take the ceiling);
2 round towards negative infinity (take the floor);
3 round towards zero (truncate);
4 round away from zero (like ceil for +ve numbers and floor for -ve);
5 round towards the nearest integer

Number five is interesting, because numbers of the form N.5 are exactly half-way between two integers, and so you have to choose a strategy for breaking ties:

5a always round up (what you probably learned in school);
5b always round down;
5c round towards zero;
5d round away from zero;
5e round up if the result will be even, otherwise down;
5f round up if the result will be odd, otherwise down;
5g round up or down at random;
5h alternate between rounding up and rounding down;

5a introduce a small bias in the result: assuming the numbers you round are randomly distributed, you will tend to increase them more often than decrease them. 5b is the same, only reversed.

5c and 5d are overall symmetrical, but they introduce a bias in positive numbers, and an equal bur reversed bias in negative numbers.

5e and 5f are symmetrical, as is 5g provided the random number generator is fair. Likewise for 5h. Provided the numbers you deal with are unbiased, they won't introduce any bias.

5e is also interesting. It is sometimes called "statistician's rounding", but more often "banker's rounding" even though there is no evidence that it was ever used by bankers until the advent of modern computers.


The bias involved from a poor choice of rounding can be significant. In 1982, the Vancouver Stock Exchange started a new index in 1982, with an initial value of 1000.000. After 22 months it had fallen to approximately 520 points, during a period that most stock prices were increasing. It turned out that the index was calculated by always rounding down to three decimal places, thousands of times each day. The correct value of the index should have been just under 1100. The accumulated rounding error from over half a million calculations in 22 months was enough to introduce rounding error of nearly 580 points -- a relative error of just over 50%.


Having re-read and thought about it for a while, I think my argument simply
distills down to this: using Decimal both allows you control over your
significant figures,

From Python, Decimal gives you more control over precision and rounding than binary floats. If you're programming in a low-level language that gives you better access to the floating point routines, binary floats give you almost as much control. The only difference I'm aware of is that the Decimal module lets you choose any arbitrary number of significant digits, while low-level floats only have a choice of certain fixed number of bits. The IEEE 754 standard mandates half precision (16 bits), single (32 bits), double (64 bits, or what Python uses for floats) and quadruple (128 bits). Not all of those bits are available for precision, one bit is used for sign and some are used for the exponent. E.g. doubles have 53 bits of precision (except for denormalised numbers, which have fewer).


and (at least for me) *requires* you to think about
what sort of truncation/rounding you will experience, and let's be honest -
usually the source of errors is we, the programmers, not thinking enough
about precision - and the result of this thought process is usually the
elimination, not of truncation/rounding, but of not accounting for these
errors. Which, to me, equates to "eliminating those pesky floating point
errors".

You can't eliminate rounding errors unless you have effectively infinite precision, which even at the cheap prices of RAM these days, would be quite costly :)

But what you can do is *control* how much rounding error you get. This is not as easy as it might seem though... one problem is the so-called "Table-maker's Dilemma" (table as in a table of numbers): in general, there is no way of knowing how many extra digits you need to calculate in order to correctly round a mathematical function.




--
Steven

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to