Re: how to avoid precision problem w/o switching to double?

Joerg Bruehe Wed, 13 Jul 2005 07:38:08 -0700

Hi!

Gleb Paharenko wrote ((re-ordered)):

Jacek Becla <[EMAIL PROTECTED]> wrote:

Hi,

I'm having problems with floats while doing comparisons.

create table fff (x float);
create table ddd (x double);
insert into fff (0.1);
insert into ddd (0.1);

"select * from ddd where x = 0.1" correctly returns one row, but
"select * from fff where x = 0.1" does not return anything.

I found on many websites that "If you are comparing FLOAT or
DOUBLE columns with numbers that have decimals, you can't use '='.
This problem is common in most computer languages because
floating-point values are not exact values. In most cases,
changing the FLOAT to a DOUBLE will fix this."

Note the term "In most cases": Both FLOAT and DOUBLE are implemented asbinary numbers (with different precision), so for both there are manydecimal numbers which cannot be represented exactly.Roughly, this is equivalent to the fact that you cannot represent thefraction 1/3 _exactly_ with any finite number of decimal digits, youjust have (increasingly good) approximations 0.3, 0.33, 0.333, ...


For a detailed treatment, see this paper:
http://cch.loria.fr/documentation/IEEE754/ACM/goldberg.pdf

This solution is fine as long as involved data size is relatively small.We are planning to store ~500 millions rows per day, each containingmany floats. Switching to double is not a viable option due to spaceoverhead. I wonder if anyone has a better suggestion how to solve it.
Thanks,
Jacek
Hello.


If the space is so important to you, what do you think about 'LIKE' operator?
It seems to work corectly and it is able to use indexes. See:

[[...]]

select * from fff where x like 0.1;


Even though this may (seem to!) work, IMO it is the wrong thing to do:

1) "LIKE" is a character string operator. If it is applied to numbers,this causes conversions to strings and will be slow.


2) It suffers from the same problems of inexact conversions.

The whole situation will get worse when the values are not entered butcomputed: While in decimal arithmetic you can be sure that

   100 * 0.001 == 0.1

this cannot be guaranteed when it is calculated using binary arithmetic,because 0.001 cannot be represented exactly.



The correct way to handle such searches is to do something like

SELECT .... FROM tab WHERE col BETWEEN val - epsilon AND val + epsilon

with epsilon being some value that is small enough for the precisionyour application needs, and large enough that the resulting intervalcovers the conversion errors between decimal and binary numbers.

As an alternative, for some applications you could scale your values:For currencies, do not store (calculate) Euro (Dollar, Pound, ...) butCent (Pennies, ...) and scale by 100 on both data entry and output;for times, do not use hours (like 40 per week or 8 per day) but ratherminutes (2400 per week or 480 per day) or even seconds; etc.(Of course, this may fail when you get to values smaller than yourstored units - same problem as you originally had.)



Regards,
Jörg

--
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: how to avoid precision problem w/o switching to double?

Reply via email to