Hi!

Gleb Paharenko wrote ((re-ordered)):
Jacek Becla <[EMAIL PROTECTED]> wrote:

Hi,

I'm having problems with floats while doing comparisons.

create table fff (x float);
create table ddd (x double);
insert into fff (0.1);
insert into ddd (0.1);

"select * from ddd where x = 0.1" correctly returns one row, but
"select * from fff where x = 0.1" does not return anything.

I found on many websites that "If you are comparing FLOAT or
DOUBLE columns with numbers that have decimals, you can't use '='.
This problem is common in most computer languages because
floating-point values are not exact values. In most cases,
changing the FLOAT to a DOUBLE will fix this."

Note the term "In most cases": Both FLOAT and DOUBLE are implemented as binary numbers (with different precision), so for both there are many decimal numbers which cannot be represented exactly. Roughly, this is equivalent to the fact that you cannot represent the fraction 1/3 _exactly_ with any finite number of decimal digits, you just have (increasingly good) approximations 0.3, 0.33, 0.333, ...

For a detailed treatment, see this paper:
http://cch.loria.fr/documentation/IEEE754/ACM/goldberg.pdf


This solution is fine as long as involved data size is relatively small. We are planning to store ~500 millions rows per day, each containing many floats. Switching to double is not a viable option due to space overhead. I wonder if anyone has a better suggestion how to solve it.

Thanks,
Jacek
Hello.


If the space is so important to you, what do you think about 'LIKE' operator?
It seems to work corectly and it is able to use indexes. See:

[[...]]

select * from fff where x like 0.1;

Even though this may (seem to!) work, IMO it is the wrong thing to do:

1) "LIKE" is a character string operator. If it is applied to numbers, this causes conversions to strings and will be slow.

2) It suffers from the same problems of inexact conversions.

The whole situation will get worse when the values are not entered but computed: While in decimal arithmetic you can be sure that
   100 * 0.001 == 0.1
this cannot be guaranteed when it is calculated using binary arithmetic, because 0.001 cannot be represented exactly.


The correct way to handle such searches is to do something like

SELECT .... FROM tab WHERE col BETWEEN val - epsilon AND val + epsilon

with epsilon being some value that is small enough for the precision your application needs, and large enough that the resulting interval covers the conversion errors between decimal and binary numbers.


As an alternative, for some applications you could scale your values: For currencies, do not store (calculate) Euro (Dollar, Pound, ...) but Cent (Pennies, ...) and scale by 100 on both data entry and output; for times, do not use hours (like 40 per week or 8 per day) but rather minutes (2400 per week or 480 per day) or even seconds; etc. (Of course, this may fail when you get to values smaller than your stored units - same problem as you originally had.)


Regards,
Jörg

--
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to