Florian Pflug <f...@phlo.org> wrote:
> Tom Lane <t...@sss.pgh.pa.us> wrote:
>> Florian Pflug <f...@phlo.org> writes:
>>> For float 4 and float8, wasn't the consensus that the potential
>>> lossy-ness of addition makes this impossible anyway, even
>>> without the NaN issue? But...
>> Well, that was my opinion, I'm not sure if it was consensus ;-).
> I'd say your example showing how it could produce completely
> bogus results was pretty convincing...

Aggregates on approximate (floating-point) numbers are not nearly
as consistent as many people probably assume.  Picture for a minute
a table where a column contains positive floating point numbers
happen to be located in the heap in increasing order, perhaps as
the result of a CLUSTER on an index on the column.  SELECT
sum(colname) FROM tablename; would tend to give the most accurate
answer possible when a sequence scan was run -- unless there
happened to be a seqscan already half-way through the heap.  Then
the result would be different.  So the exact same query against the
exact same data, with no intervening modifications or maintenance
activity could give one answer most of the time, and give various
other answers depending on concurrent SELECT queries.

Given that this is already the case with aggregates on floating
point approximate numbers, why should we rule out an optimization
which only makes rounding errors more likely to be visible?  The
real issue here is that if you are using an approximate data type
and expecting exact answers, you will have problems.

That's not to say that approximations are useless.  If you
represent the circumference of the earth with a double precision
number you're dealing with an expected rounding error of about a
foot.  That's close enough for many purposes.  The mistake is
assuming that it will be exact or that rounding errors cannot
accumulate.  In situations where SQL does not promise particular
ordering of operations, it should not be assumed; so any
expectations of a specific or repeatable result from a sum or
average of approximate numbers is misplaced.

>> But NaN is an orthogonal problem I think.


Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to