Re: [PATCHES] [BUGS] BUG #2846: inconsistent and confusing handling of underflows,

2006-12-29 Thread Roman Kononov

On 12/27/2006 12:44 PM, Bruce Momjian wrote:

The only unsolved issue is the one with underflow checks.  I have added
comments explaining the problem in case someone ever figures out how to
address it.


This will behave better for float4:

   Datum float4pl(PG_FUNCTION_ARGS)
   {
---float4  arg1 = PG_GETARG_FLOAT4(0);
---float4  arg2 = PG_GETARG_FLOAT4(1);
+++double  arg1 = PG_GETARG_FLOAT4(0);
+++double  arg2 = PG_GETARG_FLOAT4(1);
   double  result;

   result = arg1 + arg2;
   CheckFloat4Val(result,isinf(arg1) || isinf(arg2));
   PG_RETURN_FLOAT4((float4) result);
}

Roman

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [BUGS] BUG #2846: inconsistent and confusing handling

2006-12-27 Thread Roman Kononov

On 12/27/2006 05:19 PM, Tom Lane wrote:

Roman Kononov <[EMAIL PROTECTED]> writes:

On 12/27/2006 03:23 PM, Bruce Momjian wrote:

Are you sure?  As I remember, computation automatically upgrades to
'double'.  See this program and output:



This is platform- and compiler- dependent:


... and probably irrelevant, too.  We should store the result into a
float4 variable and then test for isinf() on that; that eliminates the
question of whether the compiler did the multiply in a wider format or
not.


You are right provided that you want to ignore underflows and silently
produce zeros instead.

If you go this way, I recommend to ignore overflows as well, and silently
produce infinities and NaNs.

Roman

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PATCHES] [BUGS] BUG #2846: inconsistent and confusing handling of underflows,

2006-12-27 Thread Tom Lane
Roman Kononov <[EMAIL PROTECTED]> writes:
> In float4mul() and float4div(), the computation should be double precision.

Why?  It's going to have to fit in a float4 eventually anyway.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [BUGS] BUG #2846: inconsistent and confusing handling of underflows,

2006-12-27 Thread Tom Lane
Roman Kononov <[EMAIL PROTECTED]> writes:
> On 12/27/2006 03:23 PM, Bruce Momjian wrote:
>> Are you sure?  As I remember, computation automatically upgrades to
>> 'double'.  See this program and output:

> This is platform- and compiler- dependent:

... and probably irrelevant, too.  We should store the result into a
float4 variable and then test for isinf() on that; that eliminates the
question of whether the compiler did the multiply in a wider format or
not.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [BUGS] BUG #2846: inconsistent and confusing handling of underflows,

2006-12-27 Thread Roman Kononov

On 12/27/2006 04:04 PM, Bruce Momjian wrote:

Interesting.  I didn't know that, but in the float4pl() function,
because the overflow tests and result is float4, what value is there to
doing things as double --- as soon as the float4 maximum is exceeded, we
throw an error?



This is useful for underflows.

float a=1e-30;
float b=1e-30;
double r1=a*b;
double r2=(double)a*b;

r1 is zero and underflow is lost.
r2 is not zero and underflow is detected.

In float4mul() and float4div(), the computation should be double precision.

In float4pl() and float4mi(), it depends on the ability of the hardware
to generate denormalized numbers. If denormalized numbers are generated,
float vs double makes no difference. If denormalized numbers are not
generated (zero is generated), then double computation is safer.

Another way to detect underflows, overflows and other junk is to use FPU
status flags after each computation. Performance will likely suffer.

#include 

Datum
float4mul(PG_FUNCTION_ARGS)
{
float4   arg1 = PG_GETARG_FLOAT4(0);
float4   arg2 = PG_GETARG_FLOAT4(1);
float4   result;
int fe_exceptions;
feclearexcept(FE_ALL_EXCEPT);
result = arg1 * arg2;

fe_exceptions=fetestexcept(FE_DIVBYZERO,FE_INVALID,FE_OVERFLOW,FE_UNDERFLOW);
if (fe_exceptions) handle_exceptions(fe_exceptions); //??
PG_RETURN_FLOAT4(result);
}

Yet another way to detect exceptions is to remove all CheckFloat4Val(),
CheckFloat8Val(), isnan(), unmask FPU exceptions and install an exception 
handler.
Might have portability difficulties. Comparisons of NaNs must be done in
non-IEEE way (FPU does not compare NaNs, it generates exceptions).

Roman


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [BUGS] BUG #2846: inconsistent and confusing handling of

2006-12-27 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> I think what we should probably consider is removing CheckFloat4Val
>> and CheckFloat8Val altogether, and just letting the float arithmetic
>> have its head.  Most modern hardware gets float arithmetic right per
>> spec, and we shouldn't be second-guessing it.

> Well, I am on an Xeon and can confirm that our computations of large
> non-infinite doubles who's result greatly exceed the max double are
> indeed returning infinity, as the poster reported, so something isn't
> working, if it supposed to.  What do people get for this computation?

Infinity is what you are *supposed* to get, per IEEE spec.

>> A slightly less radical proposal is to reject only the case where
>> isinf(result) and neither input isinf(); and perhaps likewise with
>> respect to NaNs.

> Uh, that's what the patch does for 'Inf':

> result = arg1 + arg2;
> CheckFloat4Val(result, isinf(arg1) || isinf(arg2));

No, because you are still comparing against FLOAT4_MAX.  I'm suggesting
that only an actual infinity should be rejected.  Even that is contrary
to IEEE spec, though.

The other problem with this coding technique is that it must invoke
isinf three times when the typical case really only requires one (if the
output isn't inf there is no need to perform isinf on the inputs).
If we're going to check for overflow at all, I think we should lose the
subroutine and just do

if (isinf(result) &&
!(isinf(arg1) || isinf(arg2)))
ereport(...OVERFLOW...);

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [BUGS] BUG #2846: inconsistent and confusing handling of

2006-12-27 Thread Bruce Momjian
Roman Kononov wrote:
> On 12/27/2006 03:23 PM, Bruce Momjian wrote:
> > Are you sure?  As I remember, computation automatically upgrades to
> > 'double'.  See this program and output:
> 
> This is platform- and compiler- dependent:
> 
> ~>uname -a
> Linux rklinux 2.6.15-27-amd64-generic #1 SMP PREEMPT Fri Dec 8 17:50:54 UTC 
> 2006 x86_64 GNU/Linux
> ~>gcc --version
> gcc (GCC) 4.3.0 20061213 (experimental)
> Copyright (C) 2006 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> 
> ~>cat test.c
> #include 
> #include 
> 
> int
> main(int argc, char *argv[])
> {
> float a = 1e30, b = 1e30;
> double c;
> 
> c = a * b;
> 
> printf("%e\n", c);
> return 0;
> }
> ~>gcc test.c
> ~>./a.out
> inf
> ~>gcc -march=i386 -m32 test.c
> ~>./a.out
> 1.00e+60

Interesting.  I didn't know that, but in the float4pl() function,
because the overflow tests and result is float4, what value is there to
doing things as double --- as soon as the float4 maximum is exceeded, we
throw an error?

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] [BUGS] BUG #2846: inconsistent and confusing handling of underflows,

2006-12-27 Thread Roman Kononov

On 12/27/2006 03:23 PM, Bruce Momjian wrote:

Are you sure?  As I remember, computation automatically upgrades to
'double'.  See this program and output:


This is platform- and compiler- dependent:

~>uname -a
Linux rklinux 2.6.15-27-amd64-generic #1 SMP PREEMPT Fri Dec 8 17:50:54 UTC 
2006 x86_64 GNU/Linux
~>gcc --version
gcc (GCC) 4.3.0 20061213 (experimental)
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

~>cat test.c
#include 
#include 

int
main(int argc, char *argv[])
{
   float a = 1e30, b = 1e30;
   double c;

   c = a * b;

   printf("%e\n", c);
   return 0;
}
~>gcc test.c
~>./a.out
inf
~>gcc -march=i386 -m32 test.c
~>./a.out
1.00e+60

Roman

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PATCHES] [BUGS] BUG #2846: inconsistent and confusing handling of

2006-12-27 Thread Bruce Momjian
Roman Kononov wrote:
> On 12/27/2006 12:44 PM, Bruce Momjian wrote:
> > The only unsolved issue is the one with underflow checks.  I have added
> > comments explaining the problem in case someone ever figures out how to
> > address it.
> 
> This will behave better for float4:
> 
> Datum float4pl(PG_FUNCTION_ARGS)
> {
> ---float4  arg1 = PG_GETARG_FLOAT4(0);
> ---float4  arg2 = PG_GETARG_FLOAT4(1);
> +++double  arg1 = PG_GETARG_FLOAT4(0);
> +++double  arg2 = PG_GETARG_FLOAT4(1);
> double  result;
> 
> result = arg1 + arg2;
> CheckFloat4Val(result,isinf(arg1) || isinf(arg2));
> PG_RETURN_FLOAT4((float4) result);
> }

Are you sure?  As I remember, computation automatically upgrades to
'double'.  See this program and output:

$ cat tst1.c
#include 
#include 

int
main(int argc, char *argv[])
{
float a = 1e30, b = 1e30;
double c;

c = a * b;

printf("%e\n", c);
return 0;
}

$  tst1
1.00e+60

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [BUGS] BUG #2846: inconsistent and confusing handling of

2006-12-27 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> I have made some more progress on this patch.

I'm not convinced that you're fixing things so much as doing your best
to destroy IEEE-compliant float arithmetic behavior.

I think what we should probably consider is removing CheckFloat4Val
and CheckFloat8Val altogether, and just letting the float arithmetic
have its head.  Most modern hardware gets float arithmetic right per
spec, and we shouldn't be second-guessing it.

A slightly less radical proposal is to reject only the case where
isinf(result) and neither input isinf(); and perhaps likewise with
respect to NaNs.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [BUGS] BUG #2846: inconsistent and confusing handling of

2006-12-27 Thread Bruce Momjian

I have made some more progress on this patch.  I have fixed the issue
with aggregates:

test=> select avg(ff) from tt;
ERROR:  type "double precision" value out of range: overflow

and tested the performance overhead of the new CheckFloat8Val() calls
the fix requires using:

EXPLAIN ANALYZE SELECT AVG(x.COL) 
FROM (SELECT 323.2 AS COL FROM generate_series(1,100)) AS x;

and could not measure any overhead.

I also found a few more bugs, this one with float4 negation:

test=> SELECT -('0'::float4);
 ?column?
--
   -0
(1 row)

test=> SELECT -('0'::float8);
 ?column?
--
0
(1 row)

and this one with casting 'Nan' to an integer:

test=> SELECT 'Nan'::float8::int4;
int4
-
 -2147483648
(1 row)

I have fixed these as well:

test=> SELECT -('0'::float4);
 ?column?
--
0
(1 row)

test=>  SELECT 'Nan'::float8::int4;
ERROR:  integer out of range

The only unsolved issue is the one with underflow checks.  I have added
comments explaining the problem in case someone ever figures out how to
address it.

If I don't receive further comments, I will apply the new attached patch
shortly.

---

bruce wrote:
> Roman Kononov wrote:
> > 
> > The following bug has been logged online:
> > 
> > Bug reference:  2846
> > Logged by:  Roman Kononov
> > Email address:  [EMAIL PROTECTED]
> > PostgreSQL version: 8.2.0 and older
> > Operating system:   linux 2.6.15-27-amd64 ubuntu
> > Description:inconsistent and confusing handling of underflows, NaNs
> > and INFs
> > Details: 
> 
> This is a very interesting bug report.  It seems you have done some good
> analysis of PostgreSQL and how it handles certain corner cases,
> infinity, and NaN.
> 
> I have researched your findings and will show some fixes below:
> 
> > Please compare the results of the simple queries.
> > ==
> > test=# select ('NaN'::float4)::int2;
> >  int2
> > --
> > 0
> > (1 row)
> 
> There certainly should be an isnan() test when converting to int2
> because while float can represent NaN, int2 cannot.  The fix shows:
> 
>   test=> select ('NaN'::float4)::int2;
>   ERROR:  smallint out of range
> 
> > test=# select ('NaN'::float4)::int4;
> > int4
> > -
> >  -2147483648
> > (1 row)
> 
> Same for int4:
> 
>   test=> select ('NaN'::float4)::int4;
>   ERROR:  integer out of range
> 
> > test=# select ('NaN'::float4)::int8;
> > ERROR:  bigint out of range
> 
> This one was correct because it uses rint() internally.
> 
> > test=# select ('nan'::numeric)::int4;
> > ERROR:  cannot convert NaN to integer
> > ==
> > test=# select abs('INF'::float4);
> >abs
> > --
> >  Infinity
> > (1 row)
> 
> Correct.
> 
> > test=# select abs('INF'::float8);
> > ERROR:  type "double precision" value out of range: overflow
> 
> This one was more complicated.  float4/8 operations test for
> results > FLOAT[84]_MAX.  This is because if you do this:
> 
>   test=> select (1e201::float8)*(1e200::float8);
> 
> the result internally is Infinity, so they check for Inf as a check for
> overflow.  The bottom line is that while the current code allows
> infinity to be entered, it does not allow the value to operate in many
> context because it is assumes Inf to be an overflow indicator.  I have
> fixed this by passing a boolean to indicate if any of the operands were
> infinity, and if so, allow an infinite result, so this now works:
> 
>   test=> select abs('INF'::float8);
>  abs
>   --
>Infinity
>   (1 row)
> 
> > ==
> > test=# select -('INF'::float4);
> >  ?column?
> > ---
> >  -Infinity
> > (1 row)
> > 
> > test=# select -('INF'::float8);
> > ERROR:  type "double precision" value out of range: overflow
> 
> And this now works too:
> 
>   test=> select -('INF'::float8);
>?column?
>   ---
>-Infinity
>   (1 row)
> 
> > ==
> > test=# select (1e-37::float4)*(1e-22::float4);
> >  ?column?
> > --
> > 0
> > (1 row)
> 
> This one is quite complex.  For overflow, there is a range of values
> that is represented as > FLOAT8_MAX, but for values very large, the
> result becomes Inf.  The old code assumed an Inf result was an overflow,
> and threw an error, as I outlined above.  The new code does a better
> job.
> 
> Now, for underflow.  For underflow, we again have a range slightly
> smaller than DBL_MIN where we can detect an underflow, and throw an
> error, but just like overflow, if the underflow is too small, the result

Re: [PATCHES] [BUGS] BUG #2846: inconsistent and confusing handling of

2006-12-23 Thread Bruce Momjian
Roman Kononov wrote:
> 
> The following bug has been logged online:
> 
> Bug reference:  2846
> Logged by:  Roman Kononov
> Email address:  [EMAIL PROTECTED]
> PostgreSQL version: 8.2.0 and older
> Operating system:   linux 2.6.15-27-amd64 ubuntu
> Description:inconsistent and confusing handling of underflows, NaNs
> and INFs
> Details: 

This is a very interesting bug report.  It seems you have done some good
analysis of PostgreSQL and how it handles certain corner cases,
infinity, and NaN.

I have researched your findings and will show some fixes below:

> Please compare the results of the simple queries.
> ==
> test=# select ('NaN'::float4)::int2;
>  int2
> --
> 0
> (1 row)

There certainly should be an isnan() test when converting to int2
because while float can represent NaN, int2 cannot.  The fix shows:

test=> select ('NaN'::float4)::int2;
ERROR:  smallint out of range

> test=# select ('NaN'::float4)::int4;
> int4
> -
>  -2147483648
> (1 row)

Same for int4:

test=> select ('NaN'::float4)::int4;
ERROR:  integer out of range

> test=# select ('NaN'::float4)::int8;
> ERROR:  bigint out of range

This one was correct because it uses rint() internally.

> test=# select ('nan'::numeric)::int4;
> ERROR:  cannot convert NaN to integer
> ==
> test=# select abs('INF'::float4);
>abs
> --
>  Infinity
> (1 row)

Correct.

> test=# select abs('INF'::float8);
> ERROR:  type "double precision" value out of range: overflow

This one was more complicated.  float4/8 operations test for
results > FLOAT[84]_MAX.  This is because if you do this:

test=> select (1e201::float8)*(1e200::float8);

the result internally is Infinity, so they check for Inf as a check for
overflow.  The bottom line is that while the current code allows
infinity to be entered, it does not allow the value to operate in many
context because it is assumes Inf to be an overflow indicator.  I have
fixed this by passing a boolean to indicate if any of the operands were
infinity, and if so, allow an infinite result, so this now works:

test=> select abs('INF'::float8);
   abs
--
 Infinity
(1 row)

> ==
> test=# select -('INF'::float4);
>  ?column?
> ---
>  -Infinity
> (1 row)
> 
> test=# select -('INF'::float8);
> ERROR:  type "double precision" value out of range: overflow

And this now works too:

test=> select -('INF'::float8);
 ?column?
---
 -Infinity
(1 row)

> ==
> test=# select (1e-37::float4)*(1e-22::float4);
>  ?column?
> --
> 0
> (1 row)

This one is quite complex.  For overflow, there is a range of values
that is represented as > FLOAT8_MAX, but for values very large, the
result becomes Inf.  The old code assumed an Inf result was an overflow,
and threw an error, as I outlined above.  The new code does a better
job.

Now, for underflow.  For underflow, we again have a range slightly
smaller than DBL_MIN where we can detect an underflow, and throw an
error, but just like overflow, if the underflow is too small, the result
becomes zero.  The bad news is that unlike Inf, zero isn't a special
value.  With Inf, we could say if we got an infinite result from
non-infinite arguments, we had an overflow, but for underflow, how do we
know if zero is an underflow or just the correct result?  For
multiplication, we could say that a zero result for non-zero arguments
is almost certainly an underflow, but I don't see how we can handle the
other operations as simply.

I was not able to fix the underflow problems you reported.

> test=# select (1e-37::float4)*(1e-2::float4);
> ERROR:  type "real" value out of range: underflow
> ==
> test=# select (1e-300::float8)*(1e-30::float8);
>  ?column?
> --
> 0
> (1 row)
> 
> test=# select (1e-300::float8)*(1e-20::float8);
> ERROR:  type "double precision" value out of range: underflow
> ==
> test=# select ('INF'::float8-'INF'::float8);
>  ?column?
> --
>   NaN
> (1 row)
> 
> test=# select ('INF'::float8+'INF'::float8);
> ERROR:  type "double precision" value out of range: overflow

This works fine now:

test=> select ('INF'::float8+'INF'::float8);
 ?column?
--
 Infinity
(1 row)

> ==
> test=# select ('INF'::float4)::float8;
>   float8
> --
>  Infinity
> (1 row)
> 
> test=# select ('INF'::float8)::float4;
> ERROR:  type "real" value out of range: overflow
> ==
> test=# select cbrt('INF'::float4);
>cbrt
> --
>  Infinity
> (1 row)
> 
> test=# select sqrt('INF'::fl