Re: [HACKERS] Slaying the HYPOTamus

2009-08-25 Thread Paul Matthews
Greg Stark wrote:
 We're implementing things like box_distance and point_distance which
 as it happens will already be doing earlier arithmetic on the doubles,
 so whatever HYPOT() does had better be consistent with that or the
 results will be just an inexplicable mishmash.

   
Let's look at what the HYPOT macro in PostgreSQL does right now:

  #define HYPOT(A, B) sqrt((A)*(A)+(B)*(B))

And let's see how it is used in point_distance:

  Datum
  point_distance(PG_FUNCTION_ARGS)
  {
Point  *pt1 = PG_GETARG_POINT_P(0);
Point  *pt2 = PG_GETARG_POINT_P(1);

PG_RETURN_FLOAT8(HYPOT(pt1-x - pt2-x, pt1-y - pt2-y));
  }

Oh. Surprise. It does not look out for NaN's, InF's, overflows,
underflows or anything else. In addition, for the majority of it's input
space, it fails to work correctly at all. If x and y are both 1E200 the
hypotenuse should be 1.4142135e200, well within the range of the double.
However by naively squaring x yields 1E400, outside the range of a
double. Currently HYPOT() returns INFINITY, which is not the correct answer.

I am trying to introduce the hypot() function (where required) and
associated patch, in order to start correcting some of the more obvious
defects with the current geometry handling. Maybe my proposed hypot()
function is not perfect, but it's so far in front of current the HYPOT
macro is not funny.

 Incidentally, what on earth does it mean to multiply or divide a
 circle by a point?

   
Basically it's complex math. This comment, from my new geometry
implementation, might help.

/*-
 * Additional Operators
 *
 * The +, -, * and / operators treat IPoint's as complex numbers.
 * (This would indicate a requirement for a Complex type?)
 *
 *   (a+bi)+(c+di) = ((a+c)+(b+d)i)
 *   (a+bi)-(c+di) = ((a-c)+(b-d)i)
 *   (a+bi)*(c+di) = ac + adi + bci + bdi^2
 * = ac + (ad+bc)i - bd # as i^2 = -1
 * = ((ac-bd)+(ad+bc)i)
 *   (a+bi)/(c+di) = (a+bi)(c-di) / (c+di)(c-di)
 * = ((ac+bd) + (bc-ad)i) / (c^2+d^2)
 * = ((ac + bd)/(c^2+d^2) + ((bc-ad)/(c^2+d^2))i)
 * 
 * translation  +   IPoint_IPoint_add
 * translation  -   IPoint_IPoint_sub
 * multiplication   *   IPoint_IPoint_mul
 * division /   IPoint_IPoint_div
 *---*/



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-24 Thread Paul Matthews
This is to go with the hypot() patch I posted the other day.

As other code, such as found in adt/float.c and adt/numeric.c, simply
assumes that isnan() exists, despite it being a C99 function, I have
assumed the same.

The below code should be placed into a file called src/port/hypot.c.

Unfortunately I do not know how to drive configure and all the other
associated build magics. Could some kind soul please implemented that
portion. (Or shove me in the right direction)


#include math.h
#include c.h
#include utils/builtins.h

/*
 * Find the hypotenuse. Firstly x and y are swapped, if required, to make
 * x the larger number. The traditional formulae of x^2+y^2 is rearranged
 * to bring x outside the sqrt. This allows computation of the hypotenuse
 * for much larger magnitudes than otherwise normally possible.
 *
 * sqrt( x^2 + y^2 ) = sqrt( x^2( 1 + y^2/x^2) )
 *   = x * sqrt( 1 + y^2/x^2 )
 *   = x * sqrt( 1 + y/x * y/x )
 */
double hypot( double x, double y )
{
double yx;

if( isinf(x) || isinf(y) )
  return get_float8_infinity();

if( isnan(x) || isnan(y) )
  return get_float8_nan();

x = fabs(x);
y = fabs(y);
if (x  y) {
double temp = x;
x = y;
y = temp;
}
if (x == 0.0)
return 0.0;
else {
yx = y/x;
return x*sqrt(1.0+yx*yx);
}
}


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-24 Thread David Fetter
On Mon, Aug 24, 2009 at 11:14:19PM +1000, Paul Matthews wrote:
 This is to go with the hypot() patch I posted the other day.
 
 As other code, such as found in adt/float.c and adt/numeric.c, simply
 assumes that isnan() exists, despite it being a C99 function, I have
 assumed the same.
 
 The below code should be placed into a file called src/port/hypot.c.
 
 Unfortunately I do not know how to drive configure and all the other
 associated build magics. Could some kind soul please implemented that
 portion. (Or shove me in the right direction)

Comments below :)

 #include math.h
 #include c.h
 #include utils/builtins.h
 
 /*
  * Find the hypotenuse. Firstly x and y are swapped, if required, to make
  * x the larger number. The traditional formulae of x^2+y^2 is rearranged
  * to bring x outside the sqrt. This allows computation of the hypotenuse
  * for much larger magnitudes than otherwise normally possible.
  *
  * sqrt( x^2 + y^2 ) = sqrt( x^2( 1 + y^2/x^2) )
  *   = x * sqrt( 1 + y^2/x^2 )
  *   = x * sqrt( 1 + y/x * y/x )
  */
 double hypot( double x, double y )
 {
 double yx;
 
 if( isinf(x) || isinf(y) )
   return get_float8_infinity();
 
 if( isnan(x) || isnan(y) )
   return get_float8_nan();
 
 x = fabs(x);
 y = fabs(y);
 
 if (x  y) {
 double temp = x;
 x = y;
 y = temp;
 }

How about putting:

if (x = 0.0  y == 0.0)
return x;

here?

These next two lines are a teensy bit baroque.  Is there some
significant speed increase that would justify them?

 if (x == 0.0)
 return 0.0;
 else {
 yx = y/x;
 return x*sqrt(1.0+yx*yx);
 }
 }

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-24 Thread Kevin Grittner
David Fetter da...@fetter.org wrote: 
 On Mon, Aug 24, 2009 at 11:14:19PM +1000, Paul Matthews wrote:
 
 These next two lines are a teensy bit baroque.  Is there some
 significant speed increase that would justify them?
 
 if (x == 0.0)
 return 0.0;
 else {
 yx = y/x;
 return x*sqrt(1.0+yx*yx);
 }
 }
 
I think the reason is overflow.  From the function comment:
 
  * The traditional formulae of x^2+y^2 is rearranged
  * to bring x outside the sqrt. This allows computation of the
hypotenuse
  * for much larger magnitudes than otherwise normally possible.
 
Although I don't see why the first part isn't:
 
if (y == 0.0)
return x;
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-24 Thread David Fetter
On Mon, Aug 24, 2009 at 12:47:42PM -0500, Kevin Grittner wrote:
 David Fetter da...@fetter.org wrote: 
  On Mon, Aug 24, 2009 at 11:14:19PM +1000, Paul Matthews wrote:
  
  These next two lines are a teensy bit baroque.  Is there some
  significant speed increase that would justify them?
  
  if (x == 0.0)
  return 0.0;
  else {
  yx = y/x;
  return x*sqrt(1.0+yx*yx);
  }
  }
  
 I think the reason is overflow.  From the function comment:
  
   * The traditional formulae of x^2+y^2 is rearranged
   * to bring x outside the sqrt. This allows computation of the
 hypotenuse
   * for much larger magnitudes than otherwise normally possible.
  
 Although I don't see why the first part isn't:
  
 if (y == 0.0)
 return x;

D'oh!

Good point :)

So the code should read as follows?

#include math.h
#include c.h
#include utils/builtins.h

/*
 * Find the hypotenuse. Firstly x and y are swapped, if required, to make
 * x the larger number. The traditional formulae of x^2+y^2 is rearranged
 * to bring x outside the sqrt. This allows computation of the hypotenuse
 * for much larger magnitudes than otherwise normally possible.
 *
 * sqrt( x^2 + y^2 ) = sqrt( x^2( 1 + y^2/x^2) )
 *   = x * sqrt( 1 + y^2/x^2 )
 *   = x * sqrt( 1 + y/x * y/x )
 */
double hypot( double x, double y )
{
double yx;

if( isinf(x) || isinf(y) )
return get_float8_infinity();

if( isnan(x) || isnan(y) )
return get_float8_nan();

x = fabs(x);
y = fabs(y);

if (x  y) {
double temp = x;
x = y;
y = temp;
}
if (y == 0.0)
return x;
yx = y/x;
return x*sqrt(1.0+yx*yx);
}

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-24 Thread Sam Mason
On Mon, Aug 24, 2009 at 07:07:13AM -0700, David Fetter wrote:
 These next two lines are a teensy bit baroque.  Is there some
 significant speed increase that would justify them?

Just noticed with your revised code that the following check:

 On Mon, Aug 24, 2009 at 11:14:19PM +1000, Paul Matthews wrote:
  if (x == 0.0)
  return 0.0;
  else {
  yx = y/x;

is preventing a divide by zero on the line above.  So it's not a
performance hack, it's just allowing it to remain correct as a result of
changing the maths around.

  return x*sqrt(1.0+yx*yx);
  }
  }

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-24 Thread Sam Mason
On Mon, Aug 24, 2009 at 06:59:38PM +0100, Sam Mason wrote:
  On Mon, Aug 24, 2009 at 11:14:19PM +1000, Paul Matthews wrote:
   if (x == 0.0)
   return 0.0;
   else {
   yx = y/x;
 
 is preventing a divide by zero on the line above.  So it's not a
 performance hack, it's just allowing it to remain correct as a result of
 changing the maths around.

I've also just realized why it's safe to return zero here; y contains
the smaller number and so if x is zero, y must be zero as well.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-24 Thread Greg Stark
On Mon, Aug 24, 2009 at 6:52 PM, David Fetterda...@fetter.org wrote:
 double hypot( double x, double y )
 {
    double yx;

    if( isinf(x) || isinf(y) )
    return get_float8_infinity();

    if( isnan(x) || isnan(y) )
    return get_float8_nan();

For what it's worth though, check out the code in float.c to see how
postgres handles floating point overflows. I'm not sure whether it's
forced by the standard, we discussed and settled on this behaviour, or
the person who wrote it just thought it was a good idea but we may as
well be consistent.

The behaviour we have now is that if you pass Inf or -Inf then we
happily return Inf or -Inf (or whatever the result is) as appropriate.
But if the operation overflows despite being passed reasonable values
then it throws an error.

Afaict hypot() can still overflow even with this new coding if you
have, say, hypot(MAX_FLOAT, MAX_FLOAT) which will return MAX_FLOAT *
sqrt(2). In that case we should throw an error unless either input was
Inf.

Also, the question arises what should be returned for hypot(Inf,NaN)
which your code returns Inf for. Empirically, it seems the normal
floating point behaviour is to return NaN so I think the NaN test
should be first.

Lastly I find the swap kind of confusing and also think it might
perform less well than just having one branch and simple logic in each
branch. This is just a style question that you could see either way
though, the performance difference is probably immeasurable if even if
my guess is right.

so I would suggest:

if (isnan(x) || isnan(y)
  return float8_get_nan();
else if (isinf(x) || isinf(y))
  return float8_get_inf();
else if (x == 0.0  y == 0.0)
  return 0.0;

x = fabs(x);
y = fabs(y);

if (x  y)
  retval = x * sqrt((y/x)*(y/x) + 1);
else
  retval = y * sqrt((x/y)*(x/y) + 1);

if (isinf(retval))
  ereport(overflow...)

return retval;
}

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-24 Thread Paul Matthews
Greg Stark wrote:
 Also, the question arises what should be returned for hypot(Inf,NaN)
 which your code returns Inf for. Empirically, it seems the normal
 floating point behaviour is to return NaN so I think the NaN test
 should be first.

   
See http://www.opengroup.org/onlinepubs/95399/functions/hypot.html

If /x/ or /y/ is ±Inf, +Inf shall be returned (even if one of /x/ or
/y/ is NaN).
If /x/ or /y/ is NaN, and the other is not ±Inf, a NaN shall be
returned.

Just trying to implement correct C99 behaviour here.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-24 Thread Tom Lane
Paul Matthews p...@netspace.net.au writes:
 Just trying to implement correct C99 behaviour here.

Around here we tend to read the Single Unix Spec before C99, and SUS
saith something different:

http://www.opengroup.org/onlinepubs/007908799/xsh/hypot.html

It would be serious folly for us to suppose that every platform's
version of hypot() behaves exactly the same for these corner cases,
anyway.  If you're proposing to write code that would depend on that,
we need to call it something else and not pretend that it's just a
fill-in for platforms that lack hypot() entirely.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-24 Thread Greg Stark
On Tue, Aug 25, 2009 at 1:14 AM, Tom Lanet...@sss.pgh.pa.us wrote:
 Paul Matthews p...@netspace.net.au writes:
 Just trying to implement correct C99 behaviour here.

 Around here we tend to read the Single Unix Spec before C99, and SUS
 saith something different:

It doesn't seem to anticipate NaN at all.

Neither of these seems on-point since we're neither implementing SuS
nor a C compiler. In fact we're not even implementing hypot().

We're implementing things like box_distance and point_distance which
as it happens will already be doing earlier arithmetic on the doubles,
so whatever HYPOT() does had better be consistent with that or the
results will be just an inexplicable mishmash.

Incidentally, what on earth does it mean to multiply or divide a
circle by a point?




-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-23 Thread Paul Matthews
Tom Lane wrote:

 Greg Stark gsst...@mit.edu writes:
   
 If there's a performance advantage then we could add a configure test
 and define the macro to call hypot(). You said it existed before C99
 though, how widespread was it? If it's in all the platforms we support
 it might be reasonable to just go with it.
 

 For one data point, I see hypot() in HPUX 10.20, released circa 1996.
 I suspect we would want a configure test and a substitute function
 anyway.  Personally I wouldn't have a problem with the substitute being
 the naive sqrt(x*x+y*y), particularly if it's replacing existing code
 that overflows in the same places.

   regards, tom lane

   

A hypot() substitute might look something like this psudo-code, this is
how Python does it if the real hypot() is missing.

double hypot( double dx, double dy )
{
double yx;

if( isinf(dx) || ifinf(dy) ) {
  return INFINITY;
}
  
dx = fabs(dx);
dy = fabs(dy);
if (dx  dy) {
double temp = dx;
dx = dy;
dy = temp;
}
if (x == 0.)
return 0.;
else {
yx = dy/dx;
return dx*sqrt(1.0+yx*yx);
}
}

As the following link shows, a lot of care could be put into getting a
substitute hypot() correct.
http://gforge.inria.fr/plugins/scmsvn/viewcvs.php/trunk/hypot.c?rev=5677root=mpfrview=markup




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-23 Thread Magnus Hagander
On Sun, Aug 23, 2009 at 07:42, Tom Lanet...@sss.pgh.pa.us wrote:
 Greg Stark gsst...@mit.edu writes:
 If there's a performance advantage then we could add a configure test
 and define the macro to call hypot(). You said it existed before C99
 though, how widespread was it? If it's in all the platforms we support
 it might be reasonable to just go with it.

 For one data point, I see hypot() in HPUX 10.20, released circa 1996.
 I suspect we would want a configure test and a substitute function
 anyway.  Personally I wouldn't have a problem with the substitute being
 the naive sqrt(x*x+y*y), particularly if it's replacing existing code
 that overflows in the same places.

For another data point, Microsoft documentation says:
This POSIX function is deprecated beginning in Visual C++ 2005. Use
the ISO C++ conformant  _hypot instead.

_hypot() has been there since Windows 95, so it shouldn't be a problem
to use it - it just needs a define, like we have for some other such
functions.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-23 Thread Nicolas Barbier
2009/8/23 Magnus Hagander mag...@hagander.net:

 For another data point, Microsoft documentation says:
 This POSIX function is deprecated beginning in Visual C++ 2005. Use
 the ISO C++ conformant  _hypot instead.

 _hypot() has been there since Windows 95, so it shouldn't be a problem
 to use it - it just needs a define, like we have for some other such
 functions.

FYI, according to [1], this warning is bogus (i.e., hypot without
underscore should be entirely OK) and will be removed in VS 2010.

Nicolas

[1] 
url:http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=289653

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-22 Thread Greg Stark
On Sun, Aug 23, 2009 at 4:54 AM, Paul Matthewsp...@netspace.net.au wrote:

 The hypot() function has been part of the C standard since C99 (ie 10
 years ago)

Postgres targets C89. The date of the standard is when the standard
came out, it takes years before it's widely available and then years
again before the systems with the old compiler are no longer
interesting.

If there's a performance advantage then we could add a configure test
and define the macro to call hypot(). You said it existed before C99
though, how widespread was it? If it's in all the platforms we support
it might be reasonable to just go with it.

 The function is designed not to fail where the current naive macro would
 result in overflow.

The code seems to be blissfully unaware of overflow dangers :( Even if
hypot() avoids spurious overflows it's always possible for there to be
a legitimate overflow which we ought to detect and handle properly.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slaying the HYPOTamus

2009-08-22 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 If there's a performance advantage then we could add a configure test
 and define the macro to call hypot(). You said it existed before C99
 though, how widespread was it? If it's in all the platforms we support
 it might be reasonable to just go with it.

For one data point, I see hypot() in HPUX 10.20, released circa 1996.
I suspect we would want a configure test and a substitute function
anyway.  Personally I wouldn't have a problem with the substitute being
the naive sqrt(x*x+y*y), particularly if it's replacing existing code
that overflows in the same places.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers