Re: [HACKERS] Slaying the HYPOTamus
Greg Stark wrote: We're implementing things like box_distance and point_distance which as it happens will already be doing earlier arithmetic on the doubles, so whatever HYPOT() does had better be consistent with that or the results will be just an inexplicable mishmash. Let's look at what the HYPOT macro in PostgreSQL does right now: #define HYPOT(A, B) sqrt((A)*(A)+(B)*(B)) And let's see how it is used in point_distance: Datum point_distance(PG_FUNCTION_ARGS) { Point *pt1 = PG_GETARG_POINT_P(0); Point *pt2 = PG_GETARG_POINT_P(1); PG_RETURN_FLOAT8(HYPOT(pt1-x - pt2-x, pt1-y - pt2-y)); } Oh. Surprise. It does not look out for NaN's, InF's, overflows, underflows or anything else. In addition, for the majority of it's input space, it fails to work correctly at all. If x and y are both 1E200 the hypotenuse should be 1.4142135e200, well within the range of the double. However by naively squaring x yields 1E400, outside the range of a double. Currently HYPOT() returns INFINITY, which is not the correct answer. I am trying to introduce the hypot() function (where required) and associated patch, in order to start correcting some of the more obvious defects with the current geometry handling. Maybe my proposed hypot() function is not perfect, but it's so far in front of current the HYPOT macro is not funny. Incidentally, what on earth does it mean to multiply or divide a circle by a point? Basically it's complex math. This comment, from my new geometry implementation, might help. /*- * Additional Operators * * The +, -, * and / operators treat IPoint's as complex numbers. * (This would indicate a requirement for a Complex type?) * * (a+bi)+(c+di) = ((a+c)+(b+d)i) * (a+bi)-(c+di) = ((a-c)+(b-d)i) * (a+bi)*(c+di) = ac + adi + bci + bdi^2 * = ac + (ad+bc)i - bd # as i^2 = -1 * = ((ac-bd)+(ad+bc)i) * (a+bi)/(c+di) = (a+bi)(c-di) / (c+di)(c-di) * = ((ac+bd) + (bc-ad)i) / (c^2+d^2) * = ((ac + bd)/(c^2+d^2) + ((bc-ad)/(c^2+d^2))i) * * translation + IPoint_IPoint_add * translation - IPoint_IPoint_sub * multiplication * IPoint_IPoint_mul * division / IPoint_IPoint_div *---*/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
This is to go with the hypot() patch I posted the other day. As other code, such as found in adt/float.c and adt/numeric.c, simply assumes that isnan() exists, despite it being a C99 function, I have assumed the same. The below code should be placed into a file called src/port/hypot.c. Unfortunately I do not know how to drive configure and all the other associated build magics. Could some kind soul please implemented that portion. (Or shove me in the right direction) #include math.h #include c.h #include utils/builtins.h /* * Find the hypotenuse. Firstly x and y are swapped, if required, to make * x the larger number. The traditional formulae of x^2+y^2 is rearranged * to bring x outside the sqrt. This allows computation of the hypotenuse * for much larger magnitudes than otherwise normally possible. * * sqrt( x^2 + y^2 ) = sqrt( x^2( 1 + y^2/x^2) ) * = x * sqrt( 1 + y^2/x^2 ) * = x * sqrt( 1 + y/x * y/x ) */ double hypot( double x, double y ) { double yx; if( isinf(x) || isinf(y) ) return get_float8_infinity(); if( isnan(x) || isnan(y) ) return get_float8_nan(); x = fabs(x); y = fabs(y); if (x y) { double temp = x; x = y; y = temp; } if (x == 0.0) return 0.0; else { yx = y/x; return x*sqrt(1.0+yx*yx); } } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
On Mon, Aug 24, 2009 at 11:14:19PM +1000, Paul Matthews wrote: This is to go with the hypot() patch I posted the other day. As other code, such as found in adt/float.c and adt/numeric.c, simply assumes that isnan() exists, despite it being a C99 function, I have assumed the same. The below code should be placed into a file called src/port/hypot.c. Unfortunately I do not know how to drive configure and all the other associated build magics. Could some kind soul please implemented that portion. (Or shove me in the right direction) Comments below :) #include math.h #include c.h #include utils/builtins.h /* * Find the hypotenuse. Firstly x and y are swapped, if required, to make * x the larger number. The traditional formulae of x^2+y^2 is rearranged * to bring x outside the sqrt. This allows computation of the hypotenuse * for much larger magnitudes than otherwise normally possible. * * sqrt( x^2 + y^2 ) = sqrt( x^2( 1 + y^2/x^2) ) * = x * sqrt( 1 + y^2/x^2 ) * = x * sqrt( 1 + y/x * y/x ) */ double hypot( double x, double y ) { double yx; if( isinf(x) || isinf(y) ) return get_float8_infinity(); if( isnan(x) || isnan(y) ) return get_float8_nan(); x = fabs(x); y = fabs(y); if (x y) { double temp = x; x = y; y = temp; } How about putting: if (x = 0.0 y == 0.0) return x; here? These next two lines are a teensy bit baroque. Is there some significant speed increase that would justify them? if (x == 0.0) return 0.0; else { yx = y/x; return x*sqrt(1.0+yx*yx); } } Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
David Fetter da...@fetter.org wrote: On Mon, Aug 24, 2009 at 11:14:19PM +1000, Paul Matthews wrote: These next two lines are a teensy bit baroque. Is there some significant speed increase that would justify them? if (x == 0.0) return 0.0; else { yx = y/x; return x*sqrt(1.0+yx*yx); } } I think the reason is overflow. From the function comment: * The traditional formulae of x^2+y^2 is rearranged * to bring x outside the sqrt. This allows computation of the hypotenuse * for much larger magnitudes than otherwise normally possible. Although I don't see why the first part isn't: if (y == 0.0) return x; -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
On Mon, Aug 24, 2009 at 12:47:42PM -0500, Kevin Grittner wrote: David Fetter da...@fetter.org wrote: On Mon, Aug 24, 2009 at 11:14:19PM +1000, Paul Matthews wrote: These next two lines are a teensy bit baroque. Is there some significant speed increase that would justify them? if (x == 0.0) return 0.0; else { yx = y/x; return x*sqrt(1.0+yx*yx); } } I think the reason is overflow. From the function comment: * The traditional formulae of x^2+y^2 is rearranged * to bring x outside the sqrt. This allows computation of the hypotenuse * for much larger magnitudes than otherwise normally possible. Although I don't see why the first part isn't: if (y == 0.0) return x; D'oh! Good point :) So the code should read as follows? #include math.h #include c.h #include utils/builtins.h /* * Find the hypotenuse. Firstly x and y are swapped, if required, to make * x the larger number. The traditional formulae of x^2+y^2 is rearranged * to bring x outside the sqrt. This allows computation of the hypotenuse * for much larger magnitudes than otherwise normally possible. * * sqrt( x^2 + y^2 ) = sqrt( x^2( 1 + y^2/x^2) ) * = x * sqrt( 1 + y^2/x^2 ) * = x * sqrt( 1 + y/x * y/x ) */ double hypot( double x, double y ) { double yx; if( isinf(x) || isinf(y) ) return get_float8_infinity(); if( isnan(x) || isnan(y) ) return get_float8_nan(); x = fabs(x); y = fabs(y); if (x y) { double temp = x; x = y; y = temp; } if (y == 0.0) return x; yx = y/x; return x*sqrt(1.0+yx*yx); } Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
On Mon, Aug 24, 2009 at 07:07:13AM -0700, David Fetter wrote: These next two lines are a teensy bit baroque. Is there some significant speed increase that would justify them? Just noticed with your revised code that the following check: On Mon, Aug 24, 2009 at 11:14:19PM +1000, Paul Matthews wrote: if (x == 0.0) return 0.0; else { yx = y/x; is preventing a divide by zero on the line above. So it's not a performance hack, it's just allowing it to remain correct as a result of changing the maths around. return x*sqrt(1.0+yx*yx); } } -- Sam http://samason.me.uk/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
On Mon, Aug 24, 2009 at 06:59:38PM +0100, Sam Mason wrote: On Mon, Aug 24, 2009 at 11:14:19PM +1000, Paul Matthews wrote: if (x == 0.0) return 0.0; else { yx = y/x; is preventing a divide by zero on the line above. So it's not a performance hack, it's just allowing it to remain correct as a result of changing the maths around. I've also just realized why it's safe to return zero here; y contains the smaller number and so if x is zero, y must be zero as well. -- Sam http://samason.me.uk/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
On Mon, Aug 24, 2009 at 6:52 PM, David Fetterda...@fetter.org wrote: double hypot( double x, double y ) { double yx; if( isinf(x) || isinf(y) ) return get_float8_infinity(); if( isnan(x) || isnan(y) ) return get_float8_nan(); For what it's worth though, check out the code in float.c to see how postgres handles floating point overflows. I'm not sure whether it's forced by the standard, we discussed and settled on this behaviour, or the person who wrote it just thought it was a good idea but we may as well be consistent. The behaviour we have now is that if you pass Inf or -Inf then we happily return Inf or -Inf (or whatever the result is) as appropriate. But if the operation overflows despite being passed reasonable values then it throws an error. Afaict hypot() can still overflow even with this new coding if you have, say, hypot(MAX_FLOAT, MAX_FLOAT) which will return MAX_FLOAT * sqrt(2). In that case we should throw an error unless either input was Inf. Also, the question arises what should be returned for hypot(Inf,NaN) which your code returns Inf for. Empirically, it seems the normal floating point behaviour is to return NaN so I think the NaN test should be first. Lastly I find the swap kind of confusing and also think it might perform less well than just having one branch and simple logic in each branch. This is just a style question that you could see either way though, the performance difference is probably immeasurable if even if my guess is right. so I would suggest: if (isnan(x) || isnan(y) return float8_get_nan(); else if (isinf(x) || isinf(y)) return float8_get_inf(); else if (x == 0.0 y == 0.0) return 0.0; x = fabs(x); y = fabs(y); if (x y) retval = x * sqrt((y/x)*(y/x) + 1); else retval = y * sqrt((x/y)*(x/y) + 1); if (isinf(retval)) ereport(overflow...) return retval; } -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
Greg Stark wrote: Also, the question arises what should be returned for hypot(Inf,NaN) which your code returns Inf for. Empirically, it seems the normal floating point behaviour is to return NaN so I think the NaN test should be first. See http://www.opengroup.org/onlinepubs/95399/functions/hypot.html If /x/ or /y/ is ±Inf, +Inf shall be returned (even if one of /x/ or /y/ is NaN). If /x/ or /y/ is NaN, and the other is not ±Inf, a NaN shall be returned. Just trying to implement correct C99 behaviour here. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
Paul Matthews p...@netspace.net.au writes: Just trying to implement correct C99 behaviour here. Around here we tend to read the Single Unix Spec before C99, and SUS saith something different: http://www.opengroup.org/onlinepubs/007908799/xsh/hypot.html It would be serious folly for us to suppose that every platform's version of hypot() behaves exactly the same for these corner cases, anyway. If you're proposing to write code that would depend on that, we need to call it something else and not pretend that it's just a fill-in for platforms that lack hypot() entirely. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
On Tue, Aug 25, 2009 at 1:14 AM, Tom Lanet...@sss.pgh.pa.us wrote: Paul Matthews p...@netspace.net.au writes: Just trying to implement correct C99 behaviour here. Around here we tend to read the Single Unix Spec before C99, and SUS saith something different: It doesn't seem to anticipate NaN at all. Neither of these seems on-point since we're neither implementing SuS nor a C compiler. In fact we're not even implementing hypot(). We're implementing things like box_distance and point_distance which as it happens will already be doing earlier arithmetic on the doubles, so whatever HYPOT() does had better be consistent with that or the results will be just an inexplicable mishmash. Incidentally, what on earth does it mean to multiply or divide a circle by a point? -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
Tom Lane wrote: Greg Stark gsst...@mit.edu writes: If there's a performance advantage then we could add a configure test and define the macro to call hypot(). You said it existed before C99 though, how widespread was it? If it's in all the platforms we support it might be reasonable to just go with it. For one data point, I see hypot() in HPUX 10.20, released circa 1996. I suspect we would want a configure test and a substitute function anyway. Personally I wouldn't have a problem with the substitute being the naive sqrt(x*x+y*y), particularly if it's replacing existing code that overflows in the same places. regards, tom lane A hypot() substitute might look something like this psudo-code, this is how Python does it if the real hypot() is missing. double hypot( double dx, double dy ) { double yx; if( isinf(dx) || ifinf(dy) ) { return INFINITY; } dx = fabs(dx); dy = fabs(dy); if (dx dy) { double temp = dx; dx = dy; dy = temp; } if (x == 0.) return 0.; else { yx = dy/dx; return dx*sqrt(1.0+yx*yx); } } As the following link shows, a lot of care could be put into getting a substitute hypot() correct. http://gforge.inria.fr/plugins/scmsvn/viewcvs.php/trunk/hypot.c?rev=5677root=mpfrview=markup -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
On Sun, Aug 23, 2009 at 07:42, Tom Lanet...@sss.pgh.pa.us wrote: Greg Stark gsst...@mit.edu writes: If there's a performance advantage then we could add a configure test and define the macro to call hypot(). You said it existed before C99 though, how widespread was it? If it's in all the platforms we support it might be reasonable to just go with it. For one data point, I see hypot() in HPUX 10.20, released circa 1996. I suspect we would want a configure test and a substitute function anyway. Personally I wouldn't have a problem with the substitute being the naive sqrt(x*x+y*y), particularly if it's replacing existing code that overflows in the same places. For another data point, Microsoft documentation says: This POSIX function is deprecated beginning in Visual C++ 2005. Use the ISO C++ conformant _hypot instead. _hypot() has been there since Windows 95, so it shouldn't be a problem to use it - it just needs a define, like we have for some other such functions. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
2009/8/23 Magnus Hagander mag...@hagander.net: For another data point, Microsoft documentation says: This POSIX function is deprecated beginning in Visual C++ 2005. Use the ISO C++ conformant _hypot instead. _hypot() has been there since Windows 95, so it shouldn't be a problem to use it - it just needs a define, like we have for some other such functions. FYI, according to [1], this warning is bogus (i.e., hypot without underscore should be entirely OK) and will be removed in VS 2010. Nicolas [1] url:http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=289653 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
On Sun, Aug 23, 2009 at 4:54 AM, Paul Matthewsp...@netspace.net.au wrote: The hypot() function has been part of the C standard since C99 (ie 10 years ago) Postgres targets C89. The date of the standard is when the standard came out, it takes years before it's widely available and then years again before the systems with the old compiler are no longer interesting. If there's a performance advantage then we could add a configure test and define the macro to call hypot(). You said it existed before C99 though, how widespread was it? If it's in all the platforms we support it might be reasonable to just go with it. The function is designed not to fail where the current naive macro would result in overflow. The code seems to be blissfully unaware of overflow dangers :( Even if hypot() avoids spurious overflows it's always possible for there to be a legitimate overflow which we ought to detect and handle properly. -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
Greg Stark gsst...@mit.edu writes: If there's a performance advantage then we could add a configure test and define the macro to call hypot(). You said it existed before C99 though, how widespread was it? If it's in all the platforms we support it might be reasonable to just go with it. For one data point, I see hypot() in HPUX 10.20, released circa 1996. I suspect we would want a configure test and a substitute function anyway. Personally I wouldn't have a problem with the substitute being the naive sqrt(x*x+y*y), particularly if it's replacing existing code that overflows in the same places. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers