[
https://issues.apache.org/jira/browse/NUMBERS-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17028450#comment-17028450
]
Alex Herbert commented on NUMBERS-143:
--------------------------------------
Here are the definitions of what hypot is required to do from ISO C99:
7.12.7.3
The hypot functions compute the square root of the sum of the squares of x and
y, without undue overflow or underflow. A range error may occur.
F.9.4.3
Special cases:
* hypot(x, y), hypot(y, x), and hypot(x, −y) are equivalent.
* hypot(x, ±0) is equivalent to |x|.
* hypot(±∞, y) returns +∞, even if y is a NaN.
There is no requirement for a set level of accuracy. The fdlibm implementation
of hypot specifies it achieves <1 ULP from the exact result.
Thus an implementation should handle scaling of the intermediate sum
{{x^2+y^2}} to avoid over/underflow, be commutative and handle a special case
of zero or infinite.
Here is the pseudocode for what fdlibm does:
{noformat}
// Get exponents of input x and y
// Get the absolutes: |x| and |y|
// Sort |x| and |y| by magnitude
// If the exponent difference is large: return |x| (this may be inf/nan)
// If the exponent is large then:
// If |x| is inf/nan then: return inf/nan (depends on |y|)
// else: scale down
// If the exponent is small then:
// If |y| is zero then: return |x|
// else: scale up
// Compute sqrt(x*x + y*y)
// Scale back the result
// return result
{noformat}
The code uses the exponent to perform magnitude checks and handle the special
cases of inf/nan without ever requiring explicit identification of inf/nan.
There are variations on this to do the scaling and inf/nan checks. A key point
is that the computation of {{x^2+y^2}} could be done using any method. The
hypot function aims for high precision using a condition based on the relative
magnitude of |x| and |y|:
{code:java}
final double w = x - y;
if (w > y) {
// x > 2y
// Extended precision computation of x^2, standard y^2
} else {
// 2y > x > y
// Extended precision computation of x^2 + y^2
}
{code}
The point here is that when y is a similar scale to x then any round-off of y^2
is as important as round-off of x^2.
This branch is a key part of the performance of the method. For any uniformly
distributed polar coordinate (magnitude, angle) input data, X and Y are the
edges of a right angle triangle with the magnitude as the hypotenuse. If |x|
and |y| can take any value then they will be within a 2-fold factor when the
ratio of the edges of the triangle are between 2/1 and 1/2. The angle between
them is arctan(2/1) - arctan(1/2) = arctan(3/4) (by trigonomic identity). This
forms a segment in the quarter circle area of arctan(3/4) / (pi/2) = 0.41. Thus
the branch to (2y > x > y) will be taken 41% of the time. If data is generated
using a uniform distribution of x and y then the input data is a square (with
one vertex the origin (0,0)) containing point (x,y). The branch (2y > x > y)
will be taken 50% of the time. This follows from the lines (2,1) and (1,2)
creating segments in a 2x2 square. The area not within the arc is (2*2 - 2*1/2
- 2*1/2) / 4 = 2 / 4. (A diagram would have helped here.) The result being that
for randomly simulated data that is uniform in polar or Cartesian coordinates
this branch is taken around half the time. This is hard to predict for any
input data and the processor cannot efficiently pipeline this computation. This
will be demonstrated in results from JMH performance benchmarks.
The following candidate methods will be tested to compute {{x^2+y^2}}:
* fdlibm computation (involves a branch)
* simple x*x + y*y
* fused multiply add: Math.fma(x, x, y*y)
* extended precision x*x + y*y summation using Dekker's method then standard
sqrt
Ideally the reference would be computed using 128-bit precision. This can be
done using Java 9 BigDecimal which has a sqrt() function. An alternative is
extended precision x*x + y*y summation and extended precision sqrt using
Dekker's method [1] to set a reference value.
Results from an accuracy test and performance test will be added to this ticket.
1. [Dekker (1971) A floating-point technique for extending the available
precision|https://doi.org/10.1007/BF01397083]
> Investigate Math.hypot for computing the absolute of a complex number
> ---------------------------------------------------------------------
>
> Key: NUMBERS-143
> URL: https://issues.apache.org/jira/browse/NUMBERS-143
> Project: Commons Numbers
> Issue Type: Task
> Components: complex
> Reporter: Alex Herbert
> Priority: Minor
>
> {{Math.hypot}} computes the value {{sqrt(x^2+y^2)}} to within 1 ULP. The
> function uses the [e_hypot.c|https://www.netlib.org/fdlibm/e_hypot.c]
> implementation from the Freely Distributable Math Library (fdlibm).
> Pre-java 9 this function used JNI to call an external implementation. The
> performance was slow. Java 9 ported the function to native java (see
> [JDK-7130085 : Port fdlibm hypot to
> Java|https://bugs.java.com/bugdatabase/view_bug.do?bug_id=7130085]).
> This function is used to define the absolute value of a complex number. It is
> also used in sqrt() and log(). This ticket is to investigate the performance
> and accuracy of \{{Math.hypot}} against alternatives for use in Complex.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)