[
https://issues.apache.org/jira/browse/NUMBERS-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334854#comment-17334854
]
Alex Herbert commented on NUMBERS-156:
--------------------------------------
SafeNorm has some issues.
I think the constants to check for over/underflow in SafeNorm are for 32-bit
single precision floats. The original source implementation should be checked.
At lot of the time scaling is done when a double would easily handle the
unscaled sum.
Second the scaling of large or small values is done using divisions by an input
value. This is not exact. So as soon as you start scaling there will be loss of
precision over a standard unscaled sum, i.e.
{code:java}
double a = SafeNorm.value(IntStream.range(1, 100).mapToDouble(i -> i /
Math.PI).toArray()) * 0x1.0p500;
double b = SafeNorm.value(IntStream.range(1, 100).mapToDouble(i -> i *
0x1.0p500 / Math.PI).toArray());
// Standard sum
double c = Math.sqrt(IntStream.range(1, 100).mapToDouble(i -> i * 0x1.0p500 /
Math.PI).map(x -> x * x).sum());
double d = Math.sqrt(IntStream.range(1, 100).mapToDouble(i -> i /
Math.PI).map(x -> x * x).sum()) * 0x1.0p500;
System.out.println(a);
System.out.println(b);
System.out.println(c);
System.out.println(d);
assert c == d;
assert a != b;
assert a != c;
{code}
Outputs:
{noformat}
5.970579281879817E152
5.970579281879819E152
5.970579281879818E152
5.970579281879818E152
{noformat}
The actual values only differ by a few ULP but it shows SafeNorm is not stable
to scaling and not the same as the unscaled sum.
I built a safe norm for the RNG line sampler I am working on. Currently it
scales based on the maximum value. Ideally you scale up based on the minimum
value (as the minimum value may underflow when squared in the case where the
maximum will not) but this requires more checks.
{code:java}
/**
* Compute the Euclidean length of the vector. The input is scaled up or
down
* based on the maximum value to avoid over/underflow.
*
* @param x Vector
* @return the length
*/
static double length(double[] x) {
double max = Math.abs(x[0]);
for (int i = 1; i < x.length; i++) {
// Math.max will return NaN for any NaN input
max = Math.max(max, Math.abs(x[i]));
}
if (!Double.isFinite(max) || max == 0) {
// No scaling possible for zero, infinite or NaN input
return max;
}
// Perform scaling
double forward;
double reverse;
if (max > 0x1.0p500) {
// Scale down big values
forward = 0x1.0p-600;
reverse = 0x1.0p600;
} else if (max < 0x1.0p-500) {
// Scale up small values
forward = 0x1.0p600;
reverse = 0x1.0p-600;
} else {
forward = reverse = 1.0;
}
double sum = 0;
for (int i = 0; i < x.length; i++) {
sum += square(x[i] * forward);
}
return Math.sqrt(sum) * reverse;
}
/**
* Return the valued squared ({@code x^2}).
*
* @param x the value
* @return the squared value
*/
private static double square(double x) {
return x * x;
}
{code}
If you want a truly stable result then without having to use extended precision
in the sum you can sort the squares and sum them. This minimises loss of bits
when summing as you always add increasingly bigger values. The summation can be
done in extended precision to ensure carry of the round-off term. It also
ensures that permutations of the input vector have the same length. This
requires storing an array of the absolute magnitudes and sorting. After the
sort you get for free the maximum and minimum to detect for scaling.
This method could be inlined for length 3 to avoid the array with a few if
statements for an inline sort of 3 magnitudes.
{code:java}
/**
* Compute the Euclidean length of the vector. The input is scaled up or
down
* based on the maximum/minimum value to avoid over/underflow. The result is
* the same for permutations of the input.
*
* @param v Vector
* @return the length
*/
static double length(double[] v) {
final double[] x = new double[v.length];
for (int i = 0; i < x.length; i++) {
x[i] = Math.abs(v[i]);
}
// NaN will move to the end of the array
Arrays.sort(x);
final double max = x[x.length - 1];
if (!Double.isFinite(max) || max == 0) {
// No scaling possible for zero, infinite or NaN input
return max;
}
// Perform scaling
double forward;
double reverse;
if (max > 0x1.0p500) {
// Scale down big values
forward = 0x1.0p-600;
reverse = 0x1.0p600;
} else if (x[0] < 0x1.0p-500) {
// Scale up small values
forward = 0x1.0p600;
reverse = 0x1.0p-600;
} else {
forward = reverse = 1.0;
}
double sum = 0;
for (int i = 0; i < x.length; i++) {
// This summation could be done in extended precision...
sum += square(x[i] * forward);
}
return Math.sqrt(sum) * reverse;
}
{code}
To check the exact result I would look into using Java 9 for the test code and
BigDecimal sqrt. You will need a method to compute the ULP difference between
doubles. Since all lengths should be positive this reduces to:
{code:java}
Double.doubleToLongBits(expected) - Double.doubleToLongBits(actual); // Will be
signed
{code}
Then you can check the mean and standard deviation of the ULP difference from
the actual result over a large number of random vectors. I expect the mean to
be 0 but the standard deviation will be higher for the less precise methods.
If you are building a test suite of different methods then you can create the
same functions as above but using an extended precision sum for comparison, for
example using [Kahan
summation|https://en.wikipedia.org/wiki/Kahan_summation_algorithm].
> SafeNorm 3D overload
> --------------------
>
> Key: NUMBERS-156
> URL: https://issues.apache.org/jira/browse/NUMBERS-156
> Project: Commons Numbers
> Issue Type: Improvement
> Reporter: Matt Juntunen
> Priority: Major
>
> We should create an overload of {{SafeNorm.value}} that accepts 3 arguments
> to potentially improve performance for 3D vectors.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)