[
https://issues.apache.org/jira/browse/NUMBERS-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17345740#comment-17345740
]
Matt Juntunen commented on NUMBERS-156:
---------------------------------------
Below are the accuracy statistics for vectors of length 100, including the
{{LinearCombination}} extended-precision algorithm (named "extLinear").
||name||input type||error mean||error std dev||error min||error max||failed||
|direct|high|NaN|-0.00|Infinity|-Infinity|200000|
|enorm|high|0.00124|1.10|-8.00|6.00|0|
|enormMod|high|-0.00179|1.02|-6.00|5.00|0|
|enormModKahan|high|0.00106|0.385|-1.00|1.00|0|
|extLinear|high|0.00130|0.369|-1.00|1.00|0|
|direct|high-thresh|-0.00174|1.02|-6.00|5.00|0|
|enorm|high-thresh|0.00129|1.10|-8.00|7.00|0|
|enormMod|high-thresh|0.00114|0.798|-4.00|4.00|0|
|enormModKahan|high-thresh|0.00111|0.385|-1.00|1.00|0|
|extLinear|high-thresh|0.00135|0.369|-1.00|1.00|0|
|direct|mid|-0.00223|1.02|-6.00|5.00|0|
|enorm|mid|-0.00223|1.02|-6.00|5.00|0|
|enormMod|mid|-0.00223|1.02|-6.00|5.00|0|
|enormModKahan|mid|0.000620|0.392|-1.00|1.00|0|
|extLinear|mid|0.000860|0.376|-1.00|1.00|0|
|direct|low-thresh|-0.00271|1.02|-6.00|5.00|0|
|enorm|low-thresh|0.000315|1.10|-8.00|7.00|0|
|enormMod|low-thresh|0.000165|0.797|-4.00|4.00|0|
|enormModKahan|low-thresh|0.000135|0.385|-1.00|1.00|0|
|extLinear|low-thresh|0.000380|0.370|-1.00|1.00|0|
|direct|low|5.74e+04|1.88e+05|-3.55e+06|3.33e+06|0|
|enorm|low|-2.50e-05|1.10|-8.00|6.00|0|
|enormMod|low|-0.00305|1.02|-6.00|5.00|0|
|enormModKahan|low|-0.000200|0.384|-1.00|1.00|0|
|extLinear|low|4.00e-05|0.369|-1.00|1.00|0|
|direct|full|0.00942|0.510|-2.00|1.00|194054|
|enorm|full|0.0952|0.677|-3.00|3.00|0|
|enormMod|full|0.00829|0.498|-2.00|2.00|0|
|enormModKahan|full|-8.50e-05|0.481|-2.00|2.00|0|
|extLinear|full|0.00200|0.437|-1.00|1.00|1866|
{{enormModKahan}} and {{extLinear}} are very close in terms of accuracy.
I also took a closer look at performance, focusing on the mid (unscaled)
exponent range, since that is mostly what I'm concerned about for
commons-geometry. Below are graphs of the performance of the various methods
for different vector lengths.
!performance-all.png!
!performance-len-1-5.png!
Except for {{extLinear}}, all of the methods are in the same general range for
small vectors. However, as the vector length grows, {{enormModKahan}} slows
down much more quickly than the others.
_Conclusion_
Overall, I'm leaning toward using {{enormMod}} in {{Norms}} and documenting the
fact that overflow and underflow are protected against but accuracy in standard
exponent ranges resembles that of direct computation. The reasons for this
choice are
1. {{enormMod}} has good performance across a wide range of vector lengths and
has an accuracy at least as good as direct computation.
2. Our previous Euclidean norm class ({{SafeNorm}}) also did not other enhanced
precision, so we are not losing anything.
3. If we later decide to implement an extended precision Euclidean norm method,
we can easily place that in a separate class or method. Users could then decide
whether or not to take the performance hit for the higher accuracy.
Thoughts?
> SafeNorm 3D overload
> --------------------
>
> Key: NUMBERS-156
> URL: https://issues.apache.org/jira/browse/NUMBERS-156
> Project: Commons Numbers
> Issue Type: Improvement
> Reporter: Matt Juntunen
> Priority: Major
> Attachments: performance-all.png, performance-len-1-5.png
>
>
> We should create an overload of {{SafeNorm.value}} that accepts 3 arguments
> to potentially improve performance for 3D vectors.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)