[
https://issues.apache.org/jira/browse/STATISTICS-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17480077#comment-17480077
]
Alex Herbert commented on STATISTICS-52:
----------------------------------------
The previous JMH benchmark only tested the call to the exponential function.
However the normal distribution PDF requires more computations:
{code:java}
// provided
double mean, sd;
// Precomputed
double sdSqrt2pi = sd * Math.sqrt(2 * Math.PI);
double density(double x) {
final double z = (x - mean) / sd;
return Math.exp(-0.5 * z * z) / sdSqrt2pi;
}{code}
Thus the density computation must first normalise the input value x, compute
the exponential and perform a divide. Even if the mean and standard are 0 and 1
the divide uses a non-trivial value of 2.5066. Thus the benchmark has been
updated to include the extra steps.
Note the extra columns in the results simply remove the time for generation of
the random deviate in the baseline method:
{noformat}
baseline = generation of random X deviate
std = standard precision PDF
hp = high-precision PDF
Adjusted = Score[method] - Score[baseline]
Relative = Adjusted[2] / Adjusted[1]{noformat}
h2. Normally distributed X deviate
||Method||Score||Adjusted||Relative||
| baseline|8.878| | |
| std|34.722|25.844| |
| hp|35.897|27.019|1.045|
On this data the difference is lower (4.5%) than the previously observed 10%
for the exp function in isolation.
h2. Uniformally distributed X deviate
||Method||Low||High||Score||Adjusted||Relative||
| baseline|0|1|8.94| | |
| std|0|1|36.947|28.007| |
| hp|0|1|36.858|27.918|0.997|
| baseline|0|10|8.902| | |
| std|0|10|37.161|28.259| |
| hp|0|10|40.043|31.141|1.102|
| baseline|0|30|8.841| | |
| std|0|30|36.84|27.999| |
| hp|0|30|39.684|30.843|1.102|
| baseline|0|100|8.981| | |
| std|0|100|32.619|23.638| |
| hp|0|100|28.142|19.161|0.811|
| baseline|2|20|9.184| | |
| std|2|20|36.932|27.748| |
| hp|2|20|39.526|30.342|1.093|
| baseline|0|2.83|8.894| | |
| std|0|2.83|36.879|27.985| |
| hp|0|2.83|41.707|32.813|1.173|
When the full PDF is computed the relative speed difference is minor compared
to the previous benchmark of the exp function.
* When the computation is entirely standard precision ([0, 1]) then there is
no speed difference.
* When it is entirely high precision ([2, 20]) then is is about 10% slower.
* Where the function must choose between the standard precision computation
(x^2 < 2) or high precision then it is again about 10% slower on the [0, 10]
and [0, 30] data.
* In the worst case scenario of [0, 2.83] the random deviate value x^2 will be
< 2 approximately 50% of the time. This is 17% slower.
* On the [0, 100] data the method is faster; in this data about 60% of the
time the computation will not call Math.exp.
h2. Conclusion
Given that the other computation for the normal distribution CDF also uses a
high precision exp function within the error function (erf) to increase
accuracy it would be consistent to add the high precision PDF. The overall
speed impact is minor at around 5% slower on normally distributed X data and
17% slower on worse case uniformly distributed input data.
> High precision PDF for the Normal distribution
> ----------------------------------------------
>
> Key: STATISTICS-52
> URL: https://issues.apache.org/jira/browse/STATISTICS-52
> Project: Apache Commons Statistics
> Issue Type: Improvement
> Components: distribution
> Affects Versions: 1.0
> Reporter: Alex Herbert
> Priority: Minor
>
> The normal distribution PDF is computed using:
>
> {code:java}
> Math.exp(-0.5 * x * x) / Math.sqrt(2 * Math.PI)
> {code}
> The value {{x^2}} can be computed to extended precision. This extra
> information in the round-off bits can increase the accuracy of the
> exponential function (see NUMBERS-177 under the title 'Accurate scaling by
> exp(z*z)').
>
> The effect of including the round-off bits on both accuracy and speed should
> be investigated.
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)