[
https://issues.apache.org/jira/browse/NUMBERS-174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17443765#comment-17443765
]
Alex Herbert commented on NUMBERS-174:
--------------------------------------
h2. Using Dfp to increase precision
The Dfp class provided arbitrary precision floating-point arithmetic (DFP =
Decimal Floating Point library for Java). I have testing using this class with
the version in Commons Math 3 to perform the computation of the regularised
gamma prefix:
{noformat}
x^a e^-x / gamma(a){noformat}
I used 64 digits of precision which is enough for the current test data.
Results:
||Test||Standard|| ||Dfp|| ||Difference|| ||
|| ||max||RMS||max||RMS||max||RMS||
|GAMMA_P_DERIV_INT|3.16382|0.974935|3.16382|0.974935|0|0|
|GAMMA_P_DERIV_SMALL|3.12638|0.901515|3.12638|0.901515|0|0|
|GAMMA_P_DERIV_MED|4.39499|1.00498|4.39499|1.00498|0|0|
|GAMMA_P_DERIV_BIG|538.126|52.8391|2.43417|0.796512|-535.69183|-52.042588|
|IGAMMA_P_INT|3.33453|0.837093|3.33453|0.837093|0|0|
|IGAMMA_P_SMALL|2.51022|0.707713|2.51022|0.707713|0|0|
|IGAMMA_P_MED|4.31041|0.743446|4.31041|0.743446|0|0|
|IGAMMA_P_BIG|412.953|31.3717|3.92196|0.786162|-409.03104|-30.585538|
|IGAMMA_P_EXTRA|0.502906|0.12985|0.502906|0.12985|0|0|
|IGAMMA_Q_INT|4.83677|1.13616|4.83677|1.13616|0|0|
|IGAMMA_Q_SMALL|3.74672|0.961876|3.74672|0.961876|0|0|
|IGAMMA_Q_MED|5.35148|0.82354|5.35148|0.82354|0|0|
|IGAMMA_Q_BIG|537.101|42.6655|5.7472|0.917859|-531.3538|-41.747641|
|IGAMMA_Q_EXTRA|54.6793|15.1705|54.6793|15.1705|0|0|
|IGAMMA_LARGE_X_ASYMP_TERM|276.954|103.997|276.954|103.997|0|0|
|IGAMMA_LARGE_X_Q|537.101|120.441|1.0676|0.420528|-536.0334|-120.020472|
|IGAMMA_LARGE_X_P|0.772756|0.229714|0.772756|0.229714|0|0|
|IGAMMA_LARGE_X_Q_ASYM|537.101|170.353|344.395|119.68|-192.706|-50.673|
|IGAMMA_LARGE_X_P_ASYM|341.213|98.6633|337.213|97.9287|-4|-0.7346|
|IGAMMA_LOWER_INT|5.41081|1.07932|5.41081|1.07932|0|0|
|IGAMMA_LOWER_SMALL|1.38617|0.472474|1.38617|0.472474|0|0|
|IGAMMA_LOWER_MED|5.99945|1.15425|5.99945|1.15425|0|0|
|IGAMMA_LOWER_BIG|7.29508|1.1326|7.29508|1.1326|0|0|
|IGAMMA_LOWER_EXTRA|4.45394|1.19886|4.45394|1.19886|0|0|
|IGAMMA_LARGE_X_ASYMP_TERM|276.954|103.997|276.954|103.997|0|0|
|IGAMMA_LARGE_X_Q|537.101|120.441|1.0676|0.420528|-536.0334|-120.020472|
|IGAMMA_LARGE_X_P|0.772756|0.229714|0.772756|0.229714|0|0|
|IGAMMA_LARGE_X_Q_ASYM|537.101|170.353|344.395|119.68|-192.706|-50.673|
|IGAMMA_LARGE_X_P_ASYM|341.213|98.6633|337.213|97.9287|-4|-0.7346|
|IGAMMA_UPPER_INT|5.1089|1.36073|5.1089|1.36073|0|0|
|IGAMMA_UPPER_SMALL|2.2915|0.78016|2.2915|0.78016|0|0|
|IGAMMA_UPPER_MED|8.2182|1.68102|8.2182|1.68102|0|0|
|IGAMMA_UPPER_BIG|7.28961|1.1512|7.28961|1.1512|0|0|
|IGAMMA_UPPER_EXTRA|11.1293|3.59124|11.1293|3.59124|0|0|
Note that this change only affects the regularized incomplete gamma functions P
and Q. In the case of the standard functions the result is typically 0 or
infinity and so extended precision makes no difference.
In cases where the error is due to cancellation in the power terms using the
Dfp class eliminates this error. However there are some cases where the error
is in the other part of the computation, for example the series summation for
large x which can be computed via the asymptotic approximation (all cases using
LARGE_X in the name). These errors could only be eliminated by using extended
precision for the remaining parts of the computation.
Using Dfp throughout the computation would have a big performance impact and
require significant code changes. The cases where Dfp can improve the
computation of the power terms are difficult to predict from the input
arguments. For simplicity the method should use double precision and accept the
loss of precision for large arguments.
> Update Gamma functions using the Boost implementation
> -----------------------------------------------------
>
> Key: NUMBERS-174
> URL: https://issues.apache.org/jira/browse/NUMBERS-174
> Project: Commons Numbers
> Issue Type: Improvement
> Components: gamma
> Affects Versions: 1.0
> Reporter: Alex Herbert
> Assignee: Alex Herbert
> Priority: Major
> Fix For: 1.1
>
>
> The current regularised incomplete gamma functions for P and Q compute using
> a series representation for all input. This method is not robust to extreme
> arguments.
> The Gamma functions are used in Commons Statistics in the Gamma and Poisson
> distributions.
> Large values of the mean in the Poisson distribution have low
> precision for the CDF. The distribution also has a configurable
> threshold for convergence of the gamma function evaluation
> (STATISTICS-38). Ideally this implementation detail should be
> removed from the API.
> The Gamma distribution has a function switch based on a threshold to
> compute the PDF. This threshold results in incorrect values for
> certain parameters (STATISTICS-39).
> The function switch in the Gamma distribution is based on the
> documentation for the Boost gamma functions [1]. However it does not
> implement all the suggested optimisations detailed in the most recent
> Boost documentation. This includes avoiding using the converging
> series of the gamma function when the convergence is slow or unstable,
> i.e. addresses the need for a configurable convergence threshold in
> the Poisson distribution.
> One part of the evaluation of the incomplete gamma functions require
> accuracy in the leading term:
> {noformat}
> x^a exp(-x) / gamma(a) = exp(a log(x) - x - loggamma(a){noformat}
> When x and a are large then using logs to compute this leads to
> cancellation. Use of logs to compute this term is done in the current
> implementation in numbers for RegularizedGamma P and Q. It is the
> source of low precision for the CDF for the Poisson distribution when
> the mean is large.
> I propose to port the Boost gamma functions to numbers. This will be a
> process similar to the recent port of the error functions to the same
> package in numbers. These functions improved both accuracy and
> speed over the existing implementation. Once the gamma functions are
> ported a comparison can be made between the existing and new
> implementations.
> [1]
> [https://www.boost.org/doc/libs/1_77_0/libs/math/doc/html/math_toolkit/sf_gamma/igamma.html]
--
This message was sent by Atlassian Jira
(v8.20.1#820001)