[ 
https://issues.apache.org/jira/browse/RNG-137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17374011#comment-17374011
 ] 

Alex Herbert commented on RNG-137:
----------------------------------

I have re-implemented the StableSampler due to issues in the commons math 
version. The version in CM did not have extensive unit tests and when I 
investigated further I found the sampler was incorrect for several of the 
parameter combinations (alpha=2, alpha != 1). I could not find the reference 
provided in the source code for the StableRandomGenerator for the origin of the 
formulas. The alpha != 1 formula does not match others I found.
h2. References

Chambers, Mallows & Stuck (1976) "A Method for Simulating Stable Random 
Variables". Journal of the American Statistical Association. 71 (354): 340–344.

Weron (1996) "On the Chambers-Mallows-Stuck method for simulating skewed stable 
random variables". Statistics & Probability Letters. 28 (2): 165–171.

Nolan (2020) "Univariate Stable Distributions: Models for Heavy Tailed Data". 
Springer Series in Operations Research and Financial Engineering. Springer. 
Sections 1.7 and 3.3.3.

Hart, JF et al (1968) Computer Approximations, New York: John Wiley & Sons, Inc.
h2. Introduction

The stable distribution is a family of distributions having an alpha (in (0, 
2]) and beta (in [-1, 1]) parameter that define the shape (alpha=stability, 
beta=skew) and then a scale and location parameter. Special cases include the 
Gaussian (alpha=2, beta ignored), Cauchy (alpha=1, beta=0), and Levy 
(alpha=0.5, beta=1). The support is in the interval (-inf, inf) unless 
beta=+/-1 and alpha < 1. Here the support is to beta * tan(2*alpha/pi), and 
this finite limit is upper or lower depending on the sign of beta.

The stable distribution is defined by the characteristic function which 
includes a complex number component. There are several parameterisations for 
the stable distribution which have different characteristic functions. These 
have been formally described in several places but are summarised in Nolan. In 
Nolan the scale is named gamma, and the location is named delta.

The 1-parameterisation is that computed by the CM implementation. This 
parameterisation is discontinuous at alpha=1. The 0-parameterisation is 
continuous at alpha=1. A 0-parameterised stable deviate can be shifted to a 
1-parameterised deviate by an offset. The magnitude of the offset approaches 
infinity as alpha -> 1 thus the discontinuity. For numerical work the 
0-parameterisation is recommended. A standard deviate x in the 
0-parameterisation with scale=1, location=0 can be shifted to any deviate by a 
linear transform gamma * x + delta.

A method for computing a stable random deviate was published by Chambers et al 
and is known as the Chambers-Mallows-Stuck (CMS) method. It requires generation 
of a uniform deviate in (-pi/2, pi/2) and an exponential deviate with mean 1. 
Two formulas were provided for alpha=1 and alpha != 1 and later a formal proof 
of this was provided by Weron. These formulas output a deviate in the 
1-parameterisation. A shift (zeta) can be applied to obtain the 
0-parameterisation. Close to alpha=1 this shift approaches 1e16 using the JDK 
math library to compute tan:
{noformat}
Zeta
            Beta
Alpha       1.0         0.5         0.25        0.1         0.0
0.001       0.001571    0.0007854   0.0003927   0.0001571   0.0
0.01        0.01571     0.007855    0.003927    0.001571    0.0
0.05        0.07870     0.03935     0.01968     0.007870    0.0
0.1         0.1584      0.07919     0.03960     0.01584     0.0
0.5         1.000       0.5000      0.2500      0.1000      0.0
0.9         6.314       3.157       1.578       0.6314      0.0
0.95        12.71       6.353       3.177       1.271       0.0
0.99        63.66       31.83       15.91       6.366       0.0
0.995       127.3       63.66       31.83       12.73       0.0
0.999       636.6       318.3       159.2       63.66       0.0
0.9995      1273        636.6       318.3       127.3       0.0
0.9999      6366        3183        1592        636.6       0.0
1.0         1.633E+16   8.166E+15   4.083E+15   1.633E+15   0.0
{noformat}
Since most of the deviates when alpha=1 are in the range [-5, 5] this large 
offset to create the deviate reduces the number of significant bits in the 
sample due to cancellation.

However the Chambers paper provided a rearrangement of the formulas using a 
re-parameterisation and half angle trigonomic identities to create a function 
that is continuous as alpha -> 1. This outputs a 0-parameterisation deviate. 
The paper provides FORTRAN code to compute the sample (the function RSTAB). I 
refer to this as the CMS formula and the alternative as the Weron formula, 
although both are implementations of the CMS method.

The CMS algorithm RSTAB uses a function tan( x) / x and calls it using a value 
of x in the range (-pi/4, pi/4). This can be approximated using an expansion 
summation. The approximation is from Hart. The function is named tan2 in the 
Chambers paper and the RSTAB program. This should have the properties:
 * For x=0 this returns 1.
 * For x=pi/4 this returns 4/pi.
 * For x=pi/4 this multiplied by x returns 1.

The original RSTAB program used single precision (32-bit) floats. The 
approximation when written using 64-bit doubles did not satisfy the 
requirements. When the requirements fail then extreme values of x create values 
that result in cancellation in the formula and invalid output (i.e. NaNs). I 
located the reference by Hart and used the highest precision approximation for 
tan( x) that was available. This did satisfy the requirements.
h2. CMS algorithm

I implemented the CMS algorithm using the JDK Math.tan function and a tan2 
approximation from Hart. This algorithm avoids many trigonomic calls to sin and 
cos by using half angle identities that reduce to just 2 calls to tan. I also 
implemented the variant Weron formulas.

One issue is that the CMS algorithm computes a term z that should be in the 
range [0, inf]. This is used with log(z) to compute further terms. The 
trigonomic identities can result in z below 0. This creates a NaN. Later 
computations can also multiply infinity by 0 or compute inf - inf, which also 
create NaN. Thus the algorithm should be checked. After some effort to add 
checks to the CMS formula I found it was easier to use the Weron formula for 
edge cases. This uses multiplication of terms and can avoid a potential 0 / 0 
by restriction of the uniform deviate to (0, 1), i.e. not [0, 1). Thus 
infinities can be handled by boxing the infinity to the maximum double value, 
computing the expression and then unboxing the result by multiplying by the 
maximum double value. This will recreate the infinity from a large value, and 
handle multiplication of infinite values by 0.

The CMS and Weron implementations create the same output sample (within 
relative error) when used with the same uniform deviate and exponential 
deviate, and the deviates are not extreme. This demonstrates they are 
rearrangements.

These edge cases of NaN output from the CMS formula are extremely rare unless 
the alpha term approaches 0. The following shows a table of the number of times 
the term d (computed from z) is infinite:
{noformat}
            Beta
Alpha       1.0         0.5         0.25        0.1         0.0
0.05        0.0         0.0         0.0         0.0         0.0
0.02        6.752E-7    6.836E-7    6.761E-7    6.612E-7    7.404E-7
0.01        0.0008493   0.0008533   0.0008541   0.0008546   0.0008539
0.005       0.02889     0.02896     0.02897     0.02897     0.02897
0.001       0.3901      0.3903      0.3903      0.3903      0.3903
{noformat}
When beta=+/-1 the term must also be restricted to the expected domain of the 
distribution. The following is the number of times the term was at or outside 
the domain when using Math.tan:
{noformat}
            Beta
Alpha       1.0         0.5         0.25        0.1         0.0
0.05        0.0003456   0.0         0.0         0.0         0.0
0.02        0.009033    0.0         0.0         0.0         0.0
0.01        0.004025    0.0         0.0         0.0         0.0
0.005       0.1230      0.0         0.0         0.0         0.0
0.001       0.2137      0.0         0.0         0.0         0.0
{noformat}
Or when using the tan2 approximation:
{noformat}
            Beta
Alpha       1.0         0.5         0.25        0.1         0.0
0.05        0.001116    0.0         0.0         0.0         0.0
0.02        0.009026    0.0         0.0         0.0         0.0
0.01        0.1650      0.0         0.0         0.0         0.0
0.005       0.2520      0.0         0.0         0.0         0.0
0.001       0.3091      0.0         0.0         0.0         0.0
{noformat}
This shows the use of the tan2 function has more numerical cancellation issues 
than using Math.tan. This only applies when beta=+/-1 and alpha is small.
h2. Sampler output

For reference the CDF can be computed using a million samples. The following 
shows plot for different alpha and beta parameters.

As alpha -> 0 the distribution is extremely long tailed.

!alpha1.3.jpg!

!alpha0.5.jpg!

!beta0.3.jpg!
h2. Performance

All 3 implementations use a uniform deviate in (0, 1), i.e. 0 must be avoided, 
and an exponential deviate. A baseline sampler for this computes both deviates 
and multiplies them together. The time for this baseline has been subtracted 
from the timings for the 3 implementations. Thus the timings reflect the calls 
to Math trigonomic, power, log and exponential functions used by the samplers. 

Note that when alpha=0.5 the Weron sampler has a speed increase. The formula 
uses two power functions with the exponent as 1/alpha and (1-alpha)/alpha. 
These reduce to 2 and 1 respectively when alpha=0.5 thus the power functions 
are much faster for this case.

General Case (beta = 0.1, 0.4, 0.7,  x-axis = alpha)

!GeneralPerformance.jpg!

Alpha = 1 (x-axis = beta)

!Alpha1Performance.jpg!

Beta = 0 (x-axis = alpha)

!Beta0Performance.jpg!
h2. A robust implementation

The CMS algorithm often will compute NaNs for small alpha and also can do this 
for extreme random deviates when the beta term is 1 (these cases are probably 
around 1 in 10^15). Use of the Weron formula is simpler to correct. Note that 
when alpha is small the term zeta to convert the Weron formula to the 
0-parameterisation is very small and will not effect the number of bits of 
significance in the sample. Thus a hybrid sampler is a compromise. This can use 
the CMS algorithm to create a sample. This will output samples in the 
0-parameterisation which is continuous for alpha and beta. It is checked 
against the bounds of the distribution support and if not within the expected 
support the Weron formula is used to correct NaN or check infinite results. 
When alpha is small then the Weron formula can be used directly as the zeta 
shift has a small magnitude and will not effect output precision. A value of 
alpha < 0.02 is a level where approximately 1 in million samples will have an 
infinite term in the computation.
h2. Approximate tan( x) / x

The sampler using tan2 (tan( x) / x) is faster by about 20-30%. I tested the 
function with x in the range [0, pi/4] and the mean ULP is <0.02 from using 
Math.tan. This is a high level of precision. Note the same test using the 
original tan2 function from Chambers et al has a mean ULP of 148293.94. The 
approximations are 4283 and 4288 from Hart. This states the maximum relative 
error (verses tan) is 1e-10.66 and 1e-26.68 respectively.

Since this is sampling and not a high accuracy computation of the PDF at a 
given point then use of the approximation seems justified by the faster 
sampling speed. Any output will be checked against the support and the 
correction of invalid samples using the Weron formula will use the JDK Math 
trigonomic functions.

 

Note:

I created these implementation using the AhrensDieterExponentialSampler. Many 
of the unit tests manipulate the RNG to create extreme output from the 
exponential deviate. These should be updated to use the 
ZigguratExponentialSampler. I will do this then open a PR for review with the 
implementations.

 

> Move class from "Commons Math"
> ------------------------------
>
>                 Key: RNG-137
>                 URL: https://issues.apache.org/jira/browse/RNG-137
>             Project: Commons RNG
>          Issue Type: Task
>          Components: sampling
>            Reporter: Gilles Sadowski
>            Assignee: Alex Herbert
>            Priority: Trivial
>              Labels: commons-math, port
>             Fix For: 1.4
>
>         Attachments: Alpha1Performance.jpg, Beta0Performance.jpg, 
> GeneralPerformance.jpg, alpha0.5.jpg, alpha1.3.jpg, beta0.3.jpg
>
>
> Shouldn't CM's class 
> [{{StableRandomGenerator}}|https://gitbox.apache.org/repos/asf?p=commons-math.git;a=blob;f=src/main/java/org/apache/commons/math4/random/StableRandomGenerator.java;h=f3a851b9b16c47546cac8371e272dbf59a909c09;hb=HEAD]
>  be moved to the \{{commons-rng-sampling}} module?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to