[
https://issues.apache.org/jira/browse/RNG-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17371381#comment-17371381
]
Alex Herbert commented on RNG-147:
----------------------------------
I created a LevySampler using a ZigguratNormalizedGaussianSampler to generate
the normalised Gaussian deviate.
Parameter order is (location, scale), or (mu, c). This matches the nomenclature
Levy(mu, c) on Wikipedia, several other sources and in Commons Math and Commons
Statistics for the LevyDistribution.
{code:java}
UniformRandomProvider rng = ...;
double location = 1.23;
double scale = 4.56;
LevySampler sampler = LevySampler.of(rng, location, scale);
{code}
As discussed recently the API is fluent for the SharedStateSampler returning
the LevyDistribution:
{code:java}
LevySampler sampler1 = LevySampler.of(...);
UniformRandomProvider rng = ...;
LevySampler sampler2 = sampler1.withUniformRandomProvider(rng);
{code}
h2. Performance
The sampler was compared using JMH to an inverse transform sampler using the
LevyDistribution from Commons Math to compute the inverse CDF. Times have been
adjusted for the baseline in the normalised column:
||RandomSource||Sampler||Score||Normalised||Relative||
|baseline| | 2.468418| | |
|SPLIT_MIX_64|InverseTransformDiscreteSampler|105.03|102.56| |
|SPLIT_MIX_64|LevySampler|11.26|8.79|0.09|
|MWC_256|InverseTransformDiscreteSampler|104.72|102.25| |
|MWC_256|LevySampler|13.68|11.21|0.11|
|JDK|InverseTransformDiscreteSampler|119.44|116.97| |
|JDK|LevySampler|27.89|25.42|0.22|
* The new sampler is about 5-10x faster.
* The LevySampler slows down as the RNG gets slower. This is because a fair
amount of the sampling time is spent creating random variates for the
underlying fast normalised Gaussian sampler.
* The inverse transform method does not slow down much with the RNG. Most of
the time is spent in computing the inverse CDF of the Levy distribution using
the error function.
h2. Distribution Support
The sample bounds for a zero location and unit scale Levy distribution should
be [0, inf).
The ZigguratNormalizedGaussianSampler will output a value of zero if the
underlying RNG creates a long value of 0. This will hit the upper bound of
infinity: 1/ (0*0) = inf.
The ZigguratNormalizedGaussianSampler will output a value of +/-infinity if the
underlying RNG produces a long value close to the maximum positive value (the
lowest 7 bits must be zero) and then the next double should be zero. This will
hit the lower bound of 0: 1/ (inf*inf) = 0.
A check for this has been added to the LevySamplerTest.
> LevySampler
> -----------
>
> Key: RNG-147
> URL: https://issues.apache.org/jira/browse/RNG-147
> Project: Commons RNG
> Issue Type: New Feature
> Components: sampling
> Affects Versions: 1.3
> Reporter: Alex Herbert
> Priority: Minor
>
> [Sampling from a Levy
> distribution|https://en.wikipedia.org/wiki/L%C3%A9vy_distribution#Random_sample_generation]
> is done using an inverse transform of the cumulative distribution function
> of the standard normal distribution.
> {noformat}
> Levy(Z) = 1
> -------------------
> (inv CDF_norm(u))^2
> {noformat}
> With u a uniform deviate in [0, 1). An alternative is direct generation of a
> uniform normal variate with mean 0 and standard deviation 1: N(0, 1):
> {noformat}
> Levy(Z) = 1
> --------
> N(0,1)^2
> {noformat}
> This should be faster than inverse transform sampling if generation of the
> normal distribution sample is faster than computation of the inverse
> cumulative probability function.
> This sampler can be used in Commons Statistics for the Levy distribution.
> The extremes of the support should be investigated, i.e. what is the maximum
> value for a sample from a standard normal distribution such as the
> ZigguratNormalizedGaussianSampler vs the maximum value of the inverse CDF of
> the normal distribution when the uniform deviate is at the upper limit of 1 -
> 2^-53.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)