[
https://issues.apache.org/jira/browse/RNG-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16834563#comment-16834563
]
Alex D Herbert commented on RNG-101:
------------------------------------
The results of this sampler for generating the Poisson distribution are shown
in RNG-91.
> MarsagliaTsangWang discrete probability sampler
> -----------------------------------------------
>
> Key: RNG-101
> URL: https://issues.apache.org/jira/browse/RNG-101
> Project: Commons RNG
> Issue Type: New Feature
> Components: sampling
> Affects Versions: 1.3
> Reporter: Alex D Herbert
> Assignee: Alex D Herbert
> Priority: Major
> Fix For: 1.3
>
>
> The following paper provides a method for sampling from a discrete
> probability distribution that requires a single 30-bit unisigned integer per
> sample:
> {noformat}
> George Marsaglia, Wai Wan Tsang, Jingbo Wang (2004)
> Fast Generation of Discrete Random Variables.
> Journal of Statistical Software. Vol. 11, Issue. 3, pp. 1-11.
> {noformat}
> [JSS Vol 11, Issue 3|http://dx.doi.org/10.18637/jss.v011.i03]
> The paper gives a general method that requires input probabilities expressed
> as unsigned 30 bit integers. The probabilities thus have an assumed divisor
> of 2^30^. The probabilities must sum to 2^30^. These are then efficiently
> tabulated for fast look-up using 5 base-64 digits from the 30-bit number.
> Sampling then takes 1 random integer per sample. No samples can be obtained
> for any observation where the probability is < 1^-32^.
> The paper provides software to compute the probability distributions for
> Poisson and Binomial and the look-up tables.
> It is applicable to any distribution by normalising to 2^30^:
> {code:java}
> // Compute the normalisation: 2^30 / sum
> // (assuming sum has been precomputed)
> final double normalisation = (1 << 30) / sum;
> final int[] prob = new int[probabilities.length];
> int sum = 0;
> int max = 0;
> int mode = 0;
> for (int i = 0; i < prob.length; i++) {
> // Add 0.5 for rounding
> final int p = (int) (probabilities[i] * normalisation + 0.5);
> sum += p;
> // Find the mode (maximum probability)
> if (max < p) {
> max = p;
> mode = i;
> }
> prob[i] = p;
> }
> // The sum must be >= 2^30.
> // Here just compensate the difference onto the highest probability.
> prob[mode] += (1 << 30) - sum;
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)