[ 
https://issues.apache.org/jira/browse/RNG-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830995#comment-16830995
 ] 

Alex D Herbert commented on RNG-99:
-----------------------------------

{quote}Am I right that it could replace part of the code of CM's 
EnumeratedDistribution?
{quote}
Yes.

It can be the engine behind the RNG {{DiscreteProbabilityCollectionSampler}}. 
So no changes needed for CM4. Just a changed in the RNG library.

I have already built working implementations for this and the guide table 
discrete sampler as I used them when testing the Poisson sampler. All versions 
passed units tests for their output before including in performance testing.

So far the guide table method is fastest unless the probabilities passed to the 
Alias method are zero-padded to at least a power of 2 but sometimes more (e.g. 
8 * the next power of 2). Zero padding allows the alias method to often pick 
the sample using a single call to nextInt() without requiring a double. This is 
because any zero probability is always aliased and the method does not have to 
decide which of a pair (the original index or its alias) to choose based on a 
fraction (i.e. a double).

The guide table method always needs a double. However it is faster to 
pre-compute and does not require padding to a power of 2 so is more space 
efficient.

I will submit the performance benchmark with the PR and post results here. Then 
we can decide on how to change the {{DiscreteProbabilityCollectionSampler}}.



> AliasMethodDiscreteSampler
> --------------------------
>
>                 Key: RNG-99
>                 URL: https://issues.apache.org/jira/browse/RNG-99
>             Project: Commons RNG
>          Issue Type: New Feature
>          Components: sampling
>    Affects Versions: 1.3
>            Reporter: Alex D Herbert
>            Assignee: Alex D Herbert
>            Priority: Minor
>             Fix For: 1.3
>
>
> From Wikipedia:
> [Alias Method|https://en.wikipedia.org/wiki/Alias_method]
> {noformat}
> In computing, the alias method is a family of efficient algorithms for 
> sampling from a discrete probability distribution, due to A. J. Walker. That 
> is, it returns integer values 1 ≤ i ≤ n according to some arbitrary 
> probability distribution. The algorithms typically use O(n log n) or O(n) 
> preprocessing time, after which random values can be drawn from the 
> distribution in O(1) time.
> {noformat}
> Create a sampler using the Alias method:
> {code:java}
> package org.apache.commons.rng.sampling.distribution;
> public class AliasMethodDiscreteSampler implements DiscreteSampler {
>     public AliasMethodDiscreteSampler(UniformRandomProvider rng, 
>                                       double[] probabilities) {
>         // Validate probabilities sum to >0
>         // Construct alias tables
>     }
>     @Override
>     public int sample() {
>         // Sample from alias tables
>     }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to