[
https://issues.apache.org/jira/browse/RNG-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830995#comment-16830995
]
Alex D Herbert commented on RNG-99:
-----------------------------------
{quote}Am I right that it could replace part of the code of CM's
EnumeratedDistribution?
{quote}
Yes.
It can be the engine behind the RNG {{DiscreteProbabilityCollectionSampler}}.
So no changes needed for CM4. Just a changed in the RNG library.
I have already built working implementations for this and the guide table
discrete sampler as I used them when testing the Poisson sampler. All versions
passed units tests for their output before including in performance testing.
So far the guide table method is fastest unless the probabilities passed to the
Alias method are zero-padded to at least a power of 2 but sometimes more (e.g.
8 * the next power of 2). Zero padding allows the alias method to often pick
the sample using a single call to nextInt() without requiring a double. This is
because any zero probability is always aliased and the method does not have to
decide which of a pair (the original index or its alias) to choose based on a
fraction (i.e. a double).
The guide table method always needs a double. However it is faster to
pre-compute and does not require padding to a power of 2 so is more space
efficient.
I will submit the performance benchmark with the PR and post results here. Then
we can decide on how to change the {{DiscreteProbabilityCollectionSampler}}.
> AliasMethodDiscreteSampler
> --------------------------
>
> Key: RNG-99
> URL: https://issues.apache.org/jira/browse/RNG-99
> Project: Commons RNG
> Issue Type: New Feature
> Components: sampling
> Affects Versions: 1.3
> Reporter: Alex D Herbert
> Assignee: Alex D Herbert
> Priority: Minor
> Fix For: 1.3
>
>
> From Wikipedia:
> [Alias Method|https://en.wikipedia.org/wiki/Alias_method]
> {noformat}
> In computing, the alias method is a family of efficient algorithms for
> sampling from a discrete probability distribution, due to A. J. Walker. That
> is, it returns integer values 1 ≤ i ≤ n according to some arbitrary
> probability distribution. The algorithms typically use O(n log n) or O(n)
> preprocessing time, after which random values can be drawn from the
> distribution in O(1) time.
> {noformat}
> Create a sampler using the Alias method:
> {code:java}
> package org.apache.commons.rng.sampling.distribution;
> public class AliasMethodDiscreteSampler implements DiscreteSampler {
> public AliasMethodDiscreteSampler(UniformRandomProvider rng,
> double[] probabilities) {
> // Validate probabilities sum to >0
> // Construct alias tables
> }
> @Override
> public int sample() {
> // Sample from alias tables
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)