[
https://issues.apache.org/jira/browse/RNG-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17530391#comment-17530391
]
Alex Herbert commented on RNG-176:
----------------------------------
There is one notable difference between the API from JDK 17 and the sampling
module. This method:
{code:java}
double nextDouble(double origin, double bound);
{code}
Cannot handle (bound - origin) as infinite.
The Commons RNG sampler will handle this:
{code:java}
double x = Double.MAX_VALUE;
ContinuousUniformSampler.of(rng, -x, x).sample();
{code}
If the JDK 17 methods are added then this should be noted in the user guide.
Given the usefulness of a stream of values then perhaps it would be sensible to
add stream methods to the sampler interfaces, e.g.
{code:java}
default DoubleStream samples() {
return DoubleStream.generate(this::sample).sequential();
} {code}
Thus allowing:
{code:java}
UniformRandomProvider rng = ...;
double[] data = ContinuousUniformSampler.of(rng, lo,
hi).samples().limit(50).toArray();
{code}
Given that there already exists a name clash in the different sampler
interfaces then a name clash using the 'samples' name is not a major issue. For
example this will not compile:
{code:java}
class FixedSampler implements DiscreteSampler, ContinuousSampler {
public int sample() {
return 42;
}
public double sample() {
return 42.0;
}
} {code}
If stream methods are added to the sampling module then the only methods not
possible using the classes in the sampling module are:
{code:java}
float nextFloat(float bound);
float nextFloat(float origin, float bound);
DoubleStream doubles();
IntStream ints();
LongStream longs(); {code}
The later are a simple one-liner to implement, e.g:
{code:java}
default DoubleStream doubles() {
return DoubleStream.generate(this::nextDouble).sequential();
}{code}
A further point to note is that by default the streams will not allow
parallelisation. However adding them to the UniformRandomProvider interface
would allow the streams to be made parallel if the RNG also implements a
SplittableUniformRandomProvider interface, i.e. the RNG can arbitrarily split
into two independent instances. This interface is one of several to be added in
JDK 17. This is a feature that is provided by the LXM generators in JDK 17 (see
RNG-168).
h2. Options
These are not mutually exclusive
# Add all methods from the JDK RandomGenerator interface
# Add only the methods to produce a stream (as a convenience)
# Add stream methods to the Sampler interfaces
# Add a FloatSampler interface and add the sampling of a range of floats in
the sampling module
I do not see much use for 2. It is usually more practical to create a stream of
numbers in a range. So I would prefer option 1 to add all the stream methods
over option 2.
I like the idea of a simple addition to the Sampler interfaces of a stream
(option 3).
Adding a FloatSampler makes less sense in the modern JDK world where floats are
a subset of doubles and thus there is no FloatStream. So the FloatSampler
interface would only exist as an alternative to adding the range methods found
in the JDK.
I think it makes little sense to add the float range methods and then defer
long, int and double range methods to the sampling module. Either no
number-in-a-range methods are added or we accept a level of duplication
(although in the case of floating point generation the ContinuousUniformSampler
is functionally different).
h2. Final Thoughts
Here is what I propose:
* Add all methods from JDK 17 RandomGenerator
* Add default stream methods to the sampler interfaces
* Update the use guide to highlight similarities and differences between using
a sampler and the UniformRandomProvider interface
* Look to add a SplittableUniformRandomProvider interface to allow fork-join
processing via a parallel stream. (The extra interfaces in JDK 17 will be the
subject of another ticket.)
Any feedback?
> Enhance the UniformRandomProvider interface with extra methods and default
> implementations
> ------------------------------------------------------------------------------------------
>
> Key: RNG-176
> URL: https://issues.apache.org/jira/browse/RNG-176
> Project: Commons RNG
> Issue Type: New Feature
> Affects Versions: 1.4
> Reporter: Alex Herbert
> Assignee: Alex Herbert
> Priority: Major
>
> JDK 17 introduced the {{RandomGenerator}} interface with the following
> methods:
> {code:java}
> DoubleStream doubles();
> DoubleStream doubles(double randomNumberOrigin, double randomNumberBound);
> DoubleStream doubles(long streamSize);
> DoubleStream doubles(long streamSize, double randomNumberOrigin,
> double randomNumberBound);
> IntStream ints();
> IntStream ints(int randomNumberOrigin, int randomNumberBound);
> IntStream ints(long streamSize);
> IntStream ints(long streamSize, int randomNumberOrigin,
> int randomNumberBound);
> LongStream longs();
> LongStream longs(long randomNumberOrigin, long randomNumberBound);
> LongStream longs(long streamSize);
> LongStream longs(long streamSize, long randomNumberOrigin,
> long randomNumberBound);
> boolean nextBoolean();
> void nextBytes(byte[] bytes);
> float nextFloat();
> float nextFloat(float bound);
> float nextFloat(float origin, float bound);
> double nextDouble();
> double nextDouble(double bound);
> double nextDouble(double origin, double bound);
> int nextInt();
> int nextInt(int bound);
> int nextInt(int origin, int bound);
> long nextLong();
> long nextLong(long bound);
> long nextLong(long origin, long bound);
> double nextGaussian();
> double nextGaussian(double mean, double stddev);
> double nextExponential();
> {code}
> The only method that is *non-default* is {{{}nextLong{}}}. This allows a new
> generator to be simply implemented by providing the source of randomness as
> 64-bit longs.
> The {{UniformRandomProvider}} interface can be expanded to include these
> generation methods. Using Java 8 default interface methods will not require
> any changes to generators currently implementing the interface.
> I propose to:
> # Add the new methods for streams and numbers in a range.
> # Add default implementations of the current API. These can be extracted
> from the o.a.c.rng.core.BaseProvider implementations.
> # Remove the implementations in o.a.c.rng.core.BaseProvider. This change
> would be binary compatible.
> The base classes in commons core for 32-bit and 64-bit sources of randomness,
> IntProvider and LongProvider, can be updated suitably to only override the
> default interface methods where they can be more efficiently implemented
> given the source of randomness. This applies to:
> ||Source||Update||Details||
> |int|nextBytes|Use nextInt() for the source of bytes|
> | |nextBoolean|Use a cached int for the randomness|
> | |nextInt|Directly supply the int rather than using 32-bits from nextLong()|
> | |nextDouble|Optimise the bits used from two ints for the 53-bits required
> for the double.|
> |long|nextInt; nextBoolean|Use a cached long for the randomness|
> h3. Note 1
> The UniformRandomProvider also has the method:
> {code:java}
> void nextBytes(byte[] bytes,
> int start,
> int len);
> {code}
> This can also have a default implementation using the output from nextLong().
> h3. Note 2
> The methods to generate an exponential and Gaussian are already implemented
> in the {{commons-rng-sampling}} module.
> java.util.Random has a nextGaussian() method and so this method appears to be
> for backward compatibility with legacy Java code. The method is implemented
> using a modified Ziggurat sampler which uses an exponential sampler for the
> long tail. The API has thus exposed the exponential sampling method that is
> used internally in the nextGaussian implementation.
> With no backward compatibility requirements the Commons RNG interface can
> avoid the distribution sampling methods. Users should select an appropriate
> sampler from the sampling module.
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)