[ 
https://issues.apache.org/jira/browse/RNG-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536041#comment-17536041
 ] 

Alex Herbert commented on RNG-176:
----------------------------------

{quote}Is it really useful?
{quote}
The idea was to emphasise that Commons RNG provides the functionality of the 
JDK 17 interfaces, and a lot more. It may not be that useful. More useful is to 
document what can be achieved and developers can make their own mind up.

Note that the new methods are not only in JDK 17. The functionality to sample 
in the range [origin, bound) for all primitive types is present in JDK 8:

 
||RNG||next [origin, bound)||stream [origin, bound)||
|Random| |Y|
|SplittableRandom|Y|Y|
|ThreadLocalRandom|Y|Y|
|SecureRandom| |Y|

Some range methods are missing from Random. It only has nextInt(int). There is 
not even a nextLong(long). The weak 48-bit LCG underlying the Random class is 
therefore deliberately left out of newer range sampling methods. However the 
stream methods for a range are present which a mix up. It also means the range 
sampling is not available with SecureRandom.

Thus the JDK 17 RandomGenerator interface just describes what SplittableRandom 
can do. As such perhaps the user guide can simply mention this fact in passing. 
Any RNG from commons can provide the same sampling functionality 
single-threaded as SplittableRandom.

What is missing from Commons is a way to support fork-join parallelism by 
splitting. This is applicable to the stream methods. 

There was a previous discussion of splitting on the mailing list. This was not 
deemed useful as it is mainly applicable to use in parallel streams. Now the 
library supports Java 8 streams this idea can be revisited. This would require 
some way to divide a generator of values:

At the generator level:
{code:java}
public interface SplittableUniformRandomProvider extends UniformRandomProvider {
    SplittableUniformRandomProvider split();
} {code}
RNGs implementing this interface would be expected to override the stream 
methods in UniformRandomProvider to allow splitting the stream.
 
At the sampler level, e.g: 
{noformat}
public interface SplittableLongSampler extends LongSampler {
    SplittableLongSampler split();
}

// Created using some factory methods:

public static SplittableLongSampler of(SplittableUniformRandomProvider rng,
                                       SharedStateLongSampler sampler) {
    // Return a class that provides:
    // LongSampler.sample() using sampler
    // SplittableLongSampler split() using:
    //   rng2 = rng.split(), sampler.withUniformRandomProvider(rng2)
    // LongStream samples() using a custom Spliterator.ofLong that allows 
splitting
}

public static SplittableLongSampler of(RandomSource source, 
                                       SharedStateLongSampler sampler) {
    // Use rng2 = RandomSource.create() for 'splitting'
}

// Note:
// 'SharedStateLongSampler sampler' requires an instance.
// It could be changed to:
// Function<UniformRandomProvider, LongSampler> factory{noformat}
Splitting is supported in the JDK 17 LXM family of generators by just creating 
a new state from the current generator. If the additive parameter for the LCG 
is different then the split generator is very likely to be independent. So 
splitting has at most a 1 in 2^31, 1 in 2^63, or 1 in 2^127 chance of overlap 
for the 32-bit, 64-bit and 128-bit LCG based generators. However even in the 
smallest generator the remaining state is 96 bits. So overlap probability is 
much lower due to the large period of a generator seeded with the same LCG 
additive parameter. So IIUC the default generator L32X64Mix based on a 32-bit 
LCG has better splitting characteristics than SplittableRandom: the chance of 
overlap is similar (127-bits of state, vs 128-bits of state, but the two child 
generators are more likely to be statistically strong where as the 
SplittableRandom split operation can create a statistically weak generator. So 
the LXM family is a good candidate to add splitting functionality to the 
library.

 

 

 

> Enhance the UniformRandomProvider interface with extra methods and default 
> implementations
> ------------------------------------------------------------------------------------------
>
>                 Key: RNG-176
>                 URL: https://issues.apache.org/jira/browse/RNG-176
>             Project: Commons RNG
>          Issue Type: New Feature
>    Affects Versions: 1.4
>            Reporter: Alex Herbert
>            Assignee: Alex Herbert
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> JDK 17 introduced the {{RandomGenerator}} interface with the following 
> methods:
> {code:java}
> DoubleStream doubles();
> DoubleStream doubles(double randomNumberOrigin, double randomNumberBound);
> DoubleStream doubles(long streamSize);
> DoubleStream doubles(long streamSize, double randomNumberOrigin,
>                      double randomNumberBound);
> IntStream ints();
> IntStream ints(int randomNumberOrigin, int randomNumberBound);
> IntStream ints(long streamSize);
> IntStream ints(long streamSize, int randomNumberOrigin,
>                int randomNumberBound);
> LongStream longs();
> LongStream longs(long randomNumberOrigin, long randomNumberBound);
> LongStream longs(long streamSize);
> LongStream longs(long streamSize, long randomNumberOrigin,
>                  long randomNumberBound);
> boolean nextBoolean();
> void nextBytes(byte[] bytes);
> float nextFloat();
> float nextFloat(float bound);
> float nextFloat(float origin, float bound);
> double nextDouble();
> double nextDouble(double bound);
> double nextDouble(double origin, double bound);
> int nextInt();
> int nextInt(int bound);
> int nextInt(int origin, int bound);
> long nextLong();
> long nextLong(long bound);
> long nextLong(long origin, long bound);
> double nextGaussian();
> double nextGaussian(double mean, double stddev);
> double nextExponential();
> {code}
> The only method that is *non-default* is {{{}nextLong{}}}. This allows a new 
> generator to be simply implemented by providing the source of randomness as 
> 64-bit longs.
> The {{UniformRandomProvider}} interface can be expanded to include these 
> generation methods. Using Java 8 default interface methods will not require 
> any changes to generators currently implementing the interface.
> I propose to:
>  # Add the new methods for streams and numbers in a range.
>  # Add default implementations of the current API. These can be extracted 
> from the  o.a.c.rng.core.BaseProvider implementations.
>  # Remove the implementations in o.a.c.rng.core.BaseProvider. This change 
> would be binary compatible.
> The base classes in commons core for 32-bit and 64-bit sources of randomness, 
> IntProvider and LongProvider, can be updated suitably to only override the 
> default interface methods where they can be more efficiently implemented 
> given the source of randomness. This applies to:
> ||Source||Update||Details||
> |int|nextBytes|Use nextInt() for the source of bytes|
> | |nextBoolean|Use a cached int for the randomness|
> | |nextInt|Directly supply the int rather than using 32-bits from nextLong()|
> | |nextDouble|Optimise the bits used from two ints for the 53-bits required 
> for the double.|
> |long|nextInt; nextBoolean|Use a cached long for the randomness|
> h3. Note 1
> The UniformRandomProvider also has the method:
> {code:java}
> void nextBytes(byte[] bytes,
>                int start,
>                int len);
> {code}
> This can also have a default implementation using the output from nextLong().
> h3. Note 2
> The methods to generate an exponential and Gaussian are already implemented 
> in the {{commons-rng-sampling}} module.
> java.util.Random has a nextGaussian() method and so this method appears to be 
> for backward compatibility with legacy Java code. The method is implemented 
> using a modified Ziggurat sampler which uses an exponential sampler for the 
> long tail. The API has thus exposed the exponential sampling method that is 
> used internally in the nextGaussian implementation.
> With no backward compatibility requirements the Commons RNG interface can 
> avoid the distribution sampling methods. Users should select an appropriate 
> sampler from the sampling module.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to