Back on mailing list.

On 24/09/2019 15:18, Gilles Sadowski wrote:
Hi.

2019-09-24 12:56 UTC+02:00, Alex Herbert <alex.d.herb...@gmail.com>:
On 24/09/2019 01:04, Gilles Sadowski wrote:
Hi.

Le lun. 23 sept. 2019 à 22:38, Alex Herbert <alex.d.herb...@gmail.com> a
écrit :
[...]
isLongJumpable()
isJumpable()
I’ve put these methods into a PR for RNG-116 [1].

I’ve also added:

public int getSeedByteSize()

However I note that the RandomSource has factory methods to create int[]
and long[]
Those are used internally to create "native" type seeds
They are public in RandomSource:

public static int createInt();
public static long createLong();
public static int[] createIntArray(int);
public static long[] createLongArray(int);

So these are API method to be used to create seeds. Since the
create(RandomSource) methods support 5 seed types it seems that the
ability to create a byte[] is missing. It also makes the API more
complete to go along with the instance method:

public int getSeedByteSize();

However when I put the changes into the current PR it seems to be a part
solution open to pitfalls. It offers the chance for a user to get the
native byte size, create some bytes then construct a generator. However
some generators have extra seeding requirements. So far I handled this
by putting a warning in the javadoc and a suggestion to just use a null
seed and let the library build the correct seed.

It may make more sense to drop the getSeedByteSize() method idea. The
only use cases I have for it are: (1) to populate the entire seed byte
array using another RNG such as SecureRandom; (2) to be able to
reproduce building the same generator with a good seed.

It could be replaced with an instance method:

Object createSeed();

This would create a seed using the internal package methods. It would
ensure the seed is correct for the RandomSource. It would return the
same seed that is created if null is passed as the seed. So the
following would make 2 identical RNGs (and satisfy requirement (2)):

RandomSource source = ...;
Object seed = source.createSeed();
UniformRandomProvider rng1 = RandomSource.create(source, seed);
UniformRandomProvider rng2 = RandomSource.create(source, seed);

The use case for getSeedByteSize() to build your own seed using your own
random source (use case (1)) may be better served by either: (a) an
overloaded:

Object createSeed(UniformRandomProvider);

Or (b): changing RandomSource.create(...) to handle
UniformRandomProvider as a seed type:

RandomSource source = ...;
UniformRandomProvider seedRng = RandomSource.create(source);
UniformRandomProvider rng1 = RandomSource.create(source, seedRng);
UniformRandomProvider rng2 = RandomSource.create(source, seedRng);
(Here rng1 and rng2 are different.)

Or both (a) and (b) (since the framework to handle either would be the
same in the internal package).

but not byte[].
This one would not be used internally.
Agreed. It just completes the API but I think is open to pitfalls when
combined with getSeedByteSize().
Given the seed size is reported in bytes it would make sense to have:

public byte[] createByteArray(int)

In the seed factory methods.
Do you mean to replace/deprecate all the constructors (in
"commons-rng-core")?
No. The core implementations should accept their native type for a seed.

Here is a more rounded approach:

1. Complete the public API to support building any supported seed type
for RandomSource.create(...). This requires adding:

public static byte[] createByteArray(int);

The RandomSource is then a public API for factory methods to build seeds.


2. Support each RandomSource being able to create a suitable seed for
itself:

public Object createSeed();
I'm a bit worried about the public API dealing with "Object".

I only suggested it due to the fact that 'Object' is the type handled for the seed by:

public boolean isNativeSeed(Object);
public static RestorableUniformRandomProvider create(RandomSource, Object seed, Object... data)

Thus createSeed() can be documented to return an object that will be 'true' for isNativeSeed(Object), and it will be one of the supported types (Integer, Long, int[], long[] or byte[]).

It is a bit vague though. Perhaps a compromise:

public byte[] createSeed();
public byte[] createSeed(UniformRandomProvider source);

This will apply to any of the generators. The seed would have to be converted from its native type but at least the API avoids Object. It also prevents the user just getting the byte size and creating their own seed. It allows the user to supply a source of randomness but will pass the additional seed requirement checks for the specific random source.



3. Support a user supplied source of randomness for seeding:

RandomSource.create(...) to support UniformRandomProvider as a seed.

This works together with (2) if this is also added:

public Object createSeed(Object seed);

The method would create the native seed type using the internal
conversion methods. It effectively allows a user to change:

RandomSource source = ...;
long something = ...;
UniformRandomProvider rng1 = RandomSource.create(source, something);

to:

RandomSource source = ...;
long something = ...;
Object seed = source.createSeed(something);
// done later/again ...
UniformRandomProvider rng1 = RandomSource.create(source, seed);

The create() method can be called repeatedly to create the same
generator. It is something I would do for simulations where you wish to
repeat a simulation that produced a desirable outcome, for example to
monitor the  steps that occurred during the same simulation.


My vote:

-0 Item (1). It is not used by the library but may be used by someone.
-1
Use-case is better served with "nextBytes" (where the source is not
limited to the one set in "SeedFactory").

+1 Item (2). It replaces getSeedByteSize() with a more user friendly API
for seed generation.

+1 Item (3). It is useful, e.g. it could be combined with the
JDKRandomWrapper:

UniformRandomProvider seed = new JDKRandomWrapper(new SecureRandom());
UniformRandomProvider rng1 = RandomSource.create(source, seed);
Ueful indeed but preferably with a strongly typed API:

RandomSource {
      public UniformRandomProvider create(RandomSource source,
          UniformRandomProvider seed) { /* ... */ }
}

This would be an overloaded factory method:

public static RestorableUniformRandomProvider create(RandomSource source,
UniformRandomProvider seed,
                                                         Object... data);

If a user wants to recreate the same generator then they have to feed in the same source of randomness:

RandomSource source = ...;
long seed = 12345;
UniformRandomProvider rng1 = RandomSource.create(source, new SplitMix64(seed)); UniformRandomProvider rng2 = RandomSource.create(source, new SplitMix64(seed));

This works for any UniformRandomProvider which can be easily recreated at the same point. But if wanting to recreate the same generator from SecureRandom they would have to deal with serialisation of SecureRandom to be able to restore it.

I think it would be an unnecessary overload of the current factory method just to change the type of a single parameter. However this new factory method would not be needed if we add:

public byte[] createSeed(UniformRandomProvider source);

Allowing the equivalent:

RandomSource source = ...;
byte[] seed = source.createSeed(new SplitMix64(12345));
UniformRandomProvider rng1 = RandomSource.create(source, seed);
UniformRandomProvider rng2 = RandomSource.create(source, seed);

So I am +1 for just adding the createSeed methods so allowing the seed verification internals to be used for public consumption.

Alex


Best,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to