[
https://issues.apache.org/jira/browse/RNG-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17507660#comment-17507660
]
Alex Herbert commented on RNG-169:
----------------------------------
Some further investigation shows that array seed conversions were inconsistent.
* byte[] input used the bytes in little endian order.
* int[] input used the bytes using a split partition of the first half of the
input int[] into the little endian lower half of the output long[]; with the
second half of the input int[] into the upper half.
* long[] input used the low half of each long for the first half of the output
int[]; the upper half of each long for the rest of the int[] output.
* All array to array conversions used the entire input length, even if the
native seed length is shorter. This created redundant array lengths for the
output. These converted bytes would not be used by any RNG in the library.
* Array to primitive (Long/Int) seed conversions will convert to an array of
the correct type and xor the entire seed array. So these RNGs will use all the
bytes in an input seed.
h2. Modifications
I have updated the array converters to all use little endian conversion and
implement the Seed2ArrayConverter interface. It is now tested that the
following works:
{noformat}
byte[] -> int[] -> long[] == byte[] -> long[]
byte[] -> long[] -> int[] == byte[] -> int[]
int[] -> long[] -> int[] == int[]
long[] -> int[] -> long[] == long[]{noformat}
Array-to-array conversions use the minimum of the native seed length, or the
number of available input bytes to set the output length. No input zero filling
occurs if the input seed is too small. Extra bytes from the input seed are not
converted. Previously they would have been converted and then discarded after
the RNG has been constructed using only its required number of bytes.
h2. Behavioural Change
This change does introduce a behavioural change for int[] or long[] seed. They
are now converted in little endian order. Any seeding using fixed seeds of type
int[] or long[] to produce a RNG with a different native seed will now create
a different RNG state from previous versions. If a seed is of the correct
native seed type then there is no change, the seed is passed through without
any changes.
I believe this an acceptable change as seeding behaviour is not part of the
API. The behaviour change will still use the same number of bits from the input
seed to create a RNG. When seeds are too large the bits used will be different.
Whens seeds are the correct size the order of bits passed to the RNG will be
different.
The recommended way to create a seed for reuse within the same Major.Minor
release version number of the library remains:
{code:java}
RandomSource rs = ...;
byte[] seed = rs.createSeed();
UniformRandomProvider rng = ...;
byte[] seed = rs.createSeed(rng);{code}
Since byte[] seed behaviour has only been changed to avoid conversion of bytes
that are discarded, any byte[] seeds from previous versions should function to
create the same RNG output.
h2. Native seed types behaviour is still inconsistent
||Native type||Array seed||Primitive seed||
|Array|Truncate|Expand|
|Primitive|Consume|Consume/Expand|
* Truncate = Use only the number of bytes required from the seed
* Consume = Use all the bytes from the seed
* Expand = Use the input seed to generate a random seed of the correct length
Consistency could be obtained by:
# Updating primitive native seed types to only use the required number of
bytes to create the native seed, effectively truncating the input seed.
# Updating the array native seed types to construct the output array of the
correct length, fill it, then continue to use any remaining bytes to mix in
with the current seed.
Option 1 would be faster.
Option 2 would be slower.
Note: The only RNGs in the library with a primitive seed type are:
{noformat}
JDKRandom (long)
SplitMix64 (long)
TwoCmres (int)
JenkinsSmallFast32 (int)
JenkinsSmallFast64 (long)
PcgMcgXshRr32 (long)
PcgMcgXshRs32 (long)
PcgXshRr32 (long)
PcgXshRs32 (long)
PcgRxsMXs64 (long){noformat}
> Update byte[] array conversion use optimum memory allocation
> ------------------------------------------------------------
>
> Key: RNG-169
> URL: https://issues.apache.org/jira/browse/RNG-169
> Project: Commons RNG
> Issue Type: Improvement
> Components: simple
> Affects Versions: 1.4
> Reporter: Alex Herbert
> Priority: Trivial
> Fix For: 1.5
>
>
> The seed conversion routines in ByteArray2LongArray and ByteArray2IntArray
> can be optimised for memory usage.
> The converters can be updated to implement Seed2ArrayConverter. This allows
> the length of the output seed to be constructed to the correct length. This
> will avoid converting part of the byte[] seed that is not used.
> In addition the input seed is expanded if it is not modulus 8 or 4
> respectively using Arrays.copyOf. This will zero fill the end of the seed.
> The array can then be converted by the NumberFactory without an exception.
> These routines should be updated to use the same method as NumberFactory to
> fill in a long[] and then add any trailing bytes to the final long.
> This avoids any array copy when using arbitrary seed lengths, e.g.
> SecureRandom.getSeed(13).
--
This message was sent by Atlassian Jira
(v8.20.1#820001)