[jira] [Commented] (RNG-169) Update byte[] array conversion use optimum memory allocation

Alex Herbert (Jira) Wed, 16 Mar 2022 07:39:04 -0700


    [ 
https://issues.apache.org/jira/browse/RNG-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17507660#comment-17507660
 ]


Alex Herbert commented on RNG-169:
----------------------------------

Some further investigation shows that array seed conversions were inconsistent.
 * byte[] input used the bytes in little endian order.
 * int[] input used the bytes using a split partition of the first half of the 
input int[] into the little endian lower half of the output long[]; with the 
second half of the input int[] into the upper half.
 * long[] input used the low half of each long for the first half of the output 
int[]; the upper half of each long for the rest of the int[] output.
 * All array to array conversions used the entire input length, even if the 
native seed length is shorter. This created redundant array lengths for the 
output. These converted bytes would not be used by any RNG in the library.
 * Array to primitive (Long/Int) seed conversions will convert to an array of 
the correct type and xor the entire seed array. So these RNGs will use all the 
bytes in an input seed.

h2. Modifications

I have updated the array converters to all use little endian conversion and 
implement the Seed2ArrayConverter interface. It is now tested that the 
following works:
{noformat}
byte[] -> int[] -> long[]   == byte[] -> long[]
byte[] -> long[] -> int[]   == byte[] -> int[]

int[] -> long[] -> int[]    == int[]
long[] -> int[] -> long[]   == long[]{noformat}
Array-to-array conversions use the minimum of the native seed length, or the 
number of available input bytes to set the output length. No input zero filling 
occurs if the input seed is too small. Extra bytes from the input seed are not 
converted. Previously they would have been converted and then discarded after 
the RNG has been constructed using only its required number of bytes.
h2. Behavioural Change

This change does introduce a behavioural change for int[] or long[] seed. They 
are now converted in little endian order. Any seeding using fixed seeds of type 
int[] or long[]  to produce a RNG with a different native seed will now create 
a different RNG state from previous versions. If a seed is of the correct 
native seed type then there is no change, the seed is passed through without 
any changes.

I believe this an acceptable change as seeding behaviour is not part of the 
API. The behaviour change will still use the same number of bits from the input 
seed to create a RNG. When seeds are too large the bits used will be different. 
Whens seeds are the correct size the order of bits passed to the RNG will be 
different.

The recommended way to create a seed for reuse within the same Major.Minor 
release version number of the library remains:

 
{code:java}
RandomSource rs = ...;

byte[] seed = rs.createSeed();

UniformRandomProvider rng = ...;
byte[] seed = rs.createSeed(rng);{code}
 

Since byte[] seed behaviour has only been changed to avoid conversion of bytes 
that are discarded, any byte[] seeds from previous versions should function to 
create the same RNG output.
h2. Native seed types behaviour is still inconsistent
||Native type||Array seed||Primitive seed||
|Array|Truncate|Expand|
|Primitive|Consume|Consume/Expand|
 * Truncate = Use only the number of bytes required from the seed
 * Consume = Use all the bytes from the seed
 * Expand = Use the input seed to generate a random seed of the correct length
Consistency could be obtained by:
 # Updating primitive native seed types to only use the required number of 
bytes to create the native seed, effectively truncating the input seed.
 # Updating the array native seed types to construct the output array of the 
correct length, fill it, then continue to use any remaining bytes to mix in 
with the current seed.

Option 1 would be faster. 

Option 2 would be slower.

Note: The only RNGs in the library with a primitive seed type are:

 
{noformat}
JDKRandom (long)
SplitMix64 (long)
TwoCmres (int)
JenkinsSmallFast32 (int)
JenkinsSmallFast64 (long)
PcgMcgXshRr32 (long)
PcgMcgXshRs32 (long)
PcgXshRr32 (long)
PcgXshRs32 (long)
PcgRxsMXs64 (long){noformat}
 

 

> Update byte[] array conversion use optimum memory allocation
> ------------------------------------------------------------
>
>                 Key: RNG-169
>                 URL: https://issues.apache.org/jira/browse/RNG-169
>             Project: Commons RNG
>          Issue Type: Improvement
>          Components: simple
>    Affects Versions: 1.4
>            Reporter: Alex Herbert
>            Priority: Trivial
>             Fix For: 1.5
>
>
> The seed conversion routines in ByteArray2LongArray and ByteArray2IntArray 
> can be optimised for memory usage.
> The converters can be updated to implement Seed2ArrayConverter. This allows 
> the length of the output seed to be constructed to the correct length. This 
> will avoid converting part of the byte[] seed that is not used.
> In addition the input seed is expanded if it is not modulus 8 or 4 
> respectively using Arrays.copyOf. This will zero fill the end of the seed. 
> The array can then be converted by the NumberFactory without an exception.
> These routines should be updated to use the same method as NumberFactory to 
> fill in a long[] and then add any trailing bytes to the final long.
> This avoids any array copy when using arbitrary seed lengths, e.g. 
> SecureRandom.getSeed(13).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (RNG-169) Update byte[] array conversion use optimum memory allocation

Reply via email to