[
https://issues.apache.org/jira/browse/RNG-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16854542#comment-16854542
]
Alex D Herbert commented on RNG-75:
-----------------------------------
I have created a PR that rearranges the internals of the ProviderBuilder. The
goal was to allow:
* Construction of array seeds of the correct length
* Customisation of seed creation on a per generator basis, e.g. to prevent
creation of seeds containing only zero bytes
The code change does the following:
* Moves the creation methods for creating a seed and a new generator into
RandomSourceInternal
* Adds a new interface for seed conversions to specify output array size
* Adds a property to each RandomSourceInternal containing the size of the seed
* Uses a custom enum supporting all the seed types to perform conversions
* Caches the constructor (it is always the same) to avoid repeat lookups using
reflection
Note: The new code still creates a maximum array length of 128 when using a
null seed. This is to minimise the work done in the SeedFactory to create
arrays using a single synchronized RNG. This could be changed by has been left
to have the same support as the old code.
To support sized conversions a new interface has been added where the output
array size is specified. This is only applicable when the input seed is either
an int or long and is converted to an array using a SplitMix generator. So to
avoid supporting this interface for all seed converters a new internal enum was
created that performs seed conversions. This uses the seed converters directly
and uses the array size when appropriate.
I have JMH construction timing data for all the generators. For reference here
are the methods:
* Create using the native constructor with the {{new}} keyword
* Create using a cached {{Constructor<Object>}} with newInstance
* Create using a cached {{Class<?>}}, lookup the {{Constructor<Object>}} and
use newInstance
* Create using {{RandomSource}} with the native seed
* Create using {{RandomSource}} with a {{null}} seed
* Create using {{RandomSource}} with a truncated native seed forcing self
seeding
* Create using {{RandomSource}} with a {{byte[]}} of the appropriate length
To avoid a large table I've computed the relative time between the old code
and the new. I've split the generators into groups with:
* Long seeding routines
* Large array seeds (above the supported max length of 128)
* Small seeds (below length 128)
Here are some charts of the relative construction time:
*Long seeding routines*
!long.jpg!
In this case the changes have not made the construction faster. I do not
understand the slower time for TWO_CMRES. Construction with the same seed using
the *new* keyword is the same. But the new factory methods are slower. The seed
is a single integer so this timing difference is strange. This generator is
very slow to self-seed so I would expect no difference for the construction
since instantiation time is minimal compared to the self-seeding time.
The other generators are the same speed. The slower time for createLongSeed for
the Mersenne Twisters is explained by the fact that the old method computed a
seed of length 128. The MT requires a seed of 624 and MT_64 a seed of 312. So
the new method which creates the correct seed length is doing more work.
*Large array seeds*
!large.jpg!
For the generators with big seeds most methods are the same speed. The MWC
result for *createNullSeed* is an outlier. It requires repeating on a different
machine to verify.
For the other generators they are slower on *createLongSeed* since the new
method is array size aware and creates a full length seed rather than the old
method creating length 128.
*Small seeds*
!small.jpg!
For *createLongSeed* construction is faster as the correct seed length is
created rather than a seed of length 128.
For other cases the construction is faster as the method now caches the
constructor that is used to create the generator. This adds a noticeable
improvement when the seed is pre-built.
Note that *createNullSeed* is not much faster for JDK or SplitMix as they have
a single long seed. The time is mainly consumed by the SeedFactory creating the
seed.
A table comparing the speed of construction a single seed created by a
synchronised method is shown below for a single thread or 4 threads running in
JMH.
|Method|1 Thread|4 Threads|
|AtomicInt_getAndIncrement|4.88|66.20|
|AtomicLong_getAndIncrement|4.85|62.57|
|SeedFactory_createInt|54.75|791.25|
|SeedFactory_createLong|63.92|842.57|
|SyncSplitMix_nextInt|6.66|78.77|
|SyncSplitMix_nextLong|6.63|75.18|
|System_currentTimeMillis|570.87|2,217.24|
|System_identityHashCode|36.43|41.42|
|System_nanoTime|575.76|2,224.95|
|ThreadLocalRandom_nextInt|1.44|1.65|
|ThreadLocalRandom_nextLong|1.63|1.66|
|volatileInt_increment|4.94|145.98|
|volatileLong_increment|4.87|154.59|
|Well44497b_nextInt|18.95|572.74|
|Well44497b_nextLong|29.48|473.10|
|XoRoShiRo128Plus_nextInt|16.71|333.53|
|XoRoShiRo128Plus_nextLong|17.20|302.67|
|XorShift1024StarPhi_nextInt|17.33|309.53|
|XorShift1024StarPhi_nextLong|16.86|302.52|
Some of these timings are to be ignored (currentTimeMillis and nanoTime) as my
machine was busy.
It is clear that the SeedFactory is not the fastest method of generating new
primitive values. This is due to the reliance on:
{code:java}
System.identityHashCode(new Object())
{code}
Interestingly this method does not slow down when running on multiple threads
so the JVM must somehow pool the identity hashcodes it uses for objects.
This data is from a new benchmark for methods of seed creation. It contains a
test for creating an array seed using a block size to define the number of
calls to the generator to perform in each synchronized block.
A previous discussion on the mailing list stated that a new method could be
added to SeedFactory for faster generation of primitive and array seeds. I will
run the new benchmark overnight and create a ticket for analysing performance
of the SeedFactory.
> Improve the speed of the RandomSource create method.
> ----------------------------------------------------
>
> Key: RNG-75
> URL: https://issues.apache.org/jira/browse/RNG-75
> Project: Commons RNG
> Issue Type: Improvement
> Components: simple
> Affects Versions: 1.3
> Reporter: Alex D Herbert
> Assignee: Alex D Herbert
> Priority: Minor
> Fix For: 1.3
>
> Attachments: large.jpg, long.jpg, small.jpg
>
>
> Update the {{o.a.c.rng.simple.internal}} package to improve the construction
> speed of random generators.
> Areas identified by the construction benchmark
> [RNG-72|https://issues.apache.org/jira/projects/RNG/issues/RNG-72] include:
> * Update the {{RandomSourceInternal}} to know the desired size for the native
> seed
> * Update the {{SeedFactory}} for faster {{byte[]}} conversions
> * Remove the use of reflection for fast seeding generators
> It is intended that all changes made are non-destructive to the quality of
> any generated seed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)