[
https://issues.apache.org/jira/browse/RNG-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855747#comment-16855747
]
Alex D Herbert commented on RNG-104:
------------------------------------
Results of generating a single int/long value using various methods that will
be thread safe:
Most methods are self explanatory. Each method with a RNG uses this approach:
{code:java}
synchronized (rng) {
return rng.nextLong();
}
{code}
System_identityHashCode does this:
{code:java}
System.identityHashCode(new Object());
{code}
The SyncSplitMix_nextLong and nextInt methods use a single AtomicLong to hold
the state of a SplitMix type generator. This is incremented atomically. The
random int or long is then generated using the hash algorithm found in the JDK
8 SplittableRandom, the long matches the SplitMix algorithm, the int is the 32
high bits of Stafford variant 4 mix64 function as an int.
Timings are in nanoseconds from a median of 5 runs with JHM overhead subtracted
(this is roughly 2ns):
|Method|Type|1|4|
|ThreadLocalRandom_nextInt|int|1.38|1.72|
|ThreadLocalRandom_nextLong|long|1.60|1.84|
|XoRoShiRo128Plus_nextLong|long|2.49|318.60|
|XoRoShiRo128Plus_nextInt|int|2.58|311.43|
|XorShift1024StarPhi_nextInt|int|3.18|301.01|
|XorShift1024StarPhi_nextLong|long|3.71|283.80|
|AtomicLong_getAndIncrement|long|4.83|65.04|
|volatileInt_increment|int|4.83|147.90|
|AtomicInt_getAndIncrement|int|4.83|67.31|
|volatileLong_increment|long|4.83|153.98|
|SyncSplitMix_nextLong|long|6.56|70.98|
|SyncSplitMix_nextInt|int|9.75|68.85|
|Well44497b_nextInt|int|9.95|379.08|
|Well44497b_nextLong|long|21.31|779.19|
|System_identityHashCode|int|36.69|41.36|
|SeedFactory_createInt|int|54.62|730.53|
|SeedFactory_createLong|long|63.01|804.52|
|System_nanoTime|long|560.05|2,190.05|
|System_currentTimeMillis|long|564.09|2,184.35|
Observations:
* System.currentTimeMillis and System.nanoTime are very slow. On a different
machine these methods are not this slow but are still worse than other methods
here and so should be avoided.
* Multi-threaded generation of random values is much slower
* Only the ThreadLocalRandom and System identityHashCode methods run
approximately the same speed. ThreadLocalRandom is designed to work across
threads and the identity hash code must have JVM framework support to work
across threads.
* Direct use of volatile variables is slower than using Atomic classes
*Single thread*
When running on a single thread using a fast generator within a synchronised
block is close to ThreadLocalRandom.
Use of Atomic classes is not as fast as synchronisation on a single RNG. So
entry and exit to a synchronised block is fast when no other thread requires
synchronization on the same RNG.
The SeedFactory is slowest. It is both synchronising on a RNG and also running
the identityHashCode method. If you add the time from the Well44497b to the
time for the identityHashCode it is similar but approximately 10 units faster.
This omits the xor operation done in the SeedFactory but the main difference
may be how the JVM can inline the code in the benchmark verses the SeedFactory
code.
*Multi-threaded*
ThreadLocalRandom easily wins. The JMH benchmark must create one for each
thread and then avoid contention through the entire benchmark as the times are
almost the same as a single thread.
The next best methods are all based on Atomic classes. These easily outperform
direct use of volatile class variables. A lot of the operations in the Unsafe
class behind Atomic classes are JVM intrinsics in JDK 8 (see [JVM 8
intrinsics|https://gist.github.com/apangin/7a9b7062a4bd0cd41fcc]). So these use
custom written assembly code.
There is a strange timing outlier for the SeedFactory. For long generation the
time is almost the same as the time for Well44497b_nextLong + the identity hash
code method. But the int generation is not as fast as expected from
Well44497b_nextInt + the identity hash code method.
*Discussion points*:
ThreadLocalRandom is in JDK 1.7 so cannot be used in the RNG codebase which is
at Java 1.6.
It seems that it would be ideal for generation of fast seeds. Adding methods to
the SeedFactory to provide a fast int/long method for Java 1.6 would then
become obsolete with an upgrade to Java 1.7.
Also note that the use of System.identityHashCode mixed with the Well44497b
generator:
* Is slow
* Does not allow it to pass BigCrush (see RNG-75)
The stress test application uses Java 8. Thus a test of ThreadLocalRandom mixed
with the Well44497b generator should be done. This would create a long period
generator in a similar fashion to mixing with the identity hash code method.
> SeedFactory seed creation performance analysis
> ----------------------------------------------
>
> Key: RNG-104
> URL: https://issues.apache.org/jira/browse/RNG-104
> Project: Commons RNG
> Issue Type: Task
> Components: simple
> Affects Versions: 1.3
> Reporter: Alex D Herbert
> Assignee: Alex D Herbert
> Priority: Minor
>
> The SeedFactory is used to create seeds for the random generators. To ensure
> thread safety this uses synchronized blocks around a single generator. The
> current method only generates a single int or long per synchronisation.
> Analyze the performance of this approach. The analysis will investigate
> generating multiple values inside each synchronisation around the generator.
> This analysis will also investigate methods to supplement the SeedFactory
> with fast methods to create seeds. This will use a fast seeding method to
> generate a single long value. This can be a seed for a SplitMix generator
> used to create a seed of any length.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)