[
https://issues.apache.org/jira/browse/RNG-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855697#comment-16855697
]
Alex D Herbert commented on RNG-104:
------------------------------------
Results of generating multiple values inside the synchronized block. In the
following table the columns are the number of values generated per
synchronisation:
Seed size was limited to 128 as this is the max size generated by the
RandomSource factory create method. The speed-up column is the relative time
for chunks of size 1 and 16.
|Array|size|Threads|1|2|4|8|16|Speed-up|
|int|2|1|13.17|12.07|12.02|12.02|12.05|1.09|
| | |4|726.19|422.72|397.07|411.54|449.25|1.62|
| |4|1|20.54|19.70|19.40|16.28|16.06|1.28|
| | |4|1,282.19|746.31|404.00|465.23|494.23|2.59|
| |8|1|36.30|34.34|27.52|23.78|26.17|1.39|
| | |4|2,701.81|1,362.47|816.16|481.00|398.19|6.79|
| |16|1|66.37|64.52|46.36|42.87|41.16|1.61|
| | |4|4,612.01|2,740.44|1,620.57|834.50|632.86|7.29|
| |128|1|509.27|502.99|349.39|336.29|281.26|1.81|
| | |4|37,264.61|25,228.67|16,460.90|6,402.86|5,412.38|6.89|
|long|2|1|17.88|13.52|13.68|13.53|13.48|1.33|
| | |4|771.18|367.42|493.18|405.20|497.45|1.55|
| |4|1|23.95|23.15|18.83|22.15|18.69|1.28|
| | |4|1,451.37|906.70|575.56|431.16|412.23|3.52|
| |8|1|42.56|41.52|33.17|28.18|28.45|1.50|
| | |4|2,602.50|2,236.36|804.16|772.09|503.62|5.17|
| |16|1|85.11|78.86|65.72|62.58|51.84|1.64|
| | |4|4,475.03|3,045.68|1,446.96|851.92|552.16|8.10|
| |128|1|734.61|678.53|535.54|450.59|423.76|1.73|
| | |4|38,662.13|28,054.10|10,627.97|6,685.73|5,524.78|7.00|
Observations:
* Generating the array in chunks has a small effect when running on a single
thread. There is no contention for the synchronized code.
* On a single thread the performance is nearly doubled when using chunks of
size 16 to output arrays of length 128.
* Running on 4 threads the generation of seeds is much slower.
* Using chunks shows must faster performance, up to 8-fold speed up.
Discussion points:
Is there an ideal chunk size for seed generation? Perhaps this should be a
balance between the overhead time to perform a synchronization on an object and
the number of values that can be produced by the generator in the same time.
This would mean the system is spending half the time synchronising and half
outputting values.
This point may be difficult to identify as the synchronization time may vary
based on thread contention.
Note: The table can be expanded to include seed size of 32 and 64 and chunk
sizes of 32, 64 and 128. This adds approximately 2-fold more combinations and
so is computable. Ideally the benchmark should exclude using a chunk size
larger than the seed size but the only way I have found to do this is throw an
exception in the state initialisation which is not very elegant.
> SeedFactory seed creation performance analysis
> ----------------------------------------------
>
> Key: RNG-104
> URL: https://issues.apache.org/jira/browse/RNG-104
> Project: Commons RNG
> Issue Type: Task
> Components: simple
> Affects Versions: 1.3
> Reporter: Alex D Herbert
> Assignee: Alex D Herbert
> Priority: Minor
>
> The SeedFactory is used to create seeds for the random generators. To ensure
> thread safety this uses synchronized blocks around a single generator. The
> current method only generates a single int or long per synchronisation.
> Analyze the performance of this approach. The analysis will investigate
> generating multiple values inside each synchronisation around the generator.
> This analysis will also investigate methods to supplement the SeedFactory
> with fast methods to create seeds. This will use a fast seeding method to
> generate a single long value. This can be a seed for a SplitMix generator
> used to create a seed of any length.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)