[
https://issues.apache.org/jira/browse/RNG-171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509450#comment-17509450
]
Alex Herbert commented on RNG-171:
----------------------------------
Benchmarked on JDK 11.0.12 on the macbook.
10 warm-up runs, 10 measurement iterations.
This is based on the current benchmark for nextBoolean/nextInt. It records the
time to generate a single boolean (or int). A baseline for the JMH overhead of
measurement has been subtracted. So the ratio should be a true relative time.
All providers benefit from the boolean cache. The fastest int provider (MSWS)
is still faster with the cache than using a sign test.
All LongProviders benefit from the int cache. The fastest long provider
(XO_RO_SHI_RO_128_PLUS) is still faster with the int cache.
Note: As stated in the user guide, the use of a cache means all the bits from
the provider are used. Thus the choice of generator should ideally be limited
to those that pass PractRand and Test U01 BigCrush.
||Source||nextBoolean||signTest||Relative||
|ISAAC|3.848|7.496|0.513|
|JDK|3.874|10.4|0.373|
|JSF_32|3.614|5.118|0.706|
|JSF_64|3.821|5.176|0.738|
|KISS|3.762|7.717|0.487|
|MSWS|3.56|4.176|0.852|
|MT|4.268|6.75|0.632|
|MT_64|4.608|7.723|0.597|
|MWC_256|3.768|5.694|0.662|
|PCG_MCG_XSH_RR_32|3.596|6.107|0.589|
|PCG_MCG_XSH_RS_32|3.6|4.364|0.825|
|PCG_RXS_M_XS_64|3.846|4.961|0.775|
|PCG_RXS_M_XS_64_OS|3.836|5.028|0.763|
|PCG_XSH_RR_32|3.609|5.62|0.642|
|PCG_XSH_RR_32_OS|3.676|5.57|0.660|
|PCG_XSH_RS_32|3.614|4.666|0.775|
|PCG_XSH_RS_32_OS|3.679|4.641|0.793|
|SFC_32|3.634|5.244|0.693|
|SFC_64|3.819|5.597|0.682|
|SPLIT_MIX_64|4.018|4.704|0.854|
|TWO_CMRES|3.843|5.883|0.653|
|WELL_1024_A|4.285|10.312|0.416|
|WELL_19937_A|3.95|11.657|0.339|
|WELL_19937_C|3.801|12.508|0.304|
|WELL_44497_A|4.208|14.879|0.283|
|WELL_44497_B|3.86|12.967|0.298|
|WELL_512_A|5.927|9.542|0.621|
|XO_RO_SHI_RO_1024_PP|3.975|6.109|0.651|
|XO_RO_SHI_RO_1024_S|4.679|5.82|0.804|
|XO_RO_SHI_RO_1024_SS|3.967|6.711|0.591|
|XO_RO_SHI_RO_128_PLUS|3.942|4.755|0.829|
|XO_RO_SHI_RO_128_PP|3.844|4.803|0.800|
|XO_RO_SHI_RO_128_SS|3.932|5.252|0.749|
|XO_RO_SHI_RO_64_S|3.712|4.38|0.847|
|XO_RO_SHI_RO_64_SS|3.715|4.857|0.765|
|XO_SHI_RO_128_PLUS|3.607|5.244|0.688|
|XO_SHI_RO_128_PP|3.593|5.252|0.684|
|XO_SHI_RO_128_SS|3.637|5.769|0.630|
|XO_SHI_RO_256_PLUS|3.93|5.285|0.744|
|XO_SHI_RO_256_PP|3.928|5.238|0.750|
|XO_SHI_RO_256_SS|3.914|5.73|0.683|
|XO_SHI_RO_512_PLUS|3.949|7.448|0.530|
|XO_SHI_RO_512_PP|3.956|7.544|0.524|
|XO_SHI_RO_512_SS|3.967|7.6|0.522|
|XOR_SHIFT_1024_S|4.028|6.006|0.671|
|XOR_SHIFT_1024_S_PHI|4.029|5.833|0.691|
||Source||nextInt||shiftLong||Relative||
|JSF_64|4.419|5.085|0.869|
|MT_64|5.842|7.293|0.801|
|PCG_RXS_M_XS_64|4.077|4.812|0.847|
|PCG_RXS_M_XS_64_OS|4.082|4.802|0.850|
|SFC_64|4.535|5.152|0.880|
|SPLIT_MIX_64|4.074|4.815|0.846|
|TWO_CMRES|4.777|5.622|0.850|
|XO_RO_SHI_RO_1024_PP|5.021|5.896|0.852|
|XO_RO_SHI_RO_1024_S|4.907|5.737|0.855|
|XO_RO_SHI_RO_1024_SS|5.105|6.328|0.807|
|XO_RO_SHI_RO_128_PLUS|3.977|4.311|0.923|
|XO_RO_SHI_RO_128_PP|4.041|4.63|0.873|
|XO_RO_SHI_RO_128_SS|4.036|4.797|0.841|
|XO_SHI_RO_256_PLUS|4.498|5.079|0.886|
|XO_SHI_RO_256_PP|4.515|5.279|0.855|
|XO_SHI_RO_256_SS|4.58|5.47|0.837|
|XO_SHI_RO_512_PLUS|5.548|7.141|0.777|
|XO_SHI_RO_512_PP|5.609|7.216|0.777|
|XO_SHI_RO_512_SS|5.485|7.347|0.747|
|XOR_SHIFT_1024_S|4.901|5.708|0.859|
|XOR_SHIFT_1024_S_PHI|4.888|5.726|0.854|
> Reduce memory footprint of cached int and boolean source
> --------------------------------------------------------
>
> Key: RNG-171
> URL: https://issues.apache.org/jira/browse/RNG-171
> Project: Commons RNG
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.4
> Reporter: Alex Herbert
> Priority: Major
> Fix For: 1.5
>
>
> The base implementation for the IntProvider and LongProvider caches values
> from the generation method to be used as the source for boolean bits, and in
> the case of the LongProvider, for int bits. The nextBoolean and nextInt
> methods thus use all the bits output by the generator. The memory footprint
> of this cache can be reduced.
> h3. Boolean
> IntProvider
> Currently a 32-bit value and a 32-bit mask is stored.
> However when the cache is refilled a single bit is used for the boolean
> return value. The cache need only store the unused 31-bits. This leaves room
> to use a bit to indicate when to reset the cache. A similar change can be
> made for the LongProvider by only storing 63-bit unused bits.
> In this example for a 64-bit cache the booleanSource is initialised as
> MIN_VALUE (a single bit in the most significant position).
> {code:java}
> @Override
> public boolean nextBoolean() {
> long l = booleanSource;
> if (l == Long.MIN_VALUE) {
> // Refill
> l = next();
> // Store unused 63 bits and a refill flag, return highest bit
> booleanSource = (l << 1) | 1;
> return l < 0;
> }
> // Shift up eventually resetting, return current high bit
> booleanSource = l << 1;
> return l < 0;
> }
> {code}
> h3. Int
> LongProvider only
> Stores a 64-bit value and a boolean flag. However only 32-bits of the stored
> value are used. Thus the remaining 32-bits can be used as a flag. In this
> example the sign bit is used as it can be easily checked with a < operator.
> {code:java}
> @Override
> public int nextInt() {
> long l = intSource;
> if (l < 0) {
> // Refill
> l =next();
> // Store low 32 bits, return high 32 bits
> intSource = l & 0xffff_ffffL;
> // OR
> // Store low 63 bits, return high 32 bits
> //intSource = (l << 1) >>> 1;
> return (int) (l >>> 32);
> }
> // Reset and return previous low bits
> intSource = -1;
> return (int) l;
> }{code}
> The example above maintains compatibility to return the upper then lower
> parts of the long for the 2 int values. An alternative with fewer operations
> is:
> {code:java}
> @Override
> public int nextInt() {
> long l = intSource;
> if (l < 0) {
> // Refill
> l = next();
> // FUNCTIONALLY BREAKING CHANGE
> // Store high 32 bits, return low 32 bits
> intSource = (l >>> 32);
> return (int) l;
> }
> // Reset and return previous low bits
> intSource = -1;
> return (int) l;
> }
> {code}
> This is arguably simpler but would be a functionally breaking change.
> Memory footprint would change:
> {noformat}
> IntProvider
> 8 bytes (int,int) -> 4 bytes (int)
> LongProvider
> 25 bytes (long,long,long,boolean) -> 16 bytes (long,long)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)