[ 
https://issues.apache.org/jira/browse/RNG-171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509450#comment-17509450
 ] 

Alex Herbert commented on RNG-171:
----------------------------------

Benchmarked on JDK 11.0.12 on the macbook.

10 warm-up runs, 10 measurement iterations.

This is based on the current benchmark for nextBoolean/nextInt. It records the 
time to generate a single boolean (or int). A baseline for the JMH overhead of 
measurement has been subtracted. So the ratio should be a true relative time.

All providers benefit from the boolean cache. The fastest int provider (MSWS) 
is still faster with the cache than using a sign test.

All LongProviders benefit from the int cache. The fastest long provider 
(XO_RO_SHI_RO_128_PLUS) is still faster with the int cache.

Note: As stated in the user guide, the use of a cache means all the bits from 
the provider are used. Thus the choice of generator should ideally be limited 
to those that pass PractRand and Test U01 BigCrush.
||Source||nextBoolean||signTest||Relative||
|ISAAC|3.848|7.496|0.513|
|JDK|3.874|10.4|0.373|
|JSF_32|3.614|5.118|0.706|
|JSF_64|3.821|5.176|0.738|
|KISS|3.762|7.717|0.487|
|MSWS|3.56|4.176|0.852|
|MT|4.268|6.75|0.632|
|MT_64|4.608|7.723|0.597|
|MWC_256|3.768|5.694|0.662|
|PCG_MCG_XSH_RR_32|3.596|6.107|0.589|
|PCG_MCG_XSH_RS_32|3.6|4.364|0.825|
|PCG_RXS_M_XS_64|3.846|4.961|0.775|
|PCG_RXS_M_XS_64_OS|3.836|5.028|0.763|
|PCG_XSH_RR_32|3.609|5.62|0.642|
|PCG_XSH_RR_32_OS|3.676|5.57|0.660|
|PCG_XSH_RS_32|3.614|4.666|0.775|
|PCG_XSH_RS_32_OS|3.679|4.641|0.793|
|SFC_32|3.634|5.244|0.693|
|SFC_64|3.819|5.597|0.682|
|SPLIT_MIX_64|4.018|4.704|0.854|
|TWO_CMRES|3.843|5.883|0.653|
|WELL_1024_A|4.285|10.312|0.416|
|WELL_19937_A|3.95|11.657|0.339|
|WELL_19937_C|3.801|12.508|0.304|
|WELL_44497_A|4.208|14.879|0.283|
|WELL_44497_B|3.86|12.967|0.298|
|WELL_512_A|5.927|9.542|0.621|
|XO_RO_SHI_RO_1024_PP|3.975|6.109|0.651|
|XO_RO_SHI_RO_1024_S|4.679|5.82|0.804|
|XO_RO_SHI_RO_1024_SS|3.967|6.711|0.591|
|XO_RO_SHI_RO_128_PLUS|3.942|4.755|0.829|
|XO_RO_SHI_RO_128_PP|3.844|4.803|0.800|
|XO_RO_SHI_RO_128_SS|3.932|5.252|0.749|
|XO_RO_SHI_RO_64_S|3.712|4.38|0.847|
|XO_RO_SHI_RO_64_SS|3.715|4.857|0.765|
|XO_SHI_RO_128_PLUS|3.607|5.244|0.688|
|XO_SHI_RO_128_PP|3.593|5.252|0.684|
|XO_SHI_RO_128_SS|3.637|5.769|0.630|
|XO_SHI_RO_256_PLUS|3.93|5.285|0.744|
|XO_SHI_RO_256_PP|3.928|5.238|0.750|
|XO_SHI_RO_256_SS|3.914|5.73|0.683|
|XO_SHI_RO_512_PLUS|3.949|7.448|0.530|
|XO_SHI_RO_512_PP|3.956|7.544|0.524|
|XO_SHI_RO_512_SS|3.967|7.6|0.522|
|XOR_SHIFT_1024_S|4.028|6.006|0.671|
|XOR_SHIFT_1024_S_PHI|4.029|5.833|0.691|

 
||Source||nextInt||shiftLong||Relative||
|JSF_64|4.419|5.085|0.869|
|MT_64|5.842|7.293|0.801|
|PCG_RXS_M_XS_64|4.077|4.812|0.847|
|PCG_RXS_M_XS_64_OS|4.082|4.802|0.850|
|SFC_64|4.535|5.152|0.880|
|SPLIT_MIX_64|4.074|4.815|0.846|
|TWO_CMRES|4.777|5.622|0.850|
|XO_RO_SHI_RO_1024_PP|5.021|5.896|0.852|
|XO_RO_SHI_RO_1024_S|4.907|5.737|0.855|
|XO_RO_SHI_RO_1024_SS|5.105|6.328|0.807|
|XO_RO_SHI_RO_128_PLUS|3.977|4.311|0.923|
|XO_RO_SHI_RO_128_PP|4.041|4.63|0.873|
|XO_RO_SHI_RO_128_SS|4.036|4.797|0.841|
|XO_SHI_RO_256_PLUS|4.498|5.079|0.886|
|XO_SHI_RO_256_PP|4.515|5.279|0.855|
|XO_SHI_RO_256_SS|4.58|5.47|0.837|
|XO_SHI_RO_512_PLUS|5.548|7.141|0.777|
|XO_SHI_RO_512_PP|5.609|7.216|0.777|
|XO_SHI_RO_512_SS|5.485|7.347|0.747|
|XOR_SHIFT_1024_S|4.901|5.708|0.859|
|XOR_SHIFT_1024_S_PHI|4.888|5.726|0.854|

> Reduce memory footprint of cached int and boolean source
> --------------------------------------------------------
>
>                 Key: RNG-171
>                 URL: https://issues.apache.org/jira/browse/RNG-171
>             Project: Commons RNG
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.4
>            Reporter: Alex Herbert
>            Priority: Major
>             Fix For: 1.5
>
>
> The base implementation for the IntProvider and LongProvider caches values 
> from the generation method to be used as the source for boolean bits, and in 
> the case of the LongProvider, for int bits. The nextBoolean and nextInt 
> methods thus use all the bits output by the generator. The memory footprint 
> of this cache can be reduced.
> h3. Boolean
> IntProvider
> Currently a 32-bit value and a 32-bit mask is stored.
> However when the cache is refilled a single bit is used for the boolean 
> return value. The cache need only store the unused 31-bits. This leaves room 
> to use a bit to indicate when to reset the cache. A similar change can be 
> made for the LongProvider by only storing 63-bit unused bits.
> In this example for a 64-bit cache the booleanSource is initialised as 
> MIN_VALUE (a single bit in the most significant position).
> {code:java}
> @Override
> public boolean nextBoolean() {
>     long l = booleanSource;
>     if (l == Long.MIN_VALUE) {
>         // Refill
>         l = next();
>         // Store unused 63 bits and a refill flag, return highest bit
>         booleanSource = (l << 1) | 1;
>         return l < 0;
>     }
>     // Shift up eventually resetting, return current high bit
>     booleanSource = l << 1;
>     return l < 0;
> }
> {code}
> h3. Int
> LongProvider only
> Stores a 64-bit value and a boolean flag. However only 32-bits of the stored 
> value are used. Thus the remaining 32-bits can be used as a flag. In this 
> example the sign bit is used as it can be easily checked with a < operator.
> {code:java}
> @Override
> public int nextInt() {
>     long l = intSource;
>     if (l < 0) {
>         // Refill
>         l =next();
>         // Store low 32 bits, return high 32 bits
>         intSource = l & 0xffff_ffffL;
>         // OR
>         // Store low 63 bits, return high 32 bits
>         //intSource = (l << 1) >>> 1;
>         return (int) (l >>> 32);
>     }
>     // Reset and return previous low bits
>     intSource = -1;
>     return (int) l;
> }{code}
> The example above maintains compatibility to return the upper then lower 
> parts of the long for the 2 int values. An alternative with fewer operations 
> is:
> {code:java}
> @Override
> public int nextInt() {
>     long l = intSource;
>     if (l < 0) {
>         // Refill
>         l = next();
>         // FUNCTIONALLY BREAKING CHANGE
>         // Store high 32 bits, return low 32 bits
>         intSource = (l >>> 32);
>         return (int) l;
>     }
>     // Reset and return previous low bits
>     intSource = -1;
>     return (int) l;
> }
> {code}
> This is arguably simpler but would be a functionally breaking change.
> Memory footprint would change:
> {noformat}
> IntProvider
> 8 bytes (int,int)                 -> 4 bytes (int)
> LongProvider
> 25 bytes (long,long,long,boolean) -> 16 bytes (long,long)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to