[jira] [Comment Edited] (RNG-50) PoissonSampler single use speed improvements

Alex D Herbert (JIRA) Wed, 01 Aug 2018 03:20:22 -0700


    [ 
https://issues.apache.org/jira/browse/RNG-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16565098#comment-16565098
 ]


Alex D Herbert edited comment on RNG-50 at 8/1/18 10:19 AM:
------------------------------------------------------------

{quote}We keep the name PoissonSampler and change the implementation.
{quote}
OK. I'll do that today.
{quote}Why is it needed for benchmarking?
{quote}
It depends what you want to achieve. If you just want to time how long it takes 
to do N samples then no. If you want to time how long it takes to do the same N 
samples then yes.

In the {{SamplersPerformance}} the benchmark is just to check how long sampling 
takes with different RNGs. So no.

In my case I am directly comparing two classes that produce the exact same 
samples with the same RNG. So it helps to reset the RNG. This ensures the 
algorithm loops will be used in the same way.

I ran the benchmark 50 times for repeat use or single use. The following table 
shows relative performance within the group of [Source+Use+Mean]:

||Source||Use||Mean||Name||Relative Score||
|SPLIT_MIX_64|Repeat|5.3|PoissonSampler|1|
|SPLIT_MIX_64|Repeat|5.3|WrapperPoissonSampler|0.43323176751383|
|SPLIT_MIX_64|Repeat|5.3|SmallMeanPoissonSampler|0.417177345476191|
|SPLIT_MIX_64|Repeat|20.1|PoissonSampler|1|
|SPLIT_MIX_64|Repeat|20.1|WrapperPoissonSampler|0.64270924157568|
|SPLIT_MIX_64|Repeat|20.1|SmallMeanPoissonSampler|0.632731209052441|
|SPLIT_MIX_64|Repeat|35.7|PoissonSampler|1|
|SPLIT_MIX_64|Repeat|35.7|WrapperPoissonSampler|0.728838753545663|
|SPLIT_MIX_64|Repeat|35.7|SmallMeanPoissonSampler|0.72432464048095|
|SPLIT_MIX_64|Repeat|40.3|PoissonSampler|1|
|SPLIT_MIX_64|Repeat|40.3|WrapperPoissonSampler|0.430599850392507|
|SPLIT_MIX_64|Repeat|40.3|LargeMeanPoissonSampler|0.418219761100271|
|SPLIT_MIX_64|Repeat|60.9|PoissonSampler|1|
|SPLIT_MIX_64|Repeat|60.9|WrapperPoissonSampler|0.428623656039417|
|SPLIT_MIX_64|Repeat|60.9|LargeMeanPoissonSampler|0.420101837541331|
|SPLIT_MIX_64|Repeat|142.3|PoissonSampler|1|
|SPLIT_MIX_64|Repeat|142.3|LargeMeanPoissonSampler|0.397325774356007|
|SPLIT_MIX_64|Repeat|142.3|WrapperPoissonSampler|0.394901813798589|
|SPLIT_MIX_64|Single|5.3|PoissonSampler|1|
|SPLIT_MIX_64|Single|5.3|WrapperPoissonSampler|0.852853749721316|
|SPLIT_MIX_64|Single|5.3|SmallMeanPoissonSampler|0.789007728438008|
|SPLIT_MIX_64|Single|20.1|PoissonSampler|1|
|SPLIT_MIX_64|Single|20.1|WrapperPoissonSampler|0.907398568647412|
|SPLIT_MIX_64|Single|20.1|SmallMeanPoissonSampler|0.853113896432368|
|SPLIT_MIX_64|Single|35.7|PoissonSampler|1|
|SPLIT_MIX_64|Single|35.7|WrapperPoissonSampler|0.940808492515571|
|SPLIT_MIX_64|Single|35.7|SmallMeanPoissonSampler|0.913374950144956|
|SPLIT_MIX_64|Single|40.3|PoissonSampler|1|
|SPLIT_MIX_64|Single|40.3|LargeMeanPoissonSampler|0.497462356322885|
|SPLIT_MIX_64|Single|40.3|WrapperPoissonSampler|0.494328324915481|
|SPLIT_MIX_64|Single|60.9|PoissonSampler|1|
|SPLIT_MIX_64|Single|60.9|LargeMeanPoissonSampler|0.364819009315596|
|SPLIT_MIX_64|Single|60.9|WrapperPoissonSampler|0.363949055950517|
|SPLIT_MIX_64|Single|142.3|PoissonSampler|1|
|SPLIT_MIX_64|Single|142.3|LargeMeanPoissonSampler|0.276494835605773|
|SPLIT_MIX_64|Single|142.3|WrapperPoissonSampler|0.273537609763915|
|WELL_19937_C|Repeat|5.3|PoissonSampler|1|
|WELL_19937_C|Repeat|5.3|WrapperPoissonSampler|0.814222792528576|
|WELL_19937_C|Repeat|5.3|SmallMeanPoissonSampler|0.733096117447755|
|WELL_19937_C|Repeat|20.1|WrapperPoissonSampler|1|
|WELL_19937_C|Repeat|20.1|PoissonSampler|0.984065218429984|
|WELL_19937_C|Repeat|20.1|SmallMeanPoissonSampler|0.869601102904367|
|WELL_19937_C|Repeat|35.7|PoissonSampler|1|
|WELL_19937_C|Repeat|35.7|WrapperPoissonSampler|0.953292758663224|
|WELL_19937_C|Repeat|35.7|SmallMeanPoissonSampler|0.931605181685712|
|WELL_19937_C|Repeat|40.3|PoissonSampler|1|
|WELL_19937_C|Repeat|40.3|WrapperPoissonSampler|0.458962363853179|
|WELL_19937_C|Repeat|40.3|LargeMeanPoissonSampler|0.45813232265573|
|WELL_19937_C|Repeat|60.9|PoissonSampler|1|
|WELL_19937_C|Repeat|60.9|WrapperPoissonSampler|0.471445071376392|
|WELL_19937_C|Repeat|60.9|LargeMeanPoissonSampler|0.45689017354128|
|WELL_19937_C|Repeat|142.3|PoissonSampler|1|
|WELL_19937_C|Repeat|142.3|WrapperPoissonSampler|0.442641901291292|
|WELL_19937_C|Repeat|142.3|LargeMeanPoissonSampler|0.440315608121226|
|WELL_19937_C|Single|5.3|WrapperPoissonSampler|1|
|WELL_19937_C|Single|5.3|PoissonSampler|0.983305632700587|
|WELL_19937_C|Single|5.3|SmallMeanPoissonSampler|0.947038102440451|
|WELL_19937_C|Single|20.1|WrapperPoissonSampler|1|
|WELL_19937_C|Single|20.1|PoissonSampler|0.915537463505366|
|WELL_19937_C|Single|20.1|SmallMeanPoissonSampler|0.898370831974599|
|WELL_19937_C|Single|35.7|PoissonSampler|1|
|WELL_19937_C|Single|35.7|WrapperPoissonSampler|0.994369807448147|
|WELL_19937_C|Single|35.7|SmallMeanPoissonSampler|0.991845798740678|
|WELL_19937_C|Single|40.3|PoissonSampler|1|
|WELL_19937_C|Single|40.3|LargeMeanPoissonSampler|0.516493476245896|
|WELL_19937_C|Single|40.3|WrapperPoissonSampler|0.514790856642155|
|WELL_19937_C|Single|60.9|PoissonSampler|1|
|WELL_19937_C|Single|60.9|WrapperPoissonSampler|0.400123977410898|
|WELL_19937_C|Single|60.9|LargeMeanPoissonSampler|0.397006185061951|
|WELL_19937_C|Single|142.3|PoissonSampler|1|
|WELL_19937_C|Single|142.3|WrapperPoissonSampler|0.319267455996389|
|WELL_19937_C|Single|142.3|LargeMeanPoissonSampler|0.312641547735181|
|WELL_44497_B|Repeat|5.3|PoissonSampler|1|
|WELL_44497_B|Repeat|5.3|WrapperPoissonSampler|0.804604033070474|
|WELL_44497_B|Repeat|5.3|SmallMeanPoissonSampler|0.734683100088343|
|WELL_44497_B|Repeat|20.1|PoissonSampler|1|
|WELL_44497_B|Repeat|20.1|WrapperPoissonSampler|0.894314294228112|
|WELL_44497_B|Repeat|20.1|SmallMeanPoissonSampler|0.889294612537888|
|WELL_44497_B|Repeat|35.7|PoissonSampler|1|
|WELL_44497_B|Repeat|35.7|SmallMeanPoissonSampler|0.924749377980263|
|WELL_44497_B|Repeat|35.7|WrapperPoissonSampler|0.923739790541932|
|WELL_44497_B|Repeat|40.3|PoissonSampler|1|
|WELL_44497_B|Repeat|40.3|WrapperPoissonSampler|0.481675060745583|
|WELL_44497_B|Repeat|40.3|LargeMeanPoissonSampler|0.478115041992058|
|WELL_44497_B|Repeat|60.9|PoissonSampler|1|
|WELL_44497_B|Repeat|60.9|LargeMeanPoissonSampler|0.471153366451228|
|WELL_44497_B|Repeat|60.9|WrapperPoissonSampler|0.468242614221675|
|WELL_44497_B|Repeat|142.3|PoissonSampler|1|
|WELL_44497_B|Repeat|142.3|LargeMeanPoissonSampler|0.446657858746463|
|WELL_44497_B|Repeat|142.3|WrapperPoissonSampler|0.444850363127775|
|WELL_44497_B|Single|5.3|WrapperPoissonSampler|1|
|WELL_44497_B|Single|5.3|PoissonSampler|0.974634788197136|
|WELL_44497_B|Single|5.3|SmallMeanPoissonSampler|0.925716568187496|
|WELL_44497_B|Single|20.1|WrapperPoissonSampler|1|
|WELL_44497_B|Single|20.1|PoissonSampler|0.932624919274866|
|WELL_44497_B|Single|20.1|SmallMeanPoissonSampler|0.921849035227966|
|WELL_44497_B|Single|35.7|PoissonSampler|1|
|WELL_44497_B|Single|35.7|WrapperPoissonSampler|0.975418737952865|
|WELL_44497_B|Single|35.7|SmallMeanPoissonSampler|0.974960890175136|
|WELL_44497_B|Single|40.3|PoissonSampler|1|
|WELL_44497_B|Single|40.3|LargeMeanPoissonSampler|0.505995036533615|
|WELL_44497_B|Single|40.3|WrapperPoissonSampler|0.501050657913843|
|WELL_44497_B|Single|60.9|PoissonSampler|1|
|WELL_44497_B|Single|60.9|WrapperPoissonSampler|0.402856380694965|
|WELL_44497_B|Single|60.9|LargeMeanPoissonSampler|0.401878567674722|
|WELL_44497_B|Single|142.3|PoissonSampler|1|
|WELL_44497_B|Single|142.3|WrapperPoissonSampler|0.319770844045709|
|WELL_44497_B|Single|142.3|LargeMeanPoissonSampler|0.313800282155152|

Observations:
 * The small mean sampler is faster.
 * The large mean sampler is faster.
 * Using the wrapper has virtually no impact on the speed.
 * Looking at the different RNGs in order of speed (SPLIT_MIX < WELL_19937_C < 
WELL_44497_B) it can be seen that the speed up is less as the RNG is slower. So 
the RNG is taking a fair chunk of the sampling time.
 * The advantage of the large mean sampler increases as the mean increases. 
This is true for single use (where the PoissonSampler is computing a largely 
unused cache) and repeat use where the prior results show that the cache is 
used less than 0.3% of the time.

Comparing the single use to the repeat use with all timings relative to the 
repeat use of the original PoissonSampler within the group of [Source+Mean]:

||Source||Mean||Name||Relative Score||
|SPLIT_MIX_64|5.3|Single PoissonSampler|1.30579709264129|
|SPLIT_MIX_64|5.3|Single WrapperPoissonSampler|1.11365394683431|
|SPLIT_MIX_64|5.3|Single SmallMeanPoissonSampler|1.03028399786586|
|SPLIT_MIX_64|5.3|Repeat PoissonSampler|1|
|SPLIT_MIX_64|5.3|Repeat WrapperPoissonSampler|0.43323176751383|
|SPLIT_MIX_64|5.3|Repeat SmallMeanPoissonSampler|0.417177345476191|
|SPLIT_MIX_64|20.1|Single PoissonSampler|1.20343026132986|
|SPLIT_MIX_64|20.1|Single WrapperPoissonSampler|1.09199089659769|
|SPLIT_MIX_64|20.1|Single SmallMeanPoissonSampler|1.02666307932774|
|SPLIT_MIX_64|20.1|Repeat PoissonSampler|1|
|SPLIT_MIX_64|20.1|Repeat WrapperPoissonSampler|0.64270924157568|
|SPLIT_MIX_64|20.1|Repeat SmallMeanPoissonSampler|0.632731209052441|
|SPLIT_MIX_64|35.7|Single PoissonSampler|1.14250882410445|
|SPLIT_MIX_64|35.7|Single WrapperPoissonSampler|1.07488200449145|
|SPLIT_MIX_64|35.7|Single SmallMeanPoissonSampler|1.04353894025658|
|SPLIT_MIX_64|35.7|Repeat PoissonSampler|1|
|SPLIT_MIX_64|35.7|Repeat WrapperPoissonSampler|0.728838753545663|
|SPLIT_MIX_64|35.7|Repeat SmallMeanPoissonSampler|0.72432464048095|
|SPLIT_MIX_64|40.3|Single PoissonSampler|2.73331565485164|
|SPLIT_MIX_64|40.3|Single LargeMeanPoissonSampler|1.35972164623673|
|SPLIT_MIX_64|40.3|Single WrapperPoissonSampler|1.35115534912807|
|SPLIT_MIX_64|40.3|Repeat PoissonSampler|1|
|SPLIT_MIX_64|40.3|Repeat WrapperPoissonSampler|0.430599850392507|
|SPLIT_MIX_64|40.3|Repeat LargeMeanPoissonSampler|0.418219761100271|
|SPLIT_MIX_64|60.9|Single PoissonSampler|3.6398342292321|
|SPLIT_MIX_64|60.9|Single LargeMeanPoissonSampler|1.32788071758145|
|SPLIT_MIX_64|60.9|Single WrapperPoissonSampler|1.3247142315454|
|SPLIT_MIX_64|60.9|Repeat PoissonSampler|1|
|SPLIT_MIX_64|60.9|Repeat WrapperPoissonSampler|0.428623656039417|
|SPLIT_MIX_64|60.9|Repeat LargeMeanPoissonSampler|0.420101837541331|
|SPLIT_MIX_64|142.3|Single PoissonSampler|4.83630203157899|
|SPLIT_MIX_64|142.3|Single LargeMeanPoissonSampler|1.3372125351613|
|SPLIT_MIX_64|142.3|Single WrapperPoissonSampler|1.32291049781448|
|SPLIT_MIX_64|142.3|Repeat PoissonSampler|1|
|SPLIT_MIX_64|142.3|Repeat LargeMeanPoissonSampler|0.397325774356007|
|SPLIT_MIX_64|142.3|Repeat WrapperPoissonSampler|0.394901813798589|
|WELL_19937_C|5.3|Single WrapperPoissonSampler|1.13142946141638|
|WELL_19937_C|5.3|Single PoissonSampler|1.11254096241411|
|WELL_19937_C|5.3|Single SmallMeanPoissonSampler|1.07150681018499|
|WELL_19937_C|5.3|Repeat PoissonSampler|1|
|WELL_19937_C|5.3|Repeat WrapperPoissonSampler|0.814222792528576|
|WELL_19937_C|5.3|Repeat SmallMeanPoissonSampler|0.733096117447755|
|WELL_19937_C|20.1|Single WrapperPoissonSampler|1.14340653099588|
|WELL_19937_C|20.1|Single PoissonSampler|1.04683151514344|
|WELL_19937_C|20.1|Single SmallMeanPoissonSampler|1.02720307653596|
|WELL_19937_C|20.1|Repeat WrapperPoissonSampler|1.01619281046783|
|WELL_19937_C|20.1|Repeat PoissonSampler|1|
|WELL_19937_C|20.1|Repeat SmallMeanPoissonSampler|0.883682388746309|
|WELL_19937_C|35.7|Single PoissonSampler|1.02906569480566|
|WELL_19937_C|35.7|Single WrapperPoissonSampler|1.0232718567954|
|WELL_19937_C|35.7|Single SmallMeanPoissonSampler|1.02067448602115|
|WELL_19937_C|35.7|Repeat PoissonSampler|1|
|WELL_19937_C|35.7|Repeat WrapperPoissonSampler|0.953292758663224|
|WELL_19937_C|35.7|Repeat SmallMeanPoissonSampler|0.931605181685712|
|WELL_19937_C|40.3|Single PoissonSampler|2.57485236349569|
|WELL_19937_C|40.3|Single LargeMeanPoissonSampler|1.32989444804185|
|WELL_19937_C|40.3|Single WrapperPoissonSampler|1.32551045393102|
|WELL_19937_C|40.3|Repeat PoissonSampler|1|
|WELL_19937_C|40.3|Repeat WrapperPoissonSampler|0.458962363853179|
|WELL_19937_C|40.3|Repeat LargeMeanPoissonSampler|0.45813232265573|
|WELL_19937_C|60.9|Single PoissonSampler|3.28174959565948|
|WELL_19937_C|60.9|Single WrapperPoissonSampler|1.31310670108188|
|WELL_19937_C|60.9|Single LargeMeanPoissonSampler|1.30287488730137|
|WELL_19937_C|60.9|Repeat PoissonSampler|1|
|WELL_19937_C|60.9|Repeat WrapperPoissonSampler|0.471445071376392|
|WELL_19937_C|60.9|Repeat LargeMeanPoissonSampler|0.45689017354128|
|WELL_19937_C|142.3|Single PoissonSampler|4.19534277243741|
|WELL_19937_C|142.3|Single WrapperPoissonSampler|1.33943641398893|
|WELL_19937_C|142.3|Single LargeMeanPoissonSampler|1.31163845765444|
|WELL_19937_C|142.3|Repeat PoissonSampler|1|
|WELL_19937_C|142.3|Repeat WrapperPoissonSampler|0.442641901291292|
|WELL_19937_C|142.3|Repeat LargeMeanPoissonSampler|0.440315608121226|
|WELL_44497_B|5.3|Single WrapperPoissonSampler|1.1263780335341|
|WELL_44497_B|5.3|Single PoissonSampler|1.09780721614341|
|WELL_44497_B|5.3|Single SmallMeanPoissonSampler|1.04270680768496|
|WELL_44497_B|5.3|Repeat PoissonSampler|1|
|WELL_44497_B|5.3|Repeat WrapperPoissonSampler|0.804604033070474|
|WELL_44497_B|5.3|Repeat SmallMeanPoissonSampler|0.734683100088343|
|WELL_44497_B|20.1|Single WrapperPoissonSampler|1.11707788657526|
|WELL_44497_B|20.1|Single PoissonSampler|1.04181467379099|
|WELL_44497_B|20.1|Single SmallMeanPoissonSampler|1.0297771720139|
|WELL_44497_B|20.1|Repeat PoissonSampler|1|
|WELL_44497_B|20.1|Repeat WrapperPoissonSampler|0.894314294228112|
|WELL_44497_B|20.1|Repeat SmallMeanPoissonSampler|0.889294612537888|
|WELL_44497_B|35.7|Single PoissonSampler|1.0262099604823|
|WELL_44497_B|35.7|Single WrapperPoissonSampler|1.0009844245283|
|WELL_44497_B|35.7|Single SmallMeanPoissonSampler|1.00051457657841|
|WELL_44497_B|35.7|Repeat PoissonSampler|1|
|WELL_44497_B|35.7|Repeat SmallMeanPoissonSampler|0.924749377980263|
|WELL_44497_B|35.7|Repeat WrapperPoissonSampler|0.923739790541932|
|WELL_44497_B|40.3|Single PoissonSampler|2.7329010610034|
|WELL_44497_B|40.3|Single LargeMeanPoissonSampler|1.38283437220517|
|WELL_44497_B|40.3|Single WrapperPoissonSampler|1.36932187462919|
|WELL_44497_B|40.3|Repeat PoissonSampler|1|
|WELL_44497_B|40.3|Repeat WrapperPoissonSampler|0.481675060745583|
|WELL_44497_B|40.3|Repeat LargeMeanPoissonSampler|0.478115041992058|
|WELL_44497_B|60.9|Single PoissonSampler|3.26060908572745|
|WELL_44497_B|60.9|Single WrapperPoissonSampler|1.31355717513728|
|WELL_44497_B|60.9|Single LargeMeanPoissonSampler|1.31036890911933|
|WELL_44497_B|60.9|Repeat PoissonSampler|1|
|WELL_44497_B|60.9|Repeat LargeMeanPoissonSampler|0.471153366451228|
|WELL_44497_B|60.9|Repeat WrapperPoissonSampler|0.468242614221675|
|WELL_44497_B|142.3|Single PoissonSampler|4.20264531572208|
|WELL_44497_B|142.3|Single WrapperPoissonSampler|1.3438834398332|
|WELL_44497_B|142.3|Single LargeMeanPoissonSampler|1.31879128587162|
|WELL_44497_B|142.3|Repeat PoissonSampler|1|
|WELL_44497_B|142.3|Repeat LargeMeanPoissonSampler|0.446657858746463|
|WELL_44497_B|142.3|Repeat WrapperPoissonSampler|0.444850363127775|

Observations:

* Repeat use is faster.
* Using the wrapper has virtually no impact on the speed.
* Single use of the original PoissonSampler with mean above 40 has a large 
penalty (due to cache computation) of 3-4 fold.
* Single use of the large mean sampler has a penalty of 30-40% over repeat use 
of the original PoissonSampler. This is attributed to some advantage of the 
cache that has been constructed (used in 0.7% or less of the samples) but more 
likely the overhead of object instantiation of the LargeMeanPoissonSampler and 
the three distributions that are used internally.
* Single use of the small mean sampler has a penalty of 2-7% over repeat use of 
the original PossionSampler. However the original PoissonSampler also has a low 
penalty of 2-30%. The penalty is greater when using the SPLIT_MIX_64 RNG 
showing the penalty is a combination of the instantiation and the algorithm 
loop. When the RNG is faster the penalty is greater due to the improvement in 
the algorithm (it has been changed from using {{long}} arithmetic to {{int}}).

I attach the raw CSV file from the benchmark ([^jmh-result.csv]).




was (Author: alexherbert):
{quote}We keep the name PoissonSampler and change the implementation.
{quote}
OK. I'll do that today.
{quote}Why is it needed for benchmarking?
{quote}
It depends what you want to achieve. If you just want to time how long it takes 
to do N samples then no. If you want to time how long it takes to do the same N 
samples then yes.

In the {{SamplersPerformance}} the benchmark is just to check how long sampling 
takes with different RNGs. So no.

In my case I am directly comparing two classes that produce the exact same 
samples with the same RNG. So it helps to reset the RNG. This ensures the 
algorithm loops will be used in the same way.

I ran the benchmark 50 times for repeat use or single use. The following table 
shows relative performance within the group of [Source+Use+Mean]:

||Source||Use||Mean||Name||Relative Score||
|SPLIT_MIX_64|Repeat|5.3|PoissonSampler|1|
|SPLIT_MIX_64|Repeat|5.3|WrapperPoissonSampler|0.43323176751383|
|SPLIT_MIX_64|Repeat|5.3|SmallMeanPoissonSampler|0.417177345476191|
|SPLIT_MIX_64|Repeat|20.1|PoissonSampler|1|
|SPLIT_MIX_64|Repeat|20.1|WrapperPoissonSampler|0.64270924157568|
|SPLIT_MIX_64|Repeat|20.1|SmallMeanPoissonSampler|0.632731209052441|
|SPLIT_MIX_64|Repeat|35.7|PoissonSampler|1|
|SPLIT_MIX_64|Repeat|35.7|WrapperPoissonSampler|0.728838753545663|
|SPLIT_MIX_64|Repeat|35.7|SmallMeanPoissonSampler|0.72432464048095|
|SPLIT_MIX_64|Repeat|40.3|PoissonSampler|1|
|SPLIT_MIX_64|Repeat|40.3|WrapperPoissonSampler|0.430599850392507|
|SPLIT_MIX_64|Repeat|40.3|LargeMeanPoissonSampler|0.418219761100271|
|SPLIT_MIX_64|Repeat|60.9|PoissonSampler|1|
|SPLIT_MIX_64|Repeat|60.9|WrapperPoissonSampler|0.428623656039417|
|SPLIT_MIX_64|Repeat|60.9|LargeMeanPoissonSampler|0.420101837541331|
|SPLIT_MIX_64|Repeat|142.3|PoissonSampler|1|
|SPLIT_MIX_64|Repeat|142.3|LargeMeanPoissonSampler|0.397325774356007|
|SPLIT_MIX_64|Repeat|142.3|WrapperPoissonSampler|0.394901813798589|
|SPLIT_MIX_64|Single|5.3|PoissonSampler|1|
|SPLIT_MIX_64|Single|5.3|WrapperPoissonSampler|0.852853749721316|
|SPLIT_MIX_64|Single|5.3|SmallMeanPoissonSampler|0.789007728438008|
|SPLIT_MIX_64|Single|20.1|PoissonSampler|1|
|SPLIT_MIX_64|Single|20.1|WrapperPoissonSampler|0.907398568647412|
|SPLIT_MIX_64|Single|20.1|SmallMeanPoissonSampler|0.853113896432368|
|SPLIT_MIX_64|Single|35.7|PoissonSampler|1|
|SPLIT_MIX_64|Single|35.7|WrapperPoissonSampler|0.940808492515571|
|SPLIT_MIX_64|Single|35.7|SmallMeanPoissonSampler|0.913374950144956|
|SPLIT_MIX_64|Single|40.3|PoissonSampler|1|
|SPLIT_MIX_64|Single|40.3|LargeMeanPoissonSampler|0.497462356322885|
|SPLIT_MIX_64|Single|40.3|WrapperPoissonSampler|0.494328324915481|
|SPLIT_MIX_64|Single|60.9|PoissonSampler|1|
|SPLIT_MIX_64|Single|60.9|LargeMeanPoissonSampler|0.364819009315596|
|SPLIT_MIX_64|Single|60.9|WrapperPoissonSampler|0.363949055950517|
|SPLIT_MIX_64|Single|142.3|PoissonSampler|1|
|SPLIT_MIX_64|Single|142.3|LargeMeanPoissonSampler|0.276494835605773|
|SPLIT_MIX_64|Single|142.3|WrapperPoissonSampler|0.273537609763915|
|WELL_19937_C|Repeat|5.3|PoissonSampler|1|
|WELL_19937_C|Repeat|5.3|WrapperPoissonSampler|0.814222792528576|
|WELL_19937_C|Repeat|5.3|SmallMeanPoissonSampler|0.733096117447755|
|WELL_19937_C|Repeat|20.1|WrapperPoissonSampler|1|
|WELL_19937_C|Repeat|20.1|PoissonSampler|0.984065218429984|
|WELL_19937_C|Repeat|20.1|SmallMeanPoissonSampler|0.869601102904367|
|WELL_19937_C|Repeat|35.7|PoissonSampler|1|
|WELL_19937_C|Repeat|35.7|WrapperPoissonSampler|0.953292758663224|
|WELL_19937_C|Repeat|35.7|SmallMeanPoissonSampler|0.931605181685712|
|WELL_19937_C|Repeat|40.3|PoissonSampler|1|
|WELL_19937_C|Repeat|40.3|WrapperPoissonSampler|0.458962363853179|
|WELL_19937_C|Repeat|40.3|LargeMeanPoissonSampler|0.45813232265573|
|WELL_19937_C|Repeat|60.9|PoissonSampler|1|
|WELL_19937_C|Repeat|60.9|WrapperPoissonSampler|0.471445071376392|
|WELL_19937_C|Repeat|60.9|LargeMeanPoissonSampler|0.45689017354128|
|WELL_19937_C|Repeat|142.3|PoissonSampler|1|
|WELL_19937_C|Repeat|142.3|WrapperPoissonSampler|0.442641901291292|
|WELL_19937_C|Repeat|142.3|LargeMeanPoissonSampler|0.440315608121226|
|WELL_19937_C|Single|5.3|WrapperPoissonSampler|1|
|WELL_19937_C|Single|5.3|PoissonSampler|0.983305632700587|
|WELL_19937_C|Single|5.3|SmallMeanPoissonSampler|0.947038102440451|
|WELL_19937_C|Single|20.1|WrapperPoissonSampler|1|
|WELL_19937_C|Single|20.1|PoissonSampler|0.915537463505366|
|WELL_19937_C|Single|20.1|SmallMeanPoissonSampler|0.898370831974599|
|WELL_19937_C|Single|35.7|PoissonSampler|1|
|WELL_19937_C|Single|35.7|WrapperPoissonSampler|0.994369807448147|
|WELL_19937_C|Single|35.7|SmallMeanPoissonSampler|0.991845798740678|
|WELL_19937_C|Single|40.3|PoissonSampler|1|
|WELL_19937_C|Single|40.3|LargeMeanPoissonSampler|0.516493476245896|
|WELL_19937_C|Single|40.3|WrapperPoissonSampler|0.514790856642155|
|WELL_19937_C|Single|60.9|PoissonSampler|1|
|WELL_19937_C|Single|60.9|WrapperPoissonSampler|0.400123977410898|
|WELL_19937_C|Single|60.9|LargeMeanPoissonSampler|0.397006185061951|
|WELL_19937_C|Single|142.3|PoissonSampler|1|
|WELL_19937_C|Single|142.3|WrapperPoissonSampler|0.319267455996389|
|WELL_19937_C|Single|142.3|LargeMeanPoissonSampler|0.312641547735181|
|WELL_44497_B|Repeat|5.3|PoissonSampler|1|
|WELL_44497_B|Repeat|5.3|WrapperPoissonSampler|0.804604033070474|
|WELL_44497_B|Repeat|5.3|SmallMeanPoissonSampler|0.734683100088343|
|WELL_44497_B|Repeat|20.1|PoissonSampler|1|
|WELL_44497_B|Repeat|20.1|WrapperPoissonSampler|0.894314294228112|
|WELL_44497_B|Repeat|20.1|SmallMeanPoissonSampler|0.889294612537888|
|WELL_44497_B|Repeat|35.7|PoissonSampler|1|
|WELL_44497_B|Repeat|35.7|SmallMeanPoissonSampler|0.924749377980263|
|WELL_44497_B|Repeat|35.7|WrapperPoissonSampler|0.923739790541932|
|WELL_44497_B|Repeat|40.3|PoissonSampler|1|
|WELL_44497_B|Repeat|40.3|WrapperPoissonSampler|0.481675060745583|
|WELL_44497_B|Repeat|40.3|LargeMeanPoissonSampler|0.478115041992058|
|WELL_44497_B|Repeat|60.9|PoissonSampler|1|
|WELL_44497_B|Repeat|60.9|LargeMeanPoissonSampler|0.471153366451228|
|WELL_44497_B|Repeat|60.9|WrapperPoissonSampler|0.468242614221675|
|WELL_44497_B|Repeat|142.3|PoissonSampler|1|
|WELL_44497_B|Repeat|142.3|LargeMeanPoissonSampler|0.446657858746463|
|WELL_44497_B|Repeat|142.3|WrapperPoissonSampler|0.444850363127775|
|WELL_44497_B|Single|5.3|WrapperPoissonSampler|1|
|WELL_44497_B|Single|5.3|PoissonSampler|0.974634788197136|
|WELL_44497_B|Single|5.3|SmallMeanPoissonSampler|0.925716568187496|
|WELL_44497_B|Single|20.1|WrapperPoissonSampler|1|
|WELL_44497_B|Single|20.1|PoissonSampler|0.932624919274866|
|WELL_44497_B|Single|20.1|SmallMeanPoissonSampler|0.921849035227966|
|WELL_44497_B|Single|35.7|PoissonSampler|1|
|WELL_44497_B|Single|35.7|WrapperPoissonSampler|0.975418737952865|
|WELL_44497_B|Single|35.7|SmallMeanPoissonSampler|0.974960890175136|
|WELL_44497_B|Single|40.3|PoissonSampler|1|
|WELL_44497_B|Single|40.3|LargeMeanPoissonSampler|0.505995036533615|
|WELL_44497_B|Single|40.3|WrapperPoissonSampler|0.501050657913843|
|WELL_44497_B|Single|60.9|PoissonSampler|1|
|WELL_44497_B|Single|60.9|WrapperPoissonSampler|0.402856380694965|
|WELL_44497_B|Single|60.9|LargeMeanPoissonSampler|0.401878567674722|
|WELL_44497_B|Single|142.3|PoissonSampler|1|
|WELL_44497_B|Single|142.3|WrapperPoissonSampler|0.319770844045709|
|WELL_44497_B|Single|142.3|LargeMeanPoissonSampler|0.313800282155152|

Observations:
 * The small mean sampler is faster.
 * The large mean sampler is faster.
 * Using the wrapper has virtually no impact on the speed.
 * Looking at the different RNGs in order of speed (SPLIT_MIX < WELL_19937_C < 
WELL_44497_B) it can be seen that the speed up is less as the RNG is slower. So 
the RNG is taking a fair chunk of the sampling time.
 * The advantage of the large mean sampler increases as the mean increases. 
This is true for single use (where the PoissonSampler is computing a largely 
unused cache) and repeat use where the prior results show that the cache is 
used less than 0.3% of the time.

Comparing the single use to the repeat use with all timings relative to the 
repeat use of the original PoissonSampler within the group of [Source+Mean]:

||Source||Use||Mean||Name||Relative Score||
|SPLIT_MIX_64|5.3|Single PoissonSampler|1.30579709264129|
|SPLIT_MIX_64|5.3|Single WrapperPoissonSampler|1.11365394683431|
|SPLIT_MIX_64|5.3|Single SmallMeanPoissonSampler|1.03028399786586|
|SPLIT_MIX_64|5.3|Repeat PoissonSampler|1|
|SPLIT_MIX_64|5.3|Repeat WrapperPoissonSampler|0.43323176751383|
|SPLIT_MIX_64|5.3|Repeat SmallMeanPoissonSampler|0.417177345476191|
|SPLIT_MIX_64|20.1|Single PoissonSampler|1.20343026132986|
|SPLIT_MIX_64|20.1|Single WrapperPoissonSampler|1.09199089659769|
|SPLIT_MIX_64|20.1|Single SmallMeanPoissonSampler|1.02666307932774|
|SPLIT_MIX_64|20.1|Repeat PoissonSampler|1|
|SPLIT_MIX_64|20.1|Repeat WrapperPoissonSampler|0.64270924157568|
|SPLIT_MIX_64|20.1|Repeat SmallMeanPoissonSampler|0.632731209052441|
|SPLIT_MIX_64|35.7|Single PoissonSampler|1.14250882410445|
|SPLIT_MIX_64|35.7|Single WrapperPoissonSampler|1.07488200449145|
|SPLIT_MIX_64|35.7|Single SmallMeanPoissonSampler|1.04353894025658|
|SPLIT_MIX_64|35.7|Repeat PoissonSampler|1|
|SPLIT_MIX_64|35.7|Repeat WrapperPoissonSampler|0.728838753545663|
|SPLIT_MIX_64|35.7|Repeat SmallMeanPoissonSampler|0.72432464048095|
|SPLIT_MIX_64|40.3|Single PoissonSampler|2.73331565485164|
|SPLIT_MIX_64|40.3|Single LargeMeanPoissonSampler|1.35972164623673|
|SPLIT_MIX_64|40.3|Single WrapperPoissonSampler|1.35115534912807|
|SPLIT_MIX_64|40.3|Repeat PoissonSampler|1|
|SPLIT_MIX_64|40.3|Repeat WrapperPoissonSampler|0.430599850392507|
|SPLIT_MIX_64|40.3|Repeat LargeMeanPoissonSampler|0.418219761100271|
|SPLIT_MIX_64|60.9|Single PoissonSampler|3.6398342292321|
|SPLIT_MIX_64|60.9|Single LargeMeanPoissonSampler|1.32788071758145|
|SPLIT_MIX_64|60.9|Single WrapperPoissonSampler|1.3247142315454|
|SPLIT_MIX_64|60.9|Repeat PoissonSampler|1|
|SPLIT_MIX_64|60.9|Repeat WrapperPoissonSampler|0.428623656039417|
|SPLIT_MIX_64|60.9|Repeat LargeMeanPoissonSampler|0.420101837541331|
|SPLIT_MIX_64|142.3|Single PoissonSampler|4.83630203157899|
|SPLIT_MIX_64|142.3|Single LargeMeanPoissonSampler|1.3372125351613|
|SPLIT_MIX_64|142.3|Single WrapperPoissonSampler|1.32291049781448|
|SPLIT_MIX_64|142.3|Repeat PoissonSampler|1|
|SPLIT_MIX_64|142.3|Repeat LargeMeanPoissonSampler|0.397325774356007|
|SPLIT_MIX_64|142.3|Repeat WrapperPoissonSampler|0.394901813798589|
|WELL_19937_C|5.3|Single WrapperPoissonSampler|1.13142946141638|
|WELL_19937_C|5.3|Single PoissonSampler|1.11254096241411|
|WELL_19937_C|5.3|Single SmallMeanPoissonSampler|1.07150681018499|
|WELL_19937_C|5.3|Repeat PoissonSampler|1|
|WELL_19937_C|5.3|Repeat WrapperPoissonSampler|0.814222792528576|
|WELL_19937_C|5.3|Repeat SmallMeanPoissonSampler|0.733096117447755|
|WELL_19937_C|20.1|Single WrapperPoissonSampler|1.14340653099588|
|WELL_19937_C|20.1|Single PoissonSampler|1.04683151514344|
|WELL_19937_C|20.1|Single SmallMeanPoissonSampler|1.02720307653596|
|WELL_19937_C|20.1|Repeat WrapperPoissonSampler|1.01619281046783|
|WELL_19937_C|20.1|Repeat PoissonSampler|1|
|WELL_19937_C|20.1|Repeat SmallMeanPoissonSampler|0.883682388746309|
|WELL_19937_C|35.7|Single PoissonSampler|1.02906569480566|
|WELL_19937_C|35.7|Single WrapperPoissonSampler|1.0232718567954|
|WELL_19937_C|35.7|Single SmallMeanPoissonSampler|1.02067448602115|
|WELL_19937_C|35.7|Repeat PoissonSampler|1|
|WELL_19937_C|35.7|Repeat WrapperPoissonSampler|0.953292758663224|
|WELL_19937_C|35.7|Repeat SmallMeanPoissonSampler|0.931605181685712|
|WELL_19937_C|40.3|Single PoissonSampler|2.57485236349569|
|WELL_19937_C|40.3|Single LargeMeanPoissonSampler|1.32989444804185|
|WELL_19937_C|40.3|Single WrapperPoissonSampler|1.32551045393102|
|WELL_19937_C|40.3|Repeat PoissonSampler|1|
|WELL_19937_C|40.3|Repeat WrapperPoissonSampler|0.458962363853179|
|WELL_19937_C|40.3|Repeat LargeMeanPoissonSampler|0.45813232265573|
|WELL_19937_C|60.9|Single PoissonSampler|3.28174959565948|
|WELL_19937_C|60.9|Single WrapperPoissonSampler|1.31310670108188|
|WELL_19937_C|60.9|Single LargeMeanPoissonSampler|1.30287488730137|
|WELL_19937_C|60.9|Repeat PoissonSampler|1|
|WELL_19937_C|60.9|Repeat WrapperPoissonSampler|0.471445071376392|
|WELL_19937_C|60.9|Repeat LargeMeanPoissonSampler|0.45689017354128|
|WELL_19937_C|142.3|Single PoissonSampler|4.19534277243741|
|WELL_19937_C|142.3|Single WrapperPoissonSampler|1.33943641398893|
|WELL_19937_C|142.3|Single LargeMeanPoissonSampler|1.31163845765444|
|WELL_19937_C|142.3|Repeat PoissonSampler|1|
|WELL_19937_C|142.3|Repeat WrapperPoissonSampler|0.442641901291292|
|WELL_19937_C|142.3|Repeat LargeMeanPoissonSampler|0.440315608121226|
|WELL_44497_B|5.3|Single WrapperPoissonSampler|1.1263780335341|
|WELL_44497_B|5.3|Single PoissonSampler|1.09780721614341|
|WELL_44497_B|5.3|Single SmallMeanPoissonSampler|1.04270680768496|
|WELL_44497_B|5.3|Repeat PoissonSampler|1|
|WELL_44497_B|5.3|Repeat WrapperPoissonSampler|0.804604033070474|
|WELL_44497_B|5.3|Repeat SmallMeanPoissonSampler|0.734683100088343|
|WELL_44497_B|20.1|Single WrapperPoissonSampler|1.11707788657526|
|WELL_44497_B|20.1|Single PoissonSampler|1.04181467379099|
|WELL_44497_B|20.1|Single SmallMeanPoissonSampler|1.0297771720139|
|WELL_44497_B|20.1|Repeat PoissonSampler|1|
|WELL_44497_B|20.1|Repeat WrapperPoissonSampler|0.894314294228112|
|WELL_44497_B|20.1|Repeat SmallMeanPoissonSampler|0.889294612537888|
|WELL_44497_B|35.7|Single PoissonSampler|1.0262099604823|
|WELL_44497_B|35.7|Single WrapperPoissonSampler|1.0009844245283|
|WELL_44497_B|35.7|Single SmallMeanPoissonSampler|1.00051457657841|
|WELL_44497_B|35.7|Repeat PoissonSampler|1|
|WELL_44497_B|35.7|Repeat SmallMeanPoissonSampler|0.924749377980263|
|WELL_44497_B|35.7|Repeat WrapperPoissonSampler|0.923739790541932|
|WELL_44497_B|40.3|Single PoissonSampler|2.7329010610034|
|WELL_44497_B|40.3|Single LargeMeanPoissonSampler|1.38283437220517|
|WELL_44497_B|40.3|Single WrapperPoissonSampler|1.36932187462919|
|WELL_44497_B|40.3|Repeat PoissonSampler|1|
|WELL_44497_B|40.3|Repeat WrapperPoissonSampler|0.481675060745583|
|WELL_44497_B|40.3|Repeat LargeMeanPoissonSampler|0.478115041992058|
|WELL_44497_B|60.9|Single PoissonSampler|3.26060908572745|
|WELL_44497_B|60.9|Single WrapperPoissonSampler|1.31355717513728|
|WELL_44497_B|60.9|Single LargeMeanPoissonSampler|1.31036890911933|
|WELL_44497_B|60.9|Repeat PoissonSampler|1|
|WELL_44497_B|60.9|Repeat LargeMeanPoissonSampler|0.471153366451228|
|WELL_44497_B|60.9|Repeat WrapperPoissonSampler|0.468242614221675|
|WELL_44497_B|142.3|Single PoissonSampler|4.20264531572208|
|WELL_44497_B|142.3|Single WrapperPoissonSampler|1.3438834398332|
|WELL_44497_B|142.3|Single LargeMeanPoissonSampler|1.31879128587162|
|WELL_44497_B|142.3|Repeat PoissonSampler|1|
|WELL_44497_B|142.3|Repeat LargeMeanPoissonSampler|0.446657858746463|
|WELL_44497_B|142.3|Repeat WrapperPoissonSampler|0.444850363127775|

Observations:

* Repeat use is faster.
* Using the wrapper has virtually no impact on the speed.
* Single use of the original PoissonSampler with mean above 40 has a large 
penalty (due to cache computation) of 3-4 fold.
* Single use of the large mean sampler has a penalty of 30-40% over repeat use 
of the original PoissonSampler. This is attributed to some advantage of the 
cache that has been constructed (used in 0.7% or less of the samples) but more 
likely the overhead of object instantiation of the LargeMeanPoissonSampler and 
the three distributions that are used internally.
* Single use of the small mean sampler has a penalty of 2-7% over repeat use of 
the original PossionSampler. However the original PoissonSampler also has a low 
penalty of 2-30%. The penalty is greater when using the SPLIT_MIX_64 RNG 
showing the penalty is a combination of the instantiation and the algorithm 
loop. When the RNG is faster the penalty is greater due to the improvement in 
the algorithm (it has been changed from using {{long}} arithmetic to {{int}}).

I attach the raw CSV file from the benchmark ([^jmh-result.csv]).



> PoissonSampler single use speed improvements
> --------------------------------------------
>
>                 Key: RNG-50
>                 URL: https://issues.apache.org/jira/browse/RNG-50
>             Project: Commons RNG
>          Issue Type: Improvement
>    Affects Versions: 1.0
>            Reporter: Alex D Herbert
>            Priority: Minor
>         Attachments: PoissonSamplerTest.java, jmh-result.csv
>
>
> The Sampler architecture of {{org.apache.commons.rng.sampling.distribution}} 
> is nicely written for fast sampling of small dataset sizes. The constructors 
> for the samplers do not check the input parameters are valid for the 
> respective distributions (in contrast to the old 
> {{org.apache.commons.math3.random.distribution}} classes). I assume this is a 
> design choice for speed. Thus most of the samplers can be used within a loop 
> to sample just one value with very little overhead.
> The {{PoissonSampler}} precomputes log factorial numbers upon construction if 
> the mean is above 40. This is done using the {{InternalUtils.FactorialLog}} 
> class. As of version 1.0 this internal class is currently only used in the 
> {{PoissonSampler}}.
> The cache size is limited to 2*PIVOT (where PIVOT=40). But it creates and 
> precomputes the cache every time a PoissonSampler is constructed if the mean 
> is above the PIVOT value.
> Why not create this once in a static block for the PoissonSampler?
> {code:java}
> /** {@code log(n!)}. */
> private static final FactorialLog factorialLog;
>      
> static 
> {
>     factorialLog = FactorialLog.create().withCache((int) (2 * 
> PoissonSampler.PIVOT));
> }
> {code}
> This will make the construction cost of a new {{PoissonSampler}} negligible. 
> If the table is computed dynamically as a static construction method then the 
> overhead will be in the first use. Thus the following call will be much 
> faster:
> {code:java}
> UniformRandomProvider rng = ...;
> int value = new PoissonSampler(rng, 50).sample();
> {code}
> I have tested this modification (see attached file) and the results are:
> {noformat}
> Mean 40  Single construction ( 7330792) vs Loop construction                  
>         (24334724)   (3.319522.2x faster)
> Mean 40  Single construction ( 7330792) vs Loop construction with static 
> FactorialLog ( 7990656)   (1.090013.2x faster)
> Mean 50  Single construction ( 6390303) vs Loop construction                  
>         (19389026)   (3.034132.2x faster)
> Mean 50  Single construction ( 6390303) vs Loop construction with static 
> FactorialLog ( 6146556)   (0.961857.2x faster)
> Mean 60  Single construction ( 6041165) vs Loop construction                  
>         (21337678)   (3.532047.2x faster)
> Mean 60  Single construction ( 6041165) vs Loop construction with static 
> FactorialLog ( 5329129)   (0.882136.2x faster)
> Mean 70  Single construction ( 6064003) vs Loop construction                  
>         (23963516)   (3.951765.2x faster)
> Mean 70  Single construction ( 6064003) vs Loop construction with static 
> FactorialLog ( 5306081)   (0.875013.2x faster)
> Mean 80  Single construction ( 6064772) vs Loop construction                  
>         (26381365)   (4.349935.2x faster)
> Mean 80  Single construction ( 6064772) vs Loop construction with static 
> FactorialLog ( 6341274)   (1.045591.2x faster)
> {noformat}
> Thus the speed improvements would be approximately 3-4 fold for single use 
> Poisson sampling.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (RNG-50) PoissonSampler single use speed improvements

Reply via email to