Hi. Le ven. 10 mai 2019 à 15:53, Alex Herbert <alex.d.herb...@gmail.com> a écrit : > > > On 10/05/2019 14:27, Gilles Sadowski wrote: > > Hi Alex. > > > > Le ven. 10 mai 2019 à 13:57, Alex Herbert <alex.d.herb...@gmail.com> a > > écrit : > >> Can I get a review of the PR for RNG-101 please. > > Thanks for this work! > > > > I didn't go into the details; however, I see many fields and methods like > > table1 ... table5 > > fillTable1 ... fillTable5 > > getTable1 ... getTable5 > > Wouldn't it be possible to use a 2D table: > > table[5][]; > > so that e.g. only one "fillTable(int tableIndex, /* other args */)" method > > is necessary (where "tableIndex" runs from 0 to 4)? > > Yes. The design is based around using 5 tables as per the example code. > > The sample() method knows which table it needs so it can directly jump > to the table in question. I'd have to look at the difference in speed > when using a 2D table as you are adding another array access but > reducing the number of possible method calls (although you still need a > method call). Maybe this will be optimised out by the JVM. > > If the speed is not a factor then I'll rewrite it. Otherwise it's > probably better done for speed as this is the entire point of the > sampler given it disregards any probability under 2^-31 (i.e. it's not a > perfectly fair sampler). > > Note that 5 tables are needed for 5 hex digits (base 2^6). The paper > states using 3 tables of base 2^10 then you get a speed increase > (roughly 1.16x) at the cost of storage (roughly 9x). Changing to 2 > tables of base 2^15 does not make it much faster again. > > I'll have a rethink to see if I can make the design work for different > base sizes.
That could be an extension made easier with the 2D table, but I quite agree that given the relatively minor speed improvement to be expected, it is not the main reason; the rationale was just to make the code a more compact and a little easier to grasp (IMHO). Gilles > > > > > The diff for "DiscreteSamplersList.java" refers to > > MarsagliaTsangWangBinomialSampler > > but > > MarsagliaTsangWangSmallMeanPoissonSampler > > seems to be missing. > > Oops, I missed adding that back. I built the PR from code where I was > testing lots of implementations. > > I've just added it back and it is still passing locally. Travis should > see that too as I pushed the change to the PR. > > > > > Regards, > > Gilles > > > >> This is a new sampler based on the source code from the paper: > >> > >> George Marsaglia, Wai Wan Tsang, Jingbo Wang (2004) > >> Fast Generation of Discrete Random Variables. > >> Journal of Statistical Software. Vol. 11, Issue. 3, pp. 1-11. > >> > >> https://www.jstatsoft.org/article/view/v011i03 > >> > >> The code has no explicit licence. > >> > >> The paper states: > >> > >> "We have provided C versions of the two methods described here, for > >> inclusion in the “Browse > >> files”section of the journal. ... You may then want to examine the > >> components of the two files, for illumination > >> or for extracting portions that might be usefully applied to your > >> discrete distributions." > >> > >> So I assuming that it can be incorporated with little modification. > >> > >> The Java implementation has been rewritten to allow the storage to be > >> optimised for the required size. The generation of the tables has been > >> adapted appropriately and checks have been added on the input parameters > >> to ensure the sampler does not generate exceptions once constructed (I > >> found out the hard way that the original code was not entirely correct). > >> > >> Thanks. > >> > >> Alex --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org