On 3 August 2011 09:06, Luc Maisonobe <luc.maison...@free.fr> wrote: > Le 03/08/2011 09:38, Luc Maisonobe a écrit : >> >> Le 01/08/2011 22:40, Luc Maisonobe a écrit : >>> >>> Hi Phil, >>> >>> Le 01/08/2011 20:39, Phil Steitz a écrit : >>>> >>>> On 8/1/11 1:31 AM, luc.maison...@free.fr wrote: >>>>> >>>>> Hi Phil, >>>>> >>>>> ----- Mail original ----- >>>>>> >>>>>> In my own applications, I noticed what appears to be poor >>>>>> performance in the nextInt(int) method of the Mersenne twister, >>>>>> which I was using to *improve* speed. I think that for small n, the >>>>>> default implementation in BistreamGenerator may be running too many >>>>>> iterations. >>>>> >>>>> Mersenne twister uses a quite large pool. It creates pseudo-random bits >>>>> by twisting it and creates large bunches at a time (624 words at a >>>>> time). >>>>> Hence when you ask for large sets, you should have several calls that >>>>> return fast, and one call that takes a longer time to generate another >>>>> large pool. >>>>> >>>>> So good performances are obtained for generating large sets, not >>>>> small sets. >>>>> >>>>> Well generators should be faster and are preferred over Mersenne >>>>> twister now, >>>>> which is now an old generator. Well generators also have large pools, >>>>> but they >>>>> don't generate bits in large batches in advance, they do generate a >>>>> few words >>>>> at a time. >>>> >>>> Yeah, I know. Both are faster than the JDK, though, even for just >>>> 32-bit chunks in my tests at least. >>>> >>>> One thing I have been thinking about is exposing nextInt[], >>>> nextDouble[], nextGaussian[] etc methods that take advantage of the >>>> pools. So you basically generate a large block of bits use this to >>>> fill the output arrays. >>> >>> Seems a very good idea. Most of the time, people generate only one kind >>> of numbers several times, so it really does make sense. >>> >>>>> >>>>>> I am still figuring out how the code works, but I >>>>>> thought it would be good to run some benchmarks - using Gilles' new >>>>>> stuff - against the Harmony implementation in java.util.Random of >>>>>> this method. That led me to notice that there are no unit tests for >>>>>> BitstreamGenerator. I propose that we add >>>>>> 0) RandomGeneratorAbstractTest with an abstract makeGenerator >>>>>> method including fixed seed tests for all RandomGenerator methods >>>>>> 1) BitstreamGeneratorTest extending RandomGeneratorAbstractTest >>>>>> implementing makeGenerator with a BitStreamGenerator that uses the >>>>>> JDK generator for next(int) >>>>>> 2) Make the test classes for Mersenne and Weil generators extend >>>>>> RandomGeneratorAbstractTest, moving redundant tests up into the base >>>>>> class >>>>>> >>>>>> Sound reasonable? >>>>> >>>>> +1 >>>>> >>>>>> Also, any recollection why we are using a >>>>>> different implementation in BitStreamGenerator for next(int) than >>>>>> Harmony and the JDK use? >>>>> >>>>> I don't understand what you mean. next(int) is used to generate the raw >>>>> bits and is the heart of each generator. Each generator has its own >>>>> implementation. Replacing next(int) by the JDK generation would imply >>>>> dropping completely Mersenne twister and Well generators. >>>> >>>> I am sorry. I meant nextInt(int). It is that code that seems to be >>>> slow in BitStreamGenerator and different from the JDK and Harmony. >>> >>> Could you point me at some code ? There are many pitfalls in netInt(int) >>> if one wants to make sure the generator is uniform, which explain the >>> strange implementation, with the mask computation and the loop. By the >>> way, even this implementation would benefit from your proposed array >>> generation, as the mask could be computed only once. >> >> I have looked at the implementation for JDK and Harmony and am a little >> puzzled. >> >> The trick for the power of two (i.e. if ((n & -n) == n)) is not useful >> for the very elaborate generators like Mersenne twister or Well. Both >> are proven to be equidistributed even for the low order bits. They are >> based on linear recurrences but not linear congruences and do not suffer >> from the drawbacks of the latter. >> >> What puzzles me more is the loop. It is documented as avoiding the >> uneven distributions, but at first glance the modulo operation bothers >> me. As documentation explicitly states it is designed for this, it is >> most probably true, I simply don't understand how yet. >> >> So our current implementation is slow, then go ahead and change it to >> the one you showed me. I would simply suggest to get rid of the ((n & >> -n) == n) test. I'll try to understand the condition in the while loop >> to understand how it rejects uneven distributions, just out of curiosity >> for myself. > > OK, I finally understood the algorithm and how it rejects the largest > incomplete numbers from k*n to (2^31)-1 where k*n is the largest multiple of > n that fits in a positive integer. The trick lies in the addition of (n-1) > which overflows the integer and wraps the result back to negative values. It > is smart. > > +1 to use it.
Provided that the algorithm is documented ... > Luc > >> >> Luc >> >>> >>> Luc >>> >>> >>>> >>>> Phil >>>>> >>>>> Mersenne twister and Well should be fast for generating large sets, but >>>>> most importantly they have very good and *proven* properties >>>>> (equidistribution >>>>> on large dimensions, null correlation, maximal period ...). These >>>>> properties >>>>> are essential for example in Monte-Carlo simulations with lots of >>>>> variables that >>>>> must be independent or have controlled correlations. >>>>> >>>>> Luc >>>>> >>>>>> The Harmony impl is almost identical to >>>>>> what is documented in the JDK javadoc. >>>>>> >>>>>> Phil >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>>>>> For additional commands, e-mail: dev-h...@commons.apache.org >>>>>> >>>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>>>> For additional commands, e-mail: dev-h...@commons.apache.org >>>>> >>>>> >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>>> For additional commands, e-mail: dev-h...@commons.apache.org >>>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>> For additional commands, e-mail: dev-h...@commons.apache.org >>> >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org