On Sun, Mar 10, 2013 at 6:12 PM, Siu Kwan Lam <[email protected]> wrote: > Hi all, > > I am redirecting a discussion on github issue tracker here. My original > post (https://github.com/numpy/numpy/issues/3137): > > "The current implementation of the RNG seems to be MT19937-32. Since 64-bit > machines are common nowadays, I am suggesting adding or upgrading to > MT19937-64. Thoughts?" > > Let me start by answering to njsmith's comments on the issue tracker: > > Would it be faster? > > > Although I have not benchmarked the 64-bit implementation, it is likely that > it will be faster on a 64-bit machine since the number of iteration > (controlled by NN and MM in the reference implementation > http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/mt19937-64.c) > is reduced by half. In addition, each generation in the 64-bit > implementation produces a 64-bit random int which can be used to generate > double precision random number. Unlike the 32-bit implementation which > requires generating a pair of 32-bit random int.
From the last time this was brought up, it looks like getting a single 64-bit integer out from MT19937-64 takes about the same amount of time as getting a single 32-bit integer from MT19937-32, perhaps a little slower, even on a 64-bit machine. http://comments.gmane.org/gmane.comp.python.numeric.general/27773 So getting a single double would be not quite twice as fast. > But, on a 32-bit machine, a 64-bit instruction is translated into 4 32-bit > instructions; thus, it is likely to be slower. (1) > > Use less memory? > > > The amount of memory use will remain the same. The size of the RNG state is > the same. > > Provide higher quality randomness? > > > My naive answer is that 32-bit and 64-bit implementation have the same > 2^19937-1 period. Need to do some research and experiments. > > Would it change the output of this program: import numpy > numpy.random.seed(0) print numpy.random.random() ? > > > Unfortunately, yes. The 64-bit implementation generates a different random > number sequence with the same seed. (2) > > > My suggestion to overcome (1) and (2) is to allow the user to select between > the two implementations (and possibly different algorithms in the future). > If user does not provide a choice, we use the MT19937-32 by default. > > numpy.random.set_state("MT19937_64", …) # choose the 64-bit > implementation Most likely, the different PRNGs should be different subclasses of RandomState. The module-level convenience API should probably be left alone. If you need to control the PRNG that you are using, you really need to be passing around a RandomState instance and not relying on reseeding the shared global instance. Aside: I really wish we hadn't exposed `set_state()` in the module API. It's an attractive nuisance. There is some low-level C work that needs to be done to allow the non-uniform distributions to be shared between implementations of the core uniform PRNG, but that's the same no matter how you organize the upper layer. -- Robert Kern _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
