Re: [Numpy-discussion] Adopt Mersenne Twister 64bit?
On Sun, Mar 10, 2013 at 6:12 PM, Siu Kwan Lam s...@continuum.io wrote: Hi all, I am redirecting a discussion on github issue tracker here. My original post (https://github.com/numpy/numpy/issues/3137): The current implementation of the RNG seems to be MT19937-32. Since 64-bit machines are common nowadays, I am suggesting adding or upgrading to MT19937-64. Thoughts? Let me start by answering to njsmith's comments on the issue tracker: Would it be faster? Although I have not benchmarked the 64-bit implementation, it is likely that it will be faster on a 64-bit machine since the number of iteration (controlled by NN and MM in the reference implementation http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/mt19937-64.c) is reduced by half. In addition, each generation in the 64-bit implementation produces a 64-bit random int which can be used to generate double precision random number. Unlike the 32-bit implementation which requires generating a pair of 32-bit random int. From the last time this was brought up, it looks like getting a single 64-bit integer out from MT19937-64 takes about the same amount of time as getting a single 32-bit integer from MT19937-32, perhaps a little slower, even on a 64-bit machine. http://comments.gmane.org/gmane.comp.python.numeric.general/27773 So getting a single double would be not quite twice as fast. But, on a 32-bit machine, a 64-bit instruction is translated into 4 32-bit instructions; thus, it is likely to be slower. (1) Use less memory? The amount of memory use will remain the same. The size of the RNG state is the same. Provide higher quality randomness? My naive answer is that 32-bit and 64-bit implementation have the same 2^19937-1 period. Need to do some research and experiments. Would it change the output of this program: import numpy numpy.random.seed(0) print numpy.random.random() ? Unfortunately, yes. The 64-bit implementation generates a different random number sequence with the same seed. (2) My suggestion to overcome (1) and (2) is to allow the user to select between the two implementations (and possibly different algorithms in the future). If user does not provide a choice, we use the MT19937-32 by default. numpy.random.set_state(MT19937_64, …) # choose the 64-bit implementation Most likely, the different PRNGs should be different subclasses of RandomState. The module-level convenience API should probably be left alone. If you need to control the PRNG that you are using, you really need to be passing around a RandomState instance and not relying on reseeding the shared global instance. Aside: I really wish we hadn't exposed `set_state()` in the module API. It's an attractive nuisance. There is some low-level C work that needs to be done to allow the non-uniform distributions to be shared between implementations of the core uniform PRNG, but that's the same no matter how you organize the upper layer. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Casting and promotion rules (e.g. int + uint64 = float)
On Fri, Mar 8, 2013 at 8:23 AM, Sergio Callegari sergio.calleg...@gmail.com wrote: I have noticed that numpy introduces some unexpected type casts, that are in some cases problematic. There has been a lot of discussion about casting on this list in the last couple months -- I suggest you peruse that discussion and see what conclusions it has lead to. A very weird cast is int + uint64 - float I think the idea here is that an int can hold negative numbers, so you can't put it in a uint64 -- but you can't put a uint64 into a signed int64. A float64 can hold the range of numbers of both a int and uint64, so it is used, even though it can't hold the full precision of a uint64 (far from it!) Another issue is that variables unexpectedly change type with accumulation operators a=np.uint64(1) a+=1 now a is float yeah -- that should NEVER happen -- += is supposed to be an iin=place operator, it should never change the array! However, what you've crated here is not an array, but a numpy scalar, and the rules are different there (but should they be?). I suspect that part of the issue is that array scalars behave a bit more like the built-in numpy number types, and thus += is not an in-place operator, but rather, translates to: a = a + 1 and as you've seen, that casts to a float64. A little test: In [34]: d = np.int64(2) In [35]: e = d # e and d are the same object In [36]: d += 1 In [37]: e is d Out[37]: False # they are not longer the same object -- the += created a new object In [38]: type(d) Out[38]: numpy.int64 # even though it's still the same type (no casting needed) If you do use an array, you don't get casting with +=: In [39]: a = np.array((1,), dtype=np.uint64) In [40]: a Out[40]: array([1], dtype=uint64) In [41]: a + 1.0 Out[41]: array([ 2.]) # got a cast with the additon and creation of a new array In [42]: a += 1.0 In [43]: a Out[43]: array([2], dtype=uint64) # but no cast with the in-place operator. Personally, I think the in-place operators should be just that -- and only work for mutable objects, but I guess the ability to easily increment in integer was just too tempting! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Request code review of numpy.i changes
https://github.com/wfspotz/numpy/compare/numpy-swig ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370Email: wfsp...@sandia.gov ** ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion