Re: [Numpy-discussion] Adopt Mersenne Twister 64bit?

2013-03-11 Thread Robert Kern
On Sun, Mar 10, 2013 at 6:12 PM, Siu Kwan Lam s...@continuum.io wrote:
 Hi all,

 I am redirecting a discussion on github issue tracker here.  My original
 post (https://github.com/numpy/numpy/issues/3137):

 The current implementation of the RNG seems to be MT19937-32. Since 64-bit
 machines are common nowadays, I am suggesting adding or upgrading to
 MT19937-64.  Thoughts?

 Let me start by answering to njsmith's comments on the issue tracker:

 Would it be faster?


 Although I have not benchmarked the 64-bit implementation, it is likely that
 it will be faster on a 64-bit machine since the number of iteration
 (controlled by NN and MM in the reference implementation
 http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/mt19937-64.c)
 is reduced by half.  In addition, each generation in the 64-bit
 implementation produces a 64-bit random int which can be used to generate
 double precision random number.  Unlike the 32-bit implementation which
 requires generating a pair of 32-bit random int.

From the last time this was brought up, it looks like getting a single
64-bit integer out from MT19937-64 takes about the same amount of time
as getting a single 32-bit integer from MT19937-32, perhaps a little
slower, even on a 64-bit machine.

http://comments.gmane.org/gmane.comp.python.numeric.general/27773

So getting a single double would be not quite twice as fast.

 But, on a 32-bit machine, a 64-bit instruction is translated into 4 32-bit
 instructions; thus, it is likely to be slower.  (1)

 Use less memory?


 The amount of memory use will remain the same.  The size of the RNG state is
 the same.

 Provide higher quality randomness?


 My naive answer is that 32-bit and 64-bit implementation have the same
 2^19937-1 period. Need to do some research and experiments.

 Would it change the output of this program: import numpy
 numpy.random.seed(0) print numpy.random.random() ?


 Unfortunately, yes.  The 64-bit implementation generates a different random
 number sequence with the same seed. (2)


 My suggestion to overcome (1) and (2) is to allow the user to select between
 the two implementations (and possibly different algorithms in the future).
 If user does not provide a choice, we use the MT19937-32 by default.

 numpy.random.set_state(MT19937_64, …)   # choose the 64-bit
 implementation

Most likely, the different PRNGs should be different subclasses of
RandomState. The module-level convenience API should probably be left
alone. If you need to control the PRNG that you are using, you really
need to be passing around a RandomState instance and not relying on
reseeding the shared global instance. Aside: I really wish we hadn't
exposed `set_state()` in the module API. It's an attractive nuisance.

There is some low-level C work that needs to be done to allow the
non-uniform distributions to be shared between implementations of the
core uniform PRNG, but that's the same no matter how you organize the
upper layer.

--
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Casting and promotion rules (e.g. int + uint64 = float)

2013-03-11 Thread Chris Barker - NOAA Federal
On Fri, Mar 8, 2013 at 8:23 AM, Sergio Callegari
sergio.calleg...@gmail.com wrote:
   I have noticed that numpy introduces some unexpected type casts, that are
 in some cases problematic.

There has been a lot of discussion about  casting on this list in the
last couple months -- I suggest you peruse that discussion and see
what conclusions it has lead to.

 A very weird cast is

 int + uint64 - float

I think the idea here is that an int can hold negative numbers, so you
can't put it in a uint64 -- but you can't put a uint64 into a signed
int64. A float64 can hold the range of numbers of both a int and
uint64, so it is used, even though it can't  hold the full precision
of a uint64 (far from it!)

 Another issue is that variables unexpectedly change type with accumulation
 operators

 a=np.uint64(1)
 a+=1

 now a is float

yeah -- that should NEVER happen -- += is supposed to be an iin=place
operator, it should never change the array! However, what you've
crated here is not an array, but a numpy scalar, and the rules are
different there (but should they be?). I suspect that part of the
issue is that array scalars behave a bit more like the built-in numpy
number types, and thus += is not an in-place operator, but rather,
translates to:

a = a + 1

and as you've seen, that casts to a float64. A little test:

In [34]: d = np.int64(2)

In [35]: e = d
# e and d are the same object

In [36]: d += 1

In [37]: e is d
Out[37]: False

# they are not longer the same object -- the += created a new object

In [38]: type(d)
Out[38]: numpy.int64

# even though it's still the same type (no casting needed)

If you do use an array, you don't get casting with +=:

In [39]: a = np.array((1,), dtype=np.uint64)

In [40]: a
Out[40]: array([1], dtype=uint64)

In [41]: a + 1.0
Out[41]: array([ 2.])

# got a cast with the additon and creation of a new array

In [42]: a += 1.0

In [43]: a
Out[43]: array([2], dtype=uint64)
# but no cast with the in-place operator.

Personally, I think the in-place operators should be just that --
and only work for mutable objects, but I guess the ability to easily
increment in integer was just too tempting!

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Request code review of numpy.i changes

2013-03-11 Thread Bill Spotz
https://github.com/wfspotz/numpy/compare/numpy-swig

** Bill Spotz  **
** Sandia National Laboratories  Voice: (505)845-0170  **
** P.O. Box 5800 Fax:   (505)284-0154  **
** Albuquerque, NM 87185-0370Email: wfsp...@sandia.gov **


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion