Re: [Numpy-discussion] Intel random number package

Pavlyk, Oleksandr Wed, 26 Oct 2016 14:26:08 -0700

Please see responses inline.

From: NumPy-Discussion [mailto:numpy-discussion-boun...@scipy.org] On Behalf Of 
Todd
Sent: Wednesday, October 26, 2016 4:04 PM
To: Discussion of Numerical Python <numpy-discussion@scipy.org>
Subject: Re: [Numpy-discussion] Intel random number package

On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr
<oleksandr.pav...@intel.com<mailto:oleksandr.pav...@intel.com>> wrote:

The module under review, similarly to randomstate package, provides alternative
basic pseudo-random number generators (BRNGs), like MT2203, MCG31, MRG32K3A,
Wichmann-Hill. The scope of support differ, with randomstate implementing some
generators absent in MKL and vice-versa.

Is there a reason that randomstate shouldn't implement those generators?

No, randomstate certainly can implement all the BRNGs implemented in MKL. It is
at developer’s discretion.

Thinking about the possibility of providing the functionality of this module
within the framework of randomstate, I find that randomstate implements
samplers from statistical distributions as functions that take the state of the
underlying BRNG, and produce a single variate, e.g.:

https://github.com/bashtage/ng-numpy-randomstate/blob/master/randomstate/distributions.c#L23-L26

This design stands in a way of efficient use of MKL, which generates a whole
vector of variates at a time. This can be done faster than sampling a variate
at a time by using vectorized instructions. So I wrote mkl_distributions.cpp
to provide functions that return a given size vector of sampled variates from
each supported distribution.

I don't know a huge amount about pseudo-random number generators, but this
seems superficially to be something that would benefit random number generation
as a whole independently of whether MKL is used. Might it be possible to
modify the numpy implementation to support this sort of vectorized approach?

I also think that adopting vectorized mindset would benefit np.random. For
example, Gaussians are currently generated using Box-Muller algorithm which
produces two variate at a time, so one currently needs to be saved in the
random state struct itself, along with an indicator that it should be used on
the next iteration. With vectorized approach one could populate the vector two
elements at a time with better memory locality, resulting in better performance.

Vectorized approach has merits with or without use of MKL.

Another point already raised by Nathaniel is that for numpy's randomness
ideally should provide a way to override default algorithm for sampling from a
particular distribution. For example RandomState object that implements PCG
may rely on default acceptance-rejection algorithm for sampling from Gamma,
while the RandomState object that provides interface to MKL might want to call
into MKL directly.

The approach that pyfftw uses at least for scipy, which may also work here, is
that you can monkey-patch the scipy.fftpack module at runtime, replacing it
with pyfftw's drop-in replacement. scipy then proceeds to use pyfftw instead
of its built-in fftpack implementation. Might such an approach work here?
Users can either use this alternative randomstate replacement directly, or they
can replace numpy's with it at runtime and numpy will then proceed to use the
alternative.

I think the monkey-patching approach will work.

RandomState was written with a view to replace numpy.random at some point in
the future. It is standalone at the moment, from what I understand, only
because it is still being worked on and extended.

One particularly important development is the ability to sample continuous
distributions in floats, or to populate a given preallocated
buffer with random samples. These features are missing from numpy.random_intel
and we thought it providing them.

As I have said earlier, another missing feature in the C-API for randomness in
numpy.

Oleksandr

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Intel random number package

Reply via email to