Re: [Numpy-discussion] Intel random number package

2016-10-27 Thread Robert Kern
On Thu, Oct 27, 2016 at 10:45 AM, Todd  wrote:
>
> On Thu, Oct 27, 2016 at 12:12 PM, Nathaniel Smith  wrote:
>>
>> Ever notice how Anaconda doesn't provide pyfftw? They can't legally ship
both MKL and pyfftw, and they picked MKL.
>
> Anaconda does ship GPL code [1].  They even ship GPL code that depends on
numpy, such as cvxcanon and pystan, and there doesn't seem to be anything
that prevents me from installing them alongside the MKL version of numpy.
So I don't see how it would be any different for pyfftw.

I think we've exhausted the relevance of this tangent to Oleksander's
contributions.

--
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-27 Thread Todd
On Thu, Oct 27, 2016 at 12:12 PM, Nathaniel Smith  wrote:

> Ever notice how Anaconda doesn't provide pyfftw? They can't legally ship
> both MKL and pyfftw, and they picked MKL.


Anaconda does ship GPL code [1].  They even ship GPL code that depends on
numpy, such as cvxcanon and pystan, and there doesn't seem to be anything
that prevents me from installing them alongside the MKL version of numpy.
So I don't see how it would be any different for pyfftw.

[1] https://docs.continuum.io/anaconda/pkg-docs
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-27 Thread Nathaniel Smith
On Oct 27, 2016 8:42 AM, "Robert McLeod"  wrote:
>
> Releasing NumPy under GPL would make it incompatible with SciPy, which
may be _slightly_ inconvenient to the scientific Python community:
>
> https://scipy.github.io/old-wiki/pages/License_Compatibility.html
>
> https://mail.scipy.org/pipermail/scipy-dev/2013-August/019149.html

There's 0 chance that numpy is going to switch to the GPL in general, so
please don't panic. Also, you're misunderstanding license compatibility, so
let's back up a step :-).

The discussion was about whether numpy might potentially, at some
unspecified future date, be available with *optional* GPL code. A numpy
build with optional GPL bits included would be similar to how the numpy
builds that many people use which that are linked to MKL, and thus subject
to MKL's license terms. In both cases the license is no longer numpy's
regular bsd, but has these extra bits added. Neither changes the
availability of bsd-licensed numpy; they just give another option.

And, both numpy+GPL-bits and numpy+MKL-bits are/would be license
*compatible* with scipy in the sense that matters to end users: you can
absolutely use and distribute numpy+(pick one of the above)+scipy together,
and the licenses are happy to allow that.

The sense in which they're both *in*compatible with scipy is just that if
you want to *add code to scipy itself*, then that code can't be GPL like
pyfftw, or proprietary like MKL, because the scipy devs have decided that
they don't want to allow that. That's a decision they've made for good
reasons, but it isn't a legal inevitability, and it doesn't stop *you* from
using and distributing scipy and GPL code together, or scipy and
proprietary code together.

(The real license incompatibility is between GPL and proprietary. Either
one can be mixed with BSD, but they can't be mixed with each other and then
distributed. Ever notice how Anaconda doesn't provide pyfftw? They can't
legally ship both MKL and pyfftw, and they picked MKL. Even then, though,
this license restriction only applies to software distributors: if you as
an end user go and install MKL and pyfftw together in the privacy of your
own cluster, then that's also totally legal.)

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-27 Thread Julian Taylor
As I understand it the wiki is talking about including code in 
numpy/scipy itself, all code in numpy and scipy must be permissively 
licensed so it is easy to reason about when building your binaries.


The license of the binaries produced from the code is a different 
matter, which at that time didn't really exist as we didn't distribute 
binaries at all (except for windows).


A GPL licensed binary containing numpy is perfectly compatible with 
SciPy. It may not be compatible with some other component which has an 
actually incompatible license (e.g. anything you cannot distribute the 
source of as required by the GPL).
I it is not numpy that is GPL licensed it is the restriction of another 
component in the binary distribution that makes the full product adhere 
to the most restrictive license
But numpy itself is always permissive, the distributor can always build 
a permissive numpy binary without the viral component in it.



On 10/27/2016 05:42 PM, Robert McLeod wrote:

Releasing NumPy under GPL would make it incompatible with SciPy, which
may be _slightly_ inconvenient to the scientific Python community:

https://scipy.github.io/old-wiki/pages/License_Compatibility.html

https://mail.scipy.org/pipermail/scipy-dev/2013-August/019149.html

Robert

On Thu, Oct 27, 2016 at 5:14 PM, Julian Taylor
<jtaylor.deb...@googlemail.com <mailto:jtaylor.deb...@googlemail.com>>
wrote:

On 10/27/2016 04:52 PM, Todd wrote:

On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor
<jtaylor.deb...@googlemail.com
<mailto:jtaylor.deb...@googlemail.com>
<mailto:jtaylor.deb...@googlemail.com
<mailto:jtaylor.deb...@googlemail.com>>>
wrote:

On 10/27/2016 04:30 PM, Todd wrote:

On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers
<ralf.gomm...@gmail.com <mailto:ralf.gomm...@gmail.com>
<mailto:ralf.gomm...@gmail.com <mailto:ralf.gomm...@gmail.com>>
<mailto:ralf.gomm...@gmail.com
<mailto:ralf.gomm...@gmail.com> <mailto:ralf.gomm...@gmail.com
<mailto:ralf.gomm...@gmail.com>>>>
wrote:


On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr
<oleksandr.pav...@intel.com
<mailto:oleksandr.pav...@intel.com>
<mailto:oleksandr.pav...@intel.com
<mailto:oleksandr.pav...@intel.com>>
<mailto:oleksandr.pav...@intel.com
<mailto:oleksandr.pav...@intel.com>
<mailto:oleksandr.pav...@intel.com
<mailto:oleksandr.pav...@intel.com>>>> wrote:

Please see responses inline.



*From:*NumPy-Discussion
[mailto:numpy-discussion-boun...@scipy.org
<mailto:numpy-discussion-boun...@scipy.org>
<mailto:numpy-discussion-boun...@scipy.org
<mailto:numpy-discussion-boun...@scipy.org>>
<mailto:numpy-discussion-boun...@scipy.org
<mailto:numpy-discussion-boun...@scipy.org>
<mailto:numpy-discussion-boun...@scipy.org
<mailto:numpy-discussion-boun...@scipy.org>>>] *On Behalf Of *Todd
*Sent:* Wednesday, October 26, 2016 4:04 PM
*To:* Discussion of Numerical Python
<numpy-discussion@scipy.org
<mailto:numpy-discussion@scipy.org>
<mailto:numpy-discussion@scipy.org
<mailto:numpy-discussion@scipy.org>>
<mailto:numpy-discussion@scipy.org
    <mailto:numpy-discussion@scipy.org>
    <mailto:numpy-discussion@scipy.org
<mailto:numpy-discussion@scipy.org>>>>
*Subject:* Re: [Numpy-discussion] Intel random
number
package




On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr
<oleksandr.pav...@intel.com
<mailto:oleksandr.pav...@intel.com>
<mailto:oleksandr.pav...@intel.com
<mailto:oleksandr.pav...@intel.com>>
<mailto:oleksandr.pav...@intel.com
<mailto:oleksandr.pav...@intel.com>

<mailto:oleksandr.pav...@intel.com
<mailto:oleksandr.pav...@intel.com>>>>
wrote:

Another point already raised by Nathaniel is
that for
numpy's randomness ideally should provide a
way to
override
default algorithm for sampling from a particular
distribution.  For example RandomState
object that
implements PCG may rely o

Re: [Numpy-discussion] Intel random number package

2016-10-27 Thread Todd
It would still be compatible with SciPy, it would "just" mean that SciPy
(and anything else that uses numpy) would be effectively GPL.

On Thu, Oct 27, 2016 at 11:42 AM, Robert McLeod <robbmcl...@gmail.com>
wrote:

> Releasing NumPy under GPL would make it incompatible with SciPy, which may
> be _slightly_ inconvenient to the scientific Python community:
>
> https://scipy.github.io/old-wiki/pages/License_Compatibility.html
>
> https://mail.scipy.org/pipermail/scipy-dev/2013-August/019149.html
>
> Robert
>
> On Thu, Oct 27, 2016 at 5:14 PM, Julian Taylor <
> jtaylor.deb...@googlemail.com> wrote:
>
>> On 10/27/2016 04:52 PM, Todd wrote:
>>
>>> On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor
>>> <jtaylor.deb...@googlemail.com <mailto:jtaylor.deb...@googlemail.com>>
>>> wrote:
>>>
>>> On 10/27/2016 04:30 PM, Todd wrote:
>>>
>>> On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers
>>> <ralf.gomm...@gmail.com <mailto:ralf.gomm...@gmail.com>
>>> <mailto:ralf.gomm...@gmail.com <mailto:ralf.gomm...@gmail.com>>>
>>> wrote:
>>>
>>>
>>> On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr
>>> <oleksandr.pav...@intel.com
>>> <mailto:oleksandr.pav...@intel.com>
>>> <mailto:oleksandr.pav...@intel.com
>>> <mailto:oleksandr.pav...@intel.com>>> wrote:
>>>
>>> Please see responses inline.
>>>
>>>
>>>
>>> *From:*NumPy-Discussion
>>> [mailto:numpy-discussion-boun...@scipy.org
>>> <mailto:numpy-discussion-boun...@scipy.org>
>>> <mailto:numpy-discussion-boun...@scipy.org
>>> <mailto:numpy-discussion-boun...@scipy.org>>] *On Behalf Of
>>> *Todd
>>> *Sent:* Wednesday, October 26, 2016 4:04 PM
>>> *To:* Discussion of Numerical Python
>>> <numpy-discussion@scipy.org <mailto:numpy-discussion@scipy.org>
>>> <mailto:numpy-discussion@scipy.org
>>> <mailto:numpy-discussion@scipy.org>>>
>>> *Subject:* Re: [Numpy-discussion] Intel random number
>>> package
>>>
>>>
>>>
>>>
>>> On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr
>>> <oleksandr.pav...@intel.com
>>> <mailto:oleksandr.pav...@intel.com>
>>> <mailto:oleksandr.pav...@intel.com
>>>
>>> <mailto:oleksandr.pav...@intel.com>>>
>>> wrote:
>>>
>>> Another point already raised by Nathaniel is that for
>>> numpy's randomness ideally should provide a way to
>>> override
>>> default algorithm for sampling from a particular
>>> distribution.  For example RandomState object that
>>> implements PCG may rely on default
>>> acceptance-rejection
>>> algorithm for sampling from Gamma, while the
>>> RandomState
>>> object that provides interface to MKL might want to
>>> call
>>> into MKL directly.
>>>
>>>
>>>
>>> The approach that pyfftw uses at least for scipy, which
>>> may also
>>> work here, is that you can monkey-patch the
>>> scipy.fftpack module
>>> at runtime, replacing it with pyfftw's drop-in
>>> replacement.
>>> scipy then proceeds to use pyfftw instead of its built-in
>>> fftpack implementation.  Might such an approach work
>>> here?
>>> Users can either use this alternative randomstate
>>> replacement
>>> directly, or they can replace numpy's with it at runtime
>>> and
>>> numpy will then proceed to use the alternative.
>>>
>>>
>>> The only reason that pyfftw uses monkeypatching is that the
>>> better
>>> approach is not possible due to license constraints with
>>> FFTW (it's
>>> GPL).
>>>
>>>
>>> Yes, that is exactly why I brought it up.  Bet

Re: [Numpy-discussion] Intel random number package

2016-10-27 Thread Robert McLeod
Releasing NumPy under GPL would make it incompatible with SciPy, which may
be _slightly_ inconvenient to the scientific Python community:

https://scipy.github.io/old-wiki/pages/License_Compatibility.html

https://mail.scipy.org/pipermail/scipy-dev/2013-August/019149.html

Robert

On Thu, Oct 27, 2016 at 5:14 PM, Julian Taylor <
jtaylor.deb...@googlemail.com> wrote:

> On 10/27/2016 04:52 PM, Todd wrote:
>
>> On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor
>> <jtaylor.deb...@googlemail.com <mailto:jtaylor.deb...@googlemail.com>>
>> wrote:
>>
>> On 10/27/2016 04:30 PM, Todd wrote:
>>
>> On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers
>> <ralf.gomm...@gmail.com <mailto:ralf.gomm...@gmail.com>
>> <mailto:ralf.gomm...@gmail.com <mailto:ralf.gomm...@gmail.com>>>
>> wrote:
>>
>>
>> On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr
>> <oleksandr.pav...@intel.com
>> <mailto:oleksandr.pav...@intel.com>
>> <mailto:oleksandr.pav...@intel.com
>> <mailto:oleksandr.pav...@intel.com>>> wrote:
>>
>> Please see responses inline.
>>
>>
>>
>> *From:*NumPy-Discussion
>> [mailto:numpy-discussion-boun...@scipy.org
>> <mailto:numpy-discussion-boun...@scipy.org>
>> <mailto:numpy-discussion-boun...@scipy.org
>> <mailto:numpy-discussion-boun...@scipy.org>>] *On Behalf Of *Todd
>> *Sent:* Wednesday, October 26, 2016 4:04 PM
>>     *To:* Discussion of Numerical Python
>> <numpy-discussion@scipy.org <mailto:numpy-discussion@scipy.org>
>> <mailto:numpy-discussion@scipy.org
>> <mailto:numpy-discussion@scipy.org>>>
>> *Subject:* Re: [Numpy-discussion] Intel random number
>> package
>>
>>
>>
>>
>> On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr
>> <oleksandr.pav...@intel.com
>> <mailto:oleksandr.pav...@intel.com>
>> <mailto:oleksandr.pav...@intel.com
>>
>> <mailto:oleksandr.pav...@intel.com>>>
>> wrote:
>>
>> Another point already raised by Nathaniel is that for
>> numpy's randomness ideally should provide a way to
>> override
>> default algorithm for sampling from a particular
>> distribution.  For example RandomState object that
>> implements PCG may rely on default
>> acceptance-rejection
>> algorithm for sampling from Gamma, while the
>> RandomState
>> object that provides interface to MKL might want to
>> call
>> into MKL directly.
>>
>>
>>
>> The approach that pyfftw uses at least for scipy, which
>> may also
>> work here, is that you can monkey-patch the
>> scipy.fftpack module
>> at runtime, replacing it with pyfftw's drop-in
>> replacement.
>> scipy then proceeds to use pyfftw instead of its built-in
>> fftpack implementation.  Might such an approach work here?
>> Users can either use this alternative randomstate
>> replacement
>> directly, or they can replace numpy's with it at runtime
>> and
>> numpy will then proceed to use the alternative.
>>
>>
>> The only reason that pyfftw uses monkeypatching is that the
>> better
>> approach is not possible due to license constraints with
>> FFTW (it's
>> GPL).
>>
>>
>> Yes, that is exactly why I brought it up.  Better approaches are
>> also
>> not possible with MKL due to license constraints.  It is a very
>> similar
>> situation overall.
>>
>>
>> Its not that similar, the better approach is certainly possible with
>> FFTW, the GPL is compatible with numpys license. It is only a
>> concern users of binary distributions. Nobody provided the code to
>> use fftw yet, but it would certainly be accepted.
>>
>>
>> Although it is technically compatible, it would make numpy effectively
>> GPL.  Suggestions for this have been explicitly 

Re: [Numpy-discussion] Intel random number package

2016-10-27 Thread Julian Taylor

On 10/27/2016 04:52 PM, Todd wrote:

On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor
<jtaylor.deb...@googlemail.com <mailto:jtaylor.deb...@googlemail.com>>
wrote:

On 10/27/2016 04:30 PM, Todd wrote:

On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers
<ralf.gomm...@gmail.com <mailto:ralf.gomm...@gmail.com>
<mailto:ralf.gomm...@gmail.com <mailto:ralf.gomm...@gmail.com>>>
wrote:


On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr
<oleksandr.pav...@intel.com
<mailto:oleksandr.pav...@intel.com>
<mailto:oleksandr.pav...@intel.com
<mailto:oleksandr.pav...@intel.com>>> wrote:

Please see responses inline.



*From:*NumPy-Discussion
[mailto:numpy-discussion-boun...@scipy.org
<mailto:numpy-discussion-boun...@scipy.org>
<mailto:numpy-discussion-boun...@scipy.org
<mailto:numpy-discussion-boun...@scipy.org>>] *On Behalf Of *Todd
*Sent:* Wednesday, October 26, 2016 4:04 PM
*To:* Discussion of Numerical Python
<numpy-discussion@scipy.org <mailto:numpy-discussion@scipy.org>
<mailto:numpy-discussion@scipy.org
    <mailto:numpy-discussion@scipy.org>>>
*Subject:* Re: [Numpy-discussion] Intel random number
package




On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr
<oleksandr.pav...@intel.com
<mailto:oleksandr.pav...@intel.com>
<mailto:oleksandr.pav...@intel.com
<mailto:oleksandr.pav...@intel.com>>>
wrote:

Another point already raised by Nathaniel is that for
numpy's randomness ideally should provide a way to
override
default algorithm for sampling from a particular
distribution.  For example RandomState object that
implements PCG may rely on default acceptance-rejection
algorithm for sampling from Gamma, while the RandomState
object that provides interface to MKL might want to call
into MKL directly.



The approach that pyfftw uses at least for scipy, which
may also
work here, is that you can monkey-patch the
scipy.fftpack module
at runtime, replacing it with pyfftw's drop-in replacement.
scipy then proceeds to use pyfftw instead of its built-in
fftpack implementation.  Might such an approach work here?
Users can either use this alternative randomstate
replacement
directly, or they can replace numpy's with it at runtime and
numpy will then proceed to use the alternative.


The only reason that pyfftw uses monkeypatching is that the
better
approach is not possible due to license constraints with
FFTW (it's
GPL).


Yes, that is exactly why I brought it up.  Better approaches are
also
not possible with MKL due to license constraints.  It is a very
similar
situation overall.


Its not that similar, the better approach is certainly possible with
FFTW, the GPL is compatible with numpys license. It is only a
concern users of binary distributions. Nobody provided the code to
use fftw yet, but it would certainly be accepted.


Although it is technically compatible, it would make numpy effectively
GPL.  Suggestions for this have been explicitly rejected on these
grounds [1]

[1] https://github.com/numpy/numpy/issues/3485



Yes it would make numpy GPL, but that is not a concern for a lot of 
users. Users for who it is a problem can still use the non-GPL version.
A more interesting debate is whether our binary wheels should then be 
GPL wheels by default or not. Probably not, but that is something that 
should be discussed when its an actual issue.


But to clarify what I said, it would be accepted if the value it 
provides is sufficient compared to the code maintenance it adds. Given 
that pyfftw already exists the value is probably relatively small, but 
personally I'd still be interested in code that allows switching the fft 
backend as that could also allow plugging e.g. gpu based implementations 
(though again this is already covered by other third party modules).

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-27 Thread Todd
On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor <
jtaylor.deb...@googlemail.com> wrote:

> On 10/27/2016 04:30 PM, Todd wrote:
>
>> On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers <ralf.gomm...@gmail.com
>> <mailto:ralf.gomm...@gmail.com>> wrote:
>>
>>
>> On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr
>> <oleksandr.pav...@intel.com <mailto:oleksandr.pav...@intel.com>>
>> wrote:
>>
>> Please see responses inline.
>>
>>
>>
>> *From:*NumPy-Discussion
>> [mailto:numpy-discussion-boun...@scipy.org
>> <mailto:numpy-discussion-boun...@scipy.org>] *On Behalf Of *Todd
>> *Sent:* Wednesday, October 26, 2016 4:04 PM
>> *To:* Discussion of Numerical Python <numpy-discussion@scipy.org
>> <mailto:numpy-discussion@scipy.org>>
>> *Subject:* Re: [Numpy-discussion] Intel random number package
>>
>>
>>
>>
>> On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr
>> <oleksandr.pav...@intel.com <mailto:oleksandr.pav...@intel.com>>
>> wrote:
>>
>> Another point already raised by Nathaniel is that for
>> numpy's randomness ideally should provide a way to override
>> default algorithm for sampling from a particular
>> distribution.  For example RandomState object that
>> implements PCG may rely on default acceptance-rejection
>> algorithm for sampling from Gamma, while the RandomState
>> object that provides interface to MKL might want to call
>> into MKL directly.
>>
>>
>>
>> The approach that pyfftw uses at least for scipy, which may also
>> work here, is that you can monkey-patch the scipy.fftpack module
>> at runtime, replacing it with pyfftw's drop-in replacement.
>> scipy then proceeds to use pyfftw instead of its built-in
>> fftpack implementation.  Might such an approach work here?
>> Users can either use this alternative randomstate replacement
>> directly, or they can replace numpy's with it at runtime and
>> numpy will then proceed to use the alternative.
>>
>>
>> The only reason that pyfftw uses monkeypatching is that the better
>> approach is not possible due to license constraints with FFTW (it's
>> GPL).
>>
>>
>> Yes, that is exactly why I brought it up.  Better approaches are also
>> not possible with MKL due to license constraints.  It is a very similar
>> situation overall.
>>
>>
> Its not that similar, the better approach is certainly possible with FFTW,
> the GPL is compatible with numpys license. It is only a concern users of
> binary distributions. Nobody provided the code to use fftw yet, but it
> would certainly be accepted.


Although it is technically compatible, it would make numpy effectively
GPL.  Suggestions for this have been explicitly rejected on these grounds
[1]

[1] https://github.com/numpy/numpy/issues/3485
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-27 Thread Julian Taylor

On 10/27/2016 04:30 PM, Todd wrote:

On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers <ralf.gomm...@gmail.com
<mailto:ralf.gomm...@gmail.com>> wrote:


On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr
<oleksandr.pav...@intel.com <mailto:oleksandr.pav...@intel.com>> wrote:

Please see responses inline.



*From:*NumPy-Discussion
[mailto:numpy-discussion-boun...@scipy.org
<mailto:numpy-discussion-boun...@scipy.org>] *On Behalf Of *Todd
*Sent:* Wednesday, October 26, 2016 4:04 PM
*To:* Discussion of Numerical Python <numpy-discussion@scipy.org
<mailto:numpy-discussion@scipy.org>>
    *Subject:* Re: [Numpy-discussion] Intel random number package




On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr
<oleksandr.pav...@intel.com <mailto:oleksandr.pav...@intel.com>>
wrote:

Another point already raised by Nathaniel is that for
numpy's randomness ideally should provide a way to override
default algorithm for sampling from a particular
distribution.  For example RandomState object that
implements PCG may rely on default acceptance-rejection
algorithm for sampling from Gamma, while the RandomState
object that provides interface to MKL might want to call
into MKL directly.



The approach that pyfftw uses at least for scipy, which may also
work here, is that you can monkey-patch the scipy.fftpack module
at runtime, replacing it with pyfftw's drop-in replacement.
scipy then proceeds to use pyfftw instead of its built-in
fftpack implementation.  Might such an approach work here?
Users can either use this alternative randomstate replacement
directly, or they can replace numpy's with it at runtime and
numpy will then proceed to use the alternative.


The only reason that pyfftw uses monkeypatching is that the better
approach is not possible due to license constraints with FFTW (it's
GPL).


Yes, that is exactly why I brought it up.  Better approaches are also
not possible with MKL due to license constraints.  It is a very similar
situation overall.



Its not that similar, the better approach is certainly possible with 
FFTW, the GPL is compatible with numpys license. It is only a concern 
users of binary distributions. Nobody provided the code to use fftw yet, 
but it would certainly be accepted.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-27 Thread Todd
On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers <ralf.gomm...@gmail.com>
wrote:
>
>
> On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr <
> oleksandr.pav...@intel.com> wrote:
>
>> Please see responses inline.
>>
>>
>>
>> *From:* NumPy-Discussion [mailto:numpy-discussion-boun...@scipy.org] *On
>> Behalf Of *Todd
>> *Sent:* Wednesday, October 26, 2016 4:04 PM
>> *To:* Discussion of Numerical Python <numpy-discussion@scipy.org>
>> *Subject:* Re: [Numpy-discussion] Intel random number package
>>
>>
>>
>> On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr <
>> oleksandr.pav...@intel.com> wrote:
>>
>> Another point already raised by Nathaniel is that for numpy's randomness
>> ideally should provide a way to override default algorithm for sampling
>> from a particular distribution.  For example RandomState object that
>> implements PCG may rely on default acceptance-rejection algorithm for
>> sampling from Gamma, while the RandomState object that provides interface
>> to MKL might want to call into MKL directly.
>>
>>
>>
>> The approach that pyfftw uses at least for scipy, which may also work
>> here, is that you can monkey-patch the scipy.fftpack module at runtime,
>> replacing it with pyfftw's drop-in replacement.  scipy then proceeds to use
>> pyfftw instead of its built-in fftpack implementation.  Might such an
>> approach work here?  Users can either use this alternative randomstate
>> replacement directly, or they can replace numpy's with it at runtime and
>> numpy will then proceed to use the alternative.
>>
>
> The only reason that pyfftw uses monkeypatching is that the better
> approach is not possible due to license constraints with FFTW (it's GPL).
>

Yes, that is exactly why I brought it up.  Better approaches are also not
possible with MKL due to license constraints.  It is a very similar
situation overall.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-27 Thread Ralf Gommers
On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr <
oleksandr.pav...@intel.com> wrote:

> Please see responses inline.
>
>
>
> *From:* NumPy-Discussion [mailto:numpy-discussion-boun...@scipy.org] *On
> Behalf Of *Todd
> *Sent:* Wednesday, October 26, 2016 4:04 PM
> *To:* Discussion of Numerical Python <numpy-discussion@scipy.org>
> *Subject:* Re: [Numpy-discussion] Intel random number package
>
>
>
> On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr <
> oleksandr.pav...@intel.com> wrote:
>
> Another point already raised by Nathaniel is that for numpy's randomness
> ideally should provide a way to override default algorithm for sampling
> from a particular distribution.  For example RandomState object that
> implements PCG may rely on default acceptance-rejection algorithm for
> sampling from Gamma, while the RandomState object that provides interface
> to MKL might want to call into MKL directly.
>
>
>
> The approach that pyfftw uses at least for scipy, which may also work
> here, is that you can monkey-patch the scipy.fftpack module at runtime,
> replacing it with pyfftw's drop-in replacement.  scipy then proceeds to use
> pyfftw instead of its built-in fftpack implementation.  Might such an
> approach work here?  Users can either use this alternative randomstate
> replacement directly, or they can replace numpy's with it at runtime and
> numpy will then proceed to use the alternative.
>

The only reason that pyfftw uses monkeypatching is that the better approach
is not possible due to license constraints with FFTW (it's GPL).


> I think the monkey-patching approach will work.
>

It will work, for a while at least, but it's bad design.

We're all on the same page I think that a separate submodule for
random_intel is a no go, but as an explicitly switchable backend for
functions with the same signature it would be fine imho. Of course we don't
have that backend infrastructure today, but it's something we want and have
been discussing anyway.

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Pavlyk, Oleksandr
Please see responses inline.

From: NumPy-Discussion [mailto:numpy-discussion-boun...@scipy.org] On Behalf Of 
Todd
Sent: Wednesday, October 26, 2016 4:04 PM
To: Discussion of Numerical Python <numpy-discussion@scipy.org>
Subject: Re: [Numpy-discussion] Intel random number package

On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr 
<oleksandr.pav...@intel.com<mailto:oleksandr.pav...@intel.com>> wrote:

The module under review, similarly to randomstate package, provides alternative 
basic pseudo-random number generators (BRNGs), like MT2203, MCG31, MRG32K3A, 
Wichmann-Hill. The scope of support differ, with randomstate implementing some 
generators absent in MKL and vice-versa.

Is there a reason that randomstate shouldn't implement those generators?


No, randomstate certainly can implement all the BRNGs implemented in MKL. It is 
at developer’s discretion.



Thinking about the possibility of providing the functionality of this module 
within the framework of randomstate, I find that randomstate implements 
samplers from statistical distributions as functions that take the state of the 
underlying BRNG, and produce a single variate, e.g.:

https://github.com/bashtage/ng-numpy-randomstate/blob/master/randomstate/distributions.c#L23-L26

This design stands in a way of efficient use of MKL, which generates a whole 
vector of variates at a time. This can be done faster than sampling a variate 
at a time by using vectorized instructions.  So I wrote mkl_distributions.cpp 
to provide functions that return a given size vector of sampled variates from 
each supported distribution.

I don't know a huge amount about pseudo-random number generators, but this 
seems superficially to be something that would benefit random number generation 
as a whole independently of whether MKL is used.  Might it be possible to 
modify the numpy implementation to support this sort of vectorized approach?

I also think that adopting vectorized mindset would benefit np.random. For 
example, Gaussians are currently generated using Box-Muller algorithm which 
produces two variate at a time, so one currently needs to be saved in the 
random state struct itself, along with an indicator that it should be used on 
the next iteration.  With vectorized approach one could populate the vector two 
elements at a time with better memory locality, resulting in better performance.

Vectorized approach has merits with or without use of MKL.

Another point already raised by Nathaniel is that for numpy's randomness 
ideally should provide a way to override default algorithm for sampling from a 
particular distribution.  For example RandomState object that implements PCG 
may rely on default acceptance-rejection algorithm for sampling from Gamma, 
while the RandomState object that provides interface to MKL might want to call 
into MKL directly.

The approach that pyfftw uses at least for scipy, which may also work here, is 
that you can monkey-patch the scipy.fftpack module at runtime, replacing it 
with pyfftw's drop-in replacement.  scipy then proceeds to use pyfftw instead 
of its built-in fftpack implementation.  Might such an approach work here?  
Users can either use this alternative randomstate replacement directly, or they 
can replace numpy's with it at runtime and numpy will then proceed to use the 
alternative.

I think the monkey-patching approach will work.

RandomState was written with a view to replace numpy.random at some point in 
the future. It is standalone at the moment, from what I understand, only 
because it is still being worked on and extended.

One particularly important development is the ability to sample continuous 
distributions in floats, or to populate a given preallocated
buffer with random samples. These features are missing from numpy.random_intel 
and we thought it providing them.

As I have said earlier, another missing feature in the C-API for randomness in 
numpy.


Oleksandr
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Todd
On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr <
oleksandr.pav...@intel.com> wrote:

>
> The module under review, similarly to randomstate package, provides
> alternative basic pseudo-random number generators (BRNGs), like MT2203,
> MCG31, MRG32K3A, Wichmann-Hill. The scope of support differ, with
> randomstate implementing some generators absent in MKL and vice-versa.
>
>
Is there a reason that randomstate shouldn't implement those generators?



> Thinking about the possibility of providing the functionality of this
> module within the framework of randomstate, I find that randomstate
> implements samplers from statistical distributions as functions that take
> the state of the underlying BRNG, and produce a single variate, e.g.:
>
> https://github.com/bashtage/ng-numpy-randomstate/blob/master/randomstate/
> distributions.c#L23-L26
>
> This design stands in a way of efficient use of MKL, which generates a
> whole vector of variates at a time. This can be done faster than sampling a
> variate at a time by using vectorized instructions.  So I wrote
> mkl_distributions.cpp to provide functions that return a given size vector
> of sampled variates from each supported distribution.
>

I don't know a huge amount about pseudo-random number generators, but this
seems superficially to be something that would benefit random number
generation as a whole independently of whether MKL is used.  Might it be
possible to modify the numpy implementation to support this sort of
vectorized approach?


Another point already raised by Nathaniel is that for numpy's randomness
> ideally should provide a way to override default algorithm for sampling
> from a particular distribution.  For example RandomState object that
> implements PCG may rely on default acceptance-rejection algorithm for
> sampling from Gamma, while the RandomState object that provides interface
> to MKL might want to call into MKL directly.
>

The approach that pyfftw uses at least for scipy, which may also work here,
is that you can monkey-patch the scipy.fftpack module at runtime, replacing
it with pyfftw's drop-in replacement.  scipy then proceeds to use pyfftw
instead of its built-in fftpack implementation.  Might such an approach
work here?  Users can either use this alternative randomstate replacement
directly, or they can replace numpy's with it at runtime and numpy will
then proceed to use the alternative.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Pavlyk, Oleksandr
Hi, 

Thanks a lot everybody for the feedback. 

The package can certainly be made a stand-alone drop-in replacement for 
np.random. There are many points raised and unraised in favor of this, 
and it is easy to accomplish.  I will create a stand-alone package on github, 
but would still appreciate some help in reviewing it 
and making it available at PyPI.

Interestingly, Nathaniel's link to a representative changes, specifically 


https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833

point at an unused code borrowed directly from mtrand/distributions.c:

https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L297

More representative change would be the implementation of Student's 
T-distribution:

https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L232-L262
 

The module under review, similarly to randomstate package, provides alternative 
basic pseudo-random number generators (BRNGs), like MT2203, MCG31, MRG32K3A, 
Wichmann-Hill. The scope of support differ, with randomstate implementing some 
generators absent in MKL and vice-versa. 

Thinking about the possibility of providing the functionality of this module 
within the framework of randomstate, I find that randomstate implements 
samplers from statistical distributions as functions that take the state of the 
underlying BRNG, and produce a single variate, e.g.:

https://github.com/bashtage/ng-numpy-randomstate/blob/master/randomstate/distributions.c#L23-L26
 

This design stands in a way of efficient use of MKL, which generates a whole 
vector of variates at a time. This can be done faster than sampling a variate 
at a time by using vectorized instructions.  So I wrote mkl_distributions.cpp 
to provide functions that return a given size vector of sampled variates from 
each supported distribution.

mklrand.pyx was then written by modifying mtrand.pyx to work with such vector 
generators.   In particular, this allowed for efficient sampling from product 
distributions of Poisson distributions with different rate parameters, which is 
implemented in MKL:

https://software.intel.com/en-us/node/521894 

https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1071
 


Another point already raised by Nathaniel is that for numpy's randomness 
ideally should provide a way to override default algorithm for sampling from a 
particular distribution.  For example RandomState object that implements PCG 
may rely on default acceptance-rejection algorithm for sampling from Gamma, 
while the RandomState object that provides interface to MKL might want to call 
into MKL directly.

While at this topic, I also would like to point out the need for C-API 
interface to randomness, particularly felt writing parallel algorithms, where 
Python's GIL and use of Lock() in RandomState hurt scalability.

Oleksandr

-Original Message-
From: NumPy-Discussion [mailto:numpy-discussion-boun...@scipy.org] On Behalf Of 
Nathaniel Smith
Sent: Wednesday, October 26, 2016 2:25 PM
To: Discussion of Numerical Python <numpy-discussion@scipy.org>
Subject: Re: [Numpy-discussion] Intel random number package

On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor <jtaylor.deb...@googlemail.com> 
wrote:
> On 10/26/2016 06:00 PM, Julian Taylor wrote:
>>
>> On 10/26/2016 10:59 AM, Ralf Gommers wrote:
>>>
>>>
>>>
>>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor 
>>> <jtaylor.deb...@googlemail.com 
>>> <mailto:jtaylor.deb...@googlemail.com>>
>>> wrote:
>>>
>>> On 26.10.2016 06:34, Charles R Harris wrote:
>>> > Hi All,
>>> >
>>> > There is a proposed random number package PR now up on github:
>>> > https://github.com/numpy/numpy/pull/8209
>>> <https://github.com/numpy/numpy/pull/8209>. It is from
>>> > oleksandr-pavlyk <https://github.com/oleksandr-pavlyk
>>> <https://github.com/oleksandr-pavlyk>> and implements
>>> > the number random number package using MKL for increased speed.
>>> I think
>>> > we are definitely interested in the improved speed, but I'm 
>>> not sure
>>> > numpy is the best place to put the package. I'd welcome any 
>>> comments on
>>> > the PR itself, as well as any thoughts on the best way 
>>> organize or use
>>> > of this work. Maybe scikit-random
>>>
>>>
>>> Note that this thread is a continuation of 
>>> https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.h
>&

Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Nathaniel Smith
On Wed, Oct 26, 2016 at 12:41 PM, Warren Weckesser
 wrote:
>
>
> On Wed, Oct 26, 2016 at 3:24 PM, Nathaniel Smith  wrote:
>>
>> On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor
>>  wrote:
>> > On 10/26/2016 06:00 PM, Julian Taylor wrote:
>> >>
>> >> On 10/26/2016 10:59 AM, Ralf Gommers wrote:
>> >>>
>> >>>
>> >>>
>> >>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor
>> >>> >
>> >>> wrote:
>> >>>
>> >>> On 26.10.2016 06:34, Charles R Harris wrote:
>> >>> > Hi All,
>> >>> >
>> >>> > There is a proposed random number package PR now up on github:
>> >>> > https://github.com/numpy/numpy/pull/8209
>> >>> . It is from
>> >>> > oleksandr-pavlyk > >>> > and implements
>> >>> > the number random number package using MKL for increased speed.
>> >>> I think
>> >>> > we are definitely interested in the improved speed, but I'm not
>> >>> sure
>> >>> > numpy is the best place to put the package. I'd welcome any
>> >>> comments on
>> >>> > the PR itself, as well as any thoughts on the best way organize
>> >>> or use
>> >>> > of this work. Maybe scikit-random
>> >>>
>> >>>
>> >>> Note that this thread is a continuation of
>> >>>
>> >>> https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html
>> >>>
>> >>>
>> >>>
>> >>> I'm not a fan of putting code depending on a proprietary library
>> >>> into numpy.
>> >>> This should be a standalone package which may provide the same
>> >>> interface
>> >>> as numpy.
>> >>>
>> >>>
>> >>> I don't really see a problem with that in principle. Numpy can use
>> >>> Intel
>> >>> MKL (and Accelerate) as well if it's available. It needs some thought
>> >>> put into the API though - a ``numpy.random_intel`` module is certainly
>> >>> not what we want.
>> >>>
>> >>
>> >> For me there is a difference between being able to optionally use a
>> >> proprietary library as an alternative to free software libraries if the
>> >> user wishes to do so and offering functionality that only works with
>> >> non-free software.
>> >> We are providing a form of advertisement for them by allowing it (hey
>> >> if
>> >> you buy this black box that you cannot modify or use freely you get
>> >> this
>> >> neat numpy feature!).
>> >>
>> >> I prefer for the full functionality of numpy to stay available with a
>> >> stack of community owned software, even if it may be less powerful that
>> >> way.
>> >
>> > But then if this is really just the same random numbers numpy already
>> > provides just faster, it is probably acceptable in principle. I haven't
>> > actually looked at the PR yet.
>>
>> The RNG stream is totally different, so yeah, it can't just be a
>> silent drop-in replacement like BLAS/LAPACK.
>>
>> The patch also adds ~10,000 lines of code; here's an example of what
>> some of it looks like:
>>
>>
>> https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833
>>
>> I don't see how we can realistically commit to maintaining this.
>>
>
>
> FYI:  numpy already maintains code exactly like that:
> https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L397
>
> Perhaps the point should be that the numpy devs won't want to maintain two
> nearly identical versions of that code.

Heh, good catch! Okay, if random_intel is a massive copy-paste of
random with modifications applied on top, then that's its own issue...
on the one hand, yeah, we definitely don't want to carry around
massive copy/paste code. OTOH, it suggests that it might be possible
to refactor the code so that common parts are shared, and this would
be a benefit to integrating random and random_intel more closely. (And
this benefit would then have to be weighed against all the other
considerations, like how much sharing there actually was,
maintainability of the remaining random_intel-specific bits, the
desire to keep numpy free-and-open, etc.) Hard to make that call just
from skimming a 10,000 line patch, though...

Oleksandr, or others at Intel: how much possibility do you think there
is for sharing code between random and random_intel?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Robert Kern
On Wed, Oct 26, 2016 at 12:41 PM, Warren Weckesser <
warren.weckes...@gmail.com> wrote:
>
> On Wed, Oct 26, 2016 at 3:24 PM, Nathaniel Smith  wrote:

>> The patch also adds ~10,000 lines of code; here's an example of what
>> some of it looks like:
>>
>>
https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833
>>
>> I don't see how we can realistically commit to maintaining this.
>
> FYI:  numpy already maintains code exactly like that:
https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L397
>
> Perhaps the point should be that the numpy devs won't want to maintain
two nearly identical versions of that code.

Indeed. That's how the algorithm was published. The /* sigh ... */ is my
own. ;-)

--
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Warren Weckesser
On Wed, Oct 26, 2016 at 3:24 PM, Nathaniel Smith  wrote:

> On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor
>  wrote:
> > On 10/26/2016 06:00 PM, Julian Taylor wrote:
> >>
> >> On 10/26/2016 10:59 AM, Ralf Gommers wrote:
> >>>
> >>>
> >>>
> >>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor
> >>> >
> >>> wrote:
> >>>
> >>> On 26.10.2016 06:34, Charles R Harris wrote:
> >>> > Hi All,
> >>> >
> >>> > There is a proposed random number package PR now up on github:
> >>> > https://github.com/numpy/numpy/pull/8209
> >>> . It is from
> >>> > oleksandr-pavlyk  >>> > and implements
> >>> > the number random number package using MKL for increased speed.
> >>> I think
> >>> > we are definitely interested in the improved speed, but I'm not
> >>> sure
> >>> > numpy is the best place to put the package. I'd welcome any
> >>> comments on
> >>> > the PR itself, as well as any thoughts on the best way organize
> >>> or use
> >>> > of this work. Maybe scikit-random
> >>>
> >>>
> >>> Note that this thread is a continuation of
> >>> https://mail.scipy.org/pipermail/numpy-discussion/
> 2016-July/075822.html
> >>>
> >>>
> >>>
> >>> I'm not a fan of putting code depending on a proprietary library
> >>> into numpy.
> >>> This should be a standalone package which may provide the same
> >>> interface
> >>> as numpy.
> >>>
> >>>
> >>> I don't really see a problem with that in principle. Numpy can use
> Intel
> >>> MKL (and Accelerate) as well if it's available. It needs some thought
> >>> put into the API though - a ``numpy.random_intel`` module is certainly
> >>> not what we want.
> >>>
> >>
> >> For me there is a difference between being able to optionally use a
> >> proprietary library as an alternative to free software libraries if the
> >> user wishes to do so and offering functionality that only works with
> >> non-free software.
> >> We are providing a form of advertisement for them by allowing it (hey if
> >> you buy this black box that you cannot modify or use freely you get this
> >> neat numpy feature!).
> >>
> >> I prefer for the full functionality of numpy to stay available with a
> >> stack of community owned software, even if it may be less powerful that
> >> way.
> >
> > But then if this is really just the same random numbers numpy already
> > provides just faster, it is probably acceptable in principle. I haven't
> > actually looked at the PR yet.
>
> The RNG stream is totally different, so yeah, it can't just be a
> silent drop-in replacement like BLAS/LAPACK.
>
> The patch also adds ~10,000 lines of code; here's an example of what
> some of it looks like:
>
> https://github.com/oleksandr-pavlyk/numpy/blob/
> b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/
> mklrand/mkl_distributions.cpp#L1724-L1833
>
> I don't see how we can realistically commit to maintaining this.
>
>

FYI:  numpy already maintains code exactly like that:
https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L397

Perhaps the point should be that the numpy devs won't want to maintain two
nearly identical versions of that code.

Warren




> I'm also not really seeing how shipping it as part of numpy provides
> extra benefits to maintainers or users? AFAICT right now it's
> basically structured as a standalone library that's been dropped into
> the numpy source tree, and it would be just as easy to ship separately
> (or am I wrong?). And since the public API is that all the
> functionality comes from importing this specific new module
> ('numpy.random_intel'), it'd be a one-line change for users to import
> from a non-numpy namespace, like 'mkl.random' or whatever. If it were
> more integrated with the rest of numpy then the trade-offs would be
> more complicated, but in its present form this seems like an easy
> call.
>
> The other question is whether it could/should change to *become* more
> integrated... that's more tricky. There's been some work towards
> supporting swappable backends inside np.random; but the focus has
> mostly been on allowing new core generators, though, and this code
> seems to want to take over the whole thing (core generator +
> distributions), so even once the swappable backends stuff is working
> I'm not sure it would be relevant here. The one case I can think of
> that does seem promising is that if we get an API for users to say "I
> don't care about stream compatibility, just give me un-reproducible
> variates as fast as you can", then it might make sense for that to
> silently use MKL if available -- this would be pretty analogous to the
> use of MKL in np.linalg. But we don't have that API yet, I'm not sure
> how the MKL fallback could be maintainably implemented given 

Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Nathaniel Smith
On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor
 wrote:
> On 10/26/2016 06:00 PM, Julian Taylor wrote:
>>
>> On 10/26/2016 10:59 AM, Ralf Gommers wrote:
>>>
>>>
>>>
>>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor
>>> >
>>> wrote:
>>>
>>> On 26.10.2016 06:34, Charles R Harris wrote:
>>> > Hi All,
>>> >
>>> > There is a proposed random number package PR now up on github:
>>> > https://github.com/numpy/numpy/pull/8209
>>> . It is from
>>> > oleksandr-pavlyk >> > and implements
>>> > the number random number package using MKL for increased speed.
>>> I think
>>> > we are definitely interested in the improved speed, but I'm not
>>> sure
>>> > numpy is the best place to put the package. I'd welcome any
>>> comments on
>>> > the PR itself, as well as any thoughts on the best way organize
>>> or use
>>> > of this work. Maybe scikit-random
>>>
>>>
>>> Note that this thread is a continuation of
>>> https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html
>>>
>>>
>>>
>>> I'm not a fan of putting code depending on a proprietary library
>>> into numpy.
>>> This should be a standalone package which may provide the same
>>> interface
>>> as numpy.
>>>
>>>
>>> I don't really see a problem with that in principle. Numpy can use Intel
>>> MKL (and Accelerate) as well if it's available. It needs some thought
>>> put into the API though - a ``numpy.random_intel`` module is certainly
>>> not what we want.
>>>
>>
>> For me there is a difference between being able to optionally use a
>> proprietary library as an alternative to free software libraries if the
>> user wishes to do so and offering functionality that only works with
>> non-free software.
>> We are providing a form of advertisement for them by allowing it (hey if
>> you buy this black box that you cannot modify or use freely you get this
>> neat numpy feature!).
>>
>> I prefer for the full functionality of numpy to stay available with a
>> stack of community owned software, even if it may be less powerful that
>> way.
>
> But then if this is really just the same random numbers numpy already
> provides just faster, it is probably acceptable in principle. I haven't
> actually looked at the PR yet.

The RNG stream is totally different, so yeah, it can't just be a
silent drop-in replacement like BLAS/LAPACK.

The patch also adds ~10,000 lines of code; here's an example of what
some of it looks like:


https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833

I don't see how we can realistically commit to maintaining this.

I'm also not really seeing how shipping it as part of numpy provides
extra benefits to maintainers or users? AFAICT right now it's
basically structured as a standalone library that's been dropped into
the numpy source tree, and it would be just as easy to ship separately
(or am I wrong?). And since the public API is that all the
functionality comes from importing this specific new module
('numpy.random_intel'), it'd be a one-line change for users to import
from a non-numpy namespace, like 'mkl.random' or whatever. If it were
more integrated with the rest of numpy then the trade-offs would be
more complicated, but in its present form this seems like an easy
call.

The other question is whether it could/should change to *become* more
integrated... that's more tricky. There's been some work towards
supporting swappable backends inside np.random; but the focus has
mostly been on allowing new core generators, though, and this code
seems to want to take over the whole thing (core generator +
distributions), so even once the swappable backends stuff is working
I'm not sure it would be relevant here. The one case I can think of
that does seem promising is that if we get an API for users to say "I
don't care about stream compatibility, just give me un-reproducible
variates as fast as you can", then it might make sense for that to
silently use MKL if available -- this would be pretty analogous to the
use of MKL in np.linalg. But we don't have that API yet, I'm not sure
how the MKL fallback could be maintainably implemented given that it
would require somehow swapping the entire RandomState implementation,
and it's entirely possible that once we figure out solutions to those
then it'd still make sense for the actual MKL wrappers to live in a
third-party library that numpy imports.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Robert Kern
On Wed, Oct 26, 2016 at 9:36 AM, Sebastian Berg 
wrote:
>
> On Mi, 2016-10-26 at 09:29 -0700, Robert Kern wrote:
> > On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor  > mail.com> wrote:
> > >
> > > On 10/26/2016 06:00 PM, Julian Taylor wrote:
> >
> > >> I prefer for the full functionality of numpy to stay available
> > with a
> > >> stack of community owned software, even if it may be less powerful
> > that
> > >> way.
> > >
> > > But then if this is really just the same random numbers numpy
> > already provides just faster, it is probably acceptable in principle.
> > I haven't actually looked at the PR yet.
> >
> > I think the stream is different in some places, at least. And it's
> > not a silent backend drop-in like np.linalg being built against an
> > optimized BLAS, just a separate module that is inoperative without
> > MKL.
>
> I might be swayed, but my gut feeling would be that a backend change
> (if the default stream changes, an explicit one, though maybe one could
> make a "fastest") would be the only reasonable way to provide such a
> thing in numpy itself.

That mostly argues for distributing it as a separate package, not part of
numpy at all.

--
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Sebastian Berg
On Mi, 2016-10-26 at 09:29 -0700, Robert Kern wrote:
> On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor  mail.com> wrote:
> >
> > On 10/26/2016 06:00 PM, Julian Taylor wrote:
> 
> >> I prefer for the full functionality of numpy to stay available
> with a
> >> stack of community owned software, even if it may be less powerful
> that
> >> way.
> >
> > But then if this is really just the same random numbers numpy
> already provides just faster, it is probably acceptable in principle.
> I haven't actually looked at the PR yet.
> 
> I think the stream is different in some places, at least. And it's
> not a silent backend drop-in like np.linalg being built against an
> optimized BLAS, just a separate module that is inoperative without
> MKL.
> 

I might be swayed, but my gut feeling would be that a backend change
(if the default stream changes, an explicit one, though maybe one could
make a "fastest") would be the only reasonable way to provide such a
thing in numpy itself.

- Sebastian



> --
> Robert Kern
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Robert Kern
On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor <
jtaylor.deb...@googlemail.com> wrote:
>
> On 10/26/2016 06:00 PM, Julian Taylor wrote:

>> I prefer for the full functionality of numpy to stay available with a
>> stack of community owned software, even if it may be less powerful that
>> way.
>
> But then if this is really just the same random numbers numpy already
provides just faster, it is probably acceptable in principle. I haven't
actually looked at the PR yet.

I think the stream is different in some places, at least. And it's not a
silent backend drop-in like np.linalg being built against an optimized
BLAS, just a separate module that is inoperative without MKL.

--
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Julian Taylor

On 10/26/2016 06:00 PM, Julian Taylor wrote:

On 10/26/2016 10:59 AM, Ralf Gommers wrote:



On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor
>
wrote:

On 26.10.2016 06:34, Charles R Harris wrote:
> Hi All,
>
> There is a proposed random number package PR now up on github:
> https://github.com/numpy/numpy/pull/8209
. It is from
> oleksandr-pavlyk > and implements
> the number random number package using MKL for increased speed.
I think
> we are definitely interested in the improved speed, but I'm not
sure
> numpy is the best place to put the package. I'd welcome any
comments on
> the PR itself, as well as any thoughts on the best way organize
or use
> of this work. Maybe scikit-random


Note that this thread is a continuation of
https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html



I'm not a fan of putting code depending on a proprietary library
into numpy.
This should be a standalone package which may provide the same
interface
as numpy.


I don't really see a problem with that in principle. Numpy can use Intel
MKL (and Accelerate) as well if it's available. It needs some thought
put into the API though - a ``numpy.random_intel`` module is certainly
not what we want.



For me there is a difference between being able to optionally use a
proprietary library as an alternative to free software libraries if the
user wishes to do so and offering functionality that only works with
non-free software.
We are providing a form of advertisement for them by allowing it (hey if
you buy this black box that you cannot modify or use freely you get this
neat numpy feature!).

I prefer for the full functionality of numpy to stay available with a
stack of community owned software, even if it may be less powerful that
way.


But then if this is really just the same random numbers numpy already 
provides just faster, it is probably acceptable in principle. I haven't 
actually looked at the PR yet.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Julian Taylor

On 10/26/2016 10:59 AM, Ralf Gommers wrote:



On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor
>
wrote:

On 26.10.2016 06:34, Charles R Harris wrote:
> Hi All,
>
> There is a proposed random number package PR now up on github:
> https://github.com/numpy/numpy/pull/8209
. It is from
> oleksandr-pavlyk > and implements
> the number random number package using MKL for increased speed. I think
> we are definitely interested in the improved speed, but I'm not sure
> numpy is the best place to put the package. I'd welcome any comments on
> the PR itself, as well as any thoughts on the best way organize or use
> of this work. Maybe scikit-random


Note that this thread is a continuation of
https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html



I'm not a fan of putting code depending on a proprietary library
into numpy.
This should be a standalone package which may provide the same interface
as numpy.


I don't really see a problem with that in principle. Numpy can use Intel
MKL (and Accelerate) as well if it's available. It needs some thought
put into the API though - a ``numpy.random_intel`` module is certainly
not what we want.



For me there is a difference between being able to optionally use a 
proprietary library as an alternative to free software libraries if the 
user wishes to do so and offering functionality that only works with 
non-free software.
We are providing a form of advertisement for them by allowing it (hey if 
you buy this black box that you cannot modify or use freely you get this 
neat numpy feature!).


I prefer for the full functionality of numpy to stay available with a 
stack of community owned software, even if it may be less powerful that way.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Ralf Gommers
On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor <
jtaylor.deb...@googlemail.com> wrote:

> On 26.10.2016 06:34, Charles R Harris wrote:
> > Hi All,
> >
> > There is a proposed random number package PR now up on github:
> > https://github.com/numpy/numpy/pull/8209. It is from
> > oleksandr-pavlyk  and implements
> > the number random number package using MKL for increased speed. I think
> > we are definitely interested in the improved speed, but I'm not sure
> > numpy is the best place to put the package. I'd welcome any comments on
> > the PR itself, as well as any thoughts on the best way organize or use
> > of this work. Maybe scikit-random
>

Note that this thread is a continuation of
https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html


>
> I'm not a fan of putting code depending on a proprietary library into
> numpy.
> This should be a standalone package which may provide the same interface
> as numpy.
>

I don't really see a problem with that in principle. Numpy can use Intel
MKL (and Accelerate) as well if it's available. It needs some thought put
into the API though - a ``numpy.random_intel`` module is certainly not what
we want.

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Julian Taylor
On 26.10.2016 06:34, Charles R Harris wrote:
> Hi All,
> 
> There is a proposed random number package PR now up on github:
> https://github.com/numpy/numpy/pull/8209. It is from
> oleksandr-pavlyk  and implements
> the number random number package using MKL for increased speed. I think
> we are definitely interested in the improved speed, but I'm not sure
> numpy is the best place to put the package. I'd welcome any comments on
> the PR itself, as well as any thoughts on the best way organize or use
> of this work. Maybe scikit-random
> 

I'm not a fan of putting code depending on a proprietary library into numpy.
This should be a standalone package which may provide the same interface
as numpy.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-25 Thread Robert Kern
On Tue, Oct 25, 2016 at 10:22 PM, Charles R Harris <
charlesr.har...@gmail.com> wrote:
>
> On Tue, Oct 25, 2016 at 10:41 PM, Robert Kern 
wrote:
>>
>> On Tue, Oct 25, 2016 at 9:34 PM, Charles R Harris <
charlesr.har...@gmail.com> wrote:
>> >
>> > Hi All,
>> >
>> > There is a proposed random number package PR now up on github:
https://github.com/numpy/numpy/pull/8209. It is from
>> > oleksandr-pavlyk and implements the number random number package using
MKL for increased speed. I think we are definitely interested in the
improved speed, but I'm not sure numpy is the best place to put the
package. I'd welcome any comments on the PR itself, as well as any thoughts
on the best way organize or use of this work. Maybe scikit-random
>>
>> This is what ng-numpy-randomstate is for.
>>
>> https://github.com/bashtage/ng-numpy-randomstate
>
> Interesting, despite old fashioned original ziggurat implementation of
the normal and gnu c style... Does that project seek to preserve all the
bytestreams or is it still in flux?

I would assume some flux for now, but you can ask the author by submitting
a corrected ziggurat PR as a trial balloon. ;-)

--
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-25 Thread Charles R Harris
On Tue, Oct 25, 2016 at 10:41 PM, Robert Kern  wrote:

> On Tue, Oct 25, 2016 at 9:34 PM, Charles R Harris <
> charlesr.har...@gmail.com> wrote:
> >
> > Hi All,
> >
> > There is a proposed random number package PR now up on github:
> https://github.com/numpy/numpy/pull/8209. It is from
> > oleksandr-pavlyk and implements the number random number package using
> MKL for increased speed. I think we are definitely interested in the
> improved speed, but I'm not sure numpy is the best place to put the
> package. I'd welcome any comments on the PR itself, as well as any thoughts
> on the best way organize or use of this work. Maybe scikit-random
>
> This is what ng-numpy-randomstate is for.
>
> https://github.com/bashtage/ng-numpy-randomstate
>

Interesting, despite old fashioned original ziggurat implementation of the
normal and gnu c style... Does that project seek to preserve all the
bytestreams or is it still in flux?

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-25 Thread Robert Kern
On Tue, Oct 25, 2016 at 9:34 PM, Charles R Harris 
wrote:
>
> Hi All,
>
> There is a proposed random number package PR now up on github:
https://github.com/numpy/numpy/pull/8209. It is from
> oleksandr-pavlyk and implements the number random number package using
MKL for increased speed. I think we are definitely interested in the
improved speed, but I'm not sure numpy is the best place to put the
package. I'd welcome any comments on the PR itself, as well as any thoughts
on the best way organize or use of this work. Maybe scikit-random

This is what ng-numpy-randomstate is for.

https://github.com/bashtage/ng-numpy-randomstate

--
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Intel random number package

2016-10-25 Thread Charles R Harris
Hi All,

There is a proposed random number package PR now up on github:
https://github.com/numpy/numpy/pull/8209. It is from
oleksandr-pavlyk  and implements the
number random number package using MKL for increased speed. I think we are
definitely interested in the improved speed, but I'm not sure numpy is the
best place to put the package. I'd welcome any comments on the PR itself,
as well as any thoughts on the best way organize or use of this work. Maybe
scikit-random

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion