Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-30 Thread Robert Kern
On Sat, Jun 30, 2018 at 12:14 PM Stephan Hoyer  wrote:

> I’d love to see a generic way of doing random number generation, but I
> agree with Martin that I don’t see it fitting a naturally into this NEP. An
> invasive change to add an array_reference argument to a bunch of functions
> might indeed be worthy of its own NEP, but again I’m not convinced that’s
> actually the right approach. I’d rather add a few new functions like
> random_like, which is a small enough change that concensus on the list
> might be enough.
>

random_like() seems very weird to me. It doesn't seem like a function that
anyone actually wants. It seems like what people actually want is to be
able to draw random numbers from any distribution as a specified array-like
type and shape, not just sample U(0, 1) with the shape of an existing array.

The most workable way to do this is to modify RandomGenerator (i.e. the new
RandomState design)[1] to accept the array-like type in the class
constructor, and modify its internals to do the right thing. Because the
intrusion on the API is so small, that doesn't require a NEP, just a PR (a
long, complicated, and tedious PR, to be sure)[2]. There are a bunch of
technical issues (if you want to avoid memory copies) because the Cython
implementation requires direct memory access, but that's intrinsic to any
solution to this problem, regardless of the API choices. random_like()
would have the same issues.

[1] https://github.com/bashtage/randomgen
[2] Sorry, Kevin.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-30 Thread Stephan Hoyer
On Sat, Jun 30, 2018 at 11:59 AM Hameer Abbasi 
wrote:

> Hi Marten,
>
> Still, I'm not sure whether this should be included in the present NEP or
> is best done separately after, with a few concrete examples of where it
> would be useful.
>
>
> There already are concrete examples from Dask and CuPy, and this is
> currently a blocker for them, which is part of the reason I’m pushing so
> hard for it. See #11074  for
> a context, and I think it was part of the reason that inspired Matt and
> Stephan to write this protocol in the first place.
>

Overloading np.ones_like() is definitely in scope already.

I’d love to see a generic way of doing random number generation, but I
agree with Martin that I don’t see it fitting a naturally into this NEP. An
invasive change to add an array_reference argument to a bunch of functions
might indeed be worthy of its own NEP, but again I’m not convinced that’s
actually the right approach. I’d rather add a few new functions like
random_like, which is a small enough change that concensus on the list
might be enough.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-30 Thread Marten van Kerkwijk
Hi Hameer,

I think the override on `dtype` would work - after all, the override is
checked before anything is done, so one can just pass in `self` if one
wishes (or some helper class that contains both `self` and any desired
further information.

But, as you note, it would not cover everything, and your `array_reference`
idea definitely makes things more uniform. Indeed, it would allow one to
implement things like `np.zeros_like` using `np.zero`, which seems quite
nice.

Still, I'm not sure whether this should be included in the present NEP or
is best done separately after, with a few concrete examples of where it
would be useful.

All the best,

Marten



On Sat, Jun 30, 2018 at 10:40 AM, Hameer Abbasi 
wrote:

> Hi Marten,
>
> Sorry, I had clearly misunderstood. It would indeed be nice for overrides
> to work on functions like `zeros` or `arange` as well, but it seems strange
> to change the signature just for that. As a possible alternative, should we
> perhaps generally check for overrides on `dtype`?
>
>
> While this very clearly makes sense for something like astropy, it has a
> few drawbacks:
>
>- Other duck arrays such as Dask need more information than just the
>dtype. For example, Dask needs chunk sizes, XArray needs axis labels, and
>pydata/sparse needs to know the type of the reference array in order
>to make one of the same type. The information in a reference array is a
>strict superset of information in the dtype.
>- There’s a need for a separate protocol, which might be a lot harder
>to work with for both NumPy and library authors.
>- Some things, like numpy.random.RandomState, don’t accept a dtype
>argument.
>
> As for your concern about changing the signature, it’s easy enough with a
> decorator. We’ll need a separate decorator for array generation functions.
> Something like:
>
> def array_generation_function(func):
> @functools.wraps(func)
> def wrapped(*args, **kwargs, array_reference=np._NoValue):
> if array_reference is not np._NoValue:
> success, result = try_array_function_override(wrapped, 
> [array_reference], args, kwargs)
>
> if success:
> return result
>
> return func(*args, **kwargs)
>
> return wrapped
>
> Hameer Abbasi
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-30 Thread Hameer Abbasi
Hi Marten,

Sorry, I had clearly misunderstood. It would indeed be nice for overrides
to work on functions like `zeros` or `arange` as well, but it seems strange
to change the signature just for that. As a possible alternative, should we
perhaps generally check for overrides on `dtype`?


While this very clearly makes sense for something like astropy, it has a
few drawbacks:

   - Other duck arrays such as Dask need more information than just the
   dtype. For example, Dask needs chunk sizes, XArray needs axis labels, and
   pydata/sparse needs to know the type of the reference array in order to
   make one of the same type. The information in a reference array is a strict
   superset of information in the dtype.
   - There’s a need for a separate protocol, which might be a lot harder to
   work with for both NumPy and library authors.
   - Some things, like numpy.random.RandomState, don’t accept a dtype
   argument.

As for your concern about changing the signature, it’s easy enough with a
decorator. We’ll need a separate decorator for array generation functions.
Something like:

def array_generation_function(func):
@functools.wraps(func)
def wrapped(*args, **kwargs, array_reference=np._NoValue):
if array_reference is not np._NoValue:
success, result = try_array_function_override(wrapped,
[array_reference], args, kwargs)

if success:
return result

return func(*args, **kwargs)

return wrapped

Hameer Abbasi
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-30 Thread Marten van Kerkwijk
Hi Hameer,


It is. The point of the proposed feature was to handle array generation
> mechanisms, that don't take an array as input in the standard NumPy API.
> Giving them a reference handles both the dispatch and the decision about
> which implementation to call.
>

Sorry, I had clearly misunderstood. It would indeed be nice for overrides
to work on functions like `zeros` or `arange` as well, but it seems strange
to change the signature just for that. As a possible alternative, should we
perhaps generally check for overrides on `dtype`?

All the best,

Marten
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-30 Thread Marten van Kerkwijk
On Fri, Jun 29, 2018 at 9:54 PM, Eric Wieser 
wrote:

> Good catch,
>
> I think the latter failing is because np.add.reduce ends up calling
> np.ufunc.reduce.__get__(np.add), and builtin_function.__get__ doesn’t
> appear to do any caching. I suppose caching bound methods would just be a
> waste of time.
> == would work just fine in my suggestion above, it seems - irrespective
> of the resolution of the discussion on python-dev.
>
> Eric
> ​
>
I think for implementers it might work easiest anyway to look up the ufunc
itself in a dict or so and then check the name of the method. (At least,
for my impementations of `__array_ufunc__`, it made a lot of sense to use
the method in that way; possibly less so for the larger variety with other
numpy functions).

-- Marten
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-29 Thread Eric Wieser
Good catch,

I think the latter failing is because np.add.reduce ends up calling
np.ufunc.reduce.__get__(np.add), and builtin_function.__get__ doesn’t
appear to do any caching. I suppose caching bound methods would just be a
waste of time.
== would work just fine in my suggestion above, it seems - irrespective of
the resolution of the discussion on python-dev.

Eric
​

On Fri, 29 Jun 2018 at 18:24 Stephan Hoyer  wrote:

> On Thu, Jun 28, 2018 at 5:36 PM Eric Wieser 
> wrote:
>
>> Another option would be to directly compare the methods against known
>> ones:
>>
>> obj = func.__self__
>> if isinstance(obj, np.ufunc):
>> if func is obj.reduce:
>> got_reduction()
>>
>> I'm not quite sure why, but this doesn't seem to work with current ufunc
> objects:
>
> >>> np.add.reduce == np.add.reduce  # OK
> True
>
> >>> np.add.reduce is np.add.reduce  # what?!?
> False
>
> Maybe this is a bug? There's been some somewhat related discussion
> recently on python-dev:
> https://mail.python.org/pipermail/python-dev/2018-June/153959.html
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-29 Thread Stephan Hoyer
On Thu, Jun 28, 2018 at 5:36 PM Eric Wieser 
wrote:

> Another option would be to directly compare the methods against known ones:
>
> obj = func.__self__
> if isinstance(obj, np.ufunc):
> if func is obj.reduce:
> got_reduction()
>
> I'm not quite sure why, but this doesn't seem to work with current ufunc
objects:

>>> np.add.reduce == np.add.reduce  # OK
True

>>> np.add.reduce is np.add.reduce  # what?!?
False

Maybe this is a bug? There's been some somewhat related discussion recently
on python-dev:
https://mail.python.org/pipermail/python-dev/2018-June/153959.html
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-28 Thread Hameer Abbasi
Hi Martin,

It is. The point of the proposed feature was to handle array generation
mechanisms, that don't take an array as input in the standard NumPy API.
Giving them a reference handles both the dispatch and the decision about
which implementation to call.

I'm confused: Isn't your reference array just `self`?
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-28 Thread Matti Picus




On 28/06/18 17:18, Stephan Hoyer wrote:
On Thu, Jun 28, 2018 at 1:12 PM Marten van Kerkwijk 
mailto:m.h.vankerkw...@gmail.com>> wrote:


For C classes like the ufuncs, it seems `__self__` is defined for
methods as well (at least, `np.add.reduce.__self__` gives
`np.add`), but not a `__func__`. There is a `__name__`
(="reduce"), though, which means that I think one can still
retrieve what is needed (obviously, this also means
`__array_ufunc__` could have been simpler...)


Good point!

I guess this means we should encourage using __name__ rather than 
__func__. I would not want to preclude refactoring classes from Python 
to C/Cython.



___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
There was opposition to that in a PR I made to provide a wrapper around 
matmul to turn it into a ufunc. It would have left the __name__ but 
changed the __func__.

https://github.com/numpy/numpy/pull/11061#issuecomment-387468084
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-28 Thread Eric Wieser
Another option would be to directly compare the methods against known ones:

obj = func.__self__
if isinstance(obj, np.ufunc):
if func is obj.reduce:
got_reduction()

Eric
​

On Thu, 28 Jun 2018 at 17:19 Stephan Hoyer  wrote:

> On Thu, Jun 28, 2018 at 1:12 PM Marten van Kerkwijk <
> m.h.vankerkw...@gmail.com> wrote:
>
>> For C classes like the ufuncs, it seems `__self__` is defined for methods
>> as well (at least, `np.add.reduce.__self__` gives `np.add`), but not a
>> `__func__`. There is a `__name__` (="reduce"), though, which means that I
>> think one can still retrieve what is needed (obviously, this also means
>> `__array_ufunc__` could have been simpler...)
>>
>
> Good point!
>
> I guess this means we should encourage using __name__ rather than
> __func__. I would not want to preclude refactoring classes from Python to
> C/Cython.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-28 Thread Marten van Kerkwijk
> I did a little more digging, and turned up the __self__ and __func__
> attributes of bound methods:
> https://stackoverflow.com/questions/4679592/how-to-find-
> instance-of-a-bound-method-in-python
>
> So we might need another decorator function, but it seems that the current
> interface would actually suffice just fine for overriding methods. I'll
> update the NEP with some examples. It will look something like:
>
> def __array_function__(self, func, types, args, kwargs):
>   ...
>   if isinstance(func, types.MethodType):
> object = func.__self__
> unbound_func = func.__func__
> ...
>
>
For C classes like the ufuncs, it seems `__self__` is defined for methods
as well (at least, `np.add.reduce.__self__` gives `np.add`), but not a
`__func__`. There is a `__name__` (="reduce"), though, which means that I
think one can still retrieve what is needed (obviously, this also means
`__array_ufunc__` could have been simpler...)

-- Marten
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-28 Thread Hameer Abbasi
I think the usefulness of this feature is actually needed. Consider
`np.random.RandomState`. If we were to add what I proposed, the two could
work very nicely to (for example) do things like creating Dask random
arrays, from RandomState objects.

For reproducibility, Dask could generate multiple RandomState objects with
a seed sequential in the job numbers.

Looping in Matt Rocklin for this — He might have some input about the
design.

Best Regards,
Hameer Abbasi
Sent from Astro  for Mac

On 28. Jun 2018 at 14:37, Marten van Kerkwijk 
wrote:




On Wed, Jun 27, 2018 at 3:50 PM, Stephan Hoyer  wrote:





> So perhaps it's worth "future proofing" the interface by passing `obj` and
> `method` to __array_function__ rather than only `func`. It is slower to
> call a func via func.__call__ than func, but only very marginally (~100 ns
> in my tests).
>

That would make it more similar yet to `__array_ufunc__`, but I'm not sure
how useful it is, as you cannot generically assume the methods have the
same arguments and hence they need their own dispatcher. Once you're there
you might as well pass them on directly (since any callable can be used as
the function). Indeed, for `__array_ufunc__`, this might not have been a
bad idea either...

-- Marten

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-28 Thread Marten van Kerkwijk
On Wed, Jun 27, 2018 at 3:50 PM, Stephan Hoyer  wrote:





> So perhaps it's worth "future proofing" the interface by passing `obj` and
> `method` to __array_function__ rather than only `func`. It is slower to
> call a func via func.__call__ than func, but only very marginally (~100 ns
> in my tests).
>

That would make it more similar yet to `__array_ufunc__`, but I'm not sure
how useful it is, as you cannot generically assume the methods have the
same arguments and hence they need their own dispatcher. Once you're there
you might as well pass them on directly (since any callable can be used as
the function). Indeed, for `__array_ufunc__`, this might not have been a
bad idea either...

-- Marten
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-27 Thread Marten van Kerkwijk
Hi Hameer,

I'm confused: Isn't your reference array just `self`?
All the best,

Marten


On Wed, Jun 27, 2018 at 2:27 AM, Hameer Abbasi 
wrote:

>
>
> On 27. Jun 2018 at 07:48, Stephan Hoyer  wrote:
>
>
> After much discussion (and the addition of three new co-authors!), I’m
> pleased to present a significantly revision of NumPy Enhancement Proposal
> 18: A dispatch mechanism for NumPy's high level array functions:
> http://www.numpy.org/neps/nep-0018-array-function-protocol.html
>
> The full text is also included below.
>
> Best,
> Stephan
>
> ===
> A dispatch mechanism for NumPy's high level array functions
> ===
>
> :Author: Stephan Hoyer 
> :Author: Matthew Rocklin 
> :Author: Marten van Kerkwijk 
> :Author: Hameer Abbasi 
> :Author: Eric Wieser 
> :Status: Draft
> :Type: Standards Track
> :Created: 2018-05-29
>
> Abstact
> ---
>
> We propose the ``__array_function__`` protocol, to allow arguments of NumPy
> functions to define how that function operates on them. This will allow
> using NumPy as a high level API for efficient multi-dimensional array
> operations, even with array implementations that differ greatly from
> ``numpy.ndarray``.
>
> Detailed description
> 
>
> NumPy's high level ndarray API has been implemented several times
> outside of NumPy itself for different architectures, such as for GPU
> arrays (CuPy), Sparse arrays (scipy.sparse, pydata/sparse) and parallel
> arrays (Dask array) as well as various NumPy-like implementations in the
> deep learning frameworks, like TensorFlow and PyTorch.
>
> Similarly there are many projects that build on top of the NumPy API
> for labeled and indexed arrays (XArray), automatic differentiation
> (Autograd, Tangent), masked arrays (numpy.ma), physical units
> (astropy.units,
> pint, unyt), etc. that add additional functionality on top of the NumPy
> API.
> Most of these project also implement a close variation of NumPy's level
> high
> API.
>
> We would like to be able to use these libraries together, for example we
> would like to be able to place a CuPy array within XArray, or perform
> automatic differentiation on Dask array code. This would be easier to
> accomplish if code written for NumPy ndarrays could also be used by
> other NumPy-like projects.
>
> For example, we would like for the following code example to work
> equally well with any NumPy-like array object:
>
> .. code:: python
>
> def f(x):
> y = np.tensordot(x, x.T)
> return np.mean(np.exp(y))
>
> Some of this is possible today with various protocol mechanisms within
> NumPy.
>
> -  The ``np.exp`` function checks the ``__array_ufunc__`` protocol
> -  The ``.T`` method works using Python's method dispatch
> -  The ``np.mean`` function explicitly checks for a ``.mean`` method on
>the argument
>
> However other functions, like ``np.tensordot`` do not dispatch, and
> instead are likely to coerce to a NumPy array (using the ``__array__``)
> protocol, or err outright. To achieve enough coverage of the NumPy API
> to support downstream projects like XArray and autograd we want to
> support *almost all* functions within NumPy, which calls for a more
> reaching protocol than just ``__array_ufunc__``. We would like a
> protocol that allows arguments of a NumPy function to take control and
> divert execution to another function (for example a GPU or parallel
> implementation) in a way that is safe and consistent across projects.
>
> Implementation
> --
>
> We propose adding support for a new protocol in NumPy,
> ``__array_function__``.
>
> This protocol is intended to be a catch-all for NumPy functionality that
> is not covered by the ``__array_ufunc__`` protocol for universal functions
> (like ``np.exp``). The semantics are very similar to ``__array_ufunc__``,
> except
> the operation is specified by an arbitrary callable object rather than a
> ufunc
> instance and method.
>
> A prototype implementation can be found in
> `this notebook  1f0a308a06cd96df20879a1ddb8f0006>`_.
>
> The interface
> ~
>
> We propose the following signature for implementations of
> ``__array_function__``:
>
> .. code-block:: python
>
> def __array_function__(self, func, types, args, kwargs)
>
> -  ``func`` is an arbitrary callable exposed by NumPy's public API,
>which was called in the form ``func(*args, **kwargs)``.
> -  ``types`` is a ``frozenset`` of unique argument types from the original
> NumPy
>function call that implement ``__array_function__``.
> -  The tuple ``args`` and dict ``kwargs`` are directly passed on from the
>original call.
>
> Unlike ``__array_ufunc__``, there are no high-level guarantees about the
> type of ``func``, or about which of ``args`` and ``kwargs`` may contain
> objects
> implementing the array API.
>
> As a convenience for ``__array_function__`` 

Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-27 Thread Hameer Abbasi
On 27. Jun 2018 at 07:48, Stephan Hoyer  wrote:


After much discussion (and the addition of three new co-authors!), I’m
pleased to present a significantly revision of NumPy Enhancement Proposal
18: A dispatch mechanism for NumPy's high level array functions:
http://www.numpy.org/neps/nep-0018-array-function-protocol.html

The full text is also included below.

Best,
Stephan

===
A dispatch mechanism for NumPy's high level array functions
===

:Author: Stephan Hoyer 
:Author: Matthew Rocklin 
:Author: Marten van Kerkwijk 
:Author: Hameer Abbasi 
:Author: Eric Wieser 
:Status: Draft
:Type: Standards Track
:Created: 2018-05-29

Abstact
---

We propose the ``__array_function__`` protocol, to allow arguments of NumPy
functions to define how that function operates on them. This will allow
using NumPy as a high level API for efficient multi-dimensional array
operations, even with array implementations that differ greatly from
``numpy.ndarray``.

Detailed description


NumPy's high level ndarray API has been implemented several times
outside of NumPy itself for different architectures, such as for GPU
arrays (CuPy), Sparse arrays (scipy.sparse, pydata/sparse) and parallel
arrays (Dask array) as well as various NumPy-like implementations in the
deep learning frameworks, like TensorFlow and PyTorch.

Similarly there are many projects that build on top of the NumPy API
for labeled and indexed arrays (XArray), automatic differentiation
(Autograd, Tangent), masked arrays (numpy.ma), physical units
(astropy.units,
pint, unyt), etc. that add additional functionality on top of the NumPy API.
Most of these project also implement a close variation of NumPy's level high
API.

We would like to be able to use these libraries together, for example we
would like to be able to place a CuPy array within XArray, or perform
automatic differentiation on Dask array code. This would be easier to
accomplish if code written for NumPy ndarrays could also be used by
other NumPy-like projects.

For example, we would like for the following code example to work
equally well with any NumPy-like array object:

.. code:: python

def f(x):
y = np.tensordot(x, x.T)
return np.mean(np.exp(y))

Some of this is possible today with various protocol mechanisms within
NumPy.

-  The ``np.exp`` function checks the ``__array_ufunc__`` protocol
-  The ``.T`` method works using Python's method dispatch
-  The ``np.mean`` function explicitly checks for a ``.mean`` method on
   the argument

However other functions, like ``np.tensordot`` do not dispatch, and
instead are likely to coerce to a NumPy array (using the ``__array__``)
protocol, or err outright. To achieve enough coverage of the NumPy API
to support downstream projects like XArray and autograd we want to
support *almost all* functions within NumPy, which calls for a more
reaching protocol than just ``__array_ufunc__``. We would like a
protocol that allows arguments of a NumPy function to take control and
divert execution to another function (for example a GPU or parallel
implementation) in a way that is safe and consistent across projects.

Implementation
--

We propose adding support for a new protocol in NumPy,
``__array_function__``.

This protocol is intended to be a catch-all for NumPy functionality that
is not covered by the ``__array_ufunc__`` protocol for universal functions
(like ``np.exp``). The semantics are very similar to ``__array_ufunc__``,
except
the operation is specified by an arbitrary callable object rather than a
ufunc
instance and method.

A prototype implementation can be found in
`this notebook <
https://nbviewer.jupyter.org/gist/shoyer/1f0a308a06cd96df20879a1ddb8f0006
>`_.

The interface
~

We propose the following signature for implementations of
``__array_function__``:

.. code-block:: python

def __array_function__(self, func, types, args, kwargs)

-  ``func`` is an arbitrary callable exposed by NumPy's public API,
   which was called in the form ``func(*args, **kwargs)``.
-  ``types`` is a ``frozenset`` of unique argument types from the original
NumPy
   function call that implement ``__array_function__``.
-  The tuple ``args`` and dict ``kwargs`` are directly passed on from the
   original call.

Unlike ``__array_ufunc__``, there are no high-level guarantees about the
type of ``func``, or about which of ``args`` and ``kwargs`` may contain
objects
implementing the array API.

As a convenience for ``__array_function__`` implementors, ``types``
provides all
argument types with an ``'__array_function__'`` attribute. This
allows downstream implementations to quickly determine if they are likely
able
to support the operation. A ``frozenset`` is used to ensure that
``__array_function__`` implementations cannot rely on the iteration order of
``types``, which would facilitate violating the 

[Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-26 Thread Stephan Hoyer
After much discussion (and the addition of three new co-authors!), I’m
pleased to present a significantly revision of NumPy Enhancement Proposal
18: A dispatch mechanism for NumPy's high level array functions:
http://www.numpy.org/neps/nep-0018-array-function-protocol.html

The full text is also included below.

Best,
Stephan

===
A dispatch mechanism for NumPy's high level array functions
===

:Author: Stephan Hoyer 
:Author: Matthew Rocklin 
:Author: Marten van Kerkwijk 
:Author: Hameer Abbasi 
:Author: Eric Wieser 
:Status: Draft
:Type: Standards Track
:Created: 2018-05-29

Abstact
---

We propose the ``__array_function__`` protocol, to allow arguments of NumPy
functions to define how that function operates on them. This will allow
using NumPy as a high level API for efficient multi-dimensional array
operations, even with array implementations that differ greatly from
``numpy.ndarray``.

Detailed description


NumPy's high level ndarray API has been implemented several times
outside of NumPy itself for different architectures, such as for GPU
arrays (CuPy), Sparse arrays (scipy.sparse, pydata/sparse) and parallel
arrays (Dask array) as well as various NumPy-like implementations in the
deep learning frameworks, like TensorFlow and PyTorch.

Similarly there are many projects that build on top of the NumPy API
for labeled and indexed arrays (XArray), automatic differentiation
(Autograd, Tangent), masked arrays (numpy.ma), physical units
(astropy.units,
pint, unyt), etc. that add additional functionality on top of the NumPy API.
Most of these project also implement a close variation of NumPy's level high
API.

We would like to be able to use these libraries together, for example we
would like to be able to place a CuPy array within XArray, or perform
automatic differentiation on Dask array code. This would be easier to
accomplish if code written for NumPy ndarrays could also be used by
other NumPy-like projects.

For example, we would like for the following code example to work
equally well with any NumPy-like array object:

.. code:: python

def f(x):
y = np.tensordot(x, x.T)
return np.mean(np.exp(y))

Some of this is possible today with various protocol mechanisms within
NumPy.

-  The ``np.exp`` function checks the ``__array_ufunc__`` protocol
-  The ``.T`` method works using Python's method dispatch
-  The ``np.mean`` function explicitly checks for a ``.mean`` method on
   the argument

However other functions, like ``np.tensordot`` do not dispatch, and
instead are likely to coerce to a NumPy array (using the ``__array__``)
protocol, or err outright. To achieve enough coverage of the NumPy API
to support downstream projects like XArray and autograd we want to
support *almost all* functions within NumPy, which calls for a more
reaching protocol than just ``__array_ufunc__``. We would like a
protocol that allows arguments of a NumPy function to take control and
divert execution to another function (for example a GPU or parallel
implementation) in a way that is safe and consistent across projects.

Implementation
--

We propose adding support for a new protocol in NumPy,
``__array_function__``.

This protocol is intended to be a catch-all for NumPy functionality that
is not covered by the ``__array_ufunc__`` protocol for universal functions
(like ``np.exp``). The semantics are very similar to ``__array_ufunc__``,
except
the operation is specified by an arbitrary callable object rather than a
ufunc
instance and method.

A prototype implementation can be found in
`this notebook <
https://nbviewer.jupyter.org/gist/shoyer/1f0a308a06cd96df20879a1ddb8f0006
>`_.

The interface
~

We propose the following signature for implementations of
``__array_function__``:

.. code-block:: python

def __array_function__(self, func, types, args, kwargs)

-  ``func`` is an arbitrary callable exposed by NumPy's public API,
   which was called in the form ``func(*args, **kwargs)``.
-  ``types`` is a ``frozenset`` of unique argument types from the original
NumPy
   function call that implement ``__array_function__``.
-  The tuple ``args`` and dict ``kwargs`` are directly passed on from the
   original call.

Unlike ``__array_ufunc__``, there are no high-level guarantees about the
type of ``func``, or about which of ``args`` and ``kwargs`` may contain
objects
implementing the array API.

As a convenience for ``__array_function__`` implementors, ``types``
provides all
argument types with an ``'__array_function__'`` attribute. This
allows downstream implementations to quickly determine if they are likely
able
to support the operation. A ``frozenset`` is used to ensure that
``__array_function__`` implementations cannot rely on the iteration order of
``types``, which would facilitate violating the well-defined "Type casting
hierarchy" described in