We seem to have run out of steam a bit here.
On Tue, Apr 30, 2019 at 7:24 AM Stephan Hoyer <sho...@gmail.com> wrote: > On Mon, Apr 29, 2019 at 5:49 AM Marten van Kerkwijk < > m.h.vankerkw...@gmail.com> wrote: > >> The uses that I've seen so far (in CuPy and JAX), involve a handful of >>> functions that are directly re-exported from NumPy, e.g., >>> jax.numpy.array_repr is the exact same object as numpy.array_repr: >>> >>> https://github.com/cupy/cupy/blob/c3f1be602bf6951b007beaae644a5662f910048b/cupy/__init__.py#L341-L366 >>> >>> https://github.com/google/jax/blob/5edb23679f2605654949156da84e330205840695/jax/numpy/lax_numpy.py#L89-L132 >>> >>> >>> I suspect this will be less common in the future if __array_function__ >>> takes off, but for now it's convenient because users don't need to know >>> exactly which functions have been reimplemented. They can just use "import >>> jax.numpy as np" and everything works. >>> >>> These libraries are indeed passing CuPy or JAX arrays into NumPy >>> functions, which currently happen to have the desired behavior, thanks to >>> accidental details about how NumPy currently supports duck-typing and/or >>> coercions. >>> >>> To this end, it would be really nice to have an alias that *is* >>> guaranteed to work exactly as if __array_function__ didn't exist, and not >>> only for numpy.ndarray arrays. >>> >> >> Just to be clear: for this purpose, being able to call the implementation >> is still mostly a convenient crutch, correct? For classes that define >> __array_function__, would you expect more than the guarantee I wrote above, >> that the wrapped version will continue to work as advertised for ndarray >> input only? >> > > I'm not sure I agree -- what would be the more principled alternative here? > > Modules that emulate NumPy's public API for a new array type are both > pretty common (cupy, jax.numpy, autograd, dask.array, pydata/sparse, etc) > and also the best early candidates for adopting NEP-18, because they don't > need to do much extra work to write a __array_function__ method. I want to > make it as easy as possible for these early adopters, because their success > will make or break the entire __array_function__ protocol. > > In the long term, I agree that the importance of these numpy-like > namespaces will diminish, because it will be possible to use the original > NumPy namespace instead. Possibly new projects will decide that they don't > need to bother with them at all. But there are still lots of plausible > reasons for keeping them around even for a project that implements > __array_function__, e.g., > (a) to avoid the overhead of NumPy's dispatching > (b) to access functions like np.ones that return a different array type > (c) to make use of optional duck-array specific arguments, e.g., the > split_every argument to dask.array.sum() > (d) if they care about supporting versions of NumPy older than 1.17 > > In practice, I suspect we'll see these modules continue to exist for a > long time. And they really do rely upon the exact behavior of NumPy today, > whatever that happens to be (e.g., the undocumented fact that > np.result_type supports duck-typing with the .dtype attribute rather than > coercing arguments to NumPy arrays).. > > In particular, suppose we change an implementation to use different other >> numpy functions inside (which are of course overridden using >> __array_function__). I could imagine situations where that would work fine >> for everything that does not define __array_ufunc__, but where it would not >> for classes that do define it. Is that then a problem for numpy or for the >> project that has a class that defines __array_function__? >> > > If we change an existing NumPy function to start calling ufuncs directly > on input arguments, rather than calling np.asarray() on its inputs, > This wasn't really the question I believe. More like, if numpy function A now calls B under the hood, and we replace it with C (in a way that's fully backwards compatible for users of A), then will that be a problem in the future? I think that in practice this doesn't happen a lot, and is quite unlikely to be a problem. that will already (potentially) be a breaking change. We lost the ability > to these sorts of refactors without breaking backwards compatibility when > we added __array_ufunc__. So I think it's already our problem, unless we're > willing to risk breaking __array_ufunc__ users. > > That said, I doubt this would actually be a major issue in practice. The > projects for which __array_function__ makes the most sense are "full duck > arrays," and all these projects are going to implement __array_ufunc__, > too, in a mostly compatible way. > > I'm a little puzzled by why you are concerned about retaining this > flexibility to reuse the attribute I'm asking for here for a function that > works differently. What I want is a special attribute that is guaranteed to > work like the public version of a NumPy function, but without checking for > an __array_function__ attribute. > > If we later decide we want to expose an attribute that provides a > non-coercing function that calls ufuncs directly instead of np.asarray, > what do we lose by giving it a new name so users don't need to worry about > changed behavior? There is plenty of room for special attributes on NumPy > functions. We can have both np.something.__skip_array_overrides__ and > np.something.__array_implementation__. > That's a good argument I think. Ralf > So we might as well pick a name that works for both, e.g., >>> __skip_array_overrides__ rather than __skip_array_function__. This would >>> let us save our users a bit of pain by not requiring them to make changes >>> like np.where.__skip_array_function__ -> np.where.__skip_array_ufunc__. >>> >> >> Note that for ufuncs it is not currently possible to skip the override. I >> don't think it is super-hard to do it, but I'm not sure I see the need to >> add a crutch where none has been needed so far. More generally, it is not >> obvious there is any C code where skipping the override is useful, since >> the C code relies much more directly on inputs being ndarray. >> > > To be entirely clear: I was thinking of > ufunc.method.__skip_array_overrides__() as "equivalent to ufunc.method() > except not checking for __array_ufunc__ attributes". > > I think the use-cases would be for Python code that ufuncs, in much the > same way that there are use-cases for Python code that call other NumPy > functions, e.g., > - np.sin.__skip_array_overrides__() could be a slightly faster than > np.sin(), because it avoids checking for __array_ufunc__ attributes. > - np.add.__skip_array_overrides__(x, y) is definitely going to be a faster > than np.add(np.asarray(x), np.asarray(y)), because it avoids the overhead > of two Python function calls. > > The use cases here are certainly not as compelling as those for > __array_function__, because __array_ufunc__'s arguments are in a > standardized form, but I think there's still meaningful. Not to mention, we > can refactor np.ndarray.__array_ufunc__ to work exactly like > np.ndarray.__array_function__, eliminating the special case in NEP-13's > dispatch rules. > > I agree that it wouldn't make sense to call the "generic duck-array > implementation" of a ufunc (these don't exist), but that wasn't what I was > proposing here. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion