Hi Ralf, You're right, the problem is with the added keyword argument (which would appear also if we did not still have to support the old .sum method override but just dispatched to __array_ufunc__ with `np.add.reduce` - maybe worse given that it seems likely the reduce method has seen much less testing in __array_ufunc__ implementations).
Still, I do think the question stands: we implement a `nansum` for our ndarray class a certain way, and provide ways to override it (three now, in fact). Is it really reasonable to expect that we wait 4 versions for other packages to keep up with this, and thus get stuck with given internal implementations? Aside: note that the present version of the nanfunctions relies on turning the arguments into arrays and copying 0s into it - that suggests that currently they do not work for duck arrays like Dask. All the best, Marten On Wed, Jun 12, 2019 at 4:32 PM Ralf Gommers <ralf.gomm...@gmail.com> wrote: > > > On Wed, Jun 12, 2019 at 12:02 AM Stefan van der Walt <stef...@berkeley.edu> > wrote: > >> On Tue, 11 Jun 2019 15:10:16 -0400, Marten van Kerkwijk wrote: >> > In a way, I brought it up mostly as a concrete example of an internal >> > implementation which we cannot change to an objectively cleaner one >> because >> > other packages rely on an out-of-date numpy API. >> > > I think this is not the right way to describe the problem (see below). > > >> This, and the comments Nathaniel made on the array function thread, are >> important to take note of. Would it be worth updating NEP 18 with a >> list of pitfalls? Or should this be a new informational NEP that >> discusses—on a higher level—the benefits, risks, and design >> considerations of providing protocols? >> > > That would be a nice thing to do (the higher level one), but in this case > I think the issue has little to do with NEP 18. The summary of the issue in > this thread is a little brief, so let me try to clarify. > > 1. np.sum gained a new `where=` keyword in 1.17.0 > 2. using np.sum(x) will detect a `x.sum` method if it's present and try to > use that > 3. the `_wrapreduction` utility that forwards the function to the method > will compare signatures of np.sum and x.sum, and throw an error if there's > a mismatch for any keywords that have a value other than the default > np._NoValue > > Code to check this: > >>> x1 = np.arange(5) > >>> x2 = np.asmatrix(x1) > >>> np.sum(x1) # works > >>> np.sum(x2) # works > >>> np.sum(x1, where=x1>3) # works > >>> np.sum(x2, where=x2>3) # _wrapreduction throws TypeError > ... > TypeError: sum() got an unexpected keyword argument 'where' > > Note that this is not specific to np.matrix. Using pandas.Series you also > get a TypeError: > >>> y = pd.Series(x1) > >>> np.sum(y) # works > >>> np.sum(y, where=y>3) # pandas throws TypeError > ... > TypeError: sum() got an unexpected keyword argument 'where' > > The issue is that when we have this kind of forwarding logic, irrespective > of how it's implemented, new keywords cannot be used until the array-like > objects with the methods that get forwarded to gain the same keyword. > > tl;dr this is simply a cost we have to be aware of when either proposing > to add new keywords, or when proposing any kind of dispatching logic (in > this case `_wrapreduction`). > > Regarding internal use of `np.sum(..., where=)`: this should not be done > until at least 4-5 versions from now, and preferably way longer than that. > Because doing so will break already released versions of Pandas, Dask, and > other libraries with array-like objects. > > Cheers, > Ralf > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion