Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1
On Tue, Jul 2, 2019 at 1:15 PM Ralf Gommers wrote: > > > On Tue, Jul 2, 2019 at 8:38 AM Stephan Hoyer wrote: > >> On Tue, Jul 2, 2019 at 8:16 AM Ralf Gommers >> wrote: >> >>> >>> >>> On Tue, Jul 2, 2019 at 1:45 AM Juan Nunez-Iglesias >>> wrote: >>> On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote: This is addressed in the NEP, see bullet 1 under "Partial implementation of NumPy's API": http://www.numpy.org/neps/nep-0018-array-function-protocol.html#partial-implementation-of-numpy-s-api My concern is that fallback coercion behavior makes it difficult to reliably implement "strict" overrides of NumPy's API. Fallback coercion is certainly useful for interactive use, but it isn't really appropriate for libraries. >>> Do you mean "fallback coercion in NumPy itself", or "at all"? Right now >>> there's lots of valid code around, e.g. `np.median(some_dask_array)`. Users >>> will keep wanting to do that. Forcing everyone to write >>> `np.median(np.array(some_dask_array))` serves no purpose. So the coercion >>> has to be somewhere. You're arguing that that's up to Dask et al I think? >>> >> >> Yes, I'm arguing this is up to dask to maintain backwards compatibility >> -- or not, as the maintainers see fit. >> >> NumPy adding dispatching with __array_function__ did not break any >> existing code, until the maintainers of other libraries started adding >> __array_function__ methods. I hope that the risks of implementing such >> experimental methods were self-evident. >> > > Yeah, that's a bit of a chicken-and-egg story though. We add something and > try to be "strict". Dask adds something because they like the idea and > generally are quick to adopt these types of things. If we make it too hard > to be backwards compatible, then neither NumPy nor Dask may try and it ends > up breaking scikit-image & co. I for one don't care where the fix lands, > but it's pretty to me that breaking scikit-image is the worst of all > options. > > >> >>> Putting it in Dask right now still doesn't address Juan's backwards >>> compat concern, but perhaps that could be bridged with a Dask bugfix >>> release and some short-lived pain. >>> >> >> I really think this is the best (only?) path forward. >> > > I think I agree (depending on how easy it is to get the Dask fix landed). > That's landed, and Dask is planning a bugfix release in 2 days, so before the NumPy 1.17.0 release. So this is not a release blocker anymore for us I think. Cheers, Ralf >> I'm not convinced that this shouldn't be fixed in NumPy though. Your >>> concern "reliably implement "strict" overrides of NumPy's API" is a bit >>> abstract. Overriding the _whole_ NumPy API is definitely undesirable. If >>> we'd have a reference list somewhere about every function that is handled >>> with __array_function__, then would that address your concern? Such a list >>> could be auto-generated fairly easily. >>> >> >> By "reliably implement strict overrides" I mean the ability to ensure >> that every operation either uses an override or raises an informative error >> -- making it very clear which operation needs to be implemented or avoided. >> > > That isn't necessarily a good goal in itself though. In many cases, an > `asarray` call still needs to go *somewhere*. If the "reliably implement > strict overrides" is to help library authors, then there may be other ways > to do that. For end users it can only hurt; those TypeErrors aren't exactly > easy to understand. > > >> It's true that we didn't really consider "always issuing warnings" as a >> long term solution in the NEP. I can see how this would simply a backwards >> compatibility story for libraries like dask, but in general, I really don't >> like warnings: >> > > I agree. > > Cheers, > Ralf > > Using them like exceptions can easily result in code that is partially >> broken or that fails later for non-obvious reasons. There's a reason why >> Python's errors stop execution flow, until errors in languages like PHP or >> JavaScript. >> >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1
On Tue, Jul 2, 2019 at 8:38 AM Stephan Hoyer wrote: > On Tue, Jul 2, 2019 at 8:16 AM Ralf Gommers > wrote: > >> >> >> On Tue, Jul 2, 2019 at 1:45 AM Juan Nunez-Iglesias >> wrote: >> >>> On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote: >>> >>> This is addressed in the NEP, see bullet 1 under "Partial implementation >>> of NumPy's API": >>> >>> http://www.numpy.org/neps/nep-0018-array-function-protocol.html#partial-implementation-of-numpy-s-api >>> >>> My concern is that fallback coercion behavior makes it difficult to >>> reliably implement "strict" overrides of NumPy's API. Fallback coercion is >>> certainly useful for interactive use, but it isn't really appropriate for >>> libraries. >>> >>> >> Do you mean "fallback coercion in NumPy itself", or "at all"? Right now >> there's lots of valid code around, e.g. `np.median(some_dask_array)`. Users >> will keep wanting to do that. Forcing everyone to write >> `np.median(np.array(some_dask_array))` serves no purpose. So the coercion >> has to be somewhere. You're arguing that that's up to Dask et al I think? >> > > Yes, I'm arguing this is up to dask to maintain backwards compatibility -- > or not, as the maintainers see fit. > > NumPy adding dispatching with __array_function__ did not break any > existing code, until the maintainers of other libraries started adding > __array_function__ methods. I hope that the risks of implementing such > experimental methods were self-evident. > Yeah, that's a bit of a chicken-and-egg story though. We add something and try to be "strict". Dask adds something because they like the idea and generally are quick to adopt these types of things. If we make it too hard to be backwards compatible, then neither NumPy nor Dask may try and it ends up breaking scikit-image & co. I for one don't care where the fix lands, but it's pretty to me that breaking scikit-image is the worst of all options. > >> Putting it in Dask right now still doesn't address Juan's backwards >> compat concern, but perhaps that could be bridged with a Dask bugfix >> release and some short-lived pain. >> > > I really think this is the best (only?) path forward. > I think I agree (depending on how easy it is to get the Dask fix landed). > > I'm not convinced that this shouldn't be fixed in NumPy though. Your >> concern "reliably implement "strict" overrides of NumPy's API" is a bit >> abstract. Overriding the _whole_ NumPy API is definitely undesirable. If >> we'd have a reference list somewhere about every function that is handled >> with __array_function__, then would that address your concern? Such a list >> could be auto-generated fairly easily. >> > > By "reliably implement strict overrides" I mean the ability to ensure that > every operation either uses an override or raises an informative error -- > making it very clear which operation needs to be implemented or avoided. > That isn't necessarily a good goal in itself though. In many cases, an `asarray` call still needs to go *somewhere*. If the "reliably implement strict overrides" is to help library authors, then there may be other ways to do that. For end users it can only hurt; those TypeErrors aren't exactly easy to understand. > It's true that we didn't really consider "always issuing warnings" as a > long term solution in the NEP. I can see how this would simply a backwards > compatibility story for libraries like dask, but in general, I really don't > like warnings: > I agree. Cheers, Ralf Using them like exceptions can easily result in code that is partially > broken or that fails later for non-obvious reasons. There's a reason why > Python's errors stop execution flow, until errors in languages like PHP or > JavaScript. > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1
On Tue, Jul 2, 2019 at 8:16 AM Ralf Gommers wrote: > > > On Tue, Jul 2, 2019 at 1:45 AM Juan Nunez-Iglesias > wrote: > >> On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote: >> >> This is addressed in the NEP, see bullet 1 under "Partial implementation >> of NumPy's API": >> >> http://www.numpy.org/neps/nep-0018-array-function-protocol.html#partial-implementation-of-numpy-s-api >> >> My concern is that fallback coercion behavior makes it difficult to >> reliably implement "strict" overrides of NumPy's API. Fallback coercion is >> certainly useful for interactive use, but it isn't really appropriate for >> libraries. >> >> > Do you mean "fallback coercion in NumPy itself", or "at all"? Right now > there's lots of valid code around, e.g. `np.median(some_dask_array)`. Users > will keep wanting to do that. Forcing everyone to write > `np.median(np.array(some_dask_array))` serves no purpose. So the coercion > has to be somewhere. You're arguing that that's up to Dask et al I think? > Yes, I'm arguing this is up to dask to maintain backwards compatibility -- or not, as the maintainers see fit. NumPy adding dispatching with __array_function__ did not break any existing code, until the maintainers of other libraries started adding __array_function__ methods. I hope that the risks of implementing such experimental methods were self-evident. > Putting it in Dask right now still doesn't address Juan's backwards compat > concern, but perhaps that could be bridged with a Dask bugfix release and > some short-lived pain. > I really think this is the best (only?) path forward. I'm not convinced that this shouldn't be fixed in NumPy though. Your > concern "reliably implement "strict" overrides of NumPy's API" is a bit > abstract. Overriding the _whole_ NumPy API is definitely undesirable. If > we'd have a reference list somewhere about every function that is handled > with __array_function__, then would that address your concern? Such a list > could be auto-generated fairly easily. > By "reliably implement strict overrides" I mean the ability to ensure that every operation either uses an override or raises an informative error -- making it very clear which operation needs to be implemented or avoided. It's true that we didn't really consider "always issuing warnings" as a long term solution in the NEP. I can see how this would simply a backwards compatibility story for libraries like dask, but in general, I really don't like warnings: Using them like exceptions can easily result in code that is partially broken or that fails later for non-obvious reasons. There's a reason why Python's errors stop execution flow, until errors in languages like PHP or JavaScript. ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1
On Tue, Jul 2, 2019 at 1:46 AM Juan Nunez-Iglesias wrote: > I'm also wondering where the list of functions that must be implemented > can be found, so that libraries like dask and CuPy can be sure that they > have a complete implementation, and further typeerrors won't be raised with > their arrays. > This is a good question. We don't have a master list currently. In practice, I would be surprised if there is ever more than exactly one full implementation of NumPy's full API. We added dispatch with __array_function__ even to really obscure corners of NumPy's API, e.g., np.lib.scimath. The short answer right now is "Any publicly exposed function that says it takes array-like arguments, aside from functions specifically for coercing to NumPy arrays and the functions in numpy.testing." ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1
On Tue, Jul 2, 2019 at 1:45 AM Juan Nunez-Iglesias wrote: > On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote: > > This is addressed in the NEP, see bullet 1 under "Partial implementation > of NumPy's API": > > http://www.numpy.org/neps/nep-0018-array-function-protocol.html#partial-implementation-of-numpy-s-api > > My concern is that fallback coercion behavior makes it difficult to > reliably implement "strict" overrides of NumPy's API. Fallback coercion is > certainly useful for interactive use, but it isn't really appropriate for > libraries. > > Do you mean "fallback coercion in NumPy itself", or "at all"? Right now there's lots of valid code around, e.g. `np.median(some_dask_array)`. Users will keep wanting to do that. Forcing everyone to write `np.median(np.array(some_dask_array))` serves no purpose. So the coercion has to be somewhere. You're arguing that that's up to Dask et al I think? Putting it in Dask right now still doesn't address Juan's backwards compat concern, but perhaps that could be bridged with a Dask bugfix release and some short-lived pain. I'm not convinced that this shouldn't be fixed in NumPy though. Your concern "reliably implement "strict" overrides of NumPy's API" is a bit abstract. Overriding the _whole_ NumPy API is definitely undesirable. If we'd have a reference list somewhere about every function that is handled with __array_function__, then would that address your concern? Such a list could be auto-generated fairly easily. > > In contrast to putting this into NumPy, if a library like dask prefers to > issue warnings or even keep around fallback coercion indefinitely (not that > I would recommend it), they can do that by putting it in their > __array_function__ implementation. > > > I get the above concerns, and thanks for bringing them up, Stephan, as I'd > only skimmed the NEP the first time around and missed them. Nevertheless, > the fact is that the current behaviour breaks user code that was perfectly > valid until NumPy 1.16, which seems, well, insane. So, warning for a few > versions followed raising seems like the only way forward to me. The NEP > explicitly states “We would like to gain experience with how > __array_function__ is actually used before making decisions that would be > difficult to roll back.” I think that this breakage *is* that experience, > and the decision right now should be not to break user code with no warning > period. > > I'm also wondering where the list of functions that must be implemented > can be found, so that libraries like dask and CuPy can be sure that they > have a complete implementation, and further typeerrors won't be raised with > their arrays. > This is one of the reasons I'm working on https://github.com/Quansight-Labs/rnumpy. It doesn't make sense for any library to copy the whole NumPy API, it's way too large with lots of stuff in there that's only there for backwards compat and has a better alternative or shouldn't be in NumPy in the first place. Cheers, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1
On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote: > This is addressed in the NEP, see bullet 1 under "Partial implementation of > NumPy's API": > http://www.numpy.org/neps/nep-0018-array-function-protocol.html#partial-implementation-of-numpy-s-api > > My concern is that fallback coercion behavior makes it difficult to reliably > implement "strict" overrides of NumPy's API. Fallback coercion is certainly > useful for interactive use, but it isn't really appropriate for libraries. > > In contrast to putting this into NumPy, if a library like dask prefers to > issue warnings or even keep around fallback coercion indefinitely (not that I > would recommend it), they can do that by putting it in their > __array_function__ implementation. I get the above concerns, and thanks for bringing them up, Stephan, as I'd only skimmed the NEP the first time around and missed them. Nevertheless, the fact is that the current behaviour breaks user code that was perfectly valid until NumPy 1.16, which seems, well, insane. So, warning for a few versions followed raising seems like the only way forward to me. The NEP explicitly states “We would like to gain experience with how `__array_function__` is actually used before making decisions that would be difficult to roll back.” I think that this breakage *is* that experience, and the decision right now should be not to break user code with no warning period. I'm also wondering where the list of functions that must be implemented can be found, so that libraries like dask and CuPy can be sure that they have a complete implementation, and further typeerrors won't be raised with their arrays.___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1
> > Your suggestion on the issue to switch from typeerror to warning is, imho, >> much better, as long as the warning contains a link to an issue/webpage >> explaining what needs to happen. It's only because I've been vaguely aware >> of the `__array_function__` discussions that I was able to diagnose >> relatively quickly. The average user would be very confused by this code >> break or by a warning, and be unsure of what they need to do to get rid of >> the warning. >> > > This would work I think. It's not even a band-aid, it's probably the > better design option because any sane library that implements > __array_function__ will have a much smaller API surface than NumPy - and > why forbid users from feeding array-like input to the rest of the NumPy > functions? > This is addressed in the NEP, see bullet 1 under "Partial implementation of NumPy's API": http://www.numpy.org/neps/nep-0018-array-function-protocol.html#partial-implementation-of-numpy-s-api My concern is that fallback coercion behavior makes it difficult to reliably implement "strict" overrides of NumPy's API. Fallback coercion is certainly useful for interactive use, but it isn't really appropriate for libraries. In contrast to putting this into NumPy, if a library like dask prefers to issue warnings or even keep around fallback coercion indefinitely (not that I would recommend it), they can do that by putting it in their __array_function__ implementation. ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1
On Mon, Jul 1, 2019 at 7:37 AM Juan Nunez-Iglesias wrote: > > > On Mon, 1 Jul 2019, at 2:34 PM, Ralf Gommers wrote: > > This issue is not very surprising - __array_function__ is going to have a > fair bit of backwards compat impact for people who were relying on feeding > all sorts of stuff into numpy functions that previously got converted with > asarray. At this point Dask is the main worry, followed by CuPy and > pydata/sparse. All those libraries have very responsive maintainers. > Perhaps we should just try to get these issues fixed asap in those > libraries instead? > > > Fixing them is not sufficient, because many people are still going to end > up with broken code unless they are bleeding-edge with everything. It's > best to minimise the number of forbidden version combinations. > Yes, fair enough. > Your suggestion on the issue to switch from typeerror to warning is, imho, > much better, as long as the warning contains a link to an issue/webpage > explaining what needs to happen. It's only because I've been vaguely aware > of the `__array_function__` discussions that I was able to diagnose > relatively quickly. The average user would be very confused by this code > break or by a warning, and be unsure of what they need to do to get rid of > the warning. > This would work I think. It's not even a band-aid, it's probably the better design option because any sane library that implements __array_function__ will have a much smaller API surface than NumPy - and why forbid users from feeding array-like input to the rest of the NumPy functions? Cheers, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1
On Mon, 1 Jul 2019, at 2:34 PM, Ralf Gommers wrote: > This issue is not very surprising - __array_function__ is going to have a > fair bit of backwards compat impact for people who were relying on feeding > all sorts of stuff into numpy functions that previously got converted with > asarray. At this point Dask is the main worry, followed by CuPy and > pydata/sparse. All those libraries have very responsive maintainers. Perhaps > we should just try to get these issues fixed asap in those libraries instead? Fixing them is not sufficient, because many people are still going to end up with broken code unless they are bleeding-edge with everything. It's best to minimise the number of forbidden version combinations. Your suggestion on the issue to switch from typeerror to warning is, imho, much better, as long as the warning contains a link to an issue/webpage explaining what needs to happen. It's only because I've been vaguely aware of the `__array_function__` discussions that I was able to diagnose relatively quickly. The average user would be very confused by this code break or by a warning, and be unsure of what they need to do to get rid of the warning.___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion