Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1

2019-07-05 Thread Ralf Gommers
On Tue, Jul 2, 2019 at 1:15 PM Ralf Gommers  wrote:

>
>
> On Tue, Jul 2, 2019 at 8:38 AM Stephan Hoyer  wrote:
>
>> On Tue, Jul 2, 2019 at 8:16 AM Ralf Gommers 
>> wrote:
>>
>>>
>>>
>>> On Tue, Jul 2, 2019 at 1:45 AM Juan Nunez-Iglesias 
>>> wrote:
>>>
 On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote:

 This is addressed in the NEP, see bullet 1 under "Partial
 implementation of NumPy's API":

 http://www.numpy.org/neps/nep-0018-array-function-protocol.html#partial-implementation-of-numpy-s-api

 My concern is that fallback coercion behavior makes it difficult to
 reliably implement "strict" overrides of NumPy's API. Fallback coercion is
 certainly useful for interactive use, but it isn't really appropriate for
 libraries.


>>> Do you mean "fallback coercion in NumPy itself", or "at all"? Right now
>>> there's lots of valid code around, e.g. `np.median(some_dask_array)`. Users
>>> will keep wanting to do that. Forcing everyone to write
>>> `np.median(np.array(some_dask_array))` serves no purpose. So the coercion
>>> has to be somewhere. You're arguing that that's up to Dask et al I think?
>>>
>>
>> Yes, I'm arguing this is up to dask to maintain backwards compatibility
>> -- or not, as the maintainers see fit.
>>
>> NumPy adding dispatching with __array_function__ did not break any
>> existing code, until the maintainers of other libraries started adding
>> __array_function__ methods. I hope that the risks of implementing such
>> experimental methods were self-evident.
>>
>
> Yeah, that's a bit of a chicken-and-egg story though. We add something and
> try to be "strict". Dask adds something because they like the idea and
> generally are quick to adopt these types of things. If we make it too hard
> to be backwards compatible, then neither NumPy nor Dask may try and it ends
> up breaking scikit-image & co. I for one don't care where the fix lands,
> but it's pretty to me that breaking scikit-image is the worst of all
> options.
>
>
>>
>>> Putting it in Dask right now still doesn't address Juan's backwards
>>> compat concern, but perhaps that could be bridged with a Dask bugfix
>>> release and some short-lived pain.
>>>
>>
>> I really think this is the best (only?) path forward.
>>
>
> I think I agree (depending on how easy it is to get the Dask fix landed).
>

That's landed, and Dask is planning a bugfix release in 2 days, so before
the NumPy 1.17.0 release. So this is not a release blocker anymore for us I
think.

Cheers,
Ralf


>> I'm not convinced that this shouldn't be fixed in NumPy though. Your
>>> concern "reliably implement "strict" overrides of NumPy's API" is a bit
>>> abstract. Overriding the _whole_ NumPy API is definitely undesirable. If
>>> we'd have a reference list somewhere about every function that is handled
>>> with __array_function__, then would that address your concern? Such a list
>>> could be auto-generated fairly easily.
>>>
>>
>> By "reliably implement strict overrides" I mean the ability to ensure
>> that every operation either uses an override or raises an informative error
>> -- making it very clear which operation needs to be implemented or avoided.
>>
>
> That isn't necessarily a good goal in itself though. In many cases, an
> `asarray` call still needs to go *somewhere*. If the "reliably implement
> strict overrides" is to help library authors, then there may be other ways
> to do that. For end users it can only hurt; those TypeErrors aren't exactly
> easy to understand.
>
>
>> It's true that we didn't really consider "always issuing warnings" as a
>> long term solution in the NEP. I can see how this would simply a backwards
>> compatibility story for libraries like dask, but in general, I really don't
>> like warnings:
>>
>
> I agree.
>
> Cheers,
> Ralf
>
> Using them like exceptions can easily result in code that is partially
>> broken or that fails later for non-obvious reasons. There's a reason why
>> Python's errors stop execution flow, until errors in languages like PHP or
>> JavaScript.
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1

2019-07-02 Thread Ralf Gommers
On Tue, Jul 2, 2019 at 8:38 AM Stephan Hoyer  wrote:

> On Tue, Jul 2, 2019 at 8:16 AM Ralf Gommers 
> wrote:
>
>>
>>
>> On Tue, Jul 2, 2019 at 1:45 AM Juan Nunez-Iglesias 
>> wrote:
>>
>>> On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote:
>>>
>>> This is addressed in the NEP, see bullet 1 under "Partial implementation
>>> of NumPy's API":
>>>
>>> http://www.numpy.org/neps/nep-0018-array-function-protocol.html#partial-implementation-of-numpy-s-api
>>>
>>> My concern is that fallback coercion behavior makes it difficult to
>>> reliably implement "strict" overrides of NumPy's API. Fallback coercion is
>>> certainly useful for interactive use, but it isn't really appropriate for
>>> libraries.
>>>
>>>
>> Do you mean "fallback coercion in NumPy itself", or "at all"? Right now
>> there's lots of valid code around, e.g. `np.median(some_dask_array)`. Users
>> will keep wanting to do that. Forcing everyone to write
>> `np.median(np.array(some_dask_array))` serves no purpose. So the coercion
>> has to be somewhere. You're arguing that that's up to Dask et al I think?
>>
>
> Yes, I'm arguing this is up to dask to maintain backwards compatibility --
> or not, as the maintainers see fit.
>
> NumPy adding dispatching with __array_function__ did not break any
> existing code, until the maintainers of other libraries started adding
> __array_function__ methods. I hope that the risks of implementing such
> experimental methods were self-evident.
>

Yeah, that's a bit of a chicken-and-egg story though. We add something and
try to be "strict". Dask adds something because they like the idea and
generally are quick to adopt these types of things. If we make it too hard
to be backwards compatible, then neither NumPy nor Dask may try and it ends
up breaking scikit-image & co. I for one don't care where the fix lands,
but it's pretty to me that breaking scikit-image is the worst of all
options.


>
>> Putting it in Dask right now still doesn't address Juan's backwards
>> compat concern, but perhaps that could be bridged with a Dask bugfix
>> release and some short-lived pain.
>>
>
> I really think this is the best (only?) path forward.
>

I think I agree (depending on how easy it is to get the Dask fix landed).

>
> I'm not convinced that this shouldn't be fixed in NumPy though. Your
>> concern "reliably implement "strict" overrides of NumPy's API" is a bit
>> abstract. Overriding the _whole_ NumPy API is definitely undesirable. If
>> we'd have a reference list somewhere about every function that is handled
>> with __array_function__, then would that address your concern? Such a list
>> could be auto-generated fairly easily.
>>
>
> By "reliably implement strict overrides" I mean the ability to ensure that
> every operation either uses an override or raises an informative error --
> making it very clear which operation needs to be implemented or avoided.
>

That isn't necessarily a good goal in itself though. In many cases, an
`asarray` call still needs to go *somewhere*. If the "reliably implement
strict overrides" is to help library authors, then there may be other ways
to do that. For end users it can only hurt; those TypeErrors aren't exactly
easy to understand.


> It's true that we didn't really consider "always issuing warnings" as a
> long term solution in the NEP. I can see how this would simply a backwards
> compatibility story for libraries like dask, but in general, I really don't
> like warnings:
>

I agree.

Cheers,
Ralf

Using them like exceptions can easily result in code that is partially
> broken or that fails later for non-obvious reasons. There's a reason why
> Python's errors stop execution flow, until errors in languages like PHP or
> JavaScript.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1

2019-07-02 Thread Stephan Hoyer
On Tue, Jul 2, 2019 at 8:16 AM Ralf Gommers  wrote:

>
>
> On Tue, Jul 2, 2019 at 1:45 AM Juan Nunez-Iglesias 
> wrote:
>
>> On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote:
>>
>> This is addressed in the NEP, see bullet 1 under "Partial implementation
>> of NumPy's API":
>>
>> http://www.numpy.org/neps/nep-0018-array-function-protocol.html#partial-implementation-of-numpy-s-api
>>
>> My concern is that fallback coercion behavior makes it difficult to
>> reliably implement "strict" overrides of NumPy's API. Fallback coercion is
>> certainly useful for interactive use, but it isn't really appropriate for
>> libraries.
>>
>>
> Do you mean "fallback coercion in NumPy itself", or "at all"? Right now
> there's lots of valid code around, e.g. `np.median(some_dask_array)`. Users
> will keep wanting to do that. Forcing everyone to write
> `np.median(np.array(some_dask_array))` serves no purpose. So the coercion
> has to be somewhere. You're arguing that that's up to Dask et al I think?
>

Yes, I'm arguing this is up to dask to maintain backwards compatibility --
or not, as the maintainers see fit.

NumPy adding dispatching with __array_function__ did not break any existing
code, until the maintainers of other libraries started adding
__array_function__ methods. I hope that the risks of implementing such
experimental methods were self-evident.


> Putting it in Dask right now still doesn't address Juan's backwards compat
> concern, but perhaps that could be bridged with a Dask bugfix release and
> some short-lived pain.
>

I really think this is the best (only?) path forward.

I'm not convinced that this shouldn't be fixed in NumPy though. Your
> concern "reliably implement "strict" overrides of NumPy's API" is a bit
> abstract. Overriding the _whole_ NumPy API is definitely undesirable. If
> we'd have a reference list somewhere about every function that is handled
> with __array_function__, then would that address your concern? Such a list
> could be auto-generated fairly easily.
>

By "reliably implement strict overrides" I mean the ability to ensure that
every operation either uses an override or raises an informative error --
making it very clear which operation needs to be implemented or avoided.

It's true that we didn't really consider "always issuing warnings" as a
long term solution in the NEP. I can see how this would simply a backwards
compatibility story for libraries like dask, but in general, I really don't
like warnings: Using them like exceptions can easily result in code that is
partially broken or that fails later for non-obvious reasons. There's a
reason why Python's errors stop execution flow, until errors in languages
like PHP or JavaScript.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1

2019-07-02 Thread Stephan Hoyer
On Tue, Jul 2, 2019 at 1:46 AM Juan Nunez-Iglesias  wrote:

> I'm also wondering where the list of functions that must be implemented
> can be found, so that libraries like dask and CuPy can be sure that they
> have a complete implementation, and further typeerrors won't be raised with
> their arrays.
>

This is a good question. We don't have a master list currently.

In practice, I would be surprised if there is ever more than exactly one
full implementation of NumPy's full API. We added dispatch with
__array_function__ even to really obscure corners of NumPy's API, e.g.,
np.lib.scimath.

The short answer right now is "Any publicly exposed function that says it
takes array-like arguments, aside from functions specifically for coercing
to NumPy arrays and the functions in numpy.testing."
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1

2019-07-02 Thread Ralf Gommers
On Tue, Jul 2, 2019 at 1:45 AM Juan Nunez-Iglesias  wrote:

> On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote:
>
> This is addressed in the NEP, see bullet 1 under "Partial implementation
> of NumPy's API":
>
> http://www.numpy.org/neps/nep-0018-array-function-protocol.html#partial-implementation-of-numpy-s-api
>
> My concern is that fallback coercion behavior makes it difficult to
> reliably implement "strict" overrides of NumPy's API. Fallback coercion is
> certainly useful for interactive use, but it isn't really appropriate for
> libraries.
>
>
Do you mean "fallback coercion in NumPy itself", or "at all"? Right now
there's lots of valid code around, e.g. `np.median(some_dask_array)`. Users
will keep wanting to do that. Forcing everyone to write
`np.median(np.array(some_dask_array))` serves no purpose. So the coercion
has to be somewhere. You're arguing that that's up to Dask et al I think?

Putting it in Dask right now still doesn't address Juan's backwards compat
concern, but perhaps that could be bridged with a Dask bugfix release and
some short-lived pain.

I'm not convinced that this shouldn't be fixed in NumPy though. Your
concern "reliably implement "strict" overrides of NumPy's API" is a bit
abstract. Overriding the _whole_ NumPy API is definitely undesirable. If
we'd have a reference list somewhere about every function that is handled
with __array_function__, then would that address your concern? Such a list
could be auto-generated fairly easily.

>
> In contrast to putting this into NumPy, if a library like dask prefers to
> issue warnings or even keep around fallback coercion indefinitely (not that
> I would recommend it), they can do that by putting it in their
> __array_function__ implementation.
>
>
> I get the above concerns, and thanks for bringing them up, Stephan, as I'd
> only skimmed the NEP the first time around and missed them. Nevertheless,
> the fact is that the current behaviour breaks user code that was perfectly
> valid until NumPy 1.16, which seems, well, insane. So, warning for a few
> versions followed raising seems like the only way forward to me. The NEP
> explicitly states “We would like to gain experience with how
> __array_function__ is actually used before making decisions that would be
> difficult to roll back.” I think that this breakage *is* that experience,
> and the decision right now should be not to break user code with no warning
> period.
>

> I'm also wondering where the list of functions that must be implemented
> can be found, so that libraries like dask and CuPy can be sure that they
> have a complete implementation, and further typeerrors won't be raised with
> their arrays.
>

This is one of the reasons I'm working on
https://github.com/Quansight-Labs/rnumpy. It doesn't make sense for any
library to copy the whole NumPy API, it's way too large with lots of stuff
in there that's only there for backwards compat and has a better
alternative or shouldn't be in NumPy in the first place.

Cheers,
Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1

2019-07-02 Thread Juan Nunez-Iglesias
On Tue, 2 Jul 2019, at 4:34 PM, Stephan Hoyer wrote:
> This is addressed in the NEP, see bullet 1 under "Partial implementation of 
> NumPy's API":
> http://www.numpy.org/neps/nep-0018-array-function-protocol.html#partial-implementation-of-numpy-s-api
> 
> My concern is that fallback coercion behavior makes it difficult to reliably 
> implement "strict" overrides of NumPy's API. Fallback coercion is certainly 
> useful for interactive use, but it isn't really appropriate for libraries.
> 
> In contrast to putting this into NumPy, if a library like dask prefers to 
> issue warnings or even keep around fallback coercion indefinitely (not that I 
> would recommend it), they can do that by putting it in their 
> __array_function__ implementation.

I get the above concerns, and thanks for bringing them up, Stephan, as I'd only 
skimmed the NEP the first time around and missed them. Nevertheless, the fact 
is that the current behaviour breaks user code that was perfectly valid until 
NumPy 1.16, which seems, well, insane. So, warning for a few versions followed 
raising seems like the only way forward to me. The NEP explicitly states “We 
would like to gain experience with how `__array_function__` is actually used 
before making decisions that would be difficult to roll back.” I think that 
this breakage *is* that experience, and the decision right now should be not to 
break user code with no warning period.

I'm also wondering where the list of functions that must be implemented can be 
found, so that libraries like dask and CuPy can be sure that they have a 
complete implementation, and further typeerrors won't be raised with their 
arrays.___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1

2019-07-02 Thread Stephan Hoyer
>
> Your suggestion on the issue to switch from typeerror to warning is, imho,
>> much better, as long as the warning contains a link to an issue/webpage
>> explaining what needs to happen. It's only because I've been vaguely aware
>> of the `__array_function__` discussions that I was able to diagnose
>> relatively quickly. The average user would be very confused by this code
>> break or by a warning, and be unsure of what they need to do to get rid of
>> the warning.
>>
>
>  This would work I think. It's not even a band-aid, it's probably the
> better design option because any sane library that implements
> __array_function__ will have a much smaller API surface than NumPy - and
> why forbid users from feeding array-like input to the rest of the NumPy
> functions?
>

This is addressed in the NEP, see bullet 1 under "Partial implementation of
NumPy's API":
http://www.numpy.org/neps/nep-0018-array-function-protocol.html#partial-implementation-of-numpy-s-api

My concern is that fallback coercion behavior makes it difficult to
reliably implement "strict" overrides of NumPy's API. Fallback coercion is
certainly useful for interactive use, but it isn't really appropriate for
libraries.

In contrast to putting this into NumPy, if a library like dask prefers to
issue warnings or even keep around fallback coercion indefinitely (not that
I would recommend it), they can do that by putting it in their
__array_function__ implementation.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] __array_function related regression for 1.17.0rc1

2019-06-30 Thread Ralf Gommers
On Mon, Jul 1, 2019 at 7:37 AM Juan Nunez-Iglesias  wrote:

>
>
> On Mon, 1 Jul 2019, at 2:34 PM, Ralf Gommers wrote:
>
> This issue is not very surprising - __array_function__ is going to have a
> fair bit of backwards compat impact for people who were relying on feeding
> all sorts of stuff into numpy functions that previously got converted with
> asarray. At this point Dask is the main worry, followed by CuPy and
> pydata/sparse. All those libraries have very responsive maintainers.
> Perhaps we should just try to get these issues fixed asap in those
> libraries instead?
>
>
> Fixing them is not sufficient, because many people are still going to end
> up with broken code unless they are bleeding-edge with everything. It's
> best to minimise the number of forbidden version combinations.
>

Yes, fair enough.


> Your suggestion on the issue to switch from typeerror to warning is, imho,
> much better, as long as the warning contains a link to an issue/webpage
> explaining what needs to happen. It's only because I've been vaguely aware
> of the `__array_function__` discussions that I was able to diagnose
> relatively quickly. The average user would be very confused by this code
> break or by a warning, and be unsure of what they need to do to get rid of
> the warning.
>

 This would work I think. It's not even a band-aid, it's probably the
better design option because any sane library that implements
__array_function__ will have a much smaller API surface than NumPy - and
why forbid users from feeding array-like input to the rest of the NumPy
functions?

Cheers,
Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion