Re: [Numpy-discussion] Experimental `like=` attribute for array creation functions

2020-08-13 Thread Juan Nunez-Iglesias
Hello everyone again!

A few clarifications about my proposal of external peer review:

- Yes, all this work is public and announced on the mailing list. However, I 
don’t think there’s a single person in this discussion or even this whole 
ecosystem that does not have a more immediately-pressing and also virtually 
infinite to-do list, so it’s unreasonable to expect that generally they would 
do more than glance at the stuff in the mailing list. In the peer review 
analogy, the mailing list is like the arXiv or Biorxiv stream — yep, anyone can 
see the stuff on there and comment, but most people just don’t have the time or 
attention to grab onto that. The only reason I stopped to comment here is 
Sebastian’s “Imma merge, YOLO!”, which had me raising my eyebrows real high.  
Especially for something that would expand the NumPy API!

- So, my proposal is that there needs to be an *editor* of NEPs who takes 
responsibility, once they are themselves satisfied with the NEP, for seeking 
out external reviewers and pinging them individually and asking them if they 
would be ok to review.

- A good friend who does screenwriting once told me, “don’t use all your 
proofreaders at once”. You want to get feedback, improve things, then feedback 
from a *totally independent* new person who can see the document with fresh 
eyes.

Obviously, all of the above slows things down. But “alone we go fast, together 
we go far”. The point of a NEP is to document critical decisions for the long 
term health of the project. If the documentation is insufficient, it defeats 
the whole purpose. Might as well just implement stuff and skip the whole NEP 
process. (Side note: Stephan, I for one would definitely appreciate an update 
to existing NEPs if there’s obvious ways they can be improved!)

I do think that NEP templates should be strict, and I don’t think that is 
incompatible with plain, jargon-free text. The NEP template and guidelines 
should specify that, and that the motivation should be understandable by a 
casual NumPy user — the kind described by Ilhan, for whom bare NumPy actually 
meets all their needs. Maybe they’ve also used PyTorch but they’ve never really 
had cause to mix them or write a program that worked with both kinds of arrays.

Ditto for backwards compatibility — everyone should be clear when their 
existing code is going to be broken. Actually NEP18 broke so much of my code, 
but its Backward compatibility section basically says all good! 
https://numpy.org/neps/nep-0018-array-function-protocol.html#backward-compatibility
 

 

Anywho, as always, none of this is criticism to work done — I thank you all, 
and am eternally grateful for all the hard work everyone is doing to keep the 
ecosystem from fragmenting. I’m just hoping that this discussion can improve 
the process going forward!

And, yes, apologies to Peter, I know from repeated personal experience how 
frustrating it can be to have last-minute drive-by objections after months of 
consensus building! But I think in the end every time that happened the end 
result was better — I hope the same is true here! And yes, I’ll reiterate 
Ralf’s point: my concerns are about the NEP process itself rather than this 
one. I’ll summarise my proposal:

- strict NEP template. NEPs with missing sections will not be accepted.
- sections Abstract, Motivation, and Backwards Compatibility should be 
understandable at a high level by casual users with ~zero background on the 
topic
- enforce the above with at least two independent rounds of coordinated peer 
review.

Thank you,

Juan.

> On 14 Aug 2020, at 5:29 am, Stephan Hoyer  wrote:
> 
> On Thu, Aug 13, 2020 at 5:22 AM Ralf Gommers  > wrote:
> Thanks for raising these concerns Ilhan and Juan, and for answering Peter. 
> Let me give my perspective as well.
> 
> To start with, this is not specifically about Peter's NEP and PR. NEP 35 
> simply follows the pattern set by previous PRs, and given its tight scope is 
> less difficult to understand than other NEPs on such technical topics. Peter 
> has done a lot of things right, and is close to the finish line.
> 
> 
> On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev  > wrote:
> 
> > I think, arriving to an agreement would be much faster if there is an 
> > executive summary of who this is intended for and what the regular usage 
> > is. Because with no offense, all I see is "dispatch", "_array_function_" 
> > and a lot of technical details of which I am absolutely ignorant.
> 
> This is what I intended to do in the Usage Guidance [2] section. Could
> you elaborate on what more information you'd want to see there? Or is
> it just a matter of reorganizing the NEP a bit to try and summarize
> such things right at the top?
> 
> We adapted the NEP template [6] several times last year to try and improve 
> this. And specified in 

[Numpy-discussion] Use of booleans in slices

2020-08-13 Thread Aaron Meurer
I noticed that np.bool_.__index__() gives a DeprecationWarning

>>> np.bool_(True).__index__()
__main__:1: DeprecationWarning: In future, it will be an error for
'np.bool_' scalars to be interpreted as an index
1

This is good, because booleans don't actually act like integers in
indexing contexts. However, raw Python bools also allow __index__()

>>> True.__index__()
1

A consequence of this is that NumPy slices allow booleans, as long as
they are the Python type (if you use the NumPy bool_ type you get the
deprecation warning).

>>> a = np.arange(10)
>>> a[True:]
array([1, 2, 3, 4, 5, 6, 7, 8, 9])

Should this behavior also be considered deprecated? Presumably
deprecating bool.__index__() in Python is a no-go, but it could be
deprecated in NumPy contexts (in the pure Python collections, booleans
don't have a special indexing meaning anyway).

Interestingly, places that use a shape don't allow booleans (I guess
they don't necessarily use __index__()?)

>>> np.empty((True,))
Traceback (most recent call last):
  File "", line 1, in 
TypeError: an integer is required

Aaron Meurer
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Experimental `like=` attribute for array creation functions

2020-08-13 Thread Stephan Hoyer
On Thu, Aug 13, 2020 at 5:22 AM Ralf Gommers  wrote:

> Thanks for raising these concerns Ilhan and Juan, and for answering Peter.
> Let me give my perspective as well.
>
> To start with, this is not specifically about Peter's NEP and PR. NEP 35
> simply follows the pattern set by previous PRs, and given its tight scope
> is less difficult to understand than other NEPs on such technical topics.
> Peter has done a lot of things right, and is close to the finish line.
>
>
> On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev <
> pe...@entschev.com> wrote:
>
>>
>> > I think, arriving to an agreement would be much faster if there is an
>> executive summary of who this is intended for and what the regular usage
>> is. Because with no offense, all I see is "dispatch", "_array_function_"
>> and a lot of technical details of which I am absolutely ignorant.
>>
>> This is what I intended to do in the Usage Guidance [2] section. Could
>> you elaborate on what more information you'd want to see there? Or is
>> it just a matter of reorganizing the NEP a bit to try and summarize
>> such things right at the top?
>>
>
> We adapted the NEP template [6] several times last year to try and improve
> this. And specified in there as well that NEP content set to the mailing
> list should only contain the sections: Abstract, Motivation and Scope,
> Usage and Impact, and Backwards compatibility. This to ensure we fully
> understand the "why" and "what" before the "how". Unfortunately that
> template and procedure hasn't been exercised much yet, only in NEP 38 [7]
> and partially in NEP 41 [8].
>
> If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image
> (Juan) and CuPy (Leo, on the PR review) all saying they don't understand
> the goals, relevance, target audience, or how they're supposed to use a new
> feature, that indicates that the people doing the writing and having the
> discussion are doing something wrong at a very fundamental level.
>
> At this point I'm pretty disappointed in and tired of how we write and
> discuss NEPs on technical topics like dispatching, dtypes and the like.
> People literally refuse to write down concrete motivations, goals and
> non-goals, code that's problematic now and will be better/working post-NEP
> and usage examples before launching into extensive discussion of the gory
> details of the internals. I'm not sure what to do about it. Completely
> separate API and behavior proposals from implementation proposals? Make
> separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo
> on the API team which then needs to approve every API change in new NEPs?
> Offer to co-write NEPs if someone is willing but doesn't understand how to
> go about it? Keep the current structure/process but veto further approvals
> until NEP authors get it right?
>

I think the NEP template is great, and we should try to be more diligent
about following it!

My own NEP 37 (__array_module__) is probably a good example of poor
presentation due to not following the template structure. It goes pretty
deep into low-level motivation and some implementation details before usage
examples.

Speaking just for myself, I would have appreciated a friendly nudge to use
the template. Certainly I think it would be fine to require using the
template for newly submitted NEPs. I did not remember about it when I
started drafting NEP 37, and it definitely would have helped. I may still
try to do a revision at some point to use the template structure.


> I want to make an exception for merging the current NEP, for which the
> plan is to merge it as experimental to try in downstream PRs and get more
> experience. That does mean that master will be in an unreleasable state by
> the way, which is unusual and it'd be nice to get Chuck's explicit OK for
> that. But after that, I think we need a change here. I would like to hear
> what everyone thinks is the shape that change should take - any of my above
> suggestions, or something else?
>
>
>
>> > Finally as a minor point, I know we are mostly (ex-)academics but this
>> necessity of formal language on NEPs is self-imposed (probably PEPs are to
>> blame) and not quite helping. It can be a bit more descriptive in my
>> external opinion.
>>
>> TBH, I don't really know how to solve that point, so if you have any
>> specific suggestions, that's certainly welcome. I understand the
>> frustration for a reader trying to understand all the details, with
>> many being only described in NEP-18 [3], but we also strive to avoid
>> rewriting things that are written elsewhere, which would also
>> overburden those who are aware of what's being discussed.
>>
>>
>> > I also share Ilhan’s concern (and I mentioned this in a previous NEP
>> discussion) that NEPs are getting pretty inaccessible. In a sense these are
>> difficult topics and readers should be expected to have *some* familiarity
>> with the topics being discussed, but perhaps more effort should be put into
>> the 

Re: [Numpy-discussion] Experimental `like=` attribute for array creation functions

2020-08-13 Thread Ilhan Polat
Yes, the underlying gory details should be spelled out of course but if it
is also modifying/adding to API then it is best to sound the horn and
invite zombies to take a stab at it. Often people arrive with interesting
use-cases that you wouldn't have thought about.

And I am very familiar with the pushback feeling you are having right now,
probably internally shouting "where have you been all this time you
slackers?". As you might have seen me asking questions here and Cython
lists, when I am done with some new feature over SciPy, it is also going to
be a very very long and tiring process. I am really not looking forward to
it :-)  but I guess it is part of the deal. Maybe I can give some comfort
that if more people start to flock over that means it has morphed into a
finished product so people can shoot. But, I honestly thought this was a
new NEP, that's a mistake on my part.

For the like, typeof and other candidates, by esoteric I mean foreign
enough to most users. We already have a nice candidate I think; ehm...
"dispatch" or "dispatch_like" or something like that, nobody sober enough
would confuse this with any other. And since this won't be typed in daily
usage, or so I understood, I guess it is ok to make it verbose. But still
take it as an initial guess and feel free to dismiss.

I still would be in a platonic love with "numpy.DIY" or "numpy.hermes"
namespace with a nice "bring your own _array_function_" service.








On Thu, Aug 13, 2020 at 4:16 PM Peter Andreas Entschev 
wrote:

> Ilhan,
>
> Thanks, that does clarify things.
>
> I think the main point -- and correct me here if I'm still wrong -- is
> that we want the NEP to have some very clear example of when/why/how
> to use it, preferably as early in the text as possible, maybe just
> below the Abstract, in a Motivation and Scope section, as the NEP
> Template [6] pointed out to by Ralf earlier suggests. That is a
> totally valid ask, and I'll try to address it as soon as possible
> (hopefully today or tomorrow).
>
> To the point of whether NEPs are to be read by users, I normally don't
> expect users to be required to read and understand those NEPs other
> than by pure curiosity. If we need them to do so, then there's
> definitely a big problem in the API. This may sound counterintuitive
> with what I said before about the "like=" name, but that's really the
> piece of the NumPy API that I with a somewhat reasonable understand of
> arrays don't quite get or like, for instance "asarray" and "like"
> sound exactly the same thing, but they're not in the NumPy context,
> and on the other hand it's quite difficult to find a reasonable name
> to clarify that. And once more, I do like the "typeof=" suggestion
> more than "like=" to be perfectly honest, I'm just afraid it could be
> mistaken by the "dtype=" keyword somehow and thus still not solve the
> clarity problem. Going back to users reading NEPs or not, I would
> really expect that the docstring from the function is sufficiently
> clear to keep users off of it, but still give them an understanding of
> why that exists, the current docstring is in [9], please do comment on
> it if you have ideas of how to make it more accessible to users.
>
> You also mentioned you'd like that the name is as esoteric as
> possible, do you have any suggestions for an esoteric name that is
> hopefully unambiguous too? Naming has definitely been very much on the
> table since the NEP was written, but the consensus was more that
> "like=" is reasonably similar enough in both application and the name
> itself to "empty_like" and derived functions, that's why we just stuck
> to it.
>
> Best,
> Peter
>
> [9]
> https://github.com/numpy/numpy/pull/16935/files#diff-e5969453e399f2d32519d305b2582da9R16-R22
>
> On Thu, Aug 13, 2020 at 3:43 PM Ilhan Polat  wrote:
> >
> > To maybe lighten up the discussion a bit and to make my outsider
> confusion more tangible, let me start by apologizing for diving head first
> without weighing the past luggage :-) I always forget how much effort goes
> into these things and for outsiders like me, it's a matter of dipping the
> finger and tasting it just before starting to complain how much salt is
> missing etc. What I was mentioning about NEPs wasn't only related
> specifically to this one by the way. It's the generic feeling that I have.
> >
> > First let me start what I mean by NumPy users and downstreamers
> distinction. This is very much related to how data-science and huge-array
> users are magnetizing every tool out there in the Python world which is
> fine though the majority of number-crunchers have nothing to do with any of
> GPU/Parallelism/ClusterUsage etc. Hence when I mention NumPy users, think
> of people who use NumPy as its own right with no duck-typing and nothing
> related to subclassing. Just straightforward array creation and lots of ops
> on these arrays. For those people (I'm one of them), this option brings in
> a keyword that we would never use. And it gets 

Re: [Numpy-discussion] Experimental `like=` attribute for array creation functions

2020-08-13 Thread Sebastian Berg
On Thu, 2020-08-13 at 15:47 +0200, Peter Andreas Entschev wrote:
> > We adapted the NEP template [6] several times last year to try and
> > improve this. And specified in there as well that NEP content set
> > to the mailing list should only contain the sections: Abstract,
> > Motivation and Scope, Usage and Impact, and Backwards
> > compatibility. This to ensure we fully understand the "why" and
> > "what" before the "how". Unfortunately that template and procedure
> > hasn't been exercised much yet, only in NEP 38 [7] and partially in
> > NEP 41 [8].
> > 
> > If we have long-time maintainers of SciPy (Ilhan and myself),
> > scikit-image (Juan) and CuPy (Leo, on the PR review) all saying
> > they don't understand the goals, relevance, target audience, or how
> > they're supposed to use a new feature, that indicates that the
> > people doing the writing and having the discussion are doing
> > something wrong at a very fundamental level.
> 
> I'm more than happy to edit the NEP and try to clarify all the
> concerns. However, it gets pretty difficult to do so when I as an
> author don't understand where the difficulty is. Ilhan, Juan and Ralf
> now pointed out things that are missing/unclear, but no comment was
> made in that regard when I sent the NEP, my point being: I couldn't
> fix what I didn't know was a problem to others.
> 
> > At this point I'm pretty disappointed in and tired of how we write
> > and discuss NEPs on technical topics like dispatching, dtypes and
> > the like. People literally refuse to write down concrete
> > motivations, goals and non-goals, code that's problematic now and
> > will be better/working post-NEP and usage examples before launching
> > into extensive discussion of the gory details of the internals. I'm
> > not sure what to do about it.
> 
> Honestly, I don't really understand this. From my perspective, there
> are two ways to deal with such things:
> 
> 1. Templates are to be taken mainly as _guidelines_ rather than
> _hardlines_, and the current text of NEP-35 definitely falls in the
> first category;
> 2. Templates are _hardlines_ and to be guided/enforced by maintainers
> at some point (maybe before merging the PR?).
> 
> If 2 is the desired case for NumPy, which sounds a lot like what is
> wanted from NEP-35 and other NEPs generally, maintainers should let
> the authors know as early as possible that something isn't following
> the template's hardlines and it should be corrected. I don't mean any
> of this to remove myself of any responsibility, but would like to
> express my frustration that a 10 month-old NEP is only now getting so
> much pushback for being unclear after its implementation is nearing
> completion.
> 
> > I want to make an exception for merging the current NEP, for which
> > the plan is to merge it as experimental to try in downstream PRs
> > and get more experience. That does mean that master will be in an
> > unreleasable state by the way, which is unusual and it'd be nice to
> > get Chuck's explicit OK for that.
> 
> I don't quite understand this either, why would that leave master in
> an unreleasable state?
> 

Well, a few points are not discussed to the end yet. The name is one
that did not get much attention yet. Maybe because nobody had much
concerns about it yet, or maybe it was just lower on the priority list.

To be clear: I am fully prepared to pull this out of master before
release or probably rather disable it in release versions. An
alternative could be an environment variable (an env variable will not
stop actual adoption, but we may be fine with that).
And unless NEP 35 is accepted, that probably has to be the default,
fortunately there is still some time until the next release.

- Sebastian


> Best,
> Peter
> 
> On Thu, Aug 13, 2020 at 2:21 PM Ralf Gommers 
> wrote:
> > Thanks for raising these concerns Ilhan and Juan, and for answering
> > Peter. Let me give my perspective as well.
> > 
> > To start with, this is not specifically about Peter's NEP and PR.
> > NEP 35 simply follows the pattern set by previous PRs, and given
> > its tight scope is less difficult to understand than other NEPs on
> > such technical topics. Peter has done a lot of things right, and is
> > close to the finish line.
> > 
> > 
> > On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev <
> > pe...@entschev.com> wrote:
> > > 
> > > > I think, arriving to an agreement would be much faster if there
> > > > is an executive summary of who this is intended for and what
> > > > the regular usage is. Because with no offense, all I see is
> > > > "dispatch", "_array_function_" and a lot of technical details
> > > > of which I am absolutely ignorant.
> > > 
> > > This is what I intended to do in the Usage Guidance [2] section.
> > > Could
> > > you elaborate on what more information you'd want to see there?
> > > Or is
> > > it just a matter of reorganizing the NEP a bit to try and
> > > summarize
> > > such things right at the top?
> > 
> > We adapted the NEP 

Re: [Numpy-discussion] Experimental `like=` attribute for array creation functions

2020-08-13 Thread Peter Andreas Entschev
Ralf,

I know none of it is a criticism of my work or directly of anybody
else's work. I was just making a couple of general points (or
questions really):

1. What is accepted as a reasonably clear NEP? It seems to point that
a NEP _must_ follow the Template
2. Should the NEP Template be followed as a hardline? Personally, I
think that would be fine in general, and diverging seems to be only an
option of when additional information is necessary, but less should
not be acceptable.

And to be perfectly clear, none of what I said is a criticism to
anybody in particular, but it's a frustration about the process
seemingly not clear in itself for either authors or maintainers, thus
my two points above. I apologize if any of what I said so far has been
taken as a personal criticism to someone, it was definitely not meant
that way.

Finally, I like Juan's previous suggestion that someone not involved
in the discussion proof-reading would be a great idea, I'm not sure if
that's achievable in practice though. However, I think that discussion
is a bit out of context, so I'll try to address the unclear parts of
this NEP in a PR and we could continue the general discussion of the
NEP process in a different thread if people wish to do so.

Best,
Peter

On Thu, Aug 13, 2020 at 4:13 PM Ralf Gommers  wrote:
>
>
>
> On Thu, Aug 13, 2020 at 2:47 PM Peter Andreas Entschev  
> wrote:
>>
>> > We adapted the NEP template [6] several times last year to try and improve 
>> > this. And specified in there as well that NEP content set to the mailing 
>> > list should only contain the sections: Abstract, Motivation and Scope, 
>> > Usage and Impact, and Backwards compatibility. This to ensure we fully 
>> > understand the "why" and "what" before the "how". Unfortunately that 
>> > template and procedure hasn't been exercised much yet, only in NEP 38 [7] 
>> > and partially in NEP 41 [8].
>> >
>> > If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image 
>> > (Juan) and CuPy (Leo, on the PR review) all saying they don't understand 
>> > the goals, relevance, target audience, or how they're supposed to use a 
>> > new feature, that indicates that the people doing the writing and having 
>> > the discussion are doing something wrong at a very fundamental level.
>>
>> I'm more than happy to edit the NEP and try to clarify all the
>> concerns.
>
>
> Thanks Peter. Let me reiterate, you did a lot of things right, have been 
> happy to adapt when given feedback, and your willingness to go back and fix 
> things up now is much appreciated (and I'm happy to help). No criticism of 
> your work or attitude intended, on the contract.
>
>>
>> However, it gets pretty difficult to do so when I as an
>> author don't understand where the difficulty is. Ilhan, Juan and Ralf
>> now pointed out things that are missing/unclear, but no comment was
>> made in that regard when I sent the NEP, my point being: I couldn't
>> fix what I didn't know was a problem to others.
>
>
> Yes of course, I totally understand that.
>
>>
>> > At this point I'm pretty disappointed in and tired of how we write and 
>> > discuss NEPs on technical topics like dispatching, dtypes and the like. 
>> > People literally refuse to write down concrete motivations, goals and 
>> > non-goals, code that's problematic now and will be better/working post-NEP 
>> > and usage examples before launching into extensive discussion of the gory 
>> > details of the internals. I'm not sure what to do about it.
>>
>> Honestly, I don't really understand this. From my perspective, there
>> are two ways to deal with such things:
>>
>> 1. Templates are to be taken mainly as _guidelines_ rather than
>> _hardlines_, and the current text of NEP-35 definitely falls in the
>> first category;
>> 2. Templates are _hardlines_ and to be guided/enforced by maintainers
>> at some point (maybe before merging the PR?).
>>
>> If 2 is the desired case for NumPy, which sounds a lot like what is
>> wanted from NEP-35 and other NEPs generally, maintainers should let
>> the authors know as early as possible that something isn't following
>> the template's hardlines and it should be corrected.
>
>
> Yes agreed, maintainers should do this. It was always meant as something in 
> between, "please follow but deviate if needed". If essential elements are 
> missing, I think that should be flagged earlier going forward.
>
> As a concrete example: Stephan (the main author of __array_function__) was 
> still fuzzy on the functions covered and whether it solves array coercion, in 
> the last 24 hours*. You answered by pointing to concrete code in Dask and 
> Xarray. That code, why it doesn't work well now but will work with like=, 
> should be at the top of the NEP as concrete problem statement / code 
> examples. It's quite unfortunate that no maintainer explicitly requested this 
> many months ago.
>
> * https://github.com/numpy/numpy/pull/16935#issuecomment-673379038
>
>> I don't mean any of this to remove 

Re: [Numpy-discussion] Experimental `like=` attribute for array creation functions

2020-08-13 Thread Peter Andreas Entschev
Ilhan,

Thanks, that does clarify things.

I think the main point -- and correct me here if I'm still wrong -- is
that we want the NEP to have some very clear example of when/why/how
to use it, preferably as early in the text as possible, maybe just
below the Abstract, in a Motivation and Scope section, as the NEP
Template [6] pointed out to by Ralf earlier suggests. That is a
totally valid ask, and I'll try to address it as soon as possible
(hopefully today or tomorrow).

To the point of whether NEPs are to be read by users, I normally don't
expect users to be required to read and understand those NEPs other
than by pure curiosity. If we need them to do so, then there's
definitely a big problem in the API. This may sound counterintuitive
with what I said before about the "like=" name, but that's really the
piece of the NumPy API that I with a somewhat reasonable understand of
arrays don't quite get or like, for instance "asarray" and "like"
sound exactly the same thing, but they're not in the NumPy context,
and on the other hand it's quite difficult to find a reasonable name
to clarify that. And once more, I do like the "typeof=" suggestion
more than "like=" to be perfectly honest, I'm just afraid it could be
mistaken by the "dtype=" keyword somehow and thus still not solve the
clarity problem. Going back to users reading NEPs or not, I would
really expect that the docstring from the function is sufficiently
clear to keep users off of it, but still give them an understanding of
why that exists, the current docstring is in [9], please do comment on
it if you have ideas of how to make it more accessible to users.

You also mentioned you'd like that the name is as esoteric as
possible, do you have any suggestions for an esoteric name that is
hopefully unambiguous too? Naming has definitely been very much on the
table since the NEP was written, but the consensus was more that
"like=" is reasonably similar enough in both application and the name
itself to "empty_like" and derived functions, that's why we just stuck
to it.

Best,
Peter

[9] 
https://github.com/numpy/numpy/pull/16935/files#diff-e5969453e399f2d32519d305b2582da9R16-R22

On Thu, Aug 13, 2020 at 3:43 PM Ilhan Polat  wrote:
>
> To maybe lighten up the discussion a bit and to make my outsider confusion 
> more tangible, let me start by apologizing for diving head first without 
> weighing the past luggage :-) I always forget how much effort goes into these 
> things and for outsiders like me, it's a matter of dipping the finger and 
> tasting it just before starting to complain how much salt is missing etc. 
> What I was mentioning about NEPs wasn't only related specifically to this one 
> by the way. It's the generic feeling that I have.
>
> First let me start what I mean by NumPy users and downstreamers distinction. 
> This is very much related to how data-science and huge-array users are 
> magnetizing every tool out there in the Python world which is fine though the 
> majority of number-crunchers have nothing to do with any of 
> GPU/Parallelism/ClusterUsage etc. Hence when I mention NumPy users, think of 
> people who use NumPy as its own right with no duck-typing and nothing related 
> to subclassing. Just straightforward array creation and lots of ops on these 
> arrays. For those people (I'm one of them), this option brings in a keyword 
> that we would never use. And it gets into many major functions (linspace and 
> others mentioned somewhere). So it has a very appealing name but has nothing 
> to do with me in an already very crowded namespace and keyword catalogue. 
> That's basically a UX issue to be addressed (under the assumption that users 
> like me are the majority). Either making its name as esoteric as possible so 
> I naturally stay away from it or I don't see it. This has absolutely nothing 
> to do with looking down on the downstream libraries. They are flat-out 
> amazing and the more we can support them the merrier.
>
> Using yet another metaphor, I was hoping that NumPy would have a loading dock 
> for heavy duty deliveries for downstream projects or specialized array 
> creations and won't disturb the regular customer entrance. Because if I look 
> at this page 
> https://numpy.org/doc/stable/referenc/routines.array-creation.html, there are 
> a lot of functions and I think most of them are candidates to gain this 
> keyword.  I wish I can comment on a viable alternative but I really cannot 
> understand the _array__ discussions since they fly way over my head no 
> matter how many times I tried. So that's why I naively mentioned the 
> "np.astypedarray" or "np.asarray_but_not_numpy_array" or whatever. Now I see 
> that it is even more complicated and I generated extra noise. So you can just 
> ignore my previous suggestions. Except that I want to draw attention to the 
> UX problem and I'd like to leave it at that.
>
> The other point is about the NEP stuff. I think I need to elaborate. If the 
> NEPs are meant for 

Re: [Numpy-discussion] Experimental `like=` attribute for array creation functions

2020-08-13 Thread Ralf Gommers
On Thu, Aug 13, 2020 at 2:47 PM Peter Andreas Entschev 
wrote:

> > We adapted the NEP template [6] several times last year to try and
> improve this. And specified in there as well that NEP content set to the
> mailing list should only contain the sections: Abstract, Motivation and
> Scope, Usage and Impact, and Backwards compatibility. This to ensure we
> fully understand the "why" and "what" before the "how". Unfortunately that
> template and procedure hasn't been exercised much yet, only in NEP 38 [7]
> and partially in NEP 41 [8].
> >
> > If we have long-time maintainers of SciPy (Ilhan and myself),
> scikit-image (Juan) and CuPy (Leo, on the PR review) all saying they don't
> understand the goals, relevance, target audience, or how they're supposed
> to use a new feature, that indicates that the people doing the writing and
> having the discussion are doing something wrong at a very fundamental level.
>
> I'm more than happy to edit the NEP and try to clarify all the
> concerns.


Thanks Peter. Let me reiterate, you did a lot of things right, have been
happy to adapt when given feedback, and your willingness to go back and fix
things up now is much appreciated (and I'm happy to help). No criticism of
your work or attitude intended, on the contract.


> However, it gets pretty difficult to do so when I as an
> author don't understand where the difficulty is. Ilhan, Juan and Ralf
> now pointed out things that are missing/unclear, but no comment was
> made in that regard when I sent the NEP, my point being: I couldn't
> fix what I didn't know was a problem to others.
>

Yes of course, I totally understand that.


> > At this point I'm pretty disappointed in and tired of how we write and
> discuss NEPs on technical topics like dispatching, dtypes and the like.
> People literally refuse to write down concrete motivations, goals and
> non-goals, code that's problematic now and will be better/working post-NEP
> and usage examples before launching into extensive discussion of the gory
> details of the internals. I'm not sure what to do about it.
>
> Honestly, I don't really understand this. From my perspective, there
> are two ways to deal with such things:
>
> 1. Templates are to be taken mainly as _guidelines_ rather than
> _hardlines_, and the current text of NEP-35 definitely falls in the
> first category;
> 2. Templates are _hardlines_ and to be guided/enforced by maintainers
> at some point (maybe before merging the PR?).
>
> If 2 is the desired case for NumPy, which sounds a lot like what is
> wanted from NEP-35 and other NEPs generally, maintainers should let
> the authors know as early as possible that something isn't following
> the template's hardlines and it should be corrected.


Yes agreed, maintainers should do this. It was always meant as something in
between, "please follow but deviate if needed". If essential elements are
missing, I think that should be flagged earlier going forward.

As a concrete example: Stephan (the main author of __array_function__) was
still fuzzy on the functions covered and whether it solves array coercion,
in the last 24 hours*. You answered by pointing to concrete code in Dask
and Xarray. That code, why it doesn't work well now but will work with
like=, should be at the top of the NEP as concrete problem statement / code
examples. It's quite unfortunate that no maintainer explicitly requested
this many months ago.

* https://github.com/numpy/numpy/pull/16935#issuecomment-673379038

I don't mean any of this to remove myself of any responsibility, but would
> like to
> express my frustration that a 10 month-old NEP is only now getting so
> much pushback for being unclear after its implementation is nearing
> completion.
>

Totally understandable. I think part of the problem is that people only
weigh in when they see concrete "this part is for you, and here's how you
use it to solve problem X".

As for me personally, if I'm saying things now that I didn't manage to
respond to earlier (specific to your NEP), I apologize. 10 months ago I was
in the middle of an intercontinental move and a new-ish job getting busier
fast. Again, apologies and no criticism of your work.


>
> > I want to make an exception for merging the current NEP, for which the
> plan is to merge it as experimental to try in downstream PRs and get more
> experience. That does mean that master will be in an unreleasable state by
> the way, which is unusual and it'd be nice to get Chuck's explicit OK for
> that.
>
> I don't quite understand this either, why would that leave master in
> an unreleasable state?
>

That's what Sebastian proposed yesterday: let's merge right now, open
issues for all the things being brought up right now, and deal with them
pre-1.20-release. I'm saying I'm fine with that, but then we actually need
to go back and finalize the discussions before the next release.

Cheers,
Ralf





> Best,
> Peter
>
> On Thu, Aug 13, 2020 at 2:21 PM Ralf Gommers 
> wrote:
> >
> > Thanks for 

Re: [Numpy-discussion] Experimental `like=` attribute for array creation functions

2020-08-13 Thread Peter Andreas Entschev
> We adapted the NEP template [6] several times last year to try and improve 
> this. And specified in there as well that NEP content set to the mailing list 
> should only contain the sections: Abstract, Motivation and Scope, Usage and 
> Impact, and Backwards compatibility. This to ensure we fully understand the 
> "why" and "what" before the "how". Unfortunately that template and procedure 
> hasn't been exercised much yet, only in NEP 38 [7] and partially in NEP 41 
> [8].
>
> If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image 
> (Juan) and CuPy (Leo, on the PR review) all saying they don't understand the 
> goals, relevance, target audience, or how they're supposed to use a new 
> feature, that indicates that the people doing the writing and having the 
> discussion are doing something wrong at a very fundamental level.

I'm more than happy to edit the NEP and try to clarify all the
concerns. However, it gets pretty difficult to do so when I as an
author don't understand where the difficulty is. Ilhan, Juan and Ralf
now pointed out things that are missing/unclear, but no comment was
made in that regard when I sent the NEP, my point being: I couldn't
fix what I didn't know was a problem to others.

> At this point I'm pretty disappointed in and tired of how we write and 
> discuss NEPs on technical topics like dispatching, dtypes and the like. 
> People literally refuse to write down concrete motivations, goals and 
> non-goals, code that's problematic now and will be better/working post-NEP 
> and usage examples before launching into extensive discussion of the gory 
> details of the internals. I'm not sure what to do about it.

Honestly, I don't really understand this. From my perspective, there
are two ways to deal with such things:

1. Templates are to be taken mainly as _guidelines_ rather than
_hardlines_, and the current text of NEP-35 definitely falls in the
first category;
2. Templates are _hardlines_ and to be guided/enforced by maintainers
at some point (maybe before merging the PR?).

If 2 is the desired case for NumPy, which sounds a lot like what is
wanted from NEP-35 and other NEPs generally, maintainers should let
the authors know as early as possible that something isn't following
the template's hardlines and it should be corrected. I don't mean any
of this to remove myself of any responsibility, but would like to
express my frustration that a 10 month-old NEP is only now getting so
much pushback for being unclear after its implementation is nearing
completion.

> I want to make an exception for merging the current NEP, for which the plan 
> is to merge it as experimental to try in downstream PRs and get more 
> experience. That does mean that master will be in an unreleasable state by 
> the way, which is unusual and it'd be nice to get Chuck's explicit OK for 
> that.

I don't quite understand this either, why would that leave master in
an unreleasable state?

Best,
Peter

On Thu, Aug 13, 2020 at 2:21 PM Ralf Gommers  wrote:
>
> Thanks for raising these concerns Ilhan and Juan, and for answering Peter. 
> Let me give my perspective as well.
>
> To start with, this is not specifically about Peter's NEP and PR. NEP 35 
> simply follows the pattern set by previous PRs, and given its tight scope is 
> less difficult to understand than other NEPs on such technical topics. Peter 
> has done a lot of things right, and is close to the finish line.
>
>
> On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev  
> wrote:
>>
>>
>> > I think, arriving to an agreement would be much faster if there is an 
>> > executive summary of who this is intended for and what the regular usage 
>> > is. Because with no offense, all I see is "dispatch", "_array_function_" 
>> > and a lot of technical details of which I am absolutely ignorant.
>>
>> This is what I intended to do in the Usage Guidance [2] section. Could
>> you elaborate on what more information you'd want to see there? Or is
>> it just a matter of reorganizing the NEP a bit to try and summarize
>> such things right at the top?
>
>
> We adapted the NEP template [6] several times last year to try and improve 
> this. And specified in there as well that NEP content set to the mailing list 
> should only contain the sections: Abstract, Motivation and Scope, Usage and 
> Impact, and Backwards compatibility. This to ensure we fully understand the 
> "why" and "what" before the "how". Unfortunately that template and procedure 
> hasn't been exercised much yet, only in NEP 38 [7] and partially in NEP 41 
> [8].
>
> If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image 
> (Juan) and CuPy (Leo, on the PR review) all saying they don't understand the 
> goals, relevance, target audience, or how they're supposed to use a new 
> feature, that indicates that the people doing the writing and having the 
> discussion are doing something wrong at a very fundamental level.
>
> At this point I'm pretty disappointed in 

Re: [Numpy-discussion] Experimental `like=` attribute for array creation functions

2020-08-13 Thread Ilhan Polat
To maybe lighten up the discussion a bit and to make my outsider confusion
more tangible, let me start by apologizing for diving head first without
weighing the past luggage :-) I always forget how much effort goes into
these things and for outsiders like me, it's a matter of dipping the finger
and tasting it just before starting to complain how much salt is missing
etc. What I was mentioning about NEPs wasn't only related specifically to
this one by the way. It's the generic feeling that I have.

First let me start what I mean by NumPy users and downstreamers
distinction. This is very much related to how data-science and huge-array
users are magnetizing every tool out there in the Python world which is
fine though the majority of number-crunchers have nothing to do with any of
GPU/Parallelism/ClusterUsage etc. Hence when I mention NumPy users, think
of people who use NumPy as its own right with no duck-typing and nothing
related to subclassing. Just straightforward array creation and lots of ops
on these arrays. For those people (I'm one of them), this option brings in
a keyword that we would never use. And it gets into many major functions
(linspace and others mentioned somewhere). So it has a very appealing name
but has nothing to do with me in an already very crowded namespace and
keyword catalogue. That's basically a UX issue to be addressed (under the
assumption that users like me are the majority). Either making its name as
esoteric as possible so I naturally stay away from it or I don't see it.
This has absolutely nothing to do with looking down on the downstream
libraries. They are flat-out amazing and the more we can support them the
merrier.

Using yet another metaphor, I was hoping that NumPy would have a loading
dock for heavy duty deliveries for downstream projects or specialized array
creations and won't disturb the regular customer entrance. Because if I
look at this page
https://numpy.org/doc/stable/referenc/routines.array-creation.html, there
are a lot of functions and I think most of them are candidates to gain this
keyword.  I wish I can comment on a viable alternative but I really cannot
understand the _array__ discussions since they fly way over my head no
matter how many times I tried. So that's why I naively mentioned the
"np.astypedarray" or "np.asarray_but_not_numpy_array" or whatever. Now I
see that it is even more complicated and I generated extra noise. So you
can just ignore my previous suggestions. Except that I want to draw
attention to the UX problem and I'd like to leave it at that.

The other point is about the NEP stuff. I think I need to elaborate. If the
NEPs are meant for internal NumPy discussions, then by all means, crank up
the pointer*-meter to 11 and dive into it, totally fine with me. But if you
also want to get feedback from outside, then probably a few lines of code
examples for mere mortals would go a long way. Also it would make the
discussion much more streamlined in my humble opinion. What I was trying to
get at was that almost all NEPs read like a legal document that I want to
agree as soon as possible. Because they often come without any or minimal
amount of code in it. In NEP35 for example, there are nice code blocks in
function dispatching but I guess it's not meant for me. Because it is only
decorating asarray with some black magic happening there somehow (I guess).
So I can't even comprehend what the proposition would mean for the regular,
friendly, anti-duck users. But I am pretty sure it is about dispatching
something because the word is repeated ~20 times :-)  Thus the feedback
would be limited. That was also what I meant there. But again I totally
understand the complexity of these issues. So I'm not expecting to
understand all details of NumPy machinery in a single NEP.

But anyways, hope this clarifies a few things that I failed to convey in my
previous mail.
ilhan



On Thu, Aug 13, 2020 at 2:23 PM Ralf Gommers  wrote:

> Thanks for raising these concerns Ilhan and Juan, and for answering Peter.
> Let me give my perspective as well.
>
> To start with, this is not specifically about Peter's NEP and PR. NEP 35
> simply follows the pattern set by previous PRs, and given its tight scope
> is less difficult to understand than other NEPs on such technical topics.
> Peter has done a lot of things right, and is close to the finish line.
>
>
> On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev <
> pe...@entschev.com> wrote:
>
>>
>> > I think, arriving to an agreement would be much faster if there is an
>> executive summary of who this is intended for and what the regular usage
>> is. Because with no offense, all I see is "dispatch", "_array_function_"
>> and a lot of technical details of which I am absolutely ignorant.
>>
>> This is what I intended to do in the Usage Guidance [2] section. Could
>> you elaborate on what more information you'd want to see there? Or is
>> it just a matter of reorganizing the NEP a bit to try and summarize
>> 

Re: [Numpy-discussion] Experimental `like=` attribute for array creation functions

2020-08-13 Thread Ralf Gommers
Thanks for raising these concerns Ilhan and Juan, and for answering Peter.
Let me give my perspective as well.

To start with, this is not specifically about Peter's NEP and PR. NEP 35
simply follows the pattern set by previous PRs, and given its tight scope
is less difficult to understand than other NEPs on such technical topics.
Peter has done a lot of things right, and is close to the finish line.


On Thu, Aug 13, 2020 at 12:02 PM Peter Andreas Entschev 
wrote:

>
> > I think, arriving to an agreement would be much faster if there is an
> executive summary of who this is intended for and what the regular usage
> is. Because with no offense, all I see is "dispatch", "_array_function_"
> and a lot of technical details of which I am absolutely ignorant.
>
> This is what I intended to do in the Usage Guidance [2] section. Could
> you elaborate on what more information you'd want to see there? Or is
> it just a matter of reorganizing the NEP a bit to try and summarize
> such things right at the top?
>

We adapted the NEP template [6] several times last year to try and improve
this. And specified in there as well that NEP content set to the mailing
list should only contain the sections: Abstract, Motivation and Scope,
Usage and Impact, and Backwards compatibility. This to ensure we fully
understand the "why" and "what" before the "how". Unfortunately that
template and procedure hasn't been exercised much yet, only in NEP 38 [7]
and partially in NEP 41 [8].

If we have long-time maintainers of SciPy (Ilhan and myself), scikit-image
(Juan) and CuPy (Leo, on the PR review) all saying they don't understand
the goals, relevance, target audience, or how they're supposed to use a new
feature, that indicates that the people doing the writing and having the
discussion are doing something wrong at a very fundamental level.

At this point I'm pretty disappointed in and tired of how we write and
discuss NEPs on technical topics like dispatching, dtypes and the like.
People literally refuse to write down concrete motivations, goals and
non-goals, code that's problematic now and will be better/working post-NEP
and usage examples before launching into extensive discussion of the gory
details of the internals. I'm not sure what to do about it. Completely
separate API and behavior proposals from implementation proposals? Make
separate "API" and "internals" teams with the likes of Juan, Ilhan and Leo
on the API team which then needs to approve every API change in new NEPs?
Offer to co-write NEPs if someone is willing but doesn't understand how to
go about it? Keep the current structure/process but veto further approvals
until NEP authors get it right?

I want to make an exception for merging the current NEP, for which the plan
is to merge it as experimental to try in downstream PRs and get more
experience. That does mean that master will be in an unreleasable state by
the way, which is unusual and it'd be nice to get Chuck's explicit OK for
that. But after that, I think we need a change here. I would like to hear
what everyone thinks is the shape that change should take - any of my above
suggestions, or something else?



> > Finally as a minor point, I know we are mostly (ex-)academics but this
> necessity of formal language on NEPs is self-imposed (probably PEPs are to
> blame) and not quite helping. It can be a bit more descriptive in my
> external opinion.
>
> TBH, I don't really know how to solve that point, so if you have any
> specific suggestions, that's certainly welcome. I understand the
> frustration for a reader trying to understand all the details, with
> many being only described in NEP-18 [3], but we also strive to avoid
> rewriting things that are written elsewhere, which would also
> overburden those who are aware of what's being discussed.
>
>
> > I also share Ilhan’s concern (and I mentioned this in a previous NEP
> discussion) that NEPs are getting pretty inaccessible. In a sense these are
> difficult topics and readers should be expected to have *some* familiarity
> with the topics being discussed, but perhaps more effort should be put into
> the context/motivation/background of a NEP before accepting it. One way to
> ensure this might be to require a final proofreading step by someone who
> has not been involved at all in the discussions, like peer review does for
> papers.
>

Some variant of this proposal would be my preference.

Cheers,
Ralf


> [1] https://github.com/numpy/numpy/issues/14441#issuecomment-529969572
> [2]
> https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html#usage-guidance
> [3] https://numpy.org/neps/nep-0018-array-function-protocol.html
> [4] https://numpy.org/neps/nep-.html#nep-workflow
> [5]
> https://mail.python.org/pipermail/numpy-discussion/2019-October/080176.html


[6] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst
[7]
https://github.com/numpy/numpy/blob/master/doc/neps/nep-0038-SIMD-optimizations.rst
[8]

Re: [Numpy-discussion] Experimental `like=` attribute for array creation functions

2020-08-13 Thread Peter Andreas Entschev
> I am not sure adding a new keyword to an already confusing function is the 
> right thing to do.

Could you clarify what is the confusing function in question?

> This is already a very (I mean extremely very) easy keyword name to confuse 
> with ones_like, zeros_like and by its nature any other interpretation.

To be fair, the usage is the same. Therefore
empty_like(downstream_array, ...) and empty(downstream_array, ...,
like=downstream_array) should have the exact same behavior, which is
arguably redundant now.

> It is not signalling anything about the functionality that is being 
> discussed. I would seriously consider reserving such obvious names for really 
> obvious tasks. Because you would also expect the shape and ndim would be 
> mimicked by the "like"d argument but it turns out it is acting more like 
> "typeof=" and not "like=" at all.

I understand this can be confusing, and naming was one of the hardest
discussions as there's no clear unambiguous name to use for this
keyword, "like=" was simply the name that got closer to converging
during discussions. At the same time I think "typeof=" is perhaps a
better name than "like=", it could be very much confusing with
"dtype=", and that would possibly just shift the confusion.

> Again, if this is meant for downstream libraries (because that's what I got 
> out of the PR discussion, cupy, dask, and JAX were the only examples I could 
> read) then hiding it in another function and writing with capital letters 
> "this is not meant for numpy users" would be a much more convenient way to 
> separate the target audience and regular users.

The problem with this approach is that the __array_function__ protocol
relies on downstream libraries implementing functions with the same
signature (for example, Dask and CuPy both implement an "array"
function that matches NumPy). The purpose of __array_function__ and
NEP-35 is to introduce only minimal changes to both NumPy's API and
downstream libraries. Of course adding new functions for such cases
would work, but IMO it would defeat the purpose of __array_function__
in general as it would require a considerable amount of work in
downstream libraries, and we discussed this previously deciding that
an argument is better than many new functions [1].

> I think, arriving to an agreement would be much faster if there is an 
> executive summary of who this is intended for and what the regular usage is. 
> Because with no offense, all I see is "dispatch", "_array_function_" and a 
> lot of technical details of which I am absolutely ignorant.

This is what I intended to do in the Usage Guidance [2] section. Could
you elaborate on what more information you'd want to see there? Or is
it just a matter of reorganizing the NEP a bit to try and summarize
such things right at the top?

> Finally as a minor point, I know we are mostly (ex-)academics but this 
> necessity of formal language on NEPs is self-imposed (probably PEPs are to 
> blame) and not quite helping. It can be a bit more descriptive in my external 
> opinion.

TBH, I don't really know how to solve that point, so if you have any
specific suggestions, that's certainly welcome. I understand the
frustration for a reader trying to understand all the details, with
many being only described in NEP-18 [3], but we also strive to avoid
rewriting things that are written elsewhere, which would also
overburden those who are aware of what's being discussed.

> I’ve generally been on the “let the NumPy devs worry about it” side of 
> things, but I do agree with Ilhan that `like=` is confusing and `typeof=` 
> would be a much more appropriate name for that parameter.

To be clear, I have no strong opinion on renaming it, I'm fine either
way but I think it's unrealistic to expect that we find somewhat
short, unambiguous and properly descriptive names in a single name. If
the preference now shifts towards the "typeof=" name, we can change
it, but "like=" was really named after "empty_like" and similar
functions.

> I do think library writers are NumPy users and so I wouldn’t really make that 
> distinction, though. Users writing their own analysis code could very well be 
> interested in writing code using numpy functions that will transparently work 
> when the input is a CuPy array or whatever.

I'm guessing this is somewhat of a loose definition of "library", to
some extent if you really need "like=" it means that you're writing
your own functions around the NumPy API (and that IMO is a library,
even if you call it something else), rather than just writing your
application on top of the existing NumPy API. I'm also happy to
rephrase that in the NEP if people feel it should be done.

> I also share Ilhan’s concern (and I mentioned this in a previous NEP 
> discussion) that NEPs are getting pretty inaccessible. In a sense these are 
> difficult topics and readers should be expected to have *some* familiarity 
> with the topics being discussed, but perhaps more effort should