Re: [Numpy-discussion] np.{bool,float,int} deprecation

Eric Wieser Fri, 11 Dec 2020 00:46:50 -0800

>  you might want to discuss this with us at the array API standard
> https://github.com/data-apis/array-api (which is currently in RFC
> stage). The spec uses bool as the name for the boolean dtype.


I don't fully understand this argument - `np.bool` is already not the
boolean dtype. Either:

* The spec is suggesting that `pkg.bool` be some arbitrary object that can
be passed into a dtype argument and will produce a boolean array.
  If this is the case, the spec could also just require that
`dtype=builtins.bool` have this behavior.
* The spec is suggesting that `pkg.bool` is some rich dtype object.
  Ignoring the question of whether this should be `np.bool_` or
`np.dtype(np.bool_)`, it's currently neither, and changing it will break
users relying on `np.bool(True) is True`.
  That's not to say this isn't a sensible thing for the specification to
have, it's just something that numpy can't conform to without breaking code.

While it would be great if `np.bool_` could be spelt `np.bool`, I really
don't think we can make that change without a long deprecation first (if at
all).

Eric

On Thu, 10 Dec 2020 at 20:00, Sebastian Berg <sebast...@sipsolutions.net>
wrote:

> On Thu, 2020-12-10 at 20:38 +0100, Ralf Gommers wrote:
> > On Thu, Dec 10, 2020 at 7:25 PM Sebastian Berg <
> > sebast...@sipsolutions.net>
> > wrote:
> >
> > > On Wed, 2020-12-09 at 13:37 -0800, Stephan Hoyer wrote:
> > > > On Wed, Dec 9, 2020 at 1:07 PM Aaron Meurer <asmeu...@gmail.com>
> > > > wrote:
> > > >
> > > > > On Wed, Dec 9, 2020 at 9:41 AM Sebastian Berg
> > > > > <sebast...@sipsolutions.net> wrote:
> > > > > >
> > > > > > On Mon, 2020-12-07 at 14:18 -0700, Aaron Meurer wrote:
> > > > > > > Regarding np.bool specifically, if you want to deprecate
> > > > > > > this,
> > > > > > > you
> > > > > > > might want to discuss this with us at the array API
> > > > > > > standard
> > > > > > > https://github.com/data-apis/array-api (which is currently
> > > > > > > in
> > > > > > > RFC
> > > > > > > stage). The spec uses bool as the name for the boolean
> > > > > > > dtype.
> > > > > > >
> > > > > > > Would it make sense for NumPy to change np.bool to just be
> > > > > > > the
> > > > > > > boolean
> > > > > > > dtype object? Unlike int and float, there is no ambiguity
> > > > > > > with
> > > > > > > bool,
> > > > > > > and NumPy clearly doesn't have any issues with shadowing
> > > > > > > builtin
> > > > > > > names
> > > > > > > in its namespace.
> > > > > >
> > > > > > We could keep the Python alias around (which for `dtype=` is
> > > > > > the
> > > > > > same
> > > > > > as `np.bool_`).
> > > > > >
> > > > > > I am not sure I like the idea of immediately shadowing the
> > > > > > builtin.
> > > > > > That is a switch we can avoid flipping (without warning);
> > > > > > `np.bool_`
> > > > > > and `bool` are fairly different beasts? [1]
> > > > >
> > > > > NumPy already shadows a lot of builtins, in many cases, in ways
> > > > > that
> > > > > are incompatible with existing ones. It's not something I would
> > > > > have
> > > > > done personally, but it's been this way for a long time.
> > > > >
> > > >
> > > > It may be defensible to keep np.bool as an alias for Python's
> > > > bool
> > > > even when we remove the other aliases.
> > >
> >
> > I'd agree with that.
> >
> >
> > > That is true, `int` is probably the most confusing, since it is not
> > > at
> > > all compatible to a Python integer, but rather the "default"
> > > integer
> > > (which happens to be the same as C `long` currently).
> > >
> > > So we could focus on `np.int`, `np.long`.  I am a bit unsure
> > > whether
> > > you would prefer that or are mainly pointing out the possibility?
> > >
> >
> > Not sure what you mean with focus, focus on describing in the release
> > notes? Deprecating `np.int` seems like the most beneficial part of
> > this
> > whole exercise.
> >
>
> I meant limiting the current deprecation to `np.int`, maybe `np.long`,
> and a "carefully chosen" set.
> To be honest, I don't mind either way, so any stronger opinion will tip
> the scale for me personally (my default currently is to update the
> release notes to recommend the more descriptive names).
>
> There are probably more doc updates that would be nice, I will suggest
> updating a separate issue for that.
>
>
> > Right now, my main take-away from the discussion is that it would be
> > > good to clarify the release notes a bit more.
> > >
> > > Using `float` for a dtype seems fine to me, but I prefer mentioning
> > > `np.float64` over `np.float_`.
> > > For integers, I wonder if we should also suggest `np.int64`, even –
> > > or
> > > because – if the default integer on many systems is currently
> > > `np.int_`?
> > >
> >
> > I agree. I think we should recommend sane, descriptive names that do
> > the
> > right thing. So ideally we'd have people spell their dtype specifiers
> > as
> >   dtype=bool  # or np.bool
> >   dtype=np.float64
> >   dtype=np.int64
> >   dtype=np.complex128
> > The names with underscores at the end make little sense from a UX
> > perspective. And the C equivalents (single/double/etc) made sense 15
> > years
> > ago, but with the user base of today - the majority of whom will not
> > know C
> > fluently or at all - also don't make too much sense.
> >
> > The `dtype=int` or `dtype=np.int_` behaviour flopping between 32 and
> > 64
> > bits is likely to be a pitfall much more often than it is what the
> > user
> > actually needs, so shouldn't be recommended and probably deserves a
> > warning
> > in the docs.
>
> Right, there is one slight trickery because `np.intp` is often a great
> integer dtype to use, because it is the integer that NumPy uses for all
> things related to indexing and array sizes.
> (I would be happy to dig out my PR making `np.intp` the default NumPy
> integer.)
>
> Cheers,
>
> Sebastian
>
>
> >
> > Cheers,
> > Ralf
> >
> >
> > >
> > > >
> > > > np.int_ and np.float_ have fixed precision, which makes them
> > > > somewhat
> > > > different from the builtin types. NumPy has a whole bunch of
> > > > different
> > > > precisions for integer and floats, so this distinction matters.
> > > >
> > > > In contrast, there is only one boolean dtype in NumPy, which
> > > > matches
> > > > Python's bool. So we wouldn't have to worry, for example, about
> > > > whether a
> > > > user has requested a specific precision explicitly. This comes up
> > > > in
> > > > issues
> > > > like type-promotion where libraries like JAX and PyTorch have
> > > > special
> > > > case
> > > > logic for most Python types vs NumPy dtypes (but booleans are the
> > > > same for
> > > > both):
> > > > https://jax.readthedocs.io/en/latest/type_promotion.html
> > >
> > >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] np.{bool,float,int} deprecation

Reply via email to