> you might want to discuss this with us at the array API standard > https://github.com/data-apis/array-api (which is currently in RFC > stage). The spec uses bool as the name for the boolean dtype.
I don't fully understand this argument - `np.bool` is already not the boolean dtype. Either: * The spec is suggesting that `pkg.bool` be some arbitrary object that can be passed into a dtype argument and will produce a boolean array. If this is the case, the spec could also just require that `dtype=builtins.bool` have this behavior. * The spec is suggesting that `pkg.bool` is some rich dtype object. Ignoring the question of whether this should be `np.bool_` or `np.dtype(np.bool_)`, it's currently neither, and changing it will break users relying on `np.bool(True) is True`. That's not to say this isn't a sensible thing for the specification to have, it's just something that numpy can't conform to without breaking code. While it would be great if `np.bool_` could be spelt `np.bool`, I really don't think we can make that change without a long deprecation first (if at all). Eric On Thu, 10 Dec 2020 at 20:00, Sebastian Berg <sebast...@sipsolutions.net> wrote: > On Thu, 2020-12-10 at 20:38 +0100, Ralf Gommers wrote: > > On Thu, Dec 10, 2020 at 7:25 PM Sebastian Berg < > > sebast...@sipsolutions.net> > > wrote: > > > > > On Wed, 2020-12-09 at 13:37 -0800, Stephan Hoyer wrote: > > > > On Wed, Dec 9, 2020 at 1:07 PM Aaron Meurer <asmeu...@gmail.com> > > > > wrote: > > > > > > > > > On Wed, Dec 9, 2020 at 9:41 AM Sebastian Berg > > > > > <sebast...@sipsolutions.net> wrote: > > > > > > > > > > > > On Mon, 2020-12-07 at 14:18 -0700, Aaron Meurer wrote: > > > > > > > Regarding np.bool specifically, if you want to deprecate > > > > > > > this, > > > > > > > you > > > > > > > might want to discuss this with us at the array API > > > > > > > standard > > > > > > > https://github.com/data-apis/array-api (which is currently > > > > > > > in > > > > > > > RFC > > > > > > > stage). The spec uses bool as the name for the boolean > > > > > > > dtype. > > > > > > > > > > > > > > Would it make sense for NumPy to change np.bool to just be > > > > > > > the > > > > > > > boolean > > > > > > > dtype object? Unlike int and float, there is no ambiguity > > > > > > > with > > > > > > > bool, > > > > > > > and NumPy clearly doesn't have any issues with shadowing > > > > > > > builtin > > > > > > > names > > > > > > > in its namespace. > > > > > > > > > > > > We could keep the Python alias around (which for `dtype=` is > > > > > > the > > > > > > same > > > > > > as `np.bool_`). > > > > > > > > > > > > I am not sure I like the idea of immediately shadowing the > > > > > > builtin. > > > > > > That is a switch we can avoid flipping (without warning); > > > > > > `np.bool_` > > > > > > and `bool` are fairly different beasts? [1] > > > > > > > > > > NumPy already shadows a lot of builtins, in many cases, in ways > > > > > that > > > > > are incompatible with existing ones. It's not something I would > > > > > have > > > > > done personally, but it's been this way for a long time. > > > > > > > > > > > > > It may be defensible to keep np.bool as an alias for Python's > > > > bool > > > > even when we remove the other aliases. > > > > > > > I'd agree with that. > > > > > > > That is true, `int` is probably the most confusing, since it is not > > > at > > > all compatible to a Python integer, but rather the "default" > > > integer > > > (which happens to be the same as C `long` currently). > > > > > > So we could focus on `np.int`, `np.long`. I am a bit unsure > > > whether > > > you would prefer that or are mainly pointing out the possibility? > > > > > > > Not sure what you mean with focus, focus on describing in the release > > notes? Deprecating `np.int` seems like the most beneficial part of > > this > > whole exercise. > > > > I meant limiting the current deprecation to `np.int`, maybe `np.long`, > and a "carefully chosen" set. > To be honest, I don't mind either way, so any stronger opinion will tip > the scale for me personally (my default currently is to update the > release notes to recommend the more descriptive names). > > There are probably more doc updates that would be nice, I will suggest > updating a separate issue for that. > > > > Right now, my main take-away from the discussion is that it would be > > > good to clarify the release notes a bit more. > > > > > > Using `float` for a dtype seems fine to me, but I prefer mentioning > > > `np.float64` over `np.float_`. > > > For integers, I wonder if we should also suggest `np.int64`, even – > > > or > > > because – if the default integer on many systems is currently > > > `np.int_`? > > > > > > > I agree. I think we should recommend sane, descriptive names that do > > the > > right thing. So ideally we'd have people spell their dtype specifiers > > as > > dtype=bool # or np.bool > > dtype=np.float64 > > dtype=np.int64 > > dtype=np.complex128 > > The names with underscores at the end make little sense from a UX > > perspective. And the C equivalents (single/double/etc) made sense 15 > > years > > ago, but with the user base of today - the majority of whom will not > > know C > > fluently or at all - also don't make too much sense. > > > > The `dtype=int` or `dtype=np.int_` behaviour flopping between 32 and > > 64 > > bits is likely to be a pitfall much more often than it is what the > > user > > actually needs, so shouldn't be recommended and probably deserves a > > warning > > in the docs. > > Right, there is one slight trickery because `np.intp` is often a great > integer dtype to use, because it is the integer that NumPy uses for all > things related to indexing and array sizes. > (I would be happy to dig out my PR making `np.intp` the default NumPy > integer.) > > Cheers, > > Sebastian > > > > > > Cheers, > > Ralf > > > > > > > > > > > > > > > np.int_ and np.float_ have fixed precision, which makes them > > > > somewhat > > > > different from the builtin types. NumPy has a whole bunch of > > > > different > > > > precisions for integer and floats, so this distinction matters. > > > > > > > > In contrast, there is only one boolean dtype in NumPy, which > > > > matches > > > > Python's bool. So we wouldn't have to worry, for example, about > > > > whether a > > > > user has requested a specific precision explicitly. This comes up > > > > in > > > > issues > > > > like type-promotion where libraries like JAX and PyTorch have > > > > special > > > > case > > > > logic for most Python types vs NumPy dtypes (but booleans are the > > > > same for > > > > both): > > > > https://jax.readthedocs.io/en/latest/type_promotion.html > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion