On Tue, 2022-02-22 at 01:43 -0600, Juan Nunez-Iglesias wrote: > > On Tue, 22 Feb 2022, at 1:01 AM, Stefan van der Walt wrote: > > it is easier to explain away `x + 1` behaving oddly over `x[0] + 1` > > behaving oddly > > Is it? I find the two equivalent, honestly. > > > given that we pretend like NumPy scalars do not exist. > > This is the leaky abstraction that I think should be plugged in this > revamp. > > > This then argues for making explicit to the user that there are > > scalars involved. I.e., no more: > > > > In [4]: x = np.array([1, 2, 3]) > > > > In [5]: x[0] > > Out[5]: 1 > > > > But rather > > > > Out[5]: np.int64(1) > > Yup. I would be in favour of such a repr change. (And to be clear, it > is *only* a repr change, not a behaviour change!) I have indeed run > across this a few times, e.g. trying to encode a single value in json > only to find that it was a NumPy int64 rather than an int. > > > > > The benefit of these semantics are that you can readily express > > > > sequences of operations with clean Python code, without having > > > > to explicitly cast scalars to the appropriate type. Imagine if > > > > rather than writing this: > > > > > > > > 3 * (x + 1) ** 2 > > > > you had to write this: > > > > > > > > np.int32(3) * (x + np.int32(1)) ** np.int32(2) > > > > And how do you write the much more common > > > > x[0] + 1 > > Is it really much more common than arithmetic combining arrays and > literals? I'd say it's much *less* common, especially in "idiomatic" > NumPy which tries to avoid Python looping over elements.
I think there are a few use-cases for this (one that comes to mind is integration, where the integration function is sometimes called on scalar values). Especially if you look to new users, who may be using scalars for lack of experience writing vectorized code. But mainly, I think it is the sneakiest backcompat break... The one "middle ground" possibility I see here is that we could limit the weak logic to Python operators in principle (I know this seems unpopular). The main arguments are: * It seems somewhat straight forward to explain that `np.add(x, 1)` behaves more like `np.add(x, np.asarray(1))` * We can give warnings for operators: At least integer overflows will give a warning, notifying users of a potential problem. * The long notation `np.add(x, np.uint8(1))` isn't so bad if you don't have operators. (or `dtype=x.dtype`) (I may well be missing a reason for why this doesn't add up at all.) Unfortunately, there will always be strange cases. No matter what we do, it will not always be clear if a library function calls `np.asarray()` on the input first, or first uses the input directly. I do not think that `asarray` should drag around the information that it was "weak" as JAX at least can (to me this seems prone to errors and unlike JAX our arrays are not immutable). So if you want "weak" logic for function input you need to take care to handle it before calling `np.asarray()`. Cheers, Sebastian > > > now? It becomes: x[0] + np.int64(1). > > I would write it as x[0].astype(np.int64) + 1, and indeed I think I > would find that less confusing, reading the code years later, because > it would allow me to not even have to think about type promotion. > > > The reason we had value inspection was that it gave us a cushy > > "best of both worlds"; when going with dtype-only casting, you have > > to give something up. > > Yes yes, we agree we are giving something up, we merely disagree > about what is better to give up long term for our community. For me, > the attractiveness of unified scalar and array semantics, together > with unified type promotion, beats the attractiveness of hiding > overflow from users, especially since the hiding can only ever be > patchy.* I 100% agree with you that it is a tradeoff. But, imho, one > worth making. > > * e.g. the same user might initially be happy about the result of > x[0] + 1 matching their infinite-precision expectation, but then be > surprised by > > x[0] + 1 > -> 256 > > y[0] = 1 > x[0] + y[0] > -> 0 # WTH > > Juan. > _______________________________________________ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net
signature.asc
Description: This is a digitally signed message part
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com