[Numpy-discussion] Type resolver related deprecations and changes

Sebastian Berg Fri, 02 Apr 2021 13:46:56 -0700

Hi all,

I have to do some changes to the type resolution, and I started these
here: https://github.com/numpy/numpy/pull/18718


There are four changes:

* Deprecate `signature="l"` and `signature=("l",)`, these are confusing
  since the signature should include all inputs and outputs.  To only
  provide the output use `dtype="l"`.

* Using `dtype=` for comparisons (e.g. `np.equal`) used to be weird:

      np.equal(1, 2, dtype=object)  -> returns boolean
      np.equal(None, 2, dtype=object)  -> returns object array

  The first one will now give a FutureWarning. Comparisons that provide
  a dtype other than object or bool give a DeprecationWarning (or
  fail).
  I hope the warning can be preserved when more refactoring happens.

* NumPy *almost* always ignores any metadata, byte-order, time unit
  information from the `dtype` or `signature` arguments to ufuncs.
  Practically, the dtypes passed actually denote the DType types
  rather than the specific instance (which could be byte swapped).

  NumPy will now do this always and give a warning if byte-order or
  time unit is ignored!

* It is THEORETICALLY possible to call `ufunc->type_resolver` in the C
  API (as opposed to providing it, which is somewhat OK).
  If someone does that they have to normalize the type tuple now, I
  don't really see a reason for keeping support, when NumPy will stop
  calling it itself almost always and anyone using it would probably be
  in trouble soon.
  To be clear: I have NOT found a single instance of such code in a
  code search. Even *providing* it – which is much more reasonable –
  is probably only done by astropy/pyerfa.



** Long example for the "time unit" dropping change **

For the third point, which is in theory the largest impact. Both pandas
and astropy do not notice it (I also grepped scipy, its clean).
These are the biggest changes:

    # The following will now warn on most systems (unchanged result):
    np.add(3, 5, dtype=">i32")

    # The biggest impact is for timedelta or datetimes:
    arr = np.arange(10, dtype="m8[s]")
    # The examples always ignored the time unit "ns" (using the
    # unit of `arr`.  They now issue a warning:
    np.add(arr, arr, dtype="m8[ns]")
    np.maximum.reduce(arr, dtype="m8[ns]")

    # The following issue a warning but previously did return
    # a "ns" result.
    np.add(3, 5, dtype="m8[ns]")  # Now return generic time units
    np.maximum(arr, arr, dtype="m8[ns]")  # Now returns "s" (from `arr`)


I doubt there is a good way to just keep the old behaviour.  It is
hopelessly inconsistent.  (The result even depends on how you pass
things, as the paths I deprecate align with `dtype=` but not with the
`signature=(None, None, dtype)` equivalent call.)
One thought I had just raising a hard error instead of a UserWarning
right now.

If you say that why not have `dtype=...` always be honored as the
correct output dtype, I don't disagree.  But it seems to me we probably
would have wade through a FutureWarning first, so a warning is the
"right direction".  (Or just add an `output_dtypes` keyword argument.)

Cheers,

Sebastian

signature.asc
Description: This is a digitally signed message part

_______________________________________________
NumPy-Discussion mailing list
[email protected]
https://mail.python.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Type resolver related deprecations and changes

Reply via email to