This is one of my oldest NumPy pain-points:
>>> np.array([1, 2, 'three'])
array(['1', '2', 'three'],
      dtype='<U21')

This is almost never what I want. In many cases, I simply write
dtype=object, but for others (e.g., numpy.where), it's a minor annoyance to
explicitly cast inputs to the right type.

Autoconverting numbers into strings occasionally introduces real bugs
(e.g., where using `np.nan` as a sentinel value for NA when working with
strings, as in https://github.com/pydata/xarray/pull/1847), but mostly just
hides bugs until later. It's certainly very un-Pythonic.

The sane promotion rule would be `np.promote_types(str, float) -> object`,
not a size 32 string.

Is it way too late to fix this for NumPy, or is this something we could
change in a major release? It would certainly need at least a deprecation
cycle. This is easy enough to introduce accidentally that there are
undoubtedly many users whose code would break if we changed this.
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Reply via email to