If you convert an array of strings to datetime64s and 'NaT' (or one of its variants) appears in the string, all subsequent values are rendered as NaT:
(this is in 1.7.1 but the problem is present in current dev version as well) >>> import numpy as np >>> a = np.array(['2010', 'nat', '2030']) >>> a.astype(np.datetime64) array(['2010', 'NaT', 'NaT'], dtype='datetime64[Y]') The fix is to re-initalize 'dt' inside the loop in _strided_to_strided_string_to_datetime (patch attached) Correct behavior (with patch): >>> import numpy as np >>> a=np.array(['2010', 'nat', '2020']) >>> a.astype(np.datetime64) array(['2010', 'NaT', '2020'], dtype='datetime64[Y]') >>>
diff --git a/numpy/core/src/multiarray/dtype_transfer.c b/numpy/core/src/multiarray/dtype_transfer.c index f758139..3bd362c 100644 --- a/numpy/core/src/multiarray/dtype_transfer.c +++ b/numpy/core/src/multiarray/dtype_transfer.c @@ -884,12 +884,13 @@ _strided_to_strided_string_to_datetime(char *dst, npy_intp dst_stride, NpyAuxData *data) { _strided_datetime_cast_data *d = (_strided_datetime_cast_data *)data; - npy_int64 dt = ~NPY_DATETIME_NAT; npy_datetimestruct dts; char *tmp_buffer = d->tmp_buffer; char *tmp; while (N > 0) { + npy_int64 dt = ~NPY_DATETIME_NAT; + /* Replicating strnlen with memchr, because Mac OS X lacks it */ tmp = memchr(src, '\0', src_itemsize);
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion