If you convert an array of strings to datetime64s and 'NaT' (or one of
its variants) appears in the string, all subsequent values are
rendered as NaT:
(this is in 1.7.1 but the problem is present in current dev version as well)
import numpy as np
a = np.array(['2010', 'nat', '2030'])
a.astype(np.datetime64)
array(['2010', 'NaT', 'NaT'], dtype='datetime64[Y]')
The fix is to re-initalize 'dt' inside the loop in
_strided_to_strided_string_to_datetime
(patch attached)
Correct behavior (with patch):
import numpy as np
a=np.array(['2010', 'nat', '2020'])
a.astype(np.datetime64)
array(['2010', 'NaT', '2020'], dtype='datetime64[Y]')
diff --git a/numpy/core/src/multiarray/dtype_transfer.c b/numpy/core/src/multiarray/dtype_transfer.c
index f758139..3bd362c 100644
--- a/numpy/core/src/multiarray/dtype_transfer.c
+++ b/numpy/core/src/multiarray/dtype_transfer.c
@@ -884,12 +884,13 @@ _strided_to_strided_string_to_datetime(char *dst, npy_intp dst_stride,
NpyAuxData *data)
{
_strided_datetime_cast_data *d = (_strided_datetime_cast_data *)data;
-npy_int64 dt = ~NPY_DATETIME_NAT;
npy_datetimestruct dts;
char *tmp_buffer = d-tmp_buffer;
char *tmp;
while (N 0) {
+npy_int64 dt = ~NPY_DATETIME_NAT;
+
/* Replicating strnlen with memchr, because Mac OS X lacks it */
tmp = memchr(src, '\0', src_itemsize);
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion