If you convert an array of strings to datetime64s and 'NaT' (or one of
its variants) appears in the string, all subsequent values are
rendered as NaT:

(this is in 1.7.1 but the problem is present in current dev version as well)

>>> import numpy as np
>>> a = np.array(['2010', 'nat', '2030'])
>>> a.astype(np.datetime64)
array(['2010', 'NaT', 'NaT'], dtype='datetime64[Y]')

The fix is to re-initalize 'dt' inside the loop in
_strided_to_strided_string_to_datetime
(patch attached)



Correct behavior (with patch):
>>> import numpy as np
>>> a=np.array(['2010', 'nat', '2020'])
>>> a.astype(np.datetime64)
array(['2010', 'NaT', '2020'], dtype='datetime64[Y]')
>>>
diff --git a/numpy/core/src/multiarray/dtype_transfer.c b/numpy/core/src/multiarray/dtype_transfer.c
index f758139..3bd362c 100644
--- a/numpy/core/src/multiarray/dtype_transfer.c
+++ b/numpy/core/src/multiarray/dtype_transfer.c
@@ -884,12 +884,13 @@ _strided_to_strided_string_to_datetime(char *dst, npy_intp dst_stride,
                         NpyAuxData *data)
 {
     _strided_datetime_cast_data *d = (_strided_datetime_cast_data *)data;
-    npy_int64 dt = ~NPY_DATETIME_NAT;
     npy_datetimestruct dts;
     char *tmp_buffer = d->tmp_buffer;
     char *tmp;
 
     while (N > 0) {
+        npy_int64 dt = ~NPY_DATETIME_NAT;
+
         /* Replicating strnlen with memchr, because Mac OS X lacks it */
         tmp = memchr(src, '\0', src_itemsize);
 
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to