It seems to be a bug in the unicode string length computation in arraytypes.c.src:UNICODE_compare(), based on comparison to the code in arrayobject.c:_myunicmp() and arrayobject.c:_compare_strings().
Patch below (against maintenance/1.6.x, but the bug also looks to be present in master based on my reading of the code). --- numpy/core/src/multiarray/arraytypes.c.src | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/numpy/core/src/multiarray/arraytypes.c.src b/numpy/core/src/multiarray/arraytypes.c.src index fde95c4..660d1e5 100644 --- a/numpy/core/src/multiarray/arraytypes.c.src +++ b/numpy/core/src/multiarray/arraytypes.c.src @@ -2789,7 +2789,7 @@ static int UNICODE_compare(PyArray_UCS4 *ip1, PyArray_UCS4 *ip2, PyArrayObject *ap) { - int itemsize = ap->descr->elsize; + int itemsize = (ap->descr->elsize) >> 2; if (itemsize < 0) { return 0; -- 1.7.9.3 _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion