On Oct 27, 2009, at 2:31 PM, Michael Droettboom wrote: > Christopher Barker wrote: >> Nadav Horesh wrote: >> >>> np.equal(a,a).sum(0) >>> >>> but, for unknown reason, np.equal operates only on "normal" arrays. >>> >> >> true: >> >> In [25]: a >> Out[25]: >> array(['abc', 'def', 'abc', 'ghij'], >> dtype='|S4') >> >> In [27]: np.equal(a,a) >> Out[27]: NotImplemented >> >> however: >> >> In [28]: a == a >> Out[28]: array([ True, True, True, True], dtype=bool) >> >> don't they use the same code? or is "==" reverting to plain old >> generic >> python sequence comparison, which would partly explain why it is so >> slow. >> > It looks as if "a == a" (that is array_richcompare) is triggering > special case code for strings, so it is fast. However, IMHO np.equal > should be made to work as well. Can you file a bug and assign it to > me > (I'm dealing with a number of other string-related things, so I > might as > well take this too).
The array_richcompare special-cased strings not for speed but for actual functionality. Making np.equal work with strings requires changes to the ufunc code itself which was never written to work with "variable-length" data- types (like strings, unicode, and records). There are several things that would have to be fixed. Some of the changes we made to allow for date-time data-types also made it possible to support variable-length strings, but this is non-trivial to implement. It's certainly possible, but I would want to look at any changes you make before committing them to make sure all the issues are being understood. Thanks, -Travis -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliph...@enthought.com _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion