A Saturday 21 March 2009, Paul Northug escrigué: [clip] > numpy arrays are not hashable, maybe for a good reason.
Numpy array are not hashable because they are mutable. > I tried > anyway by keeping a dict of hash(tuple(X)), but started having > collisions. So I switched to md5.new(X).digest() as the hash function > and it seems to work ok. In a quick search, I saw cPickle.dumps and > repr are also used as key values. Having collisions is not necessarily very bad, unless you have *a lot* of them. I wonder what kind of X you are dealing with that can provoke so much collisions when using hash(tuple(X))? Just curious. > I am assuming this is a common problem with functions with numpy > array arguments and was wondering what the best approach is > (including not using memoization). If md5.new(X).digest() works well for you, then go ahead; it seems fast: In [14]: X = np.arange(1000.) In [15]: timeit hash(tuple(X)) 1000 loops, best of 3: 504 µs per loop In [16]: timeit md5.new(X).digest() 10000 loops, best of 3: 40.4 µs per loop Cheers, -- Francesc Alted _______________________________________________ Numpy-discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
