Thank you very much.... better crack open a numpy reference manual instead of relying on my python "intuition".
On Wed, Jul 21, 2010 at 3:44 PM, Pauli Virtanen <p...@iki.fi> wrote: > Wed, 21 Jul 2010 15:12:14 -0400, wheres pythonmonks wrote: > >> I have an recarray -- the first column is date. >> >> I have the following function to compute the number of unique dates in >> my data set: >> >> >> def byName(): return(len(list(set(d['Date'])) )) > > What this code does is: > > 1. d['Date'] > > Extract an array slice containing the dates. This is fast. > > 2. set(d['Date']) > > Make copies of each array item, and box them into Python objects. > This is slow. > > Insert each of the objects in the set. Also this is somewhat slow. > > 3. list(set(d['Date'])) > > Get each item in the set, and insert them to a new list. > This is somewhat slow, and unnecessary if you only want to > count. > > 4. len(list(set(d['Date']))) > > > So the slowness arises because the code is copying data around, and > boxing it into Python objects. > > You should try using Numpy functions (these don't re-box the data) to do > this. http://docs.scipy.org/doc/numpy/reference/routines.set.html > > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion