On Tue, Aug 18, 2009 at 12:30 AM, Maria Liukis<liu...@usc.edu> wrote: > Hello everybody, > While re-implementing some Matlab code in Python, I've run into a problem of > finding a NumPy function analogous to the Matlab's "unique(array, 'rows')" > to get unique rows of an array. Searching the web, I've found a similar > discussion from couple of years ago with an example: > > ############## A SNIPPET FROM THE DISCUSSION > [Numpy-discussion] Finding unique rows in an array [Was: Finding a row match > within a numpy array] > A Tuesday 21 August 2007, Mark.Miller escrigué: >> A slightly related question on this topic... >> >> Is there a good loopless way to identify all of the unique rows in an >> array? Something like numpy.unique() is ideal, but capable of >> extracting unique subarrays along an axis. > You can always do a view of the rows as strings and then use unique(). > Here is an example: > In [1]: import numpy > In [2]: a=numpy.arange(12).reshape(4,3) > In [3]: a[2]=(3,4,5) > In [4]: a > Out[4]: > array([[ 0, 1, 2], > [ 3, 4, 5], > [ 3, 4, 5], > [ 9, 10, 11]]) > now, create the view and select the unique rows: > In [5]: b=numpy.unique(a.view('S%d'%a.itemsize*a.shape[0])).view('i4') > and finally restore the shape: > In [6]: b.reshape((len(b)/a.shape[1], a.shape[1])) > Out[6]: > array([[ 0, 1, 2], > [ 3, 4, 5], > [ 9, 10, 11]]) > If you want to find unique columns instead of rows, do a tranpose first > on the initial array. > ################END OF DISCUSSION > > Provided example works only because array elements are row-sorted. > Changing tested array to (in my case, it's 'c'): >>>> c > array([[ 0, 1, 2], > [ 3, 4, 5], > [ 3, 4, 5], > [ 9, 10, 11]]) >>>> c[0] = (11, 10, 0) >>>> c > array([[11, 10, 0], > [ 3, 4, 5], > [ 3, 4, 5], > [ 9, 10, 11]]) >>>> b = np.unique(c.view('S%s' %c.itemsize*c.shape[0])) >>>> b > array(['', '\x03', '\x04', '\x05', '\t', '\n', '\x0b'], > dtype='|S4') >>>> b.view('i4') > array([ 0, 3, 4, 5, 9, 10, 11]) >>>> b.reshape((len(b)/c.shape[1], c.shape[1])).view('i4') > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > ValueError: total size of new array must be unchanged >>>> > Since len(b) = 7. > Suggested approach would work if the whole row would be converted to a > single string, I guess. But from what I could gather, numpy.array.view() > only changes display element-wise. > Before I start re-inventing the wheel, I was just wondering if using > existing numpy functionality one could find unique rows in an array. > > Many thanks in advance! > Masha > -------------------- > liu...@usc.edu > >
one way is to convert to structured array >>> c = np.array([[ 0, 1, 2], [ 3, 4, 5], [ 3, 4, 5], [ 9, 10, 11]]) >>> np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1]) array([[ 0, 1, 2], [ 3, 4, 5], [ 9, 10, 11]]) for explanation, I asked a similar question last december about "sortrows". (I never remember, when I need the last reshape and when not) Josef _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion