Josef,

Thanks, I'll try that and will search for your question from last december :)
Masha
--------------------
liu...@usc.edu



On Aug 17, 2009, at 9:44 PM, josef.p...@gmail.com wrote:

On Tue, Aug 18, 2009 at 12:30 AM, Maria Liukis<liu...@usc.edu> wrote:
Hello everybody,
While re-implementing some Matlab code in Python, I've run into a problem of finding a NumPy function analogous to the Matlab's "unique(array, 'rows')" to get unique rows of an array. Searching the web, I've found a similar
discussion from couple of years ago with an example:

############## A SNIPPET FROM THE DISCUSSION
[Numpy-discussion] Finding unique rows in an array [Was: Finding a row match
within a numpy array]
A Tuesday 21 August 2007, Mark.Miller escrigué:
A slightly related question on this topic...

Is there a good loopless way to identify all of the unique rows in an
array?  Something like numpy.unique() is ideal, but capable of
extracting unique subarrays along an axis.
You can always do a view of the rows as strings and then use unique ().
Here is an example:
In [1]: import numpy
In [2]: a=numpy.arange(12).reshape(4,3)
In [3]: a[2]=(3,4,5)
In [4]: a
Out[4]:
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 3,  4,  5],
       [ 9, 10, 11]])
now, create the view and select the unique rows:
In [5]: b=numpy.unique(a.view('S%d'%a.itemsize*a.shape[0])).view ('i4')
and finally restore the shape:
In [6]: b.reshape((len(b)/a.shape[1], a.shape[1]))
Out[6]:
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 9, 10, 11]])
If you want to find unique columns instead of rows, do a tranpose first
on the initial array.
################END OF DISCUSSION

Provided example works only because array elements are row-sorted.
Changing tested array to (in my case, it's 'c'):
c
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 3,  4,  5],
       [ 9, 10, 11]])
c[0] = (11, 10, 0)
c
array([[11, 10,  0],
       [ 3,  4,  5],
       [ 3,  4,  5],
       [ 9, 10, 11]])
b = np.unique(c.view('S%s' %c.itemsize*c.shape[0]))
b
array(['', '\x03', '\x04', '\x05', '\t', '\n', '\x0b'],
      dtype='|S4')
b.view('i4')
array([ 0,  3,  4,  5,  9, 10, 11])
b.reshape((len(b)/c.shape[1], c.shape[1])).view('i4')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: total size of new array must be unchanged

Since len(b) = 7.
Suggested approach would work if the whole row would be converted to a single string, I guess. But from what I could gather, numpy.array.view()
only changes display element-wise.
Before I start re-inventing the wheel, I was just wondering if using
existing numpy functionality one could find unique rows in an array.

Many thanks in advance!
Masha
--------------------
liu...@usc.edu



one way is to convert to structured array

c = np.array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 3,  4,  5],
       [ 9, 10, 11]])

np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view (c.dtype).reshape(-1,c.shape[1])
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 9, 10, 11]])

for explanation, I asked a similar question last december about "sortrows".
(I never remember, when I need the last reshape and when not)

Josef
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to