Re: [Numpy-discussion] unique rows of array

2009-08-18 Thread Maria Liukis

Josef,

Many thanks for the example! It should become an official NumPy  
recipe :)


Thanks again,
Masha

liu...@usc.edu



On Aug 17, 2009, at 10:03 PM, josef.p...@gmail.com wrote:


On Tue, Aug 18, 2009 at 12:59 AM, Maria Liukisliu...@usc.edu wrote:


On Aug 17, 2009, at 9:51 PM, Charles R Harris wrote:


On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis liu...@usc.edu  
wrote:


Hello everybody,
While re-implementing some Matlab code in Python, I've run into a  
problem

of finding a NumPy function analogous to the Matlab's unique(array,
'rows') to get unique rows of an array. Searching the web, I've  
found a

similar discussion from couple of years ago with an example:


Just to be clear, do you mean finding all rows that only occur  
once in the

array?

Yes.


I interpreted your question as removing duplicates. It keeps rows that
occur more than once.
That's what my example is intended to do.

Josef



snip

Chuck


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] unique rows of array

2009-08-18 Thread josef . pktd
On Tue, Aug 18, 2009 at 2:01 AM, Maria Liukisliu...@usc.edu wrote:
 Josef,
 Many thanks for the example! It should become an official NumPy recipe :)
 Thanks again,
 Masha
 
 liu...@usc.edu

Actually, there is also an implementation of unique rows in
scipy.stats._support. It uses loops (and array concatenation in the
loop), but it preserves the order of the rows in the array.

In general, I don't recommend using scipy.stats._support, since many
or most functions are not tested and only some are used in
scipy.stats. These functions wait for a rewrite or removal. When I
thought about a rewrite last year, I didn't know much about structured
arrays and views.

Josef

 cc
array([[10,  1,  2],
   [ 3,  4,  5],
   [ 3,  4,  5],
   [ 9, 10, 11]])
 scipy.stats._support.unique(cc)
array([[10,  1,  2],
   [ 3,  4,  5],
   [ 9, 10, 11]])

unique columns using transpose :

 cct = cc.T.copy()
 cct
array([[10,  3,  3,  9],
   [ 1,  4,  4, 10],
   [ 2,  5,  5, 11]])
 scipy.stats._support.unique(cct.T).T
array([[10,  3,  9],
   [ 1,  4, 10],
   [ 2,  5, 11]])

Josef




 On Aug 17, 2009, at 10:03 PM, josef.p...@gmail.com wrote:

 On Tue, Aug 18, 2009 at 12:59 AM, Maria Liukisliu...@usc.edu wrote:

 On Aug 17, 2009, at 9:51 PM, Charles R Harris wrote:

 On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis liu...@usc.edu wrote:

 Hello everybody,
 While re-implementing some Matlab code in Python, I've run into a problem
 of finding a NumPy function analogous to the Matlab's unique(array,
 'rows') to get unique rows of an array. Searching the web, I've found a
 similar discussion from couple of years ago with an example:

 Just to be clear, do you mean finding all rows that only occur once in the
 array?
 Yes.

 I interpreted your question as removing duplicates. It keeps rows that
 occur more than once.
 That's what my example is intended to do.
 Josef

 snip
 Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] unique rows of array

2009-08-17 Thread Maria Liukis

Hello everybody,

While re-implementing some Matlab code in Python, I've run into a  
problem of finding a NumPy function analogous to the Matlab's unique 
(array, 'rows') to get unique rows of an array. Searching the web,  
I've found a similar discussion from couple of years ago with an  
example:



## A SNIPPET FROM THE DISCUSSION
[Numpy-discussion] Finding unique rows in an array [Was: Finding a  
row match within a numpy array]

A Tuesday 21 August 2007, Mark.Miller escrigué:
 A slightly related question on this topic...

 Is there a good loopless way to identify all of the unique rows in an
 array?  Something like numpy.unique() is ideal, but capable of
 extracting unique subarrays along an axis.

You can always do a view of the rows as strings and then use unique().
Here is an example:

In [1]: import numpy
In [2]: a=numpy.arange(12).reshape(4,3)
In [3]: a[2]=(3,4,5)
In [4]: a
Out[4]:
array([[ 0,  1,  2],
   [ 3,  4,  5],
   [ 3,  4,  5],
   [ 9, 10, 11]])

now, create the view and select the unique rows:

In [5]: b=numpy.unique(a.view('S%d'%a.itemsize*a.shape[0])).view('i4')

and finally restore the shape:

In [6]: b.reshape((len(b)/a.shape[1], a.shape[1]))
Out[6]:
array([[ 0,  1,  2],
   [ 3,  4,  5],
   [ 9, 10, 11]])

If you want to find unique columns instead of rows, do a tranpose first
on the initial array.

END OF DISCUSSION


Provided example works only because array elements are row-sorted.  
Changing tested array to (in my case, it's 'c'):


 c
array([[ 0,  1,  2],
   [ 3,  4,  5],
   [ 3,  4,  5],
   [ 9, 10, 11]])
 c[0] = (11, 10, 0)
 c
array([[11, 10,  0],
   [ 3,  4,  5],
   [ 3,  4,  5],
   [ 9, 10, 11]])
 b = np.unique(c.view('S%s' %c.itemsize*c.shape[0]))
 b
array(['', '\x03', '\x04', '\x05', '\t', '\n', '\x0b'],
  dtype='|S4')
 b.view('i4')
array([ 0,  3,  4,  5,  9, 10, 11])
 b.reshape((len(b)/c.shape[1], c.shape[1])).view('i4')
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: total size of new array must be unchanged


Since len(b) = 7.

Suggested approach would work if the whole row would be converted to  
a single string, I guess. But from what I could gather,  
numpy.array.view() only changes display element-wise.


Before I start re-inventing the wheel, I was just wondering if using  
existing numpy functionality one could find unique rows in an array.



Many thanks in advance!
Masha

liu...@usc.edu



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] unique rows of array

2009-08-17 Thread josef . pktd
On Tue, Aug 18, 2009 at 12:30 AM, Maria Liukisliu...@usc.edu wrote:
 Hello everybody,
 While re-implementing some Matlab code in Python, I've run into a problem of
 finding a NumPy function analogous to the Matlab's unique(array, 'rows')
 to get unique rows of an array. Searching the web, I've found a similar
 discussion from couple of years ago with an example:

 ## A SNIPPET FROM THE DISCUSSION
 [Numpy-discussion] Finding unique rows in an array [Was: Finding a row match
 within a numpy array]
 A Tuesday 21 August 2007, Mark.Miller escrigué:
 A slightly related question on this topic...

 Is there a good loopless way to identify all of the unique rows in an
 array?  Something like numpy.unique() is ideal, but capable of
 extracting unique subarrays along an axis.
 You can always do a view of the rows as strings and then use unique().
 Here is an example:
 In [1]: import numpy
 In [2]: a=numpy.arange(12).reshape(4,3)
 In [3]: a[2]=(3,4,5)
 In [4]: a
 Out[4]:
 array([[ 0,  1,  2],
        [ 3,  4,  5],
        [ 3,  4,  5],
        [ 9, 10, 11]])
 now, create the view and select the unique rows:
 In [5]: b=numpy.unique(a.view('S%d'%a.itemsize*a.shape[0])).view('i4')
 and finally restore the shape:
 In [6]: b.reshape((len(b)/a.shape[1], a.shape[1]))
 Out[6]:
 array([[ 0,  1,  2],
        [ 3,  4,  5],
        [ 9, 10, 11]])
 If you want to find unique columns instead of rows, do a tranpose first
 on the initial array.
 END OF DISCUSSION

 Provided example works only because array elements are row-sorted.
 Changing tested array to (in my case, it's 'c'):
 c
 array([[ 0,  1,  2],
        [ 3,  4,  5],
        [ 3,  4,  5],
        [ 9, 10, 11]])
 c[0] = (11, 10, 0)
 c
 array([[11, 10,  0],
        [ 3,  4,  5],
        [ 3,  4,  5],
        [ 9, 10, 11]])
 b = np.unique(c.view('S%s' %c.itemsize*c.shape[0]))
 b
 array(['', '\x03', '\x04', '\x05', '\t', '\n', '\x0b'],
       dtype='|S4')
 b.view('i4')
 array([ 0,  3,  4,  5,  9, 10, 11])
 b.reshape((len(b)/c.shape[1], c.shape[1])).view('i4')
 Traceback (most recent call last):
   File stdin, line 1, in module
 ValueError: total size of new array must be unchanged

 Since len(b) = 7.
 Suggested approach would work if the whole row would be converted to a
 single string, I guess. But from what I could gather, numpy.array.view()
 only changes display element-wise.
 Before I start re-inventing the wheel, I was just wondering if using
 existing numpy functionality one could find unique rows in an array.

 Many thanks in advance!
 Masha
 
 liu...@usc.edu



one way is to convert to structured array

 c = np.array([[ 0,  1,  2],
   [ 3,  4,  5],
   [ 3,  4,  5],
   [ 9, 10, 11]])

 np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1])
array([[ 0,  1,  2],
   [ 3,  4,  5],
   [ 9, 10, 11]])

for explanation, I asked a similar question last december about sortrows.
(I never remember, when I need the last reshape and when not)

Josef
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] unique rows of array

2009-08-17 Thread Charles R Harris
On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis liu...@usc.edu wrote:

 Hello everybody,
 While re-implementing some Matlab code in Python, I've run into a problem
 of finding a NumPy function analogous to the Matlab's unique(array,
 'rows') to get unique rows of an array. Searching the web, I've found a
 similar discussion from couple of years ago with an example:


Just to be clear, do you mean finding all rows that only occur once in the
array?

snip

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] unique rows of array

2009-08-17 Thread Maria Liukis

Josef,

Thanks, I'll try that and will search for your question from last  
december :)

Masha

liu...@usc.edu



On Aug 17, 2009, at 9:44 PM, josef.p...@gmail.com wrote:


On Tue, Aug 18, 2009 at 12:30 AM, Maria Liukisliu...@usc.edu wrote:

Hello everybody,
While re-implementing some Matlab code in Python, I've run into a  
problem of
finding a NumPy function analogous to the Matlab's unique(array,  
'rows')
to get unique rows of an array. Searching the web, I've found a  
similar

discussion from couple of years ago with an example:

## A SNIPPET FROM THE DISCUSSION
[Numpy-discussion] Finding unique rows in an array [Was: Finding a  
row match

within a numpy array]
A Tuesday 21 August 2007, Mark.Miller escrigué:

A slightly related question on this topic...

Is there a good loopless way to identify all of the unique rows  
in an

array?  Something like numpy.unique() is ideal, but capable of
extracting unique subarrays along an axis.
You can always do a view of the rows as strings and then use unique 
().

Here is an example:
In [1]: import numpy
In [2]: a=numpy.arange(12).reshape(4,3)
In [3]: a[2]=(3,4,5)
In [4]: a
Out[4]:
array([[ 0,  1,  2],
   [ 3,  4,  5],
   [ 3,  4,  5],
   [ 9, 10, 11]])
now, create the view and select the unique rows:
In [5]: b=numpy.unique(a.view('S%d'%a.itemsize*a.shape[0])).view 
('i4')

and finally restore the shape:
In [6]: b.reshape((len(b)/a.shape[1], a.shape[1]))
Out[6]:
array([[ 0,  1,  2],
   [ 3,  4,  5],
   [ 9, 10, 11]])
If you want to find unique columns instead of rows, do a tranpose  
first

on the initial array.
END OF DISCUSSION

Provided example works only because array elements are row-sorted.
Changing tested array to (in my case, it's 'c'):

c

array([[ 0,  1,  2],
   [ 3,  4,  5],
   [ 3,  4,  5],
   [ 9, 10, 11]])

c[0] = (11, 10, 0)
c

array([[11, 10,  0],
   [ 3,  4,  5],
   [ 3,  4,  5],
   [ 9, 10, 11]])

b = np.unique(c.view('S%s' %c.itemsize*c.shape[0]))
b

array(['', '\x03', '\x04', '\x05', '\t', '\n', '\x0b'],
  dtype='|S4')

b.view('i4')

array([ 0,  3,  4,  5,  9, 10, 11])

b.reshape((len(b)/c.shape[1], c.shape[1])).view('i4')

Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: total size of new array must be unchanged



Since len(b) = 7.
Suggested approach would work if the whole row would be converted  
to a
single string, I guess. But from what I could gather,  
numpy.array.view()

only changes display element-wise.
Before I start re-inventing the wheel, I was just wondering if using
existing numpy functionality one could find unique rows in an array.

Many thanks in advance!
Masha

liu...@usc.edu




one way is to convert to structured array


c = np.array([[ 0,  1,  2],

   [ 3,  4,  5],
   [ 3,  4,  5],
   [ 9, 10, 11]])

np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view 
(c.dtype).reshape(-1,c.shape[1])

array([[ 0,  1,  2],
   [ 3,  4,  5],
   [ 9, 10, 11]])

for explanation, I asked a similar question last december about  
sortrows.

(I never remember, when I need the last reshape and when not)

Josef
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] unique rows of array

2009-08-17 Thread Maria Liukis


On Aug 17, 2009, at 9:51 PM, Charles R Harris wrote:




On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis liu...@usc.edu wrote:
Hello everybody,

While re-implementing some Matlab code in Python, I've run into a  
problem of finding a NumPy function analogous to the Matlab's  
unique(array, 'rows') to get unique rows of an array. Searching  
the web, I've found a similar discussion from couple of years ago  
with an example:



Just to be clear, do you mean finding all rows that only occur once  
in the array?


Yes.



snip

Chuck


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] unique rows of array

2009-08-17 Thread josef . pktd
On Tue, Aug 18, 2009 at 12:59 AM, Maria Liukisliu...@usc.edu wrote:

 On Aug 17, 2009, at 9:51 PM, Charles R Harris wrote:


 On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis liu...@usc.edu wrote:

 Hello everybody,
 While re-implementing some Matlab code in Python, I've run into a problem
 of finding a NumPy function analogous to the Matlab's unique(array,
 'rows') to get unique rows of an array. Searching the web, I've found a
 similar discussion from couple of years ago with an example:

 Just to be clear, do you mean finding all rows that only occur once in the
 array?

 Yes.

I interpreted your question as removing duplicates. It keeps rows that
occur more than once.
That's what my example is intended to do.

Josef


 snip

 Chuck


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] unique rows of array

2009-08-17 Thread josef . pktd
On Tue, Aug 18, 2009 at 1:03 AM, josef.p...@gmail.com wrote:
 On Tue, Aug 18, 2009 at 12:59 AM, Maria Liukisliu...@usc.edu wrote:

 On Aug 17, 2009, at 9:51 PM, Charles R Harris wrote:


 On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis liu...@usc.edu wrote:

 Hello everybody,
 While re-implementing some Matlab code in Python, I've run into a problem
 of finding a NumPy function analogous to the Matlab's unique(array,
 'rows') to get unique rows of an array. Searching the web, I've found a
 similar discussion from couple of years ago with an example:

 Just to be clear, do you mean finding all rows that only occur once in the
 array?

 Yes.

 I interpreted your question as removing duplicates. It keeps rows that
 occur more than once.
 That's what my example is intended to do.

 Josef


 snip

 Chuck


Just a reminder about views on views, I don't think the recommendation
to take the transpose to get unique columns works.
We had the discussion some time ago, that views work on the original
array data and not on the view, and in this case the transpose creates
a view.  example below

Also, unique does a sort and doesn't preserve order.

Josef


 c=np.array([[ 10,  1,  2],
   [ 3,  4,  5],
   [ 3,  4,  5],
   [ 9, 10, 11]])
 cc = c.copy() #backup
 c = cc.T
 cc
array([[10,  1,  2],
   [ 3,  4,  5],
   [ 3,  4,  5],
   [ 9, 10, 11]])
 np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1])
Traceback (most recent call last):
  File pyshell#46, line 1, in module

np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1])
ValueError: new type not compatible with array.


 c = cc.T.copy()
 c
array([[10,  3,  3,  9],
   [ 1,  4,  4, 10],
   [ 2,  5,  5, 11]])
 np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1])
array([[ 1,  4,  4, 10],
   [ 2,  5,  5, 11],
   [10,  3,  3,  9]])
 c = np.ascontiguousarray(cc.T)
 np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1])
array([[ 1,  4,  4, 10],
   [ 2,  5,  5, 11],
   [10,  3,  3,  9]])
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] unique rows of array

2009-08-17 Thread Maria Liukis
On Aug 17, 2009, at 10:03 PM, josef.p...@gmail.com wrote:

 On Tue, Aug 18, 2009 at 12:59 AM, Maria Liukisliu...@usc.edu wrote:

 On Aug 17, 2009, at 9:51 PM, Charles R Harris wrote:


 On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis liu...@usc.edu  
 wrote:

 Hello everybody,
 While re-implementing some Matlab code in Python, I've run into a  
 problem
 of finding a NumPy function analogous to the Matlab's unique(array,
 'rows') to get unique rows of an array. Searching the web, I've  
 found a
 similar discussion from couple of years ago with an example:

 Just to be clear, do you mean finding all rows that only occur  
 once in the
 array?

Sorry, I think it shows that I should stop working pass 10pm :)


 Yes.

 I interpreted your question as removing duplicates. It keeps rows that
 occur more than once.

Yes, I meant keeping only unique (without duplicates) rows.

 That's what my example is intended to do.

 Josef


 snip

 Chuck


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion