[Numpy-discussion] Remove duplicate columns

2010-05-06 Thread T J
Hi,

Is there a way to sort the columns in an array?  I need to sort it so
that I can easily go through and keep only the unique columns.
ndarray.sort(axis=1) doesn't do what I want as it destroys the
relative ordering between the various columns. For example, I would
like:

[[2,1,3],
 [3,5,1],
 [0,3,1]]

to go to:

[[1,2,3],
 [5,3,1],
 [3,0,1]]

(swap the first and second columns).  So I want to treat the columns
as objects and sort them.  I can do this if I convert to a python
list, but I was hoping to avoid doing that because I ultimately need
to do element-wise bitwise operations.

Thanks!
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Remove duplicate columns

2010-05-06 Thread Keith Goodman
On Thu, May 6, 2010 at 10:25 AM, T J tjhn...@gmail.com wrote:
 Hi,

 Is there a way to sort the columns in an array?  I need to sort it so
 that I can easily go through and keep only the unique columns.
 ndarray.sort(axis=1) doesn't do what I want as it destroys the
 relative ordering between the various columns. For example, I would
 like:

 [[2,1,3],
  [3,5,1],
  [0,3,1]]

 to go to:

 [[1,2,3],
  [5,3,1],
  [3,0,1]]

 (swap the first and second columns).  So I want to treat the columns
 as objects and sort them.  I can do this if I convert to a python
 list, but I was hoping to avoid doing that because I ultimately need
 to do element-wise bitwise operations.

Assuming you want to sort columns by the values in the first row:

 x
array([[2, 1, 3],
   [3, 5, 1],
   [0, 3, 1]])
 idx = x[0,:].argsort()
 x[:,idx]
array([[1, 2, 3],
   [5, 3, 1],
   [3, 0, 1]])
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Remove duplicate columns

2010-05-06 Thread josef . pktd
On Thu, May 6, 2010 at 1:25 PM, T J tjhn...@gmail.com wrote:
 Hi,

 Is there a way to sort the columns in an array?  I need to sort it so
 that I can easily go through and keep only the unique columns.
 ndarray.sort(axis=1) doesn't do what I want as it destroys the
 relative ordering between the various columns. For example, I would
 like:

 [[2,1,3],
  [3,5,1],
  [0,3,1]]

 to go to:

 [[1,2,3],
  [5,3,1],
  [3,0,1]]

 (swap the first and second columns).  So I want to treat the columns
 as objects and sort them.  I can do this if I convert to a python
 list, but I was hoping to avoid doing that because I ultimately need
 to do element-wise bitwise operations.

there is a thread last august on unique rows which might be useful,
and a thread in Dec 2008 for sorting rows

something like

np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1])

maybe it's np.unique with numpy 1.4.

Josef

 Thanks!
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Remove duplicate columns

2010-05-06 Thread T J
On Thu, May 6, 2010 at 10:36 AM,  josef.p...@gmail.com wrote:

 there is a thread last august on unique rows which might be useful,
 and a thread in Dec 2008 for sorting rows

 something like

 np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1])

 maybe it's np.unique with numpy 1.4.


The thread is useful:

  http://www.mail-archive.com/numpy-discussion@scipy.org/msg19830.html

I'll have to see if it is quicker for me to just do:

 y = x.transpose().tolist()
 y.sort()
 x = np.array(y).transpose()
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Remove duplicate columns

2010-05-06 Thread josef . pktd
On Thu, May 6, 2010 at 4:45 PM, T J tjhn...@gmail.com wrote:
 On Thu, May 6, 2010 at 10:36 AM,  josef.p...@gmail.com wrote:

 there is a thread last august on unique rows which might be useful,
 and a thread in Dec 2008 for sorting rows

 something like

 np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1])

 maybe it's np.unique with numpy 1.4.


 The thread is useful:

  http://www.mail-archive.com/numpy-discussion@scipy.org/msg19830.html

 I'll have to see if it is quicker for me to just do:

 y = x.transpose().tolist()
 y.sort()
 x = np.array(y).transpose()

for sure it's easier to read. the difference might be temporary array
creation compared to using numpy.sort on a view.

Josef

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Remove duplicate columns

2010-05-06 Thread Charles R Harris
On Thu, May 6, 2010 at 11:25 AM, T J tjhn...@gmail.com wrote:

 Hi,

 Is there a way to sort the columns in an array?  I need to sort it so
 that I can easily go through and keep only the unique columns.
 ndarray.sort(axis=1) doesn't do what I want as it destroys the
 relative ordering between the various columns. For example, I would
 like:

 [[2,1,3],
  [3,5,1],
  [0,3,1]]

 to go to:

 [[1,2,3],
  [5,3,1],
  [3,0,1]]

 (swap the first and second columns).  So I want to treat the columns
 as objects and sort them.  I can do this if I convert to a python
 list, but I was hoping to avoid doing that because I ultimately need
 to do element-wise bitwise operations.


To get the order illustrated:

In [9]: a = array([[2,1,3],[3,5,1],[0,3,1]])

In [10]: i = lexsort([a[::-1][i] for i in range(3)])

In [11]: a[:,i]
Out[11]:
array([[1, 2, 3],
   [5, 3, 1],
   [3, 0, 1]])


But if you just want them sorted, it is easier to do

In [12]: i = lexsort([a[i] for i in range(3)])

In [13]: a[:,i]
Out[13]:
array([[2, 3, 1],
   [3, 1, 5],
   [0, 1, 3]])

or just

In [18]: a[:,lexsort(a)]
Out[18]:
array([[2, 3, 1],
   [3, 1, 5],
   [0, 1, 3]])

For the bigger array

In [21]: a
Out[21]:
array([[3, 2, 2, 2, 2],
   [2, 2, 0, 2, 2],
   [0, 1, 1, 0, 1],
   [5, 5, 3, 0, 5]])

In [22]: a[:, lexsort(a)]
Out[22]:
array([[2, 2, 3, 2, 2],
   [2, 0, 2, 2, 2],
   [0, 1, 0, 1, 1],
   [0, 3, 5, 5, 5]])

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion