Re: [Numpy-discussion] Find indices of largest elements

2010-04-16 Thread Neal Becker
Nikolaus Rath wrote:

 eat e.antero.ta...@gmail.com writes:
 Nikolaus Rath Nikolaus at rath.org writes:

 [snip]
 Not quite, because I'm interested in the n largest values over all
 elements, not the largest element in each row or column. But Keith's
 solution seems to work fine, even though I'm still struggling to
 understand what's going on there .

 My bad. I just concentrated on your example, not the actual question.

 However, what's wrong with your above approach
 [ np.unravel_index(i, x.shape) for i in idx[-3:] ] ?

 Especially if your n largest elements are just a small fraction of all
 elements.
 
 The fact that it sorts the entire list. But since for my arrays it's not
 really an efficiency problem, I will use it anyway.
 
 
 
 Best,
 
-Nikolaus
 

I use boost::python and pyublas to package a function that exposes the c++ 
std::partial_sort

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-15 Thread Nikolaus Rath
eat e.antero.ta...@gmail.com writes:
 How do I best find out the indices of the largest x elements in an
 array?

 Just
 a= np.asarray([[1, 8, 2], [2, 1, 3]])
 print np.where((a.T== a.max(axis= 1)).T)

 However, if any row contains more than 1 max entity, above will fail. Please 
 let me to know if that's relevant for you.

Not quite, because I'm interested in the n largest values over all
elements, not the largest element in each row or column. But Keith's
solution seems to work fine, even though I'm still struggling to
understand what's going on there :-).

Best,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-15 Thread Nikolaus Rath
Keith Goodman kwgood...@gmail.com writes:
 On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath nikol...@rath.org wrote:
 Keith Goodman kwgood...@gmail.com writes:
 On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote:
 On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote:
 Hello,

 How do I best find out the indices of the largest x elements in an
 array?

 Example:

 a = [ [1,8,2], [2,1,3] ]
 magic_function(a, 2) == [ (0,1), (1,2) ]

 Since the largest 2 elements are at positions (0,1) and (1,2).

 Here's a quick way to rank the data if there are no ties and no NaNs:

 ...or if you need the indices in order:

 shape = (3,2)
 x = np.random.rand(*shape)
 x
 array([[ 0.52420123,  0.43231286],
        [ 0.97995333,  0.87416228],
        [ 0.71604075,  0.66018382]])
 r = x.reshape(-1).argsort().argsort()

 I don't understand why this works. Why do you call argsort() twice?
 Doesn't that give you the indices of the sorted indices?

 It is confusing. Let's look at an example:

 x = np.random.rand(4)
 x
array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])

 If we call argsort once we get the index that will sort x:

 idx = x.argsort()
 idx
array([2, 0, 3, 1])
 x[idx]
array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])

 Notice that the first element of idx is 2. That's because element x[2]
 is the min of x. But that's not what we want.

I think that's exactly what I want, the index of the smallest element.
It also seems to work:

In [3]: x = np.random.rand(3,3)
In [4]: x
Out[4]: 
array([[ 0.49064281,  0.54989584,  0.05319183],
   [ 0.50510206,  0.39683101,  0.22801874],
   [ 0.04595144,  0.3329171 ,  0.61156205]])
In [5]: idx = x.reshape(-1).argsort()
In [6]: [ np.unravel_index(i, x.shape) for i in idx[-3:] ]
Out[6]: [(1, 0), (0, 1), (2, 2)]


So why the additional complication with the second argsort? I just don't
get it...

 We want the first
 element to be the rank of the first element of x.

I'm not quite sure why we want that...?


Best,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-15 Thread Keith Goodman
On Thu, Apr 15, 2010 at 12:41 PM, Nikolaus Rath nikol...@rath.org wrote:
 Keith Goodman kwgood...@gmail.com writes:
 On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath nikol...@rath.org wrote:
 Keith Goodman kwgood...@gmail.com writes:
 On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote:
 On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote:
 Hello,

 How do I best find out the indices of the largest x elements in an
 array?

 Example:

 a = [ [1,8,2], [2,1,3] ]
 magic_function(a, 2) == [ (0,1), (1,2) ]

 Since the largest 2 elements are at positions (0,1) and (1,2).

 Here's a quick way to rank the data if there are no ties and no NaNs:

 ...or if you need the indices in order:

 shape = (3,2)
 x = np.random.rand(*shape)
 x
 array([[ 0.52420123,  0.43231286],
        [ 0.97995333,  0.87416228],
        [ 0.71604075,  0.66018382]])
 r = x.reshape(-1).argsort().argsort()

 I don't understand why this works. Why do you call argsort() twice?
 Doesn't that give you the indices of the sorted indices?

 It is confusing. Let's look at an example:

 x = np.random.rand(4)
 x
    array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])

 If we call argsort once we get the index that will sort x:

 idx = x.argsort()
 idx
    array([2, 0, 3, 1])
 x[idx]
    array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])

 Notice that the first element of idx is 2. That's because element x[2]
 is the min of x. But that's not what we want.

 I think that's exactly what I want, the index of the smallest element.
 It also seems to work:

 In [3]: x = np.random.rand(3,3)
 In [4]: x
 Out[4]:
 array([[ 0.49064281,  0.54989584,  0.05319183],
       [ 0.50510206,  0.39683101,  0.22801874],
       [ 0.04595144,  0.3329171 ,  0.61156205]])
 In [5]: idx = x.reshape(-1).argsort()
 In [6]: [ np.unravel_index(i, x.shape) for i in idx[-3:] ]
 Out[6]: [(1, 0), (0, 1), (2, 2)]

Yes, you are right. My first thought was to approach the problem by
ranking the data. But that is not needed here since the position in
the argsorted index tells us the rank. I guess my approach was to rank
first and then ask questions later. Well, at least we got to see
Anne's fast ranking method.


 So why the additional complication with the second argsort? I just don't
 get it...

 We want the first
 element to be the rank of the first element of x.

 I'm not quite sure why we want that...?


 Best,

   -Nikolaus

 --
  »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-15 Thread eat
Nikolaus Rath Nikolaus at rath.org writes:

[snip]
 Not quite, because I'm interested in the n largest values over all
 elements, not the largest element in each row or column. But Keith's
 solution seems to work fine, even though I'm still struggling to
 understand what's going on there .

My bad. I just concentrated on your example, not the actual question.

However, what's wrong with your above approach
[ np.unravel_index(i, x.shape) for i in idx[-3:] ] ?

Especially if your n largest elements are just a small fraction of all 
elements.

# Note also the differencies
a= np.asarray([[1, 8, 2], [2, 3, 3], [3, 4, 1]])
n= 3
# between
print [np.unravel_index(ind, a.shape) for ind in np.argsort(a.ravel())[-n:]]
# and
print [np.where(val== a) for val in np.sort(a.ravel())[-n:]]


Regards,
eat
 
 Best,
 
-Nikolaus
 




___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-15 Thread Keith Goodman
On Thu, Apr 15, 2010 at 1:48 PM, Keith Goodman kwgood...@gmail.com wrote:
 On Thu, Apr 15, 2010 at 12:41 PM, Nikolaus Rath nikol...@rath.org wrote:
 Keith Goodman kwgood...@gmail.com writes:
 On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath nikol...@rath.org wrote:
 Keith Goodman kwgood...@gmail.com writes:
 On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com 
 wrote:
 On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote:
 Hello,

 How do I best find out the indices of the largest x elements in an
 array?

 Example:

 a = [ [1,8,2], [2,1,3] ]
 magic_function(a, 2) == [ (0,1), (1,2) ]

 Since the largest 2 elements are at positions (0,1) and (1,2).

 Here's a quick way to rank the data if there are no ties and no NaNs:

 ...or if you need the indices in order:

 shape = (3,2)
 x = np.random.rand(*shape)
 x
 array([[ 0.52420123,  0.43231286],
        [ 0.97995333,  0.87416228],
        [ 0.71604075,  0.66018382]])
 r = x.reshape(-1).argsort().argsort()

 I don't understand why this works. Why do you call argsort() twice?
 Doesn't that give you the indices of the sorted indices?

 It is confusing. Let's look at an example:

 x = np.random.rand(4)
 x
    array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])

 If we call argsort once we get the index that will sort x:

 idx = x.argsort()
 idx
    array([2, 0, 3, 1])
 x[idx]
    array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])

 Notice that the first element of idx is 2. That's because element x[2]
 is the min of x. But that's not what we want.

 I think that's exactly what I want, the index of the smallest element.
 It also seems to work:

 In [3]: x = np.random.rand(3,3)
 In [4]: x
 Out[4]:
 array([[ 0.49064281,  0.54989584,  0.05319183],
       [ 0.50510206,  0.39683101,  0.22801874],
       [ 0.04595144,  0.3329171 ,  0.61156205]])
 In [5]: idx = x.reshape(-1).argsort()
 In [6]: [ np.unravel_index(i, x.shape) for i in idx[-3:] ]
 Out[6]: [(1, 0), (0, 1), (2, 2)]

 Yes, you are right. My first thought was to approach the problem by
 ranking the data. But that is not needed here since the position in
 the argsorted index tells us the rank. I guess my approach was to rank
 first and then ask questions later. Well, at least we got to see
 Anne's fast ranking method.

I see now that the first method I tried in this thread requires
ranking. But the second method, the one that uses unravel_index,
doesn't.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-15 Thread Nikolaus Rath
eat e.antero.ta...@gmail.com writes:
 Nikolaus Rath Nikolaus at rath.org writes:

 [snip]
 Not quite, because I'm interested in the n largest values over all
 elements, not the largest element in each row or column. But Keith's
 solution seems to work fine, even though I'm still struggling to
 understand what's going on there .

 My bad. I just concentrated on your example, not the actual question.

 However, what's wrong with your above approach
 [ np.unravel_index(i, x.shape) for i in idx[-3:] ] ?

 Especially if your n largest elements are just a small fraction of all 
 elements.

The fact that it sorts the entire list. But since for my arrays it's not
really an efficiency problem, I will use it anyway.



Best,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Nikolaus Rath
Hello,

How do I best find out the indices of the largest x elements in an
array?

Example:

a = [ [1,8,2], [2,1,3] ]
magic_function(a, 2) == [ (0,1), (1,2) ]

Since the largest 2 elements are at positions (0,1) and (1,2).


Best,

   -Niko

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Skipper Seabold
On Wed, Apr 14, 2010 at 11:16 AM, Nikolaus Rath nikol...@rath.org wrote:
 Hello,

 How do I best find out the indices of the largest x elements in an
 array?

 Example:

 a = [ [1,8,2], [2,1,3] ]
 magic_function(a, 2) == [ (0,1), (1,2) ]

 Since the largest 2 elements are at positions (0,1) and (1,2).

Something like this might be made to work, if you want the max
elements along all the rows.

In [3]: a = np.asarray(a)

In [4]: a[range(len(a)),np.argmax(a, axis=1)]
Out[4]: array([8, 3])

Skipper
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Keith Goodman
On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote:
 Hello,

 How do I best find out the indices of the largest x elements in an
 array?

 Example:

 a = [ [1,8,2], [2,1,3] ]
 magic_function(a, 2) == [ (0,1), (1,2) ]

 Since the largest 2 elements are at positions (0,1) and (1,2).

Here's a quick way to rank the data if there are no ties and no NaNs:

 shape = (3,2)
 x = np.random.rand(*shape)
 x
array([[ 0.83424288,  0.17821326],
   [ 0.62718311,  0.63514286],
   [ 0.18373934,  0.90634162]])
 r = x.reshape(-1).argsort().argsort().reshape(shape)
 r
array([[4, 0],
   [2, 3],
   [1, 5]])

To find the indices you can use where:

 r  2
array([[False,  True],
   [False, False],
   [ True, False]], dtype=bool)
 np.where(r  2)
   (array([0, 2]), array([1, 0]))

...but the indices will not be in order.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Gökhan Sever
On Wed, Apr 14, 2010 at 10:16 AM, Nikolaus Rath nikol...@rath.org wrote:

 Hello,

 How do I best find out the indices of the largest x elements in an
 array?

 Example:

 a = [ [1,8,2], [2,1,3] ]
 magic_function(a, 2) == [ (0,1), (1,2) ]

 Since the largest 2 elements are at positions (0,1) and (1,2).

 I[1]: a = np.array([ [1,8,2], [2,1,3] ])

I[2]: b = a.max(axis=1)[:,np.newaxis]

I[3]: b
O[3]:
array([[8],
   [3]])

I[4]: np.where(a==b)
O[4]: (array([0, 1]), array([1, 2]))



-- 
Gökhan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Keith Goodman
On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote:
 On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote:
 Hello,

 How do I best find out the indices of the largest x elements in an
 array?

 Example:

 a = [ [1,8,2], [2,1,3] ]
 magic_function(a, 2) == [ (0,1), (1,2) ]

 Since the largest 2 elements are at positions (0,1) and (1,2).

 Here's a quick way to rank the data if there are no ties and no NaNs:

 shape = (3,2)
 x = np.random.rand(*shape)
 x
 array([[ 0.83424288,  0.17821326],
       [ 0.62718311,  0.63514286],
       [ 0.18373934,  0.90634162]])
 r = x.reshape(-1).argsort().argsort().reshape(shape)
 r
 array([[4, 0],
       [2, 3],
       [1, 5]])

 To find the indices you can use where:

 r  2
 array([[False,  True],
       [False, False],
       [ True, False]], dtype=bool)
 np.where(r  2)
   (array([0, 2]), array([1, 0]))

 ...but the indices will not be in order.

...or if you need the indices in order:

 shape = (3,2)
 x = np.random.rand(*shape)
 x
array([[ 0.52420123,  0.43231286],
   [ 0.97995333,  0.87416228],
   [ 0.71604075,  0.66018382]])
 r = x.reshape(-1).argsort().argsort()
 np.unravel_index(r[0], shape)
   (0, 1)
 np.unravel_index(r[1], shape)
   (0, 0)
 np.unravel_index(r[2], shape)
   (2, 1)
 np.unravel_index(r[-1], shape)
   (1, 0)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Nikolaus Rath
Keith Goodman kwgood...@gmail.com writes:
 On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote:
 On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote:
 Hello,

 How do I best find out the indices of the largest x elements in an
 array?

 Example:

 a = [ [1,8,2], [2,1,3] ]
 magic_function(a, 2) == [ (0,1), (1,2) ]

 Since the largest 2 elements are at positions (0,1) and (1,2).

 Here's a quick way to rank the data if there are no ties and no NaNs:

 ...or if you need the indices in order:

 shape = (3,2)
 x = np.random.rand(*shape)
 x
 array([[ 0.52420123,  0.43231286],
[ 0.97995333,  0.87416228],
[ 0.71604075,  0.66018382]])
 r = x.reshape(-1).argsort().argsort()

I don't understand why this works. Why do you call argsort() twice?
Doesn't that give you the indices of the sorted indices?


Thanks,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Keith Goodman
On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath nikol...@rath.org wrote:
 Keith Goodman kwgood...@gmail.com writes:
 On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote:
 On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote:
 Hello,

 How do I best find out the indices of the largest x elements in an
 array?

 Example:

 a = [ [1,8,2], [2,1,3] ]
 magic_function(a, 2) == [ (0,1), (1,2) ]

 Since the largest 2 elements are at positions (0,1) and (1,2).

 Here's a quick way to rank the data if there are no ties and no NaNs:

 ...or if you need the indices in order:

 shape = (3,2)
 x = np.random.rand(*shape)
 x
 array([[ 0.52420123,  0.43231286],
        [ 0.97995333,  0.87416228],
        [ 0.71604075,  0.66018382]])
 r = x.reshape(-1).argsort().argsort()

 I don't understand why this works. Why do you call argsort() twice?
 Doesn't that give you the indices of the sorted indices?

It is confusing. Let's look at an example:

 x = np.random.rand(4)
 x
   array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])

If we call argsort once we get the index that will sort x:

 idx = x.argsort()
 idx
   array([2, 0, 3, 1])
 x[idx]
   array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])

Notice that the first element of idx is 2. That's because element x[2]
is the min of x. But that's not what we want. We want the first
element to be the rank of the first element of x. So we need to
shuffle idx around so that the order aligns with x. How do we do that?
We sort it!

 idx.argsort()
   array([1, 3, 0, 2])

The min value of x is x[2], that's why 2 is the first element of idx
which means that we want ranked(x) to contain a 0 at position 2 which
it does.

Bah, it's all magic.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Keith Goodman
On Wed, Apr 14, 2010 at 1:56 PM, Keith Goodman kwgood...@gmail.com wrote:
 On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath nikol...@rath.org wrote:
 Keith Goodman kwgood...@gmail.com writes:
 On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote:
 On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote:
 Hello,

 How do I best find out the indices of the largest x elements in an
 array?

 Example:

 a = [ [1,8,2], [2,1,3] ]
 magic_function(a, 2) == [ (0,1), (1,2) ]

 Since the largest 2 elements are at positions (0,1) and (1,2).

 Here's a quick way to rank the data if there are no ties and no NaNs:

 ...or if you need the indices in order:

 shape = (3,2)
 x = np.random.rand(*shape)
 x
 array([[ 0.52420123,  0.43231286],
        [ 0.97995333,  0.87416228],
        [ 0.71604075,  0.66018382]])
 r = x.reshape(-1).argsort().argsort()

 I don't understand why this works. Why do you call argsort() twice?
 Doesn't that give you the indices of the sorted indices?

 It is confusing. Let's look at an example:

 x = np.random.rand(4)
 x
   array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])

 If we call argsort once we get the index that will sort x:

 idx = x.argsort()
 idx
   array([2, 0, 3, 1])
 x[idx]
   array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])

 Notice that the first element of idx is 2. That's because element x[2]
 is the min of x. But that's not what we want. We want the first
 element to be the rank of the first element of x. So we need to
 shuffle idx around so that the order aligns with x. How do we do that?
 We sort it!

 idx.argsort()
   array([1, 3, 0, 2])

 The min value of x is x[2], that's why 2 is the first element of idx
 which means that we want ranked(x) to contain a 0 at position 2 which
 it does.

 Bah, it's all magic.

You can also use rankdata from scipy:

 from scipy.stats import rankdata
 rankdata(x)
   array([ 2.,  4.,  1.,  3.])

Note the the smallest rank is 1.

 rankdata(x) - 1
   array([ 1.,  3.,  0.,  2.])
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread eat
Nikolaus Rath Nikolaus at rath.org writes:

 
 Hello,
 
 How do I best find out the indices of the largest x elements in an
 array?
 
 Example:
 
 a = [ [1,8,2], [2,1,3] ]
 magic_function(a, 2) == [ (0,1), (1,2) ]
 
 Since the largest 2 elements are at positions (0,1) and (1,2).
 
 Best,
 
-Niko
 
Hi,

Just
a= np.asarray([[1, 8, 2], [2, 1, 3]])
print np.where((a.T== a.max(axis= 1)).T)

However, if any row contains more than 1 max entity, above will fail. Please 
let me to know if that's relevant for you.

-eat


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Anne Archibald
On 14 April 2010 16:56, Keith Goodman kwgood...@gmail.com wrote:
 On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath nikol...@rath.org wrote:
 Keith Goodman kwgood...@gmail.com writes:
 On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote:
 On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote:
 Hello,

 How do I best find out the indices of the largest x elements in an
 array?

 Example:

 a = [ [1,8,2], [2,1,3] ]
 magic_function(a, 2) == [ (0,1), (1,2) ]

 Since the largest 2 elements are at positions (0,1) and (1,2).

 Here's a quick way to rank the data if there are no ties and no NaNs:

 ...or if you need the indices in order:

 shape = (3,2)
 x = np.random.rand(*shape)
 x
 array([[ 0.52420123,  0.43231286],
[ 0.97995333,  0.87416228],
[ 0.71604075,  0.66018382]])
 r = x.reshape(-1).argsort().argsort()

 I don't understand why this works. Why do you call argsort() twice?
 Doesn't that give you the indices of the sorted indices?

 It is confusing. Let's look at an example:

 x = np.random.rand(4)
 x
   array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])

 If we call argsort once we get the index that will sort x:

 idx = x.argsort()
 idx
   array([2, 0, 3, 1])
 x[idx]
   array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])

 Notice that the first element of idx is 2. That's because element x[2]
 is the min of x. But that's not what we want. We want the first
 element to be the rank of the first element of x. So we need to
 shuffle idx around so that the order aligns with x. How do we do that?
 We sort it!

Unless I am mistaken, what you are doing here is inverting the
permutation returned by the first argsort. The second argsort is an n
log n method, though, and permutations can be inverted in linear time:

ix = np.argsort(X)
ixinv = np.zeros_like(ix)
ixinv[ix] = np.arange(len(ix))

This works because if ix is a permutation and ixinv is its inverse,
A = B[ix]
is the same as
A[ixinv] = B
This also means that you can often do without the actual inverse by
moving the indexing operation to the other side of the equal sign.
(Not in the OP's case, though.)

I'll also add that if the OP needs the top m for 1mn, sorting the
whole input array is not the most efficient algorithm; there are
priority-queue-based schemes that are asymptotically more efficient,
but none exists in numpy. Since numpy's sorting is quite fast, I
personally would just use the sorting.

Anne




 idx.argsort()
   array([1, 3, 0, 2])

 The min value of x is x[2], that's why 2 is the first element of idx
 which means that we want ranked(x) to contain a 0 at position 2 which
 it does.

 Bah, it's all magic.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Find indices of largest elements

2010-04-14 Thread Keith Goodman
On Wed, Apr 14, 2010 at 3:12 PM, Anne Archibald
peridot.face...@gmail.com wrote:
 On 14 April 2010 16:56, Keith Goodman kwgood...@gmail.com wrote:
 On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath nikol...@rath.org wrote:
 Keith Goodman kwgood...@gmail.com writes:
 On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote:
 On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote:
 Hello,

 How do I best find out the indices of the largest x elements in an
 array?

 Example:

 a = [ [1,8,2], [2,1,3] ]
 magic_function(a, 2) == [ (0,1), (1,2) ]

 Since the largest 2 elements are at positions (0,1) and (1,2).

 Here's a quick way to rank the data if there are no ties and no NaNs:

 ...or if you need the indices in order:

 shape = (3,2)
 x = np.random.rand(*shape)
 x
 array([[ 0.52420123,  0.43231286],
        [ 0.97995333,  0.87416228],
        [ 0.71604075,  0.66018382]])
 r = x.reshape(-1).argsort().argsort()

 I don't understand why this works. Why do you call argsort() twice?
 Doesn't that give you the indices of the sorted indices?

 It is confusing. Let's look at an example:

 x = np.random.rand(4)
 x
   array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])

 If we call argsort once we get the index that will sort x:

 idx = x.argsort()
 idx
   array([2, 0, 3, 1])
 x[idx]
   array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])

 Notice that the first element of idx is 2. That's because element x[2]
 is the min of x. But that's not what we want. We want the first
 element to be the rank of the first element of x. So we need to
 shuffle idx around so that the order aligns with x. How do we do that?
 We sort it!

 Unless I am mistaken, what you are doing here is inverting the
 permutation returned by the first argsort. The second argsort is an n
 log n method, though, and permutations can be inverted in linear time:

 ix = np.argsort(X)
 ixinv = np.zeros_like(ix)
 ixinv[ix] = np.arange(len(ix))

 This works because if ix is a permutation and ixinv is its inverse,
 A = B[ix]
 is the same as
 A[ixinv] = B
 This also means that you can often do without the actual inverse by
 moving the indexing operation to the other side of the equal sign.
 (Not in the OP's case, though.)

That is very nice. And very fast for large arrays:

 x = np.random.rand(4)
 timeit idx = x.argsort().argsort()
100 loops, best of 3: 1.45 us per loop
 timeit idx = x.argsort(); idxinv = np.zeros_like(idx); idxinv[idx] = 
 np.arange(len(idx))
10 loops, best of 3: 9.52 us per loop

 x = np.random.rand(1000)
 timeit idx = x.argsort().argsort()
1 loops, best of 3: 112 us per loop
 timeit idx = x.argsort(); idxinv = np.zeros_like(idx); idxinv[idx] = 
 np.arange(len(idx))
1 loops, best of 3: 82.9 us per loop

 x = np.random.rand(10)
 timeit idx = x.argsort().argsort()
10 loops, best of 3: 20.4 ms per loop
 timeit idx = x.argsort(); idxinv = np.zeros_like(idx); idxinv[idx] = 
 np.arange(len(idx))
100 loops, best of 3: 13.2 ms per loop

 I'll also add that if the OP needs the top m for 1mn, sorting the
 whole input array is not the most efficient algorithm; there are
 priority-queue-based schemes that are asymptotically more efficient,
 but none exists in numpy. Since numpy's sorting is quite fast, I
 personally would just use the sorting.

Partial sorting would find a lot of uses in the numpy community (like
in median).
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion