Re: [Numpy-discussion] Find indices of largest elements
Nikolaus Rath wrote: eat e.antero.ta...@gmail.com writes: Nikolaus Rath Nikolaus at rath.org writes: [snip] Not quite, because I'm interested in the n largest values over all elements, not the largest element in each row or column. But Keith's solution seems to work fine, even though I'm still struggling to understand what's going on there . My bad. I just concentrated on your example, not the actual question. However, what's wrong with your above approach [ np.unravel_index(i, x.shape) for i in idx[-3:] ] ? Especially if your n largest elements are just a small fraction of all elements. The fact that it sorts the entire list. But since for my arrays it's not really an efficiency problem, I will use it anyway. Best, -Nikolaus I use boost::python and pyublas to package a function that exposes the c++ std::partial_sort ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
eat e.antero.ta...@gmail.com writes: How do I best find out the indices of the largest x elements in an array? Just a= np.asarray([[1, 8, 2], [2, 1, 3]]) print np.where((a.T== a.max(axis= 1)).T) However, if any row contains more than 1 max entity, above will fail. Please let me to know if that's relevant for you. Not quite, because I'm interested in the n largest values over all elements, not the largest element in each row or column. But Keith's solution seems to work fine, even though I'm still struggling to understand what's going on there :-). Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
Keith Goodman kwgood...@gmail.com writes: On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath nikol...@rath.org wrote: Keith Goodman kwgood...@gmail.com writes: On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote: On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote: Hello, How do I best find out the indices of the largest x elements in an array? Example: a = [ [1,8,2], [2,1,3] ] magic_function(a, 2) == [ (0,1), (1,2) ] Since the largest 2 elements are at positions (0,1) and (1,2). Here's a quick way to rank the data if there are no ties and no NaNs: ...or if you need the indices in order: shape = (3,2) x = np.random.rand(*shape) x array([[ 0.52420123, 0.43231286], [ 0.97995333, 0.87416228], [ 0.71604075, 0.66018382]]) r = x.reshape(-1).argsort().argsort() I don't understand why this works. Why do you call argsort() twice? Doesn't that give you the indices of the sorted indices? It is confusing. Let's look at an example: x = np.random.rand(4) x array([ 0.37412289, 0.68248559, 0.12935131, 0.42510212]) If we call argsort once we get the index that will sort x: idx = x.argsort() idx array([2, 0, 3, 1]) x[idx] array([ 0.12935131, 0.37412289, 0.42510212, 0.68248559]) Notice that the first element of idx is 2. That's because element x[2] is the min of x. But that's not what we want. I think that's exactly what I want, the index of the smallest element. It also seems to work: In [3]: x = np.random.rand(3,3) In [4]: x Out[4]: array([[ 0.49064281, 0.54989584, 0.05319183], [ 0.50510206, 0.39683101, 0.22801874], [ 0.04595144, 0.3329171 , 0.61156205]]) In [5]: idx = x.reshape(-1).argsort() In [6]: [ np.unravel_index(i, x.shape) for i in idx[-3:] ] Out[6]: [(1, 0), (0, 1), (2, 2)] So why the additional complication with the second argsort? I just don't get it... We want the first element to be the rank of the first element of x. I'm not quite sure why we want that...? Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
On Thu, Apr 15, 2010 at 12:41 PM, Nikolaus Rath nikol...@rath.org wrote: Keith Goodman kwgood...@gmail.com writes: On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath nikol...@rath.org wrote: Keith Goodman kwgood...@gmail.com writes: On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote: On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote: Hello, How do I best find out the indices of the largest x elements in an array? Example: a = [ [1,8,2], [2,1,3] ] magic_function(a, 2) == [ (0,1), (1,2) ] Since the largest 2 elements are at positions (0,1) and (1,2). Here's a quick way to rank the data if there are no ties and no NaNs: ...or if you need the indices in order: shape = (3,2) x = np.random.rand(*shape) x array([[ 0.52420123, 0.43231286], [ 0.97995333, 0.87416228], [ 0.71604075, 0.66018382]]) r = x.reshape(-1).argsort().argsort() I don't understand why this works. Why do you call argsort() twice? Doesn't that give you the indices of the sorted indices? It is confusing. Let's look at an example: x = np.random.rand(4) x array([ 0.37412289, 0.68248559, 0.12935131, 0.42510212]) If we call argsort once we get the index that will sort x: idx = x.argsort() idx array([2, 0, 3, 1]) x[idx] array([ 0.12935131, 0.37412289, 0.42510212, 0.68248559]) Notice that the first element of idx is 2. That's because element x[2] is the min of x. But that's not what we want. I think that's exactly what I want, the index of the smallest element. It also seems to work: In [3]: x = np.random.rand(3,3) In [4]: x Out[4]: array([[ 0.49064281, 0.54989584, 0.05319183], [ 0.50510206, 0.39683101, 0.22801874], [ 0.04595144, 0.3329171 , 0.61156205]]) In [5]: idx = x.reshape(-1).argsort() In [6]: [ np.unravel_index(i, x.shape) for i in idx[-3:] ] Out[6]: [(1, 0), (0, 1), (2, 2)] Yes, you are right. My first thought was to approach the problem by ranking the data. But that is not needed here since the position in the argsorted index tells us the rank. I guess my approach was to rank first and then ask questions later. Well, at least we got to see Anne's fast ranking method. So why the additional complication with the second argsort? I just don't get it... We want the first element to be the rank of the first element of x. I'm not quite sure why we want that...? Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
Nikolaus Rath Nikolaus at rath.org writes: [snip] Not quite, because I'm interested in the n largest values over all elements, not the largest element in each row or column. But Keith's solution seems to work fine, even though I'm still struggling to understand what's going on there . My bad. I just concentrated on your example, not the actual question. However, what's wrong with your above approach [ np.unravel_index(i, x.shape) for i in idx[-3:] ] ? Especially if your n largest elements are just a small fraction of all elements. # Note also the differencies a= np.asarray([[1, 8, 2], [2, 3, 3], [3, 4, 1]]) n= 3 # between print [np.unravel_index(ind, a.shape) for ind in np.argsort(a.ravel())[-n:]] # and print [np.where(val== a) for val in np.sort(a.ravel())[-n:]] Regards, eat Best, -Nikolaus ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
On Thu, Apr 15, 2010 at 1:48 PM, Keith Goodman kwgood...@gmail.com wrote: On Thu, Apr 15, 2010 at 12:41 PM, Nikolaus Rath nikol...@rath.org wrote: Keith Goodman kwgood...@gmail.com writes: On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath nikol...@rath.org wrote: Keith Goodman kwgood...@gmail.com writes: On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote: On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote: Hello, How do I best find out the indices of the largest x elements in an array? Example: a = [ [1,8,2], [2,1,3] ] magic_function(a, 2) == [ (0,1), (1,2) ] Since the largest 2 elements are at positions (0,1) and (1,2). Here's a quick way to rank the data if there are no ties and no NaNs: ...or if you need the indices in order: shape = (3,2) x = np.random.rand(*shape) x array([[ 0.52420123, 0.43231286], [ 0.97995333, 0.87416228], [ 0.71604075, 0.66018382]]) r = x.reshape(-1).argsort().argsort() I don't understand why this works. Why do you call argsort() twice? Doesn't that give you the indices of the sorted indices? It is confusing. Let's look at an example: x = np.random.rand(4) x array([ 0.37412289, 0.68248559, 0.12935131, 0.42510212]) If we call argsort once we get the index that will sort x: idx = x.argsort() idx array([2, 0, 3, 1]) x[idx] array([ 0.12935131, 0.37412289, 0.42510212, 0.68248559]) Notice that the first element of idx is 2. That's because element x[2] is the min of x. But that's not what we want. I think that's exactly what I want, the index of the smallest element. It also seems to work: In [3]: x = np.random.rand(3,3) In [4]: x Out[4]: array([[ 0.49064281, 0.54989584, 0.05319183], [ 0.50510206, 0.39683101, 0.22801874], [ 0.04595144, 0.3329171 , 0.61156205]]) In [5]: idx = x.reshape(-1).argsort() In [6]: [ np.unravel_index(i, x.shape) for i in idx[-3:] ] Out[6]: [(1, 0), (0, 1), (2, 2)] Yes, you are right. My first thought was to approach the problem by ranking the data. But that is not needed here since the position in the argsorted index tells us the rank. I guess my approach was to rank first and then ask questions later. Well, at least we got to see Anne's fast ranking method. I see now that the first method I tried in this thread requires ranking. But the second method, the one that uses unravel_index, doesn't. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
eat e.antero.ta...@gmail.com writes: Nikolaus Rath Nikolaus at rath.org writes: [snip] Not quite, because I'm interested in the n largest values over all elements, not the largest element in each row or column. But Keith's solution seems to work fine, even though I'm still struggling to understand what's going on there . My bad. I just concentrated on your example, not the actual question. However, what's wrong with your above approach [ np.unravel_index(i, x.shape) for i in idx[-3:] ] ? Especially if your n largest elements are just a small fraction of all elements. The fact that it sorts the entire list. But since for my arrays it's not really an efficiency problem, I will use it anyway. Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Find indices of largest elements
Hello, How do I best find out the indices of the largest x elements in an array? Example: a = [ [1,8,2], [2,1,3] ] magic_function(a, 2) == [ (0,1), (1,2) ] Since the largest 2 elements are at positions (0,1) and (1,2). Best, -Niko -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
On Wed, Apr 14, 2010 at 11:16 AM, Nikolaus Rath nikol...@rath.org wrote: Hello, How do I best find out the indices of the largest x elements in an array? Example: a = [ [1,8,2], [2,1,3] ] magic_function(a, 2) == [ (0,1), (1,2) ] Since the largest 2 elements are at positions (0,1) and (1,2). Something like this might be made to work, if you want the max elements along all the rows. In [3]: a = np.asarray(a) In [4]: a[range(len(a)),np.argmax(a, axis=1)] Out[4]: array([8, 3]) Skipper ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote: Hello, How do I best find out the indices of the largest x elements in an array? Example: a = [ [1,8,2], [2,1,3] ] magic_function(a, 2) == [ (0,1), (1,2) ] Since the largest 2 elements are at positions (0,1) and (1,2). Here's a quick way to rank the data if there are no ties and no NaNs: shape = (3,2) x = np.random.rand(*shape) x array([[ 0.83424288, 0.17821326], [ 0.62718311, 0.63514286], [ 0.18373934, 0.90634162]]) r = x.reshape(-1).argsort().argsort().reshape(shape) r array([[4, 0], [2, 3], [1, 5]]) To find the indices you can use where: r 2 array([[False, True], [False, False], [ True, False]], dtype=bool) np.where(r 2) (array([0, 2]), array([1, 0])) ...but the indices will not be in order. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
On Wed, Apr 14, 2010 at 10:16 AM, Nikolaus Rath nikol...@rath.org wrote: Hello, How do I best find out the indices of the largest x elements in an array? Example: a = [ [1,8,2], [2,1,3] ] magic_function(a, 2) == [ (0,1), (1,2) ] Since the largest 2 elements are at positions (0,1) and (1,2). I[1]: a = np.array([ [1,8,2], [2,1,3] ]) I[2]: b = a.max(axis=1)[:,np.newaxis] I[3]: b O[3]: array([[8], [3]]) I[4]: np.where(a==b) O[4]: (array([0, 1]), array([1, 2])) -- Gökhan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote: On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote: Hello, How do I best find out the indices of the largest x elements in an array? Example: a = [ [1,8,2], [2,1,3] ] magic_function(a, 2) == [ (0,1), (1,2) ] Since the largest 2 elements are at positions (0,1) and (1,2). Here's a quick way to rank the data if there are no ties and no NaNs: shape = (3,2) x = np.random.rand(*shape) x array([[ 0.83424288, 0.17821326], [ 0.62718311, 0.63514286], [ 0.18373934, 0.90634162]]) r = x.reshape(-1).argsort().argsort().reshape(shape) r array([[4, 0], [2, 3], [1, 5]]) To find the indices you can use where: r 2 array([[False, True], [False, False], [ True, False]], dtype=bool) np.where(r 2) (array([0, 2]), array([1, 0])) ...but the indices will not be in order. ...or if you need the indices in order: shape = (3,2) x = np.random.rand(*shape) x array([[ 0.52420123, 0.43231286], [ 0.97995333, 0.87416228], [ 0.71604075, 0.66018382]]) r = x.reshape(-1).argsort().argsort() np.unravel_index(r[0], shape) (0, 1) np.unravel_index(r[1], shape) (0, 0) np.unravel_index(r[2], shape) (2, 1) np.unravel_index(r[-1], shape) (1, 0) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
Keith Goodman kwgood...@gmail.com writes: On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote: On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote: Hello, How do I best find out the indices of the largest x elements in an array? Example: a = [ [1,8,2], [2,1,3] ] magic_function(a, 2) == [ (0,1), (1,2) ] Since the largest 2 elements are at positions (0,1) and (1,2). Here's a quick way to rank the data if there are no ties and no NaNs: ...or if you need the indices in order: shape = (3,2) x = np.random.rand(*shape) x array([[ 0.52420123, 0.43231286], [ 0.97995333, 0.87416228], [ 0.71604075, 0.66018382]]) r = x.reshape(-1).argsort().argsort() I don't understand why this works. Why do you call argsort() twice? Doesn't that give you the indices of the sorted indices? Thanks, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath nikol...@rath.org wrote: Keith Goodman kwgood...@gmail.com writes: On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote: On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote: Hello, How do I best find out the indices of the largest x elements in an array? Example: a = [ [1,8,2], [2,1,3] ] magic_function(a, 2) == [ (0,1), (1,2) ] Since the largest 2 elements are at positions (0,1) and (1,2). Here's a quick way to rank the data if there are no ties and no NaNs: ...or if you need the indices in order: shape = (3,2) x = np.random.rand(*shape) x array([[ 0.52420123, 0.43231286], [ 0.97995333, 0.87416228], [ 0.71604075, 0.66018382]]) r = x.reshape(-1).argsort().argsort() I don't understand why this works. Why do you call argsort() twice? Doesn't that give you the indices of the sorted indices? It is confusing. Let's look at an example: x = np.random.rand(4) x array([ 0.37412289, 0.68248559, 0.12935131, 0.42510212]) If we call argsort once we get the index that will sort x: idx = x.argsort() idx array([2, 0, 3, 1]) x[idx] array([ 0.12935131, 0.37412289, 0.42510212, 0.68248559]) Notice that the first element of idx is 2. That's because element x[2] is the min of x. But that's not what we want. We want the first element to be the rank of the first element of x. So we need to shuffle idx around so that the order aligns with x. How do we do that? We sort it! idx.argsort() array([1, 3, 0, 2]) The min value of x is x[2], that's why 2 is the first element of idx which means that we want ranked(x) to contain a 0 at position 2 which it does. Bah, it's all magic. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
On Wed, Apr 14, 2010 at 1:56 PM, Keith Goodman kwgood...@gmail.com wrote: On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath nikol...@rath.org wrote: Keith Goodman kwgood...@gmail.com writes: On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote: On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote: Hello, How do I best find out the indices of the largest x elements in an array? Example: a = [ [1,8,2], [2,1,3] ] magic_function(a, 2) == [ (0,1), (1,2) ] Since the largest 2 elements are at positions (0,1) and (1,2). Here's a quick way to rank the data if there are no ties and no NaNs: ...or if you need the indices in order: shape = (3,2) x = np.random.rand(*shape) x array([[ 0.52420123, 0.43231286], [ 0.97995333, 0.87416228], [ 0.71604075, 0.66018382]]) r = x.reshape(-1).argsort().argsort() I don't understand why this works. Why do you call argsort() twice? Doesn't that give you the indices of the sorted indices? It is confusing. Let's look at an example: x = np.random.rand(4) x array([ 0.37412289, 0.68248559, 0.12935131, 0.42510212]) If we call argsort once we get the index that will sort x: idx = x.argsort() idx array([2, 0, 3, 1]) x[idx] array([ 0.12935131, 0.37412289, 0.42510212, 0.68248559]) Notice that the first element of idx is 2. That's because element x[2] is the min of x. But that's not what we want. We want the first element to be the rank of the first element of x. So we need to shuffle idx around so that the order aligns with x. How do we do that? We sort it! idx.argsort() array([1, 3, 0, 2]) The min value of x is x[2], that's why 2 is the first element of idx which means that we want ranked(x) to contain a 0 at position 2 which it does. Bah, it's all magic. You can also use rankdata from scipy: from scipy.stats import rankdata rankdata(x) array([ 2., 4., 1., 3.]) Note the the smallest rank is 1. rankdata(x) - 1 array([ 1., 3., 0., 2.]) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
Nikolaus Rath Nikolaus at rath.org writes: Hello, How do I best find out the indices of the largest x elements in an array? Example: a = [ [1,8,2], [2,1,3] ] magic_function(a, 2) == [ (0,1), (1,2) ] Since the largest 2 elements are at positions (0,1) and (1,2). Best, -Niko Hi, Just a= np.asarray([[1, 8, 2], [2, 1, 3]]) print np.where((a.T== a.max(axis= 1)).T) However, if any row contains more than 1 max entity, above will fail. Please let me to know if that's relevant for you. -eat ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
On 14 April 2010 16:56, Keith Goodman kwgood...@gmail.com wrote: On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath nikol...@rath.org wrote: Keith Goodman kwgood...@gmail.com writes: On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote: On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote: Hello, How do I best find out the indices of the largest x elements in an array? Example: a = [ [1,8,2], [2,1,3] ] magic_function(a, 2) == [ (0,1), (1,2) ] Since the largest 2 elements are at positions (0,1) and (1,2). Here's a quick way to rank the data if there are no ties and no NaNs: ...or if you need the indices in order: shape = (3,2) x = np.random.rand(*shape) x array([[ 0.52420123, 0.43231286], [ 0.97995333, 0.87416228], [ 0.71604075, 0.66018382]]) r = x.reshape(-1).argsort().argsort() I don't understand why this works. Why do you call argsort() twice? Doesn't that give you the indices of the sorted indices? It is confusing. Let's look at an example: x = np.random.rand(4) x array([ 0.37412289, 0.68248559, 0.12935131, 0.42510212]) If we call argsort once we get the index that will sort x: idx = x.argsort() idx array([2, 0, 3, 1]) x[idx] array([ 0.12935131, 0.37412289, 0.42510212, 0.68248559]) Notice that the first element of idx is 2. That's because element x[2] is the min of x. But that's not what we want. We want the first element to be the rank of the first element of x. So we need to shuffle idx around so that the order aligns with x. How do we do that? We sort it! Unless I am mistaken, what you are doing here is inverting the permutation returned by the first argsort. The second argsort is an n log n method, though, and permutations can be inverted in linear time: ix = np.argsort(X) ixinv = np.zeros_like(ix) ixinv[ix] = np.arange(len(ix)) This works because if ix is a permutation and ixinv is its inverse, A = B[ix] is the same as A[ixinv] = B This also means that you can often do without the actual inverse by moving the indexing operation to the other side of the equal sign. (Not in the OP's case, though.) I'll also add that if the OP needs the top m for 1mn, sorting the whole input array is not the most efficient algorithm; there are priority-queue-based schemes that are asymptotically more efficient, but none exists in numpy. Since numpy's sorting is quite fast, I personally would just use the sorting. Anne idx.argsort() array([1, 3, 0, 2]) The min value of x is x[2], that's why 2 is the first element of idx which means that we want ranked(x) to contain a 0 at position 2 which it does. Bah, it's all magic. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Find indices of largest elements
On Wed, Apr 14, 2010 at 3:12 PM, Anne Archibald peridot.face...@gmail.com wrote: On 14 April 2010 16:56, Keith Goodman kwgood...@gmail.com wrote: On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath nikol...@rath.org wrote: Keith Goodman kwgood...@gmail.com writes: On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman kwgood...@gmail.com wrote: On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath nikol...@rath.org wrote: Hello, How do I best find out the indices of the largest x elements in an array? Example: a = [ [1,8,2], [2,1,3] ] magic_function(a, 2) == [ (0,1), (1,2) ] Since the largest 2 elements are at positions (0,1) and (1,2). Here's a quick way to rank the data if there are no ties and no NaNs: ...or if you need the indices in order: shape = (3,2) x = np.random.rand(*shape) x array([[ 0.52420123, 0.43231286], [ 0.97995333, 0.87416228], [ 0.71604075, 0.66018382]]) r = x.reshape(-1).argsort().argsort() I don't understand why this works. Why do you call argsort() twice? Doesn't that give you the indices of the sorted indices? It is confusing. Let's look at an example: x = np.random.rand(4) x array([ 0.37412289, 0.68248559, 0.12935131, 0.42510212]) If we call argsort once we get the index that will sort x: idx = x.argsort() idx array([2, 0, 3, 1]) x[idx] array([ 0.12935131, 0.37412289, 0.42510212, 0.68248559]) Notice that the first element of idx is 2. That's because element x[2] is the min of x. But that's not what we want. We want the first element to be the rank of the first element of x. So we need to shuffle idx around so that the order aligns with x. How do we do that? We sort it! Unless I am mistaken, what you are doing here is inverting the permutation returned by the first argsort. The second argsort is an n log n method, though, and permutations can be inverted in linear time: ix = np.argsort(X) ixinv = np.zeros_like(ix) ixinv[ix] = np.arange(len(ix)) This works because if ix is a permutation and ixinv is its inverse, A = B[ix] is the same as A[ixinv] = B This also means that you can often do without the actual inverse by moving the indexing operation to the other side of the equal sign. (Not in the OP's case, though.) That is very nice. And very fast for large arrays: x = np.random.rand(4) timeit idx = x.argsort().argsort() 100 loops, best of 3: 1.45 us per loop timeit idx = x.argsort(); idxinv = np.zeros_like(idx); idxinv[idx] = np.arange(len(idx)) 10 loops, best of 3: 9.52 us per loop x = np.random.rand(1000) timeit idx = x.argsort().argsort() 1 loops, best of 3: 112 us per loop timeit idx = x.argsort(); idxinv = np.zeros_like(idx); idxinv[idx] = np.arange(len(idx)) 1 loops, best of 3: 82.9 us per loop x = np.random.rand(10) timeit idx = x.argsort().argsort() 10 loops, best of 3: 20.4 ms per loop timeit idx = x.argsort(); idxinv = np.zeros_like(idx); idxinv[idx] = np.arange(len(idx)) 100 loops, best of 3: 13.2 ms per loop I'll also add that if the OP needs the top m for 1mn, sorting the whole input array is not the most efficient algorithm; there are priority-queue-based schemes that are asymptotically more efficient, but none exists in numpy. Since numpy's sorting is quite fast, I personally would just use the sorting. Partial sorting would find a lot of uses in the numpy community (like in median). ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion