On Tue, Jan 16, 2024 at 11:05 PM hao chen <unbelieveble.c...@gmail.com>
wrote:

> When dealing with lists that contain duplicate data, np.argsort fails to
> return index values that correspond to the actual sorting positions of the
> data, as it does when handling arrays without duplicates.
>
> Dear Author:
>
> When I use the np.argsort function on an array without duplicate data, the
> returned index values correspond to the sorting positions of the respective
> data.😀
>
> x = [1, 2, 5, 4]
> rank = np.argsort(x)
> print(rank)
> # [0 1 3 2]
>
> That is not what `argsort` is intended or documented to do. It returns an
array of indices _into `x`_ such that if you took the values from `x` in
that order, you would get a sorted array. That is, if `x` were sorted into
the array `sorted_x`, then `x[rank[i]] == sorted_x[i]` for all `i in
range(len(x))`. The indices in `rank` are positions in `x`, not positions
in `sorted_x`. They happen to correspond in this case, but that's a
coincidence that's somewhat common in these small examples. But consider
`[20, 30, 10, 40]`:

>>> x = np.array([20, 30, 10, 40])
>>> ix = np.argsort(x)
>>> def position(x):
...     sorted_x = np.array(x)
...     sorted_x.sort()
...     return np.searchsorted(sorted_x, x)
...
>>> ip = position(x)
>>> ix
array([2, 0, 1, 3])
>>> ip
array([1, 2, 0, 3])

But also notice:

>>> np.argsort(np.argsort(x))
array([1, 2, 0, 3])

This double-argsort is what you seem to be looking for, though it depends
on what you want from the handling of duplicates (do you return the first
index into the sorted array with the same value as in my `position()`
implementation, or do you return the index that particular item was
actually sorted to).

Either way, we probably aren't going to add this as its own function. Both
options are straightforward combinations of existing primitives.

-- 
Robert Kern
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to