Re: [Numpy-discussion] Sorting of an array row-by-row?
On Fri, Oct 20, 2017 at 1:40 PM, Joseph Fox-Rabinovitz < jfoxrabinov...@gmail.com> wrote: > I do not think that there is any particular relationship between the > order of the keys and lexicographic order. The key order is just a > convention, which is clearly documented. I agree that it is a bit > counter-intuitive for anyone that has used excel or MATLAB, but it is > ingrained in the API at this point. > When I wrote lexsort for numarray, together with the typed sorting routines, I went back and forth on the key order, but finally decided that the simplest thing would be to leave them in the same order as the sorts. That requires a bit of knowledge as to what the effect of that is, but if one remembers that the last sort dominates it isn't to bad. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Sorting of an array row-by-row?
I do not think that there is any particular relationship between the order of the keys and lexicographic order. The key order is just a convention, which is clearly documented. I agree that it is a bit counter-intuitive for anyone that has used excel or MATLAB, but it is ingrained in the API at this point. -Joe On Fri, Oct 20, 2017 at 3:03 PM, Kirill Balunov wrote: > Thank you Josef, you gave me an idea, and now the fastest version (for big > arrays) on my laptop is: > > np.lexsort(arr[:, ::-1].T) > > For me the most strange thing is the order of keys, what was an idea to keep > then right-to-left? How does this relate to lexicographic order? > > 2017-10-20 17:11 GMT+03:00 Joseph Fox-Rabinovitz : >> >> There are two mistakes in your PS. The immediate error comes from the >> fact that lexsort accepts an iterable of 1D arrays, so when you pass >> in arr as the argument, it is treated as an iterable over the rows, >> each of which is 1D. 1D arrays do not have an axis=1. You actually >> want to iterate over the columns, so np.lexsort(a.T) is the correct >> phrasing of that. No idea about the speed difference. >> >>-Joe >> >> On Fri, Oct 20, 2017 at 6:00 AM, Kirill Balunov >> wrote: >> > Hi, >> > >> > I was trying to sort an array (N, 3) by rows, and firstly come with this >> > solution: >> > >> > N = 100 >> > arr = np.random.randint(-100, 100, size=(N, 3)) >> > dt = np.dtype([('x', int),('y', int),('z', int)]) >> > >> > arr.view(dtype=dt).sort(axis=0) >> > >> > Then I found another way using lexsort function: >> > >> > idx = np.lexsort([arr[:, 2], arr[:, 1], arr[:, 0]]) >> > arr = arr[idx] >> > >> > Which is 4 times faster than the previous solution. And now i have >> > several >> > questions: >> > >> > Why is the first way so much slower? >> > What is the fastest way in numpy to sort array by rows? >> > Why is the order of keys in lexsort function reversed? >> > >> > The last question was really the root of the problem for me with the >> > lexsort function. >> > And I still can not understand the idea of such an order (the last is >> > the >> > primary), it seems to me confusing. >> > >> > Thank you!!! With kind regards, Kirill. >> > >> > p.s.: One more thing, when i first try to use lexsort. I catch this >> > strange >> > exception: >> > >> > np.lexsort(arr, axis=1) >> > >> > >> > --- >> > AxisError Traceback (most recent call >> > last) >> > in () >> > > 1 np.lexsort(ls, axis=1) >> > >> > AxisError: axis 1 is out of bounds for array of dimension 1 >> > >> > >> > >> > >> > ___ >> > NumPy-Discussion mailing list >> > NumPy-Discussion@python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> > >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Sorting of an array row-by-row?
Thank you Josef, you gave me an idea, and now the fastest version (for big arrays) on my laptop is: np.lexsort(arr[:, ::-1].T) For me the most strange thing is the order of keys, what was an idea to keep then right-to-left? How does this relate to lexicographic order*?* 2017-10-20 17:11 GMT+03:00 Joseph Fox-Rabinovitz : > There are two mistakes in your PS. The immediate error comes from the > fact that lexsort accepts an iterable of 1D arrays, so when you pass > in arr as the argument, it is treated as an iterable over the rows, > each of which is 1D. 1D arrays do not have an axis=1. You actually > want to iterate over the columns, so np.lexsort(a.T) is the correct > phrasing of that. No idea about the speed difference. > >-Joe > > On Fri, Oct 20, 2017 at 6:00 AM, Kirill Balunov > wrote: > > Hi, > > > > I was trying to sort an array (N, 3) by rows, and firstly come with this > > solution: > > > > N = 100 > > arr = np.random.randint(-100, 100, size=(N, 3)) > > dt = np.dtype([('x', int),('y', int),('z', int)]) > > > > arr.view(dtype=dt).sort(axis=0) > > > > Then I found another way using lexsort function: > > > > idx = np.lexsort([arr[:, 2], arr[:, 1], arr[:, 0]]) > > arr = arr[idx] > > > > Which is 4 times faster than the previous solution. And now i have > several > > questions: > > > > Why is the first way so much slower? > > What is the fastest way in numpy to sort array by rows? > > Why is the order of keys in lexsort function reversed? > > > > The last question was really the root of the problem for me with the > > lexsort function. > > And I still can not understand the idea of such an order (the last is the > > primary), it seems to me confusing. > > > > Thank you!!! With kind regards, Kirill. > > > > p.s.: One more thing, when i first try to use lexsort. I catch this > strange > > exception: > > > > np.lexsort(arr, axis=1) > > > > > --- > > AxisError Traceback (most recent call > last) > > in () > > > 1 np.lexsort(ls, axis=1) > > > > AxisError: axis 1 is out of bounds for array of dimension 1 > > > > > > > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Sorting of an array row-by-row?
There are two mistakes in your PS. The immediate error comes from the fact that lexsort accepts an iterable of 1D arrays, so when you pass in arr as the argument, it is treated as an iterable over the rows, each of which is 1D. 1D arrays do not have an axis=1. You actually want to iterate over the columns, so np.lexsort(a.T) is the correct phrasing of that. No idea about the speed difference. -Joe On Fri, Oct 20, 2017 at 6:00 AM, Kirill Balunov wrote: > Hi, > > I was trying to sort an array (N, 3) by rows, and firstly come with this > solution: > > N = 100 > arr = np.random.randint(-100, 100, size=(N, 3)) > dt = np.dtype([('x', int),('y', int),('z', int)]) > > arr.view(dtype=dt).sort(axis=0) > > Then I found another way using lexsort function: > > idx = np.lexsort([arr[:, 2], arr[:, 1], arr[:, 0]]) > arr = arr[idx] > > Which is 4 times faster than the previous solution. And now i have several > questions: > > Why is the first way so much slower? > What is the fastest way in numpy to sort array by rows? > Why is the order of keys in lexsort function reversed? > > The last question was really the root of the problem for me with the > lexsort function. > And I still can not understand the idea of such an order (the last is the > primary), it seems to me confusing. > > Thank you!!! With kind regards, Kirill. > > p.s.: One more thing, when i first try to use lexsort. I catch this strange > exception: > > np.lexsort(arr, axis=1) > > --- > AxisError Traceback (most recent call last) > in () > > 1 np.lexsort(ls, axis=1) > > AxisError: axis 1 is out of bounds for array of dimension 1 > > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Sorting of an array row-by-row?
Hi, I was trying to sort an array (N, 3) by rows, and firstly come with this solution: N = 100 arr = np.random.randint(-100, 100, size=(N, 3)) dt = np.dtype([('x', int),('y', int),('z', int)]) *arr.view(dtype=dt).sort(axis=0)* Then I found another way using lexsort function *:* *idx = np.lexsort([arr[:, 2], arr[:, 1], arr[:, 0]])* *arr = arr[idx]* Which is 4 times faster than the previous solution. And now i have several questions: Why is the first way so much slower? What is the fastest way in numpy to sort array by rows? Why is the order of keys in lexsort function reversed? The last question was really the root of the problem for me with the lexsort function. And I still can not understand the idea of such an order (the last is the primary), it seems to me confusing. Thank you!!! With kind regards, Kirill. p.s.: One more thing, when i first try to use lexsort. I catch this strange exception: *np.lexsort(arr, axis=1)* ---AxisError Traceback (most recent call last) in ()> 1 np.lexsort(ls, axis=1) AxisError: axis 1 is out of bounds for array of dimension 1 ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion