Re: [Numpy-discussion] Sorting of an array row-by-row?

2017-10-20 Thread Charles R Harris
On Fri, Oct 20, 2017 at 1:40 PM, Joseph Fox-Rabinovitz <
jfoxrabinov...@gmail.com> wrote:

> I do not think that there is any particular relationship between the
> order of the keys and lexicographic order. The key order is just a
> convention, which is clearly documented. I agree that it is a bit
> counter-intuitive for anyone that has used excel or MATLAB, but it is
> ingrained in the API at this point.
>

When I wrote lexsort for numarray, together with the typed sorting
routines, I went back and forth on the key order, but finally decided that
the simplest thing would be to leave them in the same order as the sorts.
That requires a bit of knowledge as to what the effect of that is, but if
one remembers that the last sort dominates it isn't to bad.



Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Sorting of an array row-by-row?

2017-10-20 Thread Joseph Fox-Rabinovitz
I do not think that there is any particular relationship between the
order of the keys and lexicographic order. The key order is just a
convention, which is clearly documented. I agree that it is a bit
counter-intuitive for anyone that has used excel or MATLAB, but it is
ingrained in the API at this point.

-Joe

On Fri, Oct 20, 2017 at 3:03 PM, Kirill Balunov  wrote:
> Thank you Josef, you gave me an idea, and now the fastest version (for big
> arrays) on my laptop is:
>
> np.lexsort(arr[:, ::-1].T)
>
> For me the most strange thing is the order of keys, what was an idea to keep
> then right-to-left? How does this relate to lexicographic order?
>
> 2017-10-20 17:11 GMT+03:00 Joseph Fox-Rabinovitz :
>>
>> There are two mistakes in your PS. The immediate error comes from the
>> fact that lexsort accepts an iterable of 1D arrays, so when you pass
>> in arr as the argument, it is treated as an iterable over the rows,
>> each of which is 1D. 1D arrays do not have an axis=1. You actually
>> want to iterate over the columns, so np.lexsort(a.T) is the correct
>> phrasing of that. No idea about the speed difference.
>>
>>-Joe
>>
>> On Fri, Oct 20, 2017 at 6:00 AM, Kirill Balunov 
>> wrote:
>> > Hi,
>> >
>> > I was trying to sort an array (N, 3) by rows, and firstly come with this
>> > solution:
>> >
>> > N = 100
>> > arr = np.random.randint(-100, 100, size=(N, 3))
>> > dt = np.dtype([('x', int),('y', int),('z', int)])
>> >
>> > arr.view(dtype=dt).sort(axis=0)
>> >
>> > Then I found another way using lexsort function:
>> >
>> > idx = np.lexsort([arr[:, 2], arr[:, 1], arr[:, 0]])
>> > arr = arr[idx]
>> >
>> > Which is 4 times faster than the previous solution. And now i have
>> > several
>> > questions:
>> >
>> > Why is the first way so much slower?
>> > What is the fastest way in numpy to sort array by rows?
>> > Why is the order of keys in lexsort function reversed?
>> >
>> > The last question  was really the root of the problem for me with the
>> > lexsort function.
>> > And I still can not understand the idea of such an order (the last is
>> > the
>> > primary), it seems to me confusing.
>> >
>> > Thank you!!! With kind regards, Kirill.
>> >
>> > p.s.: One more thing, when i first try to use lexsort. I catch this
>> > strange
>> > exception:
>> >
>> > np.lexsort(arr, axis=1)
>> >
>> >
>> > ---
>> > AxisError Traceback (most recent call
>> > last)
>> >  in ()
>> > > 1 np.lexsort(ls, axis=1)
>> >
>> > AxisError: axis 1 is out of bounds for array of dimension 1
>> >
>> >
>> >
>> >
>> > ___
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion@python.org
>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>> >
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Sorting of an array row-by-row?

2017-10-20 Thread Kirill Balunov
Thank you Josef, you gave me an idea, and now the fastest version (for big
arrays) on my laptop is:

np.lexsort(arr[:, ::-1].T)

For me the most strange thing is the order of keys, what was an idea to
keep then right-to-left? How does this relate to lexicographic order*?*

2017-10-20 17:11 GMT+03:00 Joseph Fox-Rabinovitz :

> There are two mistakes in your PS. The immediate error comes from the
> fact that lexsort accepts an iterable of 1D arrays, so when you pass
> in arr as the argument, it is treated as an iterable over the rows,
> each of which is 1D. 1D arrays do not have an axis=1. You actually
> want to iterate over the columns, so np.lexsort(a.T) is the correct
> phrasing of that. No idea about the speed difference.
>
>-Joe
>
> On Fri, Oct 20, 2017 at 6:00 AM, Kirill Balunov 
> wrote:
> > Hi,
> >
> > I was trying to sort an array (N, 3) by rows, and firstly come with this
> > solution:
> >
> > N = 100
> > arr = np.random.randint(-100, 100, size=(N, 3))
> > dt = np.dtype([('x', int),('y', int),('z', int)])
> >
> > arr.view(dtype=dt).sort(axis=0)
> >
> > Then I found another way using lexsort function:
> >
> > idx = np.lexsort([arr[:, 2], arr[:, 1], arr[:, 0]])
> > arr = arr[idx]
> >
> > Which is 4 times faster than the previous solution. And now i have
> several
> > questions:
> >
> > Why is the first way so much slower?
> > What is the fastest way in numpy to sort array by rows?
> > Why is the order of keys in lexsort function reversed?
> >
> > The last question  was really the root of the problem for me with the
> > lexsort function.
> > And I still can not understand the idea of such an order (the last is the
> > primary), it seems to me confusing.
> >
> > Thank you!!! With kind regards, Kirill.
> >
> > p.s.: One more thing, when i first try to use lexsort. I catch this
> strange
> > exception:
> >
> > np.lexsort(arr, axis=1)
> >
> > 
> ---
> > AxisError Traceback (most recent call
> last)
> >  in ()
> > > 1 np.lexsort(ls, axis=1)
> >
> > AxisError: axis 1 is out of bounds for array of dimension 1
> >
> >
> >
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Sorting of an array row-by-row?

2017-10-20 Thread Joseph Fox-Rabinovitz
There are two mistakes in your PS. The immediate error comes from the
fact that lexsort accepts an iterable of 1D arrays, so when you pass
in arr as the argument, it is treated as an iterable over the rows,
each of which is 1D. 1D arrays do not have an axis=1. You actually
want to iterate over the columns, so np.lexsort(a.T) is the correct
phrasing of that. No idea about the speed difference.

   -Joe

On Fri, Oct 20, 2017 at 6:00 AM, Kirill Balunov  wrote:
> Hi,
>
> I was trying to sort an array (N, 3) by rows, and firstly come with this
> solution:
>
> N = 100
> arr = np.random.randint(-100, 100, size=(N, 3))
> dt = np.dtype([('x', int),('y', int),('z', int)])
>
> arr.view(dtype=dt).sort(axis=0)
>
> Then I found another way using lexsort function:
>
> idx = np.lexsort([arr[:, 2], arr[:, 1], arr[:, 0]])
> arr = arr[idx]
>
> Which is 4 times faster than the previous solution. And now i have several
> questions:
>
> Why is the first way so much slower?
> What is the fastest way in numpy to sort array by rows?
> Why is the order of keys in lexsort function reversed?
>
> The last question  was really the root of the problem for me with the
> lexsort function.
> And I still can not understand the idea of such an order (the last is the
> primary), it seems to me confusing.
>
> Thank you!!! With kind regards, Kirill.
>
> p.s.: One more thing, when i first try to use lexsort. I catch this strange
> exception:
>
> np.lexsort(arr, axis=1)
>
> ---
> AxisError Traceback (most recent call last)
>  in ()
> > 1 np.lexsort(ls, axis=1)
>
> AxisError: axis 1 is out of bounds for array of dimension 1
>
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Sorting of an array row-by-row?

2017-10-20 Thread Kirill Balunov
Hi,

I was trying to sort an array (N, 3) by rows, and firstly come with this
solution:

N = 100
arr = np.random.randint(-100, 100, size=(N, 3))
dt = np.dtype([('x', int),('y', int),('z', int)])


*arr.view(dtype=dt).sort(axis=0)*
Then I found another way using lexsort function

*:*

*idx = np.lexsort([arr[:, 2], arr[:, 1], arr[:, 0]])*

*arr = arr[idx]*
Which is 4 times faster than the previous solution. And now i have several
questions:

Why is the first way so much slower?
What is the fastest way in numpy to sort array by rows?
Why is the order of keys in lexsort function reversed?

The last question  was really the root of the problem for me with the
lexsort function.
And I still can not understand the idea of ​​such an order (the last
is the primary), it seems to me confusing.

Thank you!!! With kind regards, Kirill.

p.s.: One more thing, when i first try to use lexsort. I catch this
strange exception:

*np.lexsort(arr, axis=1)*

---AxisError
Traceback (most recent call
last) in ()> 1
np.lexsort(ls, axis=1)
AxisError: axis 1 is out of bounds for array of dimension 1
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion