Re: [Numpy-discussion] Type annotations for NumPy

2017-11-26 Thread Kirill Balunov
Hi!

2017-11-26 4:31 GMT+03:00 Juan Nunez-Iglesias :

>
> On 26 Nov 2017, 12:27 PM +1100, Nathaniel Smith , wrote:
>
> It turns out that the PEP 484 type system is *mostly* not useful for
> this. They're really designed for checking consistency across a large
> code-base, not for enabling compiler speedups. For example, if you
> annotate something as an int, that means "this object is a subclass of
> int". This is enough to let mypy catch your mistake if you
> accidentally pass in a float instead, but it's not enough to tell you
> anything at all about the object's behavior -- you could make a wacky
> int subclass that acts like a string or something.
>
>
I have subscribed to many lists, although I am not an active participant in
them. Nevertheless this topic of using the type annotation in their
projects was discussed several times on all Cython-like channels (and it
becomes much more acute now days). "Misconceptions" arise both for ordinary
users and developers, but I have never seen anyone to write clearly why the
application of type annotation in Cython (and similar projects) is
impossible or not reasonable. Maybe someone close to the topic has the time
and energy to sum up and write a brief summary of how to perceive them and
why they should be viewed "orthogonal"?

Maybe I'm looking too superficially at this topic. But both Mypy and Cython
perform type checking. From the Cython point of view I do not see any
pitfalls, type checking and type conversions are what Cython is doing right
now during compilation (and looks at types as strictly as necessary).
>From Mypy's point of view, it's possible that it can delegate all this
stuff, using a certain option, on a project's related type checker (which
can be much stricter in its assumptions)

With kind regards, -gdg
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Sorting of an array row-by-row?

2017-10-20 Thread Kirill Balunov
Hi,

I was trying to sort an array (N, 3) by rows, and firstly come with this
solution:

N = 100
arr = np.random.randint(-100, 100, size=(N, 3))
dt = np.dtype([('x', int),('y', int),('z', int)])


*arr.view(dtype=dt).sort(axis=0)*
Then I found another way using lexsort function

*:*

*idx = np.lexsort([arr[:, 2], arr[:, 1], arr[:, 0]])*

*arr = arr[idx]*
Which is 4 times faster than the previous solution. And now i have several
questions:

Why is the first way so much slower?
What is the fastest way in numpy to sort array by rows?
Why is the order of keys in lexsort function reversed?

The last question  was really the root of the problem for me with the
lexsort function.
And I still can not understand the idea of ​​such an order (the last
is the primary), it seems to me confusing.

Thank you!!! With kind regards, Kirill.

p.s.: One more thing, when i first try to use lexsort. I catch this
strange exception:

*np.lexsort(arr, axis=1)*

---AxisError
Traceback (most recent call
last) in ()> 1
np.lexsort(ls, axis=1)
AxisError: axis 1 is out of bounds for array of dimension 1
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Sorting of an array row-by-row?

2017-10-20 Thread Kirill Balunov
Thank you Josef, you gave me an idea, and now the fastest version (for big
arrays) on my laptop is:

np.lexsort(arr[:, ::-1].T)

For me the most strange thing is the order of keys, what was an idea to
keep then right-to-left? How does this relate to lexicographic order*?*

2017-10-20 17:11 GMT+03:00 Joseph Fox-Rabinovitz <jfoxrabinov...@gmail.com>:

> There are two mistakes in your PS. The immediate error comes from the
> fact that lexsort accepts an iterable of 1D arrays, so when you pass
> in arr as the argument, it is treated as an iterable over the rows,
> each of which is 1D. 1D arrays do not have an axis=1. You actually
> want to iterate over the columns, so np.lexsort(a.T) is the correct
> phrasing of that. No idea about the speed difference.
>
>-Joe
>
> On Fri, Oct 20, 2017 at 6:00 AM, Kirill Balunov <kirillbalu...@gmail.com>
> wrote:
> > Hi,
> >
> > I was trying to sort an array (N, 3) by rows, and firstly come with this
> > solution:
> >
> > N = 100
> > arr = np.random.randint(-100, 100, size=(N, 3))
> > dt = np.dtype([('x', int),('y', int),('z', int)])
> >
> > arr.view(dtype=dt).sort(axis=0)
> >
> > Then I found another way using lexsort function:
> >
> > idx = np.lexsort([arr[:, 2], arr[:, 1], arr[:, 0]])
> > arr = arr[idx]
> >
> > Which is 4 times faster than the previous solution. And now i have
> several
> > questions:
> >
> > Why is the first way so much slower?
> > What is the fastest way in numpy to sort array by rows?
> > Why is the order of keys in lexsort function reversed?
> >
> > The last question  was really the root of the problem for me with the
> > lexsort function.
> > And I still can not understand the idea of such an order (the last is the
> > primary), it seems to me confusing.
> >
> > Thank you!!! With kind regards, Kirill.
> >
> > p.s.: One more thing, when i first try to use lexsort. I catch this
> strange
> > exception:
> >
> > np.lexsort(arr, axis=1)
> >
> > 
> ---
> > AxisError Traceback (most recent call
> last)
> >  in ()
> > > 1 np.lexsort(ls, axis=1)
> >
> > AxisError: axis 1 is out of bounds for array of dimension 1
> >
> >
> >
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Syntax Improvement for Array Transpose

2019-06-26 Thread Kirill Balunov
 Only concerns #4 from Ilhan's list.

ср, 26 июн. 2019 г. в 00:01, Ralf Gommers :

>
> []
>
> Perhaps not full consensus between the many people with different opinions
> and interests. But for the first one, arr.T change: it's clear that this
> won't happen.
>

To begin with, I must admit that I am not familiar with the accepted policy
of introducing changes to NumPy. But I find it quite nonconstructive just
to say - it will not happen. What then is the point in the discussion?


> Between Juan's examples of valid use, and what Stephan and Matthew said,
> there's not much more to add. We're not going to change correct code for
> minor benefits.
>

I fully agree that any feature can find its use, valid or not is another
question. Juan did not present these examples, but I will allow myself to
assume that it is more correct to describe what is being done there as a
permutation, and not a transpose. In addition, in the very next sentence,
Juan adds that "These could be easily changed to .transpose() (honestly
they probably should!)"

We're not going to change correct code for minor benefits.
>

It's fair, I personally have no preferences in both cases, the most
important thing for me is that in the 2d case it works correctly. To be
honest, until today, I thought that `.T` will raise for` ndim > 2`. At
least that's what my experience told me. For example in

Matlab - Error using  .' Transpose on ND array is not defined. Use
PERMUTE instead.

Julia - transpose not defined for Array(Float64, 3). Consider using
permutedims for higher-dimensional arrays.

Sympy - raise ValueError("array rank not 2")

Here, I agree with the authors that, to begin with, `transpose` is not the
best name, since in general it doesn’t fit as an any mathematical
definition (of course it will depend on what we take as an element) or a
definition from linear algebra. Thus the name `transpose` only leads to
confusion.

For a note about another suggestion - `.T` to mean a transpose of the last
two dimensions, in Mathematica authors for some reason did the
opposite (personally,
I could not understand why they made such a choice :) ):

Transpose[list]
transposes the first two levels in list.

I feel strongly that we should have the following policy:
>
> * Under no circumstances should we make changes that mean that correct
> old code will give different results with new Numpy.
>

I find this overly strict rules that do not allow to evolve. I completely
agree that a silent change in behavior is a disaster, that changing
behavior (if it is not an error) in the same minor version (1.X.Y) is not
acceptable, but I see no reason to extend this rule for a major version
bump (2.A.B.),  especially if it allows something to improve.

I would see such a rough version of a roadmap of change (I foresee my
loneliness in this :)) Also considering this comment

Personally I would find any divergence between a.T and a.transpose()
> to be rather surprising.
>

it will be as follows:

1. in 1.18 add the `.permute` method to the array, with the same semantics
as `.transpose`.
2. Starting from 1.18, emit  `FutureWarning`, ` DeprectationWarning` for
`.transpose` and advise replacing it with `.permute`.
3. Starting from 1.18 for `.T` with` ndim> 2`, emit a `FutureWarning`, with
a note that in future versions the behavior will change.
4. In version 2, remove the `.transpose` and change the behavior for `.T`.

Regarding `.T` with` ndim> 2` - I don’t have preferences between error or
transpose of the last two dimensions.

with kind regards,
-gdg
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion