Re: [Numpy-discussion] Type annotations for NumPy
Hi! 2017-11-26 4:31 GMT+03:00 Juan Nunez-Iglesias: > > On 26 Nov 2017, 12:27 PM +1100, Nathaniel Smith , wrote: > > It turns out that the PEP 484 type system is *mostly* not useful for > this. They're really designed for checking consistency across a large > code-base, not for enabling compiler speedups. For example, if you > annotate something as an int, that means "this object is a subclass of > int". This is enough to let mypy catch your mistake if you > accidentally pass in a float instead, but it's not enough to tell you > anything at all about the object's behavior -- you could make a wacky > int subclass that acts like a string or something. > > I have subscribed to many lists, although I am not an active participant in them. Nevertheless this topic of using the type annotation in their projects was discussed several times on all Cython-like channels (and it becomes much more acute now days). "Misconceptions" arise both for ordinary users and developers, but I have never seen anyone to write clearly why the application of type annotation in Cython (and similar projects) is impossible or not reasonable. Maybe someone close to the topic has the time and energy to sum up and write a brief summary of how to perceive them and why they should be viewed "orthogonal"? Maybe I'm looking too superficially at this topic. But both Mypy and Cython perform type checking. From the Cython point of view I do not see any pitfalls, type checking and type conversions are what Cython is doing right now during compilation (and looks at types as strictly as necessary). >From Mypy's point of view, it's possible that it can delegate all this stuff, using a certain option, on a project's related type checker (which can be much stricter in its assumptions) With kind regards, -gdg ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Sorting of an array row-by-row?
Hi, I was trying to sort an array (N, 3) by rows, and firstly come with this solution: N = 100 arr = np.random.randint(-100, 100, size=(N, 3)) dt = np.dtype([('x', int),('y', int),('z', int)]) *arr.view(dtype=dt).sort(axis=0)* Then I found another way using lexsort function *:* *idx = np.lexsort([arr[:, 2], arr[:, 1], arr[:, 0]])* *arr = arr[idx]* Which is 4 times faster than the previous solution. And now i have several questions: Why is the first way so much slower? What is the fastest way in numpy to sort array by rows? Why is the order of keys in lexsort function reversed? The last question was really the root of the problem for me with the lexsort function. And I still can not understand the idea of such an order (the last is the primary), it seems to me confusing. Thank you!!! With kind regards, Kirill. p.s.: One more thing, when i first try to use lexsort. I catch this strange exception: *np.lexsort(arr, axis=1)* ---AxisError Traceback (most recent call last) in ()> 1 np.lexsort(ls, axis=1) AxisError: axis 1 is out of bounds for array of dimension 1 ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Sorting of an array row-by-row?
Thank you Josef, you gave me an idea, and now the fastest version (for big arrays) on my laptop is: np.lexsort(arr[:, ::-1].T) For me the most strange thing is the order of keys, what was an idea to keep then right-to-left? How does this relate to lexicographic order*?* 2017-10-20 17:11 GMT+03:00 Joseph Fox-Rabinovitz <jfoxrabinov...@gmail.com>: > There are two mistakes in your PS. The immediate error comes from the > fact that lexsort accepts an iterable of 1D arrays, so when you pass > in arr as the argument, it is treated as an iterable over the rows, > each of which is 1D. 1D arrays do not have an axis=1. You actually > want to iterate over the columns, so np.lexsort(a.T) is the correct > phrasing of that. No idea about the speed difference. > >-Joe > > On Fri, Oct 20, 2017 at 6:00 AM, Kirill Balunov <kirillbalu...@gmail.com> > wrote: > > Hi, > > > > I was trying to sort an array (N, 3) by rows, and firstly come with this > > solution: > > > > N = 100 > > arr = np.random.randint(-100, 100, size=(N, 3)) > > dt = np.dtype([('x', int),('y', int),('z', int)]) > > > > arr.view(dtype=dt).sort(axis=0) > > > > Then I found another way using lexsort function: > > > > idx = np.lexsort([arr[:, 2], arr[:, 1], arr[:, 0]]) > > arr = arr[idx] > > > > Which is 4 times faster than the previous solution. And now i have > several > > questions: > > > > Why is the first way so much slower? > > What is the fastest way in numpy to sort array by rows? > > Why is the order of keys in lexsort function reversed? > > > > The last question was really the root of the problem for me with the > > lexsort function. > > And I still can not understand the idea of such an order (the last is the > > primary), it seems to me confusing. > > > > Thank you!!! With kind regards, Kirill. > > > > p.s.: One more thing, when i first try to use lexsort. I catch this > strange > > exception: > > > > np.lexsort(arr, axis=1) > > > > > --- > > AxisError Traceback (most recent call > last) > > in () > > > 1 np.lexsort(ls, axis=1) > > > > AxisError: axis 1 is out of bounds for array of dimension 1 > > > > > > > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Syntax Improvement for Array Transpose
Only concerns #4 from Ilhan's list. ср, 26 июн. 2019 г. в 00:01, Ralf Gommers : > > [] > > Perhaps not full consensus between the many people with different opinions > and interests. But for the first one, arr.T change: it's clear that this > won't happen. > To begin with, I must admit that I am not familiar with the accepted policy of introducing changes to NumPy. But I find it quite nonconstructive just to say - it will not happen. What then is the point in the discussion? > Between Juan's examples of valid use, and what Stephan and Matthew said, > there's not much more to add. We're not going to change correct code for > minor benefits. > I fully agree that any feature can find its use, valid or not is another question. Juan did not present these examples, but I will allow myself to assume that it is more correct to describe what is being done there as a permutation, and not a transpose. In addition, in the very next sentence, Juan adds that "These could be easily changed to .transpose() (honestly they probably should!)" We're not going to change correct code for minor benefits. > It's fair, I personally have no preferences in both cases, the most important thing for me is that in the 2d case it works correctly. To be honest, until today, I thought that `.T` will raise for` ndim > 2`. At least that's what my experience told me. For example in Matlab - Error using .' Transpose on ND array is not defined. Use PERMUTE instead. Julia - transpose not defined for Array(Float64, 3). Consider using permutedims for higher-dimensional arrays. Sympy - raise ValueError("array rank not 2") Here, I agree with the authors that, to begin with, `transpose` is not the best name, since in general it doesn’t fit as an any mathematical definition (of course it will depend on what we take as an element) or a definition from linear algebra. Thus the name `transpose` only leads to confusion. For a note about another suggestion - `.T` to mean a transpose of the last two dimensions, in Mathematica authors for some reason did the opposite (personally, I could not understand why they made such a choice :) ): Transpose[list] transposes the first two levels in list. I feel strongly that we should have the following policy: > > * Under no circumstances should we make changes that mean that correct > old code will give different results with new Numpy. > I find this overly strict rules that do not allow to evolve. I completely agree that a silent change in behavior is a disaster, that changing behavior (if it is not an error) in the same minor version (1.X.Y) is not acceptable, but I see no reason to extend this rule for a major version bump (2.A.B.), especially if it allows something to improve. I would see such a rough version of a roadmap of change (I foresee my loneliness in this :)) Also considering this comment Personally I would find any divergence between a.T and a.transpose() > to be rather surprising. > it will be as follows: 1. in 1.18 add the `.permute` method to the array, with the same semantics as `.transpose`. 2. Starting from 1.18, emit `FutureWarning`, ` DeprectationWarning` for `.transpose` and advise replacing it with `.permute`. 3. Starting from 1.18 for `.T` with` ndim> 2`, emit a `FutureWarning`, with a note that in future versions the behavior will change. 4. In version 2, remove the `.transpose` and change the behavior for `.T`. Regarding `.T` with` ndim> 2` - I don’t have preferences between error or transpose of the last two dimensions. with kind regards, -gdg ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion