Re: [Numpy-discussion] argsort speed

2014-02-17 Thread Francesc Alted
On 2/17/14, 1:08 AM, josef.p...@gmail.com wrote: On Sun, Feb 16, 2014 at 6:12 PM, Daπid davidmen...@gmail.com wrote: On 16 February 2014 23:43, josef.p...@gmail.com wrote: What's the fastest argsort for a 1d array with around 28 Million elements, roughly uniformly distributed, random order?

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Dave Hirschfeld
alex argriffi at ncsu.edu writes: Hello list, Here's another idea resurrection from numpy github comments that I've been advised could be posted here for re-discussion. The proposal would be to make np.linalg.svd more like scipy.linalg.svd with respect to input checking. The argument

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Jason Grout
On 2/15/14 3:37 PM, alex wrote: The proposal would be to make np.linalg.svd more like scipy.linalg.svd with respect to input checking. The argument against the change is raw speed; if you know that you will never feed non-finite input to svd, then np.linalg.svd is a bit faster than

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Sturla Molden
Dave Hirschfeld novi...@gmail.com wrote: It certainly shouldn't crash or hang though and for me at least it doesn't - it returns NaN which immediately suggests to me that I've got bad input (maybe just because I've seen it before). It might be dependent on the BLAS or LAPACK version. Since

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Sturla Molden
Jason Grout jason-s...@creativetrax.com wrote: For what my vote is worth, -1. I thought this was pretty much the designed difference between the scipy and numpy linalg routines. Scipy does the checking, and numpy provides the raw speed. Maybe this is better resolved as a note in the

Re: [Numpy-discussion] argsort speed

2014-02-17 Thread josef . pktd
On Mon, Feb 17, 2014 at 9:18 AM, Francesc Alted franc...@continuum.io wrote: On 2/17/14, 1:08 AM, josef.p...@gmail.com wrote: On Sun, Feb 16, 2014 at 6:12 PM, Daπid davidmen...@gmail.com wrote: On 16 February 2014 23:43, josef.p...@gmail.com wrote: What's the fastest argsort for a 1d array

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread alex
On Mon, Feb 17, 2014 at 4:49 AM, Dave Hirschfeld novi...@gmail.com wrote: alex argriffi at ncsu.edu writes: Hello list, Here's another idea resurrection from numpy github comments that I've been advised could be posted here for re-discussion. The proposal would be to make np.linalg.svd

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread josef . pktd
On Mon, Feb 17, 2014 at 10:03 AM, alex argri...@ncsu.edu wrote: On Mon, Feb 17, 2014 at 4:49 AM, Dave Hirschfeld novi...@gmail.com wrote: alex argriffi at ncsu.edu writes: Hello list, Here's another idea resurrection from numpy github comments that I've been advised could be posted here

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Sturla Molden
josef.p...@gmail.com wrote: I use official numpy release for development, Windows, 32bit python, i.e. MingW 3.5 and whatever old ATLAS the release includes. a constant 13% cpu usage is 1/8 th of my 8 virtual cores. Based on this and Alex' message it seems the offender is the f2c generated

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Dave Hirschfeld
Sturla Molden sturla.molden at gmail.com writes: josef.pktd at gmail.com wrote: I use official numpy release for development, Windows, 32bit python, i.e. MingW 3.5 and whatever old ATLAS the release includes. a constant 13% cpu usage is 1/8 th of my 8 virtual cores. Based on this

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Sturla Molden
Dave Hirschfeld novi...@gmail.com wrote: Even if lapack_lite always performed the isfinite check and threw a python error if False, it would be much better than either hanging or segfaulting and people who care about the isfinite cost probably would be linking to a fast lapack anyway.

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Sturla Molden
Sturla Molden sturla.mol...@gmail.com wrote: Dave Hirschfeld novi...@gmail.com wrote: Even if lapack_lite always performed the isfinite check and threw a python error if False, it would be much better than either hanging or segfaulting and people who care about the isfinite cost probably

Re: [Numpy-discussion] argsort speed

2014-02-17 Thread Julian Taylor
On 17.02.2014 15:18, Francesc Alted wrote: On 2/17/14, 1:08 AM, josef.p...@gmail.com wrote: On Sun, Feb 16, 2014 at 6:12 PM, Daπid davidmen...@gmail.com wrote: On 16 February 2014 23:43, josef.p...@gmail.com wrote: What's the fastest argsort for a 1d array with around 28 Million elements,

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Sturla Molden
josef.p...@gmail.com wrote: maybe -1 statsmodels is using np.linalg.pinv which uses svd I never ran heard of any crash (*), and the only time I compared with scipy I didn't like the slowdown. If you did care about speed in least-sqares fitting you would not call QR or SVD directly, but use

Re: [Numpy-discussion] argsort speed

2014-02-17 Thread Charles R Harris
On Mon, Feb 17, 2014 at 11:32 AM, Julian Taylor jtaylor.deb...@googlemail.com wrote: On 17.02.2014 15:18, Francesc Alted wrote: On 2/17/14, 1:08 AM, josef.p...@gmail.com wrote: On Sun, Feb 16, 2014 at 6:12 PM, Daπid davidmen...@gmail.com wrote: On 16 February 2014 23:43,

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Sturla Molden
Sturla Molden sturla.mol...@gmail.com wrote: josef.p...@gmail.com wrote: maybe -1 statsmodels is using np.linalg.pinv which uses svd I never ran heard of any crash (*), and the only time I compared with scipy I didn't like the slowdown. If you did care about speed in least-sqares

[Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Julian Taylor
hi, I noticed that during some simplistic benchmarks (e.g. https://github.com/numpy/numpy/issues/4310) a lot of time is spent in the kernel zeroing pages. This is because under linux glibc will always allocate large memory blocks with mmap. As these pages can come from other processes the kernel

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Sturla Molden
Julian Taylor jtaylor.deb...@googlemail.com wrote: When an array is created it tries to get its memory from the cache and when its deallocated it returns it to the cache. Good idea, however there is already a C function that does this. It uses a heap to keep the cached memory blocks sorted

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Julian Taylor
On 17.02.2014 21:16, Sturla Molden wrote: Julian Taylor jtaylor.deb...@googlemail.com wrote: When an array is created it tries to get its memory from the cache and when its deallocated it returns it to the cache. Good idea, however there is already a C function that does this. It uses a

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Nathaniel Smith
On 17 Feb 2014 15:17, Sturla Molden sturla.mol...@gmail.com wrote: Julian Taylor jtaylor.deb...@googlemail.com wrote: When an array is created it tries to get its memory from the cache and when its deallocated it returns it to the cache. Good idea, however there is already a C function

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Stefan Seefeld
On 02/17/2014 03:42 PM, Nathaniel Smith wrote: Another optimization we should consider that might help a lot in the same situations where this would help: for code called from the cpython eval loop, it's afaict possible to determine which inputs are temporaries by checking their refcnt. In the

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Sturla Molden
Nathaniel Smith n...@pobox.com wrote: Also, I'd be pretty wary of caching large chunks of unused memory. People already have a lot of trouble understanding their program's memory usage, and getting rid of 'greedy free' will make this even worse. A cache would only be needed when there is a lot

[Numpy-discussion] Proposal: Chaining np.dot with mdot helper function

2014-02-17 Thread Stefan Otte
Hey guys, I wrote myself a little helper function `mdot` which chains np.dot for multiple arrays. So I can write mdot(A, B, C, D, E) instead of these A.dot(B).dot(C).dot(D).dot(E) np.dot(np.dot(np.dot(np.dot(A, B), C), D), E) I know you can use `numpy.matrix` to get nicer

Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function

2014-02-17 Thread josef . pktd
On Mon, Feb 17, 2014 at 4:39 PM, Stefan Otte stefan.o...@gmail.com wrote: Hey guys, I wrote myself a little helper function `mdot` which chains np.dot for multiple arrays. So I can write mdot(A, B, C, D, E) instead of these A.dot(B).dot(C).dot(D).dot(E)

Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function

2014-02-17 Thread Jaime Fernández del Río
Perhaps you could reuse np.dot, by giving its second argument a default None value, and passing a tuple as first argument, i.e. np.dot((a, b, c)) would compute a.dot(b).dot(c), possibly not in that order. As is suggested in the matlab thread linked by Josef, if you do implement an optimal

Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function

2014-02-17 Thread Eelco Hoogendoorn
considering np.dot takes only its binary positional args and a single defaulted kwarg, passing in a variable number of positional args as a list makes sense. Then just call the builtin reduce on the list, and there you go. I also generally approve of such semantics for binary associative

Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function

2014-02-17 Thread josef . pktd
On Mon, Feb 17, 2014 at 4:57 PM, josef.p...@gmail.com wrote: On Mon, Feb 17, 2014 at 4:39 PM, Stefan Otte stefan.o...@gmail.com wrote: Hey guys, I wrote myself a little helper function `mdot` which chains np.dot for multiple arrays. So I can write mdot(A, B, C, D, E) instead of these

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Nathaniel Smith
On Mon, Feb 17, 2014 at 3:55 PM, Stefan Seefeld ste...@seefeld.name wrote: On 02/17/2014 03:42 PM, Nathaniel Smith wrote: Another optimization we should consider that might help a lot in the same situations where this would help: for code called from the cpython eval loop, it's afaict possible

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Julian Taylor
On 17.02.2014 22:27, Sturla Molden wrote: Nathaniel Smith n...@pobox.com wrote: Also, I'd be pretty wary of caching large chunks of unused memory. People already have a lot of trouble understanding their program's memory usage, and getting rid of 'greedy free' will make this even worse. A

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Stefan Seefeld
On 02/17/2014 06:56 PM, Nathaniel Smith wrote: On Mon, Feb 17, 2014 at 3:55 PM, Stefan Seefeld ste...@seefeld.name wrote: On 02/17/2014 03:42 PM, Nathaniel Smith wrote: Another optimization we should consider that might help a lot in the same situations where this would help: for code called

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread David Cournapeau
On Mon, Feb 17, 2014 at 7:31 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: hi, I noticed that during some simplistic benchmarks (e.g. https://github.com/numpy/numpy/issues/4310) a lot of time is spent in the kernel zeroing pages. This is because under linux glibc will always

[Numpy-discussion] Proposal to make power return float, and other such things.

2014-02-17 Thread Charles R Harris
This is apropos issue #899 https://github.com/numpy/numpy/issues/899, where it is suggested that power promote integers to float. That sounds reasonable to me, but such a change in behavior makes it a bit iffy. Thoughts? Chuck ___ NumPy-Discussion

Re: [Numpy-discussion] Proposal to make power return float, and other such things.

2014-02-17 Thread Alan G Isaac
On 2/17/2014 8:13 PM, Charles R Harris wrote: This is apropos issue #899 https://github.com/numpy/numpy/issues/899, where it is suggested that power promote integers to float. Even when base and exponent are both positive integers? Alan Isaac ___

Re: [Numpy-discussion] Proposal to make power return float, and other such things.

2014-02-17 Thread alex
On Mon, Feb 17, 2014 at 8:13 PM, Charles R Harris charlesr.har...@gmail.com wrote: This is apropos issue #899, where it is suggested that power promote integers to float. That sounds reasonable to me, but such a change in behavior makes it a bit iffy. Thoughts? After this change, what would

[Numpy-discussion] bug with mmap'ed datetime64 arrays

2014-02-17 Thread Charles G. Waldman
test case: #!/usr/bin/env python import numpy as np a=np.array(['2014', '2015', '2016'], dtype='datetime64') x=np.datetime64('2015') print ax np.save('test.npy', a) b = np.load('test.npy', mmap_mode='c') print bx result: [False False True] Traceback (most recent call last): File stdin,