Re: [Numpy-discussion] How exactly ought 'dot' to work?
On 24 February 2014 05:21, Nathaniel Smith n...@pobox.com wrote: I've tentatively rewritten the first section of the PEP to try and accomplish this framing: https://github.com/njsmith/numpy/blob/matmul-pep/doc/neps/return-of-revenge-of-matmul-pep.rst Comments welcome etc. I've not been following the discussion about this but just out of interest is there no interest in Matlab style left- and right- matrix division? Personally I always found the idea of matrix division confusing and dislike teaching it to students but it does seem to be popular among Matlab users. Oscar ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] np.savetxt() default format
Hello, I've noticed that numpy default format for saving data in ascii representation with np.savetxt() is %.18e. Given that the default data type for numpy is double and that the resolution of doubles is 15 decimal digits, what's the reason of the the additional three digits? The three additional digits are definitely not an issue, but I would like to understand the reason why they are there. Thanks. Best, Daniele ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.savetxt() default format
On 24 February 2014 13:26, Daniele Nicolodi dani...@grinta.net wrote: Hello, I've noticed that numpy default format for saving data in ascii representation with np.savetxt() is %.18e. Given that the default data type for numpy is double and that the resolution of doubles is 15 decimal digits, what's the reason of the the additional three digits? The three additional digits are definitely not an issue, but I would like to understand the reason why they are there. It is reasonable to think of doubles as being limited to 15 decimal digits when reasoning about rounding errors but that's a conservative estimate. If you want to be able to round trip from IEEE-754 double precision to decimal and back you need more than 15 digits. The difference between 1.0 and the next smallest double precision number appears only at the 16th decimal place: b = 1 + 1.01*2**-53 b 1.0002 len(repr(b)) - 2 16 I think the convention to use 18 digits generally stems from not fully trusting the IEEE-754-ness of other systems that might read your data. Oscar ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How exactly ought 'dot' to work?
23.02.2014 00:03, Nathaniel Smith kirjoitti: Currently numpy's 'dot' acts a bit weird for ndim2 or ndim1. In practice this doesn't usually matter much, because these are very rarely used. But, I would like to nail down the behaviour so we can say something precise in the matrix multiplication PEP. On Sat, Feb 22, 2014 at 7:09 PM, Pauli Virtanen wrote: I'm not sure it's necessary to say much about this in the PEP. It should in my view concentrate on arguing why the new binop is needed in the Python language, and for that, restricting to 2D is good enough IMHO. How exactly Numpy makes use of the capability for 2-dim arrays is something that should definitely be discussed. But I think this is a problem mainly interesting for Numpy devs, and not for CPython devs. On 2/24/2014 12:21 AM, Nathaniel Smith wrote: I actually disagree strongly. I think it's very important to make clear that we have a clear, well thought through, and cross-project approach to what @ is supposed to mean I think Paul is right. We know `@` is supposed to mean matrix multiply when dealing with conformable 2d arrays. That is the real motivation of the PEP. I cannot see why the PEP itself would need to go beyond that. The behavior of `@` in other cases seems a discussion that should go *much* slower than that of the core of the PEP, which is to get an operator for matrix multiplication. Furthermore, I am not able to understand the principles behind the discussion of how `@` should behave in other cases. I do not think they are being clearly stated. (I have added a comment to the PEP asking for clarification.) To be concrete, if `@` is proposed to behave unlike Mathematica's `Dot` command, I would hope to hear a very clear mathematical motivation for this. (Specifically, I do not understand why `@` would do scalar product.) Otoh, if the proposal is just that `@` should behave just like NumPy's `dot` does, that should be simply stated. Cheers, Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function
Hey guys, I just pushed an updated version to github: https://github.com/sotte/numpy_mdot Here is an ipython notebook with some experiments: http://nbviewer.ipython.org/urls/raw2.github.com/sotte/numpy_mdot/master/2014-02_numpy_mdot.ipynb - I added (almost numpy compliant) documentation. - I use a function for len(args) == 3 to improve the speed. - Some general cleanup. Before I create a pull request I have a few questions: - Should there be an optimize argument or should we always optimize the parentheses? There is an overhead, but maybe we could neglect it? I think we should keep the flag, but set it to True by default. - I currently use a recursive algorithm to do the multiplication. Any objections? - In which file should `mdot` live? - I wrote a function `print_optimal_chain_order(D, A, B, C, names=list(DABC))` which determines the optimal parentheses and print out a numpy expression. It's kinda handy but do we actually need it? Beste Grüße, Stefan On Thu, Feb 20, 2014 at 8:39 PM, Nathaniel Smith n...@pobox.com wrote: On Thu, Feb 20, 2014 at 1:35 PM, Stefan Otte stefan.o...@gmail.com wrote: Hey guys, I quickly hacked together a prototype of the optimization step: https://github.com/sotte/numpy_mdot I think there is still room for improvements so feedback is welcome :) I'll probably have some time to code on the weekend. @Nathaniel, I'm still not sure about integrating it in dot. Don't a lot of people use the optional out parameter of dot? The email you're replying to below about deprecating stuff in 'dot' was in reply to Eric's email about using dot on arrays with shape (k, n, n), so those comments are unrelated to the mdot stuff. I wouldn't mind seeing out= arguments become kw-only in general, but even if we decided to do that it would take a long deprecation period, so yeah, let's give up on 'dot(A, B, C, D)' as syntax for mdot. However, the suggestion of supporting np.dot([A, B, C, D]) still seems like it might be a good idea...? I have mixed feelings about it -- one less item cluttering up the namespace, but it is weird and magical to have two totally different calling conventions for the same function. -n On Thu, Feb 20, 2014 at 4:02 PM, Nathaniel Smith n...@pobox.com wrote: If you send a patch that deprecates dot's current behaviour for ndim2, we'll probably merge it. (We'd like it to function like you suggest, for consistency with other gufuncs. But to get there we have to deprecate the current behaviour first.) While I'm wishing for things I'll also mention that it would be really neat if binary gufuncs would have a .outer method like regular ufuncs do, so anyone currently using ndim2 dot could just switch to that. But that's a lot more work than just deprecating something :-). -n On 20 Feb 2014 09:27, Eric Moore e...@redtetrahedron.org wrote: On Thursday, February 20, 2014, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: If the standard semantics are not affected, and the most common two-argument scenario does not take more than a single if-statement overhead, I don't see why it couldn't be a replacement for the existing np.dot; but others mileage may vary. On Thu, Feb 20, 2014 at 11:34 AM, Stefan Otte stefan.o...@gmail.com wrote: Hey, so I propose the following. I'll implement a new function `mdot`. Incorporating the changes in `dot` are unlikely. Later, one can still include the features in `dot` if desired. `mdot` will have a default parameter `optimize`. If `optimize==True` the reordering of the multiplication is done. Otherwise it simply chains the multiplications. I'll test and benchmark my implementation and create a pull request. Cheers, Stefan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Another consideration here is that we need a better way to work with stacked matrices such as np.linalg handles now. Ie I want to compute the matrix product of two (k, n, n) arrays producing a (k,n,n) result. Near as I can tell there isn't a way to do this right now that doesn't involve an explicit loop. Since dot will return a (k, n, k, n) result. Yes this output contains what I want but it also computes a lot of things that I don't want too. It would also be nice to be able to do a matrix product reduction, (k, n, n) - (n, n) in a single line too. Eric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Nathaniel J. Smith
[Numpy-discussion] problems building numpy with ACML blas/lapack
Hello, I'm trying to build numpy from source to use AMD's ACML for matrix multiplication (specifically the multi-threaded versions gfortran64_mp). I'm able to successfully compile and use a working version of np.dot, but my resulting installation doesn't pass numpy's test suite, instead, I get a segfault. I'm hoping for some advice on what might be wrong... I'm on Debian, with a fresh install of Python-2.7.6. To install numpy, I've followed exactly the instructions previously posted to this list by Thomas Unterthiner. See http://numpy-discussion.10968.n7.nabble.com/numpy-ACML-support-is-kind-of-broken-td35454.html. The only thing I've adjusted is to try to use the gfortran64_mp version of ACML instead of just gfortran64. Using those instructions, I can compile numpy-1.8.0 so that it successfully uses the desired ACML libraries. I can confirm this by `ldd site-packages/numpy/core/_dotblas.so`, which shows that I'm linked to libacml_mp.so as desired. Furthermore, some quick timing tests indicate that for a 1000x1000 matrix X, calls to np.dot(X,X) have similar speeds as using custom C code that directly calls the ACML libraries. So, dot seems to work as desired. However, when I run numpy.test(verbose=4), I find that I get a seg fault ``` test_einsum_sums_cfloat128 (test_einsum.TestEinSum) ... Segmentation fault ``` Any ideas what might be wrong? From my benchmark tests, ACML is way faster than MKL or other options on my system, so I'd really like to use it, but I don't trust this current install. Thanks! - Mike ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] argsort speed
On Fri, Feb 21, 2014 at 11:09 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Feb 21, 2014 at 10:35 PM, Ondřej Čertík ondrej.cer...@gmail.com wrote: On Mon, Feb 17, 2014 at 11:40 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Feb 17, 2014 at 11:32 AM, Julian Taylor jtaylor.deb...@googlemail.com wrote: On 17.02.2014 15:18, Francesc Alted wrote: On 2/17/14, 1:08 AM, josef.p...@gmail.com wrote: On Sun, Feb 16, 2014 at 6:12 PM, Daπid davidmen...@gmail.com wrote: On 16 February 2014 23:43, josef.p...@gmail.com wrote: What's the fastest argsort for a 1d array with around 28 Million elements, roughly uniformly distributed, random order? On numpy latest version: for kind in ['quicksort', 'mergesort', 'heapsort']: print kind %timeit np.sort(data, kind=kind) %timeit np.argsort(data, kind=kind) quicksort 1 loops, best of 3: 3.55 s per loop 1 loops, best of 3: 10.3 s per loop mergesort 1 loops, best of 3: 4.84 s per loop 1 loops, best of 3: 9.49 s per loop heapsort 1 loops, best of 3: 12.1 s per loop 1 loops, best of 3: 39.3 s per loop It looks quicksort is quicker sorting, but mergesort is marginally faster sorting args. The diference is slim, but upon repetition, it remains significant. Why is that? Probably part of the reason is what Eelco said, and part is that in sort comparison are done accessing the array elements directly, but in argsort you have to index the array, introducing some overhead. Thanks, both. I also gain a second with mergesort. matlab would be nicer in my case, it returns both. I still need to use the argsort to index into the array to also get the sorted array. Many years ago I needed something similar, so I made some functions for sorting and argsorting in one single shot. Maybe you want to reuse them. Here it is an example of the C implementation: https://github.com/PyTables/PyTables/blob/develop/src/idx-opt.c#L619 and here the Cython wrapper for all of them: https://github.com/PyTables/PyTables/blob/develop/tables/indexesextension.pyx#L129 Francesc that doesn't really make a big difference if the data is randomly distributed. the sorting operation is normally much more expensive than latter applying the indices: In [1]: d = np.arange(1000) In [2]: np.random.shuffle(d) In [3]: %timeit np.argsort(d) 1 loops, best of 3: 1.99 s per loop In [4]: idx = np.argsort(d) In [5]: %timeit d[idx] 1 loops, best of 3: 213 ms per loop But if your data is not random it can make a difference as even quicksort can be a lot faster then. timsort would be a nice addition to numpy, it performs very well for partially sorted data. Unfortunately its quite complicated to implement. Quicksort and shellsort gain speed by having simple inner loops. I have the impression that timsort is optimal when compares and memory access are expensive, but I haven't seen any benchmarks for native types in contiguous memory. I found some benchmarks for continuous memory here: https://github.com/swenson/sort/ https://github.com/gfx/cpp-TimSort The first one seems the best, it probably can be directly reused in numpy. The only issue is that it only sorts the array, but does not provide argsort. I'm impressed by the heapsort time. Heapsort is the slowest of the numpy sorts. So either the heapsort implementation is better than ours or the other sort are worse ;) Partially sorted sequence are pretty common, so timsort might be a worthy addition. Last time I looked, JDK was using timsort for sorting objects, and quicksort for native types. Another sort is dual pivot quicksort that I've heard some good things about. Adding indirect sorts isn't so difficult once the basic sort is available. Since the memory access tends to be larger as it gets randomly accessed, timsort might be a good choice for that. Indeed, I think one has to be very careful about these benchmarks, since it highly depends on the structure of the arrays being sorted. I've been looking into this a bit, since I need some fast algorithm in Fortran, that returns indices that sort the array. So far I use quicksort, but this Timsort might perform better for partially sorted arrays, which is typically my use case. Ondrej ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1 release
Hi, On Sun, Feb 23, 2014 at 10:26 AM, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, A lot of fixes have gone into the 1.8.x branch and it looks about time to do a bugfix release. There are a couple of important bugfixes still to backport, but if all goes well next weekend, March 1, looks like a good target date. So give the current 1.8.x branch a try so as to check that it covers your most urgent bugfix needs. I'd like to volunteer to make a .whl build for Mac. Is there anything special I should do to coordinate with y'all? It would be very good to put it up on pypi for seamless pip install... Thanks a lot, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1 release
Has anyone alerted C Gohlke? http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy - Ray ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1 release
On Mon, Feb 24, 2014 at 1:54 PM, RayS r...@blue-cove.com wrote: Has anyone alerted C Gohlke? http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy - Ray Christolph seems to keep a pretty good eye on numpy and we rely on him to test it on windows. In anycase, there are enough fixes backported, that I think we better start with a 1.8.1rc. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1 release
I am pretty sure that you guys test pandas master but 1.8.1 looks good to me On Feb 24, 2014, at 4:42 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Feb 24, 2014 at 1:54 PM, RayS r...@blue-cove.com wrote: Has anyone alerted C Gohlke? http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy - Ray Christolph seems to keep a pretty good eye on numpy and we rely on him to test it on windows. In anycase, there are enough fixes backported, that I think we better start with a 1.8.1rc. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1 release
Hi, On Mon, Feb 24, 2014 at 12:40 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sun, Feb 23, 2014 at 10:26 AM, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, A lot of fixes have gone into the 1.8.x branch and it looks about time to do a bugfix release. There are a couple of important bugfixes still to backport, but if all goes well next weekend, March 1, looks like a good target date. So give the current 1.8.x branch a try so as to check that it covers your most urgent bugfix needs. I'd like to volunteer to make a .whl build for Mac. Is there anything special I should do to coordinate with y'all? It would be very good to put it up on pypi for seamless pip install... Current trunk wheel at http://nipy.bic.berkeley.edu/scipy_installers/numpy-1.8.0.dev_a89a36e-cp27-none-macosx_10_6_intel.whl Testing welcome. You'll need OSX and latest setuptools and latest pip and : curl -O http://nipy.bic.berkeley.edu/scipy_installers/numpy-1.8.0.dev_a89a36e-cp27-none-macosx_10_6_intel.whl pip install --pre --no-index --find-links . numpy I built the wheel on a 10.9 laptop and tested it on a bare-metal no-compiler 10.6 laptop, so I think it will work. I'll set up daily wheel build / tests on our buildbots as well. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] recfunctions
Hi All, Does anyone recall if it was a deliberate choice to not expose recfunctions in the lib package? This is apropos issue #4242https://github.com/numpy/numpy/issues/4242 . Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] recfunctions
On February 25, 2014 at 01:26:51 , Charles R Harris (charlesr.har...@gmail.com) wrote: Hi All, Does anyone recall if it was a deliberate choice to not expose recfunctions in the lib package? This is apropos issue #4242. Yes. This job was a rip-off of John Hunter’s similar functions on matplotlib (with some rewriting and extensions) that sounded like a good idea at the time. However, I never really used recarrays, so wasn’t sure whether the implementation could be useful and… oh, look: http://mail.scipy.org/pipermail/numpy-discussion/2011-July/057477.html___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1 release
What's up with the OpenBLAS work? Any chance that might make it into official binaries? Or is is just too fresh? Also -- from an off-hand comment in the thread is looked like OpenBLAS could provide a library that selects for optimized code at run-time depending on hardware -- this would solve the superpack problem with wheels, which would be really nice... Or did I dream that? -Chris On Mon, Feb 24, 2014 at 12:40 PM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Sun, Feb 23, 2014 at 10:26 AM, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, A lot of fixes have gone into the 1.8.x branch and it looks about time to do a bugfix release. There are a couple of important bugfixes still to backport, but if all goes well next weekend, March 1, looks like a good target date. So give the current 1.8.x branch a try so as to check that it covers your most urgent bugfix needs. I'd like to volunteer to make a .whl build for Mac. Is there anything special I should do to coordinate with y'all? It would be very good to put it up on pypi for seamless pip install... Thanks a lot, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1 release
When will we see a http://sourceforge.net/projects/numpy/files/NumPy/1.8.1/Changelog/download changelog? I'd like to get this into our organization's SRS, and a list of fixes (related or not) would be great. - Ray ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] flatnonzero fails with lists
I was surprised that `flatnonzero` fails with lists because it calls the `ravel` method, which they do not have, instead of using the `ravel` function. I do not know that there is any policy that NumPy functions should always work on lists, but I have gotten used to them doing so. So I'm just asking, is this intentional? (version 1.7.1) Thanks, Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] flatnonzero fails with lists
On Mon, Feb 24, 2014 at 10:37 PM, Alan G Isaac alan.is...@gmail.com wrote: I was surprised that `flatnonzero` fails with lists because it calls the `ravel` method, which they do not have, instead of using the `ravel` function. I do not know that there is any policy that NumPy functions should always work on lists, but I have gotten used to them doing so. So I'm just asking, is this intentional? (version 1.7.1) It's documented to take an ndarray, but I see no reason that it shouldn't be modified to take array_like. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion