Re: [Numpy-discussion] Numpy 1.12.x branched
My change about numpy.mean in float16 aren't in the doc. Should I make a PR again numpy master or maintenance/1.12.x? Fred On Mon, Nov 7, 2016 at 2:59 PM, Charles R Harriswrote: > > > On Mon, Nov 7, 2016 at 11:32 AM, Matti Picus > wrote: > >> On 07/11/16 10:19, numpy-discussion-requ...@scipy.org wrote: >> >>> Date: Sun, 06 Nov 2016 17:56:12 +0100 >>> From: Sebastian Berg >>> To:numpy-discussion@scipy.org >>> Subject: Re: [Numpy-discussion] Numpy 1.12.x branched >>> Message-ID:<1478451372.3875.5.ca...@sipsolutions.net> >>> Content-Type: text/plain; charset="utf-8" >>> >>> On Sa, 2016-11-05 at 17:04 -0600, Charles R Harris wrote: >>> >Hi All, > >Numpy 1.12.x has been branched and the 1.13 development branch is >open. It would be helpful if folks could review the release notes as >it is likely I've missed something.? I'd like to make the first beta >release in a couple of days. > >>> Very cool, thanks for all the hard work! >>> >>> - Sebastian >>> >>> >>> >Chuck >>> Thanks for managing this. I don't know where, but it would be nice if >> the release notes could mention the PyPy support - we are down to only a >> few failures on the test suite, the only real oustanding issue is nditer >> using UPDATEIFCOPY which depends on refcounting semantics to trigger the >> copy. Other than that PyPy + NumPy 1.12 is a working thing, we (PyPy devs) >> will soon try to make it work faster :). >> > > A PR updating the release notes would be welcome. This might be one of the > highlights for those interested in PyPy. > > Chuck > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Announcing Theano 0.8.0
Announcing Theano 0.8.0 This is a release for a major version, with lots of new features, bug fixes, and some interface changes (deprecated or potentially misleading features were removed). The upgrade is recommended for everybody. For those using the bleeding edge version in the git repository, we encourage you to update to the `rel-0.8.0` tag. What's New -- Highlights:« - Python 2 and 3 support with the same code base - Faster optimization - Integration of CuDNN for better GPU performance - Many Scan improvements (execution speed up, ...) - optimizer=fast_compile moves computation to the GPU. - Better convolution on CPU and GPU. (CorrMM, cudnn, 3d conv, more parameter) - Interactive visualization of graphs with d3viz - cnmem (better memory management on GPU) - BreakpointOp - Multi-GPU for data parallism via Platoon ( https://github.com/mila-udem/platoon/) - More pooling parameter supported - Bilinear interpolation of images - New GPU back-end: * Float16 new back-end (need cuda 7.5) * Multi dtypes * Multi-GPU support in the same process A total of 141 people contributed to this release, please see the end of NEWS.txt for the complete list. If you are among the authors and would like to update the information, please let us know. Download and Install You can download Theano from http://pypi.python.org/pypi/Theano Installation instructions are available at http://deeplearning.net/software/theano/install.html Description --- Theano is a Python library that allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays. It is built on top of NumPy. Theano features: * tight integration with NumPy: a similar interface to NumPy's. numpy.ndarrays are also used internally in Theano-compiled functions. * transparent use of a GPU: perform data-intensive computations up to 140x faster than on a CPU (support for float32 only). * efficient symbolic differentiation: Theano can compute derivatives for functions of one or many inputs. * speed and stability optimizations: avoid nasty bugs when computing expressions such as log(1+ exp(x)) for large values of x. * dynamic C code generation: evaluate expressions faster. * extensive unit-testing and self-verification: includes tools for detecting and diagnosing bugs and/or potential problems. Theano has been powering large-scale computationally intensive scientific research since 2007, but it is also approachable enough to be used in deep learning classes. All questions/comments are always welcome on the Theano mailing-lists ( http://deeplearning.net/software/theano/#community ) Frédéric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumPy 1.8.2 release candidate
All Theano tests work. thanks! Fred On Tue, Aug 5, 2014 at 8:46 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Tue, Aug 5, 2014 at 2:27 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Tue, Aug 5, 2014 at 1:57 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: On 05.08.2014 22:32, Christoph Gohlke wrote: On 8/5/2014 12:45 PM, Julian Taylor wrote: Hello, I am pleased to announce the first release candidate for numpy 1.8.2, a pure bugfix release for the 1.8.x series. https://sourceforge.net/projects/numpy/files/NumPy/1.8.2rc1/ If no regressions show up the final release is planned this weekend. The upgrade is recommended for all users of the 1.8.x series. Following issues have been fixed: * gh-4836: partition produces wrong results for multiple selections in equal ranges * gh-4656: Make fftpack._raw_fft threadsafe * gh-4628: incorrect argument order to _copyto in in np.nanmax, np.nanmin * gh-4613: Fix lack of NULL check in array_richcompare * gh-4642: Hold GIL for converting dtypes types with fields * gh-4733: fix np.linalg.svd(b, compute_uv=False) * gh-4853: avoid unaligned simd load on reductions on i386 * gh-4774: avoid unaligned access for strided byteswap * gh-650: Prevent division by zero when creating arrays from some buffers * gh-4602: ifort has issues with optimization flag O2, use O1 Source tarballs, windows installers and release notes can be found at https://sourceforge.net/projects/numpy/files/NumPy/1.8.2rc1/ Cheers, Julian Taylor Hello, thank you. Looks good. All builds and tests pass on Windows (using msvc/MKL). Any chance gh-4722 can make it into the release? Fix seg fault converting empty string to object https://github.com/numpy/numpy/pull/4722 thanks, I missed that one, pretty simple, I'll add it to the final release. OSX wheels built and tested and uploaded OK : http://wheels.scikit-image.org https://travis-ci.org/matthew-brett/numpy-atlas-binaries/builds/31747958 OSX wheel tested OK against current scipy stack for system Python, python.org Python, homebrew, macports: https://travis-ci.org/matthew-brett/scipy-stack-osx-testing/builds/31756325 Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.mean still broken for large float32 arrays
On Thu, Jul 24, 2014 at 12:59 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Jul 24, 2014 at 8:27 AM, Jaime Fernández del Río jaime.f...@gmail.com wrote: On Thu, Jul 24, 2014 at 4:56 AM, Julian Taylor jtaylor.deb...@googlemail.com wrote: In practice one of the better methods is pairwise summation that is pretty much as fast as a naive summation but has an accuracy of O(logN) ulp. This is the method numpy 1.9 will use this method by default (+ its even a bit faster than our old implementation of the naive sum): https://github.com/numpy/numpy/pull/3685 but it has some limitations, it is limited to blocks fo the buffer size (8192 elements by default) and does not work along the slow axes due to limitations in the numpy iterator. For what it's worth, I see the issue on a 64-bit Windows numpy 1.8, but cannot on a 32-bit Windows numpy master: np.__version__ '1.8.0' np.ones(1, dtype=np.float32).mean() 0.16777216 np.__version__ '1.10.0.dev-Unknown' np.ones(1, dtype=np.float32).mean() 1.0 Interesting. Might be compiler related as there are many choices for floating point instructions/registers in i386. The i386 version may effectively be working in double precision. Also note the different numpy version. Julian told that numpy 1.9 will use a more precise version in that case. That could explain that. Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Reporting a bug to Apple.
Hi, We already did a bug report to Apple, but they didn't acted on this yet. A process call the kernel and it loop infinitly in the kernel. The only way to kill the process is to reboot. Arnaud, how did you report it? Good luck, and if they act on this, I would be happy to know how you did. Frédéric Bastien On Mon, Jun 9, 2014 at 7:29 PM, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, Julian has tracked down a bug in the Accelerate library in Maverick, details here https://github.com/numpy/numpy/issues/4007#issuecomment-45541678. Is there a registered Apple Developer here who can report the bug to Apple? TIA, Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] IDL vs Python parallel computing
Just a quick question/possibility. What about just parallelizing ufunc with only 1 inputs that is c or fortran contiguous like trigonometric function? Is there a fast path in the ufunc mechanism when the input is fortran/c contig? If that is the case, it would be relatively easy to add an openmp pragma to parallelize that loop, with a condition to a minimum number of element. Anyway, I won't do it. I'm just outlining what I think is the most easy case(depending of NumPy internal that I don't now enough) to implement and I think the most frequent (so possible a quick fix for someone with the knowledge of that code). In Theano, we found in a few CPUs for the addition we need a minimum of 200k element for the parallelization of elemwise to be useful. We use that number by default for all operation to make it easy. This is user configurable. This warenty that with current generation, the threading don't slow thing down. I think that this is more important, don't show user slow down by default with a new version. Fred On Wed, May 7, 2014 at 2:27 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: On 07.05.2014 20:11, Sturla Molden wrote: On 03/05/14 23:56, Siegfried Gonzi wrote: A more technical answer is that NumPy's internals does not play very nicely with multithreading. For examples the array iterators used in ufuncs store an internal state. Multithreading would imply an excessive contention for this state, as well as induce false sharing of the iterator object. Therefore, a multithreaded NumPy would have performance problems due to synchronization as well as hierachical memory collisions. Adding multithreading support to the current NumPy core would just degrade the performance. NumPy will not be able to use multithreading efficiently unless we redesign the iterators in NumPy core. That is a massive undertaking which prbably means rewriting most of NumPy's core C code. A better strategy would be to monkey-patch some of the more common ufuncs with multithreaded versions. I wouldn't say that the iterator is a problem, the important iterator functions are threadsafe and there is support for multithreaded iteration using NpyIter_Copy so no data is shared between threads. I'd say the main issue is that there simply aren't many functions worth parallelizing in numpy. Most the commonly used stuff is already memory bandwidth bound with only one or two threads. The only things I can think of that would profit is sorting/partition and the special functions like sqrt, exp, log, etc. Generic efficient parallelization would require merging of operations improve the FLOPS/loads ratio. E.g. numexpr and theano are able to do so and thus also has builtin support for multithreading. That being said you can use Python threads with numpy as (especially in 1.9) most expensive functions release the GIL. But unless you are doing very flop intensive stuff you will probably have to manually block your operations to the last level cache size if you want to scale beyond one or two threads. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC project: draft of proposal
Just a comment, supporting a library that is bsd 3 clauses could help to higly reduce the compilation problem like what we have with blas. We could just include it in numpy/download it automatically or whatever to make the install trivial and then we could suppose all users have it. Deadling with blas is already not fun, if new dependency could be trivial to link to, it would be great. Fred On Fri, Mar 14, 2014 at 8:57 AM, Gregor Thalhammer gregor.thalham...@gmail.com wrote: Am 14.03.2014 um 11:00 schrieb Eric Moore e...@redtetrahedron.org: On Friday, March 14, 2014, Gregor Thalhammer gregor.thalham...@gmail.com wrote: Am 13.03.2014 um 18:35 schrieb Leo Mao lmao20...@gmail.com: Hi, Thanks a lot for your advice, Chuck. Following your advice, I have modified my draft of proposal. (attachment) I think it still needs more comments so that I can make it better. And I found that maybe I can also make some functions related to linalg (like dot, svd or something else) faster by integrating a proper library into numpy. Regards, Leo Mao Dear Leo, large parts of your proposal are covered by the uvml package https://github.com/geggo/uvml In my opinion you should also consider Intels VML (part of MKL) as a candidate. (Yes I know, it is not free). To my best knowledge it provides many more vectorized functions than the open source alternatives. Concerning your time table, once you implemented support for one function, adding more functions is very easy. Gregor I'm not sure that your week old project is enough to discourage this gsoc project. In particular, it would be nice to be able to ship this directly as part of numpy and that won't really be possible with mlk. Eric Hi, it's not at all my intention to discourage this project. I hope Leo Mao can use the uvml package as a starting point for further improvements. Since most vectorized math libraries share a very similar interface, I think the actual choice of the library could be made a configurable option. Adapting uvml to use e.g. yeppp instead of MKL should be straightforward. Similar to numpy or scipy built with MKL lapack and distributed by enthought or Christoph Gohlke, using MKL should not be ruled out completely. Gregor ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator
This is great news. Excellent work Nathaniel and all others! Frédéric On Fri, Mar 14, 2014 at 8:57 PM, Aron Ahmadia a...@ahmadia.net wrote: That's the best news I've had all week. Thanks for all your work on this Nathan. -A On Fri, Mar 14, 2014 at 8:51 PM, Nathaniel Smith n...@pobox.com wrote: Well, that was fast. Guido says he'll accept the addition of '@' as an infix operator for matrix multiplication, once some details are ironed out: https://mail.python.org/pipermail/python-ideas/2014-March/027109.html http://legacy.python.org/dev/peps/pep-0465/ Specifically, we need to figure out whether we want to make an argument for a matrix power operator (@@), and what precedence/associativity we want '@' to have. I'll post two separate threads to get feedback on those in an organized way -- this is just a heads-up. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1 release
Hi, I have a PR that fix way too much printing to stdout when finding the blas linking information: https://github.com/numpy/numpy/pull/4081 This was created by change in NumPy. I was requested as a comment to put the removed information in the dict that we return to the user. I won't have the time to do this for the 1.8.1rc as I'm in vacation next week and I need to prepar that. I'll try to find someone else to finish this, but it is not sure. I'll keep you updated on this. thanks Frédéric On Tue, Feb 25, 2014 at 5:52 PM, Carl Kleffner cmkleff...@gmail.com wrote: I build wheels for 32bit and 64bit (Windows, OpenBLAS) and put them here: https://drive.google.com/folderview?id=0B4DmELLTwYmlX05WSWpYVWJfRjgusp=sharing Due to shortage of time I give not much more detailed informations before 1st of March. Carl 2014-02-25 1:53 GMT+01:00 Chris Barker chris.bar...@noaa.gov: What's up with the OpenBLAS work? Any chance that might make it into official binaries? Or is is just too fresh? Also -- from an off-hand comment in the thread is looked like OpenBLAS could provide a library that selects for optimized code at run-time depending on hardware -- this would solve the superpack problem with wheels, which would be really nice... Or did I dream that? -Chris On Mon, Feb 24, 2014 at 12:40 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sun, Feb 23, 2014 at 10:26 AM, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, A lot of fixes have gone into the 1.8.x branch and it looks about time to do a bugfix release. There are a couple of important bugfixes still to backport, but if all goes well next weekend, March 1, looks like a good target date. So give the current 1.8.x branch a try so as to check that it covers your most urgent bugfix needs. I'd like to volunteer to make a .whl build for Mac. Is there anything special I should do to coordinate with y'all? It would be very good to put it up on pypi for seamless pip install... Thanks a lot, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1 release
Hi, Arnaud finished that in a different way then we had discussed in the PR. https://github.com/numpy/numpy/pull/4081 Fred On Wed, Feb 26, 2014 at 10:07 AM, Frédéric Bastien no...@nouiz.org wrote: Hi, I have a PR that fix way too much printing to stdout when finding the blas linking information: https://github.com/numpy/numpy/pull/4081 This was created by change in NumPy. I was requested as a comment to put the removed information in the dict that we return to the user. I won't have the time to do this for the 1.8.1rc as I'm in vacation next week and I need to prepar that. I'll try to find someone else to finish this, but it is not sure. I'll keep you updated on this. thanks Frédéric On Tue, Feb 25, 2014 at 5:52 PM, Carl Kleffner cmkleff...@gmail.com wrote: I build wheels for 32bit and 64bit (Windows, OpenBLAS) and put them here: https://drive.google.com/folderview?id=0B4DmELLTwYmlX05WSWpYVWJfRjgusp=sharing Due to shortage of time I give not much more detailed informations before 1st of March. Carl 2014-02-25 1:53 GMT+01:00 Chris Barker chris.bar...@noaa.gov: What's up with the OpenBLAS work? Any chance that might make it into official binaries? Or is is just too fresh? Also -- from an off-hand comment in the thread is looked like OpenBLAS could provide a library that selects for optimized code at run-time depending on hardware -- this would solve the superpack problem with wheels, which would be really nice... Or did I dream that? -Chris On Mon, Feb 24, 2014 at 12:40 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sun, Feb 23, 2014 at 10:26 AM, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, A lot of fixes have gone into the 1.8.x branch and it looks about time to do a bugfix release. There are a couple of important bugfixes still to backport, but if all goes well next weekend, March 1, looks like a good target date. So give the current 1.8.x branch a try so as to check that it covers your most urgent bugfix needs. I'd like to volunteer to make a .whl build for Mac. Is there anything special I should do to coordinate with y'all? It would be very good to put it up on pypi for seamless pip install... Thanks a lot, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Suggestion: Port Theano RNG implementation to NumPy
Hi, In a ticket I did a coment and Charles suggested that I post it here: In Theano we have an C implementation of a faster RNG: MRG31k3p. It is faster on CPU, and we have a GPU implementation. It would be relatively easy to parallize on the CPU with OpenMP. If someone is interested to port this to numpy, their wouldn't be any dependency problem. No license problem as Theano license have the same license as NumPy. The speed difference is significant, but I don't recall numbers. Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Suggestion: Port Theano RNG implementation to NumPy
I won't go in the discussion of which RNG is better for some problems. I'll just tell why we pick this one. We needed a parallel RNG and we wanted to use the same RNG on CPU and on GPU. We discussed with a professor in our department that is well know in that field(Pierre L'Ecuyer) and he recommanded this one for our problem. For the GPU, we don't want an rng that have too much register too. Robert K. commented that this would need refactoring of numpy.random and then it would be easy to have many rng. Fred On Tue, Feb 18, 2014 at 10:56 AM, Matthieu Brucher matthieu.bruc...@gmail.com wrote: Hi, The main issue with PRNG and MT is that you don't know how to initialize all MT generators properly. A hash-based PRNG is much more efficient in that regard (see Random123 for a more detailed explanation). From what I heard, if MT is indeed chosen for RNG in numerical world, in parallel world, it is not as obvious because of this pitfall. Cheers, Matthieu 2014-02-18 15:50 GMT+00:00 Sturla Molden sturla.mol...@gmail.com: AFAIK, CMRG (MRG31k3p) is more equidistributed than Mersenne Twister, but the period is much shorter. However, MT is getting acceptance as the PRNG of choice for numerical work. And when we are doing stochastic simulations in Python, the speed of the PRNG is unlikely to be the bottleneck. Sturla Frédéric Bastien no...@nouiz.org wrote: Hi, In a ticket I did a coment and Charles suggested that I post it here: In Theano we have an C implementation of a faster RNG: MRG31k3p. It is faster on CPU, and we have a GPU implementation. It would be relatively easy to parallize on the CPU with OpenMP. If someone is interested to port this to numpy, their wouldn't be any dependency problem. No license problem as Theano license have the same license as NumPy. The speed difference is significant, but I don't recall numbers. Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher Music band: http://liliejay.com/ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] MKL and OpenBLAS
On Fri, Feb 7, 2014 at 4:31 AM, Robert Kern robert.k...@gmail.com wrote: On Thu, Feb 6, 2014 at 9:45 PM, Matthieu Brucher matthieu.bruc...@gmail.com wrote: According to the discussions on the ML, they switched from GPL to MPL to enable the kind of distribution numpy/scipy is looking for. They had some hesitations between BSD and MPL, but IIRC their official stand is to allow inclusion inside BSD-licensed code. If they want BSD-licensed projects to incorporate their code, they need to license it under the BSD license (or similar). They are not in a position to allow their MPL-licensed code to be included in a BSD-licensed project. That just doesn't mean anything. We never needed their permission. We could be BSD-licensed except for this one bit which is MPLed, but we don't want to be. I agree that we shouldn't include Eigen code in NumPy. But what distributing windows binaries that include Eigen headers? They wrote this on their web site: Virtually any software may use Eigen. For example, closed-source software may use Eigen without having to disclose its own source code. Many proprietary and closed-source software projects are using Eigen right now, as well as many BSD-licensed projects. Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] windows and C99 math
Just a guess as I don't make those binaries, but I think they are done with Visual Studio and it only support C89... We need to back port some of our c code for windows for GPU as nvcc use VS and it don't support C99. Fred On Mon, Jan 27, 2014 at 3:04 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: hi, numpys no-C99 fallback keeps turning up issues in corner cases, e.g. hypot https://github.com/numpy/numpy/issues/2385 log1p https://github.com/numpy/numpy/issues/4225 these only seem to happen on windows, on linux and mac it seems to use the C99 math library just fine. Are our binary builds for windows not correct or does windows just not support C99 math? Hopefully it is the former. Any insight is appreciated (and patches to fix the build even more!) Cheers, Julian ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Memory allocation cleanup
On Fri, Jan 10, 2014 at 4:18 AM, Julian Taylor jtaylor.deb...@googlemail.com wrote: On Fri, Jan 10, 2014 at 3:48 AM, Nathaniel Smith n...@pobox.com wrote: On Thu, Jan 9, 2014 at 11:21 PM, Charles R Harris charlesr.har...@gmail.com wrote: [...] After a bit more research, some further points to keep in mind: Currently, PyDimMem_* and PyArray_* are just aliases for malloc/free, and PyDataMem_* is an alias for malloc/free with some extra tracing hooks wrapped around it. (AFAIK, these tracing hooks are not used by anyone anywhere -- at least, if they are I haven't heard about it, and there is no code on github that uses them.) There is one substantial difference between the PyMem_* and PyObject_* interfaces as compared to malloc(), which is that the Py* interfaces require that the GIL be held when they are called. (@Julian -- I think your PR we just merged fulfills this requirement, is that right?) I only replaced object allocation which should always be called under GIL, not sure about nditer construction, but it does uses python exceptions for errors which I think also require the GIL. [...] Also, none of the Py* interfaces implement calloc(), which is annoying because it messes up our new optimization of using calloc() for np.zeros. [...] Another thing that is not directly implemented in Python is aligned allocation. This is going to get increasingly important with the advent heavily vectorized x86 CPUs (e.g. AVX512 is rolling out now) and the C malloc being optimized for the oldish SSE (16 bytes). I want to change the array buffer allocation to make use of posix_memalign and C11 aligned_malloc if available to avoid some penalties when loading from non 32 byte aligned buffers. I could imagine it might also help coprocessors and gpus to have higher alignments, but I'm not very familiar with that type of hardware. The allocator used by the Python3.4 is plugable, so we could implement our special allocators with the new API, but only when 3.4 is more widespread. About the co-processor and GPUs, it could help, but as NumPy is CPU only and that there is other problem in directly using it, I dought that this change would help code around co-processor/GPUs. Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] adding fused multiply and add to numpy
Hi, It happen frequently that NumPy isn't compiled with all instruction that is available where it run. For example in distro. So if the decision is made to use the fast version when we don't use the newer instruction, the user need a way to know that. So the library need a function/attribute to tell that. How hard would it be to provide the choise to the user? We could provide 2 functions like: fma_fast() fma_prec() (for precision)? Or this could be a parameter or a user configuration option like for the overflow/underflow error. Fred On Thu, Jan 9, 2014 at 9:43 AM, Freddie Witherden fred...@witherden.org wrote: On 08/01/14 21:39, Julian Taylor wrote: An issue is software emulation of real fma. This can be enabled in the test ufunc with npfma.set_type(libc). This is unfortunately incredibly slow about a factor 300 on my machine without hardware fma. This means we either have a function that is fast on some platforms and slow on others but always gives the same result or we have a fast function that gives better results on some platforms. Given that we are not worth that what numpy currently provides I favor the latter. Any opinions on whether this should go into numpy or maybe stay a third party ufunc? My preference would be to initially add an madd intrinsic. This can be supported on all platforms and can be documented to permit the use of FMA where available. A 'true' FMA intrinsic function should only be provided when hardware FMA support is available. Many of the more interesting applications of FMA depend on there only being a single rounding step and as such FMA should probably mean a*b + c with only a single rounding. Regards, Freddie. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Memory allocation cleanup
This shouldn't affect Theano. So I have no objection. Making thing faster and more tracktable is always good. So I think it seam a good idea. Fred On Thu, Jan 9, 2014 at 6:21 PM, Charles R Harris charlesr.har...@gmail.com wrote: Apropos Julian's changes to use the PyObject_* allocation suite for some parts of numpy, I posted the following I think numpy memory management is due a cleanup. Currently we have PyDataMem_* PyDimMem_* PyArray_* Plus the malloc, PyMem_*, and PyObject_* interfaces. That is six ways to manage heap allocations. As far as I can tell, PyArray_* is always PyMem_* in practice. We probably need to keep the PyDataMem family as it has a memory tracking option, but PyDimMem just confuses things, I'd rather just use PyMem_* with explicit size. Curiously, the PyObject_Malloc family is not documented apart from some release notes. We should also check for the macro versions of PyMem_* as they are deprecated for extension modules. Nathaniel then suggested that we consider going all Python allocators, especially as new memory tracing tools are coming online in 3.4. Given that these changes could have some impact on current extension writers I thought I'd bring this up on the list to gather opinions. Thoughts? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] adding fused multiply and add to numpy
Good questions where do we stop. I think as you that the fma with guarantees is a good new feature. But if this is made available, people will want to use it for speed. Some people won't like to use another library or dependency. They won't like to have random speed up or slow down. So why not add the ma and fma and trace the line to the operation implemented on the CPU that have an fused version? That will make a sensible limit I think. Anyway, we won't use it directly. This is just my taught. Do you know if those instruction are automatically used by gcc if we use the good architecture parameter? Fred On Thu, Jan 9, 2014 at 12:07 PM, Nathaniel Smith n...@pobox.com wrote: On Thu, Jan 9, 2014 at 3:30 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: On Thu, Jan 9, 2014 at 3:50 PM, Frédéric Bastien no...@nouiz.org wrote: How hard would it be to provide the choise to the user? We could provide 2 functions like: fma_fast() fma_prec() (for precision)? Or this could be a parameter or a user configuration option like for the overflow/underflow error. I like Freddie Witherden proposal to name the function madd which does not guarantee one rounding operation. This leaves the namespace open for a special fma function with that guarantee. It can use the libc fma function which is very slow sometimes but platform independent. This is assuming apple did not again take shortcuts like they did with their libc hypot implementation, can someone disassemble apple libc to check what they are doing for C99 fma? And it leaves users the possibility to use the faster madd function if they do not need the precision guarantee. If madd doesn't provide any rounding guarantees, then its only reason for existence is that it provides a fused a*b+c loop that better utilizes memory bandwidth, right? I'm guessing that speed-wise it doesn't really matter whether you use the fancy AVX instructions or not, since even the naive implementation is memory bound -- the advantage is just in the fusion? Lack of loop fusion is obviously a major limitation of numpy, but it's a very general problem. I'm sceptical about whether we want to get into the business of adding functions whose only purpose is to provide pre-fused loops. After madd, what other operations should we provide like this? msub (a*b-c)? add3 (a+b+c)? maddm (a*b+c*d)? mult3 (a*b*c)? How do we decide? Surely it's better to direct people who are hitting memory bottlenecks to much more powerful and general solutions to this problem, like numexpr/cython/numba/theano? (OTOH the verison that gives rounding guarantees is obviously a unique new feature.) -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Speedup by avoiding memory alloc twice in scalar array
Hi, As told, I don't think Theano swap the stride buffer. Most of the time, we allocated with PyArray_empty or zeros. (not sure of the capitals). The only exception I remember have been changed in the last release to use PyArray_NewFromDescr(). Before that, we where allocating the PyArray with the right number of dimensions, then we where manually filling the ptr, shapes and strides. I don't recall any swapping of pointer for shapes and strides in Theano. So I don't see why Theano would prevent doing just one malloc for the struct and the shapes/strides. If it does, tell me and I'll fix Theano:) I don't want Theano to prevent optimization in NumPy. Theano now support completly the new NumPy C-API interface. Nathaniel also told that resizing the PyArray could prevent that. When Theano call PyArray_resize (not sure of the syntax), we always keep the number of dimensions the same. But I don't know if other code do differently. That could be a reason to keep separate alloc. I don't know any software that manually free the strides/shapes pointer to swap it. So I also think your suggestion to change PyDimMem_NEW to call the small allocator is good. The new interface prevent people from doing that anyway I think. Do we need to wait until we completly remove the old interface for this? Fred On Wed, Jan 8, 2014 at 1:13 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: On 18.07.2013 15:36, Nathaniel Smith wrote: On Wed, Jul 17, 2013 at 5:57 PM, Frédéric Bastien no...@nouiz.org wrote: On Wed, Jul 17, 2013 at 10:39 AM, Nathaniel Smith n...@pobox.com wrote: On Tue, Jul 16, 2013 at 11:55 AM, Nathaniel Smith n...@pobox.com wrote: It's entirely possible I misunderstood, so let's see if we can work it out. I know that you want to assign to the -data pointer in a PyArrayObject, right? That's what caused some trouble with the 1.7 API deprecations, which were trying to prevent direct access to this field? Creating a new array given a pointer to a memory region is no problem, and obviously will be supported regardless of any optimizations. But if that's all you were doing then you shouldn't have run into the deprecation problem. Or maybe I'm misremembering! What is currently done at only 1 place is to create a new PyArrayObject with a given ptr. So NumPy don't do the allocation. We later change that ptr to another one. Hmm, OK, so that would still work. If the array has the OWNDATA flag set (or you otherwise know where the data came from), then swapping the data pointer would still work. The change would be that in most cases when asking numpy to allocate a new array from scratch, the OWNDATA flag would not be set. That's because the OWNDATA flag really means when this object is deallocated, call free(self-data), but if we allocate the array struct and the data buffer together in a single memory region, then deallocating the object will automatically cause the data buffer to be deallocated as well, without the array destructor having to take any special effort. It is the change to the ptr of the just created PyArrayObject that caused problem with the interface deprecation. I fixed all other problem releated to the deprecation (mostly just rename of function/macro). But I didn't fixed this one yet. I would need to change the logic to compute the final ptr before creating the PyArrayObject object and create it with the final data ptr. But in call cases, NumPy didn't allocated data memory for this object, so this case don't block your optimization. Right. One thing in our optimization wish list is to reuse allocated PyArrayObject between Theano function call for intermediate results(so completly under Theano control). This could be useful in particular for reshape/transpose/subtensor. Those functions are pretty fast and from memory, I already found the allocation time was significant. But in those cases, it is on PyArrayObject that are views, so the metadata and the data would be in different memory region in all cases. The other cases of optimization wish list is if we want to reuse the PyArrayObject when the shape isn't the good one (but the number of dimensions is the same). If we do that for operation like addition, we will need to use PyArray_Resize(). This will be done on PyArrayObject whose data memory was allocated by NumPy. So if you do one memory allowcation for metadata and data, just make sure that PyArray_Resize() will handle that correctly. I'm not sure I follow the details here, but it does turn out that a really surprising amount of time in PyArray_NewFromDescr is spent in just calculating and writing out the shape and strides buffers, so for programs that e.g. use hundreds of small 3-element arrays to represent points in space, re-using even these buffers might be a big win... On the usefulness of doing only 1 memory allocation, on our old gpu ndarray, we where doing 2 alloc on the GPU, one for metadata and one for data. I removed
Re: [Numpy-discussion] Speedup by avoiding memory alloc twice in scalar array
On Wed, Jan 8, 2014 at 3:40 PM, Nathaniel Smith n...@pobox.com wrote: On Wed, Jan 8, 2014 at 12:13 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: On 18.07.2013 15:36, Nathaniel Smith wrote: On Wed, Jul 17, 2013 at 5:57 PM, Frédéric Bastien no...@nouiz.org wrote: On the usefulness of doing only 1 memory allocation, on our old gpu ndarray, we where doing 2 alloc on the GPU, one for metadata and one for data. I removed this, as this was a bottleneck. allocation on the CPU are faster the on the GPU, but this is still something that is slow except if you reuse memory. Do PyMem_Malloc, reuse previous small allocation? Yes, at least in theory PyMem_Malloc is highly-optimized for small buffer re-use. (For requests 256 bytes it just calls malloc().) And it's possible to define type-specific freelists; not sure if there's any value in doing that for PyArrayObjects. See Objects/obmalloc.c in the Python source tree. PyMem_Malloc is just a wrapper around malloc, so its only as optimized as the c library is (glibc is not good for small allocations). PyObject_Malloc uses a small object allocator for requests smaller 512 bytes (256 in python2). Right, I meant PyObject_Malloc of course. I filed a pull request [0] replacing a few functions which I think are safe to convert to this API. The nditer allocation which is completely encapsulated and the construction of the scalar and array python objects which are deleted via the tp_free slot (we really should not support third party libraries using PyMem_Free on python objects without checks). This already gives up to 15% improvements for scalar operations compared to glibc 2.17 malloc. Do I understand the discussions here right that we could replace PyDimMem_NEW which is used for strides in PyArray with the small object allocation too? It would still allow swapping the stride buffer, but every application must then delete it with PyDimMem_FREE which should be a reasonable requirement. That sounds reasonable to me. If we wanted to get even more elaborate, we could by default stick the shape/strides into the same allocation as the PyArrayObject, and then defer allocating a separate buffer until someone actually calls PyArray_Resize. (With a new flag, similar to OWNDATA, that tells us whether we need to free the shape/stride buffer when deallocating the array.) It's got to be a vanishingly small proportion of arrays where PyArray_Resize is actually called, so for most arrays, this would let us skip the allocation entirely, and the only cost would be that for arrays where PyArray_Resize *is* called to add new dimensions, we'd leave the original buffers sitting around until the array was freed, wasting a tiny amount of memory. Given that no-one has noticed that currently *every* array wastes 50% of this much memory (see upthread), I doubt anyone will care... Seam a good plan. When is it planed to remove the old interface? We can't do it before I think. Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Theano 0.6 released
What's New -- We recommend that everybody update to this version. Highlights (since 0.6rc5): * Last release with support for Python 2.4 and 2.5. * We will try to release more frequently. * Fix crash/installation problems. * Use less memory for conv3d2d. 0.6rc4 skipped for a technical reason. Highlights (since 0.6rc3): * Python 3.3 compatibility with buildbot test for it. * Full advanced indexing support. * Better Windows 64 bit support. * New profiler. * Better error messages that help debugging. * Better support for newer NumPy versions (remove useless warning/crash). * Faster optimization/compilation for big graph. * Move in Theano the Conv3d2d implementation. * Better SymPy/Theano bridge: Make an Theano op from SymPy expression and use SymPy c code generator. * Bug fixes. Change from 0.6rc5: * Fix crash when specifing march in cxxflags Theano flag. (Frederic B., reported by FiReTiTi) * code cleanup (Jorg Bornschein) * Fix Canopy installation on windows when it was installed for all users: Raingo * Fix Theano tests due to a scipy change. (Frederic B.) * Work around bug introduced in scipy dev 0.14. (Frederic B.) * Fix Theano tests following bugfix in SciPy. (Frederic B., reported by Ziyuan Lin) * Add Theano flag cublas.lib (Misha Denil) * Make conv3d2d work more inplace (so less memory usage) (Frederic B., repoted by Jean-Philippe Ouellet) See https://pypi.python.org/pypi/Theano for more details. Download and Install You can download Theano from http://pypi.python.org/pypi/Theano Installation instructions are available at http://deeplearning.net/software/theano/install.html Description --- Theano is a Python library that allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays. It is built on top of NumPy. Theano features: * tight integration with NumPy: a similar interface to NumPy's. numpy.ndarrays are also used internally in Theano-compiled functions. * transparent use of a GPU: perform data-intensive computations up to 140x faster than on a CPU (support for float32 only). * efficient symbolic differentiation: Theano can compute derivatives for functions of one or many inputs. * speed and stability optimizations: avoid nasty bugs when computing expressions such as log(1+ exp(x)) for large values of x. * dynamic C code generation: evaluate expressions faster. * extensive unit-testing and self-verification: includes tools for detecting and diagnosing bugs and/or potential problems. Theano has been powering large-scale computationally intensive scientific research since 2007, but it is also approachable enough to be used in the classroom (IFT6266 at the University of Montreal). Resources - About Theano: http://deeplearning.net/software/theano/ Theano-related projects: http://github.com/Theano/Theano/wiki/Related-projects About NumPy: http://numpy.scipy.org/ About SciPy: http://www.scipy.org/ Machine Learning Tutorial with Theano on Deep Architectures: http://deeplearning.net/tutorial/ Acknowledgments --- I would like to thank all contributors of Theano. For this particular release (since 0.5), many people have helped, notably: Frederic Bastien Pascal Lamblin Ian Goodfellow Olivier Delalleau Razvan Pascanu abalkin Arnaud Bergeron Nicolas Bouchard + Jeremiah Lowin + Matthew Rocklin Eric Larsen + James Bergstra David Warde-Farley John Salvatier + Vivek Kulkarni + Yann N. Dauphin Ludwig Schmidt-Hackenberg + Gabe Schwartz + Rami Al-Rfou' + Guillaume Desjardins Caglar + Sigurd Spieckermann + Steven Pigeon + Bogdan Budescu + Jey Kottalam + Mehdi Mirza + Alexander Belopolsky + Ethan Buchman + Jason Yosinski Nicolas Pinto + Sina Honari + Ben McCann + Graham Taylor Hani Almousli Ilya Dyachenko + Jan Schlüter + Jorg Bornschein + Micky Latowicki + Yaroslav Halchenko + Eric Hunsberger + Amir Elaguizy + Hannes Schulz + Huy Nguyen + Ilan Schnell + Li Yao Misha Denil + Robert Kern + Sebastian Berg + Vincent Dumoulin + Wei Li + XterNalz + A total of 51 people contributed to this release. People with a + by their names contributed a patch for the first time. Also, thank you to all NumPy and Scipy developers as Theano builds on their strengths. All questions/comments are always welcome on the Theano mailing-lists ( http://deeplearning.net/software/theano/#community ) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] nasty bug in 1.8.0??
It is the NPY_RELAXED_STRIDES_CHECKING=1 flag that caused this. Fred On Mon, Dec 2, 2013 at 2:18 PM, Neal Becker ndbeck...@gmail.com wrote: I built using: CFLAGS='-march=native -O3' NPY_RELAXED_STRIDES_CHECKING=1 python3 setup.py install --user aπid wrote: I get: In [4]: x.strides Out[4]: (8,) Same architecture and OS, Numpy installed via Pip on Python 2.7.5. On 2 December 2013 20:08, Neal Becker ndbeck...@gmail.com wrote: This is np 1.8.0 on fedora x86_64: In [5]: x =np.array ((1,)) In [6]: x.shape Out[6]: (1,) In [7]: x.strides Out[7]: (9223372036854775807,) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] nasty bug in 1.8.0??
There is a way to compile NumPy to use strange strides for dimension with shape of 1. This is done to help developer test their code to don't rely on this. There was never a warranty to the value of strides in that cases. Most of the time, it was the same, but in some cases, it was different. Using such strange strides will cause segfault if you use them, so it allow to see if you rely on them. In Theano, we did some assertion on strides and checked them for optimized call to blas. So we will need to change some code to support this. But I don't those strange strides should happen in the wild. Did you installed NumPy manually? Fred On Mon, Dec 2, 2013 at 2:14 PM, Daπid davidmen...@gmail.com wrote: I get: In [4]: x.strides Out[4]: (8,) Same architecture and OS, Numpy installed via Pip on Python 2.7.5. On 2 December 2013 20:08, Neal Becker ndbeck...@gmail.com wrote: This is np 1.8.0 on fedora x86_64: In [5]: x =np.array ((1,)) In [6]: x.shape Out[6]: (1,) In [7]: x.strides Out[7]: (9223372036854775807,) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] nasty bug in 1.8.0??
Just don't compile with NPY_RELAXED_STRIDES_CHECKING to have the old behavior I think (which is an not always the same strides depending of how it was created, I don't know if they changed that or not). Do someone else recall the detail of this? Fred p.s. I didn't do this or asked for it. But this help test your software to don't depend of the strides when shapes is 1. On Mon, Dec 2, 2013 at 2:35 PM, Neal Becker ndbeck...@gmail.com wrote: I don't think that behavior is acceptable. Frédéric Bastien wrote: It is the NPY_RELAXED_STRIDES_CHECKING=1 flag that caused this. Fred On Mon, Dec 2, 2013 at 2:18 PM, Neal Becker ndbeck...@gmail.com wrote: I built using: CFLAGS='-march=native -O3' NPY_RELAXED_STRIDES_CHECKING=1 python3 setup.py install --user aπid wrote: I get: In [4]: x.strides Out[4]: (8,) Same architecture and OS, Numpy installed via Pip on Python 2.7.5. On 2 December 2013 20:08, Neal Becker ndbeck...@gmail.com wrote: This is np 1.8.0 on fedora x86_64: In [5]: x =np.array ((1,)) In [6]: x.shape Out[6]: (1,) In [7]: x.strides Out[7]: (9223372036854775807,) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Silencing NumPy output
Hi, After more investigation, I found that there already exist a way to suppress those message on posix system. So I reused it in the PR. That way, it was faster, but prevent change in that area. So there is less change of breaking other syste: https://github.com/numpy/numpy/pull/4081 But it remove the stdout when we run this command: numpy.distutils.system_info.get_info(blas_opt) But during compilation, we still have the info about what is found: atlas_blas_threads_info: Setting PTATLAS=ATLAS Setting PTATLAS=ATLAS customize Gnu95FCompiler Found executable /usr/bin/gfortran customize Gnu95FCompiler customize Gnu95FCompiler using config compiling '_configtest.c': /* This file is generated from numpy/distutils/system_info.py */ void ATL_buildinfo(void); int main(void) { ATL_buildinfo(); return 0; } C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -g -O2 -fPIC compile options: '-c' gcc: _configtest.c gcc -pthread _configtest.o -L/usr/lib64/atlas -lptf77blas -lptcblas -latlas -o _configtest success! removing: _configtest.c _configtest.o _configtest Setting PTATLAS=ATLAS FOUND: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/usr/lib64/atlas'] language = c define_macros = [('ATLAS_INFO', '\\3.8.3\\')] include_dirs = ['/usr/include'] FOUND: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/usr/lib64/atlas'] language = c define_macros = [('ATLAS_INFO', '\\3.8.3\\')] include_dirs = ['/usr/include'] non-existing path in 'numpy/lib': 'benchmarks' lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in ['/opt/lisa/os_v2/common/Canopy_64bit/User/lib', '/usr/local/lib64', '/usr/local/lib', '/usr/lib64', '/usr/lib'] NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /opt/lisa/os_v2/common/Canopy_64bit/User/lib libraries lapack_atlas not found in /opt/lisa/os_v2/common/Canopy_64bit/User/lib libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib64 libraries lapack_atlas not found in /usr/local/lib64 libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/lib64/atlas numpy.distutils.system_info.atlas_threads_info Setting PTATLAS=ATLAS Setting PTATLAS=ATLAS FOUND: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/usr/lib64/atlas'] language = f77 define_macros = [('ATLAS_INFO', '\\3.8.3\\')] include_dirs = ['/usr/include'] FOUND: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/usr/lib64/atlas'] language = f77 define_macros = [('ATLAS_INFO', '\\3.8.3\\')] include_dirs = ['/usr/include'] Frédéric On Fri, Nov 22, 2013 at 4:26 PM, Frédéric Bastien no...@nouiz.org wrote: I didn't forgot this, but I got side tracked. Here is the Theano code I would like to try to use to replace os.system: https://github.com/Theano/Theano/blob/master/theano/misc/windows.py But I won't be able to try this before next week. Fred On Fri, Nov 15, 2013 at 5:49 PM, David Cournapeau courn...@gmail.com wrote: On Fri, Nov 15, 2013 at 7:41 PM, Robert Kern robert.k...@gmail.com wrote: On Fri, Nov 15, 2013 at 7:28 PM, David Cournapeau courn...@gmail.com wrote: On Fri, Nov 15, 2013 at 6:21 PM, Charles R Harris charlesr.har...@gmail.com wrote: Sure, give it a shot. Looks like subprocess.Popen was intended to replace os.system in any case. Except that output is not 'real time' with straight Popen, and doing so reliably on every platform (cough - windows - cough) is not completely trivial. You also have to handle buffered output, etc... That code is very fragile, so this would be quite a lot of testing to change, and I am not sure it worths it. It doesn't have to be real time. Just use .communicate() and print out the stdout and stderr to their appropriate streams after the subprocess finishes. Indeed, it does not have to be, but that's useful for debugging compilation issues (not so much for numpy itself, but for some packages which have files that takes a very long time to build, like scipy.sparsetools or bottleneck). That's a minor point compared to the potential issues when building on windows, though. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] MKL + CPU, GPU + cuBLAS comparison
We have such benchmark in Theano: https://github.com/Theano/Theano/blob/master/theano/misc/check_blas.py#L177 HTH Fred On Tue, Nov 26, 2013 at 7:10 AM, Dinesh Vadhia dineshbvad...@hotmail.com wrote: Jerome, Thanks for the swift response and tests. Crikey, that is a significant difference at first glance. Would it be possible to compare a BLAS computation eg. matrix-vector or matrix-matrix calculation? Thx! ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Silencing NumPy output
I didn't forgot this, but I got side tracked. Here is the Theano code I would like to try to use to replace os.system: https://github.com/Theano/Theano/blob/master/theano/misc/windows.py But I won't be able to try this before next week. Fred On Fri, Nov 15, 2013 at 5:49 PM, David Cournapeau courn...@gmail.com wrote: On Fri, Nov 15, 2013 at 7:41 PM, Robert Kern robert.k...@gmail.com wrote: On Fri, Nov 15, 2013 at 7:28 PM, David Cournapeau courn...@gmail.com wrote: On Fri, Nov 15, 2013 at 6:21 PM, Charles R Harris charlesr.har...@gmail.com wrote: Sure, give it a shot. Looks like subprocess.Popen was intended to replace os.system in any case. Except that output is not 'real time' with straight Popen, and doing so reliably on every platform (cough - windows - cough) is not completely trivial. You also have to handle buffered output, etc... That code is very fragile, so this would be quite a lot of testing to change, and I am not sure it worths it. It doesn't have to be real time. Just use .communicate() and print out the stdout and stderr to their appropriate streams after the subprocess finishes. Indeed, it does not have to be, but that's useful for debugging compilation issues (not so much for numpy itself, but for some packages which have files that takes a very long time to build, like scipy.sparsetools or bottleneck). That's a minor point compared to the potential issues when building on windows, though. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Silencing NumPy output
Hi, NumPy 1.8 removed the private NumPy interface numpy.distutils.__config__. So a Theano user make a PR to make Theano use the official interface: numpy.distutils.system_info.get_info(blas_opt) But this output many stuff to the output. I can silence part of it by silencing warnings, but I'm not able to silence this output: Found executable /usr/bin/gfortran ATLAS version 3.8.3 built by mockbuild on Wed Jul 28 02:12:34 UTC 2010: UNAME: Linux x86-15.phx2.fedoraproject.org 2.6.32-44.el6.x86_64 #1 SMP Wed Jul 7 15:47:50 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux INSTFLG : -1 0 -a 1 ARCHDEFS : -DATL_OS_Linux -DATL_ARCH_Corei7 -DATL_CPUMHZ=1596 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_USE64BITS -DATL_GAS_x8664 F2CDEFS : -DAdd_ -DF77_INTEGER=int -DStringSunStyle CACHEEDGE: 524288 F77 : gfortran, version GNU Fortran (GCC) 4.5.0 20100716 (Red Hat 4.5.0-3) F77FLAGS : -O -g -Wa,--noexecstack -fPIC -m64 SMC : gcc, version gcc (GCC) 4.5.0 20100716 (Red Hat 4.5.0-3) SMCFLAGS : -fomit-frame-pointer -mfpmath=sse -msse3 -O2 -fno-schedule-insns2 -g -Wa,--noexecstack -fPIC -m64 SKC : gcc, version gcc (GCC) 4.5.0 20100716 (Red Hat 4.5.0-3) SKCFLAGS : -fomit-frame-pointer -mfpmath=sse -msse3 -O2 -fno-schedule-insns2 -g -Wa,--noexecstack -fPIC -m64 -L/opt/lisa/os_v2/canopy/appdata/canopy-1.1.0.1371.rh5-x86_64/../../appdata/canopy-1.1.0.1371.rh5-x86_64/lib -lptf77blas -lptcblas -latlas I tried to redirect the stdout and stderr, but it don't work. I looked into NumPy code and I don't see a way to change that from a library that use NumPy. Is there a way to access to silence that output? Is there a new place of the old interface: numpy.distutils.__config__ that I can reuse? It don't need to be a public interface. thanks Frédéric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Silencing NumPy output
I found a line like this in the file: numpy/distutils/fcompiler/__init__.py I changed the -2 to 2, but it didn't change anything. In fact, this line wasn't called. The fct set_verbosity() is called only once, with the value of 0. The default value set at import. If I change that to 2 or -2, it do not change anything. I think that the problem is related to the exec_command as you told. It seam it call it in a way that don't take the stdout/stderr set in Python. So I can't redirect it. The problem is that is use os.system() and we can't redirect its stdout/stderr. What about replacing the os.system call to subprocess.Popen? This would allow us to catch the stdout/stderr. We use this call in Theano and it is compatible with python 2.4. Fred On Fri, Nov 15, 2013 at 12:40 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Nov 15, 2013 at 10:31 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Nov 15, 2013 at 8:12 AM, Frédéric Bastien no...@nouiz.org wrote: Hi, NumPy 1.8 removed the private NumPy interface numpy.distutils.__config__. So a Theano user make a PR to make Theano use the official interface: numpy.distutils.system_info.get_info(blas_opt) But this output many stuff to the output. I can silence part of it by silencing warnings, but I'm not able to silence this output: Found executable /usr/bin/gfortran ATLAS version 3.8.3 built by mockbuild on Wed Jul 28 02:12:34 UTC 2010: UNAME: Linux x86-15.phx2.fedoraproject.org 2.6.32-44.el6.x86_64 #1 SMP Wed Jul 7 15:47:50 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux INSTFLG : -1 0 -a 1 ARCHDEFS : -DATL_OS_Linux -DATL_ARCH_Corei7 -DATL_CPUMHZ=1596 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_USE64BITS -DATL_GAS_x8664 F2CDEFS : -DAdd_ -DF77_INTEGER=int -DStringSunStyle CACHEEDGE: 524288 F77 : gfortran, version GNU Fortran (GCC) 4.5.0 20100716 (Red Hat 4.5.0-3) F77FLAGS : -O -g -Wa,--noexecstack -fPIC -m64 SMC : gcc, version gcc (GCC) 4.5.0 20100716 (Red Hat 4.5.0-3) SMCFLAGS : -fomit-frame-pointer -mfpmath=sse -msse3 -O2 -fno-schedule-insns2 -g -Wa,--noexecstack -fPIC -m64 SKC : gcc, version gcc (GCC) 4.5.0 20100716 (Red Hat 4.5.0-3) SKCFLAGS : -fomit-frame-pointer -mfpmath=sse -msse3 -O2 -fno-schedule-insns2 -g -Wa,--noexecstack -fPIC -m64 -L/opt/lisa/os_v2/canopy/appdata/canopy-1.1.0.1371.rh5-x86_64/../../appdata/canopy-1.1.0.1371.rh5-x86_64/lib -lptf77blas -lptcblas -latlas I tried to redirect the stdout and stderr, but it don't work. I looked into NumPy code and I don't see a way to change that from a library that use NumPy. Is there a way to access to silence that output? Is there a new place of the old interface: numpy.distutils.__config__ that I can reuse? It don't need to be a public interface. Looks like the problem is in numpy/distutils/exec_command.py and numpy/distutils/log.py. In particular, it looks like a logging problem and I'd guess it may be connected to the debug logs. Also, looks like numpy.distutils.log inherits from distutils.log, which may be obsolete. You might get some control of the log with an environment variable, but the function itself looks largely undocumented. That said, it should probably be printing to stderror when run from the command line. In numpy/distutils/__init__.py line 886 try changing log.set_verbosity(-2) to log.set_verbosity(2) Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Silencing NumPy output
If it don't change, currently it mean that each process that use Theano and use BLAS will have that printed: Found executable /usr/bin/gfortran ATLAS version 3.8.3 built by mockbuild on Wed Jul 28 02:12:34 UTC 2010: UNAME: Linux x86-15.phx2.fedoraproject.org 2.6.32-44.el6.x86_64 #1 SMP Wed Jul 7 15:47:50 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux INSTFLG : -1 0 -a 1 ARCHDEFS : -DATL_OS_Linux -DATL_ARCH_Corei7 -DATL_CPUMHZ=1596 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_USE64BITS -DATL_GAS_x8664 F2CDEFS : -DAdd_ -DF77_INTEGER=int -DStringSunStyle CACHEEDGE: 524288 F77 : gfortran, version GNU Fortran (GCC) 4.5.0 20100716 (Red Hat 4.5.0-3) F77FLAGS : -O -g -Wa,--noexecstack -fPIC -m64 SMC : gcc, version gcc (GCC) 4.5.0 20100716 (Red Hat 4.5.0-3) SMCFLAGS : -fomit-frame-pointer -mfpmath=sse -msse3 -O2 -fno-schedule-insns2 -g -Wa,--noexecstack -fPIC -m64 SKC : gcc, version gcc (GCC) 4.5.0 20100716 (Red Hat 4.5.0-3) SKCFLAGS : -fomit-frame-pointer -mfpmath=sse -msse3 -O2 -fno-schedule-insns2 -g -Wa,--noexecstack -fPIC -m64 Not very nice. As told, in Theano we do that and it work on Mac, Linux and Windows in 32 and 64 bits. Yes, we have our own wrapper around it to handle windows. But as it is already done and tested in those cases, I think it is a good idea to do it. Do you think that it should be tested in other environment? thanks Frédéric p.s. There is also warning printed, but I can hide them without change in NumPy. p.p.s The first line isn't yet removed in my local tests, so maybe more is needed. On Fri, Nov 15, 2013 at 2:28 PM, David Cournapeau courn...@gmail.com wrote: On Fri, Nov 15, 2013 at 6:21 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Nov 15, 2013 at 11:05 AM, Frédéric Bastien no...@nouiz.org wrote: I found a line like this in the file: numpy/distutils/fcompiler/__init__.py I changed the -2 to 2, but it didn't change anything. In fact, this line wasn't called. The fct set_verbosity() is called only once, with the value of 0. The default value set at import. If I change that to 2 or -2, it do not change anything. I think that the problem is related to the exec_command as you told. It seam it call it in a way that don't take the stdout/stderr set in Python. So I can't redirect it. The problem is that is use os.system() and we can't redirect its stdout/stderr. What about replacing the os.system call to subprocess.Popen? This would allow us to catch the stdout/stderr. We use this call in Theano and it is compatible with python 2.4. Numpy 1.8 doesn't support python 2.4 in any case, so that isn't a problem ;) Sure, give it a shot. Looks like subprocess.Popen was intended to replace os.system in any case. Except that output is not 'real time' with straight Popen, and doing so reliably on every platform (cough - windows - cough) is not completely trivial. You also have to handle buffered output, etc... That code is very fragile, so this would be quite a lot of testing to change, and I am not sure it worths it. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] How we support new and old NumPy C API.
Hi, With recent version of NumPy, when we compile c code, by default it raise a deprecation warning. To remore it, we must of only the new NumPy C API and define a macro. The new API only exist for NumPy 1.6 and later, so if we want to support older NumPy we need to do more work. As Theano compile many c code that include NumPy, it generate too many warning to the user. So I spend about 2 weeks to update Theano. Here is what I did in hope that this help people. In particular, I think cython can do the same. Currently they use only the old interface as they want to support old NumPy version. 1) Define this macro when compiled again numpy: NPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION 2) Replace the use of old macro to the new one and define new macro that map the new one to the old one when compiling with old NumPy version: New macro = old macro NPY_ARRAY_ENSURECOPY=NPY_ENSURECOPY NPY_ARRAY_ALIGNED=NPY_ALIGNED NPY_ARRAY_WRITEABLE=NPY_WRITEABLE NPY_ARRAY_UPDATE_ALL=NPY_UPDATE_ALL NPY_ARRAY_C_CONTIGUOUS=NPY_C_CONTIGUOUS NPY_ARRAY_F_CONTIGUOUS=NPY_F_CONTIGUOUS 2) Do not access members of PyArrayObject directly, but use the old macro(that are inline fct in newer NumPy) For example change a_object-dimensions to PyArray_DIMS(a_object). 3) Another change is that the new API do not allow asignation to the BASE attribute of an ndarray. To do this, you must call a function. So we use this code that will work for all version of NumPy: #if NPY_API_VERSION 0x0007 PyArray_BASE(xview) = py_%(x)s; #else PyArray_SetBaseObject(xview, py_%(x)s); #endif 4) The new interface have no way to modify the data ptr of an ndarray. The work around that we needed is to change our code such that we create the ndarray directly with the good data ptr. In the past, we created it with a temporary value, compute the one we want, and update it. Now we create the ndarray directly with the good data ptr. This was done in our subtensor code (a_ndarray[slice[,...]]) code. 5) Lastly, we have one c code file that is generated from cython code. We modified manually this code to be compatible with the new NumPy API to don't generate error, as we disable the old NumPy interface. Here is the PR to Theano for this: https://github.com/scikit-learn/scikit-learn/issues/2573 Hoping that this will help someone. Frédéric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy 1.8.0 release
Hi, This is to tell that all Theano tests pass with the branch 1.8.x with the commit 397fdec2a2c thanks Frédéric On Sun, Oct 20, 2013 at 1:35 PM, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, I'm planning on releasing Numpy 1.8.0 next weekend. There have been a few minor fixes since 1.8.0rc2, but nothing that I think warrants another rc release. Please make sure to test the 1.8.0rc2 or maintenance/1.8.x branch with your code, for after next weekend it will be too late. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: 1.8.0b2 release.
Hi, I create an Ubuntu VM, cloned numpy and it compiled correctly in it. I tried in my normal development environment (instead of a virtualenv) with my old clone that is updated of numpy and it failed. So this isn't virtualenv. I created a new clone of numpy and it compiled correctly with my normal development environment. So this seam to indicate that the __multiarray_api.* are cached. In the new clone, I have this after compilation: $find . -name '*multiarray_api*' ./build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.c ./build/src.linux-x86_64-2.7/numpy/core/include/numpy/multiarray_api.txt ./build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.h In my old clone, I have this: $cd ../numpy [bastienf@oolong numpy]$find . -name '*multiarray_api*' ./numpy/core/include/numpy/__multiarray_api.h ./build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.c ./build/src.linux-x86_64-2.7/numpy/core/include/numpy/multiarray_api.txt ./build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.h So for some reason, the __multiarray_api.h file ended in the source of numpy. When this happen all following compilation reuse it instead of regenerating it. If in the new numpy clone, I run: python setup.py build_ext --inplace I get: $find . -name '*multiarray_api*' ./numpy/core/include/numpy/__multiarray_api.c ./numpy/core/include/numpy/multiarray_api.txt ./numpy/core/include/numpy/__multiarray_api.h ./build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.c ./build/src.linux-x86_64-2.7/numpy/core/include/numpy/multiarray_api.txt ./build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.h So this could explain how I ended up with a numpy clone in such state. I think we should make numpy always regenerate the auto-generated files even if they are present, or it would need to check if they are older the the source of thoses files. Do you think this is the good solution too? Do someone have guidance on where those file are generated? Do you and how do you support the partial rebuild vs the full rebuild? Or is it just not supported to build inplace? In the .gitignore file, there is 41 files that seam to be generated. thanks Frédéric On Mon, Sep 9, 2013 at 2:14 PM, Charles R Harris charlesr.har...@gmail.comwrote: On Mon, Sep 9, 2013 at 12:04 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Sep 9, 2013 at 11:09 AM, Frédéric Bastien no...@nouiz.orgwrote: I don't have CFLAGS defined. But I have iothers env variable that point to other python stuff like CPATH. But even in that case, I don't understand how other people could have compiled methods.c. The include aren't part of the env variable, but in the file. Anyway, I think your PR is the good fix. I checked our PR and now I have this new error: gcc: numpy/core/src/multiarray/multiarraymodule.c In file included from numpy/core/src/multiarray/multiarraymodule.c:3753:0: build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.c:303:18: error: ‘PyArray_Partition’ undeclared here (not in a function) build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.c:304:18: error: ‘PyArray_ArgPartition’ undeclared here (not in a function) In file included from numpy/core/src/multiarray/multiarraymodule.c:3753:0: build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.c:303:18: error: ‘PyArray_Partition’ undeclared here (not in a function) build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.c:304:18: error: ‘PyArray_ArgPartition’ undeclared here (not in a function) error: Command gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -O2 -fPIC -DHAVE_NPY_CONFIG_H=1 -Inumpy/core/include -Ibuild/src.linux-x86_64-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/opt/lisa/os/epd-7.1.2/include/python2.7 -c numpy/core/src/multiarray/multiarraymodule.c -o build/temp.linux-x86_64-2.7/numpy/core/src/multiarray/multiarraymodule.o failed with exit status 1 So it seam we have the same problem with those 2 functions. They are defined in numpy/core/src/multiarray/item_selection.c, but not in the .h file. I'm going to guess that there is something special about your virtualenv. snip The prototypes should be in `arrayobject.h` and `ndarrayobject.h`. Perhaps old versions are being used and cached somewhere. Are they precompiled somewhere? Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: 1.8.0b2 release.
Hi, I checkout the dev version of numpy and it fail to compile with this error: creating build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/numpy/core/src/multiarray compile options: '-DHAVE_NPY_CONFIG_H=1 -Inumpy/core/include -Ibuild/src.linux-x86_64-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/opt/lisa/os/epd-7.1.2/include/python2.7 -c' gcc: numpy/core/src/multiarray/sequence.c gcc: numpy/core/src/multiarray/descriptor.c gcc: numpy/core/src/multiarray/getset.c gcc: numpy/core/src/multiarray/arrayobject.c gcc: numpy/core/src/multiarray/methods.c numpy/core/src/multiarray/methods.c: In function ‘array_partition’: numpy/core/src/multiarray/methods.c:1199:38: error: ‘PyArray_SelectkindConverter’ undeclared (first use in this function) numpy/core/src/multiarray/methods.c:1199:38: note: each undeclared identifier is reported only once for each function it appears in numpy/core/src/multiarray/methods.c: In function ‘array_argpartition’: numpy/core/src/multiarray/methods.c:1316:38: error: ‘PyArray_SelectkindConverter’ undeclared (first use in this function) numpy/core/src/multiarray/methods.c:1352:9: warning: assignment makes pointer from integer without a cast numpy/core/src/multiarray/methods.c: In function ‘array_partition’: numpy/core/src/multiarray/methods.c:1199:38: error: ‘PyArray_SelectkindConverter’ undeclared (first use in this function) numpy/core/src/multiarray/methods.c:1199:38: note: each undeclared identifier is reported only once for each function it appears in numpy/core/src/multiarray/methods.c: In function ‘array_argpartition’: numpy/core/src/multiarray/methods.c:1316:38: error: ‘PyArray_SelectkindConverter’ undeclared (first use in this function) numpy/core/src/multiarray/methods.c:1352:9: warning: assignment makes pointer from integer without a cast error: Command gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -O2 -fPIC -DHAVE_NPY_CONFIG_H=1 -Inumpy/core/include -Ibuild/src.linux-x86_64-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/opt/lisa/os/epd-7.1.2/include/python2.7 -c numpy/core/src/multiarray/methods.c -o build/temp.linux-x86_64-2.7/numpy/core/src/multiarray/methods.o failed with exit status 1 PyArray_SelectkindConverter is defined in numpy/core/src/multiarray/conversion_utils.c. methods.c include conversion_utils.h, but there is no fct declaration of this fct in this file. Is that normal? Fred On Sun, Sep 8, 2013 at 8:55 PM, Charles R Harris charlesr.har...@gmail.comwrote: On Sun, Sep 8, 2013 at 6:36 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Sun, Sep 8, 2013 at 3:45 PM, Christoph Gohlke cgoh...@uci.edu wrote: On 9/8/2013 12:14 PM, Charles R Harris wrote: Hi all, I'm happy to announce the second beta release of Numpy 1.8.0. This release should solve the Windows problems encountered in the first beta. Many thanks to Christolph Gohlke and Julian Taylor for their hard work in getting those issues settled. It would be good if folks running OS X could try out this release and report any issues on the numpy-dev mailing list. Unfortunately the files still need to be installed from source as dmg files are not avalable at this time. Source tarballs and release notes can be found at https://sourceforge.net/projects/numpy/files/NumPy/1.8.0b2/. The Windows and OS X installers will follow when the infrastructure issues are dealt with. Chuck Hello, I tested numpy 1.8.0b2 with Visual Studio and Intel MKL on Python 2.7 and 3.3 for Windows, 32 and 64 bit. There's only a single test failure on win-amd64-py3.3, which looks strange since the test expects a TypeError to be raised. == ERROR: test_record_no_hash (test_multiarray.TestRecord) -- Traceback (most recent call last): File X:\Python33\lib\site-packages\numpy\core\tests\test_multiarray.py, line 2464, in test_record_no_hash self.assertRaises(TypeError, hash, a[0]) File X:\Python33\lib\unittest\case.py, line 570, in assertRaises return context.handle('assertRaises', callableObj, args, kwargs) File X:\Python33\lib\unittest\case.py, line 135, in handle callable_obj(*args, **kwargs) File X:\Python33\lib\unittest\case.py, line 153, in __exit__ self.obj_name)) TypeError: unhashable type: 'writeable void-scalar' Hmm, that *is* strange. I don't know what to make of the scipy errors at first glance. snip I'm going to guess self.assertRaises tried to hash it again, raising the error again, and we see the second one. The assertRaises
Re: [Numpy-discussion] ANN: 1.8.0b2 release.
I don't have CFLAGS defined. But I have iothers env variable that point to other python stuff like CPATH. But even in that case, I don't understand how other people could have compiled methods.c. The include aren't part of the env variable, but in the file. Anyway, I think your PR is the good fix. I checked our PR and now I have this new error: gcc: numpy/core/src/multiarray/multiarraymodule.c In file included from numpy/core/src/multiarray/multiarraymodule.c:3753:0: build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.c:303:18: error: ‘PyArray_Partition’ undeclared here (not in a function) build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.c:304:18: error: ‘PyArray_ArgPartition’ undeclared here (not in a function) In file included from numpy/core/src/multiarray/multiarraymodule.c:3753:0: build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.c:303:18: error: ‘PyArray_Partition’ undeclared here (not in a function) build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.c:304:18: error: ‘PyArray_ArgPartition’ undeclared here (not in a function) error: Command gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -O2 -fPIC -DHAVE_NPY_CONFIG_H=1 -Inumpy/core/include -Ibuild/src.linux-x86_64-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/opt/lisa/os/epd-7.1.2/include/python2.7 -c numpy/core/src/multiarray/multiarraymodule.c -o build/temp.linux-x86_64-2.7/numpy/core/src/multiarray/multiarraymodule.o failed with exit status 1 So it seam we have the same problem with those 2 functions. They are defined in numpy/core/src/multiarray/item_selection.c, but not in the .h file. thanks Fred On Mon, Sep 9, 2013 at 11:44 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Sep 9, 2013 at 9:33 AM, Frédéric Bastien no...@nouiz.org wrote: I tried it and retried and it still fail. This is in an virtualenv $git show commit c9b06111227f7a4ec213571f97e1b8d19b9c23f5 Merge: 73fbfb2 8edccea Author: Charles Harris charlesr.har...@gmail.com Date: Sun Sep 8 19:47:21 2013 -0700 Merge pull request #3701 from cgohlke/patch-2 ENH: add support for Python 3.4 ast.NameConstant $rm -rf build ## Fail as there is no such directory $pip install . # fail with the same error $pip uninstall numpy $python setup.py install --user # fail with the same error $pip install . ## fail with the same error: $git grep PyArray_SelectkindConverter doc/release/1.8.0-notes.rst:* PyArray_SelectkindConverter numpy/core/code_generators/numpy_api.py: 'PyArray_SelectkindConverter': 298, numpy/core/src/multiarray/conversion_utils.c:PyArray_SelectkindConverter(PyObject *obj, NPY_SELECTKIND *selectkind) numpy/core/src/multiarray/methods.c: PyArray_SelectkindConverter, sortkind, numpy/core/src/multiarray/methods.c: PyArray_SelectkindConverter, sortkind, Here I don't see PyArray_SelectkindConverter in conversion_utils.h as you said it is present. Witch commit do you use? It's not there, it is part of the API. I've got a PR to add it to the *.h file. The question is why you are the only person (so far) to have a problem compiling. What are your CFLAGS? Chuck Fred On Mon, Sep 9, 2013 at 11:02 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Sep 9, 2013 at 8:51 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Sep 9, 2013 at 7:46 AM, Frédéric Bastien no...@nouiz.orgwrote: Hi, I checkout the dev version of numpy and it fail to compile with this error: creating build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/numpy/core/src/multiarray compile options: '-DHAVE_NPY_CONFIG_H=1 -Inumpy/core/include -Ibuild/src.linux-x86_64-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/opt/lisa/os/epd-7.1.2/include/python2.7 -c' gcc: numpy/core/src/multiarray/sequence.c gcc: numpy/core/src/multiarray/descriptor.c gcc: numpy/core/src/multiarray/getset.c gcc: numpy/core/src/multiarray/arrayobject.c gcc: numpy/core/src/multiarray/methods.c numpy/core/src/multiarray/methods.c: In function ‘array_partition’: numpy/core/src/multiarray/methods.c:1199:38: error: ‘PyArray_SelectkindConverter’ undeclared (first use in this function) numpy/core/src/multiarray/methods.c:1199:38: note: each undeclared identifier is reported only once for each function it appears in numpy/core/src/multiarray/methods.c: In function ‘array_argpartition’: numpy/core/src/multiarray/methods.c:1316:38: error: ‘PyArray_SelectkindConverter’ undeclared (first use in this function) numpy/core/src/multiarray/methods.c:1352:9: warning: assignment makes pointer from integer
Re: [Numpy-discussion] ANN: 1.8.0b2 release.
I tried it and retried and it still fail. This is in an virtualenv $git show commit c9b06111227f7a4ec213571f97e1b8d19b9c23f5 Merge: 73fbfb2 8edccea Author: Charles Harris charlesr.har...@gmail.com Date: Sun Sep 8 19:47:21 2013 -0700 Merge pull request #3701 from cgohlke/patch-2 ENH: add support for Python 3.4 ast.NameConstant $rm -rf build ## Fail as there is no such directory $pip install . # fail with the same error $pip uninstall numpy $python setup.py install --user # fail with the same error $pip install . ## fail with the same error: $git grep PyArray_SelectkindConverter doc/release/1.8.0-notes.rst:* PyArray_SelectkindConverter numpy/core/code_generators/numpy_api.py: 'PyArray_SelectkindConverter': 298, numpy/core/src/multiarray/conversion_utils.c:PyArray_SelectkindConverter(PyObject *obj, NPY_SELECTKIND *selectkind) numpy/core/src/multiarray/methods.c: PyArray_SelectkindConverter, sortkind, numpy/core/src/multiarray/methods.c: PyArray_SelectkindConverter, sortkind, Here I don't see PyArray_SelectkindConverter in conversion_utils.h as you said it is present. Witch commit do you use? Fred On Mon, Sep 9, 2013 at 11:02 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Sep 9, 2013 at 8:51 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Sep 9, 2013 at 7:46 AM, Frédéric Bastien no...@nouiz.org wrote: Hi, I checkout the dev version of numpy and it fail to compile with this error: creating build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/numpy/core/src/multiarray compile options: '-DHAVE_NPY_CONFIG_H=1 -Inumpy/core/include -Ibuild/src.linux-x86_64-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/opt/lisa/os/epd-7.1.2/include/python2.7 -c' gcc: numpy/core/src/multiarray/sequence.c gcc: numpy/core/src/multiarray/descriptor.c gcc: numpy/core/src/multiarray/getset.c gcc: numpy/core/src/multiarray/arrayobject.c gcc: numpy/core/src/multiarray/methods.c numpy/core/src/multiarray/methods.c: In function ‘array_partition’: numpy/core/src/multiarray/methods.c:1199:38: error: ‘PyArray_SelectkindConverter’ undeclared (first use in this function) numpy/core/src/multiarray/methods.c:1199:38: note: each undeclared identifier is reported only once for each function it appears in numpy/core/src/multiarray/methods.c: In function ‘array_argpartition’: numpy/core/src/multiarray/methods.c:1316:38: error: ‘PyArray_SelectkindConverter’ undeclared (first use in this function) numpy/core/src/multiarray/methods.c:1352:9: warning: assignment makes pointer from integer without a cast numpy/core/src/multiarray/methods.c: In function ‘array_partition’: numpy/core/src/multiarray/methods.c:1199:38: error: ‘PyArray_SelectkindConverter’ undeclared (first use in this function) numpy/core/src/multiarray/methods.c:1199:38: note: each undeclared identifier is reported only once for each function it appears in numpy/core/src/multiarray/methods.c: In function ‘array_argpartition’: numpy/core/src/multiarray/methods.c:1316:38: error: ‘PyArray_SelectkindConverter’ undeclared (first use in this function) numpy/core/src/multiarray/methods.c:1352:9: warning: assignment makes pointer from integer without a cast error: Command gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -O2 -fPIC -DHAVE_NPY_CONFIG_H=1 -Inumpy/core/include -Ibuild/src.linux-x86_64-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/opt/lisa/os/epd-7.1.2/include/python2.7 -c numpy/core/src/multiarray/methods.c -o build/temp.linux-x86_64-2.7/numpy/core/src/multiarray/methods.o failed with exit status 1 PyArray_SelectkindConverter is defined in numpy/core/src/multiarray/conversion_utils.c. methods.c include conversion_utils.h, but there is no fct declaration of this fct in this file. Is that normal? No, it looks like a bug. What is strange is that it doesn't show up on my machine. What compiler flags are you using? Could you make a PR for this? snip Wait a minute, it is in the API. Try a clean build and see what happens. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Minimal NumPy for distribution
Hi, thanks for the information. It is very useful to know that the c code could call back to python. Fred On Wed, Sep 4, 2013 at 2:14 PM, Nathaniel Smith n...@pobox.com wrote: There do exist numpy c functions that call .py file routines. I don't know how likely you are to find them in practice, but it definitely happens. You don't need .py files if you have .pyc files, and those can be compressed (python can import directly from .zip files). -n On 4 Sep 2013 18:52, Frédéric Bastien no...@nouiz.org wrote: Hi, I have done some exploratory work with Theano to generate a shared library from a Theano function. This link with numpy c api. If we want to distribute this library and call it from C and/or python, what is the minimal installed part of NumPy needed? I suppose that only the c api is needed. Do someone already checked that? From what I found on linux, numpy linked with a dynamic BLAS take 13MB. There is 2.7MB from .so, 5.3MB from .py and 3.8MB from .pyc. Can we just keep all .so and remove all .py and .pyc file? I suppose we need to modify the __init__.py file too. On Windows, with the unofficial 64 bit binary of NumPy linked with MKL, the numpy directory take 56MB. So I would also like to know witch shared library(dll) is needed. Here is the .so file found on linux: /lib/python2.7/site-packages/numpy/core/multiarray_tests.so /lib/python2.7/site-packages/numpy/core/multiarray.so /lib/python2.7/site-packages/numpy/core/_dotblas.so /lib/python2.7/site-packages/numpy/core/_sort.so /lib/python2.7/site-packages/numpy/core/scalarmath.so /lib/python2.7/site-packages/numpy/core/umath_tests.so /lib/python2.7/site-packages/numpy/core/umath.so /lib/python2.7/site-packages/numpy/numarray/_capi.so /lib/python2.7/site-packages/numpy/linalg/lapack_lite.so /lib/python2.7/site-packages/numpy/random/mtrand.so /lib/python2.7/site-packages/numpy/fft/fftpack_lite.so /lib/python2.7/site-packages/numpy/lib/_compiled_base.so Can I get rid of all shared lib outside of core? thanks Frédéric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Minimal NumPy for distribution
Hi, I have done some exploratory work with Theano to generate a shared library from a Theano function. This link with numpy c api. If we want to distribute this library and call it from C and/or python, what is the minimal installed part of NumPy needed? I suppose that only the c api is needed. Do someone already checked that? From what I found on linux, numpy linked with a dynamic BLAS take 13MB. There is 2.7MB from .so, 5.3MB from .py and 3.8MB from .pyc. Can we just keep all .so and remove all .py and .pyc file? I suppose we need to modify the __init__.py file too. On Windows, with the unofficial 64 bit binary of NumPy linked with MKL, the numpy directory take 56MB. So I would also like to know witch shared library(dll) is needed. Here is the .so file found on linux: /lib/python2.7/site-packages/numpy/core/multiarray_tests.so /lib/python2.7/site-packages/numpy/core/multiarray.so /lib/python2.7/site-packages/numpy/core/_dotblas.so /lib/python2.7/site-packages/numpy/core/_sort.so /lib/python2.7/site-packages/numpy/core/scalarmath.so /lib/python2.7/site-packages/numpy/core/umath_tests.so /lib/python2.7/site-packages/numpy/core/umath.so /lib/python2.7/site-packages/numpy/numarray/_capi.so /lib/python2.7/site-packages/numpy/linalg/lapack_lite.so /lib/python2.7/site-packages/numpy/random/mtrand.so /lib/python2.7/site-packages/numpy/fft/fftpack_lite.so /lib/python2.7/site-packages/numpy/lib/_compiled_base.so Can I get rid of all shared lib outside of core? thanks Frédéric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Correct way to query NumPy for linktime BLAS and LAPACK
Hi, In Theano, we use the information in this dictionnary: numpy.distutils.__config__.blas_opt_info. We do this for a few years already, so I don't know how much future proof it is, but I would expect that they aren't going to change this shortly. We use this dict for the default configuration, but still we allow the user to provide its own library and it work well. In case you don't know Theano, it is a compiler that generate dynamically c code, compile them as python module and load them in the python interpreter. So it happen that numpy and Theano module use different version of BLAS. Up to now, I never heard a problem about this. Don't forget that many different BLAS version use different internal symbol for the BLAS function and just provide an official function for the interface. So if we mix different BLAS, it work. But I'm not sure if what will happen if we link with different version of the same BLAS project, like different MKL version. Maybe just on the them will get imported if the library name is the same. HTH Fred On Mon, Aug 5, 2013 at 5:06 PM, Aron Ahmadia a...@ahmadia.net wrote: Dear NumPy Developers, In the Clawpack/* repositories [1], we use a mixture of Fortran and Python source, currently glued together using f2py. Occasionally, we'll need to link the Fortran code directly against LAPACK. In particular, we're using dgeev and dgesv to solve several different Riemann problems [2,3]. In the past, we've relied on either the operating system or the user to provide these link commands for us, but it would be ideal in the future if we could query NumPy for how it is linked. Currently, the only information I can find is in the hidden __config__ module of NumPy's distutils module: numpy.distutils.__config__.blas_opt_info['extra_link_args'] numpy.distutils.__config__.lapack_opt_info['extra_link_args'] This seems to suggest that we shouldn't be relying on this information being available in future versions of NumPy (or at least, not in this location). That said, we'd still probably like to use this to avoid the possibility of multiple BLAS/LAPACK libraries being linked in to our builds. Any comments? Thanks, Aron [1] https://github.com/clawpack/clawpack [2] https://github.com/clawpack/riemann/blob/master/src/rp1_layered_shallow_water.f90#L687 [3] https://github.com/clawpack/riemann/blob/master/src/rpn2_layered_shallow_water.f90#L478 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Allow == and != to raise errors
On Thu, Jul 25, 2013 at 7:48 AM, Nathaniel Smith n...@pobox.com wrote: On Tue, Jul 23, 2013 at 4:10 PM, Frédéric Bastien no...@nouiz.org wrote: I'm mixed, because I see the good value, but I'm not able to guess the consequence of the interface change. So doing your FutureWarning would allow to gatter some data about this, and if it seam to cause too much problem, we could cancel the change. Also, in the case there is a few software that depend on the old behaviour, this will cause a crash(Except if they have a catch all Exception case), not bad result. I think we have to be willing to fix bugs, even if we can't be sure what all the consequences are. Carefully of course, and with due consideration to possible compatibility consequences, but if we rejected every change that might have unforeseen effects then we'd have to stop accepting changes altogether. (And anyway the show-stopper regressions that make it into releases always seem to be the ones we didn't anticipate at all, so I doubt that being 50% more careful with obscure corner cases like this will have any measurable impact in our overall release-to-release compatibility.) So I'd consider Fred's comments above to be a vote for the change, in practice... I think it is always hard to predict the consequence of interface change in NumPy. To help measure it, we could make/as people to contribute to a collection of software that use NumPy with a good tests suites. We could test interface change on them by running there tests suites to try to have a guess of the impact of those change. What do you think of that? I think it was already discussed on the mailing list, but not acted upon. Yeah, if we want to be careful then it never hurts to run other projects test suites to flush out bugs :-). We don't do this systematically right now. Maybe we should stick some precompiled copies of scipy and other core numpy-dependants up on a host somewhere and then pull them down and run their test suite as part of the Travis tests? We have maybe 10 minutes of CPU budget for tests still. Theano tests will be too long. I'm not sure that doing this on travis-ci is the right place. Doing this for each version of a PR will be too long for travis and will limit the project that we will test on. What about doing a vagrant VM that update/install the development version of NumPy and then reinstall some predetermined version of other project and run there tests? I started playing with vagrant VM to help test differente OS configuration for Theano. I haven't finished this, but it seam to do the job well. People just cd in a directory, then run vagrant up and then all is automatic. They just wait and read the output. Other idea? I know some other project used jenkins. Would this be a better idea? Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Allow == and != to raise errors
I'm mixed, because I see the good value, but I'm not able to guess the consequence of the interface change. So doing your FutureWarning would allow to gatter some data about this, and if it seam to cause too much problem, we could cancel the change. Also, in the case there is a few software that depend on the old behaviour, this will cause a crash(Except if they have a catch all Exception case), not bad result. I think it is always hard to predict the consequence of interface change in NumPy. To help measure it, we could make/as people to contribute to a collection of software that use NumPy with a good tests suites. We could test interface change on them by running there tests suites to try to have a guess of the impact of those change. What do you think of that? I think it was already discussed on the mailing list, but not acted upon. Fred On Tue, Jul 23, 2013 at 10:29 AM, Sebastian Berg sebast...@sipsolutions.net wrote: On Sat, 2013-07-13 at 11:28 -0400, josef.p...@gmail.com wrote: On Sat, Jul 13, 2013 at 9:14 AM, Nathaniel Smith n...@pobox.com wrote: snip I'm now +1 on the exception that Sebastian proposed. I like consistency, and having a more straightforward mental model of the numpy behavior for elementwise operations, that don't pretend sometimes to be python (when I'm doing array math), like this I am not sure what the result of this discussion is. As far as I see Benjamin and Frédéric were opposing and overall it seemed pretty mixed, so unless you two changed your mind or say that it was just a small personal preference I would drop it for now. I obviously think the current behaviour is inconsistent to buggy and am really only afraid of possibly breaking code out there. Which is why I think I maybe should first add a FutureWarning if we decide on changing it. Regards, Sebastian [1,2,3] [1,2] False [1,2,3] [1,2] True Josef -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Speedup by avoiding memory alloc twice in scalar array
On Wed, Jul 17, 2013 at 10:39 AM, Nathaniel Smith n...@pobox.com wrote: On Tue, Jul 16, 2013 at 7:53 PM, Frédéric Bastien no...@nouiz.org wrote: Hi, On Tue, Jul 16, 2013 at 11:55 AM, Nathaniel Smith n...@pobox.com wrote: On Tue, Jul 16, 2013 at 2:34 PM, Arink Verma arinkve...@gmail.com wrote: Each ndarray does two mallocs, for the obj and buffer. These could be combined into 1 - just allocate the total size and do some pointer arithmetic, then set OWNDATA to false. So, that two mallocs has been mentioned in project introduction. I got that wrong. On further thought/reading the code, it appears to be more complicated than that, actually. It looks like (for a non-scalar array) we have 2 calls to PyMem_Malloc: 1 for the array object itself, and one for the shapes + strides. And, one call to regular-old malloc: for the data buffer. (Mysteriously, shapes + strides together have 2*ndim elements, but to hold them we allocate a memory region sized to hold 3*ndim elements. I'm not sure why.) And contrary to what I said earlier, this is about as optimized as it can be without breaking ABI. We need at least 2 calls to malloc/PyMem_Malloc, because the shapes+strides may need to be resized without affecting the much larger data area. But it's tempting to allocate the array object and the data buffer in a single memory region, like I suggested earlier. And this would ALMOST work. But, it turns out there is code out there which assumes (whether wisely or not) that you can swap around which data buffer a given PyArrayObject refers to (hi Theano!). And supporting this means that data buffers and PyArrayObjects need to be in separate memory regions. Are you sure that Theano swap the data ptr of an ndarray? When we play with that, it is on a newly create ndarray. So a node in our graph, won't change the input ndarray structure. It will create a new ndarray structure with new shape/strides and pass a data ptr and we flag the new ndarray with own_data correctly to my knowledge. If Theano pose a problem here, I'll suggest that I fix Theano. But currently I don't see the problem. So if this make you change your mind about this optimization, tell me. I don't want Theano to prevent optimization in NumPy. It's entirely possible I misunderstood, so let's see if we can work it out. I know that you want to assign to the -data pointer in a PyArrayObject, right? That's what caused some trouble with the 1.7 API deprecations, which were trying to prevent direct access to this field? Creating a new array given a pointer to a memory region is no problem, and obviously will be supported regardless of any optimizations. But if that's all you were doing then you shouldn't have run into the deprecation problem. Or maybe I'm misremembering! What is currently done at only 1 place is to create a new PyArrayObject with a given ptr. So NumPy don't do the allocation. We later change that ptr to another one. It is the change to the ptr of the just created PyArrayObject that caused problem with the interface deprecation. I fixed all other problem releated to the deprecation (mostly just rename of function/macro). But I didn't fixed this one yet. I would need to change the logic to compute the final ptr before creating the PyArrayObject object and create it with the final data ptr. But in call cases, NumPy didn't allocated data memory for this object, so this case don't block your optimization. One thing in our optimization wish list is to reuse allocated PyArrayObject between Theano function call for intermediate results(so completly under Theano control). This could be useful in particular for reshape/transpose/subtensor. Those functions are pretty fast and from memory, I already found the allocation time was significant. But in those cases, it is on PyArrayObject that are views, so the metadata and the data would be in different memory region in all cases. The other cases of optimization wish list is if we want to reuse the PyArrayObject when the shape isn't the good one (but the number of dimensions is the same). If we do that for operation like addition, we will need to use PyArray_Resize(). This will be done on PyArrayObject whose data memory was allocated by NumPy. So if you do one memory allowcation for metadata and data, just make sure that PyArray_Resize() will handle that correctly. On the usefulness of doing only 1 memory allocation, on our old gpu ndarray, we where doing 2 alloc on the GPU, one for metadata and one for data. I removed this, as this was a bottleneck. allocation on the CPU are faster the on the GPU, but this is still something that is slow except if you reuse memory. Do PyMem_Malloc, reuse previous small allocation? For those that read up all this, the conclusion is that Theano should block this optimization. If you optimize the allocation of new PyArrayObject, they will be less incentive to do
Re: [Numpy-discussion] Speedup by avoiding memory alloc twice in scalar array
Hi, On Tue, Jul 16, 2013 at 11:55 AM, Nathaniel Smith n...@pobox.com wrote: On Tue, Jul 16, 2013 at 2:34 PM, Arink Verma arinkve...@gmail.com wrote: Each ndarray does two mallocs, for the obj and buffer. These could be combined into 1 - just allocate the total size and do some pointer arithmetic, then set OWNDATA to false. So, that two mallocs has been mentioned in project introduction. I got that wrong. On further thought/reading the code, it appears to be more complicated than that, actually. It looks like (for a non-scalar array) we have 2 calls to PyMem_Malloc: 1 for the array object itself, and one for the shapes + strides. And, one call to regular-old malloc: for the data buffer. (Mysteriously, shapes + strides together have 2*ndim elements, but to hold them we allocate a memory region sized to hold 3*ndim elements. I'm not sure why.) And contrary to what I said earlier, this is about as optimized as it can be without breaking ABI. We need at least 2 calls to malloc/PyMem_Malloc, because the shapes+strides may need to be resized without affecting the much larger data area. But it's tempting to allocate the array object and the data buffer in a single memory region, like I suggested earlier. And this would ALMOST work. But, it turns out there is code out there which assumes (whether wisely or not) that you can swap around which data buffer a given PyArrayObject refers to (hi Theano!). And supporting this means that data buffers and PyArrayObjects need to be in separate memory regions. Are you sure that Theano swap the data ptr of an ndarray? When we play with that, it is on a newly create ndarray. So a node in our graph, won't change the input ndarray structure. It will create a new ndarray structure with new shape/strides and pass a data ptr and we flag the new ndarray with own_data correctly to my knowledge. If Theano pose a problem here, I'll suggest that I fix Theano. But currently I don't see the problem. So if this make you change your mind about this optimization, tell me. I don't want Theano to prevent optimization in NumPy. Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Allow == and != to raise errors
Just a question, should == behave like a ufunc or like python == for tuple? I think that all ndarray comparision (==, !=, =, ...) should behave the same. If they don't (like it was said), making them consistent is good. What is the minimal change to have them behave the same? From my understanding, it is your proposal to change == and != to behave like real ufunc. But I'm not sure if the minimal change is the best, for new user, what they will expect more? The ufunc of the python behavior? Anyway, I see the advantage to simplify the interface to something more consistent. Anyway, if we make all comparison behave like ufunc, there is array_equal as said to have the python behavior of ==, is it useful to have equivalent function the other comparison? Do they already exist. thanks Fred On Mon, Jul 15, 2013 at 10:20 AM, Nathaniel Smith n...@pobox.com wrote: On Mon, Jul 15, 2013 at 2:09 PM, bruno Piguet bruno.pig...@gmail.com wrote: Python itself doesn't raise an exception in such cases : (3,4) != (2, 3, 4) True (3,4) == (2, 3, 4) False Should numpy behave differently ? The numpy equivalent to Python's scalar == is called array_equal, and that does indeed behave the same: In [5]: np.array_equal([3, 4], [2, 3, 4]) Out[5]: False But in numpy, the name == is shorthand for the ufunc np.equal, which raises an error: In [8]: np.equal([3, 4], [2, 3, 4]) ValueError: operands could not be broadcast together with shapes (2) (3) -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Allow == and != to raise errors
I also don't like that idea, but I'm not able to come to a good reasoning like Benjamin. I don't see advantage to this change and the reason isn't good enough to justify breaking the interface I think. But I don't think we rely on this, so if the change goes in, it probably won't break stuff or they will be easily seen and repared. Fred On Fri, Jul 12, 2013 at 9:13 AM, Benjamin Root ben.r...@ou.edu wrote: I can see where you are getting at, but I would have to disagree. First of all, when a comparison between two mis-shaped arrays occur, you get back a bone fide python boolean, not a numpy array of bools. So if any action was taken on the result of such a comparison assumed that the result was some sort of an array, it would fail (yes, this does make it a bit difficult to trace back the source of the problem, but not impossible). Second, no semantics are broken with this. Are the arrays equal or not? If they weren't broadcastible, then returning False for == and True for != makes perfect sense to me. At least, that is my take on it. Cheers! Ben Root On Fri, Jul 12, 2013 at 8:38 AM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey, the array comparisons == and != never raise errors but instead simply return False for invalid comparisons. The main example are arrays of non-matching dimensions, and object arrays with invalid element-wise comparisons: In [1]: np.array([1,2,3]) == np.array([1,2]) Out[1]: False In [2]: np.array([1, np.array([2, 3])], dtype=object) == [1, 2] Out[2]: False This seems wrong to me, and I am sure not just me. I doubt any large projects makes use of such comparisons and assume that most would prefer the shape mismatch to raise an error, so I would like to change it. But I am a bit unsure especially about smaller projects. So to keep the transition a bit safer could imagine implementing a FutureWarning for these cases (and that would at least notify new users that what they are doing doesn't seem like the right thing). So the question is: Is such a change safe enough, or is there some good reason for the current behavior that I am missing? Regards, Sebastian (There may be other issues with structured types that would continue returning False I think, because neither side knows how to compare) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Equality not working as expected with ndarray sub-class
Hi, __array__priority wasn't checked for ==, !=, , =, , = operation. I added it in the development version and someone else back-ported it to the 1.7.X branch. So this will work with the next release of numpy. I don't know of a workaround until the next release. Fred On Thu, Jul 4, 2013 at 9:06 AM, Thomas Robitaille thomas.robitai...@gmail.com wrote: Hi everyone, The following example: import numpy as np class SimpleArray(np.ndarray): __array_priority__ = 1 def __new__(cls, input_array, info=None): return np.asarray(input_array).view(cls) def __eq__(self, other): return False a = SimpleArray(10) print (np.int64(10) == a) print (a == np.int64(10)) gives the following output $ python2.7 eq.py True False so that in the first case, SimpleArray.__eq__ is not called. Is this a bug, and if so, can anyone think of a workaround? If this is expected behavior, how do I ensure SimpleArray.__eq__ gets called in both cases? Thanks, Tom ps: cross-posting to stackoverflow ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Equality not working as expected with ndarray sub-class
On Thu, Jul 4, 2013 at 9:12 AM, sebastian sebast...@sipsolutions.netwrote: On 2013-07-04 15:06, Thomas Robitaille wrote: Hi everyone, The following example: import numpy as np class SimpleArray(np.ndarray): __array_priority__ = 1 def __new__(cls, input_array, info=None): return np.asarray(input_array).view(cls) def __eq__(self, other): return False a = SimpleArray(10) print (np.int64(10) == a) print (a == np.int64(10)) gives the following output $ python2.7 eq.py True False so that in the first case, SimpleArray.__eq__ is not called. Is this a bug, and if so, can anyone think of a workaround? If this is expected behavior, how do I ensure SimpleArray.__eq__ gets called in both cases? This should be working in all development versions. I.e. NumPy 1.7.2 (which is not released yet). I think you mean: NumPy = 1.7.2 Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] low level optimization in NumPy and minivect
On Wed, Jun 26, 2013 at 7:30 AM, mark florisson markflorisso...@gmail.comwrote: On 26 June 2013 09:05, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 06/25/2013 04:21 PM, Frédéric Bastien wrote: Hi, I wasn't able to attend this year Scipy Conference. My tutorial proposal was rejected and other deadline intefered with this conference date. Will the presentation be recorded? If not, can you make the slide available? What is your opinion on this question: - Should other lib like NumPy/Theano/Cython/Numba base their elemwise implemention (or part of it) on dynd or minivect? I know cython and Numba do it, but it was before dynd and I don't know where dynd fit in the big picture. Do dynd reuse minivect itself? Actually, I think the Cython branch with minivect support was in the end not merged, due to lack of interest/manpower to maintain support for vectorization in the long term (so it was better to not add the feature than have a badly supported feature). My understanding is that Numba is based on minivect and not on dynd, so it's more of a competitor. Perhaps Mark Florisson will be able to comment. Dag Sverre Hey Dag, Indeed, numba uses it for its array expression support, but it will likely remove the minivect dependency and generate a simple loop nest for now. I'm working on pykit now (https://github.com/ContinuumIO/pykit) which similarly to minivect defines its own intermediate representation, with array expressions in the form of map/reduce/scan/etc functions. The project has a broader scope than minivect, to be used by projects like numba, what but a minivect baked in. As such, minivect isn't really maintained any longer, and I wouldn't recommend anyone using the code at this point (just maybe some of the ideas :)). Hi, thanks for the information. I checked the repo rapidly and didn't found information on how to use it the way I expect to use it. I would like to be able to take a small Theano graph (like just elemwise operation) and make a graph in it to have it generate the c code. Do you have some tests/tests/doc that demonstrate something in that direction? Ideally I would like to be able to implement something like this simple example: (x ** 2).sum(1) or (x ** 2).sum() Is pykit or Numba IR ready for that? thanks Frédéric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] low level optimization in NumPy and minivect
Hi, I wasn't able to attend this year Scipy Conference. My tutorial proposal was rejected and other deadline intefered with this conference date. Will the presentation be recorded? If not, can you make the slide available? What is your opinion on this question: - Should other lib like NumPy/Theano/Cython/Numba base their elemwise implemention (or part of it) on dynd or minivect? I know cython and Numba do it, but it was before dynd and I don't know where dynd fit in the big picture. Do dynd reuse minivect itself? thanks Frédéric On Mon, Jun 24, 2013 at 11:46 AM, Mark Wiebe mwwi...@gmail.com wrote: On Wed, Jun 19, 2013 at 7:48 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Jun 19, 2013 at 5:45 AM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Wed, Jun 19, 2013 at 1:43 AM, Frédéric Bastien no...@nouiz.org wrote: Hi, On Mon, Jun 17, 2013 at 5:03 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: On 17.06.2013 17:11, Frédéric Bastien wrote: Hi, I saw that recently Julian Taylor is doing many low level optimization like using SSE instruction. I think it is great. Last year, Mark Florisson released the minivect[1] project that he worked on during is master thesis. minivect is a compiler for element-wise expression that do some of the same low level optimization that Julian is doing in NumPy right now. Mark did minivect in a way that allow it to be reused by other project. It is used now by Cython and Numba I think. I had plan to reuse it in Theano, but I didn't got the time to integrate it up to now. What about reusing it in NumPy? I think that some of Julian optimization aren't in minivect (I didn't check to confirm). But from I heard, minivect don't implement reduction and there is a pull request to optimize this in NumPy. Hi, what I vectorized is just the really easy cases of unit stride continuous operations, so the min/max reductions which is now in numpy is in essence pretty trivial. minivect goes much further in optimizing general strided access and broadcasting via loop optimizations (it seems to have a lot of overlap with the graphite loop optimizer available in GCC [0]) so my code is probably not of very much use to minivect. The most interesting part in minivect for numpy is probably the optimization of broadcasting loops which seem to be pretty inefficient in numpy [0]. Concerning the rest I'm not sure how much of a bottleneck general strided operations really are in common numpy using code. I guess a similar discussion about adding an expression compiler to numpy has already happened when numexpr was released? If yes what was the outcome of that? I don't recall a discussion when numexpr was done as this is before I read this list. numexpr do optimization that can't be done by NumPy: fusing element-wise operation in one call. So I don't see how it could be done to reuse it in NumPy. You call your optimization trivial, but I don't. In the git log of NumPy, the first commit is in 2001. It is the first time someone do this in 12 years! Also, this give 1.5-8x speed up (from memory from your PR description). This is not negligible. But how much time did you spend on them? Also, some of them are processor dependent, how many people in this list already have done this? I suppose not many. Yes, your optimization don't cover all cases that minivect do. I see 2 level of optimization. 1) The inner loop/contiguous cases, 2) the strided, broadcasted level. We don't need all optimization being done for them to be useful. Any of them are useful. So what I think is that we could reuse/share that work. NumPy have c code generator. They could call minivect code generator for some of them when compiling NumPy. This will make optimization done to those code generator reused by more people. For example, when new processor are launched, we will need only 1 place to change for many projects. Or for example, it the call to MKL vector library is done there, more people will benefit from it. Right now, only numexpr do it. About the level 2 optimization (strides, broadcast), I never read NumPy code that deal with that. Do someone that know it have an idea if it would be possible to reuse minivect for this? Would someone be able to guide some of the numpy C experts into a room to do some thinking / writing on this at the scipy conference? I completely agree that these kind of optimizations and code sharing seem likely to be very important for the future. I'm not at the conference, but if there's anything I can do to help, please someone let me know. Concerning the future development of numpy, I'd also suggest that we look at libdynd https://github.com/ContinuumIO/libdynd. It looks to me like it is reaching a level of maturity where it is worth trying to plan out a long term path
Re: [Numpy-discussion] low level optimization in NumPy and minivect
I didn't know about this project. It is interresting. Is some of you discuss this at the scipy conference, it would be appreciated if you write here a summary of that. I won't be there this year. thanks Frédéric On Wed, Jun 19, 2013 at 8:48 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Jun 19, 2013 at 5:45 AM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Wed, Jun 19, 2013 at 1:43 AM, Frédéric Bastien no...@nouiz.org wrote: Hi, On Mon, Jun 17, 2013 at 5:03 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: On 17.06.2013 17:11, Frédéric Bastien wrote: Hi, I saw that recently Julian Taylor is doing many low level optimization like using SSE instruction. I think it is great. Last year, Mark Florisson released the minivect[1] project that he worked on during is master thesis. minivect is a compiler for element-wise expression that do some of the same low level optimization that Julian is doing in NumPy right now. Mark did minivect in a way that allow it to be reused by other project. It is used now by Cython and Numba I think. I had plan to reuse it in Theano, but I didn't got the time to integrate it up to now. What about reusing it in NumPy? I think that some of Julian optimization aren't in minivect (I didn't check to confirm). But from I heard, minivect don't implement reduction and there is a pull request to optimize this in NumPy. Hi, what I vectorized is just the really easy cases of unit stride continuous operations, so the min/max reductions which is now in numpy is in essence pretty trivial. minivect goes much further in optimizing general strided access and broadcasting via loop optimizations (it seems to have a lot of overlap with the graphite loop optimizer available in GCC [0]) so my code is probably not of very much use to minivect. The most interesting part in minivect for numpy is probably the optimization of broadcasting loops which seem to be pretty inefficient in numpy [0]. Concerning the rest I'm not sure how much of a bottleneck general strided operations really are in common numpy using code. I guess a similar discussion about adding an expression compiler to numpy has already happened when numexpr was released? If yes what was the outcome of that? I don't recall a discussion when numexpr was done as this is before I read this list. numexpr do optimization that can't be done by NumPy: fusing element-wise operation in one call. So I don't see how it could be done to reuse it in NumPy. You call your optimization trivial, but I don't. In the git log of NumPy, the first commit is in 2001. It is the first time someone do this in 12 years! Also, this give 1.5-8x speed up (from memory from your PR description). This is not negligible. But how much time did you spend on them? Also, some of them are processor dependent, how many people in this list already have done this? I suppose not many. Yes, your optimization don't cover all cases that minivect do. I see 2 level of optimization. 1) The inner loop/contiguous cases, 2) the strided, broadcasted level. We don't need all optimization being done for them to be useful. Any of them are useful. So what I think is that we could reuse/share that work. NumPy have c code generator. They could call minivect code generator for some of them when compiling NumPy. This will make optimization done to those code generator reused by more people. For example, when new processor are launched, we will need only 1 place to change for many projects. Or for example, it the call to MKL vector library is done there, more people will benefit from it. Right now, only numexpr do it. About the level 2 optimization (strides, broadcast), I never read NumPy code that deal with that. Do someone that know it have an idea if it would be possible to reuse minivect for this? Would someone be able to guide some of the numpy C experts into a room to do some thinking / writing on this at the scipy conference? I completely agree that these kind of optimizations and code sharing seem likely to be very important for the future. I'm not at the conference, but if there's anything I can do to help, please someone let me know. Concerning the future development of numpy, I'd also suggest that we look at libdynd https://github.com/ContinuumIO/libdynd. It looks to me like it is reaching a level of maturity where it is worth trying to plan out a long term path to merger. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] low level optimization in NumPy and minivect
Hi, On Mon, Jun 17, 2013 at 5:03 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: On 17.06.2013 17:11, Frédéric Bastien wrote: Hi, I saw that recently Julian Taylor is doing many low level optimization like using SSE instruction. I think it is great. Last year, Mark Florisson released the minivect[1] project that he worked on during is master thesis. minivect is a compiler for element-wise expression that do some of the same low level optimization that Julian is doing in NumPy right now. Mark did minivect in a way that allow it to be reused by other project. It is used now by Cython and Numba I think. I had plan to reuse it in Theano, but I didn't got the time to integrate it up to now. What about reusing it in NumPy? I think that some of Julian optimization aren't in minivect (I didn't check to confirm). But from I heard, minivect don't implement reduction and there is a pull request to optimize this in NumPy. Hi, what I vectorized is just the really easy cases of unit stride continuous operations, so the min/max reductions which is now in numpy is in essence pretty trivial. minivect goes much further in optimizing general strided access and broadcasting via loop optimizations (it seems to have a lot of overlap with the graphite loop optimizer available in GCC [0]) so my code is probably not of very much use to minivect. The most interesting part in minivect for numpy is probably the optimization of broadcasting loops which seem to be pretty inefficient in numpy [0]. Concerning the rest I'm not sure how much of a bottleneck general strided operations really are in common numpy using code. I guess a similar discussion about adding an expression compiler to numpy has already happened when numexpr was released? If yes what was the outcome of that? I don't recall a discussion when numexpr was done as this is before I read this list. numexpr do optimization that can't be done by NumPy: fusing element-wise operation in one call. So I don't see how it could be done to reuse it in NumPy. You call your optimization trivial, but I don't. In the git log of NumPy, the first commit is in 2001. It is the first time someone do this in 12 years! Also, this give 1.5-8x speed up (from memory from your PR description). This is not negligible. But how much time did you spend on them? Also, some of them are processor dependent, how many people in this list already have done this? I suppose not many. Yes, your optimization don't cover all cases that minivect do. I see 2 level of optimization. 1) The inner loop/contiguous cases, 2) the strided, broadcasted level. We don't need all optimization being done for them to be useful. Any of them are useful. So what I think is that we could reuse/share that work. NumPy have c code generator. They could call minivect code generator for some of them when compiling NumPy. This will make optimization done to those code generator reused by more people. For example, when new processor are launched, we will need only 1 place to change for many projects. Or for example, it the call to MKL vector library is done there, more people will benefit from it. Right now, only numexpr do it. About the level 2 optimization (strides, broadcast), I never read NumPy code that deal with that. Do someone that know it have an idea if it would be possible to reuse minivect for this? Frédéric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] low level optimization in NumPy and minivect
Hi, I saw that recently Julian Taylor is doing many low level optimization like using SSE instruction. I think it is great. Last year, Mark Florisson released the minivect[1] project that he worked on during is master thesis. minivect is a compiler for element-wise expression that do some of the same low level optimization that Julian is doing in NumPy right now. Mark did minivect in a way that allow it to be reused by other project. It is used now by Cython and Numba I think. I had plan to reuse it in Theano, but I didn't got the time to integrate it up to now. What about reusing it in NumPy? I think that some of Julian optimization aren't in minivect (I didn't check to confirm). But from I heard, minivect don't implement reduction and there is a pull request to optimize this in NumPy. The advantage to concentrate some functionality in one common package is that more project benefit from optimization done to it. (after the work to use it first!) How this could be done in NumPy? NumPy have its own code generator for many dtype. We could call minivect code generator to replace some of it. What do you think of this idea? Sadly, I won't be able to spent time on the code for this, but I wanted to raise the idea while people are working on that, in case it is helpful. Frédéric [1] https://github.com/markflorisson88/minivect ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_priority__ ignored if __array__ is present
I think so. Changing the order between np.array([1,2,3]) * a and a * np.array([1,2,3]) should return the same type I think, specificaly when array_priority is defined. Fred On Thu, May 30, 2013 at 3:28 PM, Thomas Robitaille thomas.robitai...@gmail.com wrote: Hi Frederic, On 16 May 2013 15:58, Frédéric Bastien no...@nouiz.org wrote: I looked yesterday rapidly in the code and didn't find the reason (I don't know it well, that is probably why). But last night I think of one possible cause. I found this code 2 times in the file core/src/umath/ufunc_object.c: if (nin == 2 nout == 1 dtypes[1]-type_num == NPY_OBJECT) { PyObject *_obj = PyTuple_GET_ITEM(args, 1); if (!PyArray_CheckExact(_obj)) { double self_prio, other_prio; self_prio = PyArray_GetPriority(PyTuple_GET_ITEM(args, 0), NPY_SCALAR_PRIORITY); other_prio = PyArray_GetPriority(_obj, NPY_SCALAR_PRIORITY); if (self_prio other_prio _has_reflected_op(_obj, ufunc_name)) { retval = -2; goto fail; } } } It is this code that will call _has_reflected_op() function. The if condition is: dtypes[1]-type_num == NPY_OBJECT I wouldn't be surprised if dtypes[1] isn't NPY_OBJECT when you implement __array__. dtypes is set with those line: retval = ufunc-type_resolver(ufunc, casting, op, type_tup, dtypes); Thanks for looking into this - should this be considered a bug? Tom HTH Fred On Thu, May 16, 2013 at 9:19 AM, Thomas Robitaille thomas.robitai...@gmail.com wrote: Hi everyone, (this was posted as part of another topic, but since it was unrelated, I'm reposting as a separate thread) I've also been having issues with __array_priority__ - the following code behaves differently for __mul__ and __rmul__: import numpy as np class TestClass(object): def __init__(self, input_array): self.array = input_array def __mul__(self, other): print Called __mul__ def __rmul__(self, other): print Called __rmul__ def __array_wrap__(self, out_arr, context=None): print Called __array_wrap__ return TestClass(out_arr) def __array__(self): print Called __array__ return np.array(self.array) with output: In [7]: a = TestClass([1,2,3]) In [8]: print type(np.array([1,2,3]) * a) Called __array__ Called __array_wrap__ class '__main__.TestClass' In [9]: print type(a * np.array([1,2,3])) Called __mul__ type 'NoneType' Is this also an oversight? I opened a ticket for it a little while ago: https://github.com/numpy/numpy/issues/3164 Any ideas? Thanks! Tom ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_priority__ don't work for gt, lt, ... operator
This is a different issue then mine. Mine is that array_priority is not implemented for comparison. Your is that array_priority isn't used when __array__ is defined. Maybe you can make a new mailing list thread? Issue get more attention when there is an associated email. I looked rapidly in numpy code, but didn't find the problem. So you will need to find someone with more time/knowledge about this to look at it or take a look yourself. HTH Fred On Sun, May 12, 2013 at 3:59 AM, Thomas Robitaille thomas.robitai...@gmail.com wrote: I've also been having issues with __array_priority__ - the following code behaves differently for __mul__ and __rmul__: import numpy as np class TestClass(object): def __init__(self, input_array): self.array = input_array def __mul__(self, other): print Called __mul__ def __rmul__(self, other): print Called __rmul__ def __array_wrap__(self, out_arr, context=None): print Called __array_wrap__ return TestClass(out_arr) def __array__(self): print Called __array__ return np.array(self.array) with output: In [7]: a = TestClass([1,2,3]) In [8]: print type(np.array([1,2,3]) * a) Called __array__ Called __array_wrap__ class '__main__.TestClass' In [9]: print type(a * np.array([1,2,3])) Called __mul__ type 'NoneType' Is this also an oversight? I opened a ticket for it a little while ago: https://github.com/numpy/numpy/issues/3164 Any ideas? Cheers, Tom On 10 May 2013 18:34, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, May 10, 2013 at 10:08 AM, Frédéric Bastien no...@nouiz.org wrote: Hi, it popped again on the Theano mailing list that this don't work: np.arange(10) = a_theano_vector. The reason is that __array_priority__ isn't respected for that class of operation. This page explain the problem and give a work around: http://stackoverflow.com/questions/14619449/how-can-i-override-comparisons-between-numpys-ndarray-and-my-type The work around is to make a python function that will decide witch version of the comparator to call and do the call. Then we tell NumPy to use that function instead of its current function with: np.set_numeric_ops(...) But if we do that, when we import theano, we will slow down all normal numpy comparison for the user, as when = is execute, first there will be numpy c code executed, that will call the python function to decide witch version to do, then if it is 2 numpy ndarray, it will call again numpy c code. That isn't a good solution. We could do the same override in C, but then theano work the same when there isn't a c++ compiler. That isn't nice. What do you think of changing them to check for __array_priority__ before doing the comparison? This looks like an oversight and should be fixed. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] slight MapIter change
Hi, this is used in Theano. What is the consequence of not doing this? There is people that use it, the question is how many. Is there a way to detect witch version need to be used? thanks Fred On Sat, May 11, 2013 at 11:41 AM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey, (this is only interesting if you know what MapIter and actually use it) In case anyone already uses the newly exposed mapiter (it was never released yet). There is a tiny change, which only affects indexes that start with np.newaxis but otherwise just simplifies a tiny bit. The old block for swapping axes should be changed like this: if ((mit-subspace != NULL) (mit-consec)) { -if (mit-iteraxes[0] 0) { -PyArray_MapIterSwapAxes(mit, (PyArrayObject **)arr, 0); -if (arr == NULL) { -return -1; -} +PyArray_MapIterSwapAxes(mit, (PyArrayObject **)arr, 0); +if (arr == NULL) { +return -1; } } Regards, Sebastian ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] slight MapIter change
ok, thanks for the associated PR. Fred On Mon, May 13, 2013 at 10:19 AM, Sebastian Berg sebast...@sipsolutions.net wrote: On Mon, 2013-05-13 at 09:58 -0400, Frédéric Bastien wrote: Hi, this is used in Theano. What is the consequence of not doing this? There is people that use it, the question is how many. There are no consequences. Only if you use the equivalent to `array[np.newaxis, fancy_index, possibly more]` the result would be wrong. But all code that used to work will continue to work, since such an index was not legal before anyway. Is there a way to detect witch version need to be used? There is no released version of numpy with the other definition. Also changing it should be safe even for someone who has an older NumPy development version, because as far as I can tell the check is only an optimization in the first place. So just remove that check and you are good. And even if someone uses a new numpy with an old thaeno development version, they would have to do weird things to run into problems. - Sebastian thanks Fred On Sat, May 11, 2013 at 11:41 AM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey, (this is only interesting if you know what MapIter and actually use it) In case anyone already uses the newly exposed mapiter (it was never released yet). There is a tiny change, which only affects indexes that start with np.newaxis but otherwise just simplifies a tiny bit. The old block for swapping axes should be changed like this: if ((mit-subspace != NULL) (mit-consec)) { -if (mit-iteraxes[0] 0) { -PyArray_MapIterSwapAxes(mit, (PyArrayObject **)arr, 0); -if (arr == NULL) { -return -1; -} +PyArray_MapIterSwapAxes(mit, (PyArrayObject **)arr, 0); +if (arr == NULL) { +return -1; } } Regards, Sebastian ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] __array_priority__ don't work for gt, lt, ... operator
Hi, it popped again on the Theano mailing list that this don't work: np.arange(10) = a_theano_vector. The reason is that __array_priority__ isn't respected for that class of operation. This page explain the problem and give a work around: http://stackoverflow.com/questions/14619449/how-can-i-override-comparisons-between-numpys-ndarray-and-my-type The work around is to make a python function that will decide witch version of the comparator to call and do the call. Then we tell NumPy to use that function instead of its current function with: np.set_numeric_ops(...) But if we do that, when we import theano, we will slow down all normal numpy comparison for the user, as when = is execute, first there will be numpy c code executed, that will call the python function to decide witch version to do, then if it is 2 numpy ndarray, it will call again numpy c code. That isn't a good solution. We could do the same override in C, but then theano work the same when there isn't a c++ compiler. That isn't nice. What do you think of changing them to check for __array_priority__ before doing the comparison? Frédéric ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_priority__ don't work for gt, lt, ... operator
I'm trying to do it, but each time I want to test something, it takes a long time to rebuild numpy to test it. Is there a way to don't recompile everything for each test? thanks Fred On Fri, May 10, 2013 at 1:34 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, May 10, 2013 at 10:08 AM, Frédéric Bastien no...@nouiz.orgwrote: Hi, it popped again on the Theano mailing list that this don't work: np.arange(10) = a_theano_vector. The reason is that __array_priority__ isn't respected for that class of operation. This page explain the problem and give a work around: http://stackoverflow.com/questions/14619449/how-can-i-override-comparisons-between-numpys-ndarray-and-my-type The work around is to make a python function that will decide witch version of the comparator to call and do the call. Then we tell NumPy to use that function instead of its current function with: np.set_numeric_ops(...) But if we do that, when we import theano, we will slow down all normal numpy comparison for the user, as when = is execute, first there will be numpy c code executed, that will call the python function to decide witch version to do, then if it is 2 numpy ndarray, it will call again numpy c code. That isn't a good solution. We could do the same override in C, but then theano work the same when there isn't a c++ compiler. That isn't nice. What do you think of changing them to check for __array_priority__ before doing the comparison? This looks like an oversight and should be fixed. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_priority__ don't work for gt, lt, ... operator
thanks, I'll look at it. I made a PR: https://github.com/numpy/numpy/pull/3324 Where should I put the tests about this? thanks Fred On Fri, May 10, 2013 at 4:03 PM, Sebastian Berg sebast...@sipsolutions.netwrote: On Fri, 2013-05-10 at 15:35 -0400, Frédéric Bastien wrote: I'm trying to do it, but each time I want to test something, it takes a long time to rebuild numpy to test it. Is there a way to don't recompile everything for each test? Are you using current master? It defaults to use ENABLE_SEPARATE_COMPILATION enviroment variable, which, together with ccache, makes most changes in numpy compile fast for me. - Sebastian thanks Fred On Fri, May 10, 2013 at 1:34 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, May 10, 2013 at 10:08 AM, Frédéric Bastien no...@nouiz.org wrote: Hi, it popped again on the Theano mailing list that this don't work: np.arange(10) = a_theano_vector. The reason is that __array_priority__ isn't respected for that class of operation. This page explain the problem and give a work around: http://stackoverflow.com/questions/14619449/how-can-i-override-comparisons-between-numpys-ndarray-and-my-type The work around is to make a python function that will decide witch version of the comparator to call and do the call. Then we tell NumPy to use that function instead of its current function with: np.set_numeric_ops(...) But if we do that, when we import theano, we will slow down all normal numpy comparison for the user, as when = is execute, first there will be numpy c code executed, that will call the python function to decide witch version to do, then if it is 2 numpy ndarray, it will call again numpy c code. That isn't a good solution. We could do the same override in C, but then theano work the same when there isn't a c++ compiler. That isn't nice. What do you think of changing them to check for __array_priority__ before doing the comparison? This looks like an oversight and should be fixed. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumPy 1.7.1 release
Sorry, I didn't saw the release candidate. I was away for 1 mounts and didn't read all my email in order. Normally I try to test the release candidate, but I wasn't able this time. I have nothing to report again NumPy 1.7.1. I reread the previous emails and I remark that I badly read the first time one of them. I understood someone suggested to do a release candidate as if you didn't do that, but he wrote about not doing a 1.7.2 for datetime! Sorry for the noise. Fred On Thu, Apr 25, 2013 at 10:11 PM, Ondřej Čertík ondrej.cer...@gmail.comwrote: On Tue, Apr 23, 2013 at 12:10 PM, Frédéric Bastien no...@nouiz.org wrote: Hi, A big thanks for that release. I also think it would be useful to do a release candidate about this. This release changed the behavior releated to python long and broke a test in Theano. Nothing important, but we could have fixed this before the release. The numpy change is that a python long that don't fit in an int64, but fit in an uint64, was throwing an overflow exception. Now it return an uint64. My apologies for this. There was a release candidate here: http://mail.scipy.org/pipermail/numpy-discussion/2013-March/065948.html and I don't see any offending patch between the 1.7.1rc1 and 1.7.1. If the bugs are in numpy, would you please report it into issues? So that we can fix it. Thanks, Ondrej ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumPy 1.7.1 release
Hi, A big thanks for that release. I also think it would be useful to do a release candidate about this. This release changed the behavior releated to python long and broke a test in Theano. Nothing important, but we could have fixed this before the release. The numpy change is that a python long that don't fit in an int64, but fit in an uint64, was throwing an overflow exception. Now it return an uint64. thanks again! Fred On Sun, Apr 7, 2013 at 4:09 AM, Ondřej Čertík ondrej.cer...@gmail.comwrote: Hi, I'm pleased to announce the availability of the final NumPy 1.7.1 release. Sources and binary installers can be found at https://sourceforge.net/projects/numpy/files/NumPy/1.7.1/ Only three simple bugs were fixed since 1.7.1rc1 (#3166, #3179, #3187). I would like to thank everybody who contributed patches since 1.7.1rc1: Eric Fode, Nathaniel J. Smith and Charles Harris. Cheers, Ondrej P.S. I'll create the Mac binary installers in a few days. Pypi is updated. = NumPy 1.7.1 Release Notes = This is a bugfix only release in the 1.7.x series. Issues fixed gh-2973 Fix `1` is printed during numpy.test() gh-2983 BUG: gh-2969: Backport memory leak fix 80b3a34. gh-3007 Backport gh-3006 gh-2984 Backport fix complex polynomial fit gh-2982 BUG: Make nansum work with booleans. gh-2985 Backport large sort fixes gh-3039 Backport object take gh-3105 Backport nditer fix op axes initialization gh-3108 BUG: npy-pkg-config ini files were missing after Bento build. gh-3124 BUG: PyArray_LexSort allocates too much temporary memory. gh-3131 BUG: Exported f2py_size symbol prevents linking multiple f2py modules. gh-3117 Backport gh-2992 gh-3135 DOC: Add mention of PyArray_SetBaseObject stealing a reference gh-3134 DOC: Fix typo in fft docs (the indexing variable is 'm', not 'n'). gh-3136 Backport #3128 Checksums = 9e369a96b94b107bf3fab7e07fef8557 release/installers/numpy-1.7.1-win32-superpack-python2.6.exe 0ab72b3b83528a7ae79c6df9042d61c6 release/installers/numpy-1.7.1.tar.gz bb0d30de007d649757a2d6d2e1c59c9a release/installers/numpy-1.7.1-win32-superpack-python3.2.exe 9a72db3cad7a6286c0d22ee43ad9bc6c release/installers/numpy-1.7.1.zip 0842258fad82060800b8d1f0896cb83b release/installers/numpy-1.7.1-win32-superpack-python3.1.exe 1b8f29b1fa89a801f83f551adc13aaf5 release/installers/numpy-1.7.1-win32-superpack-python2.7.exe 9ca22df942e5d5362cf7154217cb4b69 release/installers/numpy-1.7.1-win32-superpack-python2.5.exe 2fd475b893d8427e26153e03ad7d5b69 release/installers/numpy-1.7.1-win32-superpack-python3.3.exe ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] MapIter api
Hi, this is currently used in Theano! In fact, it is a John S. that implemented it in NumPy to allow fast gradient of the advanced indexing in Theano. It allow code like: matrix1[vector1, vector2] += matrix2 where there is duplicate indices in the vector In looking at the code, I saw it use at least those part of the interface. PyArrayMapIterObject PyArray_MapIterNext PyArray_ITER_NEXT PyArray_MapIterSwapAxes PyArray_BroadcastToShape I lost the end of this discussion, but I think this is not possible in NumPy as there was not an agreement to include that. But I remember a few other user on this list asking for this(and they where Theano user to my knowledge). So I would prefer that you don't remove the part that we use for the next 1.8 release. thanks Frédéric On Tue, Apr 16, 2013 at 9:54 AM, Nathaniel Smith n...@pobox.com wrote: On Mon, Apr 15, 2013 at 5:29 PM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey, the MapIter API has only been made public in master right? So it is no problem at all to change at least the mapiter struct, right? I got annoyed at all those special cases that make things difficult to get an idea where to put i.e. to fix the boolean array-like stuff. So actually started rewriting it (and I already got one big function that does all index preparation -- ok it is untested but its basically there). I would guess it is not really a big problem even if it was public for longer, since you shouldn't do those direct struct access probably? But just checking. Why don't we just make the struct opaque, i.e., just declare it in the public header file and move the actual definition to an internal header file? If it's too annoying I guess we could even make it non-public, at least in 1.8 -- IIRC it's only there so we can use it in umath, and IIRC the patch to use it hasn't landed yet. Or we could just merge umath and multiarray into a single .so, that would save a *lot* of annoying fiddling with the public API that doesn't actually serve any purpose. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Please stop bottom posting!!
On Thu, Apr 4, 2013 at 6:01 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: [...] Which is why I advocate interspersed posting. It comes down to: please take some thought to compose your post in a way that is suited to the thread and what you are writing, rather than simply use whatever your mail or news reader makes easy without thinking about it. Ideally, for each person that writes a post, a lot of people are reading it -- so be respectful of the readers' time, more than your own. Since I read that idea somewhere else, that is what I suggest too. But it seam more people tell something different, and I suppose they never thaugh of this. I never heard argument again this. Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
Hi, win32 do not mean it is a 32 bits windows. sys.platform always return win32 on 32bits and 64 bits windows even for python 64 bits. But that is a good question, is your python 32 or 64 bits? Fred On Wed, Mar 20, 2013 at 10:14 AM, Daπid davidmen...@gmail.com wrote: Without much detailed knowledge of the topic, I would expect both versions to give very similar timing, as it is essentially a call to ATLAS function, not much is done in Python. Given this, maybe the difference is in ATLAS itself. How have you installed it? When you compile ATLAS, it will do some machine-specific optimisation, but if you have installed a binary chances are that your version is optimised for a machine quite different from yours. So, two different installations could have been compiled in different machines and so one is more suited for your machine. If you want to be sure, I would try to compile ATLAS (this may be difficult) or check the same on a very different machine (like an AMD processor, different architecture...). Just for reference, on Linux Python 2.7 64 bits can deal with these matrices easily. %timeit mat=np.random.random((6143,6143)); matinv= np.linalg.inv(mat); res = np.dot(mat, matinv); diff= res-np.eye(6143); print np.sum(np.abs(diff)) 2.41799631031e-05 1.13955868701e-05 3.64338191541e-05 1.13484781021e-05 1 loops, best of 3: 156 s per loop Intel i5, 4 GB of RAM and SSD. ATLAS installed from Fedora repository (I don't run heavy stuff on this computer). On 20 March 2013 14:46, Colin J. Williams c...@ncf.ca wrote: I have a small program which builds random matrices for increasing matrix orders, inverts the matrix and checks the precision of the product. At some point, one would expect operations to fail, when the memory capacity is exceeded. In both Python 2.7 and 3.2 matrices of order 3,071 area handled, but not 6,143. Using wall-clock times, with win32, Python 3.2 is slower than Python 2.7. The profiler indicates a problem in the solver. Done on a Pentium, with 2.7 GHz processor, 2 GB of RAM and 221 GB of free disk space. Both Python 3.2.3 and Python 2.7.3 use numpy 1.6.2. The results are show below. Colin W. _ 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.097 Time elapsed (seconds)= 0.004143 order=5 measure ofimprecision= 2.207 Time elapsed (seconds)= 0.001514 order= 11 measure ofimprecision= 2.372 Time elapsed (seconds)= 0.001455 order= 23 measure ofimprecision= 3.318 Time elapsed (seconds)= 0.001608 order= 47 measure ofimprecision= 4.257 Time elapsed (seconds)= 0.002339 order= 95 measure ofimprecision= 4.986 Time elapsed (seconds)= 0.005747 order= 191 measure ofimprecision= 5.788 Time elapsed (seconds)= 0.029974 order= 383 measure ofimprecision= 6.765 Time elapsed (seconds)= 0.145339 order= 767 measure ofimprecision= 7.909 Time elapsed (seconds)= 0.841142 order= 1535 measure ofimprecision= 8.532 Time elapsed (seconds)= 5.793630 order= 3071 measure ofimprecision= 9.774 Time elapsed (seconds)= 39.559540 order= 6143 Process terminated by a MemoryError Above: 2.7.3 Below: Python 3.2.3 bbb_bbb 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.000 Time elapsed (seconds)= 0.113930 order=5 measure ofimprecision= 1.807 Time elapsed (seconds)= 0.001373 order= 11 measure ofimprecision= 2.395 Time elapsed (seconds)= 0.001468 order= 23 measure ofimprecision= 3.073 Time elapsed (seconds)= 0.001609 order= 47 measure ofimprecision= 5.642 Time elapsed (seconds)= 0.002687 order= 95 measure ofimprecision= 5.745 Time elapsed (seconds)= 0.013510 order= 191 measure ofimprecision= 5.866 Time elapsed (seconds)= 0.061560 order= 383 measure ofimprecision= 7.129 Time elapsed (seconds)= 0.418490 order= 767 measure ofimprecision= 8.240 Time elapsed (seconds)= 3.815713 order= 1535 measure ofimprecision= 8.735 Time elapsed (seconds)= 27.877270 order= 3071 measure ofimprecision= 9.996 Time elapsed (seconds)=212.545610 order= 6143 Process terminated by a MemoryError ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On Wed, Mar 20, 2013 at 11:01 AM, Colin J. Williams cjwilliam...@gmail.com wrote: On 20/03/2013 10:30 AM, Frédéric Bastien wrote: Hi, win32 do not mean it is a 32 bits windows. sys.platform always return win32 on 32bits and 64 bits windows even for python 64 bits. But that is a good question, is your python 32 or 64 bits? 32 bits. That explain why you have memory problem but not other people with 64 bits version. So if you want to work with bigger input, change to a python 64 bits. Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] aligned / unaligned structured dtype behavior
On Fri, Mar 8, 2013 at 5:22 AM, Francesc Alted franc...@continuum.io wrote: On 3/7/13 7:26 PM, Frédéric Bastien wrote: I'm surprised that Theano worked with the unaligned input. I added some check to make this raise an error, as we do not support that! Francesc, can you check if Theano give the good result? It is possible that someone (maybe me), just copy the input to an aligned ndarray when we receive an not aligned one. That could explain why it worked, but my memory tell me that we raise an error. It seems to work for me: In [10]: f = theano.function([a], a**2) In [11]: f(baligned) Out[11]: array([ 1., 1., 1., ..., 1., 1., 1.]) In [12]: f(bpacked) Out[12]: array([ 1., 1., 1., ..., 1., 1., 1.]) In [13]: f2 = theano.function([a], a.sum()) In [14]: f2(baligned) Out[14]: array(100.0) In [15]: f2(bpacked) Out[15]: array(100.0) I understand what happen. You declare the symbolic variable like this: a = theano.tensor.vector() This create a symbolic variable with dtype floatX that is float64 by default. baligned and bpacked are of dtype int64. When a Theano function receive as input an ndarray of the wrong dtype, we try to cast it to the good dtype and check we don't loose precission. As the input are only 1s, there is no lost of precission, so the input is silently accepted and copied. So when we check later for the aligned flags, it pass. If you change the symbolic variable to have a dtype of int64, there won't be a copy and we will see the error: a = theano.tensor.lvector() f = theano.function([a], a ** 2) f(bpacked) TypeError: ('Bad input argument to theano function at index 0(0-based)', 'The numpy.ndarray object is not aligned. Theano C code does not support that.', '', 'object shape', (100,), 'object strides', (9,)) If I time now this new function I have: In [14]: timeit baligned**2 100 loops, best of 3: 7.5 ms per loop In [15]: timeit bpacked**2 100 loops, best of 3: 8.25 ms per loop In [16]: timeit f(baligned) 100 loops, best of 3: 7.36 ms per loop So the Theano overhead was the copy in this case. It is not the first time I saw this. We added the automatic cast to allow specifing most python int/list/real as input. Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] aligned / unaligned structured dtype behavior
I agree that documenting this better would be useful to many people. So if someone what to summarize this and put it in the doc, I think many people will appreciate this. Fred On Thu, Mar 7, 2013 at 10:28 PM, Kurt Smith kwmsm...@gmail.com wrote: On Thu, Mar 7, 2013 at 12:26 PM, Frédéric Bastien no...@nouiz.org wrote: Hi, It is normal that unaligned access are slower. The hardware have been optimized for aligned access. So this is a user choice space vs speed. The quantitative difference is still important, so this thread is useful for future reference, I think. If reading in data into a packed array is 3x faster than reading into an aligned array, but the core computation is 4x slower with a packed array...you get the idea. I would have benefitted years ago knowing (1) numpy structured dtypes are packed by default, and (2) computations with unaligned data can be several factors slower than aligned. That's strong motivation to always make sure I'm using 'aligned=True' except when memory usage is an issue, or for file IO with packed binary data, etc. We can't go around that. We can only minimize the cost of unaligned access in some cases, but not all and those optimization depend of the CPU. But newer CPU have lowered in cost of unaligned access. I'm surprised that Theano worked with the unaligned input. I added some check to make this raise an error, as we do not support that! Francesc, can you check if Theano give the good result? It is possible that someone (maybe me), just copy the input to an aligned ndarray when we receive an not aligned one. That could explain why it worked, but my memory tell me that we raise an error. As you saw in the number, this is a bad example for Theano as the function compiled is too fast . Their is more Theano overhead then computation time in that example. We have reduced recently the overhead, but we can do more to lower it. Fred On Thu, Mar 7, 2013 at 1:06 PM, Francesc Alted franc...@continuum.io wrote: On 3/7/13 6:47 PM, Francesc Alted wrote: On 3/6/13 7:42 PM, Kurt Smith wrote: And regarding performance, doing simple timings shows a 30%-ish slowdown for unaligned operations: In [36]: %timeit packed_arr['b']**2 100 loops, best of 3: 2.48 ms per loop In [37]: %timeit aligned_arr['b']**2 1000 loops, best of 3: 1.9 ms per loop Hmm, that clearly depends on the architecture. On my machine: In [1]: import numpy as np In [2]: aligned_dt = np.dtype([('a', 'i1'), ('b', 'i8')], align=True) In [3]: packed_dt = np.dtype([('a', 'i1'), ('b', 'i8')], align=False) In [4]: aligned_arr = np.ones((10**6,), dtype=aligned_dt) In [5]: packed_arr = np.ones((10**6,), dtype=packed_dt) In [6]: baligned = aligned_arr['b'] In [7]: bpacked = packed_arr['b'] In [8]: %timeit baligned**2 1000 loops, best of 3: 1.96 ms per loop In [9]: %timeit bpacked**2 100 loops, best of 3: 7.84 ms per loop That is, the unaligned column is 4x slower (!). numexpr allows somewhat better results: In [11]: %timeit numexpr.evaluate('baligned**2') 1000 loops, best of 3: 1.13 ms per loop In [12]: %timeit numexpr.evaluate('bpacked**2') 1000 loops, best of 3: 865 us per loop Just for completeness, here it is what Theano gets: In [18]: import theano In [20]: a = theano.tensor.vector() In [22]: f = theano.function([a], a**2) In [23]: %timeit f(baligned) 100 loops, best of 3: 7.74 ms per loop In [24]: %timeit f(bpacked) 100 loops, best of 3: 12.6 ms per loop So yeah, Theano is also slower for the unaligned case (but less than 2x in this case). Yes, in this case, the unaligned array goes faster (as much as 30%). I think the reason is that numexpr optimizes the unaligned access by doing a copy of the different chunks in internal buffers that fits in L1 cache. Apparently this is very beneficial in this case (not sure why, though). Whereas summing shows just a 10%-ish slowdown: In [38]: %timeit packed_arr['b'].sum() 1000 loops, best of 3: 1.29 ms per loop In [39]: %timeit aligned_arr['b'].sum() 1000 loops, best of 3: 1.14 ms per loop On my machine: In [14]: %timeit baligned.sum() 1000 loops, best of 3: 1.03 ms per loop In [15]: %timeit bpacked.sum() 100 loops, best of 3: 3.79 ms per loop Again, the 4x slowdown is here. Using numexpr: In [16]: %timeit numexpr.evaluate('sum(baligned)') 100 loops, best of 3: 2.16 ms per loop In [17]: %timeit numexpr.evaluate('sum(bpacked)') 100 loops, best of 3: 2.08 ms per loop And with Theano: In [26]: f2 = theano.function([a], a.sum()) In [27]: %timeit f2(baligned) 100 loops, best of 3: 2.52 ms per loop In [28]: %timeit f2(bpacked) 100 loops, best of 3: 7.43 ms per loop Again, the unaligned case is significantly slower (as much as 3x here!). -- Francesc Alted ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] aligned / unaligned structured dtype behavior
Hi, It is normal that unaligned access are slower. The hardware have been optimized for aligned access. So this is a user choice space vs speed. We can't go around that. We can only minimize the cost of unaligned access in some cases, but not all and those optimization depend of the CPU. But newer CPU have lowered in cost of unaligned access. I'm surprised that Theano worked with the unaligned input. I added some check to make this raise an error, as we do not support that! Francesc, can you check if Theano give the good result? It is possible that someone (maybe me), just copy the input to an aligned ndarray when we receive an not aligned one. That could explain why it worked, but my memory tell me that we raise an error. As you saw in the number, this is a bad example for Theano as the function compiled is too fast . Their is more Theano overhead then computation time in that example. We have reduced recently the overhead, but we can do more to lower it. Fred On Thu, Mar 7, 2013 at 1:06 PM, Francesc Alted franc...@continuum.io wrote: On 3/7/13 6:47 PM, Francesc Alted wrote: On 3/6/13 7:42 PM, Kurt Smith wrote: And regarding performance, doing simple timings shows a 30%-ish slowdown for unaligned operations: In [36]: %timeit packed_arr['b']**2 100 loops, best of 3: 2.48 ms per loop In [37]: %timeit aligned_arr['b']**2 1000 loops, best of 3: 1.9 ms per loop Hmm, that clearly depends on the architecture. On my machine: In [1]: import numpy as np In [2]: aligned_dt = np.dtype([('a', 'i1'), ('b', 'i8')], align=True) In [3]: packed_dt = np.dtype([('a', 'i1'), ('b', 'i8')], align=False) In [4]: aligned_arr = np.ones((10**6,), dtype=aligned_dt) In [5]: packed_arr = np.ones((10**6,), dtype=packed_dt) In [6]: baligned = aligned_arr['b'] In [7]: bpacked = packed_arr['b'] In [8]: %timeit baligned**2 1000 loops, best of 3: 1.96 ms per loop In [9]: %timeit bpacked**2 100 loops, best of 3: 7.84 ms per loop That is, the unaligned column is 4x slower (!). numexpr allows somewhat better results: In [11]: %timeit numexpr.evaluate('baligned**2') 1000 loops, best of 3: 1.13 ms per loop In [12]: %timeit numexpr.evaluate('bpacked**2') 1000 loops, best of 3: 865 us per loop Just for completeness, here it is what Theano gets: In [18]: import theano In [20]: a = theano.tensor.vector() In [22]: f = theano.function([a], a**2) In [23]: %timeit f(baligned) 100 loops, best of 3: 7.74 ms per loop In [24]: %timeit f(bpacked) 100 loops, best of 3: 12.6 ms per loop So yeah, Theano is also slower for the unaligned case (but less than 2x in this case). Yes, in this case, the unaligned array goes faster (as much as 30%). I think the reason is that numexpr optimizes the unaligned access by doing a copy of the different chunks in internal buffers that fits in L1 cache. Apparently this is very beneficial in this case (not sure why, though). Whereas summing shows just a 10%-ish slowdown: In [38]: %timeit packed_arr['b'].sum() 1000 loops, best of 3: 1.29 ms per loop In [39]: %timeit aligned_arr['b'].sum() 1000 loops, best of 3: 1.14 ms per loop On my machine: In [14]: %timeit baligned.sum() 1000 loops, best of 3: 1.03 ms per loop In [15]: %timeit bpacked.sum() 100 loops, best of 3: 3.79 ms per loop Again, the 4x slowdown is here. Using numexpr: In [16]: %timeit numexpr.evaluate('sum(baligned)') 100 loops, best of 3: 2.16 ms per loop In [17]: %timeit numexpr.evaluate('sum(bpacked)') 100 loops, best of 3: 2.08 ms per loop And with Theano: In [26]: f2 = theano.function([a], a.sum()) In [27]: %timeit f2(baligned) 100 loops, best of 3: 2.52 ms per loop In [28]: %timeit f2(bpacked) 100 loops, best of 3: 7.43 ms per loop Again, the unaligned case is significantly slower (as much as 3x here!). -- Francesc Alted ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy deprecation schedule
That sound good. To be sure, the now mean the first release that include the deprecation, in that case NumPy 1.7? Fred On Wed, Mar 6, 2013 at 3:09 PM, Nathaniel Smith n...@pobox.com wrote: A number of items on the 1.8 todo list are reminders to remove things that we deprecated in 1.7, and said we would remove in 1.8, e.g.: https://github.com/numpy/numpy/issues/596 https://github.com/numpy/numpy/issues/294 But, since 1.8 is so soon after 1.7, we probably shouldn't actually do that. I suggest we switch to a time-based deprecation schedule, where instead of saying this will be removed in N releases we say this will be removed in the first release on or after (now+N months). I also suggest that we set N=12, because it's a round number, it roughly matches numpy's historical release cycle, and because AFAICT that's the number that python itself uses for core and stdlib deprecations. Thoughts? -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding .abs() method to the array object
On Sat, Feb 23, 2013 at 9:34 PM, Benjamin Root ben.r...@ou.edu wrote: My issue is having to remember which ones are methods and which ones are functions. There doesn't seem to be a rhyme or reason for the choices, and I would rather like to see that a line is drawn, but I am not picky as to where it is drawn. I like that. I think it would be a good idea to find a good line for NumPy 2.0. As we already will break the API, why not break it for another part at the same time. I don't have any idea what would be a good line... Do someone have a good idea? Do you agree that it would be a good idea for 2.0? Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Windows, blas, atlas and dlls
I just read a web page on how to embed python in an application[1]. They explain that we can keep the symbol exported event if we statically link the BLAS library in scipy. This make me think we could just change how we compile the lib that link with BLAS and we will be able to reuse it for other project! But I didn't played much with this type of thing. Do someone have more information? Do you think it would be useful? Fred [1] http://docs.python.org/2/extending/embedding.html#linking-requirements On Fri, Feb 22, 2013 at 3:38 PM, David Cournapeau courn...@gmail.com wrote: On 22 Feb 2013 16:53, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Thu, Feb 21, 2013 at 4:16 AM, Matyáš Novák lo...@centrum.cz wrote: You could also look into OpenBLAS, which is easier to build and generally faster than ATLAS. (But alas, not supported by NumPy/SciPY AFAIK.) It look slike OpenBLAS is BSD-licensed, and thus compatible with numpy/sciy. It there a reason (other than someone having to do the work) it could not be used as the standard BLAS for numpy? no reason, and it actually works quite nicely. Bento supports it, at least on Mac/linux. David -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Windows, blas, atlas and dlls
Hi, We also have the same problem for Theano. Having one reusable blas on windows would be useful to many project. Also, if possible try to make it accesible from C,C++ too. Not just cython. Fred On Feb 20, 2013 5:15 AM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 02/20/2013 10:18 AM, Sergio wrote: Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no writes: On 02/18/2013 05:26 PM, rif wrote: I have no answer to the question, but I was curious as to why directly calling the cblas would be 10x-20x slower in the first place. That seems surprising, although I'm just learning about python numerics. The statement was that directly (on the Cython level) calling cblas is 10x-20x slower than going through the (slow) SciPy wrapper routines. That makes a lot of sense if the matrices are smalle nough. Dag Sverre Soory for expressing myself badly. I need to call cblas directly from cython, because it is faster. I use matrix multiplication in a tight loop. Let the speed with the standard dot be 100, Speed using the scipy.linalg.blas routines is 200 And speed calling directly atlas from cython is 2000 Which is reasonable, since this avoids any type checking. The point is that I need to ship an extra atlas lib to do so in windows, notwithstanding the fact that numpy/scipy incorporate atlas in the windows build. I was wondering if there is a way to build numpy/scipy with atlas dynamically linked into it, in order to be able to share the atlas libs between my code and scipy. You could also look into OpenBLAS, which is easier to build and generally faster than ATLAS. (But alas, not supported by NumPy/SciPY AFAIK.) Dag Sverre ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumPy 1.7.0rc2 release
Hi, As expected all Theano's tests passed. thanks Fred On Wed, Feb 6, 2013 at 10:10 PM, Ondřej Čertík ondrej.cer...@gmail.comwrote: Hi, I'm pleased to announce the availability of the second release candidate of NumPy 1.7.0rc2. Sources and binary installers can be found at https://sourceforge.net/projects/numpy/files/NumPy/1.7.0rc2/ We have fixed all issues known to us since the 1.7.0rc1 release. Please test this release and report any issues on the numpy-discussion mailing list. If there are no further problems, I plan to release the final version in a few days. I would like to thank Sandro Tosi, Sebastian Berg, Charles Harris, Marcin Juszkiewicz, Mark Wiebe, Ralf Gommers and Nathaniel J. Smith for sending patches, fixes and helping with reviews for this release since 1.7.0rc1, and Vincent Davis for providing the Mac build machine. Cheers, Ondrej = NumPy 1.7.0 Release Notes = This release includes several new features as well as numerous bug fixes and refactorings. It supports Python 2.4 - 2.7 and 3.1 - 3.3 and is the last release that supports Python 2.4 - 2.5. Highlights == * ``where=`` parameter to ufuncs (allows the use of boolean arrays to choose where a computation should be done) * ``vectorize`` improvements (added 'excluded' and 'cache' keyword, general cleanup and bug fixes) * ``numpy.random.choice`` (random sample generating function) Compatibility notes === In a future version of numpy, the functions np.diag, np.diagonal, and the diagonal method of ndarrays will return a view onto the original array, instead of producing a copy as they do now. This makes a difference if you write to the array returned by any of these functions. To facilitate this transition, numpy 1.7 produces a FutureWarning if it detects that you may be attempting to write to such an array. See the documentation for np.diagonal for details. Similar to np.diagonal above, in a future version of numpy, indexing a record array by a list of field names will return a view onto the original array, instead of producing a copy as they do now. As with np.diagonal, numpy 1.7 produces a FutureWarning if it detects that you may be attempting to write to such an array. See the documentation for array indexing for details. In a future version of numpy, the default casting rule for UFunc out= parameters will be changed from 'unsafe' to 'same_kind'. (This also applies to in-place operations like a += b, which is equivalent to np.add(a, b, out=a).) Most usages which violate the 'same_kind' rule are likely bugs, so this change may expose previously undetected errors in projects that depend on NumPy. In this version of numpy, such usages will continue to succeed, but will raise a DeprecationWarning. Full-array boolean indexing has been optimized to use a different, optimized code path. This code path should produce the same results, but any feedback about changes to your code would be appreciated. Attempting to write to a read-only array (one with ``arr.flags.writeable`` set to ``False``) used to raise either a RuntimeError, ValueError, or TypeError inconsistently, depending on which code path was taken. It now consistently raises a ValueError. The ufunc.reduce functions evaluate some reductions in a different order than in previous versions of NumPy, generally providing higher performance. Because of the nature of floating-point arithmetic, this may subtly change some results, just as linking NumPy to a different BLAS implementations such as MKL can. If upgrading from 1.5, then generally in 1.6 and 1.7 there have been substantial code added and some code paths altered, particularly in the areas of type resolution and buffered iteration over universal functions. This might have an impact on your code particularly if you relied on accidental behavior in the past. New features Reduction UFuncs Generalize axis= Parameter --- Any ufunc.reduce function call, as well as other reductions like sum, prod, any, all, max and min support the ability to choose a subset of the axes to reduce over. Previously, one could say axis=None to mean all the axes or axis=# to pick a single axis. Now, one can also say axis=(#,#) to pick a list of axes for reduction. Reduction UFuncs New keepdims= Parameter There is a new keepdims= parameter, which if set to True, doesn't throw away the reduction axes but instead sets them to have size one. When this option is set, the reduction result will broadcast correctly to the original operand which was reduced. Datetime support .. note:: The datetime API is *experimental* in 1.7.0, and may undergo changes in future versions of NumPy. There have been a lot of fixes and enhancements to datetime64 compared to NumPy
Re: [Numpy-discussion] Modern Fortran vs NumPy syntax
Hi, I just read a paper[1] that compare python with numpy or pypy vs c++ and fortran from a code, memory and speed point of view. The python code was still better as you can't have list of ndarray in fortran and some other stuff was harder to do. The fastest was fortran, then C++, but pypy around 2x slower then c++. That isn't bad for a more productive development language. Maybe you can check that article to find more case to compare. HTH Fred [1] http://arxiv.org/abs/1301.1334 On Thu, Feb 7, 2013 at 2:22 PM, Ondřej Čertík ondrej.cer...@gmail.comwrote: Hi, I have recently setup a page about modern Fortran: http://fortran90.org/ and in particular, it has a long section with side by side syntax examples of Python/NumPy vs Fortran: http://fortran90.org/src/rosetta.html I would be very interested if some NumPy gurus would provide me feedback. I personally knew NumPy long before I learned Fortran, and I was amazed that the modern Fortran pretty much allows 1:1 syntax with NumPy, including most of all the fancy indexing etc. Is there some NumPy feature that is not covered there? I would like it to be a nice resource for people who know NumPy to feel like at home with Fortran, and vice versa. I personally use both every day (Fortran a bit more than NumPy). Or of you have any other comments or tips for the site, please let me know. Eventually I'd like to also put there C++ way of doing the same things, but at the moment I want to get Fortran and Python/NumPy done first. Ondrej ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Travis failures with no errors
Hi, go to the site tracis-ci(the the next.travis-ci.org part): https://next.travis-ci.org/numpy/numpy/jobs/4118113 When you go that way, in a drop-down menu in the screen, when you are autorized, you can ask travis-ci to rerun the tests. You can do it in the particular test or in the commit page too to rerun all test for that commit. I find this usefull to rerun failed tests caused by VM errors... HTH Fred On Wed, Jan 16, 2013 at 12:51 PM, Ondřej Čertík ondrej.cer...@gmail.com wrote: On Thu, Dec 20, 2012 at 6:32 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Dec 20, 2012 at 6:25 PM, Ondřej Čertík ondrej.cer...@gmail.com wrote: On Thu, Dec 13, 2012 at 4:39 PM, Ondřej Čertík ondrej.cer...@gmail.com wrote: Hi, I found these recent weird failures in Travis, but I can't find any problem with the log and all tests pass. Any ideas what is going on? https://travis-ci.org/numpy/numpy/jobs/3570123 https://travis-ci.org/numpy/numpy/jobs/3539549 https://travis-ci.org/numpy/numpy/jobs/3369629 And here is another one: https://travis-ci.org/numpy/numpy/jobs/3768782 Hmm, that is strange indeed. The first three are old, = 12 days, but the last is new, although the run time was getting up there. Might try running the last one again. I don't know if the is an easy way to do that. And another one from 3 days ago: https://travis-ci.org/numpy/numpy/jobs/4118113 Ondrej ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] New numpy functions: filled, filled_like
Why not optimize NumPy to detect a mul of an ndarray by a scalar to call fill? That way, np.empty * 2 will be as fast as x=np.empty; x.fill(2)? Fred On Mon, Jan 14, 2013 at 9:57 AM, Benjamin Root ben.r...@ou.edu wrote: On Mon, Jan 14, 2013 at 7:38 AM, Pierre Haessig pierre.haes...@crans.org wrote: Hi, Le 14/01/2013 00:39, Nathaniel Smith a écrit : (The nice thing about np.filled() is that it makes np.zeros() and np.ones() feel like clutter, rather than the reverse... not that I'm suggesting ever getting rid of them, but it makes the API conceptually feel smaller, not larger.) Coming from the Matlab syntax, I feel that np.zeros and np.ones are in numpy for Matlab (and maybe others ?) compatibilty and are useful for that. Now that I've been enlightened by Python, I think that those functions (especially np.ones) are indeed clutter. Therefore I favor the introduction of these two new functions. However, I think Eric's remark about masked array API compatibility is important. I don't know what other names are possible ? np.const ? Or maybe np.tile is also useful for that same purpose ? In that case adding a dtype argument to np.tile would be useful. best, Pierre I am also +1 on the idea of having a filled() and filled_like() function (I learned a long time ago to just do a = np.empty() and a.fill() rather than the multiplication trick I learned from Matlab). However, the collision with the masked array API is a non-starter for me. np.const() and np.const_like() probably make the most sense, but I would prefer a verb over a noun. Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8 release
Hi, I don't volontear for the next release manager, but +1 for shorter releases. I heard just good comments from that. Also, I'm not sure it would ask more from the release manager. Do someone have an idea? The most work I do as a release manager for theano is the preparation/tests/release notes and this depend on the amont of new stuff. And this seam exponential on the number of new changes in the release, not linear (no data, just an impression...). Making smaller release make this easier. But yes, this mean more announces. But this isn't what take the most times. Also, doing the release notes more frequently mean it is more recent in memory when you check the PR merged, so it make it easier to do. But what prevent us from making shorter release? Oother priorities that can't wait, like work for papers to submit, or for collaboration with partners. just my 2cents. Fred On Mon, Jan 14, 2013 at 7:18 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Mon, Jan 14, 2013 at 12:19 AM, David Cournapeau courn...@gmail.com wrote: On Sun, Jan 13, 2013 at 5:26 PM, Nathaniel Smith n...@pobox.com wrote: On Sun, Jan 13, 2013 at 7:03 PM, Charles R Harris charlesr.har...@gmail.com wrote: Now that 1.7 is nearing release, it's time to look forward to the 1.8 release. I'd like us to get back to the twice yearly schedule that we tried to maintain through the 1.3 - 1.6 releases, so I propose a June release as a goal. Call it the Spring Cleaning release. As to content, I'd like to see the following. Removal of Python 2.4-2.5 support. Removal of SCons support. The index work consolidated. Initial stab at removing the need for 2to3. See Pauli's PR for scipy. Miscellaneous enhancements and fixes. I'd actually like to propose a faster release cycle than this, even. Perhaps 3 months between releases; 2 months from release n to the first beta of n+1? The consequences would be: * Changes get out to users faster. * Each release is smaller, so it's easier for downstream projects to adjust to each release -- instead of having this giant pile of changes to work through all at once every 6-12 months * End-users are less scared of updating, because the changes aren't so overwhelming, so they end up actually testing (and getting to take advantage of) the new stuff more. * We get feedback more quickly, so we can fix up whatever we break while we still know what we did. * And for larger changes, if we release them incrementally, we can get feedback before we've gone miles down the wrong path. * Releases come out on time more often -- sort of paradoxical, but with small, frequent releases, beta cycles go smoother, and it's easier to say don't worry, I'll get it ready for next time, or right, that patch was less done than we thought, let's take it out for now (also this is much easier if we don't have another years worth of changes committed on top of the patch!). * If your schedule does slip, then you still end up with a 6 month release cycle. 1.6.x was branched from master in March 2011 and released in May 2011. 1.7.x was branched from master in July 2012 and still isn't out. But at least we've finally found and fixed the second to last bug! Wouldn't it be nice to have a 2-4 week beta cycle that only found trivial and expected problems? We *already* have 6 months worth of feature work in master that won't be in the *next* release. Note 1: if we do do this, then we'll also want to rethink the deprecation cycle a bit -- right now we've sort of vaguely been saying well, we'll deprecate it in release n and take it out in n+1. Whenever that is. 3 months definitely isn't long enough for a deprecation period, so if we do do this then we'll want to deprecate things for multiple releases before actually removing them. Details to be determined. Note 2: in this kind of release schedule, you definitely don't want to say here are the features that will be in the next release!, because then you end up slipping and sliding all over the place. Instead you say here are some things that I want to work on next, and we'll see which release they end up in. Since we're already following the rule that nothing goes into master until it's done and tested and ready for release anyway, this doesn't really change much. Thoughts? Hey, my time to have a time-machine: http://mail.scipy.org/pipermail/numpy-discussion/2008-May/033754.html I still think it is a good idea :) I guess it is the release manager who has by far the largest say in this. Who will that be for the next year or so? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Remove support for numeric and numarray in 1.8
Hi, Just to note, as they plan to remove there dependency on it this year, is it bad that they can't use 1.8 for a few mounts until they finish the conversion? They already have a working version. They can continue to use it for as long they want. The only advantage for them if the compat layers are kept is the abbility for them to use the new numpy 1.8 a few monts earlier. I don't know enough about this issue, but from Nathaniel description, the consequence of dropping it in 1.8 seam light compared to the potential problem in my view. But the question is, how many other group are in there situation? Can we make a big warning printed when we compile again those compatibility layer to make it clear they will get removed? (probably it is already like that) my 2 cents Fred On Wed, Jan 9, 2013 at 6:41 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Jan 9, 2013 at 4:21 PM, Christopher Hanley chan...@gmail.com wrote: After poking around our code base and talking to a few folks I predict that we at STScI can remove our dependence on the numpy-numarray compatibility layer by the end of this calendar year. I'm unsure of what the timeline for numpy 1.8 is so I don't know if this schedule supports removal of the compatibility layer from 1.8 or not. Together with the previous post that puts the kibosh on removing either numeric or numarray support from 1.8, at least if we get 1.8 before the end of summer. It's good to know where folks stand with regard to those packages, we'll give it another shot next year. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumPy 1.7.0rc1 release
Hi, Congratulation for the release and a big thanks for the hard work. I tested it with our software and all work fine. thanks! Frédéric On Sun, Dec 30, 2012 at 7:17 PM, Sandro Tosi mo...@debian.org wrote: Hi Ondrej al, On Sat, Dec 29, 2012 at 1:02 AM, Ondřej Čertík ondrej.cer...@gmail.com wrote: I'm pleased to announce the availability of the first release candidate of NumPy 1.7.0rc1. Congrats on this RC release! I've uploaded this version to Debian and updated some of the issues related to it. There are also a couple of minor PR you might want to consider for 1.7: 2872 and 2873. Cheers, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Howto bisect old commits correctly
Hi, I had many error when tring to the checkedout version and recompile. the problem I had is that I didn't erased the build directory each time. This cause some problem as not all is recompiled correctly in that case. Just deleting this directory manually fixed my problem. HTH Fred On Fri, Jan 4, 2013 at 8:29 PM, Sebastian Berg sebast...@sipsolutions.net wrote: On Sat, 2013-01-05 at 00:17 +0100, Sebastian Berg wrote: Hey, this is probably just because I do not have any experience with bisect and the like, but when I try running a bisect keep running into: Nevermind that. Probably I just stumbled on some bad versions... ImportError: /home/sebastian/.../lib/python2.7/site-packages/numpy/core/multiarray.so: undefined symbol: PyDataMem_NEW or: RuntimeError: module compiled against API version 8 but this version of numpy is 7 I am sure I am missing something simple, but I have no idea where to look. Am I just forgetting to delete some things and my version is not clean!? Regards, Sebastian ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] required nose version.
This is fine for us. Frédéric On Sun, Dec 16, 2012 at 5:36 PM, Charles R Harris charlesr.har...@gmail.com wrote: Hi All, Looking at INSTALL.txt with an eye to updating it since we have dropped Python 2.4 -2.5 support, it looks like we could update the nose version also. The first version of nose to support Python 3 was 1.0, but I think 1.1 would better because of some bug fixes. IPython also requires nose 1.1. So I propose the required nose version be updated to 1.1. Thoughts? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Status of the 1.7 release
Hi, I added a new issue that is a regression about numpy.ndindex() that we already talked. But it was a duplicate[1], so I closed it. I think it got lost as the ticket wasn't marked for 1.7 milestone. Ccan someone do it? I don't have the right. This regression break something in Theano. We could work around it, this also break stuff in SciPy from a comment in that ticket. Fred [1] github.com/numpy/numpy/issues/2781 On Mon, Dec 17, 2012 at 2:11 AM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Mon, Dec 17, 2012 at 12:26 AM, Nathaniel Smith n...@pobox.com wrote: On 16 Dec 2012 23:01, Charles R Harris charlesr.har...@gmail.com wrote: On Sun, Dec 16, 2012 at 3:50 PM, Ondřej Čertík ondrej.cer...@gmail.com wrote: Thanks Ralf and Nathan, I have put high priority on the issues that need to be fixed before the rc1. There are now 4 issues: https://github.com/numpy/numpy/issues?labels=priority%3A+highmilestone=3page=1state=open I am working on the mingw one, as that one is the most difficult. Ralf (or anyone else), do you know how to fix this one: https://github.com/numpy/numpy/issues/438 I am not very familiar with this part of numpy, so maybe you know how to document it well. The sooner we can fix these 4 issues, the sooner we can release. I believe mingw was updated last month to a new compiler version. I don't know what other changes there were, but it is possible that some problems have been fixed. It'd be worth checking in case it allows us to get off the (incredibly old) GCC that we currently require on windows. But that's a long-term problem that we probably shouldn't be messing with for 1.7 purposes. afaict all we need to do for 1.7 is switch to using our current POSIX code on win32 as well, instead of the (weird and broken) MS-specific API that we're currently using. (Plus suppress some totally spurious warnings): http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063346.html (Or I could be missing something, but I don't think any problems with that solution have been discussed on the list anyway.) AFAICT Nathaniel's suggestion in the thread linked above is the way to go. Trying again to go to gcc 4.x doesn't sound like a good idea. Probably David C. already has a good idea about whether recent changes to MinGW have made a difference to the issue he ran into about a year ago. Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Status of the 1.7 release
While we are at it, back-porting https://github.com/numpy/numpy/pull/2730 Would give a good speed up for an LTS. I made a new PR that do this back-port: https://github.com/numpy/numpy/pull/2847 Fred On Mon, Dec 17, 2012 at 11:46 AM, David Warde-Farley d.warde.far...@gmail.com wrote: A bit off-topic, but could someone have a look at https://github.com/numpy/numpy/pull/2699 and provide some feedback? If 1.7 is meant to be an LTS release, this would be a nice wart to have out of the way. The Travis failure was a spurious one that has since been fixed. On Sat, Dec 15, 2012 at 6:52 PM, Ondřej Čertík ondrej.cer...@gmail.com wrote: Hi, If you go to the issues for 1.7 and click high priority: https://github.com/numpy/numpy/issues?labels=priority%3A+highmilestone=3state=open you will see 3 issues as of right now. Two of those have PR attached. It's been a lot of work to get to this point and I'd like to thank all of you for helping out with the issues. In particular, I have just fixed a very annoying segfault (#2738) in the PR: https://github.com/numpy/numpy/pull/2831 If you can review that one carefully, that would be highly appreciated. The more people the better, it's a reference counting issue and since this would go into the 1.7 release and it's in the core of numpy, I want to make sure that it's correct. So the last high priority issue is: https://github.com/numpy/numpy/issues/568 and that's the one I will be concentrating on now. After it's fixed, I think we are ready to release the rc1. There are more open issues (that are not high priority): https://github.com/numpy/numpy/issues?labels=milestone=3page=1state=open But I don't think we should delay the release any longer because of them. Let me know if there are any objections. Of course, if you attach a PR fixing any of those, we'll merge it. Ondrej ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Use OpenBLAS for the binary releases?
Hi, it was mainly developed by one person and he left for Intel. They relicensed GotoBLAS to BSD 3 clauses and some Chinese people forked it to OpenBLAS. There was another fork, but I didn't got news of it. Maybe I just missed the news. Fred On Mon, Nov 19, 2012 at 12:20 PM, Daniele Nicolodi dani...@grinta.netwrote: On 19/11/2012 18:12, Sturla Molden wrote: I think NumPy and SciPy should consider to use OpenBLAS (a fork of GotoBLAS2) instead of ATLAS or f2c'd Netlib BLAS for the binary releases. ... * Funded and developed for use in major Chinese HPC projects. Actively maintained. (GotoBLAS2 is abandonware.) Hello Sturla, do you know why GotoBLAS2 is not worked on anymore? It looked like a project with serious university backing and its web page does not reflect the fact that it has been discontinued by its authors. Thanks. Cheers, Daniele ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Scipy dot
On Fri, Nov 9, 2012 at 3:32 AM, Nathaniel Smith n...@pobox.com wrote: On Fri, Nov 9, 2012 at 6:18 AM, Nicolas SCHEFFER scheffer.nico...@gmail.com wrote: Fred, Thanks for the advice. The code will only affect the part in _dotblas.c where gemm is called. There's tons of check before that make sure both matrices are of ndim 2. We should check though if we can do these tricks in other parts of the function. Otherwise: - I've built against ATLAS 3.10 - I'm happy to add a couple more test for C and F-contiguous. I'm not sure how to get the third type (strided), would you have an example? def with_memory_order(a, order): assert order in (C, F, discontig) assert a.ndim == 2 if order in (C, F): return np.asarray(a, order=order) else: buf = np.empty((a.shape[0] * 2, a.shape[1] * 2), dtype=a.dtype) buf[::2, ::2] = a # This returns a view onto every other element of 'buf': result = buf[::2, ::2] assert not result.flags.c_contiguous and not result.flags.f_contiguous return result The following test for instance checks integrity against multiarray.dot, which I believe is default when not compiled with BLAS. Dot is a hard function to test imho, so if anybody has ideas on what kind of test they'd like to see, please let me know. If that's ok I might now be able to: - Check for more bugs, I need to dig a bit more in the gemm call, make sure everything is ok. - Create an issue on github and link to this discussion - Make a commit in a seperate branch - Move forward like that. == import numpy as np from time import time from numpy.testing import assert_almost_equal def test_dot_regression(): Test numpy dot by comparing with multiarray dot np.random.seed(7) a = np.random.randn(3, 3) b = np.random.randn(3, 2) c = np.random.randn(2, 3) _dot = np.core.multiarray.dot assert_almost_equal(np.dot(a, a), _dot(a, a)) assert_almost_equal(np.dot(b, c), _dot(b, c)) assert_almost_equal(np.dot(b.T, c.T), _dot(b.T, c.T)) assert_almost_equal(np.dot(a.T, a), _dot(a.T, a)) assert_almost_equal(np.dot(a, a.T), _dot(a, a.T)) assert_almost_equal(np.dot(a.T, a.T), _dot(a.T, a.T)) You should check that the result is C-contiguous in all cases too. for a_order in (C, F, discontig): for b_order in (C, F, discontig): this_a = with_memory_order(a, a_order) this_b = with_memory_order(b, b_order) result = np.dot(this_a, this_b) assert_almost_equal(result, expected) assert result.flags.c_contiguous You could also wrap the above in yet another loop to try a few different combinations of a and b matrices (perhaps after sticking the code into a utility function, like run_dot_tests(a, b, expected), so the indentation doesn't get out of hand ;-)). Then you can easily test some of the edge cases, like Nx1 matrices. I agree that tests are needed the for Nx1 and variant cases. I saw blas error being raised with some blas version. You also need to test with the output provided, so there is 3 loops: for a_order in (C, F, discontig, neg): for b_order in (C, F, discontig, neg): for c_order in (C, F, discontig, neg): I also added the stride type neg, I'm not sure if it is needed, but that is other corner cases. neg = result = buf[::-1, ::-1] I just looked again at our code and there is another constrain: that the strides are multiple of the elemsize. Theano do not support not aligned array, but numpy does, so there is a need for test for this. You can make an unaligned array like this: dtype = b1,f4 a = numpy.empty(1e4, dtype=dtype)['f1'] I just saw the strides problems that affect only some. I think that the best explaination is our code: /* create appropriate strides for malformed matrices that are row or column * vectors, or empty matrices. * In that case, the value of the stride does not really matter, but * some versions of BLAS insist that: * - they are not smaller than the number of elements in the array, * - they are not 0. */ sx_0 = (Nx[0] 1) ? Sx[0]/type_size : (Nx[1] + 1); sx_1 = (Nx[1] 1) ? Sx[1]/type_size : (Nx[0] + 1); sy_0 = (Ny[0] 1) ? Sy[0]/type_size : (Ny[1] + 1); sy_1 = (Ny[1] 1) ? Sy[1]/type_size : (Ny[0] + 1); sz_0 = (Nz[0] 1) ? Sz[0]/type_size : (Nz[1] + 1); sz_1 = (Nz[1] 1) ? Sz[1]/type_size : (Nz[0] + 1); So this ask for test with empty matrices too. HTH Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Scipy dot
On Fri, Nov 9, 2012 at 10:20 AM, Nathaniel Smith n...@pobox.com wrote: On Fri, Nov 9, 2012 at 2:45 PM, Frédéric Bastien no...@nouiz.org wrote: On Fri, Nov 9, 2012 at 3:32 AM, Nathaniel Smith n...@pobox.com wrote: On Fri, Nov 9, 2012 at 6:18 AM, Nicolas SCHEFFER scheffer.nico...@gmail.com wrote: Fred, Thanks for the advice. The code will only affect the part in _dotblas.c where gemm is called. There's tons of check before that make sure both matrices are of ndim 2. We should check though if we can do these tricks in other parts of the function. Otherwise: - I've built against ATLAS 3.10 - I'm happy to add a couple more test for C and F-contiguous. I'm not sure how to get the third type (strided), would you have an example? def with_memory_order(a, order): assert order in (C, F, discontig) assert a.ndim == 2 if order in (C, F): return np.asarray(a, order=order) else: buf = np.empty((a.shape[0] * 2, a.shape[1] * 2), dtype=a.dtype) buf[::2, ::2] = a # This returns a view onto every other element of 'buf': result = buf[::2, ::2] assert not result.flags.c_contiguous and not result.flags.f_contiguous return result The following test for instance checks integrity against multiarray.dot, which I believe is default when not compiled with BLAS. Dot is a hard function to test imho, so if anybody has ideas on what kind of test they'd like to see, please let me know. If that's ok I might now be able to: - Check for more bugs, I need to dig a bit more in the gemm call, make sure everything is ok. - Create an issue on github and link to this discussion - Make a commit in a seperate branch - Move forward like that. == import numpy as np from time import time from numpy.testing import assert_almost_equal def test_dot_regression(): Test numpy dot by comparing with multiarray dot np.random.seed(7) a = np.random.randn(3, 3) b = np.random.randn(3, 2) c = np.random.randn(2, 3) _dot = np.core.multiarray.dot assert_almost_equal(np.dot(a, a), _dot(a, a)) assert_almost_equal(np.dot(b, c), _dot(b, c)) assert_almost_equal(np.dot(b.T, c.T), _dot(b.T, c.T)) assert_almost_equal(np.dot(a.T, a), _dot(a.T, a)) assert_almost_equal(np.dot(a, a.T), _dot(a, a.T)) assert_almost_equal(np.dot(a.T, a.T), _dot(a.T, a.T)) You should check that the result is C-contiguous in all cases too. for a_order in (C, F, discontig): for b_order in (C, F, discontig): this_a = with_memory_order(a, a_order) this_b = with_memory_order(b, b_order) result = np.dot(this_a, this_b) assert_almost_equal(result, expected) assert result.flags.c_contiguous You could also wrap the above in yet another loop to try a few different combinations of a and b matrices (perhaps after sticking the code into a utility function, like run_dot_tests(a, b, expected), so the indentation doesn't get out of hand ;-)). Then you can easily test some of the edge cases, like Nx1 matrices. I agree that tests are needed the for Nx1 and variant cases. I saw blas error being raised with some blas version. You also need to test with the output provided, so there is 3 loops: for a_order in (C, F, discontig, neg): for b_order in (C, F, discontig, neg): for c_order in (C, F, discontig, neg): I also added the stride type neg, I'm not sure if it is needed, but that is other corner cases. neg = result = buf[::-1, ::-1] I just looked again at our code and there is another constrain: that the strides are multiple of the elemsize. Theano do not support not aligned array, but numpy does, so there is a need for test for this. You can make an unaligned array like this: dtype = b1,f4 a = numpy.empty(1e4, dtype=dtype)['f1'] I think you're mixing up two issues here... requiring that the strides and the itemsize match is part of the requirement for C- and F-contiguity, the example code you give here produces a non-contiguous array, so it's already checked for by the function we're talking about. However, there can be contiguous arrays that are not aligned, like: In [25]: a = np.empty(100, dtype=i1)[1:-3].view(np.float32) In [26]: a.flags.c_contiguous Out[26]: True In [27]: a.flags.aligned Out[27]: False I suspect the np.dot code that Nicolas is looking at already checks for the ALIGNED flag and makes a copy if necessary, but it would be good to have an array like this in the tests to be sure. You are right, my test wasn't good and your example is good. Nicolas, I hope I don't de-motivate you with all those details, this need to be done. When dealing with blas lib, the devil is in the detail... and the fact that not all lib accept the same inputs... We hit
Re: [Numpy-discussion] Scipy dot
Hi, I also think it should go into numpy.dot and that the output order should not be changed. A new point, what about the additional overhead for small ndarray? To remove this, I would suggest to put this code into the C function that do the actual work (at least, from memory it is a c function, not a python one). HTH Fred On Thu, Nov 8, 2012 at 12:29 PM, Anthony Scopatz scop...@gmail.com wrote: On Thu, Nov 8, 2012 at 7:06 AM, David Cournapeau courn...@gmail.comwrote: On Thu, Nov 8, 2012 at 12:12 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 11/08/2012 01:07 PM, Gael Varoquaux wrote: On Thu, Nov 08, 2012 at 11:28:21AM +, Nathaniel Smith wrote: I think everyone would be very happy to see numpy.dot modified to do this automatically. But adding a scipy.dot IMHO would be fixing things in the wrong place and just create extra confusion. I am not sure I agree: numpy is often compiled without lapack support, as it is not necessary. On the other hand scipy is always compiled with lapack. Thus this makes more sens in scipy. Well, numpy.dot already contains multiple fallback cases for when it is compiled with BLAS and not. So I'm +1 on just making this an improvement on numpy.dot. I don't think there's a time when you would not want to use this (provided the output order issue is fixed), and it doesn't make sense to not have old codes take advantage of the speed improvement. Indeed, there is no reason not to make this available in NumPy. Nicolas, can you prepare a patch for numpy ? +1, I agree, this should be a fix in numpy, not scipy. Be Well Anthony David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Scipy dot
, Nov 8, 2012 at 12:06 PM, Nicolas SCHEFFER scheffer.nico...@gmail.com wrote: I've made the necessary changes to get the proper order for the output array. Also, a pass of pep8 and some tests (fixmes are in failing tests) http://pastebin.com/M8TfbURi -n On Thu, Nov 8, 2012 at 11:38 AM, Nicolas SCHEFFER scheffer.nico...@gmail.com wrote: Thanks for all the responses folks. This is indeed a nice problem to solve. Few points: I. Change the order from 'F' to 'C': I'll look into it. II. Integration with scipy / numpy: opinions are diverging here. Let's wait a bit to get more responses on what people think. One thing though: I'd need the same functionality as get_blas_funcs in numpy. Since numpy does not require lapack, what functions can I get? III. Complex arrays I unfortunately don't have enough knowledge here. If someone could propose a fix, that'd be great. IV. C Writing this in C sounds like a good idea. I'm not sure I'd be the right person to this though. V. Patch in numpy I'd love to do that and learn to do it as a byproduct. Let's make sure we agree this can go in numpy first and that all FIXME can be fixed. Although I guess we can resolve fixmes using git. Let me know how you'd like to proceed, Thanks! FIXMEs: - Fix for ndim != 2 - Fix for dtype == np.complex* - Fix order of output array On Thu, Nov 8, 2012 at 9:42 AM, Frédéric Bastien no...@nouiz.org wrote: Hi, I also think it should go into numpy.dot and that the output order should not be changed. A new point, what about the additional overhead for small ndarray? To remove this, I would suggest to put this code into the C function that do the actual work (at least, from memory it is a c function, not a python one). HTH Fred On Thu, Nov 8, 2012 at 12:29 PM, Anthony Scopatz scop...@gmail.com wrote: On Thu, Nov 8, 2012 at 7:06 AM, David Cournapeau courn...@gmail.com wrote: On Thu, Nov 8, 2012 at 12:12 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 11/08/2012 01:07 PM, Gael Varoquaux wrote: On Thu, Nov 08, 2012 at 11:28:21AM +, Nathaniel Smith wrote: I think everyone would be very happy to see numpy.dot modified to do this automatically. But adding a scipy.dot IMHO would be fixing things in the wrong place and just create extra confusion. I am not sure I agree: numpy is often compiled without lapack support, as it is not necessary. On the other hand scipy is always compiled with lapack. Thus this makes more sens in scipy. Well, numpy.dot already contains multiple fallback cases for when it is compiled with BLAS and not. So I'm +1 on just making this an improvement on numpy.dot. I don't think there's a time when you would not want to use this (provided the output order issue is fixed), and it doesn't make sense to not have old codes take advantage of the speed improvement. Indeed, there is no reason not to make this available in NumPy. Nicolas, can you prepare a patch for numpy ? +1, I agree, this should be a fix in numpy, not scipy. Be Well Anthony David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.7.0 release
Hi, I updated the numpy master and recompiled it. I still have the compilation error I got from Theano. I'll pop up that email thread again to have the history and I made a PR for this. Also, I think I said that numpy.ndindex changed its interface, in the past numpy.ndindex() was valid, not this raise and error: import numpy a=numpy.ndindex() a.next() () a.next() Traceback (most recent call last): File stdin, line 1, in module File /opt/lisa/os/epd-7.1.2/lib/python2.7/site-packages/numpy/lib/index_tricks.py, line 577, in next raise StopIteration StopIteration numpy.__version__ '1.6.1' The error I have with master: [...] ValueError: __array_interface__ shape must be at least size 1 That is the only stopper I saw, but I didn't followed what was needed for other people. Fred On Tue, Nov 6, 2012 at 2:33 AM, Travis Oliphant tra...@continuum.io wrote: Hey all, Ondrej has been tied up finishing his PhD for the past several weeks. He is defending his work shortly and should be available to continue to help with the 1.7.0 release around the first of December.He and I have been in contact during this process, and I've been helping where I can. Fortunately, other NumPy developers have been active closing tickets and reviewing pull requests which has helped the process substantially. The release has taken us longer than we expected, but I'm really glad that we've received the bug-reports and issues that we have seen because it will help the 1.7.0 release be a more stable series. Also, the merging of the Trac issues with Git has exposed over-looked problems as well and will hopefully encourage more Git-focused participation by users. We are targeting getting the final release of 1.7.0 out by mid December (based on Ondrej's availability). But, I would like to find out which issues are seen as blockers by people on this list. I think most of the issues that I had as blockers have been resolved.If there are no more remaining blockers, then we may be able to accelerate the final release of 1.7.0 to just after Thanksgiving. Best regards, -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np 1.7b2 PyArray_BYTES(obj)=ptr fail
Hi, I made a PR with my fix: https://github.com/numpy/numpy/pull/2709 Frédéric On Tue, Oct 2, 2012 at 6:18 PM, Charles R Harris charlesr.har...@gmail.comwrote: On Tue, Oct 2, 2012 at 1:44 PM, Frédéric Bastien no...@nouiz.org wrote: With numpy 1.6.2, it is working. So this is an interface change. Are you sure you want this? This break existing code. I do not understand what you mean by slot? Pythonese for structure member ;) I'm not sure is the PyArray_SWAP is a good long term idea. I would not make it if it is only for temporarily. The C++ stdlib provides something similar for std::vector. One common use case would be to pass in a vector by reference that gets swapped with one on the stack. When the function exits the one on the stack is cleaned up and the vector that was passed in has the new data, but it has to be the same type. For PyArray_SWAP I was thinking of swapping everything: type, dims, strides, data, etc. That is what f2py does. To set the base ptr, there is PyArray_SetBaseObject() fct that is new in 1.7. Is a similar fct useful in the long term for numpy? In the case where we implement differently the ndarray object, I think it won't be useful. We will also need to know how the memory is laid out by numpy for performance critical code. We we will need an attribute that tell the intern structure used. So do you want to force this interface change in numpy 1.7 so that I need to write codes now or can I wait to do it when you force the new interface? Well, no we don't want to force you to use the new interface. If you don't define NPY_NO_DEPRECATED_API things should still work. Although if it is defined the function returns an rvalue, so some other method needs to be provided for what you are doing. Currently the used code for PyArray_BYTES is: #define PyArray_BYTES(obj) ((char *)(((PyArrayObject_fields *)(obj))-data)) if I change it to #define PyArray_BYTES(obj) PyArrayObject_fields *)(obj))-data)) it work! I don't understand why removing the case make it work. the data field is already an (char*), so this should not make a difference to my underderstanding. But I'm missing something here, do someone know? What I find strange is that it is the same macro in 1.7 and 1.6, only the name of the structure was changed. Hmm... This looks almost like some compiler subtlety, I wonder if the compiler version/optimization flags have changed? In any case, I think the second form would be more correct for the lvalue since the structure member is, as you say, already a char*. We want things to work for you as they should, so we need to understand this and fix it. snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy to CPU+GPU compiler, looking for tests
On Tue, Oct 23, 2012 at 11:48 AM, Henry Gomersall h...@cantab.net wrote: On Tue, 2012-10-23 at 11:41 -0400, Frédéric Bastien wrote: Did you saw the gpu nd array project? We try to do something similar but only for the GPU. Out of interest, is there a reason why the backend for Numpy could not be written entirely in OpenCL? Assuming of course all the relevant backends are up to scratch. Is there a fundamental reason why targetting a CPU through OpenCL is worse than doing it exclusively through C or C++? First, opencl do not allow us to do pointor arythmetique. So when taking a slice of an ndarray, we can't just more the pointor. So we need to change the object structure. I didn't do any speed anylysis of this, but I think that by using OpenCL, it would have a bigger overhead. So it is only useful for big ndarray. I don't have any size in mind too. I don't know, but if we could access the opencl data directly from C/C++, we could bypass this for small array if we want. But maybe this is not possible! Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy to CPU+GPU compiler, looking for tests
That is possible. The gpu nd array project I talked above work on the CPU and the GPU in OpenCL and with CUDA. But there is much stuff that is in numpy that we didn't ported. We started integrating it into Theano. So this mean the GPU code from Theano will be ported to this project, so there will be more code available later. If some people are willing to help/collaborate don't hesitate. I think we should collaborate much more on GPU/OpenCL code then we do now. That was part of the goal of gpu nd array. There is the PyCUDA and PyOpenCL authors that also collaborate with us. Fred On Mon, Oct 29, 2012 at 11:26 AM, Henry Gomersall h...@cantab.net wrote: On Mon, 2012-10-29 at 11:11 -0400, Frédéric Bastien wrote: Assuming of course all the relevant backends are up to scratch. Is there a fundamental reason why targetting a CPU through OpenCL is worse than doing it exclusively through C or C++? First, opencl do not allow us to do pointor arythmetique. So when taking a slice of an ndarray, we can't just more the pointor. So we need to change the object structure. I didn't do any speed anylysis of this, but I think that by using OpenCL, it would have a bigger overhead. So it is only useful for big ndarray. I don't have any size in mind too. I don't know, but if we could access the opencl data directly from C/C++, we could bypass this for small array if we want. But maybe this is not possible! My understanding is that when running OpenCL on CPU, one can simply map memory from a host pointer using CL_MEM_USE_HOST_PTR during buffer creation. On a CPU, this will result in no copies being made. The overhead is clearly an issue, and was the subject of my question. I wouldn't be surprised to find that the speedup associated with the free multithreading that comes with OpenCL on CPU, along with the vector data types mapping nicely to SSE etc, would make OpenCL on CPU faster on any reasonably sized array. It strikes me that if there is a neat way in which numpy objects can be represented by coherent versions in both main memory and device memory, then OpenCL could be used when it makes sense (either on CPU or GPU), and the CPU natively when _it_ makes sense. Henry ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy to CPU+GPU compiler, looking for tests
On Mon, Oct 29, 2012 at 11:53 AM, Henry Gomersall h...@cantab.net wrote: On Mon, 2012-10-29 at 11:49 -0400, Frédéric Bastien wrote: That is possible. Great! Just to be clear, I mean this is possible to make it work. We do not do that for now. Also, sharing the memory on the CPU and GPU is not trivial if we want that to be efficient. So I would be surprised that we make it work except by copy when needed. Most of the time, if the data is on the GPU, it will be better to keep it there, as the transfer overhead will probably be too high to gain efficiency by using the GPU. The gpu nd array project I talked above work on the CPU and the GPU in OpenCL and with CUDA. But there is much stuff that is in numpy that we didn't ported. This is: https://github.com/inducer/compyte/wiki right? yes We started integrating it into Theano. So this mean the GPU code from Theano will be ported to this project, so there will be more code available later. If some people are willing to help/collaborate don't hesitate. I think we should collaborate much more on GPU/OpenCL code then we do now. That was part of the goal of gpu nd array. There is the PyCUDA and PyOpenCL authors that also collaborate with us. I'm interested, but somewhat an opencl rooky with limited time! I'll have a play when I have a moment. Great. I think that people work better when they work on stuff that motivate them. If you have any algo event if simple that you want to work on do not hesitate. Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy to CPU+GPU compiler, looking for tests
Did you saw the gpu nd array project? We try to do something similar but only for the GPU. https://github.com/inducer/compyte/wiki Fred On Sun, Oct 21, 2012 at 2:57 PM, Rahul Garg rahulgar...@gmail.com wrote: Thanks! I need to add support for eig and inv (will do this week, at least for CPU) but other than that, I should definitely be able to handle those kinds of benchmarks. rahul On Sun, Oct 21, 2012 at 12:01 PM, Aron Ahmadia a...@ahmadia.net wrote: Hi Rahul, Very cool! I'm looking forward to seeing some performance results! Anders Logg posted a computational challenge to G+ about a month ago, and we got entries in Octave, Fortran, Python, and Julia (all implementing the same solution from Jed Brown). The challenge is here: https://plus.google.com/116518787475147930287/posts/jiULACjiGnW Here is my simple attempt at Cythonizing Jed's Octave code: https://gist.github.com/3893361 The best solution in Fortran took 38 microseconds. The best Python solution clocked in at around 445. The Julia solution implemented by Jed took around 224 microseconds, a good LLVM solution should come close to or beat that. Hope this helps. Aron On Sun, Oct 21, 2012 at 3:27 PM, Rahul Garg rahulgar...@gmail.com wrote: Hi. I am a PhD student at McGill University and I am developing a compiler for Python for CPUs and GPUs. For CPUs, I build upon LLVM. For GPUs, I generate OpenCL and I have also implemented some library functions on the GPU myself. The restriction that it is only for numerical code and intended for NumPy users. The compiler is aware of simple things in NumPy like matrix multiplication, slicing operators, strided layouts, some library functions (though limited at this time) and the negative indexing semantics etc. However, the compiler is not limited to vector code. Scalar code or manually written loops also work. However, only numerical datatypes are supported with no support for lists, dicts, classes etc. First class functions are not currently supported but are on the roadmap. You will have to add some type annotations to your functions. If you have a compatible GPU, you can also use the GPU by indicating which parts to run on the GPU. Otherwise you can just use it to run your code on the CPU. As an example, simple scalar code like fibonacci function works fine. Simple loops like those used in stencil-type computations are also working. Parallel-for loops are also provided and working. Simple vector oriented code is also working fine on both CPU and GPU. The system is being tested on Ubuntu 12.04 and tested with Python 2.7 (though I think should work with other Python 2.x variants). For GPUs, I am ensuring that the system works with AMD and Nvidia GPUs. The compiler is in early stages and I am looking for test cases. The project will be open-sourced in November under Apache 2 and thereafter will be developed in an open repo. If you have some simple code that I can use as a benchmark that I can use to test and evaluate the compiler, that will be very helpful. Some annotations will be required, which I can help you write. I will be VERY grateful to anyone who can provide test cases. In turn, it will help improve the compiler and everyone will benefit. Some of you may be wondering how it compares to Numba. Well it is essentially very similar in the idea. So why build a new compiler then? Actually the project I am building is not specific to Python. I am building a far more general compiler infrastructure for array languages, and Python frontend is just one small part of the project. For example, I am also working on a MATLAB frontend. (Some of you may remember me from an earlier compiler project which unfortunately went nowhere. This is a different project and this time I am determined to convert it into a usable system. I realize the proof is in the pudding, so I hope to convince people by releasing code soon.) thanks, Rahul ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is there a way to reset an accumulate function?
Hi, Why not start conting from the end of the vector until you find a nan? Your problem do not need to check the full vector. Fred On Tue, Oct 23, 2012 at 1:11 PM, Cera, Tim t...@cerazone.net wrote: I have an array that is peppered throughout in random spots with 'nan'. I would like to use 'cumsum', but I want it to reset the accumulation to 0 whenever a 'nan' is encountered. Is there a way to do this? Aside from a loop - which is what I am going to setup here in a moment. Kindest regards, Tim ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] np 1.7b2 PyArray_BYTES(obj)=ptr fail
Hi, I don't know if that was raised or not, but in np1.7b2 doing this fail with this error message: PyArray_BYTES(obj)=ptr file:lne_number:offset: error: lvalue required as left operatnd of assignment. I tried with PyArray_DATA(obj)=ptr and this also fail. Do you want to remove this feature now? I would think this change will be done at the same time as the one related to the macro NPY_NO_DEPRECATED_API. If I missed the discussion about this, tell me. thanks Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np 1.7b2 PyArray_BYTES(obj)=ptr fail
We implement our own subtensor(x[...], where ... can be index or slice) c code due to the way Theano work. I can probably change the logic, but if you plan to revert this interface changes, I prefer to wait for this change we someone else is doing other changes that would conflict. Also, I did a Theano release candidate and I really would like the final version to work with the next release of NumPy. thanks. Fred On Tue, Oct 2, 2012 at 11:33 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Oct 2, 2012 at 8:34 AM, Frédéric Bastien no...@nouiz.org wrote: Hi, I don't know if that was raised or not, but in np1.7b2 doing this fail with this error message: PyArray_BYTES(obj)=ptr file:lne_number:offset: error: lvalue required as left operatnd of assignment. I tried with PyArray_DATA(obj)=ptr and this also fail. Do you want to remove this feature now? I would think this change will be done at the same time as the one related to the macro NPY_NO_DEPRECATED_API. If I missed the discussion about this, tell me. f2py wants to do the same thing, i.e., change the data pointer of an existing array, which is why NPY_NO_DEPRECATED_API is not defined in that module. I had some discussion off list with Pearu about that but it petered out. I think for this sort of thing is a new function needs to be implemented. What precisely is your application? IIRC, Pearu was using it to exchange pointers between two arrays to avoid a copy. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np 1.7b2 PyArray_BYTES(obj)=ptr fail
On Tue, Oct 2, 2012 at 1:18 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Oct 2, 2012 at 9:45 AM, Frédéric Bastien no...@nouiz.org wrote: We implement our own subtensor(x[...], where ... can be index or slice) c code due to the way Theano work. I can probably change the logic, but if you plan to revert this interface changes, I prefer to wait for this change we someone else is doing other changes that would conflict. Also, I did a Theano release candidate and I really would like the final version to work with the next release of NumPy. Well, you don't *have* to define NPY_NO_DEPRECATED_API. If you don't you can access the array as before using the macros even for future versions of numpy. The only way that could cause a problems is if the array structure is rearranged and I don't think that will happen anytime soon. On that account there has not been any discussion of reverting the changes. However, I'd like f2py generated modules to use the new functions at some point and in order to do that numpy needs to supply some extra functionality, I'm just not sure of the best way to do it at the moment. If I had a good idea of what you want to do it would help in deciding what numpy should provide. I do not define NPY_NO_DEPRECATED_API, so I get g++ warning about that. That is the problem I have. What I do is close to that: alloc a new ndarray that point to the start of the ndarray I want to view with the right number of output dimensions. Compute the new dimensions/strides and data ptr. Then set the data ptr to what I computed. Then set the base ptr. I can reverse this and create the ndarray only at the end, but as this change break existing code here, it can break more code. That is why I wrote to the list. doing PyArray_BASE(xview) = ptr work when I don't define NPY_NO_DEPRECATED_API, but do not work when I define NPY_NO_DEPRECATED_API. I would have expected the same for PyArray_BYTES/DATA. Do this explain clearly the problem I saw? Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np 1.7b2 PyArray_BYTES(obj)=ptr fail
With numpy 1.6.2, it is working. So this is an interface change. Are you sure you want this? This break existing code. I do not understand what you mean by slot? I'm not sure is the PyArray_SWAP is a good long term idea. I would not make it if it is only for temporarily. To set the base ptr, there is PyArray_SetBaseObject() fct that is new in 1.7. Is a similar fct useful in the long term for numpy? In the case where we implement differently the ndarray object, I think it won't be useful. We will also need to know how the memory is laid out by numpy for performance critical code. We we will need an attribute that tell the intern structure used. So do you want to force this interface change in numpy 1.7 so that I need to write codes now or can I wait to do it when you force the new interface? Currently the used code for PyArray_BYTES is: #define PyArray_BYTES(obj) ((char *)(((PyArrayObject_fields *)(obj))-data)) if I change it to #define PyArray_BYTES(obj) PyArrayObject_fields *)(obj))-data)) it work! I don't understand why removing the case make it work. the data field is already an (char*), so this should not make a difference to my underderstanding. But I'm missing something here, do someone know? Fred On Tue, Oct 2, 2012 at 1:47 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Oct 2, 2012 at 11:43 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Oct 2, 2012 at 11:30 AM, Frédéric Bastien no...@nouiz.org wrote: On Tue, Oct 2, 2012 at 1:18 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Oct 2, 2012 at 9:45 AM, Frédéric Bastien no...@nouiz.org wrote: We implement our own subtensor(x[...], where ... can be index or slice) c code due to the way Theano work. I can probably change the logic, but if you plan to revert this interface changes, I prefer to wait for this change we someone else is doing other changes that would conflict. Also, I did a Theano release candidate and I really would like the final version to work with the next release of NumPy. Well, you don't *have* to define NPY_NO_DEPRECATED_API. If you don't you can access the array as before using the macros even for future versions of numpy. The only way that could cause a problems is if the array structure is rearranged and I don't think that will happen anytime soon. On that account there has not been any discussion of reverting the changes. However, I'd like f2py generated modules to use the new functions at some point and in order to do that numpy needs to supply some extra functionality, I'm just not sure of the best way to do it at the moment. If I had a good idea of what you want to do it would help in deciding what numpy should provide. I do not define NPY_NO_DEPRECATED_API, so I get g++ warning about that. That is the problem I have. What I do is close to that: alloc a new ndarray that point to the start of the ndarray I want to view with the right number of output dimensions. Compute the new dimensions/strides and data ptr. Then set the data ptr to what I computed. Then set the base ptr. I can reverse this and create the ndarray only at the end, but as this change break existing code here, it can break more code. That is why I wrote to the list. doing PyArray_BASE(xview) = ptr work when I don't define NPY_NO_DEPRECATED_API, but do not work when I define NPY_NO_DEPRECATED_API. I would have expected the same for PyArray_BYTES/DATA. Do this explain clearly the problem I saw? Yes, thanks. I see in ndarraytypes.h #define PyArray_DATA(obj) ((void *)(((PyArrayObject_fields *)(obj))-data)) I wonder if the cast to void* is causing a problem? Could you try char* instead? Oops, the problem is that you need a pointer to the slot, not the pointer in the slot. That is, you need a different macro/function. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Memory order of array copies
As always, I think it is better to don't change the default behaviour. There is many people that don't update frequently and 2 releases is not enough. This will lead to many hard to find bug. This will also give the impression what we can't rely on numpy default behaviour and numpy is not stable. As a rule of thumb, we need to compare the benefit and consequence of changing default behaviour. In this case I see only a marginal speed gain (marginal in the sense that in the global user script, this won't matter, but locally it could be significant) vs silent and hard to find bug. If speed in that case is important, i think it would be much better to write an optimizer version that will take stride and cache line length into account. Even if we hard code the cache line lenght, this will probably bring most of the local speed up, without the inconvenient. If people still want to do this change, I think only a big release like numpy 2.0 make this acceptable but with the warning as Gael told. But I still prefer it not done and if people matter about the speed, they can write optimized code. Fred On Sun, Sep 30, 2012 at 2:22 PM, Gael Varoquaux gael.varoqu...@normalesup.org wrote: On Sun, Sep 30, 2012 at 07:17:42PM +0100, Nathaniel Smith wrote: Is there anything better to do than simply revert np.copy() to its traditional behaviour and accept that np.copy(a) and a.copy() will continue to have different semantics indefinitely? Have np.copy take an 'order=None', which would translate to 'K'. Detect 'None' as a sentinel that order as not been specified. If the order is not specified, raise a FutureWarning that np.copy will change semantics in 2 releases. In two releases, do the change. That's how I would deal with it. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion