Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
Stefan van der Walt ste...@sun.ac.za writes: On 2014-10-27 15:26:58, D. Michael McFarland dm...@dmmcf.net wrote: What I would like to ask about is the situation this illustrates, where both NumPy and SciPy provide similar functionality (sometimes identical, to judge by the documentation). Is there some guidance on which is to be preferred? I'm not sure if you've received an answer to your question so far. My advice: use the SciPy functions. SciPy is often built on more extensive Fortran libraries not available during NumPy compilation, and I am not aware of any cases where a function in NumPy is faster or more extensive than the equivalent in SciPy. The whole thread has been interesting reading (now that I've finally come back to it...got busy for a few days), but this is the sort of answer I was hoping for. Thank you. Best, Michael ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
Just to throw in my two cents here. I feel that sometimes, features are tried out first elsewhere (possibly in scipy) and then brought down into numpy after sufficient shakedown time. So, in some cases, I wonder if the numpy version is actually more refined than the scipy version? Of course, there is no way to know this from the documentation, which is a problem. Didn't scipy have nanmean() for a while before Numpy added it in version 1.8? Ben Root On Fri, Oct 31, 2014 at 10:26 AM, D. Michael McFarland dm...@dmmcf.net wrote: Stefan van der Walt ste...@sun.ac.za writes: On 2014-10-27 15:26:58, D. Michael McFarland dm...@dmmcf.net wrote: What I would like to ask about is the situation this illustrates, where both NumPy and SciPy provide similar functionality (sometimes identical, to judge by the documentation). Is there some guidance on which is to be preferred? I'm not sure if you've received an answer to your question so far. My advice: use the SciPy functions. SciPy is often built on more extensive Fortran libraries not available during NumPy compilation, and I am not aware of any cases where a function in NumPy is faster or more extensive than the equivalent in SciPy. The whole thread has been interesting reading (now that I've finally come back to it...got busy for a few days), but this is the sort of answer I was hoping for. Thank you. Best, Michael ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
On Fri, Oct 31, 2014 at 11:07 AM, Benjamin Root ben.r...@ou.edu wrote: Just to throw in my two cents here. I feel that sometimes, features are tried out first elsewhere (possibly in scipy) and then brought down into numpy after sufficient shakedown time. So, in some cases, I wonder if the numpy version is actually more refined than the scipy version? Of course, there is no way to know this from the documentation, which is a problem. Didn't scipy have nanmean() for a while before Numpy added it in version 1.8? That's true for several functions in scipy.stats. And we have more deprecation in scipy.stats in favor of numpy pending. part of polynomials is another case, kind of. But I don't remember any other ones in my time. (There is also a reverse extension for scipy binned_stats based on the np.histogram code.) Josef Ben Root On Fri, Oct 31, 2014 at 10:26 AM, D. Michael McFarland dm...@dmmcf.net wrote: Stefan van der Walt ste...@sun.ac.za writes: On 2014-10-27 15:26:58, D. Michael McFarland dm...@dmmcf.net wrote: What I would like to ask about is the situation this illustrates, where both NumPy and SciPy provide similar functionality (sometimes identical, to judge by the documentation). Is there some guidance on which is to be preferred? I'm not sure if you've received an answer to your question so far. My advice: use the SciPy functions. SciPy is often built on more extensive Fortran libraries not available during NumPy compilation, and I am not aware of any cases where a function in NumPy is faster or more extensive than the equivalent in SciPy. The whole thread has been interesting reading (now that I've finally come back to it...got busy for a few days), but this is the sort of answer I was hoping for. Thank you. Best, Michael ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
On Fri, Oct 31, 2014 at 3:07 PM, Benjamin Root ben.r...@ou.edu wrote: Just to throw in my two cents here. I feel that sometimes, features are tried out first elsewhere (possibly in scipy) and then brought down into numpy after sufficient shakedown time. So, in some cases, I wonder if the numpy version is actually more refined than the scipy version? Of course, there is no way to know this from the documentation, which is a problem. Didn't scipy have nanmean() for a while before Numpy added it in version 1.8? Not that often, and these usually get actively deprecated eventually. Most duplications are of the form Stefan discusses. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
On Tue, Oct 28, 2014 at 5:24 AM, Sturla Molden sturla.mol...@gmail.com wrote: Matthew Brett matthew.br...@gmail.com wrote: Is this an option for us? Aren't we a little behind the performance curve on FFT after we lost FFTW? It does not run on Windows because it uses POSIX to allocate executable memory for tasklets, as i understand it. By the way, why did we loose FFTW, apart from GPL? One thing to mention here is that MKL supports the FFTW APIs. If we can use MKL for linalg and numpy.dot I don't see why we cannot use it for FFT. The problem is APIs: MKL, Accelerate, etc... all use a standard API (BLAS/LAPACK), but for FFT, you need to reimplement pretty much the whole thing. Unsurprisingly, this meant the code was not well maintained. Wrapping non standard, non-BSD libraries makes much more sense in separate libraries in general. David On Mac there is also vDSP in Accelerate framework which has an insanely fast FFT (also claimed to be faster than FFTW). Since it is a system library there should be no license problems. There are clearly options if someone wants to work on it and maintain it. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
Hi Michael On 2014-10-27 15:26:58, D. Michael McFarland dm...@dmmcf.net wrote: What I would like to ask about is the situation this illustrates, where both NumPy and SciPy provide similar functionality (sometimes identical, to judge by the documentation). Is there some guidance on which is to be preferred? I could argue that using only NumPy when possible avoids unnecessary dependence on SciPy in some code, or that using SciPy consistently makes for a single interface and so is less error prone. Is there a rule of thumb for cases where SciPy names shadow NumPy names? I'm not sure if you've received an answer to your question so far. My advice: use the SciPy functions. SciPy is often built on more extensive Fortran libraries not available during NumPy compilation, and I am not aware of any cases where a function in NumPy is faster or more extensive than the equivalent in SciPy. If you want code that falls back gracefully when SciPy is not available, you may use the ``numpy.dual`` library. Regards Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
I would add one element to the discussion: for some (odd) reasons, SciPy is lacking the functions `rfftn` and `irfftn`, functions using half the memory space compared to their non-real equivalent `fftn` and `ifftn`. However, I haven't (yet) seriously tested `scipy.fftpack.fftn` vs. `np.fft.rfftn` to check if there is a serious performance gain (beside memory usage). Cheers, Pierre On Tue Oct 28 2014 at 10:54:00 Stefan van der Walt ste...@sun.ac.za wrote: Hi Michael On 2014-10-27 15:26:58, D. Michael McFarland dm...@dmmcf.net wrote: What I would like to ask about is the situation this illustrates, where both NumPy and SciPy provide similar functionality (sometimes identical, to judge by the documentation). Is there some guidance on which is to be preferred? I could argue that using only NumPy when possible avoids unnecessary dependence on SciPy in some code, or that using SciPy consistently makes for a single interface and so is less error prone. Is there a rule of thumb for cases where SciPy names shadow NumPy names? I'm not sure if you've received an answer to your question so far. My advice: use the SciPy functions. SciPy is often built on more extensive Fortran libraries not available during NumPy compilation, and I am not aware of any cases where a function in NumPy is faster or more extensive than the equivalent in SciPy. If you want code that falls back gracefully when SciPy is not available, you may use the ``numpy.dual`` library. Regards Stéfan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
Pierre Barbier de Reuille pie...@barbierdereuille.net wrote: I would add one element to the discussion: for some (odd) reasons, SciPy is lacking the functions `rfftn` and `irfftn`, functions using half the memory space compared to their non-real equivalent `fftn` and `ifftn`. In both NumPy and SciPy the N-dimensional FFTs are implemented in Python. It is just a Python loop over all the axes, calling fft or rfft on each axis. However, I haven't (yet) seriously tested `scipy.fftpack.fftn` vs. `np.fft.rfftn` to check if there is a serious performance gain (beside memory usage). Real-value FFT is implemented with complex-value FFT. You save half the memory, but not quite half the computation. Apart from that, the FFT in SciPy is written in Fortran and the FFT in NumPy is written in C, but they are algorithmically similar. I don't see any good reason why the Fortran code in SciPy should be faster than the C code in NumPy. It used to be the case that Fortran was faster than C, everything else being equal, but with modern C compilers and CPUs with deep pipelines and branch prediction this is rarely the case. So I would expect the NumPy rfftn to be slightly faster than SciPy fftn, but keep in mind that both have a huge Python overhead. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
The same occurred to me when reading that question. My personal opinion is that such functionality should be deprecated from numpy. I don't know who said this, but it really stuck with me: but the power of numpy is first and foremost in it being a fantastic interface, not in being a library. There is nothing more annoying than every project having its own array type. The fact that the whole scientific python stack can so seamlessly communicate is where all good things begin. In my opinion, that is what numpy should focus on; basic data structures, and tools for manipulating them. Linear algebra is way too high level for numpy imo, and used by only a small subsets of its 'matlab-like' users. When I get serious about linear algebra or ffts or what have you, id rather import an extra module that wraps a specific library. On Mon, Oct 27, 2014 at 2:26 PM, D. Michael McFarland dm...@dmmcf.net wrote: A recent post raised a question about differences in results obtained with numpy.linalg.eigh() and scipy.linalg.eigh(), documented at http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.eigh.html#numpy.linalg.eigh and http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.eigh.html#scipy.linalg.eigh , respectively. It is clear that these functions address different mathematical problems (among other things, the SciPy routine can solve the generalized as well as standard eigenproblems); I am not concerned here with numerical differences in the results for problems both should be able to solve (the author of the original post received useful replies in that thread). What I would like to ask about is the situation this illustrates, where both NumPy and SciPy provide similar functionality (sometimes identical, to judge by the documentation). Is there some guidance on which is to be preferred? I could argue that using only NumPy when possible avoids unnecessary dependence on SciPy in some code, or that using SciPy consistently makes for a single interface and so is less error prone. Is there a rule of thumb for cases where SciPy names shadow NumPy names? I've used Python for a long time, but have only recently returned to doing serious numerical work with it. The tools are very much improved, but sometimes, like now, I feel I'm missing the obvious. I would appreciate pointers to any relevant documentation, or just a summary of conventional wisdom on the topic. Regards, Michael ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
On Mon, Oct 27, 2014 at 2:24 PM, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: The same occurred to me when reading that question. My personal opinion is that such functionality should be deprecated from numpy. I don't know who said this, but it really stuck with me: but the power of numpy is first and foremost in it being a fantastic interface, not in being a library. There is nothing more annoying than every project having its own array type. The fact that the whole scientific python stack can so seamlessly communicate is where all good things begin. In my opinion, that is what numpy should focus on; basic data structures, and tools for manipulating them. Linear algebra is way too high level for numpy imo, and used by only a small subsets of its 'matlab-like' users. When I get serious about linear algebra or ffts or what have you, id rather import an extra module that wraps a specific library. We are not always getting serious about linalg, just a quick call to pinv or qr or matrix_rank or similar doesn't necessarily mean we need a linalg library with all advanced options. @ matrix operations and linear algebra are basic stuff. On Mon, Oct 27, 2014 at 2:26 PM, D. Michael McFarland dm...@dmmcf.net wrote: A recent post raised a question about differences in results obtained with numpy.linalg.eigh() and scipy.linalg.eigh(), documented at http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.eigh.html#numpy.linalg.eigh and http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.eigh.html#scipy.linalg.eigh , respectively. It is clear that these functions address different mathematical problems (among other things, the SciPy routine can solve the generalized as well as standard eigenproblems); I am not concerned here with numerical differences in the results for problems both should be able to solve (the author of the original post received useful replies in that thread). What I would like to ask about is the situation this illustrates, where both NumPy and SciPy provide similar functionality (sometimes identical, to judge by the documentation). Is there some guidance on which is to be preferred? I could argue that using only NumPy when possible avoids unnecessary dependence on SciPy in some code, or that using SciPy consistently makes for a single interface and so is less error prone. Is there a rule of thumb for cases where SciPy names shadow NumPy names? I've used Python for a long time, but have only recently returned to doing serious numerical work with it. The tools are very much improved, but sometimes, like now, I feel I'm missing the obvious. I would appreciate pointers to any relevant documentation, or just a summary of conventional wisdom on the topic. Just as opinion as user: Most of the time I don't care and treat this just as different versions. For example in the linalg case, I use by default numpy.linalg and switch to scipy if I need the extras. pinv is the only one that I ever seriously compared. Some details are nicer, np.linalg.qr(x, mode='r') returns the reduced matrix instead of the full matrix as does scipy.linalg. np.linalg.pinv is faster but maybe slightly less accurate (or defaults that make it less accurate in corner cases). scipy often has more overhead (and isfinite check by default). I just checked, I didn't even know scipy.linalg also has an `inv`. One of my arguments for np.linalg would have been that it's easy to switch between inv and pinv. For fft I use mostly scipy, IIRC. (scipy's fft imports numpy's fft, partially?) Essentially, I don't care most of the time that there are different ways of doing essentially the same thing, but some better information about the differences would be useful. Josef Regards, Michael ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
josef.p...@gmail.com wrote: For fft I use mostly scipy, IIRC. (scipy's fft imports numpy's fft, partially?) No. SciPy uses the Fortran library FFTPACK (wrapped with f2py) and NumPy uses a smaller C library called fftpack_lite. Algorithmically they are are similar, but fftpack_lite has fewer features (e.g. no DCT). scipy.fftpack does not import numpy.fft. Neither of these libraries are very fast, but usually they are fast enough for practical purposes. If we really need a kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or Apple's Accelerate Framework, or even use tools like CUDA or OpenCL to run the FFT on the GPU. But using such tools takes more coding (and reading API specifications) than the convinience of just using the FFTs already in NumPy or SciPy. So if you count in your own time as well, it might not be that FFTW or MKL are the faster FFTs. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
Sturla Molden sturla.mol...@gmail.com wrote: If we really need a kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or Apple's Accelerate Framework, I should perhaps also mention FFTS here, which claim to be faster than FFTW and has a BSD licence: http://anthonix.com/ffts/index.html ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
On Mon, Oct 27, 2014 at 10:50 PM, Sturla Molden sturla.mol...@gmail.com wrote: josef.p...@gmail.com wrote: For fft I use mostly scipy, IIRC. (scipy's fft imports numpy's fft, partially?) No. SciPy uses the Fortran library FFTPACK (wrapped with f2py) and NumPy uses a smaller C library called fftpack_lite. Algorithmically they are are similar, but fftpack_lite has fewer features (e.g. no DCT). scipy.fftpack does not import numpy.fft. Neither of these libraries are very fast, but usually they are fast enough for practical purposes. If we really need a kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or Apple's Accelerate Framework, or even use tools like CUDA or OpenCL to run the FFT on the GPU. But using such tools takes more coding (and reading API specifications) than the convinience of just using the FFTs already in NumPy or SciPy. So if you count in your own time as well, it might not be that FFTW or MKL are the faster FFTs. Ok, I didn't remember correctly. I didn't use much fft recently, I never used DCT. My favorite fft function is fftconvolve. https://github.com/scipy/scipy/blob/e758c482efb8829685dcf494bdf71eeca3dd77f0/scipy/signal/signaltools.py#L13 doesn't seem to mind mixing numpy and scipy (quick github search) It's sometimes useful to have simplified functions that are good enough where we don't have to figure out all the extras that the docstring of the fancy version is mentioning. Josef Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
On Mon, Oct 27, 2014 at 11:31 PM, josef.p...@gmail.com wrote: On Mon, Oct 27, 2014 at 10:50 PM, Sturla Molden sturla.mol...@gmail.com wrote: josef.p...@gmail.com wrote: For fft I use mostly scipy, IIRC. (scipy's fft imports numpy's fft, partially?) No. SciPy uses the Fortran library FFTPACK (wrapped with f2py) and NumPy uses a smaller C library called fftpack_lite. Algorithmically they are are similar, but fftpack_lite has fewer features (e.g. no DCT). scipy.fftpack does not import numpy.fft. Neither of these libraries are very fast, but usually they are fast enough for practical purposes. If we really need a kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or Apple's Accelerate Framework, or even use tools like CUDA or OpenCL to run the FFT on the GPU. But using such tools takes more coding (and reading API specifications) than the convinience of just using the FFTs already in NumPy or SciPy. So if you count in your own time as well, it might not be that FFTW or MKL are the faster FFTs. Ok, I didn't remember correctly. I didn't use much fft recently, I never used DCT. My favorite fft function is fftconvolve. https://github.com/scipy/scipy/blob/e758c482efb8829685dcf494bdf71eeca3dd77f0/scipy/signal/signaltools.py#L13 doesn't seem to mind mixing numpy and scipy (quick github search) It's sometimes useful to have simplified functions that are good enough where we don't have to figure out all the extras that the docstring of the fancy version is mentioning. I take this back (even if it's true), because IMO the defaults should work, and I have a tendency to pile on options in my code that are intended for experts. Josef Josef Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
Hi, On Mon, Oct 27, 2014 at 8:07 PM, Sturla Molden sturla.mol...@gmail.com wrote: Sturla Molden sturla.mol...@gmail.com wrote: If we really need a kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or Apple's Accelerate Framework, I should perhaps also mention FFTS here, which claim to be faster than FFTW and has a BSD licence: http://anthonix.com/ffts/index.html Nice. And a funny New Zealand name too. Is this an option for us? Aren't we a little behind the performance curve on FFT after we lost FFTW? Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
josef.p...@gmail.com wrote: ahref=https://github.com/scipy/scipy/blob/e758c482efb8829685dcf494bdf71eeca3dd77f0/scipy/signal/signaltools.py#L13;https://github.com/scipy/scipy/blob/e758c482efb8829685dcf494bdf71eeca3dd77f0/scipy/signal/signaltools.py#L13/a doesn't seem to mind mixing numpy and scipy (quick github search) I believe it is because NumPy's FFTs (beginning with 1.9.0) are thread-safe. But FFTs from numpy.fft and scipy.fftpack should be rather similar in performance. (Except if you use Enthought, in which case the former is much faster.) It seems from the code that fftconvolve does not use overlap-add or overlap-save. I seem to remember that it did before, but I might be wrong. Personally I prefer to use overlap-add instead of a very long FFT. There is also a scipy.fftpack.convolve module. I have not used it though. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Choosing between NumPy and SciPy functions
Matthew Brett matthew.br...@gmail.com wrote: Is this an option for us? Aren't we a little behind the performance curve on FFT after we lost FFTW? It does not run on Windows because it uses POSIX to allocate executable memory for tasklets, as i understand it. By the way, why did we loose FFTW, apart from GPL? One thing to mention here is that MKL supports the FFTW APIs. If we can use MKL for linalg and numpy.dot I don't see why we cannot use it for FFT. On Mac there is also vDSP in Accelerate framework which has an insanely fast FFT (also claimed to be faster than FFTW). Since it is a system library there should be no license problems. There are clearly options if someone wants to work on it and maintain it. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion