Re: [Numpy-discussion] Future of numpy (was: DARPA funding for Blaze and passing the NumPy torch)
On Dec 20, 2012, at 7:39 PM, Nathaniel Smith wrote: On Thu, Dec 20, 2012 at 11:46 PM, Matthew Brett matthew.br...@gmail.com wrote: Travis - I think you are suggesting that there should be no one person in charge of numpy, and I think this is very unlikely to work well. Perhaps there are good examples of well-led projects where there is not a clear leader, but I can't think of any myself at the moment. My worry would be that, without a clear leader, it will be unclear how decisions are made, and that will make it very hard to take strategic decisions. Curious; my feeling is the opposite, that among mature and successful FOSS projects, having a clear leader is the uncommon case. GCC doesn't, Glibc not only has no leader but they recently decided to get rid of their formal steering committee, I'm pretty sure git doesn't, Apache certainly doesn't, Samba doesn't really, etc. As usual Karl Fogel has sensible comments on this: http://producingoss.com/en/consensus-democracy.html In practice the main job of a successful FOSS leader is to refuse to make decisions, nudge people to work things out, and then if they refuse to work things out tell them to go away until they do: https://lwn.net/Articles/105375/ and what actually gives people influence in a project is the respect of the other members. The former stuff is stuff anyone can do, and the latter isn't something you can confer or take away with a vote. I will strongly voice my opinion that NumPy does not need an official single leader.What it needs are committed, experienced, service-oriented developers and users who are willing to express their concerns and requests because they are used to being treated well.It also needs new developers who are willing to dive into code, contribute to discussions, tackle issues, make pull requests, and review pull requests.As people do this regularly, the leaders of the project will emerge as they have done in the past. Even though I called out three people explicitly --- there are many more contributors to NumPy whose voices deserve attention. But, you don't need me to point out the obvious to what the Github record shows about who is shepherding NumPy these days.But, the Github record is not the only one that matters.I would love to see NumPy developers continue to pay attention to and deeply respect the users (especially of downstream projects that depend on NumPy). I plan to continue using NumPy myself and plan to continue to encourage others around me to contribute patches, fixes and features. Obviously, there are people who have rights to merge pull-requests to the repository.But, this group seems always open to new, willing help.From a practical matter, this group is the head development group of the official NumPy fork.I believe this group will continue to be open enough to new, motivated contributors which will allow it to grow to the degree that such developers are available. Nor do we necessarily have a great track record for executive decisions actually working things out. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Byte aligned arrays
On 12/20/12 7:35 PM, Henry Gomersall wrote: On Thu, 2012-12-20 at 15:23 +0100, Francesc Alted wrote: On 12/20/12 9:53 AM, Henry Gomersall wrote: On Wed, 2012-12-19 at 19:03 +0100, Francesc Alted wrote: The only scenario that I see that this would create unaligned arrays is for machines having AVX. But provided that the Intel architecture is making great strides in fetching unaligned data, I'd be surprised that the difference in performance would be even noticeable. Can you tell us which difference in performance are you seeing for an AVX-aligned array and other that is not AVX-aligned? Just curious. Further to this point, from an Intel article... http://software.intel.com/en-us/articles/practical-intel-avx-optimization-on-2nd-generation-intel-core-processors Aligning data to vector length is always recommended. When using Intel SSE and Intel SSE2 instructions, loaded data should be aligned to 16 bytes. Similarly, to achieve best results use Intel AVX instructions on 32-byte vectors that are 32-byte aligned. The use of Intel AVX instructions on unaligned 32-byte vectors means that every second load will be across a cache-line split, since the cache line is 64 bytes. This doubles the cache line split rate compared to Intel SSE code that uses 16-byte vectors. A high cache-line split rate in memory-intensive code is extremely likely to cause performance degradation. For that reason, it is highly recommended to align the data to 32 bytes for use with Intel AVX. Though it would be nice to put together a little example of this! Indeed, an example is what I was looking for. So provided that I have access to an AVX capable machine (having 6 physical cores), and that MKL 10.3 has support for AVX, I have made some comparisons using the Anaconda Python distribution (it ships with most packages linked against MKL 10.3). snip All in all, it is not clear that AVX alignment would have an advantage, even for memory-bounded problems. But of course, if Intel people are saying that AVX alignment is important is because they have use cases for asserting this. It is just that I'm having a difficult time to find these cases. Thanks for those examples, they were very interesting. I managed to temporarily get my hands on a machine with AVX and I have shown some speed-up with aligned arrays. FFT (using my wrappers) gives about a 15% speedup. Also this convolution code: https://github.com/hgomersall/SSE-convolution/blob/master/convolve.c Shows a small but repeatable speed-up (a few %) when using some aligned loads (as many as I can work out to use!). Okay, so a 15% is significant, yes. I'm still wondering why I did not get any speedup at all using MKL, but probably the reason is that it manages the unaligned corners of the datasets first, and then uses an aligned access for the rest of the data (but just guessing here). -- Francesc Alted ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Byte aligned arrays
On Fri, 2012-12-21 at 11:34 +0100, Francesc Alted wrote: Also this convolution code: https://github.com/hgomersall/SSE-convolution/blob/master/convolve.c Shows a small but repeatable speed-up (a few %) when using some aligned loads (as many as I can work out to use!). Okay, so a 15% is significant, yes. I'm still wondering why I did not get any speedup at all using MKL, but probably the reason is that it manages the unaligned corners of the datasets first, and then uses an aligned access for the rest of the data (but just guessing here). With SSE in that convolution code example above (in which all alignments need be considered for each output element), I note a significant speedup by creating 4 copies of the float input array using memcopy, each shifted by 1 float (so the 5th element is aligned again). Despite all the extra copies its still quicker than using an unaligned load. However, when one tries the same trick with 8 copies for AVX it's actually slower than the SSE case. The fastest AVX (and any) implementation I have so far is with 16-aligned arrays (made with 4 copies as with SSE), with alternate aligned and unaligned loads (which is always at worst 16-byte aligned). Fascinating stuff! hen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions
DEAR PYTHON USERS DO MATHEMATICAL FUNCTIONS HAVE LIMITATION IN PYTHON in comparison with other programming languages I have two mathematical functions: from scipy.special import sph_jn, sph_jnyn 1) sph_jn (n, z) --- n is float, z is complex number for example: a,b=sph_jn ( 2.0 , 5+0.4j ) gives the following result: a array( [ - 0.20416243 + 0.03963597j, - 0.10714653 - 0.06227716j, 0.13731305 - 0.07165432j ] ) b array( [ 0.10714653 + 0.06227716j, -0.15959617 + 0.06098154j, -0.18559289 - 0.01300886j ] ) 2) sph_jnyn(n , x) -- n-float, x - float ,for example: c,d,e,f=sph_jnyn(2.0 , 3.0) c array( [ 0.04704 , 0.3456775, 0.2986375 ] ) d array( [-0.3456775 , -0.18341166, 0.04704 ] ) e array( [ 0.3299975 , 0.06295916, -0.26703834 ] ) f array( [ -0.06295916, 0.28802472, 0.3299975 ] ) PROBLEM IS HERE!!! BUT , IF I GIVE ( it is necessary value for my program ): a , b = sph_jn ( 536 , 2513.2741228718346 + 201.0619298974676j ) I would like to see even very very deep comments as specific as possible!! ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Byte aligned arrays
On 12/21/12 11:58 AM, Henry Gomersall wrote: On Fri, 2012-12-21 at 11:34 +0100, Francesc Alted wrote: Also this convolution code: https://github.com/hgomersall/SSE-convolution/blob/master/convolve.c Shows a small but repeatable speed-up (a few %) when using some aligned loads (as many as I can work out to use!). Okay, so a 15% is significant, yes. I'm still wondering why I did not get any speedup at all using MKL, but probably the reason is that it manages the unaligned corners of the datasets first, and then uses an aligned access for the rest of the data (but just guessing here). With SSE in that convolution code example above (in which all alignments need be considered for each output element), I note a significant speedup by creating 4 copies of the float input array using memcopy, each shifted by 1 float (so the 5th element is aligned again). Despite all the extra copies its still quicker than using an unaligned load. However, when one tries the same trick with 8 copies for AVX it's actually slower than the SSE case. The fastest AVX (and any) implementation I have so far is with 16-aligned arrays (made with 4 copies as with SSE), with alternate aligned and unaligned loads (which is always at worst 16-byte aligned). Fascinating stuff! Yes, to say the least. And it supports the fact that, when fine tuning memory access performance, there is no replacement for experimentation (in some weird ways many times :) -- Francesc Alted ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Byte aligned arrays
On 12/20/2012 03:23 PM, Francesc Alted wrote: On 12/20/12 9:53 AM, Henry Gomersall wrote: On Wed, 2012-12-19 at 19:03 +0100, Francesc Alted wrote: The only scenario that I see that this would create unaligned arrays is for machines having AVX. But provided that the Intel architecture is making great strides in fetching unaligned data, I'd be surprised that the difference in performance would be even noticeable. Can you tell us which difference in performance are you seeing for an AVX-aligned array and other that is not AVX-aligned? Just curious. Further to this point, from an Intel article... http://software.intel.com/en-us/articles/practical-intel-avx-optimization-on-2nd-generation-intel-core-processors Aligning data to vector length is always recommended. When using Intel SSE and Intel SSE2 instructions, loaded data should be aligned to 16 bytes. Similarly, to achieve best results use Intel AVX instructions on 32-byte vectors that are 32-byte aligned. The use of Intel AVX instructions on unaligned 32-byte vectors means that every second load will be across a cache-line split, since the cache line is 64 bytes. This doubles the cache line split rate compared to Intel SSE code that uses 16-byte vectors. A high cache-line split rate in memory-intensive code is extremely likely to cause performance degradation. For that reason, it is highly recommended to align the data to 32 bytes for use with Intel AVX. Though it would be nice to put together a little example of this! Indeed, an example is what I was looking for. So provided that I have access to an AVX capable machine (having 6 physical cores), and that MKL 10.3 has support for AVX, I have made some comparisons using the Anaconda Python distribution (it ships with most packages linked against MKL 10.3). Here it is a first example using a DGEMM operation. First using a NumPy that is not turbo-loaded with MKL: In [34]: a = np.linspace(0,1,1e7) In [35]: b = a.reshape(1000, 1) In [36]: c = a.reshape(1, 1000) In [37]: time d = np.dot(b,c) CPU times: user 7.56 s, sys: 0.03 s, total: 7.59 s Wall time: 7.63 s In [38]: time d = np.dot(c,b) CPU times: user 78.52 s, sys: 0.18 s, total: 78.70 s Wall time: 78.89 s This is getting around 2.6 GFlop/s. Now, with a MKL 10.3 NumPy and AVX-unaligned data: In [7]: p = ctypes.create_string_buffer(int(8e7)); hex(ctypes.addressof(p)) Out[7]: '0x7fcdef3b4010' # 16 bytes alignment In [8]: a = np.ndarray(1e7, f8, p) In [9]: a[:] = np.linspace(0,1,1e7) In [10]: b = a.reshape(1000, 1) In [11]: c = a.reshape(1, 1000) In [37]: %timeit d = np.dot(b,c) 10 loops, best of 3: 164 ms per loop In [38]: %timeit d = np.dot(c,b) 1 loops, best of 3: 1.65 s per loop That is around 120 GFlop/s (i.e. almost 50x faster than without MKL/AVX). Now, using MKL 10.3 and AVX-aligned data: In [21]: p2 = ctypes.create_string_buffer(int(8e7+16)); hex(ctypes.addressof(p)) Out[21]: '0x7f8cb9598010' In [22]: a2 = np.ndarray(1e7+2, f8, p2)[2:] # skip the first 16 bytes (now is 32-bytes aligned) In [23]: a2[:] = np.linspace(0,1,1e7) In [24]: b2 = a2.reshape(1000, 1) In [25]: c2 = a2.reshape(1, 1000) In [35]: %timeit d2 = np.dot(b2,c2) 10 loops, best of 3: 163 ms per loop In [36]: %timeit d2 = np.dot(c2,b2) 1 loops, best of 3: 1.67 s per loop So, again, around 120 GFlop/s, and the difference wrt to unaligned AVX data is negligible. One may argue that DGEMM is CPU-bounded and that memory access plays little role here, and that is certainly true. So, let's go with a more memory-bounded problem, like computing a transcendental function with numexpr. First with a with NumPy and numexpr with no MKL support: In [8]: a = np.linspace(0,1,1e8) In [9]: %time b = np.sin(a) CPU times: user 1.20 s, sys: 0.22 s, total: 1.42 s Wall time: 1.42 s In [10]: import numexpr as ne In [12]: %time b = ne.evaluate(sin(a)) CPU times: user 1.42 s, sys: 0.27 s, total: 1.69 s Wall time: 0.37 s This time is around 4x faster than regular 'sin' in libc, and about the same speed than a memcpy(): In [13]: %time c = a.copy() CPU times: user 0.19 s, sys: 0.20 s, total: 0.39 s Wall time: 0.39 s Now, with a MKL-aware numexpr and non-AVX alignment: In [8]: p = ctypes.create_string_buffer(int(8e8)); hex(ctypes.addressof(p)) Out[8]: '0x7fce435da010' # 16 bytes alignment In [9]: a = np.ndarray(1e8, f8, p) In [10]: a[:] = np.linspace(0,1,1e8) In [11]: %time b = ne.evaluate(sin(a)) CPU times: user 0.44 s, sys: 0.27 s, total: 0.71 s Wall time: 0.15 s That is, more than 2x faster than a memcpy() in this system, meaning that the problem is truly memory-bounded. So now, with an AVX aligned buffer: In [14]: a2 = a[2:] # skip the first 16 bytes In [15]: %time b = ne.evaluate(sin(a2)) CPU times: user 0.40 s, sys: 0.28 s, total: 0.69 s Wall time: 0.16 s Again, times are very close. Just to make sure, let's use the timeit magic: In
Re: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions
Happyman bahtiyor_zohidov at mail.ru writes: [clip] IF I GIVE ( it is necessary value for my program ): a , b = sph_jn ( 536 , 2513.2741228718346 + 201.0619298974676j ) The implementation of the spherical Bessel functions is through this Fortran code: https://github.com/scipy/scipy/blob/master/scipy/special/specfun/specfun.f#L1091 It does not have asymptotic expansions for dealing with parts of the complex plane where the computation via the recurrence does not work. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Byte aligned arrays
On 12/21/12 1:35 PM, Dag Sverre Seljebotn wrote: On 12/20/2012 03:23 PM, Francesc Alted wrote: On 12/20/12 9:53 AM, Henry Gomersall wrote: On Wed, 2012-12-19 at 19:03 +0100, Francesc Alted wrote: The only scenario that I see that this would create unaligned arrays is for machines having AVX. But provided that the Intel architecture is making great strides in fetching unaligned data, I'd be surprised that the difference in performance would be even noticeable. Can you tell us which difference in performance are you seeing for an AVX-aligned array and other that is not AVX-aligned? Just curious. Further to this point, from an Intel article... http://software.intel.com/en-us/articles/practical-intel-avx-optimization-on-2nd-generation-intel-core-processors Aligning data to vector length is always recommended. When using Intel SSE and Intel SSE2 instructions, loaded data should be aligned to 16 bytes. Similarly, to achieve best results use Intel AVX instructions on 32-byte vectors that are 32-byte aligned. The use of Intel AVX instructions on unaligned 32-byte vectors means that every second load will be across a cache-line split, since the cache line is 64 bytes. This doubles the cache line split rate compared to Intel SSE code that uses 16-byte vectors. A high cache-line split rate in memory-intensive code is extremely likely to cause performance degradation. For that reason, it is highly recommended to align the data to 32 bytes for use with Intel AVX. Though it would be nice to put together a little example of this! Indeed, an example is what I was looking for. So provided that I have access to an AVX capable machine (having 6 physical cores), and that MKL 10.3 has support for AVX, I have made some comparisons using the Anaconda Python distribution (it ships with most packages linked against MKL 10.3). Here it is a first example using a DGEMM operation. First using a NumPy that is not turbo-loaded with MKL: In [34]: a = np.linspace(0,1,1e7) In [35]: b = a.reshape(1000, 1) In [36]: c = a.reshape(1, 1000) In [37]: time d = np.dot(b,c) CPU times: user 7.56 s, sys: 0.03 s, total: 7.59 s Wall time: 7.63 s In [38]: time d = np.dot(c,b) CPU times: user 78.52 s, sys: 0.18 s, total: 78.70 s Wall time: 78.89 s This is getting around 2.6 GFlop/s. Now, with a MKL 10.3 NumPy and AVX-unaligned data: In [7]: p = ctypes.create_string_buffer(int(8e7)); hex(ctypes.addressof(p)) Out[7]: '0x7fcdef3b4010' # 16 bytes alignment In [8]: a = np.ndarray(1e7, f8, p) In [9]: a[:] = np.linspace(0,1,1e7) In [10]: b = a.reshape(1000, 1) In [11]: c = a.reshape(1, 1000) In [37]: %timeit d = np.dot(b,c) 10 loops, best of 3: 164 ms per loop In [38]: %timeit d = np.dot(c,b) 1 loops, best of 3: 1.65 s per loop That is around 120 GFlop/s (i.e. almost 50x faster than without MKL/AVX). Now, using MKL 10.3 and AVX-aligned data: In [21]: p2 = ctypes.create_string_buffer(int(8e7+16)); hex(ctypes.addressof(p)) Out[21]: '0x7f8cb9598010' In [22]: a2 = np.ndarray(1e7+2, f8, p2)[2:] # skip the first 16 bytes (now is 32-bytes aligned) In [23]: a2[:] = np.linspace(0,1,1e7) In [24]: b2 = a2.reshape(1000, 1) In [25]: c2 = a2.reshape(1, 1000) In [35]: %timeit d2 = np.dot(b2,c2) 10 loops, best of 3: 163 ms per loop In [36]: %timeit d2 = np.dot(c2,b2) 1 loops, best of 3: 1.67 s per loop So, again, around 120 GFlop/s, and the difference wrt to unaligned AVX data is negligible. One may argue that DGEMM is CPU-bounded and that memory access plays little role here, and that is certainly true. So, let's go with a more memory-bounded problem, like computing a transcendental function with numexpr. First with a with NumPy and numexpr with no MKL support: In [8]: a = np.linspace(0,1,1e8) In [9]: %time b = np.sin(a) CPU times: user 1.20 s, sys: 0.22 s, total: 1.42 s Wall time: 1.42 s In [10]: import numexpr as ne In [12]: %time b = ne.evaluate(sin(a)) CPU times: user 1.42 s, sys: 0.27 s, total: 1.69 s Wall time: 0.37 s This time is around 4x faster than regular 'sin' in libc, and about the same speed than a memcpy(): In [13]: %time c = a.copy() CPU times: user 0.19 s, sys: 0.20 s, total: 0.39 s Wall time: 0.39 s Now, with a MKL-aware numexpr and non-AVX alignment: In [8]: p = ctypes.create_string_buffer(int(8e8)); hex(ctypes.addressof(p)) Out[8]: '0x7fce435da010' # 16 bytes alignment In [9]: a = np.ndarray(1e8, f8, p) In [10]: a[:] = np.linspace(0,1,1e8) In [11]: %time b = ne.evaluate(sin(a)) CPU times: user 0.44 s, sys: 0.27 s, total: 0.71 s Wall time: 0.15 s That is, more than 2x faster than a memcpy() in this system, meaning that the problem is truly memory-bounded. So now, with an AVX aligned buffer: In [14]: a2 = a[2:] # skip the first 16 bytes In [15]: %time b = ne.evaluate(sin(a2)) CPU times: user 0.40 s, sys: 0.28 s, total: 0.69 s Wall time: 0.16 s Again, times are very close. Just
Re: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions
Thanks Pauli But I have already very shortly built for bessel function, but the code you gave me is in Fortran.. I also used f2py but I could not manage to read fortran codes..that is why I have asked in Python what is wrong?? Пятница, 21 декабря 2012, 12:46 UTC от Pauli Virtanen p...@iki.fi: Happyman bahtiyor_zohidov at mail.ru writes: [clip] IF I GIVE ( it is necessary value for my program ): a , b = sph_jn ( 536 , 2513.2741228718346 + 201.0619298974676j ) The implementation of the spherical Bessel functions is through this Fortran code: https://github.com/scipy/scipy/blob/master/scipy/special/specfun/specfun.f#L1091 It does not have asymptotic expansions for dealing with parts of the complex plane where the computation via the recurrence does not work. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions
Happyman bahtiyor_zohidov at mail.ru writes: Thanks Pauli But I have already very shortly built for bessel function, but the code you gave me is in Fortran.. I also used f2py but I could not manage to read fortran codes..that is why I have asked in Python what is wrong?? That Fortran code is `sph_jn`, which you used. It works using f2py. Only some of the special functions in scipy.special are written using Python as the language. Most of them are in C or in Fortran, using some existing special function library not written by us. Some of the implementations provided by these libraries are not complete, and do not cover the whole complex plane (or the real axis). Other functions (the more common ones), however, have very good implementations. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions
I have everything in C or Fortran...According to my friends recommendations I started learning Python for my research... Do you mean the functions which gave Nan result has not been developed properly yet in Python, Don't you For about 1.5 months I have been facing the same problem for Bessel functions.. I think the code that I showed like an example is not working in Python. What to do ??? Пятница, 21 декабря 2012, 13:17 от Pauli Virtanen p...@iki.fi: Happyman bahtiyor_zohidov at mail.ru writes: Thanks Pauli But I have already very shortly built for bessel function, but the code you gave me is in Fortran.. I also used f2py but I could not manage to read fortran codes..that is why I have asked in Python what is wrong?? That Fortran code is `sph_jn`, which you used. It works using f2py. Only some of the special functions in scipy.special are written using Python as the language. Most of them are in C or in Fortran, using some existing special function library not written by us. Some of the implementations provided by these libraries are not complete, and do not cover the whole complex plane (or the real axis). Other functions (the more common ones), however, have very good implementations. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions
On 12/21/2012 02:30 PM, Happyman wrote: I have everything in C or Fortran...According to my friends recommendations I started learning Python for my research... Do you mean the functions which gave Nan result has not been developed properly yet in Python, Don't you The way most of NumPy and SciPy works is by calling into C and Fortran code. For about 1.5 months I have been facing the same problem for Bessel functions.. I think the code that I showed like an example is not working in Python. What to do ??? Do you have an implemention of the Bessel functions that work as you wish in C or Fortran? If so, that could be wrapped and called from Python. Dag Sverre ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions
Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no writes: [clip] Do you have an implemention of the Bessel functions that work as you wish in C or Fortran? If so, that could be wrapped and called from Python. For spherical Bessel functions it's possible to also use the relation to Bessel functions, which have a better implementation (AMOS) in Scipy: import numpy as np from scipy.special import jv def sph_jn(n, z): return jv(n + 0.5, z) * np.sqrt(np.pi / (2 * z)) print sph_jn(536, 2513.2741228718346 + 201.0619298974676j) # (2.5167666386507171e+81+3.3576357192536334e+81j) This should solve the problem. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions
Received from Pauli Virtanen on Fri, Dec 21, 2012 at 08:59:02AM EST: Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no writes: [clip] Do you have an implemention of the Bessel functions that work as you wish in C or Fortran? If so, that could be wrapped and called from Python. For spherical Bessel functions it's possible to also use the relation to Bessel functions, which have a better implementation (AMOS) in Scipy: import numpy as np from scipy.special import jv def sph_jn(n, z): return jv(n + 0.5, z) * np.sqrt(np.pi / (2 * z)) print sph_jn(536, 2513.2741228718346 + 201.0619298974676j) # (2.5167666386507171e+81+3.3576357192536334e+81j) This should solve the problem. You can also try the spherical Bessel function in Sympy; it should be able to handle this case as well [1]. L.G. [1] http://docs.sympy.org/0.7.2/modules/functions/special.html#bessel-type-functions ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions
I think you advised about the code which is the same appearance. == Problem is not here Sir I will give you exactly what I was talking about. I have ready codes already(It would be kind of you if you checked the following codes, may be): -- ## Bessel function of the first kind # mathematical form: Jn(x)-- n=arg1, x=arg2 # returns -- Jn(x) value in a complex form -- def bes_1(arg1,arg2): nu=arg1+0.5 # nu=n+0.5: Jn(x) -- Jnu(x) return sqrt(pi/(2*arg2))*np.round(jv(nu,arg2),5) # jv(nu,arg2)-- from 'numpy.special' in PYTHON ## Bessel function of the second kind # mathematical form: Yn(x)-- n=arg1, x=arg2 # returns -- Yn(x) value in a complex form - def bes_2 ( arg1, arg2 ): nu = arg1 + 0.5 # nu=n+0.5: Yn(x)-- Ynu(x) return sqrt ( pi / ( 2 * arg2 ) ) * np.round ( yv ( nu , arg2 ) , 5) # yv(nu,arg2)-- from 'numpy.special' in PYTHON - --- ## Hankel function of the first kind # mathematical form: Hn(x)= Jn(x)+Yn(x)j # returns -- Hn(x) value in a complex form - def hank_1 ( arg1, arg2 ): return bes_1 ( arg1 , arg2 ) + bes_2 ( arg1 , arg2 ) * 1.0j # Hn(x)= Jn(x)+Yn(x)j --- ## Bessel function of the first kind derivative # mathematical form: d(z*jn(z))=z*jn-1(z)-n*jn(z) where, z=x or m*x - def bes_der1 ( arg1 , arg2 ): return arg2 * bes_1 ( arg1 - 1, arg2 ) - arg1 * bes_1 ( arg1 , arg2 ) - ## Hankel function of the first kind derivative # mathematical form: d(z*hankeln(z))=z*hankeln-1(z)-n*hankeln(z) where, z=x or m*x def hank_der1 ( arg1 , arg2 ): return arg2 * hank_1 ( arg1 - 1, arg2 ) - arg1 * hank_1( arg1, arg2 ) - FOR MY CASE: m = 2513.2741228718346 + 201.0619298974676j x = 502.6548245743669 def F(m,x): nmax = x + 2.0 + 4.0 * x ** ( 1.0 / 3.0 ) # nmax= gives 536.0 as expected value nstop = np.round( nmax ) n = np.arange ( 0.0 ,nstop, dtype = float) z = m * x m2 = m * m val1 = m2 * bes_1 ( en , z ) * bes_der1 ( en, x) val2 = bes_1 ( en , x ) * bes_der1 ( en , z ) val3 = m2 * bes_1 ( en , z ) * hank_der1 ( en , x ) val4 = hank_1 ( en , x ) * bes_der1 ( en , z ) an = ( val1 - val2 ) / ( val3 - val4 ) val5 = bes_1 ( en , z ) * bes_der1 ( en, x ) val6 = bes_1 ( en , x ) * bes_der1 ( en, z ) val7 = bes_1 ( en , z ) * hank_der1 ( en, x ) val8 = hank_1 ( en , x ) * bes_der1 ( en , z ) bn = ( val5 - val6 ) / ( val7 - val8 ) return an, bn PROBLEM IS RETURNING THE an, bn at a given value which I showed before F(m,x) function WHAT IS WRONG WITH THIS?? Пятница, 21 декабря 2012, 13:59 от Pauli Virtanen p...@iki.fi: Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no writes: [clip] Do you have an implemention of the Bessel functions that work as you wish in C or Fortran? If so, that could be wrapped and called from Python. For spherical Bessel functions it's possible to also use the relation to Bessel functions, which have a better implementation (AMOS) in Scipy: import numpy as np from scipy.special import jv def sph_jn(n, z): return jv(n + 0.5, z) * np.sqrt(np.pi / (2 * z)) print sph_jn(536, 2513.2741228718346 + 201.0619298974676j) # (2.5167666386507171e+81+3.3576357192536334e+81j) This should solve the problem. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions
Hi, Your code tries to to evaluate z = 1263309.3633394379 + 101064.74910119522j jv(536, z) # - (inf+inf*j) In reality, this number is not infinite, but jv(536, z) == -2.3955170861527422e+43888 + 9.6910119847300024e+43887 These numbers (~ 10^43888) are too large for the floating point numbers that computers use (maximum ~ 10^308). This is why you get infinities and NaNs in the result. The same is true for the spherical Bessel functions. You will not be able to do this calculation using any software that uses only floating point numbers (Scipy, Matlab, ...). You need to use analytical properties of your problem to get rid of such large numbers. Alternatively, you can use arbitrary precision numbers. Python has libraries for that: http://code.google.com/p/mpmath/ By the way, the proper place for this discussion is the following mailing list: http://mail.scipy.org/mailman/listinfo/scipy-user -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Numpy speed ups to simple tasks - final findings and suggestions
Hello, On Dec/2/2012 I sent an email about some meaningful speed problems I was facing when porting our core program from Numeric (Python 2.2) to Numpy (Python 2.6). Some of our tests went from 30 seconds to 90 seconds for example. I saw interest from some people in this list and I left the topic saying I would do a more complete profile of the program and report back anything meaningful. It took me quite a bit to get through things because I ended up having to figure out how to create a Visual Studio project that I could debug and compile from the IDE. First, the obvious, Everything that relies heavily on Numpy for speed (mid to large arrays) is pretty much the same speed when compared to Numeric. The areas that are considerably slower in Numpy Vs Numeric are the trivial tasks that we end up using either for convenience (small arrays) or because scalar types such as 'float64' propagate everywhere throughout the program and creep into several of our data structures. This email is really only relevant to people stuck with doing trivial operations with Numpy and want a meaningful speed boost. I focused on float64. * NOTE: I ended up doing everything in Numpy 1.6.2 as opposed to using the latest stuff. I am going to guess all my findings still apply but I will only be able to confirm until later. = In this email I include, 1) Main bottlenecks I found which I list and refer to as (a), (b) and (c). 2) The benchmark tests I made and their speed ups 3) Details on the actual changes to the C code = Summary of conclusions, - Our code is finally running as fast as it used to by doing some changes in Numpy and also some minor changes in our code. Half of our problems were caused by instantiating small arrays several times which is fairly slow in Numpy. The other half of our problems were are caused by the slow math performance of Numpy scalars. We did find a particular python function in our code that was a big candidate to be rewritten in C and just got it done. - In Numpy I did four sets of changes in the source code. I believe three of them are relevant to every one using Numpy and one of them is probably not going to be very popular. - The main speed up is in float64 scalar operations and creation of small arrays from lists or tuples. The speed up in small array operations is only marginal but I believe there is potential to get them at least twice as fast. = 1) By profiling the program I found three generic types of bottlenecks in Numpy that were affecting our code, a) Operations that result in Python internally raising an error e.g. PyObject_GetAttrString(obj, __array_priority__) when __array_priority__ is not an attribute of obj b) Creation / destruction of scalar array types . In some places this was happening unnecessarily . c) Ufuncs trying to figure out the proper type for an operation (e.g. if I multiply a float64 array by a float64 array, a fair amount of time is spent deciding that it should use float64) I came up with specific changes to address (a) and (b) . I gave up on (c) for now since I couldn't think of a way to speed it up without a large re-write and I really don't know the Numpy code (never saw it before this work). = 2) The tests I did were (some are python natives for reference), 1) Array * Array 2) PyFloat * Array 3) Float64 * Array 4) PyFloat + Array 5) Float64 + Array 6) PyFloat * PyFloat 7) Float64 * Float64 8) PyFloat * Float64 9) PyFloat * vector1[1] 10) PyFloat + Float64 11) PyFloat Float64 12) if PyFloat Float64: 13) Create array from list 14) Assign PyFloat to all 15) Assign Float64 to all 16) Float64 * Float64 * Float64 * Float64 * Float64 17) Float64 * Float64 * Float64 * Float64 * Float64 18) Float64 ** 2 19) PyFloat ** 2 where Array - Numpy array of float64 of two elements (vector1 = array( [2.0, 3.1] )). PyFloat - pyFloat = 3.1 Float64 - Numpy scalar 'float64' (scalarFloat64 = vector1[1]) Create array from list - newVec = array([0.2, 0.3], dtype=float64) Assign PyFloat to all - vector1[:] = pyFloat Assign Float64 to all - vector1[:] = scalarFloat64 I ran every test 10 and timed it in seconds. These are the base timings with the original Numpy TIME[s]TEST 1)0.2003Array * Array 2)0.2502PyFloat * Array 3)0.2689Float64 * Array 4)0.2469PyFloat + Array 5)0.2640Float64 + Array 6)0.0055PyFloat * PyFloat 7)0.0278Float64 * Float64 8)0.0778PyFloat * Float64 9)0.0893PyFloat * vector1[1] 10)0.0767PyFloat + Float64 11)0.0532PyFloat Float64 12)0.0543if PyFloat Float64 : 13)0.6788Create array from list 14)0.0708Assign PyFloat to all 15)0.0775Assign Float64 to all 16)0.2994
Re: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions
Thanks But I could find for Win64 bit windows Second question: Did you mean that I have to put lens limits of those number??? Пятница, 21 декабря 2012, 15:45 UTC от Pauli Virtanen p...@iki.fi: Hi, Your code tries to to evaluate z = 1263309.3633394379 + 101064.74910119522j jv(536, z) # - (inf+inf*j) In reality, this number is not infinite, but jv(536, z) == -2.3955170861527422e+43888 + 9.6910119847300024e+43887 These numbers (~ 10^43888) are too large for the floating point numbers that computers use (maximum ~ 10^308). This is why you get infinities and NaNs in the result. The same is true for the spherical Bessel functions. You will not be able to do this calculation using any software that uses only floating point numbers (Scipy, Matlab, ...). You need to use analytical properties of your problem to get rid of such large numbers. Alternatively, you can use arbitrary precision numbers. Python has libraries for that: http://code.google.com/p/mpmath/ By the way, the proper place for this discussion is the following mailing list: http://mail.scipy.org/mailman/listinfo/scipy-user -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Travis-CI stopped supporting Python 3.1, but added 3.3
On Fri, Dec 21, 2012 at 2:23 AM, Ondřej Čertík ondrej.cer...@gmail.comwrote: Hi, I noticed that the 3.1 tests are now failing. After clarification with the Travis guys: https://groups.google.com/d/topic/travis-ci/02iRu6kmwY8/discussion I've submitted a fix to our .travis.yml (and backported to 1.7): https://github.com/numpy/numpy/pull/2850 https://github.com/numpy/numpy/pull/2851 In case you were wondering. Do we need to support Python 3.1? We could in principle test 3.1 just like we test 2.4. I don't know if it is worth the pain. I think for 1.7.x we should. Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Travis-CI stopped supporting Python 3.1, but added 3.3
On Fri, Dec 21, 2012 at 1:23 AM, Ondřej Čertík ondrej.cer...@gmail.com wrote: Hi, I noticed that the 3.1 tests are now failing. After clarification with the Travis guys: https://groups.google.com/d/topic/travis-ci/02iRu6kmwY8/discussion I've submitted a fix to our .travis.yml (and backported to 1.7): https://github.com/numpy/numpy/pull/2850 https://github.com/numpy/numpy/pull/2851 In case you were wondering. Do we need to support Python 3.1? We could in principle test 3.1 just like we test 2.4. I don't know if it is worth the pain. It probably isn't much pain, since Python is easy to compile, and I don't think we've been running into many cases where supporting 3.1 required workarounds yet? (As compared to 2.4, where we get compatibility-breaking patches constantly.) It's a crude metric and I'm not sure what conclusion it suggests, but we have download stats for the different python-version-specific pre-built numpy binaries: http://sourceforge.net/projects/numpy/files/NumPy/1.6.1/ http://pypi.python.org/pypi/numpy/1.6.1 http://sourceforge.net/projects/numpy/files/NumPy/1.6.2/ http://pypi.python.org/pypi/numpy/1.6.2 It looks like python 2.5 is several times more popular than 3.1, and its popularity is dropping quickly. (30% fewer downloads so far for 1.6.2 than 1.6.1, even though 1.6.2 has more downloads overall.) -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Travis-CI stopped supporting Python 3.1, but added 3.3
On Fri, Dec 21, 2012 at 12:05 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Dec 21, 2012 at 1:23 AM, Ondřej Čertík ondrej.cer...@gmail.com wrote: Hi, I noticed that the 3.1 tests are now failing. After clarification with the Travis guys: https://groups.google.com/d/topic/travis-ci/02iRu6kmwY8/discussion I've submitted a fix to our .travis.yml (and backported to 1.7): https://github.com/numpy/numpy/pull/2850 https://github.com/numpy/numpy/pull/2851 In case you were wondering. Do we need to support Python 3.1? We could in principle test 3.1 just like we test 2.4. I don't know if it is worth the pain. It probably isn't much pain, since Python is easy to compile, and I don't think we've been running into many cases where supporting 3.1 required workarounds yet? (As compared to 2.4, where we get compatibility-breaking patches constantly.) Yes, I was going to suggest that it really is not a big deal to support 3.1, as far as I see it by my experience with Travis tests on numpy PRs in the last half a year. We can compile it like we do for 2.4, or we can even provide a prebuild binary and just install it in the Travis VM each time. So I'll try to provide a PR which implements testing of 3.1, unless someone beats me to it. Ondrej ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [pystatsmodels] Re: ANN: pandas 0.10.0 released
On Dec 21, 2012, at 3:27 PM, Collin Sellman collin.sell...@gmail.com wrote: Thanks, Wes and team. I've been looking through the new features, but haven't found any documentation on the integration with the Google Analytics API. I was just in the midst of trying to pull data into Pandas from GA in v.0.9.0, so would love to try what you built in .10. -Collin On Monday, December 17, 2012 10:19:49 AM UTC-7, Wes McKinney wrote: hi all, I'm super excited to announce the pandas 0.10.0 release. This is a major release including a new high performance file reading engine with tons of new user-facing functionality as well, a bunch of work on the HDF5/PyTables integration layer, much-expanded Unicode support, a new option/configuration interface, integration with the Google Analytics API, and a wide array of other new features, bug fixes, and performance improvements. I strongly recommend that all users get upgraded as soon as feasible. Many performance improvements made are quite substantial over 0.9.x, see vbenchmarks at the end of the e-mail. As of this release, we are no longer supporting Python 2.5. Also, this is the first release to officially support Python 3.3. Note: there are a number of minor, but necessary API changes that long-time pandas users should pay attention to in the What's New. Thanks to all who contributed to this release, especially Chang She, Yoval P, and Jeff Reback (and everyone else listed in the commit log!). As always source archives and Windows installers are on PyPI. What's new: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html Installers: http://pypi.python.org/pypi/pandas $ git log v0.9.1..v0.10.0 --pretty=format:%aN | sort | uniq -c | sort -rn 246 Wes McKinney 140 y-p 99 Chang She 45 jreback 18 Abraham Flaxman 17 Jeff Reback 14 locojaydev 11 Keith Hughitt 5 Adam Obeng 2 Dieter Vandenbussche 1 zach powers 1 Luke Lee 1 Laurent Gautier 1 Ken Van Haren 1 Jay Bourque 1 Donald Curtis 1 Chris Mulligan 1 alex arsenovic 1 A. Flaxman Happy data hacking! - Wes What is it == pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with relational, time series, or any other kind of labeled data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Links = Release Notes: http://github.com/pydata/pandas/blob/master/RELEASE.rst Documentation: http://pandas.pydata.org Installers: http://pypi.python.org/pypi/pandas Code Repository: http://github.com/pydata/pandas Mailing List: http://groups.google.com/group/pydata Performance vs. v0.9.0 == Benchmarks from https://github.com/pydata/pandas/tree/master/vb_suite Ratio 1 means that v0.10.0 is faster v0.10.0 v0.9.0 ratio name unstack_sparse_keyspace 1.2813 144.1262 0.0089 groupby_frame_apply_overhead 20.1520 337.3330 0.0597 read_csv_comment2 25.3097 363.2860 0.0697 groupbym_frame_apply 75.1554 504.1661 0.1491 frame_iteritems_cached 0.0711 0.3919 0.1815 read_csv_thou_vb 35.2690 191.9360 0.1838 concat_small_frames12.901955.3561 0.2331 join_dataframe_integer_2key 5.818421.5823 0.2696 series_value_counts_strings 5.382419.1262 0.2814 append_frame_single_homogenous 0.3413 0.9319 0.3662 read_csv_vb18.408446.9500 0.3921 read_csv_standard 12.065129.9940 0.4023 panel_from_dict_all_different_indexes 73.6860 158.2949 0.4655 frame_constructor_ndarray 0.0471 0.0958 0.4918 groupby_first 3.8502 7.1988 0.5348 groupby_last3.6962 6.7792 0.5452 panel_from_dict_two_different_indexes 50.742886.4980 0.5866 append_frame_single_mixed 1.2950 2.1930 0.5905 frame_get_numeric_data 0.0695 0.1119 0.6212 replace_fillna 4.6349 7.0540 0.6571 frame_to_csv 281.9340 427.7921 0.6590 replace_replacena 4.7154 7.1207 0.6622 frame_iteritems 2.5862 3.7463 0.6903 series_align_int64_index 29.737041.2791 0.7204 join_dataframe_integer_key 1.7980 2.4303