Re: [Numpy-discussion] ANN: Numpy 1.6.1 release candidate 1
On Tue, Jun 21, 2011 at 3:55 AM, Bruce Southey bsout...@gmail.com wrote: On Mon, Jun 20, 2011 at 2:43 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Mon, Jun 20, 2011 at 8:50 PM, Bruce Southey bsout...@gmail.com wrote: I copied the files but that just moves the problem. So that patch is incorrect. I get the same errors on Fedora 15 supplied Python3.2 for numpy 1.6.0 and using git from 'https://github.com/rgommers/numpy.git'. Numpy is getting Fedora supplied Atlas (1.5.1 does not). It appears that there is a misunderstanding of the PEP because 'SO' and 'SOABI' do exactly what the PEP says on my systems: It doesn't on OS X. But that's not even the issue. As I explained before, the issue is that get_config_var('SO') is used to determine the extension of system libraries (such as liblapack.so) and python-related ones (such as multiarray.cpython-32m.so). And the current functions don't do mindreading. from distutils import sysconfig sysconfig.get_config_var('SO') '.cpython-32m.so' sysconfig.get_config_var('SOABI') 'cpython-32m' Consequently, the name, 'multiarray.pyd', created within numpy is invalid. I removed the line in ctypeslib that was trying this, so I think you are not testing my patch. Ralf Looking the code, I see this line which makes no sense given that the second part is true under Linux: if (not is_python_ext) and 'SOABI' in distutils.sysconfig.get_config_vars(): So I think the 'get_shared_lib_extension' function is wrong and probably unneeded. Bruce Just to show that this is the new version, I added two print statements in the 'get_shared_lib_extension' function: from numpy.distutils.misc_util import get_shared_lib_extension get_shared_lib_extension(True) first so_ext .cpython-32mu.so returned so_ext .cpython-32mu.so '.cpython-32mu.so' get_shared_lib_extension(False) first so_ext .cpython-32mu.so returned so_ext .so '.so' This all looks correct. Before you were saying you were still getting 'multiarray.pyd', now your error says 'multiarray.so'. So now you are testing the right thing. Test test_basic2() in test_ctypeslib was fixed, but I forgot to fix it in two other places. I updated both my branches on github, please try again. The reason for the same location is obvious because all the patch does is move the code to get the extension into that function. So the 'get_shared_lib_extension' function returns the extension '.so' to the load_library function. However that name is wrong under Linux as it has to be 'multiarray.cpython-32mu.so' and hence the error in the same location. I did come across this thread 'http://bugs.python.org/issue10262' which indicates why Linux is different by default. So what is the actual name of the multiarray shared library with the Mac? If it is ' 'multiarray.so' then the correct name is libname + sysconfig.get_config_var('SO') as I previously indicated. It is, and yes that's correct. Orthogonal to the actual issue though. Ralf Bruce $ python3 Python 3.2 (r32:88445, Feb 21 2011, 21:11:06) [GCC 4.6.0 20110212 (Red Hat 4.6.0-0.7)] on linux2 Type help, copyright, credits or license for more information. import numpy as np np.test() Running unit tests for numpy NumPy version 2.0.0.dev-Unknown NumPy is installed in /usr/lib64/python3.2/site-packages/numpy Python version 3.2 (r32:88445, Feb 21 2011, 21:11:06) [GCC 4.6.0 20110212 (Red Hat 4.6.0-0.7)] nose version 1.0.0 first so_ext .cpython-32mu.so returned so_ext .so ...F...S..F.E..KK..K..K...first so_ext .cpython-32mu.so returned so_ext .so
[Numpy-discussion] faster in1d() for monotonic case?
The following call is a bottleneck for me: np.in1d( large_array.field_of_interest, values_of_interest ) I'm not sure how in1d() is implemented, but this call seems to be slower than O(n) and faster than O( n**2 ), so perhaps it sorts the values_of_interest and does a binary search for each element of large_array? In any case, in my situation I actually know that field_of_interest increases monotonically across the large_array. So if I were writing this in C, I could do a simple O(n) loop by sorting values_of_interest and then just checking each value of large_array against values_of_interest[ i ] and values_of_interest[ i + 1 ], and any time it matched values_of_interest[ i + 1 ] increment i. Is there some way to achieve that same efficiency in numpy, taking advantage of the monotonic nature of field_of_interest? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Controlling endianness of ndarray.tofile()
Hi, On my system (Intel Xeon, Windows 7 64-bit), ndarray.tofile() outputs in little-endian. This is a bit inconvenient, since everything else I do is in big-endian. Unfortunately, scipy.io.write_arrray() is deprecated, and I can't find any other routines that write pure raw binary. Are there any other options, or perhaps could tofile() be modified to allow control over endianness? Cheers, Ben -- Benjamin D. Forbes School of Physics The University of Melbourne Parkville, VIC 3010, Australia ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] תשובה: faster in1d() for monotonic case?
Did you try searchsorted? Nadav מאת: numpy-discussion-boun...@scipy.org [mailto:numpy-discussion-boun...@scipy.org] בשם Michael Katz נשלח: Tuesday, June 21, 2011 10:06 אל: Discussion of Numerical Python נושא: [Numpy-discussion] faster in1d() for monotonic case? The following call is a bottleneck for me: np.in1d( large_array.field_of_interest, values_of_interest ) I'm not sure how in1d() is implemented, but this call seems to be slower than O(n) and faster than O( n**2 ), so perhaps it sorts the values_of_interest and does a binary search for each element of large_array? In any case, in my situation I actually know that field_of_interest increases monotonically across the large_array. So if I were writing this in C, I could do a simple O(n) loop by sorting values_of_interest and then just checking each value of large_array against values_of_interest[ i ] and values_of_interest[ i + 1 ], and any time it matched values_of_interest[ i + 1 ] increment i. Is there some way to achieve that same efficiency in numpy, taking advantage of the monotonic nature of field_of_interest? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Controlling endianness of ndarray.tofile()
Hi Ben, based on this example https://bitbucket.org/lannybroo/numpyio/src/a6191c989804/numpyIO.py I suspect the way to do it is with numpy.byteswap() and numpy.tofile() From http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.byteswap.html we can do A = np.array([1, 256, 8755], dtype=np.int16) map(hex, A) ['0x1', '0x100', '0x2233'] A.tofile('a_little.bin') A.byteswap(True) array([ 256, 1, 13090], dtype=int16) map(hex, A) ['0x100', '0x1', '0x3322'] A.tofile('a_big.bin') Gary On Tue, Jun 21, 2011 at 6:22 PM, Ben Forbes bdfor...@gmail.com wrote: Hi, On my system (Intel Xeon, Windows 7 64-bit), ndarray.tofile() outputs in little-endian. This is a bit inconvenient, since everything else I do is in big-endian. Unfortunately, scipy.io.write_arrray() is deprecated, and I can't find any other routines that write pure raw binary. Are there any other options, or perhaps could tofile() be modified to allow control over endianness? Cheers, Ben -- Benjamin D. Forbes School of Physics The University of Melbourne Parkville, VIC 3010, Australia ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Controlling endianness of ndarray.tofile()
Thanks Gary, that works. Out of interest I timed it: http://pastebin.com/HA4Qn9Ge On average the swapping incurred a 0.04 second penalty (compared with 1.5 second total run time) for a 4096x4096 array of 64-bit reals. So there is no real penalty. Cheers, Ben On Tue, Jun 21, 2011 at 8:37 PM, gary ruben gru...@bigpond.net.au wrote: Hi Ben, based on this example https://bitbucket.org/lannybroo/numpyio/src/a6191c989804/numpyIO.py I suspect the way to do it is with numpy.byteswap() and numpy.tofile() From http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.byteswap.html we can do A = np.array([1, 256, 8755], dtype=np.int16) map(hex, A) ['0x1', '0x100', '0x2233'] A.tofile('a_little.bin') A.byteswap(True) array([ 256, 1, 13090], dtype=int16) map(hex, A) ['0x100', '0x1', '0x3322'] A.tofile('a_big.bin') Gary On Tue, Jun 21, 2011 at 6:22 PM, Ben Forbes bdfor...@gmail.com wrote: Hi, On my system (Intel Xeon, Windows 7 64-bit), ndarray.tofile() outputs in little-endian. This is a bit inconvenient, since everything else I do is in big-endian. Unfortunately, scipy.io.write_arrray() is deprecated, and I can't find any other routines that write pure raw binary. Are there any other options, or perhaps could tofile() be modified to allow control over endianness? Cheers, Ben -- Benjamin D. Forbes School of Physics The University of Melbourne Parkville, VIC 3010, Australia ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Benjamin D. Forbes School of Physics The University of Melbourne Parkville, VIC 3010, Australia ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: Numpy 1.6.1 release candidate 1
On 06/21/2011 01:01 AM, Ralf Gommers wrote: On Tue, Jun 21, 2011 at 3:55 AM, Bruce Southey bsout...@gmail.com mailto:bsout...@gmail.com wrote: On Mon, Jun 20, 2011 at 2:43 PM, Ralf Gommers ralf.gomm...@googlemail.com mailto:ralf.gomm...@googlemail.com wrote: On Mon, Jun 20, 2011 at 8:50 PM, Bruce Southey bsout...@gmail.com mailto:bsout...@gmail.com wrote: I copied the files but that just moves the problem. So that patch is incorrect. I get the same errors on Fedora 15 supplied Python3.2 for numpy 1.6.0 and using git from 'https://github.com/rgommers/numpy.git'. Numpy is getting Fedora supplied Atlas (1.5.1 does not). It appears that there is a misunderstanding of the PEP because 'SO' and 'SOABI' do exactly what the PEP says on my systems: It doesn't on OS X. But that's not even the issue. As I explained before, the issue is that get_config_var('SO') is used to determine the extension of system libraries (such as liblapack.so) and python-related ones (such as multiarray.cpython-32m.so http://multiarray.cpython-32m.so). And the current functions don't do mindreading. from distutils import sysconfig sysconfig.get_config_var('SO') '.cpython-32m.so' sysconfig.get_config_var('SOABI') 'cpython-32m' Consequently, the name, 'multiarray.pyd', created within numpy is invalid. I removed the line in ctypeslib that was trying this, so I think you are not testing my patch. Ralf Looking the code, I see this line which makes no sense given that the second part is true under Linux: if (not is_python_ext) and 'SOABI' in distutils.sysconfig.get_config_vars(): So I think the 'get_shared_lib_extension' function is wrong and probably unneeded. Bruce Just to show that this is the new version, I added two print statements in the 'get_shared_lib_extension' function: from numpy.distutils.misc_util import get_shared_lib_extension get_shared_lib_extension(True) first so_ext .cpython-32mu.so returned so_ext .cpython-32mu.so '.cpython-32mu.so' get_shared_lib_extension(False) first so_ext .cpython-32mu.so returned so_ext .so '.so' This all looks correct. Before you were saying you were still getting 'multiarray.pyd', now your error says 'multiarray.so'. So now you are testing the right thing. Test test_basic2() in test_ctypeslib was fixed, but I forgot to fix it in two other places. I updated both my branches on github, please try again. The reason for the same location is obvious because all the patch does is move the code to get the extension into that function. So the 'get_shared_lib_extension' function returns the extension '.so' to the load_library function. However that name is wrong under Linux as it has to be 'multiarray.cpython-32mu.so http://multiarray.cpython-32mu.so' and hence the error in the same location. I did come across this thread 'http://bugs.python.org/issue10262' which indicates why Linux is different by default. So what is the actual name of the multiarray shared library with the Mac? If it is ' 'multiarray.so' then the correct name is libname + sysconfig.get_config_var('SO') as I previously indicated. It is, and yes that's correct. Orthogonal to the actual issue though. Ralf While the test now pass, you have now changed an API for load_library. This is not something that is meant to occur in a bug-fix release as well as the new argument is undocumented. But I do not understand the need for this extra complexity when libname + sysconfig.get_config_var('SO') works on Linux, Windows and Mac. Bruce $ git clone git://github.com/rgommers/numpy.git numpy $ cd numpy $ git checkout sharedlib-ext Switched to branch 'sharedlib-ext' $ git branch -a master * sharedlib-ext remotes/origin/1.5.x remotes/origin/HEAD - origin/master remotes/origin/compilation-issues-doc remotes/origin/doc-noinstall remotes/origin/maintenance/1.4.x remotes/origin/maintenance/1.5.x remotes/origin/maintenance/1.6.x remotes/origin/master remotes/origin/sharedlib-ext remotes/origin/sharedlib-ext-1.6.x remotes/origin/swigopts remotes/origin/ticket-1218-array2string remotes/origin/ticket-1689-fromstring remotes/origin/ticket-99 remotes/origin/warn-noclean-build [built and installed numpy] $ python3 Python 3.2 (r32:88445, Feb 21 2011, 21:11:06) [GCC 4.6.0 20110212 (Red Hat 4.6.0-0.7)] on linux2 Type help, copyright, credits or license for more information. import numpy as np np.__version__ '2.0.0.dev-Unknown' np.test() Running unit tests for numpy NumPy version 2.0.0.dev-Unknown NumPy is installed in /usr/lib64/python3.2/site-packages/numpy Python version 3.2 (r32:88445, Feb 21 2011, 21:11:06) [GCC 4.6.0
Re: [Numpy-discussion] תשובה: faster in1d() for monotonic case?
I'm not quite sure how to use searchsorted to get the output I need (e.g., the length of the output needs to be as long as large_array). But in any case it says it uses binary search, so it would seem to be an O( n * log( n ) ) solution, whereas I'm hoping for an O( n ) solution. From: Nadav Horesh nad...@visionsense.com To: Discussion of Numerical Python numpy-discussion@scipy.org Sent: Tue, June 21, 2011 2:33:24 AM Subject: [Numpy-discussion] תשובה: faster in1d() for monotonic case? Did you try searchsorted? Nadav מאת:numpy-discussion-boun...@scipy.org [mailto:numpy-discussion-boun...@scipy.org] בשם Michael Katz נשלח: Tuesday, June 21, 2011 10:06 אל: Discussion of Numerical Python נושא: [Numpy-discussion] faster in1d() for monotonic case? The following call is a bottleneck for me: np.in1d( large_array.field_of_interest, values_of_interest ) I'm not sure how in1d() is implemented, but this call seems to be slower than O(n) and faster than O( n**2 ), so perhaps it sorts the values_of_interest and does a binary search for each element of large_array? In any case, in my situation I actually know that field_of_interest increases monotonically across the large_array. So if I were writing this in C, I could do a simple O(n) loop by sorting values_of_interest and then just checking each value of large_array against values_of_interest[ i ] and values_of_interest[ i + 1 ], and any time it matched values_of_interest[ i + 1 ] increment i. Is there some way to achieve that same efficiency in numpy, taking advantage of the monotonic nature of field_of_interest?___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] poor performance of sum with sub-machine-word integer types
Hello all, As a result of the fast greyscale conversion thread, I noticed an anomaly with numpy.ndararray.sum(): summing along certain axes is much slower with sum() than versus doing it explicitly, but only with integer dtypes and when the size of the dtype is less than the machine word. I checked in 32-bit and 64-bit modes and in both cases only once the dtype got as large as that did the speed difference go away. See below... Is this something to do with numpy or something inexorable about machine / memory architecture? Zach Timings -- 64-bit mode: -- In [2]: i = numpy.ones((1024,1024,4), numpy.int8) In [3]: timeit i.sum(axis=-1) 10 loops, best of 3: 131 ms per loop In [4]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 2.57 ms per loop In [5]: i = numpy.ones((1024,1024,4), numpy.int16) In [6]: timeit i.sum(axis=-1) 10 loops, best of 3: 131 ms per loop In [7]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 4.75 ms per loop In [8]: i = numpy.ones((1024,1024,4), numpy.int32) In [9]: timeit i.sum(axis=-1) 10 loops, best of 3: 131 ms per loop In [10]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 6.37 ms per loop In [11]: i = numpy.ones((1024,1024,4), numpy.int64) In [12]: timeit i.sum(axis=-1) 100 loops, best of 3: 16.6 ms per loop In [13]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 15.1 ms per loop Timings -- 32-bit mode: -- In [2]: i = numpy.ones((1024,1024,4), numpy.int8) In [3]: timeit i.sum(axis=-1) 10 loops, best of 3: 138 ms per loop In [4]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 3.68 ms per loop In [5]: i = numpy.ones((1024,1024,4), numpy.int16) In [6]: timeit i.sum(axis=-1) 10 loops, best of 3: 140 ms per loop In [7]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 4.17 ms per loop In [8]: i = numpy.ones((1024,1024,4), numpy.int32) In [9]: timeit i.sum(axis=-1) 10 loops, best of 3: 22.4 ms per loop In [10]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 12.2 ms per loop In [11]: i = numpy.ones((1024,1024,4), numpy.int64) In [12]: timeit i.sum(axis=-1) 10 loops, best of 3: 29.2 ms per loop In [13]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 10 loops, best of 3: 23.8 ms per loop ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] poor performance of sum with sub-machine-word integer types
On Tue, Jun 21, 2011 at 10:46 AM, Zachary Pincus zachary.pin...@yale.eduwrote: Hello all, As a result of the fast greyscale conversion thread, I noticed an anomaly with numpy.ndararray.sum(): summing along certain axes is much slower with sum() than versus doing it explicitly, but only with integer dtypes and when the size of the dtype is less than the machine word. I checked in 32-bit and 64-bit modes and in both cases only once the dtype got as large as that did the speed difference go away. See below... Is this something to do with numpy or something inexorable about machine / memory architecture? It's because of the type conversion sum uses by default for greater precision. In [8]: timeit i.sum(axis=-1) 10 loops, best of 3: 140 ms per loop In [9]: timeit i.sum(axis=-1, dtype=int8) 100 loops, best of 3: 16.2 ms per loop If you have 1.6, einsum is faster but also conserves type: In [10]: timeit einsum('ijk-ij', i) 100 loops, best of 3: 5.95 ms per loop We could probably make better loops for summing within kinds, i.e., accumulate in higher precision, then cast to specified precision. snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] poor performance of sum with sub-machine-word integer types
On Tue, Jun 21, 2011 at 9:46 AM, Zachary Pincus zachary.pin...@yale.edu wrote: Hello all, As a result of the fast greyscale conversion thread, I noticed an anomaly with numpy.ndararray.sum(): summing along certain axes is much slower with sum() than versus doing it explicitly, but only with integer dtypes and when the size of the dtype is less than the machine word. I checked in 32-bit and 64-bit modes and in both cases only once the dtype got as large as that did the speed difference go away. See below... Is this something to do with numpy or something inexorable about machine / memory architecture? Zach Timings -- 64-bit mode: -- In [2]: i = numpy.ones((1024,1024,4), numpy.int8) In [3]: timeit i.sum(axis=-1) 10 loops, best of 3: 131 ms per loop In [4]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 2.57 ms per loop In [5]: i = numpy.ones((1024,1024,4), numpy.int16) In [6]: timeit i.sum(axis=-1) 10 loops, best of 3: 131 ms per loop In [7]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 4.75 ms per loop In [8]: i = numpy.ones((1024,1024,4), numpy.int32) In [9]: timeit i.sum(axis=-1) 10 loops, best of 3: 131 ms per loop In [10]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 6.37 ms per loop In [11]: i = numpy.ones((1024,1024,4), numpy.int64) In [12]: timeit i.sum(axis=-1) 100 loops, best of 3: 16.6 ms per loop In [13]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 15.1 ms per loop Timings -- 32-bit mode: -- In [2]: i = numpy.ones((1024,1024,4), numpy.int8) In [3]: timeit i.sum(axis=-1) 10 loops, best of 3: 138 ms per loop In [4]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 3.68 ms per loop In [5]: i = numpy.ones((1024,1024,4), numpy.int16) In [6]: timeit i.sum(axis=-1) 10 loops, best of 3: 140 ms per loop In [7]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 4.17 ms per loop In [8]: i = numpy.ones((1024,1024,4), numpy.int32) In [9]: timeit i.sum(axis=-1) 10 loops, best of 3: 22.4 ms per loop In [10]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 12.2 ms per loop In [11]: i = numpy.ones((1024,1024,4), numpy.int64) In [12]: timeit i.sum(axis=-1) 10 loops, best of 3: 29.2 ms per loop In [13]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 10 loops, best of 3: 23.8 ms per loop One difference is that i.sum() changes the output dtype of int input when the int dtype is less than the default int dtype: i.dtype dtype('int32') i.sum(axis=-1).dtype dtype('int64') # -- dtype changed (i[...,0]+i[...,1]+i[...,2]+i[...,3]).dtype dtype('int32') Here are my timings i = numpy.ones((1024,1024,4), numpy.int32) timeit i.sum(axis=-1) 1 loops, best of 3: 278 ms per loop timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 12.1 ms per loop import bottleneck as bn timeit bn.func.nansum_3d_int32_axis2(i) 100 loops, best of 3: 8.27 ms per loop Does making an extra copy of the input explain all of the speed difference (is this what np.sum does internally?): timeit i.astype(numpy.int64) 10 loops, best of 3: 29.2 ms per loop No. Initializing the output also adds some time: timeit np.empty((1024,1024,4), dtype=np.int32) 10 loops, best of 3: 2.67 us per loop timeit np.empty((1024,1024,4), dtype=np.int64) 10 loops, best of 3: 12.8 us per loop Switching back and forth between the input and output array takes more memory time too with int64 arrays compared to int32. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] poor performance of sum with sub-machine-word integer types
On Tue, Jun 21, 2011 at 11:17 AM, Keith Goodman kwgood...@gmail.com wrote: On Tue, Jun 21, 2011 at 9:46 AM, Zachary Pincus zachary.pin...@yale.edu wrote: Hello all, As a result of the fast greyscale conversion thread, I noticed an anomaly with numpy.ndararray.sum(): summing along certain axes is much slower with sum() than versus doing it explicitly, but only with integer dtypes and when the size of the dtype is less than the machine word. I checked in 32-bit and 64-bit modes and in both cases only once the dtype got as large as that did the speed difference go away. See below... Is this something to do with numpy or something inexorable about machine / memory architecture? Zach Timings -- 64-bit mode: -- In [2]: i = numpy.ones((1024,1024,4), numpy.int8) In [3]: timeit i.sum(axis=-1) 10 loops, best of 3: 131 ms per loop In [4]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 2.57 ms per loop In [5]: i = numpy.ones((1024,1024,4), numpy.int16) In [6]: timeit i.sum(axis=-1) 10 loops, best of 3: 131 ms per loop In [7]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 4.75 ms per loop In [8]: i = numpy.ones((1024,1024,4), numpy.int32) In [9]: timeit i.sum(axis=-1) 10 loops, best of 3: 131 ms per loop In [10]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 6.37 ms per loop In [11]: i = numpy.ones((1024,1024,4), numpy.int64) In [12]: timeit i.sum(axis=-1) 100 loops, best of 3: 16.6 ms per loop In [13]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 15.1 ms per loop Timings -- 32-bit mode: -- In [2]: i = numpy.ones((1024,1024,4), numpy.int8) In [3]: timeit i.sum(axis=-1) 10 loops, best of 3: 138 ms per loop In [4]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 3.68 ms per loop In [5]: i = numpy.ones((1024,1024,4), numpy.int16) In [6]: timeit i.sum(axis=-1) 10 loops, best of 3: 140 ms per loop In [7]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 4.17 ms per loop In [8]: i = numpy.ones((1024,1024,4), numpy.int32) In [9]: timeit i.sum(axis=-1) 10 loops, best of 3: 22.4 ms per loop In [10]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 12.2 ms per loop In [11]: i = numpy.ones((1024,1024,4), numpy.int64) In [12]: timeit i.sum(axis=-1) 10 loops, best of 3: 29.2 ms per loop In [13]: timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 10 loops, best of 3: 23.8 ms per loop One difference is that i.sum() changes the output dtype of int input when the int dtype is less than the default int dtype: i.dtype dtype('int32') i.sum(axis=-1).dtype dtype('int64') # -- dtype changed (i[...,0]+i[...,1]+i[...,2]+i[...,3]).dtype dtype('int32') Here are my timings i = numpy.ones((1024,1024,4), numpy.int32) timeit i.sum(axis=-1) 1 loops, best of 3: 278 ms per loop timeit i[...,0]+i[...,1]+i[...,2]+i[...,3] 100 loops, best of 3: 12.1 ms per loop import bottleneck as bn timeit bn.func.nansum_3d_int32_axis2(i) 100 loops, best of 3: 8.27 ms per loop Does making an extra copy of the input explain all of the speed difference (is this what np.sum does internally?): timeit i.astype(numpy.int64) 10 loops, best of 3: 29.2 ms per loop No. I think you can see the overhead here: In [14]: timeit einsum('ijk-ij', i, dtype=int32) 100 loops, best of 3: 17.6 ms per loop In [15]: timeit einsum('ijk-ij', i, dtype=int64) 100 loops, best of 3: 18 ms per loop In [16]: timeit einsum('ijk-ij', i, dtype=int16) 100 loops, best of 3: 18.3 ms per loop In [17]: timeit einsum('ijk-ij', i, dtype=int8) 100 loops, best of 3: 5.87 ms per loop Initializing the output also adds some time: timeit np.empty((1024,1024,4), dtype=np.int32) 10 loops, best of 3: 2.67 us per loop timeit np.empty((1024,1024,4), dtype=np.int64) 10 loops, best of 3: 12.8 us per loop Switching back and forth between the input and output array takes more memory time too with int64 arrays compared to int32. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs
On Mon, Jun 20, 2011 at 12:32 PM, Mark Wiebe mwwi...@gmail.com wrote: NumPy has a mechanism built in to allow subclasses to adjust or override aspects of the ufunc behavior. While this goal is important, this mechanism only allows for very limited customization, making for instance the masked arrays unable to work with the native ufuncs in a full and proper way. I would like to deprecate the current mechanism, in particular __array_prepare__ and __array_wrap__, and introduce a new method I will describe below. If you've ever used these mechanisms, please review this design to see if it meets your needs. The current approach is at a dead end, so something better needs to be done. Any class type which would like to override its behavior in ufuncs would define a method called _numpy_ufunc_, and optionally an attribute __array_priority__ as can already be done. The class which wins the priority battle gets its _numpy_ufunc_ function called as follows: return arr._numpy_ufunc_(current_ufunc, *args, **kwargs) To support this overloading, the ufunc would get a new support method, result_type, and there would be a new global function, broadcast_empty_like. The function ufunc.empty_like behaves like the global np.result_type, but produces the output type or a tuple of output types specific to the ufunc, which may follow a different convention than regular arithmetic type promotion. This allows for a class to create an output array of the correct type to pass to the ufunc if it needs to be different than the default. The function broadcast_empty_like is just like empty_like, but takes a list or tuple of arrays which are to be broadcast together for producing the output, instead of just one. How does the ufunc get called so it doesn't get caught in an endless loop? I like the proposed method if it can also be used for classes that don't subclass ndarray. Masked array, for instance, should probably not subclass ndarray. Thanks, Mark A simple class which overrides the ufuncs might look as follows: def sin(ufunc, *args, **kwargs): # Convert degrees to radians args[0] = np.deg2rad(args[0]) # Return a regular array, since the result is not in degrees return ufunc(*args, **kwargs) class MyDegreesClass: Array-like object with a degrees unit def __init__(arr): self.arr = arr def _numpy_ufunc_(ufunc, *args, **kwargs): override = globals().get(ufunc.name) if override: return override(ufunc, *args, **kwargs) else: raise TypeError, 'ufunc %s incompatible with MyDegreesClass' % ufunc.name A more complex example will be something like this: def my_general_ufunc(ufunc, *args, **kwargs): # Extract the 'out' argument. This only supports ufuncs with # one output, currently. out = kwargs.get('out') if len(args) ufunc.nin: if out is None: out = args[ufunc.nin] else: raise ValueError, 'out' given as both a position and keyword argument # Just want the inputs from here on args = args[:ufunc.nin] # Strip out MyArrayClass, but allow operations with regular ndarrays raw_in = [] for a in args: if isinstance(a, MyArrayClass): raw_in.append(a.arr) else: raw_in.append(a) # Allocate the output array if not out is None: if isinstance(out, MyArrayClass): raise TypeError, 'out' must have type MyArrayClass else: # Create the output array, obeying the 'order' parameter, # but disallowing subclasses out = np.broadcast_empty_like([args, order=kwargs.get('order'), dtype=ufunc.result_type(args), subok=False) # Override the output argument kwargs['out'] = out.arr # Call the ufunc ufunc(*args, **kwargs) # Return the output return out class MyArrayClass: def __init__(arr): self.arr = arr def _numpy_ufunc_(ufunc, *args, **kwargs): override = globals().get(ufunc.name) if override: return override(ufunc, *args, **kwargs) else: return my_general_ufunc(ufunc, *args, **kwargs) Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] poor performance of sum with sub-machine-word integer types
On Jun 21, 2011, at 1:16 PM, Charles R Harris wrote: It's because of the type conversion sum uses by default for greater precision. Aah, makes sense. Thanks for the detailed explanations and timings! ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fast numpy i/o
Neal Becker wrote: I'm wondering what are good choices for fast numpy array serialization? mmap: fast, but I guess not self-describing? hdf5: ? pickle: self-describing, but maybe not fast? others? I think, in addition, that hdf5 is the only one that easily interoperates with matlab? speaking of hdf5, I see: pyhdf5io 0.7 - Python module containing high-level hdf5 load and save functions. h5py 2.0.0 - Read and write HDF5 files from Python Any thoughts on the relative merits of these? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fast numpy i/o
On Tue, Jun 21, 2011 at 12:49, Neal Becker ndbeck...@gmail.com wrote: I'm wondering what are good choices for fast numpy array serialization? mmap: fast, but I guess not self-describing? hdf5: ? pickle: self-describing, but maybe not fast? others? NPY: http://docs.scipy.org/doc/numpy/reference/generated/numpy.save.html http://docs.scipy.org/doc/numpy/reference/generated/numpy.load.html (Note the mmap_mode argument) https://raw.github.com/numpy/numpy/master/doc/neps/npy-format.txt -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fast numpy i/o
Neal Becker wrote: I'm wondering what are good choices for fast numpy array serialization? mmap: fast, but I guess not self-describing? hdf5: ? Should be pretty fast, and self describing -- advantage of being a standard. Disadvantage is that it requires an hdf5 library, which can b a pain to install on some systems. pickle: self-describing, but maybe not fast? others? there is .tofile() and .fromfile() -- should be about as fast as you can get, not self-describing. Then there is .save(), savez() and .load() (.npz format)? It should be pretty fast, and self-describing (but not a standard outside of numpy). I doubt pickle will ever be your best bet. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs
On Tue, Jun 21, 2011 at 12:36 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Jun 20, 2011 at 12:32 PM, Mark Wiebe mwwi...@gmail.com wrote: NumPy has a mechanism built in to allow subclasses to adjust or override aspects of the ufunc behavior. While this goal is important, this mechanism only allows for very limited customization, making for instance the masked arrays unable to work with the native ufuncs in a full and proper way. I would like to deprecate the current mechanism, in particular __array_prepare__ and __array_wrap__, and introduce a new method I will describe below. If you've ever used these mechanisms, please review this design to see if it meets your needs. The current approach is at a dead end, so something better needs to be done. Any class type which would like to override its behavior in ufuncs would define a method called _numpy_ufunc_, and optionally an attribute __array_priority__ as can already be done. The class which wins the priority battle gets its _numpy_ufunc_ function called as follows: return arr._numpy_ufunc_(current_ufunc, *args, **kwargs) To support this overloading, the ufunc would get a new support method, result_type, and there would be a new global function, broadcast_empty_like. The function ufunc.empty_like behaves like the global np.result_type, but produces the output type or a tuple of output types specific to the ufunc, which may follow a different convention than regular arithmetic type promotion. This allows for a class to create an output array of the correct type to pass to the ufunc if it needs to be different than the default. The function broadcast_empty_like is just like empty_like, but takes a list or tuple of arrays which are to be broadcast together for producing the output, instead of just one. How does the ufunc get called so it doesn't get caught in an endless loop? I like the proposed method if it can also be used for classes that don't subclass ndarray. Masked array, for instance, should probably not subclass ndarray. The function being called needs to ensure this, either by extracting a raw ndarray from instances of its class, or adding a 'subok = False' parameter to the kwargs. Supporting objects that aren't ndarray subclasses is one of the purposes for this approach, and neither of my two example cases subclassed ndarray. -Mark Thanks, Mark A simple class which overrides the ufuncs might look as follows: def sin(ufunc, *args, **kwargs): # Convert degrees to radians args[0] = np.deg2rad(args[0]) # Return a regular array, since the result is not in degrees return ufunc(*args, **kwargs) class MyDegreesClass: Array-like object with a degrees unit def __init__(arr): self.arr = arr def _numpy_ufunc_(ufunc, *args, **kwargs): override = globals().get(ufunc.name) if override: return override(ufunc, *args, **kwargs) else: raise TypeError, 'ufunc %s incompatible with MyDegreesClass' % ufunc.name A more complex example will be something like this: def my_general_ufunc(ufunc, *args, **kwargs): # Extract the 'out' argument. This only supports ufuncs with # one output, currently. out = kwargs.get('out') if len(args) ufunc.nin: if out is None: out = args[ufunc.nin] else: raise ValueError, 'out' given as both a position and keyword argument # Just want the inputs from here on args = args[:ufunc.nin] # Strip out MyArrayClass, but allow operations with regular ndarrays raw_in = [] for a in args: if isinstance(a, MyArrayClass): raw_in.append(a.arr) else: raw_in.append(a) # Allocate the output array if not out is None: if isinstance(out, MyArrayClass): raise TypeError, 'out' must have type MyArrayClass else: # Create the output array, obeying the 'order' parameter, # but disallowing subclasses out = np.broadcast_empty_like([args, order=kwargs.get('order'), dtype=ufunc.result_type(args), subok=False) # Override the output argument kwargs['out'] = out.arr # Call the ufunc ufunc(*args, **kwargs) # Return the output return out class MyArrayClass: def __init__(arr): self.arr = arr def _numpy_ufunc_(ufunc, *args, **kwargs): override = globals().get(ufunc.name) if override: return override(ufunc, *args, **kwargs) else: return my_general_ufunc(ufunc, *args, **kwargs) Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___
Re: [Numpy-discussion] what python module to handle csv?
I am a huge fan of rec2csv and csv2rec, which might not technically be part of numpy, and can be found in pylab or the matplotlib.mlab module. --Abie From: numpy-discussion-boun...@scipy.org [mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Olivier Delalleau Sent: Wednesday, June 15, 2011 9:26 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] what python module to handle csv? Using savetxt with delimiter=',' should do the trick. If you want a more advanced csv interface to e.g. save more than a numpy array into a single csv, you can probably look into the python csv module. -=- Olivier 2011/6/15 Chao YUE chaoyue...@gmail.commailto:chaoyue...@gmail.com Dear all pythoners, what do you use python module to handle csv file (for reading we can use numpy.genfromtxt)? is there anyone we can do with csv file very convinient as that in R? can numpy.genfromtxt be used as writing? (I didn't try this yet because on our server we have only numpy 1.0.1...). This really make me struggling since csv is a very important interface file (I think so...). Thanks a lot, Sincerely, Chao -- *** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 77 30; Fax:01.69.08.77.16 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.orgmailto:NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fast numpy i/o
Neal Becker wrote: I'm wondering what are good choices for fast numpy array serialization? mmap: fast, but I guess not self-describing? hdf5: ? pickle: self-describing, but maybe not fast? others? I think, in addition, that hdf5 is the only one that easily interoperates with matlab? netcdf is another option if you want an open standard supported by Matlab and the like. I like the netCDF4 package: http://code.google.com/p/netcdf4-python/ Though is can be kind of slow to write (I have no idea why!) plain old tofile() should be readable by other tools as well, as long as you have a way to specify what is in the file. speaking of hdf5, I see: pyhdf5io 0.7 - Python module containing high-level hdf5 load and save functions. h5py 2.0.0 - Read and write HDF5 files from Python There is also pytables, which uses HDF5 under the hood. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fast numpy i/o
On 21.06.2011, at 7:58PM, Neal Becker wrote: I think, in addition, that hdf5 is the only one that easily interoperates with matlab? speaking of hdf5, I see: pyhdf5io 0.7 - Python module containing high-level hdf5 load and save functions. h5py 2.0.0 - Read and write HDF5 files from Python Any thoughts on the relative merits of these? In my experience, HDF5 access usually approaches disk access speed, and random access to sub-datasets should be significantly faster than reading in the entire file, though I have not been able to test this. I have not heard about pyhdf5io (how does it work together with numpy?) - as alternative to h5py I'd rather recommend pytables, though I prefer the former for its cleaner/simpler interface (but that probably depends on your programming habits). HTH, Derek ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs
On Tue, Jun 21, 2011 at 11:57 AM, Mark Wiebe mwwi...@gmail.com wrote: On Tue, Jun 21, 2011 at 12:36 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Jun 20, 2011 at 12:32 PM, Mark Wiebe mwwi...@gmail.com wrote: NumPy has a mechanism built in to allow subclasses to adjust or override aspects of the ufunc behavior. While this goal is important, this mechanism only allows for very limited customization, making for instance the masked arrays unable to work with the native ufuncs in a full and proper way. I would like to deprecate the current mechanism, in particular __array_prepare__ and __array_wrap__, and introduce a new method I will describe below. If you've ever used these mechanisms, please review this design to see if it meets your needs. The current approach is at a dead end, so something better needs to be done. Any class type which would like to override its behavior in ufuncs would define a method called _numpy_ufunc_, and optionally an attribute __array_priority__ as can already be done. The class which wins the priority battle gets its _numpy_ufunc_ function called as follows: return arr._numpy_ufunc_(current_ufunc, *args, **kwargs) To support this overloading, the ufunc would get a new support method, result_type, and there would be a new global function, broadcast_empty_like. The function ufunc.empty_like behaves like the global np.result_type, but produces the output type or a tuple of output types specific to the ufunc, which may follow a different convention than regular arithmetic type promotion. This allows for a class to create an output array of the correct type to pass to the ufunc if it needs to be different than the default. The function broadcast_empty_like is just like empty_like, but takes a list or tuple of arrays which are to be broadcast together for producing the output, instead of just one. How does the ufunc get called so it doesn't get caught in an endless loop? I like the proposed method if it can also be used for classes that don't subclass ndarray. Masked array, for instance, should probably not subclass ndarray. The function being called needs to ensure this, either by extracting a raw ndarray from instances of its class, or adding a 'subok = False' parameter to the kwargs. Supporting objects that aren't ndarray subclasses is one of the purposes for this approach, and neither of my two example cases subclassed ndarray. Sounds good. Many of the current uses of __array_wrap__ that I am aware of are in the wrappers in the linalg module and don't go through the ufunc machinery. How would that be handled? snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fast numpy i/o
Hi, I have been using h5py a lot (both on windows and Mac OSX) and can only recommend it- haven't tried the other options though Cheers, Simon On Tue, Jun 21, 2011 at 8:24 PM, Derek Homeier de...@astro.physik.uni-goettingen.de wrote: On 21.06.2011, at 7:58PM, Neal Becker wrote: I think, in addition, that hdf5 is the only one that easily interoperates with matlab? speaking of hdf5, I see: pyhdf5io 0.7 - Python module containing high-level hdf5 load and save functions. h5py 2.0.0 - Read and write HDF5 files from Python Any thoughts on the relative merits of these? In my experience, HDF5 access usually approaches disk access speed, and random access to sub-datasets should be significantly faster than reading in the entire file, though I have not been able to test this. I have not heard about pyhdf5io (how does it work together with numpy?) - as alternative to h5py I'd rather recommend pytables, though I prefer the former for its cleaner/simpler interface (but that probably depends on your programming habits). HTH, Derek ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fast numpy i/o
Robert Kern wrote: https://raw.github.com/numpy/numpy/master/doc/neps/npy-format.txt Just a note. From that doc: HDF5 is a complicated format that more or less implements a hierarchical filesystem-in-a-file. This fact makes satisfying some of the Requirements difficult. To the author's knowledge, as of this writing, there is no application or library that reads or writes even a subset of HDF5 files that does not use the canonical libhdf5 implementation. I'm pretty sure that the NetcdfJava libs, developed by Unidata, use their own home-grown code. netcdf4 is built on HDF5, so that qualifies as a library that reads or writes a subset of HDF5 files. Perhaps there are lessons to be learned there. (too bad it's Java) Furthermore, by providing the first non-libhdf5 implementation of HDF5, we would be able to encourage more adoption of simple HDF5 in applications where it was previously infeasible because of the size of the library. I suppose this point is still true -- a C lib that supported a subset of hdf would be nice. That being said, I like the simplicity of the .npy format, and I don't know that anyone wants to take any of this on anyway. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs
On Tue, Jun 21, 2011 at 2:28 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Jun 21, 2011 at 11:57 AM, Mark Wiebe mwwi...@gmail.com wrote: On Tue, Jun 21, 2011 at 12:36 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Jun 20, 2011 at 12:32 PM, Mark Wiebe mwwi...@gmail.com wrote: NumPy has a mechanism built in to allow subclasses to adjust or override aspects of the ufunc behavior. While this goal is important, this mechanism only allows for very limited customization, making for instance the masked arrays unable to work with the native ufuncs in a full and proper way. I would like to deprecate the current mechanism, in particular __array_prepare__ and __array_wrap__, and introduce a new method I will describe below. If you've ever used these mechanisms, please review this design to see if it meets your needs. The current approach is at a dead end, so something better needs to be done. Any class type which would like to override its behavior in ufuncs would define a method called _numpy_ufunc_, and optionally an attribute __array_priority__ as can already be done. The class which wins the priority battle gets its _numpy_ufunc_ function called as follows: return arr._numpy_ufunc_(current_ufunc, *args, **kwargs) To support this overloading, the ufunc would get a new support method, result_type, and there would be a new global function, broadcast_empty_like. The function ufunc.empty_like behaves like the global np.result_type, but produces the output type or a tuple of output types specific to the ufunc, which may follow a different convention than regular arithmetic type promotion. This allows for a class to create an output array of the correct type to pass to the ufunc if it needs to be different than the default. The function broadcast_empty_like is just like empty_like, but takes a list or tuple of arrays which are to be broadcast together for producing the output, instead of just one. How does the ufunc get called so it doesn't get caught in an endless loop? I like the proposed method if it can also be used for classes that don't subclass ndarray. Masked array, for instance, should probably not subclass ndarray. The function being called needs to ensure this, either by extracting a raw ndarray from instances of its class, or adding a 'subok = False' parameter to the kwargs. Supporting objects that aren't ndarray subclasses is one of the purposes for this approach, and neither of my two example cases subclassed ndarray. Sounds good. Many of the current uses of __array_wrap__ that I am aware of are in the wrappers in the linalg module and don't go through the ufunc machinery. How would that be handled? I contributed the __array_prepare__ method a while back so classes could raise errors before the array data is modified in place. Specifically, I was concerned about units support in my quantities package (http://pypi.python.org/pypi/quantities). But I agree that this approach is needs to be reconsidered. It would be nice for subclasses to have an opportunity to intercept and process the values passed to a ufunc on their way in. For example, it would be nice if when I did np.cos(1.5 degrees), my subclass could intercept the value and pass a new one on to the ufunc machinery that is expressed in radians. I thought PJ Eby's generic functions PEP would be a really good way to handle ufuncs, but the PEP has stagnated. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs
On Tue, Jun 21, 2011 at 12:46 PM, Darren Dale dsdal...@gmail.com wrote: On Tue, Jun 21, 2011 at 2:28 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Jun 21, 2011 at 11:57 AM, Mark Wiebe mwwi...@gmail.com wrote: On Tue, Jun 21, 2011 at 12:36 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Jun 20, 2011 at 12:32 PM, Mark Wiebe mwwi...@gmail.com wrote: NumPy has a mechanism built in to allow subclasses to adjust or override aspects of the ufunc behavior. While this goal is important, this mechanism only allows for very limited customization, making for instance the masked arrays unable to work with the native ufuncs in a full and proper way. I would like to deprecate the current mechanism, in particular __array_prepare__ and __array_wrap__, and introduce a new method I will describe below. If you've ever used these mechanisms, please review this design to see if it meets your needs. The current approach is at a dead end, so something better needs to be done. Any class type which would like to override its behavior in ufuncs would define a method called _numpy_ufunc_, and optionally an attribute __array_priority__ as can already be done. The class which wins the priority battle gets its _numpy_ufunc_ function called as follows: return arr._numpy_ufunc_(current_ufunc, *args, **kwargs) To support this overloading, the ufunc would get a new support method, result_type, and there would be a new global function, broadcast_empty_like. The function ufunc.empty_like behaves like the global np.result_type, but produces the output type or a tuple of output types specific to the ufunc, which may follow a different convention than regular arithmetic type promotion. This allows for a class to create an output array of the correct type to pass to the ufunc if it needs to be different than the default. The function broadcast_empty_like is just like empty_like, but takes a list or tuple of arrays which are to be broadcast together for producing the output, instead of just one. How does the ufunc get called so it doesn't get caught in an endless loop? I like the proposed method if it can also be used for classes that don't subclass ndarray. Masked array, for instance, should probably not subclass ndarray. The function being called needs to ensure this, either by extracting a raw ndarray from instances of its class, or adding a 'subok = False' parameter to the kwargs. Supporting objects that aren't ndarray subclasses is one of the purposes for this approach, and neither of my two example cases subclassed ndarray. Sounds good. Many of the current uses of __array_wrap__ that I am aware of are in the wrappers in the linalg module and don't go through the ufunc machinery. How would that be handled? I contributed the __array_prepare__ method a while back so classes could raise errors before the array data is modified in place. Specifically, I was concerned about units support in my quantities package (http://pypi.python.org/pypi/quantities). But I agree that this approach is needs to be reconsidered. It would be nice for subclasses to have an opportunity to intercept and process the values passed to a ufunc on their way in. For example, it would be nice if when I did np.cos(1.5 degrees), my subclass could intercept the value and pass a new one on to the ufunc machinery that is expressed in radians. I thought PJ Eby's generic functions PEP would be a really Link to PEP-3124 http://tinyurl.com/3brnk6. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] (cumsum, broadcast) in (numexpr, weave)
Hi All, is there a fast way to do cumsum with numexpr ? I could not find it, but the functions available in numexpr does not seem to be exhaustively documented, so it is possible that I missed it. Do not know if 'sum' takes special arguments that can be used. To try another track, does numexpr operators have something like the 'out' parameter for ufuncs ? If it is so, one could perhaps use add( a[0:-1], a[1,:], out = a[1,:) provided it is possible to preserve the sequential semantics. Another option is to use weave which does have cumsum. However my code requires expressions which implement broadcast. That leads to my next question, does repeat or concat return a copy or a view. If they avoid copying, I could perhaps use repeat to simulate efficient broadcasting. Or will it make a copy of that array anyway ?. I would ideally like to use numexpr because I make heavy use of transcendental functions and was hoping to exploit the VML library. Thanks for the help -- srean ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] (cumsum, broadcast) in (numexpr, weave)
Apologies, intended to send this to the scipy list. On Tue, Jun 21, 2011 at 2:35 PM, srean srean.l...@gmail.com wrote: Hi All, is there a fast way to do cumsum with numexpr ? I could not find it, but the functions available in numexpr does not seem to be exhaustively documented, so it is possible that I missed it. Do not know if 'sum' takes special arguments that can be used. To try another track, does numexpr operators have something like the 'out' parameter for ufuncs ? If it is so, one could perhaps use add( a[0:-1], a[1,:], out = a[1,:) provided it is possible to preserve the sequential semantics. Another option is to use weave which does have cumsum. However my code requires expressions which implement broadcast. That leads to my next question, does repeat or concat return a copy or a view. If they avoid copying, I could perhaps use repeat to simulate efficient broadcasting. Or will it make a copy of that array anyway ?. I would ideally like to use numexpr because I make heavy use of transcendental functions and was hoping to exploit the VML library. Thanks for the help -- srean ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: Numpy 1.6.1 release candidate 1
On Tue, Jun 21, 2011 at 4:38 PM, Bruce Southey bsout...@gmail.com wrote: ** On 06/21/2011 01:01 AM, Ralf Gommers wrote: On Tue, Jun 21, 2011 at 3:55 AM, Bruce Southey bsout...@gmail.com wrote: On Mon, Jun 20, 2011 at 2:43 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Mon, Jun 20, 2011 at 8:50 PM, Bruce Southey bsout...@gmail.com wrote: I copied the files but that just moves the problem. So that patch is incorrect. I get the same errors on Fedora 15 supplied Python3.2 for numpy 1.6.0 and using git from 'https://github.com/rgommers/numpy.git'. Numpy is getting Fedora supplied Atlas (1.5.1 does not). It appears that there is a misunderstanding of the PEP because 'SO' and 'SOABI' do exactly what the PEP says on my systems: It doesn't on OS X. But that's not even the issue. As I explained before, the issue is that get_config_var('SO') is used to determine the extension of system libraries (such as liblapack.so) and python-related ones (such as multiarray.cpython-32m.so). And the current functions don't do mindreading. from distutils import sysconfig sysconfig.get_config_var('SO') '.cpython-32m.so' sysconfig.get_config_var('SOABI') 'cpython-32m' Consequently, the name, 'multiarray.pyd', created within numpy is invalid. I removed the line in ctypeslib that was trying this, so I think you are not testing my patch. Ralf Looking the code, I see this line which makes no sense given that the second part is true under Linux: if (not is_python_ext) and 'SOABI' in distutils.sysconfig.get_config_vars(): So I think the 'get_shared_lib_extension' function is wrong and probably unneeded. Bruce Just to show that this is the new version, I added two print statements in the 'get_shared_lib_extension' function: from numpy.distutils.misc_util import get_shared_lib_extension get_shared_lib_extension(True) first so_ext .cpython-32mu.so returned so_ext .cpython-32mu.so '.cpython-32mu.so' get_shared_lib_extension(False) first so_ext .cpython-32mu.so returned so_ext .so '.so' This all looks correct. Before you were saying you were still getting 'multiarray.pyd', now your error says 'multiarray.so'. So now you are testing the right thing. Test test_basic2() in test_ctypeslib was fixed, but I forgot to fix it in two other places. I updated both my branches on github, please try again. The reason for the same location is obvious because all the patch does is move the code to get the extension into that function. So the 'get_shared_lib_extension' function returns the extension '.so' to the load_library function. However that name is wrong under Linux as it has to be 'multiarray.cpython-32mu.so' and hence the error in the same location. I did come across this thread 'http://bugs.python.org/issue10262' which indicates why Linux is different by default. So what is the actual name of the multiarray shared library with the Mac? If it is ' 'multiarray.so' then the correct name is libname + sysconfig.get_config_var('SO') as I previously indicated. It is, and yes that's correct. Orthogonal to the actual issue though. Ralf While the test now pass, you have now changed an API for load_library. Only in a backwards-compatible way, which should be fine. I added a keyword, the default of which does the same as before. The only thing I did other than that was remove tries with clearly invalid extensions, like .pyd on Linux. Now that I'm writing that though, I think it's better to try both .so and .cpython-32mu.so by default for python =3.2. This is not something that is meant to occur in a bug-fix release as well as the new argument is undocumented. But I do not understand the need for this extra complexity when libname + sysconfig.get_config_var('SO') works on Linux, Windows and Mac. I've tried to explain this twice already. You have both multiarray.cpython-32mu.so and liblapack.so (or some other library like that) on your system. The extension of both is needed, and always obtained via get_config_var('SO'). See the problem? If someone knows a better way to do this, I'm all for it. But I don't see a simpler way. Cheers, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: Numpy 1.6.1 release candidate 1
On Tue, Jun 21, 2011 at 10:05 PM, Ralf Gommers ralf.gomm...@googlemail.comwrote: On Tue, Jun 21, 2011 at 4:38 PM, Bruce Southey bsout...@gmail.com wrote: ** On 06/21/2011 01:01 AM, Ralf Gommers wrote: On Tue, Jun 21, 2011 at 3:55 AM, Bruce Southey bsout...@gmail.comwrote: So what is the actual name of the multiarray shared library with the Mac? If it is ' 'multiarray.so' then the correct name is libname + sysconfig.get_config_var('SO') as I previously indicated. It is, and yes that's correct. Orthogonal to the actual issue though. Ralf While the test now pass, you have now changed an API for load_library. Only in a backwards-compatible way, which should be fine. I added a keyword, the default of which does the same as before. The only thing I did other than that was remove tries with clearly invalid extensions, like .pyd on Linux. Now that I'm writing that though, I think it's better to try both .so and .cpython-32mu.so by default for python =3.2. This should try both extensions: https://github.com/rgommers/numpy/tree/sharedlibext Does that look better? Ralf This is not something that is meant to occur in a bug-fix release as well as the new argument is undocumented. But I do not understand the need for this extra complexity when libname + sysconfig.get_config_var('SO') works on Linux, Windows and Mac. I've tried to explain this twice already. You have both multiarray.cpython-32mu.so and liblapack.so (or some other library like that) on your system. The extension of both is needed, and always obtained via get_config_var('SO'). See the problem? If someone knows a better way to do this, I'm all for it. But I don't see a simpler way. Cheers, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs
On Tue, Jun 21, 2011 at 1:28 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Jun 21, 2011 at 11:57 AM, Mark Wiebe mwwi...@gmail.com wrote: On Tue, Jun 21, 2011 at 12:36 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Jun 20, 2011 at 12:32 PM, Mark Wiebe mwwi...@gmail.com wrote: NumPy has a mechanism built in to allow subclasses to adjust or override aspects of the ufunc behavior. While this goal is important, this mechanism only allows for very limited customization, making for instance the masked arrays unable to work with the native ufuncs in a full and proper way. I would like to deprecate the current mechanism, in particular __array_prepare__ and __array_wrap__, and introduce a new method I will describe below. If you've ever used these mechanisms, please review this design to see if it meets your needs. The current approach is at a dead end, so something better needs to be done. Any class type which would like to override its behavior in ufuncs would define a method called _numpy_ufunc_, and optionally an attribute __array_priority__ as can already be done. The class which wins the priority battle gets its _numpy_ufunc_ function called as follows: return arr._numpy_ufunc_(current_ufunc, *args, **kwargs) To support this overloading, the ufunc would get a new support method, result_type, and there would be a new global function, broadcast_empty_like. The function ufunc.empty_like behaves like the global np.result_type, but produces the output type or a tuple of output types specific to the ufunc, which may follow a different convention than regular arithmetic type promotion. This allows for a class to create an output array of the correct type to pass to the ufunc if it needs to be different than the default. The function broadcast_empty_like is just like empty_like, but takes a list or tuple of arrays which are to be broadcast together for producing the output, instead of just one. How does the ufunc get called so it doesn't get caught in an endless loop? I like the proposed method if it can also be used for classes that don't subclass ndarray. Masked array, for instance, should probably not subclass ndarray. The function being called needs to ensure this, either by extracting a raw ndarray from instances of its class, or adding a 'subok = False' parameter to the kwargs. Supporting objects that aren't ndarray subclasses is one of the purposes for this approach, and neither of my two example cases subclassed ndarray. Sounds good. Many of the current uses of __array_wrap__ that I am aware of are in the wrappers in the linalg module and don't go through the ufunc machinery. How would that be handled? Those could stay as they are, and just the ufunc usage of __array_wrap__ can be deprecated. For classes which currently use __array_wrap__, they would just need to also implement _numpy_ufunc_ to eliminate any deprecation messages. -Mark snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs
On Tue, Jun 21, 2011 at 1:46 PM, Darren Dale dsdal...@gmail.com wrote: On Tue, Jun 21, 2011 at 2:28 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Jun 21, 2011 at 11:57 AM, Mark Wiebe mwwi...@gmail.com wrote: On Tue, Jun 21, 2011 at 12:36 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Jun 20, 2011 at 12:32 PM, Mark Wiebe mwwi...@gmail.com wrote: NumPy has a mechanism built in to allow subclasses to adjust or override aspects of the ufunc behavior. While this goal is important, this mechanism only allows for very limited customization, making for instance the masked arrays unable to work with the native ufuncs in a full and proper way. I would like to deprecate the current mechanism, in particular __array_prepare__ and __array_wrap__, and introduce a new method I will describe below. If you've ever used these mechanisms, please review this design to see if it meets your needs. The current approach is at a dead end, so something better needs to be done. Any class type which would like to override its behavior in ufuncs would define a method called _numpy_ufunc_, and optionally an attribute __array_priority__ as can already be done. The class which wins the priority battle gets its _numpy_ufunc_ function called as follows: return arr._numpy_ufunc_(current_ufunc, *args, **kwargs) To support this overloading, the ufunc would get a new support method, result_type, and there would be a new global function, broadcast_empty_like. The function ufunc.empty_like behaves like the global np.result_type, but produces the output type or a tuple of output types specific to the ufunc, which may follow a different convention than regular arithmetic type promotion. This allows for a class to create an output array of the correct type to pass to the ufunc if it needs to be different than the default. The function broadcast_empty_like is just like empty_like, but takes a list or tuple of arrays which are to be broadcast together for producing the output, instead of just one. How does the ufunc get called so it doesn't get caught in an endless loop? I like the proposed method if it can also be used for classes that don't subclass ndarray. Masked array, for instance, should probably not subclass ndarray. The function being called needs to ensure this, either by extracting a raw ndarray from instances of its class, or adding a 'subok = False' parameter to the kwargs. Supporting objects that aren't ndarray subclasses is one of the purposes for this approach, and neither of my two example cases subclassed ndarray. Sounds good. Many of the current uses of __array_wrap__ that I am aware of are in the wrappers in the linalg module and don't go through the ufunc machinery. How would that be handled? I contributed the __array_prepare__ method a while back so classes could raise errors before the array data is modified in place. Specifically, I was concerned about units support in my quantities package (http://pypi.python.org/pypi/quantities). But I agree that this approach is needs to be reconsidered. It would be nice for subclasses to have an opportunity to intercept and process the values passed to a ufunc on their way in. For example, it would be nice if when I did np.cos(1.5 degrees), my subclass could intercept the value and pass a new one on to the ufunc machinery that is expressed in radians. I thought PJ Eby's generic functions PEP would be a really good way to handle ufuncs, but the PEP has stagnated. I made one of my examples overriding sin with degrees, because I think this overloading method can work well for a physical quantities library. -Mark Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] fast SSD
Is there a fast way to compute an array of sum-of-squared-differences between a (small) K x K array and all K x K sub-arrays of a larger array? (i.e. each element x,y in the output array is the SSD between the small array and the sub-array (x:x+K, y:y+K) My current implementation loops over each sub-array and computes the SSD with something like ((A-B)**2).sum(). Cheers, Alex ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fast SSD
On Tue, Jun 21, 2011 at 5:09 PM, Alex Flint alex.fl...@gmail.com wrote: Is there a fast way to compute an array of sum-of-squared-differences between a (small) K x K array and all K x K sub-arrays of a larger array? (i.e. each element x,y in the output array is the SSD between the small array and the sub-array (x:x+K, y:y+K) My current implementation loops over each sub-array and computes the SSD with something like ((A-B)**2).sum(). I don't know of a clever way. But if ((A-B)**2).sum() is a sizable fraction of the time, then you could use bottleneck: a = np.random.rand(5,5) timeit (a**2).sum() 10 loops, best of 3: 4.63 us per loop import bottleneck as bn timeit bn.ss(a) 100 loops, best of 3: 1.77 us per loop func, b = bn.func.ss_selector(a, axis=None) func built-in function ss_2d_float64_axisNone timeit func(b) 100 loops, best of 3: 830 ns per loop ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fast SSD
On Tue, Jun 21, 2011 at 7:09 PM, Alex Flint alex.fl...@gmail.com wrote: Is there a fast way to compute an array of sum-of-squared-differences between a (small) K x K array and all K x K sub-arrays of a larger array? (i.e. each element x,y in the output array is the SSD between the small array and the sub-array (x:x+K, y:y+K) My current implementation loops over each sub-array and computes the SSD with something like ((A-B)**2).sum(). You can use stride tricks and broadcasting: - import numpy as np from numpy.lib.stride_tricks import as_strided a = np.arange(24).reshape(4,6) k = np.array([[1,2,0],[2,1,0],[0,1,1]]) # If `a` has shape (4,6), then `b` will have shape (2, 4, 3, 3). # b[i,j] will be the 2-D sub-array of `a` with shape (3, 3). b = as_strided(a, shape=(a.shape[0]-k.shape[0]+1, a.shape[1]-k.shape[1]+1) + k.shape, strides=a.strides * 2) ssd = ((b - k)**2).sum(-1).sum(-1) print a print k print ssd - It's a neat trick, but be aware that the temporary result b - k will be nine times the size of a. If a is large, this might be unacceptable. Warren Cheers, Alex ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: Numpy 1.6.1 release candidate 1
On Tue, Jun 21, 2011 at 3:52 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Tue, Jun 21, 2011 at 10:05 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Tue, Jun 21, 2011 at 4:38 PM, Bruce Southey bsout...@gmail.com wrote: On 06/21/2011 01:01 AM, Ralf Gommers wrote: On Tue, Jun 21, 2011 at 3:55 AM, Bruce Southey bsout...@gmail.com wrote: So what is the actual name of the multiarray shared library with the Mac? If it is ' 'multiarray.so' then the correct name is libname + sysconfig.get_config_var('SO') as I previously indicated. It is, and yes that's correct. Orthogonal to the actual issue though. Ralf While the test now pass, you have now changed an API for load_library. Only in a backwards-compatible way, which should be fine. I added a keyword, the default of which does the same as before. The only thing I did other than that was remove tries with clearly invalid extensions, like .pyd on Linux. Now that I'm writing that though, I think it's better to try both .so and .cpython-32mu.so by default for python =3.2. This should try both extensions: https://github.com/rgommers/numpy/tree/sharedlibext Does that look better? Ralf This is not something that is meant to occur in a bug-fix release as well as the new argument is undocumented. But I do not understand the need for this extra complexity when libname + sysconfig.get_config_var('SO') works on Linux, Windows and Mac. I've tried to explain this twice already. You have both multiarray.cpython-32mu.so and liblapack.so (or some other library like that) on your system. The extension of both is needed, and always obtained via get_config_var('SO'). See the problem? If someone knows a better way to do this, I'm all for it. But I don't see a simpler way. Cheers, Ralf Okay, I see what you are getting at: There are various Python bindings/interface to C libraries (Fortran in Scipy) such as lapack and atlas created by a non-Python source but there is also an 'shared library' created with Python as part of an extension. The PEP does not appear to differentiate between these - probably because it assumes non-Python libraries are loaded differently (ctypes is Python 2.5+). So there has to be a different way to find the extension. Bruce ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion