Re: [Numpy-discussion] Another suggestion for making numpy's functions generic
On Sat, Oct 17, 2009 at 2:49 PM, Darren Dale dsdal...@gmail.com wrote: numpy's functions, especially ufuncs, have had some ability to support subclasses through the ndarray.__array_wrap__ method, which provides masked arrays or quantities (for example) with an opportunity to set the class and metadata of the output array at the end of an operation. An example is q1 = Quantity(1, 'meter') q2 = Quantity(2, 'meters') numpy.add(q1, q2) # yields Quantity(3, 'meters') At SciPy2009 we committed a change to the numpy trunk that provides a chance to determine the class and some metadata of the output *before* the ufunc performs its calculation, but after output array has been established (and its data is still uninitialized). Consider: q1 = Quantity(1, 'meter') q2 = Quantity(2, 'J') numpy.add(q1, q2, q1) # or equivalently: # q1 += q2 With only __array_wrap__, the attempt to propagate the units happens after q1's data was updated in place, too late to raise an error, the data is now corrupted. __array_prepare__ solves that problem, an exception can be raised in time. Now I'd like to suggest one more improvement to numpy to make its functions more generic. Consider one more example: q1 = Quantity(1, 'meter') q2 = Quantity(2, 'feet') numpy.add(q1, q2) In this case, I'd like an opportunity to operate on the input arrays on the way in to the ufunc, to rescale the second input to meters. I think it would be a hack to try to stuff this capability into __array_prepare__. One form of this particular example is already supported in quantities, q1 + q2, by overriding the __add__ method to rescale the second input, but there are ufuncs that do not have an associated special method. So I'd like to look into adding another check for a special method, perhaps called __input_prepare__. My time is really tight for the next month, so I'd rather not start if there are strong objections, but otherwise, I'd like to try to try to get it in in time for numpy-1.4. (Has a timeline been established?) I think it will be not too difficult to document this overall scheme: When calling numpy functions: 1) __input_prepare__ provides an opportunity to operate on the inputs to yield versions that are compatible with the operation (they should obviously not be modified in place) 2) the output array is established 3) __array_prepare__ is used to determine the class of the output array, as well as any metadata that needs to be established before the operation proceeds 4) the ufunc performs its operations 5) __array_wrap__ provides an opportunity to update the output array based on the results of the computation Comments, criticisms? If PEP 3124^ were already a part of the standard library, that could serve as the basis for generalizing numpy's functions. But I think the PEP will not be approved in its current form, and it is unclear when and if the author will revisit the proposal. The scheme I'm imagining might be sufficient for our purposes. I'm all for generic (u)funcs since they might come handy for me since I'm doing lots of operation on arrays of polynomials. I don't quite get the reasoning though. Could you correct me where I get it wrong? * the class Quantity derives from numpy.ndarray * Quantity overrides __add__, __mul__ etc. and you get the correct behaviour for q1 = Quantity(1, 'meter') q2 = Quantity(2, 'J') by raising an exception when performing q1+=q2 * The problem is that numpy.add(q1,q1,q2) would corrupt q1 before raising an exception Sebastian Darren ^ http://www.python.org/dev/peps/pep-3124/ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] fortran vs numpy on mac/linux - gcc performance?
Hi, I have been looking at moving some of my bottleneck functions to fortran with f2py. To get started I tried some simple things, and was surprised they performend so much better than the number builtins - which I assumed would be c and would be quite fast. On my Macbook pro laptop (Intel core 2 duo) I got the following results. Numpy is built with xcode gcc 4.0.1 and gfortran is 4.2.3 - fortran code for shuffle and bincount below: In [1]: x = np.random.random_integers(0,1023,100).astype(int) In [2]: import ftest In [3]: timeit np.bincount(x) 100 loops, best of 3: 3.97 ms per loop In [4]: timeit ftest.bincount(x,1024) 1000 loops, best of 3: 1.15 ms per loop In [5]: timeit np.random.shuffle(x) 1 loops, best of 3: 605 ms per loop In [6]: timeit ftest.shuffle(x) 10 loops, best of 3: 139 ms per loop So fortran was about 4 times faster for these loops - similarly faster than cython as well. So I was really happy as these are two of my biggest bottlenecks, but when I moved a linux workstation I got different results. Here with gcc/gfortran 4.3.3 : In [3]: x = np.random.random_integers(0,1023,100).astype(int) In [4]: timeit np.bincount(x) 100 loops, best of 3: 8.18 ms per loop In [5]: timeit ftest.bincount(x,1024) 100 loops, best of 3: 8.25 ms per loop In [6]: In [7]: timeit np.random.shuffle(x) 1 loops, best of 3: 379 ms per loop In [8]: timeit ftest.shuffle(x) 10 loops, best of 3: 172 ms per loop So shuffle is a bit faster, but bincount is now the same as fortran. The only thing I can think is that it is due to much better performance of the more recent c compiler. I think this would also explain why f2py extension was performing so much better than cython on the mac. So my question is - is there a way to build numpy with a more recent compiler on leopard? (I guess I could upgrade to snow leopard now) - Could I make the numpy install use gcc-4.2 from xcode or would it break stuff? Could I use gcc 4.3.3 from macports? It would be great to get a 4x speed up on all numpy c loops! (already just these two functions I use a lot would make a big difference). Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fortran vs numpy on mac/linux - gcc performance?
Forgot to include the fortran code used: jm-g26b101:fortran robince$ cat test.f95 subroutine bincount (x,c,n,m) implicit none integer, intent(in) :: n,m integer, dimension(0:n-1), intent(in) :: x integer, dimension(0:m-1), intent(out) :: c integer :: i c = 0 do i = 0, n-1 c(x(i)) = c(x(i)) + 1 end do end subroutine shuffle (x,s,n) implicit none integer, intent(in) :: n integer, dimension(n), intent(in) :: x integer, dimension(n), intent(out) :: s integer :: i,randpos,temp real :: r ! copy input s = x call init_random_seed() ! knuth shuffle from http://rosettacode.org/wiki/Knuth_shuffle#Fortran do i = n, 2, -1 call random_number(r) randpos = int(r * i) + 1 temp = s(randpos) s(randpos) = s(i) s(i) = temp end do end subroutine init_random_seed() ! init_random_seed from gfortran documentation integer :: i, n, clock integer, dimension(:), allocatable :: seed call random_seed(size = n) allocate(seed(n)) call system_clock(count=clock) seed = clock + 37 * (/ (i - 1, i = 1, n) /) call random_seed(put = seed) deallocate(seed) end subroutine ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Another suggestion for making numpy's functions generic
On Mon, Oct 19, 2009 at 3:10 AM, Sebastian Walter sebastian.wal...@gmail.com wrote: On Sat, Oct 17, 2009 at 2:49 PM, Darren Dale dsdal...@gmail.com wrote: numpy's functions, especially ufuncs, have had some ability to support subclasses through the ndarray.__array_wrap__ method, which provides masked arrays or quantities (for example) with an opportunity to set the class and metadata of the output array at the end of an operation. An example is q1 = Quantity(1, 'meter') q2 = Quantity(2, 'meters') numpy.add(q1, q2) # yields Quantity(3, 'meters') At SciPy2009 we committed a change to the numpy trunk that provides a chance to determine the class and some metadata of the output *before* the ufunc performs its calculation, but after output array has been established (and its data is still uninitialized). Consider: q1 = Quantity(1, 'meter') q2 = Quantity(2, 'J') numpy.add(q1, q2, q1) # or equivalently: # q1 += q2 With only __array_wrap__, the attempt to propagate the units happens after q1's data was updated in place, too late to raise an error, the data is now corrupted. __array_prepare__ solves that problem, an exception can be raised in time. Now I'd like to suggest one more improvement to numpy to make its functions more generic. Consider one more example: q1 = Quantity(1, 'meter') q2 = Quantity(2, 'feet') numpy.add(q1, q2) In this case, I'd like an opportunity to operate on the input arrays on the way in to the ufunc, to rescale the second input to meters. I think it would be a hack to try to stuff this capability into __array_prepare__. One form of this particular example is already supported in quantities, q1 + q2, by overriding the __add__ method to rescale the second input, but there are ufuncs that do not have an associated special method. So I'd like to look into adding another check for a special method, perhaps called __input_prepare__. My time is really tight for the next month, so I'd rather not start if there are strong objections, but otherwise, I'd like to try to try to get it in in time for numpy-1.4. (Has a timeline been established?) I think it will be not too difficult to document this overall scheme: When calling numpy functions: 1) __input_prepare__ provides an opportunity to operate on the inputs to yield versions that are compatible with the operation (they should obviously not be modified in place) 2) the output array is established 3) __array_prepare__ is used to determine the class of the output array, as well as any metadata that needs to be established before the operation proceeds 4) the ufunc performs its operations 5) __array_wrap__ provides an opportunity to update the output array based on the results of the computation Comments, criticisms? If PEP 3124^ were already a part of the standard library, that could serve as the basis for generalizing numpy's functions. But I think the PEP will not be approved in its current form, and it is unclear when and if the author will revisit the proposal. The scheme I'm imagining might be sufficient for our purposes. I'm all for generic (u)funcs since they might come handy for me since I'm doing lots of operation on arrays of polynomials. I don't quite get the reasoning though. Could you correct me where I get it wrong? * the class Quantity derives from numpy.ndarray * Quantity overrides __add__, __mul__ etc. and you get the correct behaviour for q1 = Quantity(1, 'meter') q2 = Quantity(2, 'J') by raising an exception when performing q1+=q2 No, Quantity does not override __iadd__ to catch this. Quantity implements __array_prepare__ to perform the dimensional analysis based on the identity of the ufunc and the inputs, and set the class and dimensionality of the output array, or raise an error when dimensional analysis fails. This approach lets quantities support all ufuncs (in principle), not just built in numerical operations. It should also make it easier to subclass from MaskedArray, so we could have a MaskedQuantity without having to establish yet another suite of ufuncs specific to quantities or masked quantities. * The problem is that numpy.add(q1,q1,q2) would corrupt q1 before raising an exception That was solved by the addition of __array_prepare__ to numpy back in August. What I am proposing now is supporting operations on arrays that would be compatible if we had a chance to transform them on the way into the ufunc, like meter + foot. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy build/installation problems ?
I had the same 4 errors in genfromtext yesterday when I upgraded numpy r 7539. mac os x python 2.5.2. --George. 2009/10/19 Pierre GM pgmdevl...@gmail.com: On Oct 19, 2009, at 10:40 AM, josef.p...@gmail.com wrote: I wanted to finally upgrade my numpy, so I can build scipy trunk again, but I get test failures with numpy. And running the tests of the previously compiled version of scipy crashes in signaltools. The ConversionWarnings are expected. I'm probably to be blamed for the AttributeErrors (I'm testing on 2.6 where tuples do have an index Attribute), I gonna check that. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy build/installation problems ?
On Oct 19, 2009, at 12:01 PM, George Nurser wrote: I had the same 4 errors in genfromtext yesterday when I upgraded numpy r 7539. mac os x python 2.5.2. I'm on it, should be fixed in a few hours. Please, don't hesitate to open a ticket next time (so that I remember to test on 2.5 as well...). Thx ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] user defined types
On Mon, Oct 19, 2009 at 14:55, Artem Serebriyskiy v.for.van...@gmail.com wrote: Hello! Would you please give me some examples of open source projects which use the implementation of user defined types for numpy library? (implementation on the C-API level) I'm not sure that anyone currently does. We do have an example in doc/newdtype_example/. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] object array alignment issues
On Oct 19, 2009, at 9:55 AM, Michael Droettboom wrote: I've filed a bug and attached a patch: http://projects.scipy.org/numpy/ticket/1267 No guarantees that I've found all of the alignment issues. I did a grep for PyObject ** to find possible locations where PyObject * in arrays were being dereferenced. If I could write a unit test to make it fall over on Solaris, then I fixed it, otherwise I left it alone. For example, there are places where misaligned dereferencing is theoretically possible (OBJECT_dot, OBJECT_compare), but a higher level function already did a BEHAVED array cast. In those cases I added a unit test so hopefully we'll be able to catch it in the future if the caller no longer ensures well-behavedness. This patch looks great technically. Thank you for tracking this down and correcting my error. Right now, though, the patch has too many white-space only changes in it. Could you submit a new patch that removes those changes? Thanks, -Travis -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliph...@enthought.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] object array alignment issues
On Mon, Oct 19, 2009 at 3:55 PM, Travis Oliphant oliph...@enthought.comwrote: On Oct 19, 2009, at 9:55 AM, Michael Droettboom wrote: I've filed a bug and attached a patch: http://projects.scipy.org/numpy/ticket/1267 No guarantees that I've found all of the alignment issues. I did a grep for PyObject ** to find possible locations where PyObject * in arrays were being dereferenced. If I could write a unit test to make it fall over on Solaris, then I fixed it, otherwise I left it alone. For example, there are places where misaligned dereferencing is theoretically possible (OBJECT_dot, OBJECT_compare), but a higher level function already did a BEHAVED array cast. In those cases I added a unit test so hopefully we'll be able to catch it in the future if the caller no longer ensures well-behavedness. This patch looks great technically. Thank you for tracking this down and correcting my error. Right now, though, the patch has too many white-space only changes in it. Could you submit a new patch that removes those changes? The old whitespace is hard tabs and needs to be replaced anyway. The new whitespace doesn't always get the indentation right, however. That file needs a style/whitespace cleanup. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] object array alignment issues
On Mon, Oct 19, 2009 at 4:36 PM, Robert Kern robert.k...@gmail.com wrote: On Mon, Oct 19, 2009 at 17:28, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Oct 19, 2009 at 3:55 PM, Travis Oliphant oliph...@enthought.com wrote: Right now, though, the patch has too many white-space only changes in it. Could you submit a new patch that removes those changes? The old whitespace is hard tabs and needs to be replaced anyway. The new whitespace doesn't always get the indentation right, however. That file needs a style/whitespace cleanup. That's fine, but whitespace cleanup needs to be done in commits that are separate from the functional changes. I agree, but it can be tricky to preserve hard tabs when your editor uses spaces and has hard tabs set to 8 spaces. That file is on my cleanup list anyway, I'll try to get to it this weekend. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] opening pickled numarray data with numpy
Try creating an empty module/class with the given name. I.e. create a 'numarray' dir off your PYTHONPATH, create an empty __init__.py file, create a 'generic.py' file in that dir and populate it with whatever class python complains about like so: #!/usr/bin/env python class MissingClass(object): pass Cheers, Jason On Mon, Oct 19, 2009 at 1:00 PM, dagmar wismeijer dagma...@gmail.comwrote: Hi, I've been trying to open (using numpy) old pickled data files that I once created using numarray, but I keep getting the message that there is no module numarray.generic. Is there any way I could open these datafiles without installing numarray again? Thanks in advance, Dagmar ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Another suggestion for making numpy's functions generic
2009/10/19 Sebastian Walter sebastian.wal...@gmail.com: I'm all for generic (u)funcs since they might come handy for me since I'm doing lots of operation on arrays of polynomials. Just as a side note, if you don't mind my asking, what sorts of operations do you do on arrays of polynomials? In a thread on scipy-dev we're discussing improving scipy's polynomial support, and we'd be happy to get some more feedback on what they need to be able to do. Thanks! Anne ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion