Re: [Numpy-discussion] Bitwise operations and unsigned types
On Thu, Apr 5, 2012 at 11:57 PM, Travis Oliphant tra...@continuum.iowrote: As of 1.5.1 this worked: numpy.__version__ 1.5.1 numpy.uint64(5) 3 1L So, this is a regression and a bug. It should be fixed so that it doesn't raise an error. I believe the scalars were special cased so that a raw 3 would not be interpreted as a signed int when it is clearly unsigned. The casting rules were well established over a long period. They had issues, but they should not have been changed like this in a 1.X release. Fortunately there are work-arounds and these issues arise only in corner cases, but we should strive to do better. I disagree, promoting to object kind of destroys the whole idea of bitwise operations. I think we *fixed* a bug. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bitwise operations and unsigned types
On Apr 6, 2012, at 1:01 AM, Charles R Harris wrote: On Thu, Apr 5, 2012 at 11:57 PM, Travis Oliphant tra...@continuum.io wrote: As of 1.5.1 this worked: numpy.__version__ 1.5.1 numpy.uint64(5) 3 1L So, this is a regression and a bug. It should be fixed so that it doesn't raise an error. I believe the scalars were special cased so that a raw 3 would not be interpreted as a signed int when it is clearly unsigned. The casting rules were well established over a long period. They had issues, but they should not have been changed like this in a 1.X release. Fortunately there are work-arounds and these issues arise only in corner cases, but we should strive to do better. I disagree, promoting to object kind of destroys the whole idea of bitwise operations. I think we *fixed* a bug. That is an interesting point of view. I could see that point of view. But, was this discussed as a bug prior to this change occurring? I just heard from a very heavy user of NumPy that they are nervous about upgrading because of little changes like this one. I don't know if this particular issue would affect them or not, but I will re-iterate my view that we should be very careful of these kinds of changes. In this particular case, what should the behavior be? It would be ideal if the scalar math did not just re-use the array-math machinery. Let's say that scalars had their own bit-wise operator. What should the output of numpy.uint64(5) 3 actually be? I think it should interpret the 3 as unsigned and perform the operation (but not promote to an object). -Travis Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bitwise operations and unsigned types
On Fri, Apr 6, 2012 at 12:01 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Apr 5, 2012 at 11:57 PM, Travis Oliphant tra...@continuum.iowrote: As of 1.5.1 this worked: numpy.__version__ 1.5.1 numpy.uint64(5) 3 1L So, this is a regression and a bug. It should be fixed so that it doesn't raise an error. I believe the scalars were special cased so that a raw 3 would not be interpreted as a signed int when it is clearly unsigned. The casting rules were well established over a long period. They had issues, but they should not have been changed like this in a 1.X release. Fortunately there are work-arounds and these issues arise only in corner cases, but we should strive to do better. I disagree, promoting to object kind of destroys the whole idea of bitwise operations. I think we *fixed* a bug. Although 1.5.1 also gives np.uint(3) + 4 7.0 i.e., a float, which certainly doesn't look right either. Whereas np.int(3) + 4 7 The float promotion is still there in 1.6.1 In [4]: uint64(1) + 2 Out[4]: 3.0 So I suppose there is the larger question is how combining numpy scalars with python scalars should be done in general. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bitwise operations and unsigned types
Although 1.5.1 also gives np.uint(3) + 4 7.0 i.e., a float, which certainly doesn't look right either. Whereas np.int(3) + 4 7 The float promotion is still there in 1.6.1 In [4]: uint64(1) + 2 Out[4]: 3.0 So I suppose there is the larger question is how combining numpy scalars with python scalars should be done in general. Yes, exactly! This is a good discussion to have. As mentioned before, I would also like to see the NumPy scalars get their own math instead of re-using the umath machinery. In the case of scalars, it would seem to make sense to interpret integers as signed or unsigned based on their value. -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bitwise operations and unsigned types
On Fri, Apr 6, 2012 at 12:19 AM, Travis Oliphant tra...@continuum.iowrote: On Apr 6, 2012, at 1:01 AM, Charles R Harris wrote: On Thu, Apr 5, 2012 at 11:57 PM, Travis Oliphant tra...@continuum.iowrote: As of 1.5.1 this worked: numpy.__version__ 1.5.1 numpy.uint64(5) 3 1L So, this is a regression and a bug. It should be fixed so that it doesn't raise an error. I believe the scalars were special cased so that a raw 3 would not be interpreted as a signed int when it is clearly unsigned. The casting rules were well established over a long period. They had issues, but they should not have been changed like this in a 1.X release. Fortunately there are work-arounds and these issues arise only in corner cases, but we should strive to do better. I disagree, promoting to object kind of destroys the whole idea of bitwise operations. I think we *fixed* a bug. That is an interesting point of view. I could see that point of view. But, was this discussed as a bug prior to this change occurring? I just heard from a very heavy user of NumPy that they are nervous about upgrading because of little changes like this one. I don't know if this particular issue would affect them or not, but I will re-iterate my view that we should be very careful of these kinds of changes. In this particular case, what should the behavior be? It would be ideal if the scalar math did not just re-use the array-math machinery. Let's say that scalars had their own bit-wise operator. What should the output of numpy.uint64(5) 3 actually be? I think it should interpret the 3 as unsigned and perform the operation (but not promote to an object). Yes, I agree with that. We should think about how the scalar types combine in the common operations. It looks like it could be made more consistent. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] why does eigvalsh return a complex array?
I noticed that numpy.linalg.eigvalsh returns a complex array, even though mathematically the resulting eigenvalues are guaranteed to be real. Looking at the source code, the underlying zheevd routine of LAPACK indeed returns an array of real numbers which is than converted to complex in the numpy wrapper. Does numpy policy require the type of the result to be the same as the type of input? Copying an array twice to arrive at the original result seems pointless to me. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bitwise operations and unsigned types
On Fri, Apr 6, 2012 at 7:19 AM, Travis Oliphant tra...@continuum.io wrote: That is an interesting point of view. I could see that point of view. But, was this discussed as a bug prior to this change occurring? I just heard from a very heavy user of NumPy that they are nervous about upgrading because of little changes like this one. I don't know if this particular issue would affect them or not, but I will re-iterate my view that we should be very careful of these kinds of changes. I agree -- these changes make me very nervous as well, especially since I haven't seen any short, simple description of what changed or what the rules actually are now (comparable to the old scalars do not affect the type of arrays). But, I also want to speak up in favor in one respect, since real world data points are always good. I had some code that did def do_something(a): a = np.asarray(a) a -= np.mean(a) ... If someone happens to pass in an integer array, then this is totally broken -- np.mean(a) may be non-integral, and in 1.6, numpy silently discards the fractional part and performs the subtraction anyway, e.g.: In [4]: a Out[4]: array([0, 1, 2, 3]) In [5]: a -= 1.5 In [6]: a Out[6]: array([-1, 0, 0, 1]) The bug was discovered when Skipper tried running my code against numpy master, and it errored out on the -=. So Mark's changes did catch one real bug that would have silently caused completely wrong numerical results! https://github.com/charlton/charlton/commit/d58c72529a5b33d06b49544bc3347c6480dc4512 - Nathaniel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Improving NumPy's indexing / subsetting / fancy indexing implementation
Hi Wes, I believe that Mark rewrote a bunch of the fancy-indexing-related code from scratch in the masked-NA branch. I don't know if it affects anything you're talking about here, but just as a heads up, you might want to benchmark master, since it may have a different performance profile. -- Nathaniel On Fri, Apr 6, 2012 at 4:04 AM, Wes McKinney wesmck...@gmail.com wrote: dear all, I've routinely found that: 1) ndarray.take is up to 1 order of magnitude faster than fancy indexing 2) Hand-coded Cython boolean indexing is many times faster than ndarray indexing 3) putmask is significantly faster than ndarray indexing For example, I stumbled on this tonight: straightforward cython code: def row_bool_subset(ndarray[float64_t, ndim=2] values, ndarray[uint8_t, cast=True] mask): cdef: Py_ssize_t i, j, n, k, pos = 0 ndarray[float64_t, ndim=2] out n, k = (object values).shape assert(n == len(mask)) out = np.empty((mask.sum(), k), dtype=np.float64) for i in range(n): if mask[i]: for j in range(k): out[pos, j] = values[i, j] pos += 1 return out In [1]: values = randn(100, 4) In [2]: mask = np.ones(100, dtype=bool) In [3]: import pandas._sandbox as sbx In [4]: result = sbx.row_bool_subset(values, mask) In [5]: timeit result = sbx.row_bool_subset(values, mask) 100 loops, best of 3: 9.58 ms per loop pure NumPy: In [6]: timeit values[mask] 10 loops, best of 3: 81.6 ms per loop Here's the kind of take performance problems that I routinely experience: In [12]: values = randn(100, 4) v In [13]: values.shape Out[13]: (100, 4) In [14]: indexer = np.random.permutation(100)[:50] In [15]: timeit values[indexer] 10 loops, best of 3: 70.7 ms per loop In [16]: timeit values.take(indexer, axis=0) 100 loops, best of 3: 13.3 ms per loop When I can spare the time in the future I will personally work on these indexing routines in the C codebase, but I suspect that I'm not the only one adversely affected by these kinds of performance problems, and it might be worth thinking about a community effort to split up the work of retooling these methods to be more performant. We could use a tool like my vbench project (github.com/wesm/vbench) to make a list of the performance benchmarks of interest (like the ones above). Unfortunately I am too time constrained at least for the next 6 months to devote to a complete rewrite of the code in question. Possibly sometime in 2013 if no one has gotten to it yet, but this seems like someplace that we should be concerned about as the language performance wars continue to rage. - Wes ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] (no subject)
a href=http://alumnos.digicap.cl/images/rmngl.html; http://alumnos.digicap.cl/images/rmngl.html/a___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] (no subject)
Could someone please ban this person from the mailing list, he keeps sending spam. On 6 April 2012 12:41, Jean-Baptiste Rudant boogalo...@yahoo.fr wrote: http://alumnos.digicap.cl/images/rmngl.html ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] (no subject)
Le 06/04/2012 14:06, mark florisson a écrit : Could someone please ban this person from the mailing list, he keeps sending spam. I was about to ask the same thing. In the mean time, I googled the name of this gentleman and found a possible match with a person working for the French national institute for statistics (INSEE). I've tried to forge a valid email address from his name and the INSEE domain name to warn him about his spamming Yahoo address. I'll see if the email comes back as Undelivered or not... -- Pierre signature.asc Description: OpenPGP digital signature ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Slice specified axis
On Friday, April 6, 2012, Val Kalatsky wrote: The only slicing short-cut I can think of is the Ellipsis object, but it's not going to help you much here. The alternatives that come to my mind are (1) manipulation of shape directly and (2) building a string and running eval on it. Your solution is better than (1), and (2) is a horrible hack, so your solution wins again. Cheers Val Take a peek at how np.gradient() does it. It creates a list of None with a length equal to the number of dimensions, and then inserts a slice object in the appropriate spot in the list. Cheers! Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] (no subject)
On Fri, Apr 6, 2012 at 7:06 AM, mark florisson markflorisso...@gmail.comwrote: Could someone please ban this person from the mailing list, he keeps sending spam. On 6 April 2012 12:41, Jean-Baptiste Rudant boogalo...@yahoo.fr wrote: http://alumnos.digicap.cl/images/rmngl.html They should be gone now. Ognen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bitwise operations and unsigned types
On Fri, Apr 6, 2012 at 3:57 AM, Nathaniel Smith n...@pobox.com wrote: On Fri, Apr 6, 2012 at 7:19 AM, Travis Oliphant tra...@continuum.io wrote: That is an interesting point of view. I could see that point of view. But, was this discussed as a bug prior to this change occurring? I just heard from a very heavy user of NumPy that they are nervous about upgrading because of little changes like this one. I don't know if this particular issue would affect them or not, but I will re-iterate my view that we should be very careful of these kinds of changes. I agree -- these changes make me very nervous as well, especially since I haven't seen any short, simple description of what changed or what the rules actually are now (comparable to the old scalars do not affect the type of arrays). But, I also want to speak up in favor in one respect, since real world data points are always good. I had some code that did def do_something(a): a = np.asarray(a) a -= np.mean(a) ... If someone happens to pass in an integer array, then this is totally broken -- np.mean(a) may be non-integral, and in 1.6, numpy silently discards the fractional part and performs the subtraction anyway, e.g.: In [4]: a Out[4]: array([0, 1, 2, 3]) In [5]: a -= 1.5 In [6]: a Out[6]: array([-1, 0, 0, 1]) The bug was discovered when Skipper tried running my code against numpy master, and it errored out on the -=. So Mark's changes did catch one real bug that would have silently caused completely wrong numerical results! https://github.com/charlton/charlton/commit/d58c72529a5b33d06b49544bc3347c6480dc4512 Yes, these things are trade offs between correctness and convenience. I don't mind new warnings/errors so much, they may break old code but they don't lead to wrong results. It's the unexpected and unnoticed successes that are scary. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bitwise operations and unsigned types
Good morning all-- didn't realize this would generate quite such a buzz. To answer a direct question, I'm using the github master. A few thoughts (from a fairly heavy numpy user for numerical simulations and analysis): The current behavior is confusing and (as far as i can tell) undocumented. Scalars act up only if they are big: In [152]: np.uint32(1) 1 Out[152]: 1 In [153]: np.uint64(1) 1 --- TypeError Traceback (most recent call last) /Users/claumann/ipython-input-153-191a0b5fe216 in module() 1 np.uint64(1) 1 TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' But arrays don't seem to mind: In [154]: ones(3, dtype=np.uint32) 1 Out[154]: array([1, 1, 1], dtype=uint32) In [155]: ones(3, dtype=np.uint64) 1 Out[155]: array([1, 1, 1], dtype=uint64) As you mentioned, explicitly casting 1 to np.uint makes the above scalar case work, but I don't understand why this is unnecessary for the arrays. I could understand a general argument that type casting rules should always be the same independent of the underlying ufunc, but I'm not sure if that is sufficiently smart. Bitwise ops probably really ought to treat nonnegative python integers as unsigned. I disagree, promoting to object kind of destroys the whole idea of bitwise operations. I think we *fixed* a bug. That is an interesting point of view. I could see that point of view. But, was this discussed as a bug prior to this change occurring? I'm not sure what 'promoting to object' constitutes in the new numpy, but just a small thought. I can think of two reasons to go to the trouble of using bitfields over more pythonic (higher level) representations: speed/memory overhead and interfacing with external hardware/software. For me, it's mostly the former -- I've already implemented this program once using a much more pythonic approach but it just has too much memory overhead to scale to where I want it. If a coder goes to the trouble of using bitfields, there's probably a good reason they wanted a lower level representation in which bitfield ops happen in parallel as integer operations. But, what do you mean that bitwise operations are destroyed by promotion to objects? Best, Chris On Apr 6, 2012, at 5:57 AM, Nathaniel Smith wrote: On Fri, Apr 6, 2012 at 7:19 AM, Travis Oliphant tra...@continuum.io wrote: That is an interesting point of view. I could see that point of view. But, was this discussed as a bug prior to this change occurring? I just heard from a very heavy user of NumPy that they are nervous about upgrading because of little changes like this one. I don't know if this particular issue would affect them or not, but I will re-iterate my view that we should be very careful of these kinds of changes. I agree -- these changes make me very nervous as well, especially since I haven't seen any short, simple description of what changed or what the rules actually are now (comparable to the old scalars do not affect the type of arrays). But, I also want to speak up in favor in one respect, since real world data points are always good. I had some code that did def do_something(a): a = np.asarray(a) a -= np.mean(a) ... If someone happens to pass in an integer array, then this is totally broken -- np.mean(a) may be non-integral, and in 1.6, numpy silently discards the fractional part and performs the subtraction anyway, e.g.: In [4]: a Out[4]: array([0, 1, 2, 3]) In [5]: a -= 1.5 In [6]: a Out[6]: array([-1, 0, 0, 1]) The bug was discovered when Skipper tried running my code against numpy master, and it errored out on the -=. So Mark's changes did catch one real bug that would have silently caused completely wrong numerical results! https://github.com/charlton/charlton/commit/d58c72529a5b33d06b49544bc3347c6480dc4512 - Nathaniel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] problem with vectorized difference equation
Hello everyone, After reading the very good post http://technicaldiscovery.blogspot.com/2011/06/speeding-up-python-numpy-cython-and.html and the book by H. P. Langtangen 'Python scripting for computational science' I was trying to speed up the execution of a loop on numpy arrays being used to describe a simple difference equation. The actual code I am working on involves some more complicated equations, but I am having the same exact behavior as described below. To test the improvement in speed I wrote the following in vect_try.py: #!/usr/bin/python import numpy as np import matplotlib.pyplot as plt dt = 0.02 #time step time = np.arange(0,2,dt) #time array u = np.sin(2*np.pi*time) #input signal array def vect_int(u,y): #vectorized function n = u.size y[1:n] = y[0:n-1] + u[1:n] return y def sc_int(u,y): #scalar function y = y + u return y def calc_vect(u, func=vect_int): out = np.zeros(u.size) for i in xrange(u.size): out = func(u,out) return out def calc_sc(u, func=sc_int): out = np.zeros(u.size) for i in xrange(u.size-1): out[i+1] = sc_int(u[i+1],out[i]) return out To verify the execution time I've used the timeit function in Ipython: import vect_try as vt timeit vt.calc_vect(vt.u) -- 1000 loops, best of 3: 494 us per loop timeit vt.calc_sc(vt.u) --1 loops, best of 3: 92.8 us per loop As you can see the scalar implementation looping one item at the time (calc_sc) is 494/92.8~5.3 times faster than the vectorized one (calc_vect). My problem consists in the fact that I need to iterate the execution of calc_vect in order for it to operate on all the elements of the input array. If I execute calc_vect only once, it will only operate on the first slice of the vectors leaving the remaining untouched. My understanding was that the vector expression y[1:n] = y[0:n-1] + u[1:n] was supposed to iterate over all the array, but that's not happening for me. Can anyone tell me what I am doing wrong? Thanks! Francesco -- View this message in context: http://old.nabble.com/problem-with-vectorized-difference-equation-tp33641688p33641688.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Slice specified axis
On Fri, Apr 6, 2012 at 8:54 AM, Benjamin Root ben.r...@ou.edu wrote: On Friday, April 6, 2012, Val Kalatsky wrote: The only slicing short-cut I can think of is the Ellipsis object, but it's not going to help you much here. The alternatives that come to my mind are (1) manipulation of shape directly and (2) building a string and running eval on it. Your solution is better than (1), and (2) is a horrible hack, so your solution wins again. Cheers Val Take a peek at how np.gradient() does it. It creates a list of None with a length equal to the number of dimensions, and then inserts a slice object in the appropriate spot in the list. Cheers! Ben Root Hmm, it looks like my original implementation wasn't too far off. Thanks for the tip! -Tony ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Slice specified axis
Hi, On Fri, Apr 6, 2012 at 1:12 PM, Tony Yu tsy...@gmail.com wrote: On Fri, Apr 6, 2012 at 8:54 AM, Benjamin Root ben.r...@ou.edu wrote: On Friday, April 6, 2012, Val Kalatsky wrote: The only slicing short-cut I can think of is the Ellipsis object, but it's not going to help you much here. The alternatives that come to my mind are (1) manipulation of shape directly and (2) building a string and running eval on it. Your solution is better than (1), and (2) is a horrible hack, so your solution wins again. Cheers Val Take a peek at how np.gradient() does it. It creates a list of None with a length equal to the number of dimensions, and then inserts a slice object in the appropriate spot in the list. Cheers! Ben Root Hmm, it looks like my original implementation wasn't too far off. Thanks for the tip! Another option: me_first = np.rollaxis(arr, axis) slice = me_first[start:end] slice = np.rollaxis(slice, 0, axis+1) Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] problem with vectorized difference equation
On Saturday 07 April 2012 12:14 AM, francesco82 wrote: Hello everyone, After reading the very good post http://technicaldiscovery.blogspot.com/2011/06/speeding-up-python-numpy-cython-and.html and the book by H. P. Langtangen 'Python scripting for computational science' I was trying to speed up the execution of a loop on numpy arrays being used to describe a simple difference equation. The actual code I am working on involves some more complicated equations, but I am having the same exact behavior as described below. To test the improvement in speed I wrote the following in vect_try.py: #!/usr/bin/python import numpy as np import matplotlib.pyplot as plt dt = 0.02 #time step time = np.arange(0,2,dt) #time array u = np.sin(2*np.pi*time) #input signal array def vect_int(u,y): #vectorized function n = u.size y[1:n] = y[0:n-1] + u[1:n] return y def sc_int(u,y): #scalar function y = y + u return y def calc_vect(u, func=vect_int): out = np.zeros(u.size) for i in xrange(u.size): out = func(u,out) return out def calc_sc(u, func=sc_int): out = np.zeros(u.size) for i in xrange(u.size-1): out[i+1] = sc_int(u[i+1],out[i]) return out To verify the execution time I've used the timeit function in Ipython: import vect_try as vt timeit vt.calc_vect(vt.u) -- 1000 loops, best of 3: 494 us per loop timeit vt.calc_sc(vt.u) --1 loops, best of 3: 92.8 us per loop As you can see the scalar implementation looping one item at the time (calc_sc) is 494/92.8~5.3 times faster than the vectorized one (calc_vect). My problem consists in the fact that I need to iterate the execution of calc_vect in order for it to operate on all the elements of the input array. If I execute calc_vect only once, it will only operate on the first slice of the vectors leaving the remaining untouched. My understanding was that the vector expression y[1:n] = y[0:n-1] + u[1:n] was supposed to iterate over all the array, but that's not happening for me. Can anyone tell me what I am doing wrong? Thanks! Francesco 1. By vectorizing, we mean replacing a loop with a single expression. In your program, both the scalar and vector implementations (calc_vect and calc_sc) have a loop each. This isn't going to make anything faster. The vectorized implementation is just a convoluted way of achieving the same result and is slower. 2. The expression y[1:n] = y[0:n-1] + u[1:n] is /not/ equivalent to the following loop for i in range(0,n-1): y[i+1] = y[i] + u[i+1] It is equivalent to something like z = np.zeros(n-1) for i in range(0,n-1): z[i] = y[i] + u[i+1] y[1:n] = z i.e., the RHS is computed in totality and then assigned to the LHS. This is how array operations work even in other languages such as Matlab. 3. I personally don't think there is a simple/obvious way to vectorize what you're trying to achieve. Sameer ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] problem with vectorized difference equation
Hello Sameer, Thank you very much for your reply. My goal was to try to speed up the loop describing the accumulator. In the (excellent) book I was mentioning in my initial post I could find one example that seemed to match what I was trying to do. Basically, it is said that a loop of the following kind: n = size(u)-1 for i in xrange(1,n,1): u_new[i] = u[i-1] + u[i] + u[i+1] should be equivalent to: u[1:n] = u[0:n-1] + u[1:n] + u[i+1] Am I missing something? Regards, Francesco Sameer Grover wrote: On Saturday 07 April 2012 12:14 AM, francesco82 wrote: Hello everyone, After reading the very good post http://technicaldiscovery.blogspot.com/2011/06/speeding-up-python-numpy-cython-and.html and the book by H. P. Langtangen 'Python scripting for computational science' I was trying to speed up the execution of a loop on numpy arrays being used to describe a simple difference equation. The actual code I am working on involves some more complicated equations, but I am having the same exact behavior as described below. To test the improvement in speed I wrote the following in vect_try.py: #!/usr/bin/python import numpy as np import matplotlib.pyplot as plt dt = 0.02 #time step time = np.arange(0,2,dt) #time array u = np.sin(2*np.pi*time) #input signal array def vect_int(u,y): #vectorized function n = u.size y[1:n] = y[0:n-1] + u[1:n] return y def sc_int(u,y): #scalar function y = y + u return y def calc_vect(u, func=vect_int): out = np.zeros(u.size) for i in xrange(u.size): out = func(u,out) return out def calc_sc(u, func=sc_int): out = np.zeros(u.size) for i in xrange(u.size-1): out[i+1] = sc_int(u[i+1],out[i]) return out To verify the execution time I've used the timeit function in Ipython: import vect_try as vt timeit vt.calc_vect(vt.u) -- 1000 loops, best of 3: 494 us per loop timeit vt.calc_sc(vt.u) --1 loops, best of 3: 92.8 us per loop As you can see the scalar implementation looping one item at the time (calc_sc) is 494/92.8~5.3 times faster than the vectorized one (calc_vect). My problem consists in the fact that I need to iterate the execution of calc_vect in order for it to operate on all the elements of the input array. If I execute calc_vect only once, it will only operate on the first slice of the vectors leaving the remaining untouched. My understanding was that the vector expression y[1:n] = y[0:n-1] + u[1:n] was supposed to iterate over all the array, but that's not happening for me. Can anyone tell me what I am doing wrong? Thanks! Francesco 1. By vectorizing, we mean replacing a loop with a single expression. In your program, both the scalar and vector implementations (calc_vect and calc_sc) have a loop each. This isn't going to make anything faster. The vectorized implementation is just a convoluted way of achieving the same result and is slower. 2. The expression y[1:n] = y[0:n-1] + u[1:n] is /not/ equivalent to the following loop for i in range(0,n-1): y[i+1] = y[i] + u[i+1] It is equivalent to something like z = np.zeros(n-1) for i in range(0,n-1): z[i] = y[i] + u[i+1] y[1:n] = z i.e., the RHS is computed in totality and then assigned to the LHS. This is how array operations work even in other languages such as Matlab. 3. I personally don't think there is a simple/obvious way to vectorize what you're trying to achieve. Sameer ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- View this message in context: http://old.nabble.com/problem-with-vectorized-difference-equation-tp33641688p33645699.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] problem with vectorized difference equation
On Saturday 07 April 2012 02:51 AM, Francesco Barale wrote: Hello Sameer, Thank you very much for your reply. My goal was to try to speed up the loop describing the accumulator. In the (excellent) book I was mentioning in my initial post I could find one example that seemed to match what I was trying to do. Basically, it is said that a loop of the following kind: n = size(u)-1 for i in xrange(1,n,1): u_new[i] = u[i-1] + u[i] + u[i+1] should be equivalent to: u[1:n] = u[0:n-1] + u[1:n] + u[i+1] This example is correct. What I was trying to point out was that the single expression y[1:n] = y[0:n-1] + u[1:n] will iterate over the array but will not accumulate. It will add y[0:n-1] to u[1:n] and assign the result to y[1:n]. For example, y = [1,2,3,4] u = [5,6,7,8] Then y[0:n-1] = [1,2,3] and u[1:n]=[6,7,8] The statement y[1:n] = y[0:n-1] + u[1:n] implies y[1:n] = [6+1,7+2,8+3] = [7,9,11] yielding y = [1,7,9,11] whereas the code: for i in range(0,n-1): y[i+1] = y[i] + u[i+1] will accumulate and give y = [1,7,14,22] Sameer Am I missing something? Regards, Francesco Sameer Grover wrote: On Saturday 07 April 2012 12:14 AM, francesco82 wrote: Hello everyone, After reading the very good post http://technicaldiscovery.blogspot.com/2011/06/speeding-up-python-numpy-cython-and.html and the book by H. P. Langtangen 'Python scripting for computational science' I was trying to speed up the execution of a loop on numpy arrays being used to describe a simple difference equation. The actual code I am working on involves some more complicated equations, but I am having the same exact behavior as described below. To test the improvement in speed I wrote the following in vect_try.py: #!/usr/bin/python import numpy as np import matplotlib.pyplot as plt dt = 0.02 #time step time = np.arange(0,2,dt) #time array u = np.sin(2*np.pi*time) #input signal array def vect_int(u,y): #vectorized function n = u.size y[1:n] = y[0:n-1] + u[1:n] return y def sc_int(u,y): #scalar function y = y + u return y def calc_vect(u, func=vect_int): out = np.zeros(u.size) for i in xrange(u.size): out = func(u,out) return out def calc_sc(u, func=sc_int): out = np.zeros(u.size) for i in xrange(u.size-1): out[i+1] = sc_int(u[i+1],out[i]) return out To verify the execution time I've used the timeit function in Ipython: import vect_try as vt timeit vt.calc_vect(vt.u) -- 1000 loops, best of 3: 494 us per loop timeit vt.calc_sc(vt.u) --1 loops, best of 3: 92.8 us per loop As you can see the scalar implementation looping one item at the time (calc_sc) is 494/92.8~5.3 times faster than the vectorized one (calc_vect). My problem consists in the fact that I need to iterate the execution of calc_vect in order for it to operate on all the elements of the input array. If I execute calc_vect only once, it will only operate on the first slice of the vectors leaving the remaining untouched. My understanding was that the vector expression y[1:n] = y[0:n-1] + u[1:n] was supposed to iterate over all the array, but that's not happening for me. Can anyone tell me what I am doing wrong? Thanks! Francesco 1. By vectorizing, we mean replacing a loop with a single expression. In your program, both the scalar and vector implementations (calc_vect and calc_sc) have a loop each. This isn't going to make anything faster. The vectorized implementation is just a convoluted way of achieving the same result and is slower. 2. The expression y[1:n] = y[0:n-1] + u[1:n] is /not/ equivalent to the following loop for i in range(0,n-1): y[i+1] = y[i] + u[i+1] It is equivalent to something like z = np.zeros(n-1) for i in range(0,n-1): z[i] = y[i] + u[i+1] y[1:n] = z i.e., the RHS is computed in totality and then assigned to the LHS. This is how array operations work even in other languages such as Matlab. 3. I personally don't think there is a simple/obvious way to vectorize what you're trying to achieve. Sameer ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Keyword argument support for vectorize.
Hi, I added a simple enhancement patch to provide vectorize with simple keyword argument support. (I added a new kwvectorize decorator, but suspect this could/should easily be rolled into the existing vectorize.) http://projects.scipy.org/numpy/ticket/2100 This just reorders any kwargs into the correct position (filling in defaults as needed) and then class the vectorized function. If people think this is reasonable, I can improve the patch with more comprehensive testing and error messages. Michael. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] speed of append_fields() in numpy.lib.recfunctions vs matplotlib.mlab
It seems that the speed of append_fields() in numpy.lib.recfunctions is much slower than rec_append_fields() in matplotlib.mlab. See the following code: import numpy as np import matplotlib.mlab as mlab import numpy.lib.recfunctions as nprf import time # Set up recarray nr_pts = 1E6 dt = np.dtype([('x', float), ('y', float)]) data = np.zeros(nr_pts, dtype=dt) data = data.view(np.recarray) data.x = np.linspace(0,5,nr_pts) data.y = np.linspace(5,10,nr_pts) z = np.linspace(20,15,nr_pts) # Test mlab last_time_clock = time.clock() data_mlab = mlab.rec_append_fields(data, ['z'], [z]) time_taken = time.clock() - last_time_clock print 'mlab took %i milliseconds.' % (time_taken*1000) # Test nprf last_time_clock = time.clock() data_nprf = nprf.append_fields(data, ['z'], [z], usemask=False, asrecarray=True) time_taken = time.clock() - last_time_clock print 'nprf took %i milliseconds.' % (time_taken*1000) On this computer, the output is (+/- 10 ms): mlab took 49 milliseconds. nprf took 440 milliseconds. Does anyone know why the numpy.lib.recfunctions version is so much slower? I thought these were a port from matplotlib.mlab. Changing to usemask=True has an effect on the time of the nprf way, making it vary 330-480 ms (still much slower). I'm using numpy 1.5.1. Best, Chris -- View this message in context: http://old.nabble.com/speed-of-append_fields%28%29-in-numpy.lib.recfunctions-vs-matplotlib.mlab-tp33646038p33646038.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] problem with vectorized difference equation
Now I am clear. I guess the vectorized notation speeds up difference equations describing FIR structures, whereas IIR ones won't benefit. Francesco Barale wrote: Hello everyone, After reading the very good post http://technicaldiscovery.blogspot.com/2011/06/speeding-up-python-numpy-cython-and.html and the book by H. P. Langtangen 'Python scripting for computational science' I was trying to speed up the execution of a loop on numpy arrays being used to describe a simple difference equation. The actual code I am working on involves some more complicated equations, but I am having the same exact behavior as described below. To test the improvement in speed I wrote the following in vect_try.py: #!/usr/bin/python import numpy as np import matplotlib.pyplot as plt dt = 0.02 #time step time = np.arange(0,2,dt) #time array u = np.sin(2*np.pi*time) #input signal array def vect_int(u,y): #vectorized function n = u.size y[1:n] = y[0:n-1] + u[1:n] return y def sc_int(u,y): #scalar function y = y + u return y def calc_vect(u, func=vect_int): out = np.zeros(u.size) for i in xrange(u.size): out = func(u,out) return out def calc_sc(u, func=sc_int): out = np.zeros(u.size) for i in xrange(u.size-1): out[i+1] = func(u[i+1],out[i]) return out To verify the execution time I've used the timeit function in Ipython: import vect_try as vt timeit vt.calc_vect(vt.u) -- 1000 loops, best of 3: 494 us per loop timeit vt.calc_sc(vt.u) --1 loops, best of 3: 92.8 us per loop As you can see the scalar implementation looping one item at the time (calc_sc) is 494/92.8~5.3 times faster than the vectorized one (calc_vect). My problem consists in the fact that I need to iterate the execution of vect_int in order for it to operate on all the elements of the input array. If I execute vect_int only once, it will only operate on the first slice of the vectors leaving the remaining untouched. My understanding was that the vector expression y[1:n] = y[0:n-1] + u[1:n] was supposed to iterate over all the array, but that's not happening for me. Can anyone tell me what I am doing wrong? Thanks! Francesco -- View this message in context: http://old.nabble.com/problem-with-vectorized-difference-equation-tp33641688p33646154.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] speed of append_fields() in numpy.lib.recfunctions vs matplotlib.mlab
Hi, On Fri, Apr 6, 2012 at 3:50 PM, cgraves christoph.gra...@gmail.com wrote: It seems that the speed of append_fields() in numpy.lib.recfunctions is much slower than rec_append_fields() in matplotlib.mlab. See the following code: As I remember it (Pierre M can probably correct me) the recfunctions are not ports of the mlab functions, but are considerably extended in order to deal with masking, and do not have exactly the same API. When I noticed this I wondered if there would be some sensible way of making the mlab routines available in a separate namespace, but I did not pursue it. Cheers, matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] why does eigvalsh return a complex array?
On Fri, Apr 6, 2012 at 1:56 AM, Christoph Groth c...@falma.de wrote: I noticed that numpy.linalg.eigvalsh returns a complex array, even though mathematically the resulting eigenvalues are guaranteed to be real. Looking at the source code, the underlying zheevd routine of LAPACK indeed returns an array of real numbers which is than converted to complex in the numpy wrapper. Does numpy policy require the type of the result to be the same as the type of input? Copying an array twice to arrive at the original result seems pointless to me. I think this should be fixed, the problem is the wrapper from a complex array. Not sure what the easiest fix is. I expect eigh is similar in behavior. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion