[Numpy-discussion] speed of array vs matrix
Hello, I have noticed a significant speed difference between the array and the matrix implementation of the dot product, especially for not-so-big matrices. For example: In [1]: import numpy as np In [2]: b = np.random.rand(104,1) In [3]: bm = np.mat(b) In [4]: a = np.random.rand(8, 104) In [5]: am = np.mat(a) In [6]: %timeit np.dot(a, b) 100 loops, best of 3: 1.74 us per loop In [7]: %timeit am * bm 10 loops, best of 3: 6.38 us per loop The results for two different PCs (PC1 with windows/EPD6.2-2 and PC2 with ubuntu/numpy-1.3.0) and two different sizes are below: array matrix 8x104 * 104x1 PC11.74us 6.38us PC21.23us 5.85us 8x10 * 10x5 PC12.38us 7.55us PC21.56us 6.01us For bigger matrices the timings seem to asymptotically approach. Is it something worth trying to fix or should I just accept this as a fact and, when working with small matrices, stick to array? Thanks, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] speed of array vs matrix
Thank you for your replies. I might just use arrays. Maybe in the future something like pypy might help reduce the python overhead. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Applying Patch #1085
Hello! What do people think of applying patch #1085. Fine with me. I'd rename the function ... Let me know if you want me to make these canges or feel free to make them. It looks like the routine doesn't try to determine if the views actually overlap, just if they might potentially share data. Is that correct? That seems safe and if the time isn't much it might be a nice safety catch. The function compares the ultimate base of every output with those of the inputs and if an output and an input have the same base (or either one is the base of the other), the input is copied to a temporary object before the operation (unless it is the easy case of same dimensions and strides, strides all positive and the output pointer is less than the input one). The two views might not overlap (such as z[1::2] = z[0::2] + 1) but the routine it is not smart enough to understand it and makes a copy anyway. This should be a conservative approach: it should be safe, at most it might cause copying some array unnecessarily. Let me know if you can think of better approach able to save some unnecesasary copy but light enough (as it is applied nin x nout times). Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.any and np.all short-circuiting
Hello David, thank you. I followed your suggestion but I was unable to make it work. I surprisingly found that with numpy in a different folder, it worked. I am afraid it is due to the fact that the first one is not a linux filesystem and cannot deal with permission and ownership. This would make sense and agree with a recent post about nose having problems with certain permissions or ownerships. Problem solved. I checked out the svn version into a linux filesystem and it works. Thanks again to all those that offered help. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.any and np.all short-circuiting
Thanks for your reply. So, what is the correct way to test a numpy development version without installing it in /usr/lib/... or /usr/local/lib/.. ? What do you guys do? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] np.any and np.all short-circuiting
Hello I noticed that python's any can be faster than numpy's any (and the similarly for all). Then I wondered why. I realized that numpy implements any as logical_or.reduce (and all as logical_and.reduce). This means that numpy cannot take advantage of short-circuiting. Looking at the timings confirmed my suspects. I think python fetches one element at the time from the array and as soon as any of them is true it returns true. Instead, numpy goes on until the end of the array even if the very first element is already true. Looking at the code I think I found a way to fix it. I don't see a reason why it should not work. It seems to work. But you never know. I wanted to run the test suite. I am unable to run the test on the svn version, neither from .../build/lib... nor form a different folder using sys.path.insert(0, '.../build/lib...'). In the first case I get NameError: name 'numeric' is not defined while in the second case zero tests are successfully performed :-) What is the correct way of running the tests (without installing the development version in the system)? Is there some expert of the inner numpy core able to tell whether the approach is correct and won't break something? I opened a ticket for this: http://projects.scipy.org/numpy/ticket/1237 Best, Luca In the following table any(x) is python's version, np.any(x) is numpy's, while *np.any(x)* is mine. '1.4.0.dev7417' x = np.zeros(10, dtype=bool) x[i] = True %timeit any(x) %timeit np.any(x) x = np.ones(10, dtype=bool) x[i] = False %timeit all(x) %timeit np.all(x) ANY i any(x)np.any(x)*np.any(x)* // 6.84 ms 831 µs189 µs 5 3.41 ms 832 µs98 µs 1 683 µs 831 µs24.7 µs 1000 68.9 µs 859 µs8.41 µs 1007.92 µs 888 µs6.9 µs 10 1.42 µs 832 µs6.68 µs 0 712 ns 831 µs6.65 µs ALL i all(x)np.all(x)*np.all(x)* // 6.65 ms 676 µs300 µs 5 3.32 ms 677 µs154 µs 1 666 µs 676 µs36.4 µs 1000 67.9 µs 686 µs9.86 µs 1007.53 µs 677 µs7.26 µs 10 1.39 µs 676 µs7.06 µs 0 716 ns 678 µs6.96 µs ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.any and np.all short-circuiting
Thank you for your instantaneuos reply! This is what I usually do: from the numpy folder I run (emptying the build folder if I just fetched svn updates) $ python setup build.py $ cd build/lib-... $ ipython In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.4.0.dev7417' Everything works except for the np.test() which gives NameError: name 'numeric' is not defined Otherwise I move into a diffeent folder, say /tmp and run ipython In [1]: import sys In [2]: sys.path.insert(0, '~/numpy/build/lib.linux-i686-2.6/') In [3]: import numpy as np In [4]: np.__version__ Out[4]: '1.4.0.dev7417' In [5]: np.test() Running unit tests for numpy NumPy version 1.4.0.dev7417 NumPy is installed in /_space_/Temp/numpy/build/lib.linux-i686-2.6/numpy Python version 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) [GCC 4.3.3] nose version 0.10.4 -- Ran 0 tests in 0.002s OK Out[5]: nose.result.TextTestResult run=0 errors=0 failures=0 What should I do instead? Thanks, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.any and np.all short-circuiting
Thank you both for your help! $ python setup.py build_src --inplace build_ext --inplace I'll give it a try. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.any and np.all short-circuiting
I am sorry. I followed your suggestion. I re-checked out the svn folder and then compiled with $ python setup.py build_src --inplace build_ext --inplace but I get the same behaviour. If I am inside I get the NameError, if I am outside and use path.insert, it successfully performs zero tests. I have tried with numpy-1.3 with sys.path.insert and it works. I re-implemented the same patch for 1.3 and it passes all 2030 tests. http://projects.scipy.org/numpy/ticket/1237 I think the speed improvement is impressive. Thanks. I still wonder why I am unable to make the tests work with the svn version. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] is ndarray.base the closest base or the ultimate base?
numpy.may_share_memory() should be pretty cheap. It's just arithmetic. True, but it is in python. Not something that should go in construct_arrays of ufunc_object.c, I suppose. But the same approach can be translated to C, probably. I can try if we decide http://projects.scipy.org/numpy/ticket/1085 is worth fixing. Let me know. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] is ndarray.base the closest base or the ultimate base?
My vote (if I am entitled to) goes to change the code. Whether or not the addressee of .base is an array, it should be the object that has to be kept alive such that the data does not get deallocated rather one object which will keep alive another object, which will keep alive another object, , which will keep alive the object with the data. On creation of a new view B of object A, if A has ONWDATA true then B.base = A, else B.base = A.base. When working on http://projects.scipy.org/numpy/ticket/1085 I had to walk the chain of bases to establish whether any of the inputs and the outputs were views of the same data. If base were the ultimate base, one would only need to check whether any of the inputs have the same base of any of the outputs. I tried to modify the code to change the behaviour. I have opened a ticket for this http://projects.scipy.org/numpy/ticket/1232 and attached a patch but I am not 100% sure. I changed PyArray_View in convert.c and a few places in mapping.c and sequence.c. But if there is any reason why the current behaviour should be kept, just ignore the ticket. Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] is ndarray.base the closest base or the ultimate base?
Hello, I cannot quite understand whether ndarray.base is the closest base, the one from which the view was made or the ultimate base, the one actually containing the data. I think the documentation and the actual behaviour mismatch. In [1]: import numpy as np In [2]: x = np.arange(12) In [3]: y = x[::2] In [4]: z = y[2:] In [5]: x.flags Out[5]: C_CONTIGUOUS : True F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [6]: y.flags Out[6]: C_CONTIGUOUS : False F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [7]: z.flags Out[7]: C_CONTIGUOUS : False F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [8]: z.base Out[8]: array([ 0, 2, 4, 6, 8, 10]) It looks like the base of z is y, i.e. its closest base, the array from which the view z was created. But the documentation says: base : ndarray If the array is a view on another array, that array is its `base` (unless that array is also a view). The `base` array is where the array data is ultimately stored. and it looks like the base should be x, the array where the data is ultimately stored. I like the second one better. First, because this way I do not have to travel all the bases until I find an array with OWNDATA set. Second, because the current implementation keeps y alive because of z while in the end z only needs x. In [11]: del y In [12]: z.base Out[12]: array([ 0, 2, 4, 6, 8, 10]) Comments? Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] is ndarray.base the closest base or the ultimate base?
Thanks for your quick answer. Is there a reason for that? Am I wrong or it only makes life harder, such as: while (PyArray_Check(base) !PyArray_CHKFLAGS(base, NPY_OWNDATA)) { base = PyArray_BASE(base); } plus the possible error you underlined, plus the fact that this keeps a chain of zombies alive without reason. Are there cases where the current behaviour is useful? Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] is ndarray.base the closest base or the ultimate base?
I think you do not need to do the chain up walk on view creation. If the assumption is that base is the ultimate base, on view creation you can do something like (pseudo-code): view.base = parent if parent.owndata else parent.base ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy large array bug
I can confirm this bug for the last svn. Also: a.put([2*1024*1024*1024 + 100,], 8) IndexError: index out of range for array in this case, I think the error is that in numpy/core/src/multiarray/item_selection.c in PyArray_PutTo line 209 should be: intp i, chunk, ni, max_item, nv, tmp; instead of: int i, chunk, ni, max_item, nv, tmp; fixing it as suggested: a.put([2*1024*1024*1024 + 100,], 8) a.max() 8 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy large array bug
I think the original bug is due to line 535 of numpy/core/src/multiarray/ctors.c (svn) that should be: intp numcopies, nbytes; instead of: int numcopies, nbytes; To resume: in line 535 of numpy/core/src/multiarray/ctors.c and in line 209 of numpy/core/src/multiarray/item_selection.c int should be replaced with intp. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy large array bug
Here it is... http://projects.scipy.org/numpy/ticket/1229 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Create 2D array from EXISTING 1D array
if I get your question correctly, np.tile could be what you need ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1')
Hi, with the standard ubuntu version (1.2.1), I get a=np.zeros(0x80,dtype='b1') # OK a=np.zeros(0x8000,dtype='b1') TypeError: data type not understood a=np.zeros(0x8000,dtype=bool) ValueError: negative dimensions are not allowed while with 1.4.0.dev7375, I get a=np.zeros(0x8000,dtype='b1') # (both 'b1'and bool give the same result) ValueError: Maximum allowed dimension exceeded I think it might have something to do with the fact that in a 32bit machine (like mine, what about yours?) the default int is 32bit. An int32 with value 0x8000 has the sign bit set and is actually -2**31. In fact with a different machine (64bit ubuntu, numpy 1.2.1), it works a=np.zeros(0x8000,dtype='b1') # OK In any case, with a 32bit machine you could not address such a big array anyway. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1')
I just realized that Sebastian posted its 'uname -a' and he has a 64bit machine. In this case it should work as mine (the 64bit one) does. Maybe during the compilation some flags prevented a full 64bit code to be compiled? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1')
Python shouldn't be the problem here. Even on a 32bit machine a = 0x8000 2147483648L a=np.zeros(a, dtype=bool) ValueError: negative dimensions are not allowed ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding a 2D with a 1D array...
Hi Ruben, In both cases, as the print statement shows, offspr is already created. offspr[...] = r + a[:, None] means fill the existing object pointed by offspr with r + a[:, None] while offspr = r + a[:,None] means create a new array and assign it to the variable offspr (after decref-ing the object previously pointed by offspr) Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: GPU Numpy
Hi Sturla, The proper way to speed up dot(a*b+c*sqrt(d), e) is to get rid of temporary intermediates. I implemented a patch http://projects.scipy.org/numpy/ticket/1153 that reduces the number of temporary intermediates. In your example from 4 to 2. There is a big improvement in terms of memory footprint, and some improvement in terms of speed (especially for large matrices) but not as much as I expected. In your example result = 0 for i in range(n): result += (a[i]*b[i] + c[i]*sqrt(d[i])) * e[i] another big speedup could come from the fact that it makes better use of the cache. That is exactly why numexpr is faster in these cases. I hope one day numpy will be able to perform such optimizations. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding a 2D with a 1D array...
Hi Ruben One dimensional arrays can be thought of as rows. If you want a column, you need to append a dimension. d = a + b[:,None] which is equivalent to d = a + b[:,np.newaxis] Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding a 2D with a 1D array...
I am sorry but it doesn't make much sense. How do you measure the performance? Are you sure you include the creation of the c output array in the time spent (which is outside the for loop but should be considered anyway)? Here are my results... In [84]: a = np.random.rand(8,26) In [85]: b = np.random.rand(8) In [86]: def o(a,b): : c = np.empty_like(a) : for i in range(len(a)): : c[i] = a[i] + b[i] : return c : In [87]: d = a + b[:,None] In [88]: (d == o(a,b)).all() Out[88]: True In [89]: %timeit o(a,b) %ti1 loops, best of 3: 36.8 µs per loop In [90]: %timeit d = a + b[:,None] 10 loops, best of 3: 5.17 µs per loop In [91]: a = np.random.rand(8,26) In [92]: b = np.random.rand(8) In [93]: %timeit o(a,b) %ti10 loops, best of 3: 287 ms per loop In [94]: %timeit d = a + b[:,None] 100 loops, best of 3: 15.4 ms per loop ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A faster median (Wirth's method)
Hello Sturla, I had a quick look at your code. Looks fine. A few notes... In select you should replace numpy with np. In _median how can you, if n==2, use s[] if s is not defined? What if n==1? Also, I think when returning an empty array, it should be of the same type you would get in the other cases. You could replace _median with the following. Best, Luca def _median(x, inplace): assert(x.ndim == 1) n = x.shape[0] if n 2: k = n 1 s = select(x, k, inplace=inplace) if n 1: return s[k] else: return 0.5*(s[k]+s[:k].max()) elif n == 0: return np.empty(0, dtype=x.dtype) elif n == 2: return 0.5*(x[0]+x[1]) else: # n == 1 return x[0] ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to concatenate two arrayswithout duplicating memory?
As Gaël pointed out you cannot create A, B and then C as the concatenation of A and B without duplicating the vectors. I am looking for a way to have a non contiguous array C in which the left (1, 2000) elements point to A and the right (1, 4000) elements point to B. But you can still re-link A to the left elements and B to the right ones afterwards by using views into C. C=numpy.concatenate((A, B), axis=1) A,B = C[:,:2000], C[:,2000:] Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] np.bitwise_and.identity
Hello, I know I am splitting the hair, but should not np.bitwise_and.identity be -1 instead of 1? I mean, something with all the bits set? I am checking whether all elements of a vector 'v' have a certain bit 'b' set: if np.bitwise_and.reduce(v) (1 b): # do something If v is empty, the expression is true for b==0 and false otherwise. In fact np.bitwise_and.identity is 1. I like being able to use np.bitwise_and.reduce because it many times faster than (v (1 b)).all() (it does not create the temporary vector). Of course there are workarounds but I was wondering if there is a reason for this behaviour. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fastest way to parsing a specific binay file
If I understand the problem... if you are 100% sure that ', ' only occurs between fields and never within, you can use the 'split' method of the string which could be faster than regexp in this simple case. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.bitwise_and.identity
Thank you, Robert, for the quick reply. I just saw the line #define PyUFunc_None -1 in the ufuncobject.h file. It is always the same, you choose a sentinel thinking that it doesn't conflict with any possible value and you later find there is one such case. As said it is not a big deal. I wouldn't spend time on it. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy core dump on linux
I experience the same problem. A few more additional test cases: In [1]: import numpy In [2]: numpy.lexsort([numpy.arange(5)[::-1].copy(), numpy.arange(5)]) Out[2]: array([0, 1, 2, 3, 4]) In [3]: numpy.lexsort([numpy.arange(5)[::-1].copy(), numpy.arange(5.)]) Out[3]: array([0, 1, 2, 3, 4]) In [4]: numpy.lexsort([numpy.arange(5), numpy.arange(5)]) Out[4]: array([0, 1, 2, 3, 4]) In [5]: numpy.lexsort([numpy.arange(5), numpy.arange(5.)]) Out[5]: array([0, 1, 2, 3, 4]) In [6]: numpy.lexsort([numpy.arange(5)[::-1], numpy.arange(5)]) Out[6]: array([0, 1, 2, 3, 4]) In [7]: numpy.lexsort([numpy.arange(5)[::-1], numpy.arange(5.)]) *** glibc detected *** /usr/bin/python: free(): invalid next size (fast): 0x09be6eb8 *** It looks like the problem is when the first array is reversed and the second is float. I am not familiar with gdb. If I run gdb python, run it, and give the commands above, it hangs at the glibc line without returning to gdb unless I hit CTRL-C. In this case, I guess, the backtrace I get is related to the CTRL-C rather than the error. Any hint in how to obtain useful information from gdb? Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] genfromtext advice
I have tried $ awk -F '|' '{if(NF != 12) print NR;}' /tmp/pp.txt and besides the first 23 lines and the last 3 lines of the file, also the following have a number of '|' different from 11: 1635 2851 5538 i.e. BIKIN, BENGUERIR and TERESINA AIRPORT. But I completely agree with you, genfromtxt could print out the line number and the actual line giving problems. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] genfromtext advice
import sys f = open(sys.argv[1], 'rt') for l in f: if len(l.split('|')) != 12: print(l) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Efficiently defining a multidimensional array
One solution I can think of still requires one loop (instead of three): import numpy as np a = np.arange(12).reshape(3,4) b = np.arange(15).reshape(3,5) z = np.empty(a.shape + (b.shape[-1],)) for i in range(len(z)): z[i] = np.add.outer(a[i], b[i]) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Efficiently defining a multidimensional array
Or a[:,:,None] + b[:,None,:] ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] why does b[:-0] not work, and is there an elegant solution?
The problem is that n is integer and integer do not have different representations for 0 and -0 (while floats do). Therefore it is impossible to disambiguate the following two case scenarios: b[:n] # take the first n b[:-n] # take all but the last n when n ==0. One possible solution (you decide whether it is elegant :-D ): b[:len(b)-n] ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] why does b[:-0] not work, and is there an elegant solution?
Another solution (elegant?? readable??) : x[slice(-n or None)] # with n == 0, 1, ... ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] is there a better way to do this arrayrepeat?
As you stress on repeat the array ... rather than repeat each element, you may want to consider tile as well: np.tile(a, [10,1]) array([[0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2]]) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] saving incrementally numpy arrays
You can do something a bit tricky but possibly working. I made the assumption of a C-ordered 1d vector. import numpy as np import numpy.lib.format as fmt # example of chunks chunks = [np.arange(l) for l in range(5,10)] # at the beginning fp = open('myfile.npy', 'wb') d = dict( descr=fmt.dtype_to_descr(chunks[0].dtype), fortran_order=False, shape=(2**30), # some big shape you think you'll never reach ) fp.write(fmt.magic(1,0)) fmt.write_array_header_1_0(fp, d) h_len = fp.tell() l = 0 # ... for each chunk ... for chunk in chunks: l += len(chunk) fp.write(chunk.tostring('C')) # finally fp.seek(0,0) fp.write(fmt.magic(1,0)) d['shape'] = (l,) fmt.write_array_header_1_0(fp, d) fp.write(' ' * (h_len - fp.tell() - 1)) fp.close() ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Not enough storage for memmap on 32 bit WinXP for accumulated file size above approx. 1 GB
Hello! I have access to both a 32bit and a 64bit linux machine. I had to change your code (appended) because I got an error about not being able to create a mmap larger than the file. Here are the results... On the 32bit machine: lc...@xps2:~$ python /tmp/ppp.py Created 1 writeable mmaps containing 100 MB Created 2 writeable mmaps containing 200 MB Created 3 writeable mmaps containing 300 MB Created 4 writeable mmaps containing 400 MB Created 5 writeable mmaps containing 500 MB [..] Created 24 writeable mmaps containing 2400 MB Created 25 writeable mmaps containing 2500 MB Created 26 writeable mmaps containing 2600 MB Created 27 writeable mmaps containing 2700 MB Created 28 writeable mmaps containing 2800 MB Created 29 writeable mmaps containing 2900 MB Created 30 writeable mmaps containing 3000 MB Removing mmaps... Done... Traceback (most recent call last): File /tmp/ppp.py, line 19, in module mm = mmap.mmap(f.fileno(), 0) mmap.error: [Errno 12] Cannot allocate memory On the 64bit machine I can create 510 mmaps both with bytes_per_mmap at 100MiB and 1GiB: Created 1 writeable mmaps containing 1000 MB Created 2 writeable mmaps containing 2000 MB Created 3 writeable mmaps containing 3000 MB Created 4 writeable mmaps containing 4000 MB Created 5 writeable mmaps containing 5000 MB Created 6 writeable mmaps containing 6000 MB [..] Created 501 writeable mmaps containing 501000 MB Created 502 writeable mmaps containing 502000 MB Created 503 writeable mmaps containing 503000 MB Created 504 writeable mmaps containing 504000 MB Created 505 writeable mmaps containing 505000 MB Created 506 writeable mmaps containing 506000 MB Created 507 writeable mmaps containing 507000 MB Created 508 writeable mmaps containing 508000 MB Created 509 writeable mmaps containing 509000 MB Created 510 writeable mmaps containing 51 MB Removing mmaps... Done... Traceback (most recent call last): File /tmp/ppp.py, line 19, in module mm = mmap.mmap(f.fileno(), 0) mmap.error: [Errno 24] Too many open files I do not even have 510GiB free in the disk. But I think that is because the ext3 filesystem allows sparse files. I think this shows that the maximum mapped space cannot be more than the maximum address space which is 2**64 for 64bit machines, 2GiB for windows32 and 3GiB for linux32. Under WindowsXP, you can try to increase it from 2GiB to 3GiB using the /3GB switch in the boot.ini Best, Luca ### CODE ### import itertools import mmap import os files = [] mmaps = [] file_names= [] mmap_cap=0 bytes_per_mmap = 100 * 1024 ** 2 try: for i in itertools.count(1): file_name = /home/lciti/%d.tst % i file_names.append(file_name) f = open(file_name, w+b) files.append(f) f.seek(bytes_per_mmap) f.write('a') f.seek(0) mm = mmap.mmap(f.fileno(), 0) mmaps.append(mm) mmap_cap += bytes_per_mmap print Created %d writeable mmaps containing %d MB % (i,mmap_cap/(1024**2)) #Clean up finally: print Removing mmaps... for mm, f, file_name in zip(mmaps, files, file_names): mm.close() f.close() os.remove(file_name) print Done... ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Comparing the precision of dtypes?
Hi Hans, You can follow this thread on approximately the same topic http://news.gmane.org/find-root.php?group=gmane.comp.python.numeric.generalarticle=31551 It should have been fixed in http://projects.scipy.org/numpy/changeset/7133 Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Overloading numpy's ufuncs for better typecoercion?
Hi Hans! Ideally, I'd like numpy to be fixed what do you mean by fixed? Are you referring to Out[2] or Out[3]? In [1]: a = numpy.array([200], numpy.uint8) In [2]: a + a Out[2]: array([144], dtype=uint8) Please do not fix this, that IS the correct output. What should numpy do? Promote every sum of arrays of uint8 to uint16? Or perform the operation as uint16 and cast it back to uint8 only if all elements are less than 256, therefore having the same operation on arrays of the same type return an unpredictable data type? I think the answer is simple: a = numpy.array([200]) if you want an integer a = numpy.array([200.]) if you want a float. These are pretty safe for reasonable inputs. Whenever the user writes: a = numpy.array([200], dtype=...) it means he/she knows what they are doing. If instead, you refer to In [3]: numpy.add(a, a, numpy.empty((1, ), dtype = numpy.uint32)) Out[3]: array([144], dtype=uint32) in this case I agree with you, the expected result should be 400. The inputs could be casted to the output type before performing the operation. I do not think performing the operations with the output dtype would break something. Even in borderline cases like the following: b = numpy.array([400], numpy.int16) c = numpy.array([200], numpy.int16) numpy.subtract(b.astype(numpy.int8), c.astype(numpy.int8), numpy.empty((1, ), dtype = numpy.int8)) array([100], dtype=int8) Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Overloading numpy's ufuncs for bettertypecoercion?
Hello Hans, Although it should be noted that in C/C++, the result of uint8+uint8 is int. But C/C++ works with scalars and often temporary results are kept in registers. On the contrary, numpy works with arrays. We cannot expect (a+b)*c to grow from uint8 to uint16 and then uint32 :-D For example, the conversion would typically have to be performed only once (after loading), no? Then, why not simplify things further by adding a dtype= parameter to importImage()? This could even default to float32 then. I think this is the cleanest, easiest, better way to go. Also, Matlab's imread reads the image in its format (e.g. uint8). An explicit conversion is needed (e.g. double()) otherwise all the operations are performed in uint8. Even i = imread('...'); i = i + 34.5 does not trigger an upcast. The only difference is that matlab saturates to 0 and 255 instead of wrapping. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting 95%/99% margin of ndarray
You can do it by hand by sorting the array and taking the corresponding elements or you can use scipy.stats.scoreatpercentile that also interpolates. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting 95%/99% margin of ndarray
I am afraid I misunderstand your question because I do not get the results you expected. def pdyn(a, p): a = np.sort(a) n = round((1-p) * len(a)) return a[int((n+1)/2)], a[len(a)-1-int(n/2)] # a[-int(n/2)] would not work if n=1 pdyn([0, 0, 0, 0, 1, 2, 3, 4, 5, 2000], 1) (0, 2000) pdyn([0, 0, 0, 0, 1, 2, 3, 4, 5, 2000], .9) (0, 2000) pdyn([0, 0, 0, 0, 1, 2, 3, 4, 5, 2000], .5) (0, 4) pdyn([0, 0, 0, 0, 1, 2, 3, 4, 5, 2000], .3) (1, 3) If you have the array 0 0 0 0 1 2 3 4 5 2000 why should p = 0.5 - (1, 5) ? I mean 10*.5 is 5, so you throw away 5 elements from the upper and lower tail (either 3+2 or 2+3) and you should end up with either (0, 4) or (0, 3), so why (1, 5) ? Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] My identity
Just my 2 cents. It is duplicated code. But it is only 3 lines. identity does not need to handle rectangular matrices and non-principal diagonals, therefore it can be reasonably faster (especially for small matrices, I guess). ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] alternative mechanism for initializing anarray
Hello, all these alternative mechanisms for initializing arrays risk to break current code. Isn't it? Then one would need to specify the data type with a kw argument while with the current implementation the second argument is the data type irregardless of whether or not it is given with the dtype keyword. np.array(['AB','S4']) array(['AB', 'S4'], dtype='|S2') np.array('AB','S4') array('AB', dtype='|S4') Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Optimizing reduction loops (sum(), prod(), et al.)
Hi Pauli, in my PC I have tried this and some of the regressions disappear, maybe you can give it a try. At the present state is compiler- and architecture-dependent, therefore not the best choice. But it may be worth trying. Best, Luca /* My additions are unindented */ /* * Vectorized reduction along an axis * * Evaluating the inner loop in smaller blocks interleaved with the * reduction loop aims to avoid cache misses in the loop-ret array. */ { typedef unsigned long long ticks; __inline__ ticks getticks(void) { unsigned a, d; /* return clock();*/ /* asm(cpuid);*/ asm volatile(rdtsc : =a (a), =d (d)); return (((ticks)a) | (((ticks)d) 32)); } npy_intp new_block_size; ticks t0, t1; int delta = 8; int speed, speed_p; /*t0 = getticks(); t0 = getticks();*/ t0 = getticks(); speed_p = 0.; block_size = 2 + (loop-bufsize / loop-outsize / 2); new_block_size = block_size; /*printf(was %d, block_size);*/ for (k = 0; k loop-size; k += block_size) { char *bufptr[3]; block_size = new_block_size; /*printf( then %d (speed_p %d), block_size, speed_p);*/ bufptr[0] = loop-bufptr[0] + k * loop-steps[0]; bufptr[1] = loop-bufptr[1] + k * loop-steps[1]; bufptr[2] = loop-bufptr[2] + k * loop-steps[2]; if (k + block_size loop-size) { block_size = loop-size - k; } for (i = i0; i = loop-N; ++i) { bufptr[1] += loop-instrides; loop-function((char **)bufptr, block_size, loop-steps, loop-funcdata); UFUNC_CHECK_ERROR(loop); } t1 = getticks(); speed = (block_size 12) / (t1 - t0); if (speed speed_p) delta = -delta; new_block_size = (1 + ((block_size * (128 + delta)) 10)) 3; speed_p = speed; t0 = t1; } /*printf( is %d (speed_p %d)\n, block_size, speed_p);*/ } PyArray_ITER_NEXT(loop-it); PyArray_ITER_NEXT(loop-rit); } ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] find_common_type broken?
Hi, I am not very confident with types but I will try to give you my opinion. As for the first part of the question np.find_common_type([np.ndarray, np.ma.MaskedArray, np.recarray], []) dtype('object') I think that the first argument of np.find_common_type should be the type of an _element_ of the array, not the type of the array. In your case you are asking np.find_common_type the common type between an array of arrays, an array of masked arrays, and an array of record arrays. Therefore the best thing np can do is to find object as common type. Correctly: np.find_common_type([np.float64, np.int32], []) dtype('float64') As for the second part of the question np.find_common_type internally uses np.dtype(t) for each type in input. While the comparison between types work as expected: np.complex128 np.complex64 np.float64 np.float32 np.int64 np.int32 True the comparison between dtype(t) gives different results: np.dtype(np.float64) np.dtype(np.int64) True np.dtype(np.float32) np.dtype(np.int64) False np.dtype(np.float32) np.dtype(np.int32) False np.dtype(np.float32) np.dtype(np.int16) True At first I thought the comparison was made based on the number of bits in the mantissa or the highest integer N for which N-1 was still representable. But then I could not explain the first result. What is surprising is that np.dtype(np.float32) np.dtype(np.int32) False np.dtype(np.float32) np.dtype(np.int32) False np.dtype(np.float32) == np.dtype(np.int32) False therefore the max() function in np.find_common_type cannot tell which to return, and returns the first. In fact: np.find_common_type([], [np.int64, np.float32]) dtype('int64') but np.find_common_type([], [np.float32, np.int64]) dtype('float32') which is unexpected. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] find_common_type broken?
That is what I thought at first, but then what is the difference between array_types and scalar_types? Function signature is: *find_common_type(array_types, scalar_types)* As I understand it, the difference is that in the following case: np.choose(range(5), [np.arange(1,6), np.zeros(5, dtype=np.uint8), 1j*np.arange(5), 22, 1.5]) one should call: find_common_type([np.int64,np.uint8,np.complex128], [int,float]) I had a look at the code and it looks like dtype1 dtype2 if dtype1 can safely be broadcasted to dtype2 As this is not the case, in either direction, for int32 and float32, then neither dtype(int32) dtype(float32) nor dtype(int32) dtype(float32) and this causes the problem you highlighted. I think in this case find_common_type should return float64. The same problem arises with: np.find_common_type([np.int8,np.uint8], []) dtype('int8') np.find_common_type([np.uint8,np.int8], []) dtype('uint8') here too, I think find_common_type should return e third type which is the smallest to which both can be safely broadcasted: int16. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] speed up np.diag
I have submitted Ticket #1167 with a patch to speed up diag and eye. On average the code is 3 times faster (but up to 9!). ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] speed up np.diag
Hello, I happened to have a look at the code for np.diag and found it more complicated that necessary. I think it can be rewritten more cleanly and efficiently. Appended you can find both versions. The speed improvement is significant: In [145]: x = S.rand(1000,1300) In [146]: assert all(diag(x,-13) == diag2(x,-13)) In [147]: %timeit diag(x,-13) 1 loops, best of 3: 37.4 µs per loop In [148]: %timeit diag2(x,-13) 10 loops, best of 3: 14.1 µs per loop In [149]: x = x.T In [150]: assert all(diag(x,-13) == diag2(x,-13)) In [151]: %timeit diag(x,-13) 1 loops, best of 3: 81.1 µs per loop In [152]: %timeit diag2(x,-13) 10 loops, best of 3: 14.8 µs per loop It takes slightly more than one third for the C-contiguous input and slightly more than one sixth for the F-contiguous input. Best, Luca ## ORIGINAL def diag(v, k=0): v = asarray(v) s = v.shape if len(s)==1: n = s[0]+abs(k) res = zeros((n,n), v.dtype) if (k=0): i = arange(0,n-k) fi = i+k+i*n else: i = arange(0,n+k) fi = i+(i-k)*n res.flat[fi] = v return res elif len(s)==2: N1,N2 = s if k = 0: M = min(N1,N2-k) i = arange(0,M) fi = i+k+i*N2 else: M = min(N1+k,N2) i = arange(0,M) fi = i + (i-k)*N2 return v.flat[fi] else: raise ValueError, Input must be 1- or 2-d. ## SUGGESTED def diag(v, k=0): v = asarray(v) s = v.shape if len(s) == 1: n = s[0]+abs(k) res = zeros((n,n), v.dtype) i = k if k = 0 else (-k) * s[1] res.flat[i::n+1] = v return res elif len(s) == 2: if v.flags.f_contiguous: v, k, s = v.T, -k, s[::-1] i = k if k = 0 else (-k) * s[1] return v.flat[i::s[1]+1] else: raise ValueError, Input must be 1- or 2-d. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] speed up np.diag
if v.flags.f_contiguous: v, k, s = v.T, -k, s[::-1] Is this correct? The .flat iterator always traverses the array in virtual C-order, not in the order it's laid out in memory. The code could work (and gives the same results) even without the two lines above which in theory do nothing: taking the k-th diagonal of an array or the (-k)-th of its transpose should be the same. But in this case ndarray.flat works faster if the array is C-contiguous. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performing operations in-place in numpy
Hello Pauli, excuse me if I insist, PyArray_Conjugate is not the problem. If when using the numpy API, it is accepted something like: ob1 = PyArray_CreateSomehowAnArray(); obj2 = PyArray_DoSomethingWithArray(obj1,...); obj3 = PyArray_DoSomethingElseWithArray(obj1,...); Py_DECREF(obj1); then there is no way my patch is guaranteed to not break things. What happens with conjugate can happen with sin, round, sum, prod, ... The problem is that the philosophy of the patch is: If I (function) have been called with an input with ref_count=1, that input is mine and I can do whatever I want with it. If you (caller) need it for later be sure you let me know by incrementing its ref_count. From what I understand, this works fine from within numpy, by changing a few operations (like divmod and variance) but could break libraries using the numpy C API. I hope someone can find a way to have the cake and eat it too. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] interpolation in numpy
Hi, you can have a look at the method interp2d of scipy.interpolate. I think it is what you are looking for. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performing operations in-place in numpy
Hello Gaël, I think it might be an option. Also one could have an internal flag which says whether or not is safe to overwrite inputs with ref_count=1. Then import_array() sets this flag to unsafe (i.e. current behaviour). If the user of the numpy C-api is aware of how the new feature works, he/she can enable it by switching the flag to safe and act accordingly (increase refcounts before / decrease after) whenever he/she needs an array for later reuse. If possible, when imported from python (is there a way to know it? is import_array() called anyway?) the flag could be set to safe. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performing operations in-place in numpy
Let me see if I understand correctly... what you suggest is something like: 1) adding an argument flag to construct_arrays that enables/disables the feature 2) adding the same argument flag to construct_loop which is passed untouched to construct_arrays 3) set the flag to disable in the construct_loop call inside PyUFunc_GenericFunction 4) write an exact copy of PyUFunc_GenericFunction with the flag enabled 5) in ufunc_generic_call, call the latter instead of PyUFunc_GenericFunction Am I correct? Sounds doable to me as long as ufunc.__call__ is not directly or indirectly in the numpy C API. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performing operations in-place in numpy
I tried to implement what you suggest. The patch is in the ticket page. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] performing operations in-place in numpy
Hello guys, I made a patch for numpy which allows performing operations in-place to save memory allocations. For example 'sqrt(a**2 + b**2)' can be performed allocating only two arrays instead of four. You find the details in ticket 1153 of numpy-core. I thought maybe you could be interested. I am impatient to have some feedback from the community. Best, Luca ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performing operations in-place in numpy
Hi Stefan, I am afraid I did not explain myself clear enough. Of course c = a + b + d leaves a, b, and d unchanged. The only array that is overwritten is (a+b) which is a temporary array that would be destroyed anyway. Normally the operation above is performed like this: 1) allocation of a temporary array f 2) execution of sum(a, b, out=f) 3) allocation of a temporary array g 4) execution of sum(f, d, out=g) 5) assignment of g to c 6) deallocation of f With my method it is performed like this: 1) allocation of a temporary array f 2) execution of sum(a, b, out=f) 3) execution of sum(f, d, out=f) 4) assignment of f to c When I write The approach works in most cases (it passes most of the numpy tests) but makes the assumption that ufuncs can work even if one of the inputs is used as output. I mean that the core ufunc, the atomic operation working on the single array element, must be able to work even if one of the input is used as output. As far as I can tell, they already work like this. Can we expect someone implementing a new ufunc to keep that in mind? Actually with my last version of the patch, all tests are passed. If after further thorough testing, this approach is found to not break things, there is no reason not to include it. It is completely transparent to the python user. The only programmer that should be aware of it is the numpy C developer. I hope I answered your concerns. I am sorry if my description of the ticket is a bit confused. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performing operations in-place in numpy
@Charles R Harris For example 'sqrt(a**2 + b**2)' can be performed... I think this particular function is already available as a ufunc. I am not sure it is implemented as ufunc. But in any case it was just an example. Another example is sin(2*pi*w+phi) that is currently implemented allocating a temporary vector for (2*pi*w), another temporary vector for (2*pi*w+phi) and another for sin(2*pi*w+phi). With my patch only the first temporary vector is allocated, then it is reused twice and finally returned. One allocation instead of three. All this happens under the hood and is completely transparent to the user. Best, Luca P.S. I am sorry I am messing up with the threads but I had the digest enabled and I did not know how to reply to a post ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performing operations in-place in numpy
On thing to keep in mind is that the inputs might be different views of the same array so the elements might accessed in an unexpected order. Only inputs owning their own data and with refcount 1 (i.e. no other array can be a view of it) are re-used as outputs. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performing operations in-place in numpy
Hello The problem is not PyArray_Conjugate itself. The problem is that whenever you call a function from the C side and one of the inputs has ref_count 1, it can be overwritten. This is not a problem from the python side because if the ufunc sees a ref_count=1 it means that no python object is referencing to it. Let us consider an ufunc like sumdiff : c,d = sumdiff(a,b) = c,d = a+b,a-b if you implement it in python it is fine. If it is implemented in C, it has to be implemented like this: Py_INCREF(obj1); Py_INCREF(obj2); obj3 = PyArray_Sum((PyAO *)obj1, (PyAO *)obj2, NULL); Py_DECREF(obj1); Py_DECREF(obj2); obj4 = PyArray_Diff((PyAO *)obj1, (PyAO *)obj2, NULL); Without the Py_INCREF/Py_DECREF pair, if sumdiff is called with arrays not bounded to any python variable, such as sumdiff(arange(10), ones(10)), PyArray_Sum can overwrite one of the inputs and make the result of PyArray_Diff wrong. With the Py_INCREF/Py_DECREF pair, PyArray_Sum cannot overwrite its inputs (while PyArray_Diff can). Moving the Py_INCREF/Py_DECREF inside the ufuncs makes the patch unuseful because no array could have ref_count 1. You are right, existing C-extension modules that use the Numpy/Numeric API, might give wrong results. Best, Luca ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion