Re: [Numpy-discussion] NOOB Alert: Looping Through Text Files...
On 15/01/11 09:52, dstaley wrote: Warning, I am a python noob. Not only do I not know python, I really don't know anything about programming outside of ArcInfo and the ancient AML language. Regardless, here is my problem Let's say I have three text files (test1.txt, test2.txt and test3.txt). Each text file has 1 line of text in it This is my text file, to which I want to add (append) a new line of text saying I suck at Python and need help, and then save the file with a suffix attached (eg test1_modified.txt, test2_modified.txt, test3_modified.txt). I guess this is the equivalent of making a change in MS Word and using the Save As... command to maintain the integrity of the original file, and save the changes in a new file. But, I want to also do that in a loop (this is a simplified example of something I want to do with hundreds of text files). Now, I understand how to add this line to an existing text file: text_file = open(test1.txt, a) text_file.write(\nI suck at Python and need help) text_file.close() While this adds a line of text, it saves the change to the original file (does not add the _modified.txt suffix to the file name), nor does it allow me to loop through all three of the files. I'm sure this is an easy thing to do, and is online in a million places. Unfortunately, I just cannot seem to find the answer. Here is my thought process: First i would define the list from which I would loop: textlist = [test1.txt, test2.txt, test3.txt] for i in textlist: text_file = open(textlist, a) This is your problem. You create the textfile list, and then loop over the list. Now the is are your elements of the list, however you are passing the list to the open function. That is what the error says, it expects a string but found a list. The better way to do this would be: for filename in textlist: text_file = open(filename, a) ... text_file.write(\nI suck at Python and need help) text_file.close() But, this doesn't work. It gives me the error: coercing to Unicode: need string or buffer, list found SO, I guess I need to do this from something other than a list? Even if it did work, it does not suit my needs as it does not create a new file and does not allow me to add the _modified.txt suffix, which will allow me to keep the original file intact. There are a number of ways to to do this and what the best way is might depend on your text files. If they are very short the easiest way is to just read the content of your text files and write the content to a different file. something like this: for fn in textlist: fp = open(fn, 'r') fp.close() s = fp.read() s += I suck at Python and need help) fp_new = open(fn[:-4]+'_modified.txt','w') fp_new.write(s) fp_new.close() Just for the future this is the numpy list, which is for discussing issues relating to the numpy python module, the next time you might want to post a question like this to one of the python beginners lists. This link might get you started: http://wiki.python.org/moin/BeginnersGuide Cheers Jochen From a responses to a previous post, this seems as if it may have something to do with a python dictionary, but I'm not sure. I'm probably totally off on how this should even be written, so any advice or suggestions would be greatly appreciated. Thanks in advance for your help! -DS ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Another reality check
On 07/12/2010 12:36 PM, David Goldsmith wrote: On Sun, Jul 11, 2010 at 6:18 PM, David Goldsmith d.l.goldsm...@gmail.com mailto:d.l.goldsm...@gmail.com wrote: In numpy.fft we find the following: Then A[1:n/2] contains the positive-frequency terms, and A[n/2+1:] contains the negative-frequency terms, in order of decreasingly negative frequency. Just want to confirm that decreasingly negative frequency means ..., A[n-2] = A_(-2), A[n-1] = A_(-1), as implied by our definition (attached). DG And while I have your attention :-) For an odd number of input points, A[(n-1)/2] contains the largest positive frequency, while A[(n+1)/2] contains the largest [in absolute value] negative frequency. Are these not also termed Nyquist frequencies? If not, would it be incorrect to characterize them as the largest realizable frequencies (in the sense that the data contain no information about any higher frequencies)? DG I would find the term the largest realizable frequency quite confusing. Realizing is a too ambiguous term IMO. It's the largest possible frequency contained in the array, so Nyquist frequency would be correct IMO. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.fft, yet again
On 13/07/10 08:04, Eric Firing wrote: On 07/12/2010 11:43 AM, David Goldsmith wrote: From the docstring: A[0] contains the zero-frequency term (the mean of the signal) And yet, consistent w/ the definition given in the docstring (and included w/ an earlier email), the code gives, e.g.: import numpy as np x = np.ones((16,)); x array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]) y = np.fft.fft(x); y array([ 16.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j]) i.e., the zero-th term is the sum, not the mean (which, again, is consistent w/ the stated defining formula). So, same ol', same ol': bug in the doc (presumably) or bug in the code? Bug in the doc. Good catch. mean is correct for the ifft, not for the fft. Eric I'd say that a pointer to a discussion about normalization of ffts would be good here. The issue is that numpy is doing a normalization to len(x) for the inverse fft. However to make ffts unitary it should actually be that fft and ifft are normalized by sqrt(len(x)). And some fft implementations don't do normalizations at all (FFTW). Cheers Jochen DG ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Here's what I've done to numpy.fft
On 13/07/10 08:47, David Goldsmith wrote: In light of my various questions and the responses thereto, here's what I've done (but not yet committed) to numpy.fft. There are many ways to define the DFT, varying in the sign of the exponent, normalization, etc. In this implementation, the DFT is defined as .. math:: A_k = \sum_{m=0}^{n-1} a_m \exp\left\{-2\pi i{mk \over n}\right\} \qquad k = 0,\ldots,n-1 where `n` is the number of input points. In general, the DFT is defined for complex inputs and outputs, and a single-frequency component at linear frequency :math:`f` is represented by a complex exponential :math:`a_m = \exp\{2\pi i\,f m\Delta t\}`, where :math:`\Delta t` is the *sampling interval*. Note that, due to the periodicity of the exponential function, formally :math:`A_{n-1} = A_{-1}, A_{n-2} = A_{-2}`, etc. That said, the values in the result are in the so-called standard order: if ``A = fft(a,n)``, then ``A[0]`` contains the zero-frequency term (the sum of the data), which is always purely real for real inputs. Then ``A[1:n/2]`` contains the positive-frequency terms, and ``A[n/2+1:]`` contains the negative-frequency (in the sense described above) terms, from least (most negative) to largest (closest to zero). In particular, for `n` even, ``A[n/2]`` represents both the positive and the negative Nyquist frequencies, and is also purely real for real input. For `n` odd, ``A[(n-1)/2]`` contains the largest positive frequency, while ``A[(n+1)/2]`` contains the largest (in absolute value) negative frequency. In both cases, i.e., `n` even or odd, ``A[n-1]`` contains the negative frequency closest to zero. Feedback welcome. DG Hi David, great work. I agree with Travis leave the sampling out. This make things more confusing. I'd also suggest pointing to fftshift for converting the standard order to order min frequency to max frequency Cheers Jochen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Request for testing
No Warning for me: └─(08:26 $)─ python isinf.py True └─(08:26 $)─ python2.5 isinf.py True Python 2.6.4 (r264:75706, Dec 7 2009, 18:43:55) [GCC 4.4.1] on linux2 Python 2.5.4 (r254:67916, Jan 20 2010, 21:43:02) [GCC 4.4.1] on linux2 numpy.version.version '1.3.0' └─(08:33 $)─ uname -a Linux cudos0803 2.6.31-19-generic #56-Ubuntu SMP Thu Jan 28 02:39:34 UTC 2010 x86_64 GNU/Linux └─(08:31 $)─ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 9.10 Release: 9.10 Codename: karmic On 02/21/10 03:30, Charles R Harris wrote: Hi All, I would be much obliged if some folks would run the attached script and report the output, numpy version, and python version. It just runs np.isinf(np.inf), which raises an invalid value warning with current numpy. As far as I can see the function itself hasn't changed since numpy1.3, yet numpy1.3 python2.5 gives no such warning. Chuck import numpy as np import warnings warnings.simplefilter('always') np.seterr(invalid='print') print (np.isinf(np.inf)) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Assigning complex values to a real array
On 12/08/09 02:32, David Warde-Farley wrote: On 7-Dec-09, at 11:13 PM, Dr. Phillip M. Feldman wrote: Example #1: IPython 0.10 [on Py 2.5.4] [~]|1 z= zeros(3) [~]|2 z[0]= 1+1J TypeError: can't convert complex to float; use abs(z) The problem is that you're using Python's built-in complex type, and it responds to type coercion differently than NumPy types do. Calling float() on a Python complex will raise the exception. Calling float() on (for example) a numpy.complex64 will not. Notice what happens here: In [14]: z = zeros(3) In [15]: z[0] = complex64(1+1j) In [16]: z[0] Out[16]: 1.0 Example #2: ### START OF CODE ### from numpy import * q = ones(2,dtype=complex)*(1 + 1J) r = zeros(2,dtype=float) r[:] = q print 'q = ',q print 'r = ',r ### END OF CODE ### Here, both operands are NumPy arrays. NumPy is in complete control of the situation, and it's well documented what it will do. I do agree that the behaviour in example #1 is mildly inconsistent, but such is the way with NumPy vs. Python scalars. They are mostly transparently intermingled, except when they're not. At a minimum, this inconsistency needs to be cleared up. My preference would be that the programmer should have to explicitly downcast from complex to float, and that if he/she fails to do this, that an exception be triggered. That would most likely break a *lot* of deployed code that depends on the implicit downcast behaviour. A less harmful solution (if a solution is warranted, which is for the Council of the Elders to decide) would be to treat the Python complex type as a special case, so that the .real attribute is accessed instead of trying to cast to float. I'm not sure how much code is actually relying on the implicit downcast, but I'd argue that it's bad programming anyways. It is really difficult to spot if you reviewing someone else's code. As others mentioned it's also a bitch to track down a bug that has been accidentally introduced by this behaviour. Jochen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] strange sin/cos performance
Hi all, I see something similar on my system. OK I've just done a test. System is Ubuntu 9.04 AMD64 there seems to be a regression for float32 with high values: In [47]: a=np.random.rand(1).astype(np.float32) In [48]: b=np.random.rand(1).astype(np.float64) In [49]: c=1000*np.random.rand(1).astype(np.float32) In [50]: d=1000*np.random.rand(1000).astype(np.float64) In [51]: %timeit -n 10 np.sin(a) 10 loops, best of 3: 251 µs per loop In [52]: %timeit -n 10 np.sin(b) 10 loops, best of 3: 395 µs per loop In [53]: %timeit -n 10 np.sin(c) 10 loops, best of 3: 5.65 ms per loop In [54]: %timeit -n 10 np.sin(d) 10 loops, best of 3: 87.7 µs per loop In [55]: %timeit -n 10 np.sin(c.astype(np.float64)).astype(np.float32) 10 loops, best of 3: 891 µs per loop Cheers Jochen On Mon, 3 Aug 2009 15:45:56 +0200 Emmanuelle Gouillart emmanuelle.gouill...@normalesup.org wrote: Hi Andrew, %timeit is an Ipython magic command that uses the timeit module, see http://ipython.scipy.org/doc/stable/html/interactive/reference.html?highlight=timeit for more information about how to use it. So you were right to suppose that it is not a normal Python. However, I was not able to reproduce your observations. import numpy as np a = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32) b = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float64) %timeit -n 10 np.sin(a) 10 loops, best of 3: 8.67 ms per loop %timeit -n 10 np.sin(b) 10 loops, best of 3: 9.29 ms per loop Emmanuelle On Mon, Aug 03, 2009 at 09:32:57AM -0400, Andrew Friedley wrote: While working on GSoC stuff I came across this weird performance behavior for sine and cosine -- using float32 is way slower than float64. On a 2ghz opteron: sin float32 1.12447786331 sin float64 0.133481025696 cos float32 1.14155912399 cos float64 0.131420135498 The times are in seconds, and are best of three runs of ten iterations of numpy.{sin,cos} over a 1000-element array (script attached). I've produced similar results on a PS3 system also. The opteron is running Python 2.6.1 and NumPy 1.3.0, while the PS3 has Python 2.5.1 and NumPy 1.1.1. I haven't jumped into the code yet, but does anyone know why sin/cos are ~8.5x slower for 32-bit floats compared to 64-bit doubles? Side question: I see people in emails writing things like 'timeit foo(x)' and having it run some sort of standard benchmark, how exactly do I do that? Is that some environment other than a normal Python? Thanks, Andrew import timeit t = timeit.Timer(numpy.sin(a), import numpy\n a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=numpy.float32)) print sin float32, min(t.repeat(3, 10)) t = timeit.Timer(numpy.sin(a), import numpy\n a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=numpy.float64)) print sin float64, min(t.repeat(3, 10)) t = timeit.Timer(numpy.cos(a), import numpy\n a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=numpy.float32)) print cos float32, min(t.repeat(3, 10)) t = timeit.Timer(numpy.cos(a), import numpy\n a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=numpy.float64)) print cos float64, min(t.repeat(3, 10)) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On 09/06/09 00:16, Gael Varoquaux wrote: On Mon, Jun 08, 2009 at 05:14:27PM -0500, Gökhan SEVER wrote: IPython's edit command works in a similar fashion, too. edit test.py The cool thing is that you can select text in the editor and execute in EPDLab. On the other hand, I know that IPython has hooks to grow this in the code base, and I would like this to grow also directly in IPython. Hell, I use vim. How cool would it be to select (using visual mode) snippets in vim, and execute them in a running Ipython session. I think there's a vim script for executing the marked code in python. If IPython has already hooks for executing code in an existing session, it might be possible to adapt this script. Also I encourage everyone to have a look at pida: http://pida.co.uk/ which is a python IDE using an embedded vim (although you can embed other editors as well I think). The website looks like development has been stale, but if you look at svn there've been commits lately. Cheers Jochen Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] view takes no keyword arguments exception
Hi all, I'm trying to help someone out with some problems with pyfftw (pyfftw.berlios.de). He is seeing an exception, TypeError: view() takes no keyword arguments This doesn't only happen when he uses pyfftw but also when he does the following: import numpy as np a=np.arange(10) print a.view(dtype='float') Traceback (most recent call last): File stdin, line 1, in module TypeError: view() takes no keyword arguments I he's on Windows and sees this error both with numpy 1.1.1 and 1.3. I'm a bit lost anybody have an idea what could be the problem? Cheers Jochen ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] view takes no keyword arguments exception
On 21/05/09 00:20, Stéfan van der Walt wrote: Hi Jochen 2009/5/20 Jochen Schroeder cycoma...@gmail.com: I'm trying to help someone out with some problems with pyfftw (pyfftw.berlios.de). He is seeing an exception, TypeError: view() takes no keyword arguments This doesn't only happen when he uses pyfftw but also when he does the following: import numpy as np a=np.arange(10) print a.view(dtype='float') Traceback (most recent call last): File stdin, line 1, in module TypeError: view() takes no keyword arguments I he's on Windows and sees this error both with numpy 1.1.1 and 1.3. I'm a bit lost anybody have an idea what could be the problem? In the older versions of numpy, a.view(float) should work (float is preferable above 'float' as well), but I would guess that you are really looking for a.astype(float) Sorry maybe I phrased my question wrongly. I don't want to change the code (This was just a short example). I just want to know why it is failing on his system and what he can do so that a.view(dtype='...') is working. I suspected it was an old numpy installation but the person is saying that he installed a new version and is still seeing the same problem (or does he just have an old version of numpy floating around). Cheers Jochen ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Construct a large n-dimensional complex matrix
On 06/04/09 21:20, Robert Kern wrote: On Mon, Apr 6, 2009 at 20:53, Nathan Faggian nfagg...@unimelb.edu.au wrote: HI, I want to construct a large complex matrix, I have the real and imaginary components as double vectors, is there a fast way to construct a complex vector in numpy? C = np.empty((n,m), dtype=complex) C.real.flat[:] = real_values C.imag.flat[:] = imag_values Just a question, is there any advantage to doing C = 1.j *imag_values + real_values ? Cheers Jochen -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Optical autocorrelation calculated with numpy is slow
On Tue, Mar 31, 2009 at 8:54 PM, Jochen S cycoma...@gmail.com wrote: On Tue, Mar 31, 2009 at 7:13 AM, João Luís Silva jsi...@fc.up.pt wrote: Hi, I wrote a script to calculate the *optical* autocorrelation of an electric field. It's like the autocorrelation, but sums the fields instead of multiplying them. I'm calculating I(tau) = integral( abs(E(t)+E(t-tau))**2,t=-inf..inf) An autocorrelation is just a convolution, which is a multiplication in frequency space. Thus you can do: FT_E = fft(E) FT_ac=FT_E*FT_E.conj() ac = fftshift(ifft(FT_ac)) where E is your field and ac is your autocorrelation. Also what sort of autocorrelation are you talking about. For instance SHG autocorrelation is an intensity autocorrelation thus the first line should be: FT_E = fft(abs(E)**2) Sorry I was reading over your example to quickly earlier, you're obviously using intensity autocorrelation so what you should be doing is: FT_E=fft(abs(E)**2) FT_ac = FT_E*FT_E.conj() ac = fftshift(ifft(FT_ac)) HTH Jochen with script appended at the end. It's too slow for my purposes (takes ~5 seconds, and scales ~O(N**2)). numpy's correlate is fast enough, but isn't what I need as it multiplies instead of add the fields. Could you help me get this script to run faster (without having to write it in another programming language) ? Thanks, João Silva # import numpy as np #import matplotlib.pyplot as plt n = 2**12 n_autocorr = 3*n-2 c = 3E2 w0 = 2.0*np.pi*c/800.0 t_max = 100.0 t = np.linspace(-t_max/2.0,t_max/2.0,n) E = np.exp(-(t/10.0)**2)*np.exp(1j*w0*t)#Electric field dt = t[1]-t[0] t_autocorr=np.linspace(-dt*n_autocorr/2.0,dt*n_autocorr/2.0,n_autocorr) E1 = np.zeros(n_autocorr,dtype=E.dtype) E2 = np.zeros(n_autocorr,dtype=E.dtype) Ac = np.zeros(n_autocorr,dtype=np.float64) E2[n-1:n-1+n] = E[:] for i in range(2*n-2): E1[:] = 0.0 E1[i:i+n] = E[:] Ac[i] = np.sum(np.abs(E1+E2)**2) Ac *= dt #plt.plot(t_autocorr,Ac) #plt.show() # ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.ctypeslib.ndpointer and the restype attribute
On 23/03/09 15:40, Sturla Molden wrote: Jens Rantil wrote: Hi all, So I have a C-function in a DLL loaded through ctypes. This particular function returns a pointer to a double. In fact I know that this pointer points to the first element in an array of, say for simplicity, 200 elements. How do I convert this pointer to a NumPy array that uses this data (ie. no copy of data in memory)? I am able to create a numpy array using a copy of the data. def fromaddress(address, nbytes, dtype=double): class Dummy(object): pass d = Dummy() d.__array_interface__ = { 'data' : (address, False), 'typestr' : numpy.uint8.str, 'descr' : numpy.uint8.descr, 'shape' : (nbytes,), 'strides' : None, 'version' : 3 } return numpy.asarray(d).view( dtype=dtype ) Might I suggest that restype is going to be removed from the documentation, it also cost me quite some time trying to get ndpointer to work with restype when I first tried it until I finally came to the conclusion that an approach like the above is necessary and ndpointer does not work with restype. Cheers Jochen ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy ndarray questions
On Tue, 2009-01-27 at 12:37 +0100, Sturla Molden wrote: On 1/27/2009 1:26 AM, Jochen wrote: a = fftw3.AlignedArray(1024,complex) a = a+1 = used this way is not assignment, it is name binding. It is easy to use function's like fftw_malloc with NumPy: import ctypes import numpy fftw_malloc = ctypes.cdll.fftw.fftw_malloc fftw_malloc.argtypes = [ctypes.c_ulong,] fftw_malloc.restype = ctypes.c_ulong def aligned_array(N, dtype): d = dtype() address = fftw_malloc(N * d.nbytes) # restype = ctypes.c_ulong if (address = 0): raise MemoryError, 'fftw_malloc returned NULL' class Dummy(object): pass d = Dummy() d.__array_interface__ = { 'data' = (address, False), 'typestr' : dtype.str, 'descr' : dtype.descr, 'shape' : shape, 'strides' : strides, 'version' : 3 } return numpy.asarray(d) I actually do it slightly different, because I want to free the memory using fftw_free. So I'm subclassing ndarray and in the __new__ function of the subclass I create a buffer object from fftw_malloc using PyBuffer_FromReadWriteMemory and pass that to ndarray.__new__. In __del__ I check if the .base is a buffer object and do an fftw_free. If you have to check for a particular alignment before calling fftw, that is trivial as well: def is_aligned(array, boundary): address = array.__array_interface__['data'][0] return not(address % boundary) Yes I knew that, btw do you know of any systems which need alignment to boundaries other than 16byte for simd operations? there a way that I get a different object type? Or even better is there a way to prevent operations like a=a+1 or make them automatically in-place operations? a = a + 1 # rebinds the name 'a' to another array a[:] = a + 1 # fills a with the result of a + 1 I knew about this one, this is what I have been doing. This has to do with Python syntax, not NumPy per se. You cannot overload the behaviour of Python's name binding operator (=). Yes I actually thought about it some more yesterday when going home and realised that this would really not be possible. I guess I just have to document it clearly so that people don't run into it. Cheers Jochen Sturla Molden ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy ndarray questions
On Tue, 2009-01-27 at 14:16 +0100, Sturla Molden wrote: On 1/27/2009 12:37 PM, Sturla Molden wrote: import ctypes import numpy fftw_malloc = ctypes.cdll.fftw.fftw_malloc fftw_malloc.argtypes = [ctypes.c_ulong,] fftw_malloc.restype = ctypes.c_ulong def aligned_array(N, dtype): d = dtype() address = fftw_malloc(N * d.nbytes) # restype = ctypes.c_ulong if (address = 0): raise MemoryError, 'fftw_malloc returned NULL' class Dummy(object): pass d = Dummy() d.__array_interface__ = { 'data' = (address, False), 'typestr' : dtype.str, 'descr' : dtype.descr, 'shape' : shape, 'strides' : strides, 'version' : 3 } return numpy.asarray(d) Or if you just want to use numpy, aligning to a 16 byte boundary can be done like this: def aligned_array(N, dtype): d = dtype() tmp = numpy.array(N * d.nbytes + 16, dtype=numpy.uint8) address = tmp.__array_interface__['data'][0] offset = (16 - address % 16) % 16 return tmp[offset:offset+N].view(dtype=dtype) S.M. Ah, I didn't think about doing it in python, cool thanks. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] numpy ndarray questions
Hi all, I just wrote ctypes bindings to fftw3 (see http://projects.scipy.org/pipermail/scipy-user/2009-January/019557.html for the post to scipy). Now I have a couple of numpy related questions: In order to be able to use simd instructions I create an ndarray subclass, which uses fftw_malloc to allocate the memory and fftw_free to free the memory when the array is deleted. This works fine for inplace operations however if someone does something like this: a = fftw3.AlignedArray(1024,complex) a = a+1 a.ctypes.data points to a different memory location (this is actually an even bigger problem when executing fftw plans), however type(a) still gives me class 'fftw3.planning.AlignedArray'. I think I understand the reason for this is that python does a copy internally (it does not call the __copy__ or __deepcopy__ methods?). Is there a way that I get a different object type? Or even better is there a way to prevent operations like a=a+1 or make them automatically in-place operations? I realise that I could change the __add__ ... methods but that would also prevent b=a+1 operations. My second comment is with respect to the documentation for numpy.ctypeslib.ndpointer The documentation says: An ndpointer instance is used to describe an ndarray in restypes and argtypes specifications. however if I specify a restype to a function e.g. clib.malloc.restype = ctypeslib.ndpointer(flags='contiguous,aligned',shape=shape,dtype=dtype) Calling clib.malloc(1024) results in a TypeError: TypeError: default __new__ takes no parameters Am I correct in assuming that the documentation is incorrect and ndpointer can only be used for argtypes? Or am I missing something? Cheers Jochen ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy ndarray questions
On Mon, 2009-01-26 at 19:25 -0600, Ryan May wrote: Jochen wrote: Hi all, I just wrote ctypes bindings to fftw3 (see http://projects.scipy.org/pipermail/scipy-user/2009-January/019557.html for the post to scipy). Now I have a couple of numpy related questions: In order to be able to use simd instructions I create an ndarray subclass, which uses fftw_malloc to allocate the memory and fftw_free to free the memory when the array is deleted. This works fine for inplace operations however if someone does something like this: a = fftw3.AlignedArray(1024,complex) a = a+1 a.ctypes.data points to a different memory location (this is actually an even bigger problem when executing fftw plans), however type(a) still gives me class 'fftw3.planning.AlignedArray'. This might help some: http://www.scipy.org/Subclasses Ryan Thanks, I had read about __array_finalize__, but not about __array_wrap__. I'm not quite sure if I understand how I can use this to get around my problem, can you explain? Cheers Jochen ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy ndarray questions
On Tue, 2009-01-27 at 13:28 +0900, David Cournapeau wrote: Jochen wrote: On Tue, 2009-01-27 at 12:49 +0900, David Cournapeau wrote: Jochen wrote: Hi all, I just wrote ctypes bindings to fftw3 (see http://projects.scipy.org/pipermail/scipy-user/2009-January/019557.html for the post to scipy). Now I have a couple of numpy related questions: In order to be able to use simd instructions I create an ndarray subclass, which uses fftw_malloc to allocate the memory and fftw_free to free the memory when the array is deleted. This works fine for inplace operations however if someone does something like this: a = fftw3.AlignedArray(1024,complex) a = a+1 a.ctypes.data points to a different memory location (this is actually an even bigger problem when executing fftw plans), however type(a) still gives me class 'fftw3.planning.AlignedArray'. I can't comment about subclassing ndarrays, but I can give you a hint about aligned allocator problem: you could maintain two list of cached plans, automatically detect whether your arrays are aligned or not, and use the appropriate list of plans; one list is for aligned arrays, one for unaligned. Before removing support for fftw, I played with some C++ code to do exactly that. You can tell fftw to create plans for unaligned arrays by using FFTW_UNALIGNED flag: http://projects.scipy.org/scipy/scipy/browser/branches/refactor_fft/scipy/fftpack/backends/fftw3/src/zfft.cxx cheers, David Hi David, I have actually kept more closely to the fftw way of doing things, i.e. I create a plan python object from two arrays, it also stores the two arrays to prevent someone to delete the original arrays and then executing the plan. I am not sure I follow you when you say the fftw way: you can avoid having to store the arrays altogether while still using fftw plans, there is nothing unfftw about using plans that way. I think trying to guarantee that your arrays data buffers won't change is more complicated. David Sorry maybe I wasn't quite clear, what I mean by the fftw way is creating a plan and executing the plan, instead of doing x=fft(y). As you say the problem comes when you execute a plan on arrays which don't exist anymore, which causes python to segfault (I'm talking about using fftw_execute() not fftw_execute_dft). So yes essentially my problem is trying to ensure that the buffer does not change. BTW memmap arrays have the same problem if I create a memmap array and later do something like a=a+1 all later changes will not be written to the file. BTW I just answered to you in the other thread on scipy-users as well, I'll try to just keep the rest of my posts here so people don't have to look at two lists if they want to follow things. Cheers Jochen __ _ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy ndarray questions
On Tue, 2009-01-27 at 13:54 +0900, David Cournapeau wrote: Jochen wrote: On Tue, 2009-01-27 at 13:28 +0900, David Cournapeau wrote: Jochen wrote: On Tue, 2009-01-27 at 12:49 +0900, David Cournapeau wrote: Jochen wrote: Hi all, I just wrote ctypes bindings to fftw3 (see http://projects.scipy.org/pipermail/scipy-user/2009-January/019557.html for the post to scipy). Now I have a couple of numpy related questions: In order to be able to use simd instructions I create an ndarray subclass, which uses fftw_malloc to allocate the memory and fftw_free to free the memory when the array is deleted. This works fine for inplace operations however if someone does something like this: a = fftw3.AlignedArray(1024,complex) a = a+1 a.ctypes.data points to a different memory location (this is actually an even bigger problem when executing fftw plans), however type(a) still gives me class 'fftw3.planning.AlignedArray'. I can't comment about subclassing ndarrays, but I can give you a hint about aligned allocator problem: you could maintain two list of cached plans, automatically detect whether your arrays are aligned or not, and use the appropriate list of plans; one list is for aligned arrays, one for unaligned. Before removing support for fftw, I played with some C++ code to do exactly that. You can tell fftw to create plans for unaligned arrays by using FFTW_UNALIGNED flag: http://projects.scipy.org/scipy/scipy/browser/branches/refactor_fft/scipy/fftpack/backends/fftw3/src/zfft.cxx cheers, David Hi David, I have actually kept more closely to the fftw way of doing things, i.e. I create a plan python object from two arrays, it also stores the two arrays to prevent someone to delete the original arrays and then executing the plan. I am not sure I follow you when you say the fftw way: you can avoid having to store the arrays altogether while still using fftw plans, there is nothing unfftw about using plans that way. I think trying to guarantee that your arrays data buffers won't change is more complicated. David Sorry maybe I wasn't quite clear, what I mean by the fftw way is creating a plan and executing the plan, instead of doing x=fft(y). I guess I was not very clear, because my suggestion has nothing to do with the above. As you say the problem comes when you execute a plan on arrays which don't exist anymore, which causes python to segfault (I'm talking about using fftw_execute() not fftw_execute_dft). So yes essentially my problem is trying to ensure that the buffer does not change. I agree that if you just use fftw_execute, you will have segfaults: I am just not sure that your problem is to ensure that the buffer does not change :) You can instead handle relatively easily the case where the buffers are not aligned, while still using fftw API. The fftw_execute is just not an appropriate API for python code, IMHO. Ah ok, I think I understand you now :). I agree that using fftw_execute is not the most pythonic way of doing things. However every other way I could think of, you would need to do a number of checks and possibly create new plans, as well as keeping old plans around when doing a fft. (I believe that's what you did in the old fftw bindings in scipy?). So I decided to create fftw bindings first for maximum performance (I know, just doing the whole calculation in c is probably more appropriate if you want maximum performance ;), it also fits my needs. However I've been thinking about ways to make the whole thing more intuitive. At least now I can compare how much overhead I'm adding if I start to do checks in order to make the handling easier :) Another solution would be to work on numpy itself, so that it use aligned buffers, but that's obviously much longer term - that's just one of those things on my TODO list that I never took the time to do, I agree that would be nice, I'd think a couple of operations would probably benefit from that. David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy ndarray questions
On Tue, 2009-01-27 at 14:46 +0900, David Cournapeau wrote: Jochen wrote: On Tue, 2009-01-27 at 13:54 +0900, David Cournapeau wrote: Jochen wrote: On Tue, 2009-01-27 at 13:28 +0900, David Cournapeau wrote: Jochen wrote: On Tue, 2009-01-27 at 12:49 +0900, David Cournapeau wrote: Jochen wrote: Hi all, I just wrote ctypes bindings to fftw3 (see http://projects.scipy.org/pipermail/scipy-user/2009-January/019557.html for the post to scipy). Now I have a couple of numpy related questions: In order to be able to use simd instructions I create an ndarray subclass, which uses fftw_malloc to allocate the memory and fftw_free to free the memory when the array is deleted. This works fine for inplace operations however if someone does something like this: a = fftw3.AlignedArray(1024,complex) a = a+1 a.ctypes.data points to a different memory location (this is actually an even bigger problem when executing fftw plans), however type(a) still gives me class 'fftw3.planning.AlignedArray'. I can't comment about subclassing ndarrays, but I can give you a hint about aligned allocator problem: you could maintain two list of cached plans, automatically detect whether your arrays are aligned or not, and use the appropriate list of plans; one list is for aligned arrays, one for unaligned. Before removing support for fftw, I played with some C++ code to do exactly that. You can tell fftw to create plans for unaligned arrays by using FFTW_UNALIGNED flag: http://projects.scipy.org/scipy/scipy/browser/branches/refactor_fft/scipy/fftpack/backends/fftw3/src/zfft.cxx cheers, David Hi David, I have actually kept more closely to the fftw way of doing things, i.e. I create a plan python object from two arrays, it also stores the two arrays to prevent someone to delete the original arrays and then executing the plan. I am not sure I follow you when you say the fftw way: you can avoid having to store the arrays altogether while still using fftw plans, there is nothing unfftw about using plans that way. I think trying to guarantee that your arrays data buffers won't change is more complicated. David Sorry maybe I wasn't quite clear, what I mean by the fftw way is creating a plan and executing the plan, instead of doing x=fft(y). I guess I was not very clear, because my suggestion has nothing to do with the above. As you say the problem comes when you execute a plan on arrays which don't exist anymore, which causes python to segfault (I'm talking about using fftw_execute() not fftw_execute_dft). So yes essentially my problem is trying to ensure that the buffer does not change. I agree that if you just use fftw_execute, you will have segfaults: I am just not sure that your problem is to ensure that the buffer does not change :) You can instead handle relatively easily the case where the buffers are not aligned, while still using fftw API. The fftw_execute is just not an appropriate API for python code, IMHO. Ah ok, I think I understand you now :). I agree that using fftw_execute is not the most pythonic way of doing things. However every other way I could think of, you would need to do a number of checks and possibly create new plans, as well as keeping old plans around when doing a fft. Yes, but I don't see the problem with keeping old plans/creating new ones. It was a pain in C++, because I had to handle many backends, not just fftw, but in your case, doing it in python is not very difficult I think. (I believe that's what you did in the old fftw bindings in scipy?). that's what was done in scipy.fftpack originally, but not by me. So I decided to create fftw bindings first for maximum performance (I know, just doing the whole calculation in c is probably more appropriate if you want maximum performance ;), it also fits my needs. Caching plans will not affect performances; on the contrary, you will be able to use more aggressively optimized plans if you add the ability to save/load plans to the disk (using wisdow mechanism). Checking for aligned arrays is one line in python, and should not affect much performances, specially if you do fft on big arrays, I agree checking for aligned arrays does not affect performance much, I guess the lookup mechanism for the plans is what would take up the most time IMO. I'll definitely look into this though :) Cheers Jochen ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion