[Numpy-discussion] re carray from list of lists
Dear all, I am new to this list and still consider myself a python newby even though I have worked with it for almost two years now. My question may be a bit academic (because there is numpy.genfromtext and matplotlib.mlab.csv2rec), yet it caused me quite a bit of searching before I found out how to convert a list of lists into a recarray. Here is an example code: - start of code import numpy as np import csv reader=csv.reader(open(csvfile.csv,rb),delimiter=';') x=list(reader) # array of string arrays h=x.pop(0) # first line has column headers # convert data to recarray (must make list of tuples!) xx=[tuple(elem) for elem in x] z=np.dtype(('S16, f8, f4, f4')) res=np.array(xx,dtype=z) res.dtype.names=h # set header names - end of code Doesn't this consume too much memory? This first creates the list of lists as x, then converts this into a list of tuples xx, and then forms a recarray res from xx. Why is there the limitation that a recarray must be given a list of tuples instead of a list of lists (which comes out of the csv.reader, for example)? Thanks for any hints to make this more efficient or explanations why these things are done the way they are done. Martin -- View this message in context: http://old.nabble.com/recarray-from-list-of-lists-tp34590361p34590361.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Issue tracking
On Tue, Oct 23, 2012 at 5:05 AM, Thouis (Ray) Jones tho...@gmail.com wrote: On Fri, Oct 19, 2012 at 9:34 PM, Thouis (Ray) Jones tho...@gmail.com wrote: On Fri, Oct 19, 2012 at 4:46 PM, Thouis (Ray) Jones tho...@gmail.com wrote: On Fri, Oct 19, 2012 at 11:20 AM, Thouis (Ray) Jones tho...@gmail.com wrote: I started the import with the oldest 75 and newest 125 Trac issues, and will wait a few hours to do the rest to allow feedback, just in case something is broken that I haven't noticed. I did make one change to better emulate Trac behavior. Some Trac usernames are also email addresses, which Trac anonymizes in its display. I decided it was safer to do the same. The import is running again, though I've been having some failures in a few comments and general hangs (these might be network related). I'm keeping track of which issues might have had difficulties. @endolith noticed that I didn't correctly relink #XXX trac id numbers to github id numbers (both trac and github create links automatically), so that will have to be handled by a postprocessing script (which it probably would have, anyway, since the github # isn't known before import). Import has finished. The following trac #s had issues in creating the comments (I think due to network problems): 182, 297, 619, 621, 902, 904, 909 913, 914, 915, 1044, 1526. I'll review them and see if I can pull in anything missing I'll also work on a script for updating the trac crossrefs to github crossrefs. In the no good deed goes unpunished category, I accidentally logged in as myself (rather than numpy-gitbot) and pushed about 500 issues, so now I receive updates whenever one of them gets changed. At least most of them were closed, already... I just updated the cross-issue-references to use github rather than Trac id numbers. Stupidly, I may have accidentally removed comments that were added in the last few days to issues moved from trac to github. Hopefully not, or at least not many. It's probably a good idea to turn off Trac, soon, to keep too many new bugs from needing to be ported, and old bugs being commented on. The latter is more of a pain to deal with. I will look into making the NumPy trac read-only. It should not be too complicated to extend Pauli's code to redirect the tickets part to github issues. Have we decided what to do with the wiki content ? David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Issue tracking
On Oct 23, 2012, at 9:58 AM, David Cournapeau wrote: On Tue, Oct 23, 2012 at 5:05 AM, Thouis (Ray) Jones tho...@gmail.com wrote: On Fri, Oct 19, 2012 at 9:34 PM, Thouis (Ray) Jones tho...@gmail.com wrote: On Fri, Oct 19, 2012 at 4:46 PM, Thouis (Ray) Jones tho...@gmail.com wrote: On Fri, Oct 19, 2012 at 11:20 AM, Thouis (Ray) Jones tho...@gmail.com wrote: I started the import with the oldest 75 and newest 125 Trac issues, and will wait a few hours to do the rest to allow feedback, just in case something is broken that I haven't noticed. I did make one change to better emulate Trac behavior. Some Trac usernames are also email addresses, which Trac anonymizes in its display. I decided it was safer to do the same. The import is running again, though I've been having some failures in a few comments and general hangs (these might be network related). I'm keeping track of which issues might have had difficulties. @endolith noticed that I didn't correctly relink #XXX trac id numbers to github id numbers (both trac and github create links automatically), so that will have to be handled by a postprocessing script (which it probably would have, anyway, since the github # isn't known before import). Import has finished. The following trac #s had issues in creating the comments (I think due to network problems): 182, 297, 619, 621, 902, 904, 909 913, 914, 915, 1044, 1526. I'll review them and see if I can pull in anything missing I'll also work on a script for updating the trac crossrefs to github crossrefs. In the no good deed goes unpunished category, I accidentally logged in as myself (rather than numpy-gitbot) and pushed about 500 issues, so now I receive updates whenever one of them gets changed. At least most of them were closed, already... I just updated the cross-issue-references to use github rather than Trac id numbers. Stupidly, I may have accidentally removed comments that were added in the last few days to issues moved from trac to github. Hopefully not, or at least not many. It's probably a good idea to turn off Trac, soon, to keep too many new bugs from needing to be ported, and old bugs being commented on. The latter is more of a pain to deal with. I will look into making the NumPy trac read-only. It should not be too complicated to extend Pauli's code to redirect the tickets part to github issues. Have we decided what to do with the wiki content ? I believe there is a wiki dump command in trac wiki. We should put that content linked off the numpy pages at github. Thanks for helping with this. -Travis David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy to CPU+GPU compiler, looking for tests
Did you saw the gpu nd array project? We try to do something similar but only for the GPU. https://github.com/inducer/compyte/wiki Fred On Sun, Oct 21, 2012 at 2:57 PM, Rahul Garg rahulgar...@gmail.com wrote: Thanks! I need to add support for eig and inv (will do this week, at least for CPU) but other than that, I should definitely be able to handle those kinds of benchmarks. rahul On Sun, Oct 21, 2012 at 12:01 PM, Aron Ahmadia a...@ahmadia.net wrote: Hi Rahul, Very cool! I'm looking forward to seeing some performance results! Anders Logg posted a computational challenge to G+ about a month ago, and we got entries in Octave, Fortran, Python, and Julia (all implementing the same solution from Jed Brown). The challenge is here: https://plus.google.com/116518787475147930287/posts/jiULACjiGnW Here is my simple attempt at Cythonizing Jed's Octave code: https://gist.github.com/3893361 The best solution in Fortran took 38 microseconds. The best Python solution clocked in at around 445. The Julia solution implemented by Jed took around 224 microseconds, a good LLVM solution should come close to or beat that. Hope this helps. Aron On Sun, Oct 21, 2012 at 3:27 PM, Rahul Garg rahulgar...@gmail.com wrote: Hi. I am a PhD student at McGill University and I am developing a compiler for Python for CPUs and GPUs. For CPUs, I build upon LLVM. For GPUs, I generate OpenCL and I have also implemented some library functions on the GPU myself. The restriction that it is only for numerical code and intended for NumPy users. The compiler is aware of simple things in NumPy like matrix multiplication, slicing operators, strided layouts, some library functions (though limited at this time) and the negative indexing semantics etc. However, the compiler is not limited to vector code. Scalar code or manually written loops also work. However, only numerical datatypes are supported with no support for lists, dicts, classes etc. First class functions are not currently supported but are on the roadmap. You will have to add some type annotations to your functions. If you have a compatible GPU, you can also use the GPU by indicating which parts to run on the GPU. Otherwise you can just use it to run your code on the CPU. As an example, simple scalar code like fibonacci function works fine. Simple loops like those used in stencil-type computations are also working. Parallel-for loops are also provided and working. Simple vector oriented code is also working fine on both CPU and GPU. The system is being tested on Ubuntu 12.04 and tested with Python 2.7 (though I think should work with other Python 2.x variants). For GPUs, I am ensuring that the system works with AMD and Nvidia GPUs. The compiler is in early stages and I am looking for test cases. The project will be open-sourced in November under Apache 2 and thereafter will be developed in an open repo. If you have some simple code that I can use as a benchmark that I can use to test and evaluate the compiler, that will be very helpful. Some annotations will be required, which I can help you write. I will be VERY grateful to anyone who can provide test cases. In turn, it will help improve the compiler and everyone will benefit. Some of you may be wondering how it compares to Numba. Well it is essentially very similar in the idea. So why build a new compiler then? Actually the project I am building is not specific to Python. I am building a far more general compiler infrastructure for array languages, and Python frontend is just one small part of the project. For example, I am also working on a MATLAB frontend. (Some of you may remember me from an earlier compiler project which unfortunately went nowhere. This is a different project and this time I am determined to convert it into a usable system. I realize the proof is in the pudding, so I hope to convince people by releasing code soon.) thanks, Rahul ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy to CPU+GPU compiler, looking for tests
On Tue, 2012-10-23 at 11:41 -0400, Frédéric Bastien wrote: Did you saw the gpu nd array project? We try to do something similar but only for the GPU. Out of interest, is there a reason why the backend for Numpy could not be written entirely in OpenCL? Assuming of course all the relevant backends are up to scratch. Is there a fundamental reason why targetting a CPU through OpenCL is worse than doing it exclusively through C or C++? Henry ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy to CPU+GPU compiler, looking for tests
Thanks! I did not know about the project. Looking into it. rahul On Tue, Oct 23, 2012 at 11:41 AM, Frédéric Bastien no...@nouiz.org wrote: Did you saw the gpu nd array project? We try to do something similar but only for the GPU. https://github.com/inducer/compyte/wiki Fred On Sun, Oct 21, 2012 at 2:57 PM, Rahul Garg rahulgar...@gmail.com wrote: Thanks! I need to add support for eig and inv (will do this week, at least for CPU) but other than that, I should definitely be able to handle those kinds of benchmarks. rahul On Sun, Oct 21, 2012 at 12:01 PM, Aron Ahmadia a...@ahmadia.net wrote: Hi Rahul, Very cool! I'm looking forward to seeing some performance results! Anders Logg posted a computational challenge to G+ about a month ago, and we got entries in Octave, Fortran, Python, and Julia (all implementing the same solution from Jed Brown). The challenge is here: https://plus.google.com/116518787475147930287/posts/jiULACjiGnW Here is my simple attempt at Cythonizing Jed's Octave code: https://gist.github.com/3893361 The best solution in Fortran took 38 microseconds. The best Python solution clocked in at around 445. The Julia solution implemented by Jed took around 224 microseconds, a good LLVM solution should come close to or beat that. Hope this helps. Aron On Sun, Oct 21, 2012 at 3:27 PM, Rahul Garg rahulgar...@gmail.com wrote: Hi. I am a PhD student at McGill University and I am developing a compiler for Python for CPUs and GPUs. For CPUs, I build upon LLVM. For GPUs, I generate OpenCL and I have also implemented some library functions on the GPU myself. The restriction that it is only for numerical code and intended for NumPy users. The compiler is aware of simple things in NumPy like matrix multiplication, slicing operators, strided layouts, some library functions (though limited at this time) and the negative indexing semantics etc. However, the compiler is not limited to vector code. Scalar code or manually written loops also work. However, only numerical datatypes are supported with no support for lists, dicts, classes etc. First class functions are not currently supported but are on the roadmap. You will have to add some type annotations to your functions. If you have a compatible GPU, you can also use the GPU by indicating which parts to run on the GPU. Otherwise you can just use it to run your code on the CPU. As an example, simple scalar code like fibonacci function works fine. Simple loops like those used in stencil-type computations are also working. Parallel-for loops are also provided and working. Simple vector oriented code is also working fine on both CPU and GPU. The system is being tested on Ubuntu 12.04 and tested with Python 2.7 (though I think should work with other Python 2.x variants). For GPUs, I am ensuring that the system works with AMD and Nvidia GPUs. The compiler is in early stages and I am looking for test cases. The project will be open-sourced in November under Apache 2 and thereafter will be developed in an open repo. If you have some simple code that I can use as a benchmark that I can use to test and evaluate the compiler, that will be very helpful. Some annotations will be required, which I can help you write. I will be VERY grateful to anyone who can provide test cases. In turn, it will help improve the compiler and everyone will benefit. Some of you may be wondering how it compares to Numba. Well it is essentially very similar in the idea. So why build a new compiler then? Actually the project I am building is not specific to Python. I am building a far more general compiler infrastructure for array languages, and Python frontend is just one small part of the project. For example, I am also working on a MATLAB frontend. (Some of you may remember me from an earlier compiler project which unfortunately went nowhere. This is a different project and this time I am determined to convert it into a usable system. I realize the proof is in the pudding, so I hope to convince people by releasing code soon.) thanks, Rahul ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Issue tracking
On Tue, Oct 23, 2012 at 5:13 PM, Travis Oliphant tra...@continuum.iowrote: On Oct 23, 2012, at 9:58 AM, David Cournapeau wrote: On Tue, Oct 23, 2012 at 5:05 AM, Thouis (Ray) Jones tho...@gmail.com wrote: On Fri, Oct 19, 2012 at 9:34 PM, Thouis (Ray) Jones tho...@gmail.com wrote: On Fri, Oct 19, 2012 at 4:46 PM, Thouis (Ray) Jones tho...@gmail.com wrote: On Fri, Oct 19, 2012 at 11:20 AM, Thouis (Ray) Jones tho...@gmail.com wrote: I started the import with the oldest 75 and newest 125 Trac issues, and will wait a few hours to do the rest to allow feedback, just in case something is broken that I haven't noticed. I did make one change to better emulate Trac behavior. Some Trac usernames are also email addresses, which Trac anonymizes in its display. I decided it was safer to do the same. The import is running again, though I've been having some failures in a few comments and general hangs (these might be network related). I'm keeping track of which issues might have had difficulties. @endolith noticed that I didn't correctly relink #XXX trac id numbers to github id numbers (both trac and github create links automatically), so that will have to be handled by a postprocessing script (which it probably would have, anyway, since the github # isn't known before import). Import has finished. The following trac #s had issues in creating the comments (I think due to network problems): 182, 297, 619, 621, 902, 904, 909 913, 914, 915, 1044, 1526. I'll review them and see if I can pull in anything missing I'll also work on a script for updating the trac crossrefs to github crossrefs. In the no good deed goes unpunished category, I accidentally logged in as myself (rather than numpy-gitbot) and pushed about 500 issues, so now I receive updates whenever one of them gets changed. At least most of them were closed, already... I just updated the cross-issue-references to use github rather than Trac id numbers. Stupidly, I may have accidentally removed comments that were added in the last few days to issues moved from trac to github. Hopefully not, or at least not many. It's probably a good idea to turn off Trac, soon, to keep too many new bugs from needing to be ported, and old bugs being commented on. The latter is more of a pain to deal with. I will look into making the NumPy trac read-only. It should not be too complicated to extend Pauli's code to redirect the tickets part to github issues. Have we decided what to do with the wiki content ? I believe there is a wiki dump command in trac wiki. We should put that content linked off the numpy pages at github. Please don't do that. Most of the content is either links to what was already moved into the numpy git repo, or very outdated stuff. I see at least two better options: - just leave the wiki pages as is, make them read only and add a clear warning outdated at the top. - move the rest of the useful content into the git repo, remove the rest Ralf Thanks for helping with this. -Travis David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Is there a way to reset an accumulate function?
I have an array that is peppered throughout in random spots with 'nan'. I would like to use 'cumsum', but I want it to reset the accumulation to 0 whenever a 'nan' is encountered. Is there a way to do this? Aside from a loop - which is what I am going to setup here in a moment. Kindest regards, Tim ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is there a way to reset an accumulate function?
There's nothing like that built into numpy, no. I guess you could use reduceat to calculate the total for each span of non-nans and then replace each nan with the negative of that value so that plain cumsum would work, but a loop is going to be easier (and a loop in Cython will be faster). -n On 23 Oct 2012 18:11, Cera, Tim t...@cerazone.net wrote: I have an array that is peppered throughout in random spots with 'nan'. I would like to use 'cumsum', but I want it to reset the accumulation to 0 whenever a 'nan' is encountered. Is there a way to do this? Aside from a loop - which is what I am going to setup here in a moment. Kindest regards, Tim ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is there a way to reset an accumulate function?
Hi, Why not start conting from the end of the vector until you find a nan? Your problem do not need to check the full vector. Fred On Tue, Oct 23, 2012 at 1:11 PM, Cera, Tim t...@cerazone.net wrote: I have an array that is peppered throughout in random spots with 'nan'. I would like to use 'cumsum', but I want it to reset the accumulation to 0 whenever a 'nan' is encountered. Is there a way to do this? Aside from a loop - which is what I am going to setup here in a moment. Kindest regards, Tim ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Issue tracking
On Tue, Oct 23, 2012 at 10:58 AM, David Cournapeau courn...@gmail.com wrote: I will look into making the NumPy trac read-only. It should not be too complicated to extend Pauli's code to redirect the tickets part to github issues. If you need the map of trac IDs to github IDs, I have code to grab that. Ray ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is there a way to reset an accumulate function?
On 23 October 2012 13:11, Cera, Tim t...@cerazone.net wrote: I have an array that is peppered throughout in random spots with 'nan'. I would like to use 'cumsum', but I want it to reset the accumulation to 0 whenever a 'nan' is encountered. Is there a way to do this? Aside from a loop - which is what I am going to setup here in a moment. How about this hackish solution, for a quick non-looping fix? In [39]: a = np.array([1,2,3,4,np.nan,1,2,3,np.nan,3]) idx = np.flatnonzero(np.isnan(a)) a_ = a.copy() a_[idx] = 0 np.add.reduceat(a_, np.hstack((0,idx))) Out[39]: array([ 10., 6., 3.]) Note that if the last element of a is nan, you get a final 0 in the result. Angus -- AJC McMorland Post-doctoral research fellow Neurobiology, University of Pittsburgh ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Issue tracking
On Tue, Oct 23, 2012 at 8:57 PM, Thouis (Ray) Jones tho...@gmail.com wrote: On Tue, Oct 23, 2012 at 10:58 AM, David Cournapeau courn...@gmail.com wrote: I will look into making the NumPy trac read-only. It should not be too complicated to extend Pauli's code to redirect the tickets part to github issues. If you need the map of trac IDs to github IDs, I have code to grab that. Oh, I meant something much simpler: just redirect the view issues link into GH :) David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion