Re: [Numpy-discussion] Looking for people interested in helping with Python compiler to LLVM
On 03/10/2012 10:35 PM, Travis Oliphant wrote: > Hey all, > > I gave a lightning talk this morning on numba which is the start of a > Python compiler to machine code through the LLVM tool-chain. It is proof > of concept stage only at this point (use it only if you are interested > in helping develop the code at this point). The only thing that works is > a fast-vectorize capability on a few functions (without for-loops). But, > it shows how creating functions in Python that can be used by the NumPy > runtime in various ways. Several NEPS that will be discussed in the > coming months will use this concept. > > Right now there is very little design documentation, but I will be > adding some in the days ahead, especially if I get people who are > interested in collaborating on the project. I did talk to Fijal and Alex > of the PyPy project at PyCon and they both graciously suggested that I > look at some of the PyPy code which walks the byte-code and does > translation to their intermediate representation for inspiration. > > Again, the code is not ready for use, it is only proof of concept, but I > would like to get feedback and help especially from people who might > have written compilers before. The code lives at: > https://github.com/ContinuumIO/numba Hi Travis, me and Mark F. has been talking today about whether some of numba and Cython development could overlap -- not right away, but in the sense that if Cython gets some features for optimization of numerical code, then make it easy for numba to reuse that functionality. This may be sort of off-topic re: the above-- but part of the goal of this post is to figure out numba's intended scope. If there isn't an overlap, that's good to know in itself. Question 1: Did you look at Clyther and/or Copperhead? Though similar, they target GPUs...but at first glance they look as though they may be parsing Python bytecode to get their ASTs... (didn't check though) Question 2: What kind of performance are you targeting -- in the short term, and in the long term? Is competing with "Fortran-level" performance a goal at all? E.g., for ufunc computations with different iteration orders such as "a + b.T" (a and b in C-order), one must do blocking to get good performance. And when dealing with strided arrays, copying small chunks at the time will sometimes help performance (and sometimes not). This is optimization strategies which (as I understand it) is quite beyond what NumPy iterators etc. can provide. And the LLVM level could be too low -- one has quite a lot of information when generating the ufunc/reduction/etc. that would be thrown away when generating LLVM code. Vectorizing compilers do their best to reconstruct this information; I know nothing about what actually exists here for LLVM. They are certainly a lot more complicated to implement and work with than making use of on higher-level information available before code generation. The idea we've been playing with is for Cython to define a limited subset of its syntax tree (essentially the "GIL-less" subset) seperate from the rest of Cython, with a more well-defined API for optimization passes etc., and targeted for a numerical optimization pipeline. This subset would actually be pretty close to what numba needs to compile, even if the overlap isn't perfect. So such a pipeline could possibly be shared between Cython and numba, even if Cython would use it at compile-time and numba at runtime, and even if the code generation backend is different (the code generation backend is probably not the hard part...). To be concrete, the idea is: (Cython|numba) -> high-level numerical compiler and loop-structure/blocking optimizer (by us on a shared parse tree representation) -> (LLVM/C/OpenCL) -> low-level optimization (by the respective compilers) Some algorithms that could be shareable are iteration strategies (already in NumPy though), blocking strategies, etc. Even if this may be beyond numba's (and perhaps Cython's) current ambition, it may be worth thinking about, if nothing else then just for how Cython's code should be structured. (Mark F., how does the above match how you feel about this?) Dag ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy videos
On Mar 12, 2012, at 5:23 PM, Abhishek Pratap wrote: > Super awesome. I love how the python community in general keeps the > recordings available for free. > > @Adam : I do have some problems that I can hit numpy with, mainly > bigData based. So in summary I have millions/billions of rows of > biological data on which I want to run some computation but at the > same time have a capability to do quick lookup. I am not sure if numpy > will be applicable for quick lookups by a string based key right ?? PyTables does precisely that. Allows to do out-of-core operations with large arrays, store tables with an unlimited number of rows on-disk and, by using its integrated indexing engine (OPSI), you can perform quick lookups based on strings (or whatever other type). Look into these examples: http://www.pytables.org/moin/HowToUse#Selectingvalues HTH, -- Francesc Alted ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy videos
This is a probably an area that is quite common, so I'd be interested to hear some other chime in. I refer to the lookup and storage in numpy data. Your implementation will of course be unique, but there are several avenues that you can consider. Here is how I handle a similar problem. Imagine I have data, probably similar to yours, where there is qualitative data (maybe biological or experimental parameters and other things), as well as numerical data. I would define a dictionary object that stores both of these to a unique key. In my work, I use the original file that all the information was taken from as my key. So for example: dict{ key: (file_info), (data_array, dtype='float')} The value of the item in the dictionary is split so that the information and the actually data arrays are kept separate. Notice my use of dtype...it is also possible to build your own numpy data type that gives you a bit more flexibility for storing your data. This is very useful if your data is not all that standardized, or if you want to quickly look up data by reference. For example, if you have a column in your file called "counts" and you want later to access this, having a custom datatype will let you do this with ease. Anyway, you can read into that later. This storage type is also highly useful if you need to make new data structures later. For example, if you want to plot all of your data in a multiplot, you can design a method to take this object and return the formatted multi-array data, as well as any axis arrays that can be extracted from this data. Generally, if you can this object built, than any other representation of the data that you need can be taken from this. This approach is useful to me, but may not be ideal if your dataset is so large that you cannot afford to have several data structures that are holding it simultanesouly in your code. On Mon, Mar 12, 2012 at 6:23 PM, Abhishek Pratap wrote: > Super awesome. I love how the python community in general keeps the > recordings available for free. > > @Adam : I do have some problems that I can hit numpy with, mainly > bigData based. So in summary I have millions/billions of rows of > biological data on which I want to run some computation but at the > same time have a capability to do quick lookup. I am not sure if numpy > will be applicable for quick lookups by a string based key right ?? > > -Abhi > > On Mon, Mar 12, 2012 at 3:18 PM, Adam Hughes > wrote: > > Abhi, > > > > One thing I would suggest is to tackle numpy with a particular focus. > Once > > you've gotten the basics down through tutorials and videos, do you have a > > research project in mind to use with numpy? > > > > > > On Mon, Mar 12, 2012 at 6:08 PM, Skipper Seabold > > wrote: > >> > >> On Mon, Mar 12, 2012 at 6:04 PM, Abhishek Pratap > wrote: > >> > > >> > Hey Guys > >> > > >> > Few days with folks at my first pycon has made me wonder how much of > >> > cool things I was missing .. > >> > > >> > I am looking to do some quick catch up on numpy and wondering if there > >> > are any set of videos that I can refer to. I learn quicker seeing > >> > videos and would appreciate if you guys can point me to anything > >> > available it will be of great help. > >> > > >> > >> You'll find a lot of videos here. The tutorials in particular may > >> interest you from past conferences. > >> > >> http://conference.scipy.org/index.html > >> > >> Oddly though it doesn't look like there's a straight link to the 2011 > >> conference there. > >> > >> http://conference.scipy.org/scipy2011/ > >> > >> Skipper > >> ___ > >> NumPy-Discussion mailing list > >> NumPy-Discussion@scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy videos
Super awesome. I love how the python community in general keeps the recordings available for free. @Adam : I do have some problems that I can hit numpy with, mainly bigData based. So in summary I have millions/billions of rows of biological data on which I want to run some computation but at the same time have a capability to do quick lookup. I am not sure if numpy will be applicable for quick lookups by a string based key right ?? -Abhi On Mon, Mar 12, 2012 at 3:18 PM, Adam Hughes wrote: > Abhi, > > One thing I would suggest is to tackle numpy with a particular focus. Once > you've gotten the basics down through tutorials and videos, do you have a > research project in mind to use with numpy? > > > On Mon, Mar 12, 2012 at 6:08 PM, Skipper Seabold > wrote: >> >> On Mon, Mar 12, 2012 at 6:04 PM, Abhishek Pratap wrote: >> > >> > Hey Guys >> > >> > Few days with folks at my first pycon has made me wonder how much of >> > cool things I was missing .. >> > >> > I am looking to do some quick catch up on numpy and wondering if there >> > are any set of videos that I can refer to. I learn quicker seeing >> > videos and would appreciate if you guys can point me to anything >> > available it will be of great help. >> > >> >> You'll find a lot of videos here. The tutorials in particular may >> interest you from past conferences. >> >> http://conference.scipy.org/index.html >> >> Oddly though it doesn't look like there's a straight link to the 2011 >> conference there. >> >> http://conference.scipy.org/scipy2011/ >> >> Skipper >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy videos
Abhi, One thing I would suggest is to tackle numpy with a particular focus. Once you've gotten the basics down through tutorials and videos, do you have a research project in mind to use with numpy? On Mon, Mar 12, 2012 at 6:08 PM, Skipper Seabold wrote: > On Mon, Mar 12, 2012 at 6:04 PM, Abhishek Pratap wrote: > > > > Hey Guys > > > > Few days with folks at my first pycon has made me wonder how much of > > cool things I was missing .. > > > > I am looking to do some quick catch up on numpy and wondering if there > > are any set of videos that I can refer to. I learn quicker seeing > > videos and would appreciate if you guys can point me to anything > > available it will be of great help. > > > > You'll find a lot of videos here. The tutorials in particular may > interest you from past conferences. > > http://conference.scipy.org/index.html > > Oddly though it doesn't look like there's a straight link to the 2011 > conference there. > > http://conference.scipy.org/scipy2011/ > > Skipper > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy videos
On Mon, Mar 12, 2012 at 6:04 PM, Abhishek Pratap wrote: > > Hey Guys > > Few days with folks at my first pycon has made me wonder how much of > cool things I was missing .. > > I am looking to do some quick catch up on numpy and wondering if there > are any set of videos that I can refer to. I learn quicker seeing > videos and would appreciate if you guys can point me to anything > available it will be of great help. > You'll find a lot of videos here. The tutorials in particular may interest you from past conferences. http://conference.scipy.org/index.html Oddly though it doesn't look like there's a straight link to the 2011 conference there. http://conference.scipy.org/scipy2011/ Skipper ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] numpy videos
Hey Guys Few days with folks at my first pycon has made me wonder how much of cool things I was missing .. I am looking to do some quick catch up on numpy and wondering if there are any set of videos that I can refer to. I learn quicker seeing videos and would appreciate if you guys can point me to anything available it will be of great help. Thanks! -Abhi ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Looking for people interested in helping with Python compiler to LLVM
Hi, Le 12/03/2012 00:21, Sturla Molden a écrit : > > It could also put Python/Numba high up on the Debian shootout ;-) Can you tell a bit more about it ? (I just didn't understand the whole sentence in fact ;-) ) Thanks ! -- Pierre signature.asc Description: OpenPGP digital signature ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] unique along axis?
I see unique does not take an axis arg. Suggested way to apply unique to each column of a 2d array? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Looking for people interested in helping with Python compiler to LLVM
One major difference is that Theano doesn't attempt to parse existing Python (byte)code: you need to explicitly code with the Theano syntax (which tries to be close to Numpy, but can end up looking quite different, especially if you want to control the program flow with loops and ifs for instance). A potentially interesting avenue would be to parse Python (byte)code to generate a Theano graph. It'd be nice if numba could output some intermediate information that would represent the computational graph being compiled, so that Theano could re-use it directly :) (probably much easier said than done though) -=- Olivier Le 12 mars 2012 12:57, Till Stensitzki a écrit : > Doesent Theano does the same, only via GCC compilation? > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Looking for people interested in helping with Python compiler to LLVM
Doesent Theano does the same, only via GCC compilation? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] nltk dispersion plot problem
On Mon, Mar 12, 2012 at 04:15:04AM +, Gias wrote: > I am using Ubuntu 11.04 (natty) in my laptop and Python 2.7. I installed nltk > (2.09b), numpy (1.5.1), and matplotlib(1.1.0). The installation is global and > I > am not using virtualenv.When I try (text4.dispersion_plot(["citizens", > "democracy", "freedom", "duties", > "America"])) in terminal (gnome > terminal 2.32.1), the plot > is not showing up. There is no error message, just a second or two interval > before the last (>>>) shows up. Of those three packages, I'd say that the least likely to be implicated is NumPy, making this one probably the list where you'll get the least help. Since it's a plotting problem I would try the matplotlib-users mailing list, and include the source of dispersion_plot, or a link to it in the Google Code code browser for the nltk project, e.g. http://code.google.com/p/nltk/source/browse/trunk/nltk/nltk/draw/dispersion.py David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion