Re: [Numpy-discussion] Looking for people interested in helping with Python compiler to LLVM

2012-03-12 Thread Dag Sverre Seljebotn
On 03/10/2012 10:35 PM, Travis Oliphant wrote:
> Hey all,
>
> I gave a lightning talk this morning on numba which is the start of a
> Python compiler to machine code through the LLVM tool-chain. It is proof
> of concept stage only at this point (use it only if you are interested
> in helping develop the code at this point). The only thing that works is
> a fast-vectorize capability on a few functions (without for-loops). But,
> it shows how creating functions in Python that can be used by the NumPy
> runtime in various ways. Several NEPS that will be discussed in the
> coming months will use this concept.
>
> Right now there is very little design documentation, but I will be
> adding some in the days ahead, especially if I get people who are
> interested in collaborating on the project. I did talk to Fijal and Alex
> of the PyPy project at PyCon and they both graciously suggested that I
> look at some of the PyPy code which walks the byte-code and does
> translation to their intermediate representation for inspiration.
>
> Again, the code is not ready for use, it is only proof of concept, but I
> would like to get feedback and help especially from people who might
> have written compilers before. The code lives at:
> https://github.com/ContinuumIO/numba

Hi Travis,

me and Mark F. has been talking today about whether some of numba and
Cython development could overlap -- not right away, but in the sense
that if Cython gets some features for optimization of numerical code,
then make it easy for numba to reuse that functionality.

This may be sort of off-topic re: the above-- but part of the goal of 
this post is to figure out numba's intended scope. If there isn't an 
overlap, that's good to know in itself.

Question 1: Did you look at Clyther and/or Copperhead? Though similar, 
they target GPUs...but at first glance they look as though they may be 
parsing Python bytecode to get their ASTs... (didn't check though)

Question 2: What kind of performance are you targeting -- in the short
term, and in the long term? Is competing with "Fortran-level"
performance a goal at all?

E.g., for ufunc computations with different iteration orders such
as "a + b.T" (a and b in C-order), one must do blocking to get good
performance. And when dealing with strided arrays, copying small chunks 
at the time will sometimes help performance (and sometimes not).

This is optimization strategies which (as I understand it) is quite
beyond what NumPy iterators etc. can provide. And the LLVM level could
be too low -- one has quite a lot of information when generating the
ufunc/reduction/etc. that would be thrown away when generating LLVM
code. Vectorizing compilers do their best to reconstruct this
information; I know nothing about what actually exists here for
LLVM. They are certainly a lot more complicated to implement and work
with than making use of on higher-level information available before
code generation.

The idea we've been playing with is for Cython to define a limited
subset of its syntax tree (essentially the "GIL-less" subset) seperate
from the rest of Cython, with a more well-defined API for optimization
passes etc., and targeted for a numerical optimization pipeline.

This subset would actually be pretty close to what numba needs to
compile, even if the overlap isn't perfect. So such a pipeline could
possibly be shared between Cython and numba, even if Cython would use
it at compile-time and numba at runtime, and even if the code
generation backend is different (the code generation backend is
probably not the hard part...). To be concrete, the idea is:

(Cython|numba) -> high-level numerical compiler and
loop-structure/blocking optimizer (by us on a shared parse tree
representation) -> (LLVM/C/OpenCL) -> low-level optimization (by the
respective compilers)

Some algorithms that could be shareable are iteration strategies
(already in NumPy though), blocking strategies, etc.

Even if this may be beyond numba's (and perhaps Cython's) current
ambition, it may be worth thinking about, if nothing else then just
for how Cython's code should be structured.

(Mark F., how does the above match how you feel about this?)

Dag
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy videos

2012-03-12 Thread Francesc Alted
On Mar 12, 2012, at 5:23 PM, Abhishek Pratap wrote:

> Super awesome. I love how the python community in general keeps the
> recordings available for free.
> 
> @Adam : I do have some problems that I can hit numpy with, mainly
> bigData based. So in summary I have millions/billions of rows of
> biological data on which I want to run some computation but at the
> same time have a capability to do quick lookup. I am not sure if numpy
> will be applicable for quick lookups  by a string based key right ??

PyTables does precisely that.  Allows to do out-of-core operations with large 
arrays, store tables with an unlimited number of rows on-disk and, by using its 
integrated indexing engine (OPSI), you can perform quick lookups based on 
strings (or whatever other type).  Look into these examples:

http://www.pytables.org/moin/HowToUse#Selectingvalues

HTH,

-- Francesc Alted



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy videos

2012-03-12 Thread Adam Hughes
This is a probably an area that is quite common, so I'd be interested to
hear some other chime in.  I refer to the lookup and storage in numpy data.

Your implementation will of course be unique, but there are several avenues
that you can consider.  Here is how I handle a similar problem.

Imagine I have data, probably similar to yours, where there is qualitative
data (maybe biological or experimental parameters and other things), as
well as numerical data.  I would define a dictionary object that stores
both of these to a unique key.  In my work, I use the original file that
all the information was taken from as my key.  So for example:

dict{ key: (file_info), (data_array, dtype='float')}

The value of the item in the dictionary is split so that the information
and the actually data arrays are kept separate.  Notice my use of
dtype...it is also possible to build your own numpy data type that gives
you a bit more flexibility for storing your data.  This is very useful if
your data is not all that standardized, or if you want to quickly look up
data by reference.  For example, if you have a column in your file called
"counts" and you want later to access this, having a custom datatype will
let you do this with ease.  Anyway, you can read into that later.

This storage type is also highly useful if you need to make new data
structures later.  For example, if you want to plot all of your data in a
multiplot, you can design a method to take this object and return the
formatted multi-array data, as well as any axis arrays that can be
extracted from this data.  Generally, if you can this object built, than
any other representation of the data that you need can be taken from this.
This approach is useful to me, but may not be ideal if your dataset is so
large that you cannot afford to have several data structures that are
holding it simultanesouly in your code.

On Mon, Mar 12, 2012 at 6:23 PM, Abhishek Pratap  wrote:

> Super awesome. I love how the python community in general keeps the
> recordings available for free.
>
> @Adam : I do have some problems that I can hit numpy with, mainly
> bigData based. So in summary I have millions/billions of rows of
> biological data on which I want to run some computation but at the
> same time have a capability to do quick lookup. I am not sure if numpy
> will be applicable for quick lookups  by a string based key right ??
>
> -Abhi
>
> On Mon, Mar 12, 2012 at 3:18 PM, Adam Hughes 
> wrote:
> > Abhi,
> >
> > One thing I would suggest is to tackle numpy with a particular focus.
> Once
> > you've gotten the basics down through tutorials and videos, do you have a
> > research project in mind to use with numpy?
> >
> >
> > On Mon, Mar 12, 2012 at 6:08 PM, Skipper Seabold 
> > wrote:
> >>
> >> On Mon, Mar 12, 2012 at 6:04 PM, Abhishek Pratap 
> wrote:
> >> >
> >> > Hey Guys
> >> >
> >> > Few days with folks at my first pycon has made me wonder how much of
> >> > cool things I was missing ..
> >> >
> >> > I am looking to do some quick catch up on numpy and wondering if there
> >> > are any set of videos that I can refer to. I learn quicker seeing
> >> > videos  and would appreciate if you guys can point me to anything
> >> > available it will be of great help.
> >> >
> >>
> >> You'll find a lot of videos here. The tutorials in particular may
> >> interest you from past conferences.
> >>
> >> http://conference.scipy.org/index.html
> >>
> >> Oddly though it doesn't look like there's a straight link to the 2011
> >> conference there.
> >>
> >> http://conference.scipy.org/scipy2011/
> >>
> >> Skipper
> >> ___
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion@scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy videos

2012-03-12 Thread Abhishek Pratap
Super awesome. I love how the python community in general keeps the
recordings available for free.

@Adam : I do have some problems that I can hit numpy with, mainly
bigData based. So in summary I have millions/billions of rows of
biological data on which I want to run some computation but at the
same time have a capability to do quick lookup. I am not sure if numpy
will be applicable for quick lookups  by a string based key right ??

-Abhi

On Mon, Mar 12, 2012 at 3:18 PM, Adam Hughes  wrote:
> Abhi,
>
> One thing I would suggest is to tackle numpy with a particular focus.  Once
> you've gotten the basics down through tutorials and videos, do you have a
> research project in mind to use with numpy?
>
>
> On Mon, Mar 12, 2012 at 6:08 PM, Skipper Seabold 
> wrote:
>>
>> On Mon, Mar 12, 2012 at 6:04 PM, Abhishek Pratap  wrote:
>> >
>> > Hey Guys
>> >
>> > Few days with folks at my first pycon has made me wonder how much of
>> > cool things I was missing ..
>> >
>> > I am looking to do some quick catch up on numpy and wondering if there
>> > are any set of videos that I can refer to. I learn quicker seeing
>> > videos  and would appreciate if you guys can point me to anything
>> > available it will be of great help.
>> >
>>
>> You'll find a lot of videos here. The tutorials in particular may
>> interest you from past conferences.
>>
>> http://conference.scipy.org/index.html
>>
>> Oddly though it doesn't look like there's a straight link to the 2011
>> conference there.
>>
>> http://conference.scipy.org/scipy2011/
>>
>> Skipper
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy videos

2012-03-12 Thread Adam Hughes
Abhi,

One thing I would suggest is to tackle numpy with a particular focus.  Once
you've gotten the basics down through tutorials and videos, do you have a
research project in mind to use with numpy?

On Mon, Mar 12, 2012 at 6:08 PM, Skipper Seabold wrote:

> On Mon, Mar 12, 2012 at 6:04 PM, Abhishek Pratap  wrote:
> >
> > Hey Guys
> >
> > Few days with folks at my first pycon has made me wonder how much of
> > cool things I was missing ..
> >
> > I am looking to do some quick catch up on numpy and wondering if there
> > are any set of videos that I can refer to. I learn quicker seeing
> > videos  and would appreciate if you guys can point me to anything
> > available it will be of great help.
> >
>
> You'll find a lot of videos here. The tutorials in particular may
> interest you from past conferences.
>
> http://conference.scipy.org/index.html
>
> Oddly though it doesn't look like there's a straight link to the 2011
> conference there.
>
> http://conference.scipy.org/scipy2011/
>
> Skipper
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy videos

2012-03-12 Thread Skipper Seabold
On Mon, Mar 12, 2012 at 6:04 PM, Abhishek Pratap  wrote:
>
> Hey Guys
>
> Few days with folks at my first pycon has made me wonder how much of
> cool things I was missing ..
>
> I am looking to do some quick catch up on numpy and wondering if there
> are any set of videos that I can refer to. I learn quicker seeing
> videos  and would appreciate if you guys can point me to anything
> available it will be of great help.
>

You'll find a lot of videos here. The tutorials in particular may
interest you from past conferences.

http://conference.scipy.org/index.html

Oddly though it doesn't look like there's a straight link to the 2011
conference there.

http://conference.scipy.org/scipy2011/

Skipper
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy videos

2012-03-12 Thread Abhishek Pratap
Hey Guys

Few days with folks at my first pycon has made me wonder how much of
cool things I was missing ..

I am looking to do some quick catch up on numpy and wondering if there
are any set of videos that I can refer to. I learn quicker seeing
videos  and would appreciate if you guys can point me to anything
available it will be of great help.

Thanks!
-Abhi
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Looking for people interested in helping with Python compiler to LLVM

2012-03-12 Thread Pierre Haessig
Hi,
Le 12/03/2012 00:21, Sturla Molden a écrit :
>
> It could also put Python/Numba high up on the Debian shootout ;-)
Can you tell a bit more about it ? (I just didn't understand the whole
sentence in fact ;-) )

Thanks !
-- 
Pierre



signature.asc
Description: OpenPGP digital signature
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] unique along axis?

2012-03-12 Thread Neal Becker
I see unique does not take an axis arg.

Suggested way to apply unique to each column of a 2d array?

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Looking for people interested in helping with Python compiler to LLVM

2012-03-12 Thread Olivier Delalleau
One major difference is that Theano doesn't attempt to parse existing
Python (byte)code: you need to explicitly code with the Theano syntax
(which tries to be close to Numpy, but can end up looking quite different,
especially if you want to control the program flow with loops and ifs for
instance).

A potentially interesting avenue would be to parse Python (byte)code to
generate a Theano graph. It'd be nice if numba could output some
intermediate information that would represent the computational graph being
compiled, so that Theano could re-use it directly :) (probably much easier
said than done though)

-=- Olivier

Le 12 mars 2012 12:57, Till Stensitzki  a écrit :

> Doesent Theano does the same, only via GCC compilation?
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Looking for people interested in helping with Python compiler to LLVM

2012-03-12 Thread Till Stensitzki
Doesent Theano does the same, only via GCC compilation?

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nltk dispersion plot problem

2012-03-12 Thread David Warde-Farley
On Mon, Mar 12, 2012 at 04:15:04AM +, Gias wrote:
> I am using Ubuntu 11.04 (natty) in my laptop and Python 2.7. I installed nltk 
> (2.09b), numpy (1.5.1), and matplotlib(1.1.0). The installation is global and 
> I 
> am not using virtualenv.When I try (text4.dispersion_plot(["citizens",
> "democracy", "freedom", "duties", 
> "America"])) in terminal (gnome
> terminal 2.32.1), the plot 
> is not showing up. There is no error message, just a second or two interval 
> before the last (>>>) shows up.

Of those three packages, I'd say that the least likely to be implicated is
NumPy, making this one probably the list where you'll get the least help.

Since it's a plotting problem I would try the matplotlib-users mailing list,
and include the source of dispersion_plot, or a link to it in the Google
Code code browser for the nltk project, e.g.

http://code.google.com/p/nltk/source/browse/trunk/nltk/nltk/draw/dispersion.py

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion