Re: [Cython] Proposal: Cython array type (CEP 517)

Dag Sverre Seljebotn Fri, 12 Jun 2009 08:58:13 -0700

I think I agree more with you than you give me credit for...

Roland Schulz wrote:
> I think it is important to see which features of this proposals are 
> possible but will only be included in the unforeseeable (=may be never) 
> future and what could/will be done soon.
> Of course this depends on how much time there is so it is difficult to say.


OK I'll by crystal clear about this: In autumn,

    cdef int[:] a = ..., b = ...
    a = a + b

will be a syntax error. The main motivation for adding this type now is 
to make it easier to pass array data between Python objects (PEP 3118) 
and external libraries.

However, adding a new, native type is not something you do without 
thinking things through, and discussing possibilities for arithmetic 
operators was important to get the whole picture of "what kind of type" 
this is and what design and implementation considerations to take.

Also there was the issue of the whole direction of Cython in the area of 
numeric programming which has now largely been resolved. This also means 
that

cdef ndarray[int] a = ..., b = ...
a = a + b

will never be optimized -- because it is vital that such code allows 
arbitrary subclasses of ndarray to overload the operators.

> By point is, if collapsing d=a*b*c*d into one loop is possible, but is 
> not clear when it will be added, this proposal won't speed up 
> calculations over vectorized numpy for the foreseeable furture.
> 
> Or putting it another way: Introducing only element-wise access won't 
> help performance by itself, so performance wise it is not a meaningful 
> intermediate step.

I agree, what it is needed for is more convenient "raw" handling of PEP 
3118 without invoking NumPy to do it (and NumPy doesn't handle all the 
memory layouts that PEP 3118 handle).

BUT: The code needed to e.g. pass Python Imaging Library images to 
Fortran code is 50% of the way towards what it takes to implement naive 
componentwise operators.


> Also this optimization is not straight forward. If you look at
> http://eigen.tuxfamily.org/index.php?title=Benchmark-August2008
> 
> you see that many libraries trying to do even simple Y+=alpha X do a bad 
> job performance wise.
> And having to do all the required optimization in the cython compiler 
> would basicly mean rewriting the Frotran Compiler. And of course the 
> language is suboptimal but the compiler can be sometimes rather smart. 
> And putting all this in the Cython compiler sounds like awful a lot of work.

I completely agree!! -- I don't see Cython doing the optimizations 
itself, but make use of external libraries (like eigen) through plugins. 
The question is how this is facilitated!

I.e. you could probably define type after the C++ integration is more 
mature that allows you to do

     cdef eigen.vector a = ..., b = ... # no pun intended
     a = a + b # faster than anything else

but can you transparently convert those to and from PIL images, or pass 
them to a SciPy/NumPy function?

That's the issues that are really at stake here.

>     2) Various plugins (with a seperate release process from Cython so that
>     work in this area doesn't bog down Cython development), using e.g.
>     Eigen/Blitz as backends, or run it on a GPU, in parallell using OpenMP
>     (well, for heavy componentwise functions), etc. etc.
> 
> 
> Why does it bog down Cython delvelopment if it is a library with ships 
> with cython? Also you can always require certain version of the external 
> components (and make it optional for complication for people not 
> interested in numerics).

What I meant to say is that we must avoid situations like this:

Dag Sverre, year 2010: "Please wait with releasing Cython 1.3 until I'm 
done supporting the next version of Eigen"

So plugins have "their own release process" in the sense that Cython can 
be released at any time shipping a previous, stable version of the plugin.

But this is premature discussion ... :-)


>     Well, the point here is that when you use Eigen/Blitz, what you do is
>     use the metalanguage of C++. Which is much more powerful than anything
>     we're likely to get in Cython. So in some sense you use C++ to do what
>     you cannot do in Cython.
> 
> 
> Exactly you have rather easy usage of Eigen/Blitz and only that code has 
> to be generated by Cython. Then the C++ compiler takes care of the 
> expression templates.

Generating the code to use Eigen/Blitz and generating naive loops are 
probably about the same amount of work, and probably something like 70% 
overlap. I see no reason for not doing both, if only to avoid having to 
say "you need C++ to use this feature of the language".

Really, neither are rocket science, but all the "buffer interopability 
infrastructure" needs to be there first.

-- 
Dag Sverre
_______________________________________________
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] Proposal: Cython array type (CEP 517)

Reply via email to