A quick answer to some of the points.

Robert Bradshaw wrote:
> First, it sounds a bit like you're proposing to essentially re- 
> implement NumPy. Would it just be slicing and arithmetic, or where  
> would you draw the line before it doesn't really belong in a  
> compiler, but rather a library. More below:

At this point, I should stress that what I'm after is a roadmap (how do 
I deal with requests coming up, and things like Hoyt's proposed project 
when that came up). I won't have time for doing more than the basic 
array stuff (to better facilitate ourself of Kurt's work) this time 
around -- which is still very useful for exchanging array data with 
C/Fortran code.

It would be "reimplementing" NumPy as far as the core array API goes, 
but the implementation would be totally different as we have a 
compilation stage. If anything, it is reimplementing Fortran, however 
Fortran desperately need more alternatives :-)

As for the library, the CEP scetches full interopability with the 
library, you would likely still do

arr = np.dot(arr,arr)

for matrix mul.

What to include? I think slicing, arithmetic (with "broadcasting", i.e 
dimensions of length 1 are repeated to make arrays conform), transpose 
(arr.T). No "sum()" member function or similar.

In addition perhaps high-level utilities for looping, such as 
"ndenumerate" etc:

cdef int[:,:] arr = ...
for (i, j), value in cython.array.ndenumerate(arr):
     ...

These would be quite nicely contained in cython.array though (and again 
have no direct NumPy link that's not horribly slow due to Python-level 
iterators).


> 
> On Jun 10, 2009, at 11:57 AM, Dag Sverre Seljebotn wrote:
> 
>> Brian Granger wrote:
>>> Dag,
>>>
>>> I quickly glanced through the proposal and have two big picture  
>>> questions:
>>>
>>> * What will this make possible that is currently not possible?
> 
> This was originally my first question too, but you beat me to it.
> 
>> 1) Efficient slices
> 
> Is the inefficiency just in the object creation and Python indexing  
> semantics? It's still O(1), right? Same with the other operations. (I  
> guess there's also a question of result type.)

Well, and then the resulting buffer is acquired and checked for data 
type. But essentially, yes.

This is purely a matter of notational convenience -- will it hurt to 
have that O(1) in your almost-inner-loop taking a slice of 50 elements, 
or do you pass "start", "end" around in your function?

>> 2) Leave the road open for memory bus friendlier arithmetic (re: Hoyt
>> Koepke's project proposal)
> 
> Could you provide a bit more of a concrete example here? There is  
> avoiding allocation of temporary arrays, is there more?

Most systems today are for simple functions like addition constrained by 
memory bus speed, not CPU speed. Consider

B = A + A + A + A + A + A

With NumPy, the data of A has to travel over the bus 6 times, with loop 
unrolling only 1, as it would turn into

B = new memory
for idx in ...
    B[idx] = A[idx] + A[idx] + ...

This can mean a dramatic speed increase as data doesn't have to enter 
the cache more than once.

(NumPy can't do this, but there is numexpr, which allow this:

B = numexpr.eval("A+A+...")

using bytecode interpreter and working in cache-sized blocks.)

>> 3) Automatic contiguous copies in/out if a function is coded to  
>> work on
>> contiguous memory
>>
>> Why can't this be done currently?
>>
>>  * Cython contains no in-compiler support for NumPy, and so cannot  
>> know
>> how to create new underlying ndarray objects.
>>
>>  * All optimizations need to hard-code semantics at compile-time. With
>> single-element indexing it seemed fair to assume the usual  
>> semantics of
>> zero-based indexing etc., but with slices tings get worse (which  
>> kind of
>> object is returned) and with arithmetic downright impossible (what  
>> does *
>> do again?)
>>
>> That's not to say there's not other options:
>> 1) We could hard-code support for NumPy only, and only allow  
>> ndarray and
>> not subclasses thereof.
>>
>> 2) We could invent some new protocol/syntax for defining compile-time
>> semantics for all relevant operations.
> 
> Here I am torn--I don't like defining compile-time semantics because  
> it goes against the whole OO style of inheritance (and feels even  
> more remote than the very dynamic, late-binding Python runtime). I  
> don't like option (1) either though.
> 
> Another an idea, have you thought of using NumPy as the backend? I.e.  
> an int[:,:] is any bufferinfo--supporting object, but if one needs to  
> be created you create it via an ndarray? This could (potentially)  
> facilitate a lot more code reuse (especially for operations that are  
> more complicated than a single loop over the data). (Might be messier  
> than implementing int[:,:] directly though.) Suppose one develops a  
> vectorized version of ndarrays, could that be a drop-in replacement?

? ndarrays are vectorized? But subclasses may not be.

I've thought about it. If arithmetic is implemented on arrays, the only 
thing that's seems to be reused this way is malloc/free of a memory area 
though. Better loose the dependency then.

> The plug-in idea is very interesting, both from the perspective that  
> it would allow one to play with different ways to operate on arrays,  
> and also it could separate some of the array-processing logic out of  
> the core compiler itself.

I can definitely see NumPy as being the first such plugin, i.e. it would 
force each array to exactly ndarray on arithmetic, to at least provide 
some arithmetic support right away.

-- 
Dag Sverre
_______________________________________________
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to