Re: [Cython] Proposal: idea for automatic management of dynamic memory

Robert Bradshaw Fri, 11 Apr 2008 03:52:29 -0700

On Apr 11, 2008, at 2:56 AM, Dag Sverre Seljebotn wrote:

> (Hmm. I'm really not that against the approach. But I'll make sure the
> arguments against it are at least heard.)


Yes, your input is valued!

> I see two kinds of uses for arrays in Cython:
>
> 1) Users that simply wants to allocate an array and do stuff with  
> it. In
> these cases, having something a bit more capable than a standard array
> is always going to be an advantage -- even if one doesn't use anything
> but C array capabilities, being able to quickly insert a print  
> statement
> to dump the array during debugging, pass it to NumPy functions for
> debugging purposes (hmmm, what's wrong...perhaps "any(x > 1)"?) and so
> on is convenient _during development_. (Though not necesarrily NumPy,
> see below).

I don't see the utility as being limited to development.

> 2) One wants to know exactly what is going on and be as near C as
> possible, possibly when wrapping C libraries. But then, nothing can  
> beat
> try/finally anyway, which really isn't that bad to write and makes it
> very clear exactly what is going on:
>
> cdef char* data = NULL
> try:
>   data = getmem(100)
>   ...
> finally:
>   if data: free(data) # can't remember if free checks for null, but
> could make xfree anyway.
>
>
> Introducing some special syntax candy for the landscape that is
> "in-between" these two options just doesn't seem worth it (it makes  
> the
> Cython language heavier and ultimately more difficult to learn).
> Especially when with this syntax candy
>
> a) it looks like the data is going to be allocated on the stack

That's what it acts like.

> b) in a language that doesn't already have a concept of allocating
> objects on the stack (as opposed to C and C++), and

All of your other objects, including fixed-length arrays, are  
allocated on the stack.

> c) magically it doesn't allocate it on the stack anyway

If the user doesn't know about stack vs. heap allocation, this won't  
bother them. If they do, then they're probably savvy enough to not  
worry about it.

> (BTW, not using try/finally in the SAGE code posted does to me (to be
> honest) just fall into the category of bad and/or sloppy programming,
> and one shouldn't make changes to Cython on the basis of that code.)

I agree, but there has to be a better way so that the user doesn't  
have to worry about it. Cython takes care of refcounting, and it  
would be nice if it took care of simple c array memory management too  
(as most Python programmers are not familiar with managing their own  
memory, and that is an area where it is really easy to shoot oneself  
in the foot).

> If NumPy is overkill then perhaps one should instead (as has been
> suggested a few times already the last day) make another "buffer"
> library that operates in the same manner with respect to Cython
> (reference counted etc., but no syntax candy) but is simpler (always
> one-dimensional char* buffer for instance). This could quickly be
> implemented in an inlineable pxd file that is shipped with Cython, and
> potentially be inlined completely in a few years of Cython  
> development.

This is the direction I wold lean, and would be very easy to do with  
the kind of improvements that we have talked about for easy NumPy  
array support. It has the advantage of being able to create them,  
pass them around, etc. but the disadvantage that one needs the GIL to  
rely on Python's recounting infrastructure.

In my mind, arrays are primitive enough that perhaps syntactic sugar  
should be developed, e.g.

cdef double[] a = cdef double[size]

Perhaps there should be a CEP with several alternatives/pros/cons? I  
think the main point is that people want to be able to use arrays  
without having to manually malloc/free (including error-handling).

However, allowing non-constant sized array declarations seems like  
much lower hanging fruit (as well as a much smaller change).

> BTW: Why would NumPy be overkill? Because of a few extra bytes of  
> memory
> per array object? Invoking the incantation "overkill" to me only
> suggests the Not Built Here syndrome, I always like to talk about
> specific, more rational reasons like memory usage, runtime  
> performance,
> library dependency, ...

Library dependancy is the obvious drawback--as nice as NumPy is we  
are not going to require it to use Cython. Runtime performance is  
also an issue, NumPy arrays are fast, but if you've looked at the  
code it is obvious it is nowhere near as fast to create one as a call  
to malloc().

- Robert


_______________________________________________
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] Proposal: idea for automatic management of dynamic memory

Reply via email to