Robert Bradshaw wrote: > On Mar 7, 2009, at 2:37 PM, Dag Sverre Seljebotn wrote: >> My proposal is here: >> http://wiki.cython.org/enhancements/buffersyntax >> >> And the thread from the numpy list here: >> http://thread.gmane.org/gmane.comp.python.numeric.general/28439 > > Very interesting, and I like the syntax for the most part (int > [:,:,"fortran"] seems a bit odd to me, perhaps mode should be a > keyword).
Consider though that @cython.locals(a=cython.int[:,:,"fortran"]) def foo(): ... is legal Python. Dropping Python compatability on a new syntax like this seems like a high price to pay. Perhaps cython.int[:,:,cython.buffer.FORTRAN_MODE] or similar? > Can you pass them around/have them assigned to global/class-level > variables? Will they be garbage collected? Also, how much of numpy Global/class-level was always the intention for current buffers as well, just never got around to it. So yes, it would just need to be done. Implementation side they can be considered (smaller variations on) Py_buffer structs. The variables themselves will (up to optimization) be structs which are copied by value, however they point to Python objects which will be refcounted in the process. a) They will hold a refcounted reference to an underlying object owning the memory, and b) they will (sometimes) need a seperate "acquisition refcount" so that when making a slice, the buffer will not be released until the last slice goes out of scope. All in all a happy mixture of copying by value and reference counting. > will you be duplicating, and how easy will it be to turn one of these > bare buffers into a numpy array? (I do like the easier int* <-> int > [:] relation though.) myarr = numpy.array(mybuf) This is currently used for conversion from lists etc. Implementation-wise this is rather simple (NumPy can support memoryview in newer releases/Python versions, and until then we can subclass memoryview to provide the NumPy __toarray__ protocol which numpy.array supports). How much duplication: Well at first nothing, but suppose the whole wanted-list is done (which depends on bringing in e.g. a student): All in all, it's more a matter of reimplementing small parts of Fortran than reimplementing NumPy. (I am actually toying with the idea of outputting and linking in Fortran code, but I don't think it would buy us much). a) Slices is pretty much duplication of functionality, but doing it compile-time, for speed, means it must be coded in a different way than in NumPy. b) The same goes for arithmetic. The NumPy one-operation-at-the-time approach isn't the most efficient one and if we decide to do this ourselves, it would be in order to do it in a different way, i.e. turn "a = b + sqrt(c)" into a full inline loop. c) Functions (log, sqrt, sin, etc.). Currently there's NumPy which only takes a whole array and C which only takes a single element. What one needs is functions which can be used on both levels, so that "a = sqrt(c)" can be used on buffer level but be turned by Cython into a loop calling sqrt on all elements individually. All in all, I think arithmetic on buffers needs pluggability to handle different backends anyway (does one have the commercial Intel MKL library? And so on.) At the cost of a very strong NumPy dependency (if this feature is used), one could of course implement this through NumPy at first, perhaps as one "backend". I.e. Cython-generated modules with buffer arithmetic would at first implicitly depend on NumPy (even if all you're trying to do is sum the contents of two C arrays). But I'm still thinking about possible solutions for the arithmetic part, it's a tricky part. Dag Sverre _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
