On May 15, 2008, at 12:33 AM, Dag Sverre Seljebotn wrote:
>
>> My proposal is that for someone coming from using NumPy from Python,
>> they only need declare their object (say x) as being a numpy array,
>> and then all access to x.shape is suddenly faster rather than having
>> to remember a new way to access shape, ndim, etc.
>
> I've been ranting about this before, so I'll be brief [edit: I
> failed],
> but I have strong feelings (and a strong investment in the summer too)
> in this.
>
> The problem is that typical usecases want to either use the Python API
> (though *invisibly* optimized), or the C API.
Having two separate APIs to the same library is suboptimal, but is
(currently) sometimes forced by namespace conflicts. What I think
people want is to use the Python API that gets compiled (by Cython)
to using the C API directly. I see nothing gained by trying to make
the two more distinct--in fact the more they overlap the less
"translation" Cython will have to do (which would reduce your burden,
right?).
> I don't see a need for the in-between crossbreed.
Cpdef is the ultimite crossbreed, and was very welcome. In fact, I
would say the whole point of Python/Cython is an in-between
crossbreed between C and Python. Or maybe I'm misinterpreting you here.
>
> Explicitly, I'd love for the following to fail, hard:
>
> cdef int* s = arr.shape
> cdef object s = raw_arr.dimensions
As defined in the current NumPy library, I agree (though it's a
little unclear what you mean by "arr" vs. "raw_arr").
However, in the ideal world, I think if arr has a cdef attribute
named "shape" and a property named "shape" then it should use the
best one for the task at hand (e.g. printing vs indexing).
> ("cdef object s = arr.shape" should work though, but I consider that
> GSoC-stuff, not something you can fake at this stage.)
Doesn't that work now?
>
> We might just have to agree that we disagree on this one. Though
> remember: Explicit is better than implicit.
>
> Renaming the fields at least keep the APIs relatively seperate.
> Even if
> the Python API is provided, there are still usecases for the native
> API.
>
> For instance, how can one implement the ndarray inlines if there's no
> raw access? If there's no way to access the raw C API but one
> relies on
> "magic" then they will create infinite loops.
No, I'm seeing things going the other way. obj.attr is viewed as a
cdef attribute (if it exists) any only as a python attribute if it is
used in a python context and can't be coerced. (One would implement
the shape property to return the appropriate object.) The "raw
access" is always available (assuming obj is typed correctly) and
there is no need or motivation for a separate API.
>
> Perhaps the following can be a compromise?:
>
> cdef extern ...
> ctypedef struct ndarray_extension_struct "PyArrayObject":
> int nd
> Py_intptr_t *dimensions
>
> cdef class numpy.ndarray [object PyArrayObject]:
> property shape:
> cdef inline final object __get__(self):
> cdef ndarray_extension_struct* raw = \
> <ndarray_extension_struct*>self
> return make_tuple_from(raw.dimensions, raw.nd)
>
> # And if you want auto-conversion to int-array,
> # add a type-overload like this:
> cdef inline final Py_intptr_t* __get__(self):
> cdef ndarray_extension_struct* raw = \
> <ndarray_extension_struct*>self
> return raw.dimensions
>
>
> So if you really want access to the raw struct, there's a
> predefined way.
>
> Lisandro's proposal is a lot easier on the fingers though ("return
> make_tuple_from(self.cshape, self.cndim)").
I think "make_tuple_from(self.shape, self.ndim)" is even easier than
that, especially if one has a Python program using NumPy and wants to
start using Cython to make it faster.
- Robert
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev