On Jul 30, 2008, at 12:14 AM, Dag Sverre Seljebotn wrote:
> Robert Bradshaw wrote:
>> On Jul 29, 2008, at 1:00 PM, Dag Sverre Seljebotn wrote:
>>
>>> There's a lot of email today but as I finished one part I'm staking
>>> out
>>> the course again.
>>>
>>> I'm thinking about letting the class author specify default buffer
>>> options. Something like this:
>>>
>>> cdef extern class Image ...:
>>> __cythonbufferdefaults__ = {'ndim' : 2}
>>> __cythonbuffermandatory__ = {'dtype': unsigned char,
>>> indirect=True}
>>> __cythonbufferalways__ = True
>>>
>>> [1]
>>>
>>> I think this makes a lot of sense, as the author of the Image class
>>> might export exactly this kind of buffer and nothing else. With the
>>> above settings, one would automatically get efficient indexing on
>>> all
>>> "cdef Image" instances.
>>>
>>> NumPy is kind of special in the flexibility it provides, but even
>>> there,
>>> setting indirect=False can provide a speedup transparently. Once the
>>> indirect option is implemented at all, that is :-) but that *must*
>>> happen to make NumPy as efficient as possible.
>>>
>>> As for priority, I guess this is below most of what I've talked
>>> about so
>>> far. Though it might suddenly look like a low hanging fruit that
>>> I go
>>> for when I need a break from other stuff.
>>>
>>> [1] (Well, I consider custom parsing for that last line better than
>>> inventing a wholly new syntax...and we've been talking about new
>>> Python-style type references as well, which fits in here)
>>
>> This almost seems to magical for me, if one wants to use a buffer
>> perhaps it is better to be explicit. But I'm curious to hear what
>> other people think.
>
> Let me just give one more concrete example for NumPy then. If you do
>
> cdef ndarray[int, 2] buf
>
> currently, then it is going to create code for checking whether it
> should
> do indirect access, which is one if-test per dimension per lookup
> -- and
> you *know* that for ndarrays, you can always get around with strided
> access, i.e. something like
>
> cdef ndarray[int, 2, 'strided'] buf
>
> Now, I would kind of like to be able to tell people to use "object
> [int,
> 2]" to write a generic buffer algorithm, but "ndarray[int, 2]" to
> write
> something that only works with NumPy in an optimized fashion, and
> that is
> it -- and the proposal kind of grew out of that, it is a way of
> letting
> the users not have to type mode="strided" all the time.
>
> (BTW, do you like mode=different strings for this, or should I go with
> "strided=True", "c=True", "fortran=True", etc?
I like just providing strings, which I am assuming map to access flags.
> There will be two modes at
> first: "full" and "strided", although if cython.buffer.bufptr is
> introduced than "c", "fortran", "contig" will be useful as well.)
>
> Of course I could only do this for the mode, but this doesn't seem
> to be a
> special case --
> though most buffer usecases seems to be even more fixed (if you have a
> JPEG library why bother to specify the ndim and so on).
>
> As for an option for automatically retrieving a buffer, I might agree.
> Also I see the downside that it allows syntax like "JPEGImage[]" and
> "MultiDimImage[3]".
You have me convinced that providing defaults is a good thing (and I
agree many (most?) libraries/classes will have a fixed dimension/
type). The __cythonbuffermandatory__ just to turn what would be a
runtime error into a compile time error, right? It may fail to be
true for subclasses. __cythonbufferalways__ can be assumed--if there
is enough (default) information to provide a buffer, then do it,
otherwise don't.
- Robert
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev