Matthew Honnibal wrote:
> Hi,
> I've only just started using Cython today, and I'm having trouble with
> the buffer interface indexing described here:
> http://wiki.cython.org/enhancements/buffer . I want to iterate over a
> unicode string getting contiguous subsequences.
>
The buffer PEP is *available* in Python 2.6, however I don't think
objects in the Python standard library exports its buffers using it.
Unfortunately.
What you can try to do is use the backwards-compatability mechanisms of
implementing __getbuffer__ in Cython, something like (untested):
from python_unicode cimport Py_UNICODE
cdef extern from "Python.h": # Or Python's unicodeobject.h
ctypedef class unicode [object PyUnicodeObject]:
Py_ssize_t length
Py_UNICODE *str
def __getbuffer__(self, Py_buffer* buf, int flags):
... fill in buf struct with PEP 3118 information to export
self.str/self.length ...
Notes:
a) If you only want to deal with unicodes, you can probably just as well
drop __getbuffer__. With the declaration above, you can still do
cdef unicode u = myunicode
cdef Py_UNICODE *buf = u.str
print buf[3] # gets 4th unicode character
without any buffer support.
b) If you do write up a decent unicode declaration, make sure to
contribute it to Cython's Cython/Includes/python_unicode.
c) If you go the __getbuffer__ route for more convenient syntax, be
aware that unicode types are not supported; you need to export it as
"=I", "=H", "=B" (int, short, byte) depending on sizeof(Py_UNICODE), see
struct module, and then acquire a buffer through
cdef unicode[Py_UNICODE] u = myunicode
cdef Py_UNICODE onechar = u[3]
Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev