Re: [Cython] coercion of char/Py_UNICODE to Python objects - string or integer?

Stefan Behnel Sun, 25 Apr 2010 22:17:15 -0700

Robert Bradshaw, 25.04.2010 07:05:
> I think char ->  bytes and Py_UNICODE ->  unicode make a lot of sense,
> my only concern would be backwards incompatibility.


There is another thing regarding the char/bytes case, which hits in Py3. 
Here, indexing returns integer values, i.e.

     b"abcdefg"[4]

returns 'e' (str) in Py2 and 101 (int) in Py3. With my recent changes, the 
following now works in both environments:

     cdef bytes s = b"abcdefg"
     cdef char c = s[4]

and (efficiently) returns 101 for c. The Py2/3 bytes difference also hits 
in other places, though, and Cython doesn't hide it. For example, the 
bytes() constructor:

     Py2:   bytes(3)  ==  b'3'
     Py3:   bytes(3)  ==  b'\0'*3

so this will not do the right thing in any case:

     cdef char c = b'e'
     s = bytes(c)

neither will this in Py3, where it returns a unicode string:

     s = chr(c)

But at least this can be made to work in Cython:

     s = <bytes>c   # s == b'e'

We may still have to do more to make the bytes type really usable in Cython 
in a portable way...

Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] coercion of char/Py_UNICODE to Python objects - string or integer?

Reply via email to