On Sep 11, 2009, at 5:36 PM, Dominic Sacré wrote:

> Hi,
>
> I'm trying to make a Pyrex/Cython module that was originally  
> written for
> Python 2.x work with Python 3.x, while at the same time keeping it
> compatible with older versions.
>
> It seems like when using Python 3.x, Cython will automatically replace
> 'unicode' with 'str', and 'str' with 'bytes'. Also, string literals  
> are
> interpreted as 'bytes' unless prefixed with 'u'.
> However, 'bytes' is not really useful in a context where an actual
> string is expected, and causes problems for example when working with
> strings passed from Python.
> (One of many issues I have run into is the fact that b"foo" !=  
> "foo"...)
>
> The only solution I've found to at least get most of my code  
> working is
> basically to use unicode for almost everything,

I think this is (unfortunately) by design.

> but if possible I'd like
> to avoid unicode strings in the 2.x version.
>
> Is there a sane way to use the native string type (i.e. 'str') in  
> either
> Python version?

How to handle strings/unicode, especially in Python 3, has been a  
huge area of debate on the list. However, I'm surprised that str is  
mapped to bytes in Python 3. What was the justification for this, or  
is it just a bug? I think if

def foo():
     return str, isinstance("abc", str)

have different behavior in Cython and Python that there's a bug  
(unless there's a *very* good reason to do so). I'm not trying to re- 
advocate automatic char* <-> unicode conversions.

- Robert

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to