On Sep 11, 2009, at 5:36 PM, Dominic Sacré wrote:
> Hi,
>
> I'm trying to make a Pyrex/Cython module that was originally
> written for
> Python 2.x work with Python 3.x, while at the same time keeping it
> compatible with older versions.
>
> It seems like when using Python 3.x, Cython will automatically replace
> 'unicode' with 'str', and 'str' with 'bytes'. Also, string literals
> are
> interpreted as 'bytes' unless prefixed with 'u'.
> However, 'bytes' is not really useful in a context where an actual
> string is expected, and causes problems for example when working with
> strings passed from Python.
> (One of many issues I have run into is the fact that b"foo" !=
> "foo"...)
>
> The only solution I've found to at least get most of my code
> working is
> basically to use unicode for almost everything,
I think this is (unfortunately) by design.
> but if possible I'd like
> to avoid unicode strings in the 2.x version.
>
> Is there a sane way to use the native string type (i.e. 'str') in
> either
> Python version?
How to handle strings/unicode, especially in Python 3, has been a
huge area of debate on the list. However, I'm surprised that str is
mapped to bytes in Python 3. What was the justification for this, or
is it just a bug? I think if
def foo():
return str, isinstance("abc", str)
have different behavior in Cython and Python that there's a bug
(unless there's a *very* good reason to do so). I'm not trying to re-
advocate automatic char* <-> unicode conversions.
- Robert
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev