Nathaniel Smith, 12.12.2009 10:05:
> After upgrading to Cython 0.12 today (Python 2.5.2, x86-64, linux),
> some code of mine broke. Specifically, it's code for reading a binary
> format, and in the tests I had a string that made Cython fail to
> compile with the error:
>   String decoding as 'UTF-8' failed. Consider using a byte string or
> unicode string explicitly, or adjust the source code encoding.
> 
> As an example, here's a complete file that Cython 0.12 will refuse to compile:
> -------------
> s = "\x12\x34\x9f\x65"
> -------------
>
> I'm not sure why it's nattering about the source code encoding when
> the problem is with explicitly quoted byte values

Because you are using a 'str' literal, which needs to be decoded in Python
3 to become the equivalent str (i.e. unicode) object. A check for that is
required for the semantics of the 'str' type in Cython, as it would
otherwise be impossible to switch the type in the generated C code - you
simply can't write out a unicode literal into C in a portable way.

The relevant CEP is here:

http://wiki.cython.org/enhancements/stringliterals


> but... my question
> is, I can fix this by adding a "b" sigil on the front, but that's
> incompatible with earlier versions of Cython.

Yes, bytes literals were fixed up fairly recently - may have been 0.11 or
so. Given that they were partly broken before that, I don't really see why
you would want to support earlier versions of Cython anyway.


> Is there any way to
> write this string that will work with all versions of Cython?

I'd just drop support for earlier Cython versions and go with an explicit
b'...' literal.


> (And was it really intentional to break Python source compatibility so badly?)

What do you mean? And what version of Python are you referring to?

Stefan

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to