Vitja Makarov, 15.01.2011 18:46: > 2011/1/15 Stefan Behnel: >> Vitja Makarov, 15.01.2011 17:44: >>> When I say about py3 I mean that strings are unicode by default. >> >> For this code >> >> '\u' >> >> Cython now gives you an error just like you'd get when pasting the above >> into Python 3. That's how 'str' works in Cython. > > That's ok to me, but will break some pyregr tests.
I know, but that's what was decided (as the result of a huge amount of discussion). It's there to make the life of the users easier. > What will it print for '\u1234'? > > Python 2.6.6 (r266:84292, Sep 15 2010, 16:22:56) > [GCC 4.4.5] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> '\u' > '\\u' > >>> '\u1234' > '\\u1234' > >>> > > I think that '\u' should be translated into '\\u' for python2 That's what it does, yes. This works because we actually parse unprefixed strings in parallel as byte strings and unicode strings. However, now that I tried it, I actually get the same result in Py3, although it should have parsed the string correctly. Not sure if we discussed this problem before, but it looks like a bug to me. >>> How about this kind of errors: >>> >>> Error converting Pyrex file to C: >>> ------------------------------------------------------------ >>> ... >>> print "\xff" >>> ^ >>> ------------------------------------------------------------ >>> >>> /tmp/foo.pyx:1:6: String decoding as 'UTF-8' failed. Consider using a >>> byte string or unicode string explicitly, or adjust the source code >>> encoding. >> >> This error has been gone for good for quite a while now. Where did you see >> it? > > Hmm, it's still there: > > vitja@vitja-laptop:~/work/cython.git$ cat foo.pyx > print '\xFF' > vitja@vitja-laptop:~/work/cython.git$ python cython.py foo.pyx > > Error compiling Cython file: > ------------------------------------------------------------ > ... > print '\xFF' > ^ > ------------------------------------------------------------ > > foo.pyx:1:6: Decoding unprefixed string literal from 'UTF-8' failed. > Consider usinga byte string or unicode string explicitly, or adjust > the source code encoding. Weird, I didn't get that. Anyway, different error message, different place to look then. This is related to the above problem. If both string representations get written into the C file correctly, this is no longer necessary. Stefan _______________________________________________ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev