2011/1/15 Stefan Behnel <stefan...@behnel.de>: > Vitja Makarov, 15.01.2011 17:21: >> 2011/1/15 Stefan Behnel: >>> Vitja Makarov, 15.01.2011 15:02: >>>> Looking into pyregr test log, I found that this code crashes cython >>>> compiler: >>>> >>>> print('\uXX') >>>> >>>> Here is traceback: >>>> >>>> File "Cython/Compiler/Parsing.py", line 788, in p_string_literal >>>> chrval = int(systr[2:], 16) >>>> ValueError: invalid literal for int() with base 16: '' >>> >>> Hmm, right, the scanner notices that '\uXX' is not a valid Unicode escape >>> sequence and reads it as '\u' + 'XX'. >>> >>> Good catch, I'll fix it. >>> >>> http://trac.cython.org/cython_trac/ticket/647 > > The same applies to hex sequences, BTW. > > >> Please notice that '\u' is valid string but not unicode string, > > I know, I've written (and rewritten) most of that code. ;-) > > >> so it's valid in py2 and not py3. > > Nope, it's valid in byte strings but not in unicode strings. Py2/Py3 is not > an issue here. Invalid hex sequences should trigger an error in both cases. >
When I say about py3 I mean that strings are unicode by default. How about this kind of errors: Error converting Pyrex file to C: ------------------------------------------------------------ ... print "\xff" ^ ------------------------------------------------------------ /tmp/foo.pyx:1:6: String decoding as 'UTF-8' failed. Consider using a byte string or unicode string explicitly, or adjust the source code encoding. I think it should be ok in py2 mode and give error in py3 -- vitja. _______________________________________________ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev