Re: [Cython] string literal parsing problem

Vitja Makarov Sat, 15 Jan 2011 08:44:43 -0800

2011/1/15 Stefan Behnel <stefan...@behnel.de>:
> Vitja Makarov, 15.01.2011 17:21:
>> 2011/1/15 Stefan Behnel:
>>> Vitja Makarov, 15.01.2011 15:02:
>>>> Looking into pyregr test log, I found that this code crashes cython 
>>>> compiler:
>>>>
>>>> print('\uXX')
>>>>
>>>> Here is traceback:
>>>>
>>>>     File "Cython/Compiler/Parsing.py", line 788, in p_string_literal
>>>>       chrval = int(systr[2:], 16)
>>>> ValueError: invalid literal for int() with base 16: ''
>>>
>>> Hmm, right, the scanner notices that '\uXX' is not a valid Unicode escape
>>> sequence and reads it as '\u' + 'XX'.
>>>
>>> Good catch, I'll fix it.
>>>
>>> http://trac.cython.org/cython_trac/ticket/647
>
> The same applies to hex sequences, BTW.
>
>
>> Please notice that '\u' is valid string but not unicode string,
>
> I know, I've written (and rewritten) most of that code. ;-)
>
>
>> so it's valid in py2 and not py3.
>
> Nope, it's valid in byte strings but not in unicode strings. Py2/Py3 is not
> an issue here. Invalid hex sequences should trigger an error in both cases.
>


When I say about py3 I mean that strings are unicode by default.


How about this kind of errors:

Error converting Pyrex file to C:
------------------------------------------------------------
...
print "\xff"
     ^
------------------------------------------------------------

/tmp/foo.pyx:1:6: String decoding as 'UTF-8' failed. Consider using a
byte string or unicode string explicitly, or adjust the source code
encoding.


I think it should be ok in py2 mode and give error in py3

-- 
vitja.
_______________________________________________
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] string literal parsing problem

Reply via email to