Re: [Cython] string literal parsing problem

Stefan Behnel Sat, 15 Jan 2011 10:13:52 -0800

Vitja Makarov, 15.01.2011 18:46:
> 2011/1/15 Stefan Behnel:
>> Vitja Makarov, 15.01.2011 17:44:
>>> When I say about py3 I mean that strings are unicode by default.
>>
>> For this code
>>
>>    '\u'
>>
>> Cython now gives you an error just like you'd get when pasting the above
>> into Python 3. That's how 'str' works in Cython.
>
> That's ok to me, but will break some pyregr tests.


I know, but that's what was decided (as the result of a huge amount of 
discussion). It's there to make the life of the users easier.


> What will it print for '\u1234'?
>
> Python 2.6.6 (r266:84292, Sep 15 2010, 16:22:56)
> [GCC 4.4.5] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> '\u'
> '\\u'
> >>> '\u1234'
> '\\u1234'
> >>>
>
> I think that '\u' should be translated into '\\u' for python2

That's what it does, yes. This works because we actually parse unprefixed 
strings in parallel as byte strings and unicode strings.

However, now that I tried it, I actually get the same result in Py3, 
although it should have parsed the string correctly. Not sure if we 
discussed this problem before, but it looks like a bug to me.


>>> How about this kind of errors:
>>>
>>> Error converting Pyrex file to C:
>>> ------------------------------------------------------------
>>> ...
>>> print "\xff"
>>>        ^
>>> ------------------------------------------------------------
>>>
>>> /tmp/foo.pyx:1:6: String decoding as 'UTF-8' failed. Consider using a
>>> byte string or unicode string explicitly, or adjust the source code
>>> encoding.
>>
>> This error has been gone for good for quite a while now. Where did you see 
>> it?
>
> Hmm, it's still there:
>
> vitja@vitja-laptop:~/work/cython.git$ cat foo.pyx
> print '\xFF'
> vitja@vitja-laptop:~/work/cython.git$ python cython.py foo.pyx
>
> Error compiling Cython file:
> ------------------------------------------------------------
> ...
> print '\xFF'
>       ^
> ------------------------------------------------------------
>
> foo.pyx:1:6: Decoding unprefixed string literal from 'UTF-8' failed.
> Consider usinga byte string or unicode string explicitly, or adjust
> the source code encoding.

Weird, I didn't get that.

Anyway, different error message, different place to look then. This is 
related to the above problem. If both string representations get written 
into the C file correctly, this is no longer necessary.

Stefan
_______________________________________________
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] string literal parsing problem

Reply via email to