Re: [Cython] string literal parsing problem

Stefan Behnel Sat, 15 Jan 2011 10:20:47 -0800

Stefan Behnel, 15.01.2011 19:13:
> Vitja Makarov, 15.01.2011 18:46:
>> What will it print for '\u1234'?
>>
>> Python 2.6.6 (r266:84292, Sep 15 2010, 16:22:56)
>> [GCC 4.4.5] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> '\u'
>> '\\u'
>>>>> '\u1234'
>> '\\u1234'
>>>>>
>>
>> I think that '\u' should be translated into '\\u' for python2
>
> That's what it does, yes. This works because we actually parse unprefixed
> strings in parallel as byte strings and unicode strings.
>
> However, now that I tried it, I actually get the same result in Py3,
> although it should have parsed the string correctly. Not sure if we
> discussed this problem before, but it looks like a bug to me.


Thinking about this some more, it's inconsistent either way.

1) If the literal string semantics should be fixed at compile time, you 
shouldn't get a unicode string in Python 3 in the first place.

2) If the literal should become a byte string in Py2 and a unicode string 
in Py3, then the unicode string should be what you you'd get if you ran 
your code in Py3, i.e. the unescaped unicode literal.

Given that 1) is out of discussion, 2) should be fixed, IMHO.

Stefan
_______________________________________________
Cython-dev mailing list
Cython-dev@codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] string literal parsing problem

Reply via email to