2011/1/15 Stefan Behnel <stefan...@behnel.de>: > Stefan Behnel, 15.01.2011 19:13: >> Vitja Makarov, 15.01.2011 18:46: >>> What will it print for '\u1234'? >>> >>> Python 2.6.6 (r266:84292, Sep 15 2010, 16:22:56) >>> [GCC 4.4.5] on linux2 >>> Type "help", "copyright", "credits" or "license" for more information. >>>>>> '\u' >>> '\\u' >>>>>> '\u1234' >>> '\\u1234' >>>>>> >>> >>> I think that '\u' should be translated into '\\u' for python2 >> >> That's what it does, yes. This works because we actually parse unprefixed >> strings in parallel as byte strings and unicode strings. >> >> However, now that I tried it, I actually get the same result in Py3, >> although it should have parsed the string correctly. Not sure if we >> discussed this problem before, but it looks like a bug to me. > > Thinking about this some more, it's inconsistent either way. > > 1) If the literal string semantics should be fixed at compile time, you > shouldn't get a unicode string in Python 3 in the first place. > > 2) If the literal should become a byte string in Py2 and a unicode string > in Py3, then the unicode string should be what you you'd get if you ran > your code in Py3, i.e. the unescaped unicode literal. > > Given that 1) is out of discussion, 2) should be fixed, IMHO. >
Can't we rely on -[23] cython switch? In -2 mode strings are always byte string and -3 always unicode? About '\xFF' error take a look at https://sage.math.washington.edu:8091/hudson/view/cython-devel/job/cython-devel-tests-pyregr-py27-c/294/console There are lots of similar errors reported. test_audioop.py: Error compiling Cython file: ------------------------------------------------------------ ... def test_lin2adpcm(self): # Very cursory test self.assertEqual(audioop.lin2adpcm('\0\0\0\0', 1, None), ('\0\0', (0,0))) def test_lin2alaw(self): self.assertEqual(audioop.lin2alaw(data[0], 1), '\xd5\xc5\xf5') -- vitja. _______________________________________________ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev