On 08/03/2011 01:48 PM, Richard D. Moores wrote:
On Wed, Aug 3, 2011 at 10:11, Peter Otten<__pete...@web.de>  wrote:

<SNIP>
Dave was close, but Steven hit the nail: the string r"C:\Users\Dick\..." is
fine, but when you put it into the docstring it is not a raw string within
another string, it becomes just a sequence of characters that is part of the
outer string. As such \U marks the beginning of a special way to define a
unicode codepoint:
<snip>
Here's from my last post:

====================================
Now I edit it back to its original problem form:

def convertPath(path):
    """
    Given a path with backslashes, return that path with forward slashes.

    By Steven D'Aprano  07/31/2011 on Tutor list
    >>>  path = r'C:\Users\Dick\Desktop\Documents\Notes\College Notes.rtf'
    >>>  convertPath(path)
    'C:/Users/Dick/Desktop/Documents/Notes/College Notes.rtf'
    """<snip>

Traceback (most recent call last):
  File "<stdin>", line 1, in<module>
  File "C:\Python32\lib\site-packages\mycalc2.py", line 10
    """
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes
in position 144-146: truncated \UXXXXXXX
X escape

Using HxD, I find that the bytes in 144-146 are 20, 54, 75 or  the
<space>, 'T', 'u' of  " Tutor" .  A screen shot of HxD with this
version of mycalc2.py open in it is at
<http://www.rcblue.com/images/HxD.jpg>. You can see that I believe the
offset integers are base-10 ints. I do hope that's correct, or I've
done a lot of work for naught.
====================================

So have I not used HxD correctly (my first time to use a hex reader)?
If I have used it correctly, why do the reported problem offsets of
144-146 correspond to such innocuous things as 'T', 'u' and<space>,
and which come BEFORE the problems you and Steven point out?

This one is my fault, for pointing you to the hex viewer. Peter is correct. But the offset is relative to the beginning of the triple-quoted string. The problem has nothing to do with the encoding of the file itself, but instead just with the backslashes inside the triple-quoted string. Since you have a \U, the parser also expects 8 hex digits. The thing that threw me was that this particular symptom is specific to Python 3.x, which I don't normally use.

The following line would have the same problem:

mystring = "abc \Unexpected def"

since the letters nexpecte don't spell out a valid hexcode. You would instead want

mystring = r"abc \Unexpected def"

--

DaveA

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to