Re: [Tutor] string conversion

Steven D'Aprano Mon, 28 Feb 2011 15:46:43 -0800

Robert Clement wrote:

Hi
I have a wxpython control in which users are intended to enter controlcharacters used to define binary string delimiters, eg. '\xBA\xBA' or'\t\r\n' .

Do you mean that your users enter *actual* control characters? What dothey type to enter (say) an ASCII null character into the field?

Or do you mean they type the string representation of the controlcharacter, e.g. for ASCII null they press \ then 0 on their keyboard,and the field shows \0 rather than one of those funny little squareboxes you get for missing characters in fonts.

I will assume you mean the second, because I can't imagine how to entercontrol characters directly into a field (other than the simple oneslike newline and tab).

The string returned by the control is a unicode version of the stringentered by the user, eg. u'\\xBA\\xBA' or u'\\t\\r\\n' .

The data you are dealing with is binary, that is, made up of bytesbetween 0 and 255. The field is Unicode, that is, made up of characterswith code points between 0 and some upper limit which is *much* higherthan 255. If wxpython has some way to set the encoding of the field toASCII, that will probably save you a lot of grief; otherwise, you'llneed to decide what you want to do if the user types something like £ or© or other unicode characters.

In any case, it seems that you are expecting strings with therepresentation of control characters, rather than actual control characters.

I would like to be able retrieve the original string containing theescaped control characters or hex values so that I can assign it to avariable to be used to split the binary string.

You have the original string -- the user typed <backslash> <code>, andyou are provided <backslash> <code>.

Remember that backslashes in Python are special, and so they are escapedwhen displaying the string. Because \t is used for the display of tab,it can't be used for the display of backslash-t. Instead the display ofbackslash is backslash-backslash. But that's just the *display*, not thestring itself. If you type \t into your field, and retrieve the stringwhich looks like u'\\t', if you call len() on the string you will get 2,not 3, or 6. If you print it with the print command, it will print as \twith no string delimiters u' and ' and no escaped backslash.

So you have the original string, exactly as typed by the user. I *think*what you want is to convert it to *actual* control characters, so that aliteral backslash-t is converted to a tab character, etc.



>>> s = u'\\t'
>>> print len(s), s, repr(s)
2 \t u'\\t'
>>> t = s.decode('string_escape')
>>> print len(t), t, repr(t)
1       '\t'


Hope that helps.




--
Steven

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] string conversion

Reply via email to