Robert Clement wrote:
Hi

I have a wxpython control in which users are intended to enter control characters used to define binary string delimiters, eg. '\xBA\xBA' or '\t\r\n' .


Do you mean that your users enter *actual* control characters? What do they type to enter (say) an ASCII null character into the field?

Or do you mean they type the string representation of the control character, e.g. for ASCII null they press \ then 0 on their keyboard, and the field shows \0 rather than one of those funny little square boxes you get for missing characters in fonts.

I will assume you mean the second, because I can't imagine how to enter control characters directly into a field (other than the simple ones like newline and tab).


The string returned by the control is a unicode version of the string entered by the user, eg. u'\\xBA\\xBA' or u'\\t\\r\\n' .

The data you are dealing with is binary, that is, made up of bytes between 0 and 255. The field is Unicode, that is, made up of characters with code points between 0 and some upper limit which is *much* higher than 255. If wxpython has some way to set the encoding of the field to ASCII, that will probably save you a lot of grief; otherwise, you'll need to decide what you want to do if the user types something like £ or © or other unicode characters.

In any case, it seems that you are expecting strings with the representation of control characters, rather than actual control characters.


I would like to be able retrieve the original string containing the escaped control characters or hex values so that I can assign it to a variable to be used to split the binary string.

You have the original string -- the user typed <backslash> <code>, and you are provided <backslash> <code>.

Remember that backslashes in Python are special, and so they are escaped when displaying the string. Because \t is used for the display of tab, it can't be used for the display of backslash-t. Instead the display of backslash is backslash-backslash. But that's just the *display*, not the string itself. If you type \t into your field, and retrieve the string which looks like u'\\t', if you call len() on the string you will get 2, not 3, or 6. If you print it with the print command, it will print as \t with no string delimiters u' and ' and no escaped backslash.

So you have the original string, exactly as typed by the user. I *think* what you want is to convert it to *actual* control characters, so that a literal backslash-t is converted to a tab character, etc.


>>> s = u'\\t'
>>> print len(s), s, repr(s)
2 \t u'\\t'
>>> t = s.decode('string_escape')
>>> print len(t), t, repr(t)
1       '\t'


Hope that helps.




--
Steven

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to