Hello everyone,

Before I pose my question, I should mention that I'm still pretty unfamiliar
with proper terminology for string encoding, so I might get some of it
wrong. Please bear with me.

I'm writing a program that accepts arguments from the command line. Some of
my users are using Windows with a non-unicode locale setting and characters
outside of the ascii set. So something like

$ program --option <Cyrillic text>

ultimately results in "UnicodeDecodeError: 'utf8' codec can't decode bytes
in position 0-3: invalid data"

My questions:
1) Is it safe to immediately decode all strings in sys.argv[] with something
like sys.argv = [string.decode(sys.stdin.encoding) for string in sys.argv]?
2) Can something similar be done to anything returned by raw_input()?

Thanks,
Tom
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to