"Kent Johnson" <ken...@tds.net> wrote in message
news:1c2a2c590905050337j1afc177ene64f800dcc3a7...@mail.gmail.com...
On Tue, May 5, 2009 at 1:14 AM, Mark Tolonen <metolone+gm...@gmail.com>
wrote:
> The below works. ConfigParser isn't written to support Unicode
> correctly. I
> was able to get Unicode sections to write out, but it was just luck.
> Unicode
> keys and values break as the OP discovered. So treat everything as byte
> strings:
Thanks for the complete example.
> files = glob.glob('*.txt')
> c.add_section('files')
>
> for i,fn in enumerate(files):
> fn = fn.decode(sys.getfilesystemencoding())
I think if you give a Unicode string to glob.glob(), e.g.
glob.glob(u'*.txt'), then the strings returned will also be unicode
and this decode step will not be needed.
You're right, that's why I had the comment above it :^)
# The following could be glob.glob(u'.') to get a filename in
# Unicode, but this is for illustration that the encoding of the
# source file has no bearing on the encoding strings other than
# ones hard-coded in the source file.
The OP had wondered why his source file encoding "doesn't use the encoding
defined for the application (# -*- coding: utf-8 -*-)." and I thought this
would illustrate that byte strings could be in other encodings. It also
shows the reason spir could said "... you shouldn't even need explicit
encoding; they should pass through silently because they fit in an 8 bit
latin charset.". If I'd left out the Chinese, I could've use a latin-1
encoding for everthing and not decode or encode at all (assuming the file
system was latin-1).
-Mark
_______________________________________________
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor