"Kent Johnson" <ken...@tds.net> wrote in message news:1c2a2c590905050337j1afc177ene64f800dcc3a7...@mail.gmail.com...
On Tue, May 5, 2009 at 1:14 AM, Mark Tolonen <metolone+gm...@gmail.com> wrote:

> The below works. ConfigParser isn't written to support Unicode > correctly. I > was able to get Unicode sections to write out, but it was just luck. > Unicode
> keys and values break as the OP discovered. So treat everything as byte
> strings:

Thanks for the complete example.

> files = glob.glob('*.txt')
> c.add_section('files')
>
> for i,fn in enumerate(files):
> fn = fn.decode(sys.getfilesystemencoding())

I think if you give a Unicode string to glob.glob(), e.g.
glob.glob(u'*.txt'), then the strings returned will also be unicode
and this decode step will not be needed.

You're right, that's why I had the comment above it :^)

   # The following could be glob.glob(u'.') to get a filename in
   # Unicode, but this is for illustration that the encoding of the
   # source file has no bearing on the encoding strings other than
   # ones hard-coded in the source file.

The OP had wondered why his source file encoding "doesn't use the encoding defined for the application (# -*- coding: utf-8 -*-)." and I thought this would illustrate that byte strings could be in other encodings. It also shows the reason spir could said "... you shouldn't even need explicit encoding; they should pass through silently because they fit in an 8 bit latin charset.". If I'd left out the Chinese, I could've use a latin-1 encoding for everthing and not decode or encode at all (assuming the file system was latin-1).

-Mark


_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to