Re: [Tutor] Encode problem

Mark Tolonen Tue, 05 May 2009 07:17:31 -0700

"Kent Johnson" <ken...@tds.net> wrote in messagenews:1c2a2c590905050337j1afc177ene64f800dcc3a7...@mail.gmail.com...

On Tue, May 5, 2009 at 1:14 AM, Mark Tolonen <metolone+gm...@gmail.com>wrote:

> The below works. ConfigParser isn't written to support Unicode> correctly. I> was able to get Unicode sections to write out, but it was just luck.> Unicode
> keys and values break as the OP discovered. So treat everything as byte
> strings:

Thanks for the complete example.

> files = glob.glob('*.txt')
> c.add_section('files')
>
> for i,fn in enumerate(files):
> fn = fn.decode(sys.getfilesystemencoding())

I think if you give a Unicode string to glob.glob(), e.g.
glob.glob(u'*.txt'), then the strings returned will also be unicode
and this decode step will not be needed.


You're right, that's why I had the comment above it :^)

   # The following could be glob.glob(u'.') to get a filename in
   # Unicode, but this is for illustration that the encoding of the
   # source file has no bearing on the encoding strings other than
   # ones hard-coded in the source file.

The OP had wondered why his source file encoding "doesn't use the encodingdefined for the application (# -*- coding: utf-8 -*-)." and I thought thiswould illustrate that byte strings could be in other encodings. It alsoshows the reason spir could said "... you shouldn't even need explicitencoding; they should pass through silently because they fit in an 8 bitlatin charset.". If I'd left out the Chinese, I could've use a latin-1encoding for everthing and not decode or encode at all (assuming the filesystem was latin-1).


-Mark


_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Encode problem

Reply via email to