On 30 Sep, 09:37 pm, [EMAIL PROTECTED] wrote:
On Tue, Sep 30, 2008 at 11:42 AM,  <[EMAIL PROTECTED]> wrote:
There are other ways to glean this knowledge; for example, looking at the 'iocharset' or 'nls' mount options supplied to mount various filesystems.

I know we could do a better job, but absent anyone who knows what
they're doing we've chosen a fairly conservative approach. I certainly
hope that someone will contribute some mean encoding-guessing code to
the stdlib that users can use. I'm not sure if I'll ever endorse doing
this automatically in io.open(), though I'd be fine with a convention
like passing encoding="guess".

I think the conservative approach is actually correct, or rather, as close to correct as it is possible to get in this mess. Inspecting these fantastically obscure options is only likely to be helpful in a tool which tries to correct filesystem encoding errors on legacy data. I wouldn't even know about them if I hadn't written several such tools (well, just little scripts, really) in the past. I was just verifying that I wasn't missing some "right way" which would let someone else do the guesswork for me.

In reality, you have two options for filesystem encoding on Linux:

 * UTF-8
 * fall in a well and die

The OS will happily let you create a completely nonsensical environment where no application can possibly do anything reasonable: set LC_ALL to KOI8R, mount your USB keychain as Shift_JIS and your windows partition as ISO-8859-8. Of course nobody would actually _do_ this, because they want things to work, so everything is gradually evolving to a default of UTF-8 everywhere. In practice, however, there are still problems with CIFS/SMB shares where other clients have different ideas about encoding. I've experienced this most commonly when sharing with Macs, which have very particular and different ideas about normalization, as has already been discussed in this thread.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to