On 30 Sep, 09:37 pm, [EMAIL PROTECTED] wrote:
On Tue, Sep 30, 2008 at 11:42 AM, <[EMAIL PROTECTED]> wrote:
There are other ways to glean this knowledge; for example, looking at
the
'iocharset' or 'nls' mount options supplied to mount various
filesystems.
I know we could do a better job, but absent anyone who knows what
they're doing we've chosen a fairly conservative approach. I certainly
hope that someone will contribute some mean encoding-guessing code to
the stdlib that users can use. I'm not sure if I'll ever endorse doing
this automatically in io.open(), though I'd be fine with a convention
like passing encoding="guess".
I think the conservative approach is actually correct, or rather, as
close to correct as it is possible to get in this mess. Inspecting
these fantastically obscure options is only likely to be helpful in a
tool which tries to correct filesystem encoding errors on legacy data.
I wouldn't even know about them if I hadn't written several such tools
(well, just little scripts, really) in the past. I was just verifying
that I wasn't missing some "right way" which would let someone else do
the guesswork for me.
In reality, you have two options for filesystem encoding on Linux:
* UTF-8
* fall in a well and die
The OS will happily let you create a completely nonsensical environment
where no application can possibly do anything reasonable: set LC_ALL to
KOI8R, mount your USB keychain as Shift_JIS and your windows partition
as ISO-8859-8. Of course nobody would actually _do_ this, because they
want things to work, so everything is gradually evolving to a default of
UTF-8 everywhere. In practice, however, there are still problems with
CIFS/SMB shares where other clients have different ideas about encoding.
I've experienced this most commonly when sharing with Macs, which have
very particular and different ideas about normalization, as has already
been discussed in this thread.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com