Re: is utf-8 the standard filename encoding?

Rodney Dawes Wed, 21 Dec 2011 11:19:10 -0800

On Wed, 2011-12-21 at 09:42 -0800, Steve Langasek wrote:
> It's possible I'm mistaken about the default behavior on Ubuntu
> Server,
> though - someone please correct me if I'm wrong.  Maybe this is
> another
> reason why we need to get the C.UTF-8 locale going everywhere.


It is definitely not using C.UTF-8 everywhere. And just C is not
UTF-8. Is it even valid to specify a charset for C locale? Doesn't
POSIX define it as always being ASCII?


> Notwithstanding the above (which indeed also explains why using the
> locale's
> charset value is a poor heuristic for interpreting filenames on the
> Linux
> filesystem), it's my understanding that the GNOME vfs stack has
> refused for
> several years now to work with any filenames that aren't UTF-8.  So
> desktop
> users with non-utf8 filenames are going to have a hard time of it.
> 
This isn't quite true. There is a complicated set of environment
variables, and checks in the code, to ensure that display is always
UTF-8, but it generally handles non-UTF-8 filenames gracefully.
Python on the other hand, just raises Unicode encoding/decoding
exceptions, and apps have to handle these to be graceful themselves.

I think Python 3 might make this a bit better though, by using Unicode
as the default string type, rather than the bytes in 2.x.

signature.asc
Description: This is a digitally signed message part

-- 
ubuntu-devel mailing list
[email protected]
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel

Re: is utf-8 the standard filename encoding?

Reply via email to