Re: filename encoding

Markus Kuhn Mon, 05 Feb 2001 03:57:35 -0800
Marcin 'Qrczak' Kowalczyk wrote on 2001-02-05 07:06 UTC:
> I cannot imagine how it could work to be always UTF-8, where the rest
> of the system uses the locale encoding and most of the system is not
> aware of character encodings.

This obviously won't work. If all filenames used by a user are encoded
in UTF-8, then obviously the user has to work in a UTF-8 locale as well.

Moving all filenames on a system to UTF-8 has to happen at the same time
as moving all users on this system to UTF-8 locales, otherwise we'd have
to add zillions of additional conversion layers between applications and
the kernel, an option that is not only generating code bloat and
inefficiency, but that is also least two orders of magnitude more
complex and expensive than making sure that the majority of applications
are usable without too severe problems under UTF-8 locales. (Remember
that applications that aren't UTF-8 ready can still be used with just
ASCII under a UTF-8 locale, therefore the migration is not as dire as it
might sound at first, especially for users who used mostly only ASCII so
far.)

Locales with mutually different character encodings can only be used by
user groups that do not share non-ASCII files. It has always been like
that and UTF-8 doesn't change anything here. That's actually quite
practical, because practically *all* non-ASCII plaintext files are
restricted to a user's or group's private subdirectories. The vast
majority of the system-wide text files is either in ASCII or some in
locale-independent encoding (e.g., HTML documentation with MIME/
.htaccess headers, etc.).

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/
Re: filename encoding

Reply via email to