Re: file name encoding

Markus Kuhn Fri, 22 Jun 2001 10:35:38 -0700
On Fri, 22 Jun 2001 [EMAIL PROTECTED] wrote:
> > Would it be acceptable to change internals of functions like fopen()
> > so that the passed file name is converted to utf-8 trough iconv() ?
>
> And how is the character set of the file name supposed to be guessed?

Trivial: Filenames would be always ASCII or UTF-8.

I think, the most practical recommendation is that today nobody should be
using non-ASCII filenames (except for UTF-8 testing of course) until the
big switch to UTF-8. In practice, we are reasonably close to that
situation. Even in very non-Latin user communities, sort-of-English
filenames are currently the dominating practice. Not only on web servers
but also due to the severe regexp hazards of some unsuitable encodings
(BIG5, GB18030, etc.).

Adding locale-dependent encoding conversion functionality to fopen() etc.
really is completely out of the question. I don't even want to start
thinking about the huge number of obvious new devious security problems
that such a severe functionality change in the highly stable Unix/Linux
file system semantics would bring. For those with no phantasy at all, let
me just mention lock files and file existance tests to start with.
Changing fopen here is an absolute no-go! Good Qwafu, we want to decrease
the locale-dependency of the C and X11 API, not increase it.

Let's focus on making the Linux environment suitable for smooth pure UTF-8
usage, not add layer after layer of redundant conversion and recoding
extensions to one API after another, until the system spends half of its
CPU cycles checking whether character encoding conversion is necessary.

Markus


-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/
Re: file name encoding

Reply via email to