Re: filename encoding

Tom Tromey Sun, 04 Feb 2001 16:18:45 -0800
>>>>> "Kai" == Kai Henningsen <[EMAIL PROTECTED]> writes:

Kai> IMAO, a *real* filesystem should use some encoding of ISO 10646 -
Kai> UTF-8, UTF-16, or UTF-32 are all viable options. The same should
Kai> be true for the kernel filename interfaces.

I like this, but what should I do right now?

I work on libgcj, the runtime component of gcj, the Java front end to
GCC.  In libgcj of course we use UCS-2 everywhere, since that is what
Java does.  Currently, for Unixy systems, we assume that all file
names are UTF-8.  (Actually, we do something notably worse, which is
assume that file names are Java-style UTF-8, with the weird encoding
for \u0000.)

This was a reasonable choice for us because, when the file handling
code was written, we didn't have the ability to recode into any other
character set anyway.

Now we do have that capability, but the thought of using
locale-specific filenames grosses me out.  It is hard to see how that
can work sensibly.

Is there some document somewhere that will tell me what I really ought
to be doing?

Tom
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/
Re: filename encoding

Reply via email to