On Tue, Oct 21, 2003 at 11:32:28AM -0400, Edward H. Trager <[EMAIL PROTECTED]> wrote a message of 118 lines which said:
> I think there can be big debates about whether a Linux (or any *nix > kernel, for that matter) has any business normalizing file names. > Personally I think Unicode normalization is not the kernel's > business. This is better left to the userland applications. I do not agree. It would mean *each* application has to normalize because it cannot rely on the kernel. It has huge security implications (two file names with the same name in NFC, so visually impossible to distinguish, but two different string of code points). Normalization has to be done in the kernel for the same reason than access control (the rwx bits in Unix) has to be in the kernel: so that no application can bypass it. > Are you sure about ls? ls should sort UTF-8-encoded file names in > raw Unicode order, n'est-ce pas? Yes, but this has no meaning (in French, é should not be after z). > What about ICU's regexp package? > (http://oss.software.ibm.com/icu/userguide/regexp.html) You should > be able to use ICU on *any* platform. Linux does not yet having a > Unicode grep I never said that Unix cannot be "Unicodized". I just saif that it is not Unicodized. That's why I talked about an "act of faith". You need to configure many things and to compile many things before you have a working Unicode environment. > I thought both Postgres and MySQL already have, or are working on > this issue? None of them have it. They claim "Unicode support" which means they can just store and retrieve UTF-8.