On Fri, 22 Feb 2002, Markus Kuhn wrote: > Gaspar Sinai wrote on 2002-02-22 13:25 UTC: > > I was thinking about this: maybe the NFS server could enforce > > normalization form 'C' so that only the precomposed variant: > > > > U+00F6 รถ > > > > could create a file. A huge number of scripts could be supported, > > without duplicate filenames. Hangul would immediatelly be ok > > without the need of jamo decomposition. And we are also very > > lucky that CJK can not be decomposed to radicals :) > > > > I admit this would create some problems... > > ... starting with violating spirit of the POSIX standard for instance, > which contains already a lengthy rationale on why case normalization is > not allowed.
This was just a suggestion to clean up things by specifying the characters that can be allowed for filenames. Currently we can not have "/", ".", ".." and "\0" for a filename. What if we say we can not have composing and zero with characters for a filename? That would not need compicated normalization - just a character check. But you are of course you are right - if it is against POSIX standard it is very little that can be done here. > No. If you are worried about filenames that are not in NFC, then for > example you can extend the GNU "find" command with a new predicate that > tests whether whether a file name is not in NFC. This way, worried > people can very quickly list all files on a harddisk that are not in NFC > and then apply little script that does the normalization or some other > desired action manually. Problem solved, without any unpleasant > surprises. The problem occurs if normalization does happen - and some programs may do normalization. > I think I speak for a lot of experienced Unix people if I say that we > really do *not* want to have any Unicode normalization code in the > kernel. Unix kernels remain mostly character encoding ignorant, and > Unicode changes nothing here. Normalization code is very complicated - I am not an advocate of that - I am just looking for a quick and nice solution to a potential problem. Thank you, gaspar -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
