On Fri, 22 Feb 2002, Markus Kuhn wrote:
> Gaspar Sinai wrote on 2002-02-22 13:25 UTC:
> > I was thinking about this: maybe the NFS server could enforce
> > normalization form 'C' so that only the precomposed variant:
> >
> > U+00F6 รถ
> >
> > could create a file. A huge number of scripts could be supported,
> > without duplicate filenames. Hangul would immediatelly be ok
> > without the need of jamo decomposition. And we are also very
> > lucky that CJK can not be decomposed to radicals :)
> >
> > I admit this would create some problems...
>
> ... starting with violating spirit of the POSIX standard for instance,
> which contains already a lengthy rationale on why case normalization is
> not allowed.

This was just a suggestion to clean up things by
specifying the characters that can be allowed for
filenames. Currently we can not have "/", ".", ".."
and "\0" for a filename. What if we say we can not
have composing and zero with characters for a filename?
That would not need compicated normalization - just
a character check.

But you are of course you are right - if it is against
POSIX standard it is very little that can be done here.

> No. If you are worried about filenames that are not in NFC, then for
> example you can extend the GNU "find" command with a new predicate that
> tests whether whether a file name is not in NFC. This way, worried
> people can very quickly list all files on a harddisk that are not in NFC
> and then apply little script that does the normalization or some other
> desired action manually. Problem solved, without any unpleasant
> surprises.

The problem occurs if normalization does happen - and some programs
may do normalization.

> I think I speak for a lot of experienced Unix people if I say that we
> really do *not* want to have any Unicode normalization code in the
> kernel. Unix kernels remain mostly character encoding ignorant, and
> Unicode changes nothing here.

Normalization code is very complicated - I am not an advocate
of that - I am just looking for a quick and nice solution to
a potential problem.

Thank you,
gaspar


--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to