On Thu, 21 Feb 2002, Markus Kuhn wrote:
[...]
> Glenn Maynard wrote on 2002-02-21 08:10 UTC:
[...]
> > A normalization form would help a lot, though. It'd guarantee that in
> > all cases where I *do* know how to enter a character in a filename,
> > I can always manipulate the file.  (If I see "c=E1r", I'd be able to "cat
> > c=E1r" and see it, reliably.)
>
> We agreed already ages ago here that Normalization Form C should be
> considered to be recommended practice under Linux and on the Web. But
> nothing should prevent you in the future from using arbitrary opaque
> byte strings as POSIX file names. In particular, POSIX forbids that the
> file system applies any sort of normalization automatically. All the URL
> security issues that IIS on NTFS had demonstrates, what a wise decision
> that was.

I can see your point now. I think the RFC-3010  should have
been a bit clearer on this and instead of:

> The NFS version 4 protocol does not mandate the
> use of a particular normalization form at this time.

It would have been much clearer to me if it said:

clear> The NFS version 4 protocol will not use any
clear> normalization automatically.

And this also means that I can create two, or more
files, like:

U+00F6: ö
U+006F U+0308: ö

When looking at these as strings (accoring to Unicode) these
strings are the same. In case these strings are used as
filenames they will be considered different. Can we resolve
this issue without normalization?

Thank you
gaspar

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to