On Thu, 21 Feb 2002, Markus Kuhn wrote: [...] > Glenn Maynard wrote on 2002-02-21 08:10 UTC: [...] > > A normalization form would help a lot, though. It'd guarantee that in > > all cases where I *do* know how to enter a character in a filename, > > I can always manipulate the file. (If I see "c=E1r", I'd be able to "cat > > c=E1r" and see it, reliably.) > > We agreed already ages ago here that Normalization Form C should be > considered to be recommended practice under Linux and on the Web. But > nothing should prevent you in the future from using arbitrary opaque > byte strings as POSIX file names. In particular, POSIX forbids that the > file system applies any sort of normalization automatically. All the URL > security issues that IIS on NTFS had demonstrates, what a wise decision > that was.
I can see your point now. I think the RFC-3010 should have been a bit clearer on this and instead of: > The NFS version 4 protocol does not mandate the > use of a particular normalization form at this time. It would have been much clearer to me if it said: clear> The NFS version 4 protocol will not use any clear> normalization automatically. And this also means that I can create two, or more files, like: U+00F6: ö U+006F U+0308: ö When looking at these as strings (accoring to Unicode) these strings are the same. In case these strings are used as filenames they will be considered different. Can we resolve this issue without normalization? Thank you gaspar -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
