Followup to: <[EMAIL PROTECTED]>
By author: Gaspar Sinai <[EMAIL PROTECTED]>
In newsgroup: linux.utf8
>
> I can see your point now. I think the RFC-3010 should have
> been a bit clearer on this and instead of:
>
> > The NFS version 4 protocol does not mandate the
> > use of a particular normalization form at this time.
>
> It would have been much clearer to me if it said:
>
> clear> The NFS version 4 protocol will not use any
> clear> normalization automatically.
>
> And this also means that I can create two, or more
> files, like:
>
> U+00F6: ö
> U+006F U+0308: ö
>
> When looking at these as strings (accoring to Unicode) these
> strings are the same. In case these strings are used as
> filenames they will be considered different. Can we resolve
> this issue without normalization?
>
Normalization won't help you, either; it's impossible to tell the
difference between a Latin "A" and a Greek "A" in most fonts.
Therefore, you have to deal with the general problem anyway, which
means you need to be able to determine the actual byte sequence used.
That being said, I agree with what other have said, i.e. the use of
Normalization Form C during *input* should be encouraged throughout,
thus minimizing the risk that this will actually cause problems.
However, normalization in the filesystem is a security hole waiting to
happen.
-hpa
--
<[EMAIL PROTECTED]> at work, <[EMAIL PROTECTED]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[EMAIL PROTECTED]>
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/