2012/2/3 Julian Foad <julianf...@btopenworld.com>:
> You may well be correct that NFC is never longer than NFD, but that's not the 
> question.  The question is whether NFC may be longer than the current paths 
> (which are not normalized to normalization form C or to form D).  And the 
> answer is yes it may be longer.  See 
> <http://unicode.org/faq/normalization.html#11>.

Oh, I didn't know that. Thanks for letting me know.
I also read all other items in <http://unicode.org/faq/normalization.html#11>
and all of <http://www.unicode.org/reports/tr15/> and learned more about
normalization.

Maybe we should revise the note.
http://svn.apache.org/repos/asf/subversion/trunk/notes/unicode-composition-for-filenames

>
>
>> Here I quote from
>> http://svn.apache.org/repos/asf/subversion/trunk/notes/unicode-composition-for-filenames
>>   > The proposed internal 'normal form' should be NFC, if only if
>>   > it were because it's the most compact form of the two:  when
>>   > allocating memory to store a conversion result, it won't be
>>   > necessary (ever) to allocate more than the size of the input buffer.
>
> That statement seems to be talking about converting between NFC and NFD, not 
> from un-normalized to normalized.

Yes, indeed.

So, we need to normalize input paths before processing.
We choose NFC as normalization form.

-- 
)Hiroaki Nakamura) hnaka...@gmail.com

Reply via email to