>> >The terminal
>> >should renormalize everything (including pastes) to NFC.
>> 
>> Then how will I paste in some wacky invalid filename into
>> my terminal in order, to say, rm it? Like I was saying,
>> paste's should not be normalized.
>I already explained this at length: ls (and other tools) should escape
>"wacky" filenames using \x, \u and \U.  This is nothing new; ls already
>escapes things, so it's just an extension on existing functionality.

I meant that rather than invisibly normalizing the paste, it would
do what you say and print the escape sequences out. If it were
to normalize on paste, it could be hiding problems.





>> Normalization for D has some serious drawbacks: if you were to try
>> to implement, say vietnamese using only composing characters,
>> it would look horrible. The appearance, position, shape, and size
>> of the combining accents depends on which letter they are being
>> combined with, as well as which other diacritics are being combined
>> with that same letter.
>That's entirely a rendering implementation detail; it should be easy
for
>the terminal's font renderer to normalize internally in whatever way is
>most appropriate.
>What scripts do you think NFD would be more appropriate than NFC for?
>NFC seems to be fairly (de-facto) standard in Unix.


If characters are ever introduced which have no precomposed codepoint,
then it will be difficult for a font to "normalize" them to one
glyph which has the appropriate internal layout. The font file itself
would then have to know about composition rules, such as when
X is composed with Y then Z, then use this glyph XYZ which has no
single codepoint in unicode.

For that reason, I dont like form D at all. I wonder how much space
it would take to represent every possible Jamo-combination, then just
do away with combining characters alltogether...




--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to