pugs-comm...@feather.perl6.nl wrote:
In the abstract, Perl is written in Unicode, and has consistent Unicode -semantics regardless of the underlying text representations. +semantics regardless of the underlying text representations. By default +Perl presents Unicode in "NFG" formation, where each grapheme counts as +one character. A grapheme is what the novice user would think of as a +character in their normal everyday life, including any diacritics.
What's with this NFG / Normal Form G that you refer to? I don't see any mention of that in http://unicode.org/reports/tr15/ ... did you mean NFC?
For that matter, is it possible for all realistic combinations of diacritics and base letters to be represented by a single Unicode codepoint, including all language-dependent graphemes?
I thought NFC sort of did one codepoint per grapheme but there were a few exceptions ... I could be wrong on that point.
-- Darren Duncan