Hi all, I came across the following point in the Unicode FAQ that explains why the Unicode standard does not contain any characters for digraphs:-
http://www.unicode.org/unicode/faq/ligature_digraph.html#3 I find the comments therein rather perplexing, especially seeing as how if the digraphic characters were in fact denoted by a singular new glyph, then they would certainly have been included. As a combination of glyphs is treated as a single character in every way--from sorting in dictionaries, to filling in crossword puzzles--it seems counterintuivitive that we should have to rely upon (albeit well-developed) heuristics to collate words. In all practicality, I do not expect that writers in languages such as Spanish, Hungarian, and Welsh etc. where digraphs are used fairly commonly would immediately change all their texts to use the appropriate single Unicode characters, had they existed. Of course, it is also true that a decade or two ago, common substitutions such as i^ for i-with-circumflex had to be made. Since then, these characters appeared in various character "sets", and whilst it's still fairly common to leave the diacritics off, there are a great deal more people using the proper characters. Since there are 676 possible digraph combinations, I endeavoured to come up with a simpler approach to marking the digraphs as a single character than simply creating a codepoint for each one. I have two ideas so far:- * Come up with a set of A-Za-z combining characters, such that c + combining-h would form a "ch" grapheme * Come up with a digraph combinging character, such that c + h + digraph-combinging-character forms the "ch" grapheme The former idea is the most costly, since it involved reserving more codepoints, is not backwards compatible in languages such as HTML (with the latter solution, you can use CSS to prevent the digraph combinging character from being displayed), and is limited to latin digraphs. If anyone has any comments on this, or any references to previous discussions, they would be gladly recieved. -- Kindest Regards, Sean B. Palmer @prefix : <http://purl.org/net/swn#> . :Sean :homepage <http://purl.org/net/sbp/> .

