Another example: suppose you want to represent the epigraphic notation where there's a tie grouping several orthographic characters, for use in texts discussing grammar. You can perfectly use the special combining character with class 0 that I propose to annotate :
- the first orthographic character (or cluster) with the standard half diacritic encoding the left part of the tie : encode that half diacritic after that special character - the second orthographic character (or cluster) with the standard half diacritic encoding the right part of the tie : encode that half diacritic after that special character - use the SAME special combining character (if there's a set of such character) to indicate that both notations are associated. This will give hints to renderers that they can safely join the two half parts ! Le 12 mars 2012 00:56, Philippe Verdy <[email protected]> a écrit : > One example: say you want to encode an epigraphic C with CEDILLA > appearing as a letter above another one, you would encode : > > - (1) the orthographic base letter (with its standard diacritics, > including CGJ if needed) > - (2) the new special combining character with combining class 0 that I > propose. > - (3) the existing combiing letter C > - (4) the existing combiing CEDILLA (or other existing diacritics, > including CGJ if needed to avoid reorderings by normalizers). > > Renderers have hints given by the character (2) that they must not > reorder/mix/compose randomly the characters between parts (1) and (3, > 4). But they also have the hint that they can precompose safely the > characters in (3, 4) without breaking anything, And they don't have to > represent the character (2) itself (they could do it, still, using > some other layout mechanisms). > > Semantic analysers know how to intepret characters in (3, 4) together, > with their semantic level associated by them for the special character > (2) > > Ortographic checkers know that characters (2,3,4) are to be ignored, > they'll only check characters in (1), ignoring the rest as indicated > by the character (2) for which they don't associate any orthographic > meaning. > > Sorters continue to work (character (2, 3, 4) can be given a non null > weight only in higher collation levels). > > Le 12 mars 2012 00:44, Philippe Verdy <[email protected]> a écrit : >> Also I do think that this proposal would avoid havng to encode many >> new "precomposed" diacritics made of a diacritic letter and a >> diacritic applying to it. We would just encode them using such >> separator first, before the encoded diacritic letter, and the standard >> combining diacritics. >> >> With this tool, immediately, we can cover all scripts at once, for all >> languages and all usages. >> >> Le 12 mars 2012 00:36, Philippe Verdy <[email protected]> a écrit : >>> In other words, that circumflex is an epigraphic notation. This means >>> three distinct levels of analysis of the text: one for Chi, one for >>> the small letter above it noting something about the Chi, and another >>> for the circumflex noting something about the Chi itself. >>> >>> This causes a major problem : how to separate cleanly those levels of >>> representation when diacritics are NOT supposed to modify a letter >>> orthographically ? >>> >>> 1) use an upper layer protocol (this is the position constantly >>> adopted, but it has its limits). >>> >>> 2) use a special invisible combining character used as prefixes (with >>> combining class 0 to avoid reorderings and other ambiguous combined >>> forms caused ny normalizations) to separate and provide an unspecified >>> additional semantic to the standard diacritics encoded after them. >>> >>> 3) Or possibly several of such special invisible combining characters >>> in a coherent set (we could have 16 of them, encoded at once in one >>> column in the special plane, each one with a numeric property which >>> does not designate how it will be used in actual texts, in a way >>> similar to the multiple variant selectors or multiple PUAs that are >>> not very well fitted for combining characters), it if is needed to >>> make semantic distinctions between these multiple (but optional) >>> epigraphic levels. >>> >>> >>> Le 11 mars 2012 14:06, Michael Everson <[email protected]> a écrit : >>>> On 11 Mar 2012, at 12:05, Denis Jacquerye wrote: >>>> >>>>> Stacked letters are also found in some Greek manuscripts. >>>>> See the page >>>>> http://www.archive.org/stream/revuearchologi27pariuoft#page/156/mode/1up >>>>> with some examples: Nu, omicron, omicron and Greek circumflex (tilde), >>>>> chi and Greek circumflex. >>>>> Would these also have to be represented by combining characters? >>>> >>>> Yes, but in this case I don't think that circumflex is part of the >>>> superscript letter per se. It's a base letter with a combining letter, and >>>> the whole thing has a mark over it to show it's an abbreviation. (There is >>>> obviously no chi-circumflex in Greek orthography.) >>>> >>>> Michael Everson * http://www.evertype.com/ >>>> >>>> >>>>

