"Disunification may be an answer?" We should avoid it as well. We have other solutions in Unicode - variation selectors (often used for sinograms when their unified shapes must be distinguished in some contexts such as people names or toponyms or trademark names or in other specific contexts), - or combining sequences (including in Arabic or Hebrew where many combining characters are not always represented visually, the same occuring as well in Latin with accents not always presented over capitals), - or sequences of multiple characters (like in Emojis for skin color variants, or sequences for encoding flags), - or other sequences using joiners (e.g. in South Asian scripts).
Disunification is only acceptable when - there's a complete disunification of concepts and the "similar" shapes are also different even if one originates from the other (E.g. the Latin slashed o disunifiied from the Latin o, even if there's also the sequence o+combining slash, almost never used as its rendering is too much approximative in most cases) - or there's a clear distinction of semantics and properties (e.g. the Latin AE ligature, which is not appropriately represented by the two separate letters, not even with a "hinting" joiner, and that has specific properties as a plain letter, e.g. with mappings) Before disunifying a character, we should first study the alternative of their representation as sequences. 2016-03-16 18:34 GMT+01:00 Asmus Freytag (t) <[email protected]>: > On 3/15/2016 8:14 PM, David Faulks wrote: > > As part of my investigations into astrological symbols, I'm beginning to > wonder if glyph variations are justifications for separate encoding of > symbols I would have previously considered the same or unifiable with symbols > already in Unicode. > > For example, the semisquare aspect is usually shown with a glyph that is > identical to ∠ (U+2220 ANGLE). However, sometimes it looks like <, or like ∟ > (U+221F RIGHT ANGLE). Would this be better encoded as a separate codepoint? > > The parallel aspect, similarily, sometimes looks like ∥ (U+2225 PARALLEL TO), > but is often shown as // or ⫽ (U+2AFD DOUBLE SOLIDUS OPERATOR). This is not a > typographical kludge since astrological fonts often show it this way. > There is also contra-parallel, which sometime is shown like ∦ (U+2226 NOT > PARALLEL TO), but has varaint glyphs with slated lines (and the crossbar is > often horizontal). > > The ‘part of fortune’ is sometimes a circled ×, or sometimes a circled +. > > Would it be better to have dedicated characters than to assume unifications > in these cases? > > > > My take is that for symbols there's always that tension between encoding > the "concept" or encoding the shape. In my view, it is often impossible to > answer the question whether the different angles (for example) are merely > different "shapes" of one and the same "symbol", or whether it isn't the > case that there are different "conventions" (using different symbols for > the same concept). > > Disunification is useful, whenever different concepts require distinct > symbol shapes (even if there are some general similarities). If other > concepts make use of the same shapes interchangeably, it is then up to the > author to fix the convention by selecting one or the other shape. > Conceptually, that is similar to the decimal point: it can be either a > period, or a comma, depending on locale (read: depending on the convention > the author follows). > > Sometimes, concepts use multiple symbol shapes, but all of these shapes > map to the same concept (and other uses are not known). In that case, > unifying the shapes might be acceptable. The selection of shape is then a > matter of the font (and may not always be under the control of the author). > Conceptually, that is similar to the integral sign, which can be slanted or > upright. The choice is one of style. While authors or readers may prefer > one look over the other, the identity of the symbol is not in question, and > there's no impact on transmission of the contents of the text. > > Whenever we have the former case, that is, multiple conventional > presentations that are symbols in their own right in other contexts, then > encoding an additional "generic" shape should be avoided. Unicode > explicitly did not encode a generic "decimal point". If the convention that > is used matters, the author is better off being able to select a specific > shape. The results will be more predictable. The downside is that a search > will have to cover all the conventions. Conceptually, that is no different > from having to search for both "color" and "colour". > > The final case is where a convention for depicting a concept uses a symbol > that itself has some variability (for example when representing some other > concepts), such that some of its forms make it less than ideal for the > conventional use intended for the concept in question. Unicode has > historically not always been able to provide a solution. In some of these > cases, plain text (that is, without a fixed font association) may simply > not give the desired answer. If specialized fonts for the convention (e.g. > astrological fonts) do not usually exist or can't be expected, then > disunifying the symbol's shapes may be an answer. > > A./ >

