2013/6/19 Michael Everson <[email protected]> > On 19 Jun 2013, at 09:59, "Jörg Knappen" <[email protected]> wrote: > > > Somehow, the compromise solution found at the ad hoc meeting sounds > fishy, because the is no such thing as > > LATIN CAPITAL LETTER MARSHALLESE L or LATIN SMALL LETTER MARSHALLESE N > (to be equipped with a cedilla). > > > > It is not the base letter but the diacritic which makes the difference, > hence names like > > > > LATIN CAPITAL LETTER L WITH PROPER CEDILLA (marshallese) > > > > would sound better and more clear. > > The use of the name MARSHALLESE L and MARSHALLESE N serve to help prevent > the mis-use of these characters. >
Do you mean that it is supposed to prevent their use in Latvian/Levonian ? This will happen anyway, simply because they are precomposed, and users will still mix them, or because some styled fonts will look better for some authors, and will have a cute presentation for these characters (notably n fonts with variable stroke widths), than with a badly styled triangular comma (or basic thin rectangle looking like an accent) in some fonts with Swiss style. Also this creates an initial restriction for correct use in other languages or contexts. For me the proposal is just a way to fix the presentation of the existing COMBINING CEDILLA which has three major forms (chosen depending on the base Latin letter and its capitalisation), including the form that looks like COMBINING COMMA BELOW (in Latvian for L/l and N/n, and in Romanian for S/s where similar confusion still occurs in NFC or NFD indifferently, long after the comma below was encoded distinctly). For languages that consider that this cariation of glyphs for COMBINING CEDILLA is unacceptable, we should better encode its specific form (like we did for COMBINING COMMA BELOW) so we'll have COMBINING CEDILLA ATTACHED BELOW (and at the same time you can encode the 4 precomposed letters you proposed, with their canonical decomposition using the new diacritic). Legacy usages will persist where existing precomposed letters already are decomposed with COMBINING CEDILLA. Notes added to these characters as well as the representative glyph can suggest what is the expected form between the 3. And for the 4 proposed characters, you can directly drop the "MARSHALLESE" word : the encoded canonical decomposition using the new diacritic will already explcitly say that only one form is acceptable (for the other possible forms, use the base letter followed by COMBINING CEDILLA, or more precisely by COMBINING COMMA BELOW or COMBINING COMMA ABOVE) I other words the encoding as well could be: * COMBINING CEDILLA ATTACHED BELOW ; Mn ; <no decomposition> * LATIN CAPITAL LETTER L WITH CEDILLA ATTACHED BELOW ; Lu ; <LATIN CAPITAL LETTER L, COMBINING CEDILLA ATTACHED BELOW> * LATIN SMALL LETTER L WITH CEDILLA ATTACHED BELOW ; Ll ; <LATIN SMALL LETTER L, COMBINING CEDILLA ATTACHED BELOW> * LATIN CAPITAL LETTER N WITH CEDILLA ATTACHED BELOW ; Lu ; <LATIN CAPITAL LETTER N, COMBINING CEDILLA ATTACHED BELOW> * LATIN SMALL LETTER N WITH CEDILLA ATTACHED BELOW ; Ll ; <LATIN SMALL LETTER N, COMBINING CEDILLA ATTACHED BELOW> (we still need a precision in these precomposed letters, due to the pre-existing letters with legacy presentations looking like comma below, that are also decomposable, but differently) And may be we could map a few other precomposed letters at the same time **without requiring** existing languages to use them (but also **without restricting** them to do so, if their usage changes or if new distinctions are needed when they will borrow words like toponyms from other languages, keeping their distinctions). E.g: : * LATIN CAPITAL LETTER C WITH CEDILLA ATTACHED BELOW ; Lu ; <LATIN CAPITAL LETTER C, COMBINING CEDILLA ATTACHED BELOW> * LATIN SMALL LETTER C WITH CEDILLA ATTACHED BELOW ; Ll ; <LATIN SMALL LETTER C, COMBINING CEDILLA ATTACHED BELOW> (these are still not needed for use in French or Portuguese but they are possible if ever there's a new development where forms with comma below will coexist, which are already encoded explicitly in decomposed form, and may already be used in fonts currently intended for French or Portuguese, where the comma below is also acceptable **today** without distinction).

