Re: Standaridized variation sequences for the Desert alphabet?
> On 26 Mar 2017, at 09:12, Martin J. Dürstwrote: > >> Thats a good point: any disunification requires showing examples of >> contrasting uses. > > Fully agreed. The default position is NOT “everything is encoded unified until disunified”. The characters in question have different and undisputed origins, undisputed. We’ve encoded one pair; evidently this pair was deprecated and another pair was devised. The letters wynn and w are also used for the same thing. They too have different origins and are encoded separately. The letters yogh and ezh have different origins and are encoded separately. (These are not perfect analogies, but they are pertinent.) > We haven't yet heard of any contrasting uses for the letter shapes we are > discussing. Contrasting use is NOT the only criterion we apply when establishing the characterhood of characters. Please try to remember that. (It’s a bit shocking to have to remind people of this. Michael Everson
Re: Standaridized variation sequences for the Desert alphabet?
On 3/26/2017 1:51 PM, Michael Everson wrote: Finally, if this was in major, modern use, adding these code points would have grave consequences for security. Why? They’re not visually similar to the existing characters. So spoofing wouldn’t be an issue. Spoofing would absolutely be an issue, because if there are free alternates users will mis-remember which one was used for a given label. Goes for the whole simplified / traditional issue in the Han script. Issues are not limited to visual similarity. A./
Re: Standaridized variation sequences for the Desert alphabet?
On 3/26/2017 9:20 AM, Michael Everson wrote: On 26 Mar 2017, at 16:45, Asmus Freytagwrote: The priority in encoding has to be with allowing distinctions in modern texts, or distinctions that matter to modern users of historic writing systems. Beyond that, theoretical analysis of typographical evolution can give some interesting insight, but I would be in the camp that does not accord them a status as primary rationale for encoding decisions. Our rationales are NOT ranked in the way you suggest. A variety of criteria are applied. And the way you weigh the criteria? Thus, critical need for contrasting use of the glyph distinctions would have to be established before it makes sense to discuss this further. Precedent for such needs is well-established. Consider the Latin Extended-D block. Sometimes it is editorial preference, and that’s not even always universal. I think the Latin Extended-D block may have its own problems. However, Latin as a script caters to so many varied levels of users, from ordinary text to scholarly notations that it really cannot be used to settle this issue. I see no principled objection to having a font choice result in a noticeable or structural glyph variation for only a few elements of an alphabet. We have handle-a vs. bowl-a as well as hook-g vs. loop-g in Latin, and fonts routinely select one or the other. Well, Asmus, we encode a and ɑ as well as g and ɡ and ᵹ. And we do that for reasons that are very different from preserving the early and possibly transient history of a minor script. And we do not consider ɑ and ɡ and ᵹ to be things that ought to be distinguished by variation selectors. (I am of course well aware of IPA usage.) Yes, and the absence of such usage in the current example makes all the difference. Whole-font switching is well understood. But character origin has always been taken into account. Consider 2EBC ⺼ CJK RADICAL MEAT and 2E9D ⺝ CJK RADICAL MOON which are apparently really supposed to have identical glyphs, though we use an old-fashioned style in the charts for the former. (Yes, I am of course aware that there are other reasons for distinguishing these, but as far as glyphs go, even our standard distinguishes them artificially.) Apparently not only in the standard, because they show as different in the plaintext view of this message. (It is only for usage outside normal text that the distinction between these forms matters). What’s “normal” text? “Normal” text in Latin probably doesn’t use the characters from the Latin Extended-D block. "ordinary" text, if you like, reflecting standard orthographies. As opposed to notational systems. While the Deseret forms are motivated by their pronunciation, I'm not necessarily convinced that the distinction has any practical significance that is in any way different than similar differences in derivation (e.g. for long s-s or long-s-z for German esszett). One practical consequence of changing the chart glyphs now, for instance, would be that it would invalidate every existing Deseret font. Adding new characters would not. No, if we state that both glyphs are alternates for the same character *and if we decide, to _not_ add variation selectors* the choice is where it belongs: with the font maker. In fact, it would seem that if a Deseret text was encoded in one of the two systems, changing to a different font would have the attractive property of preserving the content of the text (while not preserving the appearance). Changing to a different font in order to change one or two glyphs is a mechanism that we have actually rejected many times in the past. We have encoded variant and alternate characters for many scripts. If the underlying text element is the same, font switching can be the correct choice. This, in a nutshell, is the criterion for making something a font difference vs. an encoding distinction. Character identity is not defined by any single criterion. Make it the "primary" criterion then. Moreover, in Deseret, it is not the case that all texts which contain the diphthong /juː/ or /ɔɪ/ write it using EW Ч or OI Ц. Many write them as Y + U ЏЋ and O + I ЄІ. So the choice is one of *spelling*, and spelling has always been a primary criterion for such decisions. Yes, and those other spellings are not affected. This is complicated by combining characters mostly identified by glyph, and the fact that while ä and aͤ may be the same character across time, there are people wanting to distinguish them in the same text today, and in both cases the theoretical falls to the practical. In this case, there are no combining character issues and there's nobody needing to use the two forms in the same text. huh? He’s wrong there, as I pointed out. A text in German may write an older Clavieruͤbung in a citation alongside the normal spelling Klavierübung. The choice of spelling is key. That would
Re: Standaridized variation sequences for the Desert alphabet?
On 3/26/2017 9:23 AM, Michael Everson wrote: On 26 Mar 2017, at 17:02, Asmus Freytagwrote: On 3/26/2017 6:18 AM, Michael Everson wrote: In any case it’s not a disunification. Some characters are encoded; they were used to write diphthongs in 1855. These characters were abandoned by 1859, and other characters were devised. Calling them "characters" is pre-judging the issue, don't you think? No, I don’t think so. I really think it is. We know that these are different shapes, but that they stand for the same text elements. No, they don’t. Those diphthongs can also be represented in other ways in Deseret. Having alternative ways to represent these doesn't invalidate or affect my argument. I’ve never accepted the view that “everything is already encoded and everything new is a disunification” which seems to be a pretty common view. I would not say I aspire to the view you quote. If you encode a certain shape, it may get used for a range of text elements. This would (de facto) encode these text elements via that shape. If it is later felt that the given shape should not be used for the full range of text elements, then you could say that the "implicit" unification based on the usage (or, if you will, "fallback usage") was mistaken and should be better handled by two (or more) shapes. This represents a "de-facto" disunification. However, where I part from your description is the "everything is already encoded". That would not be the case anywhere a range of text elements cannot be represented at all. Your statement also implies a "correctly encoded" or "successfully encoded" which is different from "there's an encoding that some people use as a fallback", which, if disunification should prove proper later on, would be a better way of describing what was the original situation. Perhaps the point is subtle, but it is important. In the current case, you have the opposite, to wit, the text elements are unchanged, but you would like to add alternate code elements to represent what are, ultimately, the same text elements. That's not disunification, but dual encoding. A./
Re: Standaridized variation sequences for the Desert alphabet?
On 26 Mar 2017, at 17:02, Asmus Freytagwrote: > > On 3/26/2017 6:18 AM, Michael Everson wrote: > >> In any case it’s not a disunification. Some characters are encoded; they >> were used to write diphthongs in 1855. These characters were abandoned by >> 1859, and other characters were devised. > > Calling them "characters" is pre-judging the issue, don't you think? No, I don’t think so. > We know that these are different shapes, but that they stand for the same > text elements. No, they don’t. Those diphthongs can also be represented in other ways in Deseret. I’ve never accepted the view that “everything is already encoded and everything new is a disunification” which seems to be a pretty common view. Michael Everson
Re: Standaridized variation sequences for the Desert alphabet?
Michael Everson wrote: One practical consequence of changing the chart glyphs now, for instance, would be that it would invalidate every existing Deseret font. Adding new characters would not. I thought the chart glyphs were not normative. -- Doug Ewell | Thornton, CO, US | ewellic.org
Re: Standaridized variation sequences for the Desert alphabet?
On 26 Mar 2017, at 18:20, Doug Ewellwrote: > > Michael Everson wrote: > >> One practical consequence of changing the chart glyphs now, for instance, >> would be that it would invalidate every existing Deseret font. Adding new >> characters would not. > > I thought the chart glyphs were not normative. Come on, Doug. The letter W is a ligature of V and V. But sure, the glyphs are only informative, so why don’t we use an OO ligature instead? Michael.
Re: Diaeresis vs. umlaut (was: Re: Standaridized variation sequences for the Desert alphabet?)
Philippe Verdy wrote: Or may be, only for historic texts, we could add a combining lowercase e as an alternative to the existing diaeresis. Something like U+0364 COMBINING LATIN SMALL LETTER E, maybe? -- Doug Ewell | Thornton, CO, US | ewellic.org
Re: Standaridized variation sequences for the Desert alphabet?
On 26 Mar 2017, at 16:45, Asmus Freytagwrote: > > The priority in encoding has to be with allowing distinctions in modern > texts, or distinctions that matter to modern users of historic writing > systems. Beyond that, theoretical analysis of typographical evolution can > give some interesting insight, but I would be in the camp that does not > accord them a status as primary rationale for encoding decisions. Our rationales are NOT ranked in the way you suggest. A variety of criteria are applied. > Thus, critical need for contrasting use of the glyph distinctions would have > to be established before it makes sense to discuss this further. Precedent for such needs is well-established. Consider the Latin Extended-D block. Sometimes it is editorial preference, and that’s not even always universal. > I see no principled objection to having a font choice result in a noticeable > or structural glyph variation for only a few elements of an alphabet. We have > handle-a vs. bowl-a as well as hook-g vs. loop-g in Latin, and fonts > routinely select one or the other. Well, Asmus, we encode a and ɑ as well as g and ɡ and ᵹ. And we do not consider ɑ and ɡ and ᵹ to be things that ought to be distinguished by variation selectors. (I am of course well aware of IPA usage.) Whole-font switching is well understood. But character origin has always been taken into account. Consider 2EBC ⺼ CJK RADICAL MEAT and 2E9D ⺝ CJK RADICAL MOON which are apparently really supposed to have identical glyphs, though we use an old-fashioned style in the charts for the former. (Yes, I am of course aware that there are other reasons for distinguishing these, but as far as glyphs go, even our standard distinguishes them artificially.) > (It is only for usage outside normal text that the distinction between these > forms matters). What’s “normal” text? “Normal” text in Latin probably doesn’t use the characters from the Latin Extended-D block. > While the Deseret forms are motivated by their pronunciation, I'm not > necessarily convinced that the distinction has any practical significance > that is in any way different than similar differences in derivation (e.g. for > long s-s or long-s-z for German esszett). One practical consequence of changing the chart glyphs now, for instance, would be that it would invalidate every existing Deseret font. Adding new characters would not. > In fact, it would seem that if a Deseret text was encoded in one of the two > systems, changing to a different font would have the attractive property of > preserving the content of the text (while not preserving the appearance). Changing to a different font in order to change one or two glyphs is a mechanism that we have actually rejected many times in the past. We have encoded variant and alternate characters for many scripts. > This, in a nutshell, is the criterion for making something a font difference > vs. an encoding distinction. Character identity is not defined by any single criterion. Moreover, in Deseret, it is not the case that all texts which contain the diphthong /juː/ or /ɔɪ/ write it using EW Ч or OI Ц. Many write them as Y + U ЏЋ and O + I ЄІ. So the choice is one of *spelling*, and spelling has always been a primary criterion for such decisions. >> This is complicated by combining characters mostly identified by glyph, and >> the fact that while ä and aͤ may be the same character across time, there >> are people wanting to distinguish them in the same text today, and in both >> cases the theoretical falls to the practical. In this case, >> there are no combining character issues and there's nobody needing to use >> the two forms in the same text. > > huh? He’s wrong there, as I pointed out. A text in German may write an older Clavieruͤbung in a citation alongside the normal spelling Klavierübung. The choice of spelling is key. Michael Everson
Re: Standaridized variation sequences for the Desert alphabet?
> On 26 Mar 2017, at 16:59, Asmus Freytagwrote: > > On 3/26/2017 8:47 AM, Michael Everson wrote: >>> On 26 Mar 2017, at 16:45, Asmus Freytag wrote: >>> >>> The latter is patent nonsense, because ä and aͤ are even less related to >>> each other than "i" and "j"; never mind the fact that their forms are both >>> based on the letter "a". Encoding and font choice should be seen as >>> separate. >> He refers to the shape of the diacritical marks. > > I see the issue: the font selected on my end made the "e" look like an "o", > which completely changed my understanding of what he tried to communicate. Ah, yes. M
Re: Standaridized variation sequences for the Desert alphabet?
On 3/26/2017 10:33 AM, Michael Everson wrote: On 26 Mar 2017, at 18:20, Doug Ewellwrote: Michael Everson wrote: One practical consequence of changing the chart glyphs now, for instance, would be that it would invalidate every existing Deseret font. Adding new characters would not. I thought the chart glyphs were not normative. Come on, Doug. The letter W is a ligature of V and V. But sure, the glyphs are only informative, so why don’t we use an OO ligature instead? If there was a tradition of writing W like omega, then switching the chart glyphs to that alternative tradition would be something that is at least not inconceivable -- even if perhaps not advisable. For letters, their primary identity is not given by their shape, but their position / function in the alphabet. That's why making Gaelic style and Fraktur a font switch works at all, even if that is not perfect (viz, ligatures in Fraktur). In the Deseret case, making this alternation a font choice would tend to preserve the content of all documents. Making this an encoding difference would indeed invalidate some documents. Finally, if this was in major, modern use, adding these code points would have grave consequences for security. A./
Re: Standaridized variation sequences for the Desert alphabet?
On 26 Mar 2017, at 21:39, Asmus Freytagwrote: >> Come on, Doug. The letter W is a ligature of V and V. But sure, the glyphs >> are only informative, so why don’t we use an OO ligature instead? > > If there was a tradition of writing W like omega, then switching the chart > glyphs to that alternative tradition would be something that is at least not > inconceivable -- even if perhaps not advisable. You know, Asmus, no analogy is perfect. But mine was a discussion of letters derived from ligatures, and yours is just a random note about shape. > For letters, their primary identity is not given by their shape, but their > position / function in the alphabet. This isn’t really something you can turn into an axiom, much as you would like to. Position in the alphabet may very WIDELY from language to language. As can function. The Latin letter c can mean /k s tʃ ts ʔ ʃ θ/… > That's why making Gaelic style and Fraktur a font switch works at all, even > if that is not perfect (viz, ligatures in Fraktur). Font style isn’t the same thing in this context. The historical letters used to make the 1855 ligatures are *different* letters than those used for the 1859 ligatures. > In the Deseret case, making this alternation a font choice would tend to > preserve the content of all documents. No, since it’s a question of *spelling*. Some documents use a ligature-letter for the diphthong /juː/. Some documents use two separate letters for the same diphthong. So there’s no “standardized” spelling that works for all text that would be affected here. (Spelling for English wasn’t standardized anyway in historical Deseret texts and there is much variety.) > Making this an encoding difference would indeed invalidate some documents. Right now the 1859 characters aren’t representable. Deciding to change the chart glyphs to 1859 glyphs would just destabilize EVERY current Deseret font. That’s not something we should do. > Finally, if this was in major, modern use, adding these code points would > have grave consequences for security. Why? They’re not visually similar to the existing characters. So spoofing wouldn’t be an issue. Michael Everson
Re: Standaridized variation sequences for the Desert alphabet?
On Sun, 26 Mar 2017 18:33:00 +0100 Michael Eversonwrote: > On 26 Mar 2017, at 18:20, Doug Ewell wrote: > > Michael Everson wrote: > >> One practical consequence of changing the chart glyphs now, for > >> instance, would be that it would invalidate every existing Deseret > >> font. Adding new characters would not. > > I thought the chart glyphs were not normative. > Come on, Doug. The letter W is a ligature of V and V. But sure, the > glyphs are only informative, so why don’t we use an OO ligature > instead? A script-stlye font might legitimately use a glyph that looks like a small omega for U+0077 LATIN SMALL LETTER W. Small omega, of course, is an οο ligature. More to the point, a font may legitimately use the same glyphs for U+0067 LATIN SMALL LETTER G and U+0261 LATIN SMALL LETTER SCRIPT G. A more serious issue is the multiple forms of U+014A LATIN CAPITAL LETTER ENG, for which the underlying unity comes from their being the capital form of U+014B LATIN SMALL LETTER ENG. Are there not serious divergences with the shapes of the Syriac letters? Richard.
Re: Standaridized variation sequences for the Desert alphabet?
On 26 Mar 2017, at 21:48, Richard Wordinghamwrote: >> Come on, Doug. The letter W is a ligature of V and V. But sure, the glyphs >> are only informative, so why don’t we use an OO ligature= instead? > > A script-stlye font might legitimately use a glyph that looks like a small > omega for U+0077 LATIN SMALL LETTER W. As I said to Asmus, my analogy was about ligatures made from underlying letters. Yours doesn’t apply because it’s just talking about glyph shapes. > Small omega, of course, is an οο ligature. True. :-) Isn’t history wonderful? > More to the point, a font may legitimately use the same glyphs for U+0067 > LATIN SMALL LETTER G and U+0261 LATIN SMALL LETTER SCRIPT G. A good font will still find a way to distinguish them. :-) > A more serious issue is the multiple forms of U+014A LATIN CAPITAL LETTER > ENG, for which the underlying unity comes from their being the capital form > of U+014B LATIN SMALL LETTER ENG. We could have, and should have, solved this problem *long ago* by encoding LATIN CAPITAL LETTER AFRICAN ENG and LATIN SMALL LETTER AFRICAN ENG. > Are there not serious divergences with the shapes of the Syriac letters? That is analogous to Roman/Gaelic/Fraktur. That analogy doesn’t apply to these Deseret characters; it’s not a whole-script gestalt. Michael Everson
Re: Standaridized variation sequences for the Desert alphabet?
> Well, in most cases, but not e.g. for names. Goethe is not spelled > Göthe. Have a look into `Grimmsches Wörterbuch' to see the opposite :-) Werner
Re: Standaridized variation sequences for the Desert alphabet?
On 2017/03/26 11:24, Philippe Verdy wrote: Thats a good point: any disunification requires showing examples of contrasting uses. Fully agreed. We haven't yet heard of any contrasting uses for the letter shapes we are discussing. Now depending on individual publications, authors would use one character or the other according to their choice, and the encoding will respect it. If we need further unification for matching texts in the samer language across periods of time or authors, collation (UCA) can provide help: this is already what it does in modern German with the digram "ae" and the letter "ä" which are orthographic variants not distinguished by the language but by authors' preference. Well, in most cases, but not e.g. for names. Goethe is not spelled Göthe. Regards, Martin.
VS: Standaridized variation sequences for the Desert alphabet?
I tend to agree with Martin, Philippe and others in questioning the disunification. Sincerely, Erkki I. Kolehmainen -Alkuperäinen viesti- Lähettäjä: Unicode [mailto:unicode-boun...@unicode.org] Puolesta Martin J. Dürst Lähetetty: 26. maaliskuuta 2017 11:12 Vastaanottaja: verd...@wanadoo.fr; David Starner Kopio: Michael Everson; unicode Unicode Discussion Aihe: Re: Standaridized variation sequences for the Desert alphabet? On 2017/03/26 11:24, Philippe Verdy wrote: > Thats a good point: any disunification requires showing examples of > contrasting uses. Fully agreed. We haven't yet heard of any contrasting uses for the letter shapes we are discussing. > Now depending on individual publications, authors would use one > character or the other according to their choice, and the encoding > will respect it. If we need further unification for matching texts in > the samer language across periods of time or authors, collation (UCA) > can provide help: this is already what it does in modern German with > the digram "ae" and the letter "ä" which are orthographic variants not > distinguished by the language but by authors' preference. Well, in most cases, but not e.g. for names. Goethe is not spelled Göthe. Regards, Martin.
Re: Standaridized variation sequences for the Desert alphabet?
On 25 Mar 2017, at 22:15, David Starnerwrote: > > And I'd argue that a good theoretical model of the Latin script makes ä, ꞛ > and aͤ the same character, distinguished only by the font. Fortunately for the users of our standard, we don’t do this. > This is complicated by combining characters mostly identified by glyph, and > the fact that while ä and aͤ may be the same character across time, there are > people wanting to distinguish them in the same text today, and in both cases > the theoretical falls to the practical. In this case, there are no combining > character issues and there's nobody needing to use the two forms in the same > text. I’m fairly sure that a person citing a medieval document using aͤ may very well also need to write this alongside Swedish or German using ä. Michael Everson
Re: Standaridized variation sequences for the Desert alphabet?
On 3/26/2017 6:18 AM, Michael Everson wrote: On 26 Mar 2017, at 10:07, Erkki I Kolehmainenwrote: I tend to agree with Martin, Philippe and others in questioning the disunification. You may, but you give no evidence or discussion about it, so... In any case it’s not a disunification. Some characters are encoded; they were used to write diphthongs in 1855. These characters were abandoned by 1859, and other characters were devised. Calling them "characters" is pre-judging the issue, don't you think? We know that these are different shapes, but that they stand for the same text elements. A./ The origin of all of the characters as ligatures of other characters isn’t questioned. The right thing to do is to add the missing characters, not to invalidate any font that uses the 1855 characters by claiming that the 1855 and 1859 characters are “the same”. Michael Everson
Re: Standaridized variation sequences for the Desert alphabet?
On Sun, Mar 26, 2017 at 6:12 AM Michael Eversonwrote: > On 25 Mar 2017, at 22:15, David Starner wrote: > > > > And I'd argue that a good theoretical model of the Latin script makes ä, > ꞛ and aͤ the same character, distinguished only by the font. > > Fortunately for the users of our standard, we don’t do this. > You've yet to come up with users to whom these Deseret letters are relevant. I’m fairly sure that a person citing a medieval document using aͤ may very > well also need to write this alongside Swedish or German using ä. > I'm fairly sure that a person citing an early 20th century Germany document may well feel the need to cite it in Fraktur. In both cases, I believe that's going above and beyond the identity of the characters involved, but in your case, people do contrast the aͤ with ä, and the user case has been made. Show me the users who want to use these Deseret letters contrastingly.
Re: Diaeresis vs. umlaut (was: Re: Standaridized variation sequences for the Desert alphabet?)
On 2017/03/25 03:33, Doug Ewell wrote: Philippe Verdy wrote: But Unicode just prefered to keep the roundtrip compatiblity with earlier 8-bit encodings (including existing ISO 8859 and DIN standards) so that "ü" in German and French also have the same canonical decomposition even if the diacritic is a diaeresis in French and an umlaut in German, with different semantics and origins. Was this only about compatibility, or perhaps also that the two signs look identical and that disunifying them would have caused endless confusion and misuse among users? I'm not sure to what extent this was explicitly discussed when Unicode was created. The fact that the first 256 code points are identical to those in ISO-8859-1 was used as a big selling point when Unicode was first introduced. It may well have been that for Unicode, there was no discussion at all in this area, because ISO-8859-1 was already so well established. And for ISO-8859-1, space was an important concern. Ideally, both Islandic and Turkish (and the letters missed for French) would have been covered, but that wasn't possible. Disunifying diaeresis and umlaut would have been an unaffordable luxury. The above reasons mask any inherent reasons for why diaeresis and umlaut would have been unified or not if the decision had been argued purely "on the merit". But having used both German and French, and e.g. looking at the situation in Switzerland, where it was important to be able to write both French and German on the same typewriter, I would definitely argue that disunifying them would have caused endless confusion and errors among users. Also, it was argued a few mails ago that diaeresis and umlaut don't look exactly the same. I remember well that when Apple introduced its first laser printers, there were widespread complaints that the fonts (was it Helvetica, Times Roman, and Palatino?) unified away the traditional differences in the cuts of these typefaces for different languages. So to quite some extent, in the relevant period (i.e. 1970ies/80ies), the differences between diaeresis and umlaut may be due to design differences in the cuts for different languages (e.g. French and German). Nobody would have disunified some basic letters because they may have looked slightly different in cuts for different languages, and so people may also have been just fine with unifying diaeresis and umlaut. (German fonts e.g. may have contained a 'ë' for use e.g. with "Citroën", but the dots on that 'ë' will have been the same shape as 'ä', 'ö', and 'ü' umlauts for design consistency, and the other way round for French). Regards, Martin.
Re: Standaridized variation sequences for the Desert alphabet?
On 26 Mar 2017, at 10:07, Erkki I Kolehmainenwrote: > > I tend to agree with Martin, Philippe and others in questioning the > disunification. You may, but you give no evidence or discussion about it, so... In any case it’s not a disunification. Some characters are encoded; they were used to write diphthongs in 1855. These characters were abandoned by 1859, and other characters were devised. The origin of all of the characters as ligatures of other characters isn’t questioned. The right thing to do is to add the missing characters, not to invalidate any font that uses the 1855 characters by claiming that the 1855 and 1859 characters are “the same”. Michael Everson
Re: Standaridized variation sequences for the Desert alphabet?
On 26 Mar 2017, at 14:32, David Starnerwrote: >>> And I'd argue that a good theoretical model of the Latin script makes ä, ꞛ >>> and aͤ the same character, distinguished only by the font. >> >> Fortunately for the users of our standard, we don’t do this. > > You've yet to come up with users to whom these Deseret letters are relevant. You might imagine it takes time to identify problems and address them. >> I’m fairly sure that a person citing a medieval document using aͤ may very >> well also need to write this alongside Swedish or German using ä. > > I'm fairly sure that a person citing an early 20th century Germany document > may well feel the need to cite it in Fraktur. Fraktur is a whole-font substitition (modulo the ligatures). This is not the same thing as an editor choosing w or ƿ. Imagine if we had unified those two. After all, they both represent the same sound, right? (Shudder.) > In both cases, I believe that's going above and beyond the identity of the > characters involved, but in your case, people do contrast the aͤ with ä, and > the user case has been made. Show me the users who want to use these Deseret > letters contrastingly. Do try to be less dismissive. Firstly, *I* have published entire books in Deseret and so I myself have a legitimate interest. In the second, Iam in fact beginning discussions with relevant experts. Michael Everson
Re: Standaridized variation sequences for the Desert alphabet?
> On 26 Mar 2017, at 16:45, Asmus Freytagwrote: > > The latter is patent nonsense, because ä and aͤ are even less related to each > other than "i" and "j"; never mind the fact that their forms are both based > on the letter "a". Encoding and font choice should be seen as separate. He refers to the shape of the diacritical marks. Michael Everson
Re: Standaridized variation sequences for the Desert alphabet?
On 3/26/2017 8:47 AM, Michael Everson wrote: On 26 Mar 2017, at 16:45, Asmus Freytagwrote: The latter is patent nonsense, because ä and aͤ are even less related to each other than "i" and "j"; never mind the fact that their forms are both based on the letter "a". Encoding and font choice should be seen as separate. He refers to the shape of the diacritical marks. I see the issue: the font selected on my end made the "e" look like an "o", which completely changed my understanding of what he tried to communicate. A./ Michael Everson
Re: Standaridized variation sequences for the Desert alphabet?
On 3/25/2017 3:15 PM, David Starner wrote: On Fri, Mar 24, 2017 at 9:17 AM Michael Eversonwrote: And we *can* distinguish i and j in that Latin text, because we have separate characters encoded for it. And we *have* encoded many other Latin ligature-based letters and sigla of various kinds for the representation of medieval European texts. Indeed, that’s just a stronger argument for distinguishing the ligature-based letters for Deseret, I think. And I'd argue that a good theoretical model of the Latin script makes ä, ꞛ and aͤ the same character, distinguished only by the font. The latter is patent nonsense, because ä and aͤ are even less related to each other than "i" and "j"; never mind the fact that their forms are both based on the letter "a". Encoding and font choice should be seen as separate. The priority in encoding has to be with allowing distinctions in modern texts, or distinctions that matter to modern users of historic writing systems. Beyond that, theoretical analysis of typographical evolution can give some interesting insight, but I would be in the camp that does not accord them a status as primary rationale for encoding decisions. Thus, critical need for contrasting use of the glyph distinctions would have to be established before it makes sense to discuss this further. I see no principled objection to having a font choice result in a noticeable or structural glyph variation for only a few elements of an alphabet. We have handle-a vs. bowl-a as well as hook-g vs. loop-g in Latin, and fonts routinely select one or the other. (It is only for usage outside normal text that the distinction between these forms matters). While the Deseret forms are motivated by their pronunciation, I'm not necessarily convinced that the distinction has any practical significance that is in any way different than similar differences in derivation (e.g. for long s-s or long-s-z for German esszett). In fact, it would seem that if a Deseret text was encoded in one of the two systems, changing to a different font would have the attractive property of preserving the content of the text (while not preserving the appearance). This, in a nutshell, is the criterion for making something a font difference vs. an encoding distinction. A./ PS: This is complicated by combining characters mostly identified by glyph, and the fact that while ä and aͤ may be the same character across time, there are people wanting to distinguish them in the same text today, and in both cases the theoretical falls to the practical. In this case, there are no combining character issues and there's nobody needing to use the two forms in the same text. huh?
Re: Standaridized variation sequences for the Desert alphabet?
Asmus Freytag wrote, > In the current case, you have the opposite, > to wit, the text elements are unchanged, but > you would like to add alternate code elements > to represent what are, ultimately, the same > text elements. That's not disunification, but > dual encoding. If spelling a word with an x+y string versus a z+y string represents two different spellings of the same word, then hand printing the same word with either an x/y ligature versus a z/y ligature also represents two different spellings of the same word. Best regards, James Kass
Re: Standaridized variation sequences for the Desert alphabet?
On 2017/03/26 22:15, Michael Everson wrote: On 26 Mar 2017, at 09:12, Martin J. Dürstwrote: Thats a good point: any disunification requires showing examples of contrasting uses. Fully agreed. The default position is NOT “everything is encoded unified until disunified”. Neither it's "everything is encoded separately unless it's unified". The characters in question have different and undisputed origins, undisputed. If you change that to the somewhat more neutral "the shapes in question have different and undisputed origins", then I'm with you. I actually have said as much (in different words) in an earlier post. We’ve encoded one pair; evidently this pair was deprecated and another pair was devised. The letters wynn and w are also used for the same thing. They too have different origins and are encoded separately. The letters yogh and ezh have different origins and are encoded separately. (These are not perfect analogies, but they are pertinent.) Fine. I (and others) have also given quite a few analogies, none of them perfect, but most if not all of them pertinent. We haven't yet heard of any contrasting uses for the letter shapes we are discussing. Contrasting use is NOT the only criterion we apply when establishing the characterhood of characters. Sorry, but where did I say that it's the only criterion? I don't think it's the only criterion. On the other hand, I also don't think that historical origin is or should be the only criterion. Unfortunately, much of what you wrote gave me the impression that you may think that historical origin is the only criterion, or a criterion that trumps all others. If you don't think so, it would be good if you could confirm this. If you think so, it would be good to know why. Please try to remember that. (It’s a bit shocking to have to remind people of this. You don't have to remind me, at least. I have mentioned "usability for average users in average contexts" and "contrasting use" as criteria, and I have also in earlier mail acknowledged history as a (not the) criterion, and have mentioned legacy/roundtrip issues. I'm sure there are others. Regards, Martin.