Hello Michael, others,

On 2017/03/27 21:07, Michael Everson wrote:
On 27 Mar 2017, at 06:42, Martin J. Dürst <due...@it.aoyama.ac.jp> wrote:

The characters in question have different and undisputed origins, undisputed.

If you change that to the somewhat more neutral "the shapes in question have 
different and undisputed origins", then I'm with you. I actually have said as much 
(in different words) in an earlier post.

And what would the value of this be? Why should I (who have been doing this for 
two decades) not be able to use the word “character” when I believe it correct? 
Sometimes you people who have been here for a long time behave as though we had 
no precedent, as though every time a character were proposed for encoding it’s 
as thought nothing had ever been encoded before.

I didn't say that you have to change words. I just said that I could agree to a slightly differently worded phrase.

And as for precedent, the fact that we have encoded a lot of characters in Unicode doesn't mean that we can encode more characters without checking each and every single case very carefully, as we are doing in this discussion.


The sharp s analogy wasn’t useful because whether ſs or ſz users can’t tell 
either and don’t care.

Sorry, but that was exactly the point of this analogy. As to "can't tell", it's easy to ask somebody to look at an actual ß letter and say whether the right part looks more like an s or like a z. On the other hand, users of Deseret may or may not ignore the difference between the 1855 and 1859 shapes when they read. Of course they will easily see different shapes, but what's important isn't the shapes, it's what they associate it with. If for them, it's just two shapes for one and the same 40th letter of the Deseret alphabet, then that is a strong suggestion for not encoding separately, even if the shapes look really different.


No Fraktur fonts, for instance, offer a shape for U+00DF that looks like an ſs. 
And what Antiiqua fonts do, well, you get this:

https://en.wikipedia.org/wiki/%C3%9F#/media/File:Sz_modern.svg

Yes. And we are just starting to collect evidence for Deseret fonts.


And there’s nothing unrecognizable about the ſɜ (< ſꝫ (= ſz)) ligature there.

Well, not to somebody used to it. But non-German users quite often use a Greek β where they should use a ß, so it's no surprise people don't distinguish the ſs and ſz derived glyphs.


The situation in Deseret is different.

The graphic difference is definitely bigger, so to an outsider, it's definitely quite impossible to identify the pairs of shapes. But that does in no way mean that these have to be seen as different characters (rather than just different glyphs) by insiders (actual users).

To use another analogy, many people these days (me included) would have difficulties identifying Fraktur letters, in particular if they show up just as individual letters. Similar for many fantasy fonts, and for people not very familiar with the Latin script.


Underlying ligature difference is indicative of character identity. 
Particularly when two resulting ligatures are SO different from one another as 
to be unrecognizable. And that is the case with EW on the left and OI on the 
right here:
https://en.wikipedia.org/wiki/Deseret_alphabet#/media/File:Deseret_glyphs_ew_and_oi_transformation_from_1855_to_1859.svg

The lower two letterforms are in no way “glyph variants” of the upper two 
letterforms. Apart from the stroke of the SHORT I 𐐆 they share nothing in 
common — because they come from different sources and are therefore different 
characters.

The range of what can be a glyph variant is quite wide across scripts and font styles. Just that the shapes differ widely, or that the origin is different, doesn't make this conclusive.


Character origin is intimately related to character identity.

In most cases, yes. But it's not a given conclusion.


I don’t think that ANY user of Deseret is all that “average”. Certainly some 
users of Deseret are experts interested in the script origin, dating, 
variation, and so on — just as we have medievalists who do the same kind of 
work. I’m about to publish a volume full of characters from Latin Extended-D. 
My work would have been impossible had we not encoded those characters.

No, your work wouldn't be impossible. It might be quite a bit more difficult, but not impossible. I have written papers about Han ideographs and Japanese text processing where I had to create my own fonts (8-bit, with mostly random assignments of characters because these were one-off jobs), or fake things with inline bitmap images (trying to get information on the final printer resolution and how many black pixels wide a stem or crossbar would have to be to avoid dropouts, and not being very successful).

I have heard the argument that some character variant is needed because of research, history,... quite a few times. If a character has indeed been historically used in a contrasting way, this is definitely a good argument for encoding. But if a character just looked somewhat different a few (hundreds of) years ago, that doesn't make such a good argument. Otherwise, somebody may want to propose new codepoints for Bodoni and Helvetica,...


Regards,    Martin.

Reply via email to