Re: Combining characters

Mark E. Shoulson via Unicode Sun, 14 Dec 2025 15:26:47 -0800

On 12/14/25 5:44 PM, Asmus Freytag via Unicode wrote:

On 12/14/2025 10:47 AM, Phil Smith III via Unicode wrote:
Well, I’m sorta “asking for a friend” – a coworker who is deep in theweeds of working with something Unicode-related. I’m blaming him forhaving told me that :)
This actually deserves a deeper answer, or a more "bird's-eye" one, ifyou want. Read to the end.
The way you asked the question seems to hint that in your minds youand your friend conflate the concept of "combining" mark and"diacritic". That would not be surprising if you are mainly familiarwith European scripts and languages, because in that case, thisequivalence kind of applies.

Yes. This is crucial. You (Phil) are writing like "sheez, so there's eand there's e-with-an-acute, we might as well just treat them likeseparate letters." And that maybe makes sense for languages where"combining characters" are maybe two or three diacritics that can liveon five or six letters. Maybe it does make sense to consider thosecombinations as distinct letters (indeed, some of the languages inquestion do just that.) But some combining characters are more rightlyperceived as things separate from the letters which are written in thesame space (and have historically always been considered so). The mostobvious examples would be Hebrew and Arabic vowel-points. Does itreally make sense to consider בְ and בֶ and בְּ and all the othercombinatorics as separate distinct things, when they clearly containseparate units, each of which has its own consistent character? Throwin the Hebrew "accents" (cantillation marks) and you're talking anenormous combinatorial explosion at the *cost* of simplicity andconsistency, not improving it. Ditto Indic vowel-marks and a jillionother abjads and abugidas. If anything, there's a better case to bemade that the precomposed letters were maybe a wrong move.


(TL;DR: what Asmus said.)

~mark

Re: Combining characters

Reply via email to