Otto Stolz wrote:
As has been said before, in this thread (by J�rg Knappen, IIRC), the
little bow in the -burg abbreviation stems from the "u" stripped
together with the "r".

In German handwriting it used to be common to place a mark above the letter 'u', to distinguish it from 'n'. When I first saw the sample I thought it was that mark, retained to indicate that a 'u' had been there. The problem with that interpretation is that such a mark (in handwriting it could look like marcron or breve) is not used with the kind of font style used in the example. Therefore, it would have to be a holdover from when such abbreviations were written in the old style.

Hence, I deem this character quite different from a breve (be it
�semi-cyrillic� or otherwise) and quite akin to a "u" superscript
with some special kerning applied: It's just so that the bow of
the "u" fits nicely above the gap between the bowls of "b" and
"g", respectively.

I would agree, but even if the source really was a superscripted u in origin it is no longer a superscripted 'u', but simply a mark with downward pointing curve, and a pronounced left/right asymmetry.

In HTML, e. g., I could write "Herrenb<sup>u</sup>g." -- but then
I'd get the kerning wrong: Virtually all browsers would assign
some space to the superscript "u", resulting in an ugly gap be-
tween "b" and "g". So I am beginning to ask myself: Should you
rather look for a kerning directive in higher-level protocols,
such as HTML?

No - this would be incorrect. It is not a superscript u.

If you, however, decide that this abbreviation should be encoded
even in plain text, then there are three possibilities:

- Encode a character GERMAN MISSING U INDICATOR (or some such)
  to represent that little "u", and explain that it should take
  no extra space in the x-height region. (Precedents: U+00AA,
  U+00BA, as specially styled characters to be used in a very
  special context only, and in rather few languages)

- Encode a Character GERMAN BURG ABBREVIATION (or some such),
  and show a representative glyph for it (as in the scan from
  Dierke's Atlas). (Precedents: U+01CA, U+20A7, U+213B, or
  U+0A74)


- Encode a character COMBINING DOUBLE CONTRACTED U with represen-
  tative glyph as in the source given, but used across two characters.
  Precedents: 035D, 035E etc. The name could be improved.


- Device a general method to place, in plain text, a diacritic
  between two base characters, and then define a suitable dia-
  critic for this special case.

Following the existing precedent indicates that UTC has decided to encode such cases of double diacritics explicitly rather than creating a productive mechanism that would allow the use of existing combining characters with positioning.

All in all, I think that the UTC position on that is probably
wise, even though it means that we'll have to continue considering
newly discovered examples like this one. Their number
is probably quite small in the end anyway.

But a breve, spanning both "b" and "g": No, this does definitely
not fit the bill.

By that you mean using 035D COMBINING DOUBLE BREVE. At the moment, that would be the only character in Unicode that would be remotely similar. However, I agree with you that despite an apparent visual resemblance between the shape of a breve and the shape of this character indicating a missing 'u', a unification would be ill conceived - unless the assumption that it represents a 'u' can be proven wrong.

Cynically, UTC could simply do nothing, in which case, by natural
gravitation, 035D as the only available fallback would probably
be used anyway. It wouldn't be the first time that limitations in
technology or character encoding had an effect on orthography...

A./






Reply via email to