The Old Icelandic character ǫ (Unicode U+01ED: LATIN SMALL LETTER
O WITH OGONEK) is replaced in modern Icelandic by
ö.
Would it be proper therefore
to represent U+00F6, the code point which Marco Cimarosti wants to use for
o with circumflex e, also for o with ogonek?
In Icelandic they could be called the same character. Of course that only
works of Icelandic. We could not use this font for German or English or French,
unless we build some kind of recognition of language tags into it.
In French the circumflex accent indicates an earlier superscript s over the vowel. So should we allow combining
superscript s as a variant glyph
for the circumflex? But what of French text containing transliterated Arabic
names or Welsh names or transliterated classical Greek names which use a
circumflex which never had such a meaning? Again we would need language tagging.
The Old English and Middle English letter thorn (þ)is replaced in Modern English by the
combination th. Would it make sense
then for a modern font to represent U+00FE by a glyph showing th? Would it also make sense to replace
the kinds of glyphs used for U+204A TIRONIAN SIGN ET with
an ampersand? The meaning is exactly the same. But what if we want to used
this font for Icelandic or Old English? Do we again need an intelligent font
that understands language tagging?
Do we now have different flavors of Unicocde, one for English, one for Icelandic,
one for French, one for German ... ? What of other languages?
A diaeresis used in the transliterated Classical names Peirithoüs and Menelaüs is not the same as a superscript
e, though in German (and some other
languages) sounds once indicated by supersript e over a vowel have been replaced by
diaeresis over a vowel. If so, then a font which rendered any dieresis over
u or o or a would be incorrect for classical names
cited and also possibly for other foreign names. How would J.R.R. Tolkien's
name Eärendil be rendered by such
a font where the diaeresis indicates separate pronunciation of a, not an umlauted a?
Surely it makes more
sense that an author or advertising designer who wishes to use u with superscript e to use the Unicode method of u followed by a combining superscript e
so that it will appear as desired
in any font rather than by using a font change? Font changes should not change
the orthography or spelling of the original but should represent transparently
what the writer intended, and Unicode gives us a clear way to distinguish
combining superscript e from combining
diaeresis and combining superscript s
from combining circumflex.
Using the Unicode method makes far more sense than creating fonts that work
for particular languages only, provided no foreign words or names appear, or
which require language tagging.
In most European languages æ and
œ are
ligatures at one time commonly used in names and technical words of Latin
origin. Modern stylistic preference is to avoid these ligatures. However
French uses œ for a particular sound,
though the use of that ligature instead of oe was not considered important enough
for œ to be generally available on French
typewriters. Also both diagraphs were separate letters in Old English, whence
the use of æ still
in modern Danish and Icelandic. Should this modern convention be properly
indicated in an intelligent font by using unconnected ae and oe
for the these digraphs except where language tagging indicates Danish, Icelandic,
or older Scandinavian use or Old English? Should we have to language tag
Encyclopædia Britannica to be sure
that æ appears in the name properly connected?
In fact, the stylistic conventions are indicated not by font changes or tagging
but by typing the appropriate characters.
Should an English language font render ö
as oe, so that Göthe appears automatically in the more
normal English form Goethe?
Marco's desire to use a font to indicate combining superscript einstead of the way Unicode wants it
done seems prompted because currently most Unicode fonts do not currently
support the combinining superscript characters and he wishes a fallback to
normal diaeresis instead of to an undefined character indicator.
This is a reasonable wish.
In light of current Unicode support, the hack of identifying diaeresis with
combinining superscript e makes
sense.
There has never been anything wrong with using a hack when required for a
task at hand. But hacks of this kind that, if followed up widely in many fonts
in many languages, would produce a chaos of interpretations and numerous fonts
only suited for particular languages, filtering the text and not presenting
what is there, without complex and otherwise unnecessary tagging.
Surely this is not what Unicode should be?
If a writer uses a long s in modern
writing, whether quoting text of an earlier era or purposely being archaic,
normal fonts should display a long s,
not a short s on the grounds that
it happens long s is not normally
used in modern writing in Antiqua fonts.
If a writer decides between using ü,
ue, or uͤͤ (u with combining superscript e), the font should leave the text alone.
If you have a newer version of the Code 2000 font on your machine which contains
the combining superscripts, then the superscript eappears correctly in newer browsers,
even if you are using a different font for the base character. A diacritic
from one font is placed over the base character of another.
I can understand Marco not wishing to bother viewers with the demand to load
a particular font and also knowing that dynamic downloading of a font will
not work with every system or browser or with user settings of browsers. So
use the hack for now. In two or three years, hopefully, it will not be necessary.
Generally a font should not be correcting the text.
The use of macron for dieresis is somewhat a different matter. If a particular
style of German script uses a line for a diaeresis, then indeed the diaeresis
in that script has fallen together in appearance with the macron. This would
be especially so if a diaeresis was used over e and i (in foreign words and names). Representing
diaeresis by a glyph of macron form would be no more of a hack then would
be the use in an English script font of a p with an ascender, though presumably
an Icelander would identify that as the letter þ,
not p. (How þ itself should be presented in such
a script font is problematical!)
The main difficulty with identification of diaeresis and combinining superscript
e is that the identification does
not work universally, even within German, if foreign names or words appear.
Even in German text, combining superscript e may not always correctly replace diaeresis.
Jim Allan
- RE: Character identities Alain LaBont�
- RE: Character identities Michael Everson
- Re: Character identities David Starner
- RE: Character identities Kent Karlsson
- RE: Character identities Michael Everson
- Re: RE: Character identities starner
- Re: RE: Character identities Michael Everson
- Re: RE: Character identities Alain LaBont�
- Re: RE: Character identities Michael Everson
- Re: RE: Character identities John Hudson
- Re: RE: Character identities Jim Allan
- Re: RE: Character identities Adam Twardoch
- [OT] Göthe (was: Re: RE: Character ident... Doug Ewell
- Re: [OT] Göthe (was: Re: RE: Charact... Marc Wilhelm K�ster
- Re: RE: Character identities David Starner
- Re: RE: Character identities John Hudson
- Re: RE: Character identities John Cowan
- Re: RE: Character identities Doug Ewell
- Re: RE: Character identities John Cowan
- RE: RE: Character identities Kent Karlsson
- Re: Character identities Philipp Reichmuth

