Sorry I'm going on about this again, but I feel still puzzled, so bear with
me once more.
I'm not quite sure if Mark's answer solves my problem. I can see that the
case mappings and decompositions as defined in the charts are internally
contradiction-free, no problem so far. Only, there still seems to be a
mismatch between what the charts show and what users will probably expect to
see. Let me repeat: as far as I can gather, there are several different
typographical traditions, but roughly speaking there are two: In one
tradition, readers expect to see full-size, spacing glyphs for mute iotas
*both* in titlecase and in uppercase (usually a small iota glyph in
titlecase, a small or capital iota glyph in uppercase). In the other
tradition, readers expect to see smaller, diacritic-like glyphs (either
centered under, or near the right corner of, the base letter), again *both*
in titlecase and in uppercase. All the printing I've seen so far seems to
adhere either to the one major pattern or the other; they apparently don't
often get mixed. And as we've seen, many people who are used to the one
pattern aren't even aware that the other exists.
The Unicode charts, somehow arbitrarily, seem to dictate in favour of the
one tradition in the one case and of the second tradition in the other. In
titlecase you get some sort of a non-spacing diacritic, while in uppercase
you *must* use the full-size capital iota glyph. Users who want full-size
iota glyphs throughout will find it difficult to live with the decomposition
to u+0345 in titlecase, while users who want small diacritic glyphs
throughout will see no sense in the u+0399 (capital iota) in uppercase.
Without some *very* sophisticated rendering machine, neither group will be
able to get it all displayed to their taste. People will prefer encoding
their texts in ways deviating from the norm, rather sacrificing case
equivalence than what each of them will consider "correct" display.
Lukas
----- Original Message -----
From: "Mark Davis" <[EMAIL PROTECTED]>
To: "Unicode List" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Sunday, November 19, 2000 8:18 PM
Subject: Re: Greek Prosgegrammeni
> I haven't had time to read this list recently, so here is a somewhat
belated
> response.
>
> >But, even if you do so, we are left with a "wrong" canonical
decomposition:
>
> >1FBC;GREEK CAPITAL LETTER ALPHA WITH PROSGEGRAMMENI;Lt;0;L;0391
> 0345;;;;N;;;;1FB3;
>
> >According to James' statement (which is not totally supported by others,
> >anyway), the decomposition should be U+0391 U+0399 (GREEK CAPITAL LETTER
> >ALPHA + GREEK CAPITAL LETTER IOTA).
>
> Unfortunately, due to historical reasons the characters are misnamed. They
> should be named:
>
> GREEK TITLECASE LETTER ALPHA WITH PROSGEGRAMMENI, etc.
>
> However, we can't change the names. See
> http://www.unicode.org/unicode/standard/policies.html. We can add
> annotations.
>
> Notice that the general category is "Lt" = Titlecase letter, so despite
the
> name the character is the titlecase version. The decomposition is correct
> for that titlecase letter. The full case mapping, as provided in Unidata +
> SpecialCasing is also for the titlecase mapping (see
> http://www.unicode.org/unicode/reports/tr21/ ) You will also find that the
> combining ypogegrammeni cases correctly
>
> The uppercase mappings in Unidata alone are not sufficient for full case
> mapping, but are the best that can be done without changing string
lengths.
> For the full mapping, you have to use SpecialCasing.txt. You can see what
> results on
> http://www.unicode.org/unicode/reports/tr21/charts/CaseChart4.html (you'll
> need a font for the Greek characters). Search for 1FBC. You will find that
> it is the titlecase form. Some fonts will not show the 1FBC with the right
> iota, but you can see from its position in the chart what it should be.
>
> > However, the precomposed characters containing the prosgegrammeni, e.g.
> > "GREEK CAPITAL LETTER ALPHA WITH PROSGEGRAMMENI" (u+1FBC) still
> canonically
> > decompose to base letter + "COMBINING GREEK YPOGEGRAMMENI" (u+0345), as
if
> > prosgegrammeni and ypogegrammeni were the same thing. This means that,
> even
>
> Those are the right decompositions (see
>
http://www.unicode.org/unicode/reports/tr15/charts/NormalizationChart17.html
> ), however, because the characters are misnamed it leads to confusion.
>
> Mark
>