https://bugs.documentfoundation.org/show_bug.cgi?id=166037

--- Comment #10 from [email protected] ---
(In reply to Khaled Hosny from comment #9)
> I don’t understand the relationship between various language tag systems,
> but OpenType seems to map both ELL and PGR to ISO 639 ell
> https://learn.microsoft.com/en-us/typography/opentype/spec/languagetags.
> 
> Similarly, HarfBuzz does not know grc language tag and converts it to
> uppercase GRC tag unchanged, and converts -polyton subtag to PGR regardless
> of the language tag (even with no language tag at all).
Since that Microsoft page associates both 'ELL ' [Greek] and 'PGR ' [Polytonic
Greek] with ISO 639-2/ISO 639-3 `ell` [Modern Greek], that means that 'ELL '
really means [Modern Greek], and 'PGR ' really means [Polytonic Modern Greek],
which implies that there is no OpenType language system tag at all for Ancient
Greek, since it has a separate ISO 639-2/ISO 639-3 code. If that is the case,
then there would be no matching OpenType language system tag to correspond to
the LibreOffice language “Greek, Ancient”. Was the intention of the LibreOffice
developers to include support for a language with no matching OpenType language
system tag?

The Library of Congress is the registration authority for ISO 639-2, and
https://www.loc.gov/standards/iso639-2/php/code_list.php shows the following:

• Modern Greek is associated with ISO 639-1 `el`, [legacy] bibliographic ISO
639-2 `gre`, and terminological ISO 639-2 `ell` (twenty ISO 639-2 languages
have legacy bibliographic codes);

• Ancient Greek is associated with ISO 639-2 `grc`.

SIL International is the registration authority for ISO 639-3, and
https://iso639-3.sil.org/code_tables/639/data/g?page=1 shows that the ISO 639-2
distinctions between Modern Greek and Ancient Greek have been preserved in ISO
639-3. (ISO 639-3 doesn’t use the legacy bibliographic codes of ISO 639-2.)

The IETF language tags of BCP 47 come from RFC 5646. RFC 5646 in turn defines
its language tags as the “shortest ISO 639 code”, which is why it uses the
two-letter ISO 639-1 code `el` for Modern Greek. Accordingly, RFC 5646 (and
thus BCP 47) uses ISO 639-2/ISO 639-3 `grc` for Ancient Greek, since there is
no two-letter ISO 639-1 code for Ancient Greek.

If HarfBuzz uses BCP 47 language tags to match languages, then HarfBuzz needs
to be updated to recognize the ISO 639-2/ISO 639-3 `grc` code, so that (at
minimum) the LibreOffice language “Greek, Ancient” can be correctly associated
with BCP 47 `grc`.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to