https://bugs.documentfoundation.org/show_bug.cgi?id=166037
--- Comment #10 from [email protected] --- (In reply to Khaled Hosny from comment #9) > I don’t understand the relationship between various language tag systems, > but OpenType seems to map both ELL and PGR to ISO 639 ell > https://learn.microsoft.com/en-us/typography/opentype/spec/languagetags. > > Similarly, HarfBuzz does not know grc language tag and converts it to > uppercase GRC tag unchanged, and converts -polyton subtag to PGR regardless > of the language tag (even with no language tag at all). Since that Microsoft page associates both 'ELL ' [Greek] and 'PGR ' [Polytonic Greek] with ISO 639-2/ISO 639-3 `ell` [Modern Greek], that means that 'ELL ' really means [Modern Greek], and 'PGR ' really means [Polytonic Modern Greek], which implies that there is no OpenType language system tag at all for Ancient Greek, since it has a separate ISO 639-2/ISO 639-3 code. If that is the case, then there would be no matching OpenType language system tag to correspond to the LibreOffice language “Greek, Ancient”. Was the intention of the LibreOffice developers to include support for a language with no matching OpenType language system tag? The Library of Congress is the registration authority for ISO 639-2, and https://www.loc.gov/standards/iso639-2/php/code_list.php shows the following: • Modern Greek is associated with ISO 639-1 `el`, [legacy] bibliographic ISO 639-2 `gre`, and terminological ISO 639-2 `ell` (twenty ISO 639-2 languages have legacy bibliographic codes); • Ancient Greek is associated with ISO 639-2 `grc`. SIL International is the registration authority for ISO 639-3, and https://iso639-3.sil.org/code_tables/639/data/g?page=1 shows that the ISO 639-2 distinctions between Modern Greek and Ancient Greek have been preserved in ISO 639-3. (ISO 639-3 doesn’t use the legacy bibliographic codes of ISO 639-2.) The IETF language tags of BCP 47 come from RFC 5646. RFC 5646 in turn defines its language tags as the “shortest ISO 639 code”, which is why it uses the two-letter ISO 639-1 code `el` for Modern Greek. Accordingly, RFC 5646 (and thus BCP 47) uses ISO 639-2/ISO 639-3 `grc` for Ancient Greek, since there is no two-letter ISO 639-1 code for Ancient Greek. If HarfBuzz uses BCP 47 language tags to match languages, then HarfBuzz needs to be updated to recognize the ISO 639-2/ISO 639-3 `grc` code, so that (at minimum) the LibreOffice language “Greek, Ancient” can be correctly associated with BCP 47 `grc`. -- You are receiving this mail because: You are the assignee for the bug.
