Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote: > 2011/8/31 Doug Ewell <[email protected]>: >> Philippe Verdy wrote: >>> the existing >>> BCP 47 implementations, but that would limit the may-be future >>> extension of ISO 639 to longer codes): ISO 639 could immediately say >>> that it will never allocate any language code (of any length) >>> starting by qa..qz. >> >> Not possible; 'qu' is already taken for Quechua. And not necessary: >> 'qaa' through 'qtz' are reserved. > > I said using the prefixes starting by "qa..qt",
This was a direct quote from your post of Wed, 31 Aug 2011 21:58:25 +0200 (http://www.unicode.org/mail-arch/unicode-ml/y2011-m08/0456.html). But I'll assume it was a typo for "qa..qt"; you did mention this shorter range in other posts. > these prefixes are not > supposed to be used alone, there must be additional letters. so this > does not apply to "qu" alone (yes, assigned to the Quechua > macrolanguage, or isn't it a language subfamily ?). From here on, I assume you are asking about BCP 47, not about any part of ISO 639. BCP 47 uses the IANA Language Subtag Registry, which uses ISO 639 as a primary source, but adds constraints. As one example, when an alpha-2 code element (from 639-1) exists for a given language, a BCP 47 subtag exists only for that alpha-2 code element and not for the corresponding alpha-3 code element. So for French, you can only use 'fr' for French and not 'fre' (from 639-2/B) or 'fra' (from 639-2/T and 639-3). BCP 47 language subtags do not have "prefixes." An ISO-based language subtag is either 2 or 3 letters long. There is no correlation, explicit or implicit, between a 2-letter language subtag and any 3-letter subtags that begin with those same two letters. ISO 639-3 does classify Quechua as a macrolanguage, but that doesn't affect code allocation; macrolanguages are assigned code elements and subtags just like any other language. It is often useful to be able to specify, say, "Quechua" in a tag instead of one of the many specific varieties of Quechua, such as Chimborazo Highland Quichua or Yanahuanca Pasco Quechua; this in fact is why the concept of "macrolanguage" exists. > But I admit that there's an additional caveat: BCP47 opens all codes > with 5 to 8 characters to possible registration in the IANA registry. > I have not checked if there were some registration of language tags > starting by "qa..qt" in the IANA registry, but there's apaprently no > policy defined to forbid such registration. BCP 47 language subtags of 2 and 3 letters correspond to code elements assigned in some part of ISO 639. ISO 639-1, as stated earlier, has assigned 'qu' to Quechua. This is reflected in the Registry. I don't have a copy of 639-1 and don't know if it reserves 'qa..qt' or any other range. The 639-2 Web site, which lists 639-1 allocations, doesn't mention any such reservation. ISO 639-2 and 639-3 have defined 'qaa' through 'qtz' as "Reserved for local use," which is reflected in the Registry as "Private use." BCP 47 explains the use of these subtags as an alternative to the "x-" mechanism. One advantage, as you pointed out elsewhere, is that the resulting tag can be parsed like a normal tag; the region 'ZW' in "qaa-ZW" explicitly means Zimbabwe. ISO 639-2 has assigned 'que' to Quechua and ISO 639-5 has assigned 'qwe' to "Quechuan (family)." ISO 639-3 has assigned more than 50 code elements in the non-private range beginning with 'q', many of which (but not all) are for varieties of Quechua. The Registry reflects all of these assignments except 'que' (because Quechua in BCP 47 is 'qu'). > And your "not necessary" comment does not apply here too: it just > assigns the 3-letter codes for local use, not the longer codes which > are only reserved for the 4-letter codes, but not assigned for private > use (and there's also no provision given in ISO 649 to protect an > encoding space for 5-letter codes or longer, as they are now usable > for IANA registration). 4-letter language subtags are reserved, and will remain reserved unless and until BCP 47 is updated (via a new RFC) to make use of them. I don't care to speculate on their future allocation or use. If and when language subtags of 5 to 8 letters are registered, there will be no restriction (as far as I can tell) on subtags beginning with 'q' or any other letter or sequence. -- Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14 www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell

