[OT] RE: Flag tags with U+1F3F3 and subtypes
Philippe Verdy verdy underscore p at wanadoo dot fr wrote: If ever the country codes used in BCP47 becomes full (all pairs of letters used), just some time before this happens, we could see new prefixes added before a new range of code. It is possible to use a 1-letter prefix for new country/territory code extensions, but with some maintenance of BCP47 parsing rules (notably the letter used should not be reordered with other singleton prefixes) This would be a major revision to BCP 47, it would have nothing to do with reordering, and it would not in any case involve 1-letter prefixes, which already have a different meaning. And the time frame we are talking about is reminiscent of Ken's estimate of when 17 planes will no longer be enough for Unicode. But I feel it will first be simpler to assign a special 2-letter code like C1- followed by a new new series of 2-letters country codes We actually thought about this stuff over in LTRU. Really. I'm not the least bit concerned about the DNS. Five years from now they could be assigning TLDs consisting entirely of emoji. This is no longer relevant to flag tags or anything else Unicode. -- Doug Ewell | http://ewellic.org | Thornton, CO
Re: [OT] RE: Flag tags with U+1F3F3 and subtypes
2015-05-18 23:55 GMT+02:00 Doug Ewell d...@ewellic.org: Philippe Verdy verdy underscore p at wanadoo dot fr wrote: If ever the country codes used in BCP47 becomes full (all pairs of letters used), just some time before this happens, we could see new prefixes added before a new range of code. It is possible to use a 1-letter prefix for new country/territory code extensions, but with some maintenance of BCP47 parsing rules (notably the letter used should not be reordered with other singleton prefixes) This would be a major revision to BCP 47, it would have nothing to do with reordering, It woiuld have to do because all subtags after the pricmary language subtag in BCP47 are optional, and you can distincguish them only by their length *or* by the role assigned to specific singletons: there's already the x singleton exception (that is ordered at end), but other singletons are currently described to use a canonical order but it is used only for encoding variants unrelated to region subtags or even to the languages. Very few singletons are used in fact (the singleton subtags occuring at start of ther tag are also treated separately from others: it could also be used to support new syntaxes for BCP47 tags, but fow we just have i-, deprecated but still valid, and x- for private use; for all other letters there's no parsing defined for now, their syntax is unknown and they are not interchangeable without a standard, so they are used only for private use; another constraint comes from the length limit of subtags: the first subtag is either a special singleton, or a primary language code using 2 or 3 letters for now; some BCP47 use an empty first subtag, i.e. the tag starts by an hyphen; double hyphens could be used as extensions to chhange locally the parsing rules and possibly return to the next logical subtag and could be used to encode international organization without needing a formal exceptional reservation in ISO 3166-1; for example *-EU in could have been encoded as --O-EU and we could have the same system for NATO, EEA, EFTA... There's still ample space for extensions of parsing rules in BCP47, but not in ISO3166.) ISO 3166 also encodes some 4-letter codes but they are not used in BCP47 (so there's no confusion with 4-letter script codes).
[OT] RE: Flag tags with U+1F3F3 and subtypes
This is why I knew I would regret it. Clearing up some errors here. No more posts from me on this non-Unicode topic after this one. Philippe Verdy verdy underscore p at wanadoo dot fr wrote: This would be a major revision to BCP 47, it would have nothing to do with reordering, It woiuld have to do because all subtags after the pricmary language subtag in BCP47 are optional, and you can distincguish them only by their length *or* by the role assigned to specific singletons: there's already the x singleton exception (that is ordered at end), but other singletons are currently described to use a canonical order but it is used only for encoding variants unrelated to region subtags or even to the languages. All non-initial singletons introduce an extension, except for 'x' which introduces a private-use sequence, and which must be last. Even if an extension were defined to hold top-level region information, WHICH WILL NEVER HAPPEN, it would not matter whether that extension appeared before or after other extensions, because it would be an extension and not a region subtag. but fow we just have i-, deprecated but still valid, i- is not deprecated. for all other letters there's no parsing defined for now, their syntax is unknown and they are not interchangeable without a standard, so they are used only for private use Extension 't' was defined in 2011 and 'u' in 2010. They have well-defined syntax, specified in RFC 6497 and 6067 respectively. Undefined singletons may not be used for private use. some BCP47 use an empty first subtag, i.e. the tag starts by an hyphen; Absolutely, utterly false. -- Doug Ewell | http://ewellic.org | Thornton, CO
[OT] Re: Flag tags
Philippe Verdy wrote: Also there should exist somewhere a registry of known flag codes. There are wellknown vexillologic sites that list large collections of flags, but for now they still did not develop a standard (ASCII-based) codification. [...] But this registry does not have to be defined and maintained by the Unicode Consortium or by ISO, unless they have the desire to develop it. This doesn't seem at all within the scope of Unicode, though perhaps CLDR would want it. -- Doug Ewell | Thornton, Colorado, USA http://www.ewellic.org | @DougEwell