I just filed the bug in the CLDR contact form. 2017-03-28 12:49 GMT+02:00 Mark Davis ☕️ <[email protected]>:
> Thanks. Probably best as: > > unicode_locale_id = unicode_language_id > ( transformed_extensions unicode_locale_extensions? > | unicode_locale_extensions transformed_extensions? )? > ; > > even clearer would be two steps: > > unicode_locale_id = unicode_language_id extensions? ; > > extensions = transformed_extensions unicode_locale_extensions? > | unicode_locale_extensions transformed_extensions? ; > > Could you file a CLDR ticket on this? > > > Mark > > On Tue, Mar 28, 2017 at 12:36 PM, Philippe Verdy <[email protected]> > wrote: > >> I note this in TR32 >> *3.2 Unicode Locale Identifier >> <http://unicode.org/reports/tr35/index.html#Unicode_locale_identifier>* >> >> EBNF >> ABNF >> >> unicode_locale_id >> <http://unicode.org/reports/tr35/index.html#unicode_locale_id> = >> unicode_language_id >> (transformed_extensions >> unicode_locale_extensions? >> | unicode_locale_extensions? >> transformed_extensions?) ; = unicode_language_id >> ([trasformed_extensions >> [unicode_locale_extensions]] >> / [unicode_locale_extensions >> [transformed_extensions]]) >> >> * first there's a typo in the ABNF syntax ("trasformed") >> * the syntax is not strictly equivalent, or the ABNF is unnecessarily not >> context-free >> >> It should better be: >> >> EBNF >> ABNF >> >> unicode_locale_id >> <http://unicode.org/reports/tr35/index.html#unicode_locale_id> = >> unicode_language_id >> (transformed_extensions >> unicode_locale_extensions? >> | unicode_locale_extensions >> transformed_extensions?)?; = unicode_language_id >> [transformed_extensions >> [unicode_locale_extensions] >> / unicode_locale_extensions >> [transformed_extensions]] >> >> >> >> 2017-03-28 11:56 GMT+02:00 Joan Montané <[email protected]>: >> >>> >>> >>> 2017-03-28 7:57 GMT+02:00 Mark Davis ☕️ <[email protected]>: >>> >>>> To add to what Ken and Markus said: like many other identifiers, there >>>> are a number of different categories. >>>> >>>> 1. *Ill-formed: *"$1" >>>> 2. *Well-formed, but not valid: *"usx". Is *syntactic* according to >>>> http://unicode.org/reports/tr51/proposed.html#def_emoji_tag_sequence >>>> <http://unicode.org/reports/tr51/proposed.html#def_emoji_tag_sequence>, >>>> but is not *valid* according to http://unicode.org/reports/tr5 >>>> 1/proposed.html#valid-emoji-tag-sequences >>>> >>>> <http://unicode.org/reports/tr51/proposed.html#valid-emoji-tag-sequences> >>>> . >>>> 3. *Valid, but not recommended: "usca". *Corresponds to the valid >>>> Unicode subdivision code for California according to >>>> http://unicode.org/reports/tr51/proposed.html#valid-emoji-ta >>>> g-sequences >>>> >>>> <http://unicode.org/reports/tr51/proposed.html#valid-emoji-tag-sequences> >>>> and CLDR, but is not listed in http://unicode.org/Public/emoji/5.0/. >>>> 4. *Recommended:* "gbsct". Corresponds to the valid Unicode >>>> subdivision code for Scotland, and *is* listed in >>>> http://unicode.org/Public/emoji/5.0/ >>>> <http://unicode.org/Public/emoji/5.0/>. >>>> >>>> As Ken says, the terminology is a little bit in flux for term >>>> 'recommended'. TR51 is still open for comment, although we won't make any >>>> changes that would invalidate http://unicode.org/Public/emoji/5.0/. >>>> >>> >>> Just two remarks >>> >>> 1st one: point 4 (Unicode subdivision codes listed in emoji Unicode >>> site) arises something like chicken-egg problem. Vendors don't easily add >>> new subdivision-flags (because they aren't recommended), and Unicode >>> doesn't recommend new subdivision flags (because they aren't supported by >>> vendors). >>> >>> 2n one: What about "Adopt a Character" (AKA "Adopt an emoji"). Will be >>> valid, but not recommended, Unicode subdivisions codes eligible? For >>> instances, say, could someone adopt California, Texas, Pomerania, or >>> Catalonia flags? >>> >>> >>> Regards, >>> Joan Montané >>> >>> >> >

