Thanks. Probably best as: unicode_locale_id = unicode_language_id ( transformed_extensions unicode_locale_extensions? | unicode_locale_extensions transformed_extensions? )? ;
even clearer would be two steps: unicode_locale_id = unicode_language_id extensions? ; extensions = transformed_extensions unicode_locale_extensions? | unicode_locale_extensions transformed_extensions? ; Could you file a CLDR ticket on this? Mark On Tue, Mar 28, 2017 at 12:36 PM, Philippe Verdy <verd...@wanadoo.fr> wrote: > I note this in TR32 > *3.2 Unicode Locale Identifier > <http://unicode.org/reports/tr35/index.html#Unicode_locale_identifier>* > > EBNF > ABNF > > unicode_locale_id > <http://unicode.org/reports/tr35/index.html#unicode_locale_id> = > unicode_language_id > (transformed_extensions > unicode_locale_extensions? > | unicode_locale_extensions? > transformed_extensions?) ; = unicode_language_id > ([trasformed_extensions > [unicode_locale_extensions]] > / [unicode_locale_extensions > [transformed_extensions]]) > > * first there's a typo in the ABNF syntax ("trasformed") > * the syntax is not strictly equivalent, or the ABNF is unnecessarily not > context-free > > It should better be: > > EBNF > ABNF > > unicode_locale_id > <http://unicode.org/reports/tr35/index.html#unicode_locale_id> = > unicode_language_id > (transformed_extensions > unicode_locale_extensions? > | unicode_locale_extensions > transformed_extensions?)?; = unicode_language_id > [transformed_extensions > [unicode_locale_extensions] > / unicode_locale_extensions > [transformed_extensions]] > > > > 2017-03-28 11:56 GMT+02:00 Joan Montané <j...@montane.cat>: > >> >> >> 2017-03-28 7:57 GMT+02:00 Mark Davis ☕️ <m...@macchiato.com>: >> >>> To add to what Ken and Markus said: like many other identifiers, there >>> are a number of different categories. >>> >>> 1. *Ill-formed: *"$1" >>> 2. *Well-formed, but not valid: *"usx". Is *syntactic* according to >>> http://unicode.org/reports/tr51/proposed.html#def_emoji_tag_sequence >>> <http://unicode.org/reports/tr51/proposed.html#def_emoji_tag_sequence>, >>> but is not *valid* according to http://unicode.org/reports/tr5 >>> 1/proposed.html#valid-emoji-tag-sequences >>> <http://unicode.org/reports/tr51/proposed.html#valid-emoji-tag-sequences> >>> . >>> 3. *Valid, but not recommended: "usca". *Corresponds to the valid >>> Unicode subdivision code for California according to >>> http://unicode.org/reports/tr51/proposed.html#valid-emoji-ta >>> g-sequences >>> <http://unicode.org/reports/tr51/proposed.html#valid-emoji-tag-sequences> >>> and CLDR, but is not listed in http://unicode.org/Public/emoji/5.0/. >>> 4. *Recommended:* "gbsct". Corresponds to the valid Unicode >>> subdivision code for Scotland, and *is* listed in >>> http://unicode.org/Public/emoji/5.0/ >>> <http://unicode.org/Public/emoji/5.0/>. >>> >>> As Ken says, the terminology is a little bit in flux for term >>> 'recommended'. TR51 is still open for comment, although we won't make any >>> changes that would invalidate http://unicode.org/Public/emoji/5.0/. >>> >> >> Just two remarks >> >> 1st one: point 4 (Unicode subdivision codes listed in emoji Unicode site) >> arises something like chicken-egg problem. Vendors don't easily add new >> subdivision-flags (because they aren't recommended), and Unicode doesn't >> recommend new subdivision flags (because they aren't supported by vendors). >> >> 2n one: What about "Adopt a Character" (AKA "Adopt an emoji"). Will be >> valid, but not recommended, Unicode subdivisions codes eligible? For >> instances, say, could someone adopt California, Texas, Pomerania, or >> Catalonia flags? >> >> >> Regards, >> Joan Montané >> >> >