Thanks Mark
On Tue, Mar 28, 2017 at 1:01 PM, Philippe Verdy <[email protected]> wrote: > I just filed the bug in the CLDR contact form. > > 2017-03-28 12:49 GMT+02:00 Mark Davis ☕️ <[email protected]>: > >> Thanks. Probably best as: >> >> unicode_locale_id = unicode_language_id >> ( transformed_extensions unicode_locale_extensions? >> | unicode_locale_extensions transformed_extensions? >> )? ; >> >> even clearer would be two steps: >> >> unicode_locale_id = unicode_language_id extensions? ; >> >> extensions = transformed_extensions unicode_locale_extensions? >> | unicode_locale_extensions transformed_extensions? ; >> >> Could you file a CLDR ticket on this? >> >> >> Mark >> >> On Tue, Mar 28, 2017 at 12:36 PM, Philippe Verdy <[email protected]> >> wrote: >> >>> I note this in TR32 >>> *3.2 Unicode Locale Identifier >>> <http://unicode.org/reports/tr35/index.html#Unicode_locale_identifier>* >>> >>> EBNF >>> ABNF >>> >>> unicode_locale_id >>> <http://unicode.org/reports/tr35/index.html#unicode_locale_id> = >>> unicode_language_id >>> (transformed_extensions >>> unicode_locale_extensions? >>> | unicode_locale_extensions? >>> transformed_extensions?) ; = unicode_language_id >>> ([trasformed_extensions >>> [unicode_locale_extensions]] >>> / [unicode_locale_extensions >>> [transformed_extensions]]) >>> >>> * first there's a typo in the ABNF syntax ("trasformed") >>> * the syntax is not strictly equivalent, or the ABNF is unnecessarily >>> not context-free >>> >>> It should better be: >>> >>> EBNF >>> ABNF >>> >>> unicode_locale_id >>> <http://unicode.org/reports/tr35/index.html#unicode_locale_id> = >>> unicode_language_id >>> (transformed_extensions >>> unicode_locale_extensions? >>> | unicode_locale_extensions >>> transformed_extensions?)?; = unicode_language_id >>> [transformed_extensions >>> [unicode_locale_extensions] >>> / unicode_locale_extensions >>> [transformed_extensions]] >>> >>> >>> >>> 2017-03-28 11:56 GMT+02:00 Joan Montané <[email protected]>: >>> >>>> >>>> >>>> 2017-03-28 7:57 GMT+02:00 Mark Davis ☕️ <[email protected]>: >>>> >>>>> To add to what Ken and Markus said: like many other identifiers, there >>>>> are a number of different categories. >>>>> >>>>> 1. *Ill-formed: *"$1" >>>>> 2. *Well-formed, but not valid: *"usx". Is *syntactic* according >>>>> to http://unicode.org/reports/tr51/proposed.html#def_emoji_tag_ >>>>> sequence, but is not *valid* according to >>>>> http://unicode.org/reports/tr51/proposed.html#valid-emoji-ta >>>>> g-sequences >>>>> >>>>> <http://unicode.org/reports/tr51/proposed.html#valid-emoji-tag-sequences> >>>>> . >>>>> 3. *Valid, but not recommended: "usca". *Corresponds to the valid >>>>> Unicode subdivision code for California according to >>>>> http://unicode.org/reports/tr51/proposed.html#valid-emoji-ta >>>>> g-sequences >>>>> >>>>> <http://unicode.org/reports/tr51/proposed.html#valid-emoji-tag-sequences> >>>>> and CLDR, but is not listed in http://unicode.org/Public/emoji/5.0/ >>>>> . >>>>> 4. *Recommended:* "gbsct". Corresponds to the valid Unicode >>>>> subdivision code for Scotland, and *is* listed in >>>>> http://unicode.org/Public/emoji/5.0/ >>>>> <http://unicode.org/Public/emoji/5.0/>. >>>>> >>>>> As Ken says, the terminology is a little bit in flux for term >>>>> 'recommended'. TR51 is still open for comment, although we won't make any >>>>> changes that would invalidate http://unicode.org/Public/emoji/5.0/. >>>>> >>>> >>>> Just two remarks >>>> >>>> 1st one: point 4 (Unicode subdivision codes listed in emoji Unicode >>>> site) arises something like chicken-egg problem. Vendors don't easily add >>>> new subdivision-flags (because they aren't recommended), and Unicode >>>> doesn't recommend new subdivision flags (because they aren't supported by >>>> vendors). >>>> >>>> 2n one: What about "Adopt a Character" (AKA "Adopt an emoji"). Will be >>>> valid, but not recommended, Unicode subdivisions codes eligible? For >>>> instances, say, could someone adopt California, Texas, Pomerania, or >>>> Catalonia flags? >>>> >>>> >>>> Regards, >>>> Joan Montané >>>> >>>> >>> >> >

