On 31 May 2012 00:24, Mark Davis ☕ <m...@macchiato.com> wrote:
>
> There is definitely a problem.

Is it really such a problem?  Why can't implementations simply use
ZWSP to demarcate the 2-character units in a sequence of more than two
regional indicator symbols (and maybe always emit 2-character codes
wrapped between ZWSP on either side to be safe), so for example
US<ZWSP>ES<ZWSP>GE would be parsed as the regional indicator symbols
for USA, SPAIN and Georgia, whereas U<ZWSP>SE<ZWSP>SG<ZWSP>E would be
parsed as the regional indicator symbols for U (invalid), Sweden,
Singapore and E (invalid).  Algorithms such as line-breaking would not
break between two regional indicator symbols, but only at a ZWSP.

And if implementations wanted to support two- and three-letter
regional codes, they might parse
<ZWSP>GB<ZWSP>CYM<ZWSP>ENG<ZWSP>NIR<ZWSP>SCO<ZWSP> as the codes for
United Kingdom, Wales, England, Northern Ireland, and Scotland, and
represent them visually with the appropriate flag icons.

Andrew


Reply via email to