On 31 May 2012 00:24, Mark Davis ☕ <m...@macchiato.com> wrote: > > There is definitely a problem.
Is it really such a problem? Why can't implementations simply use ZWSP to demarcate the 2-character units in a sequence of more than two regional indicator symbols (and maybe always emit 2-character codes wrapped between ZWSP on either side to be safe), so for example US<ZWSP>ES<ZWSP>GE would be parsed as the regional indicator symbols for USA, SPAIN and Georgia, whereas U<ZWSP>SE<ZWSP>SG<ZWSP>E would be parsed as the regional indicator symbols for U (invalid), Sweden, Singapore and E (invalid). Algorithms such as line-breaking would not break between two regional indicator symbols, but only at a ZWSP. And if implementations wanted to support two- and three-letter regional codes, they might parse <ZWSP>GB<ZWSP>CYM<ZWSP>ENG<ZWSP>NIR<ZWSP>SCO<ZWSP> as the codes for United Kingdom, Wales, England, Northern Ireland, and Scotland, and represent them visually with the appropriate flag icons. Andrew