The UTC considered as one of the possible approaches to the problem. While easier in terms of line breaking, there'd still be a requirement to change grapheme cluster boundaries and word boundaries to join sequences like π¦π¦, and people felt the approach didn't work well with encoding conversion. About conversion, I think the discussion was something like the following:
It is relatively simple to have a mapping like: <sjis bytes> β π¦[joiner]π¦ If we used ZWSP, then we'd have: <sjis bytes> β π¦π¦ // but the code wouldn't know when to also absorb adjacent ZWSPs. <sjis bytes> β π¦π¦ // but the code would need context to know when to add adjacent ZWSPs. Both of those would be complicated for encoding converters to handle. People also felt that π¦[joiner]π¦ would be more consistent with treating the sequence as a unit, both conceptually and in fonts. I personally favored the ZWSP, but was convinced during the discussion that ZWJ was a better approach. ------------------------------ Mark <https://plus.google.com/114199149796022210033> * * *β Il meglio Γ¨ lβinimico del bene β* ** On Thu, May 31, 2012 at 2:47 AM, Andrew West <[email protected]> wrote: > On 31 May 2012 00:24, Mark Davis β <[email protected]> wrote: > > > > There is definitely a problem. > > Is it really such a problem? Why can't implementations simply use > ZWSP to demarcate the 2-character units in a sequence of more than two > regional indicator symbols (and maybe always emit 2-character codes > wrapped between ZWSP on either side to be safe), so for example > US<ZWSP>ES<ZWSP>GE would be parsed as the regional indicator symbols > for USA, SPAIN and Georgia, whereas U<ZWSP>SE<ZWSP>SG<ZWSP>E would be > parsed as the regional indicator symbols for U (invalid), Sweden, > Singapore and E (invalid). Algorithms such as line-breaking would not > break between two regional indicator symbols, but only at a ZWSP. > > And if implementations wanted to support two- and three-letter > regional codes, they might parse > <ZWSP>GB<ZWSP>CYM<ZWSP>ENG<ZWSP>NIR<ZWSP>SCO<ZWSP> as the codes for > United Kingdom, Wales, England, Northern Ireland, and Scotland, and > represent them visually with the appropriate flag icons. > > Andrew > > >

