In UAX 29, the GB10 rule[1] (and the WB14 rule[2]) states that we should not break before E_modifier characters in case it is after an emoji base (with optional Extend characters in between)
Given that the spec is allowed to ignore degenerates, is there any value lost by merging E_Modifier and Extend into a single category? This means we can completely get rid of the Emoji_Base category, and the EBG category gets merged with GAZ. <random non-emoji, skin tone modifier> sounds very much like a degenerate case to me. <GAZ emoji, skin tone> also feels rather degenerate. There are only three GAZes (heart (U+2764), kiss (U+1F48B), speech bubble (U+1F5E8)) and I can't see why you'd end up with a skin tone modifier on them except by accident. (Unless we plan to support lip colors or something but in that case the kiss emoji would switch to EBG anyway) Thanks, -Manish [1]: http://www.unicode.org/reports/tr29/#GB10 [2]: http://www.unicode.org/reports/tr29/#WB14