Non-RGI sequences are not emoji? (was: Re: Unifying E_Modifier and Extend in UAX 29 (i.e. the necessity of GB10))

2018-01-02 Thread Doug Ewell via Unicode
Mark Davis wrote: BTW, relevant to this discussion is a proposal filed http://www.unicode.org/L2/L2017/17434-emoji-rejex-uts51-def.pdf (The date is wrong, should be 2017-12-22) The phrase "emoji regex" had caused me to ignore this document, but I took a look based on this thread. It says "we

Re: Unifying E_Modifier and Extend in UAX 29 (i.e. the necessity of GB10)

2018-01-02 Thread Richard Wordingham via Unicode
On Tue, 2 Jan 2018 01:21:37 -0800 Asmus Freytag via Unicode wrote: > On 1/1/2018 6:52 AM, Richard Wordingham via Unicode wrote: > > Generally yes, but I'm not sure that they'd be inappropriate for > > Egyptian hieroglyphs showing human beings. The choice of > >

Re: Unifying E_Modifier and Extend in UAX 29 (i.e. the necessity of GB10)

2018-01-02 Thread Mark Davis ☕️ via Unicode
BTW, relevant to this discussion is a proposal filed http://www.unicode.org/ L2/L2017/17434-emoji-rejex-uts51-def.pdf (The date is wrong, should be 2017-12-22) Mark On Tue, Jan 2, 2018 at 11:41 AM, Mark Davis ☕️ wrote: > We had that originally, but some people objected that

Re: Unifying E_Modifier and Extend in UAX 29 (i.e. the necessity of GB10)

2018-01-02 Thread Mark Davis ☕️ via Unicode
We had that originally, but some people objected that some languages (Arabic, as I recall) can end a string of letters with a ZWJ, and immediately follow it by an emoji (without an intervening space) without wanting it to be joined into a grapheme cluster with a following symbol. While I

Re: Unifying E_Modifier and Extend in UAX 29 (i.e. the necessity of GB10)

2018-01-02 Thread Mark Davis ☕️ via Unicode
> Or is bringing it up here good enough? You should submit a proposal, which you can do at https://www.unicode.org/reporting.html. It doesn't have to be much more than what you put in email. (A reminder for everyone here: This is simply a discussion list, and has no effect whatsoever unless

Re: Unifying E_Modifier and Extend in UAX 29 (i.e. the necessity of GB10)

2018-01-02 Thread Manish Goregaokar via Unicode
In the current draft GB11 mentions Extended_Pictographic Extend* ZWJ x Extended_Pictographic. Can this similarly be distilled to just ZWJ x Extended_Pictographic? This does affect cases like or and I'm not certain if that counts as a degenerate case. If we do this then all of the rules except

Re: Unifying E_Modifier and Extend in UAX 29 (i.e. the necessity of GB10)

2018-01-02 Thread Manish Goregaokar via Unicode
> Note: we are already planning to get rid of the GAZ/EBG distinction ( http://www.unicode.org/reports/tr29/tr29-32.html#GB10) in any event. This is great! I hadn't noticed this when I last saw that draft (I was focusing on the Virama stuff). Good to know! > Instead, we'd add one line to

Re: Unifying E_Modifier and Extend in UAX 29 (i.e. the necessity of GB10)

2018-01-02 Thread Asmus Freytag via Unicode
On 1/1/2018 6:52 AM, Richard Wordingham via Unicode wrote: On Mon, 1 Jan 2018 13:24:29 +0530 Manish Goregaokar via Unicode wrote: sounds very much like a degenerate case to me. Generally yes, but I'm not sure that they'd be inappropriate for Egyptian hieroglyphs showing