Re: Non-RGI sequences are not emoji? (was: Re: Unifying E_Modifier and Extend in UAX 29 (i.e. the necessity of GB10))

2018-01-15 Thread Doug Ewell via Unicode

On January 5, Mark Davis wrote:


Doug, I modified my working draft, at
https://docs.google.com/document/d/1EuNjbs0XrBwqlvCJxra44o3EVrwdBJUWsPf8Ec1fWKY

If that looks ok, I'll submit.


Sorry for the delay. The text substitutions look fine.

--
Doug Ewell | Thornton, CO, US | ewellic.org



Re: Non-RGI sequences are not emoji? (was: Re: Unifying E_Modifier and Extend in UAX 29 (i.e. the necessity of GB10))

2018-01-05 Thread Mark Davis ☕️ via Unicode
Doug, I modified my working draft, at
https://docs.google.com/document/d/1EuNjbs0XrBwqlvCJxra44o3EVrwdBJUWsPf8Ec1fWKY

If that looks ok, I'll submit.

Thanks again for your comments.

Mark

Mark

On Wed, Jan 3, 2018 at 9:29 AM, Mark Davis ☕️  wrote:

> Thanks for your comments; you raise an excellent issue. There are valid
> sequences that are not RGI; a vendor can support additional emoji sequences
> (in particular, flags). So the wording in the doc isn't correct.
>
> It should do something like replace the use of "testing for RGI" by
> "testing for validity". The key areas involved in that are checking for the
> valid base+modifier combinations, valid RI pairs, and TAG sequences. The
> latter two involve testing based on the information applied in the
> appendix, while the valid base+modifiers are more regular and can be tested
> based on properties.
>
>
> Mark
>
> On Tue, Jan 2, 2018 at 9:55 PM, Doug Ewell via Unicode <
> unicode@unicode.org> wrote:
>
>> Mark Davis wrote:
>>
>> BTW, relevant to this discussion is a proposal filed
>>> http://www.unicode.org/L2/L2017/17434-emoji-rejex-uts51-def.pdf (The
>>> date is wrong, should be 2017-12-22)
>>>
>>
>> The phrase "emoji regex" had caused me to ignore this document, but I
>> took a look based on this thread. It says "we still depend on the RGI test
>> to filter the set of emoji sequences" and proposes that the EBNF in UTS #51
>> be simplified on the basis that only RGI sequences will pass the "possible
>> emoji" test anyway.
>>
>> Thus it is true, as some people have said (i.e. in L2/17‐382), that
>> non-RGI sequences do not actually count as emoji, and therefore there is no
>> way — not merely no "recommended" way — to represent the flags of entities
>> such as Catalonia and Brittany.
>>
>> In 2016 I had asked for the emoji tag sequence mechanism for flags to be
>> available for all CLDR subdivisions, not just three, with the understanding
>> that the vast majority would not be supported by vendor glyphs. II t is
>> unfortunate that, while the conciliatory name "recommended" was adopted for
>> the three, the intent of "exclusively permitted" was retained.
>>
>> --
>> Doug Ewell | Thornton, CO, US | ewellic.org
>>
>>
>


Re: Non-RGI sequences are not emoji? (was: Re: Unifying E_Modifier and Extend in UAX 29 (i.e. the necessity of GB10))

2018-01-03 Thread Mark Davis ☕️ via Unicode
Thanks for your comments; you raise an excellent issue. There are valid
sequences that are not RGI; a vendor can support additional emoji sequences
(in particular, flags). So the wording in the doc isn't correct.

It should do something like replace the use of "testing for RGI" by
"testing for validity". The key areas involved in that are checking for the
valid base+modifier combinations, valid RI pairs, and TAG sequences. The
latter two involve testing based on the information applied in the
appendix, while the valid base+modifiers are more regular and can be tested
based on properties.


Mark

On Tue, Jan 2, 2018 at 9:55 PM, Doug Ewell via Unicode 
wrote:

> Mark Davis wrote:
>
> BTW, relevant to this discussion is a proposal filed
>> http://www.unicode.org/L2/L2017/17434-emoji-rejex-uts51-def.pdf (The
>> date is wrong, should be 2017-12-22)
>>
>
> The phrase "emoji regex" had caused me to ignore this document, but I took
> a look based on this thread. It says "we still depend on the RGI test to
> filter the set of emoji sequences" and proposes that the EBNF in UTS #51 be
> simplified on the basis that only RGI sequences will pass the "possible
> emoji" test anyway.
>
> Thus it is true, as some people have said (i.e. in L2/17‐382), that
> non-RGI sequences do not actually count as emoji, and therefore there is no
> way — not merely no "recommended" way — to represent the flags of entities
> such as Catalonia and Brittany.
>
> In 2016 I had asked for the emoji tag sequence mechanism for flags to be
> available for all CLDR subdivisions, not just three, with the understanding
> that the vast majority would not be supported by vendor glyphs. II t is
> unfortunate that, while the conciliatory name "recommended" was adopted for
> the three, the intent of "exclusively permitted" was retained.
>
> --
> Doug Ewell | Thornton, CO, US | ewellic.org
>
>


Non-RGI sequences are not emoji? (was: Re: Unifying E_Modifier and Extend in UAX 29 (i.e. the necessity of GB10))

2018-01-02 Thread Doug Ewell via Unicode

Mark Davis wrote:


BTW, relevant to this discussion is a proposal filed
http://www.unicode.org/L2/L2017/17434-emoji-rejex-uts51-def.pdf (The
date is wrong, should be 2017-12-22)


The phrase "emoji regex" had caused me to ignore this document, but I 
took a look based on this thread. It says "we still depend on the RGI 
test to filter the set of emoji sequences" and proposes that the EBNF in 
UTS #51 be simplified on the basis that only RGI sequences will pass the 
"possible emoji" test anyway.


Thus it is true, as some people have said (i.e. in L2/17‐382), that 
non-RGI sequences do not actually count as emoji, and therefore there is 
no way — not merely no "recommended" way — to represent the flags of 
entities such as Catalonia and Brittany.


In 2016 I had asked for the emoji tag sequence mechanism for flags to be 
available for all CLDR subdivisions, not just three, with the 
understanding that the vast majority would not be supported by vendor 
glyphs. II t is unfortunate that, while the conciliatory name 
"recommended" was adopted for the three, the intent of "exclusively 
permitted" was retained.


--
Doug Ewell | Thornton, CO, US | ewellic.org