Re: Unicode Emoji 5.0 characters now final

2017-03-28 Thread Mark Davis ☕️
Thanks Mark On Tue, Mar 28, 2017 at 1:01 PM, Philippe Verdy wrote: > I just filed the bug in the CLDR contact form. > > 2017-03-28 12:49 GMT+02:00 Mark Davis ☕️ : > >> ​Thanks. Probably best as: >> >> unicode_locale_id = unicode_language_id >>

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-28 Thread Martin J. Dürst
I agree with Alstair. The list of font technology options was mostly to show that there are already a lot of options (some might even say too many), so font technology doesn't really limit our choices. Regards, Martin. On 2017/03/27 23:04, Alastair Houghton wrote: On 27 Mar 2017, at

Re: Unicode Emoji 5.0 characters now final

2017-03-28 Thread Mark Davis ☕️
​Good questions.​ On Tue, Mar 28, 2017 at 11:56 AM, Joan Montané wrote: > 1st one: point 4 (Unicode subdivision codes listed in emoji Unicode site) > arises something like chicken-egg problem. Vendors don't easily add new > subdivision-flags (because they aren't recommended),

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-28 Thread Michael Everson
On 28 Mar 2017, at 07:32, Martin J. Dürst wrote: > On 2017/03/28 01:03, Michael Everson wrote: >> On 27 Mar 2017, at 16:56, John H. Jenkins wrote: > >> The 1857 St Louis punches definitely included both the 1855 EW Ч and the >> 1859 OI <ЃІ>. Ken

Re: Unicode Emoji 5.0 characters now final

2017-03-28 Thread Joan Montané
2017-03-28 7:57 GMT+02:00 Mark Davis ☕️ : > To add to what Ken and Markus said: like many other identifiers, there are > a number of different categories. > >1. *Ill-formed: *"$1" >2. *Well-formed, but not valid: *"usx". Is *syntactic* according to >

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-28 Thread Mark Davis ☕️
On Tue, Mar 28, 2017 at 12:39 PM, Martin J. Dürst wrote: ​​ No, your work wouldn't be impossible. It might be quite a bit more > difficult, but not impossible. I have written papers about Han ideographs > and Japanese text processing where I had to create my own

Re: Unicode Emoji 5.0 characters now final

2017-03-28 Thread Philippe Verdy
I note this in TR32 *3.2 Unicode Locale Identifier * EBNF ABNF unicode_locale_id = unicode_language_id (transformed_extensions unicode_locale_extensions? |

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-28 Thread Martin J. Dürst
Hello Michael, others, On 2017/03/27 21:07, Michael Everson wrote: On 27 Mar 2017, at 06:42, Martin J. Dürst wrote: The characters in question have different and undisputed origins, undisputed. If you change that to the somewhat more neutral "the shapes in question

Re: Unicode Emoji 5.0 characters now final

2017-03-28 Thread Mark Davis ☕️
​Thanks. Probably best as: unicode_locale_id = unicode_language_id ( transformed_extensions unicode_locale_extensions? | unicode_locale_extensions transformed_extensions? )? ;​ even clearer would be two steps: unicode_locale_id = unicode_language_id

Re: Unicode Emoji 5.0 characters now final

2017-03-28 Thread Philippe Verdy
I just filed the bug in the CLDR contact form. 2017-03-28 12:49 GMT+02:00 Mark Davis ☕️ : > ​Thanks. Probably best as: > > unicode_locale_id = unicode_language_id > ( transformed_extensions unicode_locale_extensions? > |

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-28 Thread Martin J. Dürst
On 2017/03/28 01:49, Michael Everson wrote: Sorry, but typographic control of that sort is grand for typesetting, where you can select ranges of text and language-tag it (assuming your program accepts and supports all the language tags you might need (which they don’t)) and you can select

Re: Encoding of old compatibility characters

2017-03-28 Thread Philippe Verdy
Ideally a smart text renderer could as well display that glyph with a leading multiplication sign (a mathematical middle dot) and implicitly convert the following digits (and sign) as real superscript/exponent (using contextual substitution/positioning like for Eastern Arabic/Urdu), without

Re: Encoding of old compatibility characters

2017-03-28 Thread Ian Clifton
Philippe Verdy writes: > Ideally a smart text renderer could as well display that glyph with a > leading multiplication sign (a mathematical middle dot) and implicitly > convert the following digits (and sign) as real superscript/exponent > (using contextual

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-28 Thread Martin J. Dürst
On 2017/03/28 01:20, Michael Everson wrote: Ken transcribes into modern type a letter by Shelton dated 1859, in which “boy” is written В<ЃІ>, “few” as Й<ІЋ>, “truefully” [sic] as ГС<ІЋ>ЙЋТІ, and “you” as Џ<ІЋ>. These are all 1859 variants, yes? That would just show that these variants

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-28 Thread Martin J. Dürst
On 2017/03/27 21:59, Michael Everson wrote: On 27 Mar 2017, at 08:05, Martin J. Dürst wrote: Consider 2EBC ⺼ CJK RADICAL MEAT and 2E9D ⺝ CJK RADICAL MOON which are apparently really supposed to have identical glyphs, though we use an old-fashioned style in the charts

Re: Encoding of old compatibility characters

2017-03-28 Thread Frédéric Grosshans
Le 28/03/2017 à 02:22, Mark E. Shoulson a écrit : Aw, but ⏨ is awesome! It's much cooler-looking and more visually understandable than "e" for exponent notation. In some code I've been playing around with I support it as a valid alternative to "e". I Agree 1⏨3 times with you on this !

Re: different version of common/annotations/ja.xml

2017-03-28 Thread Takao Fujiwara
Thanks, I will file that ticket. I'd like to have another version of ja.xml for both TTS and non-TTS. Fujiwara On 03/28/17 15:20, Mark Davis ☕️-san wrote: Ah, yes. Sorry for my confusion. One main purpose for the short names is for TTS, and for that I think people felt that the reading was

Re: Unicode Emoji 5.0 characters now final

2017-03-28 Thread Mark Davis ☕️
(I'm sure you know this, Philippe, but a reminder for others: as far as the Unicode projects go, discussions on this list have no effect unless they are turned into a submission (UTC or Emoji proposal, CLDR or ICU ticket).) If you see any problems in the CLDR data, please file a ticket at

Re: different version of common/annotations/ja.xml

2017-03-28 Thread Mark Davis ☕️
Ah, yes. Sorry for my confusion. One main purpose for the short names is for TTS, and for that I think people felt that the reading was more useful. However, it would probably be better for the keywords to have the normal spelling. You might consider filing a ticket at

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-28 Thread Martin J. Dürst
On 2017/03/28 01:03, Michael Everson wrote: On 27 Mar 2017, at 16:56, John H. Jenkins wrote: The 1857 St Louis punches definitely included both the 1855 EW Ч and the 1859 OI <ЃІ>. Ken Beesley shows them in smoke proofs in his 2004 paper on Metafont. Good to have some

Re: Encoding of old compatibility characters

2017-03-28 Thread Asmus Freytag
On 3/28/2017 4:00 AM, Ian Clifton wrote: I’ve used ⏨ a couple of times, without explanation, in my own emails—without, as far as I’m aware, causing any misunderstanding. Works especially well, whenever it renders as a box with 23E8 inscribed!

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-28 Thread Michael Everson
On 28 Mar 2017, at 11:39, Martin J. Dürst wrote: >> And what would the value of this be? Why should I (who have been doing this >> for two decades) not be able to use the word “character” when I believe it >> correct? Sometimes you people who have been here for a long

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-28 Thread Philippe Verdy
2017-03-28 18:30 GMT+02:00 Asmus Freytag : > On 3/28/2017 6:56 AM, Michael Everson wrote: > >> An æ ligature is a ligature of a and of e. It is not some sort of pretzel. >> > We need a pretzel emoji. We need a broken tooth emoji too !

Aw: Re: U+0261 LATIN SMALL LETTER SCRIPT G

2017-03-28 Thread Jörg Knappen
This is a script capital G or, in TeX notation, {\cal G}. It reflects the use of multiple styles of the same underlying alhabet in mathematics and sciences. It is not a capital script g (note the different ordering of capital and script).   --Jörg Knappen   I had found in 2013 a GꞬ

U+0261 LATIN SMALL LETTER SCRIPT G

2017-03-28 Thread Richard Wordingham
On Tue, 28 Mar 2017 21:10:58 +0900 "Martin J. Dürst" wrote: (in Re: Standaridized variation sequences for the Desert alphabet?) > On 2017/03/27 21:59, Michael Everson wrote: > > Aa and Ɑɑ are used contrastively for different sounds in some > > languages and in the IPA.

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-28 Thread Asmus Freytag
On 3/28/2017 6:56 AM, Michael Everson wrote: An æ ligature is a ligature of a and of e. It is not some sort of pretzel. We need a pretzel emoji. A./

Re: U+0261 LATIN SMALL LETTER SCRIPT G

2017-03-28 Thread Frédéric Grosshans
Le 28/03/2017 à 18:14, Richard Wordingham a écrit : On Tue, 28 Mar 2017 21:10:58 +0900 "Martin J. Dürst" wrote: (in Re: Standaridized variation sequences for the Desert alphabet?) On 2017/03/27 21:59, Michael Everson wrote: Aa and Ɑɑ are used contrastively for

Re: Unicode Emoji 5.0 characters now final

2017-03-28 Thread Markus Scherer
On Tue, Mar 28, 2017 at 11:41 AM, Doug Ewell wrote: > Mark Davis wrote: > > > 3. Valid, but not recommended: "usca". Corresponds to the valid > > Unicode subdivision code for California according to > > http://unicode.org/reports/tr51/proposed.html#valid-emoji-tag-sequences > >

Re: Unicode Emoji 5.0 characters now final

2017-03-28 Thread Richard Wordingham
On Tue, 28 Mar 2017 11:41:38 -0700 "Doug Ewell" wrote: > "Not recommended," "not standard," "not interoperable," or any other > term ESC settles on for the 5000+ valid flag sequences that are not > England, Scotland, and Wales is just a short, easy step away from > deprecation

Re: Encoding of old compatibility characters

2017-03-28 Thread Mark E. Shoulson
I don't think I want my text renderer to be *that* smart. If I want ⏨, I'll put ⏨. If I want a multiplication sign or something, I'll put that. Without the multiplication sign, it's still quite understandable, more so than just "e". It is valid for a text rendering engine to render "g"

Re: Unicode Emoji 5.0 characters now final

2017-03-28 Thread Mark E. Shoulson
Kind of have to agree with Doug here. Either support the mechanism or don't. Saying "we, you CAN do this if you WANT to" always implies a "...but you probably shouldn't." Why even bother making it a possibility? On 03/28/2017 02:41 PM, Doug Ewell wrote: "Even though it is possible

Re: Encoding of old compatibility characters

2017-03-28 Thread Leo Broukhis
On Tue, Mar 28, 2017 at 6:09 AM, Asmus Freytag wrote: > On 3/28/2017 4:00 AM, Ian Clifton wrote: > > I’ve used ⏨ a couple of times, without explanation, in my own > emails—without, as far as I’m aware, causing any misunderstanding. > > Works especially well, whenever it

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-28 Thread Martin J. Dürst
On 2017/03/29 01:47, Philippe Verdy wrote: 2017-03-28 18:30 GMT+02:00 Asmus Freytag : On 3/28/2017 6:56 AM, Michael Everson wrote: An æ ligature is a ligature of a and of e. It is not some sort of pretzel. We need a pretzel emoji. We need a broken tooth emoji too !

Re: Encoding of old compatibility characters

2017-03-28 Thread Mark E. Shoulson
On 03/28/2017 09:09 AM, Asmus Freytag wrote: On 3/28/2017 4:00 AM, Ian Clifton wrote: I’ve used ⏨ a couple of times, without explanation, in my own emails—without, as far as I’m aware, causing any misunderstanding. Works especially well, whenever it renders as a box with 23E8 inscribed! A./

Re: Unicode Emoji 5.0 characters now final

2017-03-28 Thread Martin J. Dürst
Hello Doug, On 2017/03/29 03:41, Doug Ewell wrote: If this story sounds vaguely familiar to old-timers, it's exactly the path that was followed the last time Plane 14 tag characters were under discussion, between 1998 and 2000: someone wrote an RFC to embed language tags in plain text using

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-28 Thread Peter Edberg
> On Mar 28, 2017, at 9:30 AM, Asmus Freytag wrote: > > On 3/28/2017 6:56 AM, Michael Everson wrote: >> An æ ligature is a ligature of a and of e. It is not some sort of pretzel. > We need a pretzel emoji. Already in Unicode 10 / emoji 5.0:

Re: Re: U+0261 LATIN SMALL LETTER SCRIPT G

2017-03-28 Thread Frédéric Grosshans
I don't think it is a script capital G, but I admit it is arguable. One of the reasons is that the related variables s and μ are not script capital. If you're interested, I could check in the book if script capital are used in this book for other notations. Le mar. 28 mars 2017 à 18:52, "Jörg

Re: Unicode Emoji 5.0 characters now final

2017-03-28 Thread Doug Ewell
Mark Davis wrote: > 3. Valid, but not recommended: "usca". Corresponds to the valid > Unicode subdivision code for California according to > http://unicode.org/reports/tr51/proposed.html#valid-emoji-tag-sequences > and CLDR, but is not listed in http://unicode.org/Public/emoji/5.0/. "Not

Re: U+0261 LATIN SMALL LETTER SCRIPT G

2017-03-28 Thread Asmus Freytag
On 3/28/2017 10:26 AM, Frédéric Grosshans wrote: I don't think it is a script capital G, but I admit it is arguable. One of the reasons is that the related variables s and μ are not script capital. If you're interested, I could check in the

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-28 Thread Asmus Freytag (c)
On 3/28/2017 10:30 AM, Peter Edberg wrote: On Mar 28, 2017, at 9:30 AM, Asmus Freytag > wrote: On 3/28/2017 6:56 AM, Michael Everson wrote: An æ ligature is a ligature of a and of e. It is not some sort of pretzel. We need a pretzel emoji.

Re: Unicode Emoji 5.0 characters now final

2017-03-28 Thread Mark Davis ☕️
To add to what Ken and Markus said: like many other identifiers, there are a number of different categories. 1. *Ill-formed: *"$1" 2. *Well-formed, but not valid: *"usx". Is *syntactic* according to http://unicode.org/reports/tr51/proposed.html#def_emoji_tag_sequence, but is not