Re: IDC's versus Egyptian format controls

2018-02-16 Thread James Kass via Unicode
> Wouldn't that break existing data? Functionality, not data.

Re: IDC's versus Egyptian format controls

2018-02-16 Thread James Kass via Unicode
Richard Wordingham wrote: > One can argue that once the compound ideograph have been encoded, the > IDS should no longer be interpreted. Wouldn't that break existing data? If this sort of thing were done at OS or app level, it might be possible to replace the IDS string with the appropriate

Re: IDC's versus Egyptian format controls

2018-02-16 Thread James Kass via Unicode
Richard Wordingham wrote: > There is another possible use of the latitude given by TUS 5.0 to 10.0 > and possibly earlier. I can certainly imagine a case where someone > writes a font so that an unencoded character may be manipulated like any > other character. He has two choices - he can put

Re: IDC's versus Egyptian format controls

2018-02-16 Thread Richard Wordingham via Unicode
On Fri, 16 Feb 2018 15:25:22 -0800 James Kass via Unicode wrote: > Some people studying Han characters use the IDCs to illustrate the > ideographs and their components for various purposes. For example: > > U-0002A8B8 ꢸ ⿰土土 > U-0002A8B9 ꢹ ⿰土凡 > U-0002A8BA ꢺ ⿱夂土 >

Re: IDC's versus Egyptian format controls

2018-02-16 Thread James Kass via Unicode
Richard Wordingham wrote, > And doing it reasonably well could be a lot of work. > However, I don't see any good reason to discourage > fonts from doing it by default, which is what is now > being proposed. Some people studying Han characters use the IDCs to illustrate the ideographs and their

Re: IDC's versus Egyptian format controls

2018-02-16 Thread Richard Wordingham via Unicode
On Fri, 16 Feb 2018 11:10:29 -0800 Ken Whistler via Unicode wrote: > On 2/16/2018 11:00 AM, Asmus Freytag via Unicode wrote: > > On 2/16/2018 8:00 AM, Richard Wordingham via Unicode wrote: > >> That doesn't square well with, "An implementation *may* render a > >> valid

Re: IDC's versus Egyptian format controls

2018-02-16 Thread Asmus Freytag (c) via Unicode
On 2/16/2018 11:10 AM, Ken Whistler wrote: It's the "may either" which is not the same as "may also". A./ On 2/16/2018 11:00 AM, Asmus Freytag via Unicode wrote: On 2/16/2018 8:00 AM, Richard Wordingham via Unicode wrote: That doesn't square well with, "An implementation *may* render a

Re: IDC's versus Egyptian format controls

2018-02-16 Thread Ken Whistler via Unicode
On 2/16/2018 11:00 AM, Asmus Freytag via Unicode wrote: On 2/16/2018 8:00 AM, Richard Wordingham via Unicode wrote: That doesn't square well with, "An implementation *may* render a valid Ideographic Description Sequence either by rendering the individual characters separately or by parsing

Re: IDC's versus Egyptian format controls

2018-02-16 Thread Asmus Freytag via Unicode
On 2/16/2018 10:20 AM, Richard Wordingham via Unicode wrote: On Fri, 16 Feb 2018 08:22:23 -0800 Ken Whistler via Unicode wrote: On 2/16/2018 8:00 AM, Richard Wordingham via Unicode wrote: A more portable solution for ideographs is to render an Ideographic Description

Re: Unicode of Death 2.0

2018-02-16 Thread Manish Goregaokar via Unicode
FWIW I dissected the crashing strings, it's basically all sequences in Telugu, Bengali, Devanagari where the consonant is suffix-joining (ra in Devanagari, jo and ro in Bengali, and all Telugu consonants), the vowel is not Bengali au or o / Telugu ai,

Re: IDC's versus Egyptian format controls

2018-02-16 Thread Richard Wordingham via Unicode
On Fri, 16 Feb 2018 08:22:23 -0800 Ken Whistler via Unicode wrote: > On 2/16/2018 8:00 AM, Richard Wordingham via Unicode wrote: > > > A more portable solution for ideographs is to render an Ideographic > > Description Sequences (IDS) as approximations to the characters

Re: IDC's versus Egyptian format controls

2018-02-16 Thread Ken Whistler via Unicode
On 2/16/2018 8:22 AM, Ken Whistler wrote: The Egyptian quadrat controls, on the other hand, are full-fledged Unicode format controls. One more point of distinction: The (gc=So) IDC's follow a syntax that uses Polish notation order for the descriptive operators (inherited from the intended

IDC's versus Egyptian format controls (was: Re: Why so much emoji nonsense?)

2018-02-16 Thread Ken Whistler via Unicode
On 2/16/2018 8:00 AM, Richard Wordingham via Unicode wrote: A more portable solution for ideographs is to render an Ideographic Description Sequences (IDS) as approximations to the characters they describe. The Unicode Standard carefully does not prohibit so doing, and a similar scheme is

Re: Why so much emoji nonsense?

2018-02-16 Thread Richard Wordingham via Unicode
On Fri, 16 Feb 2018 10:57:57 + Phake Nick via Unicode wrote: > 2. Actually, the problem is not just limited to emoji. Many > Ideographic characters (Chinese, Japanese, etc) are adding to the > unicode each years, while at the current rate there are still many > rooms in

Re: Origin of Alphasyllabaries (was: Why so much emoji nonsense?)

2018-02-16 Thread Richard Wordingham via Unicode
On Fri, 16 Feb 2018 11:42:51 +0100 Philippe Verdy via Unicode wrote: > I said the opposite: the alphabets, abjads, abugidas and today's full > syllabaries derive from early simplified syllabaries,... In the Old World, alphabets and abugidas derive from abjads, which do not

Re: Why so much emoji nonsense?

2018-02-16 Thread Phake Nick via Unicode
2018-02-16 FRI 15:55, James Kass via Unicode wrote: > Pierpaolo Bernardi wrote: > > > But it's always a good time to argue against the addition of more > > nonsense to what we already have got. > > It's an open-ended set and precedent for encoding them exists. > Generally,

Re: Origin of Alphasyllabaries (was: Why so much emoji nonsense?)

2018-02-16 Thread Philippe Verdy via Unicode
2018-02-16 1:59 GMT+01:00 Richard Wordingham via Unicode < unicode@unicode.org>: > On Wed, 14 Feb 2018 21:49:57 +0100 > Philippe Verdy via Unicode wrote: > > > The concept of vowels as distinctive letters came later, even the > > letter A was initially a representation of a

Re: Why so much emoji nonsense?

2018-02-16 Thread Mark Davis ☕️ via Unicode
A few points 1. To add to what Asmus said, see also http://unicode.org/L2/L2018/18044-encoding-emoji.pdf "Their encoding, surprisingly, has been a boon for language support. The emoji draw on Unicode mechanisms that are used by various languages, but which had been incompletely implemented on

Re: Why so much emoji nonsense?

2018-02-16 Thread James Kass via Unicode
Asmus Freytag wrote: >> Words suffice. We go by what people actually say rather than whatever >> they might have meant. When we read text, we go by what's written. > > That is a worthy opinion, but not one that is shared, either in principle > or in lived practice (esp. related to digital

Re: Why so much emoji nonsense?

2018-02-16 Thread Asmus Freytag via Unicode
On 2/15/2018 11:54 PM, James Kass via Unicode wrote: Pierpaolo Bernardi wrote: But it's always a good time to argue against the addition of more nonsense to what we already have got. It's an open-ended set and precedent for encoding

Re: Why so much emoji nonsense?

2018-02-16 Thread Asmus Freytag via Unicode
Words suffice.  We go by what people actually say rather than whatever they might have meant.  When we read text, we go by what's written. That is a worthy opinion, but not one that is shared, either in principle or in lived practice (esp. related to digital communication) by vast numbers of

Re: Why so much emoji nonsense?

2018-02-16 Thread James Kass via Unicode
Pierpaolo Bernardi wrote: > But it's always a good time to argue against the addition of more > nonsense to what we already have got. It's an open-ended set and precedent for encoding them exists. Generally, input regarding the addition of characters to a repertoire is solicited from the user