Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread srivas sinnathurai
Doug, First of all flat code space is the primary functionality of Unicode and not calling for any changes to existing encodings. What I propose is assign about 16,000 codes to code-page switching model. Why this suggestion? With current flat space, one code point is only allocated to one and

RE: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Doug Ewell
srivas sinnathurai sisrivas at blueyonder dot co dot uk wrote: Why this suggestion? With current flat space, one code point is only allocated to one and only one purpose. We can run out of code space soon. Argument over. There are not 800,000 more characters that need to be encoded for

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread John H. Jenkins
srivas sinnathurai 於 2011年8月19日 上午9:40 寫道: Why this suggestion? With current flat space, one code point is only allocated to one and only one purpose. We can run out of code space soon. There are a couple of problems here. We currently have over 860,000 unassigned code points. Surveys

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 18:24, John H. Jenkins wrote: We currently have over 860,000 unassigned code points. Surveys of all known writing systems indicate that only a small fraction of these will be needed. Indeed, although it looks likely that Han will spill out of the SIP into plane 3, all

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Mark E. Shoulson
On 08/19/2011 01:24 PM, John H. Jenkins wrote: In order to get the UTC and WG2 to agree to a major architectural change such as you're suggesting, you'd have to have some very solid evidence that it's needed—not an interesting idea, not potentially useful, but seriously *needed*. That's how

RE: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Doug Ewell
Mark E. Shoulson mark at kli dot org wrote: And indeed, it went the other way too, back when ISO-10646 had not 17, but 65536 *planes* and someone provided some reasonable evidence (or just plain reasoned arguments) that 4.3 *billion* characters was probably overkill. Technically, I think

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Jukka K. Korpela
20.8.2011 0:07, Doug Ewell wrote: Of course, 2.1 billion characters is also overkill, but the advent of UTF-16 was how we ended up with 17 planes. And now we think that a little over a million is enough for everyone, just as they thought in the late 1980s that 16 bits is enough for everyone.

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Mark E. Shoulson
On 08/19/2011 05:07 PM, Doug Ewell wrote: Mark E. Shoulsonmark at kli dot org wrote: And indeed, it went the other way too, back when ISO-10646 had not 17, but 65536 *planes* and someone provided some reasonable evidence (or just plain reasoned arguments) that 4.3 *billion* characters was

RE: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Doug Ewell
Jukka K. Korpela jkorpela at cs dot tut dot fi wrote: And now we think that a little over a million is enough for everyone, just as they thought in the late 1980s that 16 bits is enough for everyone. I know this is an enjoyable exercise — people love to ridicule Bill Gates for his comment in

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Ken Whistler
On 8/19/2011 2:07 PM, Doug Ewell wrote: Technically, I think 10646 was always limited to 32,768 planes so that one could always address a code point with a 32-bit signed integer (a nod to the Java fans). Well, yes, but it didn't really have anything to do with Java. Remember that Java wasn't

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Asmus Freytag
On 8/19/2011 2:35 PM, Jukka K. Korpela wrote: 20.8.2011 0:07, Doug Ewell wrote: Of course, 2.1 billion characters is also overkill, but the advent of UTF-16 was how we ended up with 17 planes. And now we think that a little over a million is enough for everyone, just as they thought in the

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Asmus Freytag
On 8/19/2011 3:24 PM, Ken Whistler wrote: On 8/19/2011 2:07 PM, Doug Ewell wrote: Technically, I think 10646 was always limited to 32,768 planes so that one could always address a code point with a 32-bit signed integer (a nod to the Java fans). Well, yes, but it didn't really have anything