RE: An Aburdly Brief Introduction to Unicode (was Re: Perception

2001-02-23 Thread Marco Cimarosti
Paul Keinanen wrote: Regarding how to describe Unicode in the public, I think it is best to say that it can encode more than a million characters, of which about 10 (in 3.1) is used. It is better to defer the discussion of any transformation forms to a much later stage. I don't agree.

Re: fictional scripts revisited

2001-02-23 Thread Arnt Gulbrandsen
Joel Rees [EMAIL PROTECTED] I'm telling you that 17 planes is not enough, and it _will_ become a painful constraint in your lifetime. How? It looks likely to me that unicode now encodes more than half of the characters known by living people. Do you think people are going to expand their

Re: fictional scripts revisited

2001-02-23 Thread Joel Rees
On 2001.02.23 19:42 Arnt Gulbrandsen [EMAIL PROTECTED] asked: Joel Rees [EMAIL PROTECTED] I'm telling you that 17 planes is not enough, and it _will_ become a painful constraint in your lifetime. How? It looks likely to me that unicode now encodes more than half of the characters known

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-23 Thread Peter_Constable
On 02/22/2001 01:38:24 PM Tom Lord wrote: [EMAIL PROTECTED] wrote: "Unicode is a character set encoding standard which currently provides for its entire character repertoire to be represented using 8-bit, 16-bit or 32-bit encodings." Please say "encoding forms". OK, but I'm more

Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

2001-02-23 Thread Mark Davis
many comments - Original Message - From: "Tom Lord" [EMAIL PROTECTED] To: "Unicode List" [EMAIL PROTECTED] Sent: Wednesday, February 21, 2001 21:15 Subject: An Aburdly Brief Introduction to Unicode (was Re: Perception ...) We've seen several posts about the perception that Unicode is

Re: fictional scripts revisited

2001-02-23 Thread John H. Jenkins
At 6:28 AM -0800 2/23/01, [EMAIL PROTECTED] wrote: The unlikelihood of you or anybody coming up with sufficient evidence to make that case is such that I'd be willing to put less constraint on you: present clear evidence that more than 880,790 characters will *ever* be in wide use and will merit

Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

2001-02-23 Thread John Cowan
Mark Davis wrote: A _code_point_ is an integer value which is assigned to an abstract character. Each character receives a unique code point. inaccurate. Multiple *abstract characters* can have a single code point; multiple code points can correspond to a single *abstract character*.

RE: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-23 Thread Ayers, Mike
I advocate taking it one step farther, and referring to Unicode as "21 bits and counting". Sure, it should be a long long time before more space is needed, but it's a good idea to prepare the audience now. After all, pretty much every ceiling ever established in computing has been

RE: fictional scripts revisited

2001-02-23 Thread Ayers, Mike
From: David Starner [mailto:[EMAIL PROTECTED]] The second example I would like to raise are the "Square Words" or "New English Calligraphy"[6] (I don't know which name is more appropriate, but I will refer to it hereafter as "NEC"), which is a Sinoform script. NEC is a system where

Fictional scripts revisited, might as well relax about it now

2001-02-23 Thread Dan Kolis
On Wed, Feb 21, 2001 at 10:58:06PM -0800, Thomas Chan wrote: First, there are the 4000 new[4] "CJK Ideographs" that he created solely for a work called _Tianshu_ (A Book from the Sky)[5] (1987-1991), which Xu spent three years carving movable wooden type for. There is no doubt that

Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

2001-02-23 Thread Mark Davis
In somewhat more detail: In general, a single abstract character corresponds to a single code point. However, due to the requirement of compatibility with legacy code sets, plus some inherent fuzziness in what constitutes abstract characters, there are cases where this is not true: - one

Re: Fictional scripts revisited, might as well relax about it now

2001-02-23 Thread John H. Jenkins
At 8:27 AM -0800 2/23/01, Dan Kolis wrote: Well, if you have no cultural bias and you encode Klingon, you pretty well have to include anything. Klingon is not likely to be encoded any time soon. The basic problem here is that the Klingon Language Institute has shown little interest in

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-23 Thread John Cowan
Ayers, Mike wrote: After all, pretty much every ceiling ever established in computing has been broken through, and there is no reason to believe that it won't happen again! On the contrary. There *are* reasons to believe that it won't happen in the case of character encoding. As for

Re: 21 bits ...

2001-02-23 Thread Nelson H. F. Beebe
Since folks are debating whether 21 bits is really enough for Unicode forever, I thought I should toss in these gems from my quotation collection, about previous mistakes when people thought something was big enough: \QUOTATION{ There is only one mistake that can be made in computer design that

RE: 8-bit ASCII

2001-02-23 Thread Jim Melton
Gentlepeople, I'm surprised that nobody whose responses I've seen has taken the trouble to actually go to ANSI to see what "ASCII" means to that standards-publishing body. A quick search at http://webstore.ansi.org for the word "ASCII" (without the quotes, of course) shows the following two

RE: fictional scripts revisited

2001-02-23 Thread Daniel Biddle
On Fri, 23 Feb 2001, Ayers, Mike wrote: This, however, is absurd - one of those 1,000,000 words is "antidisestablishmentarianism", and there's a whole bunch half that long or longer. Show me the glyphs for them! This NEC thingy may make cute artsy stuff, but it would be useless for

Re: fictional scripts revisited

2001-02-23 Thread David Starner
On Fri, Feb 23, 2001 at 08:11:51AM -0800, Ayers, Mike wrote: Besides, does anyone really believe that alphabetic writers would decide that they'd rather learn thousands of glyphs? We're getting deeply fictional here... All it would take is some small dictator-run communist country whose

Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

2001-02-23 Thread Kenneth Whistler
Mark said: In somewhat more detail: In general, a single abstract character corresponds to a single code point. However, due to the requirement of compatibility with legacy code sets, plus some inherent fuzziness in what constitutes abstract characters, there are cases where this is not

Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

2001-02-23 Thread Peter_Constable
On 02/23/2001 09:58:55 AM John Cowan wrote: Mark Davis wrote: A _code_point_ is an integer value which is assigned to an abstract character. Each character receives a unique code point. inaccurate. Multiple *abstract characters* can have a single code point; multiple code points can

Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

2001-02-23 Thread Peter_Constable
On 02/23/2001 10:34:05 AM "Mark Davis" wrote: In somewhat more detail: In general, a single abstract character corresponds to a single code point. However, due to the requirement of compatibility with legacy code sets, plus some inherent fuzziness in what constitutes abstract characters, there

Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

2001-02-23 Thread Peter_Constable
On 02/23/2001 01:28:07 PM Kenneth Whistler wrote: - one abstract character can correspond to two different code points {a with ring above} == U+00C5 LATIN CAPITAL LETTER WITH RING ABOVE == U+212B ANGSTROM SIGN (singleton canonical equivalence

Re: fictional scripts revisited

2001-02-23 Thread Curtis Clark
At 10:51 PM 2/22/01, Joel Rees wrote: So Plane 9, say, can be nothing but surrogates-of-surrogates, to some 64- or 128-bit code space. You do mean for UTF-16, don't you? Let me be somewhat more explicit, now that I've thought about it for a while. IIRC there is an entire private use

RE: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-23 Thread Ayers, Mike
From: John Cowan [mailto:[EMAIL PROTECTED]] Ayers, Mike wrote: After all, pretty much every ceiling ever established in computing has been broken through, and there is no reason to believe that it won't happen again! On the contrary. There *are* reasons to believe that it won't

Re: Benefits of Unicode

2001-02-23 Thread Richard Cook
Sorry, I tuned out for a moment: is there a URL for the final version of Tex's tabulation of benefits? Also, I'd appreciate any similar links that might be used in a page of info for the uninitiated. Best, Richard

Re: Benefits of Unicode

2001-02-23 Thread Tex Texin
Richard, The list is attached. The page contains some links which would help someone get started. I did mean to make a couple more small changes that I haven't gotten to yet. In particular, someone wrote me that the item: "Standards insure interoperability and portability by prescribing

Re: bijective (was re: An Aburdly Brief Introduction to Unicode (was Re:

2001-02-23 Thread Richard Cook
Mark Davis wrote: that must be made about what counts as an abstract character and what does not; and the generally acknowledged desirability of supporting bijective mappings between a variety of older character sets and while I like bijective, it is not a commonly understood term. I