RE: Help with Greek special casing

2001-02-26 Thread Carl W. Brown
Nick, If you have a lowercase sigma in the middle of the word followed by a diacritic is it final; sigma, hacek, some other letter. Carl -Original Message- From: Nick Nicholas [mailto:[EMAIL PROTECTED]] Sent: Sunday, February 25, 2001 10:53 PM To: Unicode List Subject: Re: Help with

RE: Help with Greek special casing

2001-02-26 Thread Carl W. Brown
Mick, Maybe in might be clearer to ask if there are any cases where you use the final sigma form where it is not the last letter in a word. Modern Greek only. Carl -Original Message- From: Nick Nicholas [mailto:[EMAIL PROTECTED]] Sent: Sunday, February 25, 2001 10:53 PM To: Unicode

Re: Help with Greek special casing

2001-02-26 Thread Lukas Pietsch
Carl Brown asked: It is final when followed by a hyphen or combining diacritical mark? Patrick Rourke answered: Don't know what the Unicode rules are, but the answer is no. The final sigma form is not used if the sigma is in a medial position in the word but at the end of the line (e.g.,

RE: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-26 Thread Marco Cimarosti
Doug Ewell wrote: A *script* like Latin or Cyrillic typically has many more characters than any one language will ever use. An *alphabet* is, by definition, language-specific. Hhmmm... We probably all agree that Chinese, Japanese and Korean share the "CJK script". But would you say,

Armenian Eternity Sign

2001-02-26 Thread Marco Cimarosti
Armenian single-byte character sets contain a character called "Armenian Eternity Sign" (it looks like a helice or a flower; see it in http://www.freenet.am/armscii/armcs-006.html). What is the meaning and usage of this symbol? How does is map to Unicode? If there is no mapping, was it ever

Hebrew shaping (was RE: Benefits of Unicode)

2001-02-26 Thread Marco Cimarosti
Sorry for coming back so late on an old issue (29 Jan 2001). I (Marco Cimarosti) wrote: Each different positional form of a letter in Arabic, Syriac or Mongolian is encoded with the same code point; the rendering engine must select the proper form. The same problem in Greek and Hebrew has

Re: fictional scripts revisited

2001-02-26 Thread Michael Everson
At 18:33 -0800 2001-02-25, Joel Rees wrote: And the PUA is already being fairly actively used, which means there is already a fair amount of extant plain text that is only legible within a specific context, and the UNICODE standard has no way of approaching it. And is guaranteed not to, unless

Re: Armenian Eternity Sign

2001-02-26 Thread Michael Everson
At 01:39 -0800 2001-02-26, Marco Cimarosti wrote: Armenian single-byte character sets contain a character called "Armenian Eternity Sign" (it looks like a helice or a flower; see it in http://www.freenet.am/armscii/armcs-006.html). What is the meaning and usage of this symbol? How does is map

Re: Help with Greek special casing

2001-02-26 Thread Michael Everson
At 20:28 -0800 2001-02-25, Patrick T. Rourke wrote: Don't know what the Unicode rules are, but the answer is no. The final sigma form is not used if the sigma is in a medial position in the word but at the end of the line (e.g., when it occurs at the point of hyphenation in a hyphenated word at

RE: Question on Unicode data files

2001-02-26 Thread Marco Cimarosti
I wrote - UNIDATA/CJKXREF.TXT ([...] Errata: I meant UNIDTA/UNIHAN.TXT Sorry. _ Marco

RE: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-26 Thread Peter_Constable
On 02/26/2001 03:21:15 AM Marco Cimarosti wrote: But would you say, following your definition, that the subset of the CJK Script used to write Mandarin in Mainland China should be called "The Chinese Simplified *Alphabet*"? It is a writing system, but not an alphabetic one. - "Alphabet" is a

Re: bijective (was re: An Absurdly Brief Introduction to Unicode (was

2001-02-26 Thread Peter_Constable
On 02/24/2001 04:43:41 PM Richard Cook wrote: Whence does this terminology derive? Set or Mapping theory? I learned it in high school algebra. Anyone recommend a definitive text? I have handy the book from a topology course I took that gives definitions: Munkres, James A. 1975. Topology: A

Re: fictional scripts revisited

2001-02-26 Thread Peter_Constable
On 02/25/2001 08:33:32 PM "Joel Rees" wrote: And the PUA is already being fairly actively used, which means there is already a fair amount of extant plain text that is only legible within a specific context, and the UNICODE standard has no way of approaching it. Yes, Unicode does: if there is

Re: Introducing uniengine technology

2001-02-26 Thread Peter_Constable
On 02/25/2001 05:16:08 AM "William Overington" wrote: I am researching a concept that I am hoping to call a uniengine... The other commands are intended to include a comprehensive set of commands for certain graphics and other purposes, yet to be essentially a small set of easily implemented

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-26 Thread John Cowan
Marco Cimarosti wrote: - "Script" is a generic term meaning a writing system of any kind, its inventory of signs and its orthographic rules. - "Alphabet" is a specific class of scripts, whose principal characteristic is that tends to map each sign to one of the language's phonemes. I

Alphabet vs. script (was RE: Perception that Unicode is 16-bit (w

2001-02-26 Thread Marco Cimarosti
John Cowan wrote: - "Alphabet" is a specific class of scripts, whose principal characteristic is that tends to map each sign to one of the language's phonemes. I think that should rather be called an "alphabetic script", e.g. Latin, Greek, Cyrillic. You are right. I didn't consider

Re: Possibilities of future expansion (from Perception etc thread and

2001-02-26 Thread Peter_Constable
On 02/25/2001 05:16:15 AM "William Overington" wrote: [snip] Yet suppose that some organization were to have "Encode Your Character Here For Free" with light moderation only and openly stated that the way that the organization planned to make a profit were to encode all of the characters

Tengwar and Cirth (was: RE: Fictional scripts revisited, might as

2001-02-26 Thread Ayers, Mike
From: Michael Everson [mailto:[EMAIL PROTECTED]] Oh, we've got a *proposal* for Klingon. It does not, however, appear that it meets the criteria for use as well as Tengwar and Cirth. Okay, I've finally gotta ask: what are Tengwar and Cirth? Klingon I've heard of (and wish I

RE: Help with Greek special casing

2001-02-26 Thread Nick Nicholas
Nick, If you have a lowercase sigma in the middle of the word followed by a diacritic is it final; sigma, hacek, some other letter. No, sir. And medial sigma-diacritic is far more frequent than a sigma having a diacritic word-finally. Nick Nicholas, Thesaurus Linguae Graecae. [EMAIL

Re: Tengwar and Cirth (was: RE: Fictional scripts revisited,

2001-02-26 Thread Michael Everson
At 10:35 -0600 2001-02-26, Ayers, Mike wrote: From: Michael Everson [mailto:[EMAIL PROTECTED]] Oh, we've got a *proposal* for Klingon. It does not, however, appear that it meets the criteria for use as well as Tengwar and Cirth. Okay, I've finally gotta ask: what are Tengwar and

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-26 Thread Michael Everson
At 07:25 -0800 2001-02-26, John Cowan wrote: Marco Cimarosti wrote: - "Script" is a generic term meaning a writing system of any kind, its inventory of signs and its orthographic rules. - "Alphabet" is a specific class of scripts, whose principal characteristic is that tends to map each sign to

RE: bijective (was re: An Absurdly Brief Introduction to Unicode

2001-02-26 Thread Handwerker, Reinhard (ISS Atlanta)
Peter, that's not correct, either. A function (by definition) does not leave out any values in its domain (or it is not well-defined). If a function maps every point of its domain one-to-one into the codomain, it is injective. If a function maps every point of its domain onto the codomain (i.e.

RE: Tengwar and Cirth (was: RE: Fictional scripts revisited, migh

2001-02-26 Thread jarkko . hietaniemi
Tengwar and Cirth are the scripts used to write Tolkien's languages: http://www.dcs.ed.ac.uk/misc/local/TolkLang/fonts/ http://fan.theonering.net/~rolozo/tengwar/tengwar/ http://fan.theonering.net/~rolozo/tengwar/cirth/ http://www.uib.no/People/hnohf/ -Original Message- From: ext

Re: Klingon? take a look

2001-02-26 Thread Michael \(michka\) Kaplan
Nobody doubted the reality of the submission, we all know it is there. No one is really pushing for it, though. MichKa Michael Kaplan Trigeminal Software, Inc. http://www.trigeminal.com/ - Original Message - From: "Dan Kolis" [EMAIL PROTECTED] To: "Unicode List" [EMAIL PROTECTED]

RE: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-26 Thread Marco Cimarosti
Michael Everson wrote: Yes, an alphabet proper is usually the subset of an alphabetic script. Armenian seems to be the exception, as it is only used for one language; Georgian, Latin, Cyrillic, Ogham, Runic, and Greek have been used for other languages. I miss the other language(s) for

RE: Help with Greek special casing

2001-02-26 Thread Nick Nicholas
Maybe in might be clearer to ask if there are any cases where you use the final sigma form where it is not the last letter in a word. Modern Greek only. What I described in my first paragraph is the only such instance I'm aware of (the 19th texts I have in mind were editions of Byzantine

RE: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-26 Thread Michael Everson
At 18:44 +0100 2001-02-26, Marco Cimarosti wrote: Michael Everson wrote: Yes, an alphabet proper is usually the subset of an alphabetic script. Armenian seems to be the exception, as it is only used for one language; Georgian, Latin, Cyrillic, Ogham, Runic, and Greek have been used for

Close enough

2001-02-26 Thread Dan Kolis
Wait a minute You want to insert this: ISOMORPHIC BOWL-OF-WRATH SPANNER WITH JEWS HARP ACCOMPANIMENT. Can't you use instead the existing american equivelent? "ISOMORPHIC BOWL-OF-WRATH WRENCH WITH JEWS HARP ACCOMPANIMENT" Should be close enough Dan

Re: bijective (was re: An Absurdly Brief Introduction to Unicode (was

2001-02-26 Thread David Starner
On Mon, Feb 26, 2001 at 07:18:55AM -0800, [EMAIL PROTECTED] wrote: On 02/24/2001 04:43:41 PM Richard Cook wrote: Whence does this terminology derive? Set or Mapping theory? I learned it in high school algebra. A quick survey around here indicates that OSU doesn't teach it in any course

RE: Alphabet vs. script (was RE: Perception that Unicode is 16-bi

2001-02-26 Thread jarkko . hietaniemi
A single alphabet of a single language is actually a pretty useless for anything else than simple testing like "could this string of characters be a native word in that language". As an example, 'c' is not part of the Finnish language since no native Finnish word uses it, but as soon as one

Re: Klingon? take a look

2001-02-26 Thread Michael Everson
At 09:03 -0800 2001-02-26, Michael \(michka\) Kaplan wrote: Nobody doubted the reality of the submission, we all know it is there. No one is really pushing for it, though. It was originally made because there was a Unix or Linux implementation somewhere. It has been seen and approved by the

Re: Implementing Complex Unicode Scripts

2001-02-26 Thread jgo
The OpenType font format is supported; that means that the OS can read the files, do *basic* (i.e. 1:1) character-to-glyph mapping, and rasterize the glyph outlines. This is as much as is involved in supporting plain-vanilla TrueType fonts, only with additional possibilities for what formats

Re: Klingon? take a look

2001-02-26 Thread Nick NICHOLAS
Oh, and http://www.klingonska.org/piqad/ . Sorry. -- Nick Nicholas. TLG, UCI, USA. [EMAIL PROTECTED]; www.tlg.uci.edu/~opoudjis Many among their proselytes had sold their lands and houses to increase the public riches of the sect --- at the expense, indeed, of their unfortunate children,

Re: Possibilities of future expansion (from Perception etc thread

2001-02-26 Thread Michael Everson
At 07:21 -0800 2001-02-26, [EMAIL PROTECTED] wrote: Like it or not, Unicode is the property of the Unicode Consortium and its members, not ordinary people. The character set is also the property of the International Organization for Standardization. Personally, I think the PUA is a wonderful

Re: Klingon? take a look

2001-02-26 Thread Nick NICHOLAS
qaStaHvIS Mon, 26 Feb 2001 DIS, ghItlh Michael Everson: implementation somewhere. It has been seen and approved by the Klingon Language Institute, but it does remain true that most users of the Klingon language read and write it in its Latin orthography, although they will use the font for

Klingon silliness

2001-02-26 Thread John O'Conner
That anyone could seriously consider adding the Klingon script to Unicode seems preposterous. Even if someone were to provide an "accurate" script, a sample font, etc that meets the general requirements of a proposal, the idea is quite silly. I am surprised that the consortium hasn't simply

Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

2001-02-26 Thread Peter_Constable
On 02/24/2001 02:36:26 PM "Mark Davis" wrote: The glossary entry for "abstract character", as he points out, was inherited from 10646. "Abstract Character. A unit of information used for the organization, control, or representation of textual data. (See Definition D3 in Section 3.3, Characters

Re: Klingon silliness

2001-02-26 Thread Rick McGowan
Let me throw my light weight in with John O'Conner... It's silly to even consider Klingon for Unicode or 10646. Many members of both committees know this, and that's why it hasn't moved anywhere in several years. The question keeps cropping up because that silly proposal is still "on the

Re: Close enough

2001-02-26 Thread Peter_Constable
On 02/26/2001 11:53:27 AM dank wrote: Wait a minute You want to insert this: ISOMORPHIC BOWL-OF-WRATH SPANNER WITH JEWS HARP ACCOMPANIMENT. Can't you use instead the existing american equivelent? "ISOMORPHIC BOWL-OF-WRATH WRENCH WITH JEWS HARP ACCOMPANIMENT" Should be close enough

RE: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-26 Thread Peter_Constable
On 02/26/2001 03:21:15 AM Marco Cimarosti wrote: But would you say, following your definition, that the subset of the CJK Script used to write Mandarin in Mainland China should be called "The Chinese Simplified *Alphabet*"? It is a writing system, but not an alphabetic one. - "Alphabet" is a

Re: fictional scripts revisited

2001-02-26 Thread Peter_Constable
On 02/25/2001 08:33:32 PM "Joel Rees" wrote: And the PUA is already being fairly actively used, which means there is already a fair amount of extant plain text that is only legible within a specific context, and the UNICODE standard has no way of approaching it. Yes, Unicode does: if there is

Re: bijective (was re: An Absurdly Brief Introduction to Unicode (was

2001-02-26 Thread Peter_Constable
On 02/24/2001 04:43:41 PM Richard Cook wrote: Whence does this terminology derive? Set or Mapping theory? I learned it in high school algebra. Anyone recommend a definitive text? I have handy the book from a topology course I took that gives definitions: Munkres, James A. 1975. Topology: A

Re: Introducing uniengine technology

2001-02-26 Thread Peter_Constable
On 02/25/2001 05:16:08 AM "William Overington" wrote: I am researching a concept that I am hoping to call a uniengine... The other commands are intended to include a comprehensive set of commands for certain graphics and other purposes, yet to be essentially a small set of easily implemented

Re: Close enough

2001-02-26 Thread P. T. Rourke
It was an Asimov story, I think - I should remember for sure, but don't. But it's too similar in style to his "tell all the Foys on Sortibackenstrete that I will soon be there" punch line to give it to Clarke. (Asimov, Azimov, I doubt he'd have cared). Patrick Rourke - Original Message

Re: Klingon silliness ii

2001-02-26 Thread Dan Kolis
[EMAIL PROTECTED] said: That anyone could seriously consider adding the Klingon script to Unicode seems preposterous. Even if someone were to provide an "accurate" script, a sample font, etc that meets the general requirements of a proposal, the idea is quite silly. I am surprised that the

Re: Klingon silliness

2001-02-26 Thread Michael \(michka\) Kaplan
From: "Rick McGowan" [EMAIL PROTECTED] I have said repeatedly over the years, that I will enterain the encoding of Klingon when the tribble-kissing wimps at the Klingon High Command beam an armed delegation into a UTC meeting and demand the encoding of their script. Until then, I see no

Re: Possibilities of future expansion (from Perception etc thread

2001-02-26 Thread Thomas Chan
On Mon, 26 Feb 2001 [EMAIL PROTECTED] wrote: On 02/25/2001 08:01:38 PM "Joel Rees" wrote: I know this has been hashed over time and time again, and the answer has been handed down as if by edict time and again, but _your_ attitude as expressed below is taken by many who are not involved as

Re: bijective (was re: An Absurdly Brief Introduction to Unicode

2001-02-26 Thread rscook
On Mon, 26 Feb 2001 [EMAIL PROTECTED] wrote: On 02/24/2001 04:43:41 PM Richard Cook wrote: Whence does this terminology derive? Set or Mapping theory? I learned it in high school algebra. and you still remember it? my memory for ancient history is not good. assuming even that i ever

Re: Klingon silliness

2001-02-26 Thread Tex Texin
Perhaps the real question is what is the criteria for including or excluding a fictional script. I have deleted John's mail, but his criteria applied more broadly than Klingon if I recall. Should we worry about elvish communication and not Klingon? Do we apply a business case to fictional

Well

2001-02-26 Thread Dan Kolis
You know this Klingon thing though, might have an unexpected upside. I'm not sure. I think theres a good chance if ratified the item could make funny story of the day globally; you know, evening news style. "Trekkies everywhere applaud the inclusion of Klingon into Unicode, and effort to make

Re: Klingon silliness

2001-02-26 Thread John O'Conner
If it were up to me alone, I would put that proposal in the bin of things that have been politely refused. The fact that it has not yet gone to the great bit-bucket in the sky probably reflects the general esteem in which the gentlebeing who proposed it is held. Yes, I think we all highly

RE: Hebrew shaping (was RE: Benefits of Unicode)

2001-02-26 Thread Jonathan Rosenne
In Hebrew the exceptions are in abbreviations and foreign words, but they are not so rare. The most common ones are when the final Pe is hard, like in Philip. We have been encoding Hebrew since the 1950's, on punch cards, and the decision taken then for Hebrew was to have 5 extra letters for

Re: Tribble-kissing, was Re: Klingon silliness

2001-02-26 Thread rscook
Rick McGowan wrote: I have said repeatedly over the years, that I will entertain the encoding of Klingon when the tribble-kissing wimps at the Klingon High Command beam an armed delegation into a UTC meeting and demand the encoding of their script. Until then, I see no reason to consider

Re: Klingon silliness

2001-02-26 Thread Keld Jørn Simonsen
On Mon, Feb 26, 2001 at 01:02:43PM -0800, Tex Texin wrote: Perhaps the real question is what is the criteria for including or excluding a fictional script. I have deleted John's mail, but his criteria applied more broadly than Klingon if I recall. Should we worry about elvish communication

Re: Question on Unicode data files

2001-02-26 Thread John H. Jenkins
At 7:57 AM -0800 2/26/01, Richard Zhang wrote: Hello, Marco, Unihan is the official site I think. You can visit www.unihan.com.cn for more information about this, if you know Chinese :). If you sign up for cooperation with them, you will get full access to their database. No, Unihan is *NOT*

Re: Close enough

2001-02-26 Thread John H. Jenkins
At 12:58 PM -0800 2/26/01, P. T. Rourke wrote: It was an Asimov story, I think - I should remember for sure, but don't. But it's too similar in style to his "tell all the Foys on Sortibackenstrete that I will soon be there" punch line to give it to Clarke. (Asimov, Azimov, I doubt he'd have

Re: Implementing Complex Unicode Scripts

2001-02-26 Thread John H. Jenkins
At 10:57 AM -0800 2/26/01, jgo wrote: Doing complex character-to-glyph mapping involving the OpenType tables is another matter. My understanding is that the MacOS can do the former, but cannot yet do the latter. Well, the Apple folks who should know are on the list, so let's ask them.

Re: Klingon silliness

2001-02-26 Thread G. Adam Stanislav
At 12:11 26-02-2001 -0800, Rick McGowan wrote: It's silly to even consider Klingon for Unicode or 10646. Nah, it's not silly. It's offensive. Back when I suggested that 'ch' be added to Unicode, I received a ton of replies why that should not be. That despite the fact in Slovak 'ch' has a

Re: bijective (was re: An Absurdly Brief Introduction to Unicode (was

2001-02-26 Thread Peter_Constable
A brief intro to some algebra terminology (I'm drafting these from memory, but I don't think there are any oversights as in one of my earlier posts): I will assume the definition of "set" is understood. Examples will assume the sets K = { 1, 2, 3 }, L = { a, b}, M = { a, b, c, d }, N = { a, b,

Re: Possibilities of future expansion (from Perception etc thread

2001-02-26 Thread Peter_Constable
On 02/25/2001 08:01:38 PM "Joel Rees" wrote: Michael, I know this has been hashed over time and time again, and the answer has been handed down as if by edict time and again, but _your_ attitude as expressed below is taken by many who are not involved as rather arrogant. Michael and I don't

Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

2001-02-26 Thread Peter_Constable
On 02/24/2001 02:36:26 PM "Mark Davis" wrote: The glossary entry for "abstract character", as he points out, was inherited from 10646. "Abstract Character. A unit of information used for the organization, control, or representation of textual data. (See Definition D3 in Section 3.3, Characters

Re: Klingon silliness

2001-02-26 Thread David Starner
On Mon, Feb 26, 2001 at 02:35:43PM -0800, G. Adam Stanislav wrote: At 12:11 26-02-2001 -0800, Rick McGowan wrote: It's silly to even consider Klingon for Unicode or 10646. Nah, it's not silly. It's offensive. Back when I suggested that 'ch' be added to Unicode, I received a ton of

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-26 Thread Kenneth Whistler
Doug Ewell asked, on this hopelessly wandering thread: (Is there an English-language term for the subset of the CJK ideographic script that is used by a given language, say, Japanese?) Well, since "kanji" by now has been borrowed into English, at least among a rather large class of

Re: Close enough

2001-02-26 Thread DougEwell2
In a message dated 2001-02-26 10:38:23 Pacific Standard Time, [EMAIL PROTECTED] writes: Wait a minute You want to insert this: ISOMORPHIC BOWL-OF-WRATH SPANNER WITH JEWS HARP ACCOMPANIMENT. Can't you use instead the existing american equivelent? "ISOMORPHIC BOWL-OF-WRATH

Re: Introducing uniengine technology

2001-02-26 Thread Curtis Clark
At 07:19 AM 2/26/01, [EMAIL PROTECTED] wrote: On 02/25/2001 05:16:08 AM "William Overington" wrote: I am researching a concept that I am hoping to call a uniengine... Not happy with existing embedding technologies? Your codes would effectively amount to binary data contained within document.

Re: Question on Unicode data files

2001-02-26 Thread Richard Cook
"John H. Jenkins" wrote: At 7:57 AM -0800 2/26/01, Richard Zhang wrote: Hello, Marco, Unihan is the official site I think. You can visit www.unihan.com.cn for more information about this, if you know Chinese :). Knowing Chinese is not enough. You and your browser need to know Simplified

Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

2001-02-26 Thread Kenneth Whistler
Peter said: As I indicated above, I think that there is a non-vacuous notion that merits a specific term for the purposes of discussion, and that that notion is the one I have been assuming up to now. And that is (abstract character)1, as I clarified earlier. I agree with you, Peter, that

Random behaviour with Netscape/IE 5.0 for UTF8 support....Help!

2001-02-26 Thread thejokrishna
Hi all, I am relatively new to this list and hence this might be a FAQ. But i could'nt find a valid answer to my problem. My System architecture: 1: A Web application, written using WebObjects 4.5 and Java ( JDK 1.1.6) on Mac OS X server 2: Oracle 8i DB

Re: Klingon silliness

2001-02-26 Thread G. Adam Stanislav
At 16:09 26-02-2001 -0800, David Starner wrote: Bah. Life requires compromise. There are many people working on Unicode, each with their own reasons. To stop working on Unicode because someone else finds something a cool idea that you don't is absurd, especially when that cool idea is going