Re: Synthetic scripts

2002-03-18 Thread Sampo Syreeni
On Sun, 17 Mar 2002, Andy Heninger wrote: Tighten up the definition of an artificially constructed language to be one that has never had native speakers, and you're there. According to what I've heard, you have just thrown out both Esperanto and, believe it or not, Klingon -- linguists do funky

RE: Synthetic scripts (was: Re: Private Use Agreements and Unapproved Characters)

2002-03-18 Thread Sampo Syreeni
On Sun, 17 Mar 2002, Asmus Freytag wrote: Like all organizations, neither Unicode nor ISO have infinite resources. Of course. I actually think both the Unicode Consortium and the ISO are doing a fine job. The point was, if there was a problem prioritization could solve, it still wouldn't be the

RE: Collation - last character?

2002-03-18 Thread Lars Kristan
Markus Scherer wrote: How about U+10? It is a non-character, which gives it a high (unassigned character) weight in the UCA. It is the highest code point = the last character. That is definitely not what I was looking for. It is an illegal codepoint, while I was looking for a legal

RE: Missing values in mapping-tables?

2002-03-18 Thread Lars Kristan
Again - 'invalid data' and 'garbage'. Because you're thinking old data with old definition. How about new data and old software? Your approach means that if a new character is defined in say ISO 8859-8, then all old software should report it as error. And all users must upgrade. When (and if!)

RE: Synthetic scripts (was: Re: Private Use Agreements and Unappr oved Characters)

2002-03-18 Thread Marco Cimarosti
Jungshik Shin wrote: [Dan Kogai] Dan the Man whose Name was Compromised by the Japanese government (*) (*) My parents wanted me to name me 彈 (U+5F48), a classical form, but it was not listed on the table of Kanjis allowed for names so I was named U+5F3E. Frankly speaking, I

31 Angry Watanabes (or the Itaiji problem)

2002-03-18 Thread Dan Kogai
I've changed the Subject: header because this thread is diverging. On Saturday, March 16, 2002, at 11:43 , Thomas Chan wrote: This particular case in a Chinese context wouldn't be respected. One of the strongest taboo in business correspondence in Asia is to misspell names. (Thanks to |

Re: 31 Angry Watanabes (or the Itaiji problem)

2002-03-18 Thread Thomas Chan
On Mon, 18 Mar 2002, Dan Kogai wrote: However, if one is to pick over little details, then I still don't know what U+5F3E is (in the context of Dan's name)--does the upper right corner have two or three strokes? Three. That's the only official 'Dan' with 'Bow' and 'Single'

RE: Synthetic scripts (was: Re: Private Use Agreements and Unapproved Characters)

2002-03-18 Thread Suzanne M. Topping
-Original Message- From: Dan Kogai [mailto:[EMAIL PROTECTED]] As Kato pointed out, Unicode is more pro-programmers than pro-users. This is true of any character set. Users are not at all concerned with how their script is stored. Most would prefer to never know about, hear about,

Re: 31 Angry Watanabes (or the Itaiji problem)

2002-03-18 Thread John H. Jenkins
On Monday, March 18, 2002, at 03:54 AM, Dan Kogai wrote: In other words, neither Unicode nor any portable charset to date can be used just to issue driver's license, much less exchange legal documents electrically. This is a serious obstacle to digitize government but never discussed

Re: Synthetic scripts

2002-03-18 Thread James E. Agenbroad
On Sun, 17 Mar 2002, Miikka-Markus Alhonen wrote: On 17-Mar-02 Curtis Clark wrote: At 04:45 PM 3/16/02, Doug Ewell wrote: But right away that definition includes not only Shavian, Tengwar, Cirth, Klingon, and most of the contents of ConScript, but also Ethiopic, Cherokee, Canadian

Re: Synthetic scripts (was: Re: Private Use Agreements and UnapprovedCharacters)

2002-03-18 Thread James E. Agenbroad
On Fri, 15 Mar 2002, Kenneth Whistler wrote: Dan Kogai continued: [snip] His favorite appears to be ISO-2022 but as Yet Another Perl Encoding Hacker, ISO-2022 is pain in the arse You got that right! --Ken Monday, March

ISO 2022 (was: Re: Synthetic scripts (was: wandering off topic) )

2002-03-18 Thread Kenneth Whistler
Jim Agenbroad asked: Monday, March 18, 2002 Is ISO 2022 a character set (characters with their codes) or a complex (painful?) means to announce and negotiate among various sets? I thought it was the latter; am I missing something? ISO 2022 is a

RE: Collation - last character?

2002-03-18 Thread Kenneth Whistler
Lars Kristan responded: Markus Scherer wrote: How about U+10? It is a non-character, which gives it a high (unassigned character) weight in the UCA. It is the highest code point = the last character. That is definitely not what I was looking for. It is an illegal codepoint,

Re: Synthetic scripts

2002-03-18 Thread Timothy Partridge
Doug Ewell recently said: The closest I can come is something like a script that was invented, generally by one person and in a relatively short period of time, rather than evolving from existing scripts in a gradual and progressive manner. But right away that definition includes not only

Atama-ga-itai-ji

2002-03-18 Thread $B$m!;!;!;!;(B $B$m!;!;!;(B
Yes, just get the ruddy things into Unicode. Or else use Plan B: Plan B is, the only valid characters in a personal name are kana. _ $B$*E9$h$j$b5$7Z$K!*9%$-$J%b%N9%$-$J$@$18+$i$l$k(B MSN $B%7%g%C%T%s%0(B

Montagnard / Vietnam Highlands characters

2002-03-18 Thread Jerome Hodges IV
I'm currently typesetting a book written in Jarai (var. Jrai, J'rai), a tribal language used in the highlands of Vietnam; besides using characters already accounted for in the Vietnamese script, the written Jarai language uses several characters that are, to my knowledge, unique to it (and a

Re: Synthetic scripts

2002-03-18 Thread Asmus Freytag
At 07:11 PM 3/17/02 -0800, Doug Ewell wrote: The myth I was trying to communicate was that the process is totally serial, such that if 3 weeks are spent on getting Tai Le encoded, CJK Extension X is pushed back by 3 weeks. Stated this way, it's of course overstated. You pointed out the

Re: Montagnard / Vietnam Highlands characters

2002-03-18 Thread Kenneth Whistler
Jerome Hodges asked: I'm currently typesetting a book written in Jarai (var. Jrai, J'rai), a tribal language used in the highlands of Vietnam; besides using characters already accounted for in the Vietnamese script, the written Jarai language uses several characters that are, to my

Re: Synthetic scripts

2002-03-18 Thread Kenneth Whistler
John Jenkins wrote: Basically, the place where I personally would draw the line is between having a body of people (size left vague) who want to interchange data in the script, or if there is a historic body of literature in the script. I find myself very much in sympathy with this

Re: Montagnard / Vietnam Highlands characters

2002-03-18 Thread J Do
Hi Jerome, I'm currently typesetting a book written in Jarai (var. Jrai, J'rai), a tribal language used in the highlands of Vietnam; besides using characters already accounted for in the Vietnamese script, the written Jarai language uses several characters that are, to my knowledge, unique

RE: Missing values in mapping-tables?

2002-03-18 Thread Kenneth Whistler
Lars Kristan suggested: OK, another way of looking at all this. I believe you would accept three options: A - Reject the stream. B - Drop the invalid data. If you were defining an application concerned with security, and if you had a clearly defined conversion you were performing, yes these

RE: Missing values in mapping-tables?

2002-03-18 Thread Kenneth Whistler
And it seems to have overlooked the fact that not all conversions are defined on multi-byte character encodings to Unicode. Grr. What I meant of course was: And it seems to have overlooked the fact that not all conversions are defined on single-byte character encodings to Unicode. --Ken

Avestan and Old Persian (was: Re: Private Use Agreements ... wandering off-topic)

2002-03-18 Thread Kenneth Whistler
Vladimir Ivanov noted: Old Persian and Avestan are closely related ancient languages that usually go side by side. If a linguist refers to an Old Persian example, he must show its Avestan form or his work would be considered to be incomplete ... [ lots of good information followed ] What

Re: $BF,$,DK$$;z(B

2002-03-18 Thread $B$m!;!;!;!;(B $B$m!;!;!;(B
Plan B is, the only valid characters in a personal name are kana. I meant, *in Japan*. Stefan _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com

Re: Synthetic scripts

2002-03-18 Thread $B$m!;!;!;!;(B $B$m!;!;!;(B
John Jenkins wrote: Basically, the place where I personally would draw the line is between having a body of people (size left vague) who want to interchange data in the script, or if there is a historic body of literature in the script. So the script of the Codex Seraphinianus would NOT

Re: Synthetic scripts

2002-03-18 Thread Doug Ewell
Timothy Partridge [EMAIL PROTECTED] wrote: If I went to a community whose language doesn't have a written form and convinced them that Tengwar would be an ideal way of recording their culture, would that make Tengwar more legitimate? Or cause people to regard it as a higher priority? Yes.

Re: Avestan and Old Persian (was: Re: Private Use Agreements ... wandering off-topic)

2002-03-18 Thread Doug Ewell
Kenneth Whistler [EMAIL PROTECTED] wrote: The proposals will likely languish until Michael Everson discovers he has some free time on his hands to pursue consensus with academic Iranianists and other interested parties, or until someone from that community emerges as a champion to push the

Re: Montagnard / Vietnam Highlands characters

2002-03-18 Thread Doug Ewell
Kenneth Whistler [EMAIL PROTECTED] wrote: b-stroke: 0180 ~ 0180 id. ... (all in both lower- and upper-case variants). Substitute out the uppercase for the relevant base characters, and you have it. There's a problem, though. There is no uppercase form of U+0180

Re: Private Use Agreements and Unapproved Characters

2002-03-18 Thread Doug Ewell
Sorry for the belated response to this. I hope it is still relevant. Patrick T. Rourke [EMAIL PROTECTED] wrote: I would think you could simply use the version number of the Unicode Standard. For example, the use of Tagalog would have been conformant to this proposed PUA registry until

Re: Private Use Agreements and Unapproved Characters

2002-03-18 Thread David Starner
On Mon, Mar 18, 2002 at 08:59:15PM -0800, Doug Ewell wrote: You are not going to find many fonts on the Web that contain PUA characters. There are a few Shavian fonts using the ConScript PUA encoding. -- David Starner - [EMAIL PROTECTED] It's not a habit; it's cool; I feel alive. If you

Ladino Transliteration Of Hebrew Letters

2002-03-18 Thread Robert
Hello, Unicoders!! About the transliteration of the Hebrew letters for the Ladino (Judeo-Spanish, uemo) language, an acceptable system for that is one used by Padre (=Father) Pascal Recuero (which looks Esperanto-like, as can be seen just below): alef—' (apostrophe) beth-daghesh—b