Re: Unicode-based Cyrillic-Latin transliteration table

2001-05-29 Thread DougEwell2
In a message dated 2001-05-28 9:11:44 Pacific Daylight Time, [EMAIL PROTECTED] writes: I fear you have undertaken something hopeless. One could transliterate U+0429 as SHCH or S^C^ or any number of other things, but that is only appropriate for Russian. In Bulgarian, the only natural

Re: Unicode-based Cyrillic-Latin transliteration table

2001-05-29 Thread DougEwell2
In a message dated 2001-05-28 9:11:44 Pacific Daylight Time, [EMAIL PROTECTED] writes: I fear you have undertaken something hopeless. One could transliterate U+0429 as SHCH or S^C^ or any number of other things, but that is only appropriate for Russian. In Bulgarian, the only natural

Re: Unicode-based Cyrillic-Latin transliteration table

2001-05-29 Thread DougEwell2
I apologize for sending the previous message three times. My e-mail client told me the first two attempts had been unsuccessful. -Doug Ewell Fullerton, California

RE: Unicode-based Cyrillic-Latin transliteration table

2001-05-29 Thread James Williams
Can someone please help me understand whether support for double byte is the same as being Unicode compliant. Any elaboration would be greatly appreciated. If for instance, being Unicode compliant has any additional value/benefits, etc... I'd like to understand how, why! Thanks, Jim Williams

Unicode compliance (was RE: Unicode-based Cyrillic-Latin transliteration table)

2001-05-29 Thread Marco Cimarosti
James Williams wrote: Can someone please help me understand whether support for double byte is the same as being Unicode compliant. No. Double byte normally refers to the national character sets used in China, Japan and Korea (much older than Unicode). As these languages require thousands of

Re: Unicode-based Cyrillic-Latin transliteration table

2001-05-29 Thread Peter_Constable
On 05/28/2001 05:30:15 AM Doug Ewell wrote: I know that neither UTC nor WG2 engages in the very controversial business of assigning canonical transliterations between scripts No, but ISO TC46/SC2 does. http://www.elot.gr/tc46sc2/ The goal is to improve an existing program I wrote which

Term Asian is not used properly on Computers and NET

2001-05-29 Thread N.R.Liwal
Dear Unicoders: Whilesurfingtheneta linkwithword ASIANmostof the time leadto aChinese,Japaneseor Korean site, is not confusing? Because there are many nationsandcountriesin Asia! But today I was more confused, when I opened the Microsoft Word XP FONT dialog, it has three Font

Re: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)

2001-05-29 Thread Peter_Constable
On 05/27/2001 08:03:37 PM Jianping Yang wrote: But it seems to me that we've lived without Premise B in the past, and that it won't benefit us to adopt it now. Why bother with it? Why not continue doing what we already know how to do? As a matter of fact, the surrogate or supplementary

RE: Unicode-based Cyrillic-Latin transliteration table

2001-05-29 Thread Peter_Constable
On 05/29/2001 02:02:36 AM James Williams wrote: Can someone please help me understand whether support for double byte is the same as being Unicode compliant. No. Any elaboration would be greatly appreciated. Oh, you'd like an exaplanation? :-) Double byte refers to a variety of legacy

Re: Term Asian is not used properly on Computers and NET

2001-05-29 Thread Peter_Constable
On 05/29/2001 05:12:39 PM N.R.Liwal wrote: I think Calling CJK specifically Asian is not appropriate nor helpful, because Asia is big and have hundreds of languages and scripts, either all Asian Script i.e. Arabic, Hebrew, Devanagri, Bengali, Thai and etc.. should be called Asian

Braille vs Bidi

2001-05-29 Thread Roozbeh Pournader
Why are the Braille characters classified as Other Neutrals regarding bidi? Shouldn't they be Left-to-right? Does any Right-to-Left Braille exist anywhere in the world? --roozbeh

Re: Genesis v. UDHR?

2001-05-29 Thread Frank da Cruz
Trying to translate an English sentence often causes problems. Does hurt mean 1. Injure 2. Cause pain to 3. Both? I believe the intention of the sentence I can eat glass and it doesn't hurt me is to convey the idea that the speaker is... eccentric, which would characterize someone who

Home of Yiddish (was Re: A Europe of fonts)

2001-05-29 Thread Edward Cherlin
At 4:39 AM -0700 5/25/01, [EMAIL PROTECTED] wrote: I thought that Yiddish was a language without a home. ÅöǏǓǧǢǡÇøÇ·ÇÒÅö Although Yiddish is one of the best examples of a language without an army or navy, it is a dialect of Old High German. It was spoken everywhere that German was,

UTF-32s

2001-05-29 Thread Antoine Leca
Billancourt, le 1er avril 2001, I was thinking about this while reading the thread about UTF-8s. If the binary order of UTF-16 is of so prime interest that the (numerous) users of UTF-8 should slightly modify their code to co-operate with UTF-16-based database engines, by accepting UTF-8s rather

RE: Braille vs Bidi

2001-05-29 Thread Marco Cimarosti
Roozbeh Pournader wrote: Does any Right-to-Left Braille exist anywhere in the world? I know that Hebrew Braille is left-to-right. _ Marco

Re: Term Asian is not used properly on Computers and NET

2001-05-29 Thread DougEwell2
In a message dated 2001-05-29 7:10:48 Pacific Daylight Time, [EMAIL PROTECTED] writes: I think Calling CJK specifically Asian is not appropriate nor helpful, because Asia is big and have hundreds of languages and scripts... This is certainly a valid point. Far East or East Asian

RE: Term Asian is not used properly on Computers and NET

2001-05-29 Thread Marco Cimarosti
Doug Ewell wrote: Peter has an excellent solution -- much better than trying to explain the term CJK to ordinary people -- and I plan to use the term East Asian in the future. But, if by East Asian you mean languages written with Han ideographs, you fall in another pitfall, because

Re: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)

2001-05-29 Thread Kenneth Whistler
Doug wrote: UTF-8 and UTF-32 should absolutely not be similarly hacked to maintain some sort of bizarre compatibility with the binary sorting order of UTF-16. UTC should not, and almost certainly will not, endorse such a proposal on the part of the database vendors. I would be loath

RE: Term Asian is not used properly on Computers and NET

2001-05-29 Thread Ayers, Mike
From: Marco Cimarosti [mailto:[EMAIL PROTECTED]] Doug Ewell wrote: Peter has an excellent solution -- much better than trying to explain the term CJK to ordinary people -- and I plan to use the term East Asian in the future. But, if by East Asian you mean languages written with

Re: Braille vs Bidi

2001-05-29 Thread Kenneth Whistler
Roozbeh asked: Why are the Braille characters classified as Other Neutrals regarding bidi? Because they were all given a general category of So (Symbol Other), and the default bidi property for So is ON: 2801;BRAILLE PATTERN DOTS-1;So;0;ON;N; No one spoke out for any

RE: Term Asian is not used properly on Computers and NET

2001-05-29 Thread Marco Cimarosti
Someone (clearly having Chinese roots) wrote me privately: But, if by East Asian you mean languages written with Han ideographs, you fall in another pitfall, because Mongolian, Russian, Vietnamese and many I don't think so, Mongolia is not in East Asia, it's in North Asia. Russia

Re: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)

2001-05-29 Thread Jianping Yang
Antoine Leca wrote: Jianping Yang wrote: As a matter of fact, the surrogate or supplementary character was not defined in the past, How long is the past? I remember reading about these surrogates the first time I put my hands on a draft copy of ISO 10646. It was nearly six years ago.

RE: Term Asian is not used properly on Computers and NET

2001-05-29 Thread Peter_Constable
On 05/29/2001 12:55:19 PM Marco Cimarosti wrote: But, if by East Asian you mean languages written with Han ideographs, you fall in another pitfall, because Mongolian, Russian, Vietnamese and many other languages spoken in East Asia aren't accounted for. At least in academic contexts, or at

RE: Term Asian is not used properly on Computers and NET

2001-05-29 Thread Thomas Chan
On Tue, 29 May 2001, Marco Cimarosti wrote: Doug Ewell wrote: Peter has an excellent solution -- much better than trying to explain the term CJK to ordinary people -- and I plan to use the term East Asian in the future. But, if by East Asian you mean languages written with Han

UTF-64 [warning: contains bits bytes humor] (was RE: [OT] bits and bytes)

2001-05-29 Thread Marco Cimarosti
I originally thought could be a way of storing Unicode text in databases. However, after some thinking, I decided that idea was completely bogus, so I though to turn it into a joke for geeks. But it wasn't even amusing, so it went in the Deleted Items folder. However, I see that illogical ideas

Programming language identifier normalization/casing

2001-05-29 Thread Achim Ruopp
Unicode 3.1 Technical report #15, Annex 7 (http://www.unicode.org/unicode/reports/tr15/#Programming_Language_Ident ifiers) contains the following remark: Generally if the programming language has case-sensitive identifiers then Normalization Form C may be used, while if the programming language

Re: UTF-32s

2001-05-29 Thread Rick McGowan
So I suggest to correct the problem before it came out. And I would like to propose UTF-32s. I think this has been anticipated, I think by some people who proposed UTF-8S. My opinion, for what it's worth, is that there should be no new formats. We have too many of them already, and making

RE: Term Asian is not used properly on Computers and NET

2001-05-29 Thread Jungshik Shin
On Tue, 29 May 2001, Marco Cimarosti wrote: Doug Ewell wrote: Peter has an excellent solution -- much better than trying to explain the term CJK to ordinary people -- and I plan to use the term East Asian in the future. But, if by East Asian you mean languages written with Han

RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)

2001-05-29 Thread Carl W. Brown
Ken, UTF-8s is essentially a way to ignore surrogate processing. It allows a company to encode UTF-16 with UCS-2 logic. The problem is that by not implementing surrogate support you can introduce subtle errors. For example it is common to break buffers apart into segments. These segments may

RE: Term Asian is not used properly on Computers and NET

2001-05-29 Thread Jungshik Shin
On Tue, 29 May 2001, Jungshik Shin wrote: On Tue, 29 May 2001, Marco Cimarosti wrote: Doug Ewell wrote: Peter has an excellent solution -- much better than trying to explain the term CJK to ordinary people -- and I plan to use the term East Asian in the future. you fall in

RE: Term Asian is not used properly on Computers and NET

2001-05-29 Thread Kenneth Whistler
Thomas Chan wrote: There are many pitfalls. Does the definition exclude Korean when written solely in Hangul? Is Vietnamese clearly East Asian? How about Yi (TUS3.0 thinks so)? Whoa, wait a minute. Let's not extrapolate too much from some pragmatic decisions that were taken to divide up

Re: Programming language identifier normalization/casing

2001-05-29 Thread Peter_Constable
On 05/29/2001 02:46:48 PM Achim Ruopp wrote: Generally if the programming language has case-sensitive identifiers then Normalization Form C may be used, while if the programming language has case-insensitive identifiers then Normalization Form KC may be more appropriate. If I'm not mistaken

RE: Term Asian is not used properly on Computers and NET

2001-05-29 Thread Peter_Constable
On 05/29/2001 02:37:55 PM Thomas Chan wrote: I think what one wants is something like languages usually and currently possibly including Han characters in their written form. That frees us from worrying about historical or aberrant cases, I think. Folks, this discussion was about how to label

Re: Question about UTR#24

2001-05-29 Thread Kenneth Whistler
Marco asked: I have a question about the file http://www.unicode.org/Public/UNIDATA/Scripts.txt, the data file for UTR#24 (Script Names). I see that script-specific combining characters are normally assigned to that script. However, a few of them are in the INHERITED class: Are these

RE: Term Asian is not used properly on Computers and NET

2001-05-29 Thread Thomas Chan
On Tue, 29 May 2001 [EMAIL PROTECTED] wrote: On 05/29/2001 02:37:55 PM Thomas Chan wrote: I think what one wants is something like languages usually and currently possibly including Han characters in their written form. That frees us from worrying about historical or aberrant cases, I

RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)

2001-05-29 Thread Kenneth Whistler
Carl, Ken, UTF-8s is essentially a way to ignore surrogate processing. It allows a company to encode UTF-16 with UCS-2 logic. The problem is that by not implementing surrogate support you can introduce subtle errors. For example it is common to break buffers apart into segments.

RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)

2001-05-29 Thread Carl W. Brown
Ken, I suspect that Oracle is specifically pushing for this standard because of its unique data base design. In a sense Oracle almost picks it self up by its own bootstraps. It has always tried to minimize actual code. Therefore it was a natural choice to implement Unicode with UTF-8 because

RE: Term Asian is not used properly on Computers and NET

2001-05-29 Thread $B$F$s$I$&$j$e$&$8(B
So say "Han font" or "Hanzi font". $B!z$8$e$&$$$C$A$c$s!z(B EKYWY TXLY NPZ P MPVD XPHYV LPWWQY NKT ZPN XT WYPZTX PE PMM ET HPWWD "EYX EKTSZPXV'Z HTWY GSX P XSHOYW EKPX TXY PXV LTHHQEHYXE, ET HY, QZ RSQEY ZLPWD"

RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)

2001-05-29 Thread $B$F$s$I$&$j$e$&$8(B
You can just say Screw the number 8, let's use 21-bit bytes. $B!z$8$e$&$$$C$A$c$s!z(B EKYWY TXLY NPZ P MPVD XPHYV LPWWQY NKT ZPN XT WYPZTX PE PMM ET HPWWD "EYX EKTSZPXV'Z HTWY GSX P XSHOYW EKPX TXY PXV LTHHQEHYXE, ET HY, QZ RSQEY ZLPWD" --- Original Message --- $B:9=P?M(B: "Carl W. Brown"

Re: Term Asian is not used properly on Computers and NET

2001-05-29 Thread David Gallardo
Actually it would be more accurate to say that geographic expressions involving cardinal points without an _explicit_ point of reference are biased, because they traditionally assume that Europe is the _implicit_ point of reference. Hence, Far East, Orient, Near East (or Middle East) are biased

Re: Term Asian is not used properly on Computers and NET

2001-05-29 Thread John Cowan
David Gallardo scripsit: Actually it would be more accurate to say that geographic expressions involving cardinal points without an _explicit_ point of reference are biased, because they traditionally assume that Europe is the _implicit_ point of reference. Hence, Far East, Orient, Near

Re: Term Asian is not used properly on Computers and NET

2001-05-29 Thread David Gallardo
Please excuse the unintended querulousness, but isn't the Greenwich meridian merely the reification of this bias? The Greenwich meridian division was established in 1884 by representatives from 25 countries, mostly from Europe and the Americas. Though there were, notably, representatives from

Re: Term Asian is not used properly on Computers and NET

2001-05-29 Thread John Cowan
David Gallardo scripsit: Please excuse the unintended querulousness, but isn't the Greenwich meridian merely the reification of this bias? Sure. Ditto the Gregorian calendar, and the decimal digit system, and many other international standards. But they *are* standards. -- John Cowan