Re: Acronyms

2000-07-11 Thread Michael Everson
Ar 14:56 -0800 2000-07-10, scríobh Christopher J. Fynn: "Antoine Leca" [EMAIL PROTECTED] wrote: Also, SMP is intended "for scripts and symbols" in English, and in French « pour caractères et symboles » ("for characters and symbols"), a slightly different thing... Recte « pour écritures et

RE: Acronyms

2000-07-11 Thread Michael Everson
From: Antoine Leca [mailto:[EMAIL PROTECTED]] Which appears to me as slightly wrong, because the acronym for UCS in French ought to be JUC, as in the French title of part 1, or even better J.U.C. I hope this is not yet engraved in hard stone and that it will be corrected before it becomes

RE: What is this case folding?

2000-07-11 Thread Marco . Cimarosti
Robert Lozyniak wrote: If it is what I think it is, I don't want it in English. How could it tell "aids" from "AIDS", for instance? Or "joy" from "Joy"(name)? (C'mon, 11BB, you were supposed to know this one ;-) Case folding (or case conversion) is the process of changing letters from one

Re: Bug in TR 19, and fancy HTML in TR's

2000-07-11 Thread Otto Stolz
Am 2000-07-11 um 00:05 h UCT hat Erik van der Poel geschrieben: Or the document.write can be removed from TR22, but I don't know whether that workaround is acceptable to the authors. Or, the construct prelt;?xml version=quot;1.0quot; ...?gt; lt;!DOCTYPE characterMapping SYSTEM 

RE: Acronyms

2000-07-11 Thread John Cowan
On Mon, 10 Jul 2000, Jonathan Rosenne wrote: What about ISO - The International Organization for Standardization? It is common practice sometimes to keep the acronyms although they are from another language. Or in this case from no language: ISO is not the acronym of the full name in any

Traditional Chinese Simplified Chinese

2000-07-11 Thread Tony Yuen
Anyone know that it is a good solution to choose UTF-8 as default charset in a Web Site mainly using Traditional Chinese, Simplified Chinese and English in Redhat Linux Server and MySQL Database.

Re: Han character names?

2000-07-11 Thread Jon Babcock
Whichever one you pick, some ideographs have multiple pronunciations, and a lot have no pronunciation. I wonder if someone could point to a just one Chinese graph in the Unicode CJK Unified "Ideographs" that has no documented pronunciation. I didn't know such critters actually existed.

Re: Acronyms

2000-07-11 Thread Patrick Andries
- Message d'origine - De: "Jonathan Rosenne" [EMAIL PROTECTED] What about ISO - The International Organization for Standardization? It is common practice sometimes to keep the acronyms although they are from another language. Well, ISO apparently is not an acronym but a reference to

ISO 233 and Unicode

2000-07-11 Thread Graeme E. Coutts
Does anyone have experience with implementing ISO 233 - Transliteration of Arabic Characters into Latin Characters? The standard ommits any references to the Unicode values to use for the transliterated characters. In certain instances the clarity (typesettng) of the document leaves the

Re: Bestsellers. Previously: Difference between EM QUAD and EM

2000-07-11 Thread Curtis Clark
At 06:59 AM 00.07.10 -0800, [EMAIL PROTECTED] wrote: make us all rich. What about: "The Dark Underside of Unicode"? Harry Potter and the Code of Mystery. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Biological Sciences Department Voice: (909) 869-4062

Re: Han character names?

2000-07-11 Thread Jon Babcock
John H. Jenkins wrote: That was my point. (And in any event, since the names have to be unique, it would be hard to use pronunciation for code point names and give every ideograph a unique name.) I totally agree with this. I think the Unicode way of naming the Han characters is the only

Re: Bug in TR 19, and fancy HTML in TR's

2000-07-11 Thread Doug Ewell
Thanks to Netscapers Katsuhiko Momoi and Erik van der Poel for helping untangle this for me. It turns out that there is a bug in Navigator 4.06 that operates as Erik described. So the problem is Netscape's, but OTOH it *is* related to the use of Javascript in TR's which I originally questioned.

Re: Irish case folding

2000-07-11 Thread Michael Everson
Ar 14:10 -0800 2000-07-10, scríobh [EMAIL PROTECTED]: One definitely has to be careful with case here, e.g. "Francach" (Frenchman) is definitely not the same as a title-cased "francach" (rat) - the potential for diplomatic difficulties is enormous :-) I thought that the words are the same.

RE: Not all Arabics are created equal...

2000-07-11 Thread Roozbeh Pournader
On Mon, 10 Jul 2000, John Cowan wrote: No, what I am trying to nail down is whether the LSD is represented first in the Unicode datastream or last. European digits are represented with the MSD first and the LSD last. What's the story with Arabic digits? Without knowing that, I can't

RE: FW: Unicode to UTF-8

2000-07-11 Thread Marco . Cimarosti
Mark Davis wrote: Joe, try http://www.macchiato.com/unicode/charts.html. [...] Or if you are typing in the UTF-8 and going to UTF-16 or UTF-32, you can try http://www.macchiato.com/mark/UnicodeConverter. [...] Or, as a last resort, use this cute manual converter: - * - * - * - * - * - The

Re: Names of planes, and request for sneak preview

2000-07-11 Thread Mark Davis
We haven't used the notion of Planes and Groups. These actually derived, as far as I can remember from early days in L2, from later-discarded mechanisms that would let you swap in planes into the BMP. Thus it was important to distinguish these levels. Planes and Groups are themselves not

RE: How-To handle i18n when you don't know charset?

2000-07-11 Thread Alan Wood
Mike Brown kindly supplied some JavaScript to determine the current and default encoding for Internet Explorer 4+. This gives some interesting results for default encoding: Mac IE 4.5 - utf-8 Mac IE 5 - utf-8 Win IE 5.01 - x-user-defined Win IE 5.01 SP1 - big5 Would anyone from

Re: Han character names?

2000-07-11 Thread John Cowan
Jon Babcock wrote: I was interested in seeing an example of a Han graph that has no documented pronunciation because I was under the impression that such a graph doesn't/cannot exist. Aren't there characters used on the tortoise-shell oracle records that are totally unknown later? In that

Re: Names of planes, and request for sneak preview

2000-07-11 Thread John H. Jenkins
At 7:19 AM -0800 7/11/00, Mark Davis wrote: However, there are certain units or thresholds that are useful to distinguish in Unicode. The most important threshold is the one between and 1: important for UTF-16 implementations (and to a minor degree, UTF-8 implementations). So there are

FW: How-To handle i18n when you don't know charset?

2000-07-11 Thread Chris Wendt
Speaking for the Windows versions: All language versions of IE5 behave the same. The only difference in behavior is the encoding of the base part of URLs which defaults to UTF-8 for all translations except the Traditional Chinese and Korean ones. The initial default encoding for documents is

RE: Not all Arabics are created equal...

2000-07-11 Thread Marco . Cimarosti
Greg Reynolds wrote: The only remedy I can see for this particular flaw in Unicode is the introduction of a codepoint to set or maybe swap the evaluation rule for number strings. It is not a flaw. Rather, IMHO, we are all doing the mistake of considering this as an *encoding* issue. Which

Re: Not all Arabics are created equal...

2000-07-11 Thread Roozbeh Pournader
On Tue, 11 Jul 2000, John Cowan wrote: Okay, I now grasp that firmly. Now just what is the difference between the ARABIC-INDIC DIGITs (U+0660 et seq.) and the EASTERN ARABIC-INDIC DIGITs (U+06F0 et seq.) other than glyph shape? The EASTERN ones are classed as "European numbers" for bidi

Re: Names of planes, and request for sneak preview

2000-07-11 Thread Michael Everson
Ar 07:53 -0800 2000-07-11, scríobh John H. Jenkins: At the same time, it would be nice to have a Unicodally correct way of referring to planes 1 and 2, since there is an important boundary between them. Just use the acronyms BMP, SMP, and SIP. Michael Everson ** Everson Gunn Teoranta **

Re: Traditional Chinese Simplified Chinese

2000-07-11 Thread addison
Hi Tony, Do you mean Traditional, Simplified and English at the *same time* in the *same* HTML page? If so, UTF-8 is the only way you can effectively do it. You may need to spend a fair amount of time on setting up fonts/style sheets to get the desired look and feel for each block of Traditional

Re: Landmark name format

2000-07-11 Thread Antoine Leca
Sebastian Hagedorn wrote: 15 km SE of Montréal, Québec I am more interested in the structure that the actual character or language encoding here. Actually I think that in Germany you might not even specify the state, so that you would have only one level. You only specify the state

Re: Han character names?

2000-07-11 Thread John H. Jenkins
At 7:54 AM -0800 7/11/00, John Cowan wrote: Aren't there characters used on the tortoise-shell oracle records that are totally unknown later? In that case, there would hardly be a documented pronunciation! Yes. (And is anything that old in CJK Supplement B?) No. Vertical Extension B adds

Re: Landmark name format

2000-07-11 Thread John Cowan
Antoine Leca wrote: And my girlfriend was to laugh loudly, because she cannot imagine Paris anywhere outside France. I am sure that every people in the U.S.A., on the other hand, certainly can imagine other places... A brief check turns up ten localities named "Paris" in the U.S., of which

Re: Names of planes, and request for sneak preview

2000-07-11 Thread Mark Davis
I ALY FND ANMs HRD2 DL WTH. WD PFR NML WDS. Michael Everson wrote: Ar 07:53 -0800 2000-07-11, scríobh John H. Jenkins: At the same time, it would be nice to have a Unicodally correct way of referring to planes 1 and 2, since there is an important boundary between them. Just use the

Re: Han character names?

2000-07-11 Thread Kenneth Whistler
John Cowan asked: "John H. Jenkins" wrote: Maybe they'll start on the Shuowen next and then move back to pre-Zhou stuff. :-) If they did, would the SIP overflow? Quite possibly, depending on what one does in terms of unifications. But that is what Plane 3 is for. MDIP ("More damn

Subset of Unicode to represent Japanese Kanji?

2000-07-11 Thread Magda Danish (Unicode)
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Tuesday, July 11, 2000 7:02 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Language Support From: MICHAEL W. MARTIN To Whom It May Concern: I am writing embedded software to control a print head.

Re: Names of planes, and request for sneak preview

2000-07-11 Thread Jonathan Coxhead
Oh, by the way, if 12 is a dozen and 144 is a gross, what are 16 and 256? 272

Re: Names of planes, and request for sneak preview

2000-07-11 Thread Tex Texin
shoot, its 1 1/3 dozen. Tex Texin wrote: [EMAIL PROTECTED] wrote: What about F? I was told that there are 0x10 possible characters? Oh, by the way, if 12 is a dozen and 144 is a gross, what are 16 and 256? 1 and 1/4 dozen and 9/16 of a gross. --

RE: Euro character in ISO

2000-07-11 Thread Leon Spencer
Does anyone know where I can easily download the latest ISO-8859-X specs? The ones at ftp.unicode.org seem to be dated 1996. Also, does anyone know which ISO-8859-X contains the Euro? Thanks. Leon -Original Message- From: Murray Sargent [mailto:[EMAIL PROTECTED]] Sent:

Re: Names of planes, and request for sneak preview

2000-07-11 Thread Asmus Freytag
At 12:18 PM 7/11/00 -0800, [EMAIL PROTECTED] wrote: What about F? I was told that there are 0x10 possible characters? Oh, by the way, if 12 is a dozen and 144 is a gross, what are 16 and 256? There are 0x10 - 34 possible characters! All code values ending in 0xFFFE and Ox do

Re: Euro character in ISO

2000-07-11 Thread Asmus Freytag
At 01:25 PM 7/11/00 -0800, Leon Spencer wrote: Has ISO addressed the Euro character? Yes. It's at 0x20AC in ISO/IEC 10646-1. There has been an attempt to create a series of 'touched up' 8859 standards. The problem with these is that you get all the issues of character set confusion that

Re: Names of planes, and request for sneak preview

2000-07-11 Thread 11digitboy
Okay, 0x10FFDE different characters. But what of planes 15 and 16? -- Robert Lozyniak Accusplit pedometer, purchased about 2000a07l01d19h45mZ, has NOT FLIPPED My page: http://walk.to/11 [EMAIL PROTECTED] - email (917) 421-3909 x1133 - voicemail/fax Asmus Freytag [EMAIL PROTECTED] wrote:

Re: Euro character in ISO

2000-07-11 Thread Robert A. Rosenberg
At 15:30 -0800 on 07/11/00, Asmus Freytag wrote about Re: Euro character in ISO: There has been an attempt to create a series of 'touched up' 8859 standards. The problem with these is that you get all the issues of character set confusion that abound today with e.g. Windows CP 1252 mistaken for

Re: Han character names?

2000-07-11 Thread Thomas Chan
On Tue, 11 Jul 2000, Jon Babcock wrote: I was interested in seeing an example of a Han graph that has no documented pronunciation because I was under the impression that such a graph doesn't/cannot exist. The "beikao" chapter (pp. 1585-1631) of the _Kangxi Zidian_ would be one place to start

Re: Euro character in ISO

2000-07-11 Thread Michael \(michka\) Kaplan
Robert, I am a big fan of the Windows code pages, they often make my life easier. However, there is a disadvantage to the fact that even over the course of a few service packs (let alone a few operating systems!) the code pages have changed, and there is simply no good documentation that will