Re: Emoticons

2000-07-20 Thread Iman Saad
BTW, did anyone get the smileys right at the first sight? I got the smile but not the frown (any guesses as to why?). The font, however, is too small to see, but I *think* it's a smiley... The same happened to me. I got the smile, but not the frown. I use OE 5.00.2314.1300

U+3358

2000-07-20 Thread 11digitboy
This says reiten, not reiji. Why?! Shouldn't it say REIJI??!!! Or am I going to look like a total fool when I find out that it SHOULD say REITEN? If the thing said REIJI, it and its friends could be used to shorten encoding of times in text. Why are there no reifun, ippun, nifun, ... gojuukyuufun

Re: Decimal separator in Hindi Numbering

2000-07-20 Thread N.R.Liwal
I tested in Excel 2000 Excel 2000 automatically place leading zero. Hindi Numbers in Word 2000 can be typed in left to right Order using 060c as thousand seperator and 066b as Decemial Seperator. Liwal Subject: Re: Decimal separator and I would expect use of U+060C as a

Re: Unicode FAQ addendum

2000-07-20 Thread Lars Marius Garshol
* John Cowan | | C1 says "A process shall interpret Unicode code values as 16-bit | quantities." This I find mightily confusing. Why say something like this when there are (well, will be) characters that cannot be represented with 16 bits in any of the Unicode encodings? | "Code unit" is

Re: Subset of Unicode to represent Japanese Kanji?

2000-07-20 Thread Otto Stolz
On Mon, 17 Jul 2000, I wrote: Curtly, Now, I have found out that this word has a meaning different from what I had tried to express. I appologize for any offense that may have been perceived by anybody. Best wishes, Otto Stolz

Re: Unicode FAQ addendum

2000-07-20 Thread Asmus Freytag
There's no updating needed. The key is that The Unicode Standard, Version 3.0 recognizes UTF-16 as the default encoding. Therefore code values (or units) which are defined as 'minimal bit combination that can represent a unit of encoded text' are 16-bit. In UTF-16, one sometimes needs two of

Ethiopic digits

2000-07-20 Thread 11digitboy
Look at page 92 in the book. Then look at this: http://www.cyberethiopia.com/ethiopic/counter.htm Especially the part about no zero. -- Robert Lozyniak Accusplit pedometer, purchased about 2000a07l01d19h45mZ, has NOT FLIPPED My page: http://walk.to/11 [EMAIL PROTECTED] - email (917) 421-3909

Re: Unicode FAQ addendum

2000-07-20 Thread Elliotte Rusty Harold
At 8:00 AM -0800 7/19/00, John Cowan wrote: The new Unicode FAQ (like the old) supplies the panting world with John's Own Version of Unicode Conformance: 1) Unicode code units are 16 bits long; deal with it. 2) Byte order is only an issue in files. I've got to take issue with #2. People can and

Re: Designing a multilingual web site

2000-07-20 Thread Otto Stolz
Munzir Taha hatte geschrieben: Suppose I publish the page, how can people know that I told notepad to save as Unicode ;-) Am 2000-07-18 um 03:03 h UCT hat Michael (michka) Kaplan geschrieben: The following should go all in one line at the very top of the header: META

Re: Designing a multilingual web site

2000-07-20 Thread Michael \(michka\) Kaplan
- Original Message - From: "Otto Stolz" [EMAIL PROTECTED] As said several times before, this is only part of the story. However, since there are no browsers out there that would refuse to aceept a charset tag on the basis of no HTML 4.0 tag, it is the whole story from a functional

RE: Unicode FAQ addendum

2000-07-20 Thread Jonathan Rosenne
How about: 2) Byte order is only an issue in I/O. Jony -Original Message- From: Elliotte Rusty Harold [mailto:[EMAIL PROTECTED]] Sent: Thursday, July 20, 2000 3:19 PM To: Unicode List Subject: Re: Unicode FAQ addendum At 8:00 AM -0800 7/19/00, John Cowan wrote: The

Uniscribe API files? (Programming)(Microsoft)(Windows)

2000-07-20 Thread Marco . Cimarosti
I am looking for the header file containing the declarations for Uniscribe (USP10.DLL). It think it should be a single file named "usp10.h", but I cannot find it on the Microsoft web site or elsewhere. Could somebody point me in the right direction? Thank you. _ Marco

Re: Ethiopic digits

2000-07-20 Thread Michael \(michka\) Kaplan
Yes, Robert, there is no zero there. Tamil has the same issue. Not everyone recognizes a zero in their numbering system. michka - Original Message - From: [EMAIL PROTECTED] To: "Unicode List" [EMAIL PROTECTED] Sent: Thursday, July 20, 2000 2:58 AM Subject: Ethiopic "digits" Look at

Re: Uniscribe API files? (Programming)(Microsoft)(Windows)

2000-07-20 Thread Michael \(michka\) Kaplan
It does install with the Platform SDK (thats how I got it, to port it to VB). http://msdn.microsoft.com/developer/sdk/platform.asp If you need individual structure or API definitions, you can get it from MSDN docs: http://msdn.microsoft.com/library/psdk/winbase/uniscrib_2oth.htm michka

Re: Pronunciation of Unicode

2000-07-20 Thread John Cowan
Otto Stolz wrote: Such as Asterix and Obelix? Yes, well, they are Celts, not really French at all. :-) -- Schlingt dreifach einen Kreis um dies! || John Cowan [EMAIL PROTECTED] Schliesst euer Aug vor heiliger Schau, || http://www.reutershealth.com Denn er genoss vom Honig-Tau, ||

RE: Font for Japanese US applications

2000-07-20 Thread Alan Wood
Pierre Vaures ([EMAIL PROTECTED]) asked: We need to display both English and Japanese (Kanji, Hiragana, Katakana) characters. We don t find a font able to display both, in particular on NT US. Microsoft supplies fonts that probably do what you want. MS Gothic is part of the Japanese

Re: Depends on the language

2000-07-20 Thread John Cowan
[EMAIL PROTECTED] wrote: In English, it's ['junIkowd]. Think "unicycle" or "unilateral" or "universal". And the "code" part is the root word "code". Quod dixit, dixit. As for "unique", well, why doesn't "one" rhyme with "stone", "bone", and "alone"? Because some people thought it was

Re: Font for Japanese US applications

2000-07-20 Thread Michael \(michka\) Kaplan
MS Mincho is actually on the NT4 CD in the \langpack directory. michka - Original Message - From: "Alan Wood" [EMAIL PROTECTED] To: "Unicode List" [EMAIL PROTECTED] Cc: "'pierre vaures'" [EMAIL PROTECTED] Sent: Thursday, July 20, 2000 7:22 AM Subject: RE: Font for Japanese US

Re: Depends on the language

2000-07-20 Thread Michael Everson
Because some people thought it was clever to adopt a nonce pronunciation of "one" /Own/, namely /wVn/, and it stuck. Nah, surely it was drawling. Old English [a:n] Middle English [o:n], breaking under stress and so on to [u:@n] then transitioning to [wVn]. Michael Everson ** Everson Gunn

Re: Unicode FAQ addendum

2000-07-20 Thread Doug Ewell
| C1 says "A process shall interpret Unicode code values as 16-bit | quantities." I think the focus here was supposed to be on the fact that Unicode code values are *not 8-bit* quantities. I found out about Unicode in late 1991 when I discovered a copy of TUS 1.0 in a bookstore, and for years

Security Risks of Unicode

2000-07-20 Thread Doug Ewell
Elliotte Rusty Harold [EMAIL PROTECTED] wrote: Bruce Schneier expresses some concerns about "Security Risks of Unicode" in the latest issue of his Cryptogram newsletter. Thoser who don't subscribe can see: http://www.counterpane.com/crypto-gram-0007.html#9 I'm no expert on computer

Re: Font for Japanese US applications

2000-07-20 Thread John O'Conner
pierre vaures wrote: To Whom It May Concern: We develop, on NT4 using Visual C++ 6.0, an international application for Japanese and US users. We need to display both English and Japanese (Kanji, Hiragana, Katakana) characters. We don t find a font able to display both, in particular on

RE: Thoughts

2000-07-20 Thread Hohberger, Clive
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Monday, July 17, 2000 9:40 AM For a device that will print a relatively basic label (such as sequence number, date, time, name, department, etc) onto a document in Japanese -- what is your consensus? Basic

Re: Font for Japanese US applications

2000-07-20 Thread Michael \(michka\) Kaplan
Yes, truly globalized applications must try the name both ways. I am glad they finally fixed this implementation problem in Windows 2000. michka - Original Message - From: [EMAIL PROTECTED] To: "Unicode List" [EMAIL PROTECTED] Cc: "Unicode List" [EMAIL PROTECTED] Sent: Thursday, July

Re: Using unicode in a Java program

2000-07-20 Thread addison
Whups! My fat fingers typed "John O'Connor" yesterday when I meant to type "John O'Conner"... typing the "o-ful" version does really limit the results that you get on the Javasoft web site search I proposed. My appologies to John for the renaming! Addison

Re: Font for Japanese US applications

2000-07-20 Thread Michael \(michka\) Kaplan
- Original Message - From: [EMAIL PROTECTED] there's a character set identifier that is 0 for CP1252 and 128 for Asian fonts 128 is only good for Japanese... the actual definitions for charsets are in wingdi.h in the Platform SDK, but you can use for DEFAULT_CHARSET and not worry

RE: Unicode in VFAT file system

2000-07-20 Thread Yves Arrouye
Recently I've had the dubious pleasure of delving into the details of the VFAT file system. For long file names, I thought it used UCS-2, but in looking at the data with a disk editor, it appears to be byte-swapping (little endian). I thought that UCS-2 was by definition big endian, thus

RE: Unicode FAQ addendum

2000-07-20 Thread Becker, Joseph
| C1 says "A process shall interpret Unicode code values as 16-bit | quantities." DE I think the focus here was supposed to be on the fact that Unicode code DE values are *not 8-bit* quantities. This may be the path to an update that is pithy yet true. The original mantra, paraphrased in C1

Re: Font for Japanese US applications

2000-07-20 Thread Asmus Freytag
At 08:17 AM 7/20/00 -0800, John O'Conner wrote: 2. Compiling your app as a UNICODE application means that all Win32 API calls use Unicode-enabled versions of the API. Text areas expect you to pass Unicode, and it displays correctly when an appropriate font is used. Even if you don't compile an

Re: Unicode in VFAT file system

2000-07-20 Thread Asmus Freytag
At 09:53 AM 7/20/00 -0800, Ken Krugler wrote: 2. Is little-endian UCS-2 a valid encoding that I just don't know about? Yes, it is. Your example of the VFAT system is a near perfect case, since the details of it form what Unicode calls a 'Higher level protocol' and those may legitimately override

Re: Unicode in VFAT file system

2000-07-20 Thread Ken Krugler
Hi Addison, UCS-2 is pretty close to the same thing as UTF-16. The differences do not apply here. UCS-2 can be big-endian or little-endian. The rule is that BE is the default. However, on Intel platforms, you shouldn't be surprised to see LE everywhere: that's the architecture. Microsoft is

Re: Unicode in VFAT file system

2000-07-20 Thread Asmus Freytag
At 11:34 AM 7/20/00 -0800, John Cowan wrote: 1. Could it be using UTF-16LE? I tried creating an entry with a surrogate pair, but the name was displayed with two black boxes on a Windows 2000-based computer, so I assumed that surrogates were not supported. Probably not. So technically it

Re: Unicode FAQ addendum

2000-07-20 Thread Markus Scherer
Becker, Joseph wrote: terminology in an informal statement, I wouldn't have a problem with the simple update: 1) Unicode code units are not 8 bits long; deal with it. how about: 1) Unicode code units are not necessarily 8 bits long [wide], code points use 21 bits; deal with it. rationale:

Re: Unicode in VFAT file system

2000-07-20 Thread Asmus Freytag
At 11:41 AM 7/20/00 -0800, Ken Krugler wrote: No. UCS-2 and UCS-4 have always been bigendian. Read ISO 10646-1:1993, section "6.3 Octet order" (page 7): When serialized as octets, a more significant octet shall precede less significant octets. The section continues: "When not serialized

Re: Unicode in VFAT file system

2000-07-20 Thread addison
Well... There has always been a BOM in Unicode and it's there for a reason: to indicate the byte order on different processors. There is an inherent BE bias in Unicode. But this doesn't invalidate an LE view of the Universe. Avoiding for the moment the word-parsing that Markus suggests, Unicode

Re: Unicode FAQ addendum

2000-07-20 Thread Mark Davis
Narrowing in on it, with one amendation. UTF-8 code units are 8 bits, so we can't say that. Mark Becker, Joseph wrote: | C1 says "A process shall interpret Unicode code values as 16-bit | quantities." DE I think the focus here was supposed to be on the fact that Unicode code DE values are

127 strokes beyond the radical?!

2000-07-20 Thread 11digitboy
On page 876, the character U+6B8B is listed as being 127 strokes beyond the radical. I'd say it's more like 6 strokes beyond the radical. I do not suppose that characters of 128+ strokes are indeed possible, due to the fact that the paper would get quite soggy from the repeated strokes. --

Re: Signature for SCSU

2000-07-20 Thread David Starner
On Thu, Jul 20, 2000 at 02:38:31PM -0800, Markus Scherer wrote: i am curious as to which product or application you are implementing scsu for. can you tell us/me, please? I'm working on an Unicode library for Ada, pretty much reinventing the wheel for another language.

Signature for SCSU

2000-07-20 Thread Doug Ewell
David Starner [EMAIL PROTECTED], another brave SCSU pioneer, wrote: I'm implementing SCSU, and I was curious about the signature for SCSU. The UTR specifies 10 different signatures and then labels 0E FE FF as recommended. Is it acceptable for a decoder to interpret an initial 0E FE FF as the

Re: Unicode in VFAT file system

2000-07-20 Thread Doug Ewell
Addison Phillips [EMAIL PROTECTED] wrote: Avoiding for the moment the word-parsing that Markus suggests, Unicode on Microsoft platforms has always been LE (at least on Intel) and they have called the encoding they use "UCS-2" (when they bothered with such things: in the past they always