BTW, did anyone get the smileys right at the first sight?
I got the smile but not the frown (any guesses as to why?). The
font, however, is too small to see, but I *think* it's a smiley...
The same happened to me. I got the smile, but not the frown.
I use OE 5.00.2314.1300
This says reiten, not reiji. Why?! Shouldn't it say
REIJI??!!! Or am I going to look like a total fool
when I find out that it SHOULD say REITEN?
If the thing said REIJI, it and its friends could
be used to shorten encoding of times in text.
Why are there no reifun, ippun, nifun, ... gojuukyuufun
I tested in Excel 2000
Excel 2000 automatically place leading zero.
Hindi Numbers in Word 2000 can be typed in
left to right Order using 060c as thousand seperator
and 066b as Decemial Seperator.
Liwal
Subject: Re: Decimal separator
and I would expect use of U+060C as a
* John Cowan
|
| C1 says "A process shall interpret Unicode code values as 16-bit
| quantities."
This I find mightily confusing. Why say something like this when
there are (well, will be) characters that cannot be represented with
16 bits in any of the Unicode encodings?
| "Code unit" is
On Mon, 17 Jul 2000, I wrote:
Curtly,
Now, I have found out that this word has a meaning different from
what I had tried to express.
I appologize for any offense that may have been perceived by anybody.
Best wishes,
Otto Stolz
There's no updating needed. The key is that The Unicode Standard, Version
3.0 recognizes UTF-16 as the default encoding. Therefore code values (or
units) which are defined as 'minimal bit combination that can represent a
unit of encoded text' are 16-bit. In UTF-16, one sometimes needs two of
Look at page 92 in the book. Then look at this:
http://www.cyberethiopia.com/ethiopic/counter.htm
Especially the part about no zero.
--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909
At 8:00 AM -0800 7/19/00, John Cowan wrote:
The new Unicode FAQ (like the old) supplies the panting world with
John's Own Version of Unicode Conformance:
1) Unicode code units are 16 bits long; deal with it.
2) Byte order is only an issue in files.
I've got to take issue with #2. People can and
Munzir Taha hatte geschrieben:
Suppose I publish the page, how can people know that I told notepad
to save as Unicode ;-)
Am 2000-07-18 um 03:03 h UCT hat Michael (michka) Kaplan geschrieben:
The following should go all in one line at the very top of the header:
META
- Original Message -
From: "Otto Stolz" [EMAIL PROTECTED]
As said several times before, this is only part of the story.
However, since there are no browsers out there that would refuse to aceept a
charset tag on the basis of no HTML 4.0 tag, it is the whole story from a
functional
How about:
2) Byte order is only an issue in I/O.
Jony
-Original Message-
From: Elliotte Rusty Harold [mailto:[EMAIL PROTECTED]]
Sent: Thursday, July 20, 2000 3:19 PM
To: Unicode List
Subject: Re: Unicode FAQ addendum
At 8:00 AM -0800 7/19/00, John Cowan wrote:
The
I am looking for the header file containing the declarations for Uniscribe
(USP10.DLL).
It think it should be a single file named "usp10.h", but I cannot find it on
the Microsoft web site or elsewhere.
Could somebody point me in the right direction?
Thank you.
_ Marco
Yes, Robert, there is no zero there. Tamil has the same issue. Not everyone
recognizes a zero in their numbering system.
michka
- Original Message -
From: [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Sent: Thursday, July 20, 2000 2:58 AM
Subject: Ethiopic "digits"
Look at
It does install with the Platform SDK (thats how I got it, to port it to
VB).
http://msdn.microsoft.com/developer/sdk/platform.asp
If you need individual structure or API definitions, you can get it from
MSDN docs:
http://msdn.microsoft.com/library/psdk/winbase/uniscrib_2oth.htm
michka
Otto Stolz wrote:
Such as Asterix and Obelix?
Yes, well, they are Celts, not really French at all.
:-)
--
Schlingt dreifach einen Kreis um dies! || John Cowan [EMAIL PROTECTED]
Schliesst euer Aug vor heiliger Schau, || http://www.reutershealth.com
Denn er genoss vom Honig-Tau, ||
Pierre Vaures ([EMAIL PROTECTED]) asked:
We need to display both English and Japanese (Kanji, Hiragana,
Katakana) characters. We don t find a font able to display both,
in particular on NT US.
Microsoft supplies fonts that probably do what you want.
MS Gothic is part of the Japanese
[EMAIL PROTECTED] wrote:
In English, it's ['junIkowd]. Think "unicycle" or
"unilateral" or "universal". And the "code" part
is the root word "code".
Quod dixit, dixit.
As for "unique", well, why doesn't "one" rhyme with
"stone", "bone", and "alone"?
Because some people thought it was
MS Mincho is actually on the NT4 CD in the \langpack directory.
michka
- Original Message -
From: "Alan Wood" [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Cc: "'pierre vaures'" [EMAIL PROTECTED]
Sent: Thursday, July 20, 2000 7:22 AM
Subject: RE: Font for Japanese US
Because some people thought it was clever to adopt a nonce pronunciation
of "one" /Own/, namely /wVn/, and it stuck.
Nah, surely it was drawling. Old English [a:n] Middle English [o:n],
breaking under stress and so on to [u:@n] then transitioning to [wVn].
Michael Everson ** Everson Gunn
| C1 says "A process shall interpret Unicode code values as 16-bit
| quantities."
I think the focus here was supposed to be on the fact that Unicode code
values are *not 8-bit* quantities. I found out about Unicode in late
1991 when I discovered a copy of TUS 1.0 in a bookstore, and for years
Elliotte Rusty Harold [EMAIL PROTECTED] wrote:
Bruce Schneier expresses some concerns about "Security Risks of
Unicode" in the latest issue of his Cryptogram newsletter. Thoser who
don't subscribe can see:
http://www.counterpane.com/crypto-gram-0007.html#9
I'm no expert on computer
pierre vaures wrote:
To Whom It May Concern:
We develop, on NT4 using Visual C++ 6.0, an international application for
Japanese
and US users.
We need to display both English and Japanese (Kanji, Hiragana, Katakana)
characters.
We don t find a font able to display both, in particular on
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Monday, July 17, 2000 9:40 AM
For a device that will print a relatively basic label (such
as sequence
number, date, time, name, department, etc) onto a document in
Japanese --
what is your consensus? Basic
Yes, truly globalized applications must try the name both ways. I am glad
they finally fixed this implementation problem in Windows 2000.
michka
- Original Message -
From: [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Cc: "Unicode List" [EMAIL PROTECTED]
Sent: Thursday, July
Whups! My fat fingers typed "John O'Connor" yesterday when I meant to type
"John O'Conner"... typing the "o-ful" version does really limit the
results that you get on the Javasoft web site search I proposed.
My appologies to John for the renaming!
Addison
- Original Message -
From: [EMAIL PROTECTED]
there's a character set identifier that is 0 for CP1252
and 128 for Asian fonts
128 is only good for Japanese... the actual definitions for charsets are in
wingdi.h in the Platform SDK, but you can use for DEFAULT_CHARSET and not
worry
Recently I've had the dubious pleasure of delving into the details of
the VFAT file system. For long file names, I thought it used UCS-2,
but in looking at the data with a disk editor, it appears to be
byte-swapping (little endian). I thought that UCS-2 was by definition
big endian, thus
| C1 says "A process shall interpret Unicode code values as 16-bit
| quantities."
DE I think the focus here was supposed to be on the fact that Unicode code
DE values are *not 8-bit* quantities.
This may be the path to an update that is pithy yet true. The original
mantra, paraphrased in C1
At 08:17 AM 7/20/00 -0800, John O'Conner wrote:
2. Compiling your app as a UNICODE application means that all Win32 API calls
use Unicode-enabled versions of the API. Text areas expect you to pass
Unicode, and it displays correctly when an appropriate font is used.
Even if you don't compile an
At 09:53 AM 7/20/00 -0800, Ken Krugler wrote:
2. Is little-endian UCS-2 a valid encoding that I just don't know about?
Yes, it is. Your example of the VFAT system is a near perfect case, since
the details of it form what Unicode calls a 'Higher level protocol' and
those may legitimately override
Hi Addison,
UCS-2 is pretty close to the same thing as UTF-16. The differences do not
apply here.
UCS-2 can be big-endian or little-endian. The rule is that BE is the
default. However, on Intel platforms, you shouldn't be surprised to see LE
everywhere: that's the architecture. Microsoft is
At 11:34 AM 7/20/00 -0800, John Cowan wrote:
1. Could it be using UTF-16LE? I tried creating an entry with a
surrogate pair, but the name was displayed with two black boxes on a
Windows 2000-based computer, so I assumed that surrogates were not
supported.
Probably not. So technically it
Becker, Joseph wrote:
terminology in an informal statement, I wouldn't have a problem with the
simple update:
1) Unicode code units are not 8 bits long; deal with it.
how about:
1) Unicode code units are not necessarily 8 bits long [wide], code points use 21 bits;
deal with it.
rationale:
At 11:41 AM 7/20/00 -0800, Ken Krugler wrote:
No. UCS-2 and UCS-4 have always been bigendian. Read ISO 10646-1:1993,
section "6.3 Octet order" (page 7):
When serialized as octets, a more significant octet shall
precede less significant octets.
The section continues: "When not serialized
Well...
There has always been a BOM in Unicode and it's there for a reason: to
indicate the byte order on different processors. There is an inherent BE
bias in Unicode. But this doesn't invalidate an LE view of the Universe.
Avoiding for the moment the word-parsing that Markus suggests, Unicode
Narrowing in on it, with one amendation. UTF-8 code units are 8 bits, so we
can't say that.
Mark
Becker, Joseph wrote:
| C1 says "A process shall interpret Unicode code values as 16-bit
| quantities."
DE I think the focus here was supposed to be on the fact that Unicode code
DE values are
On page 876, the character U+6B8B is listed as being
127 strokes beyond the radical. I'd say it's more
like 6 strokes beyond the radical. I do not suppose
that characters of 128+ strokes are indeed possible,
due to the fact that the paper would get quite soggy
from the repeated strokes.
--
On Thu, Jul 20, 2000 at 02:38:31PM -0800, Markus Scherer wrote:
i am curious as to which product or application you are implementing scsu for. can
you tell us/me, please?
I'm working on an Unicode library for Ada, pretty much reinventing the wheel
for another language.
David Starner [EMAIL PROTECTED], another brave SCSU
pioneer, wrote:
I'm implementing SCSU, and I was curious about the signature for SCSU.
The UTR specifies 10 different signatures and then labels 0E FE FF as
recommended. Is it acceptable for a decoder to interpret an initial 0E
FE FF as the
Addison Phillips [EMAIL PROTECTED] wrote:
Avoiding for the moment the word-parsing that Markus suggests, Unicode
on Microsoft platforms has always been LE (at least on Intel) and they
have called the encoding they use "UCS-2" (when they bothered with
such things: in the past they always
40 matches
Mail list logo