Harry R Aufderheide wrote:
1. Is the UTF-8's character set equal to the Latin-1 (ASCII)
Code Page's? If not, what are the differences?
As Brendan Murray already mentioned, UTF-8 is an encoding form of Unicode,
so it supports *all* Unicode characters.
In case you are wondering how this is
Only if you call "V" the same glyph as "5"
I want the new digits to be:
U+218A DUODECIMAL DIGIT TEN aka TURNED DIGIT TWO
U+218B DUODECIMAL DIGIT ELEVEN aka REVERSED DIGIT THREE
Don't worry; I'll be filling out the form in my best cursive :)
The reason I chose these codepoints is because they
Antoine Leca [EMAIL PROTECTED] wrote:
[...]
APIs use and return single 16-bit values.
Ah, that may be a problem (what is the ToUpper return value of ß?)
I don't know the mentioned API, but it could return 0x00DF or (to
indicate it as an error) 0x. I don't see a problem.
--Torsten
Torsten Mohrin wrote:
Antoine Leca [EMAIL PROTECTED] wrote:
[...]
APIs use and return single 16-bit values.
Ah, that may be a problem (what is the ToUpper return value of ß?)
I don't know the mentioned API, but it could return 0x00DF or (to
indicate it as an error) 0x. I don't
Exactly what constitutes a phonetic sound, besides being made by a human
being? I mean, clapping isn't phonetic, is it?
Robert Lozyniak
01 02 03 04 05 06
"Don't stop movin',
07 08 09 10 11 12 13 14
It's your life, keep on groovin',
15 16 17 18 19 20 21
Get it right,
22
Please e-mail me and I'll e-mail you a Word document with the form so you can
help me fill it out. I already have most of it.
Robert Lozyniak
01 02 03 04 05 06
"Don't stop movin',
07 08 09 10 11 12 13 14
It's your life, keep on groovin',
15 16 17 18 19 20 21
Get it right,
Unicode does not have these two characters (dozenal digit 10 {a turned
digit 2} and dozenal digit 11 [a reversed digit 3}).
I see two slightly different forms for this DUODECIMAL DIGIT ELEVEN on
the Dozenal Society's Web page. The PDF referenced by Robert shows a
*reversed* 3 (rotated about
I have two questions about Plane 14 language tags as specified in
Technical Report #7:
1. Does anyone know of any implementation that interprets language tags
and actually does something with the result? I'm not looking for
code, just information and ideas.
2. (Ken and Glenn) Can you
Almost all international functions (upper-, lower-, titlecasing, case folding,
drawing, measuring, collation, transliteration, grapheme-, word-, linebreaks, etc.)
should take *strings* in the API, NOT single code-points. Single code-point APIs
almost always malfunction once you get outside of
On Tue, 27 Jun 2000, Tex Texin wrote:
I have been asked to gather some examples of mathematical
expressions used in bidirectional languages, where the
expressions go right to left rather than left to right.
Persian and Hebrew math are left to right. At least some Arabic math is
right to
On Wed, 28 Jun 2000, Gary P. Grosso wrote:
A user's query has been passed on to me, regarding
CYRILLIC SMALL LETTER I and CYRILLIC SMALL LETTER SHORT I
(U+0438 and U+0439). They pointed out that they
noticed that when they are italicized, they look like Us
instead of backward Ns.
A
Antoine Leca [EMAIL PROTECTED] wrote:
Torsten Mohrin wrote:
Antoine Leca [EMAIL PROTECTED] wrote:
[...]
APIs use and return single 16-bit values.
Ah, that may be a problem (what is the ToUpper return value of ß?)
I don't know the mentioned API, but it could return 0x00DF or (to
Doug Ewell wrote:
2. (Ken and Glenn) Can you explain in a little more detail the rationale
for lowercasing the entire language tag? It seems that if RFC 1766
is the model to be followed, then the RFC 1766 casing convention
(lowercase for language, uppercase for country) might
How do I look up a han character if I don't know its codepoint? What if all I
have is its shape, or its EUC-JP or Shift-JIS number? There are a couple I
want to see.
Robert Lozyniak
01 02 03 04 05 06
"Don't stop movin',
07 08 09 10 11 12 13 14
It's your life, keep on groovin',
Doug Ewell asked:
2. (Ken and Glenn) Can you explain in a little more detail the rationale
for lowercasing the entire language tag? It seems that if RFC 1766
is the model to be followed, then the RFC 1766 casing convention
(lowercase for language, uppercase for country) might
28L;56L;84L;112L;140L;168L;196L;224L;252L;280L;308L;336L;Rampshot asked...
> How do I look up a han character if I don't know its codepoint?
> What if all I have is its shape, or its EUC-JP or Shift-JIS number?
> There are a couple I want to see.
The people at Sanseido have just now made it
[EMAIL PROTECTED] wrote:
How do I look up a han character if I don't know its codepoint? What if
all I
have is its shape, or its EUC-JP or Shift-JIS number? There are a couple
I
You can use the information in the unihan.txt file (link to it from
Harry Aufderheide recently said:
I work for a large global firm in the transportation industry and we are
taking a high-level look of our future business requirements for and the
I.S. effort to properly handle all the characters of all the languages
currently in use on the planet earth.
I had been told that in Egypt math is right to left, at least in school
books. I have no first hand knowledge.
Jony
-Original Message-
From: Roozbeh Pournader [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, June 28, 2000 6:40 PM
To: Unicode List
Cc: Unicode List; [EMAIL PROTECTED]
Note that in C, it's essentially just as fast to make character comparisons
with (ch | 0x20) as with ch alone, i.e., if you know ch is in an ASCII range
(0 - 0x7F or 0xE - 0xE007F), you can do a case insensitive compare as
quickly as a case sensitive one. The problem with assuming lower case
At 11:00 AM 6/28/00 -0800, [EMAIL PROTECTED] wrote:
How do I look up a han character if I don't know its codepoint? What if all I
have is its shape, or its EUC-JP or Shift-JIS number? There are a couple I
want to see.
If you know the characters you are looking for in their Japanese (Kanji)
[EMAIL PROTECTED] (Timothy Partridge) wrote:
Do IBM DBCS strings assume starting in single byte mode?
And would the presence of certain bytes in UTF-16 trigger a switch from
double to single byte mode?
Yes and yes. There are a number of Asian EBCDIC codepages that follow this
structure. These
Has anyone out there taken a cross platform non-Unicode enabled legacy
application and converted it to run UTF-8 instead of UTF-16? I've read
Markus Kuhn's UTF-8/Unicode FAQ at
http://www.cl.cam.ac.uk/~mgk25/unicode.html and while it was helpful, it
only addresses Unix. I also have to consider
"Michael Kaplan (Trigeminal Inc.)" [EMAIL PROTECTED] wrote
Issues such as this one can obviously cause major issues since it even
affects logical vs. visual order of numbers!
I don't think there was any suggestion that the logical order would differ:
AFAIK, only the display varies.
B=
[EMAIL PROTECTED] wrote:
Has Unicode, by any name, the two mutant digits in the attached file?
What about the pairs
0041;LATIN CAPITAL LETTER A
0042;LATIN CAPITAL LETTER B
and
0061;LATIN SMALL LETTER A
0062;LATIN SMALL LETTER B
(which will be the one chosen by almost any software for this
Unicoders, Java programers, Hello
I wich changed the local language (from my keyBoard) under my java
application.
I know now how to dislpay all the characteres of Unicode in my JTextField,
but now I want to be able to ENTER some characteres in my JEditoPane (like
greek, russian). I can do
Asmus Freytag [EMAIL PROTECTED] wrote:
Yes. The Unicode Standard will deprecate the use of U+FFEF (Note: not
U+FFFE) as a zero-width non-breaking space (despite its formal name).
And U+FFEF should *only* be used as a byte order mark and/or
signature. (That is already ambiguous and trouble
27 matches
Mail list logo