As Unicode will soon contain characters defined beyond the code point range
[0,65535] I'm wondering how is Java going to handle this?
I didn't find any hints from JDK documentation either, at least a few days
ago when I browsed the Java documentation about internationalization I just
saw a
Hi
I have recently started to study Unicode and tried to understand what it is,
except that it is a system that supports double byte languages. When doing
this, I've bumped into Big5,Jis Shift, x-Jis. Are these synonyms for
different Chinese and Japanese character sets and for which? I'm
Hello,
In the Unihan.txt database, in the kMandarin field there are entries
with duplicate pronunciations. For example:
U+4E21 kMandarin 1 LIANG3 2 LIANG3 3 LIANG4
U+4E4E kMandarin 1 HU1 HU2 2 HU1
U+4E86 kMandarin 1 LIAO3 2 LE LIAO3
Is there a reason for these duplicates?
On Tue, Nov 14, 2000 at 08:22:21AM -0800, D.V. Henkel-Wallace wrote:
Sadly, it seems unlikely that any furture change or adoption of orthography
will use characters not already supported by the then major computer
systems. In fact the trend seems to be the other way, viz Spain's changing
On Tuesday, November 14, 2000, at 08:24 AM, Pierpaolo Bernardi wrote:
In the Unihan.txt database, in the kMandarin field there are entries
with duplicate pronunciations. For example:
U+4E21kMandarin 1 LIANG3 2 LIANG3 3 LIANG4
U+4E4EkMandarin 1 HU1 HU2 2 HU1
"D.V. Henkel-Wallace" wrote:
For a minority language (which all remaining unwritten languages are) the
pressure will be strong to use existing combinations (since they won't
constitute a large enough community for people to write special rendering
support).
OTOH minority languages have
Mark Davis wrote:
The Unicode Standard does define the rendering of such combinations, which
is in the absence of any other information to stack outwards.
A dumb implementation would simply move
the accent outwards if there was in the same position. This will not
necessarily produce an
You can currently store UTF-16 in the String and StringBuffer classes. However,
all operations are on char values or 16-bit code units. The upcoming release of
the J2SE platform will include support for Unicode 3.0 (maybe 3.0.1)
properties, case mapping, collation, and character break iteration.
[EMAIL PROTECTED] wrote:
Unfortunately, there's no corresponding LATIN CAPITAL LETTER N WITH LONG
RIGHT LEG, which Lakota needs.
To my knowledge, the discussion in September between John Cowan and Curtis Clark
didn't terminate with any actual proposal, and I'm not clear on whether the above
From: D.V. Henkel-Wallace [mailto:[EMAIL PROTECTED]]
At 06:30 2000-11-14 -0800, Marco Cimarosti wrote:
But my point was: not even Mr. Ethnologue himself knows
exactly *which*
combinations are meaningful, in all orthographic system.
And, clearly, no
one can figure out which combinations
Mike Ayers wrote:
The last I knew,
computer-savvy Taiwan and Hong Kong were continuing to invent new
characters. In the end, the onus is on the computer to support the user.
Yes, the computer should support the user, but... The invention of new characters to
serve multitudes is OK, and
On Tue, 14 Nov 2000, Rick McGowan wrote:
Mike Ayers wrote:
The last I knew,
computer-savvy Taiwan and Hong Kong were continuing to invent new
characters. In the end, the onus is on the computer to support the user.
Yes, the computer should support the user, but... The invention of new
12 matches
Mail list logo