I seem to remember that someone recently posted a link to some
statistics on character set usage, but I can't seem to find it in my
old messages. Can anyone help?
John.
--
-- Over 1500 webcams from ski resorts around the world - http://www.snoweye.com/
-- Translate your technical documents
At 09:05 7/31/2001 -0500, Hohberger, Clive wrote:
Tundra Nenets, together with Forest Nenets, forms the Nenets group of
languages, which belongs to the Samoyed branch of the Finno-Ugrian (Uralic)
language family. Nenets was formerly known as Yurak or Yurak Samoyed, both
now obsolete.
Last year,
James Kass wrote, Kairat A.
Rakhim wrote, I have notes about languages of former USSR
included in the list. In 1930th almost all of them have been
written in Latin script known as 'Unified New Turkic Alphabet',.or
in its derivatives (Common Northern Alphabet etc). It should be
On http://www.macchiato.com/slides/transliteration_in_icu.ppt,
I have slides for my conference talk on transliteration. For those people having
an interest in transliteration, I would appreciate any feedback.
Mark
P.S. The slides are in PowerPoint. If someone is
interested and can only
From what I have read on this list, a Roman-to-Hangul translation would be GREATLY
aided by the use of arithmetic on Unicode values. Is this in there too? Arithmetic on
Unicode values?
rubyrb$B$8$e$&$$$C$A$c$s(B/rbrp(/rprtJuuitchan/rtrp)/rp/ruby
Well, I guess what you say is true,
I could
Since Win2000 and NT are native Unicode, is it true to say that any use of a
non-Unicode font (in fact most of the fonts on Windows. And in particular
Asian font like MS Mincho, MS Gothic) in a Unicode application will generate
a conversion WideCharToMultibyte (to convert the Unicode text to the
Peter Constable thought maybe a couple and you illustrate
no additional characters required.
I'll split the difference and say one.
With the lower case... it's a couple, isn't it?
I meant the upper / lower of what I think Marco proposed as 413+321, but
I'm not sure these should be
I just saw the slides. That cursor-backup looks very tricky.
So for someone doing the kana-to-Hepburn, you might have this: (here, "o^" means
o-with-circumflex)
(bakayarou disclaimer: I make lots of errors)
$B$3"*(Bk|$B$*(B
$B$="*(Bs|$B$*(B
$B$H"*(Bt|$B$*(B
.
$B$*$&"*(Bo^
From: Marco Cimarosti [mailto:[EMAIL PROTECTED]]
This is not correct: I have found the term Han or hanzi
in any kind of
literature, not only on Unicode documentation.
Hanzi is a loan word which I have also often seen (usually written
in italics as it should be), but I never said
Actually fonts on Windows are normally Unicode based (including MS
Mincho and MS Gothic) and most have in addition some codepage access. So
there is neither a perf hit nor a codepage problem in using such fonts
on NT, Win2000 and WinXP. These considerations are orthogonal to
OpenType.
Murray
11 wrote:
From what I have read on this list, a Roman-to-Hangul
translation would be GREATLY aided by the use of arithmetic
on Unicode values. Is this in there too? Arithmetic on Unicode values?
I guess you mean this:
http://www.unicode.org/unicode/reports/tr15/#Hangul
_ Marco
On Wed, 1 Aug 2001, Ayers, Mike wrote:
From: Marco Cimarosti [mailto:[EMAIL PROTECTED]]
BTW, I notice that a single Chinese entry is listed. This
should probably
be split in several entries for the various Chinese languages (or
dialects, e.g. Mandarin, Cantonese, Hakka, etc.). This
We have specialized transliterators that are algorithmic. See
http://oss.software.ibm.com/icu/apiref/class_Transliterator.html
For the specific case of Hangul, what we have is an algorithmic Hangul-Jamo
converter, and a rule-based Jamo-Latin converter. The Hangul-Latin
transliterator internally
In fact both MS Mincho and MS Gothic contain far more characters (and
glyphs) than appear in JIS X 0208 and a few more than JIS X 0212, so
these already go far beyond code page based (code page 932 covers
essentially JIS X 0208).
In fact much Microsoft software no longer officially supports
Yes, you could use backup in that way, if you wanted. In that case, though,
it doesn't buy you much. Where it is more useful is for the kyo, gyo,...
case.
For those not familiar with Japanese, there are a large number of cases that
follow the same pattern: kyo maps to a large katakana for
Since Win2000 and NT are native Unicode, is it true to say that any use of
a
non-Unicode font (in fact most of the fonts on Windows. And in particular
Asian font like MS Mincho, MS Gothic
Your question has a mistaken premise: the vast majority of TTF fonts on
Windows *do* have Unicode cmaps.
-Original Message-
From: NORIEGA,DANNY (A-HongKong,ex1) [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, August 01, 2001 2:56 AM
To: '[EMAIL PROTECTED]'
Subject: Unicode in Asia Question
Hi:
My company is planning to implement 16-bit Unicode. The proposal is to
go strictly and solely
On Wed, 1 Aug 2001, Richard, Francois M wrote:
Since Win2000 and NT are native Unicode, is it true to say that any use of a
non-Unicode font (in fact most of the fonts on Windows. And in particular
Asian font like MS Mincho, MS Gothic) in a Unicode application will generate
a conversion
Mark Davis wrote:
Yes, you could use backup in that way, if you wanted. In that
case, though,
it doesn't buy you much. Where it is more useful is for the
kyo, gyo,... case.
Another case on the top of my mind is the transliteration (or whatever it
is) of Indic scripts. This mechanism may
Hi Danny,
Implementing Unicode is a good thing for creating multilingual applications
and for supporting code that is distributed worldwide (or at least to a
number of locales). Based on your questions below, you probably should start
with the Unicode FAQ (on the website) and with the standard
Danny,
I am currently working on xIUA. This is sample code that you can integrate
into you application to help you interface to ICU.
http://oss.software.ibm.com/icu/
It has added functionality specifically tailored for Web servers. It will
allow you to develop application Unicode that will
You can go and try using a web server that works internally in 16-bit Unicode (UTF-16)
and serves web pages in many languages in either UTF-8 (default) or many other
codepages. (Now that ICU was mentioned already...)
Go to
Title: Message
Unicoders,
I have difficulty understanding this person's request but would like to help.
Can you?
Thanks.Magda.
-Original Message-From: korlvinke
[mailto:[EMAIL PROTECTED]] Sent: Wednesday, August 01, 2001 1:59
PMTo: Magda Danish (Unicode)Subject: Re:
Codepage
I
Microsoft's Euro story can be seen at:
http://www.microsoft.com/europe/euro/
Specifically, the Windows info is at
http://microsoft.com/windows/euro.asp
There is no way to arbitarily add code points to a Windows code page,
though. Either you have the patch or the newest file, or you do not. If
I put an HTML version on
http://www.macchiato.com/slides/transliteration_in_icu.htm. BTW, I notice
that a link on slide 21 is wrong. The text is ok, but the internal link is
wrong: should be
http://oss.software.ibm.com/icu/demo
Mark
—
πάντων μέτρον ἄνθρωπος — Πρωταγόρας
25 matches
Mail list logo