Touché!
I was mislead by a fictional character by V. Montalbán: Pepe Carva*lh*o a
Catalan detective of Galician origins...
Ciao. Marco
-Original Message-
From: Antoine Leca [mailto:[EMAIL PROTECTED]]
Sent: Friday, 16 June, 2000 12.44
To: Unicode List
Cc: [EMAIL PROTECTED]
Doug Ewell wrote:
I have sometimes wondered why these two useful, pre-existing symbols
are not used in the U.S. to denote 'male' and 'female' on
e.g. restroom
doors. One possibility is that, because they are frequently
associated
with 'sexuality' or 'relations between the sexes,' they
Harry R Aufderheide wrote:
1. Is the UTF-8's character set equal to the Latin-1 (ASCII)
Code Page's? If not, what are the differences?
As Brendan Murray already mentioned, UTF-8 is an encoding form of Unicode,
so it supports *all* Unicode characters.
In case you are wondering how this is
Antoine Leca wrote:
the lowercase of Italian (or Corsican) "A'", "E'", ... at the end of a
word is likely to be "à", "é/è", ... (Marco, is it really true? and how
é and è are handled?)
We should rather say that -A' (etc.) is a poor man's capitalization for -à
(etc.). The proper capital form
Robert Lozyniak wrote:
How do I look up a han character if I don't know its
codepoint? What if all I have is its shape, or its
EUC-JP or Shift-JIS number? There are a couple I
want to see.
If you know the value in JIS (or any other encoding), you just need to look
up a conversion table.
Antoine Leca wrote:
Hmmm. Writing from top of my head (which is *not* the good
way to go in such a list), I understood that Unicode was
the default character set, [...]
You are right (see http://www.w3.org/International/O-HTML-charset.html).
OTOH, I believe that for upward compatibility,
[EMAIL PROTECTED] wrote:
How do the Japanese read the hex digits A thru F?
Probably: ei, bi, shi, di, i, effu.
_ Marco
Daniel Biddle wrote:
On Wed, 5 Jul 2000, Rick McGowan wrote:
iRck
I thought this was a typo until I saw your address. U263A
It's not a typo: Rick's signature has passed through an Indic renderer, so
the "i" was reordered. U+FF1AU+FF0DU+FF09
_ Maco`
Haï, Antoine.
[...] plan supplémentaire pour idogrammes CJK, [...]
But is "CJK" the correct acronym for "chinois, japonais, coréen"!?
Tchao.
Marco
Robert Lozyniak wrote:
If it is what I think it is, I don't want it in English.
How could it tell "aids" from "AIDS", for instance?
Or "joy" from "Joy"(name)?
(C'mon, 11BB, you were supposed to know this one ;-)
Case folding (or case conversion) is the process of changing letters from
one
[BF Ax]
FFx = [BF Bx]
copyleft 2000 by Marco Cimarosti
- * - * - * - * - * -
_ Marco
Greg Reynolds wrote:
The only remedy I can see for this particular flaw in Unicode is the
introduction of a codepoint to set or maybe swap the
evaluation rule for
number strings.
It is not a flaw. Rather, IMHO, we are all doing the mistake of considering
this as an *encoding* issue. Which
N.R.Liwal wrote:
I would vote that UNICODE, host mailing lists
on Script level, becuase, issues discussed of CJK
are not much related to Roman and Arabic.
If there are several lists like:
Arabic...
ect.
If one wish to participate in all that should be an option.
but still if all are happy
Michael W. Martin
For a device that will print a relatively basic label (such
as sequence
number, date, time, name, department, etc) onto a document in
Japanese --
what is your consensus? Basic Kanji+Hiragana+Katakana or will
Hiragana+Katakana or just Katakana suffice?
My vote is
Visual Basic (6.0 for 32-bit Windows Development, under Windows NT) is
capable of handling Unicode strings, internally, but I found no way to
display an arbitrary Unicode text in any of the built-in controls (buttons,
text boxes, combos, etc.).
Even if I set the control's font to an
Munzir Taha wrote:
utf-8: ef bb bf
utf-16be: fe ff
utf-16le: ff fe
utf-32be: 00 00 fe ff
utf-32le: ff fe 00 00 (check before utf-16le!)
scsu: 0e fe ff (unfortunately rather rarely used)
Sorry for being a dummy about this. But I can't understand
where these bytes
I am looking for the header file containing the declarations for Uniscribe
(USP10.DLL).
It think it should be a single file named "usp10.h", but I cannot find it on
the Microsoft web site or elsewhere.
Could somebody point me in the right direction?
Thank you.
_ Marco
Patrick Andries wrote:
De : [EMAIL PROTECTED]
On page 876, the character U+6B8B is listed as being
127 strokes beyond the radical. I'd say it's more
like 6 strokes beyond the radical.
I believe it to be 5 strokes and it is already listed under
radical + 5
strokes.
Funny: it is +6
Asmus Freytag wrote:
At 09:53 AM 7/20/00 -0800, Ken Krugler wrote:
2. Is little-endian UCS-2 a valid encoding that I just don't
know about?
Yes, it is. Your example of the VFAT system is a near perfect
case, since
the details of it form what Unicode calls a 'Higher level
protocol' and
1) The UTF whose bits can be counted is not the eternal UTF.
The encoding that is not in UTR-17 is not a compliant encoding.
UCS-2 is the origin of the BMP.
UTF-16 is the origin of 1,048,576 more code points.
Therefore, constantly use UTF-8 and you'll see the mystery on your mail
Sorry for all those who are seeing the mystery above here ^ but this mail
really required UTF-8.
Joseph Becker wrote:
It seems that Chinese is the only major language in which the
term "Unicode"
needs to be translated rather than transliterated. [...]
We have collected these candidates so
Brendan Murray wrote: "Md Ziaur Rahman"
[EMAIL PROTECTED] wrote:
... found that a letter that is frequently used in Bangla is absent from
the standard. It is Bangla letter Khondo-ta I believe that
this character is a composition of TA (U+09A4) and the ZERO-WIDTH
JOINER, the so-called
Robert Brady wrote:
On Thu, 27 Jul 2000, Abdul Malik wrote:
How am I to encode the different forms in unicode?
For the last three, you can do something like
BENGALI LETTER WHATEVER
BENGALI VIRAMA
BENGALI LETTER BA
for the -va form, and
BENGALI LETTER WHATEVER
BENGALI
Addison wrote:
Actually, I erred. It's Switzerland that prefers this formula (see the
ITS and DES locales on Windows or in Java--although Java uses
three digits
for grouping and it should be four).
The Swiss locale on Windows systems actually uses ' (U+0027) as a thousands
separator, not `
Asmus Freytag wrote:
The problem with the commission design of the euro glyph is
that it only
works as long as you use their aspect ratio and uniform
stroke width. As
long as you have these, the eye will complete them to a lower
case 'e' form [...]
Visual perception is indeed a funny
((( Sorry to those who see a mangled subject. It should read "RE: Encodings
for SQL Databases" )))
Jon Peck wrote:
Most of the major databases now support Unicode at some
level, but what is
the best way to encode SQL statements for various database
access apis? [...]
According to the
((( Sorry to those who see a mangled subject. It should read "RE: Encodings
for SQL Databases" )))
Jon Peck wrote:
Most of the major databases now support Unicode at some
level, but what is
the best way to encode SQL statements for various database
access apis? [...]
According to the
Antoine Leca wrote:
char C_thai[] =
"\u0E40\u0E02\u0E17\u0E32\u0E49\u0E1B\u0E07\u0E1C\u0E33";
Would the Unicode values be converted to the local SBCS/MBCS character set?
If yes:
Is the definition of this locale info part of the C99 standard itself, or is
it operating system's locale?
And
Hi, Antoine.
I can continue to dissert on this subject (all of this should
finally be
cooked in a FAQ anyway), but I do not want to flood the list
with a marginaly interesting subject.
Merci beaucoup. It was very informative!
Ciao.
Marco
P.S. You should not be so shy: up
Bob Hallissy wrote:
1) Is the Arabic Joining Class [...] normative or informative?
Like it or not, it is normative. See
http://www.unicode.org/Public/UNIDATA/UnicodeCharacterDatabase.html, that
reads:
...
ArabicShaping.txt (Section 8.2)
Basic Arabic and Syriac character shaping
Michael (michka) Kaplan wrote:
Is not
http://www.hclrss.demon.co.uk/unicode/braille_patterns.html
or alternately
http://charts.unicode.org/Web/U2800.html
already covering this?
No. These are at most the building blocks for braille. A better parallel
would be to consider these "presentation
Halldor G. Gestsson:
Can I find a list where all languages supported in the basic latin
(0x-0x00FF)?
[...]
Wich languages uses the latin extensions A,B and C?
Page http://www.eki.ee/letter/ contains the information to build your
lists.
_ Marco
Jörg Knappen wrote:
Are there good (authorative) references on the so called
swiss numerical format with its peculiar thousand separator?
Why not comparing the locale settings of main operating systems? I think
that at least WinNT, Apple, Linux, and other Unixes are widely represented
on this
Steven R. Loomis wrote:
[...] Presumably the unicode codepoints in braille
would make a great format for these translations on their way to a
printer. One would hope they would get such use and not simply for
braille-looking characters on paper or screen.
You are right, I didn't catch it:
Roozbeh Pournader wrote:
That seems problematic to me, when used for Arabic. How should one use
ZWNJ between two Arabic letters to stop the ligature? The'll get
disconnected!
Good point.
ZWJ+ZWNJ+ZWJ comes to mind, but it is really not the maximum of elegance...
_ Marco
Addison P. Phillips wrote:
This is a weakness of the locale model used on the Web and most UNIX
systems: the hierarchy is based on the ISO 639 language codes
and the ISO 3166 country codes. It doesn't cover such minutiae as
"inside-a-country" variation easily nor does it deal well with
Addison P. Phillips wrote:
Differences in writing systems are much more problematic than the
Norwegian example. The Simplified/Traditional Chinese thing
leaps to mind, of course, [...]
Right. I just notice that, in Unicode, this is not a display difference but
an encoding one: corresponding
Antoine Leca joked:
Neither you nor I would accept that our national language are tagged,
respectively, la-ital and la-fran... ;-)
Similarly, I believe Norwegians and Danes will not accept to
have their
present 2-letter codes replaced with cascaded ones in the form
"Norse"-n? or "Norse"-da
Michael (michka) Kaplan wrote:
The one irrevocable thing that LCIDs give you is a collation
choice (the regional options do not allow you to specify a separate
default
collation choice).
Another important setting that is hard-wired with Windows locale is
language. This affects some standard
Elliotte Rusty Harold wrote:
Is anyone here familiar with Armenian? The CSS Level 2 specification
from the W3C makes reference to "Traditional Armenian numbering" but
Unicode doesn't seem to include any Armenian numbers, at least as
such. Is this another language like Nebrew where the
Markus Scherer wrote:
of this list, only UTF-EBCDIC is a viable encoding form.
the others are either deprecated, never made it beyond draft,
or are unofficial discussion pieces that never made it
anywhere (i proposed one of them :-).
Please notice that at least one of these has never even
Antoine Leca wrote:
Michael (michka) Kaplan wrote:
[...]
The Monotype font and Latha in Windows 2000 are the way
that my client got
both display types.
I believe this is a rather special need that your client
have: as I understand,
he wants, at the same time, some rendering forms
Title: Win32: Commandline/batch ANSI-UTF8-UTF16-UTF8-ANSI conversion tools
Sure:
uniconv.exe by Basis Technology.
It is distributed for free
as a demo of the Rosette library; download from http://rosette.basistech.com/demo.html.
The version I
have(quite old) does not support UTF-16, but it
Michael (michka) Kaplan wrote:
From: "Rick McGowan" [EMAIL PROTECTED]
[...]
I suppose if you just want to display the non-ligature type
thing in a
situation where the font wants to give you the ligature type
thing, you
might be able to use a ZWNJ or ZWNBSP between the chars.
[...]
Please ignore my previous message (subj "[EMAIL PROTECTED]", to Antoine,
cc [EMAIL PROTECTED]). Sorry about that.
Antoine Leca wrote:
[EMAIL PROTECTED] wrote:
[...]
In ordinary cases, a ZW[N]J inside a consonant cluster does
not prevent
matra reordering. E.g., in Devanagari:
Peter constable wrote:
- code values: integers within the space of some encoding
form; d800 - dfff
*are* code values, but not codepoints
- surrogate: I'm inclined to say that this should refer
*only* to a UTF-16
code value in the range d800 - dfff; equal to "surrogate code value"
-
_ Marco
--Original Message--
From: "mlinguist" [EMAIL PROTECTED]
To: "Marco Cimarosti" [EMAIL PROTECTED]
Sent: September 12, 2000 1:55:59 PM GMT
Subject: Re: Tamil glyphs
Dear Mr.Marco,
Sorry for sending an unsolicited mail to you.
I am interested in knowing alot about t
Michael Everson wrote:
Tire Center (US)
Tire Centre (CA)
Tyre Centre (GB)
civilization (US)
civilization (GB) Oxford recommendation
civilisation (GB) Lots of folks
(Ouch! The e-mail spellchecker had a lot to complain about the above
quotation :-)
Out of curiosity: is no "en-IE" tag needed
Dieter Hoffmann wrote:
Are there known issues between the way AMD K6/2 handles
Unicode when sent to printer by Office97?
In the Windows98 SE environment whence originates this
question, Wordpad98 document containing
Greek and other special characters prints correctly, but when
handled
Roozbeh Pournader wrote:
This sequence, ZWJ ZWNJ ZWJ, really worries me. In the Arabic
script, my
interest, this is always the case. The ZWNJ is not enough in any case,
since it disconnects the letters.
And this also means some change in many simple rendering
programs that use
other
Edwin F. Hart asked:
Is there a need for a "fuzzy" comparison where names with and without
points in Hebrew? Is there a similar need for other scripts such as
Arabic?
Mark Davis replied
UCA (#10) already handles that. You will get a "fuzzy" compare if you
mask off less important weights,
Out of curiosity, when did the acronym "UTR" ("Unicode Technical Report")
mutate to those "UAX", "UTS", "DUTR" that I see in
http://www.unicode.org/unicode/reports/index.html?
And, BTW, how is it that a "Superseded UTR" is not, say, a "SUTR"?
_ Marco
Michael (michka) Kaplan wrote:
It is not that simple... what if someone else registers the
domain that uses
the common orthographic variants?
Well, I assume that it would not be possible because, by those hypothetic
collation rules, the two domains would be considered the same -- like trying
Peter Constable wrote:
On 09/16/2000 12:56:31 PM Doug Ewell wrote:
MKJ is the Ethnologue code for both 'Macedonian' and 'Slavic'.
Absolutely *everyone* knows there is no one 'Slavic'
language; the name
refers to an entire language family. This is much more
imprecise than
any of the
Jörg Knappen wrote:
No, in german "welsch" always means a romance language (in most
cases french, but also italian and even romanian can fill in). Note
also "rotwelsch".
The "generic" term for slavonic languages is "wendisch" or "windisch"
derived form the formerly slavonic "Wenden",
Karambir Rohilla wrote:
wath is maping of unicode font in indian language?
Sorry, your question is too clumsy. I think that no one will be able to give
you an answer.
You should first make some points clear to yourself, then try and ask the
different differently. The things that make your
Hi, Carl.
(You replied privately; was this intentional? If not, you can resend it to
the list, and I will re-send this one).
A better choice, IMHO, would be to normalize by *decomposition*. In this
way, the problem above would be addressed by rule 3 below.
I think you have a very good
[EMAIL PROTECTED] wrote:
Just to clarify, I have no connection with the XNS project
(other than as a
user), but posted the info about it as of possible interest
[...]
I am certainly one of those who made the impression of addressing Tom
himself, as if he was the author of the proposal.
I
I wrote this blunder:
*Spell checking* is one of these cases, that we are all quite
familiar with. If I have to write a text using traditional
hanzi in Unicode, I can tag it as "Chinese-simplified", so
that my spell-checker can assist me signaling simplified
characters that slipped in by
Jukka Korpela wrote:
Does Unicode encode traditional and simplified Chinese characters
separately, or is the difference considered as glyph variation only,
to be indicated (if desired) at higher protocol levels?
They are encoded separately, at different code points.
What you heard about
Karambir Rohilla wrote:
Please help me anyone
waht is UTF8 UTF16 ?
I found these to be well written and helpful:
- "Forms of Unicode"
(http://www-4.ibm.com/software/developer/library/utfencodingforms/index.html
) by Mark Davis.
- "Unicode Transformation Formats: UTF-8 Co."
Carl W. Brown wrote:
It would certainly seem that the optimal solution would be to
carry the locale.
Not at all, and for a good reasons: I need that, whenever and wherever I
type in a certain string, I reach the same web site.
Scenario:
Imagine that I am a customer of Äöü, a (fictionary)
George Zeigler wrote:
someone send me a FAQ page that explains the difference
between UTF-8 and Unicode (UTF-16 I suppose).
You should perhaps read it again ;-)
UTF-8 if I understand correctly only supports
European characters, where as UTF-16 supports all major
characters world
I muttered this incomprehensible paragraph:
- UTF-16 has 16-bit units ("words") and uses 1 or 2 units per
character. Characters 00 to 00 use the corresponding
word; higher values use a pair of "surrogates", the first one
("high") being in . It too exists in the same 3 variants as
John Cowan wrote (in ASCII(tm), by the way):
In fact, of course, every extant Klingon text can be written
with Unicode, and indeed with ISO 646:1983.
Well, it can -- provided that you properly *registered* your copy of
ASCII(tm) (http://www.wholehog.fsnet.co.uk/robert/ascii/), and paid your
Raghu Kolluru wrote:
My email delivery programs works with most of the charsets
but not with
shift_jis.
Here are the steps that I do,
1) I get a text file from Japan which as the content in the
encoded charset.
2) I paste this content in web based UI and store it in SQL server
3) Then I
Carl W. Brown:
An article in the October 12, 2000 issue of Linux Weekly News
http://lwn.net/bigpage.php3 tries to explain the benefit: "Many
Asian characters are composites, made up of one or more simpler
characters. Unicode simply makes a big catalog of characters, without
recognizing
Jon Babcock wrote:
It seems to me that if not for that, how could anyone
make a Chinese font? Who is going to sit down and
draw a *myriad* or more characters? Since elements
recur, this reduces the amount of labour required
greatly.
I too would have bet that all CJK foundries used some form
Jon Babcock wrote:
BTW, Marco, as near as I can recall, the above quotation in not from
me.
Did it again! Shame on me! Sorry!
_ Marco
James E. Agenbroad wrote:
If I had to make a guess it would be that transforming the
glyphs of parts of characters so they will fit together in
a pleasing fashion would take about as much effort (or
more) than designing separate glyphs for each new character.
Perhaps. I am a programmer, so
[EMAIL PROTECTED] wrote on [EMAIL PROTECTED]:
Are there languages you might need to encode where
colour is important? (such as, if a certain shape
in red is one letter, but in blue it is a different
letter)
I think this is the case for the Nahuatl (Atztec) script, where color is a
primary
igrams", "holograms", etc.),
and how (and whether) this analysis could be useful for encoding text on
computers, building software fonts, and other computer-related fall downs.
Then I (Marco Cimarosti) wrote:
Anyway. I think that everybody probably had quite enough of this
daydreams of
Patrick Andries wrote, quoting from the Frankfurter Allgemeine Zeitung:
[...] drei völlig getrennte Schriftsysteme gewissermaßen in bunter
Mischung [...]
I am not sure which "three completely separate writing system" the author
had in mind. There are several possible ways of counting "Japanese
Well, my executives are mostly Italians or Dutchmen, so they are quite used
to the perils of their own languages.
Ouch! I have just bitten my tongue in the attempt of pronouncing a very
dangerous Italian phoneme! I need medical assistance, fast!
_ Ma?co
-Original Message-
From: J. P.
Mike Ayers wrote:
I discovered this weekend that Chinese, despite grouping large
numbers by ten thousands [...], write their digits with comma
separators every 3 digits [...]
This may be different in different operating systems, but I too was
convinced that they grouped four digits at
Flask Eric wrote:
I have installed the Unicode versions of Arial and Times New
Roman on Windows
98 running Office 97 on several PCs. Everything works fine
but on two separate
occasions I found out that when printing the Maltese
Characters on particular
printers, the Maltese Characters are
Flask Eric wrote:
I have installed the Unicode versions of Arial and Times New
Roman on Windows
98 running Office 97 on several PCs. Everything works fine
but on two separate
occasions I found out that when printing the Maltese
Characters on particular
printers, the Maltese Characters are
Paul Deuter wrote:
So can anyone point me to a web-site or page that is encoded
in Unicode (UTF-16 or UCS-2)?
I have seen one single example of a web page in UTF-16 (but I cant remember
the URL), and never saw one in UCS-2.
It is much more likely to find Unicode web pages in the form of UTF-8
I have some questions about the usage of hanja (Chinese characters) in
Korean.
1) Is it correct to say that hanja are only used for words derived from
Chinese, and never for genuninely Korean words?
2) Is it true that hanja have been abolished in North Korea? When did this
happen?
3) How often
John Cowan wrote:
3) How often are hanja used today, however? (...)
I believe they are still common in newspaper headlines,
because of the greater
degree of compression they permit.
Do you mean that some hanja have a polisyllabic pronunciation in Korean?
I thought than any single hanja
John Cowan wrote:
Marco Cimarosti wrote:
Do you mean that some hanja have a polisyllabic
pronunciation in Korean?
Yes. Of the 9033 Unihan characters with Korean readings
given in the Unihan.txt
file, there are 689 with two-syllable mappings, 13 with
three-syllable mappings,
and 2
Antoine Leca wrote:
My understanding is that there are a number of similar cases,
which are not
officially prohibited (AFAIK), but does not carry any sense.
For example, how about digits followed by accents (as
combining marks)?
Or the kana voicing/voiceless combining marks, when they
Eliotte Rusty Harold wrote:
One thing I'm very curious about going forward: Right now character
values greater than 65535 are purely theoretical. However this will
change. It seems to me that handling these characters properly is
going to require redefining the char data type from two
Addison P. Phillips wrote:
I ended up deciding that the Unicode API for this OS will only work in
strings. CTYPE replacement functions (such as isalpha) and
character based
replacement functions (such as strchr) will take and return
strings for
all of their arguments.
Internally, my
Ooops!
In my previous message, I wrote:
wchar_t * _wcschr_32(const wint_t * s, wchar_t c);
wchar_t * _wcsrchr_32(const wint_t * s, wchar_t c);
What I actually wanted to write is:
wchar_t * _wcschr_32(const wchar_t * s, wint_t c);
wchar_t * _wcsrchr_32(const wchar_t * s, wint_t c);
Sorry if
Antoine Leca wrote:
Marco Cimarosti wrote:
Actually, C does have different types for characters within
strings and for
characters in isolation.
That is not my point of view.
There is a special case for 'H', that holds int type rather
than char, for
backward compatibility reasons
David Starner wrote:
Sent: 20 Nov 2000, Mon 16.18
To: Unicode List
Subject: Re: string vs. char [was Re: Java and Unicode]
On Mon, Nov 20, 2000 at 06:54:27AM -0800, Michael (michka)
Kaplan wrote:
From: "Marco Cimarosti" [EMAIL PROTECTED]
the Surrograte (aka "Astral&
Lukas Pietsch wrote:
a lot was said in this thread about intelligent rendering
mechanisms, [...]
I figure that people are mostly thinking of the technology
called "Open Type", is that right?
Right, but quite partial. There are several major technologies for rendering
"complex Unicode
list for a while. And, about points 2 and 3 above, beware that I am
a second language English speaker and that I don't have much experience of
American pronunciation.
Ciao.
Marco Cimarosti
Peter Constable wrote:
I'd add the square brackets, an off-glide on the "o", and
aspiration (02b0) after the "k".
Is that k aspirated? I do hear an aspiration when [p], [t] or [k] are at the
*beginning* of "words" (mainly because teachers told me I was supposed to
notice it), but I don't feel
Mark Davis wrote:
Much as I admire and appreciate the French language (second only to
Italian),
the proximate derivation of "Unicode" was not from that language, and the
transcription should not match the French pronunciation. Instead, it has
solid Northern Californian roots (even though not
{Notice: way off-topic}
Mark Davis wrote:
There was a period well after the Norman invasion where a
large number of words came into English directly from
Latin, which was still in widespread use among scholars.
Right. And it also was the language of priests, on both sides of the
Channel.
Peter Constable wrote:
In the better known Indic scripts, are there ever cases of conjuncts formed
with independent vowels and a following consonant?
I know this may sound weird. The idea would be a VC syllable like "al".
Things that are more familiar are to have CC conjuncts, which would have an
Rob Hardy wrote:
I'm preparing some mappings of teletext character sets to Unicode.
From
http://www.sneezes.freeserve.co.uk/teletext/tech/encodings/G0_ARABIC.txt:
0x60 0x2010 # HYPHEN (or is it a dash?)
I think that 0x60 should be U+0640 (ARABIC TATWEEL): a character used to
extend
Peter Constable:
Does anybody recognise the script in the attached sample.gif?
I already tried with handwritten Devanagari (without the top bar), but an
expert on another list said that it is unlikely. I thought too it could be
Georgian, but then I was unable to match any single letter.
My Greek textbook has acute, grave, and circumflex (called by
those names),
but I'm not sure what these correspond to in the Greek and
Greek Extended
blocks (there seem to be many more diacriticals than those).
Is there an on-line guide somewhere?
There are in fact other diacritics
Erik Garrs wrote:
The elements of the periodical table (chemistry) are
missing, and they are specially needed on chinesse
because they don't have alphabet, so they need
them as a graphical representation.
Some of these characters are quite common in modern life (e.g., "oxygen" is
certainly
Michael Everson wrote:
There is no reason the Chinese or anyone else cannot write
this with LATIN CAPITAL LETTER O and SUBSCRIPT TWO.
I think there is a misunderstanding, probably on my side.
In his Spanish version, Erik claimed that the chemical elements were missing
"en el contexto de los
Erik Garrs wrote:
Now that thanks to Pierpaolo BERNARDI who found a book (...)
(dictionary) where shows what I was mentioning,
MOST Chinese dictionaries that I have seen bear a table of chemical elements
at the end. Perhaps you would have found out earlier going in a public
library.
here we
Richard Cook wrote:
Has anybody played devil's advocate to this, with a list of
"Failings of
Unicode"? Are there any? :-) This question might in fact result in a
longer Benefits list
Although I've always been a Unicode fan, Richard's invitation is too
tempting. :-)
I'll add these to
1 - 100 of 708 matches
Mail list logo