On Aug 13, 2016, at 2:06 PM, eduardo marin wrote:
>
> It is well known that the southern song style of counting rods, had different
> forms for the digits 4, 5 and 9 https://en.wikipedia.org/wiki/Counting_rods ,
> however currently there is no way to represent such
On Jul 28, 2015, at 6:00 AM, Michael Everson ever...@evertype.com allegedly
wrote:
Emojis are not for labelling things. They’re for the playful expression of
emotions.
Is that what they're for? I thought they were (encoded) to satisfy certain
device manufacturers. And, what is the emotion
On Jul 28, 2015, at 7:53 AM, Doug Ewell d...@ewellic.org wrote:
Richard Cook rscook at wenlin dot com wrote:
And, what is the emotion playfully expressed by ?
I'm having a burger and fries for lunch but can't be bothered to type
all that into this text message lol
Is all that one
On Jul 28, 2015, at 8:56 AM, Asmus Freytag asm...@ix.netcom.com wrote:
On 7/28/2015 8:07 AM, Richard Cook wrote:
On Jul 28, 2015, at 7:53 AM, Doug Ewell d...@ewellic.org wrote:
Richard Cook rscook at wenlin dot com wrote:
And, what is the emotion playfully expressed by ?
I'm having
On Jul 7, 2015, at 7:53 AM, Richard Cook rsc...@wenlin.com wrote:
Ken Whistler wrote:
vexillology
Garth Wallace wrote:
Tangentially, I recently ran across something called International
Flag Identification Symbols. It's a symbolic notation for vexillology
that describes their use
Ken Whistler wrote:
vexillology
Garth Wallace wrote:
Tangentially, I recently ran across something called International
Flag Identification Symbols. It's a symbolic notation for vexillology
that describes their use of flags and some aspects of their design but
not enough to reproduce
On Jun 30, 2015, at 9:11 AM, Garth Wallace gwa...@gmail.com wrote:
I don't think display of U+1F308 as a rainbow flag would be expected
behavior. It risks turning a text like It's a beautiful day! into a
political statement.
Garth,
Any statement can be a political statement, in the
Ken,
I know that U+1F308 is RAINBOW ... because my nameslist lookup tool tells me so
...
T C UTF-8 Codepoint : Name : Annotations
1 C2_A0 1F308 RAINBOW
http://linguistics.berkeley.edu/~rscook/cgi/nameslistsearch.html
... but could also be a 'rainbow (flag)'?
On Oct 7, 2014, at 5:23 PM, Mark E. Shoulson m...@kli.org wrote:
The infamous Biang-Biang Noodle
Mark,
You seem to know as much as anyone about biang. All I can say is, biang is
attested in tones 2, 4 and 1, and enshrined (along with a glyph variant) in
Wenlin CDL PUA at U+E999, with 51 or
On Sep 20, 2014, at 5:35 PM, Jonathan Coxhead jonat...@doves.demon.co.uk
wrote:
Here's an icosahedral dice from the Ptolemaic period:
http://www.metmuseum.org/collection/the-collection-online/search/551070
I find myself idly wondering whether the identities of the characters are all
On Sep 9, 2014, at 8:28 AM, Richard COOK rsc...@wenlin.com wrote:
On Sep 8, 2014, at 12:03 PM, John Armstrong john.armstrong@gmail.com
wrote:
Mr. Armstrong,
I see that my reply to your message bounced from the main Unicode list, due to
length constraints.
At any rate, the message did
On Jul 3, 2014, at 1:48 PM, Asmus Freytag asm...@ix.netcom.com wrote:
On 7/3/2014 11:02 AM, Richard COOK wrote:
On Jul 2, 2014, at 8:02 AM, Karl Williamson pub...@khwilliamson.com wrote:
Corrigendum #9 has changed this so much that people are coming to me and
saying that inputs may very
On Jul 2, 2014, at 8:02 AM, Karl Williamson pub...@khwilliamson.com wrote:
Corrigendum #9 has changed this so much that people are coming to me and
saying that inputs may very well have non-characters, and that the default
should be to pass them through. Since we have no published wording
On Apr 24, 2014, at 2:16 PM, Whistler, Ken wrote:
Given the incredible level of interest shown on this list during
the last week, I am glad that I can finally announce the publication
of Bidi Brackets for Dummies:
http://www.unicode.org/notes/tn39/
Dear Dr. Ken,
Thanks ever so much for
On Mar 12, 2014, at 2:59 AM, Adam Nohejl wrote:
Since kRSUnicode is a Normative property, a formal proposal to modify that
data is required, for review in WG2. I have added notes on the items you
mention below, for consideration in that process, and in the meantime, if
you identify any
Mr. Nohejl,
About the property data you mention below. kRSUnicode property data permits
multiple/variant (space-delimited) radical/stroke values, and I think we will
see important variants added in the future. Where a specific value attested in
a specific Kangxi edition is missing from
On Feb 27, 2014, at 7:23 AM, Michael Everson wrote:
On 27 Feb 2014, at 02:32, Shriramana Sharma samj...@gmail.com wrote:
Given that Unicode encodes scripts and not languages, how appropriate is it
to call the BMP and the SMP as the multi*lingual* planes?
You are more than two decades
On Dec 16, 2004, at 3:20 PM, Tom Emerson wrote:
Ah, I don't have my copy of the Comprehensive ABC here at home with me.
If you have Wenlin, you have it in electronic form. Wenlin does the
typesetting (and sub-licensing) for ABC, and the ABC data is accessible
from within the Wenlin app.
But on
On Dec 5, 2004, at 07:02 PM, Doug Ewell wrote:
A word-based encoding for English could automatically assume spaces
where they are appropriate. The sentence:
What means this, my lord?
would have seven encodable elements: the five words, the comma, and the
question mark. Spaces would be
On Dec 5, 2004, at 12:27 AM, Tim Finney wrote:
my co-worker suggested encoding entire words in Unicode.
The word is considerably less well-defined than the character. The
set of words is open-ended. If you'd like to see where you go when you
start trying to encode words, take a look at CJK
On Dec 4, 2004, at 12:15 PM, John Hudson wrote:
I think Peter's point was that complex script require font layout
tables
Script complexity is not so easily quantified. Has anyone tried to sort
scripts by complexity? In terms of the present discussion, Han would be
viewed as a simple script, and
On Thu, 2 Dec 2004, John Cowan xiele:
Paul Hastings scripsit:
speaking of which, *are* there any open source fonts that come even
close to Arial Unicode MS?
In what, breadth of coverage or aesthetics? The GNU Unifont has very
wide coverage though it is a bitmap font; James Kass's CODE
On Mon, 29 Nov 2004, Kenneth Whistler opined contemplatively:
Allen Haaheim provided some further detailed clarification:
Note that Han characters are logographic, not ideographic. That is,
they are graphemes that represent words (or at least morphemes),
not ideas.
This correctly states
The term ideograph has special meaning in Unicode/ISO usage. Ideograph
is short for CJK Unified Ideograph, and is one of the characters with
mapping or reference data in the Unihan.txt database.
Likewise, Radical has special meaning. CJK Radicals are found in two
places, in the Kangxi Radicals
On Oct 13, 2004, at 1:42 PM, Eric Muller wrote:
Going back to the original scenario, to make my point clearer:
System A, a subset of FileMaker, has {U+0065, U+0303, U+1EBD} as its
repertoire. When presented with the input U+0065, U+0303, it
produces the output U+1EBD.
System B, my rendering
Jon,
Thanks for your reply.
On Oct 13, 2004, at 3:15 AM, you wrote:
imported UTF-8 sequences like [U+0065][U+0303] e, tilde get
remapped internally to [U+1ebd] LATIN SMALL LETTER E WITH TILDE.
Is this kind of behavior what one would expect?
That's conformant, if it causes problems with any other
Using a certain newly Unicode-aware database application which shall
remain nameless (FileMaker 7):
imported UTF-8 sequences like [U+0065][U+0303] e, tilde get remapped
internally to [U+1ebd] LATIN SMALL LETTER E WITH TILDE.
Is this kind of behavior what one would expect?
It's problematic (and
On Wed, 7 Apr 2004, Peter Constable wrote:
They were encoded that way some while before they were accepted in
Unicode. Also, until Unicode 4.1 is published, there is a possibility
that codepoints may change.
I see. I assumed the codepoint assignments were already firm.
, 2003, at 15:55 US/Pacific, Richard Cook wrote:
The English TXJ names come from Michael Nylan's book. You'll have to
find that book to learn what she meant. Or better, get a copy of the
Chinese original. -Richard
On Saturday, Oct 11, 2003, at 13:28 US/Pacific, Patrick Andries wrote:
Would
The English TXJ names come from Michael Nylan's book. You'll have to
find that book to learn what she meant. Or better, get a copy of the
Chinese original. -Richard
On Saturday, Oct 11, 2003, at 13:28 US/Pacific, Patrick Andries wrote:
Would anyone know where I could find some background
On Thursday, Sep 11, 2003, at 09:42 US/Pacific, Michael Everson wrote:
At 11:04 -0400 2003-09-11, Patrick Andries wrote:
Does TAI LE, encoded in Unicode 4.0, refer to the same language as
TAI NÜA ?
Yes.
If so, isn't TAI NÜA the most frequently used form of this language ?
According to the
On Thursday, Sep 11, 2003, at 10:45 US/Pacific, Michael Everson wrote:
At 10:02 -0700 2003-09-11, Richard Cook wrote:
I'm guessing that Tai Le would be the exonym (Chinese name), while
TAI NÜA is the autonym.
Don't guess. The Chinese name is Dehong Dai.
Well, Le is a Chinese (Mandarin) syllable
Gedney says nuea/nü is a Thai word for 'north/northern' ... looks
as if the syllable in this name gets written many different ways ...
le, lu, lü, lüe, lue, nü, nüa, nüe, neua, nuea ... at least it's
possibly the same syllable.
Here are some references:
Gedney, William J. 1976. Notes on Tai
Ostermueller, Erik wrote:
I apologize if you all have already discussed this.
At unicode.org, when I click this link,
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=2
I'm expecting to see a little square GIF that displays U+2.
Instead, I see N/A.
Shouldn't there
Sourav,
You wrote:
Hi All,
Does Unicode support both Simplified as well as Traditional Chinese ?
Yes, it does, though the Simplified support is rather lacking in
comparison with the Traditional, since the Traditional characterset is
rather large, if not completely open-ended, and
Michael Everson wrote:
I wonder what Quark would do if we all wrote to [EMAIL PROTECTED] to
ask for Unicode support.
Good idea. I just did. But, Quark is just the tip of the iceberg. I
still need a good (Mac OS X) database that can do Unicode Chinese
(including supplemental planes). Any
On Friday, June 20, 2003, at 02:44 , Kenneth Whistler wrote:
What is true is that use of italicized text is unusual
in Chinese or Japanese body text--certainly not with the frequency
or same range of functions as occurs in Latin typography.
Bold text is not that unusual, however.
In precomputer
On Tuesday, October 16, 2001, at 08:00 PM, James Kass wrote:
Are there any instructions for reporting errata such as the glyphs
at U+29FD7 and U+29FCE being identical?
[U+29FD7] and [U+29FCE] are not identical. They are (admittedly rather
close) graphical variants. If you want to ID all
Michael (michka) Kaplan wrote:
From: John H. Jenkins [EMAIL PROTECTED]
Has the UNIHAN.TXT file been updated to include radical-stroke data
for Plane Two characters?
Yes. Ever since Unicode 3.1 was released. (We still don't have an
Extension B font, however.)
There is one in
Becker, Joseph wrote:
Unicode is going to stick with the KangXi radical system
There Unicode goes again, flouting the will of the people ... while
meanwhile in another thread an esteemed Unicode elder has proposed the death
radical. It's time to bring this system into the 21st Century:
Thomas Chan wrote:
On Mon, 9 Jul 2001, Richard Cook wrote:
On a related note, I have 9000 word/char frequencies from Hanyu Pinlu
Cidian (a mainland text; I typed the entries in back in the early 90's,
and this is the freq data currently used in Wenlin). I'd be happy to
give
John H. Jenkins wrote:
It is on occasion something of an art figuring out the correct
radical/stroke position for a character in this kind of an index, sad
to say.
I'd say, when 2 radicals are possible, put it under both. When 3, well
... you probably get the idea ...
James Kass wrote:
Richard Cook wrote:
John H. Jenkins wrote:
It is on occasion something of an art figuring out the correct
radical/stroke position for a character in this kind of an index, sad
to say.
I'd say, when 2 radicals are possible, put it under both. When 3, well
Michael Everson wrote:
UTC approved it and there's a new document from John Jenkins and me
on Shavian for WG2, so it should get approved for ballotting at the
next meeting of WG2.
Hi Michael,
I'm new to the idea that anyone would care to have Shavian encoded. Will
you enlighten me?
Best,
Michael Everson wrote:
At 11:10 -0700 2001-07-04, Richard Cook wrote:
Michael Everson wrote:
UTC approved it and there's a new document from John Jenkins and me
on Shavian for WG2, so it should get approved for ballotting at the
next meeting of WG2.
Hi Michael,
I'm new
Rick McGowan wrote:
I don't think there's any point in encoding 64 hexagrams; especially when
we have the pieces already. Use the pieces of three and position them with
a drawing program. We don't have combining thingies for putting chess
pieces on board squares, either.
Hi Rick,
I
Another list member mentioned (off-list) the system of 9 bigrams and 81 tetragrams.
These appear in the text of a book called [U+592a][U+7384][U+7d93]
Tai Xuan Jing by [U+63da][U+96c4] Yang Xiong.(c.53BC-c.18AD).
Where the 64 hexagrams are based on a binary system,
the 81 tetragrams are based
John Cowan wrote:
Rick McGowan scripsit:
I don't think there's any point in encoding 64 hexagrams; especially when
we have the pieces already. Use the pieces of three and position them with
a drawing program. We don't have combining thingies for putting chess
pieces on board
Michael Everson wrote:
At 13:59 -0700 2001-07-03, Edward Cherlin wrote:
But I thought proposals for characters with decompositions into existing
characters are no longer being accepted.
True for accented letters where the combining marks already exist,
but I don't think we want to
John H. Jenkins wrote:
At 8:07 PM +0200 7/3/01, Genenz wrote:
Should one consider the Chinese oracle bone
inscriptions (1200 BC) for entry to the unicode list?
They really did exist.
As a rule, historical scripts (in which I'll include OBI, even though
their descendant is with us
Michael Everson wrote:
At 12:33 -0700 2001-07-02, Edward Cherlin wrote:
Has anyone proposed the following for inclusion in Unicode? If so,
what is their status?
Daoist Hexagrams, 64 forms (the trigrams are already included, but
with no combining mechanism)
You're welcome to, if you
John H. Jenkins wrote:
At 7:07 PM -0700 7/2/01, Richard Cook wrote:
Evidence? There's ample evidence, starting c. 1000 BC, with
[U+5468][U+6613] _Zhou Yi_ (aka _Yi Jing_ aka _I Ching_ aka _The Book of
Changes_), an artifact of the Zhou Dynasty ...
I agree with Richard here. It's silly
John H. Jenkins wrote:
At 7:07 PM -0700 7/2/01, Richard Cook wrote:
Evidence? There's ample evidence, starting c. 1000 BC, with
[U+5468][U+6613] _Zhou Yi_ (aka _Yi Jing_ aka _I Ching_ aka _The Book of
Changes_), an artifact of the Zhou Dynasty ...
I agree with Richard here. It's silly
Edward Cherlin wrote:
I use Cangjie to access my character database, since it is usually much
faster than radical and stroke count, and I usually don't know the Chinese
pronunciation of characters I need to look up. The database gives me
Radical number, Stroke count, Chinese, Japanese, and
John H. Jenkins wrote:
At 4:16 PM -0600 6/1/01, Jon Babcock wrote:
The Asia/East Asian/CJK thread reminded me of one of my own pet
peeves, the use of 'ideograph' to refer to kanji.
Perhaps some of the professionals on this list can enlighten me
here. I thought that an ideograph meant
Jon Babcock wrote:
The Asia/East Asian/CJK thread reminded me of one of my own pet peeves,
the use of 'ideograph' to refer to kanji.
Perhaps some of the professionals on this list can enlighten me here. I
thought that an ideograph meant that the graph stood for an idea, not a
sound or a
Anyone know which US president is [U+704c][U+6027][U+704c] ?
Someone told me this (admittedly silly) joke in Japanese, with
[U+85ea][U+6027][U+85ea]
Thomas Chan wrote:
On Sat, 26 May 2001, Richard Cook wrote:
Gaspar Sinai wrote:
On Sat, 26 May 2001, Richard Cook wrote:
Here's a puzzle: Any idea 1.) what this character is, and 2.) if
it's in Unicode?
http://linguistics.berkeley.edu/~rscook/bishop/Picture1.gif
Here's a puzzle: Any idea 1.) what this character is, and 2.) if it's in Unicode?
http://linguistics.berkeley.edu/~rscook/bishop/Picture1.gif
[EMAIL PROTECTED]
http://www.yudit.org/
On Sat, 26 May 2001, Richard Cook wrote:
Here's a puzzle: Any idea 1.) what this character is, and 2.) if it's in Unicode?
http://linguistics.berkeley.edu/~rscook/bishop/Picture1.gif
at http://www.perl.com/pub/2001/05/03/wall.html Larry Wall writes:
Perl 6 programs are notionally written in Unicode, and assume
Unicode semantics by default even when they happen to be
processing other character sets behind the scenes. Note that
when we say that Perl is written in
Another web page, for your collective amusement:
http://linguistics.berkeley.edu/~rscook/html/Unicode-tetralog.html
I thought Sarasvati was immune to this. Parvati?
Tex Texin wrote:
not the same as work for execs. The success of Unicode is obvious
to us (techies) is not clear to them.
Tex,
Recently looking at and talking about this
http://i18n.homepage.com/UnicodeBenefits.html
with some people, initiated and uninitiated, I quickly wrote this:
Kenneth Whistler wrote:
Doug Ewell asked, on this hopelessly wandering thread:
(Is
there an English-language term for the subset of the CJK ideographic script
that is used by a given language, say, Japanese?)
Well, since "kanji" by now has been borrowed into English, at least among
Thomas Chan wrote:
But is a romanized version of U+6F22 U+5B57 based on the Cantonese
pronunciation ever used in English writing the way hanzi (based on
Mandarin pronunciation) is?
it could be ... it might even be used as a special term to distinguish
"Cantonese Ideographs" ...
For those
Thomas Chan wrote:
There is also a similar phenomena in Chinese, called fangyanzi '"dialect"
character', which may be considered analogous to the above, the most well
known being the Cantonese ones, although others (Wu, Hakka, etc) do exist.
[1] There is a small chance that they might
Jungshik Shin wrote:
On Tue, 27 Feb 2001, Thomas Chan wrote:
On Tue, 27 Feb 2001, Richard Cook wrote:
* 'chunom' in Vietnamese [similar to (i.e., analogical) Chinese characters].
If one is going to talk about Vietnamese chu+~ no^m '"southern"
characters', then one mig
"John H. Jenkins" wrote:
At 7:57 AM -0800 2/26/01, Richard Zhang wrote:
Hello, Marco,
Unihan is the official site I think. You can visit www.unihan.com.cn for
more information about this, if you know Chinese :).
Knowing Chinese is not enough. You and your browser need to know
Simplified
Tom Lord wrote:
I think I'd like bijective too, if I knew what it meant. Someone?
It would be a lot more fun to answer this question in plain-text
Unicode (using math notation) than in ASCII.
Informally:
"Bijective" describes a mapping between two sets. Every element of
the source
Sorry, I tuned out for a moment: is there a URL for the final version of
Tex's tabulation of benefits?
Also, I'd appreciate any similar links that might be used in a page of
info for the uninitiated.
Best,
Richard
Mark Davis wrote:
that must be made about what counts as an abstract character and what
does not; and the generally acknowledged desirability of supporting
bijective mappings between a variety of older character sets and
while I like bijective, it is not a commonly understood term.
I
Just a correction. Someone previously asked about
http://www.wenlin.com/
and its support for Vertical Ext. A. It turns out that this support has
not yet made it into the public release ...
Best,
Richard
Kenneth Whistler wrote:
I cannot check now if these characters are included in Unicode as I don't
have TUS handy in this moment.
http://www.unicode.org/unicode/uni2book/u2.html (The Online Edition)
and
http://www.unicode.org/charts/draftunicode31/ (for CJK Extension
John Jenkins wrote:
On Thursday, January 25, 2001, at 03:14 AM, Pierpaolo BERNARDI wrote:
I was talking about the index for the hanzi's ordered by radical+strokes
which can be found at the end of the book, since I wanted to check
whether
high numbered elements were there. I know the
Kenneth Whistler wrote:
I could not find the radical index. Has this been put online too?
No. The CJK radical index was generated and printed with custom
software from the Unihan database. It was too much effort to try
to convert that software to produce a postable .pdf file, so the
Richard Cook wrote:
Kenneth Whistler wrote:
I could not find the radical index. Has this been put online too?
No. The CJK radical index was generated and printed with custom
software from the Unihan database. It was too much effort to try
to convert that software to produce
Kenneth Whistler wrote:
I could not find the radical index. Has this been put online too?
No. The CJK radical index was generated and printed with custom
software from the Unihan database. It was too much effort to try
to convert that software to produce a postable .pdf
Kenneth Whistler wrote:
Richard Cook surmised:
BTW, in a very close transcription, if one is using superscription
(position above baseline) and relative size reduction to indicate
aspiration, I suppose that degree of superscription or the size or both
could be modulated to indicate
I see 2 Traditional Chinese translations here:
http://www.macchiato.com/unicode/Unicode_transcriptions.html
Which one do people like?
http://my.ispchannel.com/~markdavis//unicode/Unicode_transcription_images/U_Chinese2.gif
John Jenkins wrote:
On Thursday, January 11, 2001, at 10:25 AM, Richard Cook wrote:
Which one do people like?
http://my.ispchannel.com/~markdavis//unicode/Unicode_transcription_images/U_Chinese2.gif
Is much better. "Unified Code"
This was my opinion too. I like "to
Jon Babcock wrote:
At first glance, I agreed. But then if the U_Chinese3.gif, gets
shortened to the last three characters, wanguo ma, as I suspect it
would in practice, I'd favor it slightly over the three-character
tongyi ma of U_Chinese2.gif. FWIW. To me, wanguo ma emphasizes the
Kenneth Whistler wrote:
Thus the Uighur script is the direct ancestor of the Mongolian
script, and is also a term used for the modern Mongolian script
itself, to distinguish it from Mongolian written in one of the other
scripts (including Latin and Tibetan).
And the Uighur script has
"J%ORG KNAPPEN" wrote:
The curly-tail consonants t, d, n, l, c, z are also included in the
TeX IPA (tipa fonts). The documentation of those fonts is available
on
ftp://ftp.dante.de/texarchive/fonts/tipa/tipaman.ps.gz
--J"org Knappen
Hi J"org,
It looks as if you sent the wrong url. The
I've been meaning to mention this program on-list. Tom Bishop's Wenlin at
http://www.wenlin.com/
is a self-contained, Mac/Win means of editing Unicode Chinese. I've
heard Unicoders speak well of it before. At the last conference one
presenter said in his presentation, concluding his praise of
This table has undergone some further revision:
http://stedt.berkeley.edu/pdf/curly-tail-table3.pdf
Please note in the center of the table:
U+0291/U+0293 and U+0255/U+0286
These 4 may in fact be 2 pairs of functional equivalents (synographs),
pointing to the same place of articulation.
Michael Everson wrote:
Ar 13:10 -0800 2000-11-23, scríobh Richard Cook:
Hi everyone,
This paper, brought to your attention last June
http://stedt.berkeley.edu/pdf/curly-tailed-tdnlcz.pdf
http://stedt.berkeley.edu/pdf/TranscriptionTable-WUZongji.jpg
has been updated recently. Still
List members might find information on work being done on Xi Xia
(Tangut) Script to be of interest.
Prof. GONG Hwang-cherng and his colleagues in the Institute of
Linguistics at Academia Sinica in Taiwan have been working for the past
several years to
"J%ORG KNAPPEN" wrote:
The curly-tail consonants t, d, n, l, c, z are also included in the
TeX IPA (tipa fonts). The documentation of those fonts is available
on
ftp://ftp.dante.de/texarchive/fonts/tipa/tipaman.ps.gz
--J"org Knappen
Thanks. The URL should have a hyphen in it:
Hi everyone,
This paper, brought to your attention last June
http://stedt.berkeley.edu/pdf/curly-tailed-tdnlcz.pdf
http://stedt.berkeley.edu/pdf/TranscriptionTable-WUZongji.jpg
has been updated recently. Still working on getting the formal
proposal together, and still welcoming comments and/or
90 matches
Mail list logo