Andrew C. West" <[EMAIL PROTECTED]> 
wrote on Friday, February 14, 2003 2:29 AM
Subject: Re: traditional vs simplified chinese
> On Thu, 13 Feb 2003 09:48:45 -0800 (PST), "Zhang Weiwu" wrote:
> 
> > Take it easy, if you find one 500B (the measure word)  it is usually enough to
> > say it is traditional Chinese, one 4E2A (measure word)  is in simplified
> > Chinese. They never happen together in a logically correct document.
> 
> Marco is absolutely correct that Simplified and Traditional Chinese may
> legitimately be found together on the same Web page (and I for one have several
> pages where they do).

> Certainly, I've seen "traditional" texts which mix U+500B with U+4E2A (and with
> U+7B87 for that matter). With Unicode it is now possible to transcribe
> traditional texts as they are written, rather than translate into "traditional"
> or "simplified". Take, for example, this Web page --
> http://uk.geocities.com/Morrison1782/Texts/TianguanCifu.html -- which
> transcribes a short one-act play from the Cantonese Opera tradition, published
> during the Qing dynasty (probably early 19th century). 

Okay, Andrew is a real expert and is right about it. I would want to have a look of 
that page if I can go to geocities.com. (It has been at least two years no one goes to 
geocigies.com directly from China.)

I never saw 500B and 4E2A in one same printed document as I lived in China for 20 
years. (Well, need to remove the years I cannot read:) Unless you have a obvious 
reason to do so, to print a book with Traditional characters is considered somewhat 
wrong in the past in China. There is a language council (YuWei) in charge of such 
issue. In some period of past time people want to completely kill Traditional Chinese. 
I remeber an advertisement on the street when I was a child, which said people should 
report public appearance of Traditional Chinese character to the local culture 
ministry of some sort. (Oh it's very OT) So let me correct my word: If you find a 
4E2A, maybe it is still Traditional, but if you find a 500B it is very very likely to 
be Traditional Chinese. I think we can search 500B, if it does not exist it is likely 
to be a simplified character. 

It's a bad thing I never read copied books (I mean copy from original ancient books) 
so to make the kind of mistake. Try to read more in future. 

>It has U+4E2A (simplified
> ge4) but not U+500B (traditional ge4), and yet is written mostly in
> "traditional" characters. How would your algorithm classify such a page ?

Well I was not talking about algorithm the first time. I thought Paul Hastings 
<[EMAIL PROTECTED]> wanted to do it by looking at it. And we don't have lots of such 
mixed pages.

Attachment: smime.p7s
Description: application/pkcs7-signature

Reply via email to