<[EMAIL PROTECTED]> wrote: > I am trying to write an application that can read input in Tradisional > Chinese but output (printout on papers)in Simplified Chinese, without > any 3rd party software (e.g. ChineseStar, TwinBridge). > > How can I implement Unicode in the coding? The programming language > I'm using is Ms Visual Basic 6 Professional Edition.
It depends on how much of the problem you want to solve. Mapping between Traditional Chinese (TC) and Simplified Chinese (SC) is *not* generally 1-to-1, despite what many people believe. It could be 1-to-many, many-to-1, or even many-to-many, depending on which character(s) are involved. Some TC characters have different SC "equivalents" depending on which meaning of the word is intended. And not every TC character ever invented has an SC equivalent. There is even at least one character A that is both the traditional form of some character B *and* the simplified form of another character C! TC/SC equivalence in the general case is a linguistic problem. The Unicode Standard is a character encoding standard, not a linguistic standard, so it does not attempt to provide definitive TC/SC mapping tables. The official Unicode Han database: http://www.unicode.org/Public/UNIDATA/Unihan.txt does include fields called "kSimplifiedVariant" and "kTraditionalVariant," which may be of some assistance. But as you will see, only 2629 "simplified variants" and 2554 "traditional variants" are listed, for tens of thousands of Han characters. A group of mainland Chinese and Taiwanese industry specialists have tried (unsuccessfully) to establish a TC/SC conversion layer within the forthcoming internationalized domain name (IDN) architecture. Their document includes a list of about 2000 1-to-1 TC/SC pairs taken from official Chinese and Taiwanese references. It explicitly does not propose a solution for the non-1-to-1 conversion cases, but dismisses these cases as uncommon. The document (draft-ietf-idn-tsconv-02.txt) has expired from the IETF timetable and is no longer available, but I can supply a copy if you are still interested. Of course, if you already have the TC/SC conversion module and just need to convert between a DBCS encoding (e.g. GB 2312) in order to "implement Unicode in the coding," the Unihan.txt file does include these mappings. -Doug Ewell Fullerton, California

