Doug Ewell wrote:
SRIDHARAN Aravind <ASridharan at covansys dot com> wrote:
How to convert EBCDIC data into Unicode?
There are informative mapping tables available at:

http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/
There are also various places where IBM publishes EBCDIC<->Unicode conversion tables.
In ICU's .ucm format, or in UTR #22 XML format, you can find them at http://oss.software.ibm.com/icu/charset/

You can use ICU to perform the conversion: http://oss.software.ibm.com/icu/userguide/conversion.html

You need to know which EBCDIC variant (code page) you are converting
from.  There are dozens.
Yes - in IBM parlance, you will need to identify which CCSID is used. For some CCSIDs, there are multiple Unicode conversion tables, but this is less common with EBCDIC CCSIDs. ICU has an API function ucnv_openCCSID(): http://oss.software.ibm.com/icu/apiref/ucnv_8h.html#a55

They are all the same in the A-Z, a-z, and 0-9
ranges, but beyond that they can differ substantially.
There are some more characters that have the same codes in most EBCDIC codepages, but there are also some where the Latin letters are not all present. (I think some old Japanese EBCDIC codepages replace small Latin letters with Katakana ones.)

markus

If you don't find the mapping table you are looking for, I can probably
dig it up or reconstruct it.

-Doug Ewell
 Fullerton, California
--
Opinions expressed here may not reflect my company's positions unless otherwise noted.


Reply via email to