Re: [CMS-PIPELINES] XLATE (or any method) for converting UTF-8 to/from EBCDIC.

Larson, John E. Mon, 19 Nov 2012 14:25:52 -0800

I have been searching and searching (Google, all VM documentation I can find, 
pipeline forum and history pages, etc.) for days and can't figure out how to 
deal with ASCII UTF-8.


I've read on one site that codepage 1207/1208 "might" be the way, but of course 
I can't find any more specifics about these codepages, and I'm not so 
interested in them anyway as they're not supported by XLATE.

My requirement seems simple enough, update a CMS XML browser to display ASCII 
UTF-8 data in displayable EBCDIC, translate back to UTF-8 before saving to disk 
or sending the data to a TPF system.

For example, an input message may contain an extended Latin vowel, say, x'51' 
from EBCDIC codepage 1047, and the UTF-8 equivalent of this is actually a 
two-byte value of x'C3AA'.

I have been doing just fine using the standard XLATE A2E and E2A until this new 
requirement to support the Latin characters (accented vowels and consonants).

Is there no other way to do this than write my own translate table?  Even that 
is not so straight-forward, as the characters are not all a byte for byte 
substitution.

After days of searching, I can't think of any other way than to write a pure 
rexx routine that loops through the entire string, substituting some bytes for 
a different byte, and "some" bytes to a two-byte substitution.

And of course I have to go both ways.

What makes this really unappealing is that I am dealing with a message driver 
that sends tens of thousands of messages a second (1K-5K bytes in length for 
each message), and I can't help but feel that taking the time for a rexx 
routine to do this translation is going to noticeable slow things down.

I'm really surprised that with all the Internet UFT-8 usage "out there" that 
there isn't a way to do this with a "built-in" routine.

Anyone else have to deal with UTF-8 to EBCDIC and back?

John

Re: [CMS-PIPELINES] XLATE (or any method) for converting UTF-8 to/from EBCDIC.

Reply via email to