Re: [CMS-PIPELINES] CMS-PIPELINES XLATE, VCHAR, UTF, etc.

Bob Cronin Sat, 24 Nov 2012 09:21:50 -0800

Utf8 isn't a character set per se it is an algorithm for encoding unicode
characters using mail-safe 8-bit quantities. A fine point but important to
realize nonetheless. So to deal with Utf8 on a mainframe you have to be
able to undo the 8bit encoding algorithm to yield ascii unicode and then
use the appropriate ascii to ebcdic translation table to convert the ascii
unicode to the target ebcdic character set. Unfortunately you have to know
something about the source character set to pick the appropriate table. If
the mainframe supported unicode you wouldn't have to but zVM doesn't so you
are out of luck in that regard. Unless you are dealing with Asian countries
it is probably safe to assume ISO latin1 character sets were used. That
should handle most of the sorts of characters you're talking about.
--
bc
 On Nov 23, 2012 4:29 PM, "Larson, John E." <[email protected]> wrote:


> I just wanted to thank everyone for their input, no wonder I couldn't find
> any easy solutions, I didn't realize what a large "can of worms" I had
> opened, not realizing that I needed a whole "service", not just a routine.
>
> I will take everyone's input and try and decide on the ultimate solution
> that will work for me.
>
> At this point I am told by customers that all I need to worry about (for
> the near/medium future) is just a subset of the extended Latin-1 character
> set.
> It is really just the accented vowels and consonants that I'm told I need
> to deal with.
>
> Not sure yet if I can use this assumption to utilize a simpler solution
> rather than supporting the entire UTF-8 character set, which I now
> understand is...HUGE.
>
> Thanks again to everyone who replied with suggestions and help in
> understanding what I'm facing here.
>
> John
>
>
> -----Original Message-----
> From: CMSTSO Pipelines Discussion List [mailto:[email protected]]
> On Behalf Of Bob Cronin
> Sent: Tuesday, November 20, 2012 10:46 AM
> To: [email protected]
> Subject: Re: CMS-PIPELINES Digest - 1 Nov 2012 to 19 Nov 2012 (#2012-22)
>
> Of course its simpler if you know you're always dealing with latin-1.
> --
> bc
>
>
> On Tue, Nov 20, 2012 at 12:56 PM, Glenn Knickerbocker <[email protected]
> >wrote:
>
> > On 11/20/2012 11:04 AM, Bob Cronin wrote:
> > > I used to have my own Rexx code to
> > > translate utf8 to utf16, but dropped that in favor of the
> "undocumented"
> > > utf stage the Piper was so kind to provide. So basically you just need
> to
> > > convert utf8 to utf16 and then use the Unicode to EBCDIC table
> > appropriate
> > > to your situation.
> >
> > I think by "ASCII UTF-8" he really means UTF-8 encoded 8-bit Latin-1
> > (ISO 8859-1), so VCHAR should fill in the missing step by skipping over
> > the extra 8 0-bits.  From EBCDIC to UTF-8:
> >
> >   ... | utf from utf-8 to utf-16 | vchar 16 8 | xlate a2e | ...
> >
> > From UTF-8 to EBCDIC:
> >
> >   ... | xlate e2a | vchar 8 16 | utf from utf-16 to utf-8 | ...
> >
> > ¬R
> >
>

Re: [CMS-PIPELINES] CMS-PIPELINES XLATE, VCHAR, UTF, etc.

Reply via email to