On Nov 24, 2012, at 10:21, Bob Cronin wrote:

> Utf8 isn't a character set per se it is an algorithm for encoding unicode
> characters using mail-safe 8-bit quantities. A fine point but important to
> realize nonetheless. So to deal with Utf8 on a mainframe you have to be
> able to undo the 8bit encoding algorithm to yield ascii unicode and then
> use the appropriate ascii to ebcdic translation table to convert the ascii
> unicode to the target ebcdic character set. ...
>
Isn't "ascii unicode" somewhat oxymoronic?  There isn't an "ASCII Unicode",
nor an "EBCDIC Unicode", nor a "Baudot Unicode", ...  There's one Unicode
(that's what the "Uni-" means), but, yes, numerous representations for
transmission and compaction, one of which is UTF-8.  (And part of the code
space is reserved for local purposes (Klingon?).  I suspect that's not a
concern for the OP.)

Some filters, such as iconv(1) will convert UTF-8 to an EBCDIC code page
with a single command.  But their hidden internal operation may be as
you suggest.

"[M]ail-safe 8-bit" is a bit of wishful thinking.  Almost any MUA I use,
on encountering a character value >127 will cautiously further encode
as either quoted-printable or base64.  This happens somewhere along
the route even when the proximate MTA agrees in the handshaking to use
8-bit.

-- gil

Reply via email to