Oh goody. A character sets question.

On 06/12/16 18:20, Scott Ford wrote:
I found the problem we are using an IBM TBL EZACICTR which doesnt support
CP 437, duh ....

Bummer.

Today the role of Lynn Wheeler will be played by /moi/ as I give some interesting (to me) history related to this topic. (Salted with entertaining embellishment because my facts just aren't as detailed or interesting as his.)


I have a bigger question, if we wanted to support Unicode (yeah ugh), how
do I know what CCSIDS to support ?

The problem with Unicode is that it's not an 8-bit codepage. It's 32 bits. It is the solution to all of our planet-wide problems because "there's room for everyone!".

I like UTF-8 where you get an 8-bit wide byte stream. (And there are no worries over endianness.) But that doesn't suite everyone. 8-bit bytes don't even work for program source code anymore! Eight bit bummer.


For example we go from EBCDIC on z/OS to ASCII and from ASCII to EBCDIC.
Do I some how have to tell the target what the sending CCSID is ?

Yes. (But I more often see the codepage numbers than some CCSID.)

Without better knowledge of your data and the environment, I can only recommend circling near "EBCDIC is CP1047" and "ASCII is ISO-8859-1". If your stuff is US and most of Western Europe, that works. (Not so helpful for the Russians or the Greeks or anyone East of them.)

The Story

We've enjoyed this hemmorrhiod for decades.

Dirty little secret: IBM was one of the backers of ASCII in the 1960s. The S/360 had an ASCII/EBCDIC switch. But too much momentum with Hollerith history. OS/360 and its siblings continued using EBCDIC. So the nifty A/E HW bit got re-purposed. Besides, we can fix everything in software, right? Ahh, those were the days. If only 16M were enough. Twenty-four bit addressing mode bummer.

In the late 1980s, Edwin Hart, then at Johns Hopkins Applied Physics and active with SHARE, spear-headed a customer effort to _distill common practice_ into consistency. The result was

*SHARE Report SSD No. 366*:
ASCII and EBCDIC Character Set and Code Issues in Systems Application Architecture,
The ASCII/EBCDIC Character Set Task Force.
Edited by Edwin Hart,
The Johns Hopkins University,
Applied Physics Laboratory,
Laurel, Maryland, USA;
published by Share Inc.,
111 East Wacker Drive, Chicago, Illinois, USA 60601;
*June 1989*

The effect was what some called "Codepage 37 version 2". Most mainframe sites were using either CP 37 or CP 500 (or subsets), neither of which mapped correctly to de-facto EBCDIC (for common translations to/from ASCII). CP 37 was the closer of the two. With minor code point re-assignment, a codepage floated to the surface which many of us rabidly skimmed off and ran with.

IBM took the SHARE report to heart. Mostly. They soon blessed us with CP 1047, the standard on USS, even now. Codepage 1047 is closer to the legendary and mythical CP 37v2, but still off by two points. It switches /not/ and /hat/ (circumflex, shift 6 on your US PC keyboard). Makes a /mess/ of code and scripts which use either of those characters. Thirty-two bit bummer.

Interestingly, this unofficial _CP 37v2 persists_. At least one ISV of note (I won't say which, but Dave Rivers might chime in) continues using an official pair of translate tables that /work consistently/ between z/OS and Unix/Linux/Windows. And there was much rejoicing.

I can offer these ...

   http://www.casita.net/pub/aecs.h
   http://www.casita.net/pub/aecs.c


No warranties expressed or implied. In fact, I recommend /not/ using the C routines for anything more than reference. (Code-up something in assembler and let the hardware do the grunt work.)

Tagging text with one codepage or another is madness.
But assuming EBCDIC is always one thing and ASCII always an invariant other is paint cornering you. Eventually we will get to Unicode, and the chief cause of problems is solutions.
Sixty-four bit bummer.

-- R; <><




----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to