Re: CCSID

Rick Troth Wed, 15 Jun 2016 17:07:31 -0700

Oh goody. A character sets question.


On 06/12/16 18:20, Scott Ford wrote:

I found the problem we are using an IBM TBL EZACICTR which doesnt support
CP 437, duh ....


Bummer.

Today the role of Lynn Wheeler will be played by /moi/ as I give someinteresting (to me) history related to this topic. (Salted withentertaining embellishment because my facts just aren't as detailed orinteresting as his.)

I have a bigger question, if we wanted to support Unicode (yeah ugh), how
do I know what CCSIDS to support ?

The problem with Unicode is that it's not an 8-bit codepage. It's 32bits. It is the solution to all of our planet-wide problems because"there's room for everyone!".

I like UTF-8 where you get an 8-bit wide byte stream. (And there are noworries over endianness.) But that doesn't suite everyone. 8-bit bytesdon't even work for program source code anymore! Eight bit bummer.

For example we go from EBCDIC on z/OS to ASCII and from ASCII to EBCDIC.
Do I some how have to tell the target what the sending CCSID is ?


Yes. (But I more often see the codepage numbers than some CCSID.)

Without better knowledge of your data and the environment, I can onlyrecommend circling near "EBCDIC is CP1047" and "ASCII is ISO-8859-1". Ifyour stuff is US and most of Western Europe, that works. (Not so helpfulfor the Russians or the Greeks or anyone East of them.)


The Story

We've enjoyed this hemmorrhiod for decades.

Dirty little secret: IBM was one of the backers of ASCII in the 1960s.The S/360 had an ASCII/EBCDIC switch. But too much momentum withHollerith history. OS/360 and its siblings continued using EBCDIC. Sothe nifty A/E HW bit got re-purposed. Besides, we can fix everything insoftware, right? Ahh, those were the days. If only 16M were enough.Twenty-four bit addressing mode bummer.

In the late 1980s, Edwin Hart, then at Johns Hopkins Applied Physics andactive with SHARE, spear-headed a customer effort to _distill commonpractice_ into consistency. The result was


*SHARE Report SSD No. 366*:

ASCII and EBCDIC Character Set and Code Issues in Systems ApplicationArchitecture,

The ASCII/EBCDIC Character Set Task Force.
Edited by Edwin Hart,
The Johns Hopkins University,
Applied Physics Laboratory,
Laurel, Maryland, USA;
published by Share Inc.,
111 East Wacker Drive, Chicago, Illinois, USA 60601;
*June 1989*

The effect was what some called "Codepage 37 version 2". Most mainframesites were using either CP 37 or CP 500 (or subsets), neither of whichmapped correctly to de-facto EBCDIC (for common translations to/fromASCII). CP 37 was the closer of the two. With minor code pointre-assignment, a codepage floated to the surface which many of usrabidly skimmed off and ran with.

IBM took the SHARE report to heart. Mostly. They soon blessed us with CP1047, the standard on USS, even now. Codepage 1047 is closer to thelegendary and mythical CP 37v2, but still off by two points. It switches/not/ and /hat/ (circumflex, shift 6 on your US PC keyboard). Makes a/mess/ of code and scripts which use either of those characters.Thirty-two bit bummer.

Interestingly, this unofficial _CP 37v2 persists_. At least one ISV ofnote (I won't say which, but Dave Rivers might chime in) continues usingan official pair of translate tables that /work consistently/ betweenz/OS and Unix/Linux/Windows. And there was much rejoicing.


I can offer these ...

   http://www.casita.net/pub/aecs.h
   http://www.casita.net/pub/aecs.c

No warranties expressed or implied. In fact, I recommend /not/ using theC routines for anything more than reference. (Code-up something inassembler and let the hardware do the grunt work.)


Tagging text with one codepage or another is madness.

But assuming EBCDIC is always one thing and ASCII always an invariantother is paint cornering you.Eventually we will get to Unicode, and the chief cause of problems issolutions.

Sixty-four bit bummer.

-- R; <><




----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: CCSID

Reply via email to