Re: Code Page for dataset names

Joel C. Ewing Sat, 08 Jul 2023 08:23:03 -0700

To call this "handling UTF-8 data" is being overly generous. If theUTF-8 data contains more unique characters that can fit within thelimited number of characters of any 3270 terminal codeset, then you areobviously SOL, in that you couldn't even invent a new terminal codesetthat could be used as a translation target without loss of information. When you actually need to represent an expanded number of uniquecharacters, there is no way to just "convert" UTF-8 to and from any 3270terminal CCSID and get a meaningful result.

There are so many useful glyphs available in UTF-8 that aren't in anycurrent 3270 terminal CCSID and which can't be in any 3270 terminalCCSID without throwing out some other characters. it's a shame if ISPFis still inseparably tied to 3270-architecture restrictions. I guessanother way to look at it is that any application that needs to fullysupport UTF-8 data just can't be written as an ISPF application.

I remember there used to be an ISPF client that ran on PCs that made itpossible to pretend ISPF was a PC app, but it was cumbersome, stillrequired the companion TSO/ISPF session, still was limited by 3270design, and hangs could then occur on two platforms instead of one -- generally more difficult to use and for minimal added benefit. Oneinteresting approach to bypass the 3270 architecture limitations andallow direct use of UTF-8 would be to build an ISPF-like tool but with aweb-server user interface that would be accessed via web browsers,rather than using a terminal interface accessed by 3270 emulators. Thatway you at least don't have to design a new communication protocol tosupport UTF-8. It would still be a challenge to design such a tool ina way that retained the same level of security as TSO/ISPF, where eachuser is isolated from other users in their own address space and RACFenvironment,


 JC Ewing


On 7/7/23 21:01, Attila Fogarasi wrote:

ISPF was enhanced years ago to handle both ASCII and UTF-8 data.  The EU
command edits the file containing UTF-8 data and converts it to the CCSID
of your terminal.  If the file is tagged with CCSID 1208 then the E command
automatically does this UTF-8 to terminal codepage conversion.  It's up to
you to be using an appropriate terminal codepage for the data you are
editing :)

On Sat, Jul 8, 2023 at 11:47 AM Joel C. Ewing <[email protected]> wrote:

Admittedly I've been away from 3270 devices, real or emulated, and ISPF
for over a decade now.  ISPF support for the Unix filesystem was a
little rudimentary and confusing back then, partly because of
conflicting codeset definitions, but how on earth is it supported these
days from a full-screen device limited to variants of the 8-bit EBCDIC
code?  Linux, and even Windows, now supports directory names and file
names using all but a few restricted UTF-8 characters.   Surely that
means the Unix filesystems on z/OS must now support that as well?  And
of course text data in files on non-z/OS systems these days frequently
uses UTF-8 by default.  How can you even specify Unix file paths on an
ISPF panel when arbitrary UTF-8 characters with no counterparts in any
EBCDIC variant may be in the file path?

Has no one yet figured out how to create a successor to 3270
Architecture and 3270 communication protocol that supports the UTF-8
charset?  If ISPF design is still centered around and restricted by an
architecture that can only support less than 256 different glyphs, that
would seem to be a serious deficiency in today's world.

      JC Ewing

On 7/7/23 20:37, Attila Fogarasi wrote:

Codepage 1047 is obsolete, superceded by 942.  Since this is mainframe,

it

remains supported "forever".  Euro did not exist at the time 1047 and 037
and 037-2 were created.  That is one reason that 942 was created, with

Euro

symbol amongst other changes.  My suspicion is that the tangled codepage
history has to do with the multiple conflicting divisions at IBM with
printers, PCs, S/3x, 8100 and Series/1 all intersecting on codepage in
various ways.  Most likely all divisions had veto power over codepage
standards.  This is all ancient history and not relevant in the past 20
years, but we have the legacy of strange codepage sets (and hundreds of
them) to deal with.  The politicized ISO standards at the time did not

help

matters.  Eventually the answer became Unicode -- and look how that has
struggled for 20+ years to become the standard.

On Sat, Jul 8, 2023 at 11:23 AM Paul Gilmartin <
[email protected]> wrote:

On Sat, 8 Jul 2023 09:37:23 +1000, Attila Fogarasi wrote:

Codepage 1047 was created to provide a bi-directional mapping to
ISO8859-1 character codes (this preserves values when going in either

That is not a valid rationale for codepage 1047.  There is a

bi-directional

mapping between 037 and ISO8859-1.

direction).  It also included most EBCDIC control codes (mapped to
unused ASCII codepoints) and about half the ASCII control codes (as

many

as
That is not a valid rationale for codepage 1047.  It may be a reason for
ISO8859-1, which has 32 non-ASCII control codes at 128-159.

would fit).  I think it was created in preparation for OpenEdition MVS
which became USS once it was Unix certified.  Codepage 924 is an

update of

CP1047 adding things like Euro sign, and matches ISO8859-15 (not
ISO8859-1).  CP037-2 differs from CP037 at 4 codepoints and is more

widely

Which 4?  Did they usurp any USASCII graphic equivalents from 037?  Was
there any reason that neither 037 nor 037-2 could have been used for

OMVS?

used then CP037 (though I've encountered CP037-2 implemented with the

name

CP037 by various products (!!)).  Luckily for human readable data the
differences don't matter.   I don't know if there are any other CP037-n
codepages, and these days it rarely matters.

"rarely matter" and "don't matter" are in the eye of the beholder.

Does 1047, 037, or 037-2 have €?  why could neither 037 nor 037-2 have

been

used for OMVS?

I remain unpersuaded of any rationale for 1047.

--
gil


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

--
Joel C. Ewing

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN


--
Joel C. Ewing

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Re: Code Page for dataset names

Reply via email to