Re: \uXXXX on EBCDIC systems

Daniel Richard G. Thu, 04 May 2017 19:08:39 -0700

On Wed, 2017 May  3 23:06+0000, Thorsten Glaser wrote:
>
> >in the terminal. Effectively an ASCII->EBCDIC->ASCII round trip.
>
> While I still cringe at the way you Mainframe types (also the IBM docs
> I read) write ASCII and mean 8-bit charsets, yes, that’s exactly what
> I was thinking about.


Hey, the issue remains even for the 7-bit core of ASCII :]

(I'm not really a mainframe type, because I have no beard ^_^)

> >I don't know if there are use cases where this may yield
> >unintuitive results... perhaps if this "nega-UTF-8" were redirected
> >to a file and then processed further in z/OS, that may lead to some
> >surprises. But in
>
> That was *also* what I was thinking about. But then I remembered (and
> this is why I replied to the old thread for this) that you said that
> UTF-8 is rarely seen and UTF-EBCDIC isn’t seen at all and UCS-{2,4} is
> just used on the system if needed.

Right. If and when general Unicode is handled on z/OS, it's usually data
in transit, rather than anything that is consumed locally. That's why
z/OS still only does 8-bit codepages in 2017...

> Of course, ASCII-mksh on z/OS would do away with all this; out of
> curiosity, IIRC you said something about wanting to create an ASCII
> environment for z/OS, have you come any closer to that?

I was working on that for a time, but ultimately that fell off track due
to a number of severe bugs with the USS environment (basic stuff like
signal behavior, floating-point ops returning incorrect results, etc.).
I've been working with IBM to get these fixed, and may yet one day
return to the ASCII work. But nowadays, even IBM themselves are hawking
z/Linux more aggressively than z/OS proper, so it would all end up
becoming moot if we go that path.

> (There’s that added data point of \uXXXX on EBCDIC-mksh writing stuff
> to local files that ASCII-mksh could not read… unless it were able to
> access the files “as if” they were autoconverted to extended ASCII. Oh
> well, or just use iconv…)

FWIW, iconv the program and API are available on z/OS. They don't
support as many encodings as Glibc, but the basics are there.

> So, to summarise, I believe we both agree in saying that this (botch
> \uXXXX (and base-1 integers in utf8-mode) to output not UTF-8 but
> UTF-8-converted-as-extended-ASCII-to-EBCDIC) makes sense, or, at
> least, more than not doing it?

Yes, I agree with that. I don't see the usefulness of a shell generating
real UTF-8 in the EBCDIC environment, at least not as the default.

Nega-UTF-8 > UTF-8, in other words, and certainly nega-UTF-8 >> error.


--Daniel


-- 
Daniel Richard G. || [email protected]
My ASCII-art .sig got a bad case of Times New Roman.

Re: \uXXXX on EBCDIC systems

Reply via email to