Re: [PATCH] IBM z/OS + EBCDIC support

Thorsten Glaser Fri, 01 May 2015 09:40:06 -0700

Daniel Richard G. dixit:

>Hate to disappoint...


oO

>But tr(1) does support octal escapes, so you could do e.g.
>
>    $ echo a | tr '\201' X
>    X

OK, wonderful, that should work.

[ set the compiler charset option ]
>themselves anyway. A Build.sh option just seems like overkill to me.

Right, but the idea is that *if* we need the selected charset
in mksh anyway, it’s easier and cheaper to do it like that.
If not, then, sure.

>Couldn't you just keep the backslash-newlines in the *.gen files? Are

Several shells’ read -r option is broken, so if I use that,
I get the exact same problem for a different set of ancient
shells ☹

I could write it all on one line in the source tree, but
that’s like giving up. I convert them for the releases
already anywa…………………but I feel like stupid right now.

The files have only one user each, so I can just pull the
macros out of them (I think – without looking). Duh!

[ host C tool ]
>It's possible some chtag(1) tagging might be needed, as encodings could
>potentially get mixed up in certain instances. (This was the case with a
>-qascii Bash build)

Sure, we’ll see about this when it becomes current. I assume
that’s just a line “if OS/390 then chtag” more in Build.sh.

>> >I'm presuming this would be wchar_t and its related functions?
>>
>> Absolutely no! These are extremely unportable.
>
>I suspected as much, but I've never actually had to deal with them, so
>was unsure. I'll keep that point in mind, however.

If you target NetBSD, GNU, musl or Solaris, they can be used.
MirBSD too, though we have only one locale. Anything else, or
even older versions of these systems, will suck. Plus, the local
admin must usually install the locale you want to use.

So, this is not for mksh in the generic case. We *can* use this
in the specific z/OS case, though.

>Oh, okay. I don't think the conditionals should get hairy... and if they
>do, then there is probably a better way of going about it.

Right.

>> We can have, say, zmksh (and zlksh), for which this does not hold.
>
>Is it convention to name the binaries differently for nonstandard
>variants? (E.g. the native Win32 port would also have modified names?)

I wish for it to be convention, so they don’t accidentally get used when
        #!/usr/bin/env mksh
is a script’s shebang line. Granted, you can possibly check $KSH_VERSION
but tbh that’s like enabling UTF-8 mode for scripts by default if the
current locale is Unicode: too many scripts (the majority) implicitly
assume LC_ALL=C and don’t set that. So no, I prefer to not put this
burden on the script writers.

>The code page is set at compile time, with the -qconvlit option. From
>the xlc(1) man page:

Thanks.

>believe the only case where this could become an issue is when you have
>mismatched code pages (e.g. EBCDIC 1047 mksh + EBCDIC 037 user), and
>then you pray that as many code points agree as possible. This, IMO,
>falls squarely in the category of "user caveat."

So we assume EBCDIC 1047 mksh + EBCDIC 037 user is allowed to fail,
and we only really have to support the code page used at compilation.

>This situation could change, however, once mksh is doing UTF-16
>internally. Then, because it has to translate everything to and from the
>outside world anyway, I see no reason why it couldn't use a 1047 table
>for user A, and a 037 table for user B. Perhaps even straight UTF-8 for

I think there’s two things speaking out against this:
• the compiler transcodes the strings and chars already anyway,
  and we rely on that too much
• there is a speed and simplicity advantage of having only one charset
Weak reasons, but as this is already very tricky should be kept in mind.

>There are definitely other EBCDIC platforms, but will they become
>relevant? That all depends on whether there's some random schmuck
>messing around on those systems who takes a liking to your project :)

;-)

>I'm not sure about "Z/OS MKSH", however, if the -qascii build would have
>"MIRBSD MKSH". Both are z/OS, after all, and the only thing
>significantly different about the EBCDIC build is, well, EBCDIC.

OK. So how about “EBCDIC MKSH” for zmksh, keeping “MIRBSD KSH” for mksh
(historic reasons, I’d use MKSH there nowadays).

>(Couldn't get uhr to work with R50 on my Debian system, however... lots
>of "no coprocess" errors...)

Huh.

tg@tglase-eee:~ $ zcat /usr/share/doc/mksh/examples/uhr.gz | mksh

This works OOTB for me. But you do have to install bc(1) first;
unlike real Unix systems, absolutely-basic-should-be-everywhere
tools like bc, ed, uudecode are not installed by default on GNU.

Now go get some sleep ;-)

bye,
//mirabilos
-- 
“The final straw, to be honest, was probably my amazement at the volume of
petty, peevish whingeing certain of your peers are prone to dish out on
d-devel, telling each other how to talk more like a pretty princess, as though
they were performing some kind of public service.” (someone to me, privately)

Re: [PATCH] IBM z/OS + EBCDIC support

Reply via email to