USS does support Unicode ... sort of.
Actually, any support of Unicode these days is still "sort of", so I
don't mean to slam USS support of it.  Just that it will (still) be
very entrenched in non-Unicode compatibility reqs.  (duh - as you
probably would expect)  Look for UTF-8 and how to enable it.

The quickest trigger for me, w/r/t which EBCDIC code page was used, is
where the square brackets land.  If I see AD and BD, then it's
probably 1047.  If I see BA and BB, then it's a good guess that it's
CP 37.  But most of my world involves C source or other things rich in
square brackets.  YMMV.  Scanning for "not" is another helpful hint.

-- R;   <><





On Fri, Mar 5, 2010 at 13:38, Bob Cronin <[email protected]> wrote:
> This is not specifically pipeline-related (although if there's a solution,
> I'll likely implement it using pipelines). I'd just like to pick the brains
> of a lot of very smart people with lots of IBM mainframe experience ...
>
> Can anyone suggest possible approaches to the problem of examining an
> arbitrary collection of EBCDIC text (all presumed to have been prepared
> using the same codepage) and somehow determining which codepage that was?
> ASCII mail clients (e.g. such as Lotus Notes) have functionality to choose a
> "best match" ASCII character set to use for Internet mail. I would like to
> be able to do the same thing for EBCDIC (so that when I convert it to ASCII,
> I choose an EBCDIC-to-ASCII translation table that has the maximum
> probability of delivering the correct characters). I need to detect both
> single and double-byte EBCDIC encodings. At present I use a somewhat
> cumbersome table-driven approach which defines the most likely EBCDIC
> codepage to be in use by the users of a given VM system (e.g. if that system
> is in Japan, I presume EBCDIC 939). I'd like to try to improve it.
>
> I'd really rather just use Unicode, but alas, VM does not support it (not
> sure about MVS, but I suspect not).
> --
> bc
>

Reply via email to