Re: UNICODE to EBCDIC

Steve Comstock Mon, 23 Sep 2013 11:56:32 -0700

On 9/23/2013 12:02 PM, Paul Gilmartin wrote:

On Mon, 23 Sep 2013 10:44:04 -0600, Steve Comstock wrote:

On 9/23/2013 10:22 AM, John McKown wrote:

If you mean a program, then the UNIX "iconv" command can do that. There is
also the "iconv" set of C language subroutines if you want to write your
own.
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/edclb1c0/3.440
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/cbcpg1c0/8.6.3

If you are really good with COBOL, you can probably figure out how to call
these using COBOL.


Actually, COBOL has the builtin function DISPLAY-OF that
converts UTF-16 to ASCII, EBCDIC, or UTF-8

UTF-16?  I found an interesting article:

     
http://programmers.stackexchange.com/questions/102205/should-utf-16-be-considered-harmful


Well, duh. Any programmer working with UTF-16 had better be
aware that it is possible to encounter pairs of surrogate
characters representing a single Unicode character. While
these are rare, they are part of the coding scheme, and
the programmer should be prepared to deal with it.

Actually, UTF-8 is more dangerous, since a UTF-8 character
can take 1, 2, 3, or 4 bytes.

COBOL 5 introduces some new intrinsic functions to support
this more completely. (For some reason the COBOL docs talk
about 'supplementary' characters; these are surrogate pair
situations.)

Looks like UVALID, ULENGTH, UWIDTH, UPOS, USUBSTR, and
USUPPLEMENTARY could be of service for the programmer
working with UTF-8 and / or UTF-16.


Long term, I think we should move to UTF-32, where character
length is always four bytes. But I doubt if that will happen
anytime soon.

There is a similar builtin in PL/I. For HLASM you can
use the various translate instructions (but you gotta'
build your own translate table).

I suspect John M. might recommend calling LE services out of HLASM.


I doubt it, since there are no such services. But you can call
Unicode Services from HLASM (don't have to be LE-enabled but
you may be).


Things I'd worry about:


I'm sorry: in what context would you worry about these issues?


o Which UNICODE representation: UTF-16, UTF-8, UCS-2, ...?


Presumably documented for particular environment


o Which EBCDIC code page?  SBCS?  DBCS?


User's choice


o Is the error handling useful if the SMF UNICODE character is
   absent from the EBCDIC code page?


Which error handling is that? For COBOL DISPLAY-OF you get a
substitution character, which your code can check for after
the fact. For Unicode Services check standard return and
reason codes.


o Even, if the SMF data are UTF-8 (or UTF-16) and contain an
   invalid code, is the error handling useful?


See above.


-- gil



--

Kind regards,

-Steve Comstock
The Trainer's Friend, Inc.

303-355-2752
http://www.trainersfriend.com

* We are going out of business effective 30 December, 2013

* To purchase a set of our training materials at terrific prices,
  check out our Going Out Of Business Sale:

    http://www.trainersfriend.com/SpecialSale

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Re: UNICODE to EBCDIC

Reply via email to