On 9/23/2013 12:02 PM, Paul Gilmartin wrote:
On Mon, 23 Sep 2013 10:44:04 -0600, Steve Comstock wrote:
On 9/23/2013 10:22 AM, John McKown wrote:
If you mean a program, then the UNIX "iconv" command can do that. There is
also the "iconv" set of C language subroutines if you want to write your
own.
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/edclb1c0/3.440
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/cbcpg1c0/8.6.3
If you are really good with COBOL, you can probably figure out how to call
these using COBOL.
Actually, COBOL has the builtin function DISPLAY-OF that
converts UTF-16 to ASCII, EBCDIC, or UTF-8
UTF-16? I found an interesting article:
http://programmers.stackexchange.com/questions/102205/should-utf-16-be-considered-harmful
Well, duh. Any programmer working with UTF-16 had better be
aware that it is possible to encounter pairs of surrogate
characters representing a single Unicode character. While
these are rare, they are part of the coding scheme, and
the programmer should be prepared to deal with it.
Actually, UTF-8 is more dangerous, since a UTF-8 character
can take 1, 2, 3, or 4 bytes.
COBOL 5 introduces some new intrinsic functions to support
this more completely. (For some reason the COBOL docs talk
about 'supplementary' characters; these are surrogate pair
situations.)
Looks like UVALID, ULENGTH, UWIDTH, UPOS, USUBSTR, and
USUPPLEMENTARY could be of service for the programmer
working with UTF-8 and / or UTF-16.
Long term, I think we should move to UTF-32, where character
length is always four bytes. But I doubt if that will happen
anytime soon.
There is a similar builtin in PL/I. For HLASM you can
use the various translate instructions (but you gotta'
build your own translate table).
I suspect John M. might recommend calling LE services out of HLASM.
I doubt it, since there are no such services. But you can call
Unicode Services from HLASM (don't have to be LE-enabled but
you may be).
Things I'd worry about:
I'm sorry: in what context would you worry about these issues?
o Which UNICODE representation: UTF-16, UTF-8, UCS-2, ...?
Presumably documented for particular environment
o Which EBCDIC code page? SBCS? DBCS?
User's choice
o Is the error handling useful if the SMF UNICODE character is
absent from the EBCDIC code page?
Which error handling is that? For COBOL DISPLAY-OF you get a
substitution character, which your code can check for after
the fact. For Unicode Services check standard return and
reason codes.
o Even, if the SMF data are UTF-8 (or UTF-16) and contain an
invalid code, is the error handling useful?
See above.
-- gil
--
Kind regards,
-Steve Comstock
The Trainer's Friend, Inc.
303-355-2752
http://www.trainersfriend.com
* We are going out of business effective 30 December, 2013
* To purchase a set of our training materials at terrific prices,
check out our Going Out Of Business Sale:
http://www.trainersfriend.com/SpecialSale
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN