Re: UNICODE to EBCDIC

2013-10-06 Thread Shmuel Metz (Seymour J.)
In 006101ceb8b8$8fd70610$af851230$@mcn.org, on 09/23/2013
   at 04:56 PM, Charles Mills charl...@mcn.org said:

Unicode is not a character set

Sure it is.

If it's UTF-

UTF-8 is a transform, not a character set, even though you can specify
it in charset for MIME.

If it's UTF-16 or UCS-2 you can do a 98% job if you just discard
bytes 0, 2, 4, ... and treat bytes 1, 2, 5, ... as ASCII.

It's not my dog.
 
-- 
 Shmuel (Seymour J.) Metz, SysProg and JOAT
 ISO position; see http://patriot.net/~shmuel/resume/brief.html 
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-28 Thread Shmuel Metz (Seymour J.)
In 9791327634617405.wa.paulgboulderaim@listserv.ua.edu, on
09/23/2013
   at 09:43 PM, Paul Gilmartin paulgboul...@aim.com said:

o (not mentioned) A simple byte-by-byte lexical sort doesn't
  sort into UNICODE code point order (but I'm guessing).  This
  may not be very important.

I suspect that where there is a need for Unicode then there is also a
need to use the sort order for a particular country, which will not in
general be either the octet order or the code-point order.

It doesn't say (but perhaps TR # 16 does) which EBCDIC code page 
(or for that matter, which flavor of ASCII

There is only one flavor of ASCII. The ISO 8859 character sets are not
ASCII.

Generally you will need to tell any conversion program what the source
and target code pages are.
 
-- 
 Shmuel (Seymour J.) Metz, SysProg and JOAT
 ISO position; see http://patriot.net/~shmuel/resume/brief.html 
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-24 Thread Charles Mills
Also, don't know what I was thinking when I said assuming assembler. Although 
the Unicode Services API is very much classic IBM z/OS assembler-like it is 
supported from C/C++, and that is in fact how I am using it.

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Charles Mills
Sent: Monday, September 23, 2013 4:57 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: UNICODE to EBCDIC

z/OS Unicode Services is an AWESOME facility but there is a little bit of a 
learning curve (or coding curve if there is such a thing). It will certainly 
handle whatever you need, assuming assembler is viable option for you.

Unicode is not a character set (or format) -- it's a whole family of 
character sets. http://en.wikipedia.org/wiki/Unicode. If it's UTF-8 then you 
can do a 98% job if you just treat it as ASCII. If it's UTF-16 or UCS-2 you can 
do a 98% job if you just discard bytes 0, 2, 4, ... and treat bytes 1, 2, 5, 
... as ASCII.

There is actually a Unicode EBCDIC (UTF-EBCDIC) but it's pretty obscure.

Charles
-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Donald Likens
Sent: Monday, September 23, 2013 8:55 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: UNICODE to EBCDIC

WebSphere Application Server supplies some of its information in its SMF 
records in Unicode format. Is there a facility available to convert Unicode to 
EBCDIC?

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread Scott Barry
On Mon, 23 Sep 2013 10:54:44 -0500, Donald Likens dlik...@infosecinc.com 
wrote:

WebSphere Application Server supplies some of its information in its SMF 
records in Unicode format. Is there a facility available to convert Unicode to 
EBCDIC?


Suggest structured Internet search argument --  convert unicode to ebcdic 
site:ibm.com

Scott Barry
SBBWorks, Inc.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread Steve Comstock

On 9/23/2013 9:54 AM, Donald Likens wrote:

WebSphere Application Server supplies some of its information in its SMF 
records in Unicode format. Is there a facility available to convert Unicode to 
EBCDIC?



Which EBCDIC code page would you like?

Check out Unicode Services User's Guide and Reference, SA22-7649-14


--

Kind regards,

-Steve Comstock
The Trainer's Friend, Inc.

303-355-2752
http://www.trainersfriend.com

* We are going out of business effective 30 December, 2013

* To purchase a set of our training materials at terrific prices,
  check out our Going Out Of Business Sale:

http://www.trainersfriend.com/SpecialSale

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread Barry Merrill
The SAS Language (and of course, MXG Software that processes every SMF record 
on the face 
of the earth, and written in the SAS Language) has NO problem inputting Unicode 
data and
storing either as ASCII or EBCDIC characters.

And it's not ONLY WEBSPHERE SMF that has UNICODE.

Barry


Herbert W. “Barry” Merrill, PhD
President-Programmer
MXG Software
Merrill Consultants
10717 Cromwell Drive
Dallas, TX 75229
ba...@mxg.com

http://www.mxg.com - FAQ has Most Answers 
ad...@mxg.com  – invoices/PO/Payment
supp...@mxg.com– technical
tel: 214 351 1966  - expect slow reply, use email 
fax: 214 350 3694  – prefer email, still works

 

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Donald Likens
Sent: Monday, September 23, 2013 10:55 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: UNICODE to EBCDIC

WebSphere Application Server supplies some of its information in its SMF 
records in Unicode format. Is there a facility available to convert Unicode to 
EBCDIC?

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread Clark Morris
On 23 Sep 2013 09:23:00 -0700, in bit.listserv.ibm-main you wrote:

If you mean a program, then the UNIX iconv command can do that. There is
also the iconv set of C language subroutines if you want to write your
own.
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/edclb1c0/3.440
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/cbcpg1c0/8.6.3

If you are really good with COBOL, you can probably figure out how to call
these using COBOL. Likewise with PL/I or HLASM. If _I_ needed these in
COBOL, I would likely write an HLASM stub routine to marshall the
argument to/from the COBOL / C calling conventions.

 DFSORT can do it, with some difficulty, on a field basis by using the
TRAN=ALTSEQ phrase in an INREC or OUTREC FIELDS= command. Too bad there
isn't an easy way that I can see to just use UNICODE System Services.

While I haven't looked at the manual recently, since COBOL provides
the way to describe UNICODE fields, EBCDIC fields, and ISO 8 byte
character fields, I am fairly certain simple MOVE statements among
them would suffice.

Clark Morris


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread John McKown
On Mon, Sep 23, 2013 at 11:44 AM, Steve Comstock
st...@trainersfriend.comwrote:

 On 9/23/2013 10:22 AM, John McKown wrote:

 If you mean a program, then the UNIX iconv command can do that. There is
 also the iconv set of C language subroutines if you want to write your
 own.
 http://publibz.boulder.ibm.**com/cgi-bin/bookmgr_OS390/**
 BOOKS/edclb1c0/3.440http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/edclb1c0/3.440
 http://publibz.boulder.ibm.**com/cgi-bin/bookmgr_OS390/**
 BOOKS/cbcpg1c0/8.6.3http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/cbcpg1c0/8.6.3

 If you are really good with COBOL, you can probably figure out how to call
 these using COBOL.


 Actually, COBOL has the builtin function DISPLAY-OF that
 converts UTF-16 to ASCII, EBCDIC, or UTF-8


That's a new intrinsic to me. I don't use COBOL very much. Thanks.


 --

 Kind regards,

 -Steve Comstock
 The Trainer's Friend, Inc.

 303-355-2752
 http://www.trainersfriend.com



-- 
As of next week, passwords will be entered in Morse code.

Maranatha! 
John McKown

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread Steve Comstock

On 9/23/2013 10:22 AM, John McKown wrote:

If you mean a program, then the UNIX iconv command can do that. There is
also the iconv set of C language subroutines if you want to write your
own.
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/edclb1c0/3.440
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/cbcpg1c0/8.6.3

If you are really good with COBOL, you can probably figure out how to call
these using COBOL.


Actually, COBOL has the builtin function DISPLAY-OF that
converts UTF-16 to ASCII, EBCDIC, or UTF-8

There is a similar builtin in PL/I. For HLASM you can
use the various translate instructions (but you gotta'
build your own translate table).



Likewise with PL/I or HLASM. If _I_ needed these in
COBOL, I would likely write an HLASM stub routine to marshall the
argument to/from the COBOL / C calling conventions.

  DFSORT can do it, with some difficulty, on a field basis by using the
TRAN=ALTSEQ phrase in an INREC or OUTREC FIELDS= command. Too bad there
isn't an easy way that I can see to just use UNICODE System Services.



--

Kind regards,

-Steve Comstock
The Trainer's Friend, Inc.

303-355-2752
http://www.trainersfriend.com

* We are going out of business effective 30 December, 2013

* To purchase a set of our training materials at terrific prices,
  check out our Going Out Of Business Sale:

http://www.trainersfriend.com/SpecialSale

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread Steve Comstock

On 9/23/2013 12:02 PM, Paul Gilmartin wrote:

On Mon, 23 Sep 2013 10:44:04 -0600, Steve Comstock wrote:


On 9/23/2013 10:22 AM, John McKown wrote:

If you mean a program, then the UNIX iconv command can do that. There is
also the iconv set of C language subroutines if you want to write your
own.
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/edclb1c0/3.440
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/cbcpg1c0/8.6.3

If you are really good with COBOL, you can probably figure out how to call
these using COBOL.


Actually, COBOL has the builtin function DISPLAY-OF that
converts UTF-16 to ASCII, EBCDIC, or UTF-8


UTF-16?  I found an interesting article:

 
http://programmers.stackexchange.com/questions/102205/should-utf-16-be-considered-harmful


Well, duh. Any programmer working with UTF-16 had better be
aware that it is possible to encounter pairs of surrogate
characters representing a single Unicode character. While
these are rare, they are part of the coding scheme, and
the programmer should be prepared to deal with it.

Actually, UTF-8 is more dangerous, since a UTF-8 character
can take 1, 2, 3, or 4 bytes.

COBOL 5 introduces some new intrinsic functions to support
this more completely. (For some reason the COBOL docs talk
about 'supplementary' characters; these are surrogate pair
situations.)

Looks like UVALID, ULENGTH, UWIDTH, UPOS, USUBSTR, and
USUPPLEMENTARY could be of service for the programmer
working with UTF-8 and / or UTF-16.


Long term, I think we should move to UTF-32, where character
length is always four bytes. But I doubt if that will happen
anytime soon.





There is a similar builtin in PL/I. For HLASM you can
use the various translate instructions (but you gotta'
build your own translate table).


I suspect John M. might recommend calling LE services out of HLASM.


I doubt it, since there are no such services. But you can call
Unicode Services from HLASM (don't have to be LE-enabled but
you may be).



Things I'd worry about:


I'm sorry: in what context would you worry about these issues?




o Which UNICODE representation: UTF-16, UTF-8, UCS-2, ...?


Presumably documented for particular environment




o Which EBCDIC code page?  SBCS?  DBCS?


User's choice



o Is the error handling useful if the SMF UNICODE character is
   absent from the EBCDIC code page?


Which error handling is that? For COBOL DISPLAY-OF you get a
substitution character, which your code can check for after
the fact. For Unicode Services check standard return and
reason codes.




o Even, if the SMF data are UTF-8 (or UTF-16) and contain an
   invalid code, is the error handling useful?


See above.



-- gil




--

Kind regards,

-Steve Comstock
The Trainer's Friend, Inc.

303-355-2752
http://www.trainersfriend.com

* We are going out of business effective 30 December, 2013

* To purchase a set of our training materials at terrific prices,
  check out our Going Out Of Business Sale:

http://www.trainersfriend.com/SpecialSale

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread Paul Gilmartin
On Mon, 23 Sep 2013 10:44:04 -0600, Steve Comstock wrote:

On 9/23/2013 10:22 AM, John McKown wrote:
 If you mean a program, then the UNIX iconv command can do that. There is
 also the iconv set of C language subroutines if you want to write your
 own.
 http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/edclb1c0/3.440
 http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/cbcpg1c0/8.6.3

 If you are really good with COBOL, you can probably figure out how to call
 these using COBOL.

Actually, COBOL has the builtin function DISPLAY-OF that
converts UTF-16 to ASCII, EBCDIC, or UTF-8
 
UTF-16?  I found an interesting article:


http://programmers.stackexchange.com/questions/102205/should-utf-16-be-considered-harmful

There is a similar builtin in PL/I. For HLASM you can
use the various translate instructions (but you gotta'
build your own translate table).
 
I suspect John M. might recommend calling LE services out of HLASM.

Things I'd worry about:

o Which UNICODE representation: UTF-16, UTF-8, UCS-2, ...?

o Which EBCDIC code page?  SBCS?  DBCS?

o Is the error handling useful if the SMF UNICODE character is
  absent from the EBCDIC code page?

o Even, if the SMF data are UTF-8 (or UTF-16) and contain an 
  invalid code, is the error handling useful?

-- gil

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread Jon Perryman
For official character conversion in assembler, see the IBM manual z/OS 
Unicode Services User's Guide and Reference  which documents use of their 
unicode services.

Jon Perryman.
 

 From: Steve Comstock st...@trainersfriend.com


Actually, COBOL has the builtin function DISPLAY-OF that
converts UTF-16 to ASCII, EBCDIC, or UTF-8

There is a similar builtin in PL/I. For HLASM you can
use the various translate instructions (but you gotta'
build your own translate table).


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread John McKown
If you mean a program, then the UNIX iconv command can do that. There is
also the iconv set of C language subroutines if you want to write your
own.
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/edclb1c0/3.440
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/cbcpg1c0/8.6.3

If you are really good with COBOL, you can probably figure out how to call
these using COBOL. Likewise with PL/I or HLASM. If _I_ needed these in
COBOL, I would likely write an HLASM stub routine to marshall the
argument to/from the COBOL / C calling conventions.

 DFSORT can do it, with some difficulty, on a field basis by using the
TRAN=ALTSEQ phrase in an INREC or OUTREC FIELDS= command. Too bad there
isn't an easy way that I can see to just use UNICODE System Services.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread Charles Mills
z/OS Unicode Services is an AWESOME facility but there is a little bit of a 
learning curve (or coding curve if there is such a thing). It will certainly 
handle whatever you need, assuming assembler is viable option for you.

Unicode is not a character set (or format) -- it's a whole family of 
character sets. http://en.wikipedia.org/wiki/Unicode. If it's UTF-8 then you 
can do a 98% job if you just treat it as ASCII. If it's UTF-16 or UCS-2 you can 
do a 98% job if you just discard bytes 0, 2, 4, ... and treat bytes 1, 2, 5, 
... as ASCII.

There is actually a Unicode EBCDIC (UTF-EBCDIC) but it's pretty obscure.

Charles
-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Donald Likens
Sent: Monday, September 23, 2013 8:55 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: UNICODE to EBCDIC

WebSphere Application Server supplies some of its information in its SMF 
records in Unicode format. Is there a facility available to convert Unicode to 
EBCDIC?

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread Paul Gilmartin
On Mon, 23 Sep 2013 16:56:46 -0700, Charles Mills wrote:

Unicode is not a character set (or format) -- it's a whole family of 
character sets. http://en.wikipedia.org/wiki/Unicode. If it's UTF-8 then you 
can do a 98% job if you just treat it as ASCII. If it's UTF-16 or UCS-2 you 
can do a 98% job if you just discard bytes 0, 2, 4, ... and treat bytes 1, 2, 
5, ... as ASCII.

A little misleading, as I see it.  There's only one set of code points, but, 
yes,
multiple encoding methods (op. cit.).  This is similar to saying that there are
two (or more) USASCII character sets because they're represented big-endian
in storage but little-endian in network transmission.

There is actually a Unicode EBCDIC (UTF-EBCDIC) but it's pretty obscure.

Not as obscure as it deserves to be.

-- gil

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread Charles Mills
Yer right. It's a single character set (all the characters in the world! -- 
well, not quite: Jurchen, Nü Shu, Tangut, and Linear A are working their way 
through the approval process; Klingon is ineligible because of lack of real 
world use) and a variety of ways of encoding them. Okay?

It's not a format, right?

Also, a fairly obvious typo in what I wrote: treat bytes 1, 3, 5, ... as 
ASCII.

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Paul Gilmartin
Sent: Monday, September 23, 2013 5:18 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: UNICODE to EBCDIC

On Mon, 23 Sep 2013 16:56:46 -0700, Charles Mills wrote:

Unicode is not a character set (or format) -- it's a whole family of 
character sets. http://en.wikipedia.org/wiki/Unicode. If it's UTF-8 then you 
can do a 98% job if you just treat it as ASCII. If it's UTF-16 or UCS-2 you 
can do a 98% job if you just discard bytes 0, 2, 4, ... and treat bytes 1, 2, 
5, ... as ASCII.

A little misleading, as I see it.  There's only one set of code points, but, 
yes, multiple encoding methods (op. cit.).  This is similar to saying that 
there are two (or more) USASCII character sets because they're represented 
big-endian in storage but little-endian in network transmission.

There is actually a Unicode EBCDIC (UTF-EBCDIC) but it's pretty obscure.

Not as obscure as it deserves to be.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread Tony Harminc
On 23 September 2013 20:18, Paul Gilmartin paulgboul...@aim.com wrote:
 On Mon, 23 Sep 2013 16:56:46 -0700, Charles Mills wrote:

There is actually a Unicode EBCDIC (UTF-EBCDIC) but it's pretty obscure.

 Not as obscure as it deserves to be.

Never miss a chance on this one, do you Gil... As you know, I think
UTF-EBCDIC was a great idea, and can't understand how it failed to
catch on. Maybe there's just not much call for invoking legacy
interfaces on the IBM zArch OSs with new data. Or perhaps UNIX
programmers are lazier. Or something...

Tony H.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread John Gilmore
Paul Gilmartin's opinions about EBCDIC are not always or necessarily
wrong in detail, but they are wholly predictable.

John Gilmore, Ashland, MA 01721 - USA

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread John Gilmore
On 9/23/13, John Gilmore jwgli...@gmail.com wrote:
 Paul Gilmartin's opinions about EBCDIC are not always or necessarily
 wrong in detail, but they are wholly predictable.

 John Gilmore, Ashland, MA 01721 - USA



-- 
John Gilmore, Ashland, MA 01721 - USA

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread Paul Gilmartin
On Mon, 23 Sep 2013 21:23:04 -0400, Tony Harminc wrote:

 Not as obscure as it deserves to be.

Never miss a chance on this one, do you Gil... As you know, I think
UTF-EBCDIC was a great idea, and can't understand how it failed to
catch on. Maybe there's just not much call for invoking legacy
interfaces on the IBM zArch OSs with new data. Or perhaps UNIX
programmers are lazier. Or something...
 
OK.  I was shooting from the hip.  Hmmm:

http://en.wikipedia.org/wiki/UTF-EBCDIC

o ... Its advantages for existing EBCDIC-based systems are similar to
  UTF-8's advantages for existing ASCII-based systems.

IOW, all control characters and metacharacters for most programming
languages are encoded as themselves.  This makes it friendly to
those programming languages, and to transmission over (EBCDIC)
networks.

o ... the UTF-8-Mod encoding of codepoints above U+009F is generally
  larger than the UTF-8 encoding.

Somewhat a disadvantage.

o (not mentioned) A simple byte-by-byte lexical sort doesn't
  sort into UNICODE code point order (but I'm guessing).  This
  may not be very important.

o ..., so each byte is fed through a reversible (one-to-one) lookup
  table to produce the final UTF-EBCDIC encoding.

It doesn't say (but perhaps TR # 16 does) which EBCDIC code page
(or for that matter, which flavor of ASCII -- ISO8859-??) is used.
This could be chaotic.  And the dreadful LF-NEL pitfall lurks.
ASCII suffers similar problems; else why would we have the dreadful
trigraphs in ANSI C?

What should UTF-EBCDIC be used for?  Web servers?  Still
would require a tedious UTF-EBCDIC to UTF-8 (the modal practice)
Better simply to work in ASCII/UTF-8.  That's what the z/OS C
compiler Enhanced ASCII support is for (but is that much used?)

And I see:

user@HOST: iconv -l | grep UTF  
   
1208UTF-8

... no UTF-EBCDIC.  Similarly with UTF-EBCDIC.  Cause and effect?
Does LE not support UTF-EBCDIC because of lack of demand, or
does UTF-EBCDIC languish because of lack of LE support?

-- gil

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread David Crayford

On 24/09/2013 10:43 AM, Paul Gilmartin wrote:

It doesn't say (but perhaps TR # 16 does) which EBCDIC code page
(or for that matter, which flavor of ASCII -- ISO8859-??) is used.
This could be chaotic.  And the dreadful LF-NEL pitfall lurks.
ASCII suffers similar problems; else why would we have the dreadful
trigraphs in ANSI C?


The C++ committee wanted to deprecate trigraphs in the last standard 
http://tinyurl.com/n3nas3u. EBCDIC was the only tangible reason
for keeping them alive. I've lost count of the amount of times I've had 
a C/C++ analysis tool choke because of trigraphs - doxygen, valgrind etc.


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: UNICODE to EBCDIC

2013-09-23 Thread Paul Gilmartin
On Tue, 24 Sep 2013 11:18:30 +0800, David Crayford  wrote:

The C++ committee wanted to deprecate trigraphs in the last standard
http://tinyurl.com/n3nas3u. EBCDIC was the only tangible reason
for keeping them alive. I've lost count of the amount of times I've had
a C/C++ analysis tool choke because of trigraphs - doxygen, valgrind etc.
 
I could make a remark on this.  But it would be predictable.

-- gil

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN