On Sat, 10 Jun 2017 16:48:06 -0500, John McKown wrote:
>>
>Hum, 0x4c in UTF-8 is an "L". In EBCDIC CP-037 (et al.) it is a "<". If
>you look at the first line:
>
><?xml version="1.0" encoding="UTF-8" ?>
>
>the phrase: encoding="UTF-8" says that the rest of the data is in UTF-8.
>But it's actually in EBCDIC. So the XML parser "sees" the "<" (in EBCDIC,
>this is 0x4c, as in error) as a UTF-8 value of "L", which is not what it
>wants at this point.
>
>I'm not totally sure, but I think you need the first line to look like:
>
><?xml version="1.0" encoding="IBM037" ?>
>
>or maybe even just, leaving off the encoding entirely,
>
><?xml version="1.0" ?>
>
>a good source of information on XML on z/OS:
>http://www.redbooks.ibm.com/redbooks/pdfs/sg247810.pdf section 1.4 on
>"Encoding".
>
So wattaya gonna do!?
I hate EBCDIC!
A related example: When I send email to a CMS ID, it often arrives with:
Content-Type: text/plain; charset=UTF-8
... but the body has clearly been translated, byte-by-byte, from ASCII to
EBCDIC.
If they convert the body, shouldn't they adjust the MIME headers accordingly.
UTF-EBCDIC, whatever that is, if that conversion is performed?
... and I belive (some) standards require that the headers themselves be
USASCII,
and those have also been translated to EBCDIC.
-- gil
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN