/ [EMAIL PROTECTED] was heard to say:
| I am doing an XML to XML transformation.
[With my moderator hat on, I want to remind everyone that the appropriate
DocBook list for stylesheet and other application-related questions is
[EMAIL PROTECTED]]
| If I put the following into my XSL file, entities such as — create
| gibberish characters in the new XML file:
|
| <xsl:output method="xml" />
I think with closer inspection you'll find that they aren't gibberish,
they're UTF-8 representations of the Unicode characters that those
entities represent. For example, an mdash is Unicode character 8212.
The only way to represent that in UTF-8 is with a multi-byte sequence
of octets. That sequence, when viewed in a tool that does not understand
UTF-8 encodings appears as three upper-ASCII characters.
| If I replace the output method with "html" the entities work fine but the
| processing instructions no longer have the correct format. For example - I
| get the following line without the ? for the closing tag:
|
| <?xml:stylesheet type="text/xsl" href="ae_toc.xsl">
Right. When you asked for HTML, you told the processor to output HTML,
which is in ISO-Latin1, so entities have to be used for special
characters, and PIs have the SGML form.
| Any ideas? Thanks very much, Freda
I'm not sure what you want. The first form, in UTF-8, should be
understandable to any XML processor. The second form isn't XML.
Be seeing you,
norm
--
Norman Walsh <[EMAIL PROTECTED]> | Any idiot can face a crisis; it's
http://www.oasis-open.org/docbook/ | this day-to-day living that wears
Chair, DocBook Technical Committee | you out.--Anton Chekhov