Hi Michael,
in this case the error is not caused by a character that the target encoding doesn't support (Xerces-C would handle that). It's because a node contains a character that XML is not supposed to accept.

Alberto

Il 03/01/2012 17:47, Michael Glavassevich ha scritto:
Does Xerces-C's implementation of LSSerializer [1] support the
"well-formed" parameter? It's a required feature.

Turning that on in Xerces-J would cause an error to be reported for the
invalid character.

Thanks.

[1] http://www.w3.org/TR/DOM-Level-3-LS/load-save.html#LS-LSSerializer

Michael Glavassevich
XML Technologies and WAS Development
IBM Toronto Lab
E-mail: mrgla...@ca.ibm.com
E-mail: mrgla...@apache.org

Alberto Massari<alberto.mass...@progress.com>  wrote on 01/03/2012 11:34:21
AM:

Hi Nedim,
it's a known limitation of the current codebase: see
https://issues.apache.org/jira/browse/XERCESC-1854
You can check if a character is valid according to XML 1.0 by using
XMLChar1_0::isXMLChar. For XML 1.1, use XMLChar1_1::isXMLChar

Alberto

Il 03/01/2012 15:01, Nedim Srndic ha scritto:
Hello,

&#x1; is an invalid character reference in XML 1.0. If I write the byte
value "\x01" to a Xerces-C TextNode and serialize the entire
DOMDocument
using UTF-8 and StdOutFormatTarget with XML version set to "1.0", then
Xerces-C writes the resulting XML document (without substituting the
character with the corresponding character reference) and doesn't
report
any errors. Of course, the resulting XML is not well-formed and so I
cannot use it in other programs.

In XML 1.1 this character reference is allowed and Xerces-C correctly
performs the character substitution, but the software I am using these
documents with sadly still does not support XML 1.1.

Is there a Xerces-C function that I can call that will check if a
string
that I want to put in an XML document satisfies the rules of
well-formedness for the given XML version? Is there something else I
can
do about this problem? Why doesn't Xerces-C report the error?

Thank you,
Nedim Srndic

Reply via email to