On 2016-01-21 15:27, rle...@codelibre.net wrote:

My unit test here fails with "invalid document structure".  However,
the XML is well-formed UTF-8 with no BOM; it's been working fine for
years.

The difference between the exception being thrown or not thrown is
very simple: whether or not I link Xerces-C (works) or Xerces-C *and*
Xalan-C (fails).  I'm not *using* Xalan-C, I'm merely linking my
shared library against it.

What I don't understand is why xercesc::AbstractDOMParser::parse would
be affected by linking Xalan-C into my library.  Does it have some
dynamically-registered hooks which subtly alter the behaviour of
Xerces-C?  There's no visible presence of Xalan in the stacktrace
below, but it's clearly had *some* effect.  If I remove Xalan from the
linker options, it works immediately.

Another data point: If I build Xerces-C from source on the same
platform, linking against ICU, it works with Xalan-C built from source
(without ICU).  Is there some required combination to get a functional
setup?  The other platforms also had Xerces compiled against ICU.  Or
any way to configure things to make it work?

An additional observation. I recompiled the xerces-c3 port with the icu transcoder in place of the iconv transcoder, and then recompiled the xalan-c port against it. When I rebuild and re-run my code, the tests then pass. So it can at least be made to work, but there's still a bit of a mystery in why this is happening.

- why does xalan break xerces when icu support is missing from both, merely by the act of linking it in?
- why does building xerces (not xalan) with ICU support fix this?

Could it be these:

$ nm -D -C /usr/local/lib/libxalan-c.so|grep " W xercesc_"
0000000000344a90 W xercesc_3_1::InputSource::setIssueFatalErrorIfNotFound(bool)
00000000003472b0 W xercesc_3_1::ErrorHandler::~ErrorHandler()
0000000000214e40 W xercesc_3_1::SAXException::~SAXException()
0000000000214e10 W xercesc_3_1::SAXException::~SAXException()
00000000001da9b0 W xercesc_3_1::OutOfMemoryException::~OutOfMemoryException()
0000000000344a50 W xercesc_3_1::InputSource::getEncoding() const
0000000000344a60 W xercesc_3_1::InputSource::getPublicId() const
0000000000344a70 W xercesc_3_1::InputSource::getSystemId() const
0000000000344a80 W xercesc_3_1::InputSource::getIssueFatalErrorIfNotFound() const
0000000000214e90 W xercesc_3_1::SAXException::getMessage() const

Could xalan be providing replacement InputSource methods which are not functioning correctly, resulting in invalid input? These are the only obvious symbols I can see which could override xerces symbols (the rest are vtables/typinfo).


Regards,
Roger

Reply via email to