Invalid IGXMLScanner::fDTDGrammar, causing segfault
---------------------------------------------------
Key: XERCESC-1961
URL: https://issues.apache.org/jira/browse/XERCESC-1961
Project: Xerces-C++
Issue Type: Bug
Components: SAX/SAX2, Utilities, Validating Parser (DTD)
Affects Versions: 2.6.0, 2.8.0
Environment: Linux, OpenVXI (http://sourceforge.net/projects/openvxi/)
Reporter: Peter Burns
The problem occurs while OpenVXI is initialising, when it parses a couple of
(hard-coded) DTDs and then a (hard-coded) XSD. During the DTD parsing
(SAX2XMLReader::parse()), a DTDGrammar is created and stored in two places:
GrammarResolver::fGrammarBucket and IGXMLScanner::fDTDGrammar. At the start of
the XSG parsing (SAX2XMLReader::loadGrammar()), the GrammarBucket is cleared,
deleting the DTDGrammar but leaving IGXMLScanner::fDTDGrammar still pointing to
it. During the parsing, IGXMLScanner::getEntityDeclPool() is called and hence
tries to call fDTDGrammar->getEntityDeclPool(). This sometimes causes a
segfault (though usually only after our app - performing these operations over
and over - has been running for a few hours).
I have some code which reproduces the problem - I'll attach it to this case as
soon as I can work out how. Since the code rarely segfaults, I've been
demonstrating it by adding printf()s to the DTDGrammar constructor/destructor
and IGXMLScanner::getEntityDeclPool(). So my test code currently generates this:
[peter@ultra1 xerces_bug]$ ./test
DTDGrammar::DTDGrammar() this = 0x95691508
Warning in file vxml 1.0 defaults at line 2 column 51
Reason: Element 'metadata' was referenced in a content model but never declared
DTDGrammar::~DTDGrammar() this = 0x95691508
DTDGrammar::DTDGrammar() this = 0x956fa908
DTDGrammar::DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::DTDGrammar() this = 0x95ac4908
IGXMLScanner::getEntityDeclPool() fDTDGrammar = 0x95691508
DTDGrammar::~DTDGrammar() this = 0x95ac4908
DTDGrammar::~DTDGrammar() this = 0x956fa908
[peter@ultra1 xerces_bug]$
showing the DTDGrammar this=0x95691508 being created, deleted and then used by
IGXMLScanner.
Our fix is to set fDTDGrammar to 0 after the bucket-clearing operation
fGrammarResolver->useCachedGrammarInParse(toCache);
at the start of IGXMLScanner::loadGrammar(), and this solves our problem.
We've reproduced the problem in v2.6.0 and v2.7.0, but v3.1.1 doesn't call
IGXMLScanner::getEntityDeclPool() in our test code. However, tracing it in gdb
I can see that v3.1.1 does potentially have the same problem, i.e.
IGXMLScanner::fDTDGrammar is pointing to a deleted DTDGrammar after
IGXMLScanner::loadGrammar() has cleared the cache.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]