Hello, I'm trying to user Xerces C++ 2.7 to read information from a DTD. The code below reads in the following entry in a DTD: <!ELEMENT a (x, y, z)>
// Load the DTD XercesDOMParser parser; DTDGrammar* grammar =3D (DTDGrammar*) parser.loadGrammar(dtdPath, Grammar::DTDGrammarType); // Get a specific DTD element DTDElementDecl* elementDecl =3D (DTDElementDecl*) grammar->getElemDecl(0, nil, objName, 0); // objName is "a" // Get information about the children of this element ContentSpecNode* specNode =3D elementDecl->getContentSpec(); DFAContentModel contentModel(true, specNode); // Question 1: Whey does this return nil ContentLeafNameTypeVector* content =3D contentModel.getContentLeafNameTypeVector(); // Question 2: Why does this line return (child count + 1), in this case 4 int leafCount =3D content->getLeafCount(); Question 1: The only way I can see that I can access the list of children, their names and types (exactly one, zero or one, zero or more, one or more, etc) is to use getContentLeafNameTypeVector. This unfortunately returns nil. Looking at DFAContentModel::buildDFA line 471 is the reason nil is returned: if ( (fLeafListType[outIndex] & 0x0f) !=3D ContentSpecNode::Leaf ) I'm assuming that ContentSpecNode::Leaf is for exactly once. So the net effect of this line is that if every child is a leaf node, then the ContentLeafNameTypeVector isn't populated. Why? Question 2: If I patch the code to ignore that check, the next problem is that the count returned is (child count + 1), so for the DTD element listed above, 4 is returned. The comments in the code indicate that the last node is the end of content (EOC) node that is needed for the implementation of DFAContentModel to "get rid of any repetition short cuts". Can I assume that I will alway get child count + 1? Cheers!
