Dear All, Thank you Alberto for guiding me to get rid of the "Unknown element" validation errors.
I tried setting the parameter 'XMLUni::fgDOMErrorHandler' for the DOMBuilderparser but there it had no such parameter and also I am using the DOM document which is returned after parsing. DOMBuilder parser (while parsing against the schema) reports the first schema-related error and continues with further parsing and reporting of other schema-related errors (if any). Is it possible for the DOMBuilderparser to behave in the same way (and not do any auto-modification) when there are invalid XML statement(s) like the one reported in my previous mail? Regards, Neetha On Mon, Jul 16, 2012 at 5:03 PM, Alberto Massari < [email protected]> wrote: > Hi Neetha, > the correct thing to do would be to not make these calls > > > (*pParser)->setFeature( XMLUni::fgXercesSchema, true ); > (*pParser)->setFeature( XMLUni::fgXercesSchemaFullChecking, > true ); > (*pParser)->setFeature( XMLUni::fgDOMValidation, true); > (*pParser)->setFeature( > XMLUni::fgXercesCacheGrammarFromParse, true); > > when bValidate == false, as you are asking to validate against a schema > that you are not going to provide. This will remove the "Unknown element" > validation errors. As for what you say it's an "auto-modification", it's > the correct behaviour: <name="abc"> is not a valid XML statement (either > there is a missing tag name, and "name" is an attribute, or "name" is the > element and it's missing a space followed by the attribute name. If you > force the parser to continue, the DOM tree you get back will be incomplete, > at best. > If you really want to get a DOM tree out of that invalid XML, you could > attach a W3C DOMErrorHandler (different from the one you provided) using > (*pParser)->setParameter(XMLUni::fgDOMErrorHandler, domErrorHandlerVar) > This class has a handleError method where you can check what happened by > examining the DOMError argument, and the DOMLocation inside it (it contains > the DOM node where the error was located). If you return "true", the parser > will try continuing the parse process; if you return "false", parsing will > be aborted. > > Alberto > > > Il 16/07/2012 12:06, neetha patil ha scritto: > > Dear Alberto, > > Thank you for the quick reply. > > As I do not load the grammar (schema) to the parser, it gives error like > "Unknown element.." etc., for all the XML tags until it hits the invalid > tag for which it gives the error 'Expected an attribute name' and aborts > parsing as you mentioned. > > So I set the feature 'XMLUni::fgXercesContinueAfterFatalError' to true > and got the complete file parsed. However the line containing the invalid > tag was modified as follows:- > ... > ... > <Services> > ... > ... > </Services> > ... > <name> > ... > ... > </name> > ... > ... > > As it is told in http://xml.apache.org/xerces-c-new/program-dom.html that > setting this feature to true might result in an *undetermined* behavior > of the parser, is there any other way for the parser to report the error > and continue parsing? Also can we prevent the auto-modification (in this > case, the modification from <name="abc"> to <name>)? > > Thanks > > Regards, > Neetha > > On Mon, Jul 16, 2012 at 2:39 PM, Alberto Massari < > [email protected]> wrote: > >> Hi, >> Xerces doesn't modify your document; you should check the error handler >> to see if the parsing was aborted because of an error. In this case the >> returned DOM tree would be complete up to position of the error. >> >> Alberto >> >> Il 16/07/2012 10:25, neetha patil ha scritto: >> >> Dear All, >> >> I am using Xercesc_2_8 C++. I provide a XML file (containing an invalid >> tag) to the >> DOMBuilder parser. I then edit the DOM document which is generated and >> save the document back to the XML file. The content of this file is now >> truncated from the invalid tag onwards. Why does the parser modify the file >> while parsing? How do I prevent the same? i.e., I want the parser to report >> the error and continue parsing but not modify the XML content. >> Following is the snapshot of the XML file:- >> ... >> ... >> <Header id="My Project Id" nameStructure="DevName" revision="0" >> version="1"> >> ... >> </Header> >> ... >> ... >> <Services> >> ... >> ... >> </Services> >> <!-- Invalid tag: No node name --> >> <name="abc"> >> ... >> ... >> Following is the code snippet of the parser:- >> *void CHelper::InitDOM() >> *{ >> // m_pDomImpl is a pointer to DOMImplementation >> m_pDomImpl = 0; >> if(m_pDomImpl == NULL) >> { >> XMLPlatformUtils::Initialize(); >> m_pDomImpl = >> DOMImplementationRegistry::getDOMImplementation( gLS ); >> } >> } >> >> *int CHelper::LoadFile(DOMBuilder** pParser, const CString& strXMLFile, >> DOMDocument** pDoc, CStringArray& arrError, bool bValidate, const >> CString& strSchemaFile) >> *{ >> ... >> if(*pParser == NULL) >> { >> *pParser = >> ((DOMImplementationLS*)m_pDomImpl)->createDOMBuilder >> >> (DOMImplementationLS::MODE_SYNCHRONOUS, >> 0 ); >> if((*pParser) ==NULL) >> { >> return DOM_INITIALIZE_FAILED; >> } >> >> (*pParser)->setFeature( XMLUni::fgDOMNamespaces, true ); >> (*pParser)->setFeature( XMLUni::fgXercesSchema, true ); >> (*pParser)->setFeature( XMLUni::fgXercesSchemaFullChecking, >> true ); >> (*pParser)->setFeature( XMLUni::fgDOMValidation, true); >> (*pParser)->setFeature( >> XMLUni::fgXercesCacheGrammarFromParse, true); >> } >> >> try >> { >> CMyDOMErrHandler eh(); >> m_arrValidationErrs.RemoveAll(); >> >> // parseURI a blocking call. All the errors will be >> reported first if any error handler is set >> // then only the next line will be executed. >> if(bValidate == true) >> { >> (*pParser)->setErrorHandler(&eh); >> (*pParser)->loadGrammar( strSchemaFile, >> Grammar::SchemaGrammarType, true); >> } >> else >> { >> (*pParser)->setErrorHandler(NULL); >> } >> *pDoc =(*pParser)->parseURI(strXMLFile); >> ... >> ... >> } >> catch(...) >> { >> ... >> } >> >> return SUCCESS; >> >> } >> >> Thank you in advance. >> Regards, >> Neetha >> >> >> >> > > >
