Hi Neetha,
the correct thing to do would be to not make these calls

             (*pParser)->setFeature( XMLUni::fgXercesSchema, true );
(*pParser)->setFeature( XMLUni::fgXercesSchemaFullChecking, true );
              (*pParser)->setFeature( XMLUni::fgDOMValidation, true);
(*pParser)->setFeature( XMLUni::fgXercesCacheGrammarFromParse, true);

when bValidate == false, as you are asking to validate against a schema that you are not going to provide. This will remove the "Unknown element" validation errors. As for what you say it's an "auto-modification", it's the correct behaviour: <name="abc"> is not a valid XML statement (either there is a missing tag name, and "name" is an attribute, or "name" is the element and it's missing a space followed by the attribute name. If you force the parser to continue, the DOM tree you get back will be incomplete, at best. If you really want to get a DOM tree out of that invalid XML, you could attach a W3C DOMErrorHandler (different from the one you provided) using (*pParser)->setParameter(XMLUni::fgDOMErrorHandler, domErrorHandlerVar) This class has a handleError method where you can check what happened by examining the DOMError argument, and the DOMLocation inside it (it contains the DOM node where the error was located). If you return "true", the parser will try continuing the parse process; if you return "false", parsing will be aborted.

Alberto


Il 16/07/2012 12:06, neetha patil ha scritto:
Dear Alberto,
Thank you for the quick reply.
As I do not load the grammar (schema) to the parser, it gives error like "Unknown element.." etc., for all the XML tags until it hits the invalid tag for which it gives the error 'Expected an attribute name' and aborts parsing as you mentioned. So I set the feature 'XMLUni::fgXercesContinueAfterFatalError' to true and got the complete file parsed. However the line containing the invalid tag was modified as follows:-
...
...
<Services>
     ...
     ...
</Services>
...
<name>
...
...
</name>
...
...
As it is told in http://xml.apache.org/xerces-c-new/program-dom.html that setting this feature to true might result in an *undetermined* behavior of the parser, is there any other way for the parser to report the error and continue parsing? Also can we prevent the auto-modification (in this case, the modification from <name="abc"> to <name>)?
Thanks
Regards,
Neetha

On Mon, Jul 16, 2012 at 2:39 PM, Alberto Massari <[email protected] <mailto:[email protected]>> wrote:

    Hi,
    Xerces doesn't modify your document; you should check the error
    handler to see if the parsing was aborted because of an error. In
    this case the returned DOM tree would be complete up to position
    of the error.

    Alberto

    Il 16/07/2012 10:25, neetha patil ha scritto:
    Dear All,

    I am using Xercesc_2_8 C++. I provide a XML file (containing an
    invalid tag) to the

    DOMBuilderparser. I then edit the DOM document which is generated
    and save the document back to the XML file. The content of this
    file is now truncated from the invalid tag onwards. Why does the
    parser modify the file while parsing? How do I prevent the same?
    i.e., I want the parser to report the error and continue parsing
    but not modify the XML content.
    Following is the snapshot of the XML file:-
    ...
    ...
    <Header id="My Project Id" nameStructure="DevName" revision="0"
    version="1">
         ...
    </Header>
         ...
         ...
    <Services>
         ...
         ...
    </Services>
    <!-- Invalid tag: No node name -->
    <name="abc">
    ...
    ...
     
    Following is the code snippet of the parser:-
    *void CHelper::InitDOM()
    *{
            // m_pDomImpl is a pointer to DOMImplementation
            m_pDomImpl = 0;
            if(m_pDomImpl == NULL)
            {
                  XMLPlatformUtils::Initialize();
                  m_pDomImpl =
    DOMImplementationRegistry::getDOMImplementation( gLS );
             }
    }
    *int CHelper::LoadFile(DOMBuilder** pParser, const CString&
    strXMLFile, DOMDocument** pDoc, CStringArray&     arrError, bool
    bValidate, const CString& strSchemaFile)
    *{
           ...
           if(*pParser == NULL)
           {
                  *pParser =
    ((DOMImplementationLS*)m_pDomImpl)->createDOMBuilder
                                                                                
                     (DOMImplementationLS::MODE_SYNCHRONOUS,
     0 );
                   if((*pParser) ==NULL)
                  {
                        return DOM_INITIALIZE_FAILED;
                  }

                  (*pParser)->setFeature( XMLUni::fgDOMNamespaces,
    true );
                  (*pParser)->setFeature( XMLUni::fgXercesSchema, true );
                  (*pParser)->setFeature(
    XMLUni::fgXercesSchemaFullChecking, true );
                  (*pParser)->setFeature( XMLUni::fgDOMValidation, true);
                  (*pParser)->setFeature(
    XMLUni::fgXercesCacheGrammarFromParse, true);
           }

           try
           {
                  CMyDOMErrHandler eh();
                  m_arrValidationErrs.RemoveAll();

                  // parseURI a blocking call. All the errors will be
    reported first if any error handler is set
                  // then only the next line will be executed.
                  if(bValidate == true)
                 {
                       (*pParser)->setErrorHandler(&eh);
                       (*pParser)->loadGrammar( strSchemaFile,
    Grammar::SchemaGrammarType, true);
                 }
                 else
                 {
    (*pParser)->setErrorHandler(NULL);
                 }
                 *pDoc =(*pParser)->parseURI(strXMLFile);
                 ...
                 ...
          }
          catch(...)
          {
                ...
          }

          return SUCCESS;

    }

    Thank you in advance.

    Regards,
    Neetha






Reply via email to