Hi Todd,

Todd Ditchendorf <[EMAIL PROTECTED]> writes:

> Specifically, I would like to use the SAX2XMLReader class to parse an
> XML document that may or may not have a DOCTYPE specifying a DTD. I
> would like to *ignore* any internally referenced DTDs for validation
> purposes (if it is still parsed for other reasons, that is fine), and
> validate the document against a different DTD that I will specify.

There was a similar question on comp.text.xml. Here is a post from
that thread that you may find interesting:


[EMAIL PROTECTED] writes:

> Hello,
>
> I want to load a grammar which doesnt not come with a DOCTYPE
> declaration,
>
> I tried with those lines, and it doesn't work
>
>    parser->setFeature(XMLUni::fgSAX2CoreValidation, true);
>    parser->setFeature(XMLUni::fgSAX2CoreNameSpaces, false)

What happens if you also add these?

parser->setFeature(XMLUni::fgXercesDynamic, false);
parser->setFeature(XMLUni::fgXercesSchema, true);
parser->setFeature(XMLUni::fgXercesSchemaFullChecking, true);


>
>    // Load grammar and cache it
>    parser->loadGrammar(/path/to/my/file.dtd", Grammar::DTDGrammarType, true);
>    // enable grammar reuse
>    parser->setFeature(XMLUni::fgXercesUseCachedGrammarInParse, true);
>
> Actually, If the <! DOCTYPE validation SYSTEM "file.dtd" >  declaration
> is present, it does the validation with the file in the current path,
> not the one I loaded in cache.
> if the <! DOCTYPE ... > is not there, it does not validate.
>
> for example, it would be nice if it were possible to override a DOCTYPE
> declaration

My understanding is that the loadGrammar works as a proxy to the entity
resolver. In other words, when the parser sees

<! DOCTYPE validation SYSTEM "file.dtd" >

it normally calls entity resolver to locate file.dtd but before that it
consults the grammar cache to see if this grammar was already loaded.
So I think the reason why it does not work is because without the
above DOCTYPE there is nothing in the document that specifies against
which schema we should validate (note that nothing prevents you from
caching several unrelated schemas). In case of XML Schema, there is
the fgXercesSchemaExternalSchemaLocation property that allows you
to specify schemas for namespaces and thus trigger validation if
a document refers to one of those namespaces. In case of DTD, I don't
see any similar mechanism. You may also find the following article
interesting:

http://www-128.ibm.com/developerworks/webservices/library/x-xsdxerc.html

hth,
-boris


-- 
Boris Kolpackov
Code Synthesis Tools CC
http://www.codesynthesis.com
Open-Source, Cross-Platform C++ XML Data Binding

Reply via email to