Dear Wiki user, You have subscribed to a wiki page or wiki category on "Jakarta-commons Wiki" for change notification.
The following page has been changed by SimonKitching: http://wiki.apache.org/jakarta-commons/Digester/FAQ The comment on the change is: Added info on entity resolvers ------------------------------------------------------------------------------ digester.addCallParam("map/entry", 1, true); }}} + === Why does Digester read the DTD even when validation is disabled? === + + A DTD can affect the meaning of a document, so an XML parser still needs to read it even when validation is disabled. + Note that this is a fundamental feature of XML parsing, and nothing to do with Digester really. + + For example, the DTD may define default values for xml attributes: + {{{ + <!ATTLIST some-element some-attribute CDATA "some-default-value"> + }}} + + When the DTD is present, and the user specifies + {{{ + <some-element/> + }}} + the xml parser will report that the element has an attribute "some-attribute" with value "some-default-value". + But if the DTD is ignored (not read) then the element would be reported as having no attributes. + + The DTD can also define entities that can be referenced from the document. Without the DTD, these won't work. + + === How can I use a local version of a DTD referenced from an xml document? === + + When an xml document contains {{{<!DOCTYPE rootelement PUBLIC xxxx SYSTEM yyyy>}}} the xml parser used by + Digester will try to load file yyyy in order to process the DTD. As noted in the previous FAQ entry, this + occurs even when validation is disabled. + + SYSTEM is a totally non-portable identifier. Usually it is a reference to a local file that is really only useful + on the same machine the document was created on. Even when it is an http reference, it is not really wise for the + receiver to download the specified file each time the document needs to be parsed (if it's accessable at all). + + PUBLIC is a portable identifier that is essentially a key used to look up the real location of the corresponding resource. + An application receiving a document from a remote source is expected to register local copies of the relevant document by + public id, so the lookup returns a local copy. This solves the problem of passing XML documents between host machines. + + In order to support this, method Digester.register(String publicId, String entityURL) can be used to specify what local file + (or http url) should be read instead. Digester acts as an !EntityResolver for the XML parser it creates, and uses the registered + mappings in the !EntityResolver.resolveEntity method. This mapping applies to all "external entities" referenced by the xml + document being parsed, not just the DTD (though xml documents don't typically use external entities other than the DTD). + + Note that this method takes a ''PUBLIC'' id only, not a ''SYSTEM'' id. A document that is meant to be used across machines which + omits the PUBLIC identifier is broken. + + If you do have to deal with a broken XML document that only has a SYSTEM id and no PUBLIC id then you will need to create an + !EntityResolver and pass it to the Digester.setEntityResolver method. If you really do want to ignore the DTD, you can roll your + own !EntityResolver class in about 10 lines; it just needs to return an empty stream. Before doing this, however, re-read the + FAQ entry describing why the DTD is read even when validation is disabled. + + An alternative to writing your own !EntityResolver is to use a real !EntityResolver such as the one available from: + http://xml.apache.org/commons/components/resolver/index.html + + More information on !EntityResolver behaviour can be found here: + http://xml.apache.org/commons/components/resolver/resolver-article.html + --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
