On Thu, 2004-07-15 at 09:26, Craig McClanahan wrote: > Paolo Valladolid wrote: > > >I need to use Digester to parse XML that has been retrieved from a > >database. The XML I'm working with was received from elsewhere (ie. Not > >created by our team). How do I get Digester to ignore the <!DOCTYPE> > >tag? I've tried setValidating( false ) and it did not work. > > > > > > > The setValidating(false) call does indeed tell Digester to not validate > the XML data. However, it does *not* tell the underlying XML parser to > skip the DOCTYPE, and there is no API in JAXP to say that sort of thing. > > If your problem is unresolved entities, one thing you can do is to > provide your own EntityResolver method whose resolveEntity() method > always returns null. That way, the parser won't go traipsing around the > network trying to find things that it can't.
Hi Paolo, I'm presuming the problem is that you have a DOCTYPE like this: <!DOCTYPE public "http://www.acme.com/mydtd.dtd"> and want to suppress loading of the referenced document, or have a DTD which declares <!ENTITY ....> and want to suppress loading of the entity. In other words, you don't want to ignore the DOCTYPE, you want to suppress loading of external entities. Craig's suggestion of writing an EntityResolver will work, but he has made a minor mistake: if you return *null* from the entity resolver class, then the parser will apply its normal resolving rules, including retrieving the entity (eg DTD) from the specified URL. This is explicitly stated in the javadoc for the org.xml.sax.EntityResolver class. In order to ignore remote entities, you can instead get your EntityResolver to return an InputSource that wraps an empty InputStream. Note, however, that this can change the *meaning* of your xml document. For example, if the DTD defines an implied value for an attribute, then ignoring the DTD will result in the attribute not getting its expected value. In general, it is better to ensure you have a local copy of the DTD, then use an EntityResolver to return the local DTD rather than returning an empty string. Still, if you *know* that the DTD doesn't have this sort of stuff in it, returning an InputSource which wraps an empty stream will work ok. If you happen to know that the underlying xml parser is Xerces then you can use the setFeature method to disable loading of DTDs. However this is parser-specific. See the xerces documentation on "features" for more info. By the way, this is nothing to do with the Digester; it is related to JAXP parsing in general. So you may be better off asking this on a list for xml parsing & JAXP. Regards, Simon --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
