I asked this on XML-DEV once. It seems the best, correct solution is simply to demand valid XML ;) The XML spec [1] defines "valid XML" as having an associated doctype declaration. That's the contract between an XML producer and consumer. Contracts are very important things. The whole *point* of XML is that it gives you a pre-defined, rigorous contract, to which both sides can agree. If your XML source breaks that contract, then the correct response is to yell at them until they fix it. If you "work around" the problem, you're throwing away the contract and losing the main benefit of XML.
An illustration: once upon a time, HTML was the "contract" that governed browsers. That contract was broken during the browser wars, as vendors rushed to add new tags and competed to see whose browser accepted crappier pseudo-HTML. End result: buggy, bloated browsers making web developers' lives miserable. That said, there are instances where you have no control over the XML source, and can apply no pressure to get it fixed. In that case, I suggest you look at Simon St Laurent's "DOCTYPEChangerStream" class: http://www.simonstl.com/projects/doctypes/ It is a filter that lets you replace an existing doctype declaration, or add one if one doesn't exist. In addition (and regardless of whether you use SimonStL's hack), you'll probably need some additional code to validate against your "local" DTD, instead of whatever is specified in the doctype declaration's system id. In servlet environments, your webapp may be deployed from an unpacked .war, so it's not a good idea to let the parser resolve the DTD to a file. Here, you can use a custom EntityResolver which loads the DTD via getResourceAsStream(), and returns it to the parser. I've attached a class which does this. You might also want to look at XML Catalogs for using a local DTD instead of that specified in the doctype. Here's a good article about it: http://www.arbortext.com/Think_Tank/XML_Resources/Issue_Three/issue_three.html HTH, --Jeff [1] "[Definition:] An XML document is valid if it has an associated document type declaration and if the document complies with the constraints expressed in it." -- http://www.xml.com/axml/target.html#sec-prolog-dtd On Wed, Mar 28, 2001 at 10:51:28PM +0200, Henrik Melander wrote: > I have a server that receives a XML-file over http and responds with > another. I do not have control over the client and they may not send > correct xml. (usually not ;) Therefore we want to validate the xml file > against the dtd. > > Is it possible to force the dom parser to validate against a dtd? I have > not found anything in the api. If not, the best way seems to do a regexp > in the file and insert the "dtd-link". > > Regards, > Henrik
//package net.socialchange.bob.util; import java.io.IOException; import java.io.InputStream; import org.xml.sax.EntityResolver; import org.xml.sax.InputSource; import org.xml.sax.SAXException; /** This SAX EntityResolver allows us to validate against local DTDs. * {@link net.socialchange.bob.framework.ResultMerger}s * <code> * public static final String DSML_DTD_PUBLIC_ID = "http://www.dsml.org/DSML"; * public static final String DSML_DTD_RESOURCE = "/net/socialchange/bob/dtds/dsml.dtd"; * .. * .. * builder.setEntityResolver(new LocalEntityResolver(DSML_DTD_PUBLIC_ID, DSML_DTD_RESOURCE)); * .. * SAXBuilder builder = new SAXBuilder(true); * </code> * * Stolen from jakarta tomcat 3.2.2, share/org/apache/jasper/compiler/JspUtil.java */ public class LocalEntityResolver implements EntityResolver { String dtdId; String dtdResource; public LocalEntityResolver(String id, String resource) { this.dtdId = id; this.dtdResource = resource; } public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException { //System.out.println ("publicId = " + publicId); //System.out.println ("systemId is " + systemId); //System.out.println ("resource is " + dtdResource); if (publicId == null) {return null; } if (publicId.equals(dtdId)) { InputStream input = this.getClass().getResourceAsStream(dtdResource); InputSource isrc = new InputSource(input); return isrc; } else { System.out.println ("returning null"); return null; } } }
--------------------------------------------------------------------- In case of troubles, e-mail: [EMAIL PROTECTED] To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]