Hi,
I am using Oracle XDK parser and was parsing an xml
document in two ways.
1. Not using dom4j: I do the following:
//.....
SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setNamespaceAware(true);
spf.setValidating(true);
SAXParser sp = spf.newSAXParser();
sp.setProperty(
"http://java.sun.com/xml/jaxp/properties/schemaLanguage",
"http://www.w3.org/2001/XMLSchema");
DefaultHandler dh = new testhandler();
XMLReader reader = sp.getXMLReader();
reader.setEntityResolver(new schemaval());
reader.setContentHandler(dh);
reader.setErrorHandler(dh);
reader.parse(xmlurl);
//.......
This works fine.
2. Using dom4j: I do the following:
//......
SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setNamespaceAware(true);
spf.setValidating(true);
SAXParser sp = spf.newSAXParser();
System.out.println("SAx parser class: " + sp.getClass().getName());
sp.setProperty(
"http://java.sun.com/xml/jaxp/properties/schemaLanguage",
"http://www.w3.org/2001/XMLSchema");
xmlReader = sp.getXMLReader();
xmlReader.setErrorHandler(this);
xmlReader.setEntityResolver(this);
SAXReader saxr = new SAXReader(xmlReader, true);
Document doc = saxr.read(xml);
Doing this, raises an exception like this:
org.dom4j.DocumentException: Error on line 3 of document : <Line 3,
Column 9>:
XML-20149: (Error) Element 'dataset' used but not declared. Nested
exception: <L
ine 3, Column 9>: XML-20149: (Error) Element 'dataset' used but not
declared.
at org.dom4j.io.SAXReader.read(SAXReader.java:358)
at org.dom4j.io.SAXReader.read(SAXReader.java:271)
at TestCase.parseAndPrint(TestCase.java:81)
at TestCase.main(TestCase.java:113)
Nested exception: org.xml.sax.SAXParseException: <Line 3, Column
9>: XML-20149:
(Error) Element 'dataset' used but not declared.
at
oracle.xml.parser.v2.XMLError.flushErrorHandler(XMLError.java:428)
at oracle.xml.parser.v2.XMLError.flushErrors1(XMLError.java:290)
at
oracle.xml.parser.v2.NonValidatingParser.parseDocument(NonValidatingP
arser.java:287)
at oracle.xml.parser.v2.XMLParser.parse(XMLParser.java:180)
at org.dom4j.io.SAXReader.read(SAXReader.java:342)
at org.dom4j.io.SAXReader.read(SAXReader.java:271)
at TestCase.parseAndPrint(TestCase.java:81)
at TestCase.main(TestCase.java:113)
Now, In the dom4j source code, org.dom4j.io.SAXReader I comment out
this part around line 732 and find that the program using dom4j behaves
fine i.e. validates the document and parses fine. Of course only
commenting of
the setting of validation matters.
try {
// configure validation support
reader.setFeature(
"http://xml.org/sax/features/validation",
isValidating()
);
if (errorHandler != null) {
reader.setErrorHandler(errorHandler);
}
else {
reader.setErrorHandler(contentHandler);
}
}
catch (Exception e) {
if (isValidating()) {
throw new DocumentException(
"Validation not supported for XMLReader: " +
reader,
e
);
}
}
My question is if I want to rely on the
sAXParserFactory.setValidating(true),
only then how does dom4j accomodate that?
Another problem is that even if I not use the validation in the
SAXReader
constructor i.e. using "new SAXReader(xmlReader)" the above part of the
code
makes it a non-validating parse.
Isn't this a bug?
Thanks,
Gurdev