[
https://issues.apache.org/jira/browse/XERCESJ-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maurizio Merli updated XERCESJ-1604:
------------------------------------
Description:
Parsing a document with a big CDATA section (about 25MB) with the feature
"http://apache.org/xml/features/dom/defer-node-expansion" set to false, cause
an infinite loop.
Use this test class on the attachment.
package test;
import java.io.File;
import java.io.FileInputStream;
import java.util.concurrent.TimeUnit;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
public class BigCData {
public static void main(String[] args) throws Exception {
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance(
"org.apache.xerces.jaxp.DocumentBuilderFactoryImpl",
BigCData.class.getClassLoader());
factory.setNamespaceAware(true);
factory.setValidating(false);
try {
factory.setFeature("http://apache.org/xml/features/dom/defer-node-expansion",
false);
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd",
false);
//
factory.setAttribute("http://apache.org/xml/properties/input-buffer-size", new
Integer(100000000));
} catch (Throwable ex) {
System.err.println("Cannot set IGNORE DTD feature. You
can have performace problems.");
}
XPathExpression xpath =
XPathFactory.newInstance().newXPath().compile("//style");
long t0 = System.nanoTime();
File file = new
File("/Users/maurizio/Downloads/web_dossier_mars.xml");
Document doc = factory.newDocumentBuilder().parse(new
FileInputStream(file));
long dt = System.nanoTime() - t0;
System.out.println(TimeUnit.NANOSECONDS.toMillis(dt) + " parse
> " + doc);
String nodeValue = xpath.evaluate(doc);
dt = System.nanoTime() - t0;
System.out.println(TimeUnit.NANOSECONDS.toMillis(dt) + " xpath
> " + nodeValue.length());
}
}
was:
Parsing a document with a big CDATA section (about 25MB) with the feature
"http://apache.org/xml/features/dom/defer-node-expansion" set to false, cause
an infinite loop.
> Big CDATA section cause a loop
> ------------------------------
>
> Key: XERCESJ-1604
> URL: https://issues.apache.org/jira/browse/XERCESJ-1604
> Project: Xerces2-J
> Issue Type: Bug
> Components: DOM (Level 3 Core)
> Reporter: Maurizio Merli
> Attachments: web_dossier_mars.rar
>
>
> Parsing a document with a big CDATA section (about 25MB) with the feature
> "http://apache.org/xml/features/dom/defer-node-expansion" set to false, cause
> an infinite loop.
> Use this test class on the attachment.
> package test;
> import java.io.File;
> import java.io.FileInputStream;
> import java.util.concurrent.TimeUnit;
> import javax.xml.parsers.DocumentBuilderFactory;
> import javax.xml.xpath.XPathExpression;
> import javax.xml.xpath.XPathFactory;
> import org.w3c.dom.Document;
> public class BigCData {
> public static void main(String[] args) throws Exception {
>
> DocumentBuilderFactory factory =
> DocumentBuilderFactory.newInstance(
>
> "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl",
> BigCData.class.getClassLoader());
> factory.setNamespaceAware(true);
> factory.setValidating(false);
> try {
>
> factory.setFeature("http://apache.org/xml/features/dom/defer-node-expansion",
> false);
>
> factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd",
> false);
> //
> factory.setAttribute("http://apache.org/xml/properties/input-buffer-size",
> new Integer(100000000));
> } catch (Throwable ex) {
> System.err.println("Cannot set IGNORE DTD feature. You
> can have performace problems.");
> }
> XPathExpression xpath =
> XPathFactory.newInstance().newXPath().compile("//style");
> long t0 = System.nanoTime();
> File file = new
> File("/Users/maurizio/Downloads/web_dossier_mars.xml");
> Document doc = factory.newDocumentBuilder().parse(new
> FileInputStream(file));
> long dt = System.nanoTime() - t0;
> System.out.println(TimeUnit.NANOSECONDS.toMillis(dt) + " parse
> > " + doc);
> String nodeValue = xpath.evaluate(doc);
> dt = System.nanoTime() - t0;
> System.out.println(TimeUnit.NANOSECONDS.toMillis(dt) + " xpath
> > " + nodeValue.length());
> }
> }
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]