Hi all,
I'm parsing an XHTML document using Xerces.
This is the code that I'm using to parse the document:
String xhtmlSource = "<the xhtml source>";
DOMParser parser = new DOMParser();
parser.setProperty("http://apache.org/xml/properties/dom/document-class-name","org.apache.html.dom.HTMLDocumentImpl");
InputSource iSource = new InputSource(new StringReader(xhtmlSource));
parser.parse(iSource);
HTMLDocumentImpl document = (HTMLDocumentImpl)parser.getDocument();
The parsing seems to work, except when I query the HTMLDocumentImpl most
nodes are of type |ElementNSImpl |rather than the actual apache HTML DOM
implementation classes. (For example, I can't even do a
document.getBody() - it returns null. Instead I have to walk the XML DOM
looking for the 'body' node).
This behaviour is described in NekoHTML's 'Requirements and Limitations'
section at http://people.apache.org/~andyc/neko/doc/html/index.html
I'm not using NekoHTML, and I'm currently using Xerces 2.8.0. I did try
various versions of Xerces but to no avail.
I'm having to carry on working with plain nodes, but I'd much rather
work with the HTML DOM.
Can anyone give any hints?
Thanks in advance.
Daniel
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]