I'm trying to determine the official stance (spec and/or Xerces) on whether it is ok for an HTMLDocument to contain a DocumentType object. For a while I've heard that the spec says that the HTML DOM does not carry with it a DocumentType. However, I can't seem to track down exactly where this is stated in any spec? The only place I can find such a statement is in the doc for DocumentTypeImpl [1] where it says...

"DocumentType is an Extended DOM feature, used in XML documents but not in HTML."

Then I happen upon XERCESJ-1021 [2] where a fix was applied allowing an HTMLDocument with a DTD to be cloned. Why would such a thing be fixed if it wasn't supposed to be allowed in the first place? Seems to me that it's implicitly allowed.

I have a parser class extending the Xerces2 DOMParser and pass NekoHTML's HTMLConfiguration to the DOMParser. If I parse a document with a DTD, unless I override doctypeDecl() as a NOP, I end up with an HTMLDocument having a DocumentType object containing the DTD information.

So, my question is, is there some specification or best practice that says I should override doctypeDecl() as a NOP to prevent a DocumentType from existing in the HTML DOM, or is it entirely left to my discretion? If it is left to my discretion, do you have any personal preference one way or the other? Are there any "real world" reasons to leave out the DocumentType from the HTML DOM?

I guess I'd prefer to have the DocumentType there if I can. I'm just trying to make sure I'm not implementing something harmful to users.

thanks,

Jake


[1] http://xerces.apache.org/xerces-j/apiDocs/org/apache/xerces/dom/DocumentTypeImpl.html
[2] http://issues.apache.org/jira/browse/XERCESJ-1021



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to