I'm trying to determine the official stance (spec and/or Xerces) on
whether it is ok for an HTMLDocument to contain a DocumentType
object. For a while I've heard that the spec says that the HTML DOM
does not carry with it a DocumentType. However, I can't seem to
track down exactly where this is stated in any spec? The only place
I can find such a statement is in the doc for DocumentTypeImpl [1]
where it says...
"DocumentType is an Extended DOM feature, used in XML documents
but not in HTML."
Then I happen upon XERCESJ-1021 [2] where a fix was applied allowing
an HTMLDocument with a DTD to be cloned. Why would such a thing be
fixed if it wasn't supposed to be allowed in the first place? Seems
to me that it's implicitly allowed.
I have a parser class extending the Xerces2 DOMParser and pass
NekoHTML's HTMLConfiguration to the DOMParser. If I parse a document
with a DTD, unless I override doctypeDecl() as a NOP, I end up with
an HTMLDocument having a DocumentType object containing the DTD information.
So, my question is, is there some specification or best practice that
says I should override doctypeDecl() as a NOP to prevent a
DocumentType from existing in the HTML DOM, or is it entirely left to
my discretion? If it is left to my discretion, do you have any
personal preference one way or the other? Are there any "real world"
reasons to leave out the DocumentType from the HTML DOM?
I guess I'd prefer to have the DocumentType there if I can. I'm just
trying to make sure I'm not implementing something harmful to users.
thanks,
Jake
[1]
http://xerces.apache.org/xerces-j/apiDocs/org/apache/xerces/dom/DocumentTypeImpl.html
[2] http://issues.apache.org/jira/browse/XERCESJ-1021
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]