Re: parsing HTML into a document object in Fx3

Boris Zbarsky Thu, 16 Nov 2006 16:48:29 -0800

Myk Melez wrote:

Folks (particularly extension developers) regularly ask for a way toparse HTML into a document object, which is currently hard and hacky to do.


So as I see it, the steps to get this working are:

1) Decide what the problem we're solving is. Specifically, how shouldnoscript, noframes, and such be parsed in these documents? Keep in mind thatdepending on user settings (like whether script is enabled) we create differentDOMs from the same source.

2) Decide what the plan is for charsets (currently we depend on having adocshell to handle charset autodetect and in some cases <meta> tags, because wehave to throw away the document and reparse).

3) Go through the HTML content sink and HTML document, and make sure all theplaces that use the docshell or window can survive without one.


4)  Do whatever we decided to do for charsets.

5)  Make DOMParser parse HTML.

1. Will things get better in Gecko 1.9/Firefox 3 (i.e. are thereconcrete plans or promising developments in this area)?

I'm not aware of significant changes in this area since 1.8, and I'm not sureanyone is working on this actively. I strongly suspect that given our existingcode, once item #1 above is sorted out handling item #3 and item #5 should notbe that bad -- a few days work at most. Items #2 and #4 I'm really not sureabout; I guess in large part it depends on what we decide to do about #2.


-Boris
_______________________________________________
dev-tech-layout mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-layout

Re: parsing HTML into a document object in Fx3

Reply via email to