Myk Melez wrote:
Folks (particularly extension developers) regularly ask for a way to parse HTML into a document object, which is currently hard and hacky to do.

So as I see it, the steps to get this working are:

1) Decide what the problem we're solving is. Specifically, how should noscript, noframes, and such be parsed in these documents? Keep in mind that depending on user settings (like whether script is enabled) we create different DOMs from the same source.

2) Decide what the plan is for charsets (currently we depend on having a docshell to handle charset autodetect and in some cases <meta> tags, because we have to throw away the document and reparse).

3) Go through the HTML content sink and HTML document, and make sure all the places that use the docshell or window can survive without one.

4)  Do whatever we decided to do for charsets.

5)  Make DOMParser parse HTML.

1. Will things get better in Gecko 1.9/Firefox 3 (i.e. are there concrete plans or promising developments in this area)?

I'm not aware of significant changes in this area since 1.8, and I'm not sure anyone is working on this actively. I strongly suspect that given our existing code, once item #1 above is sorted out handling item #3 and item #5 should not be that bad -- a few days work at most. Items #2 and #4 I'm really not sure about; I guess in large part it depends on what we decide to do about #2.

-Boris
_______________________________________________
dev-tech-layout mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-layout

Reply via email to