Chris said:
Essentially true. The details are a bit more complicated, but this is
the idea. We call tidy to parse the html, and we get back a structure
from tidy called a document. It contains our tree of nodes, and we
can iterate over it. The problem is, this is a usable parse tree for
the html, but it isn't a true DOM. We can remove nodes and attributes
from the tree, but we can't add them. That causes problems for JS that
needs to add new nodes. So we're going to have to take that parse tree
we get back from Tidy5, build our own DOM out of it, and eventually
render it.
So the switch statement over (action) goes away, and the painstaking
character-by-character tag recognition goes away, but maybe in return we need
a switch statement with handling for every value, maybe grouped together
in cases, that they list as "Known HTML element types" in the tidyenum.h
file?
And would some or most of the old case blocks be preserved, such as:
the old case TAGACT_TABLE might resemble a new case TidyTag_TABLE
the old case TAGACT_TR might resemble a new case TidyTag_TR
...
Like you get some work done by the library, but also want a crack at these
node types differentiated by what they are. Is that correct? We're still
building the new string 'ns'.. hmmm... is more standardization possible,
or do you still have to do a variety of things in order to add to ns
properly?
---
Here is a second note on what Karl said (paraphrasing), as a first step,
how about bringing libtidy into html.c, run their parse method and just
bring the output around as part of the ebWindow struct, for further
examination without breaking what now exists.
My note on this. I went to eb.h to see what would happen if I included
tidy.h and added a TidyDoc to the ebWindow struct. Interestingly, because
of includes from includes, there is a name collision when I try to
compile.. I think... over mkdir in plugin.c and mkdir in
/usr/include/sys/stat.h. Uh, maybe it's a client thing though.
Disregard if it doesn't sound salient..
thanks.. this is fun.. I hope tidy will work
Kevin
--------
Kevin Carhart * 415 225 5306 * The Ten Ninety Nihilists
_______________________________________________
Edbrowse-dev mailing list
[email protected]
http://lists.the-brannons.com/mailman/listinfo/edbrowse-dev