Hi Karl,
> In an ideal world,
LOL! Well we all know that does not exist!
Tidy does leave the form open, waiting, as it
should, for a close form, but then it hits
a tr open table element, and reports -
line 5 column 1 - Warning: missing close form
before tr
It is at this point that it *must* close the
form... and carries on parsing the table
row.. etc...
And that is why tidy emits an error when it
does eventually find a close form...
I too have had the thought - does this not
tell tidy that the earlier implicit form
close it added was not right - but what can
it do about it at that stage?
> postmuck with the tree
Yes, I hear you! That is *not* fun, and as you
point out in fixing one page, you can break so
many others...
> Using libtidy
You know, for a long time I have wondered why
you do not write your own html parser!
Not that I particularly want you to abandon
libtidy... your participation has helped solve
some libtidy problems... and so do hope you
continue...
But like any std html browser, IE, firefox, chrome,
who-ever, you are not really interested in how
well a document is formed... browsers can just skip
over many problems...
If necessary, maybe levering code from text-based
web browsers, like Lynx, but in my experimentation
with some of these, they too can get very hairy...
It is just that once you have the html text in a
buffer, it basically consists of looking for
`<` and the `>`, with not too many exceptions...
I have done this, with reasonable success, in several
perl scripts I have written... as I am sure you
probably have... like I remember in your first perl
version...
But I understand, this is a long, LONG way around...
quite an amount of new work initially...
But libtidy is always going to give you problems
when it runs into invalid html, and its efforts
to make it valid...
Just some thoughts... Sorry, can not seem to help
more...
Regards, Geoff.
_______________________________________________
Edbrowse-dev mailing list
[email protected]
http://lists.the-brannons.com/mailman/listinfo/edbrowse-dev