Now that innerHTML is available more often, particularly body.innerHTML,
i.e. the entire page, which some scripts want, I must point out a bug
that we inherit from tidy.
Review the example www.eklhad.net/div
which is small enough that I include it here.

<body>
<input type=button name=whatever value=hohaa>
<a href="#bottom"><div>Cognitive business is here</div></a>
<script>alert("hello");</script>
</body>

The script tag is just so we create javascript, else there wouldn't be any.
Tidy knocks the div section outside the anchor, and, it rewrites the html
that way, and, that's what it hands us, which is what I use for innerHTML.
Now I come along and cleverly detect what has happened, and work around it
by moving the div node back underneath the anchor where it belongs.
Even if I'm right, even if this isn't a false positive
and I shouldn't have done that, the html is still wrong.
Specifically, innerHTML is wrong, including a.innerHTML and body.innerHTML.
Jump into jdb and see for yourself.
This is truly an instance of Sir Walter Scott's:
"What a tangled web we weave when first we practice to deceive."

We should probably work closer with tidy, to prevent some of these html 
rewritings,
rather than work around then, but then again, tidy crew might say,
"We call ourselves tidy because we *fix* html,
it is antithetic for us to always leave it the way it is."
And yet that is what edbrowse needs.
We could fork tidy and make it do what we want,
but I *really* don't want to do that,
as we would lose the benefit of them maintaining it and enhancing it as html 
evolves.
Same reason we don't want to fork mozjs, or curl, etc.
I'm not sure what the solution is here.
I added some comments to this effect in decorate.c,
though not as long winded as this email.

Karl Dahlke
_______________________________________________
Edbrowse-dev mailing list
[email protected]
http://lists.the-brannons.com/mailman/listinfo/edbrowse-dev

Reply via email to