On Thu, Sep 10, 2015 at 10:28:03PM -0700, Kevin Carhart wrote:
> 
> Interesting.. Karl, does your certainty mean that you are saying
> that the distinction between the two tags is fundamentally
> unknowable for a parser?

It's certainly difficult if the parser isn't also capable of parsing the
scripting language within the script tags.

> I guess one good sign is that there appears to be a lot of
> past literature on this issue, on Tidy listservs.  Including
> one from 2006 called "Tidy barfs on split <SCRIPT> tags".
> Unless it's an impossible problem, maybe these past threads
> will contain something we can use.  I will read some of this
> correspondence.

I've also ran the example through the main tidy html5 version and it also spits 
it out.

> This reminds me of other gnarly situations with literals.
> For instance, when there are regular expression criteria in
> javascript strings that contain just solely a close brace or close
> parenthesis, if I come along and want to make
> assumptions about pairs of braces, the unmatched literal gets me
> out of sync.

Agreed, literals in scripts can cause issues like this.
There's also the issue of json shoved in script tags etc (I've seen web apps
use this for pre-caching server responses).

I'm not sure what we can do about this,
but I'm inclined to think that whatever we do won't catch every case and that
at some stage we have to accept that and move on.
I seem to remember that the accepted "fix"
for this in html is not to split the script in </script> but rather to split it 
at the / thus:
document.write("<"); document.write("/script>");
But I may be wrong there.

We should probably report a bug against tidy5 in any case for this.
That's why we're using a parsing library after all.
At least this one's maintained for us so there's a reasonable chance they'll
fix these things once they work out a workable solution.

Cheers,
Adam.

Attachment: signature.asc
Description: Digital signature

_______________________________________________
Edbrowse-dev mailing list
[email protected]
http://lists.the-brannons.com/mailman/listinfo/edbrowse-dev

Reply via email to