Sorry, I think my use of "literal" is backwards but I hope you can tell what I meant from context. It's this whole cluster of questions around an actual token with a formal meaning, versus that thing appearing as part of a string. Or in this case, something potentially with a formal meaning like <, only it isn't a piece of an HTML tag, it's expression criteria intended to be matched, delimited not by quotes but by slashes. And the parser may not have enough information to differentiate between the situations.


On Tue, 12 Jan 2016, Kevin Carhart wrote:


I was trying to dig into this problem where Sebastian from the commandline list was trying to read google groups with edbrowse.

There may be a few things going on with google groups, but one of them that I could isolate as a short example is that they make use of the inline regular expression style as follows:

<script type="text/javascript">
ua=/</g;va=/>/g;
</script>

And the routine fails because the expression criteria is taken as a literal, so the error is then "SyntaxError: unterminated regular expressionliteral"

I know this is very similar to the string contents interpreted as literals problems from months back, which is now fixed, right? Maybe this one is harder to deal with because it isn't delimited by quotes? It gets ambiguous to know what /</ means.
Or should this work?
Or is it slipping my mind and we talked about the regex syntax back when we talked about things like document.writeln("<script language=JavaScript>document.writeln('Subject: ');<" + "/script>");


Note, I made sure my tidy was up to date before trying this.  When I say:
tidy -v
I get
HTML Tidy for Linux version 5.1.33

Any idea what can be done here?
thanks
Kevin


--------
Kevin Carhart * 415 225 5306 * The Ten Ninety Nihilists
_______________________________________________
Edbrowse-dev mailing list
[email protected]
http://lists.the-brannons.com/mailman/listinfo/edbrowse-dev

Reply via email to