Well if they've been talking about this bug since 2007,
then they might not fix it soon.
With that in mind I was thinking about a preprocessing workaround.
My first attempt looked for strings within scripts,
and then <script within a string,
but that was completely derailed by
        if(whatever.match(/"/)) { do stuff }
The bare regexps make this approach impossible.
So I made the routine even simpler, and I think safer and better.
If it flags a false positive, then the script won't compile and won't run,
which is better than running and doing the wrong thing.
Normally I just push stuff without review,
in the interest of getting things done,
which is perhaps arrogant, and I apologize for that,
but this one I think people should look at first.
It is ready to push if you say go.
The new string is longer than the original,
so I have to use all those dynamic string functions.

/* Work around a nasty bug in tidy5 wherein "<script>" anywhere
 * in a javascript will totally derail things.
 * I turn < into \x3c. */
static char *escapeLessScript(const char *htmltext)
{
        char *ns;               /* new string */
        int ns_l;
        const char *s1, *s2;    /* start and end of script */
        const char *lw;         /* last write */
        const char *q;          /* inner script */

        ns = initString(&ns_l);
        lw = htmltext;

        while (true) {
                s1 = strstrCI(lw, "<script");
                if (!s1)
                        break;
//  printf("@@%s", s1);
                s1 += 7;
                if (isalnumByte(*s1)) { /* <scriptx */
                        stringAndBytes(&ns, &ns_l, lw, s1 - lw);
                        lw = s1;
                        continue;
                }
                s2 = strstrCI(s1, "</script");
                if (!s2)
                        goto abort;

/* script now has a start and end */
                stringAndBytes(&ns, &ns_l, lw, s1 - lw);
                lw = s1;

                while (true) {
                        q = strstrCI(lw, "<script");
                        if (!q || q > s2)
                                break;
                        stringAndBytes(&ns, &ns_l, lw, q - lw);
                        stringAndString(&ns, &ns_l, "\\x3c");
                        lw = q + 1;
                }

                stringAndBytes(&ns, &ns_l, lw, s2 - lw);
                lw = s2;
        }

        stringAndString(&ns, &ns_l, lw);
        return ns;

abort:
        nzFree(ns);
        return 0;
}                               /* escapeLessScript */
_______________________________________________
Edbrowse-dev mailing list
[email protected]
http://lists.the-brannons.com/mailman/listinfo/edbrowse-dev

Reply via email to