> > Well, if someone would volunteer to poke into muffin and look for the
> > JavaScript filtering code, we can always look at that (I'm assuming it's
> > GPL?). If it looks reasonable, I'm sure a patch for htdig/HTML.cc can be
> > made.
>
> Yes, Muffin is GPL. If you guys can get me more information about the
> problem I'll do my best to fix it.
I guess the "problem" is this: ht://Dig interprets JavaScript in HTML
files as text. So if we can take the code Muffin uses to strip JavaScript
and add it to a "remove JavaScript" pass over the HTML files before
ht://Dig begins the real indexing, we'd be set.
This could be pretty simple. If Muffin's JavaScripting code is in one or
two files and has a high-level function (something to return an HTML
buffer w/o JavaScript), then it would almost be a drop-in. If it's not
quite that simple, then we can extract what we need into a file.
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.