This script present in html page inside <script>//<!-- code //--></script> 



-----Original Message-----
From: Jack Tang [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, March 15, 2006 10:58 AM
To: [email protected]
Subject: Re: javascript in summaries [nutch-0.7.1]

Maybe you can filter javascript files(*.js) using url filter..

/Jack

On 3/15/06, Ilia S. Yatsenko <[EMAIL PROTECTED]> wrote:
> Hello
>
>
>
> Sorry my little English
>
>
>
> I use nutch-0.7.1 and have issue with html parser
>
>
>
> I got in summary javascript code and don't know how to remove it. For
> example
>
>
>
> . \n'); } if (plugin) { document.write(' '); document.write(' ');
> document.write(' '); document.write(' '); document.write(' ');
> document.write ...
>
>
>
> Or http://62.141.52.208:8080/dual/search.jsp?query=document.write :)
>
>
>
> This is my nutch-site.plugin line:
>
> <property>
>
>
<value>nutch-extensionpoints|protocol-(http|httpclient)|urlfilter-regex|pars
> e-html|index-(basic|more)|query-(more|stemmer|site|url)</value>
>
> </property>
>
>
>
> Can anybody help me?
>
>
>


--
Keep Discovering ... ...
http://www.jroller.com/page/jmars

Reply via email to