HI,
what you can do is remove parse-js and other related plugin from
nutch-site.xml file and nutch-default.xml file both .
but its not recommended to do change in nutch-default.xml , though sometimes
without changing in nutch-default.xml , it does not affect .

so you see what the changes you can do according to the requirement I am
sure once you remove the parse-js It wount crawl javascript and try removing
other plugins as parse-msword etc.

I hope that it will done

Ratnesh,V2Solutions,India



Meryl Silverburgh wrote:
> 
> Hi,
> 
> How can I configure nutch just crawl html links (no images, no
> javascript files, no css files)?
> And it won't record in the crawl database for non html pages links.
> 
> thank you.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/How-to-config-nutch-just-crawl-html-links--tf3562947.html#a9957697
Sent from the Nutch - User mailing list archive at Nabble.com.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to