Hi,

You can create a simple parse or index filter implementation, check for words 
in the content and act appropriately.

Cheers 
 
-----Original message-----
> From:mausmust <[email protected]>
> Sent: Tue 17-Jul-2012 09:43
> To: [email protected]
> Subject: Nutch Content Filtering
> 
> While Apache Nutch 1.3 crawling pages, i want to analyze the content of 
> the page and if the content contains some keywords then adding page for 
> next steps, say indexing. If the content do not contains at least one 
> key, then just getting links from that page and ignoring it. How can i 
> do that? Is there any filtering plugin available for this purpose? Thnx.
> 

Reply via email to